All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] gpu: host1x: Do not output error message for deferred probe
@ 2019-06-04 15:31 Thierry Reding
  2019-06-04 16:07 ` Dmitry Osipenko
                   ` (2 more replies)
  0 siblings, 3 replies; 10+ messages in thread
From: Thierry Reding @ 2019-06-04 15:31 UTC (permalink / raw)
  To: Thierry Reding; +Cc: linux-tegra, dri-devel

From: Thierry Reding <treding@nvidia.com>

When deferring probe, avoid logging a confusing error message. While at
it, make the error message more informational.

Signed-off-by: Thierry Reding <treding@nvidia.com>
---
 drivers/gpu/host1x/dev.c | 5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/host1x/dev.c b/drivers/gpu/host1x/dev.c
index c55e2d634887..5a3f797240d4 100644
--- a/drivers/gpu/host1x/dev.c
+++ b/drivers/gpu/host1x/dev.c
@@ -247,8 +247,11 @@ static int host1x_probe(struct platform_device *pdev)
 
 	host->clk = devm_clk_get(&pdev->dev, NULL);
 	if (IS_ERR(host->clk)) {
-		dev_err(&pdev->dev, "failed to get clock\n");
 		err = PTR_ERR(host->clk);
+
+		if (err != -EPROBE_DEFER)
+			dev_err(&pdev->dev, "failed to get clock: %d\n", err);
+
 		return err;
 	}
 
-- 
2.21.0

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [PATCH] gpu: host1x: Do not output error message for deferred probe
  2019-06-04 15:31 [PATCH] gpu: host1x: Do not output error message for deferred probe Thierry Reding
@ 2019-06-04 16:07 ` Dmitry Osipenko
  2019-06-05  8:28   ` Thierry Reding
  2019-06-04 18:35 ` Dmitry Osipenko
  2019-06-05 12:40 ` Daniel Vetter
  2 siblings, 1 reply; 10+ messages in thread
From: Dmitry Osipenko @ 2019-06-04 16:07 UTC (permalink / raw)
  To: Thierry Reding; +Cc: linux-tegra, dri-devel

04.06.2019 18:31, Thierry Reding пишет:
> From: Thierry Reding <treding@nvidia.com>
> 
> When deferring probe, avoid logging a confusing error message. While at
> it, make the error message more informational.
> 
> Signed-off-by: Thierry Reding <treding@nvidia.com>
> ---
>  drivers/gpu/host1x/dev.c | 5 ++++-
>  1 file changed, 4 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/gpu/host1x/dev.c b/drivers/gpu/host1x/dev.c
> index c55e2d634887..5a3f797240d4 100644
> --- a/drivers/gpu/host1x/dev.c
> +++ b/drivers/gpu/host1x/dev.c
> @@ -247,8 +247,11 @@ static int host1x_probe(struct platform_device *pdev)
>  
>  	host->clk = devm_clk_get(&pdev->dev, NULL);
>  	if (IS_ERR(host->clk)) {
> -		dev_err(&pdev->dev, "failed to get clock\n");
>  		err = PTR_ERR(host->clk);
> +
> +		if (err != -EPROBE_DEFER)
> +			dev_err(&pdev->dev, "failed to get clock: %d\n", err);
> +
>  		return err;
>  	}

The clock driver should be available at the time of host1x's probing on
all Tegra's because it is one of essential core drivers that become
available early during boot.

I guess you're making this change for T186, is it because the BPMP
driver's probe getting deferred? If yes, won't it be possible to fix the
defer of the clock driver instead of making such changes in the affected
drivers?

-- 
Dmitry
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] gpu: host1x: Do not output error message for deferred probe
  2019-06-04 15:31 [PATCH] gpu: host1x: Do not output error message for deferred probe Thierry Reding
  2019-06-04 16:07 ` Dmitry Osipenko
@ 2019-06-04 18:35 ` Dmitry Osipenko
  2019-06-05 12:40 ` Daniel Vetter
  2 siblings, 0 replies; 10+ messages in thread
From: Dmitry Osipenko @ 2019-06-04 18:35 UTC (permalink / raw)
  To: Thierry Reding; +Cc: linux-tegra, dri-devel

04.06.2019 18:31, Thierry Reding пишет:
> From: Thierry Reding <treding@nvidia.com>
> 
> When deferring probe, avoid logging a confusing error message. While at
> it, make the error message more informational.
> 
> Signed-off-by: Thierry Reding <treding@nvidia.com>
> ---
>  drivers/gpu/host1x/dev.c | 5 ++++-
>  1 file changed, 4 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/gpu/host1x/dev.c b/drivers/gpu/host1x/dev.c
> index c55e2d634887..5a3f797240d4 100644
> --- a/drivers/gpu/host1x/dev.c
> +++ b/drivers/gpu/host1x/dev.c
> @@ -247,8 +247,11 @@ static int host1x_probe(struct platform_device *pdev)
>  
>  	host->clk = devm_clk_get(&pdev->dev, NULL);
>  	if (IS_ERR(host->clk)) {
> -		dev_err(&pdev->dev, "failed to get clock\n");
>  		err = PTR_ERR(host->clk);
> +
> +		if (err != -EPROBE_DEFER)
> +			dev_err(&pdev->dev, "failed to get clock: %d\n", err);
> +
>  		return err;
>  	}

The clock driver should be available at the time of host1x's probing on
all Tegra's because it is one of essential core drivers that become
available early during boot.

I guess you're making this change for T186, is it because the BPMP
driver's probe getting deferred? If yes, won't it be possible to fix the
defer of the clock driver instead of making such changes in all of the
affected drivers?

[it appeared to me that first email got dropped by gmail, so I'm
re-sending it second time just in a case]

-- 
Dmitry
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] gpu: host1x: Do not output error message for deferred probe
  2019-06-04 16:07 ` Dmitry Osipenko
@ 2019-06-05  8:28   ` Thierry Reding
  2019-06-05 11:25     ` Dmitry Osipenko
  0 siblings, 1 reply; 10+ messages in thread
From: Thierry Reding @ 2019-06-05  8:28 UTC (permalink / raw)
  To: Dmitry Osipenko; +Cc: linux-tegra, dri-devel


[-- Attachment #1.1: Type: text/plain, Size: 2645 bytes --]

On Tue, Jun 04, 2019 at 07:07:42PM +0300, Dmitry Osipenko wrote:
> 04.06.2019 18:31, Thierry Reding пишет:
> > From: Thierry Reding <treding@nvidia.com>
> > 
> > When deferring probe, avoid logging a confusing error message. While at
> > it, make the error message more informational.
> > 
> > Signed-off-by: Thierry Reding <treding@nvidia.com>
> > ---
> >  drivers/gpu/host1x/dev.c | 5 ++++-
> >  1 file changed, 4 insertions(+), 1 deletion(-)
> > 
> > diff --git a/drivers/gpu/host1x/dev.c b/drivers/gpu/host1x/dev.c
> > index c55e2d634887..5a3f797240d4 100644
> > --- a/drivers/gpu/host1x/dev.c
> > +++ b/drivers/gpu/host1x/dev.c
> > @@ -247,8 +247,11 @@ static int host1x_probe(struct platform_device *pdev)
> >  
> >  	host->clk = devm_clk_get(&pdev->dev, NULL);
> >  	if (IS_ERR(host->clk)) {
> > -		dev_err(&pdev->dev, "failed to get clock\n");
> >  		err = PTR_ERR(host->clk);
> > +
> > +		if (err != -EPROBE_DEFER)
> > +			dev_err(&pdev->dev, "failed to get clock: %d\n", err);
> > +
> >  		return err;
> >  	}
> 
> The clock driver should be available at the time of host1x's probing on
> all Tegra's because it is one of essential core drivers that become
> available early during boot.

That's the currently baked-in assumption. However, there can be any
number of reasons for why the clocks may not show up as early as
expected, as evidenced in the case of Tegra186.

> I guess you're making this change for T186, is it because the BPMP
> driver's probe getting deferred? If yes, won't it be possible to fix the
> defer of the clock driver instead of making such changes in the affected
> drivers?

The reason why this is now happening on Tegra186 is because the BPMP is
bound to an IOMMU to avoid getting faults from the new no-bypass policy
that the ARM SMMU driver is implementing as of v5.2-rc1.

As a result of binding to an IOMMU, the first probe of the BPMP driver
will get deferred, so any driver trying to request a clock after that
and before BPMP gets probed successfully the next time, any clk_get()
calls will fail with -EPROBE_DEFER.

This is a bit unfortunate, but like I said, these kinds of things can
happen, and probe deferral was designed specifically to deal with that
kind of situation so that we wouldn't have to rely on all of these
built-in assumptions that occasionally break.

The driver also already handles deferred probe properly. The only thing
that this patch really changes is to no longer consider -EPROBE_DEFER an
error. It's in fact a pretty common situation in many drivers and should
not warrant a kernel log message.

Thierry

[-- Attachment #1.2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

[-- Attachment #2: Type: text/plain, Size: 159 bytes --]

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] gpu: host1x: Do not output error message for deferred probe
  2019-06-05  8:28   ` Thierry Reding
@ 2019-06-05 11:25     ` Dmitry Osipenko
  2019-06-05 12:32       ` Thierry Reding
  0 siblings, 1 reply; 10+ messages in thread
From: Dmitry Osipenko @ 2019-06-05 11:25 UTC (permalink / raw)
  To: Thierry Reding; +Cc: linux-tegra, dri-devel

05.06.2019 11:28, Thierry Reding пишет:
> On Tue, Jun 04, 2019 at 07:07:42PM +0300, Dmitry Osipenko wrote:
>> 04.06.2019 18:31, Thierry Reding пишет:
>>> From: Thierry Reding <treding@nvidia.com>
>>>
>>> When deferring probe, avoid logging a confusing error message. While at
>>> it, make the error message more informational.
>>>
>>> Signed-off-by: Thierry Reding <treding@nvidia.com>
>>> ---
>>>  drivers/gpu/host1x/dev.c | 5 ++++-
>>>  1 file changed, 4 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/drivers/gpu/host1x/dev.c b/drivers/gpu/host1x/dev.c
>>> index c55e2d634887..5a3f797240d4 100644
>>> --- a/drivers/gpu/host1x/dev.c
>>> +++ b/drivers/gpu/host1x/dev.c
>>> @@ -247,8 +247,11 @@ static int host1x_probe(struct platform_device *pdev)
>>>  
>>>  	host->clk = devm_clk_get(&pdev->dev, NULL);
>>>  	if (IS_ERR(host->clk)) {
>>> -		dev_err(&pdev->dev, "failed to get clock\n");
>>>  		err = PTR_ERR(host->clk);
>>> +
>>> +		if (err != -EPROBE_DEFER)
>>> +			dev_err(&pdev->dev, "failed to get clock: %d\n", err);
>>> +
>>>  		return err;
>>>  	}
>>
>> The clock driver should be available at the time of host1x's probing on
>> all Tegra's because it is one of essential core drivers that become
>> available early during boot.
> 
> That's the currently baked-in assumption. However, there can be any
> number of reasons for why the clocks may not show up as early as
> expected, as evidenced in the case of Tegra186.
> 
>> I guess you're making this change for T186, is it because the BPMP
>> driver's probe getting deferred? If yes, won't it be possible to fix the
>> defer of the clock driver instead of making such changes in the affected
>> drivers?
> 
> The reason why this is now happening on Tegra186 is because the BPMP is
> bound to an IOMMU to avoid getting faults from the new no-bypass policy
> that the ARM SMMU driver is implementing as of v5.2-rc1.
> 
> As a result of binding to an IOMMU, the first probe of the BPMP driver
> will get deferred, so any driver trying to request a clock after that
> and before BPMP gets probed successfully the next time, any clk_get()
> calls will fail with -EPROBE_DEFER.
> 
> This is a bit unfortunate, but like I said, these kinds of things can
> happen, and probe deferral was designed specifically to deal with that
> kind of situation so that we wouldn't have to rely on all of these
> built-in assumptions that occasionally break.
> 
> The driver also already handles deferred probe properly. The only thing
> that this patch really changes is to no longer consider -EPROBE_DEFER an
> error. It's in fact a pretty common situation in many drivers and should
> not warrant a kernel log message.

You're trying to mask symptoms instead of curing the decease and it looks
like the decease could be cured.

Won't something like this work for you?

From fbeabba5f1151e96edc38620db67593585558ca0 Mon Sep 17 00:00:00 2001
From: Dmitry Osipenko <digetx@gmail.com>
Date: Wed, 5 Jun 2019 14:02:00 +0300
Subject: [PATCH 1/2] iommu/arm-smmu: Move driver registration to subsys level

On some platforms there is a dependency on the IOMMU availability that
comes up early during of the boot process. One example is NVIDIA Tegra186
which uses firmware, called BPMP, which manages lots of core functions
like system clocks for example. That firmware driver require IOMMU
functionality and hence the driver's probing is getting deferred because
the ARM's SMMU driver is probed later, thus all the drivers that depend
on the BPMP availability are also getting deferred on Tegra186. Let's move
SMMU driver's registration to an earlier boot stage, allowing drivers like
BPMP to probe successfully without the defer.

Signed-off-by: Dmitry Osipenko <digetx@gmail.com>
---
 drivers/iommu/arm-smmu.c | 7 ++++++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c
index 5e54cc0a28b3..08919f2fdf04 100644
--- a/drivers/iommu/arm-smmu.c
+++ b/drivers/iommu/arm-smmu.c
@@ -2410,4 +2410,9 @@ static struct platform_driver arm_smmu_driver = {
 	.probe	= arm_smmu_device_probe,
 	.shutdown = arm_smmu_device_shutdown,
 };
-builtin_platform_driver(arm_smmu_driver);
+
+static int __init arm_smmu_init(void)
+{
+	return platform_driver_register(&arm_smmu_driver);
+}
+subsys_initcall(arm_smmu_init);
-- 
2.21.0

From 12ec90e22405b6d5574cbfcbd33b92736ad6bfe4 Mon Sep 17 00:00:00 2001
From: Dmitry Osipenko <digetx@gmail.com>
Date: Wed, 5 Jun 2019 14:13:15 +0300
Subject: [PATCH 2/2] firmware/tegra: Move driver registration to subsys-sync
 level

The BPMP driver depends on the ARM SMMU driver which is now getting
probed from the subsys level, hence let's move the BPMP's probing to
the subsys-sync level in order to avoid probe deferring of the BPMP's
driver.

Signed-off-by: Dmitry Osipenko <digetx@gmail.com>
---
 drivers/firmware/tegra/bpmp.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/firmware/tegra/bpmp.c b/drivers/firmware/tegra/bpmp.c
index dd775e8ba5a0..3f649e12f9f6 100644
--- a/drivers/firmware/tegra/bpmp.c
+++ b/drivers/firmware/tegra/bpmp.c
@@ -883,4 +883,4 @@ static int __init tegra_bpmp_init(void)
 {
 	return platform_driver_register(&tegra_bpmp_driver);
 }
-core_initcall(tegra_bpmp_init);
+subsys_initcall_sync(tegra_bpmp_init);
-- 
2.21.0


_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [PATCH] gpu: host1x: Do not output error message for deferred probe
  2019-06-05 11:25     ` Dmitry Osipenko
@ 2019-06-05 12:32       ` Thierry Reding
  2019-06-05 12:40         ` Dmitry Osipenko
  0 siblings, 1 reply; 10+ messages in thread
From: Thierry Reding @ 2019-06-05 12:32 UTC (permalink / raw)
  To: Dmitry Osipenko; +Cc: linux-tegra, dri-devel


[-- Attachment #1.1: Type: text/plain, Size: 6424 bytes --]

On Wed, Jun 05, 2019 at 02:25:43PM +0300, Dmitry Osipenko wrote:
> 05.06.2019 11:28, Thierry Reding пишет:
> > On Tue, Jun 04, 2019 at 07:07:42PM +0300, Dmitry Osipenko wrote:
> >> 04.06.2019 18:31, Thierry Reding пишет:
> >>> From: Thierry Reding <treding@nvidia.com>
> >>>
> >>> When deferring probe, avoid logging a confusing error message. While at
> >>> it, make the error message more informational.
> >>>
> >>> Signed-off-by: Thierry Reding <treding@nvidia.com>
> >>> ---
> >>>  drivers/gpu/host1x/dev.c | 5 ++++-
> >>>  1 file changed, 4 insertions(+), 1 deletion(-)
> >>>
> >>> diff --git a/drivers/gpu/host1x/dev.c b/drivers/gpu/host1x/dev.c
> >>> index c55e2d634887..5a3f797240d4 100644
> >>> --- a/drivers/gpu/host1x/dev.c
> >>> +++ b/drivers/gpu/host1x/dev.c
> >>> @@ -247,8 +247,11 @@ static int host1x_probe(struct platform_device *pdev)
> >>>  
> >>>  	host->clk = devm_clk_get(&pdev->dev, NULL);
> >>>  	if (IS_ERR(host->clk)) {
> >>> -		dev_err(&pdev->dev, "failed to get clock\n");
> >>>  		err = PTR_ERR(host->clk);
> >>> +
> >>> +		if (err != -EPROBE_DEFER)
> >>> +			dev_err(&pdev->dev, "failed to get clock: %d\n", err);
> >>> +
> >>>  		return err;
> >>>  	}
> >>
> >> The clock driver should be available at the time of host1x's probing on
> >> all Tegra's because it is one of essential core drivers that become
> >> available early during boot.
> > 
> > That's the currently baked-in assumption. However, there can be any
> > number of reasons for why the clocks may not show up as early as
> > expected, as evidenced in the case of Tegra186.
> > 
> >> I guess you're making this change for T186, is it because the BPMP
> >> driver's probe getting deferred? If yes, won't it be possible to fix the
> >> defer of the clock driver instead of making such changes in the affected
> >> drivers?
> > 
> > The reason why this is now happening on Tegra186 is because the BPMP is
> > bound to an IOMMU to avoid getting faults from the new no-bypass policy
> > that the ARM SMMU driver is implementing as of v5.2-rc1.
> > 
> > As a result of binding to an IOMMU, the first probe of the BPMP driver
> > will get deferred, so any driver trying to request a clock after that
> > and before BPMP gets probed successfully the next time, any clk_get()
> > calls will fail with -EPROBE_DEFER.
> > 
> > This is a bit unfortunate, but like I said, these kinds of things can
> > happen, and probe deferral was designed specifically to deal with that
> > kind of situation so that we wouldn't have to rely on all of these
> > built-in assumptions that occasionally break.
> > 
> > The driver also already handles deferred probe properly. The only thing
> > that this patch really changes is to no longer consider -EPROBE_DEFER an
> > error. It's in fact a pretty common situation in many drivers and should
> > not warrant a kernel log message.
> 
> You're trying to mask symptoms instead of curing the decease and it looks
> like the decease could be cured.

There's nothing here to cure. -EPROBE_DEFER was designed specifically to
avoid having to play these kinds of games with initcall levels.

What this patch tries to do is just to avoid printing an error message
for something that is not an error. -EPROBE_DEFER is totally expected to
happen, it's normal, it's not something that we should bother users with
because things end up sorting themselves out in the end.

> Won't something like this work for you?

I'm sure we could find a number of ways to fix this. But there's no need
to fix this because it's not broken. What is broken is that we output an
error message when this happens and make an elephant out of a fly.

Thierry

> From fbeabba5f1151e96edc38620db67593585558ca0 Mon Sep 17 00:00:00 2001
> From: Dmitry Osipenko <digetx@gmail.com>
> Date: Wed, 5 Jun 2019 14:02:00 +0300
> Subject: [PATCH 1/2] iommu/arm-smmu: Move driver registration to subsys level
> 
> On some platforms there is a dependency on the IOMMU availability that
> comes up early during of the boot process. One example is NVIDIA Tegra186
> which uses firmware, called BPMP, which manages lots of core functions
> like system clocks for example. That firmware driver require IOMMU
> functionality and hence the driver's probing is getting deferred because
> the ARM's SMMU driver is probed later, thus all the drivers that depend
> on the BPMP availability are also getting deferred on Tegra186. Let's move
> SMMU driver's registration to an earlier boot stage, allowing drivers like
> BPMP to probe successfully without the defer.
> 
> Signed-off-by: Dmitry Osipenko <digetx@gmail.com>
> ---
>  drivers/iommu/arm-smmu.c | 7 ++++++-
>  1 file changed, 6 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c
> index 5e54cc0a28b3..08919f2fdf04 100644
> --- a/drivers/iommu/arm-smmu.c
> +++ b/drivers/iommu/arm-smmu.c
> @@ -2410,4 +2410,9 @@ static struct platform_driver arm_smmu_driver = {
>  	.probe	= arm_smmu_device_probe,
>  	.shutdown = arm_smmu_device_shutdown,
>  };
> -builtin_platform_driver(arm_smmu_driver);
> +
> +static int __init arm_smmu_init(void)
> +{
> +	return platform_driver_register(&arm_smmu_driver);
> +}
> +subsys_initcall(arm_smmu_init);
> -- 
> 2.21.0
> 
> From 12ec90e22405b6d5574cbfcbd33b92736ad6bfe4 Mon Sep 17 00:00:00 2001
> From: Dmitry Osipenko <digetx@gmail.com>
> Date: Wed, 5 Jun 2019 14:13:15 +0300
> Subject: [PATCH 2/2] firmware/tegra: Move driver registration to subsys-sync
>  level
> 
> The BPMP driver depends on the ARM SMMU driver which is now getting
> probed from the subsys level, hence let's move the BPMP's probing to
> the subsys-sync level in order to avoid probe deferring of the BPMP's
> driver.
> 
> Signed-off-by: Dmitry Osipenko <digetx@gmail.com>
> ---
>  drivers/firmware/tegra/bpmp.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/firmware/tegra/bpmp.c b/drivers/firmware/tegra/bpmp.c
> index dd775e8ba5a0..3f649e12f9f6 100644
> --- a/drivers/firmware/tegra/bpmp.c
> +++ b/drivers/firmware/tegra/bpmp.c
> @@ -883,4 +883,4 @@ static int __init tegra_bpmp_init(void)
>  {
>  	return platform_driver_register(&tegra_bpmp_driver);
>  }
> -core_initcall(tegra_bpmp_init);
> +subsys_initcall_sync(tegra_bpmp_init);
> -- 
> 2.21.0
> 
> 

[-- Attachment #1.2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

[-- Attachment #2: Type: text/plain, Size: 159 bytes --]

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] gpu: host1x: Do not output error message for deferred probe
  2019-06-04 15:31 [PATCH] gpu: host1x: Do not output error message for deferred probe Thierry Reding
  2019-06-04 16:07 ` Dmitry Osipenko
  2019-06-04 18:35 ` Dmitry Osipenko
@ 2019-06-05 12:40 ` Daniel Vetter
  2 siblings, 0 replies; 10+ messages in thread
From: Daniel Vetter @ 2019-06-05 12:40 UTC (permalink / raw)
  To: Thierry Reding; +Cc: linux-tegra, dri-devel

On Tue, Jun 04, 2019 at 05:31:50PM +0200, Thierry Reding wrote:
> From: Thierry Reding <treding@nvidia.com>
> 
> When deferring probe, avoid logging a confusing error message. While at
> it, make the error message more informational.
> 
> Signed-off-by: Thierry Reding <treding@nvidia.com>
> ---
>  drivers/gpu/host1x/dev.c | 5 ++++-
>  1 file changed, 4 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/gpu/host1x/dev.c b/drivers/gpu/host1x/dev.c
> index c55e2d634887..5a3f797240d4 100644
> --- a/drivers/gpu/host1x/dev.c
> +++ b/drivers/gpu/host1x/dev.c
> @@ -247,8 +247,11 @@ static int host1x_probe(struct platform_device *pdev)
>  
>  	host->clk = devm_clk_get(&pdev->dev, NULL);
>  	if (IS_ERR(host->clk)) {
> -		dev_err(&pdev->dev, "failed to get clock\n");
>  		err = PTR_ERR(host->clk);
> +
> +		if (err != -EPROBE_DEFER)
> +			dev_err(&pdev->dev, "failed to get clock: %d\n", err);

Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>

> +
>  		return err;
>  	}
>  
> -- 
> 2.21.0
> 
> _______________________________________________
> dri-devel mailing list
> dri-devel@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/dri-devel

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] gpu: host1x: Do not output error message for deferred probe
  2019-06-05 12:32       ` Thierry Reding
@ 2019-06-05 12:40         ` Dmitry Osipenko
  2019-06-05 13:04           ` Thierry Reding
  0 siblings, 1 reply; 10+ messages in thread
From: Dmitry Osipenko @ 2019-06-05 12:40 UTC (permalink / raw)
  To: Thierry Reding; +Cc: linux-tegra, dri-devel

05.06.2019 15:32, Thierry Reding пишет:
> On Wed, Jun 05, 2019 at 02:25:43PM +0300, Dmitry Osipenko wrote:
>> 05.06.2019 11:28, Thierry Reding пишет:
>>> On Tue, Jun 04, 2019 at 07:07:42PM +0300, Dmitry Osipenko wrote:
>>>> 04.06.2019 18:31, Thierry Reding пишет:
>>>>> From: Thierry Reding <treding@nvidia.com>
>>>>>
>>>>> When deferring probe, avoid logging a confusing error message. While at
>>>>> it, make the error message more informational.
>>>>>
>>>>> Signed-off-by: Thierry Reding <treding@nvidia.com>
>>>>> ---
>>>>>  drivers/gpu/host1x/dev.c | 5 ++++-
>>>>>  1 file changed, 4 insertions(+), 1 deletion(-)
>>>>>
>>>>> diff --git a/drivers/gpu/host1x/dev.c b/drivers/gpu/host1x/dev.c
>>>>> index c55e2d634887..5a3f797240d4 100644
>>>>> --- a/drivers/gpu/host1x/dev.c
>>>>> +++ b/drivers/gpu/host1x/dev.c
>>>>> @@ -247,8 +247,11 @@ static int host1x_probe(struct platform_device *pdev)
>>>>>  
>>>>>  	host->clk = devm_clk_get(&pdev->dev, NULL);
>>>>>  	if (IS_ERR(host->clk)) {
>>>>> -		dev_err(&pdev->dev, "failed to get clock\n");
>>>>>  		err = PTR_ERR(host->clk);
>>>>> +
>>>>> +		if (err != -EPROBE_DEFER)
>>>>> +			dev_err(&pdev->dev, "failed to get clock: %d\n", err);
>>>>> +
>>>>>  		return err;
>>>>>  	}
>>>>
>>>> The clock driver should be available at the time of host1x's probing on
>>>> all Tegra's because it is one of essential core drivers that become
>>>> available early during boot.
>>>
>>> That's the currently baked-in assumption. However, there can be any
>>> number of reasons for why the clocks may not show up as early as
>>> expected, as evidenced in the case of Tegra186.
>>>
>>>> I guess you're making this change for T186, is it because the BPMP
>>>> driver's probe getting deferred? If yes, won't it be possible to fix the
>>>> defer of the clock driver instead of making such changes in the affected
>>>> drivers?
>>>
>>> The reason why this is now happening on Tegra186 is because the BPMP is
>>> bound to an IOMMU to avoid getting faults from the new no-bypass policy
>>> that the ARM SMMU driver is implementing as of v5.2-rc1.
>>>
>>> As a result of binding to an IOMMU, the first probe of the BPMP driver
>>> will get deferred, so any driver trying to request a clock after that
>>> and before BPMP gets probed successfully the next time, any clk_get()
>>> calls will fail with -EPROBE_DEFER.
>>>
>>> This is a bit unfortunate, but like I said, these kinds of things can
>>> happen, and probe deferral was designed specifically to deal with that
>>> kind of situation so that we wouldn't have to rely on all of these
>>> built-in assumptions that occasionally break.
>>>
>>> The driver also already handles deferred probe properly. The only thing
>>> that this patch really changes is to no longer consider -EPROBE_DEFER an
>>> error. It's in fact a pretty common situation in many drivers and should
>>> not warrant a kernel log message.
>>
>> You're trying to mask symptoms instead of curing the decease and it looks
>> like the decease could be cured.
> 
> There's nothing here to cure. -EPROBE_DEFER was designed specifically to
> avoid having to play these kinds of games with initcall levels.
> 
> What this patch tries to do is just to avoid printing an error message
> for something that is not an error. -EPROBE_DEFER is totally expected to
> happen, it's normal, it's not something that we should bother users with
> because things end up sorting themselves out in the end.
> 
>> Won't something like this work for you?
> 
> I'm sure we could find a number of ways to fix this. But there's no need
> to fix this because it's not broken. What is broken is that we output an
> error message when this happens and make an elephant out of a fly.

Sure, this is absolutely not critical and deferred probe is doing its job.
But don't you agree that it's better to fix the root of the annoyance once
and for all?

Would be awesome if you could give a whirl to the patches that I'm suggesting.
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] gpu: host1x: Do not output error message for deferred probe
  2019-06-05 12:40         ` Dmitry Osipenko
@ 2019-06-05 13:04           ` Thierry Reding
  2019-06-05 13:16             ` Dmitry Osipenko
  0 siblings, 1 reply; 10+ messages in thread
From: Thierry Reding @ 2019-06-05 13:04 UTC (permalink / raw)
  To: Dmitry Osipenko; +Cc: linux-tegra, dri-devel


[-- Attachment #1.1: Type: text/plain, Size: 4557 bytes --]

On Wed, Jun 05, 2019 at 03:40:28PM +0300, Dmitry Osipenko wrote:
> 05.06.2019 15:32, Thierry Reding пишет:
> > On Wed, Jun 05, 2019 at 02:25:43PM +0300, Dmitry Osipenko wrote:
> >> 05.06.2019 11:28, Thierry Reding пишет:
> >>> On Tue, Jun 04, 2019 at 07:07:42PM +0300, Dmitry Osipenko wrote:
> >>>> 04.06.2019 18:31, Thierry Reding пишет:
> >>>>> From: Thierry Reding <treding@nvidia.com>
> >>>>>
> >>>>> When deferring probe, avoid logging a confusing error message. While at
> >>>>> it, make the error message more informational.
> >>>>>
> >>>>> Signed-off-by: Thierry Reding <treding@nvidia.com>
> >>>>> ---
> >>>>>  drivers/gpu/host1x/dev.c | 5 ++++-
> >>>>>  1 file changed, 4 insertions(+), 1 deletion(-)
> >>>>>
> >>>>> diff --git a/drivers/gpu/host1x/dev.c b/drivers/gpu/host1x/dev.c
> >>>>> index c55e2d634887..5a3f797240d4 100644
> >>>>> --- a/drivers/gpu/host1x/dev.c
> >>>>> +++ b/drivers/gpu/host1x/dev.c
> >>>>> @@ -247,8 +247,11 @@ static int host1x_probe(struct platform_device *pdev)
> >>>>>  
> >>>>>  	host->clk = devm_clk_get(&pdev->dev, NULL);
> >>>>>  	if (IS_ERR(host->clk)) {
> >>>>> -		dev_err(&pdev->dev, "failed to get clock\n");
> >>>>>  		err = PTR_ERR(host->clk);
> >>>>> +
> >>>>> +		if (err != -EPROBE_DEFER)
> >>>>> +			dev_err(&pdev->dev, "failed to get clock: %d\n", err);
> >>>>> +
> >>>>>  		return err;
> >>>>>  	}
> >>>>
> >>>> The clock driver should be available at the time of host1x's probing on
> >>>> all Tegra's because it is one of essential core drivers that become
> >>>> available early during boot.
> >>>
> >>> That's the currently baked-in assumption. However, there can be any
> >>> number of reasons for why the clocks may not show up as early as
> >>> expected, as evidenced in the case of Tegra186.
> >>>
> >>>> I guess you're making this change for T186, is it because the BPMP
> >>>> driver's probe getting deferred? If yes, won't it be possible to fix the
> >>>> defer of the clock driver instead of making such changes in the affected
> >>>> drivers?
> >>>
> >>> The reason why this is now happening on Tegra186 is because the BPMP is
> >>> bound to an IOMMU to avoid getting faults from the new no-bypass policy
> >>> that the ARM SMMU driver is implementing as of v5.2-rc1.
> >>>
> >>> As a result of binding to an IOMMU, the first probe of the BPMP driver
> >>> will get deferred, so any driver trying to request a clock after that
> >>> and before BPMP gets probed successfully the next time, any clk_get()
> >>> calls will fail with -EPROBE_DEFER.
> >>>
> >>> This is a bit unfortunate, but like I said, these kinds of things can
> >>> happen, and probe deferral was designed specifically to deal with that
> >>> kind of situation so that we wouldn't have to rely on all of these
> >>> built-in assumptions that occasionally break.
> >>>
> >>> The driver also already handles deferred probe properly. The only thing
> >>> that this patch really changes is to no longer consider -EPROBE_DEFER an
> >>> error. It's in fact a pretty common situation in many drivers and should
> >>> not warrant a kernel log message.
> >>
> >> You're trying to mask symptoms instead of curing the decease and it looks
> >> like the decease could be cured.
> > 
> > There's nothing here to cure. -EPROBE_DEFER was designed specifically to
> > avoid having to play these kinds of games with initcall levels.
> > 
> > What this patch tries to do is just to avoid printing an error message
> > for something that is not an error. -EPROBE_DEFER is totally expected to
> > happen, it's normal, it's not something that we should bother users with
> > because things end up sorting themselves out in the end.
> > 
> >> Won't something like this work for you?
> > 
> > I'm sure we could find a number of ways to fix this. But there's no need
> > to fix this because it's not broken. What is broken is that we output an
> > error message when this happens and make an elephant out of a fly.
> 
> Sure, this is absolutely not critical and deferred probe is doing its job.
> But don't you agree that it's better to fix the root of the annoyance once
> and for all?

From my point of view deferred probe is the once and for all fix. Back
before we had deferred probe, doing these kinds of initcall reordering
tricks was fairly common and while such a change may fix one setup, it
often ended up breaking others.

Sorry, this is a lesson that we already learned a couple of years ago,
no need to rehash it.

Thierry

[-- Attachment #1.2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

[-- Attachment #2: Type: text/plain, Size: 159 bytes --]

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] gpu: host1x: Do not output error message for deferred probe
  2019-06-05 13:04           ` Thierry Reding
@ 2019-06-05 13:16             ` Dmitry Osipenko
  0 siblings, 0 replies; 10+ messages in thread
From: Dmitry Osipenko @ 2019-06-05 13:16 UTC (permalink / raw)
  To: Thierry Reding; +Cc: linux-tegra, dri-devel

05.06.2019 16:04, Thierry Reding пишет:
> On Wed, Jun 05, 2019 at 03:40:28PM +0300, Dmitry Osipenko wrote:
>> 05.06.2019 15:32, Thierry Reding пишет:
>>> On Wed, Jun 05, 2019 at 02:25:43PM +0300, Dmitry Osipenko wrote:
>>>> 05.06.2019 11:28, Thierry Reding пишет:
>>>>> On Tue, Jun 04, 2019 at 07:07:42PM +0300, Dmitry Osipenko wrote:
>>>>>> 04.06.2019 18:31, Thierry Reding пишет:
>>>>>>> From: Thierry Reding <treding@nvidia.com>
>>>>>>>
>>>>>>> When deferring probe, avoid logging a confusing error message. While at
>>>>>>> it, make the error message more informational.
>>>>>>>
>>>>>>> Signed-off-by: Thierry Reding <treding@nvidia.com>
>>>>>>> ---
>>>>>>>  drivers/gpu/host1x/dev.c | 5 ++++-
>>>>>>>  1 file changed, 4 insertions(+), 1 deletion(-)
>>>>>>>
>>>>>>> diff --git a/drivers/gpu/host1x/dev.c b/drivers/gpu/host1x/dev.c
>>>>>>> index c55e2d634887..5a3f797240d4 100644
>>>>>>> --- a/drivers/gpu/host1x/dev.c
>>>>>>> +++ b/drivers/gpu/host1x/dev.c
>>>>>>> @@ -247,8 +247,11 @@ static int host1x_probe(struct platform_device *pdev)
>>>>>>>  
>>>>>>>  	host->clk = devm_clk_get(&pdev->dev, NULL);
>>>>>>>  	if (IS_ERR(host->clk)) {
>>>>>>> -		dev_err(&pdev->dev, "failed to get clock\n");
>>>>>>>  		err = PTR_ERR(host->clk);
>>>>>>> +
>>>>>>> +		if (err != -EPROBE_DEFER)
>>>>>>> +			dev_err(&pdev->dev, "failed to get clock: %d\n", err);
>>>>>>> +
>>>>>>>  		return err;
>>>>>>>  	}
>>>>>>
>>>>>> The clock driver should be available at the time of host1x's probing on
>>>>>> all Tegra's because it is one of essential core drivers that become
>>>>>> available early during boot.
>>>>>
>>>>> That's the currently baked-in assumption. However, there can be any
>>>>> number of reasons for why the clocks may not show up as early as
>>>>> expected, as evidenced in the case of Tegra186.
>>>>>
>>>>>> I guess you're making this change for T186, is it because the BPMP
>>>>>> driver's probe getting deferred? If yes, won't it be possible to fix the
>>>>>> defer of the clock driver instead of making such changes in the affected
>>>>>> drivers?
>>>>>
>>>>> The reason why this is now happening on Tegra186 is because the BPMP is
>>>>> bound to an IOMMU to avoid getting faults from the new no-bypass policy
>>>>> that the ARM SMMU driver is implementing as of v5.2-rc1.
>>>>>
>>>>> As a result of binding to an IOMMU, the first probe of the BPMP driver
>>>>> will get deferred, so any driver trying to request a clock after that
>>>>> and before BPMP gets probed successfully the next time, any clk_get()
>>>>> calls will fail with -EPROBE_DEFER.
>>>>>
>>>>> This is a bit unfortunate, but like I said, these kinds of things can
>>>>> happen, and probe deferral was designed specifically to deal with that
>>>>> kind of situation so that we wouldn't have to rely on all of these
>>>>> built-in assumptions that occasionally break.
>>>>>
>>>>> The driver also already handles deferred probe properly. The only thing
>>>>> that this patch really changes is to no longer consider -EPROBE_DEFER an
>>>>> error. It's in fact a pretty common situation in many drivers and should
>>>>> not warrant a kernel log message.
>>>>
>>>> You're trying to mask symptoms instead of curing the decease and it looks
>>>> like the decease could be cured.
>>>
>>> There's nothing here to cure. -EPROBE_DEFER was designed specifically to
>>> avoid having to play these kinds of games with initcall levels.
>>>
>>> What this patch tries to do is just to avoid printing an error message
>>> for something that is not an error. -EPROBE_DEFER is totally expected to
>>> happen, it's normal, it's not something that we should bother users with
>>> because things end up sorting themselves out in the end.
>>>
>>>> Won't something like this work for you?
>>>
>>> I'm sure we could find a number of ways to fix this. But there's no need
>>> to fix this because it's not broken. What is broken is that we output an
>>> error message when this happens and make an elephant out of a fly.
>>
>> Sure, this is absolutely not critical and deferred probe is doing its job.
>> But don't you agree that it's better to fix the root of the annoyance once
>> and for all?
> 
> From my point of view deferred probe is the once and for all fix. Back
> before we had deferred probe, doing these kinds of initcall reordering
> tricks was fairly common and while such a change may fix one setup, it
> often ended up breaking others.
> 
> Sorry, this is a lesson that we already learned a couple of years ago,
> no need to rehash it.

We had a success story in similar case of pinctrl vs GPIO driver probe
order, reordering the probe order fixed a lot of deferred probings and
improved boot up time a tad. Oh, that reminded me that the Stefan's
patch was never merged [0] :(

I'm trying to help, but seems not much I could do since you're in oppose ;)

[0] https://patchwork.ozlabs.org/patch/949776/
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2019-06-05 13:16 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-06-04 15:31 [PATCH] gpu: host1x: Do not output error message for deferred probe Thierry Reding
2019-06-04 16:07 ` Dmitry Osipenko
2019-06-05  8:28   ` Thierry Reding
2019-06-05 11:25     ` Dmitry Osipenko
2019-06-05 12:32       ` Thierry Reding
2019-06-05 12:40         ` Dmitry Osipenko
2019-06-05 13:04           ` Thierry Reding
2019-06-05 13:16             ` Dmitry Osipenko
2019-06-04 18:35 ` Dmitry Osipenko
2019-06-05 12:40 ` Daniel Vetter

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.