linux-pm.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [RFC][PATCH] driver core: Extend returning EPROBE_DEFER for two minutes after late_initcall
@ 2020-02-14  0:44 John Stultz
  2020-02-14  1:51 ` Rob Herring
  2020-02-14  2:19 ` Bjorn Andersson
  0 siblings, 2 replies; 7+ messages in thread
From: John Stultz @ 2020-02-14  0:44 UTC (permalink / raw)
  To: lkml
  Cc: John Stultz, Alexander Graf, Rob Herring, Rafael J. Wysocki,
	Kevin Hilman, Ulf Hansson, Pavel Machek, Len Brown, Todd Kjos,
	Bjorn Andersson, Greg Kroah-Hartman, linux-pm

Due to commit e01afc3250255 ("PM / Domains: Stop deferring probe
at the end of initcall"), along with commit 25b4e70dcce9
("driver core: allow stopping deferred probe after init") after
late_initcall, drivers will stop getting EPROBE_DEFER, and
instead see an error causing the driver to fail to load.

That change causes trouble when trying to use many clk drivers
as modules, as the clk modules may not load until much later
after init has started. If a dependent driver loads and gets an
error instead of EPROBE_DEFER, it won't try to reload later when
the dependency is met, and will thus fail to load.

Instead of reverting that patch, this patch tries to extend the
time that EPROBE_DEFER is retruned by two minutes, to (hopefully)
ensure that everything has had a chance to load.

Specifically, on db845c, this change allows us to set
SDM_GPUCC_845, QCOM_CLK_RPMH and COMMON_CLK_QCOM as modules and
get a working system, where as without it the display will fail
to load.

Cc: Alexander Graf <agraf@suse.de>
Cc: Rob Herring <robh@kernel.org>
Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net>
Cc: Kevin Hilman <khilman@kernel.org>
Cc: Ulf Hansson <ulf.hansson@linaro.org>
Cc: Pavel Machek <pavel@ucw.cz>
Cc: Len Brown <len.brown@intel.com>
Cc: Todd Kjos <tkjos@google.com>
Cc: Bjorn Andersson <bjorn.andersson@linaro.org>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: linux-pm@vger.kernel.org
Fixes: e01afc3250255 ("PM / Domains: Stop deferring probe at the end of initcall")
Fixes: 25b4e70dcce9 ("driver core: allow stopping deferred probe after init")
Signed-off-by: John Stultz <john.stultz@linaro.org>
---
 drivers/base/dd.c | 8 +++++++-
 1 file changed, 7 insertions(+), 1 deletion(-)

diff --git a/drivers/base/dd.c b/drivers/base/dd.c
index b25bcab2a26b..35ebae8b65be 100644
--- a/drivers/base/dd.c
+++ b/drivers/base/dd.c
@@ -311,6 +311,12 @@ static void deferred_probe_timeout_work_func(struct work_struct *work)
 }
 static DECLARE_DELAYED_WORK(deferred_probe_timeout_work, deferred_probe_timeout_work_func);
 
+static void deferred_initcall_done_work_func(struct work_struct *work)
+{
+	initcalls_done = true;
+}
+static DECLARE_DELAYED_WORK(deferred_initcall_done_work, deferred_initcall_done_work_func);
+
 /**
  * deferred_probe_initcall() - Enable probing of deferred devices
  *
@@ -327,7 +333,7 @@ static int deferred_probe_initcall(void)
 	driver_deferred_probe_trigger();
 	/* Sort as many dependencies as possible before exiting initcalls */
 	flush_work(&deferred_probe_work);
-	initcalls_done = true;
+	schedule_delayed_work(&deferred_initcall_done_work, 120 * HZ);
 
 	/*
 	 * Trigger deferred probe again, this time we won't defer anything
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [RFC][PATCH] driver core: Extend returning EPROBE_DEFER for two minutes after late_initcall
  2020-02-14  0:44 [RFC][PATCH] driver core: Extend returning EPROBE_DEFER for two minutes after late_initcall John Stultz
@ 2020-02-14  1:51 ` Rob Herring
  2020-02-14  2:41   ` John Stultz
  2020-02-14  2:19 ` Bjorn Andersson
  1 sibling, 1 reply; 7+ messages in thread
From: Rob Herring @ 2020-02-14  1:51 UTC (permalink / raw)
  To: John Stultz
  Cc: lkml, Alexander Graf, Rafael J. Wysocki, Kevin Hilman,
	Ulf Hansson, Pavel Machek, Len Brown, Todd Kjos, Bjorn Andersson,
	Greg Kroah-Hartman, open list:THERMAL

On Thu, Feb 13, 2020 at 6:44 PM John Stultz <john.stultz@linaro.org> wrote:
>
> Due to commit e01afc3250255 ("PM / Domains: Stop deferring probe
> at the end of initcall"), along with commit 25b4e70dcce9
> ("driver core: allow stopping deferred probe after init") after
> late_initcall, drivers will stop getting EPROBE_DEFER, and
> instead see an error causing the driver to fail to load.
>
> That change causes trouble when trying to use many clk drivers
> as modules, as the clk modules may not load until much later
> after init has started. If a dependent driver loads and gets an
> error instead of EPROBE_DEFER, it won't try to reload later when
> the dependency is met, and will thus fail to load.
>
> Instead of reverting that patch, this patch tries to extend the
> time that EPROBE_DEFER is retruned by two minutes, to (hopefully)
> ensure that everything has had a chance to load.

I think regulators already has some delay like this. We should use the
same timeouts.

We also have the 'deferred_probe_timeout' cmdline option. It's deemed
a debug option currently, but we could change that and change the
default.

> Specifically, on db845c, this change allows us to set
> SDM_GPUCC_845, QCOM_CLK_RPMH and COMMON_CLK_QCOM as modules and
> get a working system, where as without it the display will fail
> to load.
>
> Cc: Alexander Graf <agraf@suse.de>
> Cc: Rob Herring <robh@kernel.org>
> Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net>
> Cc: Kevin Hilman <khilman@kernel.org>
> Cc: Ulf Hansson <ulf.hansson@linaro.org>
> Cc: Pavel Machek <pavel@ucw.cz>
> Cc: Len Brown <len.brown@intel.com>
> Cc: Todd Kjos <tkjos@google.com>
> Cc: Bjorn Andersson <bjorn.andersson@linaro.org>
> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
> Cc: linux-pm@vger.kernel.org
> Fixes: e01afc3250255 ("PM / Domains: Stop deferring probe at the end of initcall")
> Fixes: 25b4e70dcce9 ("driver core: allow stopping deferred probe after init")

We can debate the design, but those work as designed. So Fixes?

> Signed-off-by: John Stultz <john.stultz@linaro.org>
> ---
>  drivers/base/dd.c | 8 +++++++-
>  1 file changed, 7 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/base/dd.c b/drivers/base/dd.c
> index b25bcab2a26b..35ebae8b65be 100644
> --- a/drivers/base/dd.c
> +++ b/drivers/base/dd.c
> @@ -311,6 +311,12 @@ static void deferred_probe_timeout_work_func(struct work_struct *work)
>  }
>  static DECLARE_DELAYED_WORK(deferred_probe_timeout_work, deferred_probe_timeout_work_func);
>
> +static void deferred_initcall_done_work_func(struct work_struct *work)
> +{
> +       initcalls_done = true;
> +}
> +static DECLARE_DELAYED_WORK(deferred_initcall_done_work, deferred_initcall_done_work_func);
> +
>  /**
>   * deferred_probe_initcall() - Enable probing of deferred devices
>   *
> @@ -327,7 +333,7 @@ static int deferred_probe_initcall(void)
>         driver_deferred_probe_trigger();
>         /* Sort as many dependencies as possible before exiting initcalls */
>         flush_work(&deferred_probe_work);
> -       initcalls_done = true;
> +       schedule_delayed_work(&deferred_initcall_done_work, 120 * HZ);
>
>         /*
>          * Trigger deferred probe again, this time we won't defer anything
> --
> 2.17.1
>

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [RFC][PATCH] driver core: Extend returning EPROBE_DEFER for two minutes after late_initcall
  2020-02-14  0:44 [RFC][PATCH] driver core: Extend returning EPROBE_DEFER for two minutes after late_initcall John Stultz
  2020-02-14  1:51 ` Rob Herring
@ 2020-02-14  2:19 ` Bjorn Andersson
  2020-02-14  4:05   ` John Stultz
  1 sibling, 1 reply; 7+ messages in thread
From: Bjorn Andersson @ 2020-02-14  2:19 UTC (permalink / raw)
  To: John Stultz
  Cc: lkml, Alexander Graf, Rob Herring, Rafael J. Wysocki,
	Kevin Hilman, Ulf Hansson, Pavel Machek, Len Brown, Todd Kjos,
	Greg Kroah-Hartman, linux-pm

On Thu 13 Feb 16:44 PST 2020, John Stultz wrote:

> Due to commit e01afc3250255 ("PM / Domains: Stop deferring probe
> at the end of initcall"), along with commit 25b4e70dcce9
> ("driver core: allow stopping deferred probe after init") after
> late_initcall, drivers will stop getting EPROBE_DEFER, and
> instead see an error causing the driver to fail to load.
> 
> That change causes trouble when trying to use many clk drivers
> as modules, as the clk modules may not load until much later
> after init has started. If a dependent driver loads and gets an
> error instead of EPROBE_DEFER, it won't try to reload later when
> the dependency is met, and will thus fail to load.
> 
> Instead of reverting that patch, this patch tries to extend the
> time that EPROBE_DEFER is retruned by two minutes, to (hopefully)
> ensure that everything has had a chance to load.
> 
> Specifically, on db845c, this change allows us to set
> SDM_GPUCC_845, QCOM_CLK_RPMH and COMMON_CLK_QCOM as modules and
> get a working system, where as without it the display will fail
> to load.

The purpose of 25b4e70dcce9 ("driver core: allow stopping deferred probe
after init") is to ensure that when the kernel boots with a DeviceTree
blob that references a resource (power-domain in this case) that either
hasn't been compiled in, or simply doesn't exist yet, it should continue
to boot - under the assumption that these resources probably aren't
needed to provide a functional system.

I don't think your patch maintains this behavior, because when userspace
kicks in and load kernel modules during the first two minutes they will
all end up in the probe deferral list. Past two minutes any event that
registers a new driver (i.e. manual intervention) will kick of a new
wave of probing, which will now continue as expected, ignoring any
power-domains that is yet to be probed (either because they don't exist
or they are further down the probe deferral list).

You can improve the situation somewhat by calling
driver_deferred_probe_trigger() in your
deferred_initcall_done_work_func(), to remove the need for human
intervention. But the outcome will still depend on the order in
deferred_probe_active_list.



That said, your patch does resolve an important problem for me!

We have a number of drivers providing power-domains that are registered
subject to timing in interaction with co-processors. So with a
sufficiently small kernel (e.g. heavy use of kernel modules) it's likely
that these are registered past late_initcall.

Your extension of this to two minutes past late_initcall will for sure
be sufficient to avoid this issue.

Regards,
Bjorn

> 
> Cc: Alexander Graf <agraf@suse.de>
> Cc: Rob Herring <robh@kernel.org>
> Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net>
> Cc: Kevin Hilman <khilman@kernel.org>
> Cc: Ulf Hansson <ulf.hansson@linaro.org>
> Cc: Pavel Machek <pavel@ucw.cz>
> Cc: Len Brown <len.brown@intel.com>
> Cc: Todd Kjos <tkjos@google.com>
> Cc: Bjorn Andersson <bjorn.andersson@linaro.org>
> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
> Cc: linux-pm@vger.kernel.org
> Fixes: e01afc3250255 ("PM / Domains: Stop deferring probe at the end of initcall")
> Fixes: 25b4e70dcce9 ("driver core: allow stopping deferred probe after init")
> Signed-off-by: John Stultz <john.stultz@linaro.org>
> ---
>  drivers/base/dd.c | 8 +++++++-
>  1 file changed, 7 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/base/dd.c b/drivers/base/dd.c
> index b25bcab2a26b..35ebae8b65be 100644
> --- a/drivers/base/dd.c
> +++ b/drivers/base/dd.c
> @@ -311,6 +311,12 @@ static void deferred_probe_timeout_work_func(struct work_struct *work)
>  }
>  static DECLARE_DELAYED_WORK(deferred_probe_timeout_work, deferred_probe_timeout_work_func);
>  
> +static void deferred_initcall_done_work_func(struct work_struct *work)
> +{
> +	initcalls_done = true;
> +}
> +static DECLARE_DELAYED_WORK(deferred_initcall_done_work, deferred_initcall_done_work_func);
> +
>  /**
>   * deferred_probe_initcall() - Enable probing of deferred devices
>   *
> @@ -327,7 +333,7 @@ static int deferred_probe_initcall(void)
>  	driver_deferred_probe_trigger();
>  	/* Sort as many dependencies as possible before exiting initcalls */
>  	flush_work(&deferred_probe_work);
> -	initcalls_done = true;
> +	schedule_delayed_work(&deferred_initcall_done_work, 120 * HZ);
>  
>  	/*
>  	 * Trigger deferred probe again, this time we won't defer anything
> -- 
> 2.17.1
> 

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [RFC][PATCH] driver core: Extend returning EPROBE_DEFER for two minutes after late_initcall
  2020-02-14  1:51 ` Rob Herring
@ 2020-02-14  2:41   ` John Stultz
  2020-02-15 21:26     ` Rob Herring
  0 siblings, 1 reply; 7+ messages in thread
From: John Stultz @ 2020-02-14  2:41 UTC (permalink / raw)
  To: Rob Herring
  Cc: lkml, Alexander Graf, Rafael J. Wysocki, Kevin Hilman,
	Ulf Hansson, Pavel Machek, Len Brown, Todd Kjos, Bjorn Andersson,
	Greg Kroah-Hartman, open list:THERMAL

On Thu, Feb 13, 2020 at 5:51 PM Rob Herring <robh@kernel.org> wrote:
> On Thu, Feb 13, 2020 at 6:44 PM John Stultz <john.stultz@linaro.org> wrote:
> > Due to commit e01afc3250255 ("PM / Domains: Stop deferring probe
> > at the end of initcall"), along with commit 25b4e70dcce9
> > ("driver core: allow stopping deferred probe after init") after
> > late_initcall, drivers will stop getting EPROBE_DEFER, and
> > instead see an error causing the driver to fail to load.
> >
> > That change causes trouble when trying to use many clk drivers
> > as modules, as the clk modules may not load until much later
> > after init has started. If a dependent driver loads and gets an
> > error instead of EPROBE_DEFER, it won't try to reload later when
> > the dependency is met, and will thus fail to load.
> >
> > Instead of reverting that patch, this patch tries to extend the
> > time that EPROBE_DEFER is retruned by two minutes, to (hopefully)
> > ensure that everything has had a chance to load.
>
> I think regulators already has some delay like this. We should use the
> same timeouts.

Sounds good. My memory was a bit foggy from the time I initially
brought this up, and I looked briefly before sending this out and
didn't find the regulator change you had mentioned. If you have a
specific pointer (or patch name or something) let me know, otherwise
I'll dig around later tonight/tomorrow.

> We also have the 'deferred_probe_timeout' cmdline option. It's deemed
> a debug option currently, but we could change that and change the
> default.

I looked at that code, but couldn't really make heads or tails of it.
The initcalls_done is checked and returns before the
deferred_probe_timeout is looked at, so I was guessing the
deferred_probe_timeout was addressing a bit more subtle issue than
what I was going for. If its really the same functionality, I'm happy
to try to rework it.

> > Specifically, on db845c, this change allows us to set
> > SDM_GPUCC_845, QCOM_CLK_RPMH and COMMON_CLK_QCOM as modules and
> > get a working system, where as without it the display will fail
> > to load.
> >
> > Cc: Alexander Graf <agraf@suse.de>
> > Cc: Rob Herring <robh@kernel.org>
> > Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net>
> > Cc: Kevin Hilman <khilman@kernel.org>
> > Cc: Ulf Hansson <ulf.hansson@linaro.org>
> > Cc: Pavel Machek <pavel@ucw.cz>
> > Cc: Len Brown <len.brown@intel.com>
> > Cc: Todd Kjos <tkjos@google.com>
> > Cc: Bjorn Andersson <bjorn.andersson@linaro.org>
> > Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
> > Cc: linux-pm@vger.kernel.org
> > Fixes: e01afc3250255 ("PM / Domains: Stop deferring probe at the end of initcall")
> > Fixes: 25b4e70dcce9 ("driver core: allow stopping deferred probe after init")
>
> We can debate the design, but those work as designed. So Fixes?
>

Well, clk module loading would have worked, and then stopped working
with e01afc3250255, so it is a regression of sorts.  And really the
tags are mostly for making sure patches get applied to trees that
backported these commits (and it's not my intention to shame a patch
as broken. :)

thanks
-john

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [RFC][PATCH] driver core: Extend returning EPROBE_DEFER for two minutes after late_initcall
  2020-02-14  2:19 ` Bjorn Andersson
@ 2020-02-14  4:05   ` John Stultz
  2020-02-14  5:15     ` Bjorn Andersson
  0 siblings, 1 reply; 7+ messages in thread
From: John Stultz @ 2020-02-14  4:05 UTC (permalink / raw)
  To: Bjorn Andersson
  Cc: lkml, Rob Herring, Rafael J. Wysocki, Kevin Hilman, Ulf Hansson,
	Pavel Machek, Len Brown, Todd Kjos, Greg Kroah-Hartman,
	Linux PM list

On Thu, Feb 13, 2020 at 6:19 PM Bjorn Andersson
<bjorn.andersson@linaro.org> wrote:
> The purpose of 25b4e70dcce9 ("driver core: allow stopping deferred probe
> after init") is to ensure that when the kernel boots with a DeviceTree
> blob that references a resource (power-domain in this case) that either
> hasn't been compiled in, or simply doesn't exist yet, it should continue
> to boot - under the assumption that these resources probably aren't
> needed to provide a functional system.
>
> I don't think your patch maintains this behavior, because when userspace
> kicks in and load kernel modules during the first two minutes they will
> all end up in the probe deferral list. Past two minutes any event that
> registers a new driver (i.e. manual intervention) will kick of a new
> wave of probing, which will now continue as expected, ignoring any
> power-domains that is yet to be probed (either because they don't exist
> or they are further down the probe deferral list).

Hmm. I'll have to look at that again. I worry the logic is overloaded
a bit, because the logic in __driver_deferred_probe_check_state() will
only return -EPROBE_DEFER before late_initcall otherwise it returns
-ETIMEDOUT or 0.  So if we call__genpd_dev_pm_attach() after
late_initcall and the pd isn't ready, the driver probe will fail
permanently and not function.

I'd think in the case you describe (correct me if I'm misunderstanding
you), modules that load in the first two minutes would hit
EPROBE_DEFER only if a dependency is missing, and will continue to try
to probe next round. But once the two minutes are up, they will catch
ETIMEDOUT and fail permanently.

> You can improve the situation somewhat by calling
> driver_deferred_probe_trigger() in your
> deferred_initcall_done_work_func(), to remove the need for human
> intervention. But the outcome will still depend on the order in
> deferred_probe_active_list.

Ok. I'll take a look at that.

Thanks so much for the feedback!
-john

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [RFC][PATCH] driver core: Extend returning EPROBE_DEFER for two minutes after late_initcall
  2020-02-14  4:05   ` John Stultz
@ 2020-02-14  5:15     ` Bjorn Andersson
  0 siblings, 0 replies; 7+ messages in thread
From: Bjorn Andersson @ 2020-02-14  5:15 UTC (permalink / raw)
  To: John Stultz
  Cc: lkml, Rob Herring, Rafael J. Wysocki, Kevin Hilman, Ulf Hansson,
	Pavel Machek, Len Brown, Todd Kjos, Greg Kroah-Hartman,
	Linux PM list

On Thu 13 Feb 20:05 PST 2020, John Stultz wrote:

> On Thu, Feb 13, 2020 at 6:19 PM Bjorn Andersson
> <bjorn.andersson@linaro.org> wrote:
> > The purpose of 25b4e70dcce9 ("driver core: allow stopping deferred probe
> > after init") is to ensure that when the kernel boots with a DeviceTree
> > blob that references a resource (power-domain in this case) that either
> > hasn't been compiled in, or simply doesn't exist yet, it should continue
> > to boot - under the assumption that these resources probably aren't
> > needed to provide a functional system.
> >
> > I don't think your patch maintains this behavior, because when userspace
> > kicks in and load kernel modules during the first two minutes they will
> > all end up in the probe deferral list. Past two minutes any event that
> > registers a new driver (i.e. manual intervention) will kick of a new
> > wave of probing, which will now continue as expected, ignoring any
> > power-domains that is yet to be probed (either because they don't exist
> > or they are further down the probe deferral list).
> 
> Hmm. I'll have to look at that again. I worry the logic is overloaded
> a bit, because the logic in __driver_deferred_probe_check_state() will
> only return -EPROBE_DEFER before late_initcall otherwise it returns
> -ETIMEDOUT or 0.  So if we call__genpd_dev_pm_attach() after
> late_initcall and the pd isn't ready, the driver probe will fail
> permanently and not function.
> 

Correct. And the motivation for this is that if you use a dtb from the
future it might describe a power-domain provider that is not yet
implemented in the booting kernel and as such the purpose is to fail
fast - in a way that drivers can ignore, rather than probe deferring
indefinitely.

> I'd think in the case you describe (correct me if I'm misunderstanding
> you), modules that load in the first two minutes would hit
> EPROBE_DEFER only if a dependency is missing, and will continue to try
> to probe next round. But once the two minutes are up, they will catch
> ETIMEDOUT and fail permanently.
> 

This extends the time that probe deferral is functional from
late_initcall to 2 minutes from boot, which should solve all practical
problems you and I have with the current situation.

But the specific detail that your patch is missing is that drivers that
probe defer will end up on the deferral list and this list is only
processed whenever drivers are added or some driver succeeds to probe.
So before the 2 minutes the deferral dance will stop and you need one of
these events to kick off the dance again.

> > You can improve the situation somewhat by calling
> > driver_deferred_probe_trigger() in your
> > deferred_initcall_done_work_func(), to remove the need for human
> > intervention. But the outcome will still depend on the order in
> > deferred_probe_active_list.
> 
> Ok. I'll take a look at that.
> 

Cool

> Thanks so much for the feedback!

Thank you for working on this, I've spent days debugging subtle issues
because of this feature...

Regards,
Bjorn

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [RFC][PATCH] driver core: Extend returning EPROBE_DEFER for two minutes after late_initcall
  2020-02-14  2:41   ` John Stultz
@ 2020-02-15 21:26     ` Rob Herring
  0 siblings, 0 replies; 7+ messages in thread
From: Rob Herring @ 2020-02-15 21:26 UTC (permalink / raw)
  To: John Stultz
  Cc: lkml, Alexander Graf, Rafael J. Wysocki, Kevin Hilman,
	Ulf Hansson, Pavel Machek, Len Brown, Todd Kjos, Bjorn Andersson,
	Greg Kroah-Hartman, open list:THERMAL

On Thu, Feb 13, 2020 at 8:42 PM John Stultz <john.stultz@linaro.org> wrote:
>
> On Thu, Feb 13, 2020 at 5:51 PM Rob Herring <robh@kernel.org> wrote:
> > On Thu, Feb 13, 2020 at 6:44 PM John Stultz <john.stultz@linaro.org> wrote:
> > > Due to commit e01afc3250255 ("PM / Domains: Stop deferring probe
> > > at the end of initcall"), along with commit 25b4e70dcce9
> > > ("driver core: allow stopping deferred probe after init") after
> > > late_initcall, drivers will stop getting EPROBE_DEFER, and
> > > instead see an error causing the driver to fail to load.
> > >
> > > That change causes trouble when trying to use many clk drivers
> > > as modules, as the clk modules may not load until much later
> > > after init has started. If a dependent driver loads and gets an
> > > error instead of EPROBE_DEFER, it won't try to reload later when
> > > the dependency is met, and will thus fail to load.
> > >
> > > Instead of reverting that patch, this patch tries to extend the
> > > time that EPROBE_DEFER is retruned by two minutes, to (hopefully)
> > > ensure that everything has had a chance to load.
> >
> > I think regulators already has some delay like this. We should use the
> > same timeouts.
>
> Sounds good. My memory was a bit foggy from the time I initially
> brought this up, and I looked briefly before sending this out and
> didn't find the regulator change you had mentioned. If you have a
> specific pointer (or patch name or something) let me know, otherwise
> I'll dig around later tonight/tomorrow.
>
> > We also have the 'deferred_probe_timeout' cmdline option. It's deemed
> > a debug option currently, but we could change that and change the
> > default.
>
> I looked at that code, but couldn't really make heads or tails of it.
> The initcalls_done is checked and returns before the
> deferred_probe_timeout is looked at, so I was guessing the
> deferred_probe_timeout was addressing a bit more subtle issue than
> what I was going for. If its really the same functionality, I'm happy
> to try to rework it.

Subsystems have to opt-in in the current code to stop deferring at
init done. Though this was extended with
driver_deferred_probe_check_state_continue() which defers until the
timeout (commit 62a6bc3a1e4f4).

The only purpose of the timeout was to dump out what devices are still
deferred. I don't see any point in having 2 timeouts, so make
deferred_probe_timeout do what you want it to do. I think that's
mainly default to a sensible timeout value and have the subsystems you
care about honor the timeout. Perhaps we don't need both
driver_deferred_probe_check_state() and
driver_deferred_probe_check_state_continue() and can just have the
latter if all the subsystems need to honor the timeout.

I think it would be good if the console printed something periodically
about waiting for drivers. That way the system doesn't appear hung for
30 or 120 sec before giving up.

Also, there was one issue in particular that I ran into. Consoles have
to be built in and probed before init is done. So if your UART has any
dependencies, then those dependencies also have to be built-in. I
don't think anyone working on the 'everything is a module' problem has
fixed that.

> > > Specifically, on db845c, this change allows us to set
> > > SDM_GPUCC_845, QCOM_CLK_RPMH and COMMON_CLK_QCOM as modules and
> > > get a working system, where as without it the display will fail
> > > to load.
> > >
> > > Cc: Alexander Graf <agraf@suse.de>
> > > Cc: Rob Herring <robh@kernel.org>
> > > Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net>
> > > Cc: Kevin Hilman <khilman@kernel.org>
> > > Cc: Ulf Hansson <ulf.hansson@linaro.org>
> > > Cc: Pavel Machek <pavel@ucw.cz>
> > > Cc: Len Brown <len.brown@intel.com>
> > > Cc: Todd Kjos <tkjos@google.com>
> > > Cc: Bjorn Andersson <bjorn.andersson@linaro.org>
> > > Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
> > > Cc: linux-pm@vger.kernel.org
> > > Fixes: e01afc3250255 ("PM / Domains: Stop deferring probe at the end of initcall")
> > > Fixes: 25b4e70dcce9 ("driver core: allow stopping deferred probe after init")
> >
> > We can debate the design, but those work as designed. So Fixes?
> >
>
> Well, clk module loading would have worked, and then stopped working
> with e01afc3250255, so it is a regression of sorts.

If there was a clock driver upstream where this worked, then I agree.

> And really the
> tags are mostly for making sure patches get applied to trees that
> backported these commits (and it's not my intention to shame a patch
> as broken. :)

But 'Fixes' implies 'apply to stable' which is not the same as 'apply
backport to a distro kernel'.

Rob

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2020-02-15 21:26 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-02-14  0:44 [RFC][PATCH] driver core: Extend returning EPROBE_DEFER for two minutes after late_initcall John Stultz
2020-02-14  1:51 ` Rob Herring
2020-02-14  2:41   ` John Stultz
2020-02-15 21:26     ` Rob Herring
2020-02-14  2:19 ` Bjorn Andersson
2020-02-14  4:05   ` John Stultz
2020-02-14  5:15     ` Bjorn Andersson

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).