linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Thierry Reding <thierry.reding@gmail.com>
To: John Stultz <john.stultz@linaro.org>
Cc: lkml <linux-kernel@vger.kernel.org>,
	"David S. Miller" <davem@davemloft.net>,
	Alexey Kuznetsov <kuznet@ms2.inr.ac.ru>,
	Hideaki YOSHIFUJI <yoshfuji@linux-ipv6.org>,
	Jakub Kicinski <kuba@kernel.org>,
	Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
	"Rafael J . Wysocki" <rjw@rjwysocki.net>,
	Rob Herring <robh@kernel.org>,
	Geert Uytterhoeven <geert@linux-m68k.org>,
	Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com>,
	Robin Murphy <robin.murphy@arm.com>,
	Andy Shevchenko <andy.shevchenko@gmail.com>,
	Sudeep Holla <sudeep.holla@arm.com>,
	Andy Shevchenko <andriy.shevchenko@linux.intel.com>,
	Naresh Kamboju <naresh.kamboju@linaro.org>,
	Basil Eljuse <Basil.Eljuse@arm.com>,
	Ferry Toth <fntoth@gmail.com>, Arnd Bergmann <arnd@arndb.de>,
	Anders Roxell <anders.roxell@linaro.org>,
	netdev <netdev@vger.kernel.org>,
	Linux PM list <linux-pm@vger.kernel.org>,
	linux-tegra@vger.kernel.org, Jon Hunter <jonathanh@nvidia.com>
Subject: Re: [PATCH v3 1/3] driver core: Revert default driver_deferred_probe_timeout value to 0
Date: Fri, 7 Aug 2020 14:26:13 +0200	[thread overview]
Message-ID: <20200807122613.GA524643@ulmo> (raw)
In-Reply-To: <20200807110244.GB94549@ulmo>

[-- Attachment #1: Type: text/plain, Size: 7431 bytes --]

On Fri, Aug 07, 2020 at 01:02:44PM +0200, Thierry Reding wrote:
> On Thu, Aug 06, 2020 at 07:09:16PM -0700, John Stultz wrote:
> > On Thu, Aug 6, 2020 at 6:52 AM Thierry Reding <thierry.reding@gmail.com> wrote:
> > >
> > > On Wed, Apr 22, 2020 at 08:32:43PM +0000, John Stultz wrote:
> > > > This patch addresses a regression in 5.7-rc1+
> > > >
> > > > In commit c8c43cee29f6 ("driver core: Fix
> > > > driver_deferred_probe_check_state() logic"), we both cleaned up
> > > > the logic and also set the default driver_deferred_probe_timeout
> > > > value to 30 seconds to allow for drivers that are missing
> > > > dependencies to have some time so that the dependency may be
> > > > loaded from userland after initcalls_done is set.
> > > >
> > > > However, Yoshihiro Shimoda reported that on his device that
> > > > expects to have unmet dependencies (due to "optional links" in
> > > > its devicetree), was failing to mount the NFS root.
> > > >
> > > > In digging further, it seemed the problem was that while the
> > > > device properly probes after waiting 30 seconds for any missing
> > > > modules to load, the ip_auto_config() had already failed,
> > > > resulting in NFS to fail. This was due to ip_auto_config()
> > > > calling wait_for_device_probe() which doesn't wait for the
> > > > driver_deferred_probe_timeout to fire.
> > > >
> > > > Fixing that issue is possible, but could also introduce 30
> > > > second delays in bootups for users who don't have any
> > > > missing dependencies, which is not ideal.
> > > >
> > > > So I think the best solution to avoid any regressions is to
> > > > revert back to a default timeout value of zero, and allow
> > > > systems that need to utilize the timeout in order for userland
> > > > to load any modules that supply misisng dependencies in the dts
> > > > to specify the timeout length via the exiting documented boot
> > > > argument.
> > > >
> > > > Thanks to Geert for chasing down that ip_auto_config was why NFS
> > > > was failing in this case!
> > > >
> > > > Cc: "David S. Miller" <davem@davemloft.net>
> > > > Cc: Alexey Kuznetsov <kuznet@ms2.inr.ac.ru>
> > > > Cc: Hideaki YOSHIFUJI <yoshfuji@linux-ipv6.org>
> > > > Cc: Jakub Kicinski <kuba@kernel.org>
> > > > Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
> > > > Cc: Rafael J. Wysocki <rjw@rjwysocki.net>
> > > > Cc: Rob Herring <robh@kernel.org>
> > > > Cc: Geert Uytterhoeven <geert@linux-m68k.org>
> > > > Cc: Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com>
> > > > Cc: Robin Murphy <robin.murphy@arm.com>
> > > > Cc: Andy Shevchenko <andy.shevchenko@gmail.com>
> > > > Cc: Sudeep Holla <sudeep.holla@arm.com>
> > > > Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
> > > > Cc: Naresh Kamboju <naresh.kamboju@linaro.org>
> > > > Cc: Basil Eljuse <Basil.Eljuse@arm.com>
> > > > Cc: Ferry Toth <fntoth@gmail.com>
> > > > Cc: Arnd Bergmann <arnd@arndb.de>
> > > > Cc: Anders Roxell <anders.roxell@linaro.org>
> > > > Cc: netdev <netdev@vger.kernel.org>
> > > > Cc: linux-pm@vger.kernel.org
> > > > Reported-by: Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com>
> > > > Tested-by: Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com>
> > > > Fixes: c8c43cee29f6 ("driver core: Fix driver_deferred_probe_check_state() logic")
> > > > Signed-off-by: John Stultz <john.stultz@linaro.org>
> > > > ---
> > > >  drivers/base/dd.c | 13 ++-----------
> > > >  1 file changed, 2 insertions(+), 11 deletions(-)
> > >
> > > Sorry for being a bit late to the party, but this breaks suspend/resume
> > > support on various Tegra devices. I've only noticed now because, well,
> > > suspend/resume have been broken for other reasons for a little while and
> > > it's taken us a bit to resolve those issues.
> > >
> > > But now that those other issues have been fixed, I've started seeing an
> > > issue where after resume from suspend some of the I2C controllers are no
> > > longer working. The reason for this is that they share pins with DP AUX
> > > controllers via the pinctrl framework. The DP AUX driver registers as
> > > part of the DRM/KMS driver, which usually happens in userspace. Since
> > > the deferred probe timeout was set to 0 by default this no longer works
> > > because no pinctrl states are assigned to the I2C controller and
> > > therefore upon resume the pins cannot be configured for I2C operation.
> > 
> > Oof. My apologies!
> > 
> > > I'm also somewhat confused by this patch and a few before because they
> > > claim that they restore previous default behaviour, but that's just not
> > > true. Originally when this timeout was introduced it was -1, which meant
> > > that there was no timeout at all and hence users had to opt-in if they
> > > wanted to use a deferred probe timeout.
> > 
> > I don't think that's quite true, since the point of my original
> > changes were to avoid troubles I was seeing with drivers not loading
> > because once the timeout fired after init, driver loading would fail
> > with ENODEV instead of returning EPROBE_DEFER. The logic that existed
> > was buggy so the timeout handling didn't really work (changing the
> > boot argument wouldn't help, because after init the logic would return
> > ENODEV before it checked the timeout value).
> > 
> > That said, looking at it now, I do realize the
> > driver_deferred_probe_check_state_continue() logic in effect never
> > returned ETIMEDOUT before was consolidated in the earlier changes, and
> > now we've backed the default timeout to 0, old user (see bec6c0ecb243)
> > will now get ETIMEDOUT where they wouldn't before.
> > 
> > So would the following fix it up for you? (sorry its whitespace corrupted)
> > 
> > diff --git a/drivers/pinctrl/devicetree.c b/drivers/pinctrl/devicetree.c
> > index c6fe7d64c913..c7448be64d07 100644
> > --- a/drivers/pinctrl/devicetree.c
> > +++ b/drivers/pinctrl/devicetree.c
> > @@ -129,9 +129,8 @@ static int dt_to_map_one_config(struct pinctrl *p,
> >                 if (!np_pctldev || of_node_is_root(np_pctldev)) {
> >                         of_node_put(np_pctldev);
> >                         ret = driver_deferred_probe_check_state(p->dev);
> > -                       /* keep deferring if modules are enabled
> > unless we've timed out */
> > -                       if (IS_ENABLED(CONFIG_MODULES) && !allow_default &&
> > -                           (ret == -ENODEV))
> > +                       /* keep deferring if modules are enabled */
> > +                       if (IS_ENABLED(CONFIG_MODULES) &&
> > !allow_default && ret < 0)
> >                                 ret = -EPROBE_DEFER;
> >                         return ret;
> >                 }
> > 
> > (you could probably argue calling driver_deferred_probe_check_state
> > checking ret at all is silly here, since EPROBE_DEFER is the only
> > option you want)
> 
> Just by looking at it I would've said, yes, this looks like it would fix
> it. However, upon giving this a try, I see the same pinmux issues as I
> was seeing before. I'm going to have to dig a little deeper to see if I
> can find out why that is, because it isn't obvious to me right away.

Oh, nevermind. I managed to somehow mess up the device tree. With that
fixed the above also restores the pinmux attachment for the I2C
controller and suspend/resume works fine.

Thierry

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

  reply	other threads:[~2020-08-07 12:26 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-04-22 20:32 [PATCH v3 0/3] Fixes for deferred_probe_timeout cleanup John Stultz
2020-04-22 20:32 ` [PATCH v3 1/3] driver core: Revert default driver_deferred_probe_timeout value to 0 John Stultz
2020-04-23  7:23   ` Geert Uytterhoeven
     [not found]   ` <CGME20200429134605eucas1p2bd601082e7a6b8c8fdbe79c83972e2e3@eucas1p2.samsung.com>
2020-04-29 13:46     ` Marek Szyprowski
2020-04-29 13:52       ` Mark Brown
2020-04-29 16:56         ` John Stultz
2020-08-06 13:52   ` Thierry Reding
2020-08-07  2:09     ` John Stultz
2020-08-07 11:02       ` Thierry Reding
2020-08-07 12:26         ` Thierry Reding [this message]
2020-04-22 20:32 ` [PATCH v3 2/3] driver core: Use dev_warn() instead of dev_WARN() for deferred_probe_timeout warnings John Stultz
2020-04-23  9:33   ` Yoshihiro Shimoda
2020-04-22 20:32 ` [PATCH v3 3/3] driver core: Ensure wait_for_device_probe() waits until the deferred_probe_timeout fires John Stultz
2020-04-23  7:25   ` Geert Uytterhoeven

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200807122613.GA524643@ulmo \
    --to=thierry.reding@gmail.com \
    --cc=Basil.Eljuse@arm.com \
    --cc=anders.roxell@linaro.org \
    --cc=andriy.shevchenko@linux.intel.com \
    --cc=andy.shevchenko@gmail.com \
    --cc=arnd@arndb.de \
    --cc=davem@davemloft.net \
    --cc=fntoth@gmail.com \
    --cc=geert@linux-m68k.org \
    --cc=gregkh@linuxfoundation.org \
    --cc=john.stultz@linaro.org \
    --cc=jonathanh@nvidia.com \
    --cc=kuba@kernel.org \
    --cc=kuznet@ms2.inr.ac.ru \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-pm@vger.kernel.org \
    --cc=linux-tegra@vger.kernel.org \
    --cc=naresh.kamboju@linaro.org \
    --cc=netdev@vger.kernel.org \
    --cc=rjw@rjwysocki.net \
    --cc=robh@kernel.org \
    --cc=robin.murphy@arm.com \
    --cc=sudeep.holla@arm.com \
    --cc=yoshfuji@linux-ipv6.org \
    --cc=yoshihiro.shimoda.uh@renesas.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).