linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Thierry Reding <thierry.reding@gmail.com>
To: John Stultz <john.stultz@linaro.org>
Cc: lkml <linux-kernel@vger.kernel.org>,
	"David S. Miller" <davem@davemloft.net>,
	Alexey Kuznetsov <kuznet@ms2.inr.ac.ru>,
	Hideaki YOSHIFUJI <yoshfuji@linux-ipv6.org>,
	Jakub Kicinski <kuba@kernel.org>,
	Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
	"Rafael J . Wysocki" <rjw@rjwysocki.net>,
	Rob Herring <robh@kernel.org>,
	Geert Uytterhoeven <geert@linux-m68k.org>,
	Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com>,
	Robin Murphy <robin.murphy@arm.com>,
	Andy Shevchenko <andy.shevchenko@gmail.com>,
	Sudeep Holla <sudeep.holla@arm.com>,
	Andy Shevchenko <andriy.shevchenko@linux.intel.com>,
	Naresh Kamboju <naresh.kamboju@linaro.org>,
	Basil Eljuse <Basil.Eljuse@arm.com>,
	Ferry Toth <fntoth@gmail.com>, Arnd Bergmann <arnd@arndb.de>,
	Anders Roxell <anders.roxell@linaro.org>,
	netdev <netdev@vger.kernel.org>,
	linux-pm@vger.kernel.org, linux-tegra@vger.kernel.org,
	Jon Hunter <jonathanh@nvidia.com>
Subject: Re: [PATCH v3 1/3] driver core: Revert default driver_deferred_probe_timeout value to 0
Date: Thu, 6 Aug 2020 15:52:51 +0200	[thread overview]
Message-ID: <20200806135251.GB3351349@ulmo> (raw)
In-Reply-To: <20200422203245.83244-2-john.stultz@linaro.org>

[-- Attachment #1: Type: text/plain, Size: 7524 bytes --]

On Wed, Apr 22, 2020 at 08:32:43PM +0000, John Stultz wrote:
> This patch addresses a regression in 5.7-rc1+
> 
> In commit c8c43cee29f6 ("driver core: Fix
> driver_deferred_probe_check_state() logic"), we both cleaned up
> the logic and also set the default driver_deferred_probe_timeout
> value to 30 seconds to allow for drivers that are missing
> dependencies to have some time so that the dependency may be
> loaded from userland after initcalls_done is set.
> 
> However, Yoshihiro Shimoda reported that on his device that
> expects to have unmet dependencies (due to "optional links" in
> its devicetree), was failing to mount the NFS root.
> 
> In digging further, it seemed the problem was that while the
> device properly probes after waiting 30 seconds for any missing
> modules to load, the ip_auto_config() had already failed,
> resulting in NFS to fail. This was due to ip_auto_config()
> calling wait_for_device_probe() which doesn't wait for the
> driver_deferred_probe_timeout to fire.
> 
> Fixing that issue is possible, but could also introduce 30
> second delays in bootups for users who don't have any
> missing dependencies, which is not ideal.
> 
> So I think the best solution to avoid any regressions is to
> revert back to a default timeout value of zero, and allow
> systems that need to utilize the timeout in order for userland
> to load any modules that supply misisng dependencies in the dts
> to specify the timeout length via the exiting documented boot
> argument.
> 
> Thanks to Geert for chasing down that ip_auto_config was why NFS
> was failing in this case!
> 
> Cc: "David S. Miller" <davem@davemloft.net>
> Cc: Alexey Kuznetsov <kuznet@ms2.inr.ac.ru>
> Cc: Hideaki YOSHIFUJI <yoshfuji@linux-ipv6.org>
> Cc: Jakub Kicinski <kuba@kernel.org>
> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
> Cc: Rafael J. Wysocki <rjw@rjwysocki.net>
> Cc: Rob Herring <robh@kernel.org>
> Cc: Geert Uytterhoeven <geert@linux-m68k.org>
> Cc: Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com>
> Cc: Robin Murphy <robin.murphy@arm.com>
> Cc: Andy Shevchenko <andy.shevchenko@gmail.com>
> Cc: Sudeep Holla <sudeep.holla@arm.com>
> Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
> Cc: Naresh Kamboju <naresh.kamboju@linaro.org>
> Cc: Basil Eljuse <Basil.Eljuse@arm.com>
> Cc: Ferry Toth <fntoth@gmail.com>
> Cc: Arnd Bergmann <arnd@arndb.de>
> Cc: Anders Roxell <anders.roxell@linaro.org>
> Cc: netdev <netdev@vger.kernel.org>
> Cc: linux-pm@vger.kernel.org
> Reported-by: Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com>
> Tested-by: Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com>
> Fixes: c8c43cee29f6 ("driver core: Fix driver_deferred_probe_check_state() logic")
> Signed-off-by: John Stultz <john.stultz@linaro.org>
> ---
>  drivers/base/dd.c | 13 ++-----------
>  1 file changed, 2 insertions(+), 11 deletions(-)

Sorry for being a bit late to the party, but this breaks suspend/resume
support on various Tegra devices. I've only noticed now because, well,
suspend/resume have been broken for other reasons for a little while and
it's taken us a bit to resolve those issues.

But now that those other issues have been fixed, I've started seeing an
issue where after resume from suspend some of the I2C controllers are no
longer working. The reason for this is that they share pins with DP AUX
controllers via the pinctrl framework. The DP AUX driver registers as
part of the DRM/KMS driver, which usually happens in userspace. Since
the deferred probe timeout was set to 0 by default this no longer works
because no pinctrl states are assigned to the I2C controller and
therefore upon resume the pins cannot be configured for I2C operation.

I'm also somewhat confused by this patch and a few before because they
claim that they restore previous default behaviour, but that's just not
true. Originally when this timeout was introduced it was -1, which meant
that there was no timeout at all and hence users had to opt-in if they
wanted to use a deferred probe timeout.

But now after this series the default is for there to be a very short
timeout, which in turn causes existing use-cases to potentially break.
I'm also going to suggest here that in most cases a driver will require
the resources that it asks for, so the case that Yoshihiro described and
that this patch is meant to fix sounds to me like it's the odd one out
rather than the other way around.

But I realize that that's not very constructive. So perhaps we can find
some other way for drivers to advertise that their dependencies are
optional? I came up with the below patch, which restores suspend/resume
on Tegra and could be used in conjunction with a patch that opts into
this behaviour for the problematic driver in Yoshihiro's case to make
this again work for everyone.

--- >8 ---
From a95f8f41b8a32dee3434db4f0515af7376d1873a Mon Sep 17 00:00:00 2001
From: Thierry Reding <treding@nvidia.com>
Date: Thu, 6 Aug 2020 14:51:59 +0200
Subject: [PATCH] driver core: Do not ignore dependencies by default

Many drivers do require the resources that they ask for and timing out
may not always be an option. While there is a way to allow probing to
continue to be deferred for some time after the system has booted, the
fact that this is controlled via a command-line parameter is undesired
because it require manual intervention, whereas in can be avoid in the
majority of cases.

Instead of requiring users to edit the kernel command-line, add a way
for drivers to specify whether or not their dependencies are optional
so that they can continue deferring probe indefinitely.

Signed-off-by: Thierry Reding <treding@nvidia.com>
---
 drivers/base/dd.c             | 2 +-
 include/linux/device/driver.h | 1 +
 2 files changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/base/dd.c b/drivers/base/dd.c
index 857b0a928e8d..11e747070eae 100644
--- a/drivers/base/dd.c
+++ b/drivers/base/dd.c
@@ -279,7 +279,7 @@ int driver_deferred_probe_check_state(struct device *dev)
 		return -ENODEV;
 	}
 
-	if (!driver_deferred_probe_timeout && initcalls_done) {
+	if (dev->driver->ignore_dependencies && !driver_deferred_probe_timeout && initcalls_done) {
 		dev_warn(dev, "deferred probe timeout, ignoring dependency\n");
 		return -ETIMEDOUT;
 	}
diff --git a/include/linux/device/driver.h b/include/linux/device/driver.h
index ee7ba5b5417e..6994455e8a2e 100644
--- a/include/linux/device/driver.h
+++ b/include/linux/device/driver.h
@@ -100,6 +100,7 @@ struct device_driver {
 	const char		*mod_name;	/* used for built-in modules */
 
 	bool suppress_bind_attrs;	/* disables bind/unbind via sysfs */
+	bool ignore_dependencies;	/* ignores dependencies */
 	enum probe_type probe_type;
 
 	const struct of_device_id	*of_match_table;
-- 
2.27.0
--- >8 ---

Although, thinking about it a bit more it sounds to me like an even
better approach would be to make this part of the API where a resource
is requested. There are in fact already APIs that can request optional
resources (such as regulator_get_optional()), so I think it would make
more sense for any driver that can live without a resource to request
it with an optional flag, which in turn could then trigger this code
path for the deferred probe timeout. For anyone that really needs the
resources that they request, they really shouldn't have to jump through
hoops to get there.

Thierry

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

  parent reply	other threads:[~2020-08-06 17:02 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-04-22 20:32 [PATCH v3 0/3] Fixes for deferred_probe_timeout cleanup John Stultz
2020-04-22 20:32 ` [PATCH v3 1/3] driver core: Revert default driver_deferred_probe_timeout value to 0 John Stultz
2020-04-23  7:23   ` Geert Uytterhoeven
     [not found]   ` <CGME20200429134605eucas1p2bd601082e7a6b8c8fdbe79c83972e2e3@eucas1p2.samsung.com>
2020-04-29 13:46     ` Marek Szyprowski
2020-04-29 13:52       ` Mark Brown
2020-04-29 16:56         ` John Stultz
2020-08-06 13:52   ` Thierry Reding [this message]
2020-08-07  2:09     ` John Stultz
2020-08-07 11:02       ` Thierry Reding
2020-08-07 12:26         ` Thierry Reding
2020-04-22 20:32 ` [PATCH v3 2/3] driver core: Use dev_warn() instead of dev_WARN() for deferred_probe_timeout warnings John Stultz
2020-04-23  9:33   ` Yoshihiro Shimoda
2020-04-22 20:32 ` [PATCH v3 3/3] driver core: Ensure wait_for_device_probe() waits until the deferred_probe_timeout fires John Stultz
2020-04-23  7:25   ` Geert Uytterhoeven

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200806135251.GB3351349@ulmo \
    --to=thierry.reding@gmail.com \
    --cc=Basil.Eljuse@arm.com \
    --cc=anders.roxell@linaro.org \
    --cc=andriy.shevchenko@linux.intel.com \
    --cc=andy.shevchenko@gmail.com \
    --cc=arnd@arndb.de \
    --cc=davem@davemloft.net \
    --cc=fntoth@gmail.com \
    --cc=geert@linux-m68k.org \
    --cc=gregkh@linuxfoundation.org \
    --cc=john.stultz@linaro.org \
    --cc=jonathanh@nvidia.com \
    --cc=kuba@kernel.org \
    --cc=kuznet@ms2.inr.ac.ru \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-pm@vger.kernel.org \
    --cc=linux-tegra@vger.kernel.org \
    --cc=naresh.kamboju@linaro.org \
    --cc=netdev@vger.kernel.org \
    --cc=rjw@rjwysocki.net \
    --cc=robh@kernel.org \
    --cc=robin.murphy@arm.com \
    --cc=sudeep.holla@arm.com \
    --cc=yoshfuji@linux-ipv6.org \
    --cc=yoshihiro.shimoda.uh@renesas.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).