All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v1] driver core: Extend deferred probe timeout on driver registration
@ 2022-04-29 22:09 ` Saravana Kannan via iommu
  0 siblings, 0 replies; 38+ messages in thread
From: Saravana Kannan @ 2022-04-29 22:09 UTC (permalink / raw)
  To: Jonathan Corbet, Greg Kroah-Hartman, Rafael J. Wysocki
  Cc: Saravana Kannan, Rafael J. Wysocki, Rob Herring, Linus Walleij,
	Will Deacon, Ulf Hansson, Kevin Hilman, Thierry Reding,
	Mark Brown, Pavel Machek, Geert Uytterhoeven, Yoshihiro Shimoda,
	Paul Kocialkowski, linux-gpio, linux-pm, iommu, kernel-team,
	linux-doc, linux-kernel

The deferred probe timer that's used for this currently starts at
late_initcall and runs for driver_deferred_probe_timeout seconds. The
assumption being that all available drivers would be loaded and
registered before the timer expires. This means, the
driver_deferred_probe_timeout has to be pretty large for it to cover the
worst case. But if we set the default value for it to cover the worst
case, it would significantly slow down the average case. For this
reason, the default value is set to 0.

Also, with CONFIG_MODULES=y and the current default values of
driver_deferred_probe_timeout=0 and fw_devlink=on, devices with missing
drivers will cause their consumer devices to always defer their probes.
This is because device links created by fw_devlink defer the probe even
before the consumer driver's probe() is called.

Instead of a fixed timeout, if we extend an unexpired deferred probe
timer on every successful driver registration, with the expectation more
modules would be loaded in the near future, then the default value of
driver_deferred_probe_timeout only needs to be as long as the worst case
time difference between two consecutive module loads.

So let's implement that and set the default value to 10 seconds when
CONFIG_MODULES=y.

Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net>
Cc: Rob Herring <robh@kernel.org>
Cc: Linus Walleij <linus.walleij@linaro.org>
Cc: Will Deacon <will@kernel.org>
Cc: Ulf Hansson <ulf.hansson@linaro.org>
Cc: Kevin Hilman <khilman@kernel.org>
Cc: Thierry Reding <treding@nvidia.com>
Cc: Mark Brown <broonie@kernel.org>
Cc: Pavel Machek <pavel@ucw.cz>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com>
Cc: Paul Kocialkowski <paul.kocialkowski@bootlin.com>
Cc: linux-gpio@vger.kernel.org
Cc: linux-pm@vger.kernel.org
Cc: iommu@lists.linux-foundation.org
Signed-off-by: Saravana Kannan <saravanak@google.com>
---
 .../admin-guide/kernel-parameters.txt         |  6 ++++--
 drivers/base/base.h                           |  1 +
 drivers/base/dd.c                             | 19 +++++++++++++++++++
 drivers/base/driver.c                         |  1 +
 4 files changed, 25 insertions(+), 2 deletions(-)

diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt
index 7123524a86b8..fcc1dfc877a9 100644
--- a/Documentation/admin-guide/kernel-parameters.txt
+++ b/Documentation/admin-guide/kernel-parameters.txt
@@ -939,8 +939,10 @@
 			[KNL] Debugging option to set a timeout in seconds for
 			deferred probe to give up waiting on dependencies to
 			probe. Only specific dependencies (subsystems or
-			drivers) that have opted in will be ignored. A timeout of 0
-			will timeout at the end of initcalls. This option will also
+			drivers) that have opted in will be ignored. A timeout
+			of 0 will timeout at the end of initcalls. If the time
+			out hasn't expired, it'll be restarted by each
+			successful driver registration. This option will also
 			dump out devices still on the deferred probe list after
 			retrying.
 
diff --git a/drivers/base/base.h b/drivers/base/base.h
index 2882af26392a..ab71403d102f 100644
--- a/drivers/base/base.h
+++ b/drivers/base/base.h
@@ -159,6 +159,7 @@ extern char *make_class_name(const char *name, struct kobject *kobj);
 extern int devres_release_all(struct device *dev);
 extern void device_block_probing(void);
 extern void device_unblock_probing(void);
+extern void deferred_probe_extend_timeout(void);
 
 /* /sys/devices directory */
 extern struct kset *devices_kset;
diff --git a/drivers/base/dd.c b/drivers/base/dd.c
index f47cab21430f..603379b5f9dd 100644
--- a/drivers/base/dd.c
+++ b/drivers/base/dd.c
@@ -255,7 +255,12 @@ static int deferred_devs_show(struct seq_file *s, void *data)
 }
 DEFINE_SHOW_ATTRIBUTE(deferred_devs);
 
+#ifdef CONFIG_MODULES
+int driver_deferred_probe_timeout = 10;
+#else
 int driver_deferred_probe_timeout;
+#endif
+
 EXPORT_SYMBOL_GPL(driver_deferred_probe_timeout);
 static DECLARE_WAIT_QUEUE_HEAD(probe_timeout_waitqueue);
 
@@ -315,6 +320,20 @@ static void deferred_probe_timeout_work_func(struct work_struct *work)
 }
 static DECLARE_DELAYED_WORK(deferred_probe_timeout_work, deferred_probe_timeout_work_func);
 
+void deferred_probe_extend_timeout(void)
+{
+	/*
+	 * If the work hasn't been queued yet or if the work expired, don't
+	 * start a new one.
+	 */
+	if (cancel_delayed_work(&deferred_probe_timeout_work)) {
+		schedule_delayed_work(&deferred_probe_timeout_work,
+				driver_deferred_probe_timeout * HZ);
+		pr_debug("Extended deferred probe timeout by %d secs\n",
+					driver_deferred_probe_timeout);
+	}
+}
+
 /**
  * deferred_probe_initcall() - Enable probing of deferred devices
  *
diff --git a/drivers/base/driver.c b/drivers/base/driver.c
index 8c0d33e182fd..77a77b2095cd 100644
--- a/drivers/base/driver.c
+++ b/drivers/base/driver.c
@@ -177,6 +177,7 @@ int driver_register(struct device_driver *drv)
 		return ret;
 	}
 	kobject_uevent(&drv->p->kobj, KOBJ_ADD);
+	deferred_probe_extend_timeout();
 
 	return ret;
 }
-- 
2.36.0.464.gb9c8b46e94-goog


^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH v1] driver core: Extend deferred probe timeout on driver registration
@ 2022-04-29 22:09 ` Saravana Kannan via iommu
  0 siblings, 0 replies; 38+ messages in thread
From: Saravana Kannan via iommu @ 2022-04-29 22:09 UTC (permalink / raw)
  To: Jonathan Corbet, Greg Kroah-Hartman, Rafael J. Wysocki
  Cc: Rob Herring, Ulf Hansson, linux-doc, Saravana Kannan,
	Kevin Hilman, kernel-team, Linus Walleij, linux-pm,
	Rafael J. Wysocki, linux-kernel, Paul Kocialkowski, linux-gpio,
	iommu, Mark Brown, Geert Uytterhoeven, Pavel Machek,
	Thierry Reding, Will Deacon

The deferred probe timer that's used for this currently starts at
late_initcall and runs for driver_deferred_probe_timeout seconds. The
assumption being that all available drivers would be loaded and
registered before the timer expires. This means, the
driver_deferred_probe_timeout has to be pretty large for it to cover the
worst case. But if we set the default value for it to cover the worst
case, it would significantly slow down the average case. For this
reason, the default value is set to 0.

Also, with CONFIG_MODULES=y and the current default values of
driver_deferred_probe_timeout=0 and fw_devlink=on, devices with missing
drivers will cause their consumer devices to always defer their probes.
This is because device links created by fw_devlink defer the probe even
before the consumer driver's probe() is called.

Instead of a fixed timeout, if we extend an unexpired deferred probe
timer on every successful driver registration, with the expectation more
modules would be loaded in the near future, then the default value of
driver_deferred_probe_timeout only needs to be as long as the worst case
time difference between two consecutive module loads.

So let's implement that and set the default value to 10 seconds when
CONFIG_MODULES=y.

Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net>
Cc: Rob Herring <robh@kernel.org>
Cc: Linus Walleij <linus.walleij@linaro.org>
Cc: Will Deacon <will@kernel.org>
Cc: Ulf Hansson <ulf.hansson@linaro.org>
Cc: Kevin Hilman <khilman@kernel.org>
Cc: Thierry Reding <treding@nvidia.com>
Cc: Mark Brown <broonie@kernel.org>
Cc: Pavel Machek <pavel@ucw.cz>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com>
Cc: Paul Kocialkowski <paul.kocialkowski@bootlin.com>
Cc: linux-gpio@vger.kernel.org
Cc: linux-pm@vger.kernel.org
Cc: iommu@lists.linux-foundation.org
Signed-off-by: Saravana Kannan <saravanak@google.com>
---
 .../admin-guide/kernel-parameters.txt         |  6 ++++--
 drivers/base/base.h                           |  1 +
 drivers/base/dd.c                             | 19 +++++++++++++++++++
 drivers/base/driver.c                         |  1 +
 4 files changed, 25 insertions(+), 2 deletions(-)

diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt
index 7123524a86b8..fcc1dfc877a9 100644
--- a/Documentation/admin-guide/kernel-parameters.txt
+++ b/Documentation/admin-guide/kernel-parameters.txt
@@ -939,8 +939,10 @@
 			[KNL] Debugging option to set a timeout in seconds for
 			deferred probe to give up waiting on dependencies to
 			probe. Only specific dependencies (subsystems or
-			drivers) that have opted in will be ignored. A timeout of 0
-			will timeout at the end of initcalls. This option will also
+			drivers) that have opted in will be ignored. A timeout
+			of 0 will timeout at the end of initcalls. If the time
+			out hasn't expired, it'll be restarted by each
+			successful driver registration. This option will also
 			dump out devices still on the deferred probe list after
 			retrying.
 
diff --git a/drivers/base/base.h b/drivers/base/base.h
index 2882af26392a..ab71403d102f 100644
--- a/drivers/base/base.h
+++ b/drivers/base/base.h
@@ -159,6 +159,7 @@ extern char *make_class_name(const char *name, struct kobject *kobj);
 extern int devres_release_all(struct device *dev);
 extern void device_block_probing(void);
 extern void device_unblock_probing(void);
+extern void deferred_probe_extend_timeout(void);
 
 /* /sys/devices directory */
 extern struct kset *devices_kset;
diff --git a/drivers/base/dd.c b/drivers/base/dd.c
index f47cab21430f..603379b5f9dd 100644
--- a/drivers/base/dd.c
+++ b/drivers/base/dd.c
@@ -255,7 +255,12 @@ static int deferred_devs_show(struct seq_file *s, void *data)
 }
 DEFINE_SHOW_ATTRIBUTE(deferred_devs);
 
+#ifdef CONFIG_MODULES
+int driver_deferred_probe_timeout = 10;
+#else
 int driver_deferred_probe_timeout;
+#endif
+
 EXPORT_SYMBOL_GPL(driver_deferred_probe_timeout);
 static DECLARE_WAIT_QUEUE_HEAD(probe_timeout_waitqueue);
 
@@ -315,6 +320,20 @@ static void deferred_probe_timeout_work_func(struct work_struct *work)
 }
 static DECLARE_DELAYED_WORK(deferred_probe_timeout_work, deferred_probe_timeout_work_func);
 
+void deferred_probe_extend_timeout(void)
+{
+	/*
+	 * If the work hasn't been queued yet or if the work expired, don't
+	 * start a new one.
+	 */
+	if (cancel_delayed_work(&deferred_probe_timeout_work)) {
+		schedule_delayed_work(&deferred_probe_timeout_work,
+				driver_deferred_probe_timeout * HZ);
+		pr_debug("Extended deferred probe timeout by %d secs\n",
+					driver_deferred_probe_timeout);
+	}
+}
+
 /**
  * deferred_probe_initcall() - Enable probing of deferred devices
  *
diff --git a/drivers/base/driver.c b/drivers/base/driver.c
index 8c0d33e182fd..77a77b2095cd 100644
--- a/drivers/base/driver.c
+++ b/drivers/base/driver.c
@@ -177,6 +177,7 @@ int driver_register(struct device_driver *drv)
 		return ret;
 	}
 	kobject_uevent(&drv->p->kobj, KOBJ_ADD);
+	deferred_probe_extend_timeout();
 
 	return ret;
 }
-- 
2.36.0.464.gb9c8b46e94-goog

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply related	[flat|nested] 38+ messages in thread

* Re: [PATCH v1] driver core: Extend deferred probe timeout on driver registration
  2022-04-29 22:09 ` Saravana Kannan via iommu
@ 2022-05-13 12:48   ` Mark Brown
  -1 siblings, 0 replies; 38+ messages in thread
From: Mark Brown @ 2022-05-13 12:48 UTC (permalink / raw)
  To: Saravana Kannan
  Cc: Jonathan Corbet, Greg Kroah-Hartman, Rafael J. Wysocki,
	Rafael J. Wysocki, Rob Herring, Linus Walleij, Will Deacon,
	Ulf Hansson, Kevin Hilman, Thierry Reding, Pavel Machek,
	Geert Uytterhoeven, Yoshihiro Shimoda, Paul Kocialkowski,
	linux-gpio, linux-pm, iommu, kernel-team, linux-doc,
	linux-kernel

[-- Attachment #1: Type: text/plain, Size: 653 bytes --]

On Fri, Apr 29, 2022 at 03:09:32PM -0700, Saravana Kannan wrote:
> The deferred probe timer that's used for this currently starts at
> late_initcall and runs for driver_deferred_probe_timeout seconds. The
> assumption being that all available drivers would be loaded and
> registered before the timer expires. This means, the
> driver_deferred_probe_timeout has to be pretty large for it to cover the
> worst case. But if we set the default value for it to cover the worst
> case, it would significantly slow down the average case. For this
> reason, the default value is set to 0.

This makes sense to me.

Reviewed-by: Mark Brown <broonie@kernel.org>

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH v1] driver core: Extend deferred probe timeout on driver registration
@ 2022-05-13 12:48   ` Mark Brown
  0 siblings, 0 replies; 38+ messages in thread
From: Mark Brown @ 2022-05-13 12:48 UTC (permalink / raw)
  To: Saravana Kannan
  Cc: Rob Herring, Ulf Hansson, kernel-team, linux-doc,
	Rafael J. Wysocki, Greg Kroah-Hartman, Linus Walleij,
	Jonathan Corbet, Rafael J. Wysocki, linux-kernel,
	Paul Kocialkowski, Kevin Hilman, iommu, Geert Uytterhoeven,
	Pavel Machek, linux-gpio, linux-pm, Thierry Reding, Will Deacon


[-- Attachment #1.1: Type: text/plain, Size: 653 bytes --]

On Fri, Apr 29, 2022 at 03:09:32PM -0700, Saravana Kannan wrote:
> The deferred probe timer that's used for this currently starts at
> late_initcall and runs for driver_deferred_probe_timeout seconds. The
> assumption being that all available drivers would be loaded and
> registered before the timer expires. This means, the
> driver_deferred_probe_timeout has to be pretty large for it to cover the
> worst case. But if we set the default value for it to cover the worst
> case, it would significantly slow down the average case. For this
> reason, the default value is set to 0.

This makes sense to me.

Reviewed-by: Mark Brown <broonie@kernel.org>

[-- Attachment #1.2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

[-- Attachment #2: Type: text/plain, Size: 156 bytes --]

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH v1] driver core: Extend deferred probe timeout on driver registration
  2022-04-29 22:09 ` Saravana Kannan via iommu
@ 2022-05-13 13:58   ` Rob Herring
  -1 siblings, 0 replies; 38+ messages in thread
From: Rob Herring @ 2022-05-13 13:58 UTC (permalink / raw)
  To: Saravana Kannan
  Cc: Jonathan Corbet, Greg Kroah-Hartman, Rafael J. Wysocki,
	Rafael J. Wysocki, Linus Walleij, Will Deacon, Ulf Hansson,
	Kevin Hilman, Thierry Reding, Mark Brown, Pavel Machek,
	Geert Uytterhoeven, Yoshihiro Shimoda, Paul Kocialkowski,
	open list:GPIO SUBSYSTEM, open list:THERMAL, Linux IOMMU,
	Android Kernel Team, Linux Doc Mailing List, linux-kernel

On Fri, Apr 29, 2022 at 5:09 PM Saravana Kannan <saravanak@google.com> wrote:
>
> The deferred probe timer that's used for this currently starts at
> late_initcall and runs for driver_deferred_probe_timeout seconds. The
> assumption being that all available drivers would be loaded and
> registered before the timer expires. This means, the
> driver_deferred_probe_timeout has to be pretty large for it to cover the
> worst case. But if we set the default value for it to cover the worst
> case, it would significantly slow down the average case. For this
> reason, the default value is set to 0.
>
> Also, with CONFIG_MODULES=y and the current default values of
> driver_deferred_probe_timeout=0 and fw_devlink=on, devices with missing
> drivers will cause their consumer devices to always defer their probes.
> This is because device links created by fw_devlink defer the probe even
> before the consumer driver's probe() is called.
>
> Instead of a fixed timeout, if we extend an unexpired deferred probe
> timer on every successful driver registration, with the expectation more
> modules would be loaded in the near future, then the default value of
> driver_deferred_probe_timeout only needs to be as long as the worst case
> time difference between two consecutive module loads.
>
> So let's implement that and set the default value to 10 seconds when
> CONFIG_MODULES=y.

We had to revert a non-zero timeout before (issue with NFS root IIRC).
Does fw_devlink=on somehow fix that?

Rob

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH v1] driver core: Extend deferred probe timeout on driver registration
@ 2022-05-13 13:58   ` Rob Herring
  0 siblings, 0 replies; 38+ messages in thread
From: Rob Herring @ 2022-05-13 13:58 UTC (permalink / raw)
  To: Saravana Kannan
  Cc: Ulf Hansson, Android Kernel Team, Linux Doc Mailing List,
	Rafael J. Wysocki, Greg Kroah-Hartman, Linus Walleij,
	Jonathan Corbet, Rafael J. Wysocki, linux-kernel,
	Paul Kocialkowski, Kevin Hilman, Linux IOMMU, Mark Brown,
	Geert Uytterhoeven, Pavel Machek, open list:GPIO SUBSYSTEM,
	open list:THERMAL, Thierry Reding, Will Deacon

On Fri, Apr 29, 2022 at 5:09 PM Saravana Kannan <saravanak@google.com> wrote:
>
> The deferred probe timer that's used for this currently starts at
> late_initcall and runs for driver_deferred_probe_timeout seconds. The
> assumption being that all available drivers would be loaded and
> registered before the timer expires. This means, the
> driver_deferred_probe_timeout has to be pretty large for it to cover the
> worst case. But if we set the default value for it to cover the worst
> case, it would significantly slow down the average case. For this
> reason, the default value is set to 0.
>
> Also, with CONFIG_MODULES=y and the current default values of
> driver_deferred_probe_timeout=0 and fw_devlink=on, devices with missing
> drivers will cause their consumer devices to always defer their probes.
> This is because device links created by fw_devlink defer the probe even
> before the consumer driver's probe() is called.
>
> Instead of a fixed timeout, if we extend an unexpired deferred probe
> timer on every successful driver registration, with the expectation more
> modules would be loaded in the near future, then the default value of
> driver_deferred_probe_timeout only needs to be as long as the worst case
> time difference between two consecutive module loads.
>
> So let's implement that and set the default value to 10 seconds when
> CONFIG_MODULES=y.

We had to revert a non-zero timeout before (issue with NFS root IIRC).
Does fw_devlink=on somehow fix that?

Rob
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH v1] driver core: Extend deferred probe timeout on driver registration
  2022-05-13 13:58   ` Rob Herring
@ 2022-05-13 17:25     ` Saravana Kannan via iommu
  -1 siblings, 0 replies; 38+ messages in thread
From: Saravana Kannan @ 2022-05-13 17:25 UTC (permalink / raw)
  To: Rob Herring
  Cc: Jonathan Corbet, Greg Kroah-Hartman, Rafael J. Wysocki,
	Rafael J. Wysocki, Linus Walleij, Will Deacon, Ulf Hansson,
	Kevin Hilman, Thierry Reding, Mark Brown, Pavel Machek,
	Geert Uytterhoeven, Yoshihiro Shimoda, Paul Kocialkowski,
	open list:GPIO SUBSYSTEM, open list:THERMAL, Linux IOMMU,
	Android Kernel Team, Linux Doc Mailing List, linux-kernel

On Fri, May 13, 2022 at 6:58 AM Rob Herring <robh@kernel.org> wrote:
>
> On Fri, Apr 29, 2022 at 5:09 PM Saravana Kannan <saravanak@google.com> wrote:
> >
> > The deferred probe timer that's used for this currently starts at
> > late_initcall and runs for driver_deferred_probe_timeout seconds. The
> > assumption being that all available drivers would be loaded and
> > registered before the timer expires. This means, the
> > driver_deferred_probe_timeout has to be pretty large for it to cover the
> > worst case. But if we set the default value for it to cover the worst
> > case, it would significantly slow down the average case. For this
> > reason, the default value is set to 0.
> >
> > Also, with CONFIG_MODULES=y and the current default values of
> > driver_deferred_probe_timeout=0 and fw_devlink=on, devices with missing
> > drivers will cause their consumer devices to always defer their probes.
> > This is because device links created by fw_devlink defer the probe even
> > before the consumer driver's probe() is called.
> >
> > Instead of a fixed timeout, if we extend an unexpired deferred probe
> > timer on every successful driver registration, with the expectation more
> > modules would be loaded in the near future, then the default value of
> > driver_deferred_probe_timeout only needs to be as long as the worst case
> > time difference between two consecutive module loads.
> >
> > So let's implement that and set the default value to 10 seconds when
> > CONFIG_MODULES=y.
>
> We had to revert a non-zero timeout before (issue with NFS root IIRC).
> Does fw_devlink=on somehow fix that?

If it's the one where ip autoconfig was timing out, then John Stultz
fixed it by fixing wait_for_device_probe().
https://lore.kernel.org/all/20200422203245.83244-4-john.stultz@linaro.org/

If you are referring to some other issue, then I'd need more details.

-Saravana

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH v1] driver core: Extend deferred probe timeout on driver registration
@ 2022-05-13 17:25     ` Saravana Kannan via iommu
  0 siblings, 0 replies; 38+ messages in thread
From: Saravana Kannan via iommu @ 2022-05-13 17:25 UTC (permalink / raw)
  To: Rob Herring
  Cc: Ulf Hansson, Android Kernel Team, Linux Doc Mailing List,
	Rafael J. Wysocki, Greg Kroah-Hartman, Linus Walleij,
	Jonathan Corbet, Rafael J. Wysocki, linux-kernel,
	Paul Kocialkowski, Kevin Hilman, Linux IOMMU, Mark Brown,
	Geert Uytterhoeven, Pavel Machek, open list:GPIO SUBSYSTEM,
	open list:THERMAL, Thierry Reding, Will Deacon

On Fri, May 13, 2022 at 6:58 AM Rob Herring <robh@kernel.org> wrote:
>
> On Fri, Apr 29, 2022 at 5:09 PM Saravana Kannan <saravanak@google.com> wrote:
> >
> > The deferred probe timer that's used for this currently starts at
> > late_initcall and runs for driver_deferred_probe_timeout seconds. The
> > assumption being that all available drivers would be loaded and
> > registered before the timer expires. This means, the
> > driver_deferred_probe_timeout has to be pretty large for it to cover the
> > worst case. But if we set the default value for it to cover the worst
> > case, it would significantly slow down the average case. For this
> > reason, the default value is set to 0.
> >
> > Also, with CONFIG_MODULES=y and the current default values of
> > driver_deferred_probe_timeout=0 and fw_devlink=on, devices with missing
> > drivers will cause their consumer devices to always defer their probes.
> > This is because device links created by fw_devlink defer the probe even
> > before the consumer driver's probe() is called.
> >
> > Instead of a fixed timeout, if we extend an unexpired deferred probe
> > timer on every successful driver registration, with the expectation more
> > modules would be loaded in the near future, then the default value of
> > driver_deferred_probe_timeout only needs to be as long as the worst case
> > time difference between two consecutive module loads.
> >
> > So let's implement that and set the default value to 10 seconds when
> > CONFIG_MODULES=y.
>
> We had to revert a non-zero timeout before (issue with NFS root IIRC).
> Does fw_devlink=on somehow fix that?

If it's the one where ip autoconfig was timing out, then John Stultz
fixed it by fixing wait_for_device_probe().
https://lore.kernel.org/all/20200422203245.83244-4-john.stultz@linaro.org/

If you are referring to some other issue, then I'd need more details.

-Saravana
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH v1] driver core: Extend deferred probe timeout on driver registration
  2022-05-13 17:25     ` Saravana Kannan via iommu
@ 2022-05-16 13:49       ` Rob Herring
  -1 siblings, 0 replies; 38+ messages in thread
From: Rob Herring @ 2022-05-16 13:49 UTC (permalink / raw)
  To: Saravana Kannan
  Cc: Ulf Hansson, Android Kernel Team, Linux Doc Mailing List,
	Rafael J. Wysocki, Greg Kroah-Hartman, Linus Walleij,
	Jonathan Corbet, Rafael J. Wysocki, linux-kernel,
	Paul Kocialkowski, Kevin Hilman, Linux IOMMU, Mark Brown,
	Geert Uytterhoeven, Pavel Machek, open list:GPIO SUBSYSTEM,
	open list:THERMAL, Thierry Reding, Will Deacon

On Fri, May 13, 2022 at 12:26 PM Saravana Kannan <saravanak@google.com> wrote:
>
> On Fri, May 13, 2022 at 6:58 AM Rob Herring <robh@kernel.org> wrote:
> >
> > On Fri, Apr 29, 2022 at 5:09 PM Saravana Kannan <saravanak@google.com> wrote:
> > >
> > > The deferred probe timer that's used for this currently starts at
> > > late_initcall and runs for driver_deferred_probe_timeout seconds. The
> > > assumption being that all available drivers would be loaded and
> > > registered before the timer expires. This means, the
> > > driver_deferred_probe_timeout has to be pretty large for it to cover the
> > > worst case. But if we set the default value for it to cover the worst
> > > case, it would significantly slow down the average case. For this
> > > reason, the default value is set to 0.
> > >
> > > Also, with CONFIG_MODULES=y and the current default values of
> > > driver_deferred_probe_timeout=0 and fw_devlink=on, devices with missing
> > > drivers will cause their consumer devices to always defer their probes.
> > > This is because device links created by fw_devlink defer the probe even
> > > before the consumer driver's probe() is called.
> > >
> > > Instead of a fixed timeout, if we extend an unexpired deferred probe
> > > timer on every successful driver registration, with the expectation more
> > > modules would be loaded in the near future, then the default value of
> > > driver_deferred_probe_timeout only needs to be as long as the worst case
> > > time difference between two consecutive module loads.
> > >
> > > So let's implement that and set the default value to 10 seconds when
> > > CONFIG_MODULES=y.
> >
> > We had to revert a non-zero timeout before (issue with NFS root IIRC).
> > Does fw_devlink=on somehow fix that?
>
> If it's the one where ip autoconfig was timing out, then John Stultz
> fixed it by fixing wait_for_device_probe().
> https://lore.kernel.org/all/20200422203245.83244-4-john.stultz@linaro.org/

Yeah, that was it.

Acked-by: Rob Herring <robh@kernel.org>
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH v1] driver core: Extend deferred probe timeout on driver registration
@ 2022-05-16 13:49       ` Rob Herring
  0 siblings, 0 replies; 38+ messages in thread
From: Rob Herring @ 2022-05-16 13:49 UTC (permalink / raw)
  To: Saravana Kannan
  Cc: Jonathan Corbet, Greg Kroah-Hartman, Rafael J. Wysocki,
	Rafael J. Wysocki, Linus Walleij, Will Deacon, Ulf Hansson,
	Kevin Hilman, Thierry Reding, Mark Brown, Pavel Machek,
	Geert Uytterhoeven, Yoshihiro Shimoda, Paul Kocialkowski,
	open list:GPIO SUBSYSTEM, open list:THERMAL, Linux IOMMU,
	Android Kernel Team, Linux Doc Mailing List, linux-kernel

On Fri, May 13, 2022 at 12:26 PM Saravana Kannan <saravanak@google.com> wrote:
>
> On Fri, May 13, 2022 at 6:58 AM Rob Herring <robh@kernel.org> wrote:
> >
> > On Fri, Apr 29, 2022 at 5:09 PM Saravana Kannan <saravanak@google.com> wrote:
> > >
> > > The deferred probe timer that's used for this currently starts at
> > > late_initcall and runs for driver_deferred_probe_timeout seconds. The
> > > assumption being that all available drivers would be loaded and
> > > registered before the timer expires. This means, the
> > > driver_deferred_probe_timeout has to be pretty large for it to cover the
> > > worst case. But if we set the default value for it to cover the worst
> > > case, it would significantly slow down the average case. For this
> > > reason, the default value is set to 0.
> > >
> > > Also, with CONFIG_MODULES=y and the current default values of
> > > driver_deferred_probe_timeout=0 and fw_devlink=on, devices with missing
> > > drivers will cause their consumer devices to always defer their probes.
> > > This is because device links created by fw_devlink defer the probe even
> > > before the consumer driver's probe() is called.
> > >
> > > Instead of a fixed timeout, if we extend an unexpired deferred probe
> > > timer on every successful driver registration, with the expectation more
> > > modules would be loaded in the near future, then the default value of
> > > driver_deferred_probe_timeout only needs to be as long as the worst case
> > > time difference between two consecutive module loads.
> > >
> > > So let's implement that and set the default value to 10 seconds when
> > > CONFIG_MODULES=y.
> >
> > We had to revert a non-zero timeout before (issue with NFS root IIRC).
> > Does fw_devlink=on somehow fix that?
>
> If it's the one where ip autoconfig was timing out, then John Stultz
> fixed it by fixing wait_for_device_probe().
> https://lore.kernel.org/all/20200422203245.83244-4-john.stultz@linaro.org/

Yeah, that was it.

Acked-by: Rob Herring <robh@kernel.org>

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH v1] driver core: Extend deferred probe timeout on driver registration
  2022-04-29 22:09 ` Saravana Kannan via iommu
@ 2022-05-20 23:30   ` Nathan Chancellor
  -1 siblings, 0 replies; 38+ messages in thread
From: Nathan Chancellor @ 2022-05-20 23:30 UTC (permalink / raw)
  To: Saravana Kannan
  Cc: Jonathan Corbet, Greg Kroah-Hartman, Rafael J. Wysocki,
	Rafael J. Wysocki, Rob Herring, Linus Walleij, Will Deacon,
	Ulf Hansson, Kevin Hilman, Thierry Reding, Mark Brown,
	Pavel Machek, Geert Uytterhoeven, Yoshihiro Shimoda,
	Paul Kocialkowski, linux-gpio, linux-pm, iommu, kernel-team,
	linux-doc, linux-kernel, linux-s390

Hi Saravana,

On Fri, Apr 29, 2022 at 03:09:32PM -0700, Saravana Kannan wrote:
> The deferred probe timer that's used for this currently starts at
> late_initcall and runs for driver_deferred_probe_timeout seconds. The
> assumption being that all available drivers would be loaded and
> registered before the timer expires. This means, the
> driver_deferred_probe_timeout has to be pretty large for it to cover the
> worst case. But if we set the default value for it to cover the worst
> case, it would significantly slow down the average case. For this
> reason, the default value is set to 0.
> 
> Also, with CONFIG_MODULES=y and the current default values of
> driver_deferred_probe_timeout=0 and fw_devlink=on, devices with missing
> drivers will cause their consumer devices to always defer their probes.
> This is because device links created by fw_devlink defer the probe even
> before the consumer driver's probe() is called.
> 
> Instead of a fixed timeout, if we extend an unexpired deferred probe
> timer on every successful driver registration, with the expectation more
> modules would be loaded in the near future, then the default value of
> driver_deferred_probe_timeout only needs to be as long as the worst case
> time difference between two consecutive module loads.
> 
> So let's implement that and set the default value to 10 seconds when
> CONFIG_MODULES=y.
> 
> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
> Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net>
> Cc: Rob Herring <robh@kernel.org>
> Cc: Linus Walleij <linus.walleij@linaro.org>
> Cc: Will Deacon <will@kernel.org>
> Cc: Ulf Hansson <ulf.hansson@linaro.org>
> Cc: Kevin Hilman <khilman@kernel.org>
> Cc: Thierry Reding <treding@nvidia.com>
> Cc: Mark Brown <broonie@kernel.org>
> Cc: Pavel Machek <pavel@ucw.cz>
> Cc: Geert Uytterhoeven <geert@linux-m68k.org>
> Cc: Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com>
> Cc: Paul Kocialkowski <paul.kocialkowski@bootlin.com>
> Cc: linux-gpio@vger.kernel.org
> Cc: linux-pm@vger.kernel.org
> Cc: iommu@lists.linux-foundation.org
> Signed-off-by: Saravana Kannan <saravanak@google.com>

I bisected a boot hang with ARCH=s390 defconfig in QEMU down to this
change as commit 2b28a1a84a0e ("driver core: Extend deferred probe
timeout on driver registration") in next-20220520 (bisect log below).

$ make -skj"$(nproc)" ARCH=s390 CROSS_COMPILE=s390x-linux-gnu- defconfig bzImage

$ timeout --foreground 15m stdbuf -oL -eL \
qemu-system-s390x \
-initrd ... \
-M s390-ccw-virtio \
-display none \
-kernel arch/s390/boot/bzImage \
-m 512m \
-nodefaults \
-serial mon:stdio
...
[    2.077303] In-situ OAM (IOAM) with IPv6
[    2.077639] NET: Registered PF_PACKET protocol family
[    2.078063] bridge: filtering via arp/ip/ip6tables is no longer available by default. Update your scripts to load br_netfilter if you need this.
[    2.078795] Key type dns_resolver registered
[    2.079317] cio: Channel measurement facility initialized using format extended (mode autodetected)
[    2.081494] Discipline DIAG cannot be used without z/VM
[  260.626363] random: crng init done
qemu-system-s390x: terminating on signal 15 from pid 3815762 (timeout)

We have a simple rootfs available if necessary:

https://github.com/ClangBuiltLinux/boot-utils/raw/bc0d17785eb67f1edd0ee0a134970a807895f741/images/s390/rootfs.cpio.zst

If there is any other information I can provide, please let me know!

Cheers,
Nathan

# bad: [18ecd30af1a8402c162cca1bd58771c0e5be7815] Add linux-next specific files for 20220520
# good: [b015dcd62b86d298829990f8261d5d154b8d7af5] Merge tag 'for-5.18/parisc-4' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux
git bisect start '18ecd30af1a8402c162cca1bd58771c0e5be7815' 'b015dcd62b86d298829990f8261d5d154b8d7af5'
# good: [f9b63740b666dd9887eb0282d21b5f65bb0cadd0] Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/cryptodev-2.6.git
git bisect good f9b63740b666dd9887eb0282d21b5f65bb0cadd0
# good: [1f5eb3e76303572f0318e8c50da51c516580aa03] Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git
git bisect good 1f5eb3e76303572f0318e8c50da51c516580aa03
# bad: [4c1d9cc0363691893ef94fa0d798faca013e27d3] Merge branch 'staging-next' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging.git
git bisect bad 4c1d9cc0363691893ef94fa0d798faca013e27d3
# bad: [dcb68304485c0ba5f84f1a54687c751b68263d93] Merge branch 'usb-next' of git://git.kernel.org/pub/scm/linux/kernel/git/johan/usb-serial.git
git bisect bad dcb68304485c0ba5f84f1a54687c751b68263d93
# good: [61271996dc46aecb40fd26f89a4ec0a6bd8f3a8f] Merge branch 'next' of git://git.kernel.org/pub/scm/virt/kvm/kvm.git
git bisect good 61271996dc46aecb40fd26f89a4ec0a6bd8f3a8f
# good: [d4db45a71f56032b552e161968bb0e5fdd2767f8] Merge branch 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/pavel/linux-leds.git
git bisect good d4db45a71f56032b552e161968bb0e5fdd2767f8
# good: [d090c7a2ab84663185e4abda21d7d83880937c8a] USB / dwc3: Fix a checkpatch warning in core.c
git bisect good d090c7a2ab84663185e4abda21d7d83880937c8a
# bad: [b232b02bf3c205b13a26dcec08e53baddd8e59ed] driver core: fix deadlock in __device_attach
git bisect bad b232b02bf3c205b13a26dcec08e53baddd8e59ed
# good: [4c32174a24759d5ac6dc42b508fcec2afb8b9602] Documentation: dd: Use ReST lists for return values of driver_deferred_probe_check_state()
git bisect good 4c32174a24759d5ac6dc42b508fcec2afb8b9602
# good: [38ea74eb8fc1b82b39e13a6527095a0036539117] rpmsg: use local 'dev' variable
git bisect good 38ea74eb8fc1b82b39e13a6527095a0036539117
# good: [1f7ff11ca68f464b6a9a71b8fbe9e5219e7cac57] driver core: location: Add "back" as a possible output for panel
git bisect good 1f7ff11ca68f464b6a9a71b8fbe9e5219e7cac57
# good: [6ee60e9c9f2f83ad218159af6a175c57a395ae69] MAINTAINERS: add Russ Weight as a firmware loader maintainer
git bisect good 6ee60e9c9f2f83ad218159af6a175c57a395ae69
# bad: [15f214f9bdb7c1f560b4bf863c5a72ff53b442a4] topology: Remove unused cpu_cluster_mask()
git bisect bad 15f214f9bdb7c1f560b4bf863c5a72ff53b442a4
# bad: [2b28a1a84a0eb3412bad1a2d5cce2bb4addec626] driver core: Extend deferred probe timeout on driver registration
git bisect bad 2b28a1a84a0eb3412bad1a2d5cce2bb4addec626
# first bad commit: [2b28a1a84a0eb3412bad1a2d5cce2bb4addec626] driver core: Extend deferred probe timeout on driver registration

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH v1] driver core: Extend deferred probe timeout on driver registration
@ 2022-05-20 23:30   ` Nathan Chancellor
  0 siblings, 0 replies; 38+ messages in thread
From: Nathan Chancellor @ 2022-05-20 23:30 UTC (permalink / raw)
  To: Saravana Kannan
  Cc: Ulf Hansson, linux-doc, Linus Walleij, Pavel Machek, linux-s390,
	Will Deacon, Rob Herring, Jonathan Corbet, Rafael J. Wysocki,
	Geert Uytterhoeven, Thierry Reding, kernel-team, linux-pm,
	linux-gpio, Mark Brown, Greg Kroah-Hartman, Kevin Hilman,
	Rafael J. Wysocki, linux-kernel, Paul Kocialkowski, iommu

Hi Saravana,

On Fri, Apr 29, 2022 at 03:09:32PM -0700, Saravana Kannan wrote:
> The deferred probe timer that's used for this currently starts at
> late_initcall and runs for driver_deferred_probe_timeout seconds. The
> assumption being that all available drivers would be loaded and
> registered before the timer expires. This means, the
> driver_deferred_probe_timeout has to be pretty large for it to cover the
> worst case. But if we set the default value for it to cover the worst
> case, it would significantly slow down the average case. For this
> reason, the default value is set to 0.
> 
> Also, with CONFIG_MODULES=y and the current default values of
> driver_deferred_probe_timeout=0 and fw_devlink=on, devices with missing
> drivers will cause their consumer devices to always defer their probes.
> This is because device links created by fw_devlink defer the probe even
> before the consumer driver's probe() is called.
> 
> Instead of a fixed timeout, if we extend an unexpired deferred probe
> timer on every successful driver registration, with the expectation more
> modules would be loaded in the near future, then the default value of
> driver_deferred_probe_timeout only needs to be as long as the worst case
> time difference between two consecutive module loads.
> 
> So let's implement that and set the default value to 10 seconds when
> CONFIG_MODULES=y.
> 
> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
> Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net>
> Cc: Rob Herring <robh@kernel.org>
> Cc: Linus Walleij <linus.walleij@linaro.org>
> Cc: Will Deacon <will@kernel.org>
> Cc: Ulf Hansson <ulf.hansson@linaro.org>
> Cc: Kevin Hilman <khilman@kernel.org>
> Cc: Thierry Reding <treding@nvidia.com>
> Cc: Mark Brown <broonie@kernel.org>
> Cc: Pavel Machek <pavel@ucw.cz>
> Cc: Geert Uytterhoeven <geert@linux-m68k.org>
> Cc: Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com>
> Cc: Paul Kocialkowski <paul.kocialkowski@bootlin.com>
> Cc: linux-gpio@vger.kernel.org
> Cc: linux-pm@vger.kernel.org
> Cc: iommu@lists.linux-foundation.org
> Signed-off-by: Saravana Kannan <saravanak@google.com>

I bisected a boot hang with ARCH=s390 defconfig in QEMU down to this
change as commit 2b28a1a84a0e ("driver core: Extend deferred probe
timeout on driver registration") in next-20220520 (bisect log below).

$ make -skj"$(nproc)" ARCH=s390 CROSS_COMPILE=s390x-linux-gnu- defconfig bzImage

$ timeout --foreground 15m stdbuf -oL -eL \
qemu-system-s390x \
-initrd ... \
-M s390-ccw-virtio \
-display none \
-kernel arch/s390/boot/bzImage \
-m 512m \
-nodefaults \
-serial mon:stdio
...
[    2.077303] In-situ OAM (IOAM) with IPv6
[    2.077639] NET: Registered PF_PACKET protocol family
[    2.078063] bridge: filtering via arp/ip/ip6tables is no longer available by default. Update your scripts to load br_netfilter if you need this.
[    2.078795] Key type dns_resolver registered
[    2.079317] cio: Channel measurement facility initialized using format extended (mode autodetected)
[    2.081494] Discipline DIAG cannot be used without z/VM
[  260.626363] random: crng init done
qemu-system-s390x: terminating on signal 15 from pid 3815762 (timeout)

We have a simple rootfs available if necessary:

https://github.com/ClangBuiltLinux/boot-utils/raw/bc0d17785eb67f1edd0ee0a134970a807895f741/images/s390/rootfs.cpio.zst

If there is any other information I can provide, please let me know!

Cheers,
Nathan

# bad: [18ecd30af1a8402c162cca1bd58771c0e5be7815] Add linux-next specific files for 20220520
# good: [b015dcd62b86d298829990f8261d5d154b8d7af5] Merge tag 'for-5.18/parisc-4' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux
git bisect start '18ecd30af1a8402c162cca1bd58771c0e5be7815' 'b015dcd62b86d298829990f8261d5d154b8d7af5'
# good: [f9b63740b666dd9887eb0282d21b5f65bb0cadd0] Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/cryptodev-2.6.git
git bisect good f9b63740b666dd9887eb0282d21b5f65bb0cadd0
# good: [1f5eb3e76303572f0318e8c50da51c516580aa03] Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git
git bisect good 1f5eb3e76303572f0318e8c50da51c516580aa03
# bad: [4c1d9cc0363691893ef94fa0d798faca013e27d3] Merge branch 'staging-next' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging.git
git bisect bad 4c1d9cc0363691893ef94fa0d798faca013e27d3
# bad: [dcb68304485c0ba5f84f1a54687c751b68263d93] Merge branch 'usb-next' of git://git.kernel.org/pub/scm/linux/kernel/git/johan/usb-serial.git
git bisect bad dcb68304485c0ba5f84f1a54687c751b68263d93
# good: [61271996dc46aecb40fd26f89a4ec0a6bd8f3a8f] Merge branch 'next' of git://git.kernel.org/pub/scm/virt/kvm/kvm.git
git bisect good 61271996dc46aecb40fd26f89a4ec0a6bd8f3a8f
# good: [d4db45a71f56032b552e161968bb0e5fdd2767f8] Merge branch 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/pavel/linux-leds.git
git bisect good d4db45a71f56032b552e161968bb0e5fdd2767f8
# good: [d090c7a2ab84663185e4abda21d7d83880937c8a] USB / dwc3: Fix a checkpatch warning in core.c
git bisect good d090c7a2ab84663185e4abda21d7d83880937c8a
# bad: [b232b02bf3c205b13a26dcec08e53baddd8e59ed] driver core: fix deadlock in __device_attach
git bisect bad b232b02bf3c205b13a26dcec08e53baddd8e59ed
# good: [4c32174a24759d5ac6dc42b508fcec2afb8b9602] Documentation: dd: Use ReST lists for return values of driver_deferred_probe_check_state()
git bisect good 4c32174a24759d5ac6dc42b508fcec2afb8b9602
# good: [38ea74eb8fc1b82b39e13a6527095a0036539117] rpmsg: use local 'dev' variable
git bisect good 38ea74eb8fc1b82b39e13a6527095a0036539117
# good: [1f7ff11ca68f464b6a9a71b8fbe9e5219e7cac57] driver core: location: Add "back" as a possible output for panel
git bisect good 1f7ff11ca68f464b6a9a71b8fbe9e5219e7cac57
# good: [6ee60e9c9f2f83ad218159af6a175c57a395ae69] MAINTAINERS: add Russ Weight as a firmware loader maintainer
git bisect good 6ee60e9c9f2f83ad218159af6a175c57a395ae69
# bad: [15f214f9bdb7c1f560b4bf863c5a72ff53b442a4] topology: Remove unused cpu_cluster_mask()
git bisect bad 15f214f9bdb7c1f560b4bf863c5a72ff53b442a4
# bad: [2b28a1a84a0eb3412bad1a2d5cce2bb4addec626] driver core: Extend deferred probe timeout on driver registration
git bisect bad 2b28a1a84a0eb3412bad1a2d5cce2bb4addec626
# first bad commit: [2b28a1a84a0eb3412bad1a2d5cce2bb4addec626] driver core: Extend deferred probe timeout on driver registration
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH v1] driver core: Extend deferred probe timeout on driver registration
  2022-05-20 23:30   ` Nathan Chancellor
@ 2022-05-20 23:49     ` Saravana Kannan via iommu
  -1 siblings, 0 replies; 38+ messages in thread
From: Saravana Kannan @ 2022-05-20 23:49 UTC (permalink / raw)
  To: Nathan Chancellor
  Cc: Jonathan Corbet, Greg Kroah-Hartman, Rafael J. Wysocki,
	Rafael J. Wysocki, Rob Herring, Linus Walleij, Will Deacon,
	Ulf Hansson, Kevin Hilman, Thierry Reding, Mark Brown,
	Pavel Machek, Geert Uytterhoeven, Yoshihiro Shimoda,
	Paul Kocialkowski, linux-gpio, linux-pm, iommu, kernel-team,
	linux-doc, linux-kernel, linux-s390

On Fri, May 20, 2022 at 4:30 PM Nathan Chancellor <nathan@kernel.org> wrote:
>
> Hi Saravana,
>
> On Fri, Apr 29, 2022 at 03:09:32PM -0700, Saravana Kannan wrote:
> > The deferred probe timer that's used for this currently starts at
> > late_initcall and runs for driver_deferred_probe_timeout seconds. The
> > assumption being that all available drivers would be loaded and
> > registered before the timer expires. This means, the
> > driver_deferred_probe_timeout has to be pretty large for it to cover the
> > worst case. But if we set the default value for it to cover the worst
> > case, it would significantly slow down the average case. For this
> > reason, the default value is set to 0.
> >
> > Also, with CONFIG_MODULES=y and the current default values of
> > driver_deferred_probe_timeout=0 and fw_devlink=on, devices with missing
> > drivers will cause their consumer devices to always defer their probes.
> > This is because device links created by fw_devlink defer the probe even
> > before the consumer driver's probe() is called.
> >
> > Instead of a fixed timeout, if we extend an unexpired deferred probe
> > timer on every successful driver registration, with the expectation more
> > modules would be loaded in the near future, then the default value of
> > driver_deferred_probe_timeout only needs to be as long as the worst case
> > time difference between two consecutive module loads.
> >
> > So let's implement that and set the default value to 10 seconds when
> > CONFIG_MODULES=y.
> >
> > Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
> > Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net>
> > Cc: Rob Herring <robh@kernel.org>
> > Cc: Linus Walleij <linus.walleij@linaro.org>
> > Cc: Will Deacon <will@kernel.org>
> > Cc: Ulf Hansson <ulf.hansson@linaro.org>
> > Cc: Kevin Hilman <khilman@kernel.org>
> > Cc: Thierry Reding <treding@nvidia.com>
> > Cc: Mark Brown <broonie@kernel.org>
> > Cc: Pavel Machek <pavel@ucw.cz>
> > Cc: Geert Uytterhoeven <geert@linux-m68k.org>
> > Cc: Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com>
> > Cc: Paul Kocialkowski <paul.kocialkowski@bootlin.com>
> > Cc: linux-gpio@vger.kernel.org
> > Cc: linux-pm@vger.kernel.org
> > Cc: iommu@lists.linux-foundation.org
> > Signed-off-by: Saravana Kannan <saravanak@google.com>
>
> I bisected a boot hang with ARCH=s390 defconfig in QEMU down to this
> change as commit 2b28a1a84a0e ("driver core: Extend deferred probe
> timeout on driver registration") in next-20220520 (bisect log below).
>
> $ make -skj"$(nproc)" ARCH=s390 CROSS_COMPILE=s390x-linux-gnu- defconfig bzImage
>
> $ timeout --foreground 15m stdbuf -oL -eL \
> qemu-system-s390x \
> -initrd ... \
> -M s390-ccw-virtio \
> -display none \
> -kernel arch/s390/boot/bzImage \
> -m 512m \
> -nodefaults \
> -serial mon:stdio
> ...
> [    2.077303] In-situ OAM (IOAM) with IPv6
> [    2.077639] NET: Registered PF_PACKET protocol family
> [    2.078063] bridge: filtering via arp/ip/ip6tables is no longer available by default. Update your scripts to load br_netfilter if you need this.
> [    2.078795] Key type dns_resolver registered
> [    2.079317] cio: Channel measurement facility initialized using format extended (mode autodetected)
> [    2.081494] Discipline DIAG cannot be used without z/VM
> [  260.626363] random: crng init done
> qemu-system-s390x: terminating on signal 15 from pid 3815762 (timeout)
>
> We have a simple rootfs available if necessary:
>
> https://github.com/ClangBuiltLinux/boot-utils/raw/bc0d17785eb67f1edd0ee0a134970a807895f741/images/s390/rootfs.cpio.zst
>
> If there is any other information I can provide, please let me know!

Hmm... strange. Can you please try the following command line options
and tell me which of these has the issue and which don't?
1) deferred_probe_timeout=0
2) deferred_probe_timeout=1
3) deferred_probe_timeout=300

That should help me narrow down where the error might be.

Thanks,
Saravana

>
> Cheers,
> Nathan
>
> # bad: [18ecd30af1a8402c162cca1bd58771c0e5be7815] Add linux-next specific files for 20220520
> # good: [b015dcd62b86d298829990f8261d5d154b8d7af5] Merge tag 'for-5.18/parisc-4' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux
> git bisect start '18ecd30af1a8402c162cca1bd58771c0e5be7815' 'b015dcd62b86d298829990f8261d5d154b8d7af5'
> # good: [f9b63740b666dd9887eb0282d21b5f65bb0cadd0] Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/cryptodev-2.6.git
> git bisect good f9b63740b666dd9887eb0282d21b5f65bb0cadd0
> # good: [1f5eb3e76303572f0318e8c50da51c516580aa03] Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git
> git bisect good 1f5eb3e76303572f0318e8c50da51c516580aa03
> # bad: [4c1d9cc0363691893ef94fa0d798faca013e27d3] Merge branch 'staging-next' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging.git
> git bisect bad 4c1d9cc0363691893ef94fa0d798faca013e27d3
> # bad: [dcb68304485c0ba5f84f1a54687c751b68263d93] Merge branch 'usb-next' of git://git.kernel.org/pub/scm/linux/kernel/git/johan/usb-serial.git
> git bisect bad dcb68304485c0ba5f84f1a54687c751b68263d93
> # good: [61271996dc46aecb40fd26f89a4ec0a6bd8f3a8f] Merge branch 'next' of git://git.kernel.org/pub/scm/virt/kvm/kvm.git
> git bisect good 61271996dc46aecb40fd26f89a4ec0a6bd8f3a8f
> # good: [d4db45a71f56032b552e161968bb0e5fdd2767f8] Merge branch 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/pavel/linux-leds.git
> git bisect good d4db45a71f56032b552e161968bb0e5fdd2767f8
> # good: [d090c7a2ab84663185e4abda21d7d83880937c8a] USB / dwc3: Fix a checkpatch warning in core.c
> git bisect good d090c7a2ab84663185e4abda21d7d83880937c8a
> # bad: [b232b02bf3c205b13a26dcec08e53baddd8e59ed] driver core: fix deadlock in __device_attach
> git bisect bad b232b02bf3c205b13a26dcec08e53baddd8e59ed
> # good: [4c32174a24759d5ac6dc42b508fcec2afb8b9602] Documentation: dd: Use ReST lists for return values of driver_deferred_probe_check_state()
> git bisect good 4c32174a24759d5ac6dc42b508fcec2afb8b9602
> # good: [38ea74eb8fc1b82b39e13a6527095a0036539117] rpmsg: use local 'dev' variable
> git bisect good 38ea74eb8fc1b82b39e13a6527095a0036539117
> # good: [1f7ff11ca68f464b6a9a71b8fbe9e5219e7cac57] driver core: location: Add "back" as a possible output for panel
> git bisect good 1f7ff11ca68f464b6a9a71b8fbe9e5219e7cac57
> # good: [6ee60e9c9f2f83ad218159af6a175c57a395ae69] MAINTAINERS: add Russ Weight as a firmware loader maintainer
> git bisect good 6ee60e9c9f2f83ad218159af6a175c57a395ae69
> # bad: [15f214f9bdb7c1f560b4bf863c5a72ff53b442a4] topology: Remove unused cpu_cluster_mask()
> git bisect bad 15f214f9bdb7c1f560b4bf863c5a72ff53b442a4
> # bad: [2b28a1a84a0eb3412bad1a2d5cce2bb4addec626] driver core: Extend deferred probe timeout on driver registration
> git bisect bad 2b28a1a84a0eb3412bad1a2d5cce2bb4addec626
> # first bad commit: [2b28a1a84a0eb3412bad1a2d5cce2bb4addec626] driver core: Extend deferred probe timeout on driver registration

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH v1] driver core: Extend deferred probe timeout on driver registration
@ 2022-05-20 23:49     ` Saravana Kannan via iommu
  0 siblings, 0 replies; 38+ messages in thread
From: Saravana Kannan via iommu @ 2022-05-20 23:49 UTC (permalink / raw)
  To: Nathan Chancellor
  Cc: Ulf Hansson, linux-doc, Linus Walleij, Pavel Machek, linux-s390,
	Will Deacon, Rob Herring, Jonathan Corbet, Rafael J. Wysocki,
	Geert Uytterhoeven, Thierry Reding, kernel-team, linux-pm,
	linux-gpio, Mark Brown, Greg Kroah-Hartman, Kevin Hilman,
	Rafael J. Wysocki, linux-kernel, Paul Kocialkowski, iommu

On Fri, May 20, 2022 at 4:30 PM Nathan Chancellor <nathan@kernel.org> wrote:
>
> Hi Saravana,
>
> On Fri, Apr 29, 2022 at 03:09:32PM -0700, Saravana Kannan wrote:
> > The deferred probe timer that's used for this currently starts at
> > late_initcall and runs for driver_deferred_probe_timeout seconds. The
> > assumption being that all available drivers would be loaded and
> > registered before the timer expires. This means, the
> > driver_deferred_probe_timeout has to be pretty large for it to cover the
> > worst case. But if we set the default value for it to cover the worst
> > case, it would significantly slow down the average case. For this
> > reason, the default value is set to 0.
> >
> > Also, with CONFIG_MODULES=y and the current default values of
> > driver_deferred_probe_timeout=0 and fw_devlink=on, devices with missing
> > drivers will cause their consumer devices to always defer their probes.
> > This is because device links created by fw_devlink defer the probe even
> > before the consumer driver's probe() is called.
> >
> > Instead of a fixed timeout, if we extend an unexpired deferred probe
> > timer on every successful driver registration, with the expectation more
> > modules would be loaded in the near future, then the default value of
> > driver_deferred_probe_timeout only needs to be as long as the worst case
> > time difference between two consecutive module loads.
> >
> > So let's implement that and set the default value to 10 seconds when
> > CONFIG_MODULES=y.
> >
> > Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
> > Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net>
> > Cc: Rob Herring <robh@kernel.org>
> > Cc: Linus Walleij <linus.walleij@linaro.org>
> > Cc: Will Deacon <will@kernel.org>
> > Cc: Ulf Hansson <ulf.hansson@linaro.org>
> > Cc: Kevin Hilman <khilman@kernel.org>
> > Cc: Thierry Reding <treding@nvidia.com>
> > Cc: Mark Brown <broonie@kernel.org>
> > Cc: Pavel Machek <pavel@ucw.cz>
> > Cc: Geert Uytterhoeven <geert@linux-m68k.org>
> > Cc: Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com>
> > Cc: Paul Kocialkowski <paul.kocialkowski@bootlin.com>
> > Cc: linux-gpio@vger.kernel.org
> > Cc: linux-pm@vger.kernel.org
> > Cc: iommu@lists.linux-foundation.org
> > Signed-off-by: Saravana Kannan <saravanak@google.com>
>
> I bisected a boot hang with ARCH=s390 defconfig in QEMU down to this
> change as commit 2b28a1a84a0e ("driver core: Extend deferred probe
> timeout on driver registration") in next-20220520 (bisect log below).
>
> $ make -skj"$(nproc)" ARCH=s390 CROSS_COMPILE=s390x-linux-gnu- defconfig bzImage
>
> $ timeout --foreground 15m stdbuf -oL -eL \
> qemu-system-s390x \
> -initrd ... \
> -M s390-ccw-virtio \
> -display none \
> -kernel arch/s390/boot/bzImage \
> -m 512m \
> -nodefaults \
> -serial mon:stdio
> ...
> [    2.077303] In-situ OAM (IOAM) with IPv6
> [    2.077639] NET: Registered PF_PACKET protocol family
> [    2.078063] bridge: filtering via arp/ip/ip6tables is no longer available by default. Update your scripts to load br_netfilter if you need this.
> [    2.078795] Key type dns_resolver registered
> [    2.079317] cio: Channel measurement facility initialized using format extended (mode autodetected)
> [    2.081494] Discipline DIAG cannot be used without z/VM
> [  260.626363] random: crng init done
> qemu-system-s390x: terminating on signal 15 from pid 3815762 (timeout)
>
> We have a simple rootfs available if necessary:
>
> https://github.com/ClangBuiltLinux/boot-utils/raw/bc0d17785eb67f1edd0ee0a134970a807895f741/images/s390/rootfs.cpio.zst
>
> If there is any other information I can provide, please let me know!

Hmm... strange. Can you please try the following command line options
and tell me which of these has the issue and which don't?
1) deferred_probe_timeout=0
2) deferred_probe_timeout=1
3) deferred_probe_timeout=300

That should help me narrow down where the error might be.

Thanks,
Saravana

>
> Cheers,
> Nathan
>
> # bad: [18ecd30af1a8402c162cca1bd58771c0e5be7815] Add linux-next specific files for 20220520
> # good: [b015dcd62b86d298829990f8261d5d154b8d7af5] Merge tag 'for-5.18/parisc-4' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux
> git bisect start '18ecd30af1a8402c162cca1bd58771c0e5be7815' 'b015dcd62b86d298829990f8261d5d154b8d7af5'
> # good: [f9b63740b666dd9887eb0282d21b5f65bb0cadd0] Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/cryptodev-2.6.git
> git bisect good f9b63740b666dd9887eb0282d21b5f65bb0cadd0
> # good: [1f5eb3e76303572f0318e8c50da51c516580aa03] Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git
> git bisect good 1f5eb3e76303572f0318e8c50da51c516580aa03
> # bad: [4c1d9cc0363691893ef94fa0d798faca013e27d3] Merge branch 'staging-next' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging.git
> git bisect bad 4c1d9cc0363691893ef94fa0d798faca013e27d3
> # bad: [dcb68304485c0ba5f84f1a54687c751b68263d93] Merge branch 'usb-next' of git://git.kernel.org/pub/scm/linux/kernel/git/johan/usb-serial.git
> git bisect bad dcb68304485c0ba5f84f1a54687c751b68263d93
> # good: [61271996dc46aecb40fd26f89a4ec0a6bd8f3a8f] Merge branch 'next' of git://git.kernel.org/pub/scm/virt/kvm/kvm.git
> git bisect good 61271996dc46aecb40fd26f89a4ec0a6bd8f3a8f
> # good: [d4db45a71f56032b552e161968bb0e5fdd2767f8] Merge branch 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/pavel/linux-leds.git
> git bisect good d4db45a71f56032b552e161968bb0e5fdd2767f8
> # good: [d090c7a2ab84663185e4abda21d7d83880937c8a] USB / dwc3: Fix a checkpatch warning in core.c
> git bisect good d090c7a2ab84663185e4abda21d7d83880937c8a
> # bad: [b232b02bf3c205b13a26dcec08e53baddd8e59ed] driver core: fix deadlock in __device_attach
> git bisect bad b232b02bf3c205b13a26dcec08e53baddd8e59ed
> # good: [4c32174a24759d5ac6dc42b508fcec2afb8b9602] Documentation: dd: Use ReST lists for return values of driver_deferred_probe_check_state()
> git bisect good 4c32174a24759d5ac6dc42b508fcec2afb8b9602
> # good: [38ea74eb8fc1b82b39e13a6527095a0036539117] rpmsg: use local 'dev' variable
> git bisect good 38ea74eb8fc1b82b39e13a6527095a0036539117
> # good: [1f7ff11ca68f464b6a9a71b8fbe9e5219e7cac57] driver core: location: Add "back" as a possible output for panel
> git bisect good 1f7ff11ca68f464b6a9a71b8fbe9e5219e7cac57
> # good: [6ee60e9c9f2f83ad218159af6a175c57a395ae69] MAINTAINERS: add Russ Weight as a firmware loader maintainer
> git bisect good 6ee60e9c9f2f83ad218159af6a175c57a395ae69
> # bad: [15f214f9bdb7c1f560b4bf863c5a72ff53b442a4] topology: Remove unused cpu_cluster_mask()
> git bisect bad 15f214f9bdb7c1f560b4bf863c5a72ff53b442a4
> # bad: [2b28a1a84a0eb3412bad1a2d5cce2bb4addec626] driver core: Extend deferred probe timeout on driver registration
> git bisect bad 2b28a1a84a0eb3412bad1a2d5cce2bb4addec626
> # first bad commit: [2b28a1a84a0eb3412bad1a2d5cce2bb4addec626] driver core: Extend deferred probe timeout on driver registration
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH v1] driver core: Extend deferred probe timeout on driver registration
  2022-05-20 23:49     ` Saravana Kannan via iommu
@ 2022-05-21  0:04       ` Nathan Chancellor
  -1 siblings, 0 replies; 38+ messages in thread
From: Nathan Chancellor @ 2022-05-21  0:04 UTC (permalink / raw)
  To: Saravana Kannan
  Cc: Jonathan Corbet, Greg Kroah-Hartman, Rafael J. Wysocki,
	Rafael J. Wysocki, Rob Herring, Linus Walleij, Will Deacon,
	Ulf Hansson, Kevin Hilman, Thierry Reding, Mark Brown,
	Pavel Machek, Geert Uytterhoeven, Yoshihiro Shimoda,
	Paul Kocialkowski, linux-gpio, linux-pm, iommu, kernel-team,
	linux-doc, linux-kernel, linux-s390

On Fri, May 20, 2022 at 04:49:48PM -0700, Saravana Kannan wrote:
> On Fri, May 20, 2022 at 4:30 PM Nathan Chancellor <nathan@kernel.org> wrote:
> >
> > Hi Saravana,
> >
> > On Fri, Apr 29, 2022 at 03:09:32PM -0700, Saravana Kannan wrote:
> > > The deferred probe timer that's used for this currently starts at
> > > late_initcall and runs for driver_deferred_probe_timeout seconds. The
> > > assumption being that all available drivers would be loaded and
> > > registered before the timer expires. This means, the
> > > driver_deferred_probe_timeout has to be pretty large for it to cover the
> > > worst case. But if we set the default value for it to cover the worst
> > > case, it would significantly slow down the average case. For this
> > > reason, the default value is set to 0.
> > >
> > > Also, with CONFIG_MODULES=y and the current default values of
> > > driver_deferred_probe_timeout=0 and fw_devlink=on, devices with missing
> > > drivers will cause their consumer devices to always defer their probes.
> > > This is because device links created by fw_devlink defer the probe even
> > > before the consumer driver's probe() is called.
> > >
> > > Instead of a fixed timeout, if we extend an unexpired deferred probe
> > > timer on every successful driver registration, with the expectation more
> > > modules would be loaded in the near future, then the default value of
> > > driver_deferred_probe_timeout only needs to be as long as the worst case
> > > time difference between two consecutive module loads.
> > >
> > > So let's implement that and set the default value to 10 seconds when
> > > CONFIG_MODULES=y.
> > >
> > > Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
> > > Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net>
> > > Cc: Rob Herring <robh@kernel.org>
> > > Cc: Linus Walleij <linus.walleij@linaro.org>
> > > Cc: Will Deacon <will@kernel.org>
> > > Cc: Ulf Hansson <ulf.hansson@linaro.org>
> > > Cc: Kevin Hilman <khilman@kernel.org>
> > > Cc: Thierry Reding <treding@nvidia.com>
> > > Cc: Mark Brown <broonie@kernel.org>
> > > Cc: Pavel Machek <pavel@ucw.cz>
> > > Cc: Geert Uytterhoeven <geert@linux-m68k.org>
> > > Cc: Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com>
> > > Cc: Paul Kocialkowski <paul.kocialkowski@bootlin.com>
> > > Cc: linux-gpio@vger.kernel.org
> > > Cc: linux-pm@vger.kernel.org
> > > Cc: iommu@lists.linux-foundation.org
> > > Signed-off-by: Saravana Kannan <saravanak@google.com>
> >
> > I bisected a boot hang with ARCH=s390 defconfig in QEMU down to this
> > change as commit 2b28a1a84a0e ("driver core: Extend deferred probe
> > timeout on driver registration") in next-20220520 (bisect log below).
> >
> > $ make -skj"$(nproc)" ARCH=s390 CROSS_COMPILE=s390x-linux-gnu- defconfig bzImage
> >
> > $ timeout --foreground 15m stdbuf -oL -eL \
> > qemu-system-s390x \
> > -initrd ... \
> > -M s390-ccw-virtio \
> > -display none \
> > -kernel arch/s390/boot/bzImage \
> > -m 512m \
> > -nodefaults \
> > -serial mon:stdio
> > ...
> > [    2.077303] In-situ OAM (IOAM) with IPv6
> > [    2.077639] NET: Registered PF_PACKET protocol family
> > [    2.078063] bridge: filtering via arp/ip/ip6tables is no longer available by default. Update your scripts to load br_netfilter if you need this.
> > [    2.078795] Key type dns_resolver registered
> > [    2.079317] cio: Channel measurement facility initialized using format extended (mode autodetected)
> > [    2.081494] Discipline DIAG cannot be used without z/VM
> > [  260.626363] random: crng init done
> > qemu-system-s390x: terminating on signal 15 from pid 3815762 (timeout)
> >
> > We have a simple rootfs available if necessary:
> >
> > https://github.com/ClangBuiltLinux/boot-utils/raw/bc0d17785eb67f1edd0ee0a134970a807895f741/images/s390/rootfs.cpio.zst
> >
> > If there is any other information I can provide, please let me know!
> 
> Hmm... strange. Can you please try the following command line options
> and tell me which of these has the issue and which don't?

Sure thing!

> 1) deferred_probe_timeout=0

No issue.

> 2) deferred_probe_timeout=1
> 3) deferred_probe_timeout=300

Both of these appear to hang in the same way, I let each sit for five
minutes.

Cheers,
Nathan

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH v1] driver core: Extend deferred probe timeout on driver registration
@ 2022-05-21  0:04       ` Nathan Chancellor
  0 siblings, 0 replies; 38+ messages in thread
From: Nathan Chancellor @ 2022-05-21  0:04 UTC (permalink / raw)
  To: Saravana Kannan
  Cc: Ulf Hansson, linux-doc, Linus Walleij, Pavel Machek, linux-s390,
	Will Deacon, Rob Herring, Jonathan Corbet, Rafael J. Wysocki,
	Geert Uytterhoeven, Thierry Reding, kernel-team, linux-pm,
	linux-gpio, Mark Brown, Greg Kroah-Hartman, Kevin Hilman,
	Rafael J. Wysocki, linux-kernel, Paul Kocialkowski, iommu

On Fri, May 20, 2022 at 04:49:48PM -0700, Saravana Kannan wrote:
> On Fri, May 20, 2022 at 4:30 PM Nathan Chancellor <nathan@kernel.org> wrote:
> >
> > Hi Saravana,
> >
> > On Fri, Apr 29, 2022 at 03:09:32PM -0700, Saravana Kannan wrote:
> > > The deferred probe timer that's used for this currently starts at
> > > late_initcall and runs for driver_deferred_probe_timeout seconds. The
> > > assumption being that all available drivers would be loaded and
> > > registered before the timer expires. This means, the
> > > driver_deferred_probe_timeout has to be pretty large for it to cover the
> > > worst case. But if we set the default value for it to cover the worst
> > > case, it would significantly slow down the average case. For this
> > > reason, the default value is set to 0.
> > >
> > > Also, with CONFIG_MODULES=y and the current default values of
> > > driver_deferred_probe_timeout=0 and fw_devlink=on, devices with missing
> > > drivers will cause their consumer devices to always defer their probes.
> > > This is because device links created by fw_devlink defer the probe even
> > > before the consumer driver's probe() is called.
> > >
> > > Instead of a fixed timeout, if we extend an unexpired deferred probe
> > > timer on every successful driver registration, with the expectation more
> > > modules would be loaded in the near future, then the default value of
> > > driver_deferred_probe_timeout only needs to be as long as the worst case
> > > time difference between two consecutive module loads.
> > >
> > > So let's implement that and set the default value to 10 seconds when
> > > CONFIG_MODULES=y.
> > >
> > > Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
> > > Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net>
> > > Cc: Rob Herring <robh@kernel.org>
> > > Cc: Linus Walleij <linus.walleij@linaro.org>
> > > Cc: Will Deacon <will@kernel.org>
> > > Cc: Ulf Hansson <ulf.hansson@linaro.org>
> > > Cc: Kevin Hilman <khilman@kernel.org>
> > > Cc: Thierry Reding <treding@nvidia.com>
> > > Cc: Mark Brown <broonie@kernel.org>
> > > Cc: Pavel Machek <pavel@ucw.cz>
> > > Cc: Geert Uytterhoeven <geert@linux-m68k.org>
> > > Cc: Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com>
> > > Cc: Paul Kocialkowski <paul.kocialkowski@bootlin.com>
> > > Cc: linux-gpio@vger.kernel.org
> > > Cc: linux-pm@vger.kernel.org
> > > Cc: iommu@lists.linux-foundation.org
> > > Signed-off-by: Saravana Kannan <saravanak@google.com>
> >
> > I bisected a boot hang with ARCH=s390 defconfig in QEMU down to this
> > change as commit 2b28a1a84a0e ("driver core: Extend deferred probe
> > timeout on driver registration") in next-20220520 (bisect log below).
> >
> > $ make -skj"$(nproc)" ARCH=s390 CROSS_COMPILE=s390x-linux-gnu- defconfig bzImage
> >
> > $ timeout --foreground 15m stdbuf -oL -eL \
> > qemu-system-s390x \
> > -initrd ... \
> > -M s390-ccw-virtio \
> > -display none \
> > -kernel arch/s390/boot/bzImage \
> > -m 512m \
> > -nodefaults \
> > -serial mon:stdio
> > ...
> > [    2.077303] In-situ OAM (IOAM) with IPv6
> > [    2.077639] NET: Registered PF_PACKET protocol family
> > [    2.078063] bridge: filtering via arp/ip/ip6tables is no longer available by default. Update your scripts to load br_netfilter if you need this.
> > [    2.078795] Key type dns_resolver registered
> > [    2.079317] cio: Channel measurement facility initialized using format extended (mode autodetected)
> > [    2.081494] Discipline DIAG cannot be used without z/VM
> > [  260.626363] random: crng init done
> > qemu-system-s390x: terminating on signal 15 from pid 3815762 (timeout)
> >
> > We have a simple rootfs available if necessary:
> >
> > https://github.com/ClangBuiltLinux/boot-utils/raw/bc0d17785eb67f1edd0ee0a134970a807895f741/images/s390/rootfs.cpio.zst
> >
> > If there is any other information I can provide, please let me know!
> 
> Hmm... strange. Can you please try the following command line options
> and tell me which of these has the issue and which don't?

Sure thing!

> 1) deferred_probe_timeout=0

No issue.

> 2) deferred_probe_timeout=1
> 3) deferred_probe_timeout=300

Both of these appear to hang in the same way, I let each sit for five
minutes.

Cheers,
Nathan
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH v1] driver core: Extend deferred probe timeout on driver registration
  2022-05-21  0:04       ` Nathan Chancellor
@ 2022-05-21  0:15         ` Saravana Kannan via iommu
  -1 siblings, 0 replies; 38+ messages in thread
From: Saravana Kannan @ 2022-05-21  0:15 UTC (permalink / raw)
  To: Nathan Chancellor
  Cc: Jonathan Corbet, Greg Kroah-Hartman, Rafael J. Wysocki,
	Rafael J. Wysocki, Rob Herring, Linus Walleij, Will Deacon,
	Ulf Hansson, Kevin Hilman, Thierry Reding, Mark Brown,
	Pavel Machek, Geert Uytterhoeven, Yoshihiro Shimoda,
	Paul Kocialkowski, linux-gpio, linux-pm, iommu, kernel-team,
	linux-doc, linux-kernel, linux-s390

On Fri, May 20, 2022 at 5:04 PM Nathan Chancellor <nathan@kernel.org> wrote:
>
> On Fri, May 20, 2022 at 04:49:48PM -0700, Saravana Kannan wrote:
> > On Fri, May 20, 2022 at 4:30 PM Nathan Chancellor <nathan@kernel.org> wrote:
> > >
> > > Hi Saravana,
> > >
> > > On Fri, Apr 29, 2022 at 03:09:32PM -0700, Saravana Kannan wrote:
> > > > The deferred probe timer that's used for this currently starts at
> > > > late_initcall and runs for driver_deferred_probe_timeout seconds. The
> > > > assumption being that all available drivers would be loaded and
> > > > registered before the timer expires. This means, the
> > > > driver_deferred_probe_timeout has to be pretty large for it to cover the
> > > > worst case. But if we set the default value for it to cover the worst
> > > > case, it would significantly slow down the average case. For this
> > > > reason, the default value is set to 0.
> > > >
> > > > Also, with CONFIG_MODULES=y and the current default values of
> > > > driver_deferred_probe_timeout=0 and fw_devlink=on, devices with missing
> > > > drivers will cause their consumer devices to always defer their probes.
> > > > This is because device links created by fw_devlink defer the probe even
> > > > before the consumer driver's probe() is called.
> > > >
> > > > Instead of a fixed timeout, if we extend an unexpired deferred probe
> > > > timer on every successful driver registration, with the expectation more
> > > > modules would be loaded in the near future, then the default value of
> > > > driver_deferred_probe_timeout only needs to be as long as the worst case
> > > > time difference between two consecutive module loads.
> > > >
> > > > So let's implement that and set the default value to 10 seconds when
> > > > CONFIG_MODULES=y.
> > > >
> > > > Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
> > > > Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net>
> > > > Cc: Rob Herring <robh@kernel.org>
> > > > Cc: Linus Walleij <linus.walleij@linaro.org>
> > > > Cc: Will Deacon <will@kernel.org>
> > > > Cc: Ulf Hansson <ulf.hansson@linaro.org>
> > > > Cc: Kevin Hilman <khilman@kernel.org>
> > > > Cc: Thierry Reding <treding@nvidia.com>
> > > > Cc: Mark Brown <broonie@kernel.org>
> > > > Cc: Pavel Machek <pavel@ucw.cz>
> > > > Cc: Geert Uytterhoeven <geert@linux-m68k.org>
> > > > Cc: Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com>
> > > > Cc: Paul Kocialkowski <paul.kocialkowski@bootlin.com>
> > > > Cc: linux-gpio@vger.kernel.org
> > > > Cc: linux-pm@vger.kernel.org
> > > > Cc: iommu@lists.linux-foundation.org
> > > > Signed-off-by: Saravana Kannan <saravanak@google.com>
> > >
> > > I bisected a boot hang with ARCH=s390 defconfig in QEMU down to this
> > > change as commit 2b28a1a84a0e ("driver core: Extend deferred probe
> > > timeout on driver registration") in next-20220520 (bisect log below).
> > >
> > > $ make -skj"$(nproc)" ARCH=s390 CROSS_COMPILE=s390x-linux-gnu- defconfig bzImage
> > >
> > > $ timeout --foreground 15m stdbuf -oL -eL \
> > > qemu-system-s390x \
> > > -initrd ... \
> > > -M s390-ccw-virtio \
> > > -display none \
> > > -kernel arch/s390/boot/bzImage \
> > > -m 512m \
> > > -nodefaults \
> > > -serial mon:stdio
> > > ...
> > > [    2.077303] In-situ OAM (IOAM) with IPv6
> > > [    2.077639] NET: Registered PF_PACKET protocol family
> > > [    2.078063] bridge: filtering via arp/ip/ip6tables is no longer available by default. Update your scripts to load br_netfilter if you need this.
> > > [    2.078795] Key type dns_resolver registered
> > > [    2.079317] cio: Channel measurement facility initialized using format extended (mode autodetected)
> > > [    2.081494] Discipline DIAG cannot be used without z/VM
> > > [  260.626363] random: crng init done
> > > qemu-system-s390x: terminating on signal 15 from pid 3815762 (timeout)
> > >
> > > We have a simple rootfs available if necessary:
> > >
> > > https://github.com/ClangBuiltLinux/boot-utils/raw/bc0d17785eb67f1edd0ee0a134970a807895f741/images/s390/rootfs.cpio.zst
> > >
> > > If there is any other information I can provide, please let me know!
> >
> > Hmm... strange. Can you please try the following command line options
> > and tell me which of these has the issue and which don't?
>
> Sure thing!
>
> > 1) deferred_probe_timeout=0
>
> No issue.
>
> > 2) deferred_probe_timeout=1
> > 3) deferred_probe_timeout=300
>
> Both of these appear to hang in the same way, I let each sit for five
> minutes.

Strange that a sufficiently large timeout isn't helping. Is it trying
to boot off a network mount? I'll continue looking into this next
week.

-Saravana

>
> Cheers,
> Nathan

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH v1] driver core: Extend deferred probe timeout on driver registration
@ 2022-05-21  0:15         ` Saravana Kannan via iommu
  0 siblings, 0 replies; 38+ messages in thread
From: Saravana Kannan via iommu @ 2022-05-21  0:15 UTC (permalink / raw)
  To: Nathan Chancellor
  Cc: Ulf Hansson, linux-doc, Linus Walleij, Pavel Machek, linux-s390,
	Will Deacon, Rob Herring, Jonathan Corbet, Rafael J. Wysocki,
	Geert Uytterhoeven, Thierry Reding, kernel-team, linux-pm,
	linux-gpio, Mark Brown, Greg Kroah-Hartman, Kevin Hilman,
	Rafael J. Wysocki, linux-kernel, Paul Kocialkowski, iommu

On Fri, May 20, 2022 at 5:04 PM Nathan Chancellor <nathan@kernel.org> wrote:
>
> On Fri, May 20, 2022 at 04:49:48PM -0700, Saravana Kannan wrote:
> > On Fri, May 20, 2022 at 4:30 PM Nathan Chancellor <nathan@kernel.org> wrote:
> > >
> > > Hi Saravana,
> > >
> > > On Fri, Apr 29, 2022 at 03:09:32PM -0700, Saravana Kannan wrote:
> > > > The deferred probe timer that's used for this currently starts at
> > > > late_initcall and runs for driver_deferred_probe_timeout seconds. The
> > > > assumption being that all available drivers would be loaded and
> > > > registered before the timer expires. This means, the
> > > > driver_deferred_probe_timeout has to be pretty large for it to cover the
> > > > worst case. But if we set the default value for it to cover the worst
> > > > case, it would significantly slow down the average case. For this
> > > > reason, the default value is set to 0.
> > > >
> > > > Also, with CONFIG_MODULES=y and the current default values of
> > > > driver_deferred_probe_timeout=0 and fw_devlink=on, devices with missing
> > > > drivers will cause their consumer devices to always defer their probes.
> > > > This is because device links created by fw_devlink defer the probe even
> > > > before the consumer driver's probe() is called.
> > > >
> > > > Instead of a fixed timeout, if we extend an unexpired deferred probe
> > > > timer on every successful driver registration, with the expectation more
> > > > modules would be loaded in the near future, then the default value of
> > > > driver_deferred_probe_timeout only needs to be as long as the worst case
> > > > time difference between two consecutive module loads.
> > > >
> > > > So let's implement that and set the default value to 10 seconds when
> > > > CONFIG_MODULES=y.
> > > >
> > > > Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
> > > > Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net>
> > > > Cc: Rob Herring <robh@kernel.org>
> > > > Cc: Linus Walleij <linus.walleij@linaro.org>
> > > > Cc: Will Deacon <will@kernel.org>
> > > > Cc: Ulf Hansson <ulf.hansson@linaro.org>
> > > > Cc: Kevin Hilman <khilman@kernel.org>
> > > > Cc: Thierry Reding <treding@nvidia.com>
> > > > Cc: Mark Brown <broonie@kernel.org>
> > > > Cc: Pavel Machek <pavel@ucw.cz>
> > > > Cc: Geert Uytterhoeven <geert@linux-m68k.org>
> > > > Cc: Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com>
> > > > Cc: Paul Kocialkowski <paul.kocialkowski@bootlin.com>
> > > > Cc: linux-gpio@vger.kernel.org
> > > > Cc: linux-pm@vger.kernel.org
> > > > Cc: iommu@lists.linux-foundation.org
> > > > Signed-off-by: Saravana Kannan <saravanak@google.com>
> > >
> > > I bisected a boot hang with ARCH=s390 defconfig in QEMU down to this
> > > change as commit 2b28a1a84a0e ("driver core: Extend deferred probe
> > > timeout on driver registration") in next-20220520 (bisect log below).
> > >
> > > $ make -skj"$(nproc)" ARCH=s390 CROSS_COMPILE=s390x-linux-gnu- defconfig bzImage
> > >
> > > $ timeout --foreground 15m stdbuf -oL -eL \
> > > qemu-system-s390x \
> > > -initrd ... \
> > > -M s390-ccw-virtio \
> > > -display none \
> > > -kernel arch/s390/boot/bzImage \
> > > -m 512m \
> > > -nodefaults \
> > > -serial mon:stdio
> > > ...
> > > [    2.077303] In-situ OAM (IOAM) with IPv6
> > > [    2.077639] NET: Registered PF_PACKET protocol family
> > > [    2.078063] bridge: filtering via arp/ip/ip6tables is no longer available by default. Update your scripts to load br_netfilter if you need this.
> > > [    2.078795] Key type dns_resolver registered
> > > [    2.079317] cio: Channel measurement facility initialized using format extended (mode autodetected)
> > > [    2.081494] Discipline DIAG cannot be used without z/VM
> > > [  260.626363] random: crng init done
> > > qemu-system-s390x: terminating on signal 15 from pid 3815762 (timeout)
> > >
> > > We have a simple rootfs available if necessary:
> > >
> > > https://github.com/ClangBuiltLinux/boot-utils/raw/bc0d17785eb67f1edd0ee0a134970a807895f741/images/s390/rootfs.cpio.zst
> > >
> > > If there is any other information I can provide, please let me know!
> >
> > Hmm... strange. Can you please try the following command line options
> > and tell me which of these has the issue and which don't?
>
> Sure thing!
>
> > 1) deferred_probe_timeout=0
>
> No issue.
>
> > 2) deferred_probe_timeout=1
> > 3) deferred_probe_timeout=300
>
> Both of these appear to hang in the same way, I let each sit for five
> minutes.

Strange that a sufficiently large timeout isn't helping. Is it trying
to boot off a network mount? I'll continue looking into this next
week.

-Saravana

>
> Cheers,
> Nathan
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH v1] driver core: Extend deferred probe timeout on driver registration
  2022-05-21  0:15         ` Saravana Kannan via iommu
@ 2022-05-23 15:17           ` Nathan Chancellor
  -1 siblings, 0 replies; 38+ messages in thread
From: Nathan Chancellor @ 2022-05-23 15:17 UTC (permalink / raw)
  To: Saravana Kannan
  Cc: Jonathan Corbet, Greg Kroah-Hartman, Rafael J. Wysocki,
	Rafael J. Wysocki, Rob Herring, Linus Walleij, Will Deacon,
	Ulf Hansson, Kevin Hilman, Thierry Reding, Mark Brown,
	Pavel Machek, Geert Uytterhoeven, Yoshihiro Shimoda,
	Paul Kocialkowski, linux-gpio, linux-pm, iommu, kernel-team,
	linux-doc, linux-kernel, linux-s390

On Fri, May 20, 2022 at 05:15:55PM -0700, Saravana Kannan wrote:
> On Fri, May 20, 2022 at 5:04 PM Nathan Chancellor <nathan@kernel.org> wrote:
> >
> > On Fri, May 20, 2022 at 04:49:48PM -0700, Saravana Kannan wrote:
> > > On Fri, May 20, 2022 at 4:30 PM Nathan Chancellor <nathan@kernel.org> wrote:
> > > >
> > > > Hi Saravana,
> > > >
> > > > On Fri, Apr 29, 2022 at 03:09:32PM -0700, Saravana Kannan wrote:
> > > > > The deferred probe timer that's used for this currently starts at
> > > > > late_initcall and runs for driver_deferred_probe_timeout seconds. The
> > > > > assumption being that all available drivers would be loaded and
> > > > > registered before the timer expires. This means, the
> > > > > driver_deferred_probe_timeout has to be pretty large for it to cover the
> > > > > worst case. But if we set the default value for it to cover the worst
> > > > > case, it would significantly slow down the average case. For this
> > > > > reason, the default value is set to 0.
> > > > >
> > > > > Also, with CONFIG_MODULES=y and the current default values of
> > > > > driver_deferred_probe_timeout=0 and fw_devlink=on, devices with missing
> > > > > drivers will cause their consumer devices to always defer their probes.
> > > > > This is because device links created by fw_devlink defer the probe even
> > > > > before the consumer driver's probe() is called.
> > > > >
> > > > > Instead of a fixed timeout, if we extend an unexpired deferred probe
> > > > > timer on every successful driver registration, with the expectation more
> > > > > modules would be loaded in the near future, then the default value of
> > > > > driver_deferred_probe_timeout only needs to be as long as the worst case
> > > > > time difference between two consecutive module loads.
> > > > >
> > > > > So let's implement that and set the default value to 10 seconds when
> > > > > CONFIG_MODULES=y.
> > > > >
> > > > > Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
> > > > > Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net>
> > > > > Cc: Rob Herring <robh@kernel.org>
> > > > > Cc: Linus Walleij <linus.walleij@linaro.org>
> > > > > Cc: Will Deacon <will@kernel.org>
> > > > > Cc: Ulf Hansson <ulf.hansson@linaro.org>
> > > > > Cc: Kevin Hilman <khilman@kernel.org>
> > > > > Cc: Thierry Reding <treding@nvidia.com>
> > > > > Cc: Mark Brown <broonie@kernel.org>
> > > > > Cc: Pavel Machek <pavel@ucw.cz>
> > > > > Cc: Geert Uytterhoeven <geert@linux-m68k.org>
> > > > > Cc: Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com>
> > > > > Cc: Paul Kocialkowski <paul.kocialkowski@bootlin.com>
> > > > > Cc: linux-gpio@vger.kernel.org
> > > > > Cc: linux-pm@vger.kernel.org
> > > > > Cc: iommu@lists.linux-foundation.org
> > > > > Signed-off-by: Saravana Kannan <saravanak@google.com>
> > > >
> > > > I bisected a boot hang with ARCH=s390 defconfig in QEMU down to this
> > > > change as commit 2b28a1a84a0e ("driver core: Extend deferred probe
> > > > timeout on driver registration") in next-20220520 (bisect log below).
> > > >
> > > > $ make -skj"$(nproc)" ARCH=s390 CROSS_COMPILE=s390x-linux-gnu- defconfig bzImage
> > > >
> > > > $ timeout --foreground 15m stdbuf -oL -eL \
> > > > qemu-system-s390x \
> > > > -initrd ... \
> > > > -M s390-ccw-virtio \
> > > > -display none \
> > > > -kernel arch/s390/boot/bzImage \
> > > > -m 512m \
> > > > -nodefaults \
> > > > -serial mon:stdio
> > > > ...
> > > > [    2.077303] In-situ OAM (IOAM) with IPv6
> > > > [    2.077639] NET: Registered PF_PACKET protocol family
> > > > [    2.078063] bridge: filtering via arp/ip/ip6tables is no longer available by default. Update your scripts to load br_netfilter if you need this.
> > > > [    2.078795] Key type dns_resolver registered
> > > > [    2.079317] cio: Channel measurement facility initialized using format extended (mode autodetected)
> > > > [    2.081494] Discipline DIAG cannot be used without z/VM
> > > > [  260.626363] random: crng init done
> > > > qemu-system-s390x: terminating on signal 15 from pid 3815762 (timeout)
> > > >
> > > > We have a simple rootfs available if necessary:
> > > >
> > > > https://github.com/ClangBuiltLinux/boot-utils/raw/bc0d17785eb67f1edd0ee0a134970a807895f741/images/s390/rootfs.cpio.zst
> > > >
> > > > If there is any other information I can provide, please let me know!
> > >
> > > Hmm... strange. Can you please try the following command line options
> > > and tell me which of these has the issue and which don't?
> >
> > Sure thing!
> >
> > > 1) deferred_probe_timeout=0
> >
> > No issue.
> >
> > > 2) deferred_probe_timeout=1
> > > 3) deferred_probe_timeout=300
> >
> > Both of these appear to hang in the same way, I let each sit for five
> > minutes.
> 
> Strange that a sufficiently large timeout isn't helping. Is it trying
> to boot off a network mount? I'll continue looking into this next
> week.

I don't think so, it seems like doing that requires some extra flags
that we do not have:

https://wiki.qemu.org/Features/S390xNetworkBoot

If you need any additional information or want something tested, please
let me know!

Cheers,
Nathan

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH v1] driver core: Extend deferred probe timeout on driver registration
@ 2022-05-23 15:17           ` Nathan Chancellor
  0 siblings, 0 replies; 38+ messages in thread
From: Nathan Chancellor @ 2022-05-23 15:17 UTC (permalink / raw)
  To: Saravana Kannan
  Cc: Ulf Hansson, linux-doc, Linus Walleij, Pavel Machek, linux-s390,
	Will Deacon, Rob Herring, Jonathan Corbet, Rafael J. Wysocki,
	Geert Uytterhoeven, Thierry Reding, kernel-team, linux-pm,
	linux-gpio, Mark Brown, Greg Kroah-Hartman, Kevin Hilman,
	Rafael J. Wysocki, linux-kernel, Paul Kocialkowski, iommu

On Fri, May 20, 2022 at 05:15:55PM -0700, Saravana Kannan wrote:
> On Fri, May 20, 2022 at 5:04 PM Nathan Chancellor <nathan@kernel.org> wrote:
> >
> > On Fri, May 20, 2022 at 04:49:48PM -0700, Saravana Kannan wrote:
> > > On Fri, May 20, 2022 at 4:30 PM Nathan Chancellor <nathan@kernel.org> wrote:
> > > >
> > > > Hi Saravana,
> > > >
> > > > On Fri, Apr 29, 2022 at 03:09:32PM -0700, Saravana Kannan wrote:
> > > > > The deferred probe timer that's used for this currently starts at
> > > > > late_initcall and runs for driver_deferred_probe_timeout seconds. The
> > > > > assumption being that all available drivers would be loaded and
> > > > > registered before the timer expires. This means, the
> > > > > driver_deferred_probe_timeout has to be pretty large for it to cover the
> > > > > worst case. But if we set the default value for it to cover the worst
> > > > > case, it would significantly slow down the average case. For this
> > > > > reason, the default value is set to 0.
> > > > >
> > > > > Also, with CONFIG_MODULES=y and the current default values of
> > > > > driver_deferred_probe_timeout=0 and fw_devlink=on, devices with missing
> > > > > drivers will cause their consumer devices to always defer their probes.
> > > > > This is because device links created by fw_devlink defer the probe even
> > > > > before the consumer driver's probe() is called.
> > > > >
> > > > > Instead of a fixed timeout, if we extend an unexpired deferred probe
> > > > > timer on every successful driver registration, with the expectation more
> > > > > modules would be loaded in the near future, then the default value of
> > > > > driver_deferred_probe_timeout only needs to be as long as the worst case
> > > > > time difference between two consecutive module loads.
> > > > >
> > > > > So let's implement that and set the default value to 10 seconds when
> > > > > CONFIG_MODULES=y.
> > > > >
> > > > > Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
> > > > > Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net>
> > > > > Cc: Rob Herring <robh@kernel.org>
> > > > > Cc: Linus Walleij <linus.walleij@linaro.org>
> > > > > Cc: Will Deacon <will@kernel.org>
> > > > > Cc: Ulf Hansson <ulf.hansson@linaro.org>
> > > > > Cc: Kevin Hilman <khilman@kernel.org>
> > > > > Cc: Thierry Reding <treding@nvidia.com>
> > > > > Cc: Mark Brown <broonie@kernel.org>
> > > > > Cc: Pavel Machek <pavel@ucw.cz>
> > > > > Cc: Geert Uytterhoeven <geert@linux-m68k.org>
> > > > > Cc: Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com>
> > > > > Cc: Paul Kocialkowski <paul.kocialkowski@bootlin.com>
> > > > > Cc: linux-gpio@vger.kernel.org
> > > > > Cc: linux-pm@vger.kernel.org
> > > > > Cc: iommu@lists.linux-foundation.org
> > > > > Signed-off-by: Saravana Kannan <saravanak@google.com>
> > > >
> > > > I bisected a boot hang with ARCH=s390 defconfig in QEMU down to this
> > > > change as commit 2b28a1a84a0e ("driver core: Extend deferred probe
> > > > timeout on driver registration") in next-20220520 (bisect log below).
> > > >
> > > > $ make -skj"$(nproc)" ARCH=s390 CROSS_COMPILE=s390x-linux-gnu- defconfig bzImage
> > > >
> > > > $ timeout --foreground 15m stdbuf -oL -eL \
> > > > qemu-system-s390x \
> > > > -initrd ... \
> > > > -M s390-ccw-virtio \
> > > > -display none \
> > > > -kernel arch/s390/boot/bzImage \
> > > > -m 512m \
> > > > -nodefaults \
> > > > -serial mon:stdio
> > > > ...
> > > > [    2.077303] In-situ OAM (IOAM) with IPv6
> > > > [    2.077639] NET: Registered PF_PACKET protocol family
> > > > [    2.078063] bridge: filtering via arp/ip/ip6tables is no longer available by default. Update your scripts to load br_netfilter if you need this.
> > > > [    2.078795] Key type dns_resolver registered
> > > > [    2.079317] cio: Channel measurement facility initialized using format extended (mode autodetected)
> > > > [    2.081494] Discipline DIAG cannot be used without z/VM
> > > > [  260.626363] random: crng init done
> > > > qemu-system-s390x: terminating on signal 15 from pid 3815762 (timeout)
> > > >
> > > > We have a simple rootfs available if necessary:
> > > >
> > > > https://github.com/ClangBuiltLinux/boot-utils/raw/bc0d17785eb67f1edd0ee0a134970a807895f741/images/s390/rootfs.cpio.zst
> > > >
> > > > If there is any other information I can provide, please let me know!
> > >
> > > Hmm... strange. Can you please try the following command line options
> > > and tell me which of these has the issue and which don't?
> >
> > Sure thing!
> >
> > > 1) deferred_probe_timeout=0
> >
> > No issue.
> >
> > > 2) deferred_probe_timeout=1
> > > 3) deferred_probe_timeout=300
> >
> > Both of these appear to hang in the same way, I let each sit for five
> > minutes.
> 
> Strange that a sufficiently large timeout isn't helping. Is it trying
> to boot off a network mount? I'll continue looking into this next
> week.

I don't think so, it seems like doing that requires some extra flags
that we do not have:

https://wiki.qemu.org/Features/S390xNetworkBoot

If you need any additional information or want something tested, please
let me know!

Cheers,
Nathan
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH v1] driver core: Extend deferred probe timeout on driver registration
  2022-05-23 15:17           ` Nathan Chancellor
@ 2022-05-23 20:04             ` Saravana Kannan via iommu
  -1 siblings, 0 replies; 38+ messages in thread
From: Saravana Kannan @ 2022-05-23 20:04 UTC (permalink / raw)
  To: Nathan Chancellor
  Cc: Jonathan Corbet, Greg Kroah-Hartman, Rafael J. Wysocki,
	Rafael J. Wysocki, Rob Herring, Linus Walleij, Will Deacon,
	Ulf Hansson, Kevin Hilman, Thierry Reding, Mark Brown,
	Pavel Machek, Geert Uytterhoeven, Yoshihiro Shimoda,
	Paul Kocialkowski, linux-gpio, linux-pm, iommu, kernel-team,
	linux-doc, linux-kernel, linux-s390

On Mon, May 23, 2022 at 8:17 AM Nathan Chancellor <nathan@kernel.org> wrote:
>
> On Fri, May 20, 2022 at 05:15:55PM -0700, Saravana Kannan wrote:
> > On Fri, May 20, 2022 at 5:04 PM Nathan Chancellor <nathan@kernel.org> wrote:
> > >
> > > On Fri, May 20, 2022 at 04:49:48PM -0700, Saravana Kannan wrote:
> > > > On Fri, May 20, 2022 at 4:30 PM Nathan Chancellor <nathan@kernel.org> wrote:
> > > > >
> > > > > Hi Saravana,
> > > > >
> > > > > On Fri, Apr 29, 2022 at 03:09:32PM -0700, Saravana Kannan wrote:
> > > > > > The deferred probe timer that's used for this currently starts at
> > > > > > late_initcall and runs for driver_deferred_probe_timeout seconds. The
> > > > > > assumption being that all available drivers would be loaded and
> > > > > > registered before the timer expires. This means, the
> > > > > > driver_deferred_probe_timeout has to be pretty large for it to cover the
> > > > > > worst case. But if we set the default value for it to cover the worst
> > > > > > case, it would significantly slow down the average case. For this
> > > > > > reason, the default value is set to 0.
> > > > > >
> > > > > > Also, with CONFIG_MODULES=y and the current default values of
> > > > > > driver_deferred_probe_timeout=0 and fw_devlink=on, devices with missing
> > > > > > drivers will cause their consumer devices to always defer their probes.
> > > > > > This is because device links created by fw_devlink defer the probe even
> > > > > > before the consumer driver's probe() is called.
> > > > > >
> > > > > > Instead of a fixed timeout, if we extend an unexpired deferred probe
> > > > > > timer on every successful driver registration, with the expectation more
> > > > > > modules would be loaded in the near future, then the default value of
> > > > > > driver_deferred_probe_timeout only needs to be as long as the worst case
> > > > > > time difference between two consecutive module loads.
> > > > > >
> > > > > > So let's implement that and set the default value to 10 seconds when
> > > > > > CONFIG_MODULES=y.
> > > > > >
> > > > > > Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
> > > > > > Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net>
> > > > > > Cc: Rob Herring <robh@kernel.org>
> > > > > > Cc: Linus Walleij <linus.walleij@linaro.org>
> > > > > > Cc: Will Deacon <will@kernel.org>
> > > > > > Cc: Ulf Hansson <ulf.hansson@linaro.org>
> > > > > > Cc: Kevin Hilman <khilman@kernel.org>
> > > > > > Cc: Thierry Reding <treding@nvidia.com>
> > > > > > Cc: Mark Brown <broonie@kernel.org>
> > > > > > Cc: Pavel Machek <pavel@ucw.cz>
> > > > > > Cc: Geert Uytterhoeven <geert@linux-m68k.org>
> > > > > > Cc: Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com>
> > > > > > Cc: Paul Kocialkowski <paul.kocialkowski@bootlin.com>
> > > > > > Cc: linux-gpio@vger.kernel.org
> > > > > > Cc: linux-pm@vger.kernel.org
> > > > > > Cc: iommu@lists.linux-foundation.org
> > > > > > Signed-off-by: Saravana Kannan <saravanak@google.com>
> > > > >
> > > > > I bisected a boot hang with ARCH=s390 defconfig in QEMU down to this
> > > > > change as commit 2b28a1a84a0e ("driver core: Extend deferred probe
> > > > > timeout on driver registration") in next-20220520 (bisect log below).
> > > > >
> > > > > $ make -skj"$(nproc)" ARCH=s390 CROSS_COMPILE=s390x-linux-gnu- defconfig bzImage
> > > > >
> > > > > $ timeout --foreground 15m stdbuf -oL -eL \
> > > > > qemu-system-s390x \
> > > > > -initrd ... \
> > > > > -M s390-ccw-virtio \
> > > > > -display none \
> > > > > -kernel arch/s390/boot/bzImage \
> > > > > -m 512m \
> > > > > -nodefaults \
> > > > > -serial mon:stdio
> > > > > ...
> > > > > [    2.077303] In-situ OAM (IOAM) with IPv6
> > > > > [    2.077639] NET: Registered PF_PACKET protocol family
> > > > > [    2.078063] bridge: filtering via arp/ip/ip6tables is no longer available by default. Update your scripts to load br_netfilter if you need this.
> > > > > [    2.078795] Key type dns_resolver registered
> > > > > [    2.079317] cio: Channel measurement facility initialized using format extended (mode autodetected)
> > > > > [    2.081494] Discipline DIAG cannot be used without z/VM
> > > > > [  260.626363] random: crng init done
> > > > > qemu-system-s390x: terminating on signal 15 from pid 3815762 (timeout)
> > > > >
> > > > > We have a simple rootfs available if necessary:
> > > > >
> > > > > https://github.com/ClangBuiltLinux/boot-utils/raw/bc0d17785eb67f1edd0ee0a134970a807895f741/images/s390/rootfs.cpio.zst
> > > > >
> > > > > If there is any other information I can provide, please let me know!
> > > >
> > > > Hmm... strange. Can you please try the following command line options
> > > > and tell me which of these has the issue and which don't?
> > >
> > > Sure thing!
> > >
> > > > 1) deferred_probe_timeout=0
> > >
> > > No issue.
> > >
> > > > 2) deferred_probe_timeout=1
> > > > 3) deferred_probe_timeout=300
> > >
> > > Both of these appear to hang in the same way, I let each sit for five
> > > minutes.
> >
> > Strange that a sufficiently large timeout isn't helping. Is it trying
> > to boot off a network mount? I'll continue looking into this next
> > week.
>
> I don't think so, it seems like doing that requires some extra flags
> that we do not have:
>
> https://wiki.qemu.org/Features/S390xNetworkBoot
>
> If you need any additional information or want something tested, please
> let me know!

I'll try to get qemu going on my end, but I'm not too confident I'll
be able to get to it in a timely fashion. So if you can help figure
out where this boot process is hanging, that'd be very much
appreciated.

Couple of suggestions for debugging:

Can you add a log to "wait_for_device_probe()" and see if that's
getting called right before the boot process hangs? If it does, can
you get a stacktrace (I just add a WARN_ON(1) when I need a stack
trace)? It's unlikely this is the case because
deferred_probe_timeout=1 still causes an issue for you, but I'd be
good to rule out.

Let's try to rule out if deferred_probe_extend_timeout() is causing
some issues. So, without my patch, what happens if you set:
deferred_probe_timeout=1
deferred_probe_timeout=300

If deferred_probe_timeout=1 causes an issue even without my patch,
then in addition, can you try commenting out the call to
fw_devlink_drivers_done() inside deferred_probe_timeout_work_func()
and try again?

Thanks a lot for reporting and helping debug this issue.

-Saravana

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH v1] driver core: Extend deferred probe timeout on driver registration
@ 2022-05-23 20:04             ` Saravana Kannan via iommu
  0 siblings, 0 replies; 38+ messages in thread
From: Saravana Kannan via iommu @ 2022-05-23 20:04 UTC (permalink / raw)
  To: Nathan Chancellor
  Cc: Ulf Hansson, linux-doc, Linus Walleij, Pavel Machek, linux-s390,
	Will Deacon, Rob Herring, Jonathan Corbet, Rafael J. Wysocki,
	Geert Uytterhoeven, Thierry Reding, kernel-team, linux-pm,
	linux-gpio, Mark Brown, Greg Kroah-Hartman, Kevin Hilman,
	Rafael J. Wysocki, linux-kernel, Paul Kocialkowski, iommu

On Mon, May 23, 2022 at 8:17 AM Nathan Chancellor <nathan@kernel.org> wrote:
>
> On Fri, May 20, 2022 at 05:15:55PM -0700, Saravana Kannan wrote:
> > On Fri, May 20, 2022 at 5:04 PM Nathan Chancellor <nathan@kernel.org> wrote:
> > >
> > > On Fri, May 20, 2022 at 04:49:48PM -0700, Saravana Kannan wrote:
> > > > On Fri, May 20, 2022 at 4:30 PM Nathan Chancellor <nathan@kernel.org> wrote:
> > > > >
> > > > > Hi Saravana,
> > > > >
> > > > > On Fri, Apr 29, 2022 at 03:09:32PM -0700, Saravana Kannan wrote:
> > > > > > The deferred probe timer that's used for this currently starts at
> > > > > > late_initcall and runs for driver_deferred_probe_timeout seconds. The
> > > > > > assumption being that all available drivers would be loaded and
> > > > > > registered before the timer expires. This means, the
> > > > > > driver_deferred_probe_timeout has to be pretty large for it to cover the
> > > > > > worst case. But if we set the default value for it to cover the worst
> > > > > > case, it would significantly slow down the average case. For this
> > > > > > reason, the default value is set to 0.
> > > > > >
> > > > > > Also, with CONFIG_MODULES=y and the current default values of
> > > > > > driver_deferred_probe_timeout=0 and fw_devlink=on, devices with missing
> > > > > > drivers will cause their consumer devices to always defer their probes.
> > > > > > This is because device links created by fw_devlink defer the probe even
> > > > > > before the consumer driver's probe() is called.
> > > > > >
> > > > > > Instead of a fixed timeout, if we extend an unexpired deferred probe
> > > > > > timer on every successful driver registration, with the expectation more
> > > > > > modules would be loaded in the near future, then the default value of
> > > > > > driver_deferred_probe_timeout only needs to be as long as the worst case
> > > > > > time difference between two consecutive module loads.
> > > > > >
> > > > > > So let's implement that and set the default value to 10 seconds when
> > > > > > CONFIG_MODULES=y.
> > > > > >
> > > > > > Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
> > > > > > Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net>
> > > > > > Cc: Rob Herring <robh@kernel.org>
> > > > > > Cc: Linus Walleij <linus.walleij@linaro.org>
> > > > > > Cc: Will Deacon <will@kernel.org>
> > > > > > Cc: Ulf Hansson <ulf.hansson@linaro.org>
> > > > > > Cc: Kevin Hilman <khilman@kernel.org>
> > > > > > Cc: Thierry Reding <treding@nvidia.com>
> > > > > > Cc: Mark Brown <broonie@kernel.org>
> > > > > > Cc: Pavel Machek <pavel@ucw.cz>
> > > > > > Cc: Geert Uytterhoeven <geert@linux-m68k.org>
> > > > > > Cc: Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com>
> > > > > > Cc: Paul Kocialkowski <paul.kocialkowski@bootlin.com>
> > > > > > Cc: linux-gpio@vger.kernel.org
> > > > > > Cc: linux-pm@vger.kernel.org
> > > > > > Cc: iommu@lists.linux-foundation.org
> > > > > > Signed-off-by: Saravana Kannan <saravanak@google.com>
> > > > >
> > > > > I bisected a boot hang with ARCH=s390 defconfig in QEMU down to this
> > > > > change as commit 2b28a1a84a0e ("driver core: Extend deferred probe
> > > > > timeout on driver registration") in next-20220520 (bisect log below).
> > > > >
> > > > > $ make -skj"$(nproc)" ARCH=s390 CROSS_COMPILE=s390x-linux-gnu- defconfig bzImage
> > > > >
> > > > > $ timeout --foreground 15m stdbuf -oL -eL \
> > > > > qemu-system-s390x \
> > > > > -initrd ... \
> > > > > -M s390-ccw-virtio \
> > > > > -display none \
> > > > > -kernel arch/s390/boot/bzImage \
> > > > > -m 512m \
> > > > > -nodefaults \
> > > > > -serial mon:stdio
> > > > > ...
> > > > > [    2.077303] In-situ OAM (IOAM) with IPv6
> > > > > [    2.077639] NET: Registered PF_PACKET protocol family
> > > > > [    2.078063] bridge: filtering via arp/ip/ip6tables is no longer available by default. Update your scripts to load br_netfilter if you need this.
> > > > > [    2.078795] Key type dns_resolver registered
> > > > > [    2.079317] cio: Channel measurement facility initialized using format extended (mode autodetected)
> > > > > [    2.081494] Discipline DIAG cannot be used without z/VM
> > > > > [  260.626363] random: crng init done
> > > > > qemu-system-s390x: terminating on signal 15 from pid 3815762 (timeout)
> > > > >
> > > > > We have a simple rootfs available if necessary:
> > > > >
> > > > > https://github.com/ClangBuiltLinux/boot-utils/raw/bc0d17785eb67f1edd0ee0a134970a807895f741/images/s390/rootfs.cpio.zst
> > > > >
> > > > > If there is any other information I can provide, please let me know!
> > > >
> > > > Hmm... strange. Can you please try the following command line options
> > > > and tell me which of these has the issue and which don't?
> > >
> > > Sure thing!
> > >
> > > > 1) deferred_probe_timeout=0
> > >
> > > No issue.
> > >
> > > > 2) deferred_probe_timeout=1
> > > > 3) deferred_probe_timeout=300
> > >
> > > Both of these appear to hang in the same way, I let each sit for five
> > > minutes.
> >
> > Strange that a sufficiently large timeout isn't helping. Is it trying
> > to boot off a network mount? I'll continue looking into this next
> > week.
>
> I don't think so, it seems like doing that requires some extra flags
> that we do not have:
>
> https://wiki.qemu.org/Features/S390xNetworkBoot
>
> If you need any additional information or want something tested, please
> let me know!

I'll try to get qemu going on my end, but I'm not too confident I'll
be able to get to it in a timely fashion. So if you can help figure
out where this boot process is hanging, that'd be very much
appreciated.

Couple of suggestions for debugging:

Can you add a log to "wait_for_device_probe()" and see if that's
getting called right before the boot process hangs? If it does, can
you get a stacktrace (I just add a WARN_ON(1) when I need a stack
trace)? It's unlikely this is the case because
deferred_probe_timeout=1 still causes an issue for you, but I'd be
good to rule out.

Let's try to rule out if deferred_probe_extend_timeout() is causing
some issues. So, without my patch, what happens if you set:
deferred_probe_timeout=1
deferred_probe_timeout=300

If deferred_probe_timeout=1 causes an issue even without my patch,
then in addition, can you try commenting out the call to
fw_devlink_drivers_done() inside deferred_probe_timeout_work_func()
and try again?

Thanks a lot for reporting and helping debug this issue.

-Saravana
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH v1] driver core: Extend deferred probe timeout on driver registration
  2022-05-23 20:04             ` Saravana Kannan via iommu
@ 2022-05-23 22:14               ` Nathan Chancellor
  -1 siblings, 0 replies; 38+ messages in thread
From: Nathan Chancellor @ 2022-05-23 22:14 UTC (permalink / raw)
  To: Saravana Kannan
  Cc: Jonathan Corbet, Greg Kroah-Hartman, Rafael J. Wysocki,
	Rafael J. Wysocki, Rob Herring, Linus Walleij, Will Deacon,
	Ulf Hansson, Kevin Hilman, Thierry Reding, Mark Brown,
	Pavel Machek, Geert Uytterhoeven, Yoshihiro Shimoda,
	Paul Kocialkowski, linux-gpio, linux-pm, iommu, kernel-team,
	linux-doc, linux-kernel, linux-s390

On Mon, May 23, 2022 at 01:04:03PM -0700, Saravana Kannan wrote:
> On Mon, May 23, 2022 at 8:17 AM Nathan Chancellor <nathan@kernel.org> wrote:
> >
> > On Fri, May 20, 2022 at 05:15:55PM -0700, Saravana Kannan wrote:
> > > On Fri, May 20, 2022 at 5:04 PM Nathan Chancellor <nathan@kernel.org> wrote:
> > > >
> > > > On Fri, May 20, 2022 at 04:49:48PM -0700, Saravana Kannan wrote:
> > > > > On Fri, May 20, 2022 at 4:30 PM Nathan Chancellor <nathan@kernel.org> wrote:
> > > > > >
> > > > > > Hi Saravana,
> > > > > >
> > > > > > On Fri, Apr 29, 2022 at 03:09:32PM -0700, Saravana Kannan wrote:
> > > > > > > The deferred probe timer that's used for this currently starts at
> > > > > > > late_initcall and runs for driver_deferred_probe_timeout seconds. The
> > > > > > > assumption being that all available drivers would be loaded and
> > > > > > > registered before the timer expires. This means, the
> > > > > > > driver_deferred_probe_timeout has to be pretty large for it to cover the
> > > > > > > worst case. But if we set the default value for it to cover the worst
> > > > > > > case, it would significantly slow down the average case. For this
> > > > > > > reason, the default value is set to 0.
> > > > > > >
> > > > > > > Also, with CONFIG_MODULES=y and the current default values of
> > > > > > > driver_deferred_probe_timeout=0 and fw_devlink=on, devices with missing
> > > > > > > drivers will cause their consumer devices to always defer their probes.
> > > > > > > This is because device links created by fw_devlink defer the probe even
> > > > > > > before the consumer driver's probe() is called.
> > > > > > >
> > > > > > > Instead of a fixed timeout, if we extend an unexpired deferred probe
> > > > > > > timer on every successful driver registration, with the expectation more
> > > > > > > modules would be loaded in the near future, then the default value of
> > > > > > > driver_deferred_probe_timeout only needs to be as long as the worst case
> > > > > > > time difference between two consecutive module loads.
> > > > > > >
> > > > > > > So let's implement that and set the default value to 10 seconds when
> > > > > > > CONFIG_MODULES=y.
> > > > > > >
> > > > > > > Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
> > > > > > > Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net>
> > > > > > > Cc: Rob Herring <robh@kernel.org>
> > > > > > > Cc: Linus Walleij <linus.walleij@linaro.org>
> > > > > > > Cc: Will Deacon <will@kernel.org>
> > > > > > > Cc: Ulf Hansson <ulf.hansson@linaro.org>
> > > > > > > Cc: Kevin Hilman <khilman@kernel.org>
> > > > > > > Cc: Thierry Reding <treding@nvidia.com>
> > > > > > > Cc: Mark Brown <broonie@kernel.org>
> > > > > > > Cc: Pavel Machek <pavel@ucw.cz>
> > > > > > > Cc: Geert Uytterhoeven <geert@linux-m68k.org>
> > > > > > > Cc: Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com>
> > > > > > > Cc: Paul Kocialkowski <paul.kocialkowski@bootlin.com>
> > > > > > > Cc: linux-gpio@vger.kernel.org
> > > > > > > Cc: linux-pm@vger.kernel.org
> > > > > > > Cc: iommu@lists.linux-foundation.org
> > > > > > > Signed-off-by: Saravana Kannan <saravanak@google.com>
> > > > > >
> > > > > > I bisected a boot hang with ARCH=s390 defconfig in QEMU down to this
> > > > > > change as commit 2b28a1a84a0e ("driver core: Extend deferred probe
> > > > > > timeout on driver registration") in next-20220520 (bisect log below).
> > > > > >
> > > > > > $ make -skj"$(nproc)" ARCH=s390 CROSS_COMPILE=s390x-linux-gnu- defconfig bzImage
> > > > > >
> > > > > > $ timeout --foreground 15m stdbuf -oL -eL \
> > > > > > qemu-system-s390x \
> > > > > > -initrd ... \
> > > > > > -M s390-ccw-virtio \
> > > > > > -display none \
> > > > > > -kernel arch/s390/boot/bzImage \
> > > > > > -m 512m \
> > > > > > -nodefaults \
> > > > > > -serial mon:stdio
> > > > > > ...
> > > > > > [    2.077303] In-situ OAM (IOAM) with IPv6
> > > > > > [    2.077639] NET: Registered PF_PACKET protocol family
> > > > > > [    2.078063] bridge: filtering via arp/ip/ip6tables is no longer available by default. Update your scripts to load br_netfilter if you need this.
> > > > > > [    2.078795] Key type dns_resolver registered
> > > > > > [    2.079317] cio: Channel measurement facility initialized using format extended (mode autodetected)
> > > > > > [    2.081494] Discipline DIAG cannot be used without z/VM
> > > > > > [  260.626363] random: crng init done
> > > > > > qemu-system-s390x: terminating on signal 15 from pid 3815762 (timeout)
> > > > > >
> > > > > > We have a simple rootfs available if necessary:
> > > > > >
> > > > > > https://github.com/ClangBuiltLinux/boot-utils/raw/bc0d17785eb67f1edd0ee0a134970a807895f741/images/s390/rootfs.cpio.zst
> > > > > >
> > > > > > If there is any other information I can provide, please let me know!
> > > > >
> > > > > Hmm... strange. Can you please try the following command line options
> > > > > and tell me which of these has the issue and which don't?
> > > >
> > > > Sure thing!
> > > >
> > > > > 1) deferred_probe_timeout=0
> > > >
> > > > No issue.
> > > >
> > > > > 2) deferred_probe_timeout=1
> > > > > 3) deferred_probe_timeout=300
> > > >
> > > > Both of these appear to hang in the same way, I let each sit for five
> > > > minutes.
> > >
> > > Strange that a sufficiently large timeout isn't helping. Is it trying
> > > to boot off a network mount? I'll continue looking into this next
> > > week.
> >
> > I don't think so, it seems like doing that requires some extra flags
> > that we do not have:
> >
> > https://wiki.qemu.org/Features/S390xNetworkBoot
> >
> > If you need any additional information or want something tested, please
> > let me know!
> 
> I'll try to get qemu going on my end, but I'm not too confident I'll
> be able to get to it in a timely fashion. So if you can help figure
> out where this boot process is hanging, that'd be very much
> appreciated.

Sure thing! Information included below, I am more than happy to continue
to test and debug as you need.

> Couple of suggestions for debugging:
> 
> Can you add a log to "wait_for_device_probe()" and see if that's
> getting called right before the boot process hangs? If it does, can
> you get a stacktrace (I just add a WARN_ON(1) when I need a stack
> trace)? It's unlikely this is the case because
> deferred_probe_timeout=1 still causes an issue for you, but I'd be
> good to rule out.

If I add a pr_info() call at the top of wait_for_device_probe(), I see
it right before the process hangs. Adding WARN_ON(1) right below that
reveals dasd_eckd_init() in drivers/s390/block/dasd_eckd.c calls
wait_for_device_probe():

[    4.610397] ------------[ cut here ]------------
[    4.610520] WARNING: CPU: 0 PID: 1 at drivers/base/dd.c:742 wait_for_device_probe+0x28/0x110
[    4.611134] Modules linked in:
[    4.611593] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 5.18.0-next-20220523-dirty #1
[    4.611830] Hardware name: QEMU 8561 QEMU (KVM/Linux)
[    4.612017] Krnl PSW : 0704c00180000000 0000000000ce4b3c (wait_for_device_probe+0x2c/0x110)
[    4.612258]            R:0 T:1 IO:1 EX:1 Key:0 M:1 W:0 P:0 AS:3 CC:0 PM:0 RI:0 EA:3
[    4.612387] Krnl GPRS: 80000000fffff071 0000000000000027 000000000000000c 00000000017f91d8
[    4.612457]            00000000fffff071 00000000017f9218 0000000001a655a0 0000000000000006
[    4.612521]            0000000000000002 0000000001965810 00000000019d51a0 0000000000000000
[    4.612585]            0000000002218000 000000000125bcc8 0000000000ce4b38 000003800000bc80
[    4.614814] Krnl Code: 0000000000ce4b2c: e3e0f0980024        stg     %r14,152(%r15)
[    4.614814]            0000000000ce4b32: c0e5ffff94cb        brasl   %r14,0000000000cd74c8
[    4.614814]           #0000000000ce4b38: af000000            mc      0,0
[    4.614814]           >0000000000ce4b3c: c0100054d1fa        larl    %r1,000000000177ef30
[    4.614814]            0000000000ce4b42: e31010000012        lt      %r1,0(%r1)
[    4.614814]            0000000000ce4b48: a784002d            brc     8,0000000000ce4ba2
[    4.614814]            0000000000ce4b4c: d727f0a0f0a0        xc      160(40,%r15),160(%r15)
[    4.614814]            0000000000ce4b52: 41b0f0a0            la      %r11,160(%r15)
[    4.615698] Call Trace:
[    4.616559]  [<0000000000ce4b3c>] wait_for_device_probe+0x2c/0x110
[    4.616744] ([<0000000000ce4b38>] wait_for_device_probe+0x28/0x110)
[    4.616841]  [<000000000196593e>] dasd_eckd_init+0x12e/0x178
[    4.616913]  [<0000000000100936>] do_one_initcall+0x46/0x1e8
[    4.616983]  [<0000000001920706>] do_initcalls+0x126/0x150
[    4.617046]  [<000000000192095e>] kernel_init_freeable+0x1ae/0x1f0
[    4.617110]  [<0000000000ce85a6>] kernel_init+0x2e/0x168
[    4.617171]  [<0000000000103320>] __ret_from_fork+0x40/0x58
[    4.617233]  [<0000000000cf5eaa>] ret_from_fork+0xa/0x40
[    4.617352] Last Breaking-Event-Address:
[    4.617393]  [<0000000000e0e098>] __s390_indirect_jump_r14+0x0/0xc
[    4.617481] ---[ end trace 0000000000000000 ]---

> Let's try to rule out if deferred_probe_extend_timeout() is causing
> some issues. So, without my patch, what happens if you set:
> deferred_probe_timeout=1
> deferred_probe_timeout=300

At commit 6ee60e9c9f2f ("MAINTAINERS: add Russ Weight as a firmware
loader maintainer"), both deferred_probe_timeout=1 and
deferred_probe_timeout=300 hang the boot.

> If deferred_probe_timeout=1 causes an issue even without my patch,
> then in addition, can you try commenting out the call to
> fw_devlink_drivers_done() inside deferred_probe_timeout_work_func()
> and try again?

Sure, that does not appear to make a difference with
deferred_probe_timeout=1.

Cheers,
Nathan

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH v1] driver core: Extend deferred probe timeout on driver registration
@ 2022-05-23 22:14               ` Nathan Chancellor
  0 siblings, 0 replies; 38+ messages in thread
From: Nathan Chancellor @ 2022-05-23 22:14 UTC (permalink / raw)
  To: Saravana Kannan
  Cc: Ulf Hansson, linux-doc, Linus Walleij, Pavel Machek, linux-s390,
	Will Deacon, Rob Herring, Jonathan Corbet, Rafael J. Wysocki,
	Geert Uytterhoeven, Thierry Reding, kernel-team, linux-pm,
	linux-gpio, Mark Brown, Greg Kroah-Hartman, Kevin Hilman,
	Rafael J. Wysocki, linux-kernel, Paul Kocialkowski, iommu

On Mon, May 23, 2022 at 01:04:03PM -0700, Saravana Kannan wrote:
> On Mon, May 23, 2022 at 8:17 AM Nathan Chancellor <nathan@kernel.org> wrote:
> >
> > On Fri, May 20, 2022 at 05:15:55PM -0700, Saravana Kannan wrote:
> > > On Fri, May 20, 2022 at 5:04 PM Nathan Chancellor <nathan@kernel.org> wrote:
> > > >
> > > > On Fri, May 20, 2022 at 04:49:48PM -0700, Saravana Kannan wrote:
> > > > > On Fri, May 20, 2022 at 4:30 PM Nathan Chancellor <nathan@kernel.org> wrote:
> > > > > >
> > > > > > Hi Saravana,
> > > > > >
> > > > > > On Fri, Apr 29, 2022 at 03:09:32PM -0700, Saravana Kannan wrote:
> > > > > > > The deferred probe timer that's used for this currently starts at
> > > > > > > late_initcall and runs for driver_deferred_probe_timeout seconds. The
> > > > > > > assumption being that all available drivers would be loaded and
> > > > > > > registered before the timer expires. This means, the
> > > > > > > driver_deferred_probe_timeout has to be pretty large for it to cover the
> > > > > > > worst case. But if we set the default value for it to cover the worst
> > > > > > > case, it would significantly slow down the average case. For this
> > > > > > > reason, the default value is set to 0.
> > > > > > >
> > > > > > > Also, with CONFIG_MODULES=y and the current default values of
> > > > > > > driver_deferred_probe_timeout=0 and fw_devlink=on, devices with missing
> > > > > > > drivers will cause their consumer devices to always defer their probes.
> > > > > > > This is because device links created by fw_devlink defer the probe even
> > > > > > > before the consumer driver's probe() is called.
> > > > > > >
> > > > > > > Instead of a fixed timeout, if we extend an unexpired deferred probe
> > > > > > > timer on every successful driver registration, with the expectation more
> > > > > > > modules would be loaded in the near future, then the default value of
> > > > > > > driver_deferred_probe_timeout only needs to be as long as the worst case
> > > > > > > time difference between two consecutive module loads.
> > > > > > >
> > > > > > > So let's implement that and set the default value to 10 seconds when
> > > > > > > CONFIG_MODULES=y.
> > > > > > >
> > > > > > > Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
> > > > > > > Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net>
> > > > > > > Cc: Rob Herring <robh@kernel.org>
> > > > > > > Cc: Linus Walleij <linus.walleij@linaro.org>
> > > > > > > Cc: Will Deacon <will@kernel.org>
> > > > > > > Cc: Ulf Hansson <ulf.hansson@linaro.org>
> > > > > > > Cc: Kevin Hilman <khilman@kernel.org>
> > > > > > > Cc: Thierry Reding <treding@nvidia.com>
> > > > > > > Cc: Mark Brown <broonie@kernel.org>
> > > > > > > Cc: Pavel Machek <pavel@ucw.cz>
> > > > > > > Cc: Geert Uytterhoeven <geert@linux-m68k.org>
> > > > > > > Cc: Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com>
> > > > > > > Cc: Paul Kocialkowski <paul.kocialkowski@bootlin.com>
> > > > > > > Cc: linux-gpio@vger.kernel.org
> > > > > > > Cc: linux-pm@vger.kernel.org
> > > > > > > Cc: iommu@lists.linux-foundation.org
> > > > > > > Signed-off-by: Saravana Kannan <saravanak@google.com>
> > > > > >
> > > > > > I bisected a boot hang with ARCH=s390 defconfig in QEMU down to this
> > > > > > change as commit 2b28a1a84a0e ("driver core: Extend deferred probe
> > > > > > timeout on driver registration") in next-20220520 (bisect log below).
> > > > > >
> > > > > > $ make -skj"$(nproc)" ARCH=s390 CROSS_COMPILE=s390x-linux-gnu- defconfig bzImage
> > > > > >
> > > > > > $ timeout --foreground 15m stdbuf -oL -eL \
> > > > > > qemu-system-s390x \
> > > > > > -initrd ... \
> > > > > > -M s390-ccw-virtio \
> > > > > > -display none \
> > > > > > -kernel arch/s390/boot/bzImage \
> > > > > > -m 512m \
> > > > > > -nodefaults \
> > > > > > -serial mon:stdio
> > > > > > ...
> > > > > > [    2.077303] In-situ OAM (IOAM) with IPv6
> > > > > > [    2.077639] NET: Registered PF_PACKET protocol family
> > > > > > [    2.078063] bridge: filtering via arp/ip/ip6tables is no longer available by default. Update your scripts to load br_netfilter if you need this.
> > > > > > [    2.078795] Key type dns_resolver registered
> > > > > > [    2.079317] cio: Channel measurement facility initialized using format extended (mode autodetected)
> > > > > > [    2.081494] Discipline DIAG cannot be used without z/VM
> > > > > > [  260.626363] random: crng init done
> > > > > > qemu-system-s390x: terminating on signal 15 from pid 3815762 (timeout)
> > > > > >
> > > > > > We have a simple rootfs available if necessary:
> > > > > >
> > > > > > https://github.com/ClangBuiltLinux/boot-utils/raw/bc0d17785eb67f1edd0ee0a134970a807895f741/images/s390/rootfs.cpio.zst
> > > > > >
> > > > > > If there is any other information I can provide, please let me know!
> > > > >
> > > > > Hmm... strange. Can you please try the following command line options
> > > > > and tell me which of these has the issue and which don't?
> > > >
> > > > Sure thing!
> > > >
> > > > > 1) deferred_probe_timeout=0
> > > >
> > > > No issue.
> > > >
> > > > > 2) deferred_probe_timeout=1
> > > > > 3) deferred_probe_timeout=300
> > > >
> > > > Both of these appear to hang in the same way, I let each sit for five
> > > > minutes.
> > >
> > > Strange that a sufficiently large timeout isn't helping. Is it trying
> > > to boot off a network mount? I'll continue looking into this next
> > > week.
> >
> > I don't think so, it seems like doing that requires some extra flags
> > that we do not have:
> >
> > https://wiki.qemu.org/Features/S390xNetworkBoot
> >
> > If you need any additional information or want something tested, please
> > let me know!
> 
> I'll try to get qemu going on my end, but I'm not too confident I'll
> be able to get to it in a timely fashion. So if you can help figure
> out where this boot process is hanging, that'd be very much
> appreciated.

Sure thing! Information included below, I am more than happy to continue
to test and debug as you need.

> Couple of suggestions for debugging:
> 
> Can you add a log to "wait_for_device_probe()" and see if that's
> getting called right before the boot process hangs? If it does, can
> you get a stacktrace (I just add a WARN_ON(1) when I need a stack
> trace)? It's unlikely this is the case because
> deferred_probe_timeout=1 still causes an issue for you, but I'd be
> good to rule out.

If I add a pr_info() call at the top of wait_for_device_probe(), I see
it right before the process hangs. Adding WARN_ON(1) right below that
reveals dasd_eckd_init() in drivers/s390/block/dasd_eckd.c calls
wait_for_device_probe():

[    4.610397] ------------[ cut here ]------------
[    4.610520] WARNING: CPU: 0 PID: 1 at drivers/base/dd.c:742 wait_for_device_probe+0x28/0x110
[    4.611134] Modules linked in:
[    4.611593] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 5.18.0-next-20220523-dirty #1
[    4.611830] Hardware name: QEMU 8561 QEMU (KVM/Linux)
[    4.612017] Krnl PSW : 0704c00180000000 0000000000ce4b3c (wait_for_device_probe+0x2c/0x110)
[    4.612258]            R:0 T:1 IO:1 EX:1 Key:0 M:1 W:0 P:0 AS:3 CC:0 PM:0 RI:0 EA:3
[    4.612387] Krnl GPRS: 80000000fffff071 0000000000000027 000000000000000c 00000000017f91d8
[    4.612457]            00000000fffff071 00000000017f9218 0000000001a655a0 0000000000000006
[    4.612521]            0000000000000002 0000000001965810 00000000019d51a0 0000000000000000
[    4.612585]            0000000002218000 000000000125bcc8 0000000000ce4b38 000003800000bc80
[    4.614814] Krnl Code: 0000000000ce4b2c: e3e0f0980024        stg     %r14,152(%r15)
[    4.614814]            0000000000ce4b32: c0e5ffff94cb        brasl   %r14,0000000000cd74c8
[    4.614814]           #0000000000ce4b38: af000000            mc      0,0
[    4.614814]           >0000000000ce4b3c: c0100054d1fa        larl    %r1,000000000177ef30
[    4.614814]            0000000000ce4b42: e31010000012        lt      %r1,0(%r1)
[    4.614814]            0000000000ce4b48: a784002d            brc     8,0000000000ce4ba2
[    4.614814]            0000000000ce4b4c: d727f0a0f0a0        xc      160(40,%r15),160(%r15)
[    4.614814]            0000000000ce4b52: 41b0f0a0            la      %r11,160(%r15)
[    4.615698] Call Trace:
[    4.616559]  [<0000000000ce4b3c>] wait_for_device_probe+0x2c/0x110
[    4.616744] ([<0000000000ce4b38>] wait_for_device_probe+0x28/0x110)
[    4.616841]  [<000000000196593e>] dasd_eckd_init+0x12e/0x178
[    4.616913]  [<0000000000100936>] do_one_initcall+0x46/0x1e8
[    4.616983]  [<0000000001920706>] do_initcalls+0x126/0x150
[    4.617046]  [<000000000192095e>] kernel_init_freeable+0x1ae/0x1f0
[    4.617110]  [<0000000000ce85a6>] kernel_init+0x2e/0x168
[    4.617171]  [<0000000000103320>] __ret_from_fork+0x40/0x58
[    4.617233]  [<0000000000cf5eaa>] ret_from_fork+0xa/0x40
[    4.617352] Last Breaking-Event-Address:
[    4.617393]  [<0000000000e0e098>] __s390_indirect_jump_r14+0x0/0xc
[    4.617481] ---[ end trace 0000000000000000 ]---

> Let's try to rule out if deferred_probe_extend_timeout() is causing
> some issues. So, without my patch, what happens if you set:
> deferred_probe_timeout=1
> deferred_probe_timeout=300

At commit 6ee60e9c9f2f ("MAINTAINERS: add Russ Weight as a firmware
loader maintainer"), both deferred_probe_timeout=1 and
deferred_probe_timeout=300 hang the boot.

> If deferred_probe_timeout=1 causes an issue even without my patch,
> then in addition, can you try commenting out the call to
> fw_devlink_drivers_done() inside deferred_probe_timeout_work_func()
> and try again?

Sure, that does not appear to make a difference with
deferred_probe_timeout=1.

Cheers,
Nathan
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH v1] driver core: Extend deferred probe timeout on driver registration
  2022-05-23 22:14               ` Nathan Chancellor
@ 2022-05-24  3:43                 ` Saravana Kannan via iommu
  -1 siblings, 0 replies; 38+ messages in thread
From: Saravana Kannan @ 2022-05-24  3:43 UTC (permalink / raw)
  To: Nathan Chancellor
  Cc: Jonathan Corbet, Greg Kroah-Hartman, Rafael J. Wysocki,
	Rafael J. Wysocki, Rob Herring, Linus Walleij, Will Deacon,
	Ulf Hansson, Kevin Hilman, Thierry Reding, Mark Brown,
	Pavel Machek, Geert Uytterhoeven, Yoshihiro Shimoda,
	Paul Kocialkowski, linux-gpio, linux-pm, iommu, kernel-team,
	linux-doc, linux-kernel, linux-s390, John Stultz

On Mon, May 23, 2022 at 3:14 PM Nathan Chancellor <nathan@kernel.org> wrote:
>
> On Mon, May 23, 2022 at 01:04:03PM -0700, Saravana Kannan wrote:
> > On Mon, May 23, 2022 at 8:17 AM Nathan Chancellor <nathan@kernel.org> wrote:
> > >
> > > On Fri, May 20, 2022 at 05:15:55PM -0700, Saravana Kannan wrote:
> > > > On Fri, May 20, 2022 at 5:04 PM Nathan Chancellor <nathan@kernel.org> wrote:
> > > > >
> > > > > On Fri, May 20, 2022 at 04:49:48PM -0700, Saravana Kannan wrote:
> > > > > > On Fri, May 20, 2022 at 4:30 PM Nathan Chancellor <nathan@kernel.org> wrote:
> > > > > > >
> > > > > > > Hi Saravana,
> > > > > > >
> > > > > > > On Fri, Apr 29, 2022 at 03:09:32PM -0700, Saravana Kannan wrote:
> > > > > > > > The deferred probe timer that's used for this currently starts at
> > > > > > > > late_initcall and runs for driver_deferred_probe_timeout seconds. The
> > > > > > > > assumption being that all available drivers would be loaded and
> > > > > > > > registered before the timer expires. This means, the
> > > > > > > > driver_deferred_probe_timeout has to be pretty large for it to cover the
> > > > > > > > worst case. But if we set the default value for it to cover the worst
> > > > > > > > case, it would significantly slow down the average case. For this
> > > > > > > > reason, the default value is set to 0.
> > > > > > > >
> > > > > > > > Also, with CONFIG_MODULES=y and the current default values of
> > > > > > > > driver_deferred_probe_timeout=0 and fw_devlink=on, devices with missing
> > > > > > > > drivers will cause their consumer devices to always defer their probes.
> > > > > > > > This is because device links created by fw_devlink defer the probe even
> > > > > > > > before the consumer driver's probe() is called.
> > > > > > > >
> > > > > > > > Instead of a fixed timeout, if we extend an unexpired deferred probe
> > > > > > > > timer on every successful driver registration, with the expectation more
> > > > > > > > modules would be loaded in the near future, then the default value of
> > > > > > > > driver_deferred_probe_timeout only needs to be as long as the worst case
> > > > > > > > time difference between two consecutive module loads.
> > > > > > > >
> > > > > > > > So let's implement that and set the default value to 10 seconds when
> > > > > > > > CONFIG_MODULES=y.
> > > > > > > >
> > > > > > > > Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
> > > > > > > > Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net>
> > > > > > > > Cc: Rob Herring <robh@kernel.org>
> > > > > > > > Cc: Linus Walleij <linus.walleij@linaro.org>
> > > > > > > > Cc: Will Deacon <will@kernel.org>
> > > > > > > > Cc: Ulf Hansson <ulf.hansson@linaro.org>
> > > > > > > > Cc: Kevin Hilman <khilman@kernel.org>
> > > > > > > > Cc: Thierry Reding <treding@nvidia.com>
> > > > > > > > Cc: Mark Brown <broonie@kernel.org>
> > > > > > > > Cc: Pavel Machek <pavel@ucw.cz>
> > > > > > > > Cc: Geert Uytterhoeven <geert@linux-m68k.org>
> > > > > > > > Cc: Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com>
> > > > > > > > Cc: Paul Kocialkowski <paul.kocialkowski@bootlin.com>
> > > > > > > > Cc: linux-gpio@vger.kernel.org
> > > > > > > > Cc: linux-pm@vger.kernel.org
> > > > > > > > Cc: iommu@lists.linux-foundation.org
> > > > > > > > Signed-off-by: Saravana Kannan <saravanak@google.com>
> > > > > > >
> > > > > > > I bisected a boot hang with ARCH=s390 defconfig in QEMU down to this
> > > > > > > change as commit 2b28a1a84a0e ("driver core: Extend deferred probe
> > > > > > > timeout on driver registration") in next-20220520 (bisect log below).
> > > > > > >
> > > > > > > $ make -skj"$(nproc)" ARCH=s390 CROSS_COMPILE=s390x-linux-gnu- defconfig bzImage
> > > > > > >
> > > > > > > $ timeout --foreground 15m stdbuf -oL -eL \
> > > > > > > qemu-system-s390x \
> > > > > > > -initrd ... \
> > > > > > > -M s390-ccw-virtio \
> > > > > > > -display none \
> > > > > > > -kernel arch/s390/boot/bzImage \
> > > > > > > -m 512m \
> > > > > > > -nodefaults \
> > > > > > > -serial mon:stdio
> > > > > > > ...
> > > > > > > [    2.077303] In-situ OAM (IOAM) with IPv6
> > > > > > > [    2.077639] NET: Registered PF_PACKET protocol family
> > > > > > > [    2.078063] bridge: filtering via arp/ip/ip6tables is no longer available by default. Update your scripts to load br_netfilter if you need this.
> > > > > > > [    2.078795] Key type dns_resolver registered
> > > > > > > [    2.079317] cio: Channel measurement facility initialized using format extended (mode autodetected)
> > > > > > > [    2.081494] Discipline DIAG cannot be used without z/VM
> > > > > > > [  260.626363] random: crng init done
> > > > > > > qemu-system-s390x: terminating on signal 15 from pid 3815762 (timeout)
> > > > > > >
> > > > > > > We have a simple rootfs available if necessary:
> > > > > > >
> > > > > > > https://github.com/ClangBuiltLinux/boot-utils/raw/bc0d17785eb67f1edd0ee0a134970a807895f741/images/s390/rootfs.cpio.zst
> > > > > > >
> > > > > > > If there is any other information I can provide, please let me know!
> > > > > >
> > > > > > Hmm... strange. Can you please try the following command line options
> > > > > > and tell me which of these has the issue and which don't?
> > > > >
> > > > > Sure thing!
> > > > >
> > > > > > 1) deferred_probe_timeout=0
> > > > >
> > > > > No issue.
> > > > >
> > > > > > 2) deferred_probe_timeout=1
> > > > > > 3) deferred_probe_timeout=300
> > > > >
> > > > > Both of these appear to hang in the same way, I let each sit for five
> > > > > minutes.
> > > >
> > > > Strange that a sufficiently large timeout isn't helping. Is it trying
> > > > to boot off a network mount? I'll continue looking into this next
> > > > week.
> > >
> > > I don't think so, it seems like doing that requires some extra flags
> > > that we do not have:
> > >
> > > https://wiki.qemu.org/Features/S390xNetworkBoot
> > >
> > > If you need any additional information or want something tested, please
> > > let me know!
> >
> > I'll try to get qemu going on my end, but I'm not too confident I'll
> > be able to get to it in a timely fashion. So if you can help figure
> > out where this boot process is hanging, that'd be very much
> > appreciated.
>
> Sure thing! Information included below, I am more than happy to continue
> to test and debug as you need.
>
> > Couple of suggestions for debugging:
> >
> > Can you add a log to "wait_for_device_probe()" and see if that's
> > getting called right before the boot process hangs? If it does, can
> > you get a stacktrace (I just add a WARN_ON(1) when I need a stack
> > trace)? It's unlikely this is the case because
> > deferred_probe_timeout=1 still causes an issue for you, but I'd be
> > good to rule out.
>
> If I add a pr_info() call at the top of wait_for_device_probe(), I see
> it right before the process hangs. Adding WARN_ON(1) right below that
> reveals dasd_eckd_init() in drivers/s390/block/dasd_eckd.c calls
> wait_for_device_probe():
>
> [    4.610397] ------------[ cut here ]------------
> [    4.610520] WARNING: CPU: 0 PID: 1 at drivers/base/dd.c:742 wait_for_device_probe+0x28/0x110
> [    4.611134] Modules linked in:
> [    4.611593] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 5.18.0-next-20220523-dirty #1
> [    4.611830] Hardware name: QEMU 8561 QEMU (KVM/Linux)
> [    4.612017] Krnl PSW : 0704c00180000000 0000000000ce4b3c (wait_for_device_probe+0x2c/0x110)
> [    4.612258]            R:0 T:1 IO:1 EX:1 Key:0 M:1 W:0 P:0 AS:3 CC:0 PM:0 RI:0 EA:3
> [    4.612387] Krnl GPRS: 80000000fffff071 0000000000000027 000000000000000c 00000000017f91d8
> [    4.612457]            00000000fffff071 00000000017f9218 0000000001a655a0 0000000000000006
> [    4.612521]            0000000000000002 0000000001965810 00000000019d51a0 0000000000000000
> [    4.612585]            0000000002218000 000000000125bcc8 0000000000ce4b38 000003800000bc80
> [    4.614814] Krnl Code: 0000000000ce4b2c: e3e0f0980024        stg     %r14,152(%r15)
> [    4.614814]            0000000000ce4b32: c0e5ffff94cb        brasl   %r14,0000000000cd74c8
> [    4.614814]           #0000000000ce4b38: af000000            mc      0,0
> [    4.614814]           >0000000000ce4b3c: c0100054d1fa        larl    %r1,000000000177ef30
> [    4.614814]            0000000000ce4b42: e31010000012        lt      %r1,0(%r1)
> [    4.614814]            0000000000ce4b48: a784002d            brc     8,0000000000ce4ba2
> [    4.614814]            0000000000ce4b4c: d727f0a0f0a0        xc      160(40,%r15),160(%r15)
> [    4.614814]            0000000000ce4b52: 41b0f0a0            la      %r11,160(%r15)
> [    4.615698] Call Trace:
> [    4.616559]  [<0000000000ce4b3c>] wait_for_device_probe+0x2c/0x110
> [    4.616744] ([<0000000000ce4b38>] wait_for_device_probe+0x28/0x110)
> [    4.616841]  [<000000000196593e>] dasd_eckd_init+0x12e/0x178
> [    4.616913]  [<0000000000100936>] do_one_initcall+0x46/0x1e8
> [    4.616983]  [<0000000001920706>] do_initcalls+0x126/0x150
> [    4.617046]  [<000000000192095e>] kernel_init_freeable+0x1ae/0x1f0
> [    4.617110]  [<0000000000ce85a6>] kernel_init+0x2e/0x168
> [    4.617171]  [<0000000000103320>] __ret_from_fork+0x40/0x58
> [    4.617233]  [<0000000000cf5eaa>] ret_from_fork+0xa/0x40
> [    4.617352] Last Breaking-Event-Address:
> [    4.617393]  [<0000000000e0e098>] __s390_indirect_jump_r14+0x0/0xc
> [    4.617481] ---[ end trace 0000000000000000 ]---
>
> > Let's try to rule out if deferred_probe_extend_timeout() is causing
> > some issues. So, without my patch, what happens if you set:
> > deferred_probe_timeout=1
> > deferred_probe_timeout=300
>
> At commit 6ee60e9c9f2f ("MAINTAINERS: add Russ Weight as a firmware
> loader maintainer"), both deferred_probe_timeout=1 and
> deferred_probe_timeout=300 hang the boot.

Thanks for all the help. I think I know what's going on.

If you revert the following commit, then you'll see that your device
no longer hangs with my changes.
35a672363ab3 driver core: Ensure wait_for_device_probe() waits until
the deferred_probe_timeout fires

That commit was made to make sure NFS mounts continue to work even
when deferred_probe_timeout is set to a value greater than the IP auto
config timeout (I think that's 12 seconds). While that definitely
helps the NFS mount case, unfortunately wait_for_device_probe() is
used in some initcalls that happen before late_initcall(). This causes
a deadlock -- earlier initcalls wait for
late_initcall(deferred_probe_initcall) to run but we can't get to
late_initcall() if earlier initcalls don't finish.

Even if wait_for_device_probe() is used only in late_initcalls(), it
can still hang the kernel boot based on the link order of the .c
files as that determined the call order between late_initcalls().

It looks like all the existing users of wait_for_device_probe() seem
to try and "wait for currently probing devices to finish probing", but
the commit mentioned above changes the semantics to "wait until every
device that'll eventually probe finishes probing". I think the
solution is to revert the commit and then do this additional waiting
just in ip_auto_config(). I'll need to think about this a bit before
I'm sure the solution makes sense.

Thanks for all the help so far.

-Saravana




-Saravana



>
> > If deferred_probe_timeout=1 causes an issue even without my patch,
> > then in addition, can you try commenting out the call to
> > fw_devlink_drivers_done() inside deferred_probe_timeout_work_func()
> > and try again?
>
> Sure, that does not appear to make a difference with
> deferred_probe_timeout=1.
>
> Cheers,
> Nathan

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH v1] driver core: Extend deferred probe timeout on driver registration
@ 2022-05-24  3:43                 ` Saravana Kannan via iommu
  0 siblings, 0 replies; 38+ messages in thread
From: Saravana Kannan via iommu @ 2022-05-24  3:43 UTC (permalink / raw)
  To: Nathan Chancellor
  Cc: Ulf Hansson, linux-doc, Linus Walleij, John Stultz, Pavel Machek,
	linux-s390, Will Deacon, Rob Herring, Jonathan Corbet,
	Rafael J. Wysocki, Geert Uytterhoeven, Thierry Reding,
	kernel-team, linux-pm, linux-gpio, Mark Brown,
	Greg Kroah-Hartman, Kevin Hilman, Rafael J. Wysocki,
	linux-kernel, Paul Kocialkowski, iommu

On Mon, May 23, 2022 at 3:14 PM Nathan Chancellor <nathan@kernel.org> wrote:
>
> On Mon, May 23, 2022 at 01:04:03PM -0700, Saravana Kannan wrote:
> > On Mon, May 23, 2022 at 8:17 AM Nathan Chancellor <nathan@kernel.org> wrote:
> > >
> > > On Fri, May 20, 2022 at 05:15:55PM -0700, Saravana Kannan wrote:
> > > > On Fri, May 20, 2022 at 5:04 PM Nathan Chancellor <nathan@kernel.org> wrote:
> > > > >
> > > > > On Fri, May 20, 2022 at 04:49:48PM -0700, Saravana Kannan wrote:
> > > > > > On Fri, May 20, 2022 at 4:30 PM Nathan Chancellor <nathan@kernel.org> wrote:
> > > > > > >
> > > > > > > Hi Saravana,
> > > > > > >
> > > > > > > On Fri, Apr 29, 2022 at 03:09:32PM -0700, Saravana Kannan wrote:
> > > > > > > > The deferred probe timer that's used for this currently starts at
> > > > > > > > late_initcall and runs for driver_deferred_probe_timeout seconds. The
> > > > > > > > assumption being that all available drivers would be loaded and
> > > > > > > > registered before the timer expires. This means, the
> > > > > > > > driver_deferred_probe_timeout has to be pretty large for it to cover the
> > > > > > > > worst case. But if we set the default value for it to cover the worst
> > > > > > > > case, it would significantly slow down the average case. For this
> > > > > > > > reason, the default value is set to 0.
> > > > > > > >
> > > > > > > > Also, with CONFIG_MODULES=y and the current default values of
> > > > > > > > driver_deferred_probe_timeout=0 and fw_devlink=on, devices with missing
> > > > > > > > drivers will cause their consumer devices to always defer their probes.
> > > > > > > > This is because device links created by fw_devlink defer the probe even
> > > > > > > > before the consumer driver's probe() is called.
> > > > > > > >
> > > > > > > > Instead of a fixed timeout, if we extend an unexpired deferred probe
> > > > > > > > timer on every successful driver registration, with the expectation more
> > > > > > > > modules would be loaded in the near future, then the default value of
> > > > > > > > driver_deferred_probe_timeout only needs to be as long as the worst case
> > > > > > > > time difference between two consecutive module loads.
> > > > > > > >
> > > > > > > > So let's implement that and set the default value to 10 seconds when
> > > > > > > > CONFIG_MODULES=y.
> > > > > > > >
> > > > > > > > Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
> > > > > > > > Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net>
> > > > > > > > Cc: Rob Herring <robh@kernel.org>
> > > > > > > > Cc: Linus Walleij <linus.walleij@linaro.org>
> > > > > > > > Cc: Will Deacon <will@kernel.org>
> > > > > > > > Cc: Ulf Hansson <ulf.hansson@linaro.org>
> > > > > > > > Cc: Kevin Hilman <khilman@kernel.org>
> > > > > > > > Cc: Thierry Reding <treding@nvidia.com>
> > > > > > > > Cc: Mark Brown <broonie@kernel.org>
> > > > > > > > Cc: Pavel Machek <pavel@ucw.cz>
> > > > > > > > Cc: Geert Uytterhoeven <geert@linux-m68k.org>
> > > > > > > > Cc: Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com>
> > > > > > > > Cc: Paul Kocialkowski <paul.kocialkowski@bootlin.com>
> > > > > > > > Cc: linux-gpio@vger.kernel.org
> > > > > > > > Cc: linux-pm@vger.kernel.org
> > > > > > > > Cc: iommu@lists.linux-foundation.org
> > > > > > > > Signed-off-by: Saravana Kannan <saravanak@google.com>
> > > > > > >
> > > > > > > I bisected a boot hang with ARCH=s390 defconfig in QEMU down to this
> > > > > > > change as commit 2b28a1a84a0e ("driver core: Extend deferred probe
> > > > > > > timeout on driver registration") in next-20220520 (bisect log below).
> > > > > > >
> > > > > > > $ make -skj"$(nproc)" ARCH=s390 CROSS_COMPILE=s390x-linux-gnu- defconfig bzImage
> > > > > > >
> > > > > > > $ timeout --foreground 15m stdbuf -oL -eL \
> > > > > > > qemu-system-s390x \
> > > > > > > -initrd ... \
> > > > > > > -M s390-ccw-virtio \
> > > > > > > -display none \
> > > > > > > -kernel arch/s390/boot/bzImage \
> > > > > > > -m 512m \
> > > > > > > -nodefaults \
> > > > > > > -serial mon:stdio
> > > > > > > ...
> > > > > > > [    2.077303] In-situ OAM (IOAM) with IPv6
> > > > > > > [    2.077639] NET: Registered PF_PACKET protocol family
> > > > > > > [    2.078063] bridge: filtering via arp/ip/ip6tables is no longer available by default. Update your scripts to load br_netfilter if you need this.
> > > > > > > [    2.078795] Key type dns_resolver registered
> > > > > > > [    2.079317] cio: Channel measurement facility initialized using format extended (mode autodetected)
> > > > > > > [    2.081494] Discipline DIAG cannot be used without z/VM
> > > > > > > [  260.626363] random: crng init done
> > > > > > > qemu-system-s390x: terminating on signal 15 from pid 3815762 (timeout)
> > > > > > >
> > > > > > > We have a simple rootfs available if necessary:
> > > > > > >
> > > > > > > https://github.com/ClangBuiltLinux/boot-utils/raw/bc0d17785eb67f1edd0ee0a134970a807895f741/images/s390/rootfs.cpio.zst
> > > > > > >
> > > > > > > If there is any other information I can provide, please let me know!
> > > > > >
> > > > > > Hmm... strange. Can you please try the following command line options
> > > > > > and tell me which of these has the issue and which don't?
> > > > >
> > > > > Sure thing!
> > > > >
> > > > > > 1) deferred_probe_timeout=0
> > > > >
> > > > > No issue.
> > > > >
> > > > > > 2) deferred_probe_timeout=1
> > > > > > 3) deferred_probe_timeout=300
> > > > >
> > > > > Both of these appear to hang in the same way, I let each sit for five
> > > > > minutes.
> > > >
> > > > Strange that a sufficiently large timeout isn't helping. Is it trying
> > > > to boot off a network mount? I'll continue looking into this next
> > > > week.
> > >
> > > I don't think so, it seems like doing that requires some extra flags
> > > that we do not have:
> > >
> > > https://wiki.qemu.org/Features/S390xNetworkBoot
> > >
> > > If you need any additional information or want something tested, please
> > > let me know!
> >
> > I'll try to get qemu going on my end, but I'm not too confident I'll
> > be able to get to it in a timely fashion. So if you can help figure
> > out where this boot process is hanging, that'd be very much
> > appreciated.
>
> Sure thing! Information included below, I am more than happy to continue
> to test and debug as you need.
>
> > Couple of suggestions for debugging:
> >
> > Can you add a log to "wait_for_device_probe()" and see if that's
> > getting called right before the boot process hangs? If it does, can
> > you get a stacktrace (I just add a WARN_ON(1) when I need a stack
> > trace)? It's unlikely this is the case because
> > deferred_probe_timeout=1 still causes an issue for you, but I'd be
> > good to rule out.
>
> If I add a pr_info() call at the top of wait_for_device_probe(), I see
> it right before the process hangs. Adding WARN_ON(1) right below that
> reveals dasd_eckd_init() in drivers/s390/block/dasd_eckd.c calls
> wait_for_device_probe():
>
> [    4.610397] ------------[ cut here ]------------
> [    4.610520] WARNING: CPU: 0 PID: 1 at drivers/base/dd.c:742 wait_for_device_probe+0x28/0x110
> [    4.611134] Modules linked in:
> [    4.611593] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 5.18.0-next-20220523-dirty #1
> [    4.611830] Hardware name: QEMU 8561 QEMU (KVM/Linux)
> [    4.612017] Krnl PSW : 0704c00180000000 0000000000ce4b3c (wait_for_device_probe+0x2c/0x110)
> [    4.612258]            R:0 T:1 IO:1 EX:1 Key:0 M:1 W:0 P:0 AS:3 CC:0 PM:0 RI:0 EA:3
> [    4.612387] Krnl GPRS: 80000000fffff071 0000000000000027 000000000000000c 00000000017f91d8
> [    4.612457]            00000000fffff071 00000000017f9218 0000000001a655a0 0000000000000006
> [    4.612521]            0000000000000002 0000000001965810 00000000019d51a0 0000000000000000
> [    4.612585]            0000000002218000 000000000125bcc8 0000000000ce4b38 000003800000bc80
> [    4.614814] Krnl Code: 0000000000ce4b2c: e3e0f0980024        stg     %r14,152(%r15)
> [    4.614814]            0000000000ce4b32: c0e5ffff94cb        brasl   %r14,0000000000cd74c8
> [    4.614814]           #0000000000ce4b38: af000000            mc      0,0
> [    4.614814]           >0000000000ce4b3c: c0100054d1fa        larl    %r1,000000000177ef30
> [    4.614814]            0000000000ce4b42: e31010000012        lt      %r1,0(%r1)
> [    4.614814]            0000000000ce4b48: a784002d            brc     8,0000000000ce4ba2
> [    4.614814]            0000000000ce4b4c: d727f0a0f0a0        xc      160(40,%r15),160(%r15)
> [    4.614814]            0000000000ce4b52: 41b0f0a0            la      %r11,160(%r15)
> [    4.615698] Call Trace:
> [    4.616559]  [<0000000000ce4b3c>] wait_for_device_probe+0x2c/0x110
> [    4.616744] ([<0000000000ce4b38>] wait_for_device_probe+0x28/0x110)
> [    4.616841]  [<000000000196593e>] dasd_eckd_init+0x12e/0x178
> [    4.616913]  [<0000000000100936>] do_one_initcall+0x46/0x1e8
> [    4.616983]  [<0000000001920706>] do_initcalls+0x126/0x150
> [    4.617046]  [<000000000192095e>] kernel_init_freeable+0x1ae/0x1f0
> [    4.617110]  [<0000000000ce85a6>] kernel_init+0x2e/0x168
> [    4.617171]  [<0000000000103320>] __ret_from_fork+0x40/0x58
> [    4.617233]  [<0000000000cf5eaa>] ret_from_fork+0xa/0x40
> [    4.617352] Last Breaking-Event-Address:
> [    4.617393]  [<0000000000e0e098>] __s390_indirect_jump_r14+0x0/0xc
> [    4.617481] ---[ end trace 0000000000000000 ]---
>
> > Let's try to rule out if deferred_probe_extend_timeout() is causing
> > some issues. So, without my patch, what happens if you set:
> > deferred_probe_timeout=1
> > deferred_probe_timeout=300
>
> At commit 6ee60e9c9f2f ("MAINTAINERS: add Russ Weight as a firmware
> loader maintainer"), both deferred_probe_timeout=1 and
> deferred_probe_timeout=300 hang the boot.

Thanks for all the help. I think I know what's going on.

If you revert the following commit, then you'll see that your device
no longer hangs with my changes.
35a672363ab3 driver core: Ensure wait_for_device_probe() waits until
the deferred_probe_timeout fires

That commit was made to make sure NFS mounts continue to work even
when deferred_probe_timeout is set to a value greater than the IP auto
config timeout (I think that's 12 seconds). While that definitely
helps the NFS mount case, unfortunately wait_for_device_probe() is
used in some initcalls that happen before late_initcall(). This causes
a deadlock -- earlier initcalls wait for
late_initcall(deferred_probe_initcall) to run but we can't get to
late_initcall() if earlier initcalls don't finish.

Even if wait_for_device_probe() is used only in late_initcalls(), it
can still hang the kernel boot based on the link order of the .c
files as that determined the call order between late_initcalls().

It looks like all the existing users of wait_for_device_probe() seem
to try and "wait for currently probing devices to finish probing", but
the commit mentioned above changes the semantics to "wait until every
device that'll eventually probe finishes probing". I think the
solution is to revert the commit and then do this additional waiting
just in ip_auto_config(). I'll need to think about this a bit before
I'm sure the solution makes sense.

Thanks for all the help so far.

-Saravana




-Saravana



>
> > If deferred_probe_timeout=1 causes an issue even without my patch,
> > then in addition, can you try commenting out the call to
> > fw_devlink_drivers_done() inside deferred_probe_timeout_work_func()
> > and try again?
>
> Sure, that does not appear to make a difference with
> deferred_probe_timeout=1.
>
> Cheers,
> Nathan
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH v1] driver core: Extend deferred probe timeout on driver registration
  2022-05-24  3:43                 ` Saravana Kannan via iommu
@ 2022-05-24 16:41                   ` Sebastian Andrzej Siewior
  -1 siblings, 0 replies; 38+ messages in thread
From: Sebastian Andrzej Siewior @ 2022-05-24 16:41 UTC (permalink / raw)
  To: Saravana Kannan
  Cc: Nathan Chancellor, Jonathan Corbet, Greg Kroah-Hartman,
	Rafael J. Wysocki, Rafael J. Wysocki, Rob Herring, Linus Walleij,
	Will Deacon, Ulf Hansson, Kevin Hilman, Thierry Reding,
	Mark Brown, Pavel Machek, Geert Uytterhoeven, Yoshihiro Shimoda,
	Paul Kocialkowski, linux-gpio, linux-pm, iommu, kernel-team,
	linux-doc, linux-kernel, linux-s390, John Stultz

On 2022-05-23 20:43:06 [-0700], Saravana Kannan wrote:
…
> Thanks for all the help. I think I know what's going on.

I, too got here because my boot recently was extended by 10 seconds and
bisected to that commit in question.

> If you revert the following commit, then you'll see that your device
> no longer hangs with my changes.
> 35a672363ab3 driver core: Ensure wait_for_device_probe() waits until
> the deferred_probe_timeout fires

Removing probe_timeout_waitqueue (as suggested) or setting the timeout
to 0 avoids the delay.

> -Saravana

Sebastian

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH v1] driver core: Extend deferred probe timeout on driver registration
@ 2022-05-24 16:41                   ` Sebastian Andrzej Siewior
  0 siblings, 0 replies; 38+ messages in thread
From: Sebastian Andrzej Siewior @ 2022-05-24 16:41 UTC (permalink / raw)
  To: Saravana Kannan
  Cc: Ulf Hansson, linux-doc, Linus Walleij, John Stultz, Pavel Machek,
	linux-s390, Will Deacon, Rob Herring, Jonathan Corbet,
	Rafael J. Wysocki, Geert Uytterhoeven, Thierry Reding,
	kernel-team, linux-pm, Nathan Chancellor, linux-gpio, Mark Brown,
	Greg Kroah-Hartman, Kevin Hilman, Rafael J. Wysocki,
	linux-kernel, Paul Kocialkowski, iommu

On 2022-05-23 20:43:06 [-0700], Saravana Kannan wrote:
…
> Thanks for all the help. I think I know what's going on.

I, too got here because my boot recently was extended by 10 seconds and
bisected to that commit in question.

> If you revert the following commit, then you'll see that your device
> no longer hangs with my changes.
> 35a672363ab3 driver core: Ensure wait_for_device_probe() waits until
> the deferred_probe_timeout fires

Removing probe_timeout_waitqueue (as suggested) or setting the timeout
to 0 avoids the delay.

> -Saravana

Sebastian
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH v1] driver core: Extend deferred probe timeout on driver registration
  2022-05-24 16:41                   ` Sebastian Andrzej Siewior
@ 2022-05-24 17:46                     ` Saravana Kannan via iommu
  -1 siblings, 0 replies; 38+ messages in thread
From: Saravana Kannan @ 2022-05-24 17:46 UTC (permalink / raw)
  To: Sebastian Andrzej Siewior
  Cc: Nathan Chancellor, Jonathan Corbet, Greg Kroah-Hartman,
	Rafael J. Wysocki, Rafael J. Wysocki, Rob Herring, Linus Walleij,
	Will Deacon, Ulf Hansson, Kevin Hilman, Thierry Reding,
	Mark Brown, Pavel Machek, Geert Uytterhoeven, Yoshihiro Shimoda,
	Paul Kocialkowski, linux-gpio, linux-pm, iommu, kernel-team,
	linux-doc, linux-kernel, linux-s390, John Stultz

On Tue, May 24, 2022 at 9:41 AM Sebastian Andrzej Siewior
<bigeasy@linutronix.de> wrote:
>
> On 2022-05-23 20:43:06 [-0700], Saravana Kannan wrote:
> …
> > Thanks for all the help. I think I know what's going on.
>
> I, too got here because my boot recently was extended by 10 seconds and
> bisected to that commit in question.
>
> > If you revert the following commit, then you'll see that your device
> > no longer hangs with my changes.
> > 35a672363ab3 driver core: Ensure wait_for_device_probe() waits until
> > the deferred_probe_timeout fires
>
> Removing probe_timeout_waitqueue (as suggested) or setting the timeout
> to 0 avoids the delay.

In your case, I think it might be working as intended? Curious, what
was the call stack in your case where it was blocked?

-Saravana

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH v1] driver core: Extend deferred probe timeout on driver registration
@ 2022-05-24 17:46                     ` Saravana Kannan via iommu
  0 siblings, 0 replies; 38+ messages in thread
From: Saravana Kannan via iommu @ 2022-05-24 17:46 UTC (permalink / raw)
  To: Sebastian Andrzej Siewior
  Cc: Ulf Hansson, linux-doc, Linus Walleij, John Stultz, Pavel Machek,
	linux-s390, Will Deacon, Rob Herring, Jonathan Corbet,
	Rafael J. Wysocki, Geert Uytterhoeven, Thierry Reding,
	kernel-team, linux-pm, Nathan Chancellor, linux-gpio, Mark Brown,
	Greg Kroah-Hartman, Kevin Hilman, Rafael J. Wysocki,
	linux-kernel, Paul Kocialkowski, iommu

On Tue, May 24, 2022 at 9:41 AM Sebastian Andrzej Siewior
<bigeasy@linutronix.de> wrote:
>
> On 2022-05-23 20:43:06 [-0700], Saravana Kannan wrote:
> …
> > Thanks for all the help. I think I know what's going on.
>
> I, too got here because my boot recently was extended by 10 seconds and
> bisected to that commit in question.
>
> > If you revert the following commit, then you'll see that your device
> > no longer hangs with my changes.
> > 35a672363ab3 driver core: Ensure wait_for_device_probe() waits until
> > the deferred_probe_timeout fires
>
> Removing probe_timeout_waitqueue (as suggested) or setting the timeout
> to 0 avoids the delay.

In your case, I think it might be working as intended? Curious, what
was the call stack in your case where it was blocked?

-Saravana
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH v1] driver core: Extend deferred probe timeout on driver registration
  2022-05-24 17:46                     ` Saravana Kannan via iommu
@ 2022-05-25  7:11                       ` Sebastian Andrzej Siewior
  -1 siblings, 0 replies; 38+ messages in thread
From: Sebastian Andrzej Siewior @ 2022-05-25  7:11 UTC (permalink / raw)
  To: Saravana Kannan
  Cc: Ulf Hansson, linux-doc, Linus Walleij, John Stultz, Pavel Machek,
	linux-s390, Will Deacon, Rob Herring, Jonathan Corbet,
	Rafael J. Wysocki, Geert Uytterhoeven, Thierry Reding,
	kernel-team, linux-pm, Nathan Chancellor, linux-gpio, Mark Brown,
	Greg Kroah-Hartman, Kevin Hilman, Rafael J. Wysocki,
	linux-kernel, Paul Kocialkowski, iommu

On 2022-05-24 10:46:49 [-0700], Saravana Kannan wrote:
> > Removing probe_timeout_waitqueue (as suggested) or setting the timeout
> > to 0 avoids the delay.
> 
> In your case, I think it might be working as intended? Curious, what
> was the call stack in your case where it was blocked?

Why is then there 10sec delay during boot? The backtrace is
|------------[ cut here ]------------
|WARNING: CPU: 4 PID: 1 at drivers/base/dd.c:742 wait_for_device_probe+0x30/0x110
|Modules linked in:
|CPU: 4 PID: 1 Comm: swapper/0 Not tainted 5.18.0-rc5+ #154
|RIP: 0010:wait_for_device_probe+0x30/0x110
|Call Trace:
| <TASK>
| prepare_namespace+0x2b/0x160
| kernel_init_freeable+0x2b3/0x2dd
| kernel_init+0x11/0x110
| ret_from_fork+0x22/0x30
| </TASK>

Looking closer, it can't access init. This in particular box boots
directly the kernel without an initramfs so the kernel later mounts
/dev/sda1 and everything is good.  So that seems to be the reason…
My other machine with an initramfs does not show this problem.

> -Saravana

Sebastian
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH v1] driver core: Extend deferred probe timeout on driver registration
@ 2022-05-25  7:11                       ` Sebastian Andrzej Siewior
  0 siblings, 0 replies; 38+ messages in thread
From: Sebastian Andrzej Siewior @ 2022-05-25  7:11 UTC (permalink / raw)
  To: Saravana Kannan
  Cc: Nathan Chancellor, Jonathan Corbet, Greg Kroah-Hartman,
	Rafael J. Wysocki, Rafael J. Wysocki, Rob Herring, Linus Walleij,
	Will Deacon, Ulf Hansson, Kevin Hilman, Thierry Reding,
	Mark Brown, Pavel Machek, Geert Uytterhoeven, Yoshihiro Shimoda,
	Paul Kocialkowski, linux-gpio, linux-pm, iommu, kernel-team,
	linux-doc, linux-kernel, linux-s390, John Stultz

On 2022-05-24 10:46:49 [-0700], Saravana Kannan wrote:
> > Removing probe_timeout_waitqueue (as suggested) or setting the timeout
> > to 0 avoids the delay.
> 
> In your case, I think it might be working as intended? Curious, what
> was the call stack in your case where it was blocked?

Why is then there 10sec delay during boot? The backtrace is
|------------[ cut here ]------------
|WARNING: CPU: 4 PID: 1 at drivers/base/dd.c:742 wait_for_device_probe+0x30/0x110
|Modules linked in:
|CPU: 4 PID: 1 Comm: swapper/0 Not tainted 5.18.0-rc5+ #154
|RIP: 0010:wait_for_device_probe+0x30/0x110
|Call Trace:
| <TASK>
| prepare_namespace+0x2b/0x160
| kernel_init_freeable+0x2b3/0x2dd
| kernel_init+0x11/0x110
| ret_from_fork+0x22/0x30
| </TASK>

Looking closer, it can't access init. This in particular box boots
directly the kernel without an initramfs so the kernel later mounts
/dev/sda1 and everything is good.  So that seems to be the reason…
My other machine with an initramfs does not show this problem.

> -Saravana

Sebastian

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH v1] driver core: Extend deferred probe timeout on driver registration
  2022-05-25  7:11                       ` Sebastian Andrzej Siewior
@ 2022-05-25 19:49                         ` Saravana Kannan via iommu
  -1 siblings, 0 replies; 38+ messages in thread
From: Saravana Kannan @ 2022-05-25 19:49 UTC (permalink / raw)
  To: Sebastian Andrzej Siewior
  Cc: Nathan Chancellor, Jonathan Corbet, Greg Kroah-Hartman,
	Rafael J. Wysocki, Rafael J. Wysocki, Rob Herring, Linus Walleij,
	Will Deacon, Ulf Hansson, Kevin Hilman, Thierry Reding,
	Mark Brown, Pavel Machek, Geert Uytterhoeven, Yoshihiro Shimoda,
	Paul Kocialkowski, linux-gpio, linux-pm, iommu, kernel-team,
	linux-doc, linux-kernel, linux-s390, John Stultz

On Wed, May 25, 2022 at 12:12 AM Sebastian Andrzej Siewior
<bigeasy@linutronix.de> wrote:
>
> On 2022-05-24 10:46:49 [-0700], Saravana Kannan wrote:
> > > Removing probe_timeout_waitqueue (as suggested) or setting the timeout
> > > to 0 avoids the delay.
> >
> > In your case, I think it might be working as intended? Curious, what
> > was the call stack in your case where it was blocked?
>
> Why is then there 10sec delay during boot? The backtrace is
> |------------[ cut here ]------------
> |WARNING: CPU: 4 PID: 1 at drivers/base/dd.c:742 wait_for_device_probe+0x30/0x110
> |Modules linked in:
> |CPU: 4 PID: 1 Comm: swapper/0 Not tainted 5.18.0-rc5+ #154
> |RIP: 0010:wait_for_device_probe+0x30/0x110
> |Call Trace:
> | <TASK>
> | prepare_namespace+0x2b/0x160
> | kernel_init_freeable+0x2b3/0x2dd
> | kernel_init+0x11/0x110
> | ret_from_fork+0x22/0x30
> | </TASK>
>
> Looking closer, it can't access init. This in particular box boots
> directly the kernel without an initramfs so the kernel later mounts
> /dev/sda1 and everything is good.  So that seems to be the reason…

Hmmm... that part shouldn't matter. As long as you are hitting the
same code path. My guess is one of them has CONFIG_MODULES enabled and
the other doesn't.

In either case, I think the patch needs to be reverted (I'll send out
one soon), but that'll also mean I need to revert part of my patch
(sets the timeout back to 0) or I need to fix this case:
https://lore.kernel.org/lkml/TYAPR01MB45443DF63B9EF29054F7C41FD8C60@TYAPR01MB4544.jpnprd01.prod.outlook.com/

I'll try to do the latter if I can get something reasonable soon.
Otherwise, I'll just do the revert + partial revert.

-Saravana


> My other machine with an initramfs does not show this problem.
>
> > -Saravana
>
> Sebastian

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH v1] driver core: Extend deferred probe timeout on driver registration
@ 2022-05-25 19:49                         ` Saravana Kannan via iommu
  0 siblings, 0 replies; 38+ messages in thread
From: Saravana Kannan via iommu @ 2022-05-25 19:49 UTC (permalink / raw)
  To: Sebastian Andrzej Siewior
  Cc: Ulf Hansson, linux-doc, Linus Walleij, John Stultz, Pavel Machek,
	linux-s390, Will Deacon, Rob Herring, Jonathan Corbet,
	Rafael J. Wysocki, Geert Uytterhoeven, Thierry Reding,
	kernel-team, linux-pm, Nathan Chancellor, linux-gpio, Mark Brown,
	Greg Kroah-Hartman, Kevin Hilman, Rafael J. Wysocki,
	linux-kernel, Paul Kocialkowski, iommu

On Wed, May 25, 2022 at 12:12 AM Sebastian Andrzej Siewior
<bigeasy@linutronix.de> wrote:
>
> On 2022-05-24 10:46:49 [-0700], Saravana Kannan wrote:
> > > Removing probe_timeout_waitqueue (as suggested) or setting the timeout
> > > to 0 avoids the delay.
> >
> > In your case, I think it might be working as intended? Curious, what
> > was the call stack in your case where it was blocked?
>
> Why is then there 10sec delay during boot? The backtrace is
> |------------[ cut here ]------------
> |WARNING: CPU: 4 PID: 1 at drivers/base/dd.c:742 wait_for_device_probe+0x30/0x110
> |Modules linked in:
> |CPU: 4 PID: 1 Comm: swapper/0 Not tainted 5.18.0-rc5+ #154
> |RIP: 0010:wait_for_device_probe+0x30/0x110
> |Call Trace:
> | <TASK>
> | prepare_namespace+0x2b/0x160
> | kernel_init_freeable+0x2b3/0x2dd
> | kernel_init+0x11/0x110
> | ret_from_fork+0x22/0x30
> | </TASK>
>
> Looking closer, it can't access init. This in particular box boots
> directly the kernel without an initramfs so the kernel later mounts
> /dev/sda1 and everything is good.  So that seems to be the reason…

Hmmm... that part shouldn't matter. As long as you are hitting the
same code path. My guess is one of them has CONFIG_MODULES enabled and
the other doesn't.

In either case, I think the patch needs to be reverted (I'll send out
one soon), but that'll also mean I need to revert part of my patch
(sets the timeout back to 0) or I need to fix this case:
https://lore.kernel.org/lkml/TYAPR01MB45443DF63B9EF29054F7C41FD8C60@TYAPR01MB4544.jpnprd01.prod.outlook.com/

I'll try to do the latter if I can get something reasonable soon.
Otherwise, I'll just do the revert + partial revert.

-Saravana


> My other machine with an initramfs does not show this problem.
>
> > -Saravana
>
> Sebastian
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH v1] driver core: Extend deferred probe timeout on driver registration
  2022-05-25 19:49                         ` Saravana Kannan via iommu
@ 2022-05-29  8:34                           ` Niklas Cassel via iommu
  -1 siblings, 0 replies; 38+ messages in thread
From: Niklas Cassel @ 2022-05-29  8:34 UTC (permalink / raw)
  To: Saravana Kannan, Greg Kroah-Hartman, linux-block, linux-nvme
  Cc: Sebastian Andrzej Siewior, Nathan Chancellor, Jonathan Corbet,
	Rafael J. Wysocki, Rafael J. Wysocki, Rob Herring, Linus Walleij,
	Will Deacon, Ulf Hansson, Kevin Hilman, Thierry Reding,
	Mark Brown, Pavel Machek, Geert Uytterhoeven, Yoshihiro Shimoda,
	Paul Kocialkowski, linux-gpio, linux-pm, iommu, kernel-team,
	linux-doc, linux-kernel, linux-s390, John Stultz

On Wed, May 25, 2022 at 12:49:00PM -0700, Saravana Kannan wrote:
> On Wed, May 25, 2022 at 12:12 AM Sebastian Andrzej Siewior
> <bigeasy@linutronix.de> wrote:
> >
> > On 2022-05-24 10:46:49 [-0700], Saravana Kannan wrote:
> > > > Removing probe_timeout_waitqueue (as suggested) or setting the timeout
> > > > to 0 avoids the delay.
> > >
> > > In your case, I think it might be working as intended? Curious, what
> > > was the call stack in your case where it was blocked?
> >
> > Why is then there 10sec delay during boot? The backtrace is
> > |------------[ cut here ]------------
> > |WARNING: CPU: 4 PID: 1 at drivers/base/dd.c:742 wait_for_device_probe+0x30/0x110
> > |Modules linked in:
> > |CPU: 4 PID: 1 Comm: swapper/0 Not tainted 5.18.0-rc5+ #154
> > |RIP: 0010:wait_for_device_probe+0x30/0x110
> > |Call Trace:
> > | <TASK>
> > | prepare_namespace+0x2b/0x160
> > | kernel_init_freeable+0x2b3/0x2dd
> > | kernel_init+0x11/0x110
> > | ret_from_fork+0x22/0x30
> > | </TASK>
> >
> > Looking closer, it can't access init. This in particular box boots
> > directly the kernel without an initramfs so the kernel later mounts
> > /dev/sda1 and everything is good.  So that seems to be the reason…
>

Hello there,

My (QEMU) boot times were recently extended by 10 seconds.
Looking at the timestamps, it looks like nothing is being done for 10 whole
seconds.

A git bisect landed me at the patch in $subject:
2b28a1a84a0e ("driver core: Extend deferred probe timeout on driver registration")

Adding a WARN_ON(1) in wait_for_device_probe(), as requested by the patch
author from the others seeing a regression with this patch, gives two different
stacktraces during boot:

[    0.459633] printk: console [netcon0] enabled
[    0.459636] printk: console [netcon0] printing thread started
[    0.459637] netconsole: network logging started
[    0.459896] cfg80211: Loading compiled-in X.509 certificates for regulatory database
[    0.460230] kworker/u8:6 (105) used greatest stack depth: 14744 bytes left
[    0.461031] cfg80211: Loaded X.509 cert 'sforshee: 00b28ddf47aef9cea7'
[    0.461077] platform regulatory.0: Direct firmware load for regulatory.db failed with error -2
[    0.461085] cfg80211: failed to load regulatory.db
[    0.461113] ALSA device list:
[    0.461116]   No soundcards found.
[    0.461614] ------------[ cut here ]------------
[    0.461615] WARNING: CPU: 2 PID: 1 at drivers/base/dd.c:741 wait_for_device_probe+0x1a/0x160
[    0.485809] Modules linked in:
[    0.486089] CPU: 2 PID: 1 Comm: swapper/0 Not tainted 5.18.0-next-20220526-00004-g74f936013b08-dirty #20
[    0.486842] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.16.0-0-gd239552ce722-prebuilt.qemu.org 04/01/2014
[    0.487707] RIP: 0010:wait_for_device_probe+0x1a/0x160
[    0.488103] Code: 00 e8 fa e4 b5 ff 8b 44 24 04 48 83 c4 08 5b c3 0f 1f 44 00 00 53 48 83 ec 30 65 48 8b 04 25 28 00 00 00 48 89 44 24 28 31 c0 <0f> 0b e8 1f ac 57 00 8b 15 f1 b3 24 01 85 d2 75 3d 48 c7 c7 60 2f
[    0.489539] RSP: 0000:ffff9c7900013ed8 EFLAGS: 00010246
[    0.489965] RAX: 0000000000000000 RBX: 0000000000000008 RCX: 0000000000000d02
[    0.490597] RDX: 0000000000000cc2 RSI: 0000000000000000 RDI: 000000000002e990
[    0.491181] RBP: 0000000000000214 R08: 000000000000000f R09: 0000000000000064
[    0.491788] R10: ffff9c7900013c6c R11: 0000000000000000 R12: ffff8964c0343640
[    0.492384] R13: ffffffff9e51791c R14: 0000000000000000 R15: 0000000000000000
[    0.492960] FS:  0000000000000000(0000) GS:ffff896637d00000(0000) knlGS:0000000000000000
[    0.493658] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[    0.494501] CR2: 0000000000000000 CR3: 00000001ed20c001 CR4: 0000000000370ee0
[    0.495621] Call Trace:
[    0.496059]  <TASK>
[    0.496266]  ? init_eaccess+0x3b/0x76
[    0.496657]  prepare_namespace+0x30/0x16a
[    0.497016]  kernel_init_freeable+0x207/0x212
[    0.497407]  ? rest_init+0xc0/0xc0
[    0.497714]  kernel_init+0x16/0x120
[    0.498250]  ret_from_fork+0x1f/0x30
[    0.498898]  </TASK>
[    0.499307] ---[ end trace 0000000000000000 ]---
[    0.748413] ata2: SATA link up 1.5 Gbps (SStatus 113 SControl 300)
[    0.749053] ata1: SATA link up 1.5 Gbps (SStatus 113 SControl 300)
[    0.749461] ata2.00: ATA-7: QEMU HARDDISK, version, max UDMA/100
[    0.749470] ata2.00: 732 sectors, multi 16: LBA48 NCQ (depth 32)
[    0.749479] ata2.00: applying bridge limits
[    0.750915] ata4: SATA link down (SStatus 0 SControl 300)
[    0.752110] ata5: SATA link down (SStatus 0 SControl 300)
[    0.753424] ata6: SATA link up 1.5 Gbps (SStatus 113 SControl 300)
[    0.754877] ata3: SATA link down (SStatus 0 SControl 300)
[    0.755342] ata1.00: ATA-7: QEMU HARDDISK, version, max UDMA/100
[    0.755377] ata1.00: 268435456 sectors, multi 16: LBA48 NCQ (depth 32)
[    0.755387] ata1.00: applying bridge limits
[    0.755486] ata6.00: ATA-7: QEMU HARDDISK, version, max UDMA/100
[    0.755492] ata6.00: 8388608 sectors, multi 16: LBA48 NCQ (depth 32)
[    0.755500] ata6.00: applying bridge limits
[    0.757330] ata1.00: configured for UDMA/100
[    0.757441] ata6.00: configured for UDMA/100
[    0.757505] ata2.00: configured for UDMA/100
[    0.758015] scsi 0:0:0:0: Direct-Access     ATA      QEMU HARDDISK    ion  PQ: 0 ANSI: 5
[    0.760542] sd 0:0:0:0: Attached scsi generic sg0 type 0
[    0.760669] sd 0:0:0:0: [sda] 268435456 512-byte logical blocks: (137 GB/128 GiB)
[    0.760778] sd 0:0:0:0: [sda] Write Protect is off
[    0.760787] sd 0:0:0:0: [sda] Mode Sense: 00 3a 00 00
[    0.760847] sd 0:0:0:0: [sda] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
[    0.760981] sd 0:0:0:0: [sda] Preferred minimum I/O size 512 bytes
[    0.761319] scsi 1:0:0:0: Direct-Access     ATA      QEMU HARDDISK    ion  PQ: 0 ANSI: 5
[    0.762808] sd 1:0:0:0: Attached scsi generic sg1 type 0
[    0.763004] sd 1:0:0:0: [sdb] 732 512-byte logical blocks: (375 kB/366 KiB)
[    0.763649] sd 1:0:0:0: [sdb] Write Protect is off
[    0.763661] sd 1:0:0:0: [sdb] Mode Sense: 00 3a 00 00
[    0.763965] sd 1:0:0:0: [sdb] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
[    0.764026] sd 1:0:0:0: [sdb] Preferred minimum I/O size 512 bytes
[    0.765923] scsi 5:0:0:0: Direct-Access     ATA      QEMU HARDDISK    ion  PQ: 0 ANSI: 5
[    0.767987] sd 5:0:0:0: Attached scsi generic sg2 type 0
[    0.768626] sd 5:0:0:0: [sdc] 8388608 512-byte logical blocks: (4.29 GB/4.00 GiB)
[    0.769909] sd 5:0:0:0: [sdc] Write Protect is off
[    0.769920] sd 5:0:0:0: [sdc] Mode Sense: 00 3a 00 00
[    0.770515] sd 1:0:0:0: [sdb] Attached SCSI disk
[    0.770900] sd 5:0:0:0: [sdc] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
[    0.771782] sd 5:0:0:0: [sdc] Preferred minimum I/O size 512 bytes
[    0.773900]  sda: sda1 sda14 sda15
[    0.774499] sd 5:0:0:0: [sdc] Attached SCSI disk
[    0.774907] sd 0:0:0:0: [sda] Attached SCSI disk
[    1.084011] input: ImExPS/2 Generic Explorer Mouse as /devices/platform/i8042/serio1/input/input3
[   10.887350] md: Waiting for all devices to be available before autodetect
[   10.887395] md: If you don't use raid, use raid=noautodetect
[   10.887445] ------------[ cut here ]------------
[   10.887448] WARNING: CPU: 1 PID: 1 at drivers/base/dd.c:741 wait_for_device_probe+0x1a/0x160
[   10.893987] Modules linked in:
[   10.894314] CPU: 1 PID: 1 Comm: swapper/0 Tainted: G        W         5.18.0-next-20220526-00004-g74f936013b08-dirty #20
[   10.895389] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.16.0-0-gd239552ce722-prebuilt.qemu.org 04/01/2014
[   10.896590] RIP: 0010:wait_for_device_probe+0x1a/0x160
[   10.897215] Code: 00 e8 fa e4 b5 ff 8b 44 24 04 48 83 c4 08 5b c3 0f 1f 44 00 00 53 48 83 ec 30 65 48 8b 04 25 28 00 00 00 48 89 44 24 28 31 c0 <0f> 0b e8 1f ac 57 00 8b 15 f1 b3 24 01 85 d2 75 3d 48 c7 c7 60 2f
[   10.899269] RSP: 0000:ffff9c7900013ec8 EFLAGS: 00010246
[   10.899762] RAX: 0000000000000000 RBX: 0000000000000008 RCX: 0000000000000000
[   10.900424] RDX: 0000000000000000 RSI: ffffffff9d998089 RDI: 00000000ffffffff
[   10.901060] RBP: 0000000000000214 R08: 80000000ffffe1ca R09: ffff9c7900013ea8
[   10.901839] R10: 3fffffffffffffff R11: 0000000000000000 R12: ffff8964c0343640
[   10.902615] R13: ffffffff9e51791c R14: 0000000000000000 R15: 0000000000000000
[   10.903371] FS:  0000000000000000(0000) GS:ffff896637c80000(0000) knlGS:0000000000000000
[   10.904188] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[   10.904721] CR2: 0000000000000000 CR3: 00000001ed20c001 CR4: 0000000000370ee0
[   10.905340] Call Trace:
[   10.905558]  <TASK>
[   10.905739]  md_run_setup+0x3c/0x6a
[   10.906036]  prepare_namespace+0x35/0x16a
[   10.906391]  kernel_init_freeable+0x207/0x212
[   10.906748]  ? rest_init+0xc0/0xc0
[   10.907021]  kernel_init+0x16/0x120
[   10.907312]  ret_from_fork+0x1f/0x30
[   10.907617]  </TASK>
[   10.907796] ---[ end trace 0000000000000000 ]---
[   10.908159] md: Autodetecting RAID arrays.
[   10.908160] md: autorun ...
[   10.908160] md: ... autorun DONE.
[   10.911301] EXT4-fs (sda1): INFO: recovery required on readonly filesystem
[   10.911303] EXT4-fs (sda1): write access will be enabled during recovery
[   11.038001] EXT4-fs (sda1): orphan cleanup on readonly fs
[   11.038466] EXT4-fs (sda1): 2 orphan inodes deleted
[   11.038475] EXT4-fs (sda1): recovery complete
[   11.053033] EXT4-fs (sda1): mounted filesystem with ordered data mode. Quota mode: none.
[   11.053074] VFS: Mounted root (ext4 filesystem) readonly on device 8:1.
[   11.053175] devtmpfs: mounted
[   11.061921] Freeing unused kernel image (initmem) memory: 1680K
[   11.061930] Write protecting the kernel read-only data: 24576k
[   11.071108] Freeing unused kernel image (text/rodata gap) memory: 2032K
[   11.073596] Freeing unused kernel image (rodata/data gap) memory: 924K
[   11.073609] Run /sbin/init as init process


Kind regards,
Niklas

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH v1] driver core: Extend deferred probe timeout on driver registration
@ 2022-05-29  8:34                           ` Niklas Cassel via iommu
  0 siblings, 0 replies; 38+ messages in thread
From: Niklas Cassel via iommu @ 2022-05-29  8:34 UTC (permalink / raw)
  To: Saravana Kannan, Greg Kroah-Hartman, linux-block, linux-nvme
  Cc: Ulf Hansson, Rafael J. Wysocki, Sebastian Andrzej Siewior,
	John Stultz, Pavel Machek, Will Deacon, Rob Herring,
	Jonathan Corbet, linux-doc, Geert Uytterhoeven, Thierry Reding,
	kernel-team, Linus Walleij, linux-pm, Nathan Chancellor,
	linux-gpio, Mark Brown, linux-s390, Kevin Hilman,
	Rafael J. Wysocki, linux-kernel, Paul Kocialkowski, iommu

On Wed, May 25, 2022 at 12:49:00PM -0700, Saravana Kannan wrote:
> On Wed, May 25, 2022 at 12:12 AM Sebastian Andrzej Siewior
> <bigeasy@linutronix.de> wrote:
> >
> > On 2022-05-24 10:46:49 [-0700], Saravana Kannan wrote:
> > > > Removing probe_timeout_waitqueue (as suggested) or setting the timeout
> > > > to 0 avoids the delay.
> > >
> > > In your case, I think it might be working as intended? Curious, what
> > > was the call stack in your case where it was blocked?
> >
> > Why is then there 10sec delay during boot? The backtrace is
> > |------------[ cut here ]------------
> > |WARNING: CPU: 4 PID: 1 at drivers/base/dd.c:742 wait_for_device_probe+0x30/0x110
> > |Modules linked in:
> > |CPU: 4 PID: 1 Comm: swapper/0 Not tainted 5.18.0-rc5+ #154
> > |RIP: 0010:wait_for_device_probe+0x30/0x110
> > |Call Trace:
> > | <TASK>
> > | prepare_namespace+0x2b/0x160
> > | kernel_init_freeable+0x2b3/0x2dd
> > | kernel_init+0x11/0x110
> > | ret_from_fork+0x22/0x30
> > | </TASK>
> >
> > Looking closer, it can't access init. This in particular box boots
> > directly the kernel without an initramfs so the kernel later mounts
> > /dev/sda1 and everything is good.  So that seems to be the reason…
>

Hello there,

My (QEMU) boot times were recently extended by 10 seconds.
Looking at the timestamps, it looks like nothing is being done for 10 whole
seconds.

A git bisect landed me at the patch in $subject:
2b28a1a84a0e ("driver core: Extend deferred probe timeout on driver registration")

Adding a WARN_ON(1) in wait_for_device_probe(), as requested by the patch
author from the others seeing a regression with this patch, gives two different
stacktraces during boot:

[    0.459633] printk: console [netcon0] enabled
[    0.459636] printk: console [netcon0] printing thread started
[    0.459637] netconsole: network logging started
[    0.459896] cfg80211: Loading compiled-in X.509 certificates for regulatory database
[    0.460230] kworker/u8:6 (105) used greatest stack depth: 14744 bytes left
[    0.461031] cfg80211: Loaded X.509 cert 'sforshee: 00b28ddf47aef9cea7'
[    0.461077] platform regulatory.0: Direct firmware load for regulatory.db failed with error -2
[    0.461085] cfg80211: failed to load regulatory.db
[    0.461113] ALSA device list:
[    0.461116]   No soundcards found.
[    0.461614] ------------[ cut here ]------------
[    0.461615] WARNING: CPU: 2 PID: 1 at drivers/base/dd.c:741 wait_for_device_probe+0x1a/0x160
[    0.485809] Modules linked in:
[    0.486089] CPU: 2 PID: 1 Comm: swapper/0 Not tainted 5.18.0-next-20220526-00004-g74f936013b08-dirty #20
[    0.486842] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.16.0-0-gd239552ce722-prebuilt.qemu.org 04/01/2014
[    0.487707] RIP: 0010:wait_for_device_probe+0x1a/0x160
[    0.488103] Code: 00 e8 fa e4 b5 ff 8b 44 24 04 48 83 c4 08 5b c3 0f 1f 44 00 00 53 48 83 ec 30 65 48 8b 04 25 28 00 00 00 48 89 44 24 28 31 c0 <0f> 0b e8 1f ac 57 00 8b 15 f1 b3 24 01 85 d2 75 3d 48 c7 c7 60 2f
[    0.489539] RSP: 0000:ffff9c7900013ed8 EFLAGS: 00010246
[    0.489965] RAX: 0000000000000000 RBX: 0000000000000008 RCX: 0000000000000d02
[    0.490597] RDX: 0000000000000cc2 RSI: 0000000000000000 RDI: 000000000002e990
[    0.491181] RBP: 0000000000000214 R08: 000000000000000f R09: 0000000000000064
[    0.491788] R10: ffff9c7900013c6c R11: 0000000000000000 R12: ffff8964c0343640
[    0.492384] R13: ffffffff9e51791c R14: 0000000000000000 R15: 0000000000000000
[    0.492960] FS:  0000000000000000(0000) GS:ffff896637d00000(0000) knlGS:0000000000000000
[    0.493658] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[    0.494501] CR2: 0000000000000000 CR3: 00000001ed20c001 CR4: 0000000000370ee0
[    0.495621] Call Trace:
[    0.496059]  <TASK>
[    0.496266]  ? init_eaccess+0x3b/0x76
[    0.496657]  prepare_namespace+0x30/0x16a
[    0.497016]  kernel_init_freeable+0x207/0x212
[    0.497407]  ? rest_init+0xc0/0xc0
[    0.497714]  kernel_init+0x16/0x120
[    0.498250]  ret_from_fork+0x1f/0x30
[    0.498898]  </TASK>
[    0.499307] ---[ end trace 0000000000000000 ]---
[    0.748413] ata2: SATA link up 1.5 Gbps (SStatus 113 SControl 300)
[    0.749053] ata1: SATA link up 1.5 Gbps (SStatus 113 SControl 300)
[    0.749461] ata2.00: ATA-7: QEMU HARDDISK, version, max UDMA/100
[    0.749470] ata2.00: 732 sectors, multi 16: LBA48 NCQ (depth 32)
[    0.749479] ata2.00: applying bridge limits
[    0.750915] ata4: SATA link down (SStatus 0 SControl 300)
[    0.752110] ata5: SATA link down (SStatus 0 SControl 300)
[    0.753424] ata6: SATA link up 1.5 Gbps (SStatus 113 SControl 300)
[    0.754877] ata3: SATA link down (SStatus 0 SControl 300)
[    0.755342] ata1.00: ATA-7: QEMU HARDDISK, version, max UDMA/100
[    0.755377] ata1.00: 268435456 sectors, multi 16: LBA48 NCQ (depth 32)
[    0.755387] ata1.00: applying bridge limits
[    0.755486] ata6.00: ATA-7: QEMU HARDDISK, version, max UDMA/100
[    0.755492] ata6.00: 8388608 sectors, multi 16: LBA48 NCQ (depth 32)
[    0.755500] ata6.00: applying bridge limits
[    0.757330] ata1.00: configured for UDMA/100
[    0.757441] ata6.00: configured for UDMA/100
[    0.757505] ata2.00: configured for UDMA/100
[    0.758015] scsi 0:0:0:0: Direct-Access     ATA      QEMU HARDDISK    ion  PQ: 0 ANSI: 5
[    0.760542] sd 0:0:0:0: Attached scsi generic sg0 type 0
[    0.760669] sd 0:0:0:0: [sda] 268435456 512-byte logical blocks: (137 GB/128 GiB)
[    0.760778] sd 0:0:0:0: [sda] Write Protect is off
[    0.760787] sd 0:0:0:0: [sda] Mode Sense: 00 3a 00 00
[    0.760847] sd 0:0:0:0: [sda] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
[    0.760981] sd 0:0:0:0: [sda] Preferred minimum I/O size 512 bytes
[    0.761319] scsi 1:0:0:0: Direct-Access     ATA      QEMU HARDDISK    ion  PQ: 0 ANSI: 5
[    0.762808] sd 1:0:0:0: Attached scsi generic sg1 type 0
[    0.763004] sd 1:0:0:0: [sdb] 732 512-byte logical blocks: (375 kB/366 KiB)
[    0.763649] sd 1:0:0:0: [sdb] Write Protect is off
[    0.763661] sd 1:0:0:0: [sdb] Mode Sense: 00 3a 00 00
[    0.763965] sd 1:0:0:0: [sdb] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
[    0.764026] sd 1:0:0:0: [sdb] Preferred minimum I/O size 512 bytes
[    0.765923] scsi 5:0:0:0: Direct-Access     ATA      QEMU HARDDISK    ion  PQ: 0 ANSI: 5
[    0.767987] sd 5:0:0:0: Attached scsi generic sg2 type 0
[    0.768626] sd 5:0:0:0: [sdc] 8388608 512-byte logical blocks: (4.29 GB/4.00 GiB)
[    0.769909] sd 5:0:0:0: [sdc] Write Protect is off
[    0.769920] sd 5:0:0:0: [sdc] Mode Sense: 00 3a 00 00
[    0.770515] sd 1:0:0:0: [sdb] Attached SCSI disk
[    0.770900] sd 5:0:0:0: [sdc] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
[    0.771782] sd 5:0:0:0: [sdc] Preferred minimum I/O size 512 bytes
[    0.773900]  sda: sda1 sda14 sda15
[    0.774499] sd 5:0:0:0: [sdc] Attached SCSI disk
[    0.774907] sd 0:0:0:0: [sda] Attached SCSI disk
[    1.084011] input: ImExPS/2 Generic Explorer Mouse as /devices/platform/i8042/serio1/input/input3
[   10.887350] md: Waiting for all devices to be available before autodetect
[   10.887395] md: If you don't use raid, use raid=noautodetect
[   10.887445] ------------[ cut here ]------------
[   10.887448] WARNING: CPU: 1 PID: 1 at drivers/base/dd.c:741 wait_for_device_probe+0x1a/0x160
[   10.893987] Modules linked in:
[   10.894314] CPU: 1 PID: 1 Comm: swapper/0 Tainted: G        W         5.18.0-next-20220526-00004-g74f936013b08-dirty #20
[   10.895389] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.16.0-0-gd239552ce722-prebuilt.qemu.org 04/01/2014
[   10.896590] RIP: 0010:wait_for_device_probe+0x1a/0x160
[   10.897215] Code: 00 e8 fa e4 b5 ff 8b 44 24 04 48 83 c4 08 5b c3 0f 1f 44 00 00 53 48 83 ec 30 65 48 8b 04 25 28 00 00 00 48 89 44 24 28 31 c0 <0f> 0b e8 1f ac 57 00 8b 15 f1 b3 24 01 85 d2 75 3d 48 c7 c7 60 2f
[   10.899269] RSP: 0000:ffff9c7900013ec8 EFLAGS: 00010246
[   10.899762] RAX: 0000000000000000 RBX: 0000000000000008 RCX: 0000000000000000
[   10.900424] RDX: 0000000000000000 RSI: ffffffff9d998089 RDI: 00000000ffffffff
[   10.901060] RBP: 0000000000000214 R08: 80000000ffffe1ca R09: ffff9c7900013ea8
[   10.901839] R10: 3fffffffffffffff R11: 0000000000000000 R12: ffff8964c0343640
[   10.902615] R13: ffffffff9e51791c R14: 0000000000000000 R15: 0000000000000000
[   10.903371] FS:  0000000000000000(0000) GS:ffff896637c80000(0000) knlGS:0000000000000000
[   10.904188] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[   10.904721] CR2: 0000000000000000 CR3: 00000001ed20c001 CR4: 0000000000370ee0
[   10.905340] Call Trace:
[   10.905558]  <TASK>
[   10.905739]  md_run_setup+0x3c/0x6a
[   10.906036]  prepare_namespace+0x35/0x16a
[   10.906391]  kernel_init_freeable+0x207/0x212
[   10.906748]  ? rest_init+0xc0/0xc0
[   10.907021]  kernel_init+0x16/0x120
[   10.907312]  ret_from_fork+0x1f/0x30
[   10.907617]  </TASK>
[   10.907796] ---[ end trace 0000000000000000 ]---
[   10.908159] md: Autodetecting RAID arrays.
[   10.908160] md: autorun ...
[   10.908160] md: ... autorun DONE.
[   10.911301] EXT4-fs (sda1): INFO: recovery required on readonly filesystem
[   10.911303] EXT4-fs (sda1): write access will be enabled during recovery
[   11.038001] EXT4-fs (sda1): orphan cleanup on readonly fs
[   11.038466] EXT4-fs (sda1): 2 orphan inodes deleted
[   11.038475] EXT4-fs (sda1): recovery complete
[   11.053033] EXT4-fs (sda1): mounted filesystem with ordered data mode. Quota mode: none.
[   11.053074] VFS: Mounted root (ext4 filesystem) readonly on device 8:1.
[   11.053175] devtmpfs: mounted
[   11.061921] Freeing unused kernel image (initmem) memory: 1680K
[   11.061930] Write protecting the kernel read-only data: 24576k
[   11.071108] Freeing unused kernel image (text/rodata gap) memory: 2032K
[   11.073596] Freeing unused kernel image (rodata/data gap) memory: 924K
[   11.073609] Run /sbin/init as init process


Kind regards,
Niklas
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH v1] driver core: Extend deferred probe timeout on driver registration
  2022-05-29  8:34                           ` Niklas Cassel via iommu
@ 2022-05-30  4:59                             ` Saravana Kannan via iommu
  -1 siblings, 0 replies; 38+ messages in thread
From: Saravana Kannan @ 2022-05-30  4:59 UTC (permalink / raw)
  To: Niklas Cassel
  Cc: Greg Kroah-Hartman, linux-block, linux-nvme,
	Sebastian Andrzej Siewior, Nathan Chancellor, Jonathan Corbet,
	Rafael J. Wysocki, Rafael J. Wysocki, Rob Herring, Linus Walleij,
	Will Deacon, Ulf Hansson, Kevin Hilman, Thierry Reding,
	Mark Brown, Pavel Machek, Geert Uytterhoeven, Yoshihiro Shimoda,
	Paul Kocialkowski, linux-gpio, linux-pm, iommu, kernel-team,
	linux-doc, linux-kernel, linux-s390, John Stultz

On Sun, May 29, 2022 at 1:34 AM 'Niklas Cassel' via kernel-team
<kernel-team@android.com> wrote:
>
> On Wed, May 25, 2022 at 12:49:00PM -0700, Saravana Kannan wrote:
> > On Wed, May 25, 2022 at 12:12 AM Sebastian Andrzej Siewior
> > <bigeasy@linutronix.de> wrote:
> > >
> > > On 2022-05-24 10:46:49 [-0700], Saravana Kannan wrote:
> > > > > Removing probe_timeout_waitqueue (as suggested) or setting the timeout
> > > > > to 0 avoids the delay.
> > > >
> > > > In your case, I think it might be working as intended? Curious, what
> > > > was the call stack in your case where it was blocked?
> > >
> > > Why is then there 10sec delay during boot? The backtrace is
> > > |------------[ cut here ]------------
> > > |WARNING: CPU: 4 PID: 1 at drivers/base/dd.c:742 wait_for_device_probe+0x30/0x110
> > > |Modules linked in:
> > > |CPU: 4 PID: 1 Comm: swapper/0 Not tainted 5.18.0-rc5+ #154
> > > |RIP: 0010:wait_for_device_probe+0x30/0x110
> > > |Call Trace:
> > > | <TASK>
> > > | prepare_namespace+0x2b/0x160
> > > | kernel_init_freeable+0x2b3/0x2dd
> > > | kernel_init+0x11/0x110
> > > | ret_from_fork+0x22/0x30
> > > | </TASK>
> > >
> > > Looking closer, it can't access init. This in particular box boots
> > > directly the kernel without an initramfs so the kernel later mounts
> > > /dev/sda1 and everything is good.  So that seems to be the reason…
> >
>
> Hello there,
>
> My (QEMU) boot times were recently extended by 10 seconds.
> Looking at the timestamps, it looks like nothing is being done for 10 whole
> seconds.
>
> A git bisect landed me at the patch in $subject:
> 2b28a1a84a0e ("driver core: Extend deferred probe timeout on driver registration")
>
> Adding a WARN_ON(1) in wait_for_device_probe(), as requested by the patch
> author from the others seeing a regression with this patch, gives two different
> stacktraces during boot:

Thanks for the report. My patch was exposing an existing issue. I
already sent a fix for that a few days ago. Waiting for it to get
picked up.

-Saravana

>
> [    0.459633] printk: console [netcon0] enabled
> [    0.459636] printk: console [netcon0] printing thread started
> [    0.459637] netconsole: network logging started
> [    0.459896] cfg80211: Loading compiled-in X.509 certificates for regulatory database
> [    0.460230] kworker/u8:6 (105) used greatest stack depth: 14744 bytes left
> [    0.461031] cfg80211: Loaded X.509 cert 'sforshee: 00b28ddf47aef9cea7'
> [    0.461077] platform regulatory.0: Direct firmware load for regulatory.db failed with error -2
> [    0.461085] cfg80211: failed to load regulatory.db
> [    0.461113] ALSA device list:
> [    0.461116]   No soundcards found.
> [    0.461614] ------------[ cut here ]------------
> [    0.461615] WARNING: CPU: 2 PID: 1 at drivers/base/dd.c:741 wait_for_device_probe+0x1a/0x160
> [    0.485809] Modules linked in:
> [    0.486089] CPU: 2 PID: 1 Comm: swapper/0 Not tainted 5.18.0-next-20220526-00004-g74f936013b08-dirty #20
> [    0.486842] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.16.0-0-gd239552ce722-prebuilt.qemu.org 04/01/2014
> [    0.487707] RIP: 0010:wait_for_device_probe+0x1a/0x160
> [    0.488103] Code: 00 e8 fa e4 b5 ff 8b 44 24 04 48 83 c4 08 5b c3 0f 1f 44 00 00 53 48 83 ec 30 65 48 8b 04 25 28 00 00 00 48 89 44 24 28 31 c0 <0f> 0b e8 1f ac 57 00 8b 15 f1 b3 24 01 85 d2 75 3d 48 c7 c7 60 2f
> [    0.489539] RSP: 0000:ffff9c7900013ed8 EFLAGS: 00010246
> [    0.489965] RAX: 0000000000000000 RBX: 0000000000000008 RCX: 0000000000000d02
> [    0.490597] RDX: 0000000000000cc2 RSI: 0000000000000000 RDI: 000000000002e990
> [    0.491181] RBP: 0000000000000214 R08: 000000000000000f R09: 0000000000000064
> [    0.491788] R10: ffff9c7900013c6c R11: 0000000000000000 R12: ffff8964c0343640
> [    0.492384] R13: ffffffff9e51791c R14: 0000000000000000 R15: 0000000000000000
> [    0.492960] FS:  0000000000000000(0000) GS:ffff896637d00000(0000) knlGS:0000000000000000
> [    0.493658] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [    0.494501] CR2: 0000000000000000 CR3: 00000001ed20c001 CR4: 0000000000370ee0
> [    0.495621] Call Trace:
> [    0.496059]  <TASK>
> [    0.496266]  ? init_eaccess+0x3b/0x76
> [    0.496657]  prepare_namespace+0x30/0x16a
> [    0.497016]  kernel_init_freeable+0x207/0x212
> [    0.497407]  ? rest_init+0xc0/0xc0
> [    0.497714]  kernel_init+0x16/0x120
> [    0.498250]  ret_from_fork+0x1f/0x30
> [    0.498898]  </TASK>
> [    0.499307] ---[ end trace 0000000000000000 ]---
> [    0.748413] ata2: SATA link up 1.5 Gbps (SStatus 113 SControl 300)
> [    0.749053] ata1: SATA link up 1.5 Gbps (SStatus 113 SControl 300)
> [    0.749461] ata2.00: ATA-7: QEMU HARDDISK, version, max UDMA/100
> [    0.749470] ata2.00: 732 sectors, multi 16: LBA48 NCQ (depth 32)
> [    0.749479] ata2.00: applying bridge limits
> [    0.750915] ata4: SATA link down (SStatus 0 SControl 300)
> [    0.752110] ata5: SATA link down (SStatus 0 SControl 300)
> [    0.753424] ata6: SATA link up 1.5 Gbps (SStatus 113 SControl 300)
> [    0.754877] ata3: SATA link down (SStatus 0 SControl 300)
> [    0.755342] ata1.00: ATA-7: QEMU HARDDISK, version, max UDMA/100
> [    0.755377] ata1.00: 268435456 sectors, multi 16: LBA48 NCQ (depth 32)
> [    0.755387] ata1.00: applying bridge limits
> [    0.755486] ata6.00: ATA-7: QEMU HARDDISK, version, max UDMA/100
> [    0.755492] ata6.00: 8388608 sectors, multi 16: LBA48 NCQ (depth 32)
> [    0.755500] ata6.00: applying bridge limits
> [    0.757330] ata1.00: configured for UDMA/100
> [    0.757441] ata6.00: configured for UDMA/100
> [    0.757505] ata2.00: configured for UDMA/100
> [    0.758015] scsi 0:0:0:0: Direct-Access     ATA      QEMU HARDDISK    ion  PQ: 0 ANSI: 5
> [    0.760542] sd 0:0:0:0: Attached scsi generic sg0 type 0
> [    0.760669] sd 0:0:0:0: [sda] 268435456 512-byte logical blocks: (137 GB/128 GiB)
> [    0.760778] sd 0:0:0:0: [sda] Write Protect is off
> [    0.760787] sd 0:0:0:0: [sda] Mode Sense: 00 3a 00 00
> [    0.760847] sd 0:0:0:0: [sda] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
> [    0.760981] sd 0:0:0:0: [sda] Preferred minimum I/O size 512 bytes
> [    0.761319] scsi 1:0:0:0: Direct-Access     ATA      QEMU HARDDISK    ion  PQ: 0 ANSI: 5
> [    0.762808] sd 1:0:0:0: Attached scsi generic sg1 type 0
> [    0.763004] sd 1:0:0:0: [sdb] 732 512-byte logical blocks: (375 kB/366 KiB)
> [    0.763649] sd 1:0:0:0: [sdb] Write Protect is off
> [    0.763661] sd 1:0:0:0: [sdb] Mode Sense: 00 3a 00 00
> [    0.763965] sd 1:0:0:0: [sdb] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
> [    0.764026] sd 1:0:0:0: [sdb] Preferred minimum I/O size 512 bytes
> [    0.765923] scsi 5:0:0:0: Direct-Access     ATA      QEMU HARDDISK    ion  PQ: 0 ANSI: 5
> [    0.767987] sd 5:0:0:0: Attached scsi generic sg2 type 0
> [    0.768626] sd 5:0:0:0: [sdc] 8388608 512-byte logical blocks: (4.29 GB/4.00 GiB)
> [    0.769909] sd 5:0:0:0: [sdc] Write Protect is off
> [    0.769920] sd 5:0:0:0: [sdc] Mode Sense: 00 3a 00 00
> [    0.770515] sd 1:0:0:0: [sdb] Attached SCSI disk
> [    0.770900] sd 5:0:0:0: [sdc] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
> [    0.771782] sd 5:0:0:0: [sdc] Preferred minimum I/O size 512 bytes
> [    0.773900]  sda: sda1 sda14 sda15
> [    0.774499] sd 5:0:0:0: [sdc] Attached SCSI disk
> [    0.774907] sd 0:0:0:0: [sda] Attached SCSI disk
> [    1.084011] input: ImExPS/2 Generic Explorer Mouse as /devices/platform/i8042/serio1/input/input3
> [   10.887350] md: Waiting for all devices to be available before autodetect
> [   10.887395] md: If you don't use raid, use raid=noautodetect
> [   10.887445] ------------[ cut here ]------------
> [   10.887448] WARNING: CPU: 1 PID: 1 at drivers/base/dd.c:741 wait_for_device_probe+0x1a/0x160
> [   10.893987] Modules linked in:
> [   10.894314] CPU: 1 PID: 1 Comm: swapper/0 Tainted: G        W         5.18.0-next-20220526-00004-g74f936013b08-dirty #20
> [   10.895389] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.16.0-0-gd239552ce722-prebuilt.qemu.org 04/01/2014
> [   10.896590] RIP: 0010:wait_for_device_probe+0x1a/0x160
> [   10.897215] Code: 00 e8 fa e4 b5 ff 8b 44 24 04 48 83 c4 08 5b c3 0f 1f 44 00 00 53 48 83 ec 30 65 48 8b 04 25 28 00 00 00 48 89 44 24 28 31 c0 <0f> 0b e8 1f ac 57 00 8b 15 f1 b3 24 01 85 d2 75 3d 48 c7 c7 60 2f
> [   10.899269] RSP: 0000:ffff9c7900013ec8 EFLAGS: 00010246
> [   10.899762] RAX: 0000000000000000 RBX: 0000000000000008 RCX: 0000000000000000
> [   10.900424] RDX: 0000000000000000 RSI: ffffffff9d998089 RDI: 00000000ffffffff
> [   10.901060] RBP: 0000000000000214 R08: 80000000ffffe1ca R09: ffff9c7900013ea8
> [   10.901839] R10: 3fffffffffffffff R11: 0000000000000000 R12: ffff8964c0343640
> [   10.902615] R13: ffffffff9e51791c R14: 0000000000000000 R15: 0000000000000000
> [   10.903371] FS:  0000000000000000(0000) GS:ffff896637c80000(0000) knlGS:0000000000000000
> [   10.904188] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [   10.904721] CR2: 0000000000000000 CR3: 00000001ed20c001 CR4: 0000000000370ee0
> [   10.905340] Call Trace:
> [   10.905558]  <TASK>
> [   10.905739]  md_run_setup+0x3c/0x6a
> [   10.906036]  prepare_namespace+0x35/0x16a
> [   10.906391]  kernel_init_freeable+0x207/0x212
> [   10.906748]  ? rest_init+0xc0/0xc0
> [   10.907021]  kernel_init+0x16/0x120
> [   10.907312]  ret_from_fork+0x1f/0x30
> [   10.907617]  </TASK>
> [   10.907796] ---[ end trace 0000000000000000 ]---
> [   10.908159] md: Autodetecting RAID arrays.
> [   10.908160] md: autorun ...
> [   10.908160] md: ... autorun DONE.
> [   10.911301] EXT4-fs (sda1): INFO: recovery required on readonly filesystem
> [   10.911303] EXT4-fs (sda1): write access will be enabled during recovery
> [   11.038001] EXT4-fs (sda1): orphan cleanup on readonly fs
> [   11.038466] EXT4-fs (sda1): 2 orphan inodes deleted
> [   11.038475] EXT4-fs (sda1): recovery complete
> [   11.053033] EXT4-fs (sda1): mounted filesystem with ordered data mode. Quota mode: none.
> [   11.053074] VFS: Mounted root (ext4 filesystem) readonly on device 8:1.
> [   11.053175] devtmpfs: mounted
> [   11.061921] Freeing unused kernel image (initmem) memory: 1680K
> [   11.061930] Write protecting the kernel read-only data: 24576k
> [   11.071108] Freeing unused kernel image (text/rodata gap) memory: 2032K
> [   11.073596] Freeing unused kernel image (rodata/data gap) memory: 924K
> [   11.073609] Run /sbin/init as init process
>
>
> Kind regards,
> Niklas
>
> --
> To unsubscribe from this group and stop receiving emails from it, send an email to kernel-team+unsubscribe@android.com.
>

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH v1] driver core: Extend deferred probe timeout on driver registration
@ 2022-05-30  4:59                             ` Saravana Kannan via iommu
  0 siblings, 0 replies; 38+ messages in thread
From: Saravana Kannan via iommu @ 2022-05-30  4:59 UTC (permalink / raw)
  To: Niklas Cassel
  Cc: Ulf Hansson, Rafael J. Wysocki, Linus Walleij, linux-nvme,
	John Stultz, Pavel Machek, linux-s390, Will Deacon, Rob Herring,
	Jonathan Corbet, linux-doc, Geert Uytterhoeven, Thierry Reding,
	kernel-team, Sebastian Andrzej Siewior, linux-pm,
	Nathan Chancellor, linux-gpio, Mark Brown, linux-block,
	Greg Kroah-Hartman, Kevin Hilman, Rafael J. Wysocki,
	linux-kernel, Paul Kocialkowski, iommu

On Sun, May 29, 2022 at 1:34 AM 'Niklas Cassel' via kernel-team
<kernel-team@android.com> wrote:
>
> On Wed, May 25, 2022 at 12:49:00PM -0700, Saravana Kannan wrote:
> > On Wed, May 25, 2022 at 12:12 AM Sebastian Andrzej Siewior
> > <bigeasy@linutronix.de> wrote:
> > >
> > > On 2022-05-24 10:46:49 [-0700], Saravana Kannan wrote:
> > > > > Removing probe_timeout_waitqueue (as suggested) or setting the timeout
> > > > > to 0 avoids the delay.
> > > >
> > > > In your case, I think it might be working as intended? Curious, what
> > > > was the call stack in your case where it was blocked?
> > >
> > > Why is then there 10sec delay during boot? The backtrace is
> > > |------------[ cut here ]------------
> > > |WARNING: CPU: 4 PID: 1 at drivers/base/dd.c:742 wait_for_device_probe+0x30/0x110
> > > |Modules linked in:
> > > |CPU: 4 PID: 1 Comm: swapper/0 Not tainted 5.18.0-rc5+ #154
> > > |RIP: 0010:wait_for_device_probe+0x30/0x110
> > > |Call Trace:
> > > | <TASK>
> > > | prepare_namespace+0x2b/0x160
> > > | kernel_init_freeable+0x2b3/0x2dd
> > > | kernel_init+0x11/0x110
> > > | ret_from_fork+0x22/0x30
> > > | </TASK>
> > >
> > > Looking closer, it can't access init. This in particular box boots
> > > directly the kernel without an initramfs so the kernel later mounts
> > > /dev/sda1 and everything is good.  So that seems to be the reason…
> >
>
> Hello there,
>
> My (QEMU) boot times were recently extended by 10 seconds.
> Looking at the timestamps, it looks like nothing is being done for 10 whole
> seconds.
>
> A git bisect landed me at the patch in $subject:
> 2b28a1a84a0e ("driver core: Extend deferred probe timeout on driver registration")
>
> Adding a WARN_ON(1) in wait_for_device_probe(), as requested by the patch
> author from the others seeing a regression with this patch, gives two different
> stacktraces during boot:

Thanks for the report. My patch was exposing an existing issue. I
already sent a fix for that a few days ago. Waiting for it to get
picked up.

-Saravana

>
> [    0.459633] printk: console [netcon0] enabled
> [    0.459636] printk: console [netcon0] printing thread started
> [    0.459637] netconsole: network logging started
> [    0.459896] cfg80211: Loading compiled-in X.509 certificates for regulatory database
> [    0.460230] kworker/u8:6 (105) used greatest stack depth: 14744 bytes left
> [    0.461031] cfg80211: Loaded X.509 cert 'sforshee: 00b28ddf47aef9cea7'
> [    0.461077] platform regulatory.0: Direct firmware load for regulatory.db failed with error -2
> [    0.461085] cfg80211: failed to load regulatory.db
> [    0.461113] ALSA device list:
> [    0.461116]   No soundcards found.
> [    0.461614] ------------[ cut here ]------------
> [    0.461615] WARNING: CPU: 2 PID: 1 at drivers/base/dd.c:741 wait_for_device_probe+0x1a/0x160
> [    0.485809] Modules linked in:
> [    0.486089] CPU: 2 PID: 1 Comm: swapper/0 Not tainted 5.18.0-next-20220526-00004-g74f936013b08-dirty #20
> [    0.486842] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.16.0-0-gd239552ce722-prebuilt.qemu.org 04/01/2014
> [    0.487707] RIP: 0010:wait_for_device_probe+0x1a/0x160
> [    0.488103] Code: 00 e8 fa e4 b5 ff 8b 44 24 04 48 83 c4 08 5b c3 0f 1f 44 00 00 53 48 83 ec 30 65 48 8b 04 25 28 00 00 00 48 89 44 24 28 31 c0 <0f> 0b e8 1f ac 57 00 8b 15 f1 b3 24 01 85 d2 75 3d 48 c7 c7 60 2f
> [    0.489539] RSP: 0000:ffff9c7900013ed8 EFLAGS: 00010246
> [    0.489965] RAX: 0000000000000000 RBX: 0000000000000008 RCX: 0000000000000d02
> [    0.490597] RDX: 0000000000000cc2 RSI: 0000000000000000 RDI: 000000000002e990
> [    0.491181] RBP: 0000000000000214 R08: 000000000000000f R09: 0000000000000064
> [    0.491788] R10: ffff9c7900013c6c R11: 0000000000000000 R12: ffff8964c0343640
> [    0.492384] R13: ffffffff9e51791c R14: 0000000000000000 R15: 0000000000000000
> [    0.492960] FS:  0000000000000000(0000) GS:ffff896637d00000(0000) knlGS:0000000000000000
> [    0.493658] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [    0.494501] CR2: 0000000000000000 CR3: 00000001ed20c001 CR4: 0000000000370ee0
> [    0.495621] Call Trace:
> [    0.496059]  <TASK>
> [    0.496266]  ? init_eaccess+0x3b/0x76
> [    0.496657]  prepare_namespace+0x30/0x16a
> [    0.497016]  kernel_init_freeable+0x207/0x212
> [    0.497407]  ? rest_init+0xc0/0xc0
> [    0.497714]  kernel_init+0x16/0x120
> [    0.498250]  ret_from_fork+0x1f/0x30
> [    0.498898]  </TASK>
> [    0.499307] ---[ end trace 0000000000000000 ]---
> [    0.748413] ata2: SATA link up 1.5 Gbps (SStatus 113 SControl 300)
> [    0.749053] ata1: SATA link up 1.5 Gbps (SStatus 113 SControl 300)
> [    0.749461] ata2.00: ATA-7: QEMU HARDDISK, version, max UDMA/100
> [    0.749470] ata2.00: 732 sectors, multi 16: LBA48 NCQ (depth 32)
> [    0.749479] ata2.00: applying bridge limits
> [    0.750915] ata4: SATA link down (SStatus 0 SControl 300)
> [    0.752110] ata5: SATA link down (SStatus 0 SControl 300)
> [    0.753424] ata6: SATA link up 1.5 Gbps (SStatus 113 SControl 300)
> [    0.754877] ata3: SATA link down (SStatus 0 SControl 300)
> [    0.755342] ata1.00: ATA-7: QEMU HARDDISK, version, max UDMA/100
> [    0.755377] ata1.00: 268435456 sectors, multi 16: LBA48 NCQ (depth 32)
> [    0.755387] ata1.00: applying bridge limits
> [    0.755486] ata6.00: ATA-7: QEMU HARDDISK, version, max UDMA/100
> [    0.755492] ata6.00: 8388608 sectors, multi 16: LBA48 NCQ (depth 32)
> [    0.755500] ata6.00: applying bridge limits
> [    0.757330] ata1.00: configured for UDMA/100
> [    0.757441] ata6.00: configured for UDMA/100
> [    0.757505] ata2.00: configured for UDMA/100
> [    0.758015] scsi 0:0:0:0: Direct-Access     ATA      QEMU HARDDISK    ion  PQ: 0 ANSI: 5
> [    0.760542] sd 0:0:0:0: Attached scsi generic sg0 type 0
> [    0.760669] sd 0:0:0:0: [sda] 268435456 512-byte logical blocks: (137 GB/128 GiB)
> [    0.760778] sd 0:0:0:0: [sda] Write Protect is off
> [    0.760787] sd 0:0:0:0: [sda] Mode Sense: 00 3a 00 00
> [    0.760847] sd 0:0:0:0: [sda] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
> [    0.760981] sd 0:0:0:0: [sda] Preferred minimum I/O size 512 bytes
> [    0.761319] scsi 1:0:0:0: Direct-Access     ATA      QEMU HARDDISK    ion  PQ: 0 ANSI: 5
> [    0.762808] sd 1:0:0:0: Attached scsi generic sg1 type 0
> [    0.763004] sd 1:0:0:0: [sdb] 732 512-byte logical blocks: (375 kB/366 KiB)
> [    0.763649] sd 1:0:0:0: [sdb] Write Protect is off
> [    0.763661] sd 1:0:0:0: [sdb] Mode Sense: 00 3a 00 00
> [    0.763965] sd 1:0:0:0: [sdb] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
> [    0.764026] sd 1:0:0:0: [sdb] Preferred minimum I/O size 512 bytes
> [    0.765923] scsi 5:0:0:0: Direct-Access     ATA      QEMU HARDDISK    ion  PQ: 0 ANSI: 5
> [    0.767987] sd 5:0:0:0: Attached scsi generic sg2 type 0
> [    0.768626] sd 5:0:0:0: [sdc] 8388608 512-byte logical blocks: (4.29 GB/4.00 GiB)
> [    0.769909] sd 5:0:0:0: [sdc] Write Protect is off
> [    0.769920] sd 5:0:0:0: [sdc] Mode Sense: 00 3a 00 00
> [    0.770515] sd 1:0:0:0: [sdb] Attached SCSI disk
> [    0.770900] sd 5:0:0:0: [sdc] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
> [    0.771782] sd 5:0:0:0: [sdc] Preferred minimum I/O size 512 bytes
> [    0.773900]  sda: sda1 sda14 sda15
> [    0.774499] sd 5:0:0:0: [sdc] Attached SCSI disk
> [    0.774907] sd 0:0:0:0: [sda] Attached SCSI disk
> [    1.084011] input: ImExPS/2 Generic Explorer Mouse as /devices/platform/i8042/serio1/input/input3
> [   10.887350] md: Waiting for all devices to be available before autodetect
> [   10.887395] md: If you don't use raid, use raid=noautodetect
> [   10.887445] ------------[ cut here ]------------
> [   10.887448] WARNING: CPU: 1 PID: 1 at drivers/base/dd.c:741 wait_for_device_probe+0x1a/0x160
> [   10.893987] Modules linked in:
> [   10.894314] CPU: 1 PID: 1 Comm: swapper/0 Tainted: G        W         5.18.0-next-20220526-00004-g74f936013b08-dirty #20
> [   10.895389] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.16.0-0-gd239552ce722-prebuilt.qemu.org 04/01/2014
> [   10.896590] RIP: 0010:wait_for_device_probe+0x1a/0x160
> [   10.897215] Code: 00 e8 fa e4 b5 ff 8b 44 24 04 48 83 c4 08 5b c3 0f 1f 44 00 00 53 48 83 ec 30 65 48 8b 04 25 28 00 00 00 48 89 44 24 28 31 c0 <0f> 0b e8 1f ac 57 00 8b 15 f1 b3 24 01 85 d2 75 3d 48 c7 c7 60 2f
> [   10.899269] RSP: 0000:ffff9c7900013ec8 EFLAGS: 00010246
> [   10.899762] RAX: 0000000000000000 RBX: 0000000000000008 RCX: 0000000000000000
> [   10.900424] RDX: 0000000000000000 RSI: ffffffff9d998089 RDI: 00000000ffffffff
> [   10.901060] RBP: 0000000000000214 R08: 80000000ffffe1ca R09: ffff9c7900013ea8
> [   10.901839] R10: 3fffffffffffffff R11: 0000000000000000 R12: ffff8964c0343640
> [   10.902615] R13: ffffffff9e51791c R14: 0000000000000000 R15: 0000000000000000
> [   10.903371] FS:  0000000000000000(0000) GS:ffff896637c80000(0000) knlGS:0000000000000000
> [   10.904188] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [   10.904721] CR2: 0000000000000000 CR3: 00000001ed20c001 CR4: 0000000000370ee0
> [   10.905340] Call Trace:
> [   10.905558]  <TASK>
> [   10.905739]  md_run_setup+0x3c/0x6a
> [   10.906036]  prepare_namespace+0x35/0x16a
> [   10.906391]  kernel_init_freeable+0x207/0x212
> [   10.906748]  ? rest_init+0xc0/0xc0
> [   10.907021]  kernel_init+0x16/0x120
> [   10.907312]  ret_from_fork+0x1f/0x30
> [   10.907617]  </TASK>
> [   10.907796] ---[ end trace 0000000000000000 ]---
> [   10.908159] md: Autodetecting RAID arrays.
> [   10.908160] md: autorun ...
> [   10.908160] md: ... autorun DONE.
> [   10.911301] EXT4-fs (sda1): INFO: recovery required on readonly filesystem
> [   10.911303] EXT4-fs (sda1): write access will be enabled during recovery
> [   11.038001] EXT4-fs (sda1): orphan cleanup on readonly fs
> [   11.038466] EXT4-fs (sda1): 2 orphan inodes deleted
> [   11.038475] EXT4-fs (sda1): recovery complete
> [   11.053033] EXT4-fs (sda1): mounted filesystem with ordered data mode. Quota mode: none.
> [   11.053074] VFS: Mounted root (ext4 filesystem) readonly on device 8:1.
> [   11.053175] devtmpfs: mounted
> [   11.061921] Freeing unused kernel image (initmem) memory: 1680K
> [   11.061930] Write protecting the kernel read-only data: 24576k
> [   11.071108] Freeing unused kernel image (text/rodata gap) memory: 2032K
> [   11.073596] Freeing unused kernel image (rodata/data gap) memory: 924K
> [   11.073609] Run /sbin/init as init process
>
>
> Kind regards,
> Niklas
>
> --
> To unsubscribe from this group and stop receiving emails from it, send an email to kernel-team+unsubscribe@android.com.
>
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 38+ messages in thread

end of thread, other threads:[~2022-05-30  6:08 UTC | newest]

Thread overview: 38+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-04-29 22:09 [PATCH v1] driver core: Extend deferred probe timeout on driver registration Saravana Kannan
2022-04-29 22:09 ` Saravana Kannan via iommu
2022-05-13 12:48 ` Mark Brown
2022-05-13 12:48   ` Mark Brown
2022-05-13 13:58 ` Rob Herring
2022-05-13 13:58   ` Rob Herring
2022-05-13 17:25   ` Saravana Kannan
2022-05-13 17:25     ` Saravana Kannan via iommu
2022-05-16 13:49     ` Rob Herring
2022-05-16 13:49       ` Rob Herring
2022-05-20 23:30 ` Nathan Chancellor
2022-05-20 23:30   ` Nathan Chancellor
2022-05-20 23:49   ` Saravana Kannan
2022-05-20 23:49     ` Saravana Kannan via iommu
2022-05-21  0:04     ` Nathan Chancellor
2022-05-21  0:04       ` Nathan Chancellor
2022-05-21  0:15       ` Saravana Kannan
2022-05-21  0:15         ` Saravana Kannan via iommu
2022-05-23 15:17         ` Nathan Chancellor
2022-05-23 15:17           ` Nathan Chancellor
2022-05-23 20:04           ` Saravana Kannan
2022-05-23 20:04             ` Saravana Kannan via iommu
2022-05-23 22:14             ` Nathan Chancellor
2022-05-23 22:14               ` Nathan Chancellor
2022-05-24  3:43               ` Saravana Kannan
2022-05-24  3:43                 ` Saravana Kannan via iommu
2022-05-24 16:41                 ` Sebastian Andrzej Siewior
2022-05-24 16:41                   ` Sebastian Andrzej Siewior
2022-05-24 17:46                   ` Saravana Kannan
2022-05-24 17:46                     ` Saravana Kannan via iommu
2022-05-25  7:11                     ` Sebastian Andrzej Siewior
2022-05-25  7:11                       ` Sebastian Andrzej Siewior
2022-05-25 19:49                       ` Saravana Kannan
2022-05-25 19:49                         ` Saravana Kannan via iommu
2022-05-29  8:34                         ` Niklas Cassel
2022-05-29  8:34                           ` Niklas Cassel via iommu
2022-05-30  4:59                           ` Saravana Kannan
2022-05-30  4:59                             ` Saravana Kannan via iommu

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.