All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v2 0/3] Make fw_devlink=on more forgiving
@ 2021-02-02  4:33 Saravana Kannan
  2021-02-02  4:33 ` [PATCH v2 1/3] driver core: fw_devlink: Detect supplier devices that will never be added Saravana Kannan
                   ` (4 more replies)
  0 siblings, 5 replies; 22+ messages in thread
From: Saravana Kannan @ 2021-02-02  4:33 UTC (permalink / raw)
  To: Greg Kroah-Hartman, Rafael J. Wysocki, Marek Szyprowski,
	Geert Uytterhoeven, Marc Zyngier, Tudor Ambarus, Linus Walleij,
	Bartosz Golaszewski, Martin Kaiser, Rob Herring, Frank Rowand,
	Len Brown
  Cc: Saravana Kannan, linux-kernel, devicetree, linux-acpi, kernel-team

This patch series solves two general issues with fw_devlink=on

Patch 1/3 and 3/3 addresses the issue of firmware nodes that look like
they'll have struct devices created for them, but will never actually
have struct devices added for them. For example, DT nodes with a
compatible property that don't have devices added for them.

Patch 2/2 address (for static kernels) the issue of optional suppliers
that'll never have a driver registered for them. So, if the device could
have probed with fw_devlink=permissive with a static kernel, this patch
should allow those devices to probe with a fw_devlink=on. This doesn't
solve it for the case where modules are enabled because there's no way
to tell if a driver will never be registered or it's just about to be
registered. I have some other ideas for that, but it'll have to come
later thinking about it a bit.

Marek, Geert,

I don't expect v2 to do any better for your cases.

This series not making any difference for Marek is still a mystery to
me. I guess one of the consumers doesn't take too well to its probe (and
it's consumers' probe) being delayed till late_initcall(). I'll continue
looking into it.

Marc,

This v2 should do better than v1 with gpiolib stub driver reverted. I
forgot to take care of the case where more suppliers could link after I
went and deleted some of the links. v2 handles that now.

Tudor,

You should still make the clock driver fix (because it's a bug), but I
think this series will fix your issue too (even without the clock driver
fix). Can you please give this a shot?

Martin,

If you tested this series, can you please give a Tested-by?

Thanks,
Saravana

v1 -> v2:
Patch 1: Added a flag to fwnodes that aren't devices.
Patch 3: New patch to ise the flag set in patch 1 to not create bad links.

Saravana Kannan (3):
  driver core: fw_devlink: Detect supplier devices that will never be
    added
  driver core: fw_devlink: Handle missing drivers for optional suppliers
  of: property: Don't add links to absent suppliers

 drivers/base/base.h    |   2 +
 drivers/base/core.c    | 135 +++++++++++++++++++++++++++++++++++------
 drivers/base/dd.c      |   5 ++
 drivers/of/property.c  |   4 +-
 include/linux/fwnode.h |   2 +
 5 files changed, 127 insertions(+), 21 deletions(-)

-- 
2.30.0.365.g02bc693789-goog


^ permalink raw reply	[flat|nested] 22+ messages in thread

* [PATCH v2 1/3] driver core: fw_devlink: Detect supplier devices that will never be added
  2021-02-02  4:33 [PATCH v2 0/3] Make fw_devlink=on more forgiving Saravana Kannan
@ 2021-02-02  4:33 ` Saravana Kannan
  2021-02-02 14:12   ` Rafael J. Wysocki
  2021-02-02  4:33 ` [PATCH v2 2/3] driver core: fw_devlink: Handle missing drivers for optional suppliers Saravana Kannan
                   ` (3 subsequent siblings)
  4 siblings, 1 reply; 22+ messages in thread
From: Saravana Kannan @ 2021-02-02  4:33 UTC (permalink / raw)
  To: Greg Kroah-Hartman, Rafael J. Wysocki, Marek Szyprowski,
	Geert Uytterhoeven, Marc Zyngier, Tudor Ambarus, Linus Walleij,
	Bartosz Golaszewski, Martin Kaiser, Rob Herring, Frank Rowand,
	Len Brown, Saravana Kannan
  Cc: linux-kernel, devicetree, linux-acpi, kernel-team

During the initial parsing of firmware by fw_devlink, fw_devlink might
infer that some supplier firmware nodes would get populated as devices.
But the inference is not always correct. This patch tries to logically
detect and fix such mistakes as boot progresses or more devices probe.

fw_devlink makes a fundamental assumption that once a device binds to a
driver, it will populate (i.e: add as struct devices) all the child
firmware nodes that could be populated as devices (if they aren't
populated already).

So, whenever a device probes, we check all its child firmware nodes. If
a child firmware node has a corresponding device populated, we don't
modify the child node or its descendants. However, if a child firmware
node has not been populated as a device, we delete all the fwnode links
where the child node or its descendants are suppliers. This ensures that
no other device is blocked on a firmware node that will never be
populated as a device. We also mark such fwnodes as NOT_DEVICE, so that
no new fwnode links are created with these nodes as suppliers.

Fixes: e590474768f1 ("driver core: Set fw_devlink=on by default")
Signed-off-by: Saravana Kannan <saravanak@google.com>
---
 drivers/base/core.c    | 31 ++++++++++++++++++++++++++++---
 include/linux/fwnode.h |  2 ++
 2 files changed, 30 insertions(+), 3 deletions(-)

diff --git a/drivers/base/core.c b/drivers/base/core.c
index 484a942884ba..c95b1daabac7 100644
--- a/drivers/base/core.c
+++ b/drivers/base/core.c
@@ -148,6 +148,21 @@ void fwnode_links_purge(struct fwnode_handle *fwnode)
 	fwnode_links_purge_consumers(fwnode);
 }
 
+static void fw_devlink_purge_absent_suppliers(struct fwnode_handle *fwnode)
+{
+	struct fwnode_handle *child;
+
+	/* Don't purge consumer links of an added child */
+	if (fwnode->dev)
+		return;
+
+	fwnode->flags |= FWNODE_FLAG_NOT_DEVICE;
+	fwnode_links_purge_consumers(fwnode);
+
+	fwnode_for_each_available_child_node(fwnode, child)
+		fw_devlink_purge_absent_suppliers(child);
+}
+
 #ifdef CONFIG_SRCU
 static DEFINE_MUTEX(device_links_lock);
 DEFINE_STATIC_SRCU(device_links_srcu);
@@ -1154,12 +1169,22 @@ void device_links_driver_bound(struct device *dev)
 	LIST_HEAD(sync_list);
 
 	/*
-	 * If a device probes successfully, it's expected to have created all
+	 * If a device binds successfully, it's expected to have created all
 	 * the device links it needs to or make new device links as it needs
-	 * them. So, it no longer needs to wait on any suppliers.
+	 * them. So, fw_devlink no longer needs to create device links to any
+	 * of the device's suppliers.
+	 *
+	 * Also, if a child firmware node of this bound device is not added as
+	 * a device by now, assume it is never going to be added and make sure
+	 * other devices don't defer probe indefinitely by waiting for such a
+	 * child device.
 	 */
-	if (dev->fwnode && dev->fwnode->dev == dev)
+	if (dev->fwnode && dev->fwnode->dev == dev) {
+		struct fwnode_handle *child;
 		fwnode_links_purge_suppliers(dev->fwnode);
+		fwnode_for_each_available_child_node(dev->fwnode, child)
+			fw_devlink_purge_absent_suppliers(child);
+	}
 	device_remove_file(dev, &dev_attr_waiting_for_supplier);
 
 	device_links_write_lock();
diff --git a/include/linux/fwnode.h b/include/linux/fwnode.h
index fde4ad97564c..21082f11473f 100644
--- a/include/linux/fwnode.h
+++ b/include/linux/fwnode.h
@@ -19,8 +19,10 @@ struct device;
  * fwnode link flags
  *
  * LINKS_ADDED: The fwnode has already be parsed to add fwnode links.
+ * NOT_DEVICE: The fwnode will never be populated as a struct device.
  */
 #define FWNODE_FLAG_LINKS_ADDED		BIT(0)
+#define FWNODE_FLAG_NOT_DEVICE		BIT(1)
 
 struct fwnode_handle {
 	struct fwnode_handle *secondary;
-- 
2.30.0.365.g02bc693789-goog


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH v2 2/3] driver core: fw_devlink: Handle missing drivers for optional suppliers
  2021-02-02  4:33 [PATCH v2 0/3] Make fw_devlink=on more forgiving Saravana Kannan
  2021-02-02  4:33 ` [PATCH v2 1/3] driver core: fw_devlink: Detect supplier devices that will never be added Saravana Kannan
@ 2021-02-02  4:33 ` Saravana Kannan
  2021-02-02 14:34   ` Rafael J. Wysocki
  2021-02-02  4:33 ` [PATCH v2 3/3] of: property: Don't add links to absent suppliers Saravana Kannan
                   ` (2 subsequent siblings)
  4 siblings, 1 reply; 22+ messages in thread
From: Saravana Kannan @ 2021-02-02  4:33 UTC (permalink / raw)
  To: Greg Kroah-Hartman, Rafael J. Wysocki, Marek Szyprowski,
	Geert Uytterhoeven, Marc Zyngier, Tudor Ambarus, Linus Walleij,
	Bartosz Golaszewski, Martin Kaiser, Rob Herring, Frank Rowand,
	Len Brown, Saravana Kannan
  Cc: linux-kernel, devicetree, linux-acpi, kernel-team

After a deferred probe attempt has exhaused all the devices that can be
bound, any device that remains unbound has one/both of these conditions
true:

(1) It is waiting on its supplier to bind
(2) It does not have a matching driver

So, to make fw_devlink=on more forgiving of missing drivers for optional
suppliers, after we've done a full deferred probe attempt, this patch
deletes all device links created by fw_devlink where the supplier hasn't
probed yet and the supplier itself is not waiting on any of its
suppliers. This allows consumers to probe during another deferred probe
attempt if they were waiting on optional suppliers.

When modules are enabled, we can't differentiate between a driver
that'll never be registered vs a driver that'll be registered soon by
loading a module. So, this patch doesn't do anything for the case where
modules are enabled.

Fixes: e590474768f1 ("driver core: Set fw_devlink=on by default")
Signed-off-by: Saravana Kannan <saravanak@google.com>
---
 drivers/base/base.h |   2 +
 drivers/base/core.c | 104 ++++++++++++++++++++++++++++++++++++--------
 drivers/base/dd.c   |   5 +++
 3 files changed, 94 insertions(+), 17 deletions(-)

diff --git a/drivers/base/base.h b/drivers/base/base.h
index f5600a83124f..34befe9475cb 100644
--- a/drivers/base/base.h
+++ b/drivers/base/base.h
@@ -186,6 +186,8 @@ extern void device_links_no_driver(struct device *dev);
 extern bool device_links_busy(struct device *dev);
 extern void device_links_unbind_consumers(struct device *dev);
 
+bool fw_devlink_deferred_probe_retry(void);
+
 /* device pm support */
 void device_pm_move_to_tail(struct device *dev);
 
diff --git a/drivers/base/core.c b/drivers/base/core.c
index c95b1daabac7..5e53fc6a21ea 100644
--- a/drivers/base/core.c
+++ b/drivers/base/core.c
@@ -50,6 +50,7 @@ static LIST_HEAD(deferred_sync);
 static unsigned int defer_sync_state_count = 1;
 static DEFINE_MUTEX(fwnode_link_lock);
 static bool fw_devlink_is_permissive(void);
+static bool fw_devlink_def_probe_retry;
 
 /**
  * fwnode_link_add - Create a link between two fwnode_handles.
@@ -881,6 +882,13 @@ static void device_link_put_kref(struct device_link *link)
 		WARN(1, "Unable to drop a managed device link reference\n");
 }
 
+static void device_link_drop_managed(struct device_link *link)
+{
+	link->flags &= ~DL_FLAG_MANAGED;
+	WRITE_ONCE(link->status, DL_STATE_NONE);
+	kref_put(&link->kref, __device_link_del);
+}
+
 /**
  * device_link_del - Delete a stateless link between two devices.
  * @link: Device link to delete.
@@ -943,6 +951,29 @@ static void device_links_missing_supplier(struct device *dev)
 	}
 }
 
+/**
+ * device_links_probe_blocked_by - Return first supplier blocking probe
+ * @dev: Consumer device.
+ *
+ * Checks if the probe of @dev is blocked by a supplier without a driver. If
+ * yes, return that supplier dev. Otherwise, return NULL.
+ */
+static struct device *device_links_probe_blocked_by(struct device *dev)
+{
+	struct device_link *link;
+
+	list_for_each_entry(link, &dev->links.suppliers, c_node) {
+		if (!(link->flags & DL_FLAG_MANAGED) ||
+		    link->flags & DL_FLAG_SYNC_STATE_ONLY)
+			continue;
+
+		if (link->status != DL_STATE_AVAILABLE)
+			return link->supplier;
+	}
+
+	return NULL;
+}
+
 /**
  * device_links_check_suppliers - Check presence of supplier drivers.
  * @dev: Consumer device.
@@ -961,7 +992,7 @@ static void device_links_missing_supplier(struct device *dev)
  */
 int device_links_check_suppliers(struct device *dev)
 {
-	struct device_link *link;
+	struct device_link *link, *tmp;
 	int ret = 0;
 
 	/*
@@ -982,19 +1013,47 @@ int device_links_check_suppliers(struct device *dev)
 
 	device_links_write_lock();
 
-	list_for_each_entry(link, &dev->links.suppliers, c_node) {
+	list_for_each_entry_safe(link, tmp, &dev->links.suppliers, c_node) {
 		if (!(link->flags & DL_FLAG_MANAGED))
 			continue;
 
-		if (link->status != DL_STATE_AVAILABLE &&
-		    !(link->flags & DL_FLAG_SYNC_STATE_ONLY)) {
-			device_links_missing_supplier(dev);
-			dev_dbg(dev, "probe deferral - supplier %s not ready\n",
-				dev_name(link->supplier));
-			ret = -EPROBE_DEFER;
-			break;
+
+		if (link->status == DL_STATE_AVAILABLE ||
+		    link->flags & DL_FLAG_SYNC_STATE_ONLY) {
+			WRITE_ONCE(link->status, DL_STATE_CONSUMER_PROBE);
+			continue;
+		}
+
+		/*
+		 * After a deferred probe attempt has exhaused all the devices
+		 * that can be bound, any device that remains unbound has
+		 * one/both of these conditions true:
+		 *
+		 * (1) It is waiting on its supplier to bind
+		 * (2) It does not have a matching driver
+		 *
+		 * If this device is waiting on a supplier to bind to a driver,
+		 * we make sure condition (1) above is not true for the
+		 * supplier. In which case, condition (2) has to be true for
+		 * the supplier. That is, the supplier doesn't have a matching
+		 * driver.
+		 *
+		 * When we find such a supplier, we delete the device link if
+		 * it was created by fw_devlink. This it to allow the consumer
+		 * to probe in case the supplier is an optional.
+		 */
+		if (fw_devlink_def_probe_retry &&
+		    link->flags & DL_FLAG_INFERRED &&
+		    !device_links_probe_blocked_by(link->supplier)) {
+			device_link_drop_managed(link);
+			continue;
 		}
-		WRITE_ONCE(link->status, DL_STATE_CONSUMER_PROBE);
+
+		device_links_missing_supplier(dev);
+		dev_dbg(dev, "probe deferral - supplier %s not ready\n",
+			dev_name(link->supplier));
+		ret = -EPROBE_DEFER;
+		break;
 	}
 	dev->links.status = DL_DEV_PROBING;
 
@@ -1132,13 +1191,6 @@ static void __device_links_supplier_defer_sync(struct device *sup)
 		list_add_tail(&sup->links.defer_sync, &deferred_sync);
 }
 
-static void device_link_drop_managed(struct device_link *link)
-{
-	link->flags &= ~DL_FLAG_MANAGED;
-	WRITE_ONCE(link->status, DL_STATE_NONE);
-	kref_put(&link->kref, __device_link_del);
-}
-
 static ssize_t waiting_for_supplier_show(struct device *dev,
 					 struct device_attribute *attr,
 					 char *buf)
@@ -1597,6 +1649,24 @@ static int fw_devlink_relax_cycle(struct device *con, void *sup)
 	return ret;
 }
 
+/** fw_devlink_deferred_probe_retry - Set up fw_devlink for probe retries
+ *
+ * This function requests fw_devlink to set itself up for a deferred probe
+ * retry. This allows fw_devlink to ignore device links it created to
+ * suppliers that'll never probe. This is necessary in case some of the
+ * suppliers are optional and their consumers can probe without them.
+ *
+ * Returns true if deferred probe retry is likely to make any difference.
+ */
+bool fw_devlink_deferred_probe_retry(void)
+{
+	if (IS_ENABLED(CONFIG_MODULES))
+		return false;
+
+	fw_devlink_def_probe_retry = true;
+	return fw_devlink_get_flags() && !fw_devlink_is_permissive();
+}
+
 /**
  * fw_devlink_create_devlink - Create a device link from a consumer to fwnode
  * @con - Consumer device for the device link
diff --git a/drivers/base/dd.c b/drivers/base/dd.c
index 9179825ff646..11325df2327f 100644
--- a/drivers/base/dd.c
+++ b/drivers/base/dd.c
@@ -317,6 +317,11 @@ static int deferred_probe_initcall(void)
 	driver_deferred_probe_trigger();
 	/* Sort as many dependencies as possible before exiting initcalls */
 	flush_work(&deferred_probe_work);
+
+	if (fw_devlink_deferred_probe_retry()) {
+		driver_deferred_probe_trigger();
+		flush_work(&deferred_probe_work);
+	}
 	initcalls_done = true;
 
 	/*
-- 
2.30.0.365.g02bc693789-goog


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH v2 3/3] of: property: Don't add links to absent suppliers
  2021-02-02  4:33 [PATCH v2 0/3] Make fw_devlink=on more forgiving Saravana Kannan
  2021-02-02  4:33 ` [PATCH v2 1/3] driver core: fw_devlink: Detect supplier devices that will never be added Saravana Kannan
  2021-02-02  4:33 ` [PATCH v2 2/3] driver core: fw_devlink: Handle missing drivers for optional suppliers Saravana Kannan
@ 2021-02-02  4:33 ` Saravana Kannan
  2021-02-04 19:17   ` Saravana Kannan
  2021-02-02 16:52 ` [PATCH v2 0/3] Make fw_devlink=on more forgiving Tudor.Ambarus
  2021-02-02 21:22 ` Martin Kaiser
  4 siblings, 1 reply; 22+ messages in thread
From: Saravana Kannan @ 2021-02-02  4:33 UTC (permalink / raw)
  To: Greg Kroah-Hartman, Rafael J. Wysocki, Marek Szyprowski,
	Geert Uytterhoeven, Marc Zyngier, Tudor Ambarus, Linus Walleij,
	Bartosz Golaszewski, Martin Kaiser, Rob Herring, Frank Rowand,
	Len Brown, Saravana Kannan
  Cc: linux-kernel, devicetree, linux-acpi, kernel-team

If driver core marks a firmware node as not a device, don't add fwnode
links where it's a supplier.

Fixes: e590474768f1 ("driver core: Set fw_devlink=on by default")
Signed-off-by: Saravana Kannan <saravanak@google.com>
---
 drivers/of/property.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/drivers/of/property.c b/drivers/of/property.c
index 6287c6d60bb7..53d163c8d39b 100644
--- a/drivers/of/property.c
+++ b/drivers/of/property.c
@@ -1103,7 +1103,9 @@ static int of_link_to_phandle(struct device_node *con_np,
 	 * created for them.
 	 */
 	sup_dev = get_dev_from_fwnode(&sup_np->fwnode);
-	if (!sup_dev && of_node_check_flag(sup_np, OF_POPULATED)) {
+	if (!sup_dev &&
+	    (of_node_check_flag(sup_np, OF_POPULATED) ||
+	     sup_np->fwnode.flags & FWNODE_FLAG_NOT_DEVICE)) {
 		pr_debug("Not linking %pOFP to %pOFP - No struct device\n",
 			 con_np, sup_np);
 		of_node_put(sup_np);
-- 
2.30.0.365.g02bc693789-goog


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* Re: [PATCH v2 1/3] driver core: fw_devlink: Detect supplier devices that will never be added
  2021-02-02  4:33 ` [PATCH v2 1/3] driver core: fw_devlink: Detect supplier devices that will never be added Saravana Kannan
@ 2021-02-02 14:12   ` Rafael J. Wysocki
  2021-02-02 19:35     ` Saravana Kannan
  0 siblings, 1 reply; 22+ messages in thread
From: Rafael J. Wysocki @ 2021-02-02 14:12 UTC (permalink / raw)
  To: Saravana Kannan
  Cc: Greg Kroah-Hartman, Rafael J. Wysocki, Marek Szyprowski,
	Geert Uytterhoeven, Marc Zyngier, Tudor Ambarus, Linus Walleij,
	Bartosz Golaszewski, Martin Kaiser, Rob Herring, Frank Rowand,
	Len Brown, Linux Kernel Mailing List, devicetree,
	ACPI Devel Maling List, Cc: Android Kernel

On Tue, Feb 2, 2021 at 5:33 AM Saravana Kannan <saravanak@google.com> wrote:
>
> During the initial parsing of firmware by fw_devlink, fw_devlink might
> infer that some supplier firmware nodes would get populated as devices.
> But the inference is not always correct. This patch tries to logically
> detect and fix such mistakes as boot progresses or more devices probe.
>
> fw_devlink makes a fundamental assumption that once a device binds to a
> driver, it will populate (i.e: add as struct devices) all the child
> firmware nodes that could be populated as devices (if they aren't
> populated already).
>
> So, whenever a device probes, we check all its child firmware nodes. If
> a child firmware node has a corresponding device populated, we don't
> modify the child node or its descendants. However, if a child firmware
> node has not been populated as a device, we delete all the fwnode links
> where the child node or its descendants are suppliers. This ensures that
> no other device is blocked on a firmware node that will never be
> populated as a device. We also mark such fwnodes as NOT_DEVICE, so that
> no new fwnode links are created with these nodes as suppliers.
>
> Fixes: e590474768f1 ("driver core: Set fw_devlink=on by default")
> Signed-off-by: Saravana Kannan <saravanak@google.com>

Still ACKed.

> ---
>  drivers/base/core.c    | 31 ++++++++++++++++++++++++++++---
>  include/linux/fwnode.h |  2 ++
>  2 files changed, 30 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/base/core.c b/drivers/base/core.c
> index 484a942884ba..c95b1daabac7 100644
> --- a/drivers/base/core.c
> +++ b/drivers/base/core.c
> @@ -148,6 +148,21 @@ void fwnode_links_purge(struct fwnode_handle *fwnode)
>         fwnode_links_purge_consumers(fwnode);
>  }
>
> +static void fw_devlink_purge_absent_suppliers(struct fwnode_handle *fwnode)
> +{
> +       struct fwnode_handle *child;
> +
> +       /* Don't purge consumer links of an added child */
> +       if (fwnode->dev)
> +               return;
> +
> +       fwnode->flags |= FWNODE_FLAG_NOT_DEVICE;
> +       fwnode_links_purge_consumers(fwnode);
> +
> +       fwnode_for_each_available_child_node(fwnode, child)
> +               fw_devlink_purge_absent_suppliers(child);
> +}
> +
>  #ifdef CONFIG_SRCU
>  static DEFINE_MUTEX(device_links_lock);
>  DEFINE_STATIC_SRCU(device_links_srcu);
> @@ -1154,12 +1169,22 @@ void device_links_driver_bound(struct device *dev)
>         LIST_HEAD(sync_list);
>
>         /*
> -        * If a device probes successfully, it's expected to have created all
> +        * If a device binds successfully, it's expected to have created all
>          * the device links it needs to or make new device links as it needs
> -        * them. So, it no longer needs to wait on any suppliers.
> +        * them. So, fw_devlink no longer needs to create device links to any
> +        * of the device's suppliers.
> +        *
> +        * Also, if a child firmware node of this bound device is not added as
> +        * a device by now, assume it is never going to be added and make sure
> +        * other devices don't defer probe indefinitely by waiting for such a
> +        * child device.
>          */
> -       if (dev->fwnode && dev->fwnode->dev == dev)
> +       if (dev->fwnode && dev->fwnode->dev == dev) {
> +               struct fwnode_handle *child;
>                 fwnode_links_purge_suppliers(dev->fwnode);
> +               fwnode_for_each_available_child_node(dev->fwnode, child)
> +                       fw_devlink_purge_absent_suppliers(child);
> +       }
>         device_remove_file(dev, &dev_attr_waiting_for_supplier);
>
>         device_links_write_lock();
> diff --git a/include/linux/fwnode.h b/include/linux/fwnode.h
> index fde4ad97564c..21082f11473f 100644
> --- a/include/linux/fwnode.h
> +++ b/include/linux/fwnode.h
> @@ -19,8 +19,10 @@ struct device;
>   * fwnode link flags
>   *
>   * LINKS_ADDED: The fwnode has already be parsed to add fwnode links.
> + * NOT_DEVICE: The fwnode will never be populated as a struct device.
>   */
>  #define FWNODE_FLAG_LINKS_ADDED                BIT(0)
> +#define FWNODE_FLAG_NOT_DEVICE         BIT(1)
>
>  struct fwnode_handle {
>         struct fwnode_handle *secondary;
> --
> 2.30.0.365.g02bc693789-goog
>

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v2 2/3] driver core: fw_devlink: Handle missing drivers for optional suppliers
  2021-02-02  4:33 ` [PATCH v2 2/3] driver core: fw_devlink: Handle missing drivers for optional suppliers Saravana Kannan
@ 2021-02-02 14:34   ` Rafael J. Wysocki
  2021-02-02 19:46     ` Saravana Kannan
  0 siblings, 1 reply; 22+ messages in thread
From: Rafael J. Wysocki @ 2021-02-02 14:34 UTC (permalink / raw)
  To: Saravana Kannan
  Cc: Greg Kroah-Hartman, Rafael J. Wysocki, Marek Szyprowski,
	Geert Uytterhoeven, Marc Zyngier, Tudor Ambarus, Linus Walleij,
	Bartosz Golaszewski, Martin Kaiser, Rob Herring, Frank Rowand,
	Len Brown, Linux Kernel Mailing List, devicetree,
	ACPI Devel Maling List, Cc: Android Kernel

On Tue, Feb 2, 2021 at 5:33 AM Saravana Kannan <saravanak@google.com> wrote:
>
> After a deferred probe attempt has exhaused all the devices that can be
> bound, any device that remains unbound has one/both of these conditions
> true:
>
> (1) It is waiting on its supplier to bind
> (2) It does not have a matching driver
>
> So, to make fw_devlink=on more forgiving of missing drivers for optional
> suppliers, after we've done a full deferred probe attempt, this patch
> deletes all device links created by fw_devlink where the supplier hasn't
> probed yet and the supplier itself is not waiting on any of its
> suppliers. This allows consumers to probe during another deferred probe
> attempt if they were waiting on optional suppliers.
>
> When modules are enabled, we can't differentiate between a driver
> that'll never be registered vs a driver that'll be registered soon by
> loading a module. So, this patch doesn't do anything for the case where
> modules are enabled.
>
> Fixes: e590474768f1 ("driver core: Set fw_devlink=on by default")
> Signed-off-by: Saravana Kannan <saravanak@google.com>
> ---
>  drivers/base/base.h |   2 +
>  drivers/base/core.c | 104 ++++++++++++++++++++++++++++++++++++--------
>  drivers/base/dd.c   |   5 +++
>  3 files changed, 94 insertions(+), 17 deletions(-)
>
> diff --git a/drivers/base/base.h b/drivers/base/base.h
> index f5600a83124f..34befe9475cb 100644
> --- a/drivers/base/base.h
> +++ b/drivers/base/base.h
> @@ -186,6 +186,8 @@ extern void device_links_no_driver(struct device *dev);
>  extern bool device_links_busy(struct device *dev);
>  extern void device_links_unbind_consumers(struct device *dev);
>
> +bool fw_devlink_deferred_probe_retry(void);
> +
>  /* device pm support */
>  void device_pm_move_to_tail(struct device *dev);
>
> diff --git a/drivers/base/core.c b/drivers/base/core.c
> index c95b1daabac7..5e53fc6a21ea 100644
> --- a/drivers/base/core.c
> +++ b/drivers/base/core.c
> @@ -50,6 +50,7 @@ static LIST_HEAD(deferred_sync);
>  static unsigned int defer_sync_state_count = 1;
>  static DEFINE_MUTEX(fwnode_link_lock);
>  static bool fw_devlink_is_permissive(void);
> +static bool fw_devlink_def_probe_retry;
>
>  /**
>   * fwnode_link_add - Create a link between two fwnode_handles.
> @@ -881,6 +882,13 @@ static void device_link_put_kref(struct device_link *link)
>                 WARN(1, "Unable to drop a managed device link reference\n");
>  }
>
> +static void device_link_drop_managed(struct device_link *link)
> +{
> +       link->flags &= ~DL_FLAG_MANAGED;
> +       WRITE_ONCE(link->status, DL_STATE_NONE);
> +       kref_put(&link->kref, __device_link_del);
> +}
> +
>  /**
>   * device_link_del - Delete a stateless link between two devices.
>   * @link: Device link to delete.
> @@ -943,6 +951,29 @@ static void device_links_missing_supplier(struct device *dev)
>         }
>  }
>
> +/**
> + * device_links_probe_blocked_by - Return first supplier blocking probe
> + * @dev: Consumer device.
> + *
> + * Checks if the probe of @dev is blocked by a supplier without a driver. If
> + * yes, return that supplier dev. Otherwise, return NULL.
> + */
> +static struct device *device_links_probe_blocked_by(struct device *dev)
> +{
> +       struct device_link *link;
> +
> +       list_for_each_entry(link, &dev->links.suppliers, c_node) {
> +               if (!(link->flags & DL_FLAG_MANAGED) ||
> +                   link->flags & DL_FLAG_SYNC_STATE_ONLY)
> +                       continue;
> +
> +               if (link->status != DL_STATE_AVAILABLE)
> +                       return link->supplier;
> +       }
> +
> +       return NULL;

This is slightly confusing, because you don't actually use the
returned device pointer, but simply check it against NULL.

AFAICS this function can return bool and I'd call it
device_links_probe_blocked().

> +}
> +
>  /**
>   * device_links_check_suppliers - Check presence of supplier drivers.
>   * @dev: Consumer device.
> @@ -961,7 +992,7 @@ static void device_links_missing_supplier(struct device *dev)
>   */
>  int device_links_check_suppliers(struct device *dev)
>  {
> -       struct device_link *link;
> +       struct device_link *link, *tmp;
>         int ret = 0;
>
>         /*
> @@ -982,19 +1013,47 @@ int device_links_check_suppliers(struct device *dev)
>
>         device_links_write_lock();
>
> -       list_for_each_entry(link, &dev->links.suppliers, c_node) {
> +       list_for_each_entry_safe(link, tmp, &dev->links.suppliers, c_node) {
>                 if (!(link->flags & DL_FLAG_MANAGED))
>                         continue;
>
> -               if (link->status != DL_STATE_AVAILABLE &&
> -                   !(link->flags & DL_FLAG_SYNC_STATE_ONLY)) {
> -                       device_links_missing_supplier(dev);
> -                       dev_dbg(dev, "probe deferral - supplier %s not ready\n",
> -                               dev_name(link->supplier));
> -                       ret = -EPROBE_DEFER;
> -                       break;
> +
> +               if (link->status == DL_STATE_AVAILABLE ||
> +                   link->flags & DL_FLAG_SYNC_STATE_ONLY) {
> +                       WRITE_ONCE(link->status, DL_STATE_CONSUMER_PROBE);
> +                       continue;
> +               }
> +
> +               /*
> +                * After a deferred probe attempt has exhaused all the devices
> +                * that can be bound, any device that remains unbound has
> +                * one/both of these conditions true:
> +                *
> +                * (1) It is waiting on its supplier to bind
> +                * (2) It does not have a matching driver
> +                *
> +                * If this device is waiting on a supplier to bind to a driver,
> +                * we make sure condition (1) above is not true for the
> +                * supplier. In which case, condition (2) has to be true for
> +                * the supplier. That is, the supplier doesn't have a matching
> +                * driver.
> +                *
> +                * When we find such a supplier, we delete the device link if
> +                * it was created by fw_devlink. This it to allow the consumer
> +                * to probe in case the supplier is an optional.
> +                */
> +               if (fw_devlink_def_probe_retry &&

I would put a IS_ENABLED(CONFIG_MODULES) check here to let the
compiler optimize out the code depending on it and make it clear that
this is a NOP if there are modules.

> +                   link->flags & DL_FLAG_INFERRED &&
> +                   !device_links_probe_blocked_by(link->supplier)) {
> +                       device_link_drop_managed(link);
> +                       continue;
>                 }
> -               WRITE_ONCE(link->status, DL_STATE_CONSUMER_PROBE);
> +
> +               device_links_missing_supplier(dev);
> +               dev_dbg(dev, "probe deferral - supplier %s not ready\n",
> +                       dev_name(link->supplier));
> +               ret = -EPROBE_DEFER;
> +               break;
>         }
>         dev->links.status = DL_DEV_PROBING;
>
> @@ -1132,13 +1191,6 @@ static void __device_links_supplier_defer_sync(struct device *sup)
>                 list_add_tail(&sup->links.defer_sync, &deferred_sync);
>  }
>
> -static void device_link_drop_managed(struct device_link *link)
> -{
> -       link->flags &= ~DL_FLAG_MANAGED;
> -       WRITE_ONCE(link->status, DL_STATE_NONE);
> -       kref_put(&link->kref, __device_link_del);
> -}
> -
>  static ssize_t waiting_for_supplier_show(struct device *dev,
>                                          struct device_attribute *attr,
>                                          char *buf)
> @@ -1597,6 +1649,24 @@ static int fw_devlink_relax_cycle(struct device *con, void *sup)
>         return ret;
>  }
>
> +/** fw_devlink_deferred_probe_retry - Set up fw_devlink for probe retries

Kerneldoc format mistake.

> + *
> + * This function requests fw_devlink to set itself up for a deferred probe
> + * retry. This allows fw_devlink to ignore device links it created to
> + * suppliers that'll never probe. This is necessary in case some of the
> + * suppliers are optional and their consumers can probe without them.
> + *
> + * Returns true if deferred probe retry is likely to make any difference.
> + */
> +bool fw_devlink_deferred_probe_retry(void)
> +{
> +       if (IS_ENABLED(CONFIG_MODULES))
> +               return false;

To make the above more visible, I'd fold this function into the caller.

> +
> +       fw_devlink_def_probe_retry = true;
> +       return fw_devlink_get_flags() && !fw_devlink_is_permissive();
> +}
> +
>  /**
>   * fw_devlink_create_devlink - Create a device link from a consumer to fwnode
>   * @con - Consumer device for the device link
> diff --git a/drivers/base/dd.c b/drivers/base/dd.c
> index 9179825ff646..11325df2327f 100644
> --- a/drivers/base/dd.c
> +++ b/drivers/base/dd.c
> @@ -317,6 +317,11 @@ static int deferred_probe_initcall(void)
>         driver_deferred_probe_trigger();
>         /* Sort as many dependencies as possible before exiting initcalls */
>         flush_work(&deferred_probe_work);
> +
> +       if (fw_devlink_deferred_probe_retry()) {
> +               driver_deferred_probe_trigger();
> +               flush_work(&deferred_probe_work);
> +       }
>         initcalls_done = true;
>
>         /*
> --

Overall, the "let's do nothing if modules are not enabled" approach is
a bit disappointing, because what if somebody builds all of the
drivers needed for boot in and enables modules anyway, for example to
allow USB drivers to be probed dynamically?

They surely can be forgiven for expecting to get the right ordering
then, like in the "no modules" case ...

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v2 0/3] Make fw_devlink=on more forgiving
  2021-02-02  4:33 [PATCH v2 0/3] Make fw_devlink=on more forgiving Saravana Kannan
                   ` (2 preceding siblings ...)
  2021-02-02  4:33 ` [PATCH v2 3/3] of: property: Don't add links to absent suppliers Saravana Kannan
@ 2021-02-02 16:52 ` Tudor.Ambarus
  2021-02-02 17:41   ` Rob Herring
  2021-02-02 21:22 ` Martin Kaiser
  4 siblings, 1 reply; 22+ messages in thread
From: Tudor.Ambarus @ 2021-02-02 16:52 UTC (permalink / raw)
  To: saravanak, gregkh, rafael, m.szyprowski, geert, maz,
	linus.walleij, bgolaszewski, martin, robh+dt, frowand.list, lenb
  Cc: linux-kernel, devicetree, linux-acpi, kernel-team

Hi, Saravana,

On 2/2/21 6:33 AM, Saravana Kannan wrote:
> EXTERNAL EMAIL: Do not click links or open attachments unless you know the content is safe
> 
> This patch series solves two general issues with fw_devlink=on
> 
> Patch 1/3 and 3/3 addresses the issue of firmware nodes that look like
> they'll have struct devices created for them, but will never actually
> have struct devices added for them. For example, DT nodes with a
> compatible property that don't have devices added for them.
> 
> Patch 2/2 address (for static kernels) the issue of optional suppliers
> that'll never have a driver registered for them. So, if the device could
> have probed with fw_devlink=permissive with a static kernel, this patch
> should allow those devices to probe with a fw_devlink=on. This doesn't
> solve it for the case where modules are enabled because there's no way
> to tell if a driver will never be registered or it's just about to be
> registered. I have some other ideas for that, but it'll have to come
> later thinking about it a bit.
> 
> Marek, Geert,
> 
> I don't expect v2 to do any better for your cases.
> 
> This series not making any difference for Marek is still a mystery to
> me. I guess one of the consumers doesn't take too well to its probe (and
> it's consumers' probe) being delayed till late_initcall(). I'll continue
> looking into it.
> 
> Marc,
> 
> This v2 should do better than v1 with gpiolib stub driver reverted. I
> forgot to take care of the case where more suppliers could link after I
> went and deleted some of the links. v2 handles that now.
> 
> Tudor,
> 
> You should still make the clock driver fix (because it's a bug), but I
> think this series will fix your issue too (even without the clock driver
> fix). Can you please give this a shot?
> 

Did the following tests (using sama5_defconfig and at91-sama5d2_xplained.dts):
1/ modular kernel with your v2 on top of next-20210202, and without the clock
driver fix: the problem persists.

2/ static kernel with your v2 on top of next-20210202, and without the clock
driver fix: the problem persists. Comparing to the previous test, I see that
the links to pmc are dropped. I can see the following only with early printk
enabled:
platform fc008000.serial: Dropping the link to f0014000.pmc
But later on, the serial still gets deferred waiting for the dma controller
this time:
platform f8020000.serial: probe deferral - supplier f0010000.dma-controller not ready
I'll check what happens in the dma-controller.

3/ modular kernel with your v2 and the clock driver fix on top of next-20210202:
I can boot the board and I have access to the console. I still have some
probe deferrals that I need to check:
root@sama5d2-xplained-sd:~# dmesg | grep -i defer
[    4.335625] platform b0000000.sdio-host: probe deferral - wait for supplier pmic@5b
[    4.474559] platform fc030000.adc: probe deferral - wait for supplier pmic@5b
[    4.884315] calling  deferred_probe_initcall+0x0/0xd8 @ 1
[    4.889206] platform fc030000.adc: probe deferral - wait for supplier pmic@5b
[    5.139447] initcall deferred_probe_initcall+0x0/0xd8 returned 0 after 245132 usecs
root@sama5d2-xplained-sd:~# dmesg | grep -i linked
[    4.916342] platform fc030000.adc: Linked as a consumer to 0-005b
[    4.921298] platform b0000000.sdio-host: Linked as a consumer to 0-005b
[    4.926817] i2c 0-005b: Linked as a sync state only consumer to fc038000.pinctrl
[    4.990246] platform act8945a-charger: Linked as a consumer to fc038000.pinctrl
[    5.078488] at24 1-0054: Linked as a consumer to regulator.0
[    5.111533] at91-sama5d2_adc fc030000.adc: Linked as a consumer to regulator.5
[    5.118969] sdhci-at91 b0000000.sdio-host: Linked as a consumer to regulator.3
root@sama5d2-xplained-sd:~# dmesg | grep -i drop
[    5.014026] act8945a 0-005b: Dropping the link to fc038000.pinctrl

4/ static kernel with your v2 and the clock driver fix on top of next-20210202:
I can boot the board and I have access to the console. The probe deferrals are the
same as in test 3/.

Cheers,
ta

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v2 0/3] Make fw_devlink=on more forgiving
  2021-02-02 16:52 ` [PATCH v2 0/3] Make fw_devlink=on more forgiving Tudor.Ambarus
@ 2021-02-02 17:41   ` Rob Herring
  2021-02-02 18:28     ` Saravana Kannan
  0 siblings, 1 reply; 22+ messages in thread
From: Rob Herring @ 2021-02-02 17:41 UTC (permalink / raw)
  To: Tudor Ambarus, Saravana Kannan
  Cc: Greg Kroah-Hartman, Rafael J. Wysocki, Marek Szyprowski,
	Geert Uytterhoeven, Marc Zyngier, Linus Walleij,
	Bartosz Golaszewski, Martin Kaiser, Frank Rowand, Len Brown,
	linux-kernel, devicetree, open list:ACPI FOR ARM64 (ACPI/arm64),
	Android Kernel Team

On Tue, Feb 2, 2021 at 10:52 AM <Tudor.Ambarus@microchip.com> wrote:
>
> Hi, Saravana,
>
> On 2/2/21 6:33 AM, Saravana Kannan wrote:
> > EXTERNAL EMAIL: Do not click links or open attachments unless you know the content is safe
> >
> > This patch series solves two general issues with fw_devlink=on
> >
> > Patch 1/3 and 3/3 addresses the issue of firmware nodes that look like
> > they'll have struct devices created for them, but will never actually
> > have struct devices added for them. For example, DT nodes with a
> > compatible property that don't have devices added for them.
> >
> > Patch 2/2 address (for static kernels) the issue of optional suppliers
> > that'll never have a driver registered for them. So, if the device could
> > have probed with fw_devlink=permissive with a static kernel, this patch
> > should allow those devices to probe with a fw_devlink=on. This doesn't
> > solve it for the case where modules are enabled because there's no way
> > to tell if a driver will never be registered or it's just about to be
> > registered. I have some other ideas for that, but it'll have to come
> > later thinking about it a bit.
> >
> > Marek, Geert,
> >
> > I don't expect v2 to do any better for your cases.
> >
> > This series not making any difference for Marek is still a mystery to
> > me. I guess one of the consumers doesn't take too well to its probe (and
> > it's consumers' probe) being delayed till late_initcall(). I'll continue
> > looking into it.
> >
> > Marc,
> >
> > This v2 should do better than v1 with gpiolib stub driver reverted. I
> > forgot to take care of the case where more suppliers could link after I
> > went and deleted some of the links. v2 handles that now.
> >
> > Tudor,
> >
> > You should still make the clock driver fix (because it's a bug), but I
> > think this series will fix your issue too (even without the clock driver
> > fix). Can you please give this a shot?
> >
>
> Did the following tests (using sama5_defconfig and at91-sama5d2_xplained.dts):
> 1/ modular kernel with your v2 on top of next-20210202, and without the clock
> driver fix: the problem persists.
>
> 2/ static kernel with your v2 on top of next-20210202, and without the clock
> driver fix: the problem persists. Comparing to the previous test, I see that
> the links to pmc are dropped. I can see the following only with early printk
> enabled:
> platform fc008000.serial: Dropping the link to f0014000.pmc
> But later on, the serial still gets deferred waiting for the dma controller
> this time:
> platform f8020000.serial: probe deferral - supplier f0010000.dma-controller not ready
> I'll check what happens in the dma-controller.

Not sure if it's the case here, but some serial drivers use DMA only
when available and decide that on open() rather than probe. How is
devlinks going to deal with that?

Rob

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v2 0/3] Make fw_devlink=on more forgiving
  2021-02-02 17:41   ` Rob Herring
@ 2021-02-02 18:28     ` Saravana Kannan
  0 siblings, 0 replies; 22+ messages in thread
From: Saravana Kannan @ 2021-02-02 18:28 UTC (permalink / raw)
  To: Rob Herring
  Cc: Tudor Ambarus, Greg Kroah-Hartman, Rafael J. Wysocki,
	Marek Szyprowski, Geert Uytterhoeven, Marc Zyngier,
	Linus Walleij, Bartosz Golaszewski, Martin Kaiser, Frank Rowand,
	Len Brown, linux-kernel,
	open list:OPEN FIRMWARE AND FLATTENED DEVICE TREE BINDINGS,
	open list:ACPI FOR ARM64 (ACPI/arm64),
	Android Kernel Team

On Tue, Feb 2, 2021 at 9:41 AM Rob Herring <robh+dt@kernel.org> wrote:
>
> On Tue, Feb 2, 2021 at 10:52 AM <Tudor.Ambarus@microchip.com> wrote:
> >
> > Hi, Saravana,
> >
> > On 2/2/21 6:33 AM, Saravana Kannan wrote:
> > > EXTERNAL EMAIL: Do not click links or open attachments unless you know the content is safe
> > >
> > > This patch series solves two general issues with fw_devlink=on
> > >
> > > Patch 1/3 and 3/3 addresses the issue of firmware nodes that look like
> > > they'll have struct devices created for them, but will never actually
> > > have struct devices added for them. For example, DT nodes with a
> > > compatible property that don't have devices added for them.
> > >
> > > Patch 2/2 address (for static kernels) the issue of optional suppliers
> > > that'll never have a driver registered for them. So, if the device could
> > > have probed with fw_devlink=permissive with a static kernel, this patch
> > > should allow those devices to probe with a fw_devlink=on. This doesn't
> > > solve it for the case where modules are enabled because there's no way
> > > to tell if a driver will never be registered or it's just about to be
> > > registered. I have some other ideas for that, but it'll have to come
> > > later thinking about it a bit.
> > >
> > > Marek, Geert,
> > >
> > > I don't expect v2 to do any better for your cases.
> > >
> > > This series not making any difference for Marek is still a mystery to
> > > me. I guess one of the consumers doesn't take too well to its probe (and
> > > it's consumers' probe) being delayed till late_initcall(). I'll continue
> > > looking into it.
> > >
> > > Marc,
> > >
> > > This v2 should do better than v1 with gpiolib stub driver reverted. I
> > > forgot to take care of the case where more suppliers could link after I
> > > went and deleted some of the links. v2 handles that now.
> > >
> > > Tudor,
> > >
> > > You should still make the clock driver fix (because it's a bug), but I
> > > think this series will fix your issue too (even without the clock driver
> > > fix). Can you please give this a shot?
> > >
> >
> > Did the following tests (using sama5_defconfig and at91-sama5d2_xplained.dts):
> > 1/ modular kernel with your v2 on top of next-20210202, and without the clock
> > driver fix: the problem persists.
> >
> > 2/ static kernel with your v2 on top of next-20210202, and without the clock
> > driver fix: the problem persists. Comparing to the previous test, I see that
> > the links to pmc are dropped. I can see the following only with early printk
> > enabled:
> > platform fc008000.serial: Dropping the link to f0014000.pmc
> > But later on, the serial still gets deferred waiting for the dma controller
> > this time:
> > platform f8020000.serial: probe deferral - supplier f0010000.dma-controller not ready
> > I'll check what happens in the dma-controller.
>
> Not sure if it's the case here, but some serial drivers use DMA only
> when available and decide that on open() rather than probe. How is
> devlinks going to deal with that?

That's kinda what I'm trying to work out here :) but in a more generic fashion.

-Saravana

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v2 1/3] driver core: fw_devlink: Detect supplier devices that will never be added
  2021-02-02 14:12   ` Rafael J. Wysocki
@ 2021-02-02 19:35     ` Saravana Kannan
  0 siblings, 0 replies; 22+ messages in thread
From: Saravana Kannan @ 2021-02-02 19:35 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Greg Kroah-Hartman, Marek Szyprowski, Geert Uytterhoeven,
	Marc Zyngier, Tudor Ambarus, Linus Walleij, Bartosz Golaszewski,
	Martin Kaiser, Rob Herring, Frank Rowand, Len Brown,
	Linux Kernel Mailing List, devicetree, ACPI Devel Maling List,
	Cc: Android Kernel

On Tue, Feb 2, 2021 at 6:12 AM Rafael J. Wysocki <rafael@kernel.org> wrote:
>
> On Tue, Feb 2, 2021 at 5:33 AM Saravana Kannan <saravanak@google.com> wrote:
> >
> > During the initial parsing of firmware by fw_devlink, fw_devlink might
> > infer that some supplier firmware nodes would get populated as devices.
> > But the inference is not always correct. This patch tries to logically
> > detect and fix such mistakes as boot progresses or more devices probe.
> >
> > fw_devlink makes a fundamental assumption that once a device binds to a
> > driver, it will populate (i.e: add as struct devices) all the child
> > firmware nodes that could be populated as devices (if they aren't
> > populated already).
> >
> > So, whenever a device probes, we check all its child firmware nodes. If
> > a child firmware node has a corresponding device populated, we don't
> > modify the child node or its descendants. However, if a child firmware
> > node has not been populated as a device, we delete all the fwnode links
> > where the child node or its descendants are suppliers. This ensures that
> > no other device is blocked on a firmware node that will never be
> > populated as a device. We also mark such fwnodes as NOT_DEVICE, so that
> > no new fwnode links are created with these nodes as suppliers.
> >
> > Fixes: e590474768f1 ("driver core: Set fw_devlink=on by default")
> > Signed-off-by: Saravana Kannan <saravanak@google.com>
>
> Still ACKed.

Thanks. I didn't want to add your Ack when I made changes since your Ack.

-Saravana

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v2 2/3] driver core: fw_devlink: Handle missing drivers for optional suppliers
  2021-02-02 14:34   ` Rafael J. Wysocki
@ 2021-02-02 19:46     ` Saravana Kannan
  2021-02-04 18:41       ` Rafael J. Wysocki
  0 siblings, 1 reply; 22+ messages in thread
From: Saravana Kannan @ 2021-02-02 19:46 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Greg Kroah-Hartman, Marek Szyprowski, Geert Uytterhoeven,
	Marc Zyngier, Tudor Ambarus, Linus Walleij, Bartosz Golaszewski,
	Martin Kaiser, Rob Herring, Frank Rowand, Len Brown,
	Linux Kernel Mailing List, devicetree, ACPI Devel Maling List,
	Cc: Android Kernel

On Tue, Feb 2, 2021 at 6:34 AM Rafael J. Wysocki <rafael@kernel.org> wrote:
>
> On Tue, Feb 2, 2021 at 5:33 AM Saravana Kannan <saravanak@google.com> wrote:
> >
> > After a deferred probe attempt has exhaused all the devices that can be
> > bound, any device that remains unbound has one/both of these conditions
> > true:
> >
> > (1) It is waiting on its supplier to bind
> > (2) It does not have a matching driver
> >
> > So, to make fw_devlink=on more forgiving of missing drivers for optional
> > suppliers, after we've done a full deferred probe attempt, this patch
> > deletes all device links created by fw_devlink where the supplier hasn't
> > probed yet and the supplier itself is not waiting on any of its
> > suppliers. This allows consumers to probe during another deferred probe
> > attempt if they were waiting on optional suppliers.
> >
> > When modules are enabled, we can't differentiate between a driver
> > that'll never be registered vs a driver that'll be registered soon by
> > loading a module. So, this patch doesn't do anything for the case where
> > modules are enabled.
> >
> > Fixes: e590474768f1 ("driver core: Set fw_devlink=on by default")
> > Signed-off-by: Saravana Kannan <saravanak@google.com>
> > ---
> >  drivers/base/base.h |   2 +
> >  drivers/base/core.c | 104 ++++++++++++++++++++++++++++++++++++--------
> >  drivers/base/dd.c   |   5 +++
> >  3 files changed, 94 insertions(+), 17 deletions(-)
> >
> > diff --git a/drivers/base/base.h b/drivers/base/base.h
> > index f5600a83124f..34befe9475cb 100644
> > --- a/drivers/base/base.h
> > +++ b/drivers/base/base.h
> > @@ -186,6 +186,8 @@ extern void device_links_no_driver(struct device *dev);
> >  extern bool device_links_busy(struct device *dev);
> >  extern void device_links_unbind_consumers(struct device *dev);
> >
> > +bool fw_devlink_deferred_probe_retry(void);
> > +
> >  /* device pm support */
> >  void device_pm_move_to_tail(struct device *dev);
> >
> > diff --git a/drivers/base/core.c b/drivers/base/core.c
> > index c95b1daabac7..5e53fc6a21ea 100644
> > --- a/drivers/base/core.c
> > +++ b/drivers/base/core.c
> > @@ -50,6 +50,7 @@ static LIST_HEAD(deferred_sync);
> >  static unsigned int defer_sync_state_count = 1;
> >  static DEFINE_MUTEX(fwnode_link_lock);
> >  static bool fw_devlink_is_permissive(void);
> > +static bool fw_devlink_def_probe_retry;
> >
> >  /**
> >   * fwnode_link_add - Create a link between two fwnode_handles.
> > @@ -881,6 +882,13 @@ static void device_link_put_kref(struct device_link *link)
> >                 WARN(1, "Unable to drop a managed device link reference\n");
> >  }
> >
> > +static void device_link_drop_managed(struct device_link *link)
> > +{
> > +       link->flags &= ~DL_FLAG_MANAGED;
> > +       WRITE_ONCE(link->status, DL_STATE_NONE);
> > +       kref_put(&link->kref, __device_link_del);
> > +}
> > +
> >  /**
> >   * device_link_del - Delete a stateless link between two devices.
> >   * @link: Device link to delete.
> > @@ -943,6 +951,29 @@ static void device_links_missing_supplier(struct device *dev)
> >         }
> >  }
> >
> > +/**
> > + * device_links_probe_blocked_by - Return first supplier blocking probe
> > + * @dev: Consumer device.
> > + *
> > + * Checks if the probe of @dev is blocked by a supplier without a driver. If
> > + * yes, return that supplier dev. Otherwise, return NULL.
> > + */
> > +static struct device *device_links_probe_blocked_by(struct device *dev)
> > +{
> > +       struct device_link *link;
> > +
> > +       list_for_each_entry(link, &dev->links.suppliers, c_node) {
> > +               if (!(link->flags & DL_FLAG_MANAGED) ||
> > +                   link->flags & DL_FLAG_SYNC_STATE_ONLY)
> > +                       continue;
> > +
> > +               if (link->status != DL_STATE_AVAILABLE)
> > +                       return link->supplier;
> > +       }
> > +
> > +       return NULL;
>
> This is slightly confusing, because you don't actually use the
> returned device pointer, but simply check it against NULL.
>
> AFAICS this function can return bool and I'd call it
> device_links_probe_blocked().

Yeah, I was writing it this way because I had other future uses for
this. But yeah, I'll just make it a bool and change it later when I
need to.

>
> > +}
> > +
> >  /**
> >   * device_links_check_suppliers - Check presence of supplier drivers.
> >   * @dev: Consumer device.
> > @@ -961,7 +992,7 @@ static void device_links_missing_supplier(struct device *dev)
> >   */
> >  int device_links_check_suppliers(struct device *dev)
> >  {
> > -       struct device_link *link;
> > +       struct device_link *link, *tmp;
> >         int ret = 0;
> >
> >         /*
> > @@ -982,19 +1013,47 @@ int device_links_check_suppliers(struct device *dev)
> >
> >         device_links_write_lock();
> >
> > -       list_for_each_entry(link, &dev->links.suppliers, c_node) {
> > +       list_for_each_entry_safe(link, tmp, &dev->links.suppliers, c_node) {
> >                 if (!(link->flags & DL_FLAG_MANAGED))
> >                         continue;
> >
> > -               if (link->status != DL_STATE_AVAILABLE &&
> > -                   !(link->flags & DL_FLAG_SYNC_STATE_ONLY)) {
> > -                       device_links_missing_supplier(dev);
> > -                       dev_dbg(dev, "probe deferral - supplier %s not ready\n",
> > -                               dev_name(link->supplier));
> > -                       ret = -EPROBE_DEFER;
> > -                       break;
> > +
> > +               if (link->status == DL_STATE_AVAILABLE ||
> > +                   link->flags & DL_FLAG_SYNC_STATE_ONLY) {
> > +                       WRITE_ONCE(link->status, DL_STATE_CONSUMER_PROBE);
> > +                       continue;
> > +               }
> > +
> > +               /*
> > +                * After a deferred probe attempt has exhaused all the devices
> > +                * that can be bound, any device that remains unbound has
> > +                * one/both of these conditions true:
> > +                *
> > +                * (1) It is waiting on its supplier to bind
> > +                * (2) It does not have a matching driver
> > +                *
> > +                * If this device is waiting on a supplier to bind to a driver,
> > +                * we make sure condition (1) above is not true for the
> > +                * supplier. In which case, condition (2) has to be true for
> > +                * the supplier. That is, the supplier doesn't have a matching
> > +                * driver.
> > +                *
> > +                * When we find such a supplier, we delete the device link if
> > +                * it was created by fw_devlink. This it to allow the consumer
> > +                * to probe in case the supplier is an optional.
> > +                */
> > +               if (fw_devlink_def_probe_retry &&
>
> I would put a IS_ENABLED(CONFIG_MODULES) check here to let the
> compiler optimize out the code depending on it and make it clear that
> this is a NOP if there are modules.

Will do.

>
> > +                   link->flags & DL_FLAG_INFERRED &&
> > +                   !device_links_probe_blocked_by(link->supplier)) {
> > +                       device_link_drop_managed(link);
> > +                       continue;
> >                 }
> > -               WRITE_ONCE(link->status, DL_STATE_CONSUMER_PROBE);
> > +
> > +               device_links_missing_supplier(dev);
> > +               dev_dbg(dev, "probe deferral - supplier %s not ready\n",
> > +                       dev_name(link->supplier));
> > +               ret = -EPROBE_DEFER;
> > +               break;
> >         }
> >         dev->links.status = DL_DEV_PROBING;
> >
> > @@ -1132,13 +1191,6 @@ static void __device_links_supplier_defer_sync(struct device *sup)
> >                 list_add_tail(&sup->links.defer_sync, &deferred_sync);
> >  }
> >
> > -static void device_link_drop_managed(struct device_link *link)
> > -{
> > -       link->flags &= ~DL_FLAG_MANAGED;
> > -       WRITE_ONCE(link->status, DL_STATE_NONE);
> > -       kref_put(&link->kref, __device_link_del);
> > -}
> > -
> >  static ssize_t waiting_for_supplier_show(struct device *dev,
> >                                          struct device_attribute *attr,
> >                                          char *buf)
> > @@ -1597,6 +1649,24 @@ static int fw_devlink_relax_cycle(struct device *con, void *sup)
> >         return ret;
> >  }
> >
> > +/** fw_devlink_deferred_probe_retry - Set up fw_devlink for probe retries
>
> Kerneldoc format mistake.

Ack

>
> > + *
> > + * This function requests fw_devlink to set itself up for a deferred probe
> > + * retry. This allows fw_devlink to ignore device links it created to
> > + * suppliers that'll never probe. This is necessary in case some of the
> > + * suppliers are optional and their consumers can probe without them.
> > + *
> > + * Returns true if deferred probe retry is likely to make any difference.
> > + */
> > +bool fw_devlink_deferred_probe_retry(void)
> > +{
> > +       if (IS_ENABLED(CONFIG_MODULES))
> > +               return false;
>
> To make the above more visible, I'd fold this function into the caller.

I had written it this way because I'm thinking of adding a timeout
heuristic for MODULES in here. I can move it to the caller if you feel
strongly about it.

>
> > +
> > +       fw_devlink_def_probe_retry = true;
> > +       return fw_devlink_get_flags() && !fw_devlink_is_permissive();
> > +}
> > +
> >  /**
> >   * fw_devlink_create_devlink - Create a device link from a consumer to fwnode
> >   * @con - Consumer device for the device link
> > diff --git a/drivers/base/dd.c b/drivers/base/dd.c
> > index 9179825ff646..11325df2327f 100644
> > --- a/drivers/base/dd.c
> > +++ b/drivers/base/dd.c
> > @@ -317,6 +317,11 @@ static int deferred_probe_initcall(void)
> >         driver_deferred_probe_trigger();
> >         /* Sort as many dependencies as possible before exiting initcalls */
> >         flush_work(&deferred_probe_work);
> > +
> > +       if (fw_devlink_deferred_probe_retry()) {
> > +               driver_deferred_probe_trigger();
> > +               flush_work(&deferred_probe_work);
> > +       }
> >         initcalls_done = true;
> >
> >         /*
> > --
>
> Overall, the "let's do nothing if modules are not enabled" approach is
> a bit disappointing, because what if somebody builds all of the
> drivers needed for boot in and enables modules anyway, for example to
> allow USB drivers to be probed dynamically?

Yeah, I'm disappointed too :( But I'm trying to get it to work for
!MODULES so that we can enable fw_devlink=on by default at least for
!MODULES to make sure drivers don't introduce more issues going
forward. And then I plan to continue working on making it work
correctly for MODULES case too.

Getting fw_devlink=on to work perfectly for MODULES and !MODULES is
not a problem at all. But it needs fixing a bunch of drivers (mostly
simple fixes like setting the right flag, handling deferred probes
correctly, etc), but I'm hitting a catch-22 here. I can't find the
drivers without setting fw_devlink=on by default. But if I did that,
it's going to break a bunch of boards.

What's your thought on leaving fw_devlink=on by default on 5.12 and
fixing drivers as issues are reported? If that's a no, do you have any
other ideas on how to deal with this catch-22?

Thanks,
Saravana

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v2 0/3] Make fw_devlink=on more forgiving
  2021-02-02  4:33 [PATCH v2 0/3] Make fw_devlink=on more forgiving Saravana Kannan
                   ` (3 preceding siblings ...)
  2021-02-02 16:52 ` [PATCH v2 0/3] Make fw_devlink=on more forgiving Tudor.Ambarus
@ 2021-02-02 21:22 ` Martin Kaiser
  2021-02-02 22:43   ` Saravana Kannan
  4 siblings, 1 reply; 22+ messages in thread
From: Martin Kaiser @ 2021-02-02 21:22 UTC (permalink / raw)
  To: Saravana Kannan
  Cc: Greg Kroah-Hartman, Rafael J. Wysocki, Marek Szyprowski,
	Geert Uytterhoeven, Marc Zyngier, Tudor Ambarus, Linus Walleij,
	Bartosz Golaszewski, Rob Herring, Frank Rowand, Len Brown,
	linux-kernel, devicetree, linux-acpi, kernel-team

Hi Saravana,

Thus wrote Saravana Kannan (saravanak@google.com):

> Martin,

> If you tested this series, can you please give a Tested-by?

I tested this v2 series on top of next-20210202 (without the fsl,avic
patch).

If modules are enabled, the kernel doesn't boot on my imx25 board. This
is expected, I guess.

With modules disabled, the kernel boots but probe fails for some
(non-mainline) drivers in my tree. All of those drivers have a gpio in
their device-tree node, such as

my_driver {
   gpio_test1 = <&gpio1 0 0>;
   ...
};

with gpio1 from arch/arm/boot/dts/imx25.dtsi.

The probe function calls

of_get_named_gpio(np, "gpio_test1", 0);

to get the gpio. This fails with -EINVAL.

Best regards,
Martin

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v2 0/3] Make fw_devlink=on more forgiving
  2021-02-02 21:22 ` Martin Kaiser
@ 2021-02-02 22:43   ` Saravana Kannan
  2021-02-03  7:54     ` Geert Uytterhoeven
  2021-02-03 21:57     ` Martin Kaiser
  0 siblings, 2 replies; 22+ messages in thread
From: Saravana Kannan @ 2021-02-02 22:43 UTC (permalink / raw)
  To: Martin Kaiser
  Cc: Greg Kroah-Hartman, Rafael J. Wysocki, Marek Szyprowski,
	Geert Uytterhoeven, Marc Zyngier, Tudor Ambarus, Linus Walleij,
	Bartosz Golaszewski, Rob Herring, Frank Rowand, Len Brown, LKML,
	open list:OPEN FIRMWARE AND FLATTENED DEVICE TREE BINDINGS,
	ACPI Devel Maling List, Android Kernel Team

On Tue, Feb 2, 2021 at 1:22 PM Martin Kaiser <martin@kaiser.cx> wrote:
>
> Hi Saravana,
>
> Thus wrote Saravana Kannan (saravanak@google.com):
>
> > Martin,
>
> > If you tested this series, can you please give a Tested-by?
>
> I tested this v2 series on top of next-20210202 (without the fsl,avic
> patch).
>
> If modules are enabled, the kernel doesn't boot on my imx25 board. This
> is expected, I guess.
>
> With modules disabled, the kernel boots but probe fails for some
> (non-mainline) drivers in my tree.

Thanks Martin!

> All of those drivers have a gpio in
> their device-tree node, such as
>
> my_driver {
>    gpio_test1 = <&gpio1 0 0>;
>    ...
> };
>
> with gpio1 from arch/arm/boot/dts/imx25.dtsi.
>
> The probe function calls
>
> of_get_named_gpio(np, "gpio_test1", 0);
>
> to get the gpio. This fails with -EINVAL.

And you didn't see this issue with the fsl,avic patch?

The property you are using is not a standard GPIO binding (-gpios,
gpio, gpios) and I'm not surprised it's not working. The gpio1 is
probably getting probe deferred and ends up running after "my_driver".

-Saravana

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v2 0/3] Make fw_devlink=on more forgiving
  2021-02-02 22:43   ` Saravana Kannan
@ 2021-02-03  7:54     ` Geert Uytterhoeven
  2021-02-03  8:11       ` Saravana Kannan
  2021-02-03 22:02       ` Martin Kaiser
  2021-02-03 21:57     ` Martin Kaiser
  1 sibling, 2 replies; 22+ messages in thread
From: Geert Uytterhoeven @ 2021-02-03  7:54 UTC (permalink / raw)
  To: Saravana Kannan
  Cc: Martin Kaiser, Greg Kroah-Hartman, Rafael J. Wysocki,
	Marek Szyprowski, Marc Zyngier, Tudor Ambarus, Linus Walleij,
	Bartosz Golaszewski, Rob Herring, Frank Rowand, Len Brown, LKML,
	open list:OPEN FIRMWARE AND FLATTENED DEVICE TREE BINDINGS,
	ACPI Devel Maling List, Android Kernel Team

On Tue, Feb 2, 2021 at 11:44 PM Saravana Kannan <saravanak@google.com> wrote:
> On Tue, Feb 2, 2021 at 1:22 PM Martin Kaiser <martin@kaiser.cx> wrote:
> > Thus wrote Saravana Kannan (saravanak@google.com):
> > All of those drivers have a gpio in
> > their device-tree node, such as
> >
> > my_driver {
> >    gpio_test1 = <&gpio1 0 0>;
> >    ...
> > };
> >
> > with gpio1 from arch/arm/boot/dts/imx25.dtsi.
> >
> > The probe function calls
> >
> > of_get_named_gpio(np, "gpio_test1", 0);
> >
> > to get the gpio. This fails with -EINVAL.
>
> And you didn't see this issue with the fsl,avic patch?
>
> The property you are using is not a standard GPIO binding (-gpios,
> gpio, gpios) and I'm not surprised it's not working. The gpio1 is
> probably getting probe deferred and ends up running after "my_driver".

So my_driver doesn't support deferred probe, as of_get_named_gpio()
returns -EINVAL instead of -EPROBE_DEFER?
Converting my_driver from of_get_named_gpio() to the gpiod_*() API
should at least make the driver support probe deferral, after which I
expect it to start working again on reprobe?

Gr{oetje,eeting}s,

                        Geert

-- 
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
                                -- Linus Torvalds

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v2 0/3] Make fw_devlink=on more forgiving
  2021-02-03  7:54     ` Geert Uytterhoeven
@ 2021-02-03  8:11       ` Saravana Kannan
  2021-02-03  8:15         ` Geert Uytterhoeven
  2021-02-03 22:02       ` Martin Kaiser
  1 sibling, 1 reply; 22+ messages in thread
From: Saravana Kannan @ 2021-02-03  8:11 UTC (permalink / raw)
  To: Geert Uytterhoeven
  Cc: Martin Kaiser, Greg Kroah-Hartman, Rafael J. Wysocki,
	Marek Szyprowski, Marc Zyngier, Tudor Ambarus, Linus Walleij,
	Bartosz Golaszewski, Rob Herring, Frank Rowand, Len Brown, LKML,
	open list:OPEN FIRMWARE AND FLATTENED DEVICE TREE BINDINGS,
	ACPI Devel Maling List, Android Kernel Team

On Tue, Feb 2, 2021 at 11:55 PM Geert Uytterhoeven <geert@linux-m68k.org> wrote:
>
> On Tue, Feb 2, 2021 at 11:44 PM Saravana Kannan <saravanak@google.com> wrote:
> > On Tue, Feb 2, 2021 at 1:22 PM Martin Kaiser <martin@kaiser.cx> wrote:
> > > Thus wrote Saravana Kannan (saravanak@google.com):
> > > All of those drivers have a gpio in
> > > their device-tree node, such as
> > >
> > > my_driver {
> > >    gpio_test1 = <&gpio1 0 0>;
> > >    ...
> > > };
> > >
> > > with gpio1 from arch/arm/boot/dts/imx25.dtsi.
> > >
> > > The probe function calls
> > >
> > > of_get_named_gpio(np, "gpio_test1", 0);
> > >
> > > to get the gpio. This fails with -EINVAL.
> >
> > And you didn't see this issue with the fsl,avic patch?
> >
> > The property you are using is not a standard GPIO binding (-gpios,
> > gpio, gpios) and I'm not surprised it's not working. The gpio1 is
> > probably getting probe deferred and ends up running after "my_driver".
>
> So my_driver doesn't support deferred probe, as of_get_named_gpio()
> returns -EINVAL instead of -EPROBE_DEFER?
> Converting my_driver from of_get_named_gpio() to the gpiod_*() API
> should at least make the driver support probe deferral, after which I
> expect it to start working again on reprobe?

The way I understood the API/example, you can't just change the code
and have it work. The DT itself isn't using standard bindings. And we
can't make kernel changes that assume the DT has been changed to match
the code. So, the best we could do is have of_get_named_gpio() return
-EPROBE_DEFER if it doesn't find the GPIO -- assuming that doesn't
break other users. Or have this specific driver remap the -EINVAL to
-EPROBE_DEFER.

-Saravana

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v2 0/3] Make fw_devlink=on more forgiving
  2021-02-03  8:11       ` Saravana Kannan
@ 2021-02-03  8:15         ` Geert Uytterhoeven
  0 siblings, 0 replies; 22+ messages in thread
From: Geert Uytterhoeven @ 2021-02-03  8:15 UTC (permalink / raw)
  To: Saravana Kannan
  Cc: Martin Kaiser, Greg Kroah-Hartman, Rafael J. Wysocki,
	Marek Szyprowski, Marc Zyngier, Tudor Ambarus, Linus Walleij,
	Bartosz Golaszewski, Rob Herring, Frank Rowand, Len Brown, LKML,
	open list:OPEN FIRMWARE AND FLATTENED DEVICE TREE BINDINGS,
	ACPI Devel Maling List, Android Kernel Team

Hi Saravana,

On Wed, Feb 3, 2021 at 9:11 AM Saravana Kannan <saravanak@google.com> wrote:
> On Tue, Feb 2, 2021 at 11:55 PM Geert Uytterhoeven <geert@linux-m68k.org> wrote:
> > On Tue, Feb 2, 2021 at 11:44 PM Saravana Kannan <saravanak@google.com> wrote:
> > > On Tue, Feb 2, 2021 at 1:22 PM Martin Kaiser <martin@kaiser.cx> wrote:
> > > > Thus wrote Saravana Kannan (saravanak@google.com):
> > > > All of those drivers have a gpio in
> > > > their device-tree node, such as
> > > >
> > > > my_driver {
> > > >    gpio_test1 = <&gpio1 0 0>;
> > > >    ...
> > > > };
> > > >
> > > > with gpio1 from arch/arm/boot/dts/imx25.dtsi.
> > > >
> > > > The probe function calls
> > > >
> > > > of_get_named_gpio(np, "gpio_test1", 0);
> > > >
> > > > to get the gpio. This fails with -EINVAL.
> > >
> > > And you didn't see this issue with the fsl,avic patch?
> > >
> > > The property you are using is not a standard GPIO binding (-gpios,
> > > gpio, gpios) and I'm not surprised it's not working. The gpio1 is
> > > probably getting probe deferred and ends up running after "my_driver".
> >
> > So my_driver doesn't support deferred probe, as of_get_named_gpio()
> > returns -EINVAL instead of -EPROBE_DEFER?
> > Converting my_driver from of_get_named_gpio() to the gpiod_*() API
> > should at least make the driver support probe deferral, after which I
> > expect it to start working again on reprobe?
>
> The way I understood the API/example, you can't just change the code
> and have it work. The DT itself isn't using standard bindings. And we

Oh, right.

> can't make kernel changes that assume the DT has been changed to match
> the code. So, the best we could do is have of_get_named_gpio() return
> -EPROBE_DEFER if it doesn't find the GPIO -- assuming that doesn't
> break other users. Or have this specific driver remap the -EINVAL to
> -EPROBE_DEFER.

The latter would hide real errors, too, and would cause futile reprobes.

Gr{oetje,eeting}s,

                        Geert

-- 
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
                                -- Linus Torvalds

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v2 0/3] Make fw_devlink=on more forgiving
  2021-02-02 22:43   ` Saravana Kannan
  2021-02-03  7:54     ` Geert Uytterhoeven
@ 2021-02-03 21:57     ` Martin Kaiser
  2021-02-03 22:03       ` Saravana Kannan
  1 sibling, 1 reply; 22+ messages in thread
From: Martin Kaiser @ 2021-02-03 21:57 UTC (permalink / raw)
  To: Saravana Kannan
  Cc: Greg Kroah-Hartman, Rafael J. Wysocki, Marek Szyprowski,
	Geert Uytterhoeven, Marc Zyngier, Tudor Ambarus, Linus Walleij,
	Bartosz Golaszewski, Rob Herring, Frank Rowand, Len Brown, LKML,
	open list:OPEN FIRMWARE AND FLATTENED DEVICE TREE BINDINGS,
	ACPI Devel Maling List, Android Kernel Team

Thus wrote Saravana Kannan (saravanak@google.com):

> > With modules disabled, the kernel boots but probe fails for some
> > (non-mainline) drivers in my tree.

> Thanks Martin!

> > All of those drivers have a gpio in
> > their device-tree node, such as

> > my_driver {
> >    gpio_test1 = <&gpio1 0 0>;
> >    ...
> > };

> > with gpio1 from arch/arm/boot/dts/imx25.dtsi.

> > The probe function calls

> > of_get_named_gpio(np, "gpio_test1", 0);

> > to get the gpio. This fails with -EINVAL.

> And you didn't see this issue with the fsl,avic patch?

No. With the fsl,avic patch in place, all drivers are probed correctly.

> The property you are using is not a standard GPIO binding (-gpios,
> gpio, gpios) and I'm not surprised it's not working.

I know that I should be using the gpiod API as suggested by Geert.

BTW is this definition ok? Could its driver be converted to using the
gpiod api?

rtc: rtc {
   compatible = "moxa,moxart-rtc";
   gpio-rtc-sclk = <&gpio 5 0>;
...


> The gpio1 is probably getting probe deferred and ends up running after
> "my_driver".

I added a debug print in the probe function. It turned out that the
driver for gpio1 is probed for the first time after my_driver.

I removed the interrupt-controller property for gpio2 for testing. gpio2
was then probed much earlier.

Best regards,
Martin

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v2 0/3] Make fw_devlink=on more forgiving
  2021-02-03  7:54     ` Geert Uytterhoeven
  2021-02-03  8:11       ` Saravana Kannan
@ 2021-02-03 22:02       ` Martin Kaiser
  1 sibling, 0 replies; 22+ messages in thread
From: Martin Kaiser @ 2021-02-03 22:02 UTC (permalink / raw)
  To: Geert Uytterhoeven
  Cc: Saravana Kannan, Greg Kroah-Hartman, Rafael J. Wysocki,
	Marek Szyprowski, Marc Zyngier, Tudor Ambarus, Linus Walleij,
	Bartosz Golaszewski, Rob Herring, Frank Rowand, Len Brown, LKML,
	open list:OPEN FIRMWARE AND FLATTENED DEVICE TREE BINDINGS,
	ACPI Devel Maling List, Android Kernel Team

Thus wrote Geert Uytterhoeven (geert@linux-m68k.org):

> > The property you are using is not a standard GPIO binding (-gpios,
> > gpio, gpios) and I'm not surprised it's not working. The gpio1 is
> > probably getting probe deferred and ends up running after "my_driver".

> So my_driver doesn't support deferred probe,

I know that the gpio definition in the device-tree is non-standard (and
should have been done differently).

Apart from this, the driver uses module_platform_driver_probe. My
understanding is that this prevents probe deferral.

Does this mean that from now on, a driver which requests a gpio must not
use module_platform_driver_probe?

Thanks,
Martin

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v2 0/3] Make fw_devlink=on more forgiving
  2021-02-03 21:57     ` Martin Kaiser
@ 2021-02-03 22:03       ` Saravana Kannan
  0 siblings, 0 replies; 22+ messages in thread
From: Saravana Kannan @ 2021-02-03 22:03 UTC (permalink / raw)
  To: Martin Kaiser
  Cc: Greg Kroah-Hartman, Rafael J. Wysocki, Marek Szyprowski,
	Geert Uytterhoeven, Marc Zyngier, Tudor Ambarus, Linus Walleij,
	Bartosz Golaszewski, Rob Herring, Frank Rowand, Len Brown, LKML,
	open list:OPEN FIRMWARE AND FLATTENED DEVICE TREE BINDINGS,
	ACPI Devel Maling List, Android Kernel Team

On Wed, Feb 3, 2021 at 1:58 PM Martin Kaiser <martin@kaiser.cx> wrote:
>
> Thus wrote Saravana Kannan (saravanak@google.com):
>
> > > With modules disabled, the kernel boots but probe fails for some
> > > (non-mainline) drivers in my tree.
>
> > Thanks Martin!
>
> > > All of those drivers have a gpio in
> > > their device-tree node, such as
>
> > > my_driver {
> > >    gpio_test1 = <&gpio1 0 0>;
> > >    ...
> > > };
>
> > > with gpio1 from arch/arm/boot/dts/imx25.dtsi.
>
> > > The probe function calls
>
> > > of_get_named_gpio(np, "gpio_test1", 0);
>
> > > to get the gpio. This fails with -EINVAL.
>
> > And you didn't see this issue with the fsl,avic patch?
>
> No. With the fsl,avic patch in place, all drivers are probed correctly.
>
> > The property you are using is not a standard GPIO binding (-gpios,
> > gpio, gpios) and I'm not surprised it's not working.
>
> I know that I should be using the gpiod API as suggested by Geert.
>
> BTW is this definition ok? Could its driver be converted to using the
> gpiod api?
>
> rtc: rtc {
>    compatible = "moxa,moxart-rtc";
>    gpio-rtc-sclk = <&gpio 5 0>;
> ...

The correct non-deprecated binding AFAIK is something-gpios. Not
gpio-something. And then you can use different APIs to get the GPIO (I
forget what it's called).

-Saravana

>
>
> > The gpio1 is probably getting probe deferred and ends up running after
> > "my_driver".
>
> I added a debug print in the probe function. It turned out that the
> driver for gpio1 is probed for the first time after my_driver.
>
> I removed the interrupt-controller property for gpio2 for testing. gpio2
> was then probed much earlier.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v2 2/3] driver core: fw_devlink: Handle missing drivers for optional suppliers
  2021-02-02 19:46     ` Saravana Kannan
@ 2021-02-04 18:41       ` Rafael J. Wysocki
  2021-02-04 19:03         ` Saravana Kannan
  0 siblings, 1 reply; 22+ messages in thread
From: Rafael J. Wysocki @ 2021-02-04 18:41 UTC (permalink / raw)
  To: Saravana Kannan
  Cc: Rafael J. Wysocki, Greg Kroah-Hartman, Marek Szyprowski,
	Geert Uytterhoeven, Marc Zyngier, Tudor Ambarus, Linus Walleij,
	Bartosz Golaszewski, Martin Kaiser, Rob Herring, Frank Rowand,
	Len Brown, Linux Kernel Mailing List, devicetree,
	ACPI Devel Maling List, Cc: Android Kernel

On Tue, Feb 2, 2021 at 8:47 PM Saravana Kannan <saravanak@google.com> wrote:
>
> On Tue, Feb 2, 2021 at 6:34 AM Rafael J. Wysocki <rafael@kernel.org> wrote:
> >
> > On Tue, Feb 2, 2021 at 5:33 AM Saravana Kannan <saravanak@google.com> wrote:
> > >

[cut]

> >
> > > + *
> > > + * This function requests fw_devlink to set itself up for a deferred probe
> > > + * retry. This allows fw_devlink to ignore device links it created to
> > > + * suppliers that'll never probe. This is necessary in case some of the
> > > + * suppliers are optional and their consumers can probe without them.
> > > + *
> > > + * Returns true if deferred probe retry is likely to make any difference.
> > > + */
> > > +bool fw_devlink_deferred_probe_retry(void)
> > > +{
> > > +       if (IS_ENABLED(CONFIG_MODULES))
> > > +               return false;
> >
> > To make the above more visible, I'd fold this function into the caller.
>
> I had written it this way because I'm thinking of adding a timeout
> heuristic for MODULES in here. I can move it to the caller if you feel
> strongly about it.

Not really strongly, but then moving it back when you need doesn't
sound particularly troublesome to me. :-)

> >
> > > +
> > > +       fw_devlink_def_probe_retry = true;
> > > +       return fw_devlink_get_flags() && !fw_devlink_is_permissive();
> > > +}
> > > +
> > >  /**
> > >   * fw_devlink_create_devlink - Create a device link from a consumer to fwnode
> > >   * @con - Consumer device for the device link
> > > diff --git a/drivers/base/dd.c b/drivers/base/dd.c
> > > index 9179825ff646..11325df2327f 100644
> > > --- a/drivers/base/dd.c
> > > +++ b/drivers/base/dd.c
> > > @@ -317,6 +317,11 @@ static int deferred_probe_initcall(void)
> > >         driver_deferred_probe_trigger();
> > >         /* Sort as many dependencies as possible before exiting initcalls */
> > >         flush_work(&deferred_probe_work);
> > > +
> > > +       if (fw_devlink_deferred_probe_retry()) {
> > > +               driver_deferred_probe_trigger();
> > > +               flush_work(&deferred_probe_work);
> > > +       }
> > >         initcalls_done = true;
> > >
> > >         /*
> > > --
> >
> > Overall, the "let's do nothing if modules are not enabled" approach is
> > a bit disappointing, because what if somebody builds all of the
> > drivers needed for boot in and enables modules anyway, for example to
> > allow USB drivers to be probed dynamically?
>
> Yeah, I'm disappointed too :( But I'm trying to get it to work for
> !MODULES so that we can enable fw_devlink=on by default at least for
> !MODULES to make sure drivers don't introduce more issues going
> forward. And then I plan to continue working on making it work
> correctly for MODULES case too.
>
> Getting fw_devlink=on to work perfectly for MODULES and !MODULES is
> not a problem at all. But it needs fixing a bunch of drivers (mostly
> simple fixes like setting the right flag, handling deferred probes
> correctly, etc), but I'm hitting a catch-22 here. I can't find the
> drivers without setting fw_devlink=on by default. But if I did that,
> it's going to break a bunch of boards.
>
> What's your thought on leaving fw_devlink=on by default on 5.12 and
> fixing drivers as issues are reported?

If there are any issues known today that need to be addressed, I'd fix
them first and then try to enable fw_devlink=on maybe just for
!MODULES to start with.

> If that's a no, do you have any other ideas on how to deal with this catch-22?

Try to enable, fix issues as they show up in linux-next.  If there are
still outstanding issues before the next release, back off and try in
the next cycle.  Repeat.

This doesn't sound particularly attractive, but I don't have any
better idea, sorry.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v2 2/3] driver core: fw_devlink: Handle missing drivers for optional suppliers
  2021-02-04 18:41       ` Rafael J. Wysocki
@ 2021-02-04 19:03         ` Saravana Kannan
  0 siblings, 0 replies; 22+ messages in thread
From: Saravana Kannan @ 2021-02-04 19:03 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Greg Kroah-Hartman, Marek Szyprowski, Geert Uytterhoeven,
	Marc Zyngier, Tudor Ambarus, Linus Walleij, Bartosz Golaszewski,
	Martin Kaiser, Rob Herring, Frank Rowand, Len Brown,
	Linux Kernel Mailing List, devicetree, ACPI Devel Maling List,
	Cc: Android Kernel

On Thu, Feb 4, 2021 at 10:41 AM Rafael J. Wysocki <rafael@kernel.org> wrote:
>
> On Tue, Feb 2, 2021 at 8:47 PM Saravana Kannan <saravanak@google.com> wrote:
> >
> > On Tue, Feb 2, 2021 at 6:34 AM Rafael J. Wysocki <rafael@kernel.org> wrote:
> > >
> > > On Tue, Feb 2, 2021 at 5:33 AM Saravana Kannan <saravanak@google.com> wrote:
> > > >
>
> [cut]
>
> > >
> > > > + *
> > > > + * This function requests fw_devlink to set itself up for a deferred probe
> > > > + * retry. This allows fw_devlink to ignore device links it created to
> > > > + * suppliers that'll never probe. This is necessary in case some of the
> > > > + * suppliers are optional and their consumers can probe without them.
> > > > + *
> > > > + * Returns true if deferred probe retry is likely to make any difference.
> > > > + */
> > > > +bool fw_devlink_deferred_probe_retry(void)
> > > > +{
> > > > +       if (IS_ENABLED(CONFIG_MODULES))
> > > > +               return false;
> > >
> > > To make the above more visible, I'd fold this function into the caller.
> >
> > I had written it this way because I'm thinking of adding a timeout
> > heuristic for MODULES in here. I can move it to the caller if you feel
> > strongly about it.
>
> Not really strongly, but then moving it back when you need doesn't
> sound particularly troublesome to me. :-)

Ok, will move it. I'm also rewriting this patch. So we'll see where this lands.

>
> > >
> > > > +
> > > > +       fw_devlink_def_probe_retry = true;
> > > > +       return fw_devlink_get_flags() && !fw_devlink_is_permissive();
> > > > +}
> > > > +
> > > >  /**
> > > >   * fw_devlink_create_devlink - Create a device link from a consumer to fwnode
> > > >   * @con - Consumer device for the device link
> > > > diff --git a/drivers/base/dd.c b/drivers/base/dd.c
> > > > index 9179825ff646..11325df2327f 100644
> > > > --- a/drivers/base/dd.c
> > > > +++ b/drivers/base/dd.c
> > > > @@ -317,6 +317,11 @@ static int deferred_probe_initcall(void)
> > > >         driver_deferred_probe_trigger();
> > > >         /* Sort as many dependencies as possible before exiting initcalls */
> > > >         flush_work(&deferred_probe_work);
> > > > +
> > > > +       if (fw_devlink_deferred_probe_retry()) {
> > > > +               driver_deferred_probe_trigger();
> > > > +               flush_work(&deferred_probe_work);
> > > > +       }
> > > >         initcalls_done = true;
> > > >
> > > >         /*
> > > > --
> > >
> > > Overall, the "let's do nothing if modules are not enabled" approach is
> > > a bit disappointing, because what if somebody builds all of the
> > > drivers needed for boot in and enables modules anyway, for example to
> > > allow USB drivers to be probed dynamically?
> >
> > Yeah, I'm disappointed too :( But I'm trying to get it to work for
> > !MODULES so that we can enable fw_devlink=on by default at least for
> > !MODULES to make sure drivers don't introduce more issues going
> > forward. And then I plan to continue working on making it work
> > correctly for MODULES case too.
> >
> > Getting fw_devlink=on to work perfectly for MODULES and !MODULES is
> > not a problem at all. But it needs fixing a bunch of drivers (mostly
> > simple fixes like setting the right flag, handling deferred probes
> > correctly, etc), but I'm hitting a catch-22 here. I can't find the
> > drivers without setting fw_devlink=on by default. But if I did that,
> > it's going to break a bunch of boards.
> >
> > What's your thought on leaving fw_devlink=on by default on 5.12 and
> > fixing drivers as issues are reported?
>
> If there are any issues known today that need to be addressed, I'd fix
> them first and then try to enable fw_devlink=on maybe just for
> !MODULES to start with.

Yeah, that's what I'm thinking of for now.

> > If that's a no, do you have any other ideas on how to deal with this catch-22?
>
> Try to enable, fix issues as they show up in linux-next.  If there are
> still outstanding issues before the next release, back off and try in
> the next cycle.  Repeat.

If it's just dealing with outstanding issues that are reported, I'm
hoping I can do that. The biggest headache right now is dealing with
devices that have drivers that directly parse the fwnode AND still
have a struct device. So the struct device remains unbound even if the
driver has initialized the device.

>
> This doesn't sound particularly attractive, but I don't have any
> better idea, sorry.

:'( Yeah, another approach I'm thinking of is to have a separate
"strict mode" for fw_devlink=on or above. Where it'll try it's best
till kernel late init and then fallback to permissive. But it's
becoming a headache to deal with some corner cases.

-Saravana

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v2 3/3] of: property: Don't add links to absent suppliers
  2021-02-02  4:33 ` [PATCH v2 3/3] of: property: Don't add links to absent suppliers Saravana Kannan
@ 2021-02-04 19:17   ` Saravana Kannan
  0 siblings, 0 replies; 22+ messages in thread
From: Saravana Kannan @ 2021-02-04 19:17 UTC (permalink / raw)
  To: Greg Kroah-Hartman, Rafael J. Wysocki, Marek Szyprowski,
	Geert Uytterhoeven, Marc Zyngier, Tudor Ambarus, Linus Walleij,
	Bartosz Golaszewski, Martin Kaiser, Rob Herring, Frank Rowand,
	Len Brown, Saravana Kannan
  Cc: LKML, open list:OPEN FIRMWARE AND FLATTENED DEVICE TREE BINDINGS,
	ACPI Devel Maling List, Android Kernel Team

On Mon, Feb 1, 2021 at 8:33 PM Saravana Kannan <saravanak@google.com> wrote:
>
> If driver core marks a firmware node as not a device, don't add fwnode
> links where it's a supplier.
>
> Fixes: e590474768f1 ("driver core: Set fw_devlink=on by default")
> Signed-off-by: Saravana Kannan <saravanak@google.com>
> ---
>  drivers/of/property.c | 4 +++-
>  1 file changed, 3 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/of/property.c b/drivers/of/property.c
> index 6287c6d60bb7..53d163c8d39b 100644
> --- a/drivers/of/property.c
> +++ b/drivers/of/property.c
> @@ -1103,7 +1103,9 @@ static int of_link_to_phandle(struct device_node *con_np,
>          * created for them.
>          */
>         sup_dev = get_dev_from_fwnode(&sup_np->fwnode);
> -       if (!sup_dev && of_node_check_flag(sup_np, OF_POPULATED)) {
> +       if (!sup_dev &&
> +           (of_node_check_flag(sup_np, OF_POPULATED) ||
> +            sup_np->fwnode.flags & FWNODE_FLAG_NOT_DEVICE)) {
>                 pr_debug("Not linking %pOFP to %pOFP - No struct device\n",
>                          con_np, sup_np);
>                 of_node_put(sup_np);
> --
> 2.30.0.365.g02bc693789-goog
>

Rob, Can I get an Ack or reviewed-by for this? I want to land this
patch and another one on driver-core.

Thanks,
Saravana

^ permalink raw reply	[flat|nested] 22+ messages in thread

end of thread, other threads:[~2021-02-04 19:20 UTC | newest]

Thread overview: 22+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-02-02  4:33 [PATCH v2 0/3] Make fw_devlink=on more forgiving Saravana Kannan
2021-02-02  4:33 ` [PATCH v2 1/3] driver core: fw_devlink: Detect supplier devices that will never be added Saravana Kannan
2021-02-02 14:12   ` Rafael J. Wysocki
2021-02-02 19:35     ` Saravana Kannan
2021-02-02  4:33 ` [PATCH v2 2/3] driver core: fw_devlink: Handle missing drivers for optional suppliers Saravana Kannan
2021-02-02 14:34   ` Rafael J. Wysocki
2021-02-02 19:46     ` Saravana Kannan
2021-02-04 18:41       ` Rafael J. Wysocki
2021-02-04 19:03         ` Saravana Kannan
2021-02-02  4:33 ` [PATCH v2 3/3] of: property: Don't add links to absent suppliers Saravana Kannan
2021-02-04 19:17   ` Saravana Kannan
2021-02-02 16:52 ` [PATCH v2 0/3] Make fw_devlink=on more forgiving Tudor.Ambarus
2021-02-02 17:41   ` Rob Herring
2021-02-02 18:28     ` Saravana Kannan
2021-02-02 21:22 ` Martin Kaiser
2021-02-02 22:43   ` Saravana Kannan
2021-02-03  7:54     ` Geert Uytterhoeven
2021-02-03  8:11       ` Saravana Kannan
2021-02-03  8:15         ` Geert Uytterhoeven
2021-02-03 22:02       ` Martin Kaiser
2021-02-03 21:57     ` Martin Kaiser
2021-02-03 22:03       ` Saravana Kannan

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.