linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v1 0/5] fw_devlink improvements
@ 2021-09-14  4:39 Saravana Kannan
  2021-09-14  4:39 ` [PATCH v1 1/5] driver core: fw_devlink: Improve handling of cyclic dependencies Saravana Kannan
                   ` (4 more replies)
  0 siblings, 5 replies; 21+ messages in thread
From: Saravana Kannan @ 2021-09-14  4:39 UTC (permalink / raw)
  To: Greg Kroah-Hartman, Rafael J. Wysocki, Saravana Kannan
  Cc: John Stultz, Marek Szyprowski, Rob Herring, Geert Uytterhoeven,
	Andrew Lunn, Vladimir Oltean, kernel-team, linux-kernel

Greg,

Patches ready for picking up:
Patch 1 fixes a bug in fw_devlink.
Patch 2-4 are meant to make debugging easier

Patches that need a bit of discussion:
Patch 5 is an RFC for fw_devlink.debug cmdline param

CCin'g a bunch of folks who've had to deal with debugging fw_devlink to
comment on the need/lack of need for Patch 5:
Cc: John Stultz <john.stultz@linaro.org>
Cc: Marek Szyprowski <m.szyprowski@samsung.com>
Cc: Rob Herring <robh+dt@kernel.org>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Andrew Lunn <andrew@lunn.ch>
Cc: Vladimir Oltean <olteanv@gmail.com>

Thanks,
Saravana

Saravana Kannan (5):
  driver core: fw_devlink: Improve handling of cyclic dependencies
  driver core: Set deferred probe reason when deferred by driver core
  driver core: Create __fwnode_link_del() helper function
  driver core: Add debug logs when fwnode links are added/deleted
  driver core: Add fw_devlink.debug command line boolean parameter

 drivers/base/base.h |   9 ++++
 drivers/base/core.c | 118 +++++++++++++++++++++++++++++++++-----------
 2 files changed, 99 insertions(+), 28 deletions(-)

-- 
2.33.0.309.g3052b89438-goog


^ permalink raw reply	[flat|nested] 21+ messages in thread

* [PATCH v1 1/5] driver core: fw_devlink: Improve handling of cyclic dependencies
  2021-09-14  4:39 [PATCH v1 0/5] fw_devlink improvements Saravana Kannan
@ 2021-09-14  4:39 ` Saravana Kannan
  2021-09-14  6:16   ` Marek Szyprowski
  2021-09-14 12:35   ` Rob Herring
  2021-09-14  4:39 ` [PATCH v1 2/5] driver core: Set deferred probe reason when deferred by driver core Saravana Kannan
                   ` (3 subsequent siblings)
  4 siblings, 2 replies; 21+ messages in thread
From: Saravana Kannan @ 2021-09-14  4:39 UTC (permalink / raw)
  To: Greg Kroah-Hartman, Rafael J. Wysocki, Saravana Kannan
  Cc: John Stultz, Marek Szyprowski, Rob Herring, Geert Uytterhoeven,
	Andrew Lunn, Vladimir Oltean, kernel-team, linux-kernel

When we have a dependency of the form:

Device-A -> Device-C
	Device-B

Device-C -> Device-B

Where,
* Indentation denotes "child of" parent in previous line.
* X -> Y denotes X is consumer of Y based on firmware (Eg: DT).

We have cyclic dependency: device-A -> device-C -> device-B -> device-A

fw_devlink current treats device-C -> device-B dependency as an invalid
dependency and doesn't enforce it but leaves the rest of the
dependencies as is.

While the current behavior is necessary, it is not sufficient if the
false dependency in this example is actually device-A -> device-C. When
this is the case, device-C will correctly probe defer waiting for
device-B to be added, but device-A will be incorrectly probe deferred by
fw_devlink waiting on device-C to probe successfully. Due to this, none
of the devices in the cycle will end up probing.

To fix this, we need to go relax all the dependencies in the cycle like
we already do in the other instances where fw_devlink detects cycles.
A real world example of this was reported[1] and analyzed[2].

[1] - https://lore.kernel.org/lkml/0a2c4106-7f48-2bb5-048e-8c001a7c3fda@samsung.com/
[2] - https://lore.kernel.org/lkml/CAGETcx8peaew90SWiux=TyvuGgvTQOmO4BFALz7aj0Za5QdNFQ@mail.gmail.com/
Fixes: f9aa460672c9 ("driver core: Refactor fw_devlink feature")
Reported-by: Marek Szyprowski <m.szyprowski@samsung.com>
Signed-off-by: Saravana Kannan <saravanak@google.com>
---
 drivers/base/core.c | 17 ++++++++++++-----
 1 file changed, 12 insertions(+), 5 deletions(-)

diff --git a/drivers/base/core.c b/drivers/base/core.c
index e65dd803a453..316df6027093 100644
--- a/drivers/base/core.c
+++ b/drivers/base/core.c
@@ -1772,14 +1772,21 @@ static int fw_devlink_create_devlink(struct device *con,
 	 * be broken by applying logic. Check for these types of cycles and
 	 * break them so that devices in the cycle probe properly.
 	 *
-	 * If the supplier's parent is dependent on the consumer, then
-	 * the consumer-supplier dependency is a false dependency. So,
-	 * treat it as an invalid link.
+	 * If the supplier's parent is dependent on the consumer, then the
+	 * consumer and supplier have a cyclic dependency. Since fw_devlink
+	 * can't tell which of the inferred dependencies are incorrect, don't
+	 * enforce probe ordering between any of the devices in this cyclic
+	 * dependency. Do this by relaxing all the fw_devlink device links in
+	 * this cycle and by treating the fwnode link between the consumer and
+	 * the supplier as an invalid dependency.
 	 */
 	sup_dev = fwnode_get_next_parent_dev(sup_handle);
 	if (sup_dev && device_is_dependent(con, sup_dev)) {
-		dev_dbg(con, "Not linking to %pfwP - False link\n",
-			sup_handle);
+		dev_info(con, "Fixing up cyclic dependency with %pfwP (%s)\n",
+			 sup_handle, dev_name(sup_dev));
+		device_links_write_lock();
+		fw_devlink_relax_cycle(con, sup_dev);
+		device_links_write_unlock();
 		ret = -EINVAL;
 	} else {
 		/*
-- 
2.33.0.309.g3052b89438-goog


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH v1 2/5] driver core: Set deferred probe reason when deferred by driver core
  2021-09-14  4:39 [PATCH v1 0/5] fw_devlink improvements Saravana Kannan
  2021-09-14  4:39 ` [PATCH v1 1/5] driver core: fw_devlink: Improve handling of cyclic dependencies Saravana Kannan
@ 2021-09-14  4:39 ` Saravana Kannan
  2021-09-14  7:01   ` Geert Uytterhoeven
  2021-09-14  4:39 ` [PATCH v1 3/5] driver core: Create __fwnode_link_del() helper function Saravana Kannan
                   ` (2 subsequent siblings)
  4 siblings, 1 reply; 21+ messages in thread
From: Saravana Kannan @ 2021-09-14  4:39 UTC (permalink / raw)
  To: Greg Kroah-Hartman, Rafael J. Wysocki, Saravana Kannan
  Cc: John Stultz, Marek Szyprowski, Rob Herring, Geert Uytterhoeven,
	Andrew Lunn, Vladimir Oltean, kernel-team, linux-kernel

When the driver core defers the probe of a device, set the deferred
probe reason so that it's easier to debug. The deferred probe reason is
available in debugfs under devices_deferred.

Signed-off-by: Saravana Kannan <saravanak@google.com>
---
 drivers/base/core.c | 36 +++++++++++++++++++++++++++++++++---
 1 file changed, 33 insertions(+), 3 deletions(-)

diff --git a/drivers/base/core.c b/drivers/base/core.c
index 316df6027093..41f456904272 100644
--- a/drivers/base/core.c
+++ b/drivers/base/core.c
@@ -955,6 +955,29 @@ static void device_links_missing_supplier(struct device *dev)
 	}
 }
 
+/**
+ * dev_set_def_probe_reason - Set the deferred probe reason for a device
+ * @dev: the pointer to the struct device
+ * @fmt: printf-style format string
+ * @...: arguments as specified in the format string
+ *
+ * This is a more caller-friendly version of device_set_deferred_probe_reason()
+ * that takes variable argument inputs similar to dev_info().
+ */
+static void dev_set_def_probe_reason(const struct device *dev, const char *fmt, ...)
+{
+	struct va_format vaf;
+	va_list args;
+
+	va_start(args, fmt);
+	vaf.fmt = fmt;
+	vaf.va = &args;
+
+	device_set_deferred_probe_reason(dev, &vaf);
+
+	va_end(args);
+}
+
 /**
  * device_links_check_suppliers - Check presence of supplier drivers.
  * @dev: Consumer device.
@@ -975,6 +998,7 @@ int device_links_check_suppliers(struct device *dev)
 {
 	struct device_link *link;
 	int ret = 0;
+	struct fwnode_handle *sup_fw;
 
 	/*
 	 * Device waiting for supplier to become available is not allowed to
@@ -983,10 +1007,13 @@ int device_links_check_suppliers(struct device *dev)
 	mutex_lock(&fwnode_link_lock);
 	if (dev->fwnode && !list_empty(&dev->fwnode->suppliers) &&
 	    !fw_devlink_is_permissive()) {
+		sup_fw = list_first_entry(&dev->fwnode->suppliers,
+					  struct fwnode_link,
+					  c_hook)->supplier;
 		dev_dbg(dev, "probe deferral - wait for supplier %pfwP\n",
-			list_first_entry(&dev->fwnode->suppliers,
-			struct fwnode_link,
-			c_hook)->supplier);
+			sup_fw);
+		dev_set_def_probe_reason(dev,
+			"wait for supplier %pfwP\n", sup_fw);
 		mutex_unlock(&fwnode_link_lock);
 		return -EPROBE_DEFER;
 	}
@@ -1003,6 +1030,9 @@ int device_links_check_suppliers(struct device *dev)
 			device_links_missing_supplier(dev);
 			dev_dbg(dev, "probe deferral - supplier %s not ready\n",
 				dev_name(link->supplier));
+			dev_set_def_probe_reason(dev,
+				"supplier %s not ready\n",
+				dev_name(link->supplier));
 			ret = -EPROBE_DEFER;
 			break;
 		}
-- 
2.33.0.309.g3052b89438-goog


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH v1 3/5] driver core: Create __fwnode_link_del() helper function
  2021-09-14  4:39 [PATCH v1 0/5] fw_devlink improvements Saravana Kannan
  2021-09-14  4:39 ` [PATCH v1 1/5] driver core: fw_devlink: Improve handling of cyclic dependencies Saravana Kannan
  2021-09-14  4:39 ` [PATCH v1 2/5] driver core: Set deferred probe reason when deferred by driver core Saravana Kannan
@ 2021-09-14  4:39 ` Saravana Kannan
  2021-09-14  7:04   ` Geert Uytterhoeven
  2021-09-14  4:39 ` [PATCH v1 4/5] driver core: Add debug logs when fwnode links are added/deleted Saravana Kannan
  2021-09-14  4:39 ` [PATCH v1 5/5] driver core: Add fw_devlink.debug command line boolean parameter Saravana Kannan
  4 siblings, 1 reply; 21+ messages in thread
From: Saravana Kannan @ 2021-09-14  4:39 UTC (permalink / raw)
  To: Greg Kroah-Hartman, Rafael J. Wysocki, Saravana Kannan
  Cc: John Stultz, Marek Szyprowski, Rob Herring, Geert Uytterhoeven,
	Andrew Lunn, Vladimir Oltean, kernel-team, linux-kernel

The same code is repeated in multiple locations. Create a helper
function for it.

Signed-off-by: Saravana Kannan <saravanak@google.com>
---
 drivers/base/core.c | 35 +++++++++++++++++++----------------
 1 file changed, 19 insertions(+), 16 deletions(-)

diff --git a/drivers/base/core.c b/drivers/base/core.c
index 41f456904272..516a5c498b88 100644
--- a/drivers/base/core.c
+++ b/drivers/base/core.c
@@ -101,6 +101,19 @@ int fwnode_link_add(struct fwnode_handle *con, struct fwnode_handle *sup)
 	return ret;
 }
 
+/**
+ * __fwnode_link_del - Delete a link between two fwnode_handles.
+ * @link: the fwnode_link to be deleted
+ *
+ * The fwnode_link_lock needs to be held when this function is called.
+ */
+static void __fwnode_link_del(struct fwnode_link *link)
+{
+	list_del(&link->s_hook);
+	list_del(&link->c_hook);
+	kfree(link);
+}
+
 /**
  * fwnode_links_purge_suppliers - Delete all supplier links of fwnode_handle.
  * @fwnode: fwnode whose supplier links need to be deleted
@@ -112,11 +125,8 @@ static void fwnode_links_purge_suppliers(struct fwnode_handle *fwnode)
 	struct fwnode_link *link, *tmp;
 
 	mutex_lock(&fwnode_link_lock);
-	list_for_each_entry_safe(link, tmp, &fwnode->suppliers, c_hook) {
-		list_del(&link->s_hook);
-		list_del(&link->c_hook);
-		kfree(link);
-	}
+	list_for_each_entry_safe(link, tmp, &fwnode->suppliers, c_hook)
+		__fwnode_link_del(link);
 	mutex_unlock(&fwnode_link_lock);
 }
 
@@ -131,11 +141,8 @@ static void fwnode_links_purge_consumers(struct fwnode_handle *fwnode)
 	struct fwnode_link *link, *tmp;
 
 	mutex_lock(&fwnode_link_lock);
-	list_for_each_entry_safe(link, tmp, &fwnode->consumers, s_hook) {
-		list_del(&link->s_hook);
-		list_del(&link->c_hook);
-		kfree(link);
-	}
+	list_for_each_entry_safe(link, tmp, &fwnode->consumers, s_hook)
+		__fwnode_link_del(link);
 	mutex_unlock(&fwnode_link_lock);
 }
 
@@ -1895,9 +1902,7 @@ static void __fw_devlink_link_to_consumers(struct device *dev)
 		if (!own_link || ret == -EAGAIN)
 			continue;
 
-		list_del(&link->s_hook);
-		list_del(&link->c_hook);
-		kfree(link);
+		__fwnode_link_del(link);
 	}
 }
 
@@ -1949,9 +1954,7 @@ static void __fw_devlink_link_to_suppliers(struct device *dev,
 		if (!own_link || ret == -EAGAIN)
 			continue;
 
-		list_del(&link->s_hook);
-		list_del(&link->c_hook);
-		kfree(link);
+		__fwnode_link_del(link);
 
 		/* If no device link was created, nothing more to do. */
 		if (ret)
-- 
2.33.0.309.g3052b89438-goog


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH v1 4/5] driver core: Add debug logs when fwnode links are added/deleted
  2021-09-14  4:39 [PATCH v1 0/5] fw_devlink improvements Saravana Kannan
                   ` (2 preceding siblings ...)
  2021-09-14  4:39 ` [PATCH v1 3/5] driver core: Create __fwnode_link_del() helper function Saravana Kannan
@ 2021-09-14  4:39 ` Saravana Kannan
  2021-09-14  7:09   ` Geert Uytterhoeven
  2021-09-14  4:39 ` [PATCH v1 5/5] driver core: Add fw_devlink.debug command line boolean parameter Saravana Kannan
  4 siblings, 1 reply; 21+ messages in thread
From: Saravana Kannan @ 2021-09-14  4:39 UTC (permalink / raw)
  To: Greg Kroah-Hartman, Rafael J. Wysocki, Saravana Kannan
  Cc: John Stultz, Marek Szyprowski, Rob Herring, Geert Uytterhoeven,
	Andrew Lunn, Vladimir Oltean, kernel-team, linux-kernel

This will help with debugging fw_devlink issues.

Signed-off-by: Saravana Kannan <saravanak@google.com>
---
 drivers/base/core.c | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/drivers/base/core.c b/drivers/base/core.c
index 516a5c498b88..b10c425f4b89 100644
--- a/drivers/base/core.c
+++ b/drivers/base/core.c
@@ -95,6 +95,8 @@ int fwnode_link_add(struct fwnode_handle *con, struct fwnode_handle *sup)
 
 	list_add(&link->s_hook, &sup->consumers);
 	list_add(&link->c_hook, &con->suppliers);
+	pr_debug("%pfwP Linked as a fwnode consumer to %pfwP\n",
+		 con, sup);
 out:
 	mutex_unlock(&fwnode_link_lock);
 
@@ -109,6 +111,8 @@ int fwnode_link_add(struct fwnode_handle *con, struct fwnode_handle *sup)
  */
 static void __fwnode_link_del(struct fwnode_link *link)
 {
+	pr_debug("%pfwP Dropping the fwnode link to %pfwP\n",
+		 link->consumer, link->supplier);
 	list_del(&link->s_hook);
 	list_del(&link->c_hook);
 	kfree(link);
-- 
2.33.0.309.g3052b89438-goog


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH v1 5/5] driver core: Add fw_devlink.debug command line boolean parameter
  2021-09-14  4:39 [PATCH v1 0/5] fw_devlink improvements Saravana Kannan
                   ` (3 preceding siblings ...)
  2021-09-14  4:39 ` [PATCH v1 4/5] driver core: Add debug logs when fwnode links are added/deleted Saravana Kannan
@ 2021-09-14  4:39 ` Saravana Kannan
  2021-09-14 15:10   ` Andrew Lunn
  4 siblings, 1 reply; 21+ messages in thread
From: Saravana Kannan @ 2021-09-14  4:39 UTC (permalink / raw)
  To: Greg Kroah-Hartman, Rafael J. Wysocki, Saravana Kannan
  Cc: John Stultz, Marek Szyprowski, Rob Herring, Geert Uytterhoeven,
	Andrew Lunn, Vladimir Oltean, kernel-team, linux-kernel

When the parameter is set, it enables all the debug logs that would be
useful for debugging fw_devlink issues.

I'll add the documentation if we agree that we should add this param.

Signed-off-by: Saravana Kannan <saravanak@google.com>
---
 drivers/base/base.h |  9 +++++++++
 drivers/base/core.c | 34 ++++++++++++++++++++++++++--------
 2 files changed, 35 insertions(+), 8 deletions(-)

diff --git a/drivers/base/base.h b/drivers/base/base.h
index 2882af26392a..e0744c08ccbe 100644
--- a/drivers/base/base.h
+++ b/drivers/base/base.h
@@ -191,6 +191,15 @@ extern void device_links_no_driver(struct device *dev);
 extern bool device_links_busy(struct device *dev);
 extern void device_links_unbind_consumers(struct device *dev);
 extern void fw_devlink_drivers_done(void);
+extern bool fw_devlink_debug;
+
+#define fw_devlink_dbg(dev, fmt, ...)				\
+do {								\
+	if (fw_devlink_debug)					\
+		dev_info(dev, dev_fmt(fmt), ##__VA_ARGS__);	\
+	else							\
+		dev_dbg(dev, dev_fmt(fmt), ##__VA_ARGS__);	\
+} while (0)
 
 /* device pm support */
 void device_pm_move_to_tail(struct device *dev);
diff --git a/drivers/base/core.c b/drivers/base/core.c
index b10c425f4b89..d3be785820ca 100644
--- a/drivers/base/core.c
+++ b/drivers/base/core.c
@@ -95,8 +95,12 @@ int fwnode_link_add(struct fwnode_handle *con, struct fwnode_handle *sup)
 
 	list_add(&link->s_hook, &sup->consumers);
 	list_add(&link->c_hook, &con->suppliers);
-	pr_debug("%pfwP Linked as a fwnode consumer to %pfwP\n",
-		 con, sup);
+	if (fw_devlink_debug)
+		pr_info("%pfwP Linked as a fwnode consumer to %pfwP\n",
+			con, sup);
+	else
+		pr_debug("%pfwP Linked as a fwnode consumer to %pfwP\n",
+			 con, sup);
 out:
 	mutex_unlock(&fwnode_link_lock);
 
@@ -111,8 +115,12 @@ int fwnode_link_add(struct fwnode_handle *con, struct fwnode_handle *sup)
  */
 static void __fwnode_link_del(struct fwnode_link *link)
 {
-	pr_debug("%pfwP Dropping the fwnode link to %pfwP\n",
-		 link->consumer, link->supplier);
+	if (fw_devlink_debug)
+		pr_info("%pfwP Dropping the fwnode link to %pfwP\n",
+			link->consumer, link->supplier);
+	else
+		pr_debug("%pfwP Dropping the fwnode link to %pfwP\n",
+			 link->consumer, link->supplier);
 	list_del(&link->s_hook);
 	list_del(&link->c_hook);
 	kfree(link);
@@ -852,7 +860,7 @@ struct device_link *device_link_add(struct device *consumer,
 	list_add_tail_rcu(&link->c_node, &consumer->links.suppliers);
 
 	if (flags & DL_FLAG_SYNC_STATE_ONLY) {
-		dev_dbg(consumer,
+		fw_devlink_dbg(consumer,
 			"Linked as a sync state only consumer to %s\n",
 			dev_name(supplier));
 		goto out;
@@ -868,7 +876,8 @@ struct device_link *device_link_add(struct device *consumer,
 	 */
 	device_reorder_to_tail(consumer, NULL);
 
-	dev_dbg(consumer, "Linked as a consumer to %s\n", dev_name(supplier));
+	fw_devlink_dbg(consumer, "Linked as a consumer to %s\n",
+		       dev_name(supplier));
 
 out:
 	device_pm_unlock();
@@ -1021,7 +1030,8 @@ int device_links_check_suppliers(struct device *dev)
 		sup_fw = list_first_entry(&dev->fwnode->suppliers,
 					  struct fwnode_link,
 					  c_hook)->supplier;
-		dev_dbg(dev, "probe deferral - wait for supplier %pfwP\n",
+		fw_devlink_dbg(dev,
+			"probe deferral - wait for supplier %pfwP\n",
 			sup_fw);
 		dev_set_def_probe_reason(dev,
 			"wait for supplier %pfwP\n", sup_fw);
@@ -1039,7 +1049,8 @@ int device_links_check_suppliers(struct device *dev)
 		if (link->status != DL_STATE_AVAILABLE &&
 		    !(link->flags & DL_FLAG_SYNC_STATE_ONLY)) {
 			device_links_missing_supplier(dev);
-			dev_dbg(dev, "probe deferral - supplier %s not ready\n",
+			fw_devlink_dbg(dev,
+				"probe deferral - supplier %s not ready\n",
 				dev_name(link->supplier));
 			dev_set_def_probe_reason(dev,
 				"supplier %s not ready\n",
@@ -1616,6 +1627,13 @@ static int __init fw_devlink_strict_setup(char *arg)
 }
 early_param("fw_devlink.strict", fw_devlink_strict_setup);
 
+bool fw_devlink_debug;
+static int __init fw_devlink_debug_setup(char *arg)
+{
+	return strtobool(arg, &fw_devlink_debug);
+}
+early_param("fw_devlink.debug", fw_devlink_debug_setup);
+
 u32 fw_devlink_get_flags(void)
 {
 	return fw_devlink_flags;
-- 
2.33.0.309.g3052b89438-goog


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* Re: [PATCH v1 1/5] driver core: fw_devlink: Improve handling of cyclic dependencies
  2021-09-14  4:39 ` [PATCH v1 1/5] driver core: fw_devlink: Improve handling of cyclic dependencies Saravana Kannan
@ 2021-09-14  6:16   ` Marek Szyprowski
  2021-09-14  8:01     ` Saravana Kannan
  2021-09-14 12:35   ` Rob Herring
  1 sibling, 1 reply; 21+ messages in thread
From: Marek Szyprowski @ 2021-09-14  6:16 UTC (permalink / raw)
  To: Saravana Kannan, Greg Kroah-Hartman, Rafael J. Wysocki
  Cc: John Stultz, Rob Herring, Geert Uytterhoeven, Andrew Lunn,
	Vladimir Oltean, kernel-team, linux-kernel

On 14.09.2021 06:39, Saravana Kannan wrote:
> When we have a dependency of the form:
>
> Device-A -> Device-C
> 	Device-B
>
> Device-C -> Device-B
>
> Where,
> * Indentation denotes "child of" parent in previous line.
> * X -> Y denotes X is consumer of Y based on firmware (Eg: DT).
>
> We have cyclic dependency: device-A -> device-C -> device-B -> device-A
>
> fw_devlink current treats device-C -> device-B dependency as an invalid
> dependency and doesn't enforce it but leaves the rest of the
> dependencies as is.
>
> While the current behavior is necessary, it is not sufficient if the
> false dependency in this example is actually device-A -> device-C. When
> this is the case, device-C will correctly probe defer waiting for
> device-B to be added, but device-A will be incorrectly probe deferred by
> fw_devlink waiting on device-C to probe successfully. Due to this, none
> of the devices in the cycle will end up probing.
>
> To fix this, we need to go relax all the dependencies in the cycle like
> we already do in the other instances where fw_devlink detects cycles.
> A real world example of this was reported[1] and analyzed[2].
>
> [1] - https://lore.kernel.org/lkml/0a2c4106-7f48-2bb5-048e-8c001a7c3fda@samsung.com/
> [2] - https://lore.kernel.org/lkml/CAGETcx8peaew90SWiux=TyvuGgvTQOmO4BFALz7aj0Za5QdNFQ@mail.gmail.com/
> Fixes: f9aa460672c9 ("driver core: Refactor fw_devlink feature")
> Reported-by: Marek Szyprowski <m.szyprowski@samsung.com>
Tested-by: Marek Szyprowski <m.szyprowski@samsung.com>
> Signed-off-by: Saravana Kannan <saravanak@google.com>
> ---
>   drivers/base/core.c | 17 ++++++++++++-----
>   1 file changed, 12 insertions(+), 5 deletions(-)
>
> diff --git a/drivers/base/core.c b/drivers/base/core.c
> index e65dd803a453..316df6027093 100644
> --- a/drivers/base/core.c
> +++ b/drivers/base/core.c
> @@ -1772,14 +1772,21 @@ static int fw_devlink_create_devlink(struct device *con,
>   	 * be broken by applying logic. Check for these types of cycles and
>   	 * break them so that devices in the cycle probe properly.
>   	 *
> -	 * If the supplier's parent is dependent on the consumer, then
> -	 * the consumer-supplier dependency is a false dependency. So,
> -	 * treat it as an invalid link.
> +	 * If the supplier's parent is dependent on the consumer, then the
> +	 * consumer and supplier have a cyclic dependency. Since fw_devlink
> +	 * can't tell which of the inferred dependencies are incorrect, don't
> +	 * enforce probe ordering between any of the devices in this cyclic
> +	 * dependency. Do this by relaxing all the fw_devlink device links in
> +	 * this cycle and by treating the fwnode link between the consumer and
> +	 * the supplier as an invalid dependency.
>   	 */
>   	sup_dev = fwnode_get_next_parent_dev(sup_handle);
>   	if (sup_dev && device_is_dependent(con, sup_dev)) {
> -		dev_dbg(con, "Not linking to %pfwP - False link\n",
> -			sup_handle);
> +		dev_info(con, "Fixing up cyclic dependency with %pfwP (%s)\n",
> +			 sup_handle, dev_name(sup_dev));
> +		device_links_write_lock();
> +		fw_devlink_relax_cycle(con, sup_dev);
> +		device_links_write_unlock();
>   		ret = -EINVAL;
>   	} else {
>   		/*

Best regards
-- 
Marek Szyprowski, PhD
Samsung R&D Institute Poland


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH v1 2/5] driver core: Set deferred probe reason when deferred by driver core
  2021-09-14  4:39 ` [PATCH v1 2/5] driver core: Set deferred probe reason when deferred by driver core Saravana Kannan
@ 2021-09-14  7:01   ` Geert Uytterhoeven
  2021-09-14  7:58     ` Saravana Kannan
  0 siblings, 1 reply; 21+ messages in thread
From: Geert Uytterhoeven @ 2021-09-14  7:01 UTC (permalink / raw)
  To: Saravana Kannan
  Cc: Greg Kroah-Hartman, Rafael J. Wysocki, John Stultz,
	Marek Szyprowski, Rob Herring, Andrew Lunn, Vladimir Oltean,
	Android Kernel Team, Linux Kernel Mailing List

Hi Saravana,

On Tue, Sep 14, 2021 at 6:39 AM Saravana Kannan <saravanak@google.com> wrote:
> When the driver core defers the probe of a device, set the deferred
> probe reason so that it's easier to debug. The deferred probe reason is
> available in debugfs under devices_deferred.
>
> Signed-off-by: Saravana Kannan <saravanak@google.com>

Thanks for your patch!

> --- a/drivers/base/core.c
> +++ b/drivers/base/core.c
> @@ -955,6 +955,29 @@ static void device_links_missing_supplier(struct device *dev)
>         }
>  }
>
> +/**
> + * dev_set_def_probe_reason - Set the deferred probe reason for a device
> + * @dev: the pointer to the struct device
> + * @fmt: printf-style format string
> + * @...: arguments as specified in the format string
> + *
> + * This is a more caller-friendly version of device_set_deferred_probe_reason()
> + * that takes variable argument inputs similar to dev_info().
> + */
> +static void dev_set_def_probe_reason(const struct device *dev, const char *fmt, ...)

So this is indeed similar to device_set_deferred_probe_reason(),
but the function's name is completely different, unlike e.g.
(v)printf()?

> +{
> +       struct va_format vaf;
> +       va_list args;
> +
> +       va_start(args, fmt);
> +       vaf.fmt = fmt;
> +       vaf.va = &args;
> +
> +       device_set_deferred_probe_reason(dev, &vaf);
> +
> +       va_end(args);
> +}

I think you can just make this a macro wrapper calling
dev_err_probe(dev, -EPROBE_DEFER, fmt, ...).
Or open-code that below.

> +
>  /**
>   * device_links_check_suppliers - Check presence of supplier drivers.
>   * @dev: Consumer device.
> @@ -975,6 +998,7 @@ int device_links_check_suppliers(struct device *dev)
>  {
>         struct device_link *link;
>         int ret = 0;
> +       struct fwnode_handle *sup_fw;
>
>         /*
>          * Device waiting for supplier to become available is not allowed to
> @@ -983,10 +1007,13 @@ int device_links_check_suppliers(struct device *dev)
>         mutex_lock(&fwnode_link_lock);
>         if (dev->fwnode && !list_empty(&dev->fwnode->suppliers) &&
>             !fw_devlink_is_permissive()) {
> +               sup_fw = list_first_entry(&dev->fwnode->suppliers,
> +                                         struct fwnode_link,
> +                                         c_hook)->supplier;
>                 dev_dbg(dev, "probe deferral - wait for supplier %pfwP\n",
> -                       list_first_entry(&dev->fwnode->suppliers,
> -                       struct fwnode_link,
> -                       c_hook)->supplier);
> +                       sup_fw);
> +               dev_set_def_probe_reason(dev,
> +                       "wait for supplier %pfwP\n", sup_fw);

dev_err_probe() would replace both the dev_dbg() and the
dev_set_def_probe_reason().

>                 mutex_unlock(&fwnode_link_lock);
>                 return -EPROBE_DEFER;
>         }
> @@ -1003,6 +1030,9 @@ int device_links_check_suppliers(struct device *dev)
>                         device_links_missing_supplier(dev);
>                         dev_dbg(dev, "probe deferral - supplier %s not ready\n",
>                                 dev_name(link->supplier));
> +                       dev_set_def_probe_reason(dev,
> +                               "supplier %s not ready\n",
> +                               dev_name(link->supplier));

Likewise.

>                         ret = -EPROBE_DEFER;
>                         break;
>                 }

Gr{oetje,eeting}s,

                        Geert

-- 
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
                                -- Linus Torvalds

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH v1 3/5] driver core: Create __fwnode_link_del() helper function
  2021-09-14  4:39 ` [PATCH v1 3/5] driver core: Create __fwnode_link_del() helper function Saravana Kannan
@ 2021-09-14  7:04   ` Geert Uytterhoeven
  2021-09-14  7:46     ` Saravana Kannan
  0 siblings, 1 reply; 21+ messages in thread
From: Geert Uytterhoeven @ 2021-09-14  7:04 UTC (permalink / raw)
  To: Saravana Kannan
  Cc: Greg Kroah-Hartman, Rafael J. Wysocki, John Stultz,
	Marek Szyprowski, Rob Herring, Andrew Lunn, Vladimir Oltean,
	Android Kernel Team, Linux Kernel Mailing List

Hi Saravana,

On Tue, Sep 14, 2021 at 6:39 AM Saravana Kannan <saravanak@google.com> wrote:
> The same code is repeated in multiple locations. Create a helper
> function for it.
>
> Signed-off-by: Saravana Kannan <saravanak@google.com>

Thanks for your patch!

> --- a/drivers/base/core.c
> +++ b/drivers/base/core.c
> @@ -101,6 +101,19 @@ int fwnode_link_add(struct fwnode_handle *con, struct fwnode_handle *sup)
>         return ret;
>  }
>
> +/**
> + * __fwnode_link_del - Delete a link between two fwnode_handles.
> + * @link: the fwnode_link to be deleted
> + *
> + * The fwnode_link_lock needs to be held when this function is called.
> + */
> +static void __fwnode_link_del(struct fwnode_link *link)

Why the double underscore?

> +{
> +       list_del(&link->s_hook);
> +       list_del(&link->c_hook);
> +       kfree(link);
> +}

Apart from that:
Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be>

Gr{oetje,eeting}s,

                        Geert

-- 
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
                                -- Linus Torvalds

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH v1 4/5] driver core: Add debug logs when fwnode links are added/deleted
  2021-09-14  4:39 ` [PATCH v1 4/5] driver core: Add debug logs when fwnode links are added/deleted Saravana Kannan
@ 2021-09-14  7:09   ` Geert Uytterhoeven
  0 siblings, 0 replies; 21+ messages in thread
From: Geert Uytterhoeven @ 2021-09-14  7:09 UTC (permalink / raw)
  To: Saravana Kannan
  Cc: Greg Kroah-Hartman, Rafael J. Wysocki, John Stultz,
	Marek Szyprowski, Rob Herring, Andrew Lunn, Vladimir Oltean,
	Android Kernel Team, Linux Kernel Mailing List

On Tue, Sep 14, 2021 at 6:39 AM Saravana Kannan <saravanak@google.com> wrote:
> This will help with debugging fw_devlink issues.
>
> Signed-off-by: Saravana Kannan <saravanak@google.com>

Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be>

Gr{oetje,eeting}s,

                        Geert

-- 
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
                                -- Linus Torvalds

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH v1 3/5] driver core: Create __fwnode_link_del() helper function
  2021-09-14  7:04   ` Geert Uytterhoeven
@ 2021-09-14  7:46     ` Saravana Kannan
  0 siblings, 0 replies; 21+ messages in thread
From: Saravana Kannan @ 2021-09-14  7:46 UTC (permalink / raw)
  To: Geert Uytterhoeven
  Cc: Greg Kroah-Hartman, Rafael J. Wysocki, John Stultz,
	Marek Szyprowski, Rob Herring, Andrew Lunn, Vladimir Oltean,
	Android Kernel Team, Linux Kernel Mailing List

On Tue, Sep 14, 2021 at 12:05 AM Geert Uytterhoeven
<geert@linux-m68k.org> wrote:
>
> Hi Saravana,
>
> On Tue, Sep 14, 2021 at 6:39 AM Saravana Kannan <saravanak@google.com> wrote:
> > The same code is repeated in multiple locations. Create a helper
> > function for it.
> >
> > Signed-off-by: Saravana Kannan <saravanak@google.com>
>
> Thanks for your patch!
>
> > --- a/drivers/base/core.c
> > +++ b/drivers/base/core.c
> > @@ -101,6 +101,19 @@ int fwnode_link_add(struct fwnode_handle *con, struct fwnode_handle *sup)
> >         return ret;
> >  }
> >
> > +/**
> > + * __fwnode_link_del - Delete a link between two fwnode_handles.
> > + * @link: the fwnode_link to be deleted
> > + *
> > + * The fwnode_link_lock needs to be held when this function is called.
> > + */
> > +static void __fwnode_link_del(struct fwnode_link *link)
>
> Why the double underscore?

Because unlike fwnode_link_add(), this one needs the lock to be held.

-Saravana

>
> > +{
> > +       list_del(&link->s_hook);
> > +       list_del(&link->c_hook);
> > +       kfree(link);
> > +}
>
> Apart from that:
> Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be>
>
> Gr{oetje,eeting}s,
>
>                         Geert
>
> --
> Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org
>
> In personal conversations with technical people, I call myself a hacker. But
> when I'm talking to journalists I just say "programmer" or something like that.
>                                 -- Linus Torvalds

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH v1 2/5] driver core: Set deferred probe reason when deferred by driver core
  2021-09-14  7:01   ` Geert Uytterhoeven
@ 2021-09-14  7:58     ` Saravana Kannan
  2021-09-15  5:41       ` Saravana Kannan
  0 siblings, 1 reply; 21+ messages in thread
From: Saravana Kannan @ 2021-09-14  7:58 UTC (permalink / raw)
  To: Geert Uytterhoeven
  Cc: Greg Kroah-Hartman, Rafael J. Wysocki, John Stultz,
	Marek Szyprowski, Rob Herring, Andrew Lunn, Vladimir Oltean,
	Android Kernel Team, Linux Kernel Mailing List

On Tue, Sep 14, 2021 at 12:01 AM Geert Uytterhoeven
<geert@linux-m68k.org> wrote:
>
> Hi Saravana,
>
> On Tue, Sep 14, 2021 at 6:39 AM Saravana Kannan <saravanak@google.com> wrote:
> > When the driver core defers the probe of a device, set the deferred
> > probe reason so that it's easier to debug. The deferred probe reason is
> > available in debugfs under devices_deferred.
> >
> > Signed-off-by: Saravana Kannan <saravanak@google.com>
>
> Thanks for your patch!

Thanks for the reviews!

>
> > --- a/drivers/base/core.c
> > +++ b/drivers/base/core.c
> > @@ -955,6 +955,29 @@ static void device_links_missing_supplier(struct device *dev)
> >         }
> >  }
> >
> > +/**
> > + * dev_set_def_probe_reason - Set the deferred probe reason for a device
> > + * @dev: the pointer to the struct device
> > + * @fmt: printf-style format string
> > + * @...: arguments as specified in the format string
> > + *
> > + * This is a more caller-friendly version of device_set_deferred_probe_reason()
> > + * that takes variable argument inputs similar to dev_info().
> > + */
> > +static void dev_set_def_probe_reason(const struct device *dev, const char *fmt, ...)
>
> So this is indeed similar to device_set_deferred_probe_reason(),
> but the function's name is completely different, unlike e.g.
> (v)printf()?

Yes.

>
> > +{
> > +       struct va_format vaf;
> > +       va_list args;
> > +
> > +       va_start(args, fmt);
> > +       vaf.fmt = fmt;
> > +       vaf.va = &args;
> > +
> > +       device_set_deferred_probe_reason(dev, &vaf);
> > +
> > +       va_end(args);
> > +}
>
> I think you can just make this a macro wrapper calling
> dev_err_probe(dev, -EPROBE_DEFER, fmt, ...).
> Or open-code that below.

Good point. I think I can make it be a wrapper macro.

>
> > +
> >  /**
> >   * device_links_check_suppliers - Check presence of supplier drivers.
> >   * @dev: Consumer device.
> > @@ -975,6 +998,7 @@ int device_links_check_suppliers(struct device *dev)
> >  {
> >         struct device_link *link;
> >         int ret = 0;
> > +       struct fwnode_handle *sup_fw;
> >
> >         /*
> >          * Device waiting for supplier to become available is not allowed to
> > @@ -983,10 +1007,13 @@ int device_links_check_suppliers(struct device *dev)
> >         mutex_lock(&fwnode_link_lock);
> >         if (dev->fwnode && !list_empty(&dev->fwnode->suppliers) &&
> >             !fw_devlink_is_permissive()) {
> > +               sup_fw = list_first_entry(&dev->fwnode->suppliers,
> > +                                         struct fwnode_link,
> > +                                         c_hook)->supplier;
> >                 dev_dbg(dev, "probe deferral - wait for supplier %pfwP\n",
> > -                       list_first_entry(&dev->fwnode->suppliers,
> > -                       struct fwnode_link,
> > -                       c_hook)->supplier);
> > +                       sup_fw);
> > +               dev_set_def_probe_reason(dev,
> > +                       "wait for supplier %pfwP\n", sup_fw);
>
> dev_err_probe() would replace both the dev_dbg() and the
> dev_set_def_probe_reason().

I intentionally didn't use dev_err_probe() because:

1. I wanted the messages to be a bit different -- not have the "probe
deferral" in the deferred_devices file but have it in the dmesg logs
so that the log is clearer.
2. And more importantly, I'm working on a patch (could take a few
weeks) that'll make this place return -EPROBE_DEFER vs -ENODEV (or
whatever) for different situations. And using dev_err_probe() with
-ENODEV would cause it to print the error (when I don't want it to).
And always doing dev_err_probe(dev, -EPROBE_DEFER,...) while returning
-ENODEV would be confusing.

>
> >                 mutex_unlock(&fwnode_link_lock);
> >                 return -EPROBE_DEFER;
> >         }
> > @@ -1003,6 +1030,9 @@ int device_links_check_suppliers(struct device *dev)
> >                         device_links_missing_supplier(dev);
> >                         dev_dbg(dev, "probe deferral - supplier %s not ready\n",
> >                                 dev_name(link->supplier));
> > +                       dev_set_def_probe_reason(dev,
> > +                               "supplier %s not ready\n",
> > +                               dev_name(link->supplier));
>
> Likewise.

Same reason as above.

I mainly added you for comments on 5/5. Hopefully you'll have some
comments on that too by the time I'm up tomorrow :)

-Saravana

>
> >                         ret = -EPROBE_DEFER;
> >                         break;
> >                 }
>
> Gr{oetje,eeting}s,
>
>                         Geert
>
> --
> Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org
>
> In personal conversations with technical people, I call myself a hacker. But
> when I'm talking to journalists I just say "programmer" or something like that.
>                                 -- Linus Torvalds

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH v1 1/5] driver core: fw_devlink: Improve handling of cyclic dependencies
  2021-09-14  6:16   ` Marek Szyprowski
@ 2021-09-14  8:01     ` Saravana Kannan
  0 siblings, 0 replies; 21+ messages in thread
From: Saravana Kannan @ 2021-09-14  8:01 UTC (permalink / raw)
  To: Marek Szyprowski
  Cc: Greg Kroah-Hartman, Rafael J. Wysocki, John Stultz, Rob Herring,
	Geert Uytterhoeven, Andrew Lunn, Vladimir Oltean, kernel-team,
	linux-kernel

On Mon, Sep 13, 2021 at 11:16 PM Marek Szyprowski
<m.szyprowski@samsung.com> wrote:
>
> On 14.09.2021 06:39, Saravana Kannan wrote:
> > When we have a dependency of the form:
> >
> > Device-A -> Device-C
> >       Device-B
> >
> > Device-C -> Device-B
> >
> > Where,
> > * Indentation denotes "child of" parent in previous line.
> > * X -> Y denotes X is consumer of Y based on firmware (Eg: DT).
> >
> > We have cyclic dependency: device-A -> device-C -> device-B -> device-A
> >
> > fw_devlink current treats device-C -> device-B dependency as an invalid
> > dependency and doesn't enforce it but leaves the rest of the
> > dependencies as is.
> >
> > While the current behavior is necessary, it is not sufficient if the
> > false dependency in this example is actually device-A -> device-C. When
> > this is the case, device-C will correctly probe defer waiting for
> > device-B to be added, but device-A will be incorrectly probe deferred by
> > fw_devlink waiting on device-C to probe successfully. Due to this, none
> > of the devices in the cycle will end up probing.
> >
> > To fix this, we need to go relax all the dependencies in the cycle like
> > we already do in the other instances where fw_devlink detects cycles.
> > A real world example of this was reported[1] and analyzed[2].
> >
> > [1] - https://lore.kernel.org/lkml/0a2c4106-7f48-2bb5-048e-8c001a7c3fda@samsung.com/
> > [2] - https://lore.kernel.org/lkml/CAGETcx8peaew90SWiux=TyvuGgvTQOmO4BFALz7aj0Za5QdNFQ@mail.gmail.com/
> > Fixes: f9aa460672c9 ("driver core: Refactor fw_devlink feature")
> > Reported-by: Marek Szyprowski <m.szyprowski@samsung.com>
> Tested-by: Marek Szyprowski <m.szyprowski@samsung.com>

Thanks!

-Saravana

> > Signed-off-by: Saravana Kannan <saravanak@google.com>
> > ---
> >   drivers/base/core.c | 17 ++++++++++++-----
> >   1 file changed, 12 insertions(+), 5 deletions(-)
> >
> > diff --git a/drivers/base/core.c b/drivers/base/core.c
> > index e65dd803a453..316df6027093 100644
> > --- a/drivers/base/core.c
> > +++ b/drivers/base/core.c
> > @@ -1772,14 +1772,21 @@ static int fw_devlink_create_devlink(struct device *con,
> >        * be broken by applying logic. Check for these types of cycles and
> >        * break them so that devices in the cycle probe properly.
> >        *
> > -      * If the supplier's parent is dependent on the consumer, then
> > -      * the consumer-supplier dependency is a false dependency. So,
> > -      * treat it as an invalid link.
> > +      * If the supplier's parent is dependent on the consumer, then the
> > +      * consumer and supplier have a cyclic dependency. Since fw_devlink
> > +      * can't tell which of the inferred dependencies are incorrect, don't
> > +      * enforce probe ordering between any of the devices in this cyclic
> > +      * dependency. Do this by relaxing all the fw_devlink device links in
> > +      * this cycle and by treating the fwnode link between the consumer and
> > +      * the supplier as an invalid dependency.
> >        */
> >       sup_dev = fwnode_get_next_parent_dev(sup_handle);
> >       if (sup_dev && device_is_dependent(con, sup_dev)) {
> > -             dev_dbg(con, "Not linking to %pfwP - False link\n",
> > -                     sup_handle);
> > +             dev_info(con, "Fixing up cyclic dependency with %pfwP (%s)\n",
> > +                      sup_handle, dev_name(sup_dev));
> > +             device_links_write_lock();
> > +             fw_devlink_relax_cycle(con, sup_dev);
> > +             device_links_write_unlock();
> >               ret = -EINVAL;
> >       } else {
> >               /*
>
> Best regards
> --
> Marek Szyprowski, PhD
> Samsung R&D Institute Poland
>

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH v1 1/5] driver core: fw_devlink: Improve handling of cyclic dependencies
  2021-09-14  4:39 ` [PATCH v1 1/5] driver core: fw_devlink: Improve handling of cyclic dependencies Saravana Kannan
  2021-09-14  6:16   ` Marek Szyprowski
@ 2021-09-14 12:35   ` Rob Herring
  2021-09-14 16:23     ` Saravana Kannan
  1 sibling, 1 reply; 21+ messages in thread
From: Rob Herring @ 2021-09-14 12:35 UTC (permalink / raw)
  To: Saravana Kannan
  Cc: Greg Kroah-Hartman, Rafael J. Wysocki, John Stultz,
	Marek Szyprowski, Geert Uytterhoeven, Andrew Lunn,
	Vladimir Oltean, Android Kernel Team, linux-kernel

On Mon, Sep 13, 2021 at 11:39 PM Saravana Kannan <saravanak@google.com> wrote:
>
> When we have a dependency of the form:
>
> Device-A -> Device-C
>         Device-B
>
> Device-C -> Device-B
>
> Where,
> * Indentation denotes "child of" parent in previous line.
> * X -> Y denotes X is consumer of Y based on firmware (Eg: DT).
>
> We have cyclic dependency: device-A -> device-C -> device-B -> device-A
>
> fw_devlink current treats device-C -> device-B dependency as an invalid
> dependency and doesn't enforce it but leaves the rest of the
> dependencies as is.
>
> While the current behavior is necessary, it is not sufficient if the
> false dependency in this example is actually device-A -> device-C. When
> this is the case, device-C will correctly probe defer waiting for
> device-B to be added, but device-A will be incorrectly probe deferred by
> fw_devlink waiting on device-C to probe successfully. Due to this, none
> of the devices in the cycle will end up probing.
>
> To fix this, we need to go relax all the dependencies in the cycle like
> we already do in the other instances where fw_devlink detects cycles.
> A real world example of this was reported[1] and analyzed[2].
>
> [1] - https://lore.kernel.org/lkml/0a2c4106-7f48-2bb5-048e-8c001a7c3fda@samsung.com/
> [2] - https://lore.kernel.org/lkml/CAGETcx8peaew90SWiux=TyvuGgvTQOmO4BFALz7aj0Za5QdNFQ@mail.gmail.com/
> Fixes: f9aa460672c9 ("driver core: Refactor fw_devlink feature")
> Reported-by: Marek Szyprowski <m.szyprowski@samsung.com>
> Signed-off-by: Saravana Kannan <saravanak@google.com>
> ---
>  drivers/base/core.c | 17 ++++++++++++-----
>  1 file changed, 12 insertions(+), 5 deletions(-)

Am I supposed to apply this? What about the revert and mdio-parent-bus
support you mentioned? Those are needed too? Please send me a series
with what I should apply for 5.15, not fixes and new features
combined.

Rob

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH v1 5/5] driver core: Add fw_devlink.debug command line boolean parameter
  2021-09-14  4:39 ` [PATCH v1 5/5] driver core: Add fw_devlink.debug command line boolean parameter Saravana Kannan
@ 2021-09-14 15:10   ` Andrew Lunn
  2021-09-14 16:27     ` Saravana Kannan
  0 siblings, 1 reply; 21+ messages in thread
From: Andrew Lunn @ 2021-09-14 15:10 UTC (permalink / raw)
  To: Saravana Kannan
  Cc: Greg Kroah-Hartman, Rafael J. Wysocki, John Stultz,
	Marek Szyprowski, Rob Herring, Geert Uytterhoeven,
	Vladimir Oltean, kernel-team, linux-kernel

On Mon, Sep 13, 2021 at 09:39:27PM -0700, Saravana Kannan wrote:
> When the parameter is set, it enables all the debug logs that would be
> useful for debugging fw_devlink issues.
> 
> I'll add the documentation if we agree that we should add this param.

https://www.kernel.org/doc/html/latest/admin-guide/dynamic-debug-howto.html

It is better to make use of existing infrastructure, maybe with
extensions if needed.

	   Andrew

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH v1 1/5] driver core: fw_devlink: Improve handling of cyclic dependencies
  2021-09-14 12:35   ` Rob Herring
@ 2021-09-14 16:23     ` Saravana Kannan
  0 siblings, 0 replies; 21+ messages in thread
From: Saravana Kannan @ 2021-09-14 16:23 UTC (permalink / raw)
  To: Rob Herring
  Cc: Greg Kroah-Hartman, Rafael J. Wysocki, John Stultz,
	Marek Szyprowski, Geert Uytterhoeven, Andrew Lunn,
	Vladimir Oltean, Android Kernel Team, linux-kernel

On Tue, Sep 14, 2021 at 5:35 AM Rob Herring <robh+dt@kernel.org> wrote:
>
> On Mon, Sep 13, 2021 at 11:39 PM Saravana Kannan <saravanak@google.com> wrote:
> >
> > When we have a dependency of the form:
> >
> > Device-A -> Device-C
> >         Device-B
> >
> > Device-C -> Device-B
> >
> > Where,
> > * Indentation denotes "child of" parent in previous line.
> > * X -> Y denotes X is consumer of Y based on firmware (Eg: DT).
> >
> > We have cyclic dependency: device-A -> device-C -> device-B -> device-A
> >
> > fw_devlink current treats device-C -> device-B dependency as an invalid
> > dependency and doesn't enforce it but leaves the rest of the
> > dependencies as is.
> >
> > While the current behavior is necessary, it is not sufficient if the
> > false dependency in this example is actually device-A -> device-C. When
> > this is the case, device-C will correctly probe defer waiting for
> > device-B to be added, but device-A will be incorrectly probe deferred by
> > fw_devlink waiting on device-C to probe successfully. Due to this, none
> > of the devices in the cycle will end up probing.
> >
> > To fix this, we need to go relax all the dependencies in the cycle like
> > we already do in the other instances where fw_devlink detects cycles.
> > A real world example of this was reported[1] and analyzed[2].
> >
> > [1] - https://lore.kernel.org/lkml/0a2c4106-7f48-2bb5-048e-8c001a7c3fda@samsung.com/
> > [2] - https://lore.kernel.org/lkml/CAGETcx8peaew90SWiux=TyvuGgvTQOmO4BFALz7aj0Za5QdNFQ@mail.gmail.com/
> > Fixes: f9aa460672c9 ("driver core: Refactor fw_devlink feature")
> > Reported-by: Marek Szyprowski <m.szyprowski@samsung.com>
> > Signed-off-by: Saravana Kannan <saravanak@google.com>
> > ---
> >  drivers/base/core.c | 17 ++++++++++++-----
> >  1 file changed, 12 insertions(+), 5 deletions(-)
>
> Am I supposed to apply this? What about the revert and mdio-parent-bus
> support you mentioned? Those are needed too? Please send me a series
> with what I should apply for 5.15, not fixes and new features
> combined.

You don't need to pick up any of this. I just added you for 5/5 and
you've seen how a lot of the debugging goes.

-Saravana

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH v1 5/5] driver core: Add fw_devlink.debug command line boolean parameter
  2021-09-14 15:10   ` Andrew Lunn
@ 2021-09-14 16:27     ` Saravana Kannan
  2021-09-14 16:43       ` Andrew Lunn
  0 siblings, 1 reply; 21+ messages in thread
From: Saravana Kannan @ 2021-09-14 16:27 UTC (permalink / raw)
  To: Andrew Lunn
  Cc: Greg Kroah-Hartman, Rafael J. Wysocki, John Stultz,
	Marek Szyprowski, Rob Herring, Geert Uytterhoeven,
	Vladimir Oltean, kernel-team, linux-kernel

On Tue, Sep 14, 2021 at 8:10 AM Andrew Lunn <andrew@lunn.ch> wrote:
>
> On Mon, Sep 13, 2021 at 09:39:27PM -0700, Saravana Kannan wrote:
> > When the parameter is set, it enables all the debug logs that would be
> > useful for debugging fw_devlink issues.
> >
> > I'll add the documentation if we agree that we should add this param.
>
> https://www.kernel.org/doc/html/latest/admin-guide/dynamic-debug-howto.html
>
> It is better to make use of existing infrastructure, maybe with
> extensions if needed.

I did think of this, but the problem is that dynamic debug logs can
get compiled out (when DYNAMIC_DEBUG isn't set). I think debugging of
fw_devlink needs to be possible without having to recompile the
kernel. So in a sense similar to how initcall_debug works today.

-Saravana

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH v1 5/5] driver core: Add fw_devlink.debug command line boolean parameter
  2021-09-14 16:27     ` Saravana Kannan
@ 2021-09-14 16:43       ` Andrew Lunn
  2021-09-14 16:52         ` Rob Herring
  0 siblings, 1 reply; 21+ messages in thread
From: Andrew Lunn @ 2021-09-14 16:43 UTC (permalink / raw)
  To: Saravana Kannan
  Cc: Greg Kroah-Hartman, Rafael J. Wysocki, John Stultz,
	Marek Szyprowski, Rob Herring, Geert Uytterhoeven,
	Vladimir Oltean, kernel-team, linux-kernel

> I did think of this, but the problem is that dynamic debug logs can
> get compiled out (when DYNAMIC_DEBUG isn't set). I think debugging of
> fw_devlink needs to be possible without having to recompile the
> kernel. So in a sense similar to how initcall_debug works today.

My off the shelf Debian kernel has it enabled. Maybe you can check
other mainline distributions and see if it is enabled by default.

You are also on a slippery path. You argue this is needed all the time
and add a custom knob. Somebody else adds a new feature which they
also argue always needs bug, and add there own custom knob. We soon
have lots of custom knobs, each doing it slightly differently, in
different places. Chaos. So you need a really good argument why your
code really is special.

I would suggest you start with dynamic debug, and collect some
statistics of how often you need to ask people to recompile their
kernel.

	Andrew

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH v1 5/5] driver core: Add fw_devlink.debug command line boolean parameter
  2021-09-14 16:43       ` Andrew Lunn
@ 2021-09-14 16:52         ` Rob Herring
  2021-09-14 17:31           ` Saravana Kannan
  0 siblings, 1 reply; 21+ messages in thread
From: Rob Herring @ 2021-09-14 16:52 UTC (permalink / raw)
  To: Andrew Lunn
  Cc: Saravana Kannan, Greg Kroah-Hartman, Rafael J. Wysocki,
	John Stultz, Marek Szyprowski, Geert Uytterhoeven,
	Vladimir Oltean, Android Kernel Team, linux-kernel

On Tue, Sep 14, 2021 at 11:44 AM Andrew Lunn <andrew@lunn.ch> wrote:
>
> > I did think of this, but the problem is that dynamic debug logs can
> > get compiled out (when DYNAMIC_DEBUG isn't set). I think debugging of
> > fw_devlink needs to be possible without having to recompile the
> > kernel. So in a sense similar to how initcall_debug works today.
>
> My off the shelf Debian kernel has it enabled. Maybe you can check
> other mainline distributions and see if it is enabled by default.

Right, I would expect users that can't rebuild their kernel easily
would have it enabled.

> You are also on a slippery path. You argue this is needed all the time
> and add a custom knob. Somebody else adds a new feature which they
> also argue always needs bug, and add there own custom knob. We soon
> have lots of custom knobs, each doing it slightly differently, in
> different places. Chaos. So you need a really good argument why your
> code really is special.
>
> I would suggest you start with dynamic debug, and collect some
> statistics of how often you need to ask people to recompile their
> kernel.

I agree.

What would be nice is documenting what needs to be set for devlink.
What I used was just 'dyndbg="file drivers/base/core.c +p"'

Rob

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH v1 5/5] driver core: Add fw_devlink.debug command line boolean parameter
  2021-09-14 16:52         ` Rob Herring
@ 2021-09-14 17:31           ` Saravana Kannan
  0 siblings, 0 replies; 21+ messages in thread
From: Saravana Kannan @ 2021-09-14 17:31 UTC (permalink / raw)
  To: Rob Herring
  Cc: Andrew Lunn, Greg Kroah-Hartman, Rafael J. Wysocki, John Stultz,
	Marek Szyprowski, Geert Uytterhoeven, Vladimir Oltean,
	Android Kernel Team, linux-kernel

On Tue, Sep 14, 2021 at 9:52 AM Rob Herring <robh+dt@kernel.org> wrote:
>
> On Tue, Sep 14, 2021 at 11:44 AM Andrew Lunn <andrew@lunn.ch> wrote:
> >
> > > I did think of this, but the problem is that dynamic debug logs can
> > > get compiled out (when DYNAMIC_DEBUG isn't set). I think debugging of
> > > fw_devlink needs to be possible without having to recompile the
> > > kernel. So in a sense similar to how initcall_debug works today.
> >
> > My off the shelf Debian kernel has it enabled. Maybe you can check
> > other mainline distributions and see if it is enabled by default.
>
> Right, I would expect users that can't rebuild their kernel easily
> would have it enabled.
>
> > You are also on a slippery path. You argue this is needed all the time
> > and add a custom knob. Somebody else adds a new feature which they
> > also argue always needs bug, and add there own custom knob. We soon
> > have lots of custom knobs, each doing it slightly differently, in
> > different places. Chaos. So you need a really good argument why your
> > code really is special.
> >
> > I would suggest you start with dynamic debug, and collect some
> > statistics of how often you need to ask people to recompile their
> > kernel.
>
> I agree.

Fair enough -- Andrew/Rob.

>
> What would be nice is documenting what needs to be set for devlink.
> What I used was just 'dyndbg="file drivers/base/core.c +p"'

Yeah, I was thinking if we went with dynamic debug I'd have something
I'll copy paste as a reply to any fw_devlink debug email thread.

Where would be a good place to document this?

I was actually thinking of making fw_devlink.debug=1 be an alias for
something like:
dyndbg="func device_link_add +p; func device_links_check_suppliers +p"....

So that people don't have to remember what to copy paste into their
command line as things change.

-Saravana

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH v1 2/5] driver core: Set deferred probe reason when deferred by driver core
  2021-09-14  7:58     ` Saravana Kannan
@ 2021-09-15  5:41       ` Saravana Kannan
  0 siblings, 0 replies; 21+ messages in thread
From: Saravana Kannan @ 2021-09-15  5:41 UTC (permalink / raw)
  To: Geert Uytterhoeven
  Cc: Greg Kroah-Hartman, Rafael J. Wysocki, John Stultz,
	Marek Szyprowski, Rob Herring, Andrew Lunn, Vladimir Oltean,
	Android Kernel Team, Linux Kernel Mailing List

On Tue, Sep 14, 2021 at 12:58 AM Saravana Kannan <saravanak@google.com> wrote:
>
> On Tue, Sep 14, 2021 at 12:01 AM Geert Uytterhoeven
> <geert@linux-m68k.org> wrote:
> >
> > Hi Saravana,
> >
> > On Tue, Sep 14, 2021 at 6:39 AM Saravana Kannan <saravanak@google.com> wrote:
> > > When the driver core defers the probe of a device, set the deferred
> > > probe reason so that it's easier to debug. The deferred probe reason is
> > > available in debugfs under devices_deferred.
> > >
> > > Signed-off-by: Saravana Kannan <saravanak@google.com>
> >
> > Thanks for your patch!
>
> Thanks for the reviews!
>
> >
> > > --- a/drivers/base/core.c
> > > +++ b/drivers/base/core.c
> > > @@ -955,6 +955,29 @@ static void device_links_missing_supplier(struct device *dev)
> > >         }
> > >  }
> > >
> > > +/**
> > > + * dev_set_def_probe_reason - Set the deferred probe reason for a device
> > > + * @dev: the pointer to the struct device
> > > + * @fmt: printf-style format string
> > > + * @...: arguments as specified in the format string
> > > + *
> > > + * This is a more caller-friendly version of device_set_deferred_probe_reason()
> > > + * that takes variable argument inputs similar to dev_info().
> > > + */
> > > +static void dev_set_def_probe_reason(const struct device *dev, const char *fmt, ...)
> >
> > So this is indeed similar to device_set_deferred_probe_reason(),
> > but the function's name is completely different, unlike e.g.
> > (v)printf()?
>
> Yes.
>
> >
> > > +{
> > > +       struct va_format vaf;
> > > +       va_list args;
> > > +
> > > +       va_start(args, fmt);
> > > +       vaf.fmt = fmt;
> > > +       vaf.va = &args;
> > > +
> > > +       device_set_deferred_probe_reason(dev, &vaf);
> > > +
> > > +       va_end(args);
> > > +}
> >
> > I think you can just make this a macro wrapper calling
> > dev_err_probe(dev, -EPROBE_DEFER, fmt, ...).
> > Or open-code that below.
>
> Good point. I think I can make it be a wrapper macro.
>
> >
> > > +
> > >  /**
> > >   * device_links_check_suppliers - Check presence of supplier drivers.
> > >   * @dev: Consumer device.
> > > @@ -975,6 +998,7 @@ int device_links_check_suppliers(struct device *dev)
> > >  {
> > >         struct device_link *link;
> > >         int ret = 0;
> > > +       struct fwnode_handle *sup_fw;
> > >
> > >         /*
> > >          * Device waiting for supplier to become available is not allowed to
> > > @@ -983,10 +1007,13 @@ int device_links_check_suppliers(struct device *dev)
> > >         mutex_lock(&fwnode_link_lock);
> > >         if (dev->fwnode && !list_empty(&dev->fwnode->suppliers) &&
> > >             !fw_devlink_is_permissive()) {
> > > +               sup_fw = list_first_entry(&dev->fwnode->suppliers,
> > > +                                         struct fwnode_link,
> > > +                                         c_hook)->supplier;
> > >                 dev_dbg(dev, "probe deferral - wait for supplier %pfwP\n",
> > > -                       list_first_entry(&dev->fwnode->suppliers,
> > > -                       struct fwnode_link,
> > > -                       c_hook)->supplier);
> > > +                       sup_fw);
> > > +               dev_set_def_probe_reason(dev,
> > > +                       "wait for supplier %pfwP\n", sup_fw);
> >
> > dev_err_probe() would replace both the dev_dbg() and the
> > dev_set_def_probe_reason().
>
> I intentionally didn't use dev_err_probe() because:
>
> 1. I wanted the messages to be a bit different -- not have the "probe
> deferral" in the deferred_devices file but have it in the dmesg logs
> so that the log is clearer.

Nevermind, I see that dev_err_probe() prints the log with
-EPROBE_DEFER but sets the reason without that string. Which is kinda
similar to what I'm trying to do here. So I'll go with that
suggestion.

> 2. And more importantly, I'm working on a patch (could take a few
> weeks) that'll make this place return -EPROBE_DEFER vs -ENODEV (or
> whatever) for different situations. And using dev_err_probe() with
> -ENODEV would cause it to print the error (when I don't want it to).
> And always doing dev_err_probe(dev, -EPROBE_DEFER,...) while returning
> -ENODEV would be confusing.

I'll deal with this when I send out that patch.

Thanks for the review.

-Saravana

>
> >
> > >                 mutex_unlock(&fwnode_link_lock);
> > >                 return -EPROBE_DEFER;
> > >         }
> > > @@ -1003,6 +1030,9 @@ int device_links_check_suppliers(struct device *dev)
> > >                         device_links_missing_supplier(dev);
> > >                         dev_dbg(dev, "probe deferral - supplier %s not ready\n",
> > >                                 dev_name(link->supplier));
> > > +                       dev_set_def_probe_reason(dev,
> > > +                               "supplier %s not ready\n",
> > > +                               dev_name(link->supplier));
> >
> > Likewise.
>
> Same reason as above.
>
> I mainly added you for comments on 5/5. Hopefully you'll have some
> comments on that too by the time I'm up tomorrow :)
>
> -Saravana
>
> >
> > >                         ret = -EPROBE_DEFER;
> > >                         break;
> > >                 }
> >
> > Gr{oetje,eeting}s,
> >
> >                         Geert
> >
> > --
> > Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org
> >
> > In personal conversations with technical people, I call myself a hacker. But
> > when I'm talking to journalists I just say "programmer" or something like that.
> >                                 -- Linus Torvalds

^ permalink raw reply	[flat|nested] 21+ messages in thread

end of thread, other threads:[~2021-09-15  5:42 UTC | newest]

Thread overview: 21+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-09-14  4:39 [PATCH v1 0/5] fw_devlink improvements Saravana Kannan
2021-09-14  4:39 ` [PATCH v1 1/5] driver core: fw_devlink: Improve handling of cyclic dependencies Saravana Kannan
2021-09-14  6:16   ` Marek Szyprowski
2021-09-14  8:01     ` Saravana Kannan
2021-09-14 12:35   ` Rob Herring
2021-09-14 16:23     ` Saravana Kannan
2021-09-14  4:39 ` [PATCH v1 2/5] driver core: Set deferred probe reason when deferred by driver core Saravana Kannan
2021-09-14  7:01   ` Geert Uytterhoeven
2021-09-14  7:58     ` Saravana Kannan
2021-09-15  5:41       ` Saravana Kannan
2021-09-14  4:39 ` [PATCH v1 3/5] driver core: Create __fwnode_link_del() helper function Saravana Kannan
2021-09-14  7:04   ` Geert Uytterhoeven
2021-09-14  7:46     ` Saravana Kannan
2021-09-14  4:39 ` [PATCH v1 4/5] driver core: Add debug logs when fwnode links are added/deleted Saravana Kannan
2021-09-14  7:09   ` Geert Uytterhoeven
2021-09-14  4:39 ` [PATCH v1 5/5] driver core: Add fw_devlink.debug command line boolean parameter Saravana Kannan
2021-09-14 15:10   ` Andrew Lunn
2021-09-14 16:27     ` Saravana Kannan
2021-09-14 16:43       ` Andrew Lunn
2021-09-14 16:52         ` Rob Herring
2021-09-14 17:31           ` Saravana Kannan

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).