linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v2 0/3] Improve remote-endpoint parsing
@ 2024-02-07  1:17 Saravana Kannan
  2024-02-07  1:18 ` [PATCH v2 1/3] of: property: Improve finding the consumer of a remote-endpoint property Saravana Kannan
                   ` (4 more replies)
  0 siblings, 5 replies; 21+ messages in thread
From: Saravana Kannan @ 2024-02-07  1:17 UTC (permalink / raw)
  To: Rob Herring, Frank Rowand, Greg Kroah-Hartman, Saravana Kannan
  Cc: Xu Yang, kernel-team, devicetree, linux-kernel

Some changes to do a more accurate parsing of remote-endpoints. Making
fw_devlink a tiny bit more efficient and the debug logs a bit cleaner when
trying to debug fw_devlink.

Rob,

Can we get this into 6.8-rcX please?

Thanks,
Saravana

v1->v2:
- Switched from fwnode_graph_get_port_parent() to of_graph_get_port_parent()
- Added Patch 3

Saravana Kannan (3):
  of: property: Improve finding the consumer of a remote-endpoint
    property
  of: property: Improve finding the supplier of a remote-endpoint
    property
  of: property: Add in-ports/out-ports support to
    of_graph_get_port_parent()

 drivers/of/property.c | 63 +++++++++++++++++--------------------------
 1 file changed, 24 insertions(+), 39 deletions(-)

-- 
2.43.0.594.gd9cf4e227d-goog


^ permalink raw reply	[flat|nested] 21+ messages in thread

* [PATCH v2 1/3] of: property: Improve finding the consumer of a remote-endpoint property
  2024-02-07  1:17 [PATCH v2 0/3] Improve remote-endpoint parsing Saravana Kannan
@ 2024-02-07  1:18 ` Saravana Kannan
  2024-02-07  1:18 ` [PATCH v2 2/3] of: property: Improve finding the supplier " Saravana Kannan
                   ` (3 subsequent siblings)
  4 siblings, 0 replies; 21+ messages in thread
From: Saravana Kannan @ 2024-02-07  1:18 UTC (permalink / raw)
  To: Rob Herring, Frank Rowand, Greg Kroah-Hartman, Saravana Kannan
  Cc: Xu Yang, kernel-team, devicetree, linux-kernel

We have a more accurate function to find the right consumer of a
remote-endpoint property instead of searching for a parent with
compatible string property. So, use that instead. While at it, make the
code to find the consumer a bit more flexible and based on the property
being parsed.

Fixes: f7514a663016 ("of: property: fw_devlink: Add support for remote-endpoint")
Signed-off-by: Saravana Kannan <saravanak@google.com>
---
 drivers/of/property.c | 47 +++++++++----------------------------------
 1 file changed, 10 insertions(+), 37 deletions(-)

diff --git a/drivers/of/property.c b/drivers/of/property.c
index 641a40cf5cf3..da70aaa62ca3 100644
--- a/drivers/of/property.c
+++ b/drivers/of/property.c
@@ -1063,36 +1063,6 @@ of_fwnode_device_get_match_data(const struct fwnode_handle *fwnode,
 	return of_device_get_match_data(dev);
 }
 
-static struct device_node *of_get_compat_node(struct device_node *np)
-{
-	of_node_get(np);
-
-	while (np) {
-		if (!of_device_is_available(np)) {
-			of_node_put(np);
-			np = NULL;
-		}
-
-		if (of_property_present(np, "compatible"))
-			break;
-
-		np = of_get_next_parent(np);
-	}
-
-	return np;
-}
-
-static struct device_node *of_get_compat_node_parent(struct device_node *np)
-{
-	struct device_node *parent, *node;
-
-	parent = of_get_parent(np);
-	node = of_get_compat_node(parent);
-	of_node_put(parent);
-
-	return node;
-}
-
 static void of_link_to_phandle(struct device_node *con_np,
 			      struct device_node *sup_np)
 {
@@ -1222,10 +1192,10 @@ static struct device_node *parse_##fname(struct device_node *np,	     \
  *  parse_prop.prop_name: Name of property holding a phandle value
  *  parse_prop.index: For properties holding a list of phandles, this is the
  *		      index into the list
+ * @get_con_dev: If the consumer node containing the property is never converted
+ *		 to a struct device, implement this ops so fw_devlink can use it
+ *		 to find the true consumer.
  * @optional: Describes whether a supplier is mandatory or not
- * @node_not_dev: The consumer node containing the property is never converted
- *		  to a struct device. Instead, parse ancestor nodes for the
- *		  compatible property to find a node corresponding to a device.
  *
  * Returns:
  * parse_prop() return values are
@@ -1236,8 +1206,8 @@ static struct device_node *parse_##fname(struct device_node *np,	     \
 struct supplier_bindings {
 	struct device_node *(*parse_prop)(struct device_node *np,
 					  const char *prop_name, int index);
+	struct device_node *(*get_con_dev)(struct device_node *np);
 	bool optional;
-	bool node_not_dev;
 };
 
 DEFINE_SIMPLE_PROP(clocks, "clocks", "#clock-cells")
@@ -1352,7 +1322,10 @@ static const struct supplier_bindings of_supplier_bindings[] = {
 	{ .parse_prop = parse_pinctrl6, },
 	{ .parse_prop = parse_pinctrl7, },
 	{ .parse_prop = parse_pinctrl8, },
-	{ .parse_prop = parse_remote_endpoint, .node_not_dev = true, },
+	{
+		.parse_prop = parse_remote_endpoint,
+		.get_con_dev = of_graph_get_port_parent,
+	},
 	{ .parse_prop = parse_pwms, },
 	{ .parse_prop = parse_resets, },
 	{ .parse_prop = parse_leds, },
@@ -1403,8 +1376,8 @@ static int of_link_property(struct device_node *con_np, const char *prop_name)
 		while ((phandle = s->parse_prop(con_np, prop_name, i))) {
 			struct device_node *con_dev_np;
 
-			con_dev_np = s->node_not_dev
-					? of_get_compat_node_parent(con_np)
+			con_dev_np = s->get_con_dev
+					? s->get_con_dev(con_np)
 					: of_node_get(con_np);
 			matched = true;
 			i++;
-- 
2.43.0.594.gd9cf4e227d-goog


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH v2 2/3] of: property: Improve finding the supplier of a remote-endpoint property
  2024-02-07  1:17 [PATCH v2 0/3] Improve remote-endpoint parsing Saravana Kannan
  2024-02-07  1:18 ` [PATCH v2 1/3] of: property: Improve finding the consumer of a remote-endpoint property Saravana Kannan
@ 2024-02-07  1:18 ` Saravana Kannan
  2024-02-23 16:18   ` Luca Ceresoli
  2024-02-07  1:18 ` [PATCH v2 3/3] of: property: Add in-ports/out-ports support to of_graph_get_port_parent() Saravana Kannan
                   ` (2 subsequent siblings)
  4 siblings, 1 reply; 21+ messages in thread
From: Saravana Kannan @ 2024-02-07  1:18 UTC (permalink / raw)
  To: Rob Herring, Frank Rowand, Greg Kroah-Hartman, Saravana Kannan
  Cc: Xu Yang, kernel-team, devicetree, linux-kernel

After commit 4a032827daa8 ("of: property: Simplify of_link_to_phandle()"),
remote-endpoint properties created a fwnode link from the consumer device
to the supplier endpoint. This is a tiny bit inefficient (not buggy) when
trying to create device links or detecting cycles. So, improve this the
same way we improved finding the consumer of a remote-endpoint property.

Fixes: 4a032827daa8 ("of: property: Simplify of_link_to_phandle()")
Signed-off-by: Saravana Kannan <saravanak@google.com>
---
 drivers/of/property.c | 12 +++++++++++-
 1 file changed, 11 insertions(+), 1 deletion(-)

diff --git a/drivers/of/property.c b/drivers/of/property.c
index da70aaa62ca3..7bb2d8e290de 100644
--- a/drivers/of/property.c
+++ b/drivers/of/property.c
@@ -1232,7 +1232,6 @@ DEFINE_SIMPLE_PROP(pinctrl5, "pinctrl-5", NULL)
 DEFINE_SIMPLE_PROP(pinctrl6, "pinctrl-6", NULL)
 DEFINE_SIMPLE_PROP(pinctrl7, "pinctrl-7", NULL)
 DEFINE_SIMPLE_PROP(pinctrl8, "pinctrl-8", NULL)
-DEFINE_SIMPLE_PROP(remote_endpoint, "remote-endpoint", NULL)
 DEFINE_SIMPLE_PROP(pwms, "pwms", "#pwm-cells")
 DEFINE_SIMPLE_PROP(resets, "resets", "#reset-cells")
 DEFINE_SIMPLE_PROP(leds, "leds", NULL)
@@ -1298,6 +1297,17 @@ static struct device_node *parse_interrupts(struct device_node *np,
 	return of_irq_parse_one(np, index, &sup_args) ? NULL : sup_args.np;
 }
 
+static struct device_node *parse_remote_endpoint(struct device_node *np,
+						 const char *prop_name,
+						 int index)
+{
+	/* Return NULL for index > 0 to signify end of remote-endpoints. */
+	if (!index || strcmp(prop_name, "remote-endpoint"))
+		return NULL;
+
+	return of_graph_get_remote_port_parent(np);
+}
+
 static const struct supplier_bindings of_supplier_bindings[] = {
 	{ .parse_prop = parse_clocks, },
 	{ .parse_prop = parse_interconnects, },
-- 
2.43.0.594.gd9cf4e227d-goog


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH v2 3/3] of: property: Add in-ports/out-ports support to of_graph_get_port_parent()
  2024-02-07  1:17 [PATCH v2 0/3] Improve remote-endpoint parsing Saravana Kannan
  2024-02-07  1:18 ` [PATCH v2 1/3] of: property: Improve finding the consumer of a remote-endpoint property Saravana Kannan
  2024-02-07  1:18 ` [PATCH v2 2/3] of: property: Improve finding the supplier " Saravana Kannan
@ 2024-02-07  1:18 ` Saravana Kannan
  2024-02-21  0:47   ` Saravana Kannan
  2024-02-07  8:10 ` [PATCH v2 0/3] Improve remote-endpoint parsing Rob Herring
  2024-02-09 10:31 ` Rob Herring
  4 siblings, 1 reply; 21+ messages in thread
From: Saravana Kannan @ 2024-02-07  1:18 UTC (permalink / raw)
  To: Rob Herring, Frank Rowand, Greg Kroah-Hartman, Saravana Kannan
  Cc: Xu Yang, kernel-team, devicetree, linux-kernel

Similar to the existing "ports" node name, coresight device tree bindings
have added "in-ports" and "out-ports" as standard node names for a
collection of ports.

Add support for these name to of_graph_get_port_parent() so that
remote-endpoint parsing can find the correct parent node for these
coresight ports too.

Signed-off-by: Saravana Kannan <saravanak@google.com>
---
 drivers/of/property.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/drivers/of/property.c b/drivers/of/property.c
index 7bb2d8e290de..39a3ee1dfb58 100644
--- a/drivers/of/property.c
+++ b/drivers/of/property.c
@@ -763,7 +763,9 @@ struct device_node *of_graph_get_port_parent(struct device_node *node)
 	/* Walk 3 levels up only if there is 'ports' node. */
 	for (depth = 3; depth && node; depth--) {
 		node = of_get_next_parent(node);
-		if (depth == 2 && !of_node_name_eq(node, "ports"))
+		if (depth == 2 && !of_node_name_eq(node, "ports") &&
+		    !of_node_name_eq(node, "in-ports") &&
+		    !of_node_name_eq(node, "out-ports"))
 			break;
 	}
 	return node;
-- 
2.43.0.594.gd9cf4e227d-goog


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* Re: [PATCH v2 0/3] Improve remote-endpoint parsing
  2024-02-07  1:17 [PATCH v2 0/3] Improve remote-endpoint parsing Saravana Kannan
                   ` (2 preceding siblings ...)
  2024-02-07  1:18 ` [PATCH v2 3/3] of: property: Add in-ports/out-ports support to of_graph_get_port_parent() Saravana Kannan
@ 2024-02-07  8:10 ` Rob Herring
  2024-02-07 21:07   ` Saravana Kannan
  2024-02-09 10:31 ` Rob Herring
  4 siblings, 1 reply; 21+ messages in thread
From: Rob Herring @ 2024-02-07  8:10 UTC (permalink / raw)
  To: Saravana Kannan
  Cc: Frank Rowand, Greg Kroah-Hartman, Xu Yang, kernel-team,
	devicetree, linux-kernel

On Wed, Feb 7, 2024 at 1:18 AM Saravana Kannan <saravanak@google.com> wrote:
>
> Some changes to do a more accurate parsing of remote-endpoints. Making
> fw_devlink a tiny bit more efficient and the debug logs a bit cleaner when
> trying to debug fw_devlink.
>
> Rob,
>
> Can we get this into 6.8-rcX please?

I'm failing to see how this is 6.8 material?

Rob

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH v2 0/3] Improve remote-endpoint parsing
  2024-02-07  8:10 ` [PATCH v2 0/3] Improve remote-endpoint parsing Rob Herring
@ 2024-02-07 21:07   ` Saravana Kannan
  0 siblings, 0 replies; 21+ messages in thread
From: Saravana Kannan @ 2024-02-07 21:07 UTC (permalink / raw)
  To: Rob Herring
  Cc: Frank Rowand, Greg Kroah-Hartman, Xu Yang, kernel-team,
	devicetree, linux-kernel

On Wed, Feb 7, 2024 at 12:11 AM Rob Herring <robh+dt@kernel.org> wrote:
>
> On Wed, Feb 7, 2024 at 1:18 AM Saravana Kannan <saravanak@google.com> wrote:
> >
> > Some changes to do a more accurate parsing of remote-endpoints. Making
> > fw_devlink a tiny bit more efficient and the debug logs a bit cleaner when
> > trying to debug fw_devlink.
> >
> > Rob,
> >
> > Can we get this into 6.8-rcX please?
>
> I'm failing to see how this is 6.8 material?

It's a bug fix that's making the remote-endpoint tracking more
accurate. I thought stuff like this would be okay for rcX. But if you
want to hold it for later, that's ok too.

-Saravana

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH v2 0/3] Improve remote-endpoint parsing
  2024-02-07  1:17 [PATCH v2 0/3] Improve remote-endpoint parsing Saravana Kannan
                   ` (3 preceding siblings ...)
  2024-02-07  8:10 ` [PATCH v2 0/3] Improve remote-endpoint parsing Rob Herring
@ 2024-02-09 10:31 ` Rob Herring
  4 siblings, 0 replies; 21+ messages in thread
From: Rob Herring @ 2024-02-09 10:31 UTC (permalink / raw)
  To: Saravana Kannan
  Cc: Rob Herring, Frank Rowand, devicetree, linux-kernel, kernel-team,
	Greg Kroah-Hartman, Xu Yang


On Tue, 06 Feb 2024 17:17:59 -0800, Saravana Kannan wrote:
> Some changes to do a more accurate parsing of remote-endpoints. Making
> fw_devlink a tiny bit more efficient and the debug logs a bit cleaner when
> trying to debug fw_devlink.
> 
> Rob,
> 
> Can we get this into 6.8-rcX please?
> 
> Thanks,
> Saravana
> 
> v1->v2:
> - Switched from fwnode_graph_get_port_parent() to of_graph_get_port_parent()
> - Added Patch 3
> 
> Saravana Kannan (3):
>   of: property: Improve finding the consumer of a remote-endpoint
>     property
>   of: property: Improve finding the supplier of a remote-endpoint
>     property
>   of: property: Add in-ports/out-ports support to
>     of_graph_get_port_parent()
> 
>  drivers/of/property.c | 63 +++++++++++++++++--------------------------
>  1 file changed, 24 insertions(+), 39 deletions(-)
> 
> --
> 2.43.0.594.gd9cf4e227d-goog
> 
> 
> 

Applied, thanks!


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH v2 3/3] of: property: Add in-ports/out-ports support to of_graph_get_port_parent()
  2024-02-07  1:18 ` [PATCH v2 3/3] of: property: Add in-ports/out-ports support to of_graph_get_port_parent() Saravana Kannan
@ 2024-02-21  0:47   ` Saravana Kannan
  2024-02-21  7:00     ` Greg Kroah-Hartman
  0 siblings, 1 reply; 21+ messages in thread
From: Saravana Kannan @ 2024-02-21  0:47 UTC (permalink / raw)
  To: Rob Herring, Frank Rowand, Greg Kroah-Hartman, Saravana Kannan, stable
  Cc: Xu Yang, kernel-team, devicetree, linux-kernel

On Tue, Feb 6, 2024 at 5:18 PM Saravana Kannan <saravanak@google.com> wrote:
>
> Similar to the existing "ports" node name, coresight device tree bindings
> have added "in-ports" and "out-ports" as standard node names for a
> collection of ports.
>
> Add support for these name to of_graph_get_port_parent() so that
> remote-endpoint parsing can find the correct parent node for these
> coresight ports too.
>
> Signed-off-by: Saravana Kannan <saravanak@google.com>

Greg,

I saw that you pulled the previous 2 patches in this series to 6.1,
6.6 and 6.7 kernel branches. I really should have added both of those
Fixes tag to this patch too.

Can you please pull in the patch to those stable branches too?

Thanks,
Saravana

> ---
>  drivers/of/property.c | 4 +++-
>  1 file changed, 3 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/of/property.c b/drivers/of/property.c
> index 7bb2d8e290de..39a3ee1dfb58 100644
> --- a/drivers/of/property.c
> +++ b/drivers/of/property.c
> @@ -763,7 +763,9 @@ struct device_node *of_graph_get_port_parent(struct device_node *node)
>         /* Walk 3 levels up only if there is 'ports' node. */
>         for (depth = 3; depth && node; depth--) {
>                 node = of_get_next_parent(node);
> -               if (depth == 2 && !of_node_name_eq(node, "ports"))
> +               if (depth == 2 && !of_node_name_eq(node, "ports") &&
> +                   !of_node_name_eq(node, "in-ports") &&
> +                   !of_node_name_eq(node, "out-ports"))
>                         break;
>         }
>         return node;
> --
> 2.43.0.594.gd9cf4e227d-goog
>

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH v2 3/3] of: property: Add in-ports/out-ports support to of_graph_get_port_parent()
  2024-02-21  0:47   ` Saravana Kannan
@ 2024-02-21  7:00     ` Greg Kroah-Hartman
  2024-02-21  8:38       ` Greg Kroah-Hartman
  0 siblings, 1 reply; 21+ messages in thread
From: Greg Kroah-Hartman @ 2024-02-21  7:00 UTC (permalink / raw)
  To: Saravana Kannan
  Cc: Rob Herring, Frank Rowand, stable, Xu Yang, kernel-team,
	devicetree, linux-kernel

On Tue, Feb 20, 2024 at 04:47:35PM -0800, Saravana Kannan wrote:
> On Tue, Feb 6, 2024 at 5:18 PM Saravana Kannan <saravanak@google.com> wrote:
> >
> > Similar to the existing "ports" node name, coresight device tree bindings
> > have added "in-ports" and "out-ports" as standard node names for a
> > collection of ports.
> >
> > Add support for these name to of_graph_get_port_parent() so that
> > remote-endpoint parsing can find the correct parent node for these
> > coresight ports too.
> >
> > Signed-off-by: Saravana Kannan <saravanak@google.com>
> 
> Greg,
> 
> I saw that you pulled the previous 2 patches in this series to 6.1,
> 6.6 and 6.7 kernel branches. I really should have added both of those
> Fixes tag to this patch too.
> 
> Can you please pull in the patch to those stable branches too?

Sure, what's the git id?

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH v2 3/3] of: property: Add in-ports/out-ports support to of_graph_get_port_parent()
  2024-02-21  7:00     ` Greg Kroah-Hartman
@ 2024-02-21  8:38       ` Greg Kroah-Hartman
  0 siblings, 0 replies; 21+ messages in thread
From: Greg Kroah-Hartman @ 2024-02-21  8:38 UTC (permalink / raw)
  To: Saravana Kannan
  Cc: Rob Herring, Frank Rowand, stable, Xu Yang, kernel-team,
	devicetree, linux-kernel

On Wed, Feb 21, 2024 at 08:00:00AM +0100, Greg Kroah-Hartman wrote:
> On Tue, Feb 20, 2024 at 04:47:35PM -0800, Saravana Kannan wrote:
> > On Tue, Feb 6, 2024 at 5:18 PM Saravana Kannan <saravanak@google.com> wrote:
> > >
> > > Similar to the existing "ports" node name, coresight device tree bindings
> > > have added "in-ports" and "out-ports" as standard node names for a
> > > collection of ports.
> > >
> > > Add support for these name to of_graph_get_port_parent() so that
> > > remote-endpoint parsing can find the correct parent node for these
> > > coresight ports too.
> > >
> > > Signed-off-by: Saravana Kannan <saravanak@google.com>
> > 
> > Greg,
> > 
> > I saw that you pulled the previous 2 patches in this series to 6.1,
> > 6.6 and 6.7 kernel branches. I really should have added both of those
> > Fixes tag to this patch too.
> > 
> > Can you please pull in the patch to those stable branches too?
> 
> Sure, what's the git id?

Nevermind, I found it...

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH v2 2/3] of: property: Improve finding the supplier of a remote-endpoint property
  2024-02-07  1:18 ` [PATCH v2 2/3] of: property: Improve finding the supplier " Saravana Kannan
@ 2024-02-23 16:18   ` Luca Ceresoli
  2024-02-24  1:35     ` Saravana Kannan
  2024-02-28 22:01     ` Rob Herring
  0 siblings, 2 replies; 21+ messages in thread
From: Luca Ceresoli @ 2024-02-23 16:18 UTC (permalink / raw)
  To: Saravana Kannan
  Cc: Rob Herring, Frank Rowand, Greg Kroah-Hartman, Xu Yang,
	kernel-team, devicetree, linux-kernel, Hervé Codina

Hello Saravana,

[+cc Hervé Codina]

On Tue,  6 Feb 2024 17:18:01 -0800
Saravana Kannan <saravanak@google.com> wrote:

> After commit 4a032827daa8 ("of: property: Simplify of_link_to_phandle()"),
> remote-endpoint properties created a fwnode link from the consumer device
> to the supplier endpoint. This is a tiny bit inefficient (not buggy) when
> trying to create device links or detecting cycles. So, improve this the
> same way we improved finding the consumer of a remote-endpoint property.
> 
> Fixes: 4a032827daa8 ("of: property: Simplify of_link_to_phandle()")
> Signed-off-by: Saravana Kannan <saravanak@google.com>

After rebasing my own branch on v6.8-rc5 from v6.8-rc1 I started
getting unexpected warnings during device tree overlay removal. After a
somewhat painful bisection I identified this patch as the one that
triggers it all.

> ---
> --- a/drivers/of/property.c
> +++ b/drivers/of/property.c
> @@ -1232,7 +1232,6 @@ DEFINE_SIMPLE_PROP(pinctrl5, "pinctrl-5", NULL)
>  DEFINE_SIMPLE_PROP(pinctrl6, "pinctrl-6", NULL)
>  DEFINE_SIMPLE_PROP(pinctrl7, "pinctrl-7", NULL)
>  DEFINE_SIMPLE_PROP(pinctrl8, "pinctrl-8", NULL)
> -DEFINE_SIMPLE_PROP(remote_endpoint, "remote-endpoint", NULL)
>  DEFINE_SIMPLE_PROP(pwms, "pwms", "#pwm-cells")
>  DEFINE_SIMPLE_PROP(resets, "resets", "#reset-cells")
>  DEFINE_SIMPLE_PROP(leds, "leds", NULL)
> @@ -1298,6 +1297,17 @@ static struct device_node *parse_interrupts(struct device_node *np,
>  	return of_irq_parse_one(np, index, &sup_args) ? NULL : sup_args.np;
>  }
>  
> +static struct device_node *parse_remote_endpoint(struct device_node *np,
> +						 const char *prop_name,
> +						 int index)
> +{
> +	/* Return NULL for index > 0 to signify end of remote-endpoints. */
> +	if (!index || strcmp(prop_name, "remote-endpoint"))

There seem to be a bug here: "!index" should be "index > 0", as the
comment suggests. Otherwise NULL is always returned.

I am going to send a quick patch for that, but haven't done so yet
because it still won't solve the problem, so I wanted to open the topic
here without further delay.

Even with the 'index > 0' fix I'm still getting pretty much the same:

[   34.836781] ------------[ cut here ]------------
[   34.841401] WARNING: CPU: 2 PID: 204 at drivers/base/devres.c:1064 devm_kfree+0x8c/0xfc
...
[   35.024751] Call trace:
[   35.027199]  devm_kfree+0x8c/0xfc
[   35.030520]  devm_drm_panel_bridge_release+0x54/0x64 [drm_kms_helper]
[   35.036990]  devres_release_group+0xe0/0x164
[   35.041264]  i2c_device_remove+0x38/0x9c
[   35.045196]  device_remove+0x4c/0x80
[   35.048774]  device_release_driver_internal+0x1d4/0x230
[   35.054003]  device_release_driver+0x18/0x24
[   35.058279]  bus_remove_device+0xcc/0x10c
[   35.062292]  device_del+0x15c/0x41c
[   35.065786]  device_unregister+0x18/0x34
[   35.069714]  i2c_unregister_device+0x54/0x88
[   35.073988]  of_i2c_notify+0x98/0x224
[   35.077656]  blocking_notifier_call_chain+0x6c/0xa0
[   35.082543]  __of_changeset_entry_notify+0x100/0x16c
[   35.087515]  __of_changeset_revert_notify+0x44/0x78
[   35.092398]  of_overlay_remove+0x114/0x1c4
...

By comparing the two versions I found that before removing the overlay:

 * in the "working" case (with this patch reverted) I have:

   # ls /sys/class/devlink/ | grep 002c
   platform:hpbr--i2c:13-002c
   platform:panel-dsi-lvds--i2c:13-002c
   platform:regulator-sys-1v8--i2c:13-002c
   regulator:regulator.31--i2c:13-002c
   #

 * in the "broken" case (v6.8-rc5 + s/!index/index > 0/ as mentioned):

   # ls /sys/class/devlink/ | grep 002c
   platform:hpbr--i2c:13-002c
   platform:regulator-sys-1v8--i2c:13-002c
   regulator:regulator.30--i2c:13-002c
   # 

So in the latter case the panel-dsi-lvds--i2c:13-002c link is missing.
I think it gets created but later on removed. Here's a snippet of the
kernel log when that happens:

[    9.578279] ----- cycle: start -----
[    9.578283] /soc@0/bus@30800000/i2c@30ad0000/i2cmux@70/i2c@3/dsi-lvds-bridge@2c: cycle: depends on /panel-dsi-lvds
[    9.578308] /panel-dsi-lvds: cycle: depends on /soc@0/bus@30800000/i2c@30ad0000/i2cmux@70/i2c@3/dsi-lvds-bridge@2c
[    9.578329] ----- cycle: end -----
[    9.578334] platform panel-dsi-lvds: Fixed dependency cycle(s) with /soc@0/bus@30800000/i2c@30ad0000/i2cmux@70/i2c@3/dsi-lvds-bridge@2c
...
[    9.590620] /panel-dsi-lvds Dropping the fwnode link to /soc@0/bus@30800000/i2c@30ad0000/i2cmux@70/i2c@3/dsi-lvds-bridge@2c
...
[    9.597280] ----- cycle: start -----
[    9.597283] /panel-dsi-lvds: cycle: depends on /soc@0/bus@30800000/i2c@30ad0000/i2cmux@70/i2c@3/dsi-lvds-bridge@2c
[    9.602781] /soc@0/bus@30800000/i2c@30ad0000/i2cmux@70/i2c@3/dsi-lvds-bridge@2c: cycle: depends on /panel-dsi-lvds
[    9.607581] ----- cycle: end -----
[    9.607585] i2c 13-002c: Fixed dependency cycle(s) with /panel-dsi-lvds
[    9.614217] device: 'platform:panel-dsi-lvds--i2c:13-002c': device_add
...
[    9.614277] /soc@0/bus@30800000/i2c@30ad0000/i2cmux@70/i2c@3/dsi-lvds-bridge@2c Dropping the fwnode link to /panel-dsi-lvds
[    9.614369] /soc@0/bus@30800000/i2c@30ad0000/i2cmux@70/i2c@3/dsi-lvds-bridge@2c Dropping the fwnode link to /regulator-dock-sys-1v8
...
[    9.739840] panel-simple panel-dsi-lvds: Dropping the link to 13-002c
[    9.739846] device: 'i2c:13-002c--platform:panel-dsi-lvds': device_unregister
[   10.247037] sn65dsi83 13-002c: Dropping the link to panel-dsi-lvds
[   10.247049] device: 'platform:panel-dsi-lvds--i2c:13-002c': device_unregister

And here's the relevant portion of my device tree overlay:

--------------------8<--------------------

/dts-v1/;
/plugin/;

&{/}
{
        panel_dsi_lvds: panel-dsi-lvds {
                compatible = "auo,g133han01.1";

                ports {
                        #address-cells = <1>;
                        #size-cells = <0>;
                        port@0{
                                reg = <0>;
                                dual-lvds-odd-pixels;
                                panel_dsi_lvds_in0: endpoint {
                                        remote-endpoint = <&sn65dsi84_out0>;
                                };
                        };

                        port@1{
                                reg = <1>;
                                dual-lvds-even-pixels;
                                panel_dsi_lvds_in1: endpoint {
                                        remote-endpoint = <&sn65dsi84_out1>;
                                };
                        };
                };
        };
};

&i2c5_ch3 {
        dsi-lvds-bridge@2c {
                compatible = "ti,sn65dsi84";
                reg = <0x2c>;
                vcc-supply = <&reg_sys_1v8>;

                ports {
                        #address-cells = <1>;
                        #size-cells = <0>;

                        port@0 {
                                reg = <0>;

                                sn65dsi84_from_bridge: endpoint {
                                        remote-endpoint = <&hpbr_source>;
                                        data-lanes = <1 2 3 4>;
                                };
                        };
                        port@2 {
                                reg = <2>;

                                sn65dsi84_out0: endpoint {
                                        remote-endpoint = <&panel_dsi_lvds_in0>;
                                };
                        };
                        port@3 {
                                reg = <3>;

                                sn65dsi84_out1: endpoint {
                                        remote-endpoint = <&panel_dsi_lvds_in1>;
                                };
                        };
                };
        };
};

--------------------8<--------------------

That's all I could get at this point. Any clues for further
investigation?

Best regards,
Luca

-- 
Luca Ceresoli, Bootlin
Embedded Linux and Kernel engineering
https://bootlin.com

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH v2 2/3] of: property: Improve finding the supplier of a remote-endpoint property
  2024-02-23 16:18   ` Luca Ceresoli
@ 2024-02-24  1:35     ` Saravana Kannan
  2024-02-26 11:52       ` [REGRESSION] " Luca Ceresoli
  2024-02-28 22:01     ` Rob Herring
  1 sibling, 1 reply; 21+ messages in thread
From: Saravana Kannan @ 2024-02-24  1:35 UTC (permalink / raw)
  To: Luca Ceresoli
  Cc: Rob Herring, Frank Rowand, Greg Kroah-Hartman, Xu Yang,
	kernel-team, devicetree, linux-kernel, Hervé Codina,
	Geert Uytterhoeven

On Fri, Feb 23, 2024 at 8:18 AM Luca Ceresoli <luca.ceresoli@bootlin.com> wrote:
>
> Hello Saravana,
>
> [+cc Hervé Codina]
>
> On Tue,  6 Feb 2024 17:18:01 -0800
> Saravana Kannan <saravanak@google.com> wrote:
>
> > After commit 4a032827daa8 ("of: property: Simplify of_link_to_phandle()"),
> > remote-endpoint properties created a fwnode link from the consumer device
> > to the supplier endpoint. This is a tiny bit inefficient (not buggy) when
> > trying to create device links or detecting cycles. So, improve this the
> > same way we improved finding the consumer of a remote-endpoint property.
> >
> > Fixes: 4a032827daa8 ("of: property: Simplify of_link_to_phandle()")
> > Signed-off-by: Saravana Kannan <saravanak@google.com>
>
> After rebasing my own branch on v6.8-rc5 from v6.8-rc1 I started
> getting unexpected warnings during device tree overlay removal. After a
> somewhat painful bisection I identified this patch as the one that
> triggers it all.

Thanks for the report.

>
> > ---
> > --- a/drivers/of/property.c
> > +++ b/drivers/of/property.c
> > @@ -1232,7 +1232,6 @@ DEFINE_SIMPLE_PROP(pinctrl5, "pinctrl-5", NULL)
> >  DEFINE_SIMPLE_PROP(pinctrl6, "pinctrl-6", NULL)
> >  DEFINE_SIMPLE_PROP(pinctrl7, "pinctrl-7", NULL)
> >  DEFINE_SIMPLE_PROP(pinctrl8, "pinctrl-8", NULL)
> > -DEFINE_SIMPLE_PROP(remote_endpoint, "remote-endpoint", NULL)
> >  DEFINE_SIMPLE_PROP(pwms, "pwms", "#pwm-cells")
> >  DEFINE_SIMPLE_PROP(resets, "resets", "#reset-cells")
> >  DEFINE_SIMPLE_PROP(leds, "leds", NULL)
> > @@ -1298,6 +1297,17 @@ static struct device_node *parse_interrupts(struct device_node *np,
> >       return of_irq_parse_one(np, index, &sup_args) ? NULL : sup_args.np;
> >  }
> >
> > +static struct device_node *parse_remote_endpoint(struct device_node *np,
> > +                                              const char *prop_name,
> > +                                              int index)
> > +{
> > +     /* Return NULL for index > 0 to signify end of remote-endpoints. */
> > +     if (!index || strcmp(prop_name, "remote-endpoint"))
>
> There seem to be a bug here: "!index" should be "index > 0", as the
> comment suggests. Otherwise NULL is always returned.

Ah crap, I think you are right. It should have been "index". Not
"!index". But I tested this! Sigh. I probably screwed up my testing.

Please send out a Fix for this.

Geert, we got excited too soon. :(

> I am going to send a quick patch for that, but haven't done so yet
> because it still won't solve the problem, so I wanted to open the topic
> here without further delay.
>
> Even with the 'index > 0' fix I'm still getting pretty much the same:

This part is confusing though. If I read your DT correctly, there's a
cycle between platform:panel-dsi-lvds and i2c:13-002c. And fw_devlink
should not be enforcing any ordering between those devices ever.

I'm surprised that in your "working" case, fw_devlink didn't detect
any cycle. It should have. If there's any debugging to do, that's the
one we need to debug.

>
> [   34.836781] ------------[ cut here ]------------
> [   34.841401] WARNING: CPU: 2 PID: 204 at drivers/base/devres.c:1064 devm_kfree+0x8c/0xfc
> ...
> [   35.024751] Call trace:
> [   35.027199]  devm_kfree+0x8c/0xfc
> [   35.030520]  devm_drm_panel_bridge_release+0x54/0x64 [drm_kms_helper]
> [   35.036990]  devres_release_group+0xe0/0x164
> [   35.041264]  i2c_device_remove+0x38/0x9c
> [   35.045196]  device_remove+0x4c/0x80
> [   35.048774]  device_release_driver_internal+0x1d4/0x230
> [   35.054003]  device_release_driver+0x18/0x24
> [   35.058279]  bus_remove_device+0xcc/0x10c
> [   35.062292]  device_del+0x15c/0x41c
> [   35.065786]  device_unregister+0x18/0x34
> [   35.069714]  i2c_unregister_device+0x54/0x88
> [   35.073988]  of_i2c_notify+0x98/0x224
> [   35.077656]  blocking_notifier_call_chain+0x6c/0xa0
> [   35.082543]  __of_changeset_entry_notify+0x100/0x16c
> [   35.087515]  __of_changeset_revert_notify+0x44/0x78
> [   35.092398]  of_overlay_remove+0x114/0x1c4
> ...
>
> By comparing the two versions I found that before removing the overlay:
>
>  * in the "working" case (with this patch reverted) I have:
>
>    # ls /sys/class/devlink/ | grep 002c
>    platform:hpbr--i2c:13-002c
>    platform:panel-dsi-lvds--i2c:13-002c

Can you check the "status" and "sync_state_only" file in this folder
and tell me what it says?

Since these devices have a cyclic dependency between them, it should
have been something other than "not tracked" and "sync_state_only"
should be "1". But my guess is you'll see "active" and "0".

>    platform:regulator-sys-1v8--i2c:13-002c
>    regulator:regulator.31--i2c:13-002c
>    #
>
>  * in the "broken" case (v6.8-rc5 + s/!index/index > 0/ as mentioned):
>
>    # ls /sys/class/devlink/ | grep 002c
>    platform:hpbr--i2c:13-002c
>    platform:regulator-sys-1v8--i2c:13-002c
>    regulator:regulator.30--i2c:13-002c
>    #
>
> So in the latter case the panel-dsi-lvds--i2c:13-002c link is missing.
> I think it gets created but later on removed. Here's a snippet of the
> kernel log when that happens:
>
> [    9.578279] ----- cycle: start -----
> [    9.578283] /soc@0/bus@30800000/i2c@30ad0000/i2cmux@70/i2c@3/dsi-lvds-bridge@2c: cycle: depends on /panel-dsi-lvds
> [    9.578308] /panel-dsi-lvds: cycle: depends on /soc@0/bus@30800000/i2c@30ad0000/i2cmux@70/i2c@3/dsi-lvds-bridge@2c
> [    9.578329] ----- cycle: end -----
> [    9.578334] platform panel-dsi-lvds: Fixed dependency cycle(s) with /soc@0/bus@30800000/i2c@30ad0000/i2cmux@70/i2c@3/dsi-lvds-bridge@2c
> ...

Somewhere in this area, I'm thinking you'll also see "device:
'i2c:13-002c--platform:panel-dsi-lvds': device_add" do you not? And if
you enabled device link logs, you'll see that it was "sync state only"
link.

> [    9.590620] /panel-dsi-lvds Dropping the fwnode link to /soc@0/bus@30800000/i2c@30ad0000/i2cmux@70/i2c@3/dsi-lvds-bridge@2c
> ...
> [    9.597280] ----- cycle: start -----
> [    9.597283] /panel-dsi-lvds: cycle: depends on /soc@0/bus@30800000/i2c@30ad0000/i2cmux@70/i2c@3/dsi-lvds-bridge@2c
> [    9.602781] /soc@0/bus@30800000/i2c@30ad0000/i2cmux@70/i2c@3/dsi-lvds-bridge@2c: cycle: depends on /panel-dsi-lvds
> [    9.607581] ----- cycle: end -----
> [    9.607585] i2c 13-002c: Fixed dependency cycle(s) with /panel-dsi-lvds
> [    9.614217] device: 'platform:panel-dsi-lvds--i2c:13-002c': device_add
> ...
> [    9.614277] /soc@0/bus@30800000/i2c@30ad0000/i2cmux@70/i2c@3/dsi-lvds-bridge@2c Dropping the fwnode link to /panel-dsi-lvds
> [    9.614369] /soc@0/bus@30800000/i2c@30ad0000/i2cmux@70/i2c@3/dsi-lvds-bridge@2c Dropping the fwnode link to /regulator-dock-sys-1v8
> ...
> [    9.739840] panel-simple panel-dsi-lvds: Dropping the link to 13-002c
> [    9.739846] device: 'i2c:13-002c--platform:panel-dsi-lvds': device_unregister

Oh yeah, see. The "device_add" I expected earlier is getting removed here.

> [   10.247037] sn65dsi83 13-002c: Dropping the link to panel-dsi-lvds
> [   10.247049] device: 'platform:panel-dsi-lvds--i2c:13-002c': device_unregister
>
> And here's the relevant portion of my device tree overlay:
>
> --------------------8<--------------------
>

I think the eventual fix would be this series + adding a
"post-init-providers" property to the device that's supposed to probe
first and point it to the device that's supposed to probe next. Do
this at the device node level, not the endpoint level.
https://lore.kernel.org/lkml/20240221233026.2915061-1-saravanak@google.com/


-Saravana

> /dts-v1/;
> /plugin/;
>
> &{/}
> {
>         panel_dsi_lvds: panel-dsi-lvds {
>                 compatible = "auo,g133han01.1";
>
>                 ports {
>                         #address-cells = <1>;
>                         #size-cells = <0>;
>                         port@0{
>                                 reg = <0>;
>                                 dual-lvds-odd-pixels;
>                                 panel_dsi_lvds_in0: endpoint {
>                                         remote-endpoint = <&sn65dsi84_out0>;
>                                 };
>                         };
>
>                         port@1{
>                                 reg = <1>;
>                                 dual-lvds-even-pixels;
>                                 panel_dsi_lvds_in1: endpoint {
>                                         remote-endpoint = <&sn65dsi84_out1>;
>                                 };
>                         };
>                 };
>         };
> };
>
> &i2c5_ch3 {
>         dsi-lvds-bridge@2c {
>                 compatible = "ti,sn65dsi84";
>                 reg = <0x2c>;
>                 vcc-supply = <&reg_sys_1v8>;
>
>                 ports {
>                         #address-cells = <1>;
>                         #size-cells = <0>;
>
>                         port@0 {
>                                 reg = <0>;
>
>                                 sn65dsi84_from_bridge: endpoint {
>                                         remote-endpoint = <&hpbr_source>;
>                                         data-lanes = <1 2 3 4>;
>                                 };
>                         };
>                         port@2 {
>                                 reg = <2>;
>
>                                 sn65dsi84_out0: endpoint {
>                                         remote-endpoint = <&panel_dsi_lvds_in0>;
>                                 };
>                         };
>                         port@3 {
>                                 reg = <3>;
>
>                                 sn65dsi84_out1: endpoint {
>                                         remote-endpoint = <&panel_dsi_lvds_in1>;
>                                 };
>                         };
>                 };
>         };
> };
>
> --------------------8<--------------------
>
> That's all I could get at this point. Any clues for further
> investigation?
>
> Best regards,
> Luca
>
> --
> Luca Ceresoli, Bootlin
> Embedded Linux and Kernel engineering
> https://bootlin.com

^ permalink raw reply	[flat|nested] 21+ messages in thread

* [REGRESSION] Re: [PATCH v2 2/3] of: property: Improve finding the supplier of a remote-endpoint property
  2024-02-24  1:35     ` Saravana Kannan
@ 2024-02-26 11:52       ` Luca Ceresoli
  2024-02-28 21:56         ` Rob Herring
  0 siblings, 1 reply; 21+ messages in thread
From: Luca Ceresoli @ 2024-02-26 11:52 UTC (permalink / raw)
  To: Saravana Kannan
  Cc: Rob Herring, Frank Rowand, Greg Kroah-Hartman, Xu Yang,
	kernel-team, devicetree, linux-kernel, Hervé Codina,
	Geert Uytterhoeven

Hello Saravana,

On Fri, 23 Feb 2024 17:35:24 -0800
Saravana Kannan <saravanak@google.com> wrote:

> On Fri, Feb 23, 2024 at 8:18 AM Luca Ceresoli <luca.ceresoli@bootlin.com> wrote:
> >
> > Hello Saravana,
> >
> > [+cc Hervé Codina]
> >
> > On Tue,  6 Feb 2024 17:18:01 -0800
> > Saravana Kannan <saravanak@google.com> wrote:
> >  
> > > After commit 4a032827daa8 ("of: property: Simplify of_link_to_phandle()"),
> > > remote-endpoint properties created a fwnode link from the consumer device
> > > to the supplier endpoint. This is a tiny bit inefficient (not buggy) when
> > > trying to create device links or detecting cycles. So, improve this the
> > > same way we improved finding the consumer of a remote-endpoint property.
> > >
> > > Fixes: 4a032827daa8 ("of: property: Simplify of_link_to_phandle()")
> > > Signed-off-by: Saravana Kannan <saravanak@google.com>  
> >
> > After rebasing my own branch on v6.8-rc5 from v6.8-rc1 I started
> > getting unexpected warnings during device tree overlay removal. After a
> > somewhat painful bisection I identified this patch as the one that
> > triggers it all.  
> 
> Thanks for the report.
> 
> >  
> > > ---
> > > --- a/drivers/of/property.c
> > > +++ b/drivers/of/property.c
> > > @@ -1232,7 +1232,6 @@ DEFINE_SIMPLE_PROP(pinctrl5, "pinctrl-5", NULL)
> > >  DEFINE_SIMPLE_PROP(pinctrl6, "pinctrl-6", NULL)
> > >  DEFINE_SIMPLE_PROP(pinctrl7, "pinctrl-7", NULL)
> > >  DEFINE_SIMPLE_PROP(pinctrl8, "pinctrl-8", NULL)
> > > -DEFINE_SIMPLE_PROP(remote_endpoint, "remote-endpoint", NULL)
> > >  DEFINE_SIMPLE_PROP(pwms, "pwms", "#pwm-cells")
> > >  DEFINE_SIMPLE_PROP(resets, "resets", "#reset-cells")
> > >  DEFINE_SIMPLE_PROP(leds, "leds", NULL)
> > > @@ -1298,6 +1297,17 @@ static struct device_node *parse_interrupts(struct device_node *np,
> > >       return of_irq_parse_one(np, index, &sup_args) ? NULL : sup_args.np;
> > >  }
> > >
> > > +static struct device_node *parse_remote_endpoint(struct device_node *np,
> > > +                                              const char *prop_name,
> > > +                                              int index)
> > > +{
> > > +     /* Return NULL for index > 0 to signify end of remote-endpoints. */
> > > +     if (!index || strcmp(prop_name, "remote-endpoint"))  
> >
> > There seem to be a bug here: "!index" should be "index > 0", as the
> > comment suggests. Otherwise NULL is always returned.  
> 
> Ah crap, I think you are right. It should have been "index". Not
> "!index". But I tested this! Sigh. I probably screwed up my testing.
> 
> Please send out a Fix for this.
> 
> Geert, we got excited too soon. :(
> 
> > I am going to send a quick patch for that, but haven't done so yet
> > because it still won't solve the problem, so I wanted to open the topic
> > here without further delay.
> >
> > Even with the 'index > 0' fix I'm still getting pretty much the same:  
> 
> This part is confusing though. If I read your DT correctly, there's a
> cycle between platform:panel-dsi-lvds and i2c:13-002c. And fw_devlink
> should not be enforcing any ordering between those devices ever.
> 
> I'm surprised that in your "working" case, fw_devlink didn't detect
> any cycle. It should have. If there's any debugging to do, that's the
> one we need to debug.
> 
> >
> > [   34.836781] ------------[ cut here ]------------
> > [   34.841401] WARNING: CPU: 2 PID: 204 at drivers/base/devres.c:1064 devm_kfree+0x8c/0xfc
> > ...
> > [   35.024751] Call trace:
> > [   35.027199]  devm_kfree+0x8c/0xfc
> > [   35.030520]  devm_drm_panel_bridge_release+0x54/0x64 [drm_kms_helper]
> > [   35.036990]  devres_release_group+0xe0/0x164
> > [   35.041264]  i2c_device_remove+0x38/0x9c
> > [   35.045196]  device_remove+0x4c/0x80
> > [   35.048774]  device_release_driver_internal+0x1d4/0x230
> > [   35.054003]  device_release_driver+0x18/0x24
> > [   35.058279]  bus_remove_device+0xcc/0x10c
> > [   35.062292]  device_del+0x15c/0x41c
> > [   35.065786]  device_unregister+0x18/0x34
> > [   35.069714]  i2c_unregister_device+0x54/0x88
> > [   35.073988]  of_i2c_notify+0x98/0x224
> > [   35.077656]  blocking_notifier_call_chain+0x6c/0xa0
> > [   35.082543]  __of_changeset_entry_notify+0x100/0x16c
> > [   35.087515]  __of_changeset_revert_notify+0x44/0x78
> > [   35.092398]  of_overlay_remove+0x114/0x1c4
> > ...
> >
> > By comparing the two versions I found that before removing the overlay:
> >
> >  * in the "working" case (with this patch reverted) I have:
> >
> >    # ls /sys/class/devlink/ | grep 002c
> >    platform:hpbr--i2c:13-002c
> >    platform:panel-dsi-lvds--i2c:13-002c  
> 
> Can you check the "status" and "sync_state_only" file in this folder
> and tell me what it says?
> 
> Since these devices have a cyclic dependency between them, it should
> have been something other than "not tracked" and "sync_state_only"
> should be "1". But my guess is you'll see "active" and "0".
> 
> >    platform:regulator-sys-1v8--i2c:13-002c
> >    regulator:regulator.31--i2c:13-002c
> >    #
> >
> >  * in the "broken" case (v6.8-rc5 + s/!index/index > 0/ as mentioned):
> >
> >    # ls /sys/class/devlink/ | grep 002c
> >    platform:hpbr--i2c:13-002c
> >    platform:regulator-sys-1v8--i2c:13-002c
> >    regulator:regulator.30--i2c:13-002c
> >    #
> >
> > So in the latter case the panel-dsi-lvds--i2c:13-002c link is missing.
> > I think it gets created but later on removed. Here's a snippet of the
> > kernel log when that happens:
> >
> > [    9.578279] ----- cycle: start -----
> > [    9.578283] /soc@0/bus@30800000/i2c@30ad0000/i2cmux@70/i2c@3/dsi-lvds-bridge@2c: cycle: depends on /panel-dsi-lvds
> > [    9.578308] /panel-dsi-lvds: cycle: depends on /soc@0/bus@30800000/i2c@30ad0000/i2cmux@70/i2c@3/dsi-lvds-bridge@2c
> > [    9.578329] ----- cycle: end -----
> > [    9.578334] platform panel-dsi-lvds: Fixed dependency cycle(s) with /soc@0/bus@30800000/i2c@30ad0000/i2cmux@70/i2c@3/dsi-lvds-bridge@2c
> > ...  
> 
> Somewhere in this area, I'm thinking you'll also see "device:
> 'i2c:13-002c--platform:panel-dsi-lvds': device_add" do you not? And if
> you enabled device link logs, you'll see that it was "sync state only"
> link.
> 
> > [    9.590620] /panel-dsi-lvds Dropping the fwnode link to /soc@0/bus@30800000/i2c@30ad0000/i2cmux@70/i2c@3/dsi-lvds-bridge@2c
> > ...
> > [    9.597280] ----- cycle: start -----
> > [    9.597283] /panel-dsi-lvds: cycle: depends on /soc@0/bus@30800000/i2c@30ad0000/i2cmux@70/i2c@3/dsi-lvds-bridge@2c
> > [    9.602781] /soc@0/bus@30800000/i2c@30ad0000/i2cmux@70/i2c@3/dsi-lvds-bridge@2c: cycle: depends on /panel-dsi-lvds
> > [    9.607581] ----- cycle: end -----
> > [    9.607585] i2c 13-002c: Fixed dependency cycle(s) with /panel-dsi-lvds
> > [    9.614217] device: 'platform:panel-dsi-lvds--i2c:13-002c': device_add
> > ...
> > [    9.614277] /soc@0/bus@30800000/i2c@30ad0000/i2cmux@70/i2c@3/dsi-lvds-bridge@2c Dropping the fwnode link to /panel-dsi-lvds
> > [    9.614369] /soc@0/bus@30800000/i2c@30ad0000/i2cmux@70/i2c@3/dsi-lvds-bridge@2c Dropping the fwnode link to /regulator-dock-sys-1v8
> > ...
> > [    9.739840] panel-simple panel-dsi-lvds: Dropping the link to 13-002c
> > [    9.739846] device: 'i2c:13-002c--platform:panel-dsi-lvds': device_unregister  
> 
> Oh yeah, see. The "device_add" I expected earlier is getting removed here.
> 
> > [   10.247037] sn65dsi83 13-002c: Dropping the link to panel-dsi-lvds
> > [   10.247049] device: 'platform:panel-dsi-lvds--i2c:13-002c': device_unregister
> >
> > And here's the relevant portion of my device tree overlay:
> >
> > --------------------8<--------------------
> >  
> 
> I think the eventual fix would be this series + adding a
> "post-init-providers" property to the device that's supposed to probe
> first and point it to the device that's supposed to probe next. Do
> this at the device node level, not the endpoint level.
> https://lore.kernel.org/lkml/20240221233026.2915061-1-saravanak@google.com/

I'm certainly going to look at this series in more detail and at the
debugging you asked for, however I'm afraid I won't have access to the
hardware this week and it's not going to be a quick task anyway.

So in this moment I think it's quite clear that this specific patch
creates a regression and there is no clear fix that is reasonably
likely to get merged before 6.8.

I propose reverting this patch immediately, unless you have a better
short-term solution.

Luca

-- 
Luca Ceresoli, Bootlin
Embedded Linux and Kernel engineering
https://bootlin.com

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [REGRESSION] Re: [PATCH v2 2/3] of: property: Improve finding the supplier of a remote-endpoint property
  2024-02-26 11:52       ` [REGRESSION] " Luca Ceresoli
@ 2024-02-28 21:56         ` Rob Herring
  2024-02-28 23:58           ` Saravana Kannan
  0 siblings, 1 reply; 21+ messages in thread
From: Rob Herring @ 2024-02-28 21:56 UTC (permalink / raw)
  To: Luca Ceresoli
  Cc: Saravana Kannan, Frank Rowand, Greg Kroah-Hartman, Xu Yang,
	kernel-team, devicetree, linux-kernel, Hervé Codina,
	Geert Uytterhoeven

On Mon, Feb 26, 2024 at 5:52 AM Luca Ceresoli <luca.ceresoli@bootlin.com> wrote:
>
> Hello Saravana,
>
> On Fri, 23 Feb 2024 17:35:24 -0800
> Saravana Kannan <saravanak@google.com> wrote:
>
> > On Fri, Feb 23, 2024 at 8:18 AM Luca Ceresoli <luca.ceresoli@bootlin.com> wrote:
> > >
> > > Hello Saravana,
> > >
> > > [+cc Hervé Codina]
> > >
> > > On Tue,  6 Feb 2024 17:18:01 -0800
> > > Saravana Kannan <saravanak@google.com> wrote:
> > >
> > > > After commit 4a032827daa8 ("of: property: Simplify of_link_to_phandle()"),
> > > > remote-endpoint properties created a fwnode link from the consumer device
> > > > to the supplier endpoint. This is a tiny bit inefficient (not buggy) when
> > > > trying to create device links or detecting cycles. So, improve this the
> > > > same way we improved finding the consumer of a remote-endpoint property.
> > > >
> > > > Fixes: 4a032827daa8 ("of: property: Simplify of_link_to_phandle()")
> > > > Signed-off-by: Saravana Kannan <saravanak@google.com>
> > >
> > > After rebasing my own branch on v6.8-rc5 from v6.8-rc1 I started
> > > getting unexpected warnings during device tree overlay removal. After a
> > > somewhat painful bisection I identified this patch as the one that
> > > triggers it all.
> >
> > Thanks for the report.
> >
> > >
> > > > ---
> > > > --- a/drivers/of/property.c
> > > > +++ b/drivers/of/property.c
> > > > @@ -1232,7 +1232,6 @@ DEFINE_SIMPLE_PROP(pinctrl5, "pinctrl-5", NULL)
> > > >  DEFINE_SIMPLE_PROP(pinctrl6, "pinctrl-6", NULL)
> > > >  DEFINE_SIMPLE_PROP(pinctrl7, "pinctrl-7", NULL)
> > > >  DEFINE_SIMPLE_PROP(pinctrl8, "pinctrl-8", NULL)
> > > > -DEFINE_SIMPLE_PROP(remote_endpoint, "remote-endpoint", NULL)
> > > >  DEFINE_SIMPLE_PROP(pwms, "pwms", "#pwm-cells")
> > > >  DEFINE_SIMPLE_PROP(resets, "resets", "#reset-cells")
> > > >  DEFINE_SIMPLE_PROP(leds, "leds", NULL)
> > > > @@ -1298,6 +1297,17 @@ static struct device_node *parse_interrupts(struct device_node *np,
> > > >       return of_irq_parse_one(np, index, &sup_args) ? NULL : sup_args.np;
> > > >  }
> > > >
> > > > +static struct device_node *parse_remote_endpoint(struct device_node *np,
> > > > +                                              const char *prop_name,
> > > > +                                              int index)
> > > > +{
> > > > +     /* Return NULL for index > 0 to signify end of remote-endpoints. */
> > > > +     if (!index || strcmp(prop_name, "remote-endpoint"))
> > >
> > > There seem to be a bug here: "!index" should be "index > 0", as the
> > > comment suggests. Otherwise NULL is always returned.
> >
> > Ah crap, I think you are right. It should have been "index". Not
> > "!index". But I tested this! Sigh. I probably screwed up my testing.
> >
> > Please send out a Fix for this.
> >
> > Geert, we got excited too soon. :(
> >
> > > I am going to send a quick patch for that, but haven't done so yet
> > > because it still won't solve the problem, so I wanted to open the topic
> > > here without further delay.
> > >
> > > Even with the 'index > 0' fix I'm still getting pretty much the same:
> >
> > This part is confusing though. If I read your DT correctly, there's a
> > cycle between platform:panel-dsi-lvds and i2c:13-002c. And fw_devlink
> > should not be enforcing any ordering between those devices ever.
> >
> > I'm surprised that in your "working" case, fw_devlink didn't detect
> > any cycle. It should have. If there's any debugging to do, that's the
> > one we need to debug.
> >
> > >
> > > [   34.836781] ------------[ cut here ]------------
> > > [   34.841401] WARNING: CPU: 2 PID: 204 at drivers/base/devres.c:1064 devm_kfree+0x8c/0xfc
> > > ...
> > > [   35.024751] Call trace:
> > > [   35.027199]  devm_kfree+0x8c/0xfc
> > > [   35.030520]  devm_drm_panel_bridge_release+0x54/0x64 [drm_kms_helper]
> > > [   35.036990]  devres_release_group+0xe0/0x164
> > > [   35.041264]  i2c_device_remove+0x38/0x9c
> > > [   35.045196]  device_remove+0x4c/0x80
> > > [   35.048774]  device_release_driver_internal+0x1d4/0x230
> > > [   35.054003]  device_release_driver+0x18/0x24
> > > [   35.058279]  bus_remove_device+0xcc/0x10c
> > > [   35.062292]  device_del+0x15c/0x41c
> > > [   35.065786]  device_unregister+0x18/0x34
> > > [   35.069714]  i2c_unregister_device+0x54/0x88
> > > [   35.073988]  of_i2c_notify+0x98/0x224
> > > [   35.077656]  blocking_notifier_call_chain+0x6c/0xa0
> > > [   35.082543]  __of_changeset_entry_notify+0x100/0x16c
> > > [   35.087515]  __of_changeset_revert_notify+0x44/0x78
> > > [   35.092398]  of_overlay_remove+0x114/0x1c4
> > > ...
> > >
> > > By comparing the two versions I found that before removing the overlay:
> > >
> > >  * in the "working" case (with this patch reverted) I have:
> > >
> > >    # ls /sys/class/devlink/ | grep 002c
> > >    platform:hpbr--i2c:13-002c
> > >    platform:panel-dsi-lvds--i2c:13-002c
> >
> > Can you check the "status" and "sync_state_only" file in this folder
> > and tell me what it says?
> >
> > Since these devices have a cyclic dependency between them, it should
> > have been something other than "not tracked" and "sync_state_only"
> > should be "1". But my guess is you'll see "active" and "0".
> >
> > >    platform:regulator-sys-1v8--i2c:13-002c
> > >    regulator:regulator.31--i2c:13-002c
> > >    #
> > >
> > >  * in the "broken" case (v6.8-rc5 + s/!index/index > 0/ as mentioned):
> > >
> > >    # ls /sys/class/devlink/ | grep 002c
> > >    platform:hpbr--i2c:13-002c
> > >    platform:regulator-sys-1v8--i2c:13-002c
> > >    regulator:regulator.30--i2c:13-002c
> > >    #
> > >
> > > So in the latter case the panel-dsi-lvds--i2c:13-002c link is missing.
> > > I think it gets created but later on removed. Here's a snippet of the
> > > kernel log when that happens:
> > >
> > > [    9.578279] ----- cycle: start -----
> > > [    9.578283] /soc@0/bus@30800000/i2c@30ad0000/i2cmux@70/i2c@3/dsi-lvds-bridge@2c: cycle: depends on /panel-dsi-lvds
> > > [    9.578308] /panel-dsi-lvds: cycle: depends on /soc@0/bus@30800000/i2c@30ad0000/i2cmux@70/i2c@3/dsi-lvds-bridge@2c
> > > [    9.578329] ----- cycle: end -----
> > > [    9.578334] platform panel-dsi-lvds: Fixed dependency cycle(s) with /soc@0/bus@30800000/i2c@30ad0000/i2cmux@70/i2c@3/dsi-lvds-bridge@2c
> > > ...
> >
> > Somewhere in this area, I'm thinking you'll also see "device:
> > 'i2c:13-002c--platform:panel-dsi-lvds': device_add" do you not? And if
> > you enabled device link logs, you'll see that it was "sync state only"
> > link.
> >
> > > [    9.590620] /panel-dsi-lvds Dropping the fwnode link to /soc@0/bus@30800000/i2c@30ad0000/i2cmux@70/i2c@3/dsi-lvds-bridge@2c
> > > ...
> > > [    9.597280] ----- cycle: start -----
> > > [    9.597283] /panel-dsi-lvds: cycle: depends on /soc@0/bus@30800000/i2c@30ad0000/i2cmux@70/i2c@3/dsi-lvds-bridge@2c
> > > [    9.602781] /soc@0/bus@30800000/i2c@30ad0000/i2cmux@70/i2c@3/dsi-lvds-bridge@2c: cycle: depends on /panel-dsi-lvds
> > > [    9.607581] ----- cycle: end -----
> > > [    9.607585] i2c 13-002c: Fixed dependency cycle(s) with /panel-dsi-lvds
> > > [    9.614217] device: 'platform:panel-dsi-lvds--i2c:13-002c': device_add
> > > ...
> > > [    9.614277] /soc@0/bus@30800000/i2c@30ad0000/i2cmux@70/i2c@3/dsi-lvds-bridge@2c Dropping the fwnode link to /panel-dsi-lvds
> > > [    9.614369] /soc@0/bus@30800000/i2c@30ad0000/i2cmux@70/i2c@3/dsi-lvds-bridge@2c Dropping the fwnode link to /regulator-dock-sys-1v8
> > > ...
> > > [    9.739840] panel-simple panel-dsi-lvds: Dropping the link to 13-002c
> > > [    9.739846] device: 'i2c:13-002c--platform:panel-dsi-lvds': device_unregister
> >
> > Oh yeah, see. The "device_add" I expected earlier is getting removed here.
> >
> > > [   10.247037] sn65dsi83 13-002c: Dropping the link to panel-dsi-lvds
> > > [   10.247049] device: 'platform:panel-dsi-lvds--i2c:13-002c': device_unregister
> > >
> > > And here's the relevant portion of my device tree overlay:
> > >
> > > --------------------8<--------------------
> > >
> >
> > I think the eventual fix would be this series + adding a
> > "post-init-providers" property to the device that's supposed to probe
> > first and point it to the device that's supposed to probe next. Do
> > this at the device node level, not the endpoint level.
> > https://lore.kernel.org/lkml/20240221233026.2915061-1-saravanak@google.com/
>
> I'm certainly going to look at this series in more detail and at the
> debugging you asked for, however I'm afraid I won't have access to the
> hardware this week and it's not going to be a quick task anyway.
>
> So in this moment I think it's quite clear that this specific patch
> creates a regression and there is no clear fix that is reasonably
> likely to get merged before 6.8.
>
> I propose reverting this patch immediately, unless you have a better
> short-term solution.

It's just this one of the 3 patches that needs reverting?

Rob

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH v2 2/3] of: property: Improve finding the supplier of a remote-endpoint property
  2024-02-23 16:18   ` Luca Ceresoli
  2024-02-24  1:35     ` Saravana Kannan
@ 2024-02-28 22:01     ` Rob Herring
  1 sibling, 0 replies; 21+ messages in thread
From: Rob Herring @ 2024-02-28 22:01 UTC (permalink / raw)
  To: Luca Ceresoli
  Cc: Saravana Kannan, Frank Rowand, Greg Kroah-Hartman, Xu Yang,
	kernel-team, devicetree, linux-kernel, Hervé Codina

On Fri, Feb 23, 2024 at 10:18 AM Luca Ceresoli
<luca.ceresoli@bootlin.com> wrote:
>
> Hello Saravana,
>
> [+cc Hervé Codina]
>
> On Tue,  6 Feb 2024 17:18:01 -0800
> Saravana Kannan <saravanak@google.com> wrote:
>
> > After commit 4a032827daa8 ("of: property: Simplify of_link_to_phandle()"),
> > remote-endpoint properties created a fwnode link from the consumer device
> > to the supplier endpoint. This is a tiny bit inefficient (not buggy) when
> > trying to create device links or detecting cycles. So, improve this the
> > same way we improved finding the consumer of a remote-endpoint property.
> >
> > Fixes: 4a032827daa8 ("of: property: Simplify of_link_to_phandle()")
> > Signed-off-by: Saravana Kannan <saravanak@google.com>
>
> After rebasing my own branch on v6.8-rc5 from v6.8-rc1 I started
> getting unexpected warnings during device tree overlay removal. After a
> somewhat painful bisection I identified this patch as the one that
> triggers it all.

How are you applying the overlay?

Rob

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [REGRESSION] Re: [PATCH v2 2/3] of: property: Improve finding the supplier of a remote-endpoint property
  2024-02-28 21:56         ` Rob Herring
@ 2024-02-28 23:58           ` Saravana Kannan
  2024-02-29  0:26             ` Rob Herring
  0 siblings, 1 reply; 21+ messages in thread
From: Saravana Kannan @ 2024-02-28 23:58 UTC (permalink / raw)
  To: Rob Herring
  Cc: Luca Ceresoli, Frank Rowand, Greg Kroah-Hartman, Xu Yang,
	kernel-team, devicetree, linux-kernel, Hervé Codina,
	Geert Uytterhoeven

On Wed, Feb 28, 2024 at 1:56 PM Rob Herring <robh+dt@kernel.org> wrote:
>
> On Mon, Feb 26, 2024 at 5:52 AM Luca Ceresoli <luca.ceresoli@bootlin.com> wrote:
> >
> > Hello Saravana,
> >
> > On Fri, 23 Feb 2024 17:35:24 -0800
> > Saravana Kannan <saravanak@google.com> wrote:
> >
> > > On Fri, Feb 23, 2024 at 8:18 AM Luca Ceresoli <luca.ceresoli@bootlin.com> wrote:
> > > >
> > > > Hello Saravana,
> > > >
> > > > [+cc Hervé Codina]
> > > >
> > > > On Tue,  6 Feb 2024 17:18:01 -0800
> > > > Saravana Kannan <saravanak@google.com> wrote:
> > > >
> > > > > After commit 4a032827daa8 ("of: property: Simplify of_link_to_phandle()"),
> > > > > remote-endpoint properties created a fwnode link from the consumer device
> > > > > to the supplier endpoint. This is a tiny bit inefficient (not buggy) when
> > > > > trying to create device links or detecting cycles. So, improve this the
> > > > > same way we improved finding the consumer of a remote-endpoint property.
> > > > >
> > > > > Fixes: 4a032827daa8 ("of: property: Simplify of_link_to_phandle()")
> > > > > Signed-off-by: Saravana Kannan <saravanak@google.com>
> > > >
> > > > After rebasing my own branch on v6.8-rc5 from v6.8-rc1 I started
> > > > getting unexpected warnings during device tree overlay removal. After a
> > > > somewhat painful bisection I identified this patch as the one that
> > > > triggers it all.
> > >
> > > Thanks for the report.
> > >
> > > >
> > > > > ---
> > > > > --- a/drivers/of/property.c
> > > > > +++ b/drivers/of/property.c
> > > > > @@ -1232,7 +1232,6 @@ DEFINE_SIMPLE_PROP(pinctrl5, "pinctrl-5", NULL)
> > > > >  DEFINE_SIMPLE_PROP(pinctrl6, "pinctrl-6", NULL)
> > > > >  DEFINE_SIMPLE_PROP(pinctrl7, "pinctrl-7", NULL)
> > > > >  DEFINE_SIMPLE_PROP(pinctrl8, "pinctrl-8", NULL)
> > > > > -DEFINE_SIMPLE_PROP(remote_endpoint, "remote-endpoint", NULL)
> > > > >  DEFINE_SIMPLE_PROP(pwms, "pwms", "#pwm-cells")
> > > > >  DEFINE_SIMPLE_PROP(resets, "resets", "#reset-cells")
> > > > >  DEFINE_SIMPLE_PROP(leds, "leds", NULL)
> > > > > @@ -1298,6 +1297,17 @@ static struct device_node *parse_interrupts(struct device_node *np,
> > > > >       return of_irq_parse_one(np, index, &sup_args) ? NULL : sup_args.np;
> > > > >  }
> > > > >
> > > > > +static struct device_node *parse_remote_endpoint(struct device_node *np,
> > > > > +                                              const char *prop_name,
> > > > > +                                              int index)
> > > > > +{
> > > > > +     /* Return NULL for index > 0 to signify end of remote-endpoints. */
> > > > > +     if (!index || strcmp(prop_name, "remote-endpoint"))
> > > >
> > > > There seem to be a bug here: "!index" should be "index > 0", as the
> > > > comment suggests. Otherwise NULL is always returned.
> > >
> > > Ah crap, I think you are right. It should have been "index". Not
> > > "!index". But I tested this! Sigh. I probably screwed up my testing.
> > >
> > > Please send out a Fix for this.
> > >
> > > Geert, we got excited too soon. :(
> > >
> > > > I am going to send a quick patch for that, but haven't done so yet
> > > > because it still won't solve the problem, so I wanted to open the topic
> > > > here without further delay.
> > > >
> > > > Even with the 'index > 0' fix I'm still getting pretty much the same:
> > >
> > > This part is confusing though. If I read your DT correctly, there's a
> > > cycle between platform:panel-dsi-lvds and i2c:13-002c. And fw_devlink
> > > should not be enforcing any ordering between those devices ever.
> > >
> > > I'm surprised that in your "working" case, fw_devlink didn't detect
> > > any cycle. It should have. If there's any debugging to do, that's the
> > > one we need to debug.
> > >
> > > >
> > > > [   34.836781] ------------[ cut here ]------------
> > > > [   34.841401] WARNING: CPU: 2 PID: 204 at drivers/base/devres.c:1064 devm_kfree+0x8c/0xfc
> > > > ...
> > > > [   35.024751] Call trace:
> > > > [   35.027199]  devm_kfree+0x8c/0xfc
> > > > [   35.030520]  devm_drm_panel_bridge_release+0x54/0x64 [drm_kms_helper]
> > > > [   35.036990]  devres_release_group+0xe0/0x164
> > > > [   35.041264]  i2c_device_remove+0x38/0x9c
> > > > [   35.045196]  device_remove+0x4c/0x80
> > > > [   35.048774]  device_release_driver_internal+0x1d4/0x230
> > > > [   35.054003]  device_release_driver+0x18/0x24
> > > > [   35.058279]  bus_remove_device+0xcc/0x10c
> > > > [   35.062292]  device_del+0x15c/0x41c
> > > > [   35.065786]  device_unregister+0x18/0x34
> > > > [   35.069714]  i2c_unregister_device+0x54/0x88
> > > > [   35.073988]  of_i2c_notify+0x98/0x224
> > > > [   35.077656]  blocking_notifier_call_chain+0x6c/0xa0
> > > > [   35.082543]  __of_changeset_entry_notify+0x100/0x16c
> > > > [   35.087515]  __of_changeset_revert_notify+0x44/0x78
> > > > [   35.092398]  of_overlay_remove+0x114/0x1c4
> > > > ...
> > > >
> > > > By comparing the two versions I found that before removing the overlay:
> > > >
> > > >  * in the "working" case (with this patch reverted) I have:
> > > >
> > > >    # ls /sys/class/devlink/ | grep 002c
> > > >    platform:hpbr--i2c:13-002c
> > > >    platform:panel-dsi-lvds--i2c:13-002c
> > >
> > > Can you check the "status" and "sync_state_only" file in this folder
> > > and tell me what it says?
> > >
> > > Since these devices have a cyclic dependency between them, it should
> > > have been something other than "not tracked" and "sync_state_only"
> > > should be "1". But my guess is you'll see "active" and "0".
> > >
> > > >    platform:regulator-sys-1v8--i2c:13-002c
> > > >    regulator:regulator.31--i2c:13-002c
> > > >    #
> > > >
> > > >  * in the "broken" case (v6.8-rc5 + s/!index/index > 0/ as mentioned):
> > > >
> > > >    # ls /sys/class/devlink/ | grep 002c
> > > >    platform:hpbr--i2c:13-002c
> > > >    platform:regulator-sys-1v8--i2c:13-002c
> > > >    regulator:regulator.30--i2c:13-002c
> > > >    #
> > > >
> > > > So in the latter case the panel-dsi-lvds--i2c:13-002c link is missing.
> > > > I think it gets created but later on removed. Here's a snippet of the
> > > > kernel log when that happens:
> > > >
> > > > [    9.578279] ----- cycle: start -----
> > > > [    9.578283] /soc@0/bus@30800000/i2c@30ad0000/i2cmux@70/i2c@3/dsi-lvds-bridge@2c: cycle: depends on /panel-dsi-lvds
> > > > [    9.578308] /panel-dsi-lvds: cycle: depends on /soc@0/bus@30800000/i2c@30ad0000/i2cmux@70/i2c@3/dsi-lvds-bridge@2c
> > > > [    9.578329] ----- cycle: end -----
> > > > [    9.578334] platform panel-dsi-lvds: Fixed dependency cycle(s) with /soc@0/bus@30800000/i2c@30ad0000/i2cmux@70/i2c@3/dsi-lvds-bridge@2c
> > > > ...
> > >
> > > Somewhere in this area, I'm thinking you'll also see "device:
> > > 'i2c:13-002c--platform:panel-dsi-lvds': device_add" do you not? And if
> > > you enabled device link logs, you'll see that it was "sync state only"
> > > link.
> > >
> > > > [    9.590620] /panel-dsi-lvds Dropping the fwnode link to /soc@0/bus@30800000/i2c@30ad0000/i2cmux@70/i2c@3/dsi-lvds-bridge@2c
> > > > ...
> > > > [    9.597280] ----- cycle: start -----
> > > > [    9.597283] /panel-dsi-lvds: cycle: depends on /soc@0/bus@30800000/i2c@30ad0000/i2cmux@70/i2c@3/dsi-lvds-bridge@2c
> > > > [    9.602781] /soc@0/bus@30800000/i2c@30ad0000/i2cmux@70/i2c@3/dsi-lvds-bridge@2c: cycle: depends on /panel-dsi-lvds
> > > > [    9.607581] ----- cycle: end -----
> > > > [    9.607585] i2c 13-002c: Fixed dependency cycle(s) with /panel-dsi-lvds
> > > > [    9.614217] device: 'platform:panel-dsi-lvds--i2c:13-002c': device_add
> > > > ...
> > > > [    9.614277] /soc@0/bus@30800000/i2c@30ad0000/i2cmux@70/i2c@3/dsi-lvds-bridge@2c Dropping the fwnode link to /panel-dsi-lvds
> > > > [    9.614369] /soc@0/bus@30800000/i2c@30ad0000/i2cmux@70/i2c@3/dsi-lvds-bridge@2c Dropping the fwnode link to /regulator-dock-sys-1v8
> > > > ...
> > > > [    9.739840] panel-simple panel-dsi-lvds: Dropping the link to 13-002c
> > > > [    9.739846] device: 'i2c:13-002c--platform:panel-dsi-lvds': device_unregister
> > >
> > > Oh yeah, see. The "device_add" I expected earlier is getting removed here.
> > >
> > > > [   10.247037] sn65dsi83 13-002c: Dropping the link to panel-dsi-lvds
> > > > [   10.247049] device: 'platform:panel-dsi-lvds--i2c:13-002c': device_unregister
> > > >
> > > > And here's the relevant portion of my device tree overlay:
> > > >
> > > > --------------------8<--------------------
> > > >
> > >
> > > I think the eventual fix would be this series + adding a
> > > "post-init-providers" property to the device that's supposed to probe
> > > first and point it to the device that's supposed to probe next. Do
> > > this at the device node level, not the endpoint level.
> > > https://lore.kernel.org/lkml/20240221233026.2915061-1-saravanak@google.com/
> >
> > I'm certainly going to look at this series in more detail and at the
> > debugging you asked for, however I'm afraid I won't have access to the
> > hardware this week and it's not going to be a quick task anyway.
> >
> > So in this moment I think it's quite clear that this specific patch
> > creates a regression and there is no clear fix that is reasonably
> > likely to get merged before 6.8.
> >
> > I propose reverting this patch immediately, unless you have a better
> > short-term solution.
>
> It's just this one of the 3 patches that needs reverting?

I sent a fix. With the fix, it's just exposing a bug elsewhere.

-Saravana

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [REGRESSION] Re: [PATCH v2 2/3] of: property: Improve finding the supplier of a remote-endpoint property
  2024-02-28 23:58           ` Saravana Kannan
@ 2024-02-29  0:26             ` Rob Herring
  2024-02-29  9:34               ` Luca Ceresoli
  0 siblings, 1 reply; 21+ messages in thread
From: Rob Herring @ 2024-02-29  0:26 UTC (permalink / raw)
  To: Saravana Kannan
  Cc: Luca Ceresoli, Frank Rowand, Greg Kroah-Hartman, Xu Yang,
	kernel-team, devicetree, linux-kernel, Hervé Codina,
	Geert Uytterhoeven

On Wed, Feb 28, 2024 at 5:58 PM Saravana Kannan <saravanak@google.com> wrote:
>
> On Wed, Feb 28, 2024 at 1:56 PM Rob Herring <robh+dt@kernel.org> wrote:
> >
> > On Mon, Feb 26, 2024 at 5:52 AM Luca Ceresoli <luca.ceresoli@bootlin.com> wrote:
> > >
> > > Hello Saravana,
> > >
> > > On Fri, 23 Feb 2024 17:35:24 -0800
> > > Saravana Kannan <saravanak@google.com> wrote:
> > >
> > > > On Fri, Feb 23, 2024 at 8:18 AM Luca Ceresoli <luca.ceresoli@bootlin.com> wrote:
> > > > >
> > > > > Hello Saravana,
> > > > >
> > > > > [+cc Hervé Codina]
> > > > >
> > > > > On Tue,  6 Feb 2024 17:18:01 -0800
> > > > > Saravana Kannan <saravanak@google.com> wrote:
> > > > >
> > > > > > After commit 4a032827daa8 ("of: property: Simplify of_link_to_phandle()"),
> > > > > > remote-endpoint properties created a fwnode link from the consumer device
> > > > > > to the supplier endpoint. This is a tiny bit inefficient (not buggy) when
> > > > > > trying to create device links or detecting cycles. So, improve this the
> > > > > > same way we improved finding the consumer of a remote-endpoint property.
> > > > > >
> > > > > > Fixes: 4a032827daa8 ("of: property: Simplify of_link_to_phandle()")
> > > > > > Signed-off-by: Saravana Kannan <saravanak@google.com>
> > > > >
> > > > > After rebasing my own branch on v6.8-rc5 from v6.8-rc1 I started
> > > > > getting unexpected warnings during device tree overlay removal. After a
> > > > > somewhat painful bisection I identified this patch as the one that
> > > > > triggers it all.
> > > >
> > > > Thanks for the report.
> > > >
> > > > >
> > > > > > ---
> > > > > > --- a/drivers/of/property.c
> > > > > > +++ b/drivers/of/property.c
> > > > > > @@ -1232,7 +1232,6 @@ DEFINE_SIMPLE_PROP(pinctrl5, "pinctrl-5", NULL)
> > > > > >  DEFINE_SIMPLE_PROP(pinctrl6, "pinctrl-6", NULL)
> > > > > >  DEFINE_SIMPLE_PROP(pinctrl7, "pinctrl-7", NULL)
> > > > > >  DEFINE_SIMPLE_PROP(pinctrl8, "pinctrl-8", NULL)
> > > > > > -DEFINE_SIMPLE_PROP(remote_endpoint, "remote-endpoint", NULL)
> > > > > >  DEFINE_SIMPLE_PROP(pwms, "pwms", "#pwm-cells")
> > > > > >  DEFINE_SIMPLE_PROP(resets, "resets", "#reset-cells")
> > > > > >  DEFINE_SIMPLE_PROP(leds, "leds", NULL)
> > > > > > @@ -1298,6 +1297,17 @@ static struct device_node *parse_interrupts(struct device_node *np,
> > > > > >       return of_irq_parse_one(np, index, &sup_args) ? NULL : sup_args.np;
> > > > > >  }
> > > > > >
> > > > > > +static struct device_node *parse_remote_endpoint(struct device_node *np,
> > > > > > +                                              const char *prop_name,
> > > > > > +                                              int index)
> > > > > > +{
> > > > > > +     /* Return NULL for index > 0 to signify end of remote-endpoints. */
> > > > > > +     if (!index || strcmp(prop_name, "remote-endpoint"))
> > > > >
> > > > > There seem to be a bug here: "!index" should be "index > 0", as the
> > > > > comment suggests. Otherwise NULL is always returned.
> > > >
> > > > Ah crap, I think you are right. It should have been "index". Not
> > > > "!index". But I tested this! Sigh. I probably screwed up my testing.
> > > >
> > > > Please send out a Fix for this.
> > > >
> > > > Geert, we got excited too soon. :(
> > > >
> > > > > I am going to send a quick patch for that, but haven't done so yet
> > > > > because it still won't solve the problem, so I wanted to open the topic
> > > > > here without further delay.
> > > > >
> > > > > Even with the 'index > 0' fix I'm still getting pretty much the same:
> > > >
> > > > This part is confusing though. If I read your DT correctly, there's a
> > > > cycle between platform:panel-dsi-lvds and i2c:13-002c. And fw_devlink
> > > > should not be enforcing any ordering between those devices ever.
> > > >
> > > > I'm surprised that in your "working" case, fw_devlink didn't detect
> > > > any cycle. It should have. If there's any debugging to do, that's the
> > > > one we need to debug.
> > > >
> > > > >
> > > > > [   34.836781] ------------[ cut here ]------------
> > > > > [   34.841401] WARNING: CPU: 2 PID: 204 at drivers/base/devres.c:1064 devm_kfree+0x8c/0xfc
> > > > > ...
> > > > > [   35.024751] Call trace:
> > > > > [   35.027199]  devm_kfree+0x8c/0xfc
> > > > > [   35.030520]  devm_drm_panel_bridge_release+0x54/0x64 [drm_kms_helper]
> > > > > [   35.036990]  devres_release_group+0xe0/0x164
> > > > > [   35.041264]  i2c_device_remove+0x38/0x9c
> > > > > [   35.045196]  device_remove+0x4c/0x80
> > > > > [   35.048774]  device_release_driver_internal+0x1d4/0x230
> > > > > [   35.054003]  device_release_driver+0x18/0x24
> > > > > [   35.058279]  bus_remove_device+0xcc/0x10c
> > > > > [   35.062292]  device_del+0x15c/0x41c
> > > > > [   35.065786]  device_unregister+0x18/0x34
> > > > > [   35.069714]  i2c_unregister_device+0x54/0x88
> > > > > [   35.073988]  of_i2c_notify+0x98/0x224
> > > > > [   35.077656]  blocking_notifier_call_chain+0x6c/0xa0
> > > > > [   35.082543]  __of_changeset_entry_notify+0x100/0x16c
> > > > > [   35.087515]  __of_changeset_revert_notify+0x44/0x78
> > > > > [   35.092398]  of_overlay_remove+0x114/0x1c4
> > > > > ...
> > > > >
> > > > > By comparing the two versions I found that before removing the overlay:
> > > > >
> > > > >  * in the "working" case (with this patch reverted) I have:
> > > > >
> > > > >    # ls /sys/class/devlink/ | grep 002c
> > > > >    platform:hpbr--i2c:13-002c
> > > > >    platform:panel-dsi-lvds--i2c:13-002c
> > > >
> > > > Can you check the "status" and "sync_state_only" file in this folder
> > > > and tell me what it says?
> > > >
> > > > Since these devices have a cyclic dependency between them, it should
> > > > have been something other than "not tracked" and "sync_state_only"
> > > > should be "1". But my guess is you'll see "active" and "0".
> > > >
> > > > >    platform:regulator-sys-1v8--i2c:13-002c
> > > > >    regulator:regulator.31--i2c:13-002c
> > > > >    #
> > > > >
> > > > >  * in the "broken" case (v6.8-rc5 + s/!index/index > 0/ as mentioned):
> > > > >
> > > > >    # ls /sys/class/devlink/ | grep 002c
> > > > >    platform:hpbr--i2c:13-002c
> > > > >    platform:regulator-sys-1v8--i2c:13-002c
> > > > >    regulator:regulator.30--i2c:13-002c
> > > > >    #
> > > > >
> > > > > So in the latter case the panel-dsi-lvds--i2c:13-002c link is missing.
> > > > > I think it gets created but later on removed. Here's a snippet of the
> > > > > kernel log when that happens:
> > > > >
> > > > > [    9.578279] ----- cycle: start -----
> > > > > [    9.578283] /soc@0/bus@30800000/i2c@30ad0000/i2cmux@70/i2c@3/dsi-lvds-bridge@2c: cycle: depends on /panel-dsi-lvds
> > > > > [    9.578308] /panel-dsi-lvds: cycle: depends on /soc@0/bus@30800000/i2c@30ad0000/i2cmux@70/i2c@3/dsi-lvds-bridge@2c
> > > > > [    9.578329] ----- cycle: end -----
> > > > > [    9.578334] platform panel-dsi-lvds: Fixed dependency cycle(s) with /soc@0/bus@30800000/i2c@30ad0000/i2cmux@70/i2c@3/dsi-lvds-bridge@2c
> > > > > ...
> > > >
> > > > Somewhere in this area, I'm thinking you'll also see "device:
> > > > 'i2c:13-002c--platform:panel-dsi-lvds': device_add" do you not? And if
> > > > you enabled device link logs, you'll see that it was "sync state only"
> > > > link.
> > > >
> > > > > [    9.590620] /panel-dsi-lvds Dropping the fwnode link to /soc@0/bus@30800000/i2c@30ad0000/i2cmux@70/i2c@3/dsi-lvds-bridge@2c
> > > > > ...
> > > > > [    9.597280] ----- cycle: start -----
> > > > > [    9.597283] /panel-dsi-lvds: cycle: depends on /soc@0/bus@30800000/i2c@30ad0000/i2cmux@70/i2c@3/dsi-lvds-bridge@2c
> > > > > [    9.602781] /soc@0/bus@30800000/i2c@30ad0000/i2cmux@70/i2c@3/dsi-lvds-bridge@2c: cycle: depends on /panel-dsi-lvds
> > > > > [    9.607581] ----- cycle: end -----
> > > > > [    9.607585] i2c 13-002c: Fixed dependency cycle(s) with /panel-dsi-lvds
> > > > > [    9.614217] device: 'platform:panel-dsi-lvds--i2c:13-002c': device_add
> > > > > ...
> > > > > [    9.614277] /soc@0/bus@30800000/i2c@30ad0000/i2cmux@70/i2c@3/dsi-lvds-bridge@2c Dropping the fwnode link to /panel-dsi-lvds
> > > > > [    9.614369] /soc@0/bus@30800000/i2c@30ad0000/i2cmux@70/i2c@3/dsi-lvds-bridge@2c Dropping the fwnode link to /regulator-dock-sys-1v8
> > > > > ...
> > > > > [    9.739840] panel-simple panel-dsi-lvds: Dropping the link to 13-002c
> > > > > [    9.739846] device: 'i2c:13-002c--platform:panel-dsi-lvds': device_unregister
> > > >
> > > > Oh yeah, see. The "device_add" I expected earlier is getting removed here.
> > > >
> > > > > [   10.247037] sn65dsi83 13-002c: Dropping the link to panel-dsi-lvds
> > > > > [   10.247049] device: 'platform:panel-dsi-lvds--i2c:13-002c': device_unregister
> > > > >
> > > > > And here's the relevant portion of my device tree overlay:
> > > > >
> > > > > --------------------8<--------------------
> > > > >
> > > >
> > > > I think the eventual fix would be this series + adding a
> > > > "post-init-providers" property to the device that's supposed to probe
> > > > first and point it to the device that's supposed to probe next. Do
> > > > this at the device node level, not the endpoint level.
> > > > https://lore.kernel.org/lkml/20240221233026.2915061-1-saravanak@google.com/
> > >
> > > I'm certainly going to look at this series in more detail and at the
> > > debugging you asked for, however I'm afraid I won't have access to the
> > > hardware this week and it's not going to be a quick task anyway.
> > >
> > > So in this moment I think it's quite clear that this specific patch
> > > creates a regression and there is no clear fix that is reasonably
> > > likely to get merged before 6.8.
> > >
> > > I propose reverting this patch immediately, unless you have a better
> > > short-term solution.
> >
> > It's just this one of the 3 patches that needs reverting?
>
> I sent a fix. With the fix, it's just exposing a bug elsewhere.

That's not telling me what to do...

You say apply the fix. Luca says revert. I say I wish I made this 6.9
material. Which is it?

If the overlay applying depends on out of tree code (likely as there
are limited ways to apply an overlay in mainline), then I don't really
care if there is still a regression.

Rob

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [REGRESSION] Re: [PATCH v2 2/3] of: property: Improve finding the supplier of a remote-endpoint property
  2024-02-29  0:26             ` Rob Herring
@ 2024-02-29  9:34               ` Luca Ceresoli
  2024-02-29 22:10                 ` Rob Herring
  0 siblings, 1 reply; 21+ messages in thread
From: Luca Ceresoli @ 2024-02-29  9:34 UTC (permalink / raw)
  To: Rob Herring
  Cc: Saravana Kannan, Frank Rowand, Greg Kroah-Hartman, Xu Yang,
	kernel-team, devicetree, linux-kernel, Hervé Codina,
	Geert Uytterhoeven

Hi Rob, Saravana,

On Wed, 28 Feb 2024 18:26:36 -0600
Rob Herring <robh+dt@kernel.org> wrote:

> On Wed, Feb 28, 2024 at 5:58 PM Saravana Kannan <saravanak@google.com> wrote:
> >
> > On Wed, Feb 28, 2024 at 1:56 PM Rob Herring <robh+dt@kernel.org> wrote:  
> > >
> > > On Mon, Feb 26, 2024 at 5:52 AM Luca Ceresoli <luca.ceresoli@bootlin.com> wrote:  
> > > >
> > > > Hello Saravana,
> > > >
> > > > On Fri, 23 Feb 2024 17:35:24 -0800
> > > > Saravana Kannan <saravanak@google.com> wrote:
> > > >  
> > > > > On Fri, Feb 23, 2024 at 8:18 AM Luca Ceresoli <luca.ceresoli@bootlin.com> wrote:  
> > > > > >
> > > > > > Hello Saravana,
> > > > > >
> > > > > > [+cc Hervé Codina]
> > > > > >
> > > > > > On Tue,  6 Feb 2024 17:18:01 -0800
> > > > > > Saravana Kannan <saravanak@google.com> wrote:
> > > > > >  
> > > > > > > After commit 4a032827daa8 ("of: property: Simplify of_link_to_phandle()"),
> > > > > > > remote-endpoint properties created a fwnode link from the consumer device
> > > > > > > to the supplier endpoint. This is a tiny bit inefficient (not buggy) when
> > > > > > > trying to create device links or detecting cycles. So, improve this the
> > > > > > > same way we improved finding the consumer of a remote-endpoint property.
> > > > > > >
> > > > > > > Fixes: 4a032827daa8 ("of: property: Simplify of_link_to_phandle()")
> > > > > > > Signed-off-by: Saravana Kannan <saravanak@google.com>  
> > > > > >
> > > > > > After rebasing my own branch on v6.8-rc5 from v6.8-rc1 I started
> > > > > > getting unexpected warnings during device tree overlay removal. After a
> > > > > > somewhat painful bisection I identified this patch as the one that
> > > > > > triggers it all.  
> > > > >
> > > > > Thanks for the report.
> > > > >  
> > > > > >  
> > > > > > > ---
> > > > > > > --- a/drivers/of/property.c
> > > > > > > +++ b/drivers/of/property.c
> > > > > > > @@ -1232,7 +1232,6 @@ DEFINE_SIMPLE_PROP(pinctrl5, "pinctrl-5", NULL)
> > > > > > >  DEFINE_SIMPLE_PROP(pinctrl6, "pinctrl-6", NULL)
> > > > > > >  DEFINE_SIMPLE_PROP(pinctrl7, "pinctrl-7", NULL)
> > > > > > >  DEFINE_SIMPLE_PROP(pinctrl8, "pinctrl-8", NULL)
> > > > > > > -DEFINE_SIMPLE_PROP(remote_endpoint, "remote-endpoint", NULL)
> > > > > > >  DEFINE_SIMPLE_PROP(pwms, "pwms", "#pwm-cells")
> > > > > > >  DEFINE_SIMPLE_PROP(resets, "resets", "#reset-cells")
> > > > > > >  DEFINE_SIMPLE_PROP(leds, "leds", NULL)
> > > > > > > @@ -1298,6 +1297,17 @@ static struct device_node *parse_interrupts(struct device_node *np,
> > > > > > >       return of_irq_parse_one(np, index, &sup_args) ? NULL : sup_args.np;
> > > > > > >  }
> > > > > > >
> > > > > > > +static struct device_node *parse_remote_endpoint(struct device_node *np,
> > > > > > > +                                              const char *prop_name,
> > > > > > > +                                              int index)
> > > > > > > +{
> > > > > > > +     /* Return NULL for index > 0 to signify end of remote-endpoints. */
> > > > > > > +     if (!index || strcmp(prop_name, "remote-endpoint"))  
> > > > > >
> > > > > > There seem to be a bug here: "!index" should be "index > 0", as the
> > > > > > comment suggests. Otherwise NULL is always returned.  
> > > > >
> > > > > Ah crap, I think you are right. It should have been "index". Not
> > > > > "!index". But I tested this! Sigh. I probably screwed up my testing.
> > > > >
> > > > > Please send out a Fix for this.
> > > > >
> > > > > Geert, we got excited too soon. :(
> > > > >  
> > > > > > I am going to send a quick patch for that, but haven't done so yet
> > > > > > because it still won't solve the problem, so I wanted to open the topic
> > > > > > here without further delay.
> > > > > >
> > > > > > Even with the 'index > 0' fix I'm still getting pretty much the same:  
> > > > >
> > > > > This part is confusing though. If I read your DT correctly, there's a
> > > > > cycle between platform:panel-dsi-lvds and i2c:13-002c. And fw_devlink
> > > > > should not be enforcing any ordering between those devices ever.
> > > > >
> > > > > I'm surprised that in your "working" case, fw_devlink didn't detect
> > > > > any cycle. It should have. If there's any debugging to do, that's the
> > > > > one we need to debug.
> > > > >  
> > > > > >
> > > > > > [   34.836781] ------------[ cut here ]------------
> > > > > > [   34.841401] WARNING: CPU: 2 PID: 204 at drivers/base/devres.c:1064 devm_kfree+0x8c/0xfc
> > > > > > ...
> > > > > > [   35.024751] Call trace:
> > > > > > [   35.027199]  devm_kfree+0x8c/0xfc
> > > > > > [   35.030520]  devm_drm_panel_bridge_release+0x54/0x64 [drm_kms_helper]
> > > > > > [   35.036990]  devres_release_group+0xe0/0x164
> > > > > > [   35.041264]  i2c_device_remove+0x38/0x9c
> > > > > > [   35.045196]  device_remove+0x4c/0x80
> > > > > > [   35.048774]  device_release_driver_internal+0x1d4/0x230
> > > > > > [   35.054003]  device_release_driver+0x18/0x24
> > > > > > [   35.058279]  bus_remove_device+0xcc/0x10c
> > > > > > [   35.062292]  device_del+0x15c/0x41c
> > > > > > [   35.065786]  device_unregister+0x18/0x34
> > > > > > [   35.069714]  i2c_unregister_device+0x54/0x88
> > > > > > [   35.073988]  of_i2c_notify+0x98/0x224
> > > > > > [   35.077656]  blocking_notifier_call_chain+0x6c/0xa0
> > > > > > [   35.082543]  __of_changeset_entry_notify+0x100/0x16c
> > > > > > [   35.087515]  __of_changeset_revert_notify+0x44/0x78
> > > > > > [   35.092398]  of_overlay_remove+0x114/0x1c4
> > > > > > ...
> > > > > >
> > > > > > By comparing the two versions I found that before removing the overlay:
> > > > > >
> > > > > >  * in the "working" case (with this patch reverted) I have:
> > > > > >
> > > > > >    # ls /sys/class/devlink/ | grep 002c
> > > > > >    platform:hpbr--i2c:13-002c
> > > > > >    platform:panel-dsi-lvds--i2c:13-002c  
> > > > >
> > > > > Can you check the "status" and "sync_state_only" file in this folder
> > > > > and tell me what it says?
> > > > >
> > > > > Since these devices have a cyclic dependency between them, it should
> > > > > have been something other than "not tracked" and "sync_state_only"
> > > > > should be "1". But my guess is you'll see "active" and "0".
> > > > >  
> > > > > >    platform:regulator-sys-1v8--i2c:13-002c
> > > > > >    regulator:regulator.31--i2c:13-002c
> > > > > >    #
> > > > > >
> > > > > >  * in the "broken" case (v6.8-rc5 + s/!index/index > 0/ as mentioned):
> > > > > >
> > > > > >    # ls /sys/class/devlink/ | grep 002c
> > > > > >    platform:hpbr--i2c:13-002c
> > > > > >    platform:regulator-sys-1v8--i2c:13-002c
> > > > > >    regulator:regulator.30--i2c:13-002c
> > > > > >    #
> > > > > >
> > > > > > So in the latter case the panel-dsi-lvds--i2c:13-002c link is missing.
> > > > > > I think it gets created but later on removed. Here's a snippet of the
> > > > > > kernel log when that happens:
> > > > > >
> > > > > > [    9.578279] ----- cycle: start -----
> > > > > > [    9.578283] /soc@0/bus@30800000/i2c@30ad0000/i2cmux@70/i2c@3/dsi-lvds-bridge@2c: cycle: depends on /panel-dsi-lvds
> > > > > > [    9.578308] /panel-dsi-lvds: cycle: depends on /soc@0/bus@30800000/i2c@30ad0000/i2cmux@70/i2c@3/dsi-lvds-bridge@2c
> > > > > > [    9.578329] ----- cycle: end -----
> > > > > > [    9.578334] platform panel-dsi-lvds: Fixed dependency cycle(s) with /soc@0/bus@30800000/i2c@30ad0000/i2cmux@70/i2c@3/dsi-lvds-bridge@2c
> > > > > > ...  
> > > > >
> > > > > Somewhere in this area, I'm thinking you'll also see "device:
> > > > > 'i2c:13-002c--platform:panel-dsi-lvds': device_add" do you not? And if
> > > > > you enabled device link logs, you'll see that it was "sync state only"
> > > > > link.
> > > > >  
> > > > > > [    9.590620] /panel-dsi-lvds Dropping the fwnode link to /soc@0/bus@30800000/i2c@30ad0000/i2cmux@70/i2c@3/dsi-lvds-bridge@2c
> > > > > > ...
> > > > > > [    9.597280] ----- cycle: start -----
> > > > > > [    9.597283] /panel-dsi-lvds: cycle: depends on /soc@0/bus@30800000/i2c@30ad0000/i2cmux@70/i2c@3/dsi-lvds-bridge@2c
> > > > > > [    9.602781] /soc@0/bus@30800000/i2c@30ad0000/i2cmux@70/i2c@3/dsi-lvds-bridge@2c: cycle: depends on /panel-dsi-lvds
> > > > > > [    9.607581] ----- cycle: end -----
> > > > > > [    9.607585] i2c 13-002c: Fixed dependency cycle(s) with /panel-dsi-lvds
> > > > > > [    9.614217] device: 'platform:panel-dsi-lvds--i2c:13-002c': device_add
> > > > > > ...
> > > > > > [    9.614277] /soc@0/bus@30800000/i2c@30ad0000/i2cmux@70/i2c@3/dsi-lvds-bridge@2c Dropping the fwnode link to /panel-dsi-lvds
> > > > > > [    9.614369] /soc@0/bus@30800000/i2c@30ad0000/i2cmux@70/i2c@3/dsi-lvds-bridge@2c Dropping the fwnode link to /regulator-dock-sys-1v8
> > > > > > ...
> > > > > > [    9.739840] panel-simple panel-dsi-lvds: Dropping the link to 13-002c
> > > > > > [    9.739846] device: 'i2c:13-002c--platform:panel-dsi-lvds': device_unregister  
> > > > >
> > > > > Oh yeah, see. The "device_add" I expected earlier is getting removed here.
> > > > >  
> > > > > > [   10.247037] sn65dsi83 13-002c: Dropping the link to panel-dsi-lvds
> > > > > > [   10.247049] device: 'platform:panel-dsi-lvds--i2c:13-002c': device_unregister
> > > > > >
> > > > > > And here's the relevant portion of my device tree overlay:
> > > > > >
> > > > > > --------------------8<--------------------
> > > > > >  
> > > > >
> > > > > I think the eventual fix would be this series + adding a
> > > > > "post-init-providers" property to the device that's supposed to probe
> > > > > first and point it to the device that's supposed to probe next. Do
> > > > > this at the device node level, not the endpoint level.
> > > > > https://lore.kernel.org/lkml/20240221233026.2915061-1-saravanak@google.com/  
> > > >
> > > > I'm certainly going to look at this series in more detail and at the
> > > > debugging you asked for, however I'm afraid I won't have access to the
> > > > hardware this week and it's not going to be a quick task anyway.
> > > >
> > > > So in this moment I think it's quite clear that this specific patch
> > > > creates a regression and there is no clear fix that is reasonably
> > > > likely to get merged before 6.8.
> > > >
> > > > I propose reverting this patch immediately, unless you have a better
> > > > short-term solution.  
> > >
> > > It's just this one of the 3 patches that needs reverting?  

Just this patch. I reverted only this and the issue disappeared.

> > I sent a fix. With the fix, it's just exposing a bug elsewhere. 

Exactly, this patch has two issues and only the easy one has a fix [0]
currently as far as I know.

> You say apply the fix. Luca says revert. I say I wish I made this 6.9
> material. Which is it?
> 
> If the overlay applying depends on out of tree code (likely as there
> are limited ways to apply an overlay in mainline), then I don't really
> care if there is still a regression.

Obviously, to load and unload the overlays I'm using code not yet
in mainline. It is using of_overlay_fdt_apply() and of_overlay_remove()
via a driver underdevelopment that is similar to the one Hervé and
Lizhi Hou are working on [1][2].

I see the point that "we are not breaking existing use cases as no code
is (un)loading overlays except unittest", sure.

As I see it, we have a feature in the kernel that is not used, but it
will be, eventually: there are use cases, development is progressing and
patches are being sent actively. My opinion is that we should not
put additional known obstacles that will make it even harder than it
already is.

[0]
https://lore.kernel.org/linux-devicetree/20240224052436.3552333-1-saravanak@google.com/
[1]
https://lore.kernel.org/all/1692120000-46900-1-git-send-email-lizhi.hou@amd.com/
[2]
https://lore.kernel.org/lkml/20220427094502.456111-1-clement.leger@bootlin.com/

Luca

-- 
Luca Ceresoli, Bootlin
Embedded Linux and Kernel engineering
https://bootlin.com

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [REGRESSION] Re: [PATCH v2 2/3] of: property: Improve finding the supplier of a remote-endpoint property
  2024-02-29  9:34               ` Luca Ceresoli
@ 2024-02-29 22:10                 ` Rob Herring
  2024-02-29 22:54                   ` Saravana Kannan
  2024-03-01  9:59                   ` Luca Ceresoli
  0 siblings, 2 replies; 21+ messages in thread
From: Rob Herring @ 2024-02-29 22:10 UTC (permalink / raw)
  To: Luca Ceresoli
  Cc: Saravana Kannan, Frank Rowand, Greg Kroah-Hartman, Xu Yang,
	kernel-team, devicetree, linux-kernel, Hervé Codina,
	Geert Uytterhoeven

On Thu, Feb 29, 2024 at 3:34 AM Luca Ceresoli <luca.ceresoli@bootlin.com> wrote:
>
> Hi Rob, Saravana,
>
> On Wed, 28 Feb 2024 18:26:36 -0600
> Rob Herring <robh+dt@kernel.org> wrote:
>
> > On Wed, Feb 28, 2024 at 5:58 PM Saravana Kannan <saravanak@google.com> wrote:
> > >
> > > On Wed, Feb 28, 2024 at 1:56 PM Rob Herring <robh+dt@kernel.org> wrote:
> > > >
> > > > On Mon, Feb 26, 2024 at 5:52 AM Luca Ceresoli <luca.ceresoli@bootlin.com> wrote:
> > > > >
> > > > > Hello Saravana,
> > > > >
> > > > > On Fri, 23 Feb 2024 17:35:24 -0800
> > > > > Saravana Kannan <saravanak@google.com> wrote:
> > > > >
> > > > > > On Fri, Feb 23, 2024 at 8:18 AM Luca Ceresoli <luca.ceresoli@bootlin.com> wrote:
> > > > > > >
> > > > > > > Hello Saravana,
> > > > > > >
> > > > > > > [+cc Hervé Codina]
> > > > > > >
> > > > > > > On Tue,  6 Feb 2024 17:18:01 -0800
> > > > > > > Saravana Kannan <saravanak@google.com> wrote:
> > > > > > >
> > > > > > > > After commit 4a032827daa8 ("of: property: Simplify of_link_to_phandle()"),
> > > > > > > > remote-endpoint properties created a fwnode link from the consumer device
> > > > > > > > to the supplier endpoint. This is a tiny bit inefficient (not buggy) when
> > > > > > > > trying to create device links or detecting cycles. So, improve this the
> > > > > > > > same way we improved finding the consumer of a remote-endpoint property.
> > > > > > > >
> > > > > > > > Fixes: 4a032827daa8 ("of: property: Simplify of_link_to_phandle()")
> > > > > > > > Signed-off-by: Saravana Kannan <saravanak@google.com>
> > > > > > >
> > > > > > > After rebasing my own branch on v6.8-rc5 from v6.8-rc1 I started
> > > > > > > getting unexpected warnings during device tree overlay removal. After a
> > > > > > > somewhat painful bisection I identified this patch as the one that
> > > > > > > triggers it all.
> > > > > >
> > > > > > Thanks for the report.
> > > > > >
> > > > > > >
> > > > > > > > ---
> > > > > > > > --- a/drivers/of/property.c
> > > > > > > > +++ b/drivers/of/property.c
> > > > > > > > @@ -1232,7 +1232,6 @@ DEFINE_SIMPLE_PROP(pinctrl5, "pinctrl-5", NULL)
> > > > > > > >  DEFINE_SIMPLE_PROP(pinctrl6, "pinctrl-6", NULL)
> > > > > > > >  DEFINE_SIMPLE_PROP(pinctrl7, "pinctrl-7", NULL)
> > > > > > > >  DEFINE_SIMPLE_PROP(pinctrl8, "pinctrl-8", NULL)
> > > > > > > > -DEFINE_SIMPLE_PROP(remote_endpoint, "remote-endpoint", NULL)
> > > > > > > >  DEFINE_SIMPLE_PROP(pwms, "pwms", "#pwm-cells")
> > > > > > > >  DEFINE_SIMPLE_PROP(resets, "resets", "#reset-cells")
> > > > > > > >  DEFINE_SIMPLE_PROP(leds, "leds", NULL)
> > > > > > > > @@ -1298,6 +1297,17 @@ static struct device_node *parse_interrupts(struct device_node *np,
> > > > > > > >       return of_irq_parse_one(np, index, &sup_args) ? NULL : sup_args.np;
> > > > > > > >  }
> > > > > > > >
> > > > > > > > +static struct device_node *parse_remote_endpoint(struct device_node *np,
> > > > > > > > +                                              const char *prop_name,
> > > > > > > > +                                              int index)
> > > > > > > > +{
> > > > > > > > +     /* Return NULL for index > 0 to signify end of remote-endpoints. */
> > > > > > > > +     if (!index || strcmp(prop_name, "remote-endpoint"))
> > > > > > >
> > > > > > > There seem to be a bug here: "!index" should be "index > 0", as the
> > > > > > > comment suggests. Otherwise NULL is always returned.
> > > > > >
> > > > > > Ah crap, I think you are right. It should have been "index". Not
> > > > > > "!index". But I tested this! Sigh. I probably screwed up my testing.
> > > > > >
> > > > > > Please send out a Fix for this.
> > > > > >
> > > > > > Geert, we got excited too soon. :(
> > > > > >
> > > > > > > I am going to send a quick patch for that, but haven't done so yet
> > > > > > > because it still won't solve the problem, so I wanted to open the topic
> > > > > > > here without further delay.
> > > > > > >
> > > > > > > Even with the 'index > 0' fix I'm still getting pretty much the same:
> > > > > >
> > > > > > This part is confusing though. If I read your DT correctly, there's a
> > > > > > cycle between platform:panel-dsi-lvds and i2c:13-002c. And fw_devlink
> > > > > > should not be enforcing any ordering between those devices ever.
> > > > > >
> > > > > > I'm surprised that in your "working" case, fw_devlink didn't detect
> > > > > > any cycle. It should have. If there's any debugging to do, that's the
> > > > > > one we need to debug.
> > > > > >
> > > > > > >
> > > > > > > [   34.836781] ------------[ cut here ]------------
> > > > > > > [   34.841401] WARNING: CPU: 2 PID: 204 at drivers/base/devres.c:1064 devm_kfree+0x8c/0xfc
> > > > > > > ...
> > > > > > > [   35.024751] Call trace:
> > > > > > > [   35.027199]  devm_kfree+0x8c/0xfc
> > > > > > > [   35.030520]  devm_drm_panel_bridge_release+0x54/0x64 [drm_kms_helper]
> > > > > > > [   35.036990]  devres_release_group+0xe0/0x164
> > > > > > > [   35.041264]  i2c_device_remove+0x38/0x9c
> > > > > > > [   35.045196]  device_remove+0x4c/0x80
> > > > > > > [   35.048774]  device_release_driver_internal+0x1d4/0x230
> > > > > > > [   35.054003]  device_release_driver+0x18/0x24
> > > > > > > [   35.058279]  bus_remove_device+0xcc/0x10c
> > > > > > > [   35.062292]  device_del+0x15c/0x41c
> > > > > > > [   35.065786]  device_unregister+0x18/0x34
> > > > > > > [   35.069714]  i2c_unregister_device+0x54/0x88
> > > > > > > [   35.073988]  of_i2c_notify+0x98/0x224
> > > > > > > [   35.077656]  blocking_notifier_call_chain+0x6c/0xa0
> > > > > > > [   35.082543]  __of_changeset_entry_notify+0x100/0x16c
> > > > > > > [   35.087515]  __of_changeset_revert_notify+0x44/0x78
> > > > > > > [   35.092398]  of_overlay_remove+0x114/0x1c4
> > > > > > > ...
> > > > > > >
> > > > > > > By comparing the two versions I found that before removing the overlay:
> > > > > > >
> > > > > > >  * in the "working" case (with this patch reverted) I have:
> > > > > > >
> > > > > > >    # ls /sys/class/devlink/ | grep 002c
> > > > > > >    platform:hpbr--i2c:13-002c
> > > > > > >    platform:panel-dsi-lvds--i2c:13-002c
> > > > > >
> > > > > > Can you check the "status" and "sync_state_only" file in this folder
> > > > > > and tell me what it says?
> > > > > >
> > > > > > Since these devices have a cyclic dependency between them, it should
> > > > > > have been something other than "not tracked" and "sync_state_only"
> > > > > > should be "1". But my guess is you'll see "active" and "0".
> > > > > >
> > > > > > >    platform:regulator-sys-1v8--i2c:13-002c
> > > > > > >    regulator:regulator.31--i2c:13-002c
> > > > > > >    #
> > > > > > >
> > > > > > >  * in the "broken" case (v6.8-rc5 + s/!index/index > 0/ as mentioned):
> > > > > > >
> > > > > > >    # ls /sys/class/devlink/ | grep 002c
> > > > > > >    platform:hpbr--i2c:13-002c
> > > > > > >    platform:regulator-sys-1v8--i2c:13-002c
> > > > > > >    regulator:regulator.30--i2c:13-002c
> > > > > > >    #
> > > > > > >
> > > > > > > So in the latter case the panel-dsi-lvds--i2c:13-002c link is missing.
> > > > > > > I think it gets created but later on removed. Here's a snippet of the
> > > > > > > kernel log when that happens:
> > > > > > >
> > > > > > > [    9.578279] ----- cycle: start -----
> > > > > > > [    9.578283] /soc@0/bus@30800000/i2c@30ad0000/i2cmux@70/i2c@3/dsi-lvds-bridge@2c: cycle: depends on /panel-dsi-lvds
> > > > > > > [    9.578308] /panel-dsi-lvds: cycle: depends on /soc@0/bus@30800000/i2c@30ad0000/i2cmux@70/i2c@3/dsi-lvds-bridge@2c
> > > > > > > [    9.578329] ----- cycle: end -----
> > > > > > > [    9.578334] platform panel-dsi-lvds: Fixed dependency cycle(s) with /soc@0/bus@30800000/i2c@30ad0000/i2cmux@70/i2c@3/dsi-lvds-bridge@2c
> > > > > > > ...
> > > > > >
> > > > > > Somewhere in this area, I'm thinking you'll also see "device:
> > > > > > 'i2c:13-002c--platform:panel-dsi-lvds': device_add" do you not? And if
> > > > > > you enabled device link logs, you'll see that it was "sync state only"
> > > > > > link.
> > > > > >
> > > > > > > [    9.590620] /panel-dsi-lvds Dropping the fwnode link to /soc@0/bus@30800000/i2c@30ad0000/i2cmux@70/i2c@3/dsi-lvds-bridge@2c
> > > > > > > ...
> > > > > > > [    9.597280] ----- cycle: start -----
> > > > > > > [    9.597283] /panel-dsi-lvds: cycle: depends on /soc@0/bus@30800000/i2c@30ad0000/i2cmux@70/i2c@3/dsi-lvds-bridge@2c
> > > > > > > [    9.602781] /soc@0/bus@30800000/i2c@30ad0000/i2cmux@70/i2c@3/dsi-lvds-bridge@2c: cycle: depends on /panel-dsi-lvds
> > > > > > > [    9.607581] ----- cycle: end -----
> > > > > > > [    9.607585] i2c 13-002c: Fixed dependency cycle(s) with /panel-dsi-lvds
> > > > > > > [    9.614217] device: 'platform:panel-dsi-lvds--i2c:13-002c': device_add
> > > > > > > ...
> > > > > > > [    9.614277] /soc@0/bus@30800000/i2c@30ad0000/i2cmux@70/i2c@3/dsi-lvds-bridge@2c Dropping the fwnode link to /panel-dsi-lvds
> > > > > > > [    9.614369] /soc@0/bus@30800000/i2c@30ad0000/i2cmux@70/i2c@3/dsi-lvds-bridge@2c Dropping the fwnode link to /regulator-dock-sys-1v8
> > > > > > > ...
> > > > > > > [    9.739840] panel-simple panel-dsi-lvds: Dropping the link to 13-002c
> > > > > > > [    9.739846] device: 'i2c:13-002c--platform:panel-dsi-lvds': device_unregister
> > > > > >
> > > > > > Oh yeah, see. The "device_add" I expected earlier is getting removed here.
> > > > > >
> > > > > > > [   10.247037] sn65dsi83 13-002c: Dropping the link to panel-dsi-lvds
> > > > > > > [   10.247049] device: 'platform:panel-dsi-lvds--i2c:13-002c': device_unregister
> > > > > > >
> > > > > > > And here's the relevant portion of my device tree overlay:
> > > > > > >
> > > > > > > --------------------8<--------------------
> > > > > > >
> > > > > >
> > > > > > I think the eventual fix would be this series + adding a
> > > > > > "post-init-providers" property to the device that's supposed to probe
> > > > > > first and point it to the device that's supposed to probe next. Do
> > > > > > this at the device node level, not the endpoint level.
> > > > > > https://lore.kernel.org/lkml/20240221233026.2915061-1-saravanak@google.com/
> > > > >
> > > > > I'm certainly going to look at this series in more detail and at the
> > > > > debugging you asked for, however I'm afraid I won't have access to the
> > > > > hardware this week and it's not going to be a quick task anyway.
> > > > >
> > > > > So in this moment I think it's quite clear that this specific patch
> > > > > creates a regression and there is no clear fix that is reasonably
> > > > > likely to get merged before 6.8.
> > > > >
> > > > > I propose reverting this patch immediately, unless you have a better
> > > > > short-term solution.
> > > >
> > > > It's just this one of the 3 patches that needs reverting?
>
> Just this patch. I reverted only this and the issue disappeared.
>
> > > I sent a fix. With the fix, it's just exposing a bug elsewhere.
>
> Exactly, this patch has two issues and only the easy one has a fix [0]
> currently as far as I know.
>
> > You say apply the fix. Luca says revert. I say I wish I made this 6.9
> > material. Which is it?
> >
> > If the overlay applying depends on out of tree code (likely as there
> > are limited ways to apply an overlay in mainline), then I don't really
> > care if there is still a regression.
>
> Obviously, to load and unload the overlays I'm using code not yet
> in mainline. It is using of_overlay_fdt_apply() and of_overlay_remove()
> via a driver underdevelopment that is similar to the one Hervé and
> Lizhi Hou are working on [1][2].
>
> I see the point that "we are not breaking existing use cases as no code
> is (un)loading overlays except unittest", sure.
>
> As I see it, we have a feature in the kernel that is not used, but it
> will be, eventually: there are use cases, development is progressing and
> patches are being sent actively. My opinion is that we should not
> put additional known obstacles that will make it even harder than it
> already is.

Well, I don't care to do extra work of applying things and then have
to turn right around fix or revert them. It happens enough as-is with
just mainline. And no one wants to step up and fix the problems with
overlays, but are fine just carrying their out of tree patches. What's
one more. This is the 2nd case of overlay problems with out of tree
users *today*! Some days I'm tempted to just remove overlay support
altogether given the only way to apply them is unittest.

Given Geert is having issues too, I guess I'm going to revert.

Rob

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [REGRESSION] Re: [PATCH v2 2/3] of: property: Improve finding the supplier of a remote-endpoint property
  2024-02-29 22:10                 ` Rob Herring
@ 2024-02-29 22:54                   ` Saravana Kannan
  2024-03-01  9:59                   ` Luca Ceresoli
  1 sibling, 0 replies; 21+ messages in thread
From: Saravana Kannan @ 2024-02-29 22:54 UTC (permalink / raw)
  To: Rob Herring
  Cc: Luca Ceresoli, Frank Rowand, Greg Kroah-Hartman, Xu Yang,
	kernel-team, devicetree, linux-kernel, Hervé Codina,
	Geert Uytterhoeven

On Thu, Feb 29, 2024 at 2:10 PM Rob Herring <robh+dt@kernel.org> wrote:
>
> On Thu, Feb 29, 2024 at 3:34 AM Luca Ceresoli <luca.ceresoli@bootlin.com> wrote:
> >
> > Hi Rob, Saravana,
> >
> > On Wed, 28 Feb 2024 18:26:36 -0600
> > Rob Herring <robh+dt@kernel.org> wrote:
> >
> > > On Wed, Feb 28, 2024 at 5:58 PM Saravana Kannan <saravanak@google.com> wrote:
> > > >
> > > > On Wed, Feb 28, 2024 at 1:56 PM Rob Herring <robh+dt@kernel.org> wrote:
> > > > >
> > > > > On Mon, Feb 26, 2024 at 5:52 AM Luca Ceresoli <luca.ceresoli@bootlin.com> wrote:
> > > > > >
> > > > > > Hello Saravana,
> > > > > >
> > > > > > On Fri, 23 Feb 2024 17:35:24 -0800
> > > > > > Saravana Kannan <saravanak@google.com> wrote:
> > > > > >
> > > > > > > On Fri, Feb 23, 2024 at 8:18 AM Luca Ceresoli <luca.ceresoli@bootlin.com> wrote:
> > > > > > > >
> > > > > > > > Hello Saravana,
> > > > > > > >
> > > > > > > > [+cc Hervé Codina]
> > > > > > > >
> > > > > > > > On Tue,  6 Feb 2024 17:18:01 -0800
> > > > > > > > Saravana Kannan <saravanak@google.com> wrote:
> > > > > > > >
> > > > > > > > > After commit 4a032827daa8 ("of: property: Simplify of_link_to_phandle()"),
> > > > > > > > > remote-endpoint properties created a fwnode link from the consumer device
> > > > > > > > > to the supplier endpoint. This is a tiny bit inefficient (not buggy) when
> > > > > > > > > trying to create device links or detecting cycles. So, improve this the
> > > > > > > > > same way we improved finding the consumer of a remote-endpoint property.
> > > > > > > > >
> > > > > > > > > Fixes: 4a032827daa8 ("of: property: Simplify of_link_to_phandle()")
> > > > > > > > > Signed-off-by: Saravana Kannan <saravanak@google.com>
> > > > > > > >
> > > > > > > > After rebasing my own branch on v6.8-rc5 from v6.8-rc1 I started
> > > > > > > > getting unexpected warnings during device tree overlay removal. After a
> > > > > > > > somewhat painful bisection I identified this patch as the one that
> > > > > > > > triggers it all.
> > > > > > >
> > > > > > > Thanks for the report.
> > > > > > >
> > > > > > > >
> > > > > > > > > ---
> > > > > > > > > --- a/drivers/of/property.c
> > > > > > > > > +++ b/drivers/of/property.c
> > > > > > > > > @@ -1232,7 +1232,6 @@ DEFINE_SIMPLE_PROP(pinctrl5, "pinctrl-5", NULL)
> > > > > > > > >  DEFINE_SIMPLE_PROP(pinctrl6, "pinctrl-6", NULL)
> > > > > > > > >  DEFINE_SIMPLE_PROP(pinctrl7, "pinctrl-7", NULL)
> > > > > > > > >  DEFINE_SIMPLE_PROP(pinctrl8, "pinctrl-8", NULL)
> > > > > > > > > -DEFINE_SIMPLE_PROP(remote_endpoint, "remote-endpoint", NULL)
> > > > > > > > >  DEFINE_SIMPLE_PROP(pwms, "pwms", "#pwm-cells")
> > > > > > > > >  DEFINE_SIMPLE_PROP(resets, "resets", "#reset-cells")
> > > > > > > > >  DEFINE_SIMPLE_PROP(leds, "leds", NULL)
> > > > > > > > > @@ -1298,6 +1297,17 @@ static struct device_node *parse_interrupts(struct device_node *np,
> > > > > > > > >       return of_irq_parse_one(np, index, &sup_args) ? NULL : sup_args.np;
> > > > > > > > >  }
> > > > > > > > >
> > > > > > > > > +static struct device_node *parse_remote_endpoint(struct device_node *np,
> > > > > > > > > +                                              const char *prop_name,
> > > > > > > > > +                                              int index)
> > > > > > > > > +{
> > > > > > > > > +     /* Return NULL for index > 0 to signify end of remote-endpoints. */
> > > > > > > > > +     if (!index || strcmp(prop_name, "remote-endpoint"))
> > > > > > > >
> > > > > > > > There seem to be a bug here: "!index" should be "index > 0", as the
> > > > > > > > comment suggests. Otherwise NULL is always returned.
> > > > > > >
> > > > > > > Ah crap, I think you are right. It should have been "index". Not
> > > > > > > "!index". But I tested this! Sigh. I probably screwed up my testing.
> > > > > > >
> > > > > > > Please send out a Fix for this.
> > > > > > >
> > > > > > > Geert, we got excited too soon. :(
> > > > > > >
> > > > > > > > I am going to send a quick patch for that, but haven't done so yet
> > > > > > > > because it still won't solve the problem, so I wanted to open the topic
> > > > > > > > here without further delay.
> > > > > > > >
> > > > > > > > Even with the 'index > 0' fix I'm still getting pretty much the same:
> > > > > > >
> > > > > > > This part is confusing though. If I read your DT correctly, there's a
> > > > > > > cycle between platform:panel-dsi-lvds and i2c:13-002c. And fw_devlink
> > > > > > > should not be enforcing any ordering between those devices ever.
> > > > > > >
> > > > > > > I'm surprised that in your "working" case, fw_devlink didn't detect
> > > > > > > any cycle. It should have. If there's any debugging to do, that's the
> > > > > > > one we need to debug.
> > > > > > >
> > > > > > > >
> > > > > > > > [   34.836781] ------------[ cut here ]------------
> > > > > > > > [   34.841401] WARNING: CPU: 2 PID: 204 at drivers/base/devres.c:1064 devm_kfree+0x8c/0xfc
> > > > > > > > ...
> > > > > > > > [   35.024751] Call trace:
> > > > > > > > [   35.027199]  devm_kfree+0x8c/0xfc
> > > > > > > > [   35.030520]  devm_drm_panel_bridge_release+0x54/0x64 [drm_kms_helper]
> > > > > > > > [   35.036990]  devres_release_group+0xe0/0x164
> > > > > > > > [   35.041264]  i2c_device_remove+0x38/0x9c
> > > > > > > > [   35.045196]  device_remove+0x4c/0x80
> > > > > > > > [   35.048774]  device_release_driver_internal+0x1d4/0x230
> > > > > > > > [   35.054003]  device_release_driver+0x18/0x24
> > > > > > > > [   35.058279]  bus_remove_device+0xcc/0x10c
> > > > > > > > [   35.062292]  device_del+0x15c/0x41c
> > > > > > > > [   35.065786]  device_unregister+0x18/0x34
> > > > > > > > [   35.069714]  i2c_unregister_device+0x54/0x88
> > > > > > > > [   35.073988]  of_i2c_notify+0x98/0x224
> > > > > > > > [   35.077656]  blocking_notifier_call_chain+0x6c/0xa0
> > > > > > > > [   35.082543]  __of_changeset_entry_notify+0x100/0x16c
> > > > > > > > [   35.087515]  __of_changeset_revert_notify+0x44/0x78
> > > > > > > > [   35.092398]  of_overlay_remove+0x114/0x1c4
> > > > > > > > ...
> > > > > > > >
> > > > > > > > By comparing the two versions I found that before removing the overlay:
> > > > > > > >
> > > > > > > >  * in the "working" case (with this patch reverted) I have:
> > > > > > > >
> > > > > > > >    # ls /sys/class/devlink/ | grep 002c
> > > > > > > >    platform:hpbr--i2c:13-002c
> > > > > > > >    platform:panel-dsi-lvds--i2c:13-002c
> > > > > > >
> > > > > > > Can you check the "status" and "sync_state_only" file in this folder
> > > > > > > and tell me what it says?
> > > > > > >
> > > > > > > Since these devices have a cyclic dependency between them, it should
> > > > > > > have been something other than "not tracked" and "sync_state_only"
> > > > > > > should be "1". But my guess is you'll see "active" and "0".
> > > > > > >
> > > > > > > >    platform:regulator-sys-1v8--i2c:13-002c
> > > > > > > >    regulator:regulator.31--i2c:13-002c
> > > > > > > >    #
> > > > > > > >
> > > > > > > >  * in the "broken" case (v6.8-rc5 + s/!index/index > 0/ as mentioned):
> > > > > > > >
> > > > > > > >    # ls /sys/class/devlink/ | grep 002c
> > > > > > > >    platform:hpbr--i2c:13-002c
> > > > > > > >    platform:regulator-sys-1v8--i2c:13-002c
> > > > > > > >    regulator:regulator.30--i2c:13-002c
> > > > > > > >    #
> > > > > > > >
> > > > > > > > So in the latter case the panel-dsi-lvds--i2c:13-002c link is missing.
> > > > > > > > I think it gets created but later on removed. Here's a snippet of the
> > > > > > > > kernel log when that happens:
> > > > > > > >
> > > > > > > > [    9.578279] ----- cycle: start -----
> > > > > > > > [    9.578283] /soc@0/bus@30800000/i2c@30ad0000/i2cmux@70/i2c@3/dsi-lvds-bridge@2c: cycle: depends on /panel-dsi-lvds
> > > > > > > > [    9.578308] /panel-dsi-lvds: cycle: depends on /soc@0/bus@30800000/i2c@30ad0000/i2cmux@70/i2c@3/dsi-lvds-bridge@2c
> > > > > > > > [    9.578329] ----- cycle: end -----
> > > > > > > > [    9.578334] platform panel-dsi-lvds: Fixed dependency cycle(s) with /soc@0/bus@30800000/i2c@30ad0000/i2cmux@70/i2c@3/dsi-lvds-bridge@2c
> > > > > > > > ...
> > > > > > >
> > > > > > > Somewhere in this area, I'm thinking you'll also see "device:
> > > > > > > 'i2c:13-002c--platform:panel-dsi-lvds': device_add" do you not? And if
> > > > > > > you enabled device link logs, you'll see that it was "sync state only"
> > > > > > > link.
> > > > > > >
> > > > > > > > [    9.590620] /panel-dsi-lvds Dropping the fwnode link to /soc@0/bus@30800000/i2c@30ad0000/i2cmux@70/i2c@3/dsi-lvds-bridge@2c
> > > > > > > > ...
> > > > > > > > [    9.597280] ----- cycle: start -----
> > > > > > > > [    9.597283] /panel-dsi-lvds: cycle: depends on /soc@0/bus@30800000/i2c@30ad0000/i2cmux@70/i2c@3/dsi-lvds-bridge@2c
> > > > > > > > [    9.602781] /soc@0/bus@30800000/i2c@30ad0000/i2cmux@70/i2c@3/dsi-lvds-bridge@2c: cycle: depends on /panel-dsi-lvds
> > > > > > > > [    9.607581] ----- cycle: end -----
> > > > > > > > [    9.607585] i2c 13-002c: Fixed dependency cycle(s) with /panel-dsi-lvds
> > > > > > > > [    9.614217] device: 'platform:panel-dsi-lvds--i2c:13-002c': device_add
> > > > > > > > ...
> > > > > > > > [    9.614277] /soc@0/bus@30800000/i2c@30ad0000/i2cmux@70/i2c@3/dsi-lvds-bridge@2c Dropping the fwnode link to /panel-dsi-lvds
> > > > > > > > [    9.614369] /soc@0/bus@30800000/i2c@30ad0000/i2cmux@70/i2c@3/dsi-lvds-bridge@2c Dropping the fwnode link to /regulator-dock-sys-1v8
> > > > > > > > ...
> > > > > > > > [    9.739840] panel-simple panel-dsi-lvds: Dropping the link to 13-002c
> > > > > > > > [    9.739846] device: 'i2c:13-002c--platform:panel-dsi-lvds': device_unregister
> > > > > > >
> > > > > > > Oh yeah, see. The "device_add" I expected earlier is getting removed here.
> > > > > > >
> > > > > > > > [   10.247037] sn65dsi83 13-002c: Dropping the link to panel-dsi-lvds
> > > > > > > > [   10.247049] device: 'platform:panel-dsi-lvds--i2c:13-002c': device_unregister
> > > > > > > >
> > > > > > > > And here's the relevant portion of my device tree overlay:
> > > > > > > >
> > > > > > > > --------------------8<--------------------
> > > > > > > >
> > > > > > >
> > > > > > > I think the eventual fix would be this series + adding a
> > > > > > > "post-init-providers" property to the device that's supposed to probe
> > > > > > > first and point it to the device that's supposed to probe next. Do
> > > > > > > this at the device node level, not the endpoint level.
> > > > > > > https://lore.kernel.org/lkml/20240221233026.2915061-1-saravanak@google.com/
> > > > > >
> > > > > > I'm certainly going to look at this series in more detail and at the
> > > > > > debugging you asked for, however I'm afraid I won't have access to the
> > > > > > hardware this week and it's not going to be a quick task anyway.
> > > > > >
> > > > > > So in this moment I think it's quite clear that this specific patch
> > > > > > creates a regression and there is no clear fix that is reasonably
> > > > > > likely to get merged before 6.8.
> > > > > >
> > > > > > I propose reverting this patch immediately, unless you have a better
> > > > > > short-term solution.
> > > > >
> > > > > It's just this one of the 3 patches that needs reverting?
> >
> > Just this patch. I reverted only this and the issue disappeared.
> >
> > > > I sent a fix. With the fix, it's just exposing a bug elsewhere.
> >
> > Exactly, this patch has two issues and only the easy one has a fix [0]
> > currently as far as I know.
> >
> > > You say apply the fix. Luca says revert. I say I wish I made this 6.9
> > > material. Which is it?
> > >
> > > If the overlay applying depends on out of tree code (likely as there
> > > are limited ways to apply an overlay in mainline), then I don't really
> > > care if there is still a regression.
> >
> > Obviously, to load and unload the overlays I'm using code not yet
> > in mainline. It is using of_overlay_fdt_apply() and of_overlay_remove()
> > via a driver underdevelopment that is similar to the one Hervé and
> > Lizhi Hou are working on [1][2].
> >
> > I see the point that "we are not breaking existing use cases as no code
> > is (un)loading overlays except unittest", sure.
> >
> > As I see it, we have a feature in the kernel that is not used, but it
> > will be, eventually: there are use cases, development is progressing and
> > patches are being sent actively. My opinion is that we should not
> > put additional known obstacles that will make it even harder than it
> > already is.
>
> Well, I don't care to do extra work of applying things and then have
> to turn right around fix or revert them. It happens enough as-is with
> just mainline. And no one wants to step up and fix the problems with
> overlays, but are fine just carrying their out of tree patches. What's
> one more. This is the 2nd case of overlay problems with out of tree
> users *today*! Some days I'm tempted to just remove overlay support
> altogether given the only way to apply them is unittest.

Rob,

Sorry I couldn't reply yesterday. And sorry for getting this into 6.8
and causing headaches for you.

With [1], there are no more bugs to fix in fw_devlink wrt
remote-endpoints for sure.
[1] - https://lore.kernel.org/lkml/20240224052436.3552333-1-saravanak@google.com/

It's solely exposing a bug in another driver. If this was upstream
code, I might have been okay with reverting things just to make their
bug for now. I didn't realize this was downstream stuff until you
asked/Luca confirmed. We definitely shouldn't revert anything. Luca
can take my pointers and debug their driver and I'm happy to help
debug this further.

Also, post-init-providers should definitely help in this case. So,
Luca can use that once we land it.

> Given Geert is having issues too, I guess I'm going to revert.

It's just extra/explicit logging because as the original series was
meant to do, it improves remote-enpoint parsing.

-Saravana

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [REGRESSION] Re: [PATCH v2 2/3] of: property: Improve finding the supplier of a remote-endpoint property
  2024-02-29 22:10                 ` Rob Herring
  2024-02-29 22:54                   ` Saravana Kannan
@ 2024-03-01  9:59                   ` Luca Ceresoli
  1 sibling, 0 replies; 21+ messages in thread
From: Luca Ceresoli @ 2024-03-01  9:59 UTC (permalink / raw)
  To: Rob Herring
  Cc: Saravana Kannan, Frank Rowand, Greg Kroah-Hartman, Xu Yang,
	kernel-team, devicetree, linux-kernel, Hervé Codina,
	Geert Uytterhoeven

Hello Rob,

On Thu, 29 Feb 2024 16:10:38 -0600
Rob Herring <robh+dt@kernel.org> wrote:
[...]
> > > > > It's just this one of the 3 patches that needs reverting?  
> >
> > Just this patch. I reverted only this and the issue disappeared.
> >  
> > > > I sent a fix. With the fix, it's just exposing a bug elsewhere.  
> >
> > Exactly, this patch has two issues and only the easy one has a fix [0]
> > currently as far as I know.
> >  
> > > You say apply the fix. Luca says revert. I say I wish I made this 6.9
> > > material. Which is it?
> > >
> > > If the overlay applying depends on out of tree code (likely as there
> > > are limited ways to apply an overlay in mainline), then I don't really
> > > care if there is still a regression.  
> >
> > Obviously, to load and unload the overlays I'm using code not yet
> > in mainline. It is using of_overlay_fdt_apply() and of_overlay_remove()
> > via a driver underdevelopment that is similar to the one Hervé and
> > Lizhi Hou are working on [1][2].
> >
> > I see the point that "we are not breaking existing use cases as no code
> > is (un)loading overlays except unittest", sure.
> >
> > As I see it, we have a feature in the kernel that is not used, but it
> > will be, eventually: there are use cases, development is progressing and
> > patches are being sent actively. My opinion is that we should not
> > put additional known obstacles that will make it even harder than it
> > already is.  
> 
> Well, I don't care to do extra work of applying things and then have
> to turn right around fix or revert them. It happens enough as-is with
> just mainline. And no one wants to step up and fix the problems with
> overlays, but are fine just carrying their out of tree patches. What's
> one more. This is the 2nd case of overlay problems with out of tree
> users *today*! Some days I'm tempted to just remove overlay support
> altogether given the only way to apply them is unittest.

Thanks for taking time to understand the situation.

Just to clarify my position: together with Hervé we are not just
carrying out of tree code, we are actively developing code that uses
overlay load/unload at runtime and we will send it to mainline as soon
as it is ready.

As part of this process, Hervé has already sent patches to fix various
problems that happen when overlays are loaded and especially unloaded:
https://lore.kernel.org/all/20240229105204.720717-1-herve.codina@bootlin.com/
https://lore.kernel.org/all/20240227113426.253232-1-herve.codina@bootlin.com/
https://lore.kernel.org/all/20240220133950.138452-1-herve.codina@bootlin.com/
https://lore.kernel.org/all/20240220111044.133776-1-herve.codina@bootlin.com/

Best regards,
Luca

-- 
Luca Ceresoli, Bootlin
Embedded Linux and Kernel engineering
https://bootlin.com

^ permalink raw reply	[flat|nested] 21+ messages in thread

end of thread, other threads:[~2024-03-01  9:59 UTC | newest]

Thread overview: 21+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-02-07  1:17 [PATCH v2 0/3] Improve remote-endpoint parsing Saravana Kannan
2024-02-07  1:18 ` [PATCH v2 1/3] of: property: Improve finding the consumer of a remote-endpoint property Saravana Kannan
2024-02-07  1:18 ` [PATCH v2 2/3] of: property: Improve finding the supplier " Saravana Kannan
2024-02-23 16:18   ` Luca Ceresoli
2024-02-24  1:35     ` Saravana Kannan
2024-02-26 11:52       ` [REGRESSION] " Luca Ceresoli
2024-02-28 21:56         ` Rob Herring
2024-02-28 23:58           ` Saravana Kannan
2024-02-29  0:26             ` Rob Herring
2024-02-29  9:34               ` Luca Ceresoli
2024-02-29 22:10                 ` Rob Herring
2024-02-29 22:54                   ` Saravana Kannan
2024-03-01  9:59                   ` Luca Ceresoli
2024-02-28 22:01     ` Rob Herring
2024-02-07  1:18 ` [PATCH v2 3/3] of: property: Add in-ports/out-ports support to of_graph_get_port_parent() Saravana Kannan
2024-02-21  0:47   ` Saravana Kannan
2024-02-21  7:00     ` Greg Kroah-Hartman
2024-02-21  8:38       ` Greg Kroah-Hartman
2024-02-07  8:10 ` [PATCH v2 0/3] Improve remote-endpoint parsing Rob Herring
2024-02-07 21:07   ` Saravana Kannan
2024-02-09 10:31 ` Rob Herring

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).