[4/4] of: property: Avoid linking devices with circular dependencies
diff mbox series

Message ID 20200415150550.28156-5-nsaenzjulienne@suse.de
State Superseded
Headers show
Series
  • of: property: fw_devlink misc fixes
Related show

Commit Message

Nicolas Saenz Julienne April 15, 2020, 3:05 p.m. UTC
When creating a consumer/supplier relationship between devices it's
essential to make sure they aren't supplying each other creating a
circular dependency.

Introduce a new function to check if such circular dependency exists
between two device nodes and use it in of_link_to_phandle().

Fixes: a3e1d1a7f5fc ("of: property: Add functional dependency link from DT bindings")
Signed-off-by: Nicolas Saenz Julienne <nsaenzjulienne@suse.de>
---

NOTE:
 I feel of_link_is_circular() is a little dense, and could benefit from
 some abstraction/refactoring. That said, I'd rather get some feedback,
 before spending time on it.

 drivers/of/property.c | 50 +++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 50 insertions(+)

Comments

Saravana Kannan April 15, 2020, 6:52 p.m. UTC | #1
On Wed, Apr 15, 2020 at 8:06 AM Nicolas Saenz Julienne
<nsaenzjulienne@suse.de> wrote:
>
> When creating a consumer/supplier relationship between devices it's
> essential to make sure they aren't supplying each other creating a
> circular dependency.

Kinda correct. But fw_devlink is not just about optimizing probing.
It's also about ensuring sync_state() callbacks work correctly when
drivers are built as modules. And for that to work, circular
"SYNC_STATE_ONLY" device links are allowed. I've explained it in a bit
more detail here [1].

> Introduce a new function to check if such circular dependency exists
> between two device nodes and use it in of_link_to_phandle().
>
> Fixes: a3e1d1a7f5fc ("of: property: Add functional dependency link from DT bindings")
> Signed-off-by: Nicolas Saenz Julienne <nsaenzjulienne@suse.de>
> ---
>
> NOTE:
>  I feel of_link_is_circular() is a little dense, and could benefit from
>  some abstraction/refactoring. That said, I'd rather get some feedback,
>  before spending time on it.

Good call :)

>  drivers/of/property.c | 50 +++++++++++++++++++++++++++++++++++++++++++
>  1 file changed, 50 insertions(+)
>
> diff --git a/drivers/of/property.c b/drivers/of/property.c
> index 2c7978ef22be1..74a5190408c3b 100644
> --- a/drivers/of/property.c
> +++ b/drivers/of/property.c
> @@ -1171,6 +1171,44 @@ static const struct supplier_bindings of_supplier_bindings[] = {
>         {}
>  };
>
> +/**
> + * of_link_is_circular - Make sure potential link isn't circular
> + *
> + * @sup_np: Supplier device
> + * @con_np: Consumer device
> + *
> + * This function checks if @sup_np's properties contain a reference to @con_np.
> + *
> + * Will return true if there's a circular dependency and false otherwise.
> + */
> +static bool of_link_is_circular(struct device_node *sup_np,
> +                               struct device_node *con_np)
> +{
> +       const struct supplier_bindings *s = of_supplier_bindings;
> +       struct device_node *tmp;
> +       bool matched = false;
> +       struct property *p;
> +       int i = 0;
> +
> +       for_each_property_of_node(sup_np, p) {
> +               while (!matched && s->parse_prop) {
> +                       while ((tmp = s->parse_prop(sup_np, p->name, i))) {
> +                               matched = true;
> +                               i++;
> +
> +                               if (tmp == con_np)
> +                                       return true;
> +                       }
> +                       i = 0;
> +                       s++;
> +               }
> +               s = of_supplier_bindings;
> +               matched = false;
> +       }
> +
> +       return false;
> +}

This only catches circular links made out of 2 devices. If we really
needed such a function that worked correctly to catch bigger
"circles", you'd need to recurse and it'll get super wasteful and
ugly.

Thankfully, device_link_add() already checks for circular dependencies
when we need it and it's much cheaper because the links are at a
device level and not examined at a property level.

Is this a real problem you are hitting with the Raspberry Pi 4's? If
so can you give an example in its DT where you are hitting this?

I'll have to NACK this patch for reasons mentioned above and in [1].
However, I think I have a solution that should work for what I'm
guessing is your real problem. But let me see the description of the
real scenario before I claim to have a solution.

-Saravana

[1] - https://lore.kernel.org/lkml/20191028220027.251605-1-saravanak@google.com/
Nicolas Saenz Julienne April 16, 2020, 4:01 p.m. UTC | #2
On Wed, 2020-04-15 at 11:52 -0700, Saravana Kannan wrote:
> On Wed, Apr 15, 2020 at 8:06 AM Nicolas Saenz Julienne
> <nsaenzjulienne@suse.de> wrote:
> > When creating a consumer/supplier relationship between devices it's
> > essential to make sure they aren't supplying each other creating a
> > circular dependency.
> 
> Kinda correct. But fw_devlink is not just about optimizing probing.
> It's also about ensuring sync_state() callbacks work correctly when
> drivers are built as modules. And for that to work, circular
> "SYNC_STATE_ONLY" device links are allowed. I've explained it in a bit
> more detail here [1].

Understood.

[...]

> This only catches circular links made out of 2 devices. If we really
> needed such a function that worked correctly to catch bigger
> "circles", you'd need to recurse and it'll get super wasteful and
> ugly.

Yeah, I was kind of expecting this reply :).

> Thankfully, device_link_add() already checks for circular dependencies
> when we need it and it's much cheaper because the links are at a
> device level and not examined at a property level.
> 
> Is this a real problem you are hitting with the Raspberry Pi 4's? If
> so can you give an example in its DT where you are hitting this?

So the DT bit that triggered all this series is in
'arch/arm/boot/dts/bcm283x.dtsi'. Namely the interaction between
'cprman@7e101000' and 'dsi@7e209000.' Both are clock providers and both are
clock consumers of each other.

Well I had a second deeper look at the issue, here is how the circular
dependency breaks the boot process (A being soc, B being cprman and C being
dsi):

Device node A
	Device node B -> C
	Device node C -> B

The probe sequence is the following (with DL_FLAG_AUTOPROBE_CONSUMER):
1. A device is added, the rest of devices are siblings, nothing is done
2. B device is added, C device doesn't exist, B is added to
   'wait_for_suppliers' list with 'need_for_probe' flag set.
3. C device is added, B is picked up from 'wait_for_suppliers' list, device
   link created with B consuming from C.
4. C is then parsed, and tried to be linked with B as a consumer this time.
   This fails after testing for circular deps (by device_is_dependent()) during
   device_link_add(). This leaves C in the 'wait_for_suppliers' list *for ever*
   as every further attempt at add_link() on C will fail.

-> Ultimately this prevents C for ever being probed, which also prevents B from
   being probed. Which isn't good as B is the main clock provider of the system.

Note that B can live without C. I think some clock re-parenting will not be
accessible, but that's all.

> I'll have to NACK this patch for reasons mentioned above and in [1].
> However, I think I have a solution that should work for what I'm
> guessing is your real problem. But let me see the description of the
> real scenario before I claim to have a solution.

My intuition would be, upon getting a circular dep from device_is_dependent()
with DL_FLAG_AUTOPROBE_CONSUMER to switch need_for_probe to false on both
devices.

Regards,
Nicolas
Saravana Kannan April 16, 2020, 8:57 p.m. UTC | #3
On Thu, Apr 16, 2020 at 9:01 AM Nicolas Saenz Julienne
<nsaenzjulienne@suse.de> wrote:
>
> On Wed, 2020-04-15 at 11:52 -0700, Saravana Kannan wrote:
> > On Wed, Apr 15, 2020 at 8:06 AM Nicolas Saenz Julienne
> > <nsaenzjulienne@suse.de> wrote:
> > > When creating a consumer/supplier relationship between devices it's
> > > essential to make sure they aren't supplying each other creating a
> > > circular dependency.
> >
> > Kinda correct. But fw_devlink is not just about optimizing probing.
> > It's also about ensuring sync_state() callbacks work correctly when
> > drivers are built as modules. And for that to work, circular
> > "SYNC_STATE_ONLY" device links are allowed. I've explained it in a bit
> > more detail here [1].
>
> Understood.
>
> [...]
>
> > This only catches circular links made out of 2 devices. If we really
> > needed such a function that worked correctly to catch bigger
> > "circles", you'd need to recurse and it'll get super wasteful and
> > ugly.
>
> Yeah, I was kind of expecting this reply :).
>
> > Thankfully, device_link_add() already checks for circular dependencies
> > when we need it and it's much cheaper because the links are at a
> > device level and not examined at a property level.
> >
> > Is this a real problem you are hitting with the Raspberry Pi 4's? If
> > so can you give an example in its DT where you are hitting this?
>
> So the DT bit that triggered all this series is in
> 'arch/arm/boot/dts/bcm283x.dtsi'. Namely the interaction between
> 'cprman@7e101000' and 'dsi@7e209000.' Both are clock providers and both are
> clock consumers of each other.
>
> Well I had a second deeper look at the issue, here is how the circular
> dependency breaks the boot process (A being soc, B being cprman and C being
> dsi):
>
> Device node A
>         Device node B -> C
>         Device node C -> B
>
> The probe sequence is the following (with DL_FLAG_AUTOPROBE_CONSUMER):
> 1. A device is added, the rest of devices are siblings, nothing is done
> 2. B device is added, C device doesn't exist, B is added to
>    'wait_for_suppliers' list with 'need_for_probe' flag set.
> 3. C device is added, B is picked up from 'wait_for_suppliers' list, device
>    link created with B consuming from C.
> 4. C is then parsed, and tried to be linked with B as a consumer this time.
>    This fails after testing for circular deps (by device_is_dependent()) during
>    device_link_add(). This leaves C in the 'wait_for_suppliers' list *for ever*
>    as every further attempt at add_link() on C will fail.
>
> -> Ultimately this prevents C for ever being probed, which also prevents B from
>    being probed. Which isn't good as B is the main clock provider of the system.
>
> Note that B can live without C. I think some clock re-parenting will not be
> accessible, but that's all.
>
> > I'll have to NACK this patch for reasons mentioned above and in [1].
> > However, I think I have a solution that should work for what I'm
> > guessing is your real problem. But let me see the description of the
> > real scenario before I claim to have a solution.
>
> My intuition would be, upon getting a circular dep from device_is_dependent()
> with DL_FLAG_AUTOPROBE_CONSUMER to switch need_for_probe to false on both
> devices.

The problem with that is the devices will start trying to probe and
then defer due to other suppliers that are needed for probing but
haven't been linked yet. So it'll go a bit against what you are trying
to do. Also it doesn't solve the problem of already created links that
are wrong.

I'll send out a patch in reply to your email. I've been meaning to
send that outside of this discussion. It doesn't cover all cases of
cycles, but it'll cover most cases and I think it should fix your case
too.

For a more comprehensive fix, I'd like to do something like what I
explain here [1]. That should be doable for your driver too if you
want to try that approach. But I haven't heard Rob/Frank's opinion on
that.

-Saravana
[1] - https://lore.kernel.org/lkml/CAGETcx_2vdjSWc3BBN-N2WrtJP90ZnH-2vE=2iVuHuaE1YmMWQ@mail.gmail.com/

Patch
diff mbox series

diff --git a/drivers/of/property.c b/drivers/of/property.c
index 2c7978ef22be1..74a5190408c3b 100644
--- a/drivers/of/property.c
+++ b/drivers/of/property.c
@@ -1171,6 +1171,44 @@  static const struct supplier_bindings of_supplier_bindings[] = {
 	{}
 };
 
+/**
+ * of_link_is_circular - Make sure potential link isn't circular
+ *
+ * @sup_np: Supplier device
+ * @con_np: Consumer device
+ *
+ * This function checks if @sup_np's properties contain a reference to @con_np.
+ *
+ * Will return true if there's a circular dependency and false otherwise.
+ */
+static bool of_link_is_circular(struct device_node *sup_np,
+				struct device_node *con_np)
+{
+	const struct supplier_bindings *s = of_supplier_bindings;
+	struct device_node *tmp;
+	bool matched = false;
+	struct property *p;
+	int i = 0;
+
+	for_each_property_of_node(sup_np, p) {
+		while (!matched && s->parse_prop) {
+			while ((tmp = s->parse_prop(sup_np, p->name, i))) {
+				matched = true;
+				i++;
+
+				if (tmp == con_np)
+					return true;
+			}
+			i = 0;
+			s++;
+		}
+		s = of_supplier_bindings;
+		matched = false;
+	}
+
+	return false;
+}
+
 /**
  * of_link_to_phandle - Add device link to supplier from supplier phandle
  * @dev: consumer device
@@ -1216,6 +1254,18 @@  static int of_link_to_phandle(struct device *dev, struct device_node *sup_np,
 		return -ENODEV;
 	}
 
+	/*
+	 * It is possible for consumer device nodes to also supply the device
+	 * node they are consuming from. Creating an unwarranted circular
+	 * dependency.
+	 */
+	if (of_link_is_circular(sup_np, dev->of_node)) {
+		dev_dbg(dev, "Not linking to %pOFP - Circular dependency\n",
+			sup_np);
+		of_node_put(sup_np);
+		return -ENODEV;
+	}
+
 	/*
 	 * Don't allow linking a device node as a consumer of one of its
 	 * descendant nodes. By definition, a child node can't be a functional