[v2,2/3] of/platform: Add functional dependency link from DT bindings
diff mbox series

Message ID 20190628022202.118166-3-saravanak@google.com
State Superseded
Headers show
Series
  • Solve postboot supplier cleanup and optimize probe ordering
Related show

Commit Message

Saravana Kannan June 28, 2019, 2:22 a.m. UTC
Add device-links to track functional dependencies between devices
after they are created (but before they are probed) by looking at
their common DT bindings like clocks, interconnects, etc.

Automatically adding device-links to track functional dependencies at
the framework level provides the following benefits:

- Optimizes device probe order and avoids the useless work of
  attempting probes of devices that will not probe successfully
  (because their suppliers aren't present or haven't probed yet).

  For example, in a commonly available mobile SoC, registering just
  one consumer device's driver at an initcall level earlier than the
  supplier device's driver causes 11 failed probe attempts before the
  consumer device probes successfully. This was with a kernel with all
  the drivers statically compiled in. This problem gets a lot worse if
  all the drivers are loaded as modules without direct symbol
  dependencies.

- Supplier devices like clock providers, interconnect providers, etc
  need to keep the resources they provide active and at a particular
  state(s) during boot up even if their current set of consumers don't
  request the resource to be active. This is because the rest of the
  consumers might not have probed yet and turning off the resource
  before all the consumers have probed could lead to a hang or
  undesired user experience.

  Some frameworks (Eg: regulator) handle this today by turning off
  "unused" resources at late_initcall_sync and hoping all the devices
  have probed by then. This is not a valid assumption for systems with
  loadable modules. Other frameworks (Eg: clock) just don't handle
  this due to the lack of a clear signal for when they can turn off
  resources. This leads to downstream hacks to handle cases like this
  that can easily be solved in the upstream kernel.

  By linking devices before they are probed, we give suppliers a clear
  count of the number of dependent consumers. Once all of the
  consumers are active, the suppliers can turn off the unused
  resources without making assumptions about the number of consumers.

By default we just add device-links to track "driver presence" (probe
succeeded) of the supplier device. If any other functionality provided
by device-links are needed, it is left to the consumer/supplier
devices to change the link when they probe.

Signed-off-by: Saravana Kannan <saravanak@google.com>
---
 drivers/of/Kconfig    |  9 ++++++
 drivers/of/platform.c | 73 +++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 82 insertions(+)

Comments

David Collins June 29, 2019, 12:55 a.m. UTC | #1
Hello Saravana,

On 6/27/19 7:22 PM, Saravana Kannan wrote:
> diff --git a/drivers/of/platform.c b/drivers/of/platform.c
> index 04ad312fd85b..8d690fa0f47c 100644
> --- a/drivers/of/platform.c
> +++ b/drivers/of/platform.c
> @@ -61,6 +61,72 @@ struct platform_device *of_find_device_by_node(struct device_node *np)
>  EXPORT_SYMBOL(of_find_device_by_node);
>  
>  #ifdef CONFIG_OF_ADDRESS
> +static int of_link_binding(struct device *dev, char *binding, char *cell)
> +{
> +	struct of_phandle_args sup_args;
> +	struct platform_device *sup_dev;
> +	unsigned int i = 0, links = 0;
> +	u32 dl_flags = DL_FLAG_AUTOPROBE_CONSUMER;
> +
> +	while (!of_parse_phandle_with_args(dev->of_node, binding, cell, i,
> +					   &sup_args)) {
> +		i++;
> +		sup_dev = of_find_device_by_node(sup_args.np);
> +		if (!sup_dev)
> +			continue;

This check means that a required dependency link between a consumer and
supplier will not be added in the case that the consumer device is created
before the supply device.  If the supplier device is created and
immediately bound to its driver after late_initcall_sync(), then it is
possible for the sync_state() callback of the supplier to be called before
the consumer gets a chance to probe since its link was never captured.

of_platform_default_populate() below will only create devices for the
first level DT nodes directly under "/".  Suppliers DT nodes can exist as
second level nodes under a first level bus node (e.g. I2C, SPMI, RPMh,
etc).  Thus, it is quite likely that not all supplier devices will have
been created when device_link_check_waiting_consumers() is called.

As far as I can tell, this effectively breaks the sync_state()
functionality (and thus proxy un-voting built on top of it) when using
kernel modules for both the supplier and consumer drivers which are probed
after late_initcall_sync().  I'm not sure how this can be avoided given
that the linking is done between devices in the process of sequentially
adding devices.  Perhaps linking between device nodes instead of devices
might be able to overcome this issue.


> +		if (device_link_add(dev, &sup_dev->dev, dl_flags))
> +			links++;
> +		put_device(&sup_dev->dev);
> +	}
> +	if (links < i)
> +		return -ENODEV;
> +	return 0;
> +}
> +
> +/*
> + * List of bindings and their cell names (use NULL if no cell names) from which
> + * device links need to be created.
> + */
> +static char *link_bindings[] = {
> +#ifdef CONFIG_OF_DEVLINKS
> +	"clocks", "#clock-cells",
> +	"interconnects", "#interconnect-cells",
> +#endif
> +};

This list and helper function above are missing support for regulator
<arbitrary-consumer-name>-supply properties.  We require this support on
QTI boards in order to handle regulator proxy un-voting when booting with
kernel modules.  Are you planning to add this support in a follow-on
version of this patch or in an additional patch?

Note that handling regulator supply properties will be very challenging
for at least these reasons:

1. There is not a consistent DT property name used for regulator supplies.

2. The device node referenced in a regulator supply phandle is usually not
the device node which correspond to the device pointer for the supplier.
This is because a single regulator supplier device node (which will have
an associated device pointer) typically has a subnode for each of the
regulators it supports.  Consumers then use phandles for the subnodes.

3. The specification of parent supplies for regulators frequently results
in *-supply properties in a node pointing to child subnodes of that node.
 See [1] for an example.  Special care would need to be taken to avoid
trying to mark a regulator supplier as a supplier to itself as well as to
avoid blocking its own probing due to an unlinked supply dependency.

4. Not all DT properties of the form "*-supply" are regulator supplies.
(Note, this case has been discussed, but I was not able to locate an
example of it.)


Clocks also have a problem.  A recent patch [2] allows clock provider
parent clocks to be specified via DT.  This could lead to cases of
circular "clocks" property dependencies where there are two clock supplier
devices A and B with A having some clocks with B clock parents along with
B having some clocks with A clock parents.  If "clocks" properties are
followed, then neither device would ever be able to probe.

This does not present a problem without this patch series because the
clock framework supports late binding of parents specifically to avoid
issues with clocks not registering in perfectly topological order of
parent dependencies.


> +
> +static int of_link_to_suppliers(struct device *dev)
> +{
> +	unsigned int i = 0;
> +	bool done = true;
> +
> +	if (unlikely(!dev->of_node))
> +		return 0;
> +
> +	for (i = 0; i < ARRAY_SIZE(link_bindings) / 2; i++)
> +		if (of_link_binding(dev, link_bindings[i * 2],
> +					link_bindings[i * 2 + 1]))
> +			done = false;
> +
> +	if (!done)
> +		return -ENODEV;
> +	return 0;
> +}
> +
> +static void link_waiting_consumers_func(struct work_struct *work)
> +{
> +	device_link_check_waiting_consumers(of_link_to_suppliers);
> +}
> +static DECLARE_WORK(link_waiting_consumers_work, link_waiting_consumers_func);
> +
> +static bool link_waiting_consumers_enable;
> +static void link_waiting_consumers_trigger(void)
> +{
> +	if (!link_waiting_consumers_enable)
> +		return;
> +
> +	schedule_work(&link_waiting_consumers_work);
> +}
> +
>  /*
>   * The following routines scan a subtree and registers a device for
>   * each applicable node.
> @@ -192,10 +258,13 @@ static struct platform_device *of_platform_device_create_pdata(
>  	dev->dev.platform_data = platform_data;
>  	of_msi_configure(&dev->dev, dev->dev.of_node);
>  
> +	if (of_link_to_suppliers(&dev->dev))
> +		device_link_wait_for_supplier(&dev->dev);
>  	if (of_device_add(dev) != 0) {
>  		platform_device_put(dev);
>  		goto err_clear_flag;
>  	}
> +	link_waiting_consumers_trigger();
>  
>  	return dev;
>  
> @@ -541,6 +610,10 @@ static int __init of_platform_default_populate_init(void)
>  	/* Populate everything else. */
>  	of_platform_default_populate(NULL, NULL, NULL);
>  
> +	/* Make the device-links between suppliers and consumers */
> +	link_waiting_consumers_enable = true;
> +	device_link_check_waiting_consumers(of_link_to_suppliers);
> +
>  	return 0;
>  }
>  arch_initcall_sync(of_platform_default_populate_init);
> 

Thanks,
David

[1]:
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/arch/arm64/boot/dts/qcom/sdm845-mtp.dts?h=v5.2-rc5#n73

[2]:
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=fc0c209c147f35ed2648adda09db39fcad89e334
Saravana Kannan July 1, 2019, 10:09 p.m. UTC | #2
On Fri, Jun 28, 2019 at 5:55 PM David Collins <collinsd@codeaurora.org> wrote:
>
> Hello Saravana,
>
> On 6/27/19 7:22 PM, Saravana Kannan wrote:
> > diff --git a/drivers/of/platform.c b/drivers/of/platform.c
> > index 04ad312fd85b..8d690fa0f47c 100644
> > --- a/drivers/of/platform.c
> > +++ b/drivers/of/platform.c
> > @@ -61,6 +61,72 @@ struct platform_device *of_find_device_by_node(struct device_node *np)
> >  EXPORT_SYMBOL(of_find_device_by_node);
> >
> >  #ifdef CONFIG_OF_ADDRESS
> > +static int of_link_binding(struct device *dev, char *binding, char *cell)
> > +{
> > +     struct of_phandle_args sup_args;
> > +     struct platform_device *sup_dev;
> > +     unsigned int i = 0, links = 0;
> > +     u32 dl_flags = DL_FLAG_AUTOPROBE_CONSUMER;
> > +
> > +     while (!of_parse_phandle_with_args(dev->of_node, binding, cell, i,
> > +                                        &sup_args)) {
> > +             i++;
> > +             sup_dev = of_find_device_by_node(sup_args.np);
> > +             if (!sup_dev)
> > +                     continue;
>
> This check means that a required dependency link between a consumer and
> supplier will not be added in the case that the consumer device is created
> before the supply device.  If the supplier device is created and
> immediately bound to its driver after late_initcall_sync(), then it is
> possible for the sync_state() callback of the supplier to be called before
> the consumer gets a chance to probe since its link was never captured.

Yeah, I was aware of this but wasn't sure how likely this case was. I
didn't want to go down the rabbit hole of handling every corner case
perfectly before seeing how the general idea was received by the
maintainers. Also, was waiting to see if someone complained about it
before trying to fix it.

> of_platform_default_populate() below will only create devices for the
> first level DT nodes directly under "/".  Suppliers DT nodes can exist as
> second level nodes under a first level bus node (e.g. I2C, SPMI, RPMh,
> etc).  Thus, it is quite likely that not all supplier devices will have
> been created when device_link_check_waiting_consumers() is called.

Yeah, those are all good example of when this could be an issue.

> As far as I can tell, this effectively breaks the sync_state()
> functionality (and thus proxy un-voting built on top of it) when using
> kernel modules for both the supplier and consumer drivers which are probed
> after late_initcall_sync().  I'm not sure how this can be avoided given
> that the linking is done between devices in the process of sequentially
> adding devices.  Perhaps linking between device nodes instead of devices
> might be able to overcome this issue.

I'm not sure linking struct device_node would be useful here. There
are different and simpler ways of fixing it. Working on them right now
(v3 patch series). Thanks for bringing up the good examples.

>
>
> > +             if (device_link_add(dev, &sup_dev->dev, dl_flags))
> > +                     links++;
> > +             put_device(&sup_dev->dev);
> > +     }
> > +     if (links < i)
> > +             return -ENODEV;
> > +     return 0;
> > +}
> > +
> > +/*
> > + * List of bindings and their cell names (use NULL if no cell names) from which
> > + * device links need to be created.
> > + */
> > +static char *link_bindings[] = {
> > +#ifdef CONFIG_OF_DEVLINKS
> > +     "clocks", "#clock-cells",
> > +     "interconnects", "#interconnect-cells",
> > +#endif
> > +};
>
> This list and helper function above are missing support for regulator
> <arbitrary-consumer-name>-supply properties.  We require this support on
> QTI boards in order to handle regulator proxy un-voting when booting with
> kernel modules.  Are you planning to add this support in a follow-on
> version of this patch or in an additional patch?

Yes, I intentionally left out regulators here because it's a huge can
of worms. But keep in mind, that even without adding regulator DT
binding handling here, you could still switch to sync_state callback
and be no worse than you are today. Once regulator supplier-consumer
linking is added/improved, the QTI boards would start working with
modules.

As for how regulators supplier-consumer linking is handled, I think
that's the one we need to discuss and figure out. But I don't think
the regulator binding necessarily has to be handled in this patch
series. I'm sure in general the number of bindings we support could be
improved over time.

>
> Note that handling regulator supply properties will be very challenging
> for at least these reasons:
>
> 1. There is not a consistent DT property name used for regulator supplies.

Yup. Maybe we can add a new regulator binding format with a more
consistent name (like clocks and interconnects) and deprecate the
older ones? Seems like a need binding clean up in general.

> 2. The device node referenced in a regulator supply phandle is usually not
> the device node which correspond to the device pointer for the supplier.
> This is because a single regulator supplier device node (which will have
> an associated device pointer) typically has a subnode for each of the
> regulators it supports.  Consumers then use phandles for the subnodes.

If I'm not mistaken, looks like this can be multiple sub-nodes deep
too. One option is to walk up the phandle till we find a compatible
string and then find the device for that node?

> 3. The specification of parent supplies for regulators frequently results
> in *-supply properties in a node pointing to child subnodes of that node.
>  See [1] for an example.  Special care would need to be taken to avoid
> trying to mark a regulator supplier as a supplier to itself as well as to
> avoid blocking its own probing due to an unlinked supply dependency.

Sigh... as if it's not already complicated enough. Anyway,
device_link_add() already has a bunch of check to avoid creating
cyclic dependencies, etc. So, I'd expect this to be handled already.
At worst case, we might need to add a few more checks there. But that
hopefully shouldn't be an issue.

> 4. Not all DT properties of the form "*-supply" are regulator supplies.
> (Note, this case has been discussed, but I was not able to locate an
> example of it.)

Yup and I hate this part. Not sure what to say.

> Clocks also have a problem.  A recent patch [2] allows clock provider
> parent clocks to be specified via DT.  This could lead to cases of
> circular "clocks" property dependencies where there are two clock supplier
> devices A and B with A having some clocks with B clock parents along with
> B having some clocks with A clock parents.  If "clocks" properties are
> followed, then neither device would ever be able to probe.

Interconnects have a similar problem too because every interconnect
lists all the other interconnects it's connected to. Even if that's
magically addressed correctly, interconnect consumers still have a
problem because "interconnect" DT binding only lists phandles of the
source and destination interconnect and not all the interconnect along
the way. So they will be missing dependencies.

In general I agree with your points about clocks. I've brought this up
multiple times, but the maintainers insists I first implement parsing
existing DT bindings. So, I've done that. Lets see what they have to
say now.

But I have a few more ideas for handling circular dependencies without
adding new DT bindings that might work (will send out as part of v3
patch series) but interconnects are still an issue.

> This does not present a problem without this patch series because the
> clock framework supports late binding of parents specifically to avoid
> issues with clocks not registering in perfectly topological order of
> parent dependencies.

That's why I added the OF_DEVLINKS config. As of v2, you simply can't
use it for SoC/boards with cyclic clock dependencies. But again,
sync_state is no worse that what's there today. And it'll only improve
over time.

-Saravana

> [2]:
> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=fc0c209c147f35ed2648adda09db39fcad89e334
>
> --
> The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
> a Linux Foundation Collaborative Project

Patch
diff mbox series

diff --git a/drivers/of/Kconfig b/drivers/of/Kconfig
index 37c2ccbefecd..7c7fa7394b4c 100644
--- a/drivers/of/Kconfig
+++ b/drivers/of/Kconfig
@@ -103,4 +103,13 @@  config OF_OVERLAY
 config OF_NUMA
 	bool
 
+config OF_DEVLINKS
+	bool "Device links from DT bindings"
+	help
+	  Common DT bindings like clocks, interconnects, etc represent a
+	  consumer device's dependency on suppliers devices. This option
+	  creates device links from these common bindings so that consumers are
+	  probed only after all their suppliers are active and suppliers can
+	  tell when all their consumers are active.
+
 endif # OF
diff --git a/drivers/of/platform.c b/drivers/of/platform.c
index 04ad312fd85b..8d690fa0f47c 100644
--- a/drivers/of/platform.c
+++ b/drivers/of/platform.c
@@ -61,6 +61,72 @@  struct platform_device *of_find_device_by_node(struct device_node *np)
 EXPORT_SYMBOL(of_find_device_by_node);
 
 #ifdef CONFIG_OF_ADDRESS
+static int of_link_binding(struct device *dev, char *binding, char *cell)
+{
+	struct of_phandle_args sup_args;
+	struct platform_device *sup_dev;
+	unsigned int i = 0, links = 0;
+	u32 dl_flags = DL_FLAG_AUTOPROBE_CONSUMER;
+
+	while (!of_parse_phandle_with_args(dev->of_node, binding, cell, i,
+					   &sup_args)) {
+		i++;
+		sup_dev = of_find_device_by_node(sup_args.np);
+		if (!sup_dev)
+			continue;
+		if (device_link_add(dev, &sup_dev->dev, dl_flags))
+			links++;
+		put_device(&sup_dev->dev);
+	}
+	if (links < i)
+		return -ENODEV;
+	return 0;
+}
+
+/*
+ * List of bindings and their cell names (use NULL if no cell names) from which
+ * device links need to be created.
+ */
+static char *link_bindings[] = {
+#ifdef CONFIG_OF_DEVLINKS
+	"clocks", "#clock-cells",
+	"interconnects", "#interconnect-cells",
+#endif
+};
+
+static int of_link_to_suppliers(struct device *dev)
+{
+	unsigned int i = 0;
+	bool done = true;
+
+	if (unlikely(!dev->of_node))
+		return 0;
+
+	for (i = 0; i < ARRAY_SIZE(link_bindings) / 2; i++)
+		if (of_link_binding(dev, link_bindings[i * 2],
+					link_bindings[i * 2 + 1]))
+			done = false;
+
+	if (!done)
+		return -ENODEV;
+	return 0;
+}
+
+static void link_waiting_consumers_func(struct work_struct *work)
+{
+	device_link_check_waiting_consumers(of_link_to_suppliers);
+}
+static DECLARE_WORK(link_waiting_consumers_work, link_waiting_consumers_func);
+
+static bool link_waiting_consumers_enable;
+static void link_waiting_consumers_trigger(void)
+{
+	if (!link_waiting_consumers_enable)
+		return;
+
+	schedule_work(&link_waiting_consumers_work);
+}
+
 /*
  * The following routines scan a subtree and registers a device for
  * each applicable node.
@@ -192,10 +258,13 @@  static struct platform_device *of_platform_device_create_pdata(
 	dev->dev.platform_data = platform_data;
 	of_msi_configure(&dev->dev, dev->dev.of_node);
 
+	if (of_link_to_suppliers(&dev->dev))
+		device_link_wait_for_supplier(&dev->dev);
 	if (of_device_add(dev) != 0) {
 		platform_device_put(dev);
 		goto err_clear_flag;
 	}
+	link_waiting_consumers_trigger();
 
 	return dev;
 
@@ -541,6 +610,10 @@  static int __init of_platform_default_populate_init(void)
 	/* Populate everything else. */
 	of_platform_default_populate(NULL, NULL, NULL);
 
+	/* Make the device-links between suppliers and consumers */
+	link_waiting_consumers_enable = true;
+	device_link_check_waiting_consumers(of_link_to_suppliers);
+
 	return 0;
 }
 arch_initcall_sync(of_platform_default_populate_init);