Add a generic "depends-on" property that allows specifying mandatory functional dependencies between devices. Add device-links after the devices are created (but before they are probed) by looking at this "depends-on" property. This property is used instead of existing DT properties that specify phandles of other devices (Eg: clocks, pinctrl, regulators, etc). This is because not all resources referred to by existing DT properties are mandatory functional dependencies. Some devices/drivers might be able to operate with reduced functionality when some of the resources aren't available. For example, a device could operate in polling mode if no IRQ is available, a device could skip doing power management if clock or voltage control isn't available and they are left on, etc. So, adding mandatory functional dependency links between devices by looking at referred phandles in DT properties won't work as it would prevent probing devices that could be probed. By having an explicit depends-on property, we can handle these cases correctly. Having functional dependencies explicitly called out in DT and automatically added before the devices are probed, provides the following benefits: - Optimizes device probe order and avoids the useless work of attempting probes of devices that will not probe successfully (because their suppliers aren't present or haven't probed yet). For example, in a commonly available mobile SoC, registering just one consumer device's driver at an initcall level earlier than the supplier device's driver causes 11 failed probe attempts before the consumer device probes successfully. This was with a kernel with all the drivers statically compiled in. This problem gets a lot worse if all the drivers are loaded as modules without direct symbol dependencies. - Supplier devices like clock providers, regulators providers, etc need to keep the resources they provide active and at a particular state(s) during boot up even if their current set of consumers don't request the resource to be active. This is because the rest of the consumers might not have probed yet and turning off the resource before all the consumers have probed could lead to a hang or undesired user experience. Some frameworks (Eg: regulator) handle this today by turning off "unused" resources at late_initcall_sync and hoping all the devices have probed by then. This is not a valid assumption for systems with loadable modules. Other frameworks (Eg: clock) just don't handle this due to the lack of a clear signal for when they can turn off resources. This leads to downstream hacks to handle cases like this that can easily be solved in the upstream kernel. By linking devices before they are probed, we give suppliers a clear count of the number of dependent consumers. Once all of the consumers are active, the suppliers can turn off the unused resources without making assumptions about the number of consumers. By default we just add device-links to track "driver presence" (probe succeeded) of the supplier device. If any other functionality provided by device-links are needed, it is left to the consumer/supplier devices to change the link when they probe. Saravana Kannan (5): of/platform: Speed up of_find_device_by_node() driver core: Add device links support for pending links to suppliers dt-bindings: Add depends-on property of/platform: Add functional dependency link from "depends-on" property driver core: Add sync_state driver/bus callback .../devicetree/bindings/depends-on.txt | 26 +++++ drivers/base/core.c | 106 ++++++++++++++++++ drivers/of/platform.c | 75 ++++++++++++- include/linux/device.h | 24 ++++ include/linux/of.h | 3 + 5 files changed, 233 insertions(+), 1 deletion(-) create mode 100644 Documentation/devicetree/bindings/depends-on.txt -- 2.22.0.rc1.257.g3120a18244-goog
Add a pointer from device tree node to the device created from it. This allows us to find the device corresponding to a device tree node without having to loop through all the platform devices. However, fallback to looping through the platform devices to handle any devices that might set their own of_node. Signed-off-by: Saravana Kannan <saravanak@google.com> --- drivers/of/platform.c | 20 +++++++++++++++++++- include/linux/of.h | 3 +++ 2 files changed, 22 insertions(+), 1 deletion(-) diff --git a/drivers/of/platform.c b/drivers/of/platform.c index 04ad312fd85b..1115a8d80a33 100644 --- a/drivers/of/platform.c +++ b/drivers/of/platform.c @@ -42,6 +42,8 @@ static int of_dev_node_match(struct device *dev, void *data) return dev->of_node == data; } +static DEFINE_SPINLOCK(of_dev_lock); + /** * of_find_device_by_node - Find the platform_device associated with a node * @np: Pointer to device tree node @@ -55,7 +57,18 @@ struct platform_device *of_find_device_by_node(struct device_node *np) { struct device *dev; - dev = bus_find_device(&platform_bus_type, NULL, np, of_dev_node_match); + /* + * Spinlock needed to make sure np->dev doesn't get freed between NULL + * check inside and kref count increment inside get_device(). This is + * achieved by grabbing the spinlock before setting np->dev = NULL in + * of_platform_device_destroy(). + */ + spin_lock(&of_dev_lock); + dev = get_device(np->dev); + spin_unlock(&of_dev_lock); + if (!dev) + dev = bus_find_device(&platform_bus_type, NULL, np, + of_dev_node_match); return dev ? to_platform_device(dev) : NULL; } EXPORT_SYMBOL(of_find_device_by_node); @@ -196,6 +209,7 @@ static struct platform_device *of_platform_device_create_pdata( platform_device_put(dev); goto err_clear_flag; } + np->dev = &dev->dev; return dev; @@ -556,6 +570,10 @@ int of_platform_device_destroy(struct device *dev, void *data) if (of_node_check_flag(dev->of_node, OF_POPULATED_BUS)) device_for_each_child(dev, NULL, of_platform_device_destroy); + /* Spinlock is needed for of_find_device_by_node() to work */ + spin_lock(&of_dev_lock); + dev->of_node->dev = NULL; + spin_unlock(&of_dev_lock); of_node_clear_flag(dev->of_node, OF_POPULATED); of_node_clear_flag(dev->of_node, OF_POPULATED_BUS); diff --git a/include/linux/of.h b/include/linux/of.h index 0cf857012f11..f2b4912cbca1 100644 --- a/include/linux/of.h +++ b/include/linux/of.h @@ -48,6 +48,8 @@ struct property { struct of_irq_controller; #endif +struct device; + struct device_node { const char *name; phandle phandle; @@ -68,6 +70,7 @@ struct device_node { unsigned int unique_id; struct of_irq_controller *irq_trans; #endif + struct device *dev; /* Device created from this node */ }; #define MAX_PHANDLE_ARGS 16 -- 2.22.0.rc1.257.g3120a18244-goog
When consumer devices are added, they might not have a supplier device to link to despite needing mandatory resources/functionality from one or more suppliers. Add a waiting_for_suppliers list to track such consumers and add helper functions to manage the list. Marking/unmarking a consumer device as waiting for suppliers is generally expected to be done by the entity that's creating the device. Signed-off-by: Saravana Kannan <saravanak@google.com> --- drivers/base/core.c | 67 ++++++++++++++++++++++++++++++++++++++++++ include/linux/device.h | 5 ++++ 2 files changed, 72 insertions(+) diff --git a/drivers/base/core.c b/drivers/base/core.c index fd7511e04e62..9ab6782dda1c 100644 --- a/drivers/base/core.c +++ b/drivers/base/core.c @@ -44,6 +44,8 @@ early_param("sysfs.deprecated", sysfs_deprecated_setup); #endif /* Device links support. */ +static LIST_HEAD(wait_for_suppliers); +static DEFINE_MUTEX(wfs_lock); #ifdef CONFIG_SRCU static DEFINE_MUTEX(device_links_lock); @@ -401,6 +403,53 @@ struct device_link *device_link_add(struct device *consumer, } EXPORT_SYMBOL_GPL(device_link_add); +/** + * device_link_wait_for_supplier - Mark device as waiting for supplier + * @consumer: Consumer device + * + * Marks the consumer device as waiting for suppliers to become available. The + * consumer device will never be probed until it's unmarked as waiting for + * suppliers. The caller is responsible for adding the link to the supplier + * once the supplier device is present. + * + * This function is NOT meant to be called from the probe function of the + * consumer but rather from code that creates the consumer device. + */ +void device_link_wait_for_supplier(struct device *consumer) +{ + mutex_lock(&wfs_lock); + list_add_tail(&consumer->links.needs_suppliers, &wait_for_suppliers); + mutex_unlock(&wfs_lock); +} + +/** + * device_link_check_waiting_consumers - Try to unmark waiting consumers + * @add_suppliers: Callback function to add suppliers to waiting consumer + * + * Loops through all consumers waiting on suppliers and tries to add all their + * supplier links. If that succeeds, the consumer device is unmarked as waiting + * for suppliers. Otherwise, they are left marked as waiting on suppliers, + * + * The add_suppliers callback is expected to return 0 if it has found and added + * all the supplier links for the consumer device. It should return an error if + * it isn't able to do so. + * + * The caller of device_link_wait_for_supplier() is expected to call this once + * it's aware of potential suppliers becoming available. + */ +void device_link_check_waiting_consumers( + int (*add_suppliers)(struct device *consumer)) +{ + struct device *dev, *tmp; + + mutex_lock(&wfs_lock); + list_for_each_entry_safe(dev, tmp, &wait_for_suppliers, + links.needs_suppliers) + if (!add_suppliers(dev)) + list_del_init(&dev->links.needs_suppliers); + mutex_unlock(&wfs_lock); +} + static void device_link_free(struct device_link *link) { while (refcount_dec_not_one(&link->rpm_active)) @@ -535,6 +584,19 @@ int device_links_check_suppliers(struct device *dev) struct device_link *link; int ret = 0; + /* + * If a device is waiting for one or more suppliers (in + * wait_for_suppliers list), it is not ready to probe yet. So just + * return -EPROBE_DEFER without having to check the links with existing + * suppliers. + */ + mutex_lock(&wfs_lock); + if (!list_empty(&dev->links.needs_suppliers)) { + mutex_unlock(&wfs_lock); + return -EPROBE_DEFER; + } + mutex_unlock(&wfs_lock); + device_links_write_lock(); list_for_each_entry(link, &dev->links.suppliers, c_node) { @@ -812,6 +874,10 @@ static void device_links_purge(struct device *dev) { struct device_link *link, *ln; + mutex_lock(&wfs_lock); + list_del(&dev->links.needs_suppliers); + mutex_unlock(&wfs_lock); + /* * Delete all of the remaining links from this device to any other * devices (either consumers or suppliers). @@ -1673,6 +1739,7 @@ void device_initialize(struct device *dev) #endif INIT_LIST_HEAD(&dev->links.consumers); INIT_LIST_HEAD(&dev->links.suppliers); + INIT_LIST_HEAD(&dev->links.needs_suppliers); dev->links.status = DL_DEV_NO_DRIVER; } EXPORT_SYMBOL_GPL(device_initialize); diff --git a/include/linux/device.h b/include/linux/device.h index e85264fb6616..4e71e5386aae 100644 --- a/include/linux/device.h +++ b/include/linux/device.h @@ -887,11 +887,13 @@ enum dl_dev_state { * struct dev_links_info - Device data related to device links. * @suppliers: List of links to supplier devices. * @consumers: List of links to consumer devices. + * @needs_suppliers: Hook to global list of devices waiting for suppliers. * @status: Driver status information. */ struct dev_links_info { struct list_head suppliers; struct list_head consumers; + struct list_head needs_suppliers; enum dl_dev_state status; }; @@ -1395,6 +1397,9 @@ struct device_link *device_link_add(struct device *consumer, struct device *supplier, u32 flags); void device_link_del(struct device_link *link); void device_link_remove(void *consumer, struct device *supplier); +void device_link_wait_for_supplier(struct device *consumer); +void device_link_check_waiting_consumers( + int (*add_suppliers)(struct device *consumer)); #ifndef dev_fmt #define dev_fmt(fmt) fmt -- 2.22.0.rc1.257.g3120a18244-goog
The depends-on property is used to list the mandatory functional dependencies of a consumer device on zero or more supplier devices. Signed-off-by: Saravana Kannan <saravanak@google.com> --- .../devicetree/bindings/depends-on.txt | 26 +++++++++++++++++++ 1 file changed, 26 insertions(+) create mode 100644 Documentation/devicetree/bindings/depends-on.txt diff --git a/Documentation/devicetree/bindings/depends-on.txt b/Documentation/devicetree/bindings/depends-on.txt new file mode 100644 index 000000000000..1cbddd11cf17 --- /dev/null +++ b/Documentation/devicetree/bindings/depends-on.txt @@ -0,0 +1,26 @@ +Functional dependency linking +============================= + +Apart from parent-child relationships, devices (consumers) often have +functional dependencies on other devices (suppliers). Common examples of +suppliers are clock, regulators, pinctrl, etc. However not all of them are +dependencies with well defined devicetree bindings. Also, not all functional +dependencies are mandatory as the device might be able to operate in a limited +mode without some of the dependencies. + +The depends-on property allows marking these mandatory functional dependencies +between one or more devices. The depends-on property is used by the consumer +device to point to all its mandatory supplier devices. + +Optional properties: +- depends-on: A list of phandles to mandatory suppliers of the device. + + +Examples: +Here, the device is dependent on only 2 of the 3 clock providers +dev@40031000 { + compatible = "name"; + reg = <0x40031000 0x1000>; + clocks = <&osc_1 1>, <&osc_2 7> <&osc_3 5>; + depends-on = <&osc_1>, <&osc_3>; +}; -- 2.22.0.rc1.257.g3120a18244-goog
Add device-links after the devices are created (but before they are probed) by looking at the "depends-on" DT property that allows specifying mandatory functional dependencies between devices. This property is used instead of existing DT properties that specify phandles of other devices (Eg: clocks, pinctrl, regulators, etc). This is because not all resources referred to by existing DT properties are mandatory functional dependencies. Some devices/drivers might be able to operate with reduced functionality when some of the resources aren't available. For example, a device could operate in polling mode if no IRQ is available, a device could skip doing power management if clock or voltage control isn't available and they are left on, etc. Automatically adding device-links for functional dependencies at the framework level provides the following benefits: - Optimizes device probe order and avoids the useless work of attempting probes of devices that will not probe successfully (because their suppliers aren't present or haven't probed yet). For example, in a commonly available mobile SoC, registering just one consumer device's driver at an initcall level earlier than the supplier device's driver causes 11 failed probe attempts before the consumer device probes successfully. This was with a kernel with all the drivers statically compiled in. This problem gets a lot worse if all the drivers are loaded as modules without direct symbol dependencies. - Supplier devices like clock providers, regulators providers, etc need to keep the resources they provide active and at a particular state(s) during boot up even if their current set of consumers don't request the resource to be active. This is because the rest of the consumers might not have probed yet and turning off the resource before all the consumers have probed could lead to a hang or undesired user experience. Some frameworks (Eg: regulator) handle this today by turning off "unused" resources at late_initcall_sync and hoping all the devices have probed by then. This is not a valid assumption for systems with loadable modules. Other frameworks (Eg: clock) just don't handle this due to the lack of a clear signal for when they can turn off resources. This leads to downstream hacks to handle cases like this that can easily be solved in the upstream kernel. By linking devices before they are probed, we give suppliers a clear count of the number of dependent consumers. Once all of the consumers are active, the suppliers can turn off the unused resources without making assumptions about the number of consumers. By default we just add device-links to track "driver presence" (probe succeeded) of the supplier device. If any other functionality provided by device-links are needed, it is left to the consumer/supplier devices to change the link when they probe. Signed-off-by: Saravana Kannan <saravanak@google.com> --- drivers/of/platform.c | 46 +++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 46 insertions(+) diff --git a/drivers/of/platform.c b/drivers/of/platform.c index 1115a8d80a33..da1aa52b310a 100644 --- a/drivers/of/platform.c +++ b/drivers/of/platform.c @@ -74,6 +74,45 @@ struct platform_device *of_find_device_by_node(struct device_node *np) EXPORT_SYMBOL(of_find_device_by_node); #ifdef CONFIG_OF_ADDRESS +static int of_link_to_suppliers(struct device *dev) +{ + struct device_node *sup_node; + struct platform_device *sup_dev; + unsigned int i = 0, links = 0; + u32 dl_flags = DL_FLAG_AUTOPROBE_CONSUMER; + + if (unlikely(!dev->of_node)) + return 0; + + while ((sup_node = of_parse_phandle(dev->of_node, "depends-on", i))) { + i++; + sup_dev = of_find_device_by_node(sup_node); + if (!sup_dev) + continue; + if (device_link_add(dev, &sup_dev->dev, dl_flags)) + links++; + put_device(&sup_dev->dev); + } + if (links < i) + return -ENODEV; + return 0; +} + +static void link_waiting_consumers_func(struct work_struct *work) +{ + device_link_check_waiting_consumers(of_link_to_suppliers); +} +static DECLARE_WORK(link_waiting_consumers_work, link_waiting_consumers_func); + +static bool link_waiting_consumers_enable; +static void link_waiting_consumers_trigger(void) +{ + if (!link_waiting_consumers_enable) + return; + + schedule_work(&link_waiting_consumers_work); +} + /* * The following routines scan a subtree and registers a device for * each applicable node. @@ -205,11 +244,14 @@ static struct platform_device *of_platform_device_create_pdata( dev->dev.platform_data = platform_data; of_msi_configure(&dev->dev, dev->dev.of_node); + if (of_link_to_suppliers(&dev->dev)) + device_link_wait_for_supplier(&dev->dev); if (of_device_add(dev) != 0) { platform_device_put(dev); goto err_clear_flag; } np->dev = &dev->dev; + link_waiting_consumers_trigger(); return dev; @@ -555,6 +597,10 @@ static int __init of_platform_default_populate_init(void) /* Populate everything else. */ of_platform_default_populate(NULL, NULL, NULL); + /* Make the device-links between suppliers and consumers */ + link_waiting_consumers_enable = true; + device_link_check_waiting_consumers(of_link_to_suppliers); + return 0; } arch_initcall_sync(of_platform_default_populate_init); -- 2.22.0.rc1.257.g3120a18244-goog
This sync_state driver/bus callback is called once all the consumers of a supplier have probed successfully. This allows the supplier device's driver/bus to sync the supplier device's state to the software state with the guarantee that all the consumers are actively managing the resources provided by the supplier device. To maintain backwards compatibility and ease transition from existing frameworks and resource cleanup schemes, late_initcall_sync is the earliest when the sync_state callback might be called. There is no upper bound on the time by which the sync_state callback has to be called. This is because if a consumer device never probes, the supplier has to maintain its resources in the state left by the bootloader. For example, if the bootloader leaves the display backlight at a fixed voltage and the backlight driver is never probed, you don't want the backlight to ever be turned off after boot up. Signed-off-by: Saravana Kannan <saravanak@google.com> --- drivers/base/core.c | 39 +++++++++++++++++++++++++++++++++++++++ drivers/of/platform.c | 9 +++++++++ include/linux/device.h | 19 +++++++++++++++++++ 3 files changed, 67 insertions(+) diff --git a/drivers/base/core.c b/drivers/base/core.c index 9ab6782dda1c..7a8777a33e8c 100644 --- a/drivers/base/core.c +++ b/drivers/base/core.c @@ -46,6 +46,7 @@ early_param("sysfs.deprecated", sysfs_deprecated_setup); /* Device links support. */ static LIST_HEAD(wait_for_suppliers); static DEFINE_MUTEX(wfs_lock); +static bool supplier_sync_state_enabled; #ifdef CONFIG_SRCU static DEFINE_MUTEX(device_links_lock); @@ -616,6 +617,41 @@ int device_links_check_suppliers(struct device *dev) return ret; } +static void __device_links_supplier_sync_state(struct device *dev) +{ + struct device_link *link; + + if (dev->state_synced) + return; + + list_for_each_entry(link, &dev->links.consumers, s_node) { + if (link->flags & DL_FLAG_STATELESS) + continue; + if (link->status != DL_STATE_ACTIVE) + return; + } + + if (dev->bus->sync_state) + dev->bus->sync_state(dev); + else if (dev->driver && dev->driver->sync_state) + dev->driver->sync_state(dev); + + dev->state_synced = true; +} + +int device_links_supplier_sync_state(struct device *dev, void *data) +{ + device_links_write_lock(); + __device_links_supplier_sync_state(dev); + device_links_write_unlock(); + return 0; +} + +void device_links_supplier_sync_state_enable(void) +{ + supplier_sync_state_enabled = true; +} + /** * device_links_driver_bound - Update device links after probing its driver. * @dev: Device to update the links for. @@ -660,6 +696,9 @@ void device_links_driver_bound(struct device *dev) WARN_ON(link->status != DL_STATE_CONSUMER_PROBE); WRITE_ONCE(link->status, DL_STATE_ACTIVE); + + if (supplier_sync_state_enabled) + __device_links_supplier_sync_state(link->supplier); } dev->links.status = DL_DEV_DRIVER_BOUND; diff --git a/drivers/of/platform.c b/drivers/of/platform.c index da1aa52b310a..b5cce0c2496a 100644 --- a/drivers/of/platform.c +++ b/drivers/of/platform.c @@ -604,6 +604,15 @@ static int __init of_platform_default_populate_init(void) return 0; } arch_initcall_sync(of_platform_default_populate_init); + +static int __init of_platform_sync_state_init(void) +{ + device_links_supplier_sync_state_enable(); + bus_for_each_dev(&platform_bus_type, NULL, NULL, + device_links_supplier_sync_state); + return 0; +} +late_initcall_sync(of_platform_sync_state_init); #endif int of_platform_device_destroy(struct device *dev, void *data) diff --git a/include/linux/device.h b/include/linux/device.h index 4e71e5386aae..f6d5bba098df 100644 --- a/include/linux/device.h +++ b/include/linux/device.h @@ -79,6 +79,13 @@ extern void bus_remove_file(struct bus_type *, struct bus_attribute *); * that generate uevents to add the environment variables. * @probe: Called when a new device or driver add to this bus, and callback * the specific driver's probe to initial the matched device. + * @sync_state: Called to sync device state to software state after all the + * state tracking consumers linked to this device (present at + * the time of late_initcall) have successfully bound to a + * driver. If the device has no consumers, this function will + * be called at late_initcall_sync level. If the device has + * consumers that are never bound to a driver, this function + * will never get called until they do. * @remove: Called when a device removed from this bus. * @shutdown: Called at shut-down time to quiesce the device. * @@ -122,6 +129,7 @@ struct bus_type { int (*match)(struct device *dev, struct device_driver *drv); int (*uevent)(struct device *dev, struct kobj_uevent_env *env); int (*probe)(struct device *dev); + void (*sync_state)(struct device *dev); int (*remove)(struct device *dev); void (*shutdown)(struct device *dev); @@ -251,6 +259,13 @@ enum probe_type { * @probe: Called to query the existence of a specific device, * whether this driver can work with it, and bind the driver * to a specific device. + * @sync_state: Called to sync device state to software state after all the + * state tracking consumers linked to this device (present at + * the time of late_initcall) have successfully bound to a + * driver. If the device has no consumers, this function will + * be called at late_initcall_sync level. If the device has + * consumers that are never bound to a driver, this function + * will never get called until they do. * @remove: Called when the device is removed from the system to * unbind a device from this driver. * @shutdown: Called at shut-down time to quiesce the device. @@ -288,6 +303,7 @@ struct device_driver { const struct acpi_device_id *acpi_match_table; int (*probe) (struct device *dev); + void (*sync_state)(struct device *dev); int (*remove) (struct device *dev); void (*shutdown) (struct device *dev); int (*suspend) (struct device *dev, pm_message_t state); @@ -1058,6 +1074,7 @@ struct device { bool offline_disabled:1; bool offline:1; bool of_node_reused:1; + bool state_synced:1; #if defined(CONFIG_ARCH_HAS_SYNC_DMA_FOR_DEVICE) || \ defined(CONFIG_ARCH_HAS_SYNC_DMA_FOR_CPU) || \ defined(CONFIG_ARCH_HAS_SYNC_DMA_FOR_CPU_ALL) @@ -1400,6 +1417,8 @@ void device_link_remove(void *consumer, struct device *supplier); void device_link_wait_for_supplier(struct device *consumer); void device_link_check_waiting_consumers( int (*add_suppliers)(struct device *consumer)); +int device_links_supplier_sync_state(struct device *dev, void *data); +void device_links_supplier_sync_state_enable(void); #ifndef dev_fmt #define dev_fmt(fmt) fmt -- 2.22.0.rc1.257.g3120a18244-goog
On Thu, May 23, 2019 at 06:01:13PM -0700, Saravana Kannan wrote:
> When consumer devices are added, they might not have a supplier device
> to link to despite needing mandatory resources/functionality from one
> or more suppliers. Add a waiting_for_suppliers list to track such
> consumers and add helper functions to manage the list.
>
> Marking/unmarking a consumer device as waiting for suppliers is
> generally expected to be done by the entity that's creating the
> device.
>
> Signed-off-by: Saravana Kannan <saravanak@google.com>
> ---
> drivers/base/core.c | 67 ++++++++++++++++++++++++++++++++++++++++++
> include/linux/device.h | 5 ++++
> 2 files changed, 72 insertions(+)
This looks fine to me, I'll let the dt people argue if the new entry is
valid or not :)
Feel free to add:
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
to this if the dt people want to take it through their tree. Or, I
guess I should really take it through mine...
thanks,
greg k-h
On Thu, May 23, 2019 at 06:01:16PM -0700, Saravana Kannan wrote:
> This sync_state driver/bus callback is called once all the consumers
> of a supplier have probed successfully.
>
> This allows the supplier device's driver/bus to sync the supplier
> device's state to the software state with the guarantee that all the
> consumers are actively managing the resources provided by the supplier
> device.
>
> To maintain backwards compatibility and ease transition from existing
> frameworks and resource cleanup schemes, late_initcall_sync is the
> earliest when the sync_state callback might be called.
>
> There is no upper bound on the time by which the sync_state callback
> has to be called. This is because if a consumer device never probes,
> the supplier has to maintain its resources in the state left by the
> bootloader. For example, if the bootloader leaves the display
> backlight at a fixed voltage and the backlight driver is never probed,
> you don't want the backlight to ever be turned off after boot up.
>
> Signed-off-by: Saravana Kannan <saravanak@google.com>
> ---
> drivers/base/core.c | 39 +++++++++++++++++++++++++++++++++++++++
> drivers/of/platform.c | 9 +++++++++
> include/linux/device.h | 19 +++++++++++++++++++
> 3 files changed, 67 insertions(+)
Also:
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
thanks,
greg k-h
On Thu, May 23, 2019 at 06:01:11PM -0700, Saravana Kannan wrote:
> Add a generic "depends-on" property that allows specifying mandatory
> functional dependencies between devices. Add device-links after the
> devices are created (but before they are probed) by looking at this
> "depends-on" property.
>
> This property is used instead of existing DT properties that specify
> phandles of other devices (Eg: clocks, pinctrl, regulators, etc). This
> is because not all resources referred to by existing DT properties are
> mandatory functional dependencies. Some devices/drivers might be able
> to operate with reduced functionality when some of the resources
> aren't available. For example, a device could operate in polling mode
> if no IRQ is available, a device could skip doing power management if
> clock or voltage control isn't available and they are left on, etc.
>
> So, adding mandatory functional dependency links between devices by
> looking at referred phandles in DT properties won't work as it would
> prevent probing devices that could be probed. By having an explicit
> depends-on property, we can handle these cases correctly.
>
> Having functional dependencies explicitly called out in DT and
> automatically added before the devices are probed, provides the
> following benefits:
>
> - Optimizes device probe order and avoids the useless work of
> attempting probes of devices that will not probe successfully
> (because their suppliers aren't present or haven't probed yet).
>
> For example, in a commonly available mobile SoC, registering just
> one consumer device's driver at an initcall level earlier than the
> supplier device's driver causes 11 failed probe attempts before the
> consumer device probes successfully. This was with a kernel with all
> the drivers statically compiled in. This problem gets a lot worse if
> all the drivers are loaded as modules without direct symbol
> dependencies.
>
> - Supplier devices like clock providers, regulators providers, etc
> need to keep the resources they provide active and at a particular
> state(s) during boot up even if their current set of consumers don't
> request the resource to be active. This is because the rest of the
> consumers might not have probed yet and turning off the resource
> before all the consumers have probed could lead to a hang or
> undesired user experience.
>
> Some frameworks (Eg: regulator) handle this today by turning off
> "unused" resources at late_initcall_sync and hoping all the devices
> have probed by then. This is not a valid assumption for systems with
> loadable modules. Other frameworks (Eg: clock) just don't handle
> this due to the lack of a clear signal for when they can turn off
> resources. This leads to downstream hacks to handle cases like this
> that can easily be solved in the upstream kernel.
>
> By linking devices before they are probed, we give suppliers a clear
> count of the number of dependent consumers. Once all of the
> consumers are active, the suppliers can turn off the unused
> resources without making assumptions about the number of consumers.
>
> By default we just add device-links to track "driver presence" (probe
> succeeded) of the supplier device. If any other functionality provided
> by device-links are needed, it is left to the consumer/supplier
> devices to change the link when they probe.
Somewhere in this wall of text you need to say:
MAKES DEVICES BOOT FASTER!
right? :)
So in short, this solves the issue of deferred probing with systems with
loads of modules for platform devices and device tree, in that now you
have a chance to probe devices in the correct order saving loads of busy
loops.
A good thing, I like this, very nice work, all of these are:
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
but odds are I'll take this through my tree, so I'll add my s-o-b then.
But only after the DT people agree on the new entry.
thanks,
greg k-h
On Thu, May 23, 2019 at 8:01 PM Saravana Kannan <saravanak@google.com> wrote: > > Add a generic "depends-on" property that allows specifying mandatory > functional dependencies between devices. Add device-links after the > devices are created (but before they are probed) by looking at this > "depends-on" property. The DT already has dependency information. A node with 'clocks' property has its dependency right there. We should use that. We don't need to duplicate the information. > This property is used instead of existing DT properties that specify > phandles of other devices (Eg: clocks, pinctrl, regulators, etc). This > is because not all resources referred to by existing DT properties are > mandatory functional dependencies. Some devices/drivers might be able > to operate with reduced functionality when some of the resources > aren't available. For example, a device could operate in polling mode > if no IRQ is available, a device could skip doing power management if > clock or voltage control isn't available and they are left on, etc. Yeah, but none of these examples are typically what you'd want to happen. These cases are a property of the OS, not the DT. For example, until recently, If you added pinctrl bindings to your DT, the kernel would no longer boot because it would be looking for pinctrl driver. That's wrong because the DT should not be coupled to the OS like that. Adding this property will cause the same problem. > So, adding mandatory functional dependency links between devices by > looking at referred phandles in DT properties won't work as it would > prevent probing devices that could be probed. By having an explicit > depends-on property, we can handle these cases correctly. > > Having functional dependencies explicitly called out in DT and > automatically added before the devices are probed, provides the > following benefits: > > - Optimizes device probe order and avoids the useless work of > attempting probes of devices that will not probe successfully > (because their suppliers aren't present or haven't probed yet). > > For example, in a commonly available mobile SoC, registering just > one consumer device's driver at an initcall level earlier than the > supplier device's driver causes 11 failed probe attempts before the > consumer device probes successfully. This was with a kernel with all > the drivers statically compiled in. This problem gets a lot worse if > all the drivers are loaded as modules without direct symbol > dependencies. Do you have data on how much time is spent. Past 'smarter probing' attempts have not shown a significant difference. > - Supplier devices like clock providers, regulators providers, etc > need to keep the resources they provide active and at a particular > state(s) during boot up even if their current set of consumers don't > request the resource to be active. This is because the rest of the > consumers might not have probed yet and turning off the resource > before all the consumers have probed could lead to a hang or > undesired user experience. We already know generally what devices are dependencies because you just listed them. Why don't we make the kernel smarter by instantiating these core devices/drivers first instead of relying on initcall and link order. > Some frameworks (Eg: regulator) handle this today by turning off > "unused" resources at late_initcall_sync and hoping all the devices > have probed by then. This is not a valid assumption for systems with > loadable modules. Other frameworks (Eg: clock) just don't handle > this due to the lack of a clear signal for when they can turn off > resources. This leads to downstream hacks to handle cases like this > that can easily be solved in the upstream kernel. IMO, we should get rid of this auto disabling. Rob
On Thu, May 23, 2019 at 06:01:14PM -0700, Saravana Kannan wrote: > The depends-on property is used to list the mandatory functional > dependencies of a consumer device on zero or more supplier devices. > > Signed-off-by: Saravana Kannan <saravanak@google.com> > --- > .../devicetree/bindings/depends-on.txt | 26 +++++++++++++++++++ > 1 file changed, 26 insertions(+) > create mode 100644 Documentation/devicetree/bindings/depends-on.txt > > diff --git a/Documentation/devicetree/bindings/depends-on.txt b/Documentation/devicetree/bindings/depends-on.txt > new file mode 100644 > index 000000000000..1cbddd11cf17 > --- /dev/null > +++ b/Documentation/devicetree/bindings/depends-on.txt > @@ -0,0 +1,26 @@ > +Functional dependency linking > +============================= > + > +Apart from parent-child relationships, devices (consumers) often have > +functional dependencies on other devices (suppliers). Common examples of > +suppliers are clock, regulators, pinctrl, etc. However not all of them are > +dependencies with well defined devicetree bindings. For clocks, regualtors, and pinctrl, that dependency is already implicit in the consumer node's properties. We should be able to derive those dependencies within the kernel. Can you give an example of where a dependency is not implicit in an existing binding? > Also, not all functional > +dependencies are mandatory as the device might be able to operate in a limited > +mode without some of the dependencies. Whether something is a mandatory dependency will depend on the driver and dynamic runtime details more than it will depend on the hardware. For example, assume I have an IP block that functions as both a clocksource and a watchdog that can reset the system, with those two functions derived from separate input clocks. I could use the device as just a clocksource, or as just a watchdog, and neither feature in isolation is necessarily mandatory for the device to be somewhat useful to the OS. We need better ways of dynamically providing and managing this information. For example, if a driver could register its dynamic dependencies at probe (or some new pre-probe callback), we'd be able to notify it immediately when its dependencies are available. Thanks, Mark.
On 5/23/19 6:01 PM, Saravana Kannan wrote: > Add a generic "depends-on" property that allows specifying mandatory > functional dependencies between devices. Add device-links after the > devices are created (but before they are probed) by looking at this > "depends-on" property. > > This property is used instead of existing DT properties that specify > phandles of other devices (Eg: clocks, pinctrl, regulators, etc). This > is because not all resources referred to by existing DT properties are > mandatory functional dependencies. Some devices/drivers might be able> to operate with reduced functionality when some of the resources > aren't available. For example, a device could operate in polling mode > if no IRQ is available, a device could skip doing power management if > clock or voltage control isn't available and they are left on, etc. > > So, adding mandatory functional dependency links between devices by > looking at referred phandles in DT properties won't work as it would > prevent probing devices that could be probed. By having an explicit > depends-on property, we can handle these cases correctly. Trying to wrap my brain around the concept, this series seems to be adding the ability to declare that an apparent dependency (eg an IRQ specified by a phandle) is _not_ actually a dependency. The phandle already implies the dependency. Creating a separate depends-on property provides a method of ignoring the implied dependencies. This is not just hardware description. It is instead a combination of hardware functionality and driver functionality. An example provided in the second paragraph of the email I am replying to suggests a device could operate in polling mode if no IRQ is available. Using this example, the devicetree does not know whether the driver requires the IRQ (currently an implied dependency since the IRQ phandle exists). My understanding of this example is that the device node would _not_ have a depends-on property for the IRQ phandle so the IRQ would be optional. But this is an attribute of the driver, not the hardware. This is also configuration, declaring whether the system is willing to accept polling mode instead of interrupt mode. Devicetree is not the proper place for driver description or for configuration. Another flaw with this method is that existing device trees will be broken after the kernel is modified, because existing device trees do not have the depends-on property. This breaks the devicetree compatibility rules. > Having functional dependencies explicitly called out in DT and > automatically added before the devices are probed, provides the > following benefits: > > - Optimizes device probe order and avoids the useless work of > attempting probes of devices that will not probe successfully > (because their suppliers aren't present or haven't probed yet). > > For example, in a commonly available mobile SoC, registering just > one consumer device's driver at an initcall level earlier than the > supplier device's driver causes 11 failed probe attempts before the > consumer device probes successfully. This was with a kernel with all > the drivers statically compiled in. This problem gets a lot worse if > all the drivers are loaded as modules without direct symbol > dependencies. > > - Supplier devices like clock providers, regulators providers, etc > need to keep the resources they provide active and at a particular > state(s) during boot up even if their current set of consumers don't > request the resource to be active. This is because the rest of the > consumers might not have probed yet and turning off the resource > before all the consumers have probed could lead to a hang or > undesired user experience. > > Some frameworks (Eg: regulator) handle this today by turning off > "unused" resources at late_initcall_sync and hoping all the devices > have probed by then. This is not a valid assumption for systems with > loadable modules. Other frameworks (Eg: clock) just don't handle > this due to the lack of a clear signal for when they can turn off > resources. This leads to downstream hacks to handle cases like this > that can easily be solved in the upstream kernel. > > By linking devices before they are probed, we give suppliers a clear By linking devices to suppliers before they are probed, we give suppliers a clear > count of the number of dependent consumers. Once all of the > consumers are active, the suppliers can turn off the unused > resources without making assumptions about the number of consumers. > > By default we just add device-links to track "driver presence" (probe > succeeded) of the supplier device. If any other functionality provided > by device-links are needed, it is left to the consumer/supplier > devices to change the link when they probe. > > > Saravana Kannan (5): > of/platform: Speed up of_find_device_by_node() > driver core: Add device links support for pending links to suppliers > dt-bindings: Add depends-on property > of/platform: Add functional dependency link from "depends-on" property > driver core: Add sync_state driver/bus callback > > .../devicetree/bindings/depends-on.txt | 26 +++++ > drivers/base/core.c | 106 ++++++++++++++++++ > drivers/of/platform.c | 75 ++++++++++++- > include/linux/device.h | 24 ++++ > include/linux/of.h | 3 + > 5 files changed, 233 insertions(+), 1 deletion(-) > create mode 100644 Documentation/devicetree/bindings/depends-on.txt > The above issues make this specific implementation not acceptable. I like the analysis of the problem areas, and I like the concepts of trying to solve not only probe ordering, but also the problem of when to turn off resources that will not be needed. But at the moment, I don't have a suggestion of a way to implement a solution. -Frank
Hi Sarvana, I'm not reviewing patches 1-5 in any detail, given my reply to patch 0. But I had already skimmed through this patch before I received the email for patch 0, so I want to make one generic comment below, to give some feedback as you continue thinking through possible implementations to solve the underlying problems. On 5/23/19 6:01 PM, Saravana Kannan wrote: > Add a pointer from device tree node to the device created from it. > This allows us to find the device corresponding to a device tree node > without having to loop through all the platform devices. > > However, fallback to looping through the platform devices to handle > any devices that might set their own of_node. > > Signed-off-by: Saravana Kannan <saravanak@google.com> > --- > drivers/of/platform.c | 20 +++++++++++++++++++- > include/linux/of.h | 3 +++ > 2 files changed, 22 insertions(+), 1 deletion(-) > > diff --git a/drivers/of/platform.c b/drivers/of/platform.c > index 04ad312fd85b..1115a8d80a33 100644 > --- a/drivers/of/platform.c > +++ b/drivers/of/platform.c > @@ -42,6 +42,8 @@ static int of_dev_node_match(struct device *dev, void *data) > return dev->of_node == data; > } > > +static DEFINE_SPINLOCK(of_dev_lock); > + > /** > * of_find_device_by_node - Find the platform_device associated with a node > * @np: Pointer to device tree node > @@ -55,7 +57,18 @@ struct platform_device *of_find_device_by_node(struct device_node *np) > { > struct device *dev; > > - dev = bus_find_device(&platform_bus_type, NULL, np, of_dev_node_match); > + /* > + * Spinlock needed to make sure np->dev doesn't get freed between NULL > + * check inside and kref count increment inside get_device(). This is > + * achieved by grabbing the spinlock before setting np->dev = NULL in > + * of_platform_device_destroy(). > + */ > + spin_lock(&of_dev_lock); > + dev = get_device(np->dev); > + spin_unlock(&of_dev_lock); > + if (!dev) > + dev = bus_find_device(&platform_bus_type, NULL, np, > + of_dev_node_match); > return dev ? to_platform_device(dev) : NULL; > } > EXPORT_SYMBOL(of_find_device_by_node); > @@ -196,6 +209,7 @@ static struct platform_device *of_platform_device_create_pdata( > platform_device_put(dev); > goto err_clear_flag; > } > + np->dev = &dev->dev; > > return dev; > > @@ -556,6 +570,10 @@ int of_platform_device_destroy(struct device *dev, void *data) > if (of_node_check_flag(dev->of_node, OF_POPULATED_BUS)) > device_for_each_child(dev, NULL, of_platform_device_destroy); > > + /* Spinlock is needed for of_find_device_by_node() to work */ > + spin_lock(&of_dev_lock); > + dev->of_node->dev = NULL; > + spin_unlock(&of_dev_lock); > of_node_clear_flag(dev->of_node, OF_POPULATED); > of_node_clear_flag(dev->of_node, OF_POPULATED_BUS); > > diff --git a/include/linux/of.h b/include/linux/of.h > index 0cf857012f11..f2b4912cbca1 100644 > --- a/include/linux/of.h > +++ b/include/linux/of.h > @@ -48,6 +48,8 @@ struct property { > struct of_irq_controller; > #endif > > +struct device; > + > struct device_node { > const char *name; > phandle phandle; > @@ -68,6 +70,7 @@ struct device_node { > unsigned int unique_id; > struct of_irq_controller *irq_trans; > #endif > + struct device *dev; /* Device created from this node */ We have actively been working on shrinking the size of struct device_node, as part of reducing the devicetree memory usage. As such, we need strong justification for adding anything to this struct. For example, proof that there is a performance problem that can only be solved by increasing the memory usage. -Frank > }; > > #define MAX_PHANDLE_ARGS 16 >
On Fri, May 24, 2019 at 10:56 AM Frank Rowand <frowand.list@gmail.com> wrote: > > Hi Sarvana, > > I'm not reviewing patches 1-5 in any detail, given my reply to patch 0. > > But I had already skimmed through this patch before I received the > email for patch 0, so I want to make one generic comment below, > to give some feedback as you continue thinking through possible > implementations to solve the underlying problems. Appreciate the feedback Frank! > > > On 5/23/19 6:01 PM, Saravana Kannan wrote: > > Add a pointer from device tree node to the device created from it. > > This allows us to find the device corresponding to a device tree node > > without having to loop through all the platform devices. > > > > However, fallback to looping through the platform devices to handle > > any devices that might set their own of_node. > > > > Signed-off-by: Saravana Kannan <saravanak@google.com> > > --- > > drivers/of/platform.c | 20 +++++++++++++++++++- > > include/linux/of.h | 3 +++ > > 2 files changed, 22 insertions(+), 1 deletion(-) > > > > diff --git a/drivers/of/platform.c b/drivers/of/platform.c > > index 04ad312fd85b..1115a8d80a33 100644 > > --- a/drivers/of/platform.c > > +++ b/drivers/of/platform.c > > @@ -42,6 +42,8 @@ static int of_dev_node_match(struct device *dev, void *data) > > return dev->of_node == data; > > } > > > > +static DEFINE_SPINLOCK(of_dev_lock); > > + > > /** > > * of_find_device_by_node - Find the platform_device associated with a node > > * @np: Pointer to device tree node > > @@ -55,7 +57,18 @@ struct platform_device *of_find_device_by_node(struct device_node *np) > > { > > struct device *dev; > > > > - dev = bus_find_device(&platform_bus_type, NULL, np, of_dev_node_match); > > + /* > > + * Spinlock needed to make sure np->dev doesn't get freed between NULL > > + * check inside and kref count increment inside get_device(). This is > > + * achieved by grabbing the spinlock before setting np->dev = NULL in > > + * of_platform_device_destroy(). > > + */ > > + spin_lock(&of_dev_lock); > > + dev = get_device(np->dev); > > + spin_unlock(&of_dev_lock); > > + if (!dev) > > + dev = bus_find_device(&platform_bus_type, NULL, np, > > + of_dev_node_match); > > return dev ? to_platform_device(dev) : NULL; > > } > > EXPORT_SYMBOL(of_find_device_by_node); > > @@ -196,6 +209,7 @@ static struct platform_device *of_platform_device_create_pdata( > > platform_device_put(dev); > > goto err_clear_flag; > > } > > + np->dev = &dev->dev; > > > > return dev; > > > > @@ -556,6 +570,10 @@ int of_platform_device_destroy(struct device *dev, void *data) > > if (of_node_check_flag(dev->of_node, OF_POPULATED_BUS)) > > device_for_each_child(dev, NULL, of_platform_device_destroy); > > > > + /* Spinlock is needed for of_find_device_by_node() to work */ > > + spin_lock(&of_dev_lock); > > + dev->of_node->dev = NULL; > > + spin_unlock(&of_dev_lock); > > of_node_clear_flag(dev->of_node, OF_POPULATED); > > of_node_clear_flag(dev->of_node, OF_POPULATED_BUS); > > > > diff --git a/include/linux/of.h b/include/linux/of.h > > index 0cf857012f11..f2b4912cbca1 100644 > > --- a/include/linux/of.h > > +++ b/include/linux/of.h > > @@ -48,6 +48,8 @@ struct property { > > struct of_irq_controller; > > #endif > > > > +struct device; > > + > > struct device_node { > > const char *name; > > phandle phandle; > > @@ -68,6 +70,7 @@ struct device_node { > > unsigned int unique_id; > > struct of_irq_controller *irq_trans; > > #endif > > + struct device *dev; /* Device created from this node */ > > We have actively been working on shrinking the size of struct device_node, > as part of reducing the devicetree memory usage. As such, we need strong > justification for adding anything to this struct. For example, proof that > there is a performance problem that can only be solved by increasing the > memory usage. I didn't mean for people to focus on the deferred probe optimization. In reality that was just a added side benefit of this series. The main problem to solve is that of suppliers having to know when all their consumers are up and managing the resources actively, especially in a system with loadable modules where we can't depend on the driver to notify the supplier because the consumer driver module might not be available or loaded until much later. Having said that, I'm not saying we should go around and waste space willy-nilly. But, isn't the memory usage going to increase based on the number of DT nodes present in DT? I'd think as the number of DT nodes increase it's more likely for those devices have more memory? So at least in this specific case I think adding the field is justified. Also, right now the look up is O(n) complexity and if we are trying to add device links to most of the devices, that whole process becomes O(n^2). Having this field makes the look up a O(1) and the entire linking process a O(n) process. I think the memory usage increase is worth the efficiency improvement. And if people are still strongly against it, we could make this a config option. -Saravana
Before I address all the comments, a friendly reminder: Whatever solution we come up with needs to work on a system with loadable modules and shouldn't depend on userspace for correctness. On Fri, May 24, 2019 at 6:04 AM Rob Herring <robh+dt@kernel.org> wrote: > > On Thu, May 23, 2019 at 8:01 PM Saravana Kannan <saravanak@google.com> wrote: > > > > Add a generic "depends-on" property that allows specifying mandatory > > functional dependencies between devices. Add device-links after the > > devices are created (but before they are probed) by looking at this > > "depends-on" property. > > The DT already has dependency information. A node with 'clocks' > property has its dependency right there. We should use that. We don't > need to duplicate the information. So, are you saying all clock bindings are mandatory for a device for all possible users of DT + Linux? Do you think if we have a patch that makes all clock bindings mandatory dependencies, no one would object to that? > > This property is used instead of existing DT properties that specify > > phandles of other devices (Eg: clocks, pinctrl, regulators, etc). This > > is because not all resources referred to by existing DT properties are > > mandatory functional dependencies. Some devices/drivers might be able > > to operate with reduced functionality when some of the resources > > aren't available. For example, a device could operate in polling mode > > if no IRQ is available, a device could skip doing power management if > > clock or voltage control isn't available and they are left on, etc. > > Yeah, but none of these examples are typically what you'd want to > happen. These cases are a property of the OS, not the DT. For example, > until recently, If you added pinctrl bindings to your DT, the kernel > would no longer boot because it would be looking for pinctrl driver. > That's wrong because the DT should not be coupled to the OS like that. > Adding this property will cause the same problem. Isn't the a perfect example of the pinctrl being an optional dependency in that specific case? The kernel still booted if pinctrl wasn't available? I don't agree that the dependency is purely a property of the OS. If there's no clock to clock the hardware core, then the hardware just can't work. There's no question about that. However, there can be clock bindings that aren't mandatory for functionality but are needed just for performance/power control. Another perfect example are clock providers. Clock providers often get input clocks from multiple other clock providers and even have cyclic clock bindings. But only some of them are mandatory for the clock provider to work. For example, clock provider A has input clocks from clock providers B and C, but it only needs B to function (provides root clock to all clocks). Not having C would only affect 4 (out of 100s of clocks) from clock provider A and those 4 are clocks depend on an input clock from C (basically clock from C going to A to have some clock gates and dividers added and sent back to C). This isn't even a made up scenario -- there are SoCs that actually have this. The OS could still choose to not probe the device unless full functionality is available or it could assume all clocks are left on by the bootloader and provide basic functionality. THAT would be the property of the OS. But that doesn't remove the fact that some of the resources are absolutely mandatory for the hardware to function. I'm proposing the depends-on to capture the true hardware dependency -- not what the SW chooses to do with it. > > So, adding mandatory functional dependency links between devices by > > looking at referred phandles in DT properties won't work as it would > > prevent probing devices that could be probed. By having an explicit > > depends-on property, we can handle these cases correctly. > > > > Having functional dependencies explicitly called out in DT and > > automatically added before the devices are probed, provides the > > following benefits: > > > > - Optimizes device probe order and avoids the useless work of > > attempting probes of devices that will not probe successfully > > (because their suppliers aren't present or haven't probed yet). > > > > For example, in a commonly available mobile SoC, registering just > > one consumer device's driver at an initcall level earlier than the > > supplier device's driver causes 11 failed probe attempts before the > > consumer device probes successfully. This was with a kernel with all > > the drivers statically compiled in. This problem gets a lot worse if > > all the drivers are loaded as modules without direct symbol > > dependencies. > > Do you have data on how much time is spent. Past 'smarter probing' > attempts have not shown a significant difference. "avoids the useless work attempting probes of devices that will not probe successfully" -- I never claimed to save boot up time. Your argument about having to save wall clock time is a moot point as a ton of kernel features that optimize code won't save wall clock time (the CPU would just run faster to make up for the inefficiency). Those features just make the kernel less resource hungry and more efficient. I'd understand your argument if this patch series is insanely complex -- but that's not the case here. > > - Supplier devices like clock providers, regulators providers, etc > > need to keep the resources they provide active and at a particular > > state(s) during boot up even if their current set of consumers don't > > request the resource to be active. This is because the rest of the > > consumers might not have probed yet and turning off the resource > > before all the consumers have probed could lead to a hang or > > undesired user experience. > > We already know generally what devices are dependencies because you > just listed them. Why don't we make the kernel smarter by > instantiating these core devices/drivers first instead of relying on > initcall and link order. That's what this patch series is -- it makes the kernel smarter by just using the data from DT instead of relying on manual tweaking of initcall and link order. > > Some frameworks (Eg: regulator) handle this today by turning off > > "unused" resources at late_initcall_sync and hoping all the devices > > have probed by then. This is not a valid assumption for systems with > > loadable modules. Other frameworks (Eg: clock) just don't handle > > this due to the lack of a clear signal for when they can turn off > > resources. This leads to downstream hacks to handle cases like this > > that can easily be solved in the upstream kernel. > > IMO, we should get rid of this auto disabling. Well, you need to back that opinion with reasoning. IMO we should disable unused resources so that we don't waste power -- especially on devices operating on batteries. Also, I explicitly said "need to keep the resources they provide active and at a particular state(s) during boot up". So it's not even about auto disabling. For example, in the case of a voltage regulator supplying multiple devices, if the first device probes and says it only need the lowest voltage level, you can't just drop the voltage. Because the other devices in the same voltage rail haven't probed yet and you can crash the system if you just drop the voltage. You need to wait for all the devices to be probed and then you can let the voltage regulator operate normally. And you can't depend on late_initcall because it falls apart on systems with modules. -Saravana
On Fri, May 24, 2019 at 10:49 AM Frank Rowand <frowand.list@gmail.com> wrote: > > On 5/23/19 6:01 PM, Saravana Kannan wrote: > > Add a generic "depends-on" property that allows specifying mandatory > > functional dependencies between devices. Add device-links after the > > devices are created (but before they are probed) by looking at this > > "depends-on" property. > > > > This property is used instead of existing DT properties that specify > > phandles of other devices (Eg: clocks, pinctrl, regulators, etc). This > > is because not all resources referred to by existing DT properties are > > mandatory functional dependencies. Some devices/drivers might be able> to operate with reduced functionality when some of the resources > > aren't available. For example, a device could operate in polling mode > > if no IRQ is available, a device could skip doing power management if > > clock or voltage control isn't available and they are left on, etc. > > > > So, adding mandatory functional dependency links between devices by > > looking at referred phandles in DT properties won't work as it would > > prevent probing devices that could be probed. By having an explicit > > depends-on property, we can handle these cases correctly. > > Trying to wrap my brain around the concept, this series seems to be > adding the ability to declare that an apparent dependency (eg an IRQ > specified by a phandle) is _not_ actually a dependency. The current implementation completely ignores existing bindings for dependencies and so does the current tip of the kernel. So it's not really overriding anything. However, if I change the implementation so that depends-on becomes the source of truth if it exists and falls back to existing common bindings if "depends-on" isn't present -- then depends-on would truly be overriding existing bindings for dependencies. It depends on how we want to define the DT property. > The phandle already implies the dependency. Sure, it might imply, but it's not always true. > Creating a separate > depends-on property provides a method of ignoring the implied > dependencies. implied != true > This is not just hardware description. It is instead a combination > of hardware functionality and driver functionality. An example > provided in the second paragraph of the email I am replying to > suggests a device could operate in polling mode if no IRQ is > available. Using this example, the devicetree does not know > whether the driver requires the IRQ (currently an implied > dependency since the IRQ phandle exists). My understanding > of this example is that the device node would _not_ have a > depends-on property for the IRQ phandle so the IRQ would be > optional. But this is an attribute of the driver, not the > hardware. Not really. The interrupt could be for "SD card plugged in". That's never a mandatory dependency for the SD card controller to work. So the IRQ provider won't be a "depends-on" in this case. But if there is no power supply or clock for the SD card controller, it isn't going to work -- so they'd be listed in the "depends-on". So, this is still defining the hardware and not the OS. > This is also configuration, declaring whether the > system is willing to accept polling mode instead of interrupt > mode. Whether the driver will choose to operate without the IRQ is up to it. The OS could also assume the power supply is never turned off and still try to use the device. Depending on the hardware configuration, that might or might not work. > Devicetree is not the proper place for driver description or > for configuration. But depends-on isn't describing the driver configuration though. Overall, the clock provider example I gave in another reply is a much better example. If you just assume implied dependencies are mandatory dependencies, some devices will never be probe because the kernel is using them incorrectly (they aren't meant to list mandatory dependencies). > Another flaw with this method is that existing device trees > will be broken after the kernel is modified, because existing > device trees do not have the depends-on property. This breaks > the devicetree compatibility rules. This is 100% not true with the current implementation. I actually tested this. This is fully backwards compatible. That's another reason for adding depends-on and going by just what it says. The existing bindings were never meant to describe only mandatory dependencies. So using them as such is what would break backwards compatibility. > > Having functional dependencies explicitly called out in DT and > > automatically added before the devices are probed, provides the > > following benefits: > > > > - Optimizes device probe order and avoids the useless work of > > attempting probes of devices that will not probe successfully > > (because their suppliers aren't present or haven't probed yet). > > > > For example, in a commonly available mobile SoC, registering just > > one consumer device's driver at an initcall level earlier than the > > supplier device's driver causes 11 failed probe attempts before the > > consumer device probes successfully. This was with a kernel with all > > the drivers statically compiled in. This problem gets a lot worse if > > all the drivers are loaded as modules without direct symbol > > dependencies. > > > > - Supplier devices like clock providers, regulators providers, etc > > need to keep the resources they provide active and at a particular > > state(s) during boot up even if their current set of consumers don't > > request the resource to be active. This is because the rest of the > > consumers might not have probed yet and turning off the resource > > before all the consumers have probed could lead to a hang or > > undesired user experience. > > > > Some frameworks (Eg: regulator) handle this today by turning off > > "unused" resources at late_initcall_sync and hoping all the devices > > have probed by then. This is not a valid assumption for systems with > > loadable modules. Other frameworks (Eg: clock) just don't handle > > this due to the lack of a clear signal for when they can turn off > > resources. This leads to downstream hacks to handle cases like this > > that can easily be solved in the upstream kernel. > > > > By linking devices before they are probed, we give suppliers a clear > > By linking devices to suppliers before they are probed, we give suppliers a clear Ack > > count of the number of dependent consumers. Once all of the > > consumers are active, the suppliers can turn off the unused > > resources without making assumptions about the number of consumers. > > > > By default we just add device-links to track "driver presence" (probe > > succeeded) of the supplier device. If any other functionality provided > > by device-links are needed, it is left to the consumer/supplier > > devices to change the link when they probe. > > > > > > Saravana Kannan (5): > > of/platform: Speed up of_find_device_by_node() > > driver core: Add device links support for pending links to suppliers > > dt-bindings: Add depends-on property > > of/platform: Add functional dependency link from "depends-on" property > > driver core: Add sync_state driver/bus callback > > > > .../devicetree/bindings/depends-on.txt | 26 +++++ > > drivers/base/core.c | 106 ++++++++++++++++++ > > drivers/of/platform.c | 75 ++++++++++++- > > include/linux/device.h | 24 ++++ > > include/linux/of.h | 3 + > > 5 files changed, 233 insertions(+), 1 deletion(-) > > create mode 100644 Documentation/devicetree/bindings/depends-on.txt > > > > The above issues make this specific implementation not acceptable. > I like the analysis of the problem areas, and I like the concepts of > trying to solve not only probe ordering, but also the problem of > when to turn off resources that will not be needed. Beating a dead horse here, but I want to make sure I get this into as many minds as possible: It is NOT just about turning off resources. It's about the kernel taking full control of resources (allowing the full range of voltages, clock frequencies, bus configurations, etc) and syncing the HW state to the SW state as determined by the consumers. > But at the > moment, I don't have a suggestion of a way to implement a solution. The problem of syncing resources to SW state after boot up completed has been broken for a long time in the kernel. It's high time we fix it. I'm open to other suggestions, but we can't just say "we don't have a solution". For example, we can have a kernel command line argument that'll use all common implicit bindings as mandatory dependencies and allow "depends-on" to override them for the few cases where the implicit dependencies don't match mandatory dependencies. Then: - The kernel will be 100% backwards compatible with existing DT if the command line arg isn't provided. - New DT + old kernel will be no worse than today because old kernel doesn't do any dependency tracking. - Command line arg + new kernel + hardware where all implicit dependencies are actually mandatory dependencies == things work better. - Command line arg + new kernel + hardware where all implicit dependencies don't match mandatory dependencies + depends-on for those exception case == things work better. Would that be an acceptable use of "depends-on" to track mandatory dependencies? -Saravana
On Fri, May 24, 2019 at 8:01 AM Mark Rutland <mark.rutland@arm.com> wrote: > > On Thu, May 23, 2019 at 06:01:14PM -0700, Saravana Kannan wrote: > > The depends-on property is used to list the mandatory functional > > dependencies of a consumer device on zero or more supplier devices. > > > > Signed-off-by: Saravana Kannan <saravanak@google.com> > > --- > > .../devicetree/bindings/depends-on.txt | 26 +++++++++++++++++++ > > 1 file changed, 26 insertions(+) > > create mode 100644 Documentation/devicetree/bindings/depends-on.txt > > > > diff --git a/Documentation/devicetree/bindings/depends-on.txt b/Documentation/devicetree/bindings/depends-on.txt > > new file mode 100644 > > index 000000000000..1cbddd11cf17 > > --- /dev/null > > +++ b/Documentation/devicetree/bindings/depends-on.txt > > @@ -0,0 +1,26 @@ > > +Functional dependency linking > > +============================= > > + > > +Apart from parent-child relationships, devices (consumers) often have > > +functional dependencies on other devices (suppliers). Common examples of > > +suppliers are clock, regulators, pinctrl, etc. However not all of them are > > +dependencies with well defined devicetree bindings. > > For clocks, regualtors, and pinctrl, that dependency is already implicit > in the consumer node's properties. We should be able to derive those > dependencies within the kernel. > > Can you give an example of where a dependency is not implicit in an > existing binding? I already gave the IRQ example. But if that's not good I replied to other emails with a clock providers example that's based on real hardware. > > Also, not all functional > > +dependencies are mandatory as the device might be able to operate in a limited > > +mode without some of the dependencies. > > Whether something is a mandatory dependency will depend on the driver > and dynamic runtime details more than it will depend on the hardware. > > For example, assume I have an IP block that functions as both a > clocksource and a watchdog that can reset the system, with those two > functions derived from separate input clocks. > > I could use the device as just a clocksource, or as just a watchdog, and > neither feature in isolation is necessarily mandatory for the device to > be somewhat useful to the OS. Aren't you talking about the supplier here? I don't see any issues so far. You could have consumers that try to use both the features of this IP you mention (although I'm hard pressed to see how a watchdog is a mandatory dependency but let's assume it is). So lets say consumer A adds depends-on to your IP block for the clock and consumer B adds depends-on your IP block for the watchdog. If your driver is incomplete and provides only the watchdog feature, then consumer A and B will attempt to probe once your device probes, but consumer A will never probe successfully because it'll keep getting -EPROBE_DEFER or an error. Your driver will never get a sync_state() callback and that's fine because you don't know how to use or turn off the clock source. If the situation is reversed and your driver provides only the clock feature, then consumer B will never probe because it'll never be able to "get()" or whatever it tries to do with the watchdog feature. And your driver will never get a sync_state() callback and you can never turn off your clock. But that's the best you'll get till you send a patch to add the watchdog support to your driver :) > We need better ways of dynamically providing and managing this > information. For example, if a driver could register its dynamic > dependencies at probe (or some new pre-probe callback), we'd be able to > notify it immediately when its dependencies are available. We can't depend on the drivers to notify the core framework because that doesn't work on a system with modules. The example I gave in the commit text for the last patch is a good one. If your driver is supplying power to the screen backlight and the backlight module is never loaded, you can never turn off the supply to the backlight ever. That use case has to work and it won't work if you depend on the backlight driver to tell you what it depends on. -Saravana
On 5/24/19 11:21 AM, Saravana Kannan wrote: > On Fri, May 24, 2019 at 10:56 AM Frank Rowand <frowand.list@gmail.com> wrote: >> >> Hi Sarvana, >> >> I'm not reviewing patches 1-5 in any detail, given my reply to patch 0. >> >> But I had already skimmed through this patch before I received the >> email for patch 0, so I want to make one generic comment below, >> to give some feedback as you continue thinking through possible >> implementations to solve the underlying problems. > > Appreciate the feedback Frank! > >> >> >> On 5/23/19 6:01 PM, Saravana Kannan wrote: >>> Add a pointer from device tree node to the device created from it. >>> This allows us to find the device corresponding to a device tree node >>> without having to loop through all the platform devices. >>> >>> However, fallback to looping through the platform devices to handle >>> any devices that might set their own of_node. >>> >>> Signed-off-by: Saravana Kannan <saravanak@google.com> >>> --- >>> drivers/of/platform.c | 20 +++++++++++++++++++- >>> include/linux/of.h | 3 +++ >>> 2 files changed, 22 insertions(+), 1 deletion(-) >>> >>> diff --git a/drivers/of/platform.c b/drivers/of/platform.c >>> index 04ad312fd85b..1115a8d80a33 100644 >>> --- a/drivers/of/platform.c >>> +++ b/drivers/of/platform.c >>> @@ -42,6 +42,8 @@ static int of_dev_node_match(struct device *dev, void *data) >>> return dev->of_node == data; >>> } >>> >>> +static DEFINE_SPINLOCK(of_dev_lock); >>> + >>> /** >>> * of_find_device_by_node - Find the platform_device associated with a node >>> * @np: Pointer to device tree node >>> @@ -55,7 +57,18 @@ struct platform_device *of_find_device_by_node(struct device_node *np) >>> { >>> struct device *dev; >>> >>> - dev = bus_find_device(&platform_bus_type, NULL, np, of_dev_node_match); >>> + /* >>> + * Spinlock needed to make sure np->dev doesn't get freed between NULL >>> + * check inside and kref count increment inside get_device(). This is >>> + * achieved by grabbing the spinlock before setting np->dev = NULL in >>> + * of_platform_device_destroy(). >>> + */ >>> + spin_lock(&of_dev_lock); >>> + dev = get_device(np->dev); >>> + spin_unlock(&of_dev_lock); >>> + if (!dev) >>> + dev = bus_find_device(&platform_bus_type, NULL, np, >>> + of_dev_node_match); >>> return dev ? to_platform_device(dev) : NULL; >>> } >>> EXPORT_SYMBOL(of_find_device_by_node); >>> @@ -196,6 +209,7 @@ static struct platform_device *of_platform_device_create_pdata( >>> platform_device_put(dev); >>> goto err_clear_flag; >>> } >>> + np->dev = &dev->dev; >>> >>> return dev; >>> >>> @@ -556,6 +570,10 @@ int of_platform_device_destroy(struct device *dev, void *data) >>> if (of_node_check_flag(dev->of_node, OF_POPULATED_BUS)) >>> device_for_each_child(dev, NULL, of_platform_device_destroy); >>> >>> + /* Spinlock is needed for of_find_device_by_node() to work */ >>> + spin_lock(&of_dev_lock); >>> + dev->of_node->dev = NULL; >>> + spin_unlock(&of_dev_lock); >>> of_node_clear_flag(dev->of_node, OF_POPULATED); >>> of_node_clear_flag(dev->of_node, OF_POPULATED_BUS); >>> >>> diff --git a/include/linux/of.h b/include/linux/of.h >>> index 0cf857012f11..f2b4912cbca1 100644 >>> --- a/include/linux/of.h >>> +++ b/include/linux/of.h >>> @@ -48,6 +48,8 @@ struct property { >>> struct of_irq_controller; >>> #endif >>> >>> +struct device; >>> + >>> struct device_node { >>> const char *name; >>> phandle phandle; >>> @@ -68,6 +70,7 @@ struct device_node { >>> unsigned int unique_id; >>> struct of_irq_controller *irq_trans; >>> #endif >>> + struct device *dev; /* Device created from this node */ >> >> We have actively been working on shrinking the size of struct device_node, >> as part of reducing the devicetree memory usage. As such, we need strong >> justification for adding anything to this struct. For example, proof that >> there is a performance problem that can only be solved by increasing the >> memory usage. > > I didn't mean for people to focus on the deferred probe optimization. I was speaking specifically of the of_find_device_by_node() optimization. I did not chase any further back in the call chain to see how that would impact anything else. My comments stand, whether this patch is meant to optimize deferred probe optimization or to optimize something else. > In reality that was just a added side benefit of this series. The main > problem to solve is that of suppliers having to know when all their > consumers are up and managing the resources actively, especially in a > system with loadable modules where we can't depend on the driver to > notify the supplier because the consumer driver module might not be > available or loaded until much later. > > Having said that, I'm not saying we should go around and waste space > willy-nilly. But, isn't the memory usage going to increase based on > the number of DT nodes present in DT? I'd think as the number of DT > nodes increase it's more likely for those devices have more memory? So > at least in this specific case I think adding the field is justified. Struct device_node is the nodes of the in kernel devicetree data. This patch adds a field to every single node of the devicetree. The patch series is also adding a new property, of varying size, to some nodes. This also results in additional memory usage by devicetree. Arguing that a more complex system is likely to have more memory is likely true, but beside the point. Minimizing devicetree memory used on less complex systems is also one of our goals. > Also, right now the look up is O(n) complexity and if we are trying to > add device links to most of the devices, that whole process becomes > O(n^2). Having this field makes the look up a O(1) and the entire > linking process a O(n) process. I think the memory usage increase is > worth the efficiency improvement. Hand waving about O(n) and O(1) and O(n2) is not sufficient. We require actual measurements that show O(n2) (when the existing algorithm is such) is a performance problem and that the proposed change to the algorithm results in a specific change in the performance. The devicetree maintainers have decided that memory use is important and to be minimized, and the burden of proof that performance is an issue lies on the submitter of a patch that improves performance at the cost of memory. Most devicetree boot time cpu overhead only affects the boot event. Added memory use persists for the entire booted lifetime of the system. That is not to say that we never increase memory use to improve boot performance. We have done so when the measured performance issue and measured performance improvement justified the change. > > And if people are still strongly against it, we could make this a config option. > > -Saravana >
Hi Saravana, On 5/24/19 2:53 PM, Saravana Kannan wrote: > On Fri, May 24, 2019 at 10:49 AM Frank Rowand <frowand.list@gmail.com> wrote: >> >> On 5/23/19 6:01 PM, Saravana Kannan wrote: < snip > > > -Saravana > There were several different topics in your email. I am going to do separate replies for different topics so that each topic is contained in a single sub-thread instead of possibly having a many topic sub-thread where any of the topics might get lost. If I drop any key context out of any of the replies, please feel free to add it back in. -Frank
On 5/24/19 2:53 PM, Saravana Kannan wrote: > On Fri, May 24, 2019 at 10:49 AM Frank Rowand <frowand.list@gmail.com> wrote: >> >> On 5/23/19 6:01 PM, Saravana Kannan wrote: < snip > >> Another flaw with this method is that existing device trees >> will be broken after the kernel is modified, because existing >> device trees do not have the depends-on property. This breaks >> the devicetree compatibility rules. > > This is 100% not true with the current implementation. I actually > tested this. This is fully backwards compatible. That's another reason > for adding depends-on and going by just what it says. The existing > bindings were never meant to describe only mandatory dependencies. So > using them as such is what would break backwards compatibility. Are you saying that an existing, already compiled, devicetree (an FDT) can be used to boot a new kernel that has implemented this patch set? The new kernel will boot with the existing FDT that does not have any depends-on properties? -Frank
On 5/24/19 5:22 PM, Frank Rowand wrote: > On 5/24/19 2:53 PM, Saravana Kannan wrote: >> On Fri, May 24, 2019 at 10:49 AM Frank Rowand <frowand.list@gmail.com> wrote: >>> >>> On 5/23/19 6:01 PM, Saravana Kannan wrote: > > < snip > > >>> Another flaw with this method is that existing device trees >>> will be broken after the kernel is modified, because existing >>> device trees do not have the depends-on property. This breaks >>> the devicetree compatibility rules. >> >> This is 100% not true with the current implementation. I actually >> tested this. This is fully backwards compatible. That's another reason >> for adding depends-on and going by just what it says. The existing >> bindings were never meant to describe only mandatory dependencies. So >> using them as such is what would break backwards compatibility. > > Are you saying that an existing, already compiled, devicetree (an FDT) > can be used to boot a new kernel that has implemented this patch set? > > The new kernel will boot with the existing FDT that does not have > any depends-on properties? I overlooked something you said in the email I replied to. You said: "that depends-on becomes the source of truth if it exists and falls back to existing common bindings if "depends-on" isn't present" Let me go back to look at the patch series to see how it falls back to the existing bindings. > > -Frank >
Ugh... mobile app is sending HTML emails. Replying again. On Fri, May 24, 2019 at 5:25 PM Frank Rowand <frowand.list@gmail.com> wrote: > > On 5/24/19 5:22 PM, Frank Rowand wrote: > > On 5/24/19 2:53 PM, Saravana Kannan wrote: > >> On Fri, May 24, 2019 at 10:49 AM Frank Rowand <frowand.list@gmail.com> wrote: > >>> > >>> On 5/23/19 6:01 PM, Saravana Kannan wrote: > > > > < snip > > > > >>> Another flaw with this method is that existing device trees > >>> will be broken after the kernel is modified, because existing > >>> device trees do not have the depends-on property. This breaks > >>> the devicetree compatibility rules. > >> > >> This is 100% not true with the current implementation. I actually > >> tested this. This is fully backwards compatible. That's another reason > >> for adding depends-on and going by just what it says. The existing > >> bindings were never meant to describe only mandatory dependencies. So > >> using them as such is what would break backwards compatibility. > > > > Are you saying that an existing, already compiled, devicetree (an FDT) > > can be used to boot a new kernel that has implemented this patch set? > > > > The new kernel will boot with the existing FDT that does not have > > any depends-on properties? You sent out a lot of emails on this topic. But to answer them all. The existing implementation is 100% backwards compatible. > I overlooked something you said in the email I replied to. You said: > > "that depends-on becomes the source of truth if it exists and falls > back to existing common bindings if "depends-on" isn't present" This is referring to an alternate implementation where implicit dependencies are used by the kernel but new "depends-on" property would allow overriding in cases where the implicit dependencies are wrong. But this will need a kernel command line flag to enable this feature so that we can be backwards compatible. Otherwise it won't be. > Let me go back to look at the patch series to see how it falls back > to the existing bindings. Current patch series doesn't really "fallback" but rather it only acts on this new property. Existing FDT binaries simply don't have this. So it won't have any impact on the kernel behavior. But yes, looking at the patches again will help :) -Saravana
On Thu, May 23, 2019 at 10:52 PM Greg Kroah-Hartman <gregkh@linuxfoundation.org> wrote: > > On Thu, May 23, 2019 at 06:01:11PM -0700, Saravana Kannan wrote: > > Add a generic "depends-on" property that allows specifying mandatory > > functional dependencies between devices. Add device-links after the > > devices are created (but before they are probed) by looking at this > > "depends-on" property. > > > > This property is used instead of existing DT properties that specify > > phandles of other devices (Eg: clocks, pinctrl, regulators, etc). This > > is because not all resources referred to by existing DT properties are > > mandatory functional dependencies. Some devices/drivers might be able > > to operate with reduced functionality when some of the resources > > aren't available. For example, a device could operate in polling mode > > if no IRQ is available, a device could skip doing power management if > > clock or voltage control isn't available and they are left on, etc. > > > > So, adding mandatory functional dependency links between devices by > > looking at referred phandles in DT properties won't work as it would > > prevent probing devices that could be probed. By having an explicit > > depends-on property, we can handle these cases correctly. > > > > Having functional dependencies explicitly called out in DT and > > automatically added before the devices are probed, provides the > > following benefits: > > > > - Optimizes device probe order and avoids the useless work of > > attempting probes of devices that will not probe successfully > > (because their suppliers aren't present or haven't probed yet). > > > > For example, in a commonly available mobile SoC, registering just > > one consumer device's driver at an initcall level earlier than the > > supplier device's driver causes 11 failed probe attempts before the > > consumer device probes successfully. This was with a kernel with all > > the drivers statically compiled in. This problem gets a lot worse if > > all the drivers are loaded as modules without direct symbol > > dependencies. > > > > - Supplier devices like clock providers, regulators providers, etc > > need to keep the resources they provide active and at a particular > > state(s) during boot up even if their current set of consumers don't > > request the resource to be active. This is because the rest of the > > consumers might not have probed yet and turning off the resource > > before all the consumers have probed could lead to a hang or > > undesired user experience. > > > > Some frameworks (Eg: regulator) handle this today by turning off > > "unused" resources at late_initcall_sync and hoping all the devices > > have probed by then. This is not a valid assumption for systems with > > loadable modules. Other frameworks (Eg: clock) just don't handle > > this due to the lack of a clear signal for when they can turn off > > resources. This leads to downstream hacks to handle cases like this > > that can easily be solved in the upstream kernel. > > > > By linking devices before they are probed, we give suppliers a clear > > count of the number of dependent consumers. Once all of the > > consumers are active, the suppliers can turn off the unused > > resources without making assumptions about the number of consumers. > > > > By default we just add device-links to track "driver presence" (probe > > succeeded) of the supplier device. If any other functionality provided > > by device-links are needed, it is left to the consumer/supplier > > devices to change the link when they probe. > > Somewhere in this wall of text you need to say: > MAKES DEVICES BOOT FASTER! > right? :) I'm sure it will, but I can't easily test and measure this number because I don't have a device with 100s of devices (common in mobile SoCs) where I can load all the drivers as modules and are supported upstream. And the current ones I have mostly workaround this in their downstream tree by manually ordering with initcalls and link order. But I see the avoidance of useless probes that'll fail as more of a free side benefit and not the main goal of this patch series. Getting modules to actually work and crash the system while booting is the main goal. > So in short, this solves the issue of deferred probing with systems with > loads of modules for platform devices and device tree, in that now you > have a chance to probe devices in the correct order saving loads of busy > loops. Yes, definitely saves loads of busy work. > A good thing, I like this, very nice work, all of these are: > Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Thanks! > but odds are I'll take this through my tree, so I'll add my s-o-b then. > But only after the DT people agree on the new entry. Yup! Trying to do that. :) -Saravana
Hi Saranova, I'll try to address the other portions of this email that I <snipped> in my previous replies. On 5/24/19 2:53 PM, Saravana Kannan wrote: > On Fri, May 24, 2019 at 10:49 AM Frank Rowand <frowand.list@gmail.com> wrote: >> >> On 5/23/19 6:01 PM, Saravana Kannan wrote: >>> Add a generic "depends-on" property that allows specifying mandatory >>> functional dependencies between devices. Add device-links after the >>> devices are created (but before they are probed) by looking at this >>> "depends-on" property. >>> >>> This property is used instead of existing DT properties that specify >>> phandles of other devices (Eg: clocks, pinctrl, regulators, etc). This >>> is because not all resources referred to by existing DT properties are >>> mandatory functional dependencies. Some devices/drivers might be able> to operate with reduced functionality when some of the resources >>> aren't available. For example, a device could operate in polling mode >>> if no IRQ is available, a device could skip doing power management if >>> clock or voltage control isn't available and they are left on, etc. >>> >>> So, adding mandatory functional dependency links between devices by >>> looking at referred phandles in DT properties won't work as it would >>> prevent probing devices that could be probed. By having an explicit >>> depends-on property, we can handle these cases correctly. >> >> Trying to wrap my brain around the concept, this series seems to be >> adding the ability to declare that an apparent dependency (eg an IRQ >> specified by a phandle) is _not_ actually a dependency. > > The current implementation completely ignores existing bindings for > dependencies and so does the current tip of the kernel. So it's not > really overriding anything. However, if I change the implementation so > that depends-on becomes the source of truth if it exists and falls > back to existing common bindings if "depends-on" isn't present -- then > depends-on would truly be overriding existing bindings for > dependencies. It depends on how we want to define the DT property. > >> The phandle already implies the dependency. > > Sure, it might imply, but it's not always true. > >> Creating a separate >> depends-on property provides a method of ignoring the implied >> dependencies. > > implied != true > >> This is not just hardware description. It is instead a combination >> of hardware functionality and driver functionality. An example >> provided in the second paragraph of the email I am replying to >> suggests a device could operate in polling mode if no IRQ is >> available. Using this example, the devicetree does not know >> whether the driver requires the IRQ (currently an implied >> dependency since the IRQ phandle exists). My understanding >> of this example is that the device node would _not_ have a >> depends-on property for the IRQ phandle so the IRQ would be >> optional. But this is an attribute of the driver, not the >> hardware. > > Not really. The interrupt could be for "SD card plugged in". That's > never a mandatory dependency for the SD card controller to work. So > the IRQ provider won't be a "depends-on" in this case. But if there is > no power supply or clock for the SD card controller, it isn't going to > work -- so they'd be listed in the "depends-on". So, this is still > defining the hardware and not the OS. Please comment on my observation that was based on an IRQ for a device will polling mode vs interrupt driven mode. You described a different case and did not address my comment. >> This is also configuration, declaring whether the >> system is willing to accept polling mode instead of interrupt >> mode. > > Whether the driver will choose to operate without the IRQ is up to it. > The OS could also assume the power supply is never turned off and > still try to use the device. Depending on the hardware configuration, > that might or might not work. > >> Devicetree is not the proper place for driver description or >> for configuration. > > But depends-on isn't describing the driver configuration though. > > Overall, the clock provider example I gave in another reply is a much > better example. If you just assume implied dependencies are mandatory > dependencies, some devices will never be probe because the kernel is > using them incorrectly (they aren't meant to list mandatory > dependencies). > >> Another flaw with this method is that existing device trees >> will be broken after the kernel is modified, because existing >> device trees do not have the depends-on property. This breaks >> the devicetree compatibility rules. > > This is 100% not true with the current implementation. I actually > tested this. This is fully backwards compatible. That's another reason > for adding depends-on and going by just what it says. The existing > bindings were never meant to describe only mandatory dependencies. So > using them as such is what would break backwards compatibility. > >>> Having functional dependencies explicitly called out in DT and >>> automatically added before the devices are probed, provides the >>> following benefits: >>> >>> - Optimizes device probe order and avoids the useless work of >>> attempting probes of devices that will not probe successfully >>> (because their suppliers aren't present or haven't probed yet). >>> >>> For example, in a commonly available mobile SoC, registering just >>> one consumer device's driver at an initcall level earlier than the >>> supplier device's driver causes 11 failed probe attempts before the >>> consumer device probes successfully. This was with a kernel with all >>> the drivers statically compiled in. This problem gets a lot worse if >>> all the drivers are loaded as modules without direct symbol >>> dependencies. >>> >>> - Supplier devices like clock providers, regulators providers, etc >>> need to keep the resources they provide active and at a particular >>> state(s) during boot up even if their current set of consumers don't >>> request the resource to be active. This is because the rest of the >>> consumers might not have probed yet and turning off the resource >>> before all the consumers have probed could lead to a hang or >>> undesired user experience. >>> >>> Some frameworks (Eg: regulator) handle this today by turning off >>> "unused" resources at late_initcall_sync and hoping all the devices >>> have probed by then. This is not a valid assumption for systems with >>> loadable modules. Other frameworks (Eg: clock) just don't handle >>> this due to the lack of a clear signal for when they can turn off >>> resources. This leads to downstream hacks to handle cases like this >>> that can easily be solved in the upstream kernel. >>> >>> By linking devices before they are probed, we give suppliers a clear >> >> By linking devices to suppliers before they are probed, we give suppliers a clear > > Ack > >>> count of the number of dependent consumers. Once all of the >>> consumers are active, the suppliers can turn off the unused >>> resources without making assumptions about the number of consumers. >>> >>> By default we just add device-links to track "driver presence" (probe >>> succeeded) of the supplier device. If any other functionality provided >>> by device-links are needed, it is left to the consumer/supplier >>> devices to change the link when they probe. >>> >>> >>> Saravana Kannan (5): >>> of/platform: Speed up of_find_device_by_node() >>> driver core: Add device links support for pending links to suppliers >>> dt-bindings: Add depends-on property >>> of/platform: Add functional dependency link from "depends-on" property >>> driver core: Add sync_state driver/bus callback >>> >>> .../devicetree/bindings/depends-on.txt | 26 +++++ >>> drivers/base/core.c | 106 ++++++++++++++++++ >>> drivers/of/platform.c | 75 ++++++++++++- >>> include/linux/device.h | 24 ++++ >>> include/linux/of.h | 3 + >>> 5 files changed, 233 insertions(+), 1 deletion(-) >>> create mode 100644 Documentation/devicetree/bindings/depends-on.txt >>> >> >> The above issues make this specific implementation not acceptable. >> I like the analysis of the problem areas, and I like the concepts of >> trying to solve not only probe ordering, but also the problem of >> when to turn off resources that will not be needed. > > Beating a dead horse here, but I want to make sure I get this into as > many minds as possible: > It is NOT just about turning off resources. It's about the kernel > taking full control of resources (allowing the full range of voltages, > clock frequencies, bus configurations, etc) and syncing the HW state > to the SW state as determined by the consumers. > >> But at the >> moment, I don't have a suggestion of a way to implement a solution. > > The problem of syncing resources to SW state after boot up completed > has been broken for a long time in the kernel. It's high time we fix > it. I'm open to other suggestions, but we can't just say "we don't > have a solution". > > For example, we can have a kernel command line argument that'll use > all common implicit bindings as mandatory dependencies and allow > "depends-on" to override them for the few cases where the implicit > dependencies don't match mandatory dependencies. Then: > - The kernel will be 100% backwards compatible with existing DT if the > command line arg isn't provided. > - New DT + old kernel will be no worse than today because old kernel > doesn't do any dependency tracking. > - Command line arg + new kernel + hardware where all implicit > dependencies are actually mandatory dependencies == things work > better. > - Command line arg + new kernel + hardware where all implicit > dependencies don't match mandatory dependencies + depends-on for those > exception case == things work better. Using a command line argument for this purpose just seems to be a hack and bad architecture. It also seems like an invitation to mis-configure a system (in other words, increases the complexity and difficulty of properly configuring and administering a system). The is not a hard no (yet), but it will take some convincing for me to accept the command line approach to add the feature, yet maintain compatibility. Please do not spend any time replying to this concern yet - we will have plenty of time to discuss later if need be. > > Would that be an acceptable use of "depends-on" to track mandatory dependencies? > > > > -Saravana > -Frank
On Fri, May 24, 2019 at 7:40 PM Frank Rowand <frowand.list@gmail.com> wrote: > > Hi Saranova, > > I'll try to address the other portions of this email that I <snipped> > in my previous replies. > > > On 5/24/19 2:53 PM, Saravana Kannan wrote: > > On Fri, May 24, 2019 at 10:49 AM Frank Rowand <frowand.list@gmail.com> wrote: > >> > >> On 5/23/19 6:01 PM, Saravana Kannan wrote: > >>> Add a generic "depends-on" property that allows specifying mandatory > >>> functional dependencies between devices. Add device-links after the > >>> devices are created (but before they are probed) by looking at this > >>> "depends-on" property. > >>> > >>> This property is used instead of existing DT properties that specify > >>> phandles of other devices (Eg: clocks, pinctrl, regulators, etc). This > >>> is because not all resources referred to by existing DT properties are > >>> mandatory functional dependencies. Some devices/drivers might be able> to operate with reduced functionality when some of the resources > >>> aren't available. For example, a device could operate in polling mode > >>> if no IRQ is available, a device could skip doing power management if > >>> clock or voltage control isn't available and they are left on, etc. > >>> > >>> So, adding mandatory functional dependency links between devices by > >>> looking at referred phandles in DT properties won't work as it would > >>> prevent probing devices that could be probed. By having an explicit > >>> depends-on property, we can handle these cases correctly. > >> > >> Trying to wrap my brain around the concept, this series seems to be > >> adding the ability to declare that an apparent dependency (eg an IRQ > >> specified by a phandle) is _not_ actually a dependency. > > > > The current implementation completely ignores existing bindings for > > dependencies and so does the current tip of the kernel. So it's not > > really overriding anything. However, if I change the implementation so > > that depends-on becomes the source of truth if it exists and falls > > back to existing common bindings if "depends-on" isn't present -- then > > depends-on would truly be overriding existing bindings for > > dependencies. It depends on how we want to define the DT property. > > > >> The phandle already implies the dependency. > > > > Sure, it might imply, but it's not always true. > > > >> Creating a separate > >> depends-on property provides a method of ignoring the implied > >> dependencies. > > > > implied != true > > > >> This is not just hardware description. It is instead a combination > >> of hardware functionality and driver functionality. An example > >> provided in the second paragraph of the email I am replying to > >> suggests a device could operate in polling mode if no IRQ is > >> available. Using this example, the devicetree does not know > >> whether the driver requires the IRQ (currently an implied > >> dependency since the IRQ phandle exists). My understanding > >> of this example is that the device node would _not_ have a > >> depends-on property for the IRQ phandle so the IRQ would be > >> optional. But this is an attribute of the driver, not the > >> hardware. > > > > > Not really. The interrupt could be for "SD card plugged in". That's > > never a mandatory dependency for the SD card controller to work. So > > the IRQ provider won't be a "depends-on" in this case. But if there is > > no power supply or clock for the SD card controller, it isn't going to > > work -- so they'd be listed in the "depends-on". So, this is still > > defining the hardware and not the OS. > > Please comment on my observation that was based on an IRQ for a device > will polling mode vs interrupt driven mode. > You described a different > case and did not address my comment. I thought I did reply -- not sure what part you are looking for so I'll rephrase. I was just picking the SD card controller as a concrete example of device that can work with or without an interrupt. But sure, I can call it "the device". And yes, the device won't have a "depends-on" on the IRQ provider because the device can still work without a working (as in bound to driver) IRQ provider. Whether the driver insists on waiting on an IRQ provider or not is up to the driver and the depends-on property is NOT trying to dictate what the driver should do in this case. Does that answer your implied question? > > >> This is also configuration, declaring whether the > >> system is willing to accept polling mode instead of interrupt > >> mode. > > > > Whether the driver will choose to operate without the IRQ is up to it. > > The OS could also assume the power supply is never turned off and > > still try to use the device. Depending on the hardware configuration, > > that might or might not work. > > > >> Devicetree is not the proper place for driver description or > >> for configuration. > > > > But depends-on isn't describing the driver configuration though. > > > > Overall, the clock provider example I gave in another reply is a much > > better example. If you just assume implied dependencies are mandatory > > dependencies, some devices will never be probe because the kernel is > > using them incorrectly (they aren't meant to list mandatory > > dependencies). > > > >> Another flaw with this method is that existing device trees > >> will be broken after the kernel is modified, because existing > >> device trees do not have the depends-on property. This breaks > >> the devicetree compatibility rules. > > > > This is 100% not true with the current implementation. I actually > > tested this. This is fully backwards compatible. That's another reason > > for adding depends-on and going by just what it says. The existing > > bindings were never meant to describe only mandatory dependencies. So > > using them as such is what would break backwards compatibility. > > > >>> Having functional dependencies explicitly called out in DT and > >>> automatically added before the devices are probed, provides the > >>> following benefits: > >>> > >>> - Optimizes device probe order and avoids the useless work of > >>> attempting probes of devices that will not probe successfully > >>> (because their suppliers aren't present or haven't probed yet). > >>> > >>> For example, in a commonly available mobile SoC, registering just > >>> one consumer device's driver at an initcall level earlier than the > >>> supplier device's driver causes 11 failed probe attempts before the > >>> consumer device probes successfully. This was with a kernel with all > >>> the drivers statically compiled in. This problem gets a lot worse if > >>> all the drivers are loaded as modules without direct symbol > >>> dependencies. > >>> > >>> - Supplier devices like clock providers, regulators providers, etc > >>> need to keep the resources they provide active and at a particular > >>> state(s) during boot up even if their current set of consumers don't > >>> request the resource to be active. This is because the rest of the > >>> consumers might not have probed yet and turning off the resource > >>> before all the consumers have probed could lead to a hang or > >>> undesired user experience. > >>> > >>> Some frameworks (Eg: regulator) handle this today by turning off > >>> "unused" resources at late_initcall_sync and hoping all the devices > >>> have probed by then. This is not a valid assumption for systems with > >>> loadable modules. Other frameworks (Eg: clock) just don't handle > >>> this due to the lack of a clear signal for when they can turn off > >>> resources. This leads to downstream hacks to handle cases like this > >>> that can easily be solved in the upstream kernel. > >>> > >>> By linking devices before they are probed, we give suppliers a clear > >> > >> By linking devices to suppliers before they are probed, we give suppliers a clear > > > > Ack > > > >>> count of the number of dependent consumers. Once all of the > >>> consumers are active, the suppliers can turn off the unused > >>> resources without making assumptions about the number of consumers. > >>> > >>> By default we just add device-links to track "driver presence" (probe > >>> succeeded) of the supplier device. If any other functionality provided > >>> by device-links are needed, it is left to the consumer/supplier > >>> devices to change the link when they probe. > >>> > >>> > >>> Saravana Kannan (5): > >>> of/platform: Speed up of_find_device_by_node() > >>> driver core: Add device links support for pending links to suppliers > >>> dt-bindings: Add depends-on property > >>> of/platform: Add functional dependency link from "depends-on" property > >>> driver core: Add sync_state driver/bus callback > >>> > >>> .../devicetree/bindings/depends-on.txt | 26 +++++ > >>> drivers/base/core.c | 106 ++++++++++++++++++ > >>> drivers/of/platform.c | 75 ++++++++++++- > >>> include/linux/device.h | 24 ++++ > >>> include/linux/of.h | 3 + > >>> 5 files changed, 233 insertions(+), 1 deletion(-) > >>> create mode 100644 Documentation/devicetree/bindings/depends-on.txt > >>> > >> > >> The above issues make this specific implementation not acceptable. > >> I like the analysis of the problem areas, and I like the concepts of > >> trying to solve not only probe ordering, but also the problem of > >> when to turn off resources that will not be needed. > > > > Beating a dead horse here, but I want to make sure I get this into as > > many minds as possible: > > It is NOT just about turning off resources. It's about the kernel > > taking full control of resources (allowing the full range of voltages, > > clock frequencies, bus configurations, etc) and syncing the HW state > > to the SW state as determined by the consumers. > > > >> But at the > >> moment, I don't have a suggestion of a way to implement a solution. > > > > The problem of syncing resources to SW state after boot up completed > > has been broken for a long time in the kernel. It's high time we fix > > it. I'm open to other suggestions, but we can't just say "we don't > > have a solution". > > > > For example, we can have a kernel command line argument that'll use > > all common implicit bindings as mandatory dependencies and allow > > "depends-on" to override them for the few cases where the implicit > > dependencies don't match mandatory dependencies. Then: > > - The kernel will be 100% backwards compatible with existing DT if the > > command line arg isn't provided. > > - New DT + old kernel will be no worse than today because old kernel > > doesn't do any dependency tracking. > > - Command line arg + new kernel + hardware where all implicit > > dependencies are actually mandatory dependencies == things work > > better. > > - Command line arg + new kernel + hardware where all implicit > > dependencies don't match mandatory dependencies + depends-on for those > > exception case == things work better. > > Using a command line argument for this purpose just seems to be a > hack and bad architecture. I agree -- which is why the current implementation doesn't try to form mandatory dependency from implicit bindings and makes the case that all mandatory dependencies should be called out explicitly. But if people insist on that implicit bindings be used as such, then a CONFIG or commandline option would be an acceptable compromise for me. > It also seems like an invitation to mis-configure a system (in other > words, increases the complexity and difficulty of properly configuring > and administering a system). Just want to point out that, as of today, we have a broken system -- module loading is not compatible with proper handling of shared mandatory resources during boot up. To be clear, I'm not arguing for a commandline arg. > The is not a hard no (yet), but it will take some convincing for me > to accept the command line approach to add the feature, yet maintain > compatibility. Please do not spend any time replying to this concern > yet - we will have plenty of time to discuss later if need be. Sounds good. -Saravana
Sending again because email client somehow reverted to HTML.
Frank, Rob, Mark,
Gentle reminder. I've replied to your emails spread across the
different patches in the series. Hoping they address your questions
and concerns. Please let me know what you think.
Thanks,
Saravana
On Wed, May 29, 2019 at 1:00 PM Saravana Kannan <saravanak@google.com> wrote:
>
> Frank, Rob, Mark,
>
> Gentle reminder. I've replied to your emails spread across the different patches in the series. Hoping they address your questions and concerns. Please let me know what you think.
>
> Thanks,
> Saravana
>
> On Thu, May 23, 2019 at 6:01 PM Saravana Kannan <saravanak@google.com> wrote:
>>
>> Add a generic "depends-on" property that allows specifying mandatory
>> functional dependencies between devices. Add device-links after the
>> devices are created (but before they are probed) by looking at this
>> "depends-on" property.
>>
>> This property is used instead of existing DT properties that specify
>> phandles of other devices (Eg: clocks, pinctrl, regulators, etc). This
>> is because not all resources referred to by existing DT properties are
>> mandatory functional dependencies. Some devices/drivers might be able
>> to operate with reduced functionality when some of the resources
>> aren't available. For example, a device could operate in polling mode
>> if no IRQ is available, a device could skip doing power management if
>> clock or voltage control isn't available and they are left on, etc.
>>
>> So, adding mandatory functional dependency links between devices by
>> looking at referred phandles in DT properties won't work as it would
>> prevent probing devices that could be probed. By having an explicit
>> depends-on property, we can handle these cases correctly.
>>
>> Having functional dependencies explicitly called out in DT and
>> automatically added before the devices are probed, provides the
>> following benefits:
>>
>> - Optimizes device probe order and avoids the useless work of
>> attempting probes of devices that will not probe successfully
>> (because their suppliers aren't present or haven't probed yet).
>>
>> For example, in a commonly available mobile SoC, registering just
>> one consumer device's driver at an initcall level earlier than the
>> supplier device's driver causes 11 failed probe attempts before the
>> consumer device probes successfully. This was with a kernel with all
>> the drivers statically compiled in. This problem gets a lot worse if
>> all the drivers are loaded as modules without direct symbol
>> dependencies.
>>
>> - Supplier devices like clock providers, regulators providers, etc
>> need to keep the resources they provide active and at a particular
>> state(s) during boot up even if their current set of consumers don't
>> request the resource to be active. This is because the rest of the
>> consumers might not have probed yet and turning off the resource
>> before all the consumers have probed could lead to a hang or
>> undesired user experience.
>>
>> Some frameworks (Eg: regulator) handle this today by turning off
>> "unused" resources at late_initcall_sync and hoping all the devices
>> have probed by then. This is not a valid assumption for systems with
>> loadable modules. Other frameworks (Eg: clock) just don't handle
>> this due to the lack of a clear signal for when they can turn off
>> resources. This leads to downstream hacks to handle cases like this
>> that can easily be solved in the upstream kernel.
>>
>> By linking devices before they are probed, we give suppliers a clear
>> count of the number of dependent consumers. Once all of the
>> consumers are active, the suppliers can turn off the unused
>> resources without making assumptions about the number of consumers.
>>
>> By default we just add device-links to track "driver presence" (probe
>> succeeded) of the supplier device. If any other functionality provided
>> by device-links are needed, it is left to the consumer/supplier
>> devices to change the link when they probe.
>>
>>
>> Saravana Kannan (5):
>> of/platform: Speed up of_find_device_by_node()
>> driver core: Add device links support for pending links to suppliers
>> dt-bindings: Add depends-on property
>> of/platform: Add functional dependency link from "depends-on" property
>> driver core: Add sync_state driver/bus callback
>>
>> .../devicetree/bindings/depends-on.txt | 26 +++++
>> drivers/base/core.c | 106 ++++++++++++++++++
>> drivers/of/platform.c | 75 ++++++++++++-
>> include/linux/device.h | 24 ++++
>> include/linux/of.h | 3 +
>> 5 files changed, 233 insertions(+), 1 deletion(-)
>> create mode 100644 Documentation/devicetree/bindings/depends-on.txt
>>
>> --
>> 2.22.0.rc1.257.g3120a18244-goog
>>
Hello Saravana, On 5/23/19 6:01 PM, Saravana Kannan wrote: ... > Having functional dependencies explicitly called out in DT and > automatically added before the devices are probed, provides the > following benefits: ... > - Supplier devices like clock providers, regulators providers, etc > need to keep the resources they provide active and at a particular > state(s) during boot up even if their current set of consumers don't > request the resource to be active. This is because the rest of the > consumers might not have probed yet and turning off the resource > before all the consumers have probed could lead to a hang or > undesired user experience. This benefit provided by the sync_state() callback function introduced in this series gives us a mechanism to solve a specific problem encountered on Qualcomm Technologies, Inc. (QTI) boards when booting with drivers compiled as modules. QTI boards have a regulator that powers the PHYs for display, camera, USB, UFS, and PCIe. When these boards boot up, the boot loader enables this regulator along with other resources in order to display a splash screen image. The regulator must remain enabled until the Linux display driver has probed and made a request with the regulator framework to keep the regulator enabled. If the regulator is disabled prematurely, then the screen image is corrupted and the display hardware enters a bad state. We have observed that when the camera driver probes before the display driver, it performs this sequence: regulator_enable(), camera register IO, regulator_disable(). Since it is the first consumer of the shared regulator, the regulator is physically disabled (even though display hardware still requires it to be enabled). We have solved this problem when compiling drivers statically with a downstream regulator proxy-consumer driver. This proxy-consumer is able to make an enable request for the shared regulator before any other consumer. It then removes its request at late_initcall_sync. Unfortunately, when drivers are compiled as modules instead of compiled statically into the kernel image, late_initcall_sync is not a meaningful marker of driver probe completion. This means that our existing proxy voting system will not work when drivers are compiled as modules. The sync_state() callback resolves this issue by providing a notification that is guaranteed to arrive only after all consumers of the shared regulator have probed. QTI boards have other cases of shared resources such as bus bandwidth which must remain at least at a level set by boot loaders in order to properly support hardware blocks that are enabled before the Linux kernel starts booting. Take care, David -- The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project
Hi Saravana,
(I notice that I never seem to spell your name correctly. Apologies for that,
both past and future).
This email was never answered.
-Frank
On 5/24/19 5:12 PM, Frank Rowand wrote:
> On 5/24/19 11:21 AM, Saravana Kannan wrote:
>> On Fri, May 24, 2019 at 10:56 AM Frank Rowand <frowand.list@gmail.com> wrote:
>>>
>>> Hi Sarvana,
>>>
>>> I'm not reviewing patches 1-5 in any detail, given my reply to patch 0.
>>>
>>> But I had already skimmed through this patch before I received the
>>> email for patch 0, so I want to make one generic comment below,
>>> to give some feedback as you continue thinking through possible
>>> implementations to solve the underlying problems.
>>
>> Appreciate the feedback Frank!
>>
>>>
>>>
>>> On 5/23/19 6:01 PM, Saravana Kannan wrote:
>>>> Add a pointer from device tree node to the device created from it.
>>>> This allows us to find the device corresponding to a device tree node
>>>> without having to loop through all the platform devices.
>>>>
>>>> However, fallback to looping through the platform devices to handle
>>>> any devices that might set their own of_node.
>>>>
>>>> Signed-off-by: Saravana Kannan <saravanak@google.com>
>>>> ---
>>>> drivers/of/platform.c | 20 +++++++++++++++++++-
>>>> include/linux/of.h | 3 +++
>>>> 2 files changed, 22 insertions(+), 1 deletion(-)
>>>>
>>>> diff --git a/drivers/of/platform.c b/drivers/of/platform.c
>>>> index 04ad312fd85b..1115a8d80a33 100644
>>>> --- a/drivers/of/platform.c
>>>> +++ b/drivers/of/platform.c
>>>> @@ -42,6 +42,8 @@ static int of_dev_node_match(struct device *dev, void *data)
>>>> return dev->of_node == data;
>>>> }
>>>>
>>>> +static DEFINE_SPINLOCK(of_dev_lock);
>>>> +
>>>> /**
>>>> * of_find_device_by_node - Find the platform_device associated with a node
>>>> * @np: Pointer to device tree node
>>>> @@ -55,7 +57,18 @@ struct platform_device *of_find_device_by_node(struct device_node *np)
>>>> {
>>>> struct device *dev;
>>>>
>>>> - dev = bus_find_device(&platform_bus_type, NULL, np, of_dev_node_match);
>>>> + /*
>>>> + * Spinlock needed to make sure np->dev doesn't get freed between NULL
>>>> + * check inside and kref count increment inside get_device(). This is
>>>> + * achieved by grabbing the spinlock before setting np->dev = NULL in
>>>> + * of_platform_device_destroy().
>>>> + */
>>>> + spin_lock(&of_dev_lock);
>>>> + dev = get_device(np->dev);
>>>> + spin_unlock(&of_dev_lock);
>>>> + if (!dev)
>>>> + dev = bus_find_device(&platform_bus_type, NULL, np,
>>>> + of_dev_node_match);
>>>> return dev ? to_platform_device(dev) : NULL;
>>>> }
>>>> EXPORT_SYMBOL(of_find_device_by_node);
>>>> @@ -196,6 +209,7 @@ static struct platform_device *of_platform_device_create_pdata(
>>>> platform_device_put(dev);
>>>> goto err_clear_flag;
>>>> }
>>>> + np->dev = &dev->dev;
>>>>
>>>> return dev;
>>>>
>>>> @@ -556,6 +570,10 @@ int of_platform_device_destroy(struct device *dev, void *data)
>>>> if (of_node_check_flag(dev->of_node, OF_POPULATED_BUS))
>>>> device_for_each_child(dev, NULL, of_platform_device_destroy);
>>>>
>>>> + /* Spinlock is needed for of_find_device_by_node() to work */
>>>> + spin_lock(&of_dev_lock);
>>>> + dev->of_node->dev = NULL;
>>>> + spin_unlock(&of_dev_lock);
>>>> of_node_clear_flag(dev->of_node, OF_POPULATED);
>>>> of_node_clear_flag(dev->of_node, OF_POPULATED_BUS);
>>>>
>>>> diff --git a/include/linux/of.h b/include/linux/of.h
>>>> index 0cf857012f11..f2b4912cbca1 100644
>>>> --- a/include/linux/of.h
>>>> +++ b/include/linux/of.h
>>>> @@ -48,6 +48,8 @@ struct property {
>>>> struct of_irq_controller;
>>>> #endif
>>>>
>>>> +struct device;
>>>> +
>>>> struct device_node {
>>>> const char *name;
>>>> phandle phandle;
>>>> @@ -68,6 +70,7 @@ struct device_node {
>>>> unsigned int unique_id;
>>>> struct of_irq_controller *irq_trans;
>>>> #endif
>>>> + struct device *dev; /* Device created from this node */
>>>
>>> We have actively been working on shrinking the size of struct device_node,
>>> as part of reducing the devicetree memory usage. As such, we need strong
>>> justification for adding anything to this struct. For example, proof that
>>> there is a performance problem that can only be solved by increasing the
>>> memory usage.
>>
>> I didn't mean for people to focus on the deferred probe optimization.
>
> I was speaking specifically of the of_find_device_by_node() optimization.
> I did not chase any further back in the call chain to see how that would
> impact anything else. My comments stand, whether this patch is meant
> to optimize deferred probe optimization or to optimize something else.
>
>
>> In reality that was just a added side benefit of this series. The main
>> problem to solve is that of suppliers having to know when all their
>> consumers are up and managing the resources actively, especially in a
>> system with loadable modules where we can't depend on the driver to
>> notify the supplier because the consumer driver module might not be
>> available or loaded until much later.
>>
>> Having said that, I'm not saying we should go around and waste space
>> willy-nilly. But, isn't the memory usage going to increase based on
>> the number of DT nodes present in DT? I'd think as the number of DT
>> nodes increase it's more likely for those devices have more memory? So
>> at least in this specific case I think adding the field is justified.
>
> Struct device_node is the nodes of the in kernel devicetree data. This
> patch adds a field to every single node of the devicetree.
>
> The patch series is also adding a new property, of varying size, to
> some nodes. This also results in additional memory usage by
> devicetree.
>
> Arguing that a more complex system is likely to have more memory is
> likely true, but beside the point. Minimizing devicetree memory
> used on less complex systems is also one of our goals.
>
>
>> Also, right now the look up is O(n) complexity and if we are trying to
>> add device links to most of the devices, that whole process becomes
>> O(n^2). Having this field makes the look up a O(1) and the entire
>> linking process a O(n) process. I think the memory usage increase is
>> worth the efficiency improvement.
>
> Hand waving about O(n) and O(1) and O(n2) is not sufficient. We require
> actual measurements that show O(n2) (when the existing algorithm is such)
> is a performance problem and that the proposed change to the algorithm
> results in a specific change in the performance.
>
> The devicetree maintainers have decided that memory use is important and
> to be minimized, and the burden of proof that performance is an issue
> lies on the submitter of a patch that improves performance at the cost
> of memory.
>
> Most devicetree boot time cpu overhead only affects the boot event.
> Added memory use persists for the entire booted lifetime of the system.
>
> That is not to say that we never increase memory use to improve boot
> performance. We have done so when the measured performance issue and
> measured performance improvement justified the change.
>
>>
>> And if people are still strongly against it, we could make this a config option.
>>
>> -Saravana
>>
>
>
Hi Saravana, On 5/24/19 9:04 PM, Saravana Kannan wrote: > On Fri, May 24, 2019 at 7:40 PM Frank Rowand <frowand.list@gmail.com> wrote: >> >> Hi Saranova, >> >> I'll try to address the other portions of this email that I <snipped> >> in my previous replies. >> >> >> On 5/24/19 2:53 PM, Saravana Kannan wrote: >>> On Fri, May 24, 2019 at 10:49 AM Frank Rowand <frowand.list@gmail.com> wrote: >>>> >>>> On 5/23/19 6:01 PM, Saravana Kannan wrote: >>>>> Add a generic "depends-on" property that allows specifying mandatory >>>>> functional dependencies between devices. Add device-links after the >>>>> devices are created (but before they are probed) by looking at this >>>>> "depends-on" property. >>>>> >>>>> This property is used instead of existing DT properties that specify >>>>> phandles of other devices (Eg: clocks, pinctrl, regulators, etc). This >>>>> is because not all resources referred to by existing DT properties are >>>>> mandatory functional dependencies. Some devices/drivers might be able>>>>> to operate with reduced functionality when some of the resources In your original email, you say this: >>>>> aren't available. For example, a device could operate in polling mode >>>>> if no IRQ is available, a device could skip doing power management if >>>>> clock or voltage control isn't available and they are left on, etc. >>>>> >>>>> So, adding mandatory functional dependency links between devices by >>>>> looking at referred phandles in DT properties won't work as it would >>>>> prevent probing devices that could be probed. By having an explicit >>>>> depends-on property, we can handle these cases correctly. >>>> >>>> Trying to wrap my brain around the concept, this series seems to be >>>> adding the ability to declare that an apparent dependency (eg an IRQ >>>> specified by a phandle) is _not_ actually a dependency. >>> >>> The current implementation completely ignores existing bindings for >>> dependencies and so does the current tip of the kernel. So it's not >>> really overriding anything. However, if I change the implementation so >>> that depends-on becomes the source of truth if it exists and falls >>> back to existing common bindings if "depends-on" isn't present -- then >>> depends-on would truly be overriding existing bindings for >>> dependencies. It depends on how we want to define the DT property. >>> >>>> The phandle already implies the dependency. >>> >>> Sure, it might imply, but it's not always true. >>> >>>> Creating a separate >>>> depends-on property provides a method of ignoring the implied >>>> dependencies. >>> >>> implied != true >>> I refer to your irq mode vs polled mode device example: >>>> This is not just hardware description. It is instead a combination >>>> of hardware functionality and driver functionality. An example >>>> provided in the second paragraph of the email I am replying to >>>> suggests a device could operate in polling mode if no IRQ is >>>> available. Using this example, the devicetree does not know >>>> whether the driver requires the IRQ (currently an implied >>>> dependency since the IRQ phandle exists). My understanding >>>> of this example is that the device node would _not_ have a >>>> depends-on property for the IRQ phandle so the IRQ would be >>>> optional. But this is an attribute of the driver, not the >>>> hardware. >>> >> You change the subject from irq mode vs polled mode device to some other type of device: >>> Not really. The interrupt could be for "SD card plugged in". That's >>> never a mandatory dependency for the SD card controller to work. So >>> the IRQ provider won't be a "depends-on" in this case. But if there is >>> no power supply or clock for the SD card controller, it isn't going to >>> work -- so they'd be listed in the "depends-on". So, this is still >>> defining the hardware and not the OS. >> I again try to get you to discuss the irq mode vs polled mode device: >> Please comment on my observation that was based on an IRQ for a device >> will polling mode vs interrupt driven mode. >> You described a different >> case and did not address my comment. > > I thought I did reply -- not sure what part you are looking for so > I'll rephrase. I was just picking the SD card controller as a concrete > example of device that can work with or without an interrupt. But > sure, I can call it "the device". > And the thread is so deeply nested that you are missing the original point that I made. > And yes, the device won't have a "depends-on" on the IRQ provider > because the device can still work without a working (as in bound to > driver) IRQ provider. Whether the driver insists on waiting on an IRQ > provider or not is up to the driver and the depends-on property is NOT > trying to dictate what the driver should do in this case. Does that > answer your implied question? If the device _must_ operate in irq mode to achieve the throughput that is _required_ for the system to be functional then that system would need a devicetree to have a "depends-on" property for the irq. But another system using the same exact hardware might be able to tolerate the reduced throughput of operating in polled mode. This second system would need a devicetree that does _not_ have a depends-on property for that same irq, as used by that same device. This then becomes configuration, not hardware description, as noted in my next paragraph: > >> >>>> This is also configuration, declaring whether the >>>> system is willing to accept polling mode instead of interrupt >>>> mode. -Frank >>> >>> Whether the driver will choose to operate without the IRQ is up to it. >>> The OS could also assume the power supply is never turned off and >>> still try to use the device. Depending on the hardware configuration, >>> that might or might not work. >>> >>>> Devicetree is not the proper place for driver description or >>>> for configuration. >>> >>> But depends-on isn't describing the driver configuration though. >>> >>> Overall, the clock provider example I gave in another reply is a much >>> better example. If you just assume implied dependencies are mandatory >>> dependencies, some devices will never be probe because the kernel is >>> using them incorrectly (they aren't meant to list mandatory >>> dependencies). >>> >>>> Another flaw with this method is that existing device trees >>>> will be broken after the kernel is modified, because existing >>>> device trees do not have the depends-on property. This breaks >>>> the devicetree compatibility rules. >>> >>> This is 100% not true with the current implementation. I actually >>> tested this. This is fully backwards compatible. That's another reason >>> for adding depends-on and going by just what it says. The existing >>> bindings were never meant to describe only mandatory dependencies. So >>> using them as such is what would break backwards compatibility. >>> >>>>> Having functional dependencies explicitly called out in DT and >>>>> automatically added before the devices are probed, provides the >>>>> following benefits: >>>>> >>>>> - Optimizes device probe order and avoids the useless work of >>>>> attempting probes of devices that will not probe successfully >>>>> (because their suppliers aren't present or haven't probed yet). >>>>> >>>>> For example, in a commonly available mobile SoC, registering just >>>>> one consumer device's driver at an initcall level earlier than the >>>>> supplier device's driver causes 11 failed probe attempts before the >>>>> consumer device probes successfully. This was with a kernel with all >>>>> the drivers statically compiled in. This problem gets a lot worse if >>>>> all the drivers are loaded as modules without direct symbol >>>>> dependencies. >>>>> >>>>> - Supplier devices like clock providers, regulators providers, etc >>>>> need to keep the resources they provide active and at a particular >>>>> state(s) during boot up even if their current set of consumers don't >>>>> request the resource to be active. This is because the rest of the >>>>> consumers might not have probed yet and turning off the resource >>>>> before all the consumers have probed could lead to a hang or >>>>> undesired user experience. >>>>> >>>>> Some frameworks (Eg: regulator) handle this today by turning off >>>>> "unused" resources at late_initcall_sync and hoping all the devices >>>>> have probed by then. This is not a valid assumption for systems with >>>>> loadable modules. Other frameworks (Eg: clock) just don't handle >>>>> this due to the lack of a clear signal for when they can turn off >>>>> resources. This leads to downstream hacks to handle cases like this >>>>> that can easily be solved in the upstream kernel. >>>>> >>>>> By linking devices before they are probed, we give suppliers a clear >>>> >>>> By linking devices to suppliers before they are probed, we give suppliers a clear >>> >>> Ack >>> >>>>> count of the number of dependent consumers. Once all of the >>>>> consumers are active, the suppliers can turn off the unused >>>>> resources without making assumptions about the number of consumers. >>>>> >>>>> By default we just add device-links to track "driver presence" (probe >>>>> succeeded) of the supplier device. If any other functionality provided >>>>> by device-links are needed, it is left to the consumer/supplier >>>>> devices to change the link when they probe. >>>>> >>>>> >>>>> Saravana Kannan (5): >>>>> of/platform: Speed up of_find_device_by_node() >>>>> driver core: Add device links support for pending links to suppliers >>>>> dt-bindings: Add depends-on property >>>>> of/platform: Add functional dependency link from "depends-on" property >>>>> driver core: Add sync_state driver/bus callback >>>>> >>>>> .../devicetree/bindings/depends-on.txt | 26 +++++ >>>>> drivers/base/core.c | 106 ++++++++++++++++++ >>>>> drivers/of/platform.c | 75 ++++++++++++- >>>>> include/linux/device.h | 24 ++++ >>>>> include/linux/of.h | 3 + >>>>> 5 files changed, 233 insertions(+), 1 deletion(-) >>>>> create mode 100644 Documentation/devicetree/bindings/depends-on.txt >>>>> >>>> >>>> The above issues make this specific implementation not acceptable. >>>> I like the analysis of the problem areas, and I like the concepts of >>>> trying to solve not only probe ordering, but also the problem of >>>> when to turn off resources that will not be needed. >>> >>> Beating a dead horse here, but I want to make sure I get this into as >>> many minds as possible: >>> It is NOT just about turning off resources. It's about the kernel >>> taking full control of resources (allowing the full range of voltages, >>> clock frequencies, bus configurations, etc) and syncing the HW state >>> to the SW state as determined by the consumers. >>> >>>> But at the >>>> moment, I don't have a suggestion of a way to implement a solution. >>> >>> The problem of syncing resources to SW state after boot up completed >>> has been broken for a long time in the kernel. It's high time we fix >>> it. I'm open to other suggestions, but we can't just say "we don't >>> have a solution". >>> >>> For example, we can have a kernel command line argument that'll use >>> all common implicit bindings as mandatory dependencies and allow >>> "depends-on" to override them for the few cases where the implicit >>> dependencies don't match mandatory dependencies. Then: >>> - The kernel will be 100% backwards compatible with existing DT if the >>> command line arg isn't provided. >>> - New DT + old kernel will be no worse than today because old kernel >>> doesn't do any dependency tracking. >>> - Command line arg + new kernel + hardware where all implicit >>> dependencies are actually mandatory dependencies == things work >>> better. >>> - Command line arg + new kernel + hardware where all implicit >>> dependencies don't match mandatory dependencies + depends-on for those >>> exception case == things work better. >> >> Using a command line argument for this purpose just seems to be a >> hack and bad architecture. > > I agree -- which is why the current implementation doesn't try to form > mandatory dependency from implicit bindings and makes the case that > all mandatory dependencies should be called out explicitly. But if > people insist on that implicit bindings be used as such, then a CONFIG > or commandline option would be an acceptable compromise for me. > >> It also seems like an invitation to mis-configure a system (in other >> words, increases the complexity and difficulty of properly configuring >> and administering a system). > > Just want to point out that, as of today, we have a broken system -- > module loading is not compatible with proper handling of shared > mandatory resources during boot up. To be clear, I'm not arguing for a > commandline arg. > >> The is not a hard no (yet), but it will take some convincing for me >> to accept the command line approach to add the feature, yet maintain >> compatibility. Please do not spend any time replying to this concern >> yet - we will have plenty of time to discuss later if need be. > > Sounds good. > > -Saravana >
On Tue, Jun 11, 2019 at 7:51 AM Frank Rowand <frowand.list@gmail.com> wrote: > > Hi Saravana, > > (I notice that I never seem to spell your name correctly. Apologies for that, > both past and future). Thanks for noticing :) One trick that might help with remembering my name is that every other letter is an "a" :) > > This email was never answered. Yeah, because this patch wasn't central to the functionality. At worse case, I drop the patch and modules would still work. While waiting for responses for the other emails -- I was working on measuring the speed up. If I just took the clock bindings (in a final solution we'll have to look up clocks, resets, regulators, interrupts, etc) in a SDM845 and made them into device links, that resulted in ~500+ searches for devices by their nodes. Looking at all 500+ of the lookups: Total time: With patch, : 2 milliseconds Without patch: 12 milliseconds Worst case single look up: With patch: 39 microseconds Without patch: 250 microseconds Median of lookup times: With patch: 208 nanoseconds Without patch: 23 microseconds Even if I assume there are 1000 device nodes, the increase in memory from one additional pointer this patch adds is ~8KB. 2GB is the low ball on the amount of memory available in a typical SDM845 device. Boot time to userspace on the device Iooked at is: 1149 ms So total boot time reduction is 0.8% Total memory increase is 0.00038% Once more device links are added I expect the boot time impact to be larger. I think a 1% reduction in boot time for 0.00038% increase in memory usage is a good trade off. That 1% faster boot time can also be approximated to 1% reduction in boot up power usage. That scaled to a million or billion devices that'll run an Android or some form of ARM Linux kernel is still a good chunk of energy savings for a small memory increase. I still stand by the usefulness of this patch, but this is not the hill I'm going to die on (so if dropping this patch is what it takes to get modules working, I'll drop it for now). -Saravana > > -Frank > > > On 5/24/19 5:12 PM, Frank Rowand wrote: > > On 5/24/19 11:21 AM, Saravana Kannan wrote: > >> On Fri, May 24, 2019 at 10:56 AM Frank Rowand <frowand.list@gmail.com> wrote: > >>> > >>> Hi Sarvana, > >>> > >>> I'm not reviewing patches 1-5 in any detail, given my reply to patch 0. > >>> > >>> But I had already skimmed through this patch before I received the > >>> email for patch 0, so I want to make one generic comment below, > >>> to give some feedback as you continue thinking through possible > >>> implementations to solve the underlying problems. > >> > >> Appreciate the feedback Frank! > >> > >>> > >>> > >>> On 5/23/19 6:01 PM, Saravana Kannan wrote: > >>>> Add a pointer from device tree node to the device created from it. > >>>> This allows us to find the device corresponding to a device tree node > >>>> without having to loop through all the platform devices. > >>>> > >>>> However, fallback to looping through the platform devices to handle > >>>> any devices that might set their own of_node. > >>>> > >>>> Signed-off-by: Saravana Kannan <saravanak@google.com> > >>>> --- > >>>> drivers/of/platform.c | 20 +++++++++++++++++++- > >>>> include/linux/of.h | 3 +++ > >>>> 2 files changed, 22 insertions(+), 1 deletion(-) > >>>> > >>>> diff --git a/drivers/of/platform.c b/drivers/of/platform.c > >>>> index 04ad312fd85b..1115a8d80a33 100644 > >>>> --- a/drivers/of/platform.c > >>>> +++ b/drivers/of/platform.c > >>>> @@ -42,6 +42,8 @@ static int of_dev_node_match(struct device *dev, void *data) > >>>> return dev->of_node == data; > >>>> } > >>>> > >>>> +static DEFINE_SPINLOCK(of_dev_lock); > >>>> + > >>>> /** > >>>> * of_find_device_by_node - Find the platform_device associated with a node > >>>> * @np: Pointer to device tree node > >>>> @@ -55,7 +57,18 @@ struct platform_device *of_find_device_by_node(struct device_node *np) > >>>> { > >>>> struct device *dev; > >>>> > >>>> - dev = bus_find_device(&platform_bus_type, NULL, np, of_dev_node_match); > >>>> + /* > >>>> + * Spinlock needed to make sure np->dev doesn't get freed between NULL > >>>> + * check inside and kref count increment inside get_device(). This is > >>>> + * achieved by grabbing the spinlock before setting np->dev = NULL in > >>>> + * of_platform_device_destroy(). > >>>> + */ > >>>> + spin_lock(&of_dev_lock); > >>>> + dev = get_device(np->dev); > >>>> + spin_unlock(&of_dev_lock); > >>>> + if (!dev) > >>>> + dev = bus_find_device(&platform_bus_type, NULL, np, > >>>> + of_dev_node_match); > >>>> return dev ? to_platform_device(dev) : NULL; > >>>> } > >>>> EXPORT_SYMBOL(of_find_device_by_node); > >>>> @@ -196,6 +209,7 @@ static struct platform_device *of_platform_device_create_pdata( > >>>> platform_device_put(dev); > >>>> goto err_clear_flag; > >>>> } > >>>> + np->dev = &dev->dev; > >>>> > >>>> return dev; > >>>> > >>>> @@ -556,6 +570,10 @@ int of_platform_device_destroy(struct device *dev, void *data) > >>>> if (of_node_check_flag(dev->of_node, OF_POPULATED_BUS)) > >>>> device_for_each_child(dev, NULL, of_platform_device_destroy); > >>>> > >>>> + /* Spinlock is needed for of_find_device_by_node() to work */ > >>>> + spin_lock(&of_dev_lock); > >>>> + dev->of_node->dev = NULL; > >>>> + spin_unlock(&of_dev_lock); > >>>> of_node_clear_flag(dev->of_node, OF_POPULATED); > >>>> of_node_clear_flag(dev->of_node, OF_POPULATED_BUS); > >>>> > >>>> diff --git a/include/linux/of.h b/include/linux/of.h > >>>> index 0cf857012f11..f2b4912cbca1 100644 > >>>> --- a/include/linux/of.h > >>>> +++ b/include/linux/of.h > >>>> @@ -48,6 +48,8 @@ struct property { > >>>> struct of_irq_controller; > >>>> #endif > >>>> > >>>> +struct device; > >>>> + > >>>> struct device_node { > >>>> const char *name; > >>>> phandle phandle; > >>>> @@ -68,6 +70,7 @@ struct device_node { > >>>> unsigned int unique_id; > >>>> struct of_irq_controller *irq_trans; > >>>> #endif > >>>> + struct device *dev; /* Device created from this node */ > >>> > >>> We have actively been working on shrinking the size of struct device_node, > >>> as part of reducing the devicetree memory usage. As such, we need strong > >>> justification for adding anything to this struct. For example, proof that > >>> there is a performance problem that can only be solved by increasing the > >>> memory usage. > >> > >> I didn't mean for people to focus on the deferred probe optimization. > > > > I was speaking specifically of the of_find_device_by_node() optimization. > > I did not chase any further back in the call chain to see how that would > > impact anything else. My comments stand, whether this patch is meant > > to optimize deferred probe optimization or to optimize something else. > > > > > >> In reality that was just a added side benefit of this series. The main > >> problem to solve is that of suppliers having to know when all their > >> consumers are up and managing the resources actively, especially in a > >> system with loadable modules where we can't depend on the driver to > >> notify the supplier because the consumer driver module might not be > >> available or loaded until much later. > >> > >> Having said that, I'm not saying we should go around and waste space > >> willy-nilly. But, isn't the memory usage going to increase based on > >> the number of DT nodes present in DT? I'd think as the number of DT > >> nodes increase it's more likely for those devices have more memory? So > >> at least in this specific case I think adding the field is justified. > > > > Struct device_node is the nodes of the in kernel devicetree data. This > > patch adds a field to every single node of the devicetree. > > > > The patch series is also adding a new property, of varying size, to > > some nodes. This also results in additional memory usage by > > devicetree. > > > > Arguing that a more complex system is likely to have more memory is > > likely true, but beside the point. Minimizing devicetree memory > > used on less complex systems is also one of our goals. > > > > > >> Also, right now the look up is O(n) complexity and if we are trying to > >> add device links to most of the devices, that whole process becomes > >> O(n^2). Having this field makes the look up a O(1) and the entire > >> linking process a O(n) process. I think the memory usage increase is > >> worth the efficiency improvement. > > > > Hand waving about O(n) and O(1) and O(n2) is not sufficient. We require > > actual measurements that show O(n2) (when the existing algorithm is such) > > is a performance problem and that the proposed change to the algorithm > > results in a specific change in the performance. > > > > The devicetree maintainers have decided that memory use is important and > > to be minimized, and the burden of proof that performance is an issue > > lies on the submitter of a patch that improves performance at the cost > > of memory. > > > > Most devicetree boot time cpu overhead only affects the boot event. > > Added memory use persists for the entire booted lifetime of the system. > > > > That is not to say that we never increase memory use to improve boot > > performance. We have done so when the measured performance issue and > > measured performance improvement justified the change. > > > >> > >> And if people are still strongly against it, we could make this a config option. > >> > >> -Saravana > >> > > > > >
On Tue, Jun 11, 2019 at 7:56 AM Frank Rowand <frowand.list@gmail.com> wrote:
>
> Hi Saravana,
>
> On 5/24/19 9:04 PM, Saravana Kannan wrote:
> > On Fri, May 24, 2019 at 7:40 PM Frank Rowand <frowand.list@gmail.com> wrote:
> >>
> >> Hi Saranova,
> >>
> >> I'll try to address the other portions of this email that I <snipped>
> >> in my previous replies.
> >>
> >>
> >> On 5/24/19 2:53 PM, Saravana Kannan wrote:
> >>> On Fri, May 24, 2019 at 10:49 AM Frank Rowand <frowand.list@gmail.com> wrote:
> >>>>
> >>>> On 5/23/19 6:01 PM, Saravana Kannan wrote:
> >>>>> Add a generic "depends-on" property that allows specifying mandatory
> >>>>> functional dependencies between devices. Add device-links after the
> >>>>> devices are created (but before they are probed) by looking at this
> >>>>> "depends-on" property.
> >>>>>
> >>>>> This property is used instead of existing DT properties that specify
> >>>>> phandles of other devices (Eg: clocks, pinctrl, regulators, etc). This
> >>>>> is because not all resources referred to by existing DT properties are
> >>>>> mandatory functional dependencies. Some devices/drivers might be able>>>>> to operate with reduced functionality when some of the resources
>
>
> In your original email, you say this:
>
> >>>>> aren't available. For example, a device could operate in polling mode
> >>>>> if no IRQ is available, a device could skip doing power management if
> >>>>> clock or voltage control isn't available and they are left on, etc.
>
>
> >>>>>
> >>>>> So, adding mandatory functional dependency links between devices by
> >>>>> looking at referred phandles in DT properties won't work as it would
> >>>>> prevent probing devices that could be probed. By having an explicit
> >>>>> depends-on property, we can handle these cases correctly.
> >>>>
> >>>> Trying to wrap my brain around the concept, this series seems to be
> >>>> adding the ability to declare that an apparent dependency (eg an IRQ
> >>>> specified by a phandle) is _not_ actually a dependency.
> >>>
> >>> The current implementation completely ignores existing bindings for
> >>> dependencies and so does the current tip of the kernel. So it's not
> >>> really overriding anything. However, if I change the implementation so
> >>> that depends-on becomes the source of truth if it exists and falls
> >>> back to existing common bindings if "depends-on" isn't present -- then
> >>> depends-on would truly be overriding existing bindings for
> >>> dependencies. It depends on how we want to define the DT property.
> >>>
> >>>> The phandle already implies the dependency.
> >>>
> >>> Sure, it might imply, but it's not always true.
> >>>
> >>>> Creating a separate
> >>>> depends-on property provides a method of ignoring the implied
> >>>> dependencies.
> >>>
> >>> implied != true
> >>>
>
> I refer to your irq mode vs polled mode device example:
>
> >>>> This is not just hardware description. It is instead a combination
> >>>> of hardware functionality and driver functionality. An example
> >>>> provided in the second paragraph of the email I am replying to
> >>>> suggests a device could operate in polling mode if no IRQ is
> >>>> available. Using this example, the devicetree does not know
> >>>> whether the driver requires the IRQ (currently an implied
> >>>> dependency since the IRQ phandle exists). My understanding
> >>>> of this example is that the device node would _not_ have a
> >>>> depends-on property for the IRQ phandle so the IRQ would be
> >>>> optional. But this is an attribute of the driver, not the
> >>>> hardware.
> >>>
> >>
>
> You change the subject from irq mode vs polled mode device to some
> other type of device:
>
> >>> Not really. The interrupt could be for "SD card plugged in". That's
> >>> never a mandatory dependency for the SD card controller to work. So
> >>> the IRQ provider won't be a "depends-on" in this case. But if there is
> >>> no power supply or clock for the SD card controller, it isn't going to
> >>> work -- so they'd be listed in the "depends-on". So, this is still
> >>> defining the hardware and not the OS.
> >>
>
> I again try to get you to discuss the irq mode vs polled mode device:
>
> >> Please comment on my observation that was based on an IRQ for a device
> >> will polling mode vs interrupt driven mode.
> >> You described a different
> >> case and did not address my comment.
> > > I thought I did reply -- not sure what part you are looking for so
> > I'll rephrase. I was just picking the SD card controller as a concrete
> > example of device that can work with or without an interrupt. But
> > sure, I can call it "the device".
> >
>
> And the thread is so deeply nested that you are missing the original
> point that I made.
>
> > And yes, the device won't have a "depends-on" on the IRQ provider
> > because the device can still work without a working (as in bound to
> > driver) IRQ provider. Whether the driver insists on waiting on an IRQ
> > provider or not is up to the driver and the depends-on property is NOT
> > trying to dictate what the driver should do in this case. Does that
> > answer your implied question?
>
> If the device _must_ operate in irq mode to achieve the throughput
> that is _required_ for the system to be functional then that system
> would need a devicetree to have a "depends-on" property for the irq.
> But another system using the same exact hardware might be able to
> tolerate the reduced throughput of operating in polled mode. This
> second system would need a devicetree that does _not_ have a
> depends-on property for that same irq, as used by that same device.
Thanks for clarifying the point. I see the difference in our view points.
The way I see it, on the system where the device must operate in IRQ
mode to meet performance requirements, the DRIVER would choose to
always -EPROBE_DEFER till it gets the IRQ. Or if it's BSD or Windows
or whatever other theoretical OS that is trying to use the same DT
blob, it'll follow whatever mechanism that OS provides for waiting for
the IRQ to become available.
On the system where not having the IRQ is okay, the DRIVER would not
EPROBE_DEFER and just go to using polling mode.
In both these cases I don't expect the depends-on to list the IRQ provider.
I have more to say but I'll say that in the RESEND thread as a reply
to your other email.
-Saravana