* [PATCH v5 0/2] Better domain idle from device wakeup patterns @ 2020-11-06 16:48 Lina Iyer 2020-11-06 16:48 ` [PATCH v5 1/2] PM / domains: inform PM domain of a device's next wakeup Lina Iyer 2020-11-06 16:48 ` [PATCH v5 2/2] PM / Domains: use device's next wakeup to determine domain idle state Lina Iyer 0 siblings, 2 replies; 12+ messages in thread From: Lina Iyer @ 2020-11-06 16:48 UTC (permalink / raw) To: rjw, ulf.hansson; +Cc: linux-pm, linux-arm-msm, Lina Iyer Changes since v4 [4]: - Address review comments Changes since v3 [3]: - Move the next_wakeup info of the device deeper into the device's domain data. This should avoid overhead for devices that do not have a predictable wakeup pattern. Changes since v2: - Fix unwanted change Changes since v1 [2]: - Update documentation and commit text - Remove check for runtime PM when setting next_event - Fix kernel-test robot reported issue Changes since RFC [1]: - Organized the code to make it cleaner - Fixed some issues with idle state determination - Add documentation and update commit text Hello, I was looking for an option to do better power management for some domains where the devices enter runtime PM in a predictable fashion. For example a display device that sends a vsync interrupt every 16 ms for a 60 Hz panel. These interrupts are not timer interrupts but tend to interrupt periodically to service the workflow and the devices and domains may go back to idle soon after. Two domains are affected by this - the device's PM domain and the CPU PM domain. As a first step, I am looking to solve for the device's PM domain idle state (and hopefully solve for the CPU PM domains subsequently). The PM domain could have multiple idle states and/or the enter/exit latencies could be high. In either case, it may not always be beneficial to power off the domain, only to turn it back on before satisfying the idle state residency. When the wakeup is known for the device, we could use that to determine the worthiness of entering a domain idle state. Only the device can tell us when the future event would be and that could change as the usecase changes. Like, when the panel refresh rate increases to 120 Hz. If this information was made available to runtime PM, we could use that in the domain governor to determine a suitable idle state. This is the idea behind these patches. Would appreciate your thoughts on this. Thanks, Lina [1]. https://lore.kernel.org/linux-pm/010101746eccb270-05beb27f-e1e4-40eb-92da-ad1bb48feb41-000000@us-west-2.amazonses.com/T/ [2]. https://lore.kernel.org/linux-pm/20201012223400.23609-3-ilina@codeaurora.org/T/#u [3]. https://lore.kernel.org/linux-pm/20201015193807.17423-1-ilina@codeaurora.org/ [4]. https://www.spinics.net/lists/linux-arm-msm/msg74322.html Lina Iyer (2): PM / domains: inform PM domain of a device's next wakeup PM / Domains: use device's next wakeup to determine domain idle state drivers/base/power/domain.c | 36 +++++++++++++ drivers/base/power/domain_governor.c | 81 ++++++++++++++++++++++++++-- include/linux/pm_domain.h | 9 ++++ 3 files changed, 121 insertions(+), 5 deletions(-) -- The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project ^ permalink raw reply [flat|nested] 12+ messages in thread
* [PATCH v5 1/2] PM / domains: inform PM domain of a device's next wakeup 2020-11-06 16:48 [PATCH v5 0/2] Better domain idle from device wakeup patterns Lina Iyer @ 2020-11-06 16:48 ` Lina Iyer 2020-11-06 16:48 ` [PATCH v5 2/2] PM / Domains: use device's next wakeup to determine domain idle state Lina Iyer 1 sibling, 0 replies; 12+ messages in thread From: Lina Iyer @ 2020-11-06 16:48 UTC (permalink / raw) To: rjw, ulf.hansson; +Cc: linux-pm, linux-arm-msm, Lina Iyer Some devices may have a predictable interrupt pattern while executing usecases. An example would be the VSYNC interrupt associated with display devices. A 60 Hz display could cause a interrupt every 16 ms. If the device were in a PM domain, the domain would need to be powered up for device to resume and handle the interrupt. Entering a domain idle state saves power, only if the residency of the idle state is met. Without knowing the idle duration of the domain, the governor would just choose the deepest idle state that matches the QoS requirements. The domain might be powered off just as the device is expecting to wake up. If devices could inform PM frameworks of their next event, the parent PM domain's idle duration can be determined. So let's add the dev_pm_genpd_set_next_wakeup() API for the device to inform PM domains of the impending wakeup. This information will be the domain governor to determine the best idle state given the wakeup. Signed-off-by: Lina Iyer <ilina@codeaurora.org> Reviewed-by: Ulf Hansson <ulf.hansson@linaro.org> --- Changes in v5: - Fix commit text as pointed by Ulf - Use -EOPNOTSUPP Changes in v4: - Use PM domain data to store next_wakeup - Drop runtime PM documentation Changes in v3: - Fix unwanted change Changes in v2: - Update documentation - Remove runtime PM enabled check - Update commit text --- drivers/base/power/domain.c | 36 ++++++++++++++++++++++++++++++++++++ include/linux/pm_domain.h | 8 ++++++++ 2 files changed, 44 insertions(+) diff --git a/drivers/base/power/domain.c b/drivers/base/power/domain.c index 743268996336..34b90e77e0cd 100644 --- a/drivers/base/power/domain.c +++ b/drivers/base/power/domain.c @@ -408,6 +408,41 @@ int dev_pm_genpd_set_performance_state(struct device *dev, unsigned int state) } EXPORT_SYMBOL_GPL(dev_pm_genpd_set_performance_state); +/** + * dev_pm_genpd_set_next_wakeup - Notify PM framework of an impending wakeup. + * + * @dev: Device to handle + * @next: impending interrupt/wakeup for the device + * + * Allow devices to inform of the next wakeup. But, if the domain were already + * powered off, we will not wakeup the domain to recompute it's idle duration. + */ +int dev_pm_genpd_set_next_wakeup(struct device *dev, ktime_t next) +{ + struct generic_pm_domain *genpd; + struct generic_pm_domain_data *gpd_data; + int ret = -EINVAL; + + genpd = dev_to_genpd_safe(dev); + if (!genpd) + return -ENODEV; + + if (WARN_ON(!dev->power.subsys_data || + !dev->power.subsys_data->domain_data)) + return ret; + + genpd_lock(genpd); + gpd_data = to_gpd_data(dev->power.subsys_data->domain_data); + if (ktime_before(ktime_get(), next)) { + gpd_data->next_wakeup = next; + ret = 0; + } + genpd_unlock(genpd); + + return ret; +} +EXPORT_SYMBOL_GPL(dev_pm_genpd_set_next_wakeup); + static int _genpd_power_on(struct generic_pm_domain *genpd, bool timed) { unsigned int state_idx = genpd->state_idx; @@ -1431,6 +1466,7 @@ static struct generic_pm_domain_data *genpd_alloc_dev_data(struct device *dev) gpd_data->td.constraint_changed = true; gpd_data->td.effective_constraint_ns = PM_QOS_RESUME_LATENCY_NO_CONSTRAINT_NS; gpd_data->nb.notifier_call = genpd_dev_pm_qos_notifier; + gpd_data->next_wakeup = KTIME_MAX; spin_lock_irq(&dev->power.lock); diff --git a/include/linux/pm_domain.h b/include/linux/pm_domain.h index 1ad0ec481416..d7875bfde7f4 100644 --- a/include/linux/pm_domain.h +++ b/include/linux/pm_domain.h @@ -9,6 +9,7 @@ #define _LINUX_PM_DOMAIN_H #include <linux/device.h> +#include <linux/ktime.h> #include <linux/mutex.h> #include <linux/pm.h> #include <linux/err.h> @@ -191,6 +192,7 @@ struct generic_pm_domain_data { struct notifier_block *power_nb; int cpu; unsigned int performance_state; + ktime_t next_wakeup; void *data; }; @@ -217,6 +219,7 @@ int pm_genpd_remove(struct generic_pm_domain *genpd); int dev_pm_genpd_set_performance_state(struct device *dev, unsigned int state); int dev_pm_genpd_add_notifier(struct device *dev, struct notifier_block *nb); int dev_pm_genpd_remove_notifier(struct device *dev); +int dev_pm_genpd_set_next_wakeup(struct device *dev, ktime_t next); extern struct dev_power_governor simple_qos_governor; extern struct dev_power_governor pm_domain_always_on_gov; @@ -275,6 +278,11 @@ static inline int dev_pm_genpd_remove_notifier(struct device *dev) return -ENOTSUPP; } +static inline int dev_pm_genpd_set_next_wakeup(struct device *dev, ktime_t next) +{ + return -EOPNOTSUPP; +} + #define simple_qos_governor (*(struct dev_power_governor *)(NULL)) #define pm_domain_always_on_gov (*(struct dev_power_governor *)(NULL)) #endif -- The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project ^ permalink raw reply related [flat|nested] 12+ messages in thread
* [PATCH v5 2/2] PM / Domains: use device's next wakeup to determine domain idle state 2020-11-06 16:48 [PATCH v5 0/2] Better domain idle from device wakeup patterns Lina Iyer 2020-11-06 16:48 ` [PATCH v5 1/2] PM / domains: inform PM domain of a device's next wakeup Lina Iyer @ 2020-11-06 16:48 ` Lina Iyer 2020-11-09 15:26 ` Ulf Hansson 1 sibling, 1 reply; 12+ messages in thread From: Lina Iyer @ 2020-11-06 16:48 UTC (permalink / raw) To: rjw, ulf.hansson; +Cc: linux-pm, linux-arm-msm, Lina Iyer Currently, a PM domain's idle state is determined based on whether the QoS requirements are met. This may not save power, if the idle state residency requirements are not met. CPU PM domains use the next timer wakeup for the CPUs in the domain to determine the sleep duration of the domain. This is compared with the idle state residencies to determine the optimal idle state. For other PM domains, determining the sleep length is not that straight forward. But if the device's next_event is available, we can use that to determine the sleep duration of the PM domain. Let's update the domain governor logic to check for idle state residency based on the next wakeup of devices as well as QoS constraints. Signed-off-by: Lina Iyer <ilina@codeaurora.org> --- Changes in v5: - Minor code changes suggested by Rafel Changes in v4: - Update to use next_wakeup from struct generic_pm_domain_data. Changes in v3: - None Changes in v2: - Fix state_idx type to hold negative value. - Update commit text. --- drivers/base/power/domain_governor.c | 81 ++++++++++++++++++++++++++-- include/linux/pm_domain.h | 1 + 2 files changed, 77 insertions(+), 5 deletions(-) diff --git a/drivers/base/power/domain_governor.c b/drivers/base/power/domain_governor.c index 490ed7deb99a..fee050c6ea6a 100644 --- a/drivers/base/power/domain_governor.c +++ b/drivers/base/power/domain_governor.c @@ -117,6 +117,47 @@ static bool default_suspend_ok(struct device *dev) return td->cached_suspend_ok; } +static void update_domain_next_wakeup(struct generic_pm_domain *genpd, ktime_t now) +{ + ktime_t domain_wakeup = KTIME_MAX; + ktime_t next_wakeup; + struct pm_domain_data *pdd; + struct gpd_link *link; + + /* Find the earliest wakeup for all devices in the domain */ + list_for_each_entry(pdd, &genpd->dev_list, list_node) { + next_wakeup = to_gpd_data(pdd)->next_wakeup; + if (next_wakeup != KTIME_MAX && !ktime_before(next_wakeup, now)) + if (ktime_before(next_wakeup, domain_wakeup)) + domain_wakeup = next_wakeup; + } + + /* Then find the earliest wakeup of from all the child domains */ + list_for_each_entry(link, &genpd->parent_links, parent_node) { + next_wakeup = link->child->next_wakeup; + if (next_wakeup != KTIME_MAX && !ktime_before(next_wakeup, now)) + if (ktime_before(next_wakeup, domain_wakeup)) + domain_wakeup = next_wakeup; + } + + genpd->next_wakeup = domain_wakeup; +} + +static bool next_wakeup_allows_state(struct generic_pm_domain *genpd, + unsigned int state, ktime_t now) +{ + ktime_t domain_wakeup = genpd->next_wakeup; + s64 idle_time_ns, min_sleep_ns; + + min_sleep_ns = genpd->states[state].power_off_latency_ns + + genpd->states[state].power_on_latency_ns + + genpd->states[state].residency_ns; + + idle_time_ns = ktime_to_ns(ktime_sub(domain_wakeup, now)); + + return idle_time_ns >= min_sleep_ns; +} + static bool __default_power_down_ok(struct dev_pm_domain *pd, unsigned int state) { @@ -209,8 +250,34 @@ static bool __default_power_down_ok(struct dev_pm_domain *pd, static bool default_power_down_ok(struct dev_pm_domain *pd) { struct generic_pm_domain *genpd = pd_to_genpd(pd); + int state_idx = genpd->state_count - 1; + ktime_t now = ktime_get(); struct gpd_link *link; + /* + * Find the next wakeup from devices that can determine their own wakeup + * to find when the domain would wakeup and do it for every device down + * the hierarchy. It is not worth while to sleep if the state's residency + * cannot be met. + */ + update_domain_next_wakeup(genpd, now); + if (genpd->next_wakeup != KTIME_MAX) { + /* Let's find out the deepest domain idle state, the devices prefer */ + while (state_idx >= 0) { + if (next_wakeup_allows_state(genpd, state_idx, now)) { + genpd->max_off_time_changed = true; + break; + } + state_idx--; + } + + if (state_idx < 0) { + state_idx = 0; + genpd->cached_power_down_ok = false; + goto done; + } + } + if (!genpd->max_off_time_changed) { genpd->state_idx = genpd->cached_power_down_state_idx; return genpd->cached_power_down_ok; @@ -228,17 +295,21 @@ static bool default_power_down_ok(struct dev_pm_domain *pd) genpd->max_off_time_ns = -1; genpd->max_off_time_changed = false; genpd->cached_power_down_ok = true; - genpd->state_idx = genpd->state_count - 1; - /* Find a state to power down to, starting from the deepest. */ - while (!__default_power_down_ok(pd, genpd->state_idx)) { - if (genpd->state_idx == 0) { + /* + * Find a state to power down to, starting from the state + * determined by the next wakeup. + */ + while (!__default_power_down_ok(pd, state_idx)) { + if (state_idx == 0) { genpd->cached_power_down_ok = false; break; } - genpd->state_idx--; + state_idx--; } +done: + genpd->state_idx = state_idx; genpd->cached_power_down_state_idx = genpd->state_idx; return genpd->cached_power_down_ok; } diff --git a/include/linux/pm_domain.h b/include/linux/pm_domain.h index d7875bfde7f4..49982cd58bfd 100644 --- a/include/linux/pm_domain.h +++ b/include/linux/pm_domain.h @@ -130,6 +130,7 @@ struct generic_pm_domain { unsigned int state); struct gpd_dev_ops dev_ops; s64 max_off_time_ns; /* Maximum allowed "suspended" time. */ + ktime_t next_wakeup; /* Maintained by the domain governor */ bool max_off_time_changed; bool cached_power_down_ok; bool cached_power_down_state_idx; -- The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project ^ permalink raw reply related [flat|nested] 12+ messages in thread
* Re: [PATCH v5 2/2] PM / Domains: use device's next wakeup to determine domain idle state 2020-11-06 16:48 ` [PATCH v5 2/2] PM / Domains: use device's next wakeup to determine domain idle state Lina Iyer @ 2020-11-09 15:26 ` Ulf Hansson 2020-11-09 17:41 ` Lina Iyer 0 siblings, 1 reply; 12+ messages in thread From: Ulf Hansson @ 2020-11-09 15:26 UTC (permalink / raw) To: Lina Iyer; +Cc: Rafael J. Wysocki, Linux PM, linux-arm-msm On Fri, 6 Nov 2020 at 17:48, Lina Iyer <ilina@codeaurora.org> wrote: > > Currently, a PM domain's idle state is determined based on whether the > QoS requirements are met. This may not save power, if the idle state > residency requirements are not met. > > CPU PM domains use the next timer wakeup for the CPUs in the domain to > determine the sleep duration of the domain. This is compared with the > idle state residencies to determine the optimal idle state. For other PM > domains, determining the sleep length is not that straight forward. But > if the device's next_event is available, we can use that to determine > the sleep duration of the PM domain. > > Let's update the domain governor logic to check for idle state residency > based on the next wakeup of devices as well as QoS constraints. > > Signed-off-by: Lina Iyer <ilina@codeaurora.org> > --- > Changes in v5: > - Minor code changes suggested by Rafel > Changes in v4: > - Update to use next_wakeup from struct generic_pm_domain_data. > Changes in v3: > - None > Changes in v2: > - Fix state_idx type to hold negative value. > - Update commit text. > --- > drivers/base/power/domain_governor.c | 81 ++++++++++++++++++++++++++-- > include/linux/pm_domain.h | 1 + > 2 files changed, 77 insertions(+), 5 deletions(-) > > diff --git a/drivers/base/power/domain_governor.c b/drivers/base/power/domain_governor.c > index 490ed7deb99a..fee050c6ea6a 100644 > --- a/drivers/base/power/domain_governor.c > +++ b/drivers/base/power/domain_governor.c > @@ -117,6 +117,47 @@ static bool default_suspend_ok(struct device *dev) > return td->cached_suspend_ok; > } > > +static void update_domain_next_wakeup(struct generic_pm_domain *genpd, ktime_t now) > +{ > + ktime_t domain_wakeup = KTIME_MAX; > + ktime_t next_wakeup; > + struct pm_domain_data *pdd; > + struct gpd_link *link; > + > + /* Find the earliest wakeup for all devices in the domain */ > + list_for_each_entry(pdd, &genpd->dev_list, list_node) { > + next_wakeup = to_gpd_data(pdd)->next_wakeup; > + if (next_wakeup != KTIME_MAX && !ktime_before(next_wakeup, now)) > + if (ktime_before(next_wakeup, domain_wakeup)) > + domain_wakeup = next_wakeup; If it turns out that one of the device's next_wakeup is before "now", leading to ktime_before() above returns true - then I think you should bail out, no? At least, we shouldn't just continue and ignore this case, right? > + } > + > + /* Then find the earliest wakeup of from all the child domains */ > + list_for_each_entry(link, &genpd->parent_links, parent_node) { > + next_wakeup = link->child->next_wakeup; > + if (next_wakeup != KTIME_MAX && !ktime_before(next_wakeup, now)) > + if (ktime_before(next_wakeup, domain_wakeup)) > + domain_wakeup = next_wakeup; > + } > + > + genpd->next_wakeup = domain_wakeup; > +} > + > +static bool next_wakeup_allows_state(struct generic_pm_domain *genpd, > + unsigned int state, ktime_t now) > +{ > + ktime_t domain_wakeup = genpd->next_wakeup; > + s64 idle_time_ns, min_sleep_ns; > + > + min_sleep_ns = genpd->states[state].power_off_latency_ns + > + genpd->states[state].power_on_latency_ns + > + genpd->states[state].residency_ns; > + I don't think you should include the power_on_latency_ns here. The validation isn't about QoS constraints, but whether we can meet the residency time to make it worth entering the state, from an energy point of view. Right? > + idle_time_ns = ktime_to_ns(ktime_sub(domain_wakeup, now)); > + > + return idle_time_ns >= min_sleep_ns; > +} > + > static bool __default_power_down_ok(struct dev_pm_domain *pd, > unsigned int state) > { > @@ -209,8 +250,34 @@ static bool __default_power_down_ok(struct dev_pm_domain *pd, > static bool default_power_down_ok(struct dev_pm_domain *pd) > { > struct generic_pm_domain *genpd = pd_to_genpd(pd); > + int state_idx = genpd->state_count - 1; > + ktime_t now = ktime_get(); > struct gpd_link *link; > > + /* > + * Find the next wakeup from devices that can determine their own wakeup > + * to find when the domain would wakeup and do it for every device down > + * the hierarchy. It is not worth while to sleep if the state's residency > + * cannot be met. > + */ > + update_domain_next_wakeup(genpd, now); > + if (genpd->next_wakeup != KTIME_MAX) { > + /* Let's find out the deepest domain idle state, the devices prefer */ > + while (state_idx >= 0) { > + if (next_wakeup_allows_state(genpd, state_idx, now)) { > + genpd->max_off_time_changed = true; > + break; > + } > + state_idx--; > + } > + > + if (state_idx < 0) { > + state_idx = 0; > + genpd->cached_power_down_ok = false; > + goto done; > + } > + } > + The above would introduce unnecessary overhead, as it may become executed in cases when it's not needed. For example, there's no point doing the above, if the domain doesn't specify residency values for its idle states. Additionally, we don't need to recompute the domain's next wakup, unless a device has got a new next_wakeup value set for it. In this case, we can just select a state based upon an previously computed value, thus avoiding the recomputation. > if (!genpd->max_off_time_changed) { > genpd->state_idx = genpd->cached_power_down_state_idx; > return genpd->cached_power_down_ok; > @@ -228,17 +295,21 @@ static bool default_power_down_ok(struct dev_pm_domain *pd) > genpd->max_off_time_ns = -1; > genpd->max_off_time_changed = false; > genpd->cached_power_down_ok = true; > - genpd->state_idx = genpd->state_count - 1; > > - /* Find a state to power down to, starting from the deepest. */ > - while (!__default_power_down_ok(pd, genpd->state_idx)) { > - if (genpd->state_idx == 0) { > + /* > + * Find a state to power down to, starting from the state > + * determined by the next wakeup. > + */ > + while (!__default_power_down_ok(pd, state_idx)) { > + if (state_idx == 0) { > genpd->cached_power_down_ok = false; > break; > } > - genpd->state_idx--; > + state_idx--; > } > > +done: > + genpd->state_idx = state_idx; > genpd->cached_power_down_state_idx = genpd->state_idx; > return genpd->cached_power_down_ok; > } Another thing to consider for the above changes, is that the cpu_power_down_ok() calls into default_power_down_ok(). Even if I am fine with the approach taken in $subject patch, I think it's important to try to keep the path a slim as possible as it's also executed in the CPU idlepath. For example, $subject change means also that we end up calling ktime_get() twice in the same path, introducing unnecessary overhead. We can do better and avoid that by restructuring the code a bit, don't you think? > diff --git a/include/linux/pm_domain.h b/include/linux/pm_domain.h > index d7875bfde7f4..49982cd58bfd 100644 > --- a/include/linux/pm_domain.h > +++ b/include/linux/pm_domain.h > @@ -130,6 +130,7 @@ struct generic_pm_domain { > unsigned int state); > struct gpd_dev_ops dev_ops; > s64 max_off_time_ns; /* Maximum allowed "suspended" time. */ > + ktime_t next_wakeup; /* Maintained by the domain governor */ > bool max_off_time_changed; > bool cached_power_down_ok; > bool cached_power_down_state_idx; > -- > The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum, > a Linux Foundation Collaborative Project > Kind regards Uffe ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH v5 2/2] PM / Domains: use device's next wakeup to determine domain idle state 2020-11-09 15:26 ` Ulf Hansson @ 2020-11-09 17:41 ` Lina Iyer 2020-11-10 10:01 ` Ulf Hansson 0 siblings, 1 reply; 12+ messages in thread From: Lina Iyer @ 2020-11-09 17:41 UTC (permalink / raw) To: Ulf Hansson; +Cc: Rafael J. Wysocki, Linux PM, linux-arm-msm On Mon, Nov 09 2020 at 08:27 -0700, Ulf Hansson wrote: >On Fri, 6 Nov 2020 at 17:48, Lina Iyer <ilina@codeaurora.org> wrote: >> [...] >> +static void update_domain_next_wakeup(struct generic_pm_domain *genpd, ktime_t now) >> +{ >> + ktime_t domain_wakeup = KTIME_MAX; >> + ktime_t next_wakeup; >> + struct pm_domain_data *pdd; >> + struct gpd_link *link; >> + >> + /* Find the earliest wakeup for all devices in the domain */ >> + list_for_each_entry(pdd, &genpd->dev_list, list_node) { >> + next_wakeup = to_gpd_data(pdd)->next_wakeup; >> + if (next_wakeup != KTIME_MAX && !ktime_before(next_wakeup, now)) >> + if (ktime_before(next_wakeup, domain_wakeup)) >> + domain_wakeup = next_wakeup; > >If it turns out that one of the device's next_wakeup is before "now", >leading to ktime_before() above returns true - then I think you should >bail out, no? > >At least, we shouldn't just continue and ignore this case, right? > No, that could be a very common case. Drivers are not expected to clean up the next wakeup by setting it to KTIME_MAX. The best we can do is to make a choice with the valid information we have. This will also map to the current behavior. Say if all next wakeup information provided to the devices were in the past, we would be no worse (or better) than what we do without this change. >> + } >> + >> + /* Then find the earliest wakeup of from all the child domains */ >> + list_for_each_entry(link, &genpd->parent_links, parent_node) { >> + next_wakeup = link->child->next_wakeup; >> + if (next_wakeup != KTIME_MAX && !ktime_before(next_wakeup, now)) >> + if (ktime_before(next_wakeup, domain_wakeup)) >> + domain_wakeup = next_wakeup; >> + } >> + >> + genpd->next_wakeup = domain_wakeup; >> +} >> + >> +static bool next_wakeup_allows_state(struct generic_pm_domain *genpd, >> + unsigned int state, ktime_t now) >> +{ >> + ktime_t domain_wakeup = genpd->next_wakeup; >> + s64 idle_time_ns, min_sleep_ns; >> + >> + min_sleep_ns = genpd->states[state].power_off_latency_ns + >> + genpd->states[state].power_on_latency_ns + >> + genpd->states[state].residency_ns; >> + > >I don't think you should include the power_on_latency_ns here. > >The validation isn't about QoS constraints, but whether we can meet >the residency time to make it worth entering the state, from an energy >point of view. Right? > Fair point. I will remove the power_on_latency_ns. >> + idle_time_ns = ktime_to_ns(ktime_sub(domain_wakeup, now)); >> + >> + return idle_time_ns >= min_sleep_ns; >> +} >> + >> static bool __default_power_down_ok(struct dev_pm_domain *pd, >> unsigned int state) >> { >> @@ -209,8 +250,34 @@ static bool __default_power_down_ok(struct dev_pm_domain *pd, >> static bool default_power_down_ok(struct dev_pm_domain *pd) >> { >> struct generic_pm_domain *genpd = pd_to_genpd(pd); >> + int state_idx = genpd->state_count - 1; >> + ktime_t now = ktime_get(); >> struct gpd_link *link; >> >> + /* >> + * Find the next wakeup from devices that can determine their own wakeup >> + * to find when the domain would wakeup and do it for every device down >> + * the hierarchy. It is not worth while to sleep if the state's residency >> + * cannot be met. >> + */ >> + update_domain_next_wakeup(genpd, now); >> + if (genpd->next_wakeup != KTIME_MAX) { >> + /* Let's find out the deepest domain idle state, the devices prefer */ >> + while (state_idx >= 0) { >> + if (next_wakeup_allows_state(genpd, state_idx, now)) { >> + genpd->max_off_time_changed = true; >> + break; >> + } >> + state_idx--; >> + } >> + >> + if (state_idx < 0) { >> + state_idx = 0; >> + genpd->cached_power_down_ok = false; >> + goto done; >> + } >> + } >> + > >The above would introduce unnecessary overhead, as it may become >executed in cases when it's not needed. > >For example, there's no point doing the above, if the domain doesn't >specify residency values for its idle states. > We would still need to ensure that the next wakeup is after the power_off_latency, if specified. >Additionally, we don't need to recompute the domain's next wakup, >unless a device has got a new next_wakeup value set for it. In this >case, we can just select a state based upon an previously computed >value, thus avoiding the recomputation. > If the domain's next wakeup was in the past, then using our previously computed state may be incorrect. >> if (!genpd->max_off_time_changed) { >> genpd->state_idx = genpd->cached_power_down_state_idx; >> return genpd->cached_power_down_ok; >> @@ -228,17 +295,21 @@ static bool default_power_down_ok(struct dev_pm_domain *pd) >> genpd->max_off_time_ns = -1; >> genpd->max_off_time_changed = false; >> genpd->cached_power_down_ok = true; >> - genpd->state_idx = genpd->state_count - 1; >> >> - /* Find a state to power down to, starting from the deepest. */ >> - while (!__default_power_down_ok(pd, genpd->state_idx)) { >> - if (genpd->state_idx == 0) { >> + /* >> + * Find a state to power down to, starting from the state >> + * determined by the next wakeup. >> + */ >> + while (!__default_power_down_ok(pd, state_idx)) { >> + if (state_idx == 0) { >> genpd->cached_power_down_ok = false; >> break; >> } >> - genpd->state_idx--; >> + state_idx--; >> } >> >> +done: >> + genpd->state_idx = state_idx; >> genpd->cached_power_down_state_idx = genpd->state_idx; >> return genpd->cached_power_down_ok; >> } > >Another thing to consider for the above changes, is that the >cpu_power_down_ok() calls into default_power_down_ok(). > >Even if I am fine with the approach taken in $subject patch, I think >it's important to try to keep the path a slim as possible as it's also >executed in the CPU idlepath. Wouldn't this be called only the last CPU is powering down and only if the domain is ready to power down? >For example, $subject change means also >that we end up calling ktime_get() twice in the same path, introducing >unnecessary overhead. We can do better and avoid that by restructuring >the code a bit, don't you think? > Hmmm, we could. I will submit a follow on patch to reorganize the code so the ktime_get() would be called only once for either of the power_down_ok callbacks. Thanks for your review, Ulf. -- Lina ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH v5 2/2] PM / Domains: use device's next wakeup to determine domain idle state 2020-11-09 17:41 ` Lina Iyer @ 2020-11-10 10:01 ` Ulf Hansson 2020-11-11 16:27 ` Lina Iyer 0 siblings, 1 reply; 12+ messages in thread From: Ulf Hansson @ 2020-11-10 10:01 UTC (permalink / raw) To: Lina Iyer; +Cc: Rafael J. Wysocki, Linux PM, linux-arm-msm On Mon, 9 Nov 2020 at 18:41, Lina Iyer <ilina@codeaurora.org> wrote: > > On Mon, Nov 09 2020 at 08:27 -0700, Ulf Hansson wrote: > >On Fri, 6 Nov 2020 at 17:48, Lina Iyer <ilina@codeaurora.org> wrote: > >> > [...] > >> +static void update_domain_next_wakeup(struct generic_pm_domain *genpd, ktime_t now) > >> +{ > >> + ktime_t domain_wakeup = KTIME_MAX; > >> + ktime_t next_wakeup; > >> + struct pm_domain_data *pdd; > >> + struct gpd_link *link; > >> + > >> + /* Find the earliest wakeup for all devices in the domain */ > >> + list_for_each_entry(pdd, &genpd->dev_list, list_node) { > >> + next_wakeup = to_gpd_data(pdd)->next_wakeup; > >> + if (next_wakeup != KTIME_MAX && !ktime_before(next_wakeup, now)) > >> + if (ktime_before(next_wakeup, domain_wakeup)) > >> + domain_wakeup = next_wakeup; > > > >If it turns out that one of the device's next_wakeup is before "now", > >leading to ktime_before() above returns true - then I think you should > >bail out, no? > > > >At least, we shouldn't just continue and ignore this case, right? > > > No, that could be a very common case. Drivers are not expected to clean > up the next wakeup by setting it to KTIME_MAX. The best we can do is > to make a choice with the valid information we have. This will also map > to the current behavior. Say if all next wakeup information provided to > the devices were in the past, we would be no worse (or better) than what > we do without this change. Well, I don't quite agree (at least not yet), but let me elaborate, as I think we can do better without having to add complexity. Normally, I don't think a driver needs to clean up its device's next wakeup in between the remote wakeups, instead it should just set a new value. That's because, even if the driver acts to a remote wakeup or deals with a request entering a queue, the driver needs to runtime resume its device during this period. This prevents genpd from power off the PM domain, hence also the genpd governor from potentially looking at "invalid" wakeup information for its attached devices. Of course, I assume there are situations, where a driver actually needs to do a clean up of its device's next wakeup, but that should be less frequent and likely when a remote wakeup is disabled (for whatever reason). > > >> + } > >> + > >> + /* Then find the earliest wakeup of from all the child domains */ > >> + list_for_each_entry(link, &genpd->parent_links, parent_node) { > >> + next_wakeup = link->child->next_wakeup; > >> + if (next_wakeup != KTIME_MAX && !ktime_before(next_wakeup, now)) > >> + if (ktime_before(next_wakeup, domain_wakeup)) > >> + domain_wakeup = next_wakeup; > >> + } > >> + > >> + genpd->next_wakeup = domain_wakeup; > >> +} > >> + > >> +static bool next_wakeup_allows_state(struct generic_pm_domain *genpd, > >> + unsigned int state, ktime_t now) > >> +{ > >> + ktime_t domain_wakeup = genpd->next_wakeup; > >> + s64 idle_time_ns, min_sleep_ns; > >> + > >> + min_sleep_ns = genpd->states[state].power_off_latency_ns + > >> + genpd->states[state].power_on_latency_ns + > >> + genpd->states[state].residency_ns; > >> + > > > >I don't think you should include the power_on_latency_ns here. > > > >The validation isn't about QoS constraints, but whether we can meet > >the residency time to make it worth entering the state, from an energy > >point of view. Right? > > > Fair point. I will remove the power_on_latency_ns. > > >> + idle_time_ns = ktime_to_ns(ktime_sub(domain_wakeup, now)); > >> + > >> + return idle_time_ns >= min_sleep_ns; > >> +} > >> + > >> static bool __default_power_down_ok(struct dev_pm_domain *pd, > >> unsigned int state) > >> { > >> @@ -209,8 +250,34 @@ static bool __default_power_down_ok(struct dev_pm_domain *pd, > >> static bool default_power_down_ok(struct dev_pm_domain *pd) > >> { > >> struct generic_pm_domain *genpd = pd_to_genpd(pd); > >> + int state_idx = genpd->state_count - 1; > >> + ktime_t now = ktime_get(); > >> struct gpd_link *link; > >> > >> + /* > >> + * Find the next wakeup from devices that can determine their own wakeup > >> + * to find when the domain would wakeup and do it for every device down > >> + * the hierarchy. It is not worth while to sleep if the state's residency > >> + * cannot be met. > >> + */ > >> + update_domain_next_wakeup(genpd, now); > >> + if (genpd->next_wakeup != KTIME_MAX) { > >> + /* Let's find out the deepest domain idle state, the devices prefer */ > >> + while (state_idx >= 0) { > >> + if (next_wakeup_allows_state(genpd, state_idx, now)) { > >> + genpd->max_off_time_changed = true; > >> + break; > >> + } > >> + state_idx--; > >> + } > >> + > >> + if (state_idx < 0) { > >> + state_idx = 0; > >> + genpd->cached_power_down_ok = false; > >> + goto done; > >> + } > >> + } > >> + > > > >The above would introduce unnecessary overhead, as it may become > >executed in cases when it's not needed. > > > >For example, there's no point doing the above, if the domain doesn't > >specify residency values for its idle states. > > > We would still need to ensure that the next wakeup is after the > power_off_latency, if specified. Good point! Although, I would rather avoid adding the overhead, unless the residency is specified. Do you see a problem with this approach? Another option is to add a new governor, but if possible, I would like to avoid that. > > >Additionally, we don't need to recompute the domain's next wakup, > >unless a device has got a new next_wakeup value set for it. In this > >case, we can just select a state based upon an previously computed > >value, thus avoiding the recomputation. > > > If the domain's next wakeup was in the past, then using our previously > computed state may be incorrect. I am not saying you should use the previously computed *idlestate*, but the previously computed next wakeup. > > >> if (!genpd->max_off_time_changed) { > >> genpd->state_idx = genpd->cached_power_down_state_idx; > >> return genpd->cached_power_down_ok; > >> @@ -228,17 +295,21 @@ static bool default_power_down_ok(struct dev_pm_domain *pd) > >> genpd->max_off_time_ns = -1; > >> genpd->max_off_time_changed = false; > >> genpd->cached_power_down_ok = true; > >> - genpd->state_idx = genpd->state_count - 1; > >> > >> - /* Find a state to power down to, starting from the deepest. */ > >> - while (!__default_power_down_ok(pd, genpd->state_idx)) { > >> - if (genpd->state_idx == 0) { > >> + /* > >> + * Find a state to power down to, starting from the state > >> + * determined by the next wakeup. > >> + */ > >> + while (!__default_power_down_ok(pd, state_idx)) { > >> + if (state_idx == 0) { > >> genpd->cached_power_down_ok = false; > >> break; > >> } > >> - genpd->state_idx--; > >> + state_idx--; > >> } > >> > >> +done: > >> + genpd->state_idx = state_idx; > >> genpd->cached_power_down_state_idx = genpd->state_idx; > >> return genpd->cached_power_down_ok; > >> } > > > >Another thing to consider for the above changes, is that the > >cpu_power_down_ok() calls into default_power_down_ok(). > > > >Even if I am fine with the approach taken in $subject patch, I think > >it's important to try to keep the path a slim as possible as it's also > >executed in the CPU idlepath. > Wouldn't this be called only the last CPU is powering down and only if > the domain is ready to power down? > > >For example, $subject change means also > >that we end up calling ktime_get() twice in the same path, introducing > >unnecessary overhead. We can do better and avoid that by restructuring > >the code a bit, don't you think? > > > Hmmm, we could. I will submit a follow on patch to reorganize the code > so the ktime_get() would be called only once for either of the > power_down_ok callbacks. If possible, I would rather make it part of the series. Just fold in some "rework" patch before extending the governor, that should be possible I think. Kind regards Uffe ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH v5 2/2] PM / Domains: use device's next wakeup to determine domain idle state 2020-11-10 10:01 ` Ulf Hansson @ 2020-11-11 16:27 ` Lina Iyer 2020-11-13 10:33 ` Ulf Hansson 0 siblings, 1 reply; 12+ messages in thread From: Lina Iyer @ 2020-11-11 16:27 UTC (permalink / raw) To: Ulf Hansson; +Cc: Rafael J. Wysocki, Linux PM, linux-arm-msm On Tue, Nov 10 2020 at 03:02 -0700, Ulf Hansson wrote: >On Mon, 9 Nov 2020 at 18:41, Lina Iyer <ilina@codeaurora.org> wrote: >> >> On Mon, Nov 09 2020 at 08:27 -0700, Ulf Hansson wrote: >> >On Fri, 6 Nov 2020 at 17:48, Lina Iyer <ilina@codeaurora.org> wrote: >> >> >> [...] >> >> +static void update_domain_next_wakeup(struct generic_pm_domain *genpd, ktime_t now) >> >> +{ >> >> + ktime_t domain_wakeup = KTIME_MAX; >> >> + ktime_t next_wakeup; >> >> + struct pm_domain_data *pdd; >> >> + struct gpd_link *link; >> >> + >> >> + /* Find the earliest wakeup for all devices in the domain */ >> >> + list_for_each_entry(pdd, &genpd->dev_list, list_node) { >> >> + next_wakeup = to_gpd_data(pdd)->next_wakeup; >> >> + if (next_wakeup != KTIME_MAX && !ktime_before(next_wakeup, now)) >> >> + if (ktime_before(next_wakeup, domain_wakeup)) >> >> + domain_wakeup = next_wakeup; >> > >> >If it turns out that one of the device's next_wakeup is before "now", >> >leading to ktime_before() above returns true - then I think you should >> >bail out, no? >> > >> >At least, we shouldn't just continue and ignore this case, right? >> > >> No, that could be a very common case. Drivers are not expected to clean >> up the next wakeup by setting it to KTIME_MAX. The best we can do is >> to make a choice with the valid information we have. This will also map >> to the current behavior. Say if all next wakeup information provided to >> the devices were in the past, we would be no worse (or better) than what >> we do without this change. > >Well, I don't quite agree (at least not yet), but let me elaborate, as >I think we can do better without having to add complexity. > >Normally, I don't think a driver needs to clean up its device's next >wakeup in between the remote wakeups, instead it should just set a new >value. > >That's because, even if the driver acts to a remote wakeup or deals >with a request entering a queue, the driver needs to runtime resume >its device during this period. This prevents genpd from power off the >PM domain, hence also the genpd governor from potentially looking at >"invalid" wakeup information for its attached devices. > Could you elaborate a bit? Why would a remote wakeup affect the next wakeup. I'm afraid that I'm not getting the situation correctly. >Of course, I assume there are situations, where a driver actually >needs to do a clean up of its device's next wakeup, but that should be >less frequent and likely when a remote wakeup is disabled (for >whatever reason). > A common case would be that the driver does not know when the usecase is being turned off and therefore may not be able to set the next wakeup to max. If the stale value continues to exist then we may never power off the domain. >> >> + /* >> >> + * Find the next wakeup from devices that can determine their own wakeup >> >> + * to find when the domain would wakeup and do it for every device down >> >> + * the hierarchy. It is not worth while to sleep if the state's residency >> >> + * cannot be met. >> >> + */ >> >> + update_domain_next_wakeup(genpd, now); >> >> + if (genpd->next_wakeup != KTIME_MAX) { >> >> + /* Let's find out the deepest domain idle state, the devices prefer */ >> >> + while (state_idx >= 0) { >> >> + if (next_wakeup_allows_state(genpd, state_idx, now)) { >> >> + genpd->max_off_time_changed = true; >> >> + break; >> >> + } >> >> + state_idx--; >> >> + } >> >> + >> >> + if (state_idx < 0) { >> >> + state_idx = 0; >> >> + genpd->cached_power_down_ok = false; >> >> + goto done; >> >> + } >> >> + } >> >> + >> > >> >The above would introduce unnecessary overhead, as it may become >> >executed in cases when it's not needed. >> > >> >For example, there's no point doing the above, if the domain doesn't >> >specify residency values for its idle states. >> > >> We would still need to ensure that the next wakeup is after the >> power_off_latency, if specified. > >Good point! Although, I would rather avoid adding the overhead, unless >the residency is specified. Do you see a problem with this approach? > Hmmm, no strong objections. However, we still need to run through the states to make sure the residency is not set and have a variable track that. The devices wouldn't know that and would still continue to set the next wakeup, unless we find a way to let them know we are not using this feature for the domain. >Another option is to add a new governor, but if possible, I would like >to avoid that. > Absolutely, we should avoid that. >> >> >Additionally, we don't need to recompute the domain's next wakup, >> >unless a device has got a new next_wakeup value set for it. In this >> >case, we can just select a state based upon an previously computed >> >value, thus avoiding the recomputation. >> > >> If the domain's next wakeup was in the past, then using our previously >> computed state may be incorrect. > >I am not saying you should use the previously computed *idlestate*, >but the previously computed next wakeup. > I guess this falls into the first discussion, should be use the next wakeup from the past. >> >> >> if (!genpd->max_off_time_changed) { >> >> genpd->state_idx = genpd->cached_power_down_state_idx; >> >> return genpd->cached_power_down_ok; >> >> @@ -228,17 +295,21 @@ static bool default_power_down_ok(struct dev_pm_domain *pd) >> >> genpd->max_off_time_ns = -1; >> >> genpd->max_off_time_changed = false; >> >> genpd->cached_power_down_ok = true; >> >> - genpd->state_idx = genpd->state_count - 1; >> >> >> >> - /* Find a state to power down to, starting from the deepest. */ >> >> - while (!__default_power_down_ok(pd, genpd->state_idx)) { >> >> - if (genpd->state_idx == 0) { >> >> + /* >> >> + * Find a state to power down to, starting from the state >> >> + * determined by the next wakeup. >> >> + */ >> >> + while (!__default_power_down_ok(pd, state_idx)) { >> >> + if (state_idx == 0) { >> >> genpd->cached_power_down_ok = false; >> >> break; >> >> } >> >> - genpd->state_idx--; >> >> + state_idx--; >> >> } >> >> >> >> +done: >> >> + genpd->state_idx = state_idx; >> >> genpd->cached_power_down_state_idx = genpd->state_idx; >> >> return genpd->cached_power_down_ok; >> >> } >> > >> >Another thing to consider for the above changes, is that the >> >cpu_power_down_ok() calls into default_power_down_ok(). >> > >> >Even if I am fine with the approach taken in $subject patch, I think >> >it's important to try to keep the path a slim as possible as it's also >> >executed in the CPU idlepath. >> Wouldn't this be called only the last CPU is powering down and only if >> the domain is ready to power down? >> >> >For example, $subject change means also >> >that we end up calling ktime_get() twice in the same path, introducing >> >unnecessary overhead. We can do better and avoid that by restructuring >> >the code a bit, don't you think? >> > >> Hmmm, we could. I will submit a follow on patch to reorganize the code >> so the ktime_get() would be called only once for either of the >> power_down_ok callbacks. > >If possible, I would rather make it part of the series. Just fold in >some "rework" patch before extending the governor, that should be >possible I think. > Sure. Since this would affect changing the default_power_down_ok(), I thought a follow on patch would be appropriate. Thanks, Lina ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH v5 2/2] PM / Domains: use device's next wakeup to determine domain idle state 2020-11-11 16:27 ` Lina Iyer @ 2020-11-13 10:33 ` Ulf Hansson 2020-11-16 15:57 ` Lina Iyer 0 siblings, 1 reply; 12+ messages in thread From: Ulf Hansson @ 2020-11-13 10:33 UTC (permalink / raw) To: Lina Iyer; +Cc: Rafael J. Wysocki, Linux PM, linux-arm-msm On Wed, 11 Nov 2020 at 17:51, Lina Iyer <ilina@codeaurora.org> wrote: > > On Tue, Nov 10 2020 at 03:02 -0700, Ulf Hansson wrote: > >On Mon, 9 Nov 2020 at 18:41, Lina Iyer <ilina@codeaurora.org> wrote: > >> > >> On Mon, Nov 09 2020 at 08:27 -0700, Ulf Hansson wrote: > >> >On Fri, 6 Nov 2020 at 17:48, Lina Iyer <ilina@codeaurora.org> wrote: > >> >> > >> [...] > >> >> +static void update_domain_next_wakeup(struct generic_pm_domain *genpd, ktime_t now) > >> >> +{ > >> >> + ktime_t domain_wakeup = KTIME_MAX; > >> >> + ktime_t next_wakeup; > >> >> + struct pm_domain_data *pdd; > >> >> + struct gpd_link *link; > >> >> + > >> >> + /* Find the earliest wakeup for all devices in the domain */ > >> >> + list_for_each_entry(pdd, &genpd->dev_list, list_node) { > >> >> + next_wakeup = to_gpd_data(pdd)->next_wakeup; > >> >> + if (next_wakeup != KTIME_MAX && !ktime_before(next_wakeup, now)) > >> >> + if (ktime_before(next_wakeup, domain_wakeup)) > >> >> + domain_wakeup = next_wakeup; > >> > > >> >If it turns out that one of the device's next_wakeup is before "now", > >> >leading to ktime_before() above returns true - then I think you should > >> >bail out, no? > >> > > >> >At least, we shouldn't just continue and ignore this case, right? > >> > > >> No, that could be a very common case. Drivers are not expected to clean > >> up the next wakeup by setting it to KTIME_MAX. The best we can do is > >> to make a choice with the valid information we have. This will also map > >> to the current behavior. Say if all next wakeup information provided to > >> the devices were in the past, we would be no worse (or better) than what > >> we do without this change. > > > >Well, I don't quite agree (at least not yet), but let me elaborate, as > >I think we can do better without having to add complexity. > > > >Normally, I don't think a driver needs to clean up its device's next > >wakeup in between the remote wakeups, instead it should just set a new > >value. > > > >That's because, even if the driver acts to a remote wakeup or deals > >with a request entering a queue, the driver needs to runtime resume > >its device during this period. This prevents genpd from power off the > >PM domain, hence also the genpd governor from potentially looking at > >"invalid" wakeup information for its attached devices. > > > Could you elaborate a bit? Why would a remote wakeup affect the next > wakeup. I'm afraid that I'm not getting the situation correctly. Let me try :-) A remote wakeup is a wakeup irq that is triggered when the device is in runtime suspended state. I was expecting that you would be arming a remote wakeup for the corresponding device that is attached to a genpd, when the use case becomes enabled. Additionally, to allow the driver to consume the wakeup irq, it needs to runtime resume its device (which means its PM domain via genpd must be powered on as well, if it's not on already). Therefore, during the period of when the driver consumes the wakeup irq, its device's PM domain remains powered on. When this is completed, the driver allows its device to become runtime suspended again. At some point before the device becomes runtime suspended, the driver should set a new value of the "next wakeup" for its device. > > >Of course, I assume there are situations, where a driver actually > >needs to do a clean up of its device's next wakeup, but that should be > >less frequent and likely when a remote wakeup is disabled (for > >whatever reason). > > > A common case would be that the driver does not know when the usecase is > being turned off and therefore may not be able to set the next wakeup to > max. If the stale value continues to exist then we may never power off > the domain. Right. But, how do you know that the use case starts and what prevents us from knowing that the use case has stopped? Maybe if you add a user of the new APIs, this would help me to understand better? > > >> >> + /* > >> >> + * Find the next wakeup from devices that can determine their own wakeup > >> >> + * to find when the domain would wakeup and do it for every device down > >> >> + * the hierarchy. It is not worth while to sleep if the state's residency > >> >> + * cannot be met. > >> >> + */ > >> >> + update_domain_next_wakeup(genpd, now); > >> >> + if (genpd->next_wakeup != KTIME_MAX) { > >> >> + /* Let's find out the deepest domain idle state, the devices prefer */ > >> >> + while (state_idx >= 0) { > >> >> + if (next_wakeup_allows_state(genpd, state_idx, now)) { > >> >> + genpd->max_off_time_changed = true; > >> >> + break; > >> >> + } > >> >> + state_idx--; > >> >> + } > >> >> + > >> >> + if (state_idx < 0) { > >> >> + state_idx = 0; > >> >> + genpd->cached_power_down_ok = false; > >> >> + goto done; > >> >> + } > >> >> + } > >> >> + > >> > > >> >The above would introduce unnecessary overhead, as it may become > >> >executed in cases when it's not needed. > >> > > >> >For example, there's no point doing the above, if the domain doesn't > >> >specify residency values for its idle states. > >> > > >> We would still need to ensure that the next wakeup is after the > >> power_off_latency, if specified. > > > >Good point! Although, I would rather avoid adding the overhead, unless > >the residency is specified. Do you see a problem with this approach? > > > Hmmm, no strong objections. However, we still need to run through the > states to make sure the residency is not set and have a variable track > that. Right. The important part is that we can do that once and not for every call to the governor. > The devices wouldn't know that and would still continue to set the > next wakeup, unless we find a way to let them know we are not using this > feature for the domain. Right. To allow the driver to know, we could respond with an error code from the new dev_pm_genpd_set_performance_state() API (from patch1), in case the genpd+governor doesn't support it. Would that be okay? Otherwise we will have to add a separate genpd API, asking explicitly if the "next wakeup" feature is supported. > > >Another option is to add a new governor, but if possible, I would like > >to avoid that. > > > Absolutely, we should avoid that. > [...] Kind regards Uffe ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH v5 2/2] PM / Domains: use device's next wakeup to determine domain idle state 2020-11-13 10:33 ` Ulf Hansson @ 2020-11-16 15:57 ` Lina Iyer 2020-11-19 9:56 ` Ulf Hansson 0 siblings, 1 reply; 12+ messages in thread From: Lina Iyer @ 2020-11-16 15:57 UTC (permalink / raw) To: Ulf Hansson; +Cc: Rafael J. Wysocki, Linux PM, linux-arm-msm On Fri, Nov 13 2020 at 03:34 -0700, Ulf Hansson wrote: >On Wed, 11 Nov 2020 at 17:51, Lina Iyer <ilina@codeaurora.org> wrote: >> >> On Tue, Nov 10 2020 at 03:02 -0700, Ulf Hansson wrote: >> >On Mon, 9 Nov 2020 at 18:41, Lina Iyer <ilina@codeaurora.org> wrote: >> >> >> >> On Mon, Nov 09 2020 at 08:27 -0700, Ulf Hansson wrote: >> >> >On Fri, 6 Nov 2020 at 17:48, Lina Iyer <ilina@codeaurora.org> wrote: >> >> >> >> >> [...] >> >> >> +static void update_domain_next_wakeup(struct generic_pm_domain *genpd, ktime_t now) >> >> >> +{ >> >> >> + ktime_t domain_wakeup = KTIME_MAX; >> >> >> + ktime_t next_wakeup; >> >> >> + struct pm_domain_data *pdd; >> >> >> + struct gpd_link *link; >> >> >> + >> >> >> + /* Find the earliest wakeup for all devices in the domain */ >> >> >> + list_for_each_entry(pdd, &genpd->dev_list, list_node) { >> >> >> + next_wakeup = to_gpd_data(pdd)->next_wakeup; >> >> >> + if (next_wakeup != KTIME_MAX && !ktime_before(next_wakeup, now)) >> >> >> + if (ktime_before(next_wakeup, domain_wakeup)) >> >> >> + domain_wakeup = next_wakeup; >> >> > >> >> >If it turns out that one of the device's next_wakeup is before "now", >> >> >leading to ktime_before() above returns true - then I think you should >> >> >bail out, no? >> >> > >> >> >At least, we shouldn't just continue and ignore this case, right? >> >> > >> >> No, that could be a very common case. Drivers are not expected to clean >> >> up the next wakeup by setting it to KTIME_MAX. The best we can do is >> >> to make a choice with the valid information we have. This will also map >> >> to the current behavior. Say if all next wakeup information provided to >> >> the devices were in the past, we would be no worse (or better) than what >> >> we do without this change. >> > >> >Well, I don't quite agree (at least not yet), but let me elaborate, as >> >I think we can do better without having to add complexity. >> > >> >Normally, I don't think a driver needs to clean up its device's next >> >wakeup in between the remote wakeups, instead it should just set a new >> >value. >> > >> >That's because, even if the driver acts to a remote wakeup or deals >> >with a request entering a queue, the driver needs to runtime resume >> >its device during this period. This prevents genpd from power off the >> >PM domain, hence also the genpd governor from potentially looking at >> >"invalid" wakeup information for its attached devices. >> > >> Could you elaborate a bit? Why would a remote wakeup affect the next >> wakeup. I'm afraid that I'm not getting the situation correctly. > >Let me try :-) > >A remote wakeup is a wakeup irq that is triggered when the device is >in runtime suspended state. > >I was expecting that you would be arming a remote wakeup for the >corresponding device that is attached to a genpd, when the use case >becomes enabled. Additionally, to allow the driver to consume the >wakeup irq, it needs to runtime resume its device (which means its PM >domain via genpd must be powered on as well, if it's not on already). > >Therefore, during the period of when the driver consumes the wakeup >irq, its device's PM domain remains powered on. When this is >completed, the driver allows its device to become runtime suspended >again. At some point before the device becomes runtime suspended, the >driver should set a new value of the "next wakeup" for its device. > Okay, that is clear. Thanks :) Yes, we would expect the device to set up its next_wakeup before doing runtime suspend and if the next wakeup is in the past, we would possibly not have runtime suspended the device. That would keep the domain ON and we would not come to this governor at all. And if we still are doing it, then the device has not set the next wakeup correctly. What you are suggesting is that, we should not power down the domain in such a case. This would be a really hard problem to debug when a device leaves a stale wakeup behind and we no longer power off the domain because of that. Tracking that back to the device will be a monumental effort. Ignoring the next wakeup though might involve a power/perf penalty (no worse than today), but we would not have a difficult problem to solve. >> >> >Of course, I assume there are situations, where a driver actually >> >needs to do a clean up of its device's next wakeup, but that should be >> >less frequent and likely when a remote wakeup is disabled (for >> >whatever reason). >> > >> A common case would be that the driver does not know when the usecase is >> being turned off and therefore may not be able to set the next wakeup to >> max. If the stale value continues to exist then we may never power off >> the domain. > >Right. > >But, how do you know that the use case starts and what prevents us >from knowing that the use case has stopped? > Usually, the device is made aware of the usecase when it starts, but stopping the usecase might be something the device may or may not be aware of. >Maybe if you add a user of the new APIs, this would help me to >understand better? > I have been asking some folks, but let's see. [...] >> >> >For example, there's no point doing the above, if the domain doesn't >> >> >specify residency values for its idle states. >> >> > >> >> We would still need to ensure that the next wakeup is after the >> >> power_off_latency, if specified. >> > >> >Good point! Although, I would rather avoid adding the overhead, unless >> >the residency is specified. Do you see a problem with this approach? >> > >> Hmmm, no strong objections. However, we still need to run through the >> states to make sure the residency is not set and have a variable track >> that. > >Right. > >The important part is that we can do that once and not for every call >to the governor. > >> The devices wouldn't know that and would still continue to set the >> next wakeup, unless we find a way to let them know we are not using this >> feature for the domain. > >Right. > >To allow the driver to know, we could respond with an error code from >the new dev_pm_genpd_set_performance_state() API (from patch1), in >case the genpd+governor doesn't support it. > It would an unnecessary work everytime a next wakeup is being set to iterate through the available states and figure out if the residency has been set or not. We could probably hold that result in a variable when we setup the genpd states. Expect the next_wake to be set from a critical path or an interrupt handler, so we have to be quick. >Would that be okay? Otherwise we will have to add a separate genpd >API, asking explicitly if the "next wakeup" feature is supported. > Would like to avoid that as much as possible. Thanks, Lina ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH v5 2/2] PM / Domains: use device's next wakeup to determine domain idle state 2020-11-16 15:57 ` Lina Iyer @ 2020-11-19 9:56 ` Ulf Hansson 2020-11-19 15:47 ` Lina Iyer 2020-11-19 17:46 ` Lina Iyer 0 siblings, 2 replies; 12+ messages in thread From: Ulf Hansson @ 2020-11-19 9:56 UTC (permalink / raw) To: Lina Iyer; +Cc: Rafael J. Wysocki, Linux PM, linux-arm-msm On Mon, 16 Nov 2020 at 16:57, Lina Iyer <ilina@codeaurora.org> wrote: > > On Fri, Nov 13 2020 at 03:34 -0700, Ulf Hansson wrote: > >On Wed, 11 Nov 2020 at 17:51, Lina Iyer <ilina@codeaurora.org> wrote: > >> > >> On Tue, Nov 10 2020 at 03:02 -0700, Ulf Hansson wrote: > >> >On Mon, 9 Nov 2020 at 18:41, Lina Iyer <ilina@codeaurora.org> wrote: > >> >> > >> >> On Mon, Nov 09 2020 at 08:27 -0700, Ulf Hansson wrote: > >> >> >On Fri, 6 Nov 2020 at 17:48, Lina Iyer <ilina@codeaurora.org> wrote: > >> >> >> > >> >> [...] > >> >> >> +static void update_domain_next_wakeup(struct generic_pm_domain *genpd, ktime_t now) > >> >> >> +{ > >> >> >> + ktime_t domain_wakeup = KTIME_MAX; > >> >> >> + ktime_t next_wakeup; > >> >> >> + struct pm_domain_data *pdd; > >> >> >> + struct gpd_link *link; > >> >> >> + > >> >> >> + /* Find the earliest wakeup for all devices in the domain */ > >> >> >> + list_for_each_entry(pdd, &genpd->dev_list, list_node) { > >> >> >> + next_wakeup = to_gpd_data(pdd)->next_wakeup; > >> >> >> + if (next_wakeup != KTIME_MAX && !ktime_before(next_wakeup, now)) > >> >> >> + if (ktime_before(next_wakeup, domain_wakeup)) > >> >> >> + domain_wakeup = next_wakeup; > >> >> > > >> >> >If it turns out that one of the device's next_wakeup is before "now", > >> >> >leading to ktime_before() above returns true - then I think you should > >> >> >bail out, no? > >> >> > > >> >> >At least, we shouldn't just continue and ignore this case, right? > >> >> > > >> >> No, that could be a very common case. Drivers are not expected to clean > >> >> up the next wakeup by setting it to KTIME_MAX. The best we can do is > >> >> to make a choice with the valid information we have. This will also map > >> >> to the current behavior. Say if all next wakeup information provided to > >> >> the devices were in the past, we would be no worse (or better) than what > >> >> we do without this change. > >> > > >> >Well, I don't quite agree (at least not yet), but let me elaborate, as > >> >I think we can do better without having to add complexity. > >> > > >> >Normally, I don't think a driver needs to clean up its device's next > >> >wakeup in between the remote wakeups, instead it should just set a new > >> >value. > >> > > >> >That's because, even if the driver acts to a remote wakeup or deals > >> >with a request entering a queue, the driver needs to runtime resume > >> >its device during this period. This prevents genpd from power off the > >> >PM domain, hence also the genpd governor from potentially looking at > >> >"invalid" wakeup information for its attached devices. > >> > > >> Could you elaborate a bit? Why would a remote wakeup affect the next > >> wakeup. I'm afraid that I'm not getting the situation correctly. > > > >Let me try :-) > > > >A remote wakeup is a wakeup irq that is triggered when the device is > >in runtime suspended state. > > > >I was expecting that you would be arming a remote wakeup for the > >corresponding device that is attached to a genpd, when the use case > >becomes enabled. Additionally, to allow the driver to consume the > >wakeup irq, it needs to runtime resume its device (which means its PM > >domain via genpd must be powered on as well, if it's not on already). > > > >Therefore, during the period of when the driver consumes the wakeup > >irq, its device's PM domain remains powered on. When this is > >completed, the driver allows its device to become runtime suspended > >again. At some point before the device becomes runtime suspended, the > >driver should set a new value of the "next wakeup" for its device. > > > Okay, that is clear. Thanks :) > > Yes, we would expect the device to set up its next_wakeup before doing > runtime suspend and if the next wakeup is in the past, we would possibly > not have runtime suspended the device. That would keep the domain ON and > we would not come to this governor at all. And if we still are doing it, > then the device has not set the next wakeup correctly. Correct. > > What you are suggesting is that, we should not power down the domain in > such a case. This would be a really hard problem to debug when a device > leaves a stale wakeup behind and we no longer power off the domain > because of that. Tracking that back to the device will be a monumental > effort. Ignoring the next wakeup though might involve a power/perf > penalty (no worse than today), but we would not have a difficult problem > to solve. Hmm, you have a good point! Additionally, I guess it should be a rather seldom situation, as in principle the wakeup irq should have been triggered already. That said, I am okay to stick with your suggested approach. Although, please add a comment in the code, to make it clear that the behaviour is deliberate. Perhaps we should also clarify the function description of dev_pm_genpd_set_next_wakeup() (in patch1) to make the behaviour more clear for the user. > > >> > >> >Of course, I assume there are situations, where a driver actually > >> >needs to do a clean up of its device's next wakeup, but that should be > >> >less frequent and likely when a remote wakeup is disabled (for > >> >whatever reason). > >> > > >> A common case would be that the driver does not know when the usecase is > >> being turned off and therefore may not be able to set the next wakeup to > >> max. If the stale value continues to exist then we may never power off > >> the domain. > > > >Right. > > > >But, how do you know that the use case starts and what prevents us > >from knowing that the use case has stopped? > > > Usually, the device is made aware of the usecase when it starts, but > stopping the usecase might be something the device may or may not be > aware of. Okay, I see. I guess this means the remote wakeup stays armed, but there are no longer any wakeups being triggered. > > >Maybe if you add a user of the new APIs, this would help me to > >understand better? > > > I have been asking some folks, but let's see. > > [...] > > >> >> >For example, there's no point doing the above, if the domain doesn't > >> >> >specify residency values for its idle states. > >> >> > > >> >> We would still need to ensure that the next wakeup is after the > >> >> power_off_latency, if specified. > >> > > >> >Good point! Although, I would rather avoid adding the overhead, unless > >> >the residency is specified. Do you see a problem with this approach? > >> > > >> Hmmm, no strong objections. However, we still need to run through the > >> states to make sure the residency is not set and have a variable track > >> that. > > > >Right. > > > >The important part is that we can do that once and not for every call > >to the governor. > > > >> The devices wouldn't know that and would still continue to set the > >> next wakeup, unless we find a way to let them know we are not using this > >> feature for the domain. > > > >Right. > > > >To allow the driver to know, we could respond with an error code from > >the new dev_pm_genpd_set_performance_state() API (from patch1), in > >case the genpd+governor doesn't support it. > > > It would an unnecessary work everytime a next wakeup is being set to > iterate through the available states and figure out if the residency has > been set or not. We could probably hold that result in a variable when > we setup the genpd states. Expect the next_wake to be set from a > critical path or an interrupt handler, so we have to be quick. Yes, that's the idea I had in mind. Maybe it's not feasible, let's see. However, for sure I am looking at decreasing overhead, not to increase. :-) > > >Would that be okay? Otherwise we will have to add a separate genpd > >API, asking explicitly if the "next wakeup" feature is supported. > > > Would like to avoid that as much as possible. Okay, good. Kind regards Uffe ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH v5 2/2] PM / Domains: use device's next wakeup to determine domain idle state 2020-11-19 9:56 ` Ulf Hansson @ 2020-11-19 15:47 ` Lina Iyer 2020-11-19 17:46 ` Lina Iyer 1 sibling, 0 replies; 12+ messages in thread From: Lina Iyer @ 2020-11-19 15:47 UTC (permalink / raw) To: Ulf Hansson; +Cc: Rafael J. Wysocki, Linux PM, linux-arm-msm On Thu, Nov 19 2020 at 02:57 -0700, Ulf Hansson wrote: >On Mon, 16 Nov 2020 at 16:57, Lina Iyer <ilina@codeaurora.org> wrote: >> >> On Fri, Nov 13 2020 at 03:34 -0700, Ulf Hansson wrote: >> >On Wed, 11 Nov 2020 at 17:51, Lina Iyer <ilina@codeaurora.org> wrote: >> >> >> >> On Tue, Nov 10 2020 at 03:02 -0700, Ulf Hansson wrote: >> >> >On Mon, 9 Nov 2020 at 18:41, Lina Iyer <ilina@codeaurora.org> wrote: >> >> >> >> >> >> On Mon, Nov 09 2020 at 08:27 -0700, Ulf Hansson wrote: >> >> >> >On Fri, 6 Nov 2020 at 17:48, Lina Iyer <ilina@codeaurora.org> wrote: >> >> >> >> >> >> >> [...] >> >> What you are suggesting is that, we should not power down the domain in >> such a case. This would be a really hard problem to debug when a device >> leaves a stale wakeup behind and we no longer power off the domain >> because of that. Tracking that back to the device will be a monumental >> effort. Ignoring the next wakeup though might involve a power/perf >> penalty (no worse than today), but we would not have a difficult problem >> to solve. > >Hmm, you have a good point! > >Additionally, I guess it should be a rather seldom situation, as in >principle the wakeup irq should have been triggered already. > >That said, I am okay to stick with your suggested approach. > >Although, please add a comment in the code, to make it clear that the >behaviour is deliberate. Perhaps we should also clarify the function >description of dev_pm_genpd_set_next_wakeup() (in patch1) to make the >behaviour more clear for the user. > Sure, will update with comments. >> Let's revisit the patch again after I repost, to make sure this is what we want, atleast for now. Thanks for your review, Ulf. --Lina ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH v5 2/2] PM / Domains: use device's next wakeup to determine domain idle state 2020-11-19 9:56 ` Ulf Hansson 2020-11-19 15:47 ` Lina Iyer @ 2020-11-19 17:46 ` Lina Iyer 1 sibling, 0 replies; 12+ messages in thread From: Lina Iyer @ 2020-11-19 17:46 UTC (permalink / raw) To: Ulf Hansson; +Cc: Rafael J. Wysocki, Linux PM, linux-arm-msm On Thu, Nov 19 2020 at 02:57 -0700, Ulf Hansson wrote: >On Mon, 16 Nov 2020 at 16:57, Lina Iyer <ilina@codeaurora.org> wrote: >> >> On Fri, Nov 13 2020 at 03:34 -0700, Ulf Hansson wrote: >> >On Wed, 11 Nov 2020 at 17:51, Lina Iyer <ilina@codeaurora.org> wrote: >> >> >> >> On Tue, Nov 10 2020 at 03:02 -0700, Ulf Hansson wrote: >> >> >On Mon, 9 Nov 2020 at 18:41, Lina Iyer <ilina@codeaurora.org> wrote: >> >> >> >> >> >> On Mon, Nov 09 2020 at 08:27 -0700, Ulf Hansson wrote: >> >> >> >On Fri, 6 Nov 2020 at 17:48, Lina Iyer <ilina@codeaurora.org> wrote: >> [...] >> >> >> >> >For example, there's no point doing the above, if the domain doesn't >> >> >> >specify residency values for its idle states. >> >> >> > >> >> >> We would still need to ensure that the next wakeup is after the >> >> >> power_off_latency, if specified. >> >> > >> >> >Good point! Although, I would rather avoid adding the overhead, unless >> >> >the residency is specified. Do you see a problem with this approach? >> >> > >> >> Hmmm, no strong objections. However, we still need to run through the >> >> states to make sure the residency is not set and have a variable track >> >> that. >> > >> >Right. >> > >> >The important part is that we can do that once and not for every call >> >to the governor. >> > >> >> The devices wouldn't know that and would still continue to set the >> >> next wakeup, unless we find a way to let them know we are not using this >> >> feature for the domain. >> > >> >Right. >> > >> >To allow the driver to know, we could respond with an error code from >> >the new dev_pm_genpd_set_performance_state() API (from patch1), in >> >case the genpd+governor doesn't support it. >> > >> It would an unnecessary work everytime a next wakeup is being set to >> iterate through the available states and figure out if the residency has >> been set or not. We could probably hold that result in a variable when >> we setup the genpd states. Expect the next_wake to be set from a >> critical path or an interrupt handler, so we have to be quick. > >Yes, that's the idea I had in mind. > >Maybe it's not feasible, let's see. However, for sure I am looking at >decreasing overhead, not to increase. :-) > Wondering what do you think about a genpd flag for this purpose? The flag may be set when the genpd is initialized with idle states that have residency specified. In the governor, we could skip this path completely, if the flag is not set. --Lina >> >> >Would that be okay? Otherwise we will have to add a separate genpd >> >API, asking explicitly if the "next wakeup" feature is supported. >> > >> Would like to avoid that as much as possible. > >Okay, good. > >Kind regards >Uffe ^ permalink raw reply [flat|nested] 12+ messages in thread
end of thread, other threads:[~2020-11-19 17:46 UTC | newest] Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2020-11-06 16:48 [PATCH v5 0/2] Better domain idle from device wakeup patterns Lina Iyer 2020-11-06 16:48 ` [PATCH v5 1/2] PM / domains: inform PM domain of a device's next wakeup Lina Iyer 2020-11-06 16:48 ` [PATCH v5 2/2] PM / Domains: use device's next wakeup to determine domain idle state Lina Iyer 2020-11-09 15:26 ` Ulf Hansson 2020-11-09 17:41 ` Lina Iyer 2020-11-10 10:01 ` Ulf Hansson 2020-11-11 16:27 ` Lina Iyer 2020-11-13 10:33 ` Ulf Hansson 2020-11-16 15:57 ` Lina Iyer 2020-11-19 9:56 ` Ulf Hansson 2020-11-19 15:47 ` Lina Iyer 2020-11-19 17:46 ` Lina Iyer
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.