linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH V7 0/2] PM / Domains: Power domain performance states
@ 2017-05-17  7:02 Viresh Kumar
  2017-05-17  7:02 ` [PATCH V7 1/2] PM / Domains: Add support to select performance-state of domains Viresh Kumar
                   ` (2 more replies)
  0 siblings, 3 replies; 10+ messages in thread
From: Viresh Kumar @ 2017-05-17  7:02 UTC (permalink / raw)
  To: Rafael Wysocki, ulf.hansson, Kevin Hilman
  Cc: linaro-kernel, linux-pm, linux-kernel, Vincent Guittot,
	Stephen Boyd, Nishanth Menon, robh+dt, lina.iyer, rnayak,
	sudeep.holla, Viresh Kumar

Hi,

Here is the 7th version of the series and it looks very different from
whatever is sent until now. Almost a rewrite.


Here is a brief summary of the problem I am trying to solve.

Some platforms have the capability to configure the performance state of
their power domains. The process of configuring the performance state is
pretty much platform dependent and we may need to work with a wide range
of configurables.  For some platforms, like Qcom, it can be a positive
integer value alone, while in other cases it can be voltage levels, etc.

The power-domain framework until now was only designed for the idle
state management of the device and this needs to change in order to
reuse the power-domain framework for active state management of the
devices.


Solution:

Kevin suggested in V6 that we should wait for a while before introducing
any new binding (power-domain-opp) to the OPP table for this stuff and
translate the device requirements into a performance index from within
the power domain driver as it already knows which devices it supports.

If we do that, then there is also no need to represent the performance
states of the power domains in the DT. Keep that as well in the driver
for now.

This simplified things a lot and we can make things work with just two
patches. The first one updates the genpd framework to supply new APIs
and the second patch uses them from the OPP core. Its quite straight
forward and simple.

The ideal way is still to get the relation between device and domain
states via the DT instead of platform code, but that can be done
incrementally later once we have some users for it. It would be much
simpler to get these two patches merged. The code never got any real
reviews in the last 6 versions as we were stuck with bindings :)

This is tested currently by hacking the kernel a bit with virtual
power-domains for the dual A15 exynos platform.

Driver code:
------------

Here is some sample power-domain driver code that I have. It only
supports a single device for now, CPU.

int pd_get_performance(struct device *dev, unsigned long rate)
{
	struct device *cpu_dev = get_cpu_device(0);

	if (cpu_dev != dev)
		return -EINVAL;

	if (rate <= 500000000)
		return 1;
	else if (rate <= 700000000)
		return 2;
	else if (rate <= 900000000)
		return 3;
	else if (rate <= 1200000000)
		return 4;
	else if (rate <= 1600000000)
		return 5;
	else
		return 6;
}

static int pd_set_performance(struct generic_pm_domain *domain, unsigned int state)
{
	/* Set performance of the domain in platform dependent way */

	return 0;
}


static const struct of_device_id pm_domain_of_match[] __initconst = {
       { .compatible = "foo,genpd", },
       { },
};

static int __init genpd_test_init(void)
{
       struct device *dev = get_cpu_device(0);
       struct device_node *np;
       const struct of_device_id *match;
       int n;
       int ret;

       for_each_matching_node_and_match(np, pm_domain_of_match, &match) {
               pd.name = kstrdup_const(strrchr(np->full_name, '/') + 1,
                               GFP_KERNEL);
               if (!pd.name) {
                       of_node_put(np);
                       return -ENOMEM;
               }

		pd.get_performance_state = pd_get_performance;
		pd.set_performance_state = pd_set_performance;

               pm_genpd_init(&pd, NULL, false);
               of_genpd_add_provider_simple(np, &pd);
       }

       ret = dev_pm_domain_attach(dev, false);

       return ret;
}


Pushed here as well:

https://git.linaro.org/people/viresh.kumar/linux.git/log/?h=opp/genpd-performance-state

Rebased on: pm/linux-next +  some OPP cleanups (https://marc.info/?l=linux-kernel&m=149499607030364&w=2)

V6->V7:
- Almost a rewrite, only two patches against 9 in earlier version.
- No bindings updated now and domain's performance state aren't passed
  via DT for now (until we know how users are going to use it).
- We also skipped the QoS framework completely and new APIs are provided
  directly by genpd.

V5->V6:
- Use freq/voltage in OPP table as it is for power domain and don't
  create "domain-performance-level" property
- Create new "power-domain-opp" property for the devices.
- Take care of domain providers that provide multiple domains and extend
  "operating-points-v2" property to contain a list of phandles
- Update code according to those bindings.

V4->V5:
- Only 3 patches were resent and 2 of them are Acked from Ulf.

V3->V4:
- Use OPP table for genpd devices as well.
- Add struct device to genpd, in order to reuse OPP infrastructure.
- Based over: https://marc.info/?l=linux-kernel&m=148972988002317&w=2
- Fixed examples in DT document to have voltage in target,min,max order.

V2->V3:
- Based over latest pm/linux-next
- Bindings and code are merged together
- Lots of updates in bindings
  - the performance-states node is present within the power-domain now,
    instead of its phandle.
  - performance-level property is replaced by "reg".
  - domain-performance-state property of the consumers contain an
    integer value now instead of phandle.
- Lots of updates to the code as well
  - Patch "PM / QOS: Add default case to the switch" is merged with
    other patches and the code is changed a bit as well.
  - Don't pass 'type' to dev_pm_qos_add_notifier(), rather handle all
    notifiers with a single list. A new patch is added for that.
  - The OPP framework patch can be applied now and has proper SoB from
    me.
  - Dropped "PM / domain: Save/restore performance state at runtime
    suspend/resume".
  - Drop all WARN().
  - Tested-by Rajendra nayak.

V1->V2:
- Based over latest pm/linux-next
- It is mostly a resend of what is sent earlier as this series hasn't
  got any reviews so far and Rafael suggested that its better I resend
  it.
- Only the 4/6 patch got an update, which was shared earlier as reply to
  V1 as well. It has got several fixes for taking care of power domain
  hierarchy, etc.

--
viresh

Viresh Kumar (2):
  PM / Domains: Add support to select performance-state of domains
  PM / OPP: Support updating performance state of device's power domains

 drivers/base/power/domain.c   | 172 ++++++++++++++++++++++++++++++++++++++++++
 drivers/base/power/opp/core.c |  48 +++++++++++-
 drivers/base/power/opp/opp.h  |   2 +
 include/linux/pm_domain.h     |  19 +++++
 4 files changed, 240 insertions(+), 1 deletion(-)

-- 
2.13.0.303.g4ebf3021692d

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [PATCH V7 1/2] PM / Domains: Add support to select performance-state of domains
  2017-05-17  7:02 [PATCH V7 0/2] PM / Domains: Power domain performance states Viresh Kumar
@ 2017-05-17  7:02 ` Viresh Kumar
  2017-06-07 12:26   ` Ulf Hansson
  2017-05-17  7:02 ` [PATCH V7 2/2] PM / OPP: Support updating performance state of device's power domains Viresh Kumar
  2017-06-05  4:48 ` [PATCH V7 0/2] PM / Domains: Power domain performance states Viresh Kumar
  2 siblings, 1 reply; 10+ messages in thread
From: Viresh Kumar @ 2017-05-17  7:02 UTC (permalink / raw)
  To: Rafael Wysocki, ulf.hansson, Kevin Hilman, Pavel Machek, Len Brown
  Cc: linaro-kernel, linux-pm, linux-kernel, Vincent Guittot,
	Stephen Boyd, Nishanth Menon, robh+dt, lina.iyer, rnayak,
	sudeep.holla, Viresh Kumar

Some platforms have the capability to configure the performance state of
their Power Domains. The performance levels are identified by positive
integer values, a lower value represents lower performance state.

This patch adds a new genpd API: pm_genpd_update_performance_state().
The caller passes the affected device and the frequency representing its
next DVFS state.

The power domains get two new callbacks:

- get_performance_state(): This is called by the genpd core to retrieve
  the performance state (integer value) corresponding to a target
  frequency for the device. This state is used by the genpd core to find
  the highest requested state by all the devices powered by a domain.

- set_performance_state(): The highest state retrieved from above
  interface is then passed to this callback to finally program the
  performance state of the power domain.

The power domains can avoid supplying these callbacks, if they don't
support setting performance-states.

A power domain may have only get_performance_state() callback, if it
doesn't have the capability of changing the performance state itself but
someone in its parent hierarchy has.

A power domain may have only set_performance_state(), it doesn't have
any direct devices below it and only subdomains. And so the
get_performance_state() will never get called.

The more common case would be to have both the callbacks set.

Another API, pm_genpd_has_performance_state(), is also added to let
other parts of the kernel check if the power domain of a device supports
performance-states or not.

Note that the same performance level as returned by the parent domain of
a device is used for every other domain in parent hierarchy.

Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
---
 drivers/base/power/domain.c | 172 ++++++++++++++++++++++++++++++++++++++++++++
 include/linux/pm_domain.h   |  19 +++++
 2 files changed, 191 insertions(+)

diff --git a/drivers/base/power/domain.c b/drivers/base/power/domain.c
index 71c95ad808d5..56b666eb1a71 100644
--- a/drivers/base/power/domain.c
+++ b/drivers/base/power/domain.c
@@ -466,6 +466,166 @@ static int genpd_dev_pm_qos_notifier(struct notifier_block *nb,
 	return NOTIFY_DONE;
 }
 
+/*
+ * Returns true if anyone in genpd's parent hierarchy has
+ * set_performance_state() set.
+ */
+static bool genpd_has_set_performance_state(struct generic_pm_domain *genpd)
+{
+	struct gpd_link *link;
+
+	if (genpd->set_performance_state)
+		return true;
+
+	list_for_each_entry(link, &genpd->slave_links, slave_node) {
+		if (genpd_has_set_performance_state(link->master))
+			return true;
+	}
+
+	return false;
+}
+
+/**
+ * pm_genpd_has_performance_state - Checks if power domain does performance
+ * state management.
+ *
+ * @dev: Device whose power domain is getting inquired.
+ */
+bool pm_genpd_has_performance_state(struct device *dev)
+{
+	struct generic_pm_domain *genpd = dev_to_genpd(dev);
+
+	/* The parent domain must have set get_performance_state() */
+	if (!IS_ERR(genpd) && genpd->get_performance_state)
+		return true;
+
+	return false;
+}
+EXPORT_SYMBOL_GPL(pm_genpd_has_performance_state);
+
+static int genpd_update_performance_state(struct generic_pm_domain *genpd,
+					  int depth)
+{
+	struct generic_pm_domain_data *pd_data;
+	struct generic_pm_domain *subdomain;
+	struct pm_domain_data *pdd;
+	unsigned int state = 0, prev;
+	struct gpd_link *link;
+	int ret;
+
+	/* Traverse all devices within the domain */
+	list_for_each_entry(pdd, &genpd->dev_list, list_node) {
+		pd_data = to_gpd_data(pdd);
+
+		if (pd_data->performance_state > state)
+			state = pd_data->performance_state;
+	}
+
+	/* Traverse all subdomains within the domain */
+	list_for_each_entry(link, &genpd->master_links, master_node) {
+		subdomain = link->slave;
+
+		if (subdomain->performance_state > state)
+			state = subdomain->performance_state;
+	}
+
+	if (genpd->performance_state == state)
+		return 0;
+
+	/*
+	 * Not all domains support updating performance state. Move on to their
+	 * parent domains in that case.
+	 */
+	if (genpd->set_performance_state) {
+		ret = genpd->set_performance_state(genpd, state);
+		if (!ret)
+			genpd->performance_state = state;
+
+		return ret;
+	}
+
+	prev = genpd->performance_state;
+	genpd->performance_state = state;
+
+	list_for_each_entry(link, &genpd->slave_links, slave_node) {
+		struct generic_pm_domain *master = link->master;
+
+		genpd_lock_nested(master, depth + 1);
+		ret = genpd_update_performance_state(master, depth + 1);
+		genpd_unlock(master);
+
+		if (!ret)
+			continue;
+
+		/*
+		 * Preserve earlier state for all the power domains which have
+		 * been updated.
+		 */
+		genpd->performance_state = prev;
+
+		list_for_each_entry(link, &genpd->slave_links, slave_node) {
+			struct generic_pm_domain *nmaster = link->master;
+
+			if (nmaster == master)
+				break;
+
+			genpd_lock_nested(nmaster, depth + 1);
+			genpd_update_performance_state(master, depth + 1);
+			genpd_unlock(master);
+		}
+
+		return ret;
+	}
+
+	return 0;
+}
+
+/**
+ * pm_genpd_update_performance_state - Update performance state of parent power
+ * domain for the target frequency for the device.
+ *
+ * @dev: Device for which the performance-state needs to be adjusted.
+ * @rate: Device's next frequency.
+ */
+int pm_genpd_update_performance_state(struct device *dev, unsigned long rate)
+{
+	struct generic_pm_domain *genpd = ERR_PTR(-ENODATA);
+	struct generic_pm_domain_data *gpd_data;
+	int ret, prev;
+
+	spin_lock_irq(&dev->power.lock);
+	if (dev->power.subsys_data && dev->power.subsys_data->domain_data) {
+		gpd_data = to_gpd_data(dev->power.subsys_data->domain_data);
+		genpd = dev_to_genpd(dev);
+	}
+	spin_unlock_irq(&dev->power.lock);
+
+	if (IS_ERR(genpd))
+		return -ENODEV;
+
+	if (!genpd->get_performance_state) {
+		dev_err(dev, "Performance states aren't supported by power domain\n");
+		return -EINVAL;
+	}
+
+	genpd_lock(genpd);
+	ret = genpd->get_performance_state(dev, rate);
+	if (ret < 0)
+		goto unlock;
+
+	prev = gpd_data->performance_state;
+	gpd_data->performance_state = ret;
+	ret = genpd_update_performance_state(genpd, 0);
+	if (ret)
+		gpd_data->performance_state = prev;
+
+unlock:
+	genpd_unlock(genpd);
+
+	return ret;
+}
+EXPORT_SYMBOL_GPL(pm_genpd_update_performance_state);
+
 /**
  * genpd_power_off_work_fn - Power off PM domain whose subdomain count is 0.
  * @work: Work structure used for scheduling the execution of this function.
@@ -1156,6 +1316,7 @@ static struct generic_pm_domain_data *genpd_alloc_dev_data(struct device *dev,
 	gpd_data->td.constraint_changed = true;
 	gpd_data->td.effective_constraint_ns = -1;
 	gpd_data->nb.notifier_call = genpd_dev_pm_qos_notifier;
+	gpd_data->performance_state = 0;
 
 	spin_lock_irq(&dev->power.lock);
 
@@ -1502,6 +1663,17 @@ int pm_genpd_init(struct generic_pm_domain *genpd,
 		genpd->dev_ops.start = pm_clk_resume;
 	}
 
+	/*
+	 * If this genpd supports getting performance state, then someone in its
+	 * hierarchy must support setting it too.
+	 */
+	if (genpd->get_performance_state &&
+	    !genpd_has_set_performance_state(genpd)) {
+		pr_err("%s: genpd doesn't support updating performance state\n",
+		       genpd->name);
+		return -ENODEV;
+	}
+
 	/* Always-on domains must be powered on at initialization. */
 	if (genpd_is_always_on(genpd) && !genpd_status_on(genpd))
 		return -EINVAL;
diff --git a/include/linux/pm_domain.h b/include/linux/pm_domain.h
index b7803a251044..74502664ea33 100644
--- a/include/linux/pm_domain.h
+++ b/include/linux/pm_domain.h
@@ -63,8 +63,12 @@ struct generic_pm_domain {
 	unsigned int device_count;	/* Number of devices */
 	unsigned int suspended_count;	/* System suspend device counter */
 	unsigned int prepared_count;	/* Suspend counter of prepared devices */
+	unsigned int performance_state;	/* Max requested performance state */
 	int (*power_off)(struct generic_pm_domain *domain);
 	int (*power_on)(struct generic_pm_domain *domain);
+	int (*get_performance_state)(struct device *dev, unsigned long rate);
+	int (*set_performance_state)(struct generic_pm_domain *domain,
+				     unsigned int state);
 	struct gpd_dev_ops dev_ops;
 	s64 max_off_time_ns;	/* Maximum allowed "suspended" time. */
 	bool max_off_time_changed;
@@ -118,6 +122,7 @@ struct generic_pm_domain_data {
 	struct pm_domain_data base;
 	struct gpd_timing_data td;
 	struct notifier_block nb;
+	unsigned int performance_state;
 	void *data;
 };
 
@@ -148,6 +153,9 @@ extern int pm_genpd_remove(struct generic_pm_domain *genpd);
 
 extern struct dev_power_governor simple_qos_governor;
 extern struct dev_power_governor pm_domain_always_on_gov;
+extern bool pm_genpd_has_performance_state(struct device *dev);
+extern int pm_genpd_update_performance_state(struct device *dev,
+					     unsigned long rate);
 #else
 
 static inline struct generic_pm_domain_data *dev_gpd_data(struct device *dev)
@@ -185,6 +193,17 @@ static inline int pm_genpd_remove(struct generic_pm_domain *genpd)
 	return -ENOTSUPP;
 }
 
+static inline bool pm_genpd_has_performance_state(struct device *dev)
+{
+	return false;
+}
+
+static inline int pm_genpd_update_performance_state(struct device *dev,
+						    unsigned long rate)
+{
+	return -ENOTSUPP;
+}
+
 #define simple_qos_governor		(*(struct dev_power_governor *)(NULL))
 #define pm_domain_always_on_gov		(*(struct dev_power_governor *)(NULL))
 #endif
-- 
2.13.0.303.g4ebf3021692d

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH V7 2/2] PM / OPP: Support updating performance state of device's power domains
  2017-05-17  7:02 [PATCH V7 0/2] PM / Domains: Power domain performance states Viresh Kumar
  2017-05-17  7:02 ` [PATCH V7 1/2] PM / Domains: Add support to select performance-state of domains Viresh Kumar
@ 2017-05-17  7:02 ` Viresh Kumar
  2017-06-05  4:48 ` [PATCH V7 0/2] PM / Domains: Power domain performance states Viresh Kumar
  2 siblings, 0 replies; 10+ messages in thread
From: Viresh Kumar @ 2017-05-17  7:02 UTC (permalink / raw)
  To: Rafael Wysocki, ulf.hansson, Kevin Hilman, Viresh Kumar,
	Nishanth Menon, Stephen Boyd
  Cc: linaro-kernel, linux-pm, linux-kernel, Vincent Guittot, robh+dt,
	lina.iyer, rnayak, sudeep.holla, Viresh Kumar

The gendpd framework now provides an API to request device's power
domain to update its performance state based on a particular target
frequency for the device.

Use that interface from the OPP core for devices whose power domains
support performance states.

Note that the current implementation is restricted to the case where the
device doesn't have separate regulators for itself. We shouldn't
over engineer the code before we have real use case for them. We can
always come back and add more code to support such cases.

Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
---
 drivers/base/power/opp/core.c | 48 ++++++++++++++++++++++++++++++++++++++++++-
 drivers/base/power/opp/opp.h  |  2 ++
 2 files changed, 49 insertions(+), 1 deletion(-)

diff --git a/drivers/base/power/opp/core.c b/drivers/base/power/opp/core.c
index 898f19ea0f60..3c1036f638f6 100644
--- a/drivers/base/power/opp/core.c
+++ b/drivers/base/power/opp/core.c
@@ -19,6 +19,7 @@
 #include <linux/slab.h>
 #include <linux/device.h>
 #include <linux/export.h>
+#include <linux/pm_domain.h>
 #include <linux/regulator/consumer.h>
 
 #include "opp.h"
@@ -535,6 +536,42 @@ _generic_set_opp_clk_only(struct device *dev, struct clk *clk,
 	return ret;
 }
 
+static inline int
+_generic_set_opp_domain(struct device *dev, struct clk *clk,
+			unsigned long old_freq, unsigned long freq)
+{
+	int ret;
+
+	/* Scaling up? Scale domain performance state before frequency */
+	if (freq > old_freq) {
+		ret = pm_genpd_update_performance_state(dev, freq);
+		if (ret)
+			return ret;
+	}
+
+	ret = _generic_set_opp_clk_only(dev, clk, old_freq, freq);
+	if (ret)
+		goto restore_domain_state;
+
+	/* Scaling down? Scale domain performance state after frequency */
+	if (freq < old_freq) {
+		ret = pm_genpd_update_performance_state(dev, freq);
+		if (ret)
+			goto restore_freq;
+	}
+
+	return 0;
+
+restore_freq:
+	if (_generic_set_opp_clk_only(dev, clk, freq, old_freq))
+		dev_err(dev, "%s: failed to restore old-freq (%lu Hz)\n",
+			__func__, old_freq);
+restore_domain_state:
+	pm_genpd_update_performance_state(dev, old_freq);
+
+	return ret;
+}
+
 static int _generic_set_opp_regulator(struct opp_table *opp_table,
 				      struct device *dev,
 				      unsigned long old_freq,
@@ -653,7 +690,14 @@ int dev_pm_opp_set_rate(struct device *dev, unsigned long target_freq)
 
 	/* Only frequency scaling */
 	if (!opp_table->regulators) {
-		ret = _generic_set_opp_clk_only(dev, clk, old_freq, freq);
+		/*
+		 * We don't support devices with both regulator and
+		 * domain performance-state for now.
+		 */
+		if (opp_table->genpd_performance_state)
+			ret = _generic_set_opp_domain(dev, clk, old_freq, freq);
+		else
+			ret = _generic_set_opp_clk_only(dev, clk, old_freq, freq);
 	} else if (!opp_table->set_opp) {
 		ret = _generic_set_opp_regulator(opp_table, dev, old_freq, freq,
 						 IS_ERR(old_opp) ? NULL : old_opp->supplies,
@@ -755,6 +799,8 @@ static struct opp_table *_allocate_opp_table(struct device *dev)
 				ret);
 	}
 
+	opp_table->genpd_performance_state = pm_genpd_has_performance_state(dev);
+
 	BLOCKING_INIT_NOTIFIER_HEAD(&opp_table->head);
 	INIT_LIST_HEAD(&opp_table->opp_list);
 	mutex_init(&opp_table->lock);
diff --git a/drivers/base/power/opp/opp.h b/drivers/base/power/opp/opp.h
index 166eef990599..1efa253e1934 100644
--- a/drivers/base/power/opp/opp.h
+++ b/drivers/base/power/opp/opp.h
@@ -135,6 +135,7 @@ enum opp_table_access {
  * @clk: Device's clock handle
  * @regulators: Supply regulators
  * @regulator_count: Number of power supply regulators
+ * @genpd_performance_state: Device's power domain support performance state.
  * @set_opp: Platform specific set_opp callback
  * @set_opp_data: Data to be passed to set_opp callback
  * @dentry:	debugfs dentry pointer of the real device directory (not links).
@@ -170,6 +171,7 @@ struct opp_table {
 	struct clk *clk;
 	struct regulator **regulators;
 	unsigned int regulator_count;
+	bool genpd_performance_state;
 
 	int (*set_opp)(struct dev_pm_set_opp_data *data);
 	struct dev_pm_set_opp_data *set_opp_data;
-- 
2.13.0.303.g4ebf3021692d

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [PATCH V7 0/2] PM / Domains: Power domain performance states
  2017-05-17  7:02 [PATCH V7 0/2] PM / Domains: Power domain performance states Viresh Kumar
  2017-05-17  7:02 ` [PATCH V7 1/2] PM / Domains: Add support to select performance-state of domains Viresh Kumar
  2017-05-17  7:02 ` [PATCH V7 2/2] PM / OPP: Support updating performance state of device's power domains Viresh Kumar
@ 2017-06-05  4:48 ` Viresh Kumar
  2 siblings, 0 replies; 10+ messages in thread
From: Viresh Kumar @ 2017-06-05  4:48 UTC (permalink / raw)
  To: Rafael Wysocki, ulf.hansson, Kevin Hilman
  Cc: linaro-kernel, linux-pm, linux-kernel, Vincent Guittot,
	Stephen Boyd, Nishanth Menon, robh+dt, lina.iyer, rnayak,
	sudeep.holla

On 17-05-17, 12:32, Viresh Kumar wrote:
> Hi,
> 
> Here is the 7th version of the series and it looks very different from
> whatever is sent until now. Almost a rewrite.

Gentle reminder as 3 weeks have passed since this was last posted :)

@Kevin/Ulf: Any inputs ?

-- 
viresh

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH V7 1/2] PM / Domains: Add support to select performance-state of domains
  2017-05-17  7:02 ` [PATCH V7 1/2] PM / Domains: Add support to select performance-state of domains Viresh Kumar
@ 2017-06-07 12:26   ` Ulf Hansson
  2017-06-08  7:05     ` Viresh Kumar
  0 siblings, 1 reply; 10+ messages in thread
From: Ulf Hansson @ 2017-06-07 12:26 UTC (permalink / raw)
  To: Viresh Kumar
  Cc: Rafael Wysocki, Kevin Hilman, Pavel Machek, Len Brown,
	linaro-kernel, linux-pm, linux-kernel, Vincent Guittot,
	Stephen Boyd, Nishanth Menon, Rob Herring, Lina Iyer,
	Rajendra Nayak, Sudeep Holla

On 17 May 2017 at 09:02, Viresh Kumar <viresh.kumar@linaro.org> wrote:
> Some platforms have the capability to configure the performance state of
> their Power Domains. The performance levels are identified by positive
> integer values, a lower value represents lower performance state.
>
> This patch adds a new genpd API: pm_genpd_update_performance_state().
> The caller passes the affected device and the frequency representing its
> next DVFS state.
>
> The power domains get two new callbacks:
>
> - get_performance_state(): This is called by the genpd core to retrieve
>   the performance state (integer value) corresponding to a target
>   frequency for the device. This state is used by the genpd core to find
>   the highest requested state by all the devices powered by a domain.
>
> - set_performance_state(): The highest state retrieved from above
>   interface is then passed to this callback to finally program the
>   performance state of the power domain.
>
> The power domains can avoid supplying these callbacks, if they don't
> support setting performance-states.
>
> A power domain may have only get_performance_state() callback, if it
> doesn't have the capability of changing the performance state itself but
> someone in its parent hierarchy has.
>
> A power domain may have only set_performance_state(), it doesn't have
> any direct devices below it and only subdomains. And so the
> get_performance_state() will never get called.
>
> The more common case would be to have both the callbacks set.
>
> Another API, pm_genpd_has_performance_state(), is also added to let
> other parts of the kernel check if the power domain of a device supports
> performance-states or not.
>
> Note that the same performance level as returned by the parent domain of
> a device is used for every other domain in parent hierarchy.
>
> Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>

Apologize for the delay! I was kind of waiting for Kevin to jump in as
he has already been in the loop.

Anyway, let me give you my thoughts about his. Starting with some
comment, then an overall thought about this.

> ---
>  drivers/base/power/domain.c | 172 ++++++++++++++++++++++++++++++++++++++++++++
>  include/linux/pm_domain.h   |  19 +++++
>  2 files changed, 191 insertions(+)
>
> diff --git a/drivers/base/power/domain.c b/drivers/base/power/domain.c
> index 71c95ad808d5..56b666eb1a71 100644
> --- a/drivers/base/power/domain.c
> +++ b/drivers/base/power/domain.c
> @@ -466,6 +466,166 @@ static int genpd_dev_pm_qos_notifier(struct notifier_block *nb,
>         return NOTIFY_DONE;
>  }
>
> +/*
> + * Returns true if anyone in genpd's parent hierarchy has
> + * set_performance_state() set.
> + */
> +static bool genpd_has_set_performance_state(struct generic_pm_domain *genpd)
> +{
> +       struct gpd_link *link;
> +
> +       if (genpd->set_performance_state)
> +               return true;
> +
> +       list_for_each_entry(link, &genpd->slave_links, slave_node) {
> +               if (genpd_has_set_performance_state(link->master))
> +                       return true;
> +       }
> +
> +       return false;
> +}
> +
> +/**
> + * pm_genpd_has_performance_state - Checks if power domain does performance
> + * state management.
> + *
> + * @dev: Device whose power domain is getting inquired.
> + */
> +bool pm_genpd_has_performance_state(struct device *dev)
> +{
> +       struct generic_pm_domain *genpd = dev_to_genpd(dev);
> +
> +       /* The parent domain must have set get_performance_state() */
> +       if (!IS_ERR(genpd) && genpd->get_performance_state)
> +               return true;
> +
> +       return false;
> +}
> +EXPORT_SYMBOL_GPL(pm_genpd_has_performance_state);

There is no protection from locking point view and there is no
validation the pm_domain pointer being a valid genpd.

Please study the locks that is used in genpd more carefully and try to
figure out what is needed here. I know it's a bit tricky, but the
above just isn't sufficient.

> +
> +static int genpd_update_performance_state(struct generic_pm_domain *genpd,
> +                                         int depth)
> +{
> +       struct generic_pm_domain_data *pd_data;
> +       struct generic_pm_domain *subdomain;
> +       struct pm_domain_data *pdd;
> +       unsigned int state = 0, prev;
> +       struct gpd_link *link;
> +       int ret;
> +
> +       /* Traverse all devices within the domain */
> +       list_for_each_entry(pdd, &genpd->dev_list, list_node) {
> +               pd_data = to_gpd_data(pdd);
> +
> +               if (pd_data->performance_state > state)
> +                       state = pd_data->performance_state;
> +       }
> +
> +       /* Traverse all subdomains within the domain */
> +       list_for_each_entry(link, &genpd->master_links, master_node) {
> +               subdomain = link->slave;
> +
> +               if (subdomain->performance_state > state)
> +                       state = subdomain->performance_state;

You need to take the genpd lock of the subdomain before assigning this, right?

Is there a way to do this the opposite direction? It's important that
we walk upwards in the topology, as we need to preserve the order for
how genpd locks are taken.

The way we do it is always taking the lock of the subdomain first,
then its master, then the master's master and so own. You can have
look at for example genpd_power_on().

> +       }
> +
> +       if (genpd->performance_state == state)
> +               return 0;
> +
> +       /*
> +        * Not all domains support updating performance state. Move on to their
> +        * parent domains in that case.
> +        */
> +       if (genpd->set_performance_state) {
> +               ret = genpd->set_performance_state(genpd, state);
> +               if (!ret)
> +                       genpd->performance_state = state;
> +
> +               return ret;
> +       }
> +
> +       prev = genpd->performance_state;
> +       genpd->performance_state = state;
> +
> +       list_for_each_entry(link, &genpd->slave_links, slave_node) {
> +               struct generic_pm_domain *master = link->master;
> +
> +               genpd_lock_nested(master, depth + 1);
> +               ret = genpd_update_performance_state(master, depth + 1);
> +               genpd_unlock(master);
> +
> +               if (!ret)
> +                       continue;
> +
> +               /*
> +                * Preserve earlier state for all the power domains which have
> +                * been updated.
> +                */
> +               genpd->performance_state = prev;
> +
> +               list_for_each_entry(link, &genpd->slave_links, slave_node) {
> +                       struct generic_pm_domain *nmaster = link->master;
> +
> +                       if (nmaster == master)
> +                               break;
> +
> +                       genpd_lock_nested(nmaster, depth + 1);
> +                       genpd_update_performance_state(master, depth + 1);
> +                       genpd_unlock(master);
> +               }

I think some comment on how to walk the genpd topology to fetch the
current selected state would be nice. I am not sure I get it by just
looking at the code.

First you are walking master links then slave links. Then you start
over again with the slave links in the error path.

Normally in the error path we use
list_for_each_entry_continue_reverse() to restore data, so this looks
a bit weird to me. Are you sure the error path is correct?

> +
> +               return ret;
> +       }
> +
> +       return 0;
> +}
> +
> +/**
> + * pm_genpd_update_performance_state - Update performance state of parent power
> + * domain for the target frequency for the device.
> + *
> + * @dev: Device for which the performance-state needs to be adjusted.
> + * @rate: Device's next frequency.
> + */
> +int pm_genpd_update_performance_state(struct device *dev, unsigned long rate)
> +{
> +       struct generic_pm_domain *genpd = ERR_PTR(-ENODATA);
> +       struct generic_pm_domain_data *gpd_data;
> +       int ret, prev;
> +
> +       spin_lock_irq(&dev->power.lock);
> +       if (dev->power.subsys_data && dev->power.subsys_data->domain_data) {
> +               gpd_data = to_gpd_data(dev->power.subsys_data->domain_data);
> +               genpd = dev_to_genpd(dev);
> +       }
> +       spin_unlock_irq(&dev->power.lock);

So even if you fetch the genpd pointer here, you don't protect the
device from detached from its genpd after this point (thus freeing the
domain data).

To move this issue even further, there is no guarantee that the genpd
don't get removed after this point, but you are still operating on the
pointer...

> +
> +       if (IS_ERR(genpd))
> +               return -ENODEV;
> +
> +       if (!genpd->get_performance_state) {
> +               dev_err(dev, "Performance states aren't supported by power domain\n");
> +               return -EINVAL;
> +       }
> +
> +       genpd_lock(genpd);
> +       ret = genpd->get_performance_state(dev, rate);
> +       if (ret < 0)
> +               goto unlock;
> +
> +       prev = gpd_data->performance_state;
> +       gpd_data->performance_state = ret;
> +       ret = genpd_update_performance_state(genpd, 0);
> +       if (ret)
> +               gpd_data->performance_state = prev;
> +
> +unlock:
> +       genpd_unlock(genpd);
> +
> +       return ret;
> +}
> +EXPORT_SYMBOL_GPL(pm_genpd_update_performance_state);
> +
>  /**
>   * genpd_power_off_work_fn - Power off PM domain whose subdomain count is 0.
>   * @work: Work structure used for scheduling the execution of this function.
> @@ -1156,6 +1316,7 @@ static struct generic_pm_domain_data *genpd_alloc_dev_data(struct device *dev,
>         gpd_data->td.constraint_changed = true;
>         gpd_data->td.effective_constraint_ns = -1;
>         gpd_data->nb.notifier_call = genpd_dev_pm_qos_notifier;
> +       gpd_data->performance_state = 0;
>
>         spin_lock_irq(&dev->power.lock);
>
> @@ -1502,6 +1663,17 @@ int pm_genpd_init(struct generic_pm_domain *genpd,
>                 genpd->dev_ops.start = pm_clk_resume;
>         }
>
> +       /*
> +        * If this genpd supports getting performance state, then someone in its
> +        * hierarchy must support setting it too.
> +        */
> +       if (genpd->get_performance_state &&
> +           !genpd_has_set_performance_state(genpd)) {
> +               pr_err("%s: genpd doesn't support updating performance state\n",
> +                      genpd->name);
> +               return -ENODEV;
> +       }
> +
>         /* Always-on domains must be powered on at initialization. */
>         if (genpd_is_always_on(genpd) && !genpd_status_on(genpd))
>                 return -EINVAL;
> diff --git a/include/linux/pm_domain.h b/include/linux/pm_domain.h
> index b7803a251044..74502664ea33 100644
> --- a/include/linux/pm_domain.h
> +++ b/include/linux/pm_domain.h
> @@ -63,8 +63,12 @@ struct generic_pm_domain {
>         unsigned int device_count;      /* Number of devices */
>         unsigned int suspended_count;   /* System suspend device counter */
>         unsigned int prepared_count;    /* Suspend counter of prepared devices */
> +       unsigned int performance_state; /* Max requested performance state */
>         int (*power_off)(struct generic_pm_domain *domain);
>         int (*power_on)(struct generic_pm_domain *domain);
> +       int (*get_performance_state)(struct device *dev, unsigned long rate);
> +       int (*set_performance_state)(struct generic_pm_domain *domain,
> +                                    unsigned int state);
>         struct gpd_dev_ops dev_ops;
>         s64 max_off_time_ns;    /* Maximum allowed "suspended" time. */
>         bool max_off_time_changed;
> @@ -118,6 +122,7 @@ struct generic_pm_domain_data {
>         struct pm_domain_data base;
>         struct gpd_timing_data td;
>         struct notifier_block nb;
> +       unsigned int performance_state;
>         void *data;
>  };
>
> @@ -148,6 +153,9 @@ extern int pm_genpd_remove(struct generic_pm_domain *genpd);
>
>  extern struct dev_power_governor simple_qos_governor;
>  extern struct dev_power_governor pm_domain_always_on_gov;
> +extern bool pm_genpd_has_performance_state(struct device *dev);
> +extern int pm_genpd_update_performance_state(struct device *dev,
> +                                            unsigned long rate);
>  #else
>
>  static inline struct generic_pm_domain_data *dev_gpd_data(struct device *dev)
> @@ -185,6 +193,17 @@ static inline int pm_genpd_remove(struct generic_pm_domain *genpd)
>         return -ENOTSUPP;
>  }
>
> +static inline bool pm_genpd_has_performance_state(struct device *dev)
> +{
> +       return false;
> +}
> +
> +static inline int pm_genpd_update_performance_state(struct device *dev,
> +                                                   unsigned long rate)
> +{
> +       return -ENOTSUPP;
> +}
> +
>  #define simple_qos_governor            (*(struct dev_power_governor *)(NULL))
>  #define pm_domain_always_on_gov                (*(struct dev_power_governor *)(NULL))
>  #endif
> --
> 2.13.0.303.g4ebf3021692d
>

As you probably figured out from my above comments, the solution here
isn't really working.

Adding these new APIs, more or less requires us to manage reference
counting on the genpd for the device. I haven't thought about that in
great detail, just that we of course need to be careful if we want to
introduce something like that, to make sure all aspect of locking
becomes correct.

Moreover adding reference counting touches the idea of adding APIs to
genpd, to allow users/drivers to control their genpd explicitly. This
is required to support > 1 pm_domain per device. I assume you have
been following that discussion for genpd on linux-pm as well. My point
is that, we should consider that while inventing *any* new APIs.

Kind regards
Uffe

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH V7 1/2] PM / Domains: Add support to select performance-state of domains
  2017-06-07 12:26   ` Ulf Hansson
@ 2017-06-08  7:05     ` Viresh Kumar
  2017-06-08  7:48       ` Ulf Hansson
  0 siblings, 1 reply; 10+ messages in thread
From: Viresh Kumar @ 2017-06-08  7:05 UTC (permalink / raw)
  To: Ulf Hansson
  Cc: Rafael Wysocki, Kevin Hilman, Pavel Machek, Len Brown,
	linaro-kernel, linux-pm, linux-kernel, Vincent Guittot,
	Stephen Boyd, Nishanth Menon, Rob Herring, Lina Iyer,
	Rajendra Nayak, Sudeep Holla

On 07-06-17, 14:26, Ulf Hansson wrote:
> On 17 May 2017 at 09:02, Viresh Kumar <viresh.kumar@linaro.org> wrote:
 
> Apologize for the delay! I was kind of waiting for Kevin to jump in as
> he has already been in the loop.

Thanks for getting these reviewed eventually :)

> > +bool pm_genpd_has_performance_state(struct device *dev)
> > +{
> > +       struct generic_pm_domain *genpd = dev_to_genpd(dev);
> > +
> > +       /* The parent domain must have set get_performance_state() */
> > +       if (!IS_ERR(genpd) && genpd->get_performance_state)
> > +               return true;
> > +
> > +       return false;
> > +}
> > +EXPORT_SYMBOL_GPL(pm_genpd_has_performance_state);
> 
> There is no protection from locking point view

I hope you are talking about not taking genpd_lock(genpd) here ? If yes, can you
please explain what kind of race will that protect here with an example ?

As I really fail to see how can this be racy at all as we are just checking if
a callback is set or not.

> and there is no
> validation the pm_domain pointer being a valid genpd.

Isn't dev_to_genpd() already doing that? And the check !IS_ERR(genpd) is all we
need. Sorry if I missed something really stupid.

> Please study the locks that is used in genpd more carefully and try to
> figure out what is needed here. I know it's a bit tricky, but the
> above just isn't sufficient.

Sure.

> > +static int genpd_update_performance_state(struct generic_pm_domain *genpd,
> > +                                         int depth)
> > +{
> > +       struct generic_pm_domain_data *pd_data;
> > +       struct generic_pm_domain *subdomain;
> > +       struct pm_domain_data *pdd;
> > +       unsigned int state = 0, prev;
> > +       struct gpd_link *link;
> > +       int ret;
> > +
> > +       /* Traverse all devices within the domain */
> > +       list_for_each_entry(pdd, &genpd->dev_list, list_node) {
> > +               pd_data = to_gpd_data(pdd);
> > +
> > +               if (pd_data->performance_state > state)
> > +                       state = pd_data->performance_state;
> > +       }
> > +
> > +       /* Traverse all subdomains within the domain */
> > +       list_for_each_entry(link, &genpd->master_links, master_node) {
> > +               subdomain = link->slave;
> > +
> > +               if (subdomain->performance_state > state)
> > +                       state = subdomain->performance_state;
> 
> You need to take the genpd lock of the subdomain before assigning this, right?

Maybe not, but sure there is one problem you highlighted very well.
subdomain->performance_state might have been updated from the same function
(from a different thread) and we may end up using that incorrectly. Thanks for
noticing the stupid mistake.

> Is there a way to do this the opposite direction? It's important that
> we walk upwards in the topology, as we need to preserve the order for
> how genpd locks are taken.

At least I couldn't get around a solution which would be able to do this. All
these requests generate from a device and its domain may have subdomains and
devices. Though, we can add a constraint (see below) which can get us around
this for now at least.

> The way we do it is always taking the lock of the subdomain first,
> then its master, then the master's master and so own. You can have
> look at for example genpd_power_on().

Right. Locking was surely messy here.

> > +       }
> > +
> > +       if (genpd->performance_state == state)
> > +               return 0;
> > +
> > +       /*
> > +        * Not all domains support updating performance state. Move on to their
> > +        * parent domains in that case.
> > +        */
> > +       if (genpd->set_performance_state) {
> > +               ret = genpd->set_performance_state(genpd, state);
> > +               if (!ret)
> > +                       genpd->performance_state = state;
> > +
> > +               return ret;
> > +       }
> > +
> > +       prev = genpd->performance_state;
> > +       genpd->performance_state = state;
> > +
> > +       list_for_each_entry(link, &genpd->slave_links, slave_node) {
> > +               struct generic_pm_domain *master = link->master;
> > +
> > +               genpd_lock_nested(master, depth + 1);
> > +               ret = genpd_update_performance_state(master, depth + 1);
> > +               genpd_unlock(master);
> > +
> > +               if (!ret)
> > +                       continue;
> > +
> > +               /*
> > +                * Preserve earlier state for all the power domains which have
> > +                * been updated.
> > +                */
> > +               genpd->performance_state = prev;
> > +
> > +               list_for_each_entry(link, &genpd->slave_links, slave_node) {
> > +                       struct generic_pm_domain *nmaster = link->master;
> > +
> > +                       if (nmaster == master)
> > +                               break;
> > +
> > +                       genpd_lock_nested(nmaster, depth + 1);
> > +                       genpd_update_performance_state(master, depth + 1);
> > +                       genpd_unlock(master);
> > +               }
> 
> I think some comment on how to walk the genpd topology to fetch the
> current selected state would be nice. I am not sure I get it by just
> looking at the code.
> 
> First you are walking master links then slave links. Then you start
> over again with the slave links in the error path.

Sure, a comment might be helpful.

> > +int pm_genpd_update_performance_state(struct device *dev, unsigned long rate)
> > +{
> > +       struct generic_pm_domain *genpd = ERR_PTR(-ENODATA);
> > +       struct generic_pm_domain_data *gpd_data;
> > +       int ret, prev;
> > +
> > +       spin_lock_irq(&dev->power.lock);
> > +       if (dev->power.subsys_data && dev->power.subsys_data->domain_data) {
> > +               gpd_data = to_gpd_data(dev->power.subsys_data->domain_data);
> > +               genpd = dev_to_genpd(dev);
> > +       }
> > +       spin_unlock_irq(&dev->power.lock);
> 
> So even if you fetch the genpd pointer here, you don't protect the
> device from detached from its genpd after this point (thus freeing the
> domain data).
> 
> To move this issue even further, there is no guarantee that the genpd
> don't get removed after this point, but you are still operating on the
> pointer...

That's another blunder. I agree. Thanks.

> As you probably figured out from my above comments, the solution here
> isn't really working.
> 
> Adding these new APIs, more or less requires us to manage reference
> counting on the genpd for the device.

Hmm, I am not able to understand why would we need that with this series :(

> I haven't thought about that in
> great detail, just that we of course need to be careful if we want to
> introduce something like that, to make sure all aspect of locking
> becomes correct.
> 
> Moreover adding reference counting touches the idea of adding APIs to
> genpd, to allow users/drivers to control their genpd explicitly. This
> is required to support > 1 pm_domain per device. I assume you have
> been following that discussion for genpd on linux-pm as well. My point
> is that, we should consider that while inventing *any* new APIs.

Not very deeply but yeah I have seen that thread.

So, I couldn't find way to get the locking thing fixed to avoid deadlocks but we
can live with a constraint (if you guys think it is fine) to have this solved.

Constraint: Update performance state API wouldn't support power domains that
don't provide a set_performance_state() callback and have multiple parent power
domains.

Why: In order to guarantee that we read and update the performance_state of
subdomains and devices properly, we need to do that under the parent domain's
lock. And if we have multiple parent power domains, then we would be required to
get them locked at once before updating subdomain's state and that would be a
nightmare.

And here is the new diff based on above (only compile tested):

---
 drivers/base/power/domain.c | 177 ++++++++++++++++++++++++++++++++++++++++++++
 include/linux/pm_domain.h   |  19 +++++
 2 files changed, 196 insertions(+)

diff --git a/drivers/base/power/domain.c b/drivers/base/power/domain.c
index 71c95ad808d5..cf090adc7967 100644
--- a/drivers/base/power/domain.c
+++ b/drivers/base/power/domain.c
@@ -466,6 +466,171 @@ static int genpd_dev_pm_qos_notifier(struct notifier_block *nb,
 	return NOTIFY_DONE;
 }
 
+/*
+ * Returns true if anyone in genpd's parent hierarchy has
+ * set_performance_state() set.
+ */
+static bool genpd_has_set_performance_state(struct generic_pm_domain *genpd)
+{
+	struct gpd_link *link;
+
+	if (genpd->set_performance_state)
+		return true;
+
+	list_for_each_entry(link, &genpd->slave_links, slave_node) {
+		if (genpd_has_set_performance_state(link->master))
+			return true;
+	}
+
+	return false;
+}
+
+/**
+ * pm_genpd_has_performance_state - Checks if power domain does performance
+ * state management.
+ *
+ * @dev: Device whose power domain is getting inquired.
+ */
+bool pm_genpd_has_performance_state(struct device *dev)
+{
+	struct generic_pm_domain *genpd = dev_to_genpd(dev);
+
+	/* The parent domain must have set get_performance_state() */
+	if (!IS_ERR(genpd) && genpd->get_performance_state)
+		return true;
+
+	return false;
+}
+EXPORT_SYMBOL_GPL(pm_genpd_has_performance_state);
+
+/*
+ * Re-evaluate performance state of a power domain. Finds the highest requested
+ * performance state by the devices and subdomains within the power domain and
+ * then tries to change its performance state. If the power domain doesn't have
+ * a set_performance_state() callback, then we move the request to its parent
+ * power domain (with constraints explained below).
+ */
+static int genpd_update_performance_state(struct generic_pm_domain *genpd,
+					  int depth)
+{
+	struct generic_pm_domain_data *pd_data;
+	struct generic_pm_domain *master;
+	struct generic_pm_domain *subdomain;
+	struct pm_domain_data *pdd;
+	unsigned int state = 0, prev;
+	struct gpd_link *link;
+	int ret;
+
+	/* Traverse all devices within the domain */
+	list_for_each_entry(pdd, &genpd->dev_list, list_node) {
+		pd_data = to_gpd_data(pdd);
+
+		if (pd_data->performance_state > state)
+			state = pd_data->performance_state;
+	}
+
+	/* Traverse all subdomains within the domain */
+	list_for_each_entry(link, &genpd->master_links, master_node) {
+		subdomain = link->slave;
+
+		if (subdomain->performance_state > state)
+			state = subdomain->performance_state;
+	}
+
+	if (genpd->performance_state == state)
+		return 0;
+
+	/*
+	 * Not all domains support updating performance state. Move on to their
+	 * parent domains in that case.
+	 */
+	if (genpd->set_performance_state) {
+		ret = genpd->set_performance_state(genpd, state);
+		if (!ret)
+			genpd->performance_state = state;
+
+		return ret;
+	}
+
+	/*
+	 * Only support subdomains with single parent power domain to avoid
+	 * complicated locking problems. We only take locks in the upwards
+	 * direction (i.e. subdomain first, followed by parent domain). We also
+	 * need to guarantee that the performance_state field of devices and
+	 * subdomains of a domain is not updated (or read) without the domain's
+	 * lock and that wouldn't work with multiple parents of a subdomain.
+	 */
+	if (!list_is_singular(&genpd->slave_links)) {
+		pr_err("%s: Domain has multiple parents, can't update performance state\n",
+		       genpd->name);
+		return -EINVAL;
+	}
+
+	link = list_first_entry(&genpd->slave_links, struct gpd_link, slave_node);
+	master = link->master;
+
+	genpd_lock_nested(master, depth + 1);
+	prev = genpd->performance_state;
+	genpd->performance_state = state;
+
+	ret = genpd_update_performance_state(master, depth + 1);
+	if (ret)
+		genpd->performance_state = prev;
+	genpd_unlock(master);
+
+	return ret;
+}
+
+/**
+ * pm_genpd_update_performance_state - Update performance state of parent power
+ * domain for the target frequency for the device.
+ *
+ * @dev: Device for which the performance-state needs to be adjusted.
+ * @rate: Device's next frequency.
+ */
+int pm_genpd_update_performance_state(struct device *dev, unsigned long rate)
+{
+	struct generic_pm_domain *genpd = ERR_PTR(-ENODATA);
+	struct generic_pm_domain_data *gpd_data;
+	int ret, prev;
+
+	spin_lock_irq(&dev->power.lock);
+	if (dev->power.subsys_data && dev->power.subsys_data->domain_data) {
+		gpd_data = to_gpd_data(dev->power.subsys_data->domain_data);
+		genpd = dev_to_genpd(dev);
+	}
+
+	if (IS_ERR(genpd)) {
+		ret = -ENODEV;
+		goto unlock;
+	}
+
+	if (!genpd->get_performance_state) {
+		dev_err(dev, "Performance states aren't supported by power domain\n");
+		ret = -EINVAL;
+		goto unlock;
+	}
+
+	genpd_lock(genpd);
+	ret = genpd->get_performance_state(dev, rate);
+	if (ret < 0)
+		goto unlock_genpd;
+
+	prev = gpd_data->performance_state;
+	gpd_data->performance_state = ret;
+	ret = genpd_update_performance_state(genpd, 0);
+	if (ret)
+		gpd_data->performance_state = prev;
+
+unlock_genpd:
+	genpd_unlock(genpd);
+unlock:
+	spin_unlock_irq(&dev->power.lock);
+
+	return ret;
+}
+EXPORT_SYMBOL_GPL(pm_genpd_update_performance_state);
+
 /**
  * genpd_power_off_work_fn - Power off PM domain whose subdomain count is 0.
  * @work: Work structure used for scheduling the execution of this function.
@@ -1156,6 +1321,7 @@ static struct generic_pm_domain_data *genpd_alloc_dev_data(struct device *dev,
 	gpd_data->td.constraint_changed = true;
 	gpd_data->td.effective_constraint_ns = -1;
 	gpd_data->nb.notifier_call = genpd_dev_pm_qos_notifier;
+	gpd_data->performance_state = 0;
 
 	spin_lock_irq(&dev->power.lock);
 
@@ -1502,6 +1668,17 @@ int pm_genpd_init(struct generic_pm_domain *genpd,
 		genpd->dev_ops.start = pm_clk_resume;
 	}
 
+	/*
+	 * If this genpd supports getting performance state, then someone in its
+	 * hierarchy must support setting it too.
+	 */
+	if (genpd->get_performance_state &&
+	    !genpd_has_set_performance_state(genpd)) {
+		pr_err("%s: genpd doesn't support updating performance state\n",
+		       genpd->name);
+		return -ENODEV;
+	}
+
 	/* Always-on domains must be powered on at initialization. */
 	if (genpd_is_always_on(genpd) && !genpd_status_on(genpd))
 		return -EINVAL;
diff --git a/include/linux/pm_domain.h b/include/linux/pm_domain.h
index b7803a251044..74502664ea33 100644
--- a/include/linux/pm_domain.h
+++ b/include/linux/pm_domain.h
@@ -63,8 +63,12 @@ struct generic_pm_domain {
 	unsigned int device_count;	/* Number of devices */
 	unsigned int suspended_count;	/* System suspend device counter */
 	unsigned int prepared_count;	/* Suspend counter of prepared devices */
+	unsigned int performance_state;	/* Max requested performance state */
 	int (*power_off)(struct generic_pm_domain *domain);
 	int (*power_on)(struct generic_pm_domain *domain);
+	int (*get_performance_state)(struct device *dev, unsigned long rate);
+	int (*set_performance_state)(struct generic_pm_domain *domain,
+				     unsigned int state);
 	struct gpd_dev_ops dev_ops;
 	s64 max_off_time_ns;	/* Maximum allowed "suspended" time. */
 	bool max_off_time_changed;
@@ -118,6 +122,7 @@ struct generic_pm_domain_data {
 	struct pm_domain_data base;
 	struct gpd_timing_data td;
 	struct notifier_block nb;
+	unsigned int performance_state;
 	void *data;
 };
 
@@ -148,6 +153,9 @@ extern int pm_genpd_remove(struct generic_pm_domain *genpd);
 
 extern struct dev_power_governor simple_qos_governor;
 extern struct dev_power_governor pm_domain_always_on_gov;
+extern bool pm_genpd_has_performance_state(struct device *dev);
+extern int pm_genpd_update_performance_state(struct device *dev,
+					     unsigned long rate);
 #else
 
 static inline struct generic_pm_domain_data *dev_gpd_data(struct device *dev)
@@ -185,6 +193,17 @@ static inline int pm_genpd_remove(struct generic_pm_domain *genpd)
 	return -ENOTSUPP;
 }
 
+static inline bool pm_genpd_has_performance_state(struct device *dev)
+{
+	return false;
+}
+
+static inline int pm_genpd_update_performance_state(struct device *dev,
+						    unsigned long rate)
+{
+	return -ENOTSUPP;
+}
+
 #define simple_qos_governor		(*(struct dev_power_governor *)(NULL))
 #define pm_domain_always_on_gov		(*(struct dev_power_governor *)(NULL))
 #endif

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [PATCH V7 1/2] PM / Domains: Add support to select performance-state of domains
  2017-06-08  7:05     ` Viresh Kumar
@ 2017-06-08  7:48       ` Ulf Hansson
  2017-06-08  9:42         ` Viresh Kumar
  0 siblings, 1 reply; 10+ messages in thread
From: Ulf Hansson @ 2017-06-08  7:48 UTC (permalink / raw)
  To: Viresh Kumar
  Cc: Rafael Wysocki, Kevin Hilman, Pavel Machek, Len Brown,
	linaro-kernel, linux-pm, linux-kernel, Vincent Guittot,
	Stephen Boyd, Nishanth Menon, Rob Herring, Lina Iyer,
	Rajendra Nayak, Sudeep Holla

[...]

>> As you probably figured out from my above comments, the solution here
>> isn't really working.
>>
>> Adding these new APIs, more or less requires us to manage reference
>> counting on the genpd for the device.
>
> Hmm, I am not able to understand why would we need that with this series :(
>
>> I haven't thought about that in
>> great detail, just that we of course need to be careful if we want to
>> introduce something like that, to make sure all aspect of locking
>> becomes correct.
>>
>> Moreover adding reference counting touches the idea of adding APIs to
>> genpd, to allow users/drivers to control their genpd explicitly. This
>> is required to support > 1 pm_domain per device. I assume you have
>> been following that discussion for genpd on linux-pm as well. My point
>> is that, we should consider that while inventing *any* new APIs.
>
> Not very deeply but yeah I have seen that thread.
>
> So, I couldn't find way to get the locking thing fixed to avoid deadlocks but we
> can live with a constraint (if you guys think it is fine) to have this solved.
>
> Constraint: Update performance state API wouldn't support power domains that
> don't provide a set_performance_state() callback and have multiple parent power
> domains.

No thanks.

If we are going to do this, let's do it properly. To me some minor
constraints would be okay, but this is just too much.

>
> Why: In order to guarantee that we read and update the performance_state of
> subdomains and devices properly, we need to do that under the parent domain's
> lock. And if we have multiple parent power domains, then we would be required to
> get them locked at once before updating subdomain's state and that would be a
> nightmare.

It's not a nightmare, just a tricky thing to solve. :-)

[...]

Kind regards
Uffe

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH V7 1/2] PM / Domains: Add support to select performance-state of domains
  2017-06-08  7:48       ` Ulf Hansson
@ 2017-06-08  9:42         ` Viresh Kumar
  2017-06-09 11:00           ` Ulf Hansson
  0 siblings, 1 reply; 10+ messages in thread
From: Viresh Kumar @ 2017-06-08  9:42 UTC (permalink / raw)
  To: Ulf Hansson
  Cc: Rafael Wysocki, Kevin Hilman, Pavel Machek, Len Brown,
	linaro-kernel, linux-pm, linux-kernel, Vincent Guittot,
	Stephen Boyd, Nishanth Menon, Rob Herring, Lina Iyer,
	Rajendra Nayak, Sudeep Holla

On 08-06-17, 09:48, Ulf Hansson wrote:
> It's not a nightmare, just a tricky thing to solve. :-)

I may have just solved it actually :)

So the real locking problem was the case where a subdomain have multiple parent
domains and how do we access its performance_state field from all the paths that
originate from the parent's update and from the subdomains own path. And a
single performance_state field isn't going to help in that as multiple locks are
involved here. I have added another per parent-domain field and that will help
get the locking quite simple. Here is the new patch (compile tested):

---
 drivers/base/power/domain.c | 193 ++++++++++++++++++++++++++++++++++++++++++++
 include/linux/pm_domain.h   |  22 +++++
 2 files changed, 215 insertions(+)

diff --git a/drivers/base/power/domain.c b/drivers/base/power/domain.c
index 71c95ad808d5..40815974287f 100644
--- a/drivers/base/power/domain.c
+++ b/drivers/base/power/domain.c
@@ -466,6 +466,187 @@ static int genpd_dev_pm_qos_notifier(struct notifier_block *nb,
 	return NOTIFY_DONE;
 }
 
+/*
+ * Returns true if anyone in genpd's parent hierarchy has
+ * set_performance_state() set.
+ */
+static bool genpd_has_set_performance_state(struct generic_pm_domain *genpd)
+{
+	struct gpd_link *link;
+
+	if (genpd->set_performance_state)
+		return true;
+
+	list_for_each_entry(link, &genpd->slave_links, slave_node) {
+		if (genpd_has_set_performance_state(link->master))
+			return true;
+	}
+
+	return false;
+}
+
+/**
+ * pm_genpd_has_performance_state - Checks if power domain does performance
+ * state management.
+ *
+ * @dev: Device whose power domain is getting inquired.
+ */
+bool pm_genpd_has_performance_state(struct device *dev)
+{
+	struct generic_pm_domain *genpd = dev_to_genpd(dev);
+
+	/* The parent domain must have set get_performance_state() */
+	if (!IS_ERR(genpd) && genpd->get_performance_state)
+		return true;
+
+	return false;
+}
+EXPORT_SYMBOL_GPL(pm_genpd_has_performance_state);
+
+/*
+ * Re-evaluate performance state of a power domain. Finds the highest requested
+ * performance state by the devices and subdomains within the power domain and
+ * then tries to change its performance state. If the power domain doesn't have
+ * a set_performance_state() callback, then we move the request to its parent
+ * power domain.
+ *
+ * Locking: Access (or update) to device's "pd_data->performance_state" field
+ * happens only with parent domain's lock held. Subdomains have their
+ * "genpd->performance_state" protected with their own lock (and they are the
+ * only user of this field) and their per-parent-domain
+ * "link->performance_state" field is protected with individual parent power
+ * domain's lock and is only accessed/updated with that lock held.
+ */
+static int genpd_update_performance_state(struct generic_pm_domain *genpd,
+					  int depth)
+{
+	struct generic_pm_domain_data *pd_data;
+	struct generic_pm_domain *master;
+	struct pm_domain_data *pdd;
+	unsigned int state = 0, prev;
+	struct gpd_link *link;
+	int ret;
+
+	/* Traverse all devices within the domain */
+	list_for_each_entry(pdd, &genpd->dev_list, list_node) {
+		pd_data = to_gpd_data(pdd);
+
+		if (pd_data->performance_state > state)
+			state = pd_data->performance_state;
+	}
+
+	/* Traverse all subdomains within the domain */
+	list_for_each_entry(link, &genpd->master_links, master_node) {
+		if (link->performance_state > state)
+			state = link->performance_state;
+	}
+
+	if (genpd->performance_state == state)
+		return 0;
+
+	/*
+	 * Not all domains support updating performance state. Move on to their
+	 * parent domains in that case.
+	 */
+	if (genpd->set_performance_state) {
+		ret = genpd->set_performance_state(genpd, state);
+		if (!ret)
+			genpd->performance_state = state;
+
+		return ret;
+	}
+
+	prev = genpd->performance_state;
+
+	list_for_each_entry(link, &genpd->slave_links, slave_node) {
+		master = link->master;
+
+		genpd_lock_nested(master, depth + 1);
+
+		link->performance_state = state;
+		ret = genpd_update_performance_state(master, depth + 1);
+		if (ret)
+			link->performance_state = prev;
+
+		genpd_unlock(master);
+
+		if (ret)
+			goto err;
+	}
+
+	/*
+	 * All the parent domains are updated now, lets get genpd
+	 * performance_state in sync with those.
+	 */
+	genpd->performance_state = state;
+	return 0;
+
+err:
+	list_for_each_entry_continue_reverse(link, &genpd->slave_links,
+					     slave_node) {
+		master = link->master;
+
+		genpd_lock_nested(master, depth + 1);
+		link->performance_state = prev;
+		if (genpd_update_performance_state(master, depth + 1))
+			pr_err("%s: Failed to roll back to %d performance state\n",
+			       genpd->name, prev);
+		genpd_unlock(master);
+	}
+
+	return ret;
+}
+
+/**
+ * pm_genpd_update_performance_state - Update performance state of parent power
+ * domain for the target frequency for the device.
+ *
+ * @dev: Device for which the performance-state needs to be adjusted.
+ * @rate: Device's next frequency.
+ */
+int pm_genpd_update_performance_state(struct device *dev, unsigned long rate)
+{
+	struct generic_pm_domain *genpd = ERR_PTR(-ENODATA);
+	struct generic_pm_domain_data *gpd_data;
+	int ret, prev;
+
+	spin_lock_irq(&dev->power.lock);
+	if (dev->power.subsys_data && dev->power.subsys_data->domain_data) {
+		gpd_data = to_gpd_data(dev->power.subsys_data->domain_data);
+		genpd = dev_to_genpd(dev);
+	}
+
+	if (IS_ERR(genpd)) {
+		ret = -ENODEV;
+		goto unlock;
+	}
+
+	if (!genpd->get_performance_state) {
+		dev_err(dev, "Performance states aren't supported by power domain\n");
+		ret = -EINVAL;
+		goto unlock;
+	}
+
+	genpd_lock(genpd);
+	ret = genpd->get_performance_state(dev, rate);
+	if (ret < 0)
+		goto unlock_genpd;
+
+	prev = gpd_data->performance_state;
+	gpd_data->performance_state = ret;
+	ret = genpd_update_performance_state(genpd, 0);
+	if (ret)
+		gpd_data->performance_state = prev;
+
+unlock_genpd:
+	genpd_unlock(genpd);
+unlock:
+	spin_unlock_irq(&dev->power.lock);
+
+	return ret;
+}
+EXPORT_SYMBOL_GPL(pm_genpd_update_performance_state);
+
 /**
  * genpd_power_off_work_fn - Power off PM domain whose subdomain count is 0.
  * @work: Work structure used for scheduling the execution of this function.
@@ -1156,6 +1337,7 @@ static struct generic_pm_domain_data *genpd_alloc_dev_data(struct device *dev,
 	gpd_data->td.constraint_changed = true;
 	gpd_data->td.effective_constraint_ns = -1;
 	gpd_data->nb.notifier_call = genpd_dev_pm_qos_notifier;
+	gpd_data->performance_state = 0;
 
 	spin_lock_irq(&dev->power.lock);
 
@@ -1502,6 +1684,17 @@ int pm_genpd_init(struct generic_pm_domain *genpd,
 		genpd->dev_ops.start = pm_clk_resume;
 	}
 
+	/*
+	 * If this genpd supports getting performance state, then someone in its
+	 * hierarchy must support setting it too.
+	 */
+	if (genpd->get_performance_state &&
+	    !genpd_has_set_performance_state(genpd)) {
+		pr_err("%s: genpd doesn't support updating performance state\n",
+		       genpd->name);
+		return -ENODEV;
+	}
+
 	/* Always-on domains must be powered on at initialization. */
 	if (genpd_is_always_on(genpd) && !genpd_status_on(genpd))
 		return -EINVAL;
diff --git a/include/linux/pm_domain.h b/include/linux/pm_domain.h
index b7803a251044..bf90177208a2 100644
--- a/include/linux/pm_domain.h
+++ b/include/linux/pm_domain.h
@@ -63,8 +63,12 @@ struct generic_pm_domain {
 	unsigned int device_count;	/* Number of devices */
 	unsigned int suspended_count;	/* System suspend device counter */
 	unsigned int prepared_count;	/* Suspend counter of prepared devices */
+	unsigned int performance_state;	/* Max requested performance state */
 	int (*power_off)(struct generic_pm_domain *domain);
 	int (*power_on)(struct generic_pm_domain *domain);
+	int (*get_performance_state)(struct device *dev, unsigned long rate);
+	int (*set_performance_state)(struct generic_pm_domain *domain,
+				     unsigned int state);
 	struct gpd_dev_ops dev_ops;
 	s64 max_off_time_ns;	/* Maximum allowed "suspended" time. */
 	bool max_off_time_changed;
@@ -99,6 +103,9 @@ struct gpd_link {
 	struct list_head master_node;
 	struct generic_pm_domain *slave;
 	struct list_head slave_node;
+
+	/* Sub-domain's per-parent domain performance state */
+	unsigned int performance_state;
 };
 
 struct gpd_timing_data {
@@ -118,6 +125,7 @@ struct generic_pm_domain_data {
 	struct pm_domain_data base;
 	struct gpd_timing_data td;
 	struct notifier_block nb;
+	unsigned int performance_state;
 	void *data;
 };
 
@@ -148,6 +156,9 @@ extern int pm_genpd_remove(struct generic_pm_domain *genpd);
 
 extern struct dev_power_governor simple_qos_governor;
 extern struct dev_power_governor pm_domain_always_on_gov;
+extern bool pm_genpd_has_performance_state(struct device *dev);
+extern int pm_genpd_update_performance_state(struct device *dev,
+					     unsigned long rate);
 #else
 
 static inline struct generic_pm_domain_data *dev_gpd_data(struct device *dev)
@@ -185,6 +196,17 @@ static inline int pm_genpd_remove(struct generic_pm_domain *genpd)
 	return -ENOTSUPP;
 }
 
+static inline bool pm_genpd_has_performance_state(struct device *dev)
+{
+	return false;
+}
+
+static inline int pm_genpd_update_performance_state(struct device *dev,
+						    unsigned long rate)
+{
+	return -ENOTSUPP;
+}
+
 #define simple_qos_governor		(*(struct dev_power_governor *)(NULL))
 #define pm_domain_always_on_gov		(*(struct dev_power_governor *)(NULL))
 #endif

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [PATCH V7 1/2] PM / Domains: Add support to select performance-state of domains
  2017-06-08  9:42         ` Viresh Kumar
@ 2017-06-09 11:00           ` Ulf Hansson
  2017-06-13 10:56             ` Viresh Kumar
  0 siblings, 1 reply; 10+ messages in thread
From: Ulf Hansson @ 2017-06-09 11:00 UTC (permalink / raw)
  To: Viresh Kumar
  Cc: Rafael Wysocki, Kevin Hilman, Pavel Machek, Len Brown,
	linaro-kernel, linux-pm, linux-kernel, Vincent Guittot,
	Stephen Boyd, Nishanth Menon, Rob Herring, Lina Iyer,
	Rajendra Nayak, Sudeep Holla

On 8 June 2017 at 11:42, Viresh Kumar <viresh.kumar@linaro.org> wrote:
> On 08-06-17, 09:48, Ulf Hansson wrote:
>> It's not a nightmare, just a tricky thing to solve. :-)
>
> I may have just solved it actually :)

That was quick. :-)

>
> So the real locking problem was the case where a subdomain have multiple parent
> domains and how do we access its performance_state field from all the paths that
> originate from the parent's update and from the subdomains own path. And a
> single performance_state field isn't going to help in that as multiple locks are
> involved here. I have added another per parent-domain field and that will help
> get the locking quite simple. Here is the new patch (compile tested):

Obviously you didn't think about this long enough. Please spare me
from having to review something of this complexity just being compile
tested. I don't have unlimited bandwidth. :-)

I recommend at least some functional tests run together with lockdep tests.

>
> ---
>  drivers/base/power/domain.c | 193 ++++++++++++++++++++++++++++++++++++++++++++
>  include/linux/pm_domain.h   |  22 +++++
>  2 files changed, 215 insertions(+)
>
> diff --git a/drivers/base/power/domain.c b/drivers/base/power/domain.c
> index 71c95ad808d5..40815974287f 100644
> --- a/drivers/base/power/domain.c
> +++ b/drivers/base/power/domain.c
> @@ -466,6 +466,187 @@ static int genpd_dev_pm_qos_notifier(struct notifier_block *nb,
>         return NOTIFY_DONE;
>  }
>
> +/*
> + * Returns true if anyone in genpd's parent hierarchy has
> + * set_performance_state() set.
> + */
> +static bool genpd_has_set_performance_state(struct generic_pm_domain *genpd)
> +{
> +       struct gpd_link *link;
> +
> +       if (genpd->set_performance_state)
> +               return true;
> +
> +       list_for_each_entry(link, &genpd->slave_links, slave_node) {
> +               if (genpd_has_set_performance_state(link->master))
> +                       return true;
> +       }
> +
> +       return false;
> +}
> +
> +/**
> + * pm_genpd_has_performance_state - Checks if power domain does performance
> + * state management.
> + *
> + * @dev: Device whose power domain is getting inquired.
> + */
> +bool pm_genpd_has_performance_state(struct device *dev)
> +{
> +       struct generic_pm_domain *genpd = dev_to_genpd(dev);
> +
> +       /* The parent domain must have set get_performance_state() */
> +       if (!IS_ERR(genpd) && genpd->get_performance_state)
> +               return true;
> +
> +       return false;
> +}
> +EXPORT_SYMBOL_GPL(pm_genpd_has_performance_state);

I think you didn't read my earlier comments well enough.

Who is going to call this? What prevents the genpd object from being
removed in the middle of when you are operating on the genpd pointer?
How can you even be sure the pm_domain pointer is a valid genpd
object?

[...]

Kind regards
Uffe

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH V7 1/2] PM / Domains: Add support to select performance-state of domains
  2017-06-09 11:00           ` Ulf Hansson
@ 2017-06-13 10:56             ` Viresh Kumar
  0 siblings, 0 replies; 10+ messages in thread
From: Viresh Kumar @ 2017-06-13 10:56 UTC (permalink / raw)
  To: Ulf Hansson
  Cc: Rafael Wysocki, Kevin Hilman, Pavel Machek, Len Brown,
	linaro-kernel, linux-pm, linux-kernel, Vincent Guittot,
	Stephen Boyd, Nishanth Menon, Rob Herring, Lina Iyer,
	Rajendra Nayak, Sudeep Holla

On 09-06-17, 13:00, Ulf Hansson wrote:
> On 8 June 2017 at 11:42, Viresh Kumar <viresh.kumar@linaro.org> wrote:
> > On 08-06-17, 09:48, Ulf Hansson wrote:
> >> It's not a nightmare, just a tricky thing to solve. :-)
> >
> > I may have just solved it actually :)
> 
> That was quick. :-)

:)

> > So the real locking problem was the case where a subdomain have multiple parent
> > domains and how do we access its performance_state field from all the paths that
> > originate from the parent's update and from the subdomains own path. And a
> > single performance_state field isn't going to help in that as multiple locks are
> > involved here. I have added another per parent-domain field and that will help
> > get the locking quite simple. Here is the new patch (compile tested):
> 
> Obviously you didn't think about this long enough.

That's a fair point of view of yours, but something similar to what I
have done in this patch was running around in my head for a couple of
days and so I went implementing it really quickly.

> Please spare me
> from having to review something of this complexity just being compile
> tested. I don't have unlimited bandwidth. :-)

I agree, my aim wasn't to waste your time. I just wanted to get some
sort of Ack on the idea of re-using the per-parent link structure for
storing performance state and so went ahead just with compile tests.

> I recommend at least some functional tests run together with lockdep tests.

I have done that now (with lockdep) for multiple cases:

- Device with a parent genpd which supports setting performance state.
- Device with parent genpd which doesn't allow setting performance
  state, but the parent domains of the genpd allow that. This focusses
  on the complex part of this patch where the update is propagated to
  parents of a sub-domain.

And they are all working fine now.

> > +/**
> > + * pm_genpd_has_performance_state - Checks if power domain does performance
> > + * state management.
> > + *
> > + * @dev: Device whose power domain is getting inquired.
> > + */
> > +bool pm_genpd_has_performance_state(struct device *dev)
> > +{
> > +       struct generic_pm_domain *genpd = dev_to_genpd(dev);
> > +
> > +       /* The parent domain must have set get_performance_state() */
> > +       if (!IS_ERR(genpd) && genpd->get_performance_state)
> > +               return true;
> > +
> > +       return false;
> > +}
> > +EXPORT_SYMBOL_GPL(pm_genpd_has_performance_state);
>
> I think you didn't read my earlier comments well enough.

Actually I did and asked you a couple of questions in my previous
email, but as you haven't replied to them I thought you are fine with
my arguments.

Anyway, we had some chat regarding this yesterday on IRC and I have
understood what the problems can be here and so I have used your
suggestions here now.

> Who is going to call this?

Some user driver. For devices that support frequency scaling, the OPP
core (if used by device) will call it.

For cases where there is no frequency scaling, the driver may call it
directly.

> What prevents the genpd object from being
> removed in the middle of when you are operating on the genpd pointer?

So there are three cases we have and they kind of follow the sort of
assumptions already made in genpd.

1). Devices which have drivers for them, like MMC, etc. In such cases
  the genpd is attached to the device from the bus layer
  (platform/amba/etc) and so until the time driver is around, we are
  guaranteed to have the genpd structure. And because the driver
  creates/removes OPP tables, we will be fine here.

2). Devices for which we don't have drivers (at least for ARM
  platforms), like CPU. Here the platform specific power domain driver
  would be attaching the genpd with the devices instead of bus cores
  and so the domain driver has to guarantee that the genpd doesn't go
  away until the time some user driver (like cpufreq-dt.c) is using
  the device. This can be controlled by creating the cpufreq virtual
  device (which we create from cpufreq-dt-platdev.c today) from the
  domain driver itself.

So we perhaps don't need any refcount stuff for now at lest.

> How can you even be sure the pm_domain pointer is a valid genpd
> object?

I understood this comment yesterday only and now I have used
genpd_lookup_dev() instead of dev_to_genpd() here. The new patch with
some minor updates from the last version is pasted below.

I have also asked Rajendra Nayak to provide some sample code for
actual platform along with platform specific power domain driver. He
should be providing that to me sometime this week. I may post the
entire series again then along with his code.

-- 
viresh

---
 drivers/base/power/domain.c | 229 ++++++++++++++++++++++++++++++++++++++++++++
 include/linux/pm_domain.h   |  22 +++++
 2 files changed, 251 insertions(+)

diff --git a/drivers/base/power/domain.c b/drivers/base/power/domain.c
index 71c95ad808d5..8c92a2f00c4e 100644
--- a/drivers/base/power/domain.c
+++ b/drivers/base/power/domain.c
@@ -466,6 +466,235 @@ static int genpd_dev_pm_qos_notifier(struct notifier_block *nb,
 	return NOTIFY_DONE;
 }
 
+/*
+ * Returns true if anyone in genpd's parent hierarchy has
+ * set_performance_state() set.
+ */
+static bool genpd_has_set_performance_state(struct generic_pm_domain *genpd)
+{
+	struct gpd_link *link;
+
+	if (genpd->set_performance_state)
+		return true;
+
+	list_for_each_entry(link, &genpd->slave_links, slave_node) {
+		if (genpd_has_set_performance_state(link->master))
+			return true;
+	}
+
+	return false;
+}
+
+/**
+ * pm_genpd_has_performance_state - Checks if power domain does performance
+ * state management.
+ *
+ * @dev: Device whose power domain is getting inquired.
+ *
+ * This must be called by the user drivers, before they start calling
+ * pm_genpd_update_performance_state(), to guarantee that all dependencies are
+ * met and the device's genpd supports performance states.
+ *
+ * It is assumed that the user driver guarantees that the genpd wouldn't be
+ * detached while this routine is getting called.
+ *
+ * Returns "true" if device's genpd supports performance states, "false"
+ * otherwise.
+ */
+bool pm_genpd_has_performance_state(struct device *dev)
+{
+	struct generic_pm_domain *genpd = genpd_lookup_dev(dev);
+
+	/* The parent domain must have set get_performance_state() */
+	if (!IS_ERR(genpd) && genpd->get_performance_state) {
+		if (genpd_has_set_performance_state(genpd))
+			return true;
+
+		/*
+		 * A genpd with ->get_performance_state() callback must also
+		 * allow setting performance state.
+		 */
+		dev_err(dev, "genpd doesn't support setting performance state\n");
+	}
+
+	return false;
+}
+EXPORT_SYMBOL_GPL(pm_genpd_has_performance_state);
+
+/*
+ * Re-evaluate performance state of a power domain. Finds the highest requested
+ * performance state by the devices and subdomains within the power domain and
+ * then tries to change its performance state. If the power domain doesn't have
+ * a set_performance_state() callback, then we move the request to its parent
+ * power domain.
+ *
+ * Locking: Access (or update) to device's "pd_data->performance_state" field
+ * happens only with parent domain's lock held. Subdomains have their
+ * "genpd->performance_state" protected with their own lock (and they are the
+ * only user of this field) and their per-parent-domain
+ * "link->performance_state" field is protected with individual parent power
+ * domain's lock and is only accessed/updated with that lock held.
+ */
+static int genpd_update_performance_state(struct generic_pm_domain *genpd,
+					  int depth)
+{
+	struct generic_pm_domain_data *pd_data;
+	struct generic_pm_domain *master;
+	struct pm_domain_data *pdd;
+	unsigned int state = 0, prev;
+	struct gpd_link *link;
+	int ret;
+
+	/* Traverse all devices within the domain */
+	list_for_each_entry(pdd, &genpd->dev_list, list_node) {
+		pd_data = to_gpd_data(pdd);
+
+		if (pd_data->performance_state > state)
+			state = pd_data->performance_state;
+	}
+
+	/* Traverse all subdomains within the domain */
+	list_for_each_entry(link, &genpd->master_links, master_node) {
+		if (link->performance_state > state)
+			state = link->performance_state;
+	}
+
+	if (genpd->performance_state == state)
+		return 0;
+
+	if (genpd->set_performance_state) {
+		ret = genpd->set_performance_state(genpd, state);
+		if (!ret)
+			genpd->performance_state = state;
+
+		return ret;
+	}
+
+	/*
+	 * Not all domains support updating performance state. Move on to their
+	 * parent domains in that case.
+	 */
+	prev = genpd->performance_state;
+
+	list_for_each_entry(link, &genpd->slave_links, slave_node) {
+		master = link->master;
+
+		genpd_lock_nested(master, depth + 1);
+
+		link->performance_state = state;
+		ret = genpd_update_performance_state(master, depth + 1);
+		if (ret)
+			link->performance_state = prev;
+
+		genpd_unlock(master);
+
+		if (ret)
+			goto err;
+	}
+
+	/*
+	 * The parent domains are updated now, lets get genpd performance_state
+	 * in sync with those.
+	 */
+	genpd->performance_state = state;
+	return 0;
+
+err:
+	list_for_each_entry_continue_reverse(link, &genpd->slave_links,
+					     slave_node) {
+		master = link->master;
+
+		genpd_lock_nested(master, depth + 1);
+		link->performance_state = prev;
+		if (genpd_update_performance_state(master, depth + 1))
+			pr_err("%s: Failed to roll back to %d performance state\n",
+			       genpd->name, prev);
+		genpd_unlock(master);
+	}
+
+	return ret;
+}
+
+static int __dev_update_performance_state(struct device *dev, int state)
+{
+	struct generic_pm_domain_data *gpd_data;
+	int ret;
+
+	spin_lock_irq(&dev->power.lock);
+
+	if (!dev->power.subsys_data || !dev->power.subsys_data->domain_data) {
+		ret = -ENODEV;
+		goto unlock;
+	}
+
+	gpd_data = to_gpd_data(dev->power.subsys_data->domain_data);
+
+	ret = gpd_data->performance_state;
+	gpd_data->performance_state = state;
+
+unlock:
+	spin_unlock_irq(&dev->power.lock);
+
+	return ret;
+}
+
+/**
+ * pm_genpd_update_performance_state - Update performance state of device's
+ * parent power domain for the target frequency for the device.
+ *
+ * @dev: Device for which the performance-state needs to be adjusted.
+ * @rate: Device's next frequency. This can be set as 0 when the device doesn't
+ * have any performance state constraints left (And so the device wouldn't
+ * participate anymore to find the target performance state of the genpd).
+ *
+ * This must be called by the user drivers (as many times as they want) only
+ * after pm_genpd_has_performance_state() is called (only once) and that
+ * returned "true".
+ *
+ * This is assumed that the user driver guarantees that the genpd wouldn't be
+ * detached while this routine is getting called.
+ *
+ * Returns 0 on success and negative error values on failures.
+ */
+int pm_genpd_update_performance_state(struct device *dev, unsigned long rate)
+{
+	struct generic_pm_domain *genpd = dev_to_genpd(dev);
+	int ret, state;
+
+	if (IS_ERR(genpd))
+		return -ENODEV;
+
+	genpd_lock(genpd);
+
+	if (unlikely(!genpd->get_performance_state)) {
+		dev_err(dev, "Performance states aren't supported by power domain\n");
+		ret = -EINVAL;
+		goto unlock;
+	}
+
+	state = genpd->get_performance_state(dev, rate);
+	if (state < 0) {
+		ret = state;
+		goto unlock;
+	}
+
+	state = __dev_update_performance_state(dev, state);
+	if (state < 0) {
+		ret = state;
+		goto unlock;
+	}
+
+	ret = genpd_update_performance_state(genpd, 0);
+	if (ret)
+		__dev_update_performance_state(dev, state);
+
+unlock:
+	genpd_unlock(genpd);
+
+	return ret;
+}
+EXPORT_SYMBOL_GPL(pm_genpd_update_performance_state);
+
 /**
  * genpd_power_off_work_fn - Power off PM domain whose subdomain count is 0.
  * @work: Work structure used for scheduling the execution of this function.
diff --git a/include/linux/pm_domain.h b/include/linux/pm_domain.h
index b7803a251044..bf90177208a2 100644
--- a/include/linux/pm_domain.h
+++ b/include/linux/pm_domain.h
@@ -63,8 +63,12 @@ struct generic_pm_domain {
 	unsigned int device_count;	/* Number of devices */
 	unsigned int suspended_count;	/* System suspend device counter */
 	unsigned int prepared_count;	/* Suspend counter of prepared devices */
+	unsigned int performance_state;	/* Max requested performance state */
 	int (*power_off)(struct generic_pm_domain *domain);
 	int (*power_on)(struct generic_pm_domain *domain);
+	int (*get_performance_state)(struct device *dev, unsigned long rate);
+	int (*set_performance_state)(struct generic_pm_domain *domain,
+				     unsigned int state);
 	struct gpd_dev_ops dev_ops;
 	s64 max_off_time_ns;	/* Maximum allowed "suspended" time. */
 	bool max_off_time_changed;
@@ -99,6 +103,9 @@ struct gpd_link {
 	struct list_head master_node;
 	struct generic_pm_domain *slave;
 	struct list_head slave_node;
+
+	/* Sub-domain's per-parent domain performance state */
+	unsigned int performance_state;
 };
 
 struct gpd_timing_data {
@@ -118,6 +125,7 @@ struct generic_pm_domain_data {
 	struct pm_domain_data base;
 	struct gpd_timing_data td;
 	struct notifier_block nb;
+	unsigned int performance_state;
 	void *data;
 };
 
@@ -148,6 +156,9 @@ extern int pm_genpd_remove(struct generic_pm_domain *genpd);
 
 extern struct dev_power_governor simple_qos_governor;
 extern struct dev_power_governor pm_domain_always_on_gov;
+extern bool pm_genpd_has_performance_state(struct device *dev);
+extern int pm_genpd_update_performance_state(struct device *dev,
+					     unsigned long rate);
 #else
 
 static inline struct generic_pm_domain_data *dev_gpd_data(struct device *dev)
@@ -185,6 +196,17 @@ static inline int pm_genpd_remove(struct generic_pm_domain *genpd)
 	return -ENOTSUPP;
 }
 
+static inline bool pm_genpd_has_performance_state(struct device *dev)
+{
+	return false;
+}
+
+static inline int pm_genpd_update_performance_state(struct device *dev,
+						    unsigned long rate)
+{
+	return -ENOTSUPP;
+}
+
 #define simple_qos_governor		(*(struct dev_power_governor *)(NULL))
 #define pm_domain_always_on_gov		(*(struct dev_power_governor *)(NULL))
 #endif

^ permalink raw reply related	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2017-06-13 10:56 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-05-17  7:02 [PATCH V7 0/2] PM / Domains: Power domain performance states Viresh Kumar
2017-05-17  7:02 ` [PATCH V7 1/2] PM / Domains: Add support to select performance-state of domains Viresh Kumar
2017-06-07 12:26   ` Ulf Hansson
2017-06-08  7:05     ` Viresh Kumar
2017-06-08  7:48       ` Ulf Hansson
2017-06-08  9:42         ` Viresh Kumar
2017-06-09 11:00           ` Ulf Hansson
2017-06-13 10:56             ` Viresh Kumar
2017-05-17  7:02 ` [PATCH V7 2/2] PM / OPP: Support updating performance state of device's power domains Viresh Kumar
2017-06-05  4:48 ` [PATCH V7 0/2] PM / Domains: Power domain performance states Viresh Kumar

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).