[PATCH 0/2] thermal, OPP: move the CPU power estimation to the OPP library

All of lore.kernel.org
 help / color / mirror / Atom feed

* [PATCH 0/2] thermal, OPP: move the CPU power estimation to the OPP library
@ 2018-01-09 11:02 Quentin Perret
  2018-01-09 11:02 ` [PATCH 1/2] PM / OPP: introduce an OPP power estimation helper Quentin Perret
                   ` (2 more replies)
  0 siblings, 3 replies; 18+ messages in thread
From: Quentin Perret @ 2018-01-09 11:02 UTC (permalink / raw)
  To: linux-pm
  Cc: rjw, vireshk, nm, sboyd, sudeep.holla, amit.kachhap, javi.merino,
	rui.zhang, edubezval, matthias.bgg, dietmar.eggemann,
	morten.rasmussen, patrick.bellasi, ionela.voinescu

Currently, IPA estimates the power dissipated by a CPU at each available OPP
using its capacitance (the dynamic-power-coefficient DT binding). This series
relocates this feature into the OPP library as a preparation for future
changes. More specifically:

   1. The current DT-based approach for power estimation will need deep
      changes to support SCMI-provided power values. While the thermal
      subsystem is not necessarily the best place to hide multiple power
      estimation methods, the OPP library appears to be a good candidate to
      implement the required platform abstraction.
   2. The energy models of CPUs will be needed by other clients in the future
      (such as the task scheduler or CPUFreq governors for example) in order
      to make energy-aware decisions. The relocation to the OPP library will
      enable code re-use and all clients will benefit form the platform
      abstraction mentioned previously.

Quentin Perret (2):
  PM / OPP: introduce an OPP power estimation helper
  thermal: cpu_cooling: use power models from the OPP library

 drivers/cpufreq/arm_big_little.c   |  2 ++
 drivers/cpufreq/cpufreq-dt.c       |  2 ++
 drivers/cpufreq/mediatek-cpufreq.c |  2 ++
 drivers/opp/core.c                 | 40 +++++++++++++++++++++++++
 drivers/opp/of.c                   | 61 ++++++++++++++++++++++++++++++++++++++
 drivers/opp/opp.h                  |  4 +++
 drivers/thermal/cpu_cooling.c      | 33 ++++++---------------
 include/linux/pm_opp.h             | 20 +++++++++++++
 8 files changed, 140 insertions(+), 24 deletions(-)

-- 
2.15.1

^ permalink raw reply	[flat|nested] 18+ messages in thread

* [PATCH 1/2] PM / OPP: introduce an OPP power estimation helper
  2018-01-09 11:02 [PATCH 0/2] thermal, OPP: move the CPU power estimation to the OPP library Quentin Perret
@ 2018-01-09 11:02 ` Quentin Perret
  2018-01-10  4:36   ` Viresh Kumar
  2018-01-09 11:02 ` [PATCH 2/2] thermal: cpu_cooling: use power models from the OPP library Quentin Perret
  2018-01-10 19:34 ` [PATCH 0/2] thermal, OPP: move the CPU power estimation to " Eduardo Valentin
  2 siblings, 1 reply; 18+ messages in thread
From: Quentin Perret @ 2018-01-09 11:02 UTC (permalink / raw)
  To: linux-pm
  Cc: rjw, vireshk, nm, sboyd, sudeep.holla, amit.kachhap, javi.merino,
	rui.zhang, edubezval, matthias.bgg, dietmar.eggemann,
	morten.rasmussen, patrick.bellasi, ionela.voinescu

The power dissipated by a CPU at a specific OPP is currently estimated by
the thermal subsystem as P = C * V^2 * f, with P the power, C the CPU's
capacitance, V the OPP's voltage and f the OPP's frequency. As this
feature can be useful for other clients requiring energy models of CPUs,
this commit introduces an equivalent power estimator directly into the OPP
library, hence enabling code re-use.

Signed-off-by: Quentin Perret <quentin.perret@arm.com>
---
 drivers/opp/core.c     | 40 +++++++++++++++++++++++++++++++++
 drivers/opp/of.c       | 61 ++++++++++++++++++++++++++++++++++++++++++++++++++
 drivers/opp/opp.h      |  4 ++++
 include/linux/pm_opp.h | 20 +++++++++++++++++
 4 files changed, 125 insertions(+)

diff --git a/drivers/opp/core.c b/drivers/opp/core.c
index 92fa94a6dcc1..b5e1ad2d311d 100644
--- a/drivers/opp/core.c
+++ b/drivers/opp/core.c
@@ -127,6 +127,24 @@ unsigned long dev_pm_opp_get_freq(struct dev_pm_opp *opp)
 }
 EXPORT_SYMBOL_GPL(dev_pm_opp_get_freq);
 
+/**
+ * dev_pm_opp_get_power() - Gets the estimated power corresponding to an opp
+ * @opp:	opp for which power has to be returned for
+ *
+ * Return: estimated power in mirco-watts corresponding to the opp, else
+ * return 0
+ */
+unsigned long dev_pm_opp_get_power(struct dev_pm_opp *opp)
+{
+	if (IS_ERR_OR_NULL(opp) || !opp->available) {
+		pr_err("%s: Invalid parameters\n", __func__);
+		return 0;
+	}
+
+	return opp->power_estimate_uw;
+}
+EXPORT_SYMBOL_GPL(dev_pm_opp_get_power);
+
 /**
  * dev_pm_opp_is_turbo() - Returns if opp is turbo OPP or not
  * @opp: opp for which turbo mode is being verified
@@ -148,6 +166,28 @@ bool dev_pm_opp_is_turbo(struct dev_pm_opp *opp)
 }
 EXPORT_SYMBOL_GPL(dev_pm_opp_is_turbo);
 
+/**
+ * dev_pm_opp_has_power() - Get the status of power values for OPPs
+ * @cpu_dev:	CPU device for which we do this operation
+ *
+ * Return: True if the OPPs hold power estimates for the CPU
+ */
+bool dev_pm_opp_has_power(struct device *cpu_dev)
+{
+	struct opp_table *opp_table;
+	bool has_power;
+
+	opp_table = _find_opp_table(cpu_dev);
+	if (IS_ERR(opp_table))
+		return false;
+
+	has_power = opp_table->has_power;
+
+	dev_pm_opp_put_opp_table(opp_table);
+
+	return has_power;
+}
+
 /**
  * dev_pm_opp_get_max_clock_latency() - Get max clock latency in nanoseconds
  * @dev:	device for which we do this operation
diff --git a/drivers/opp/of.c b/drivers/opp/of.c
index cb716aa2f44b..4a376cd00e8d 100644
--- a/drivers/opp/of.c
+++ b/drivers/opp/of.c
@@ -633,3 +633,64 @@ int dev_pm_opp_of_get_sharing_cpus(struct device *cpu_dev,
 	return ret;
 }
 EXPORT_SYMBOL_GPL(dev_pm_opp_of_get_sharing_cpus);
+
+/**
+ * dev_pm_opp_of_estimate_power() - Estimates the power dissipated by @cpu_dev
+ *                                  at each OPP.
+ * @cpu_dev:	CPU device for which we do this estimation
+ *
+ * This estimates the active power consumed by a CPU at each OPP using:
+ *                         P = C * V^2 * f
+ * with P the power, C the CPU's capacitance, V the OPP's voltage and f the
+ * OPP's frequency. V and f are assumed to be known by the time this function
+ * is called and C is read from DT.
+ *
+ * Returns -EINVAL if the CPU's capacitance cannot be read from DT.
+ */
+int dev_pm_opp_of_estimate_power(struct device *cpu_dev)
+{
+	struct opp_table *opp_table;
+	unsigned long mV, uW, KHz;
+	struct device_node *np;
+	struct dev_pm_opp *opp;
+	u32 capacitance = 0;
+	int ret = 0;
+
+	opp_table = _find_opp_table(cpu_dev);
+	if (IS_ERR(opp_table)) {
+		ret = PTR_ERR(opp_table);
+		dev_dbg(cpu_dev, "%s: no OPP table (%d)\n", __func__, ret);
+		return ret;
+	}
+
+	np = of_node_get(cpu_dev->of_node);
+	if (!np) {
+		dev_err(cpu_dev, "%s: no node for cpu\n", __func__);
+		ret = -ENOENT;
+		goto out;
+	}
+
+	of_property_read_u32(np, "dynamic-power-coefficient", &capacitance);
+	of_node_put(np);
+
+	if (!capacitance) {
+		opp_table->has_power = false;
+		ret = -EINVAL;
+		goto out;
+	}
+
+	list_for_each_entry(opp, &opp_table->opp_list, node) {
+		mV = dev_pm_opp_get_voltage(opp) / 1000;
+		KHz = dev_pm_opp_get_freq(opp) / 1000;
+		uW = (unsigned long)capacitance * KHz * mV * mV;
+		uW /= 1000000000;
+		opp->power_estimate_uw = uW;
+	}
+
+	opp_table->has_power = true;
+
+out:
+	dev_pm_opp_put_opp_table(opp_table);
+	return ret;
+}
+EXPORT_SYMBOL_GPL(dev_pm_opp_of_estimate_power);
diff --git a/drivers/opp/opp.h b/drivers/opp/opp.h
index 4d00061648a3..11cb62944bea 100644
--- a/drivers/opp/opp.h
+++ b/drivers/opp/opp.h
@@ -84,6 +84,8 @@ struct dev_pm_opp {
 
 	unsigned long clock_latency_ns;
 
+	unsigned long power_estimate_uw;
+
 	struct opp_table *opp_table;
 
 	struct device_node *np;
@@ -176,6 +178,8 @@ struct opp_table {
 	unsigned int regulator_count;
 	bool genpd_performance_state;
 
+	bool has_power;
+
 	int (*set_opp)(struct dev_pm_set_opp_data *data);
 	struct dev_pm_set_opp_data *set_opp_data;
 	int (*get_pstate)(struct device *dev, unsigned long rate);
diff --git a/include/linux/pm_opp.h b/include/linux/pm_opp.h
index 6c2d2e88f066..63a0edc89c7c 100644
--- a/include/linux/pm_opp.h
+++ b/include/linux/pm_opp.h
@@ -85,8 +85,12 @@ unsigned long dev_pm_opp_get_voltage(struct dev_pm_opp *opp);
 
 unsigned long dev_pm_opp_get_freq(struct dev_pm_opp *opp);
 
+unsigned long dev_pm_opp_get_power(struct dev_pm_opp *opp);
+
 bool dev_pm_opp_is_turbo(struct dev_pm_opp *opp);
 
+bool dev_pm_opp_has_power(struct device *cpu_dev);
+
 int dev_pm_opp_get_opp_count(struct device *dev);
 unsigned long dev_pm_opp_get_max_clock_latency(struct device *dev);
 unsigned long dev_pm_opp_get_max_volt_latency(struct device *dev);
@@ -150,11 +154,21 @@ static inline unsigned long dev_pm_opp_get_freq(struct dev_pm_opp *opp)
 	return 0;
 }
 
+static inline unsigned long dev_pm_opp_get_power(struct dev_pm_opp *opp)
+{
+	return 0;
+}
+
 static inline bool dev_pm_opp_is_turbo(struct dev_pm_opp *opp)
 {
 	return false;
 }
 
+static inline bool dev_pm_opp_has_power(struct device *cpu_dev)
+{
+	return false;
+}
+
 static inline int dev_pm_opp_get_opp_count(struct device *dev)
 {
 	return 0;
@@ -308,6 +322,7 @@ int dev_pm_opp_of_cpumask_add_table(const struct cpumask *cpumask);
 void dev_pm_opp_of_cpumask_remove_table(const struct cpumask *cpumask);
 int dev_pm_opp_of_get_sharing_cpus(struct device *cpu_dev, struct cpumask *cpumask);
 struct device_node *dev_pm_opp_of_get_opp_desc_node(struct device *dev);
+int dev_pm_opp_of_estimate_power(struct device *cpu_dev);
 #else
 static inline int dev_pm_opp_of_add_table(struct device *dev)
 {
@@ -336,6 +351,11 @@ static inline struct device_node *dev_pm_opp_of_get_opp_desc_node(struct device
 {
 	return NULL;
 }
+
+static inline int dev_pm_opp_of_estimate_power(struct device *cpu_dev)
+{
+	return -EINVAL;
+}
 #endif
 
 #endif		/* __LINUX_OPP_H__ */
-- 
2.15.1

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH 2/2] thermal: cpu_cooling: use power models from the OPP library
  2018-01-09 11:02 [PATCH 0/2] thermal, OPP: move the CPU power estimation to the OPP library Quentin Perret
  2018-01-09 11:02 ` [PATCH 1/2] PM / OPP: introduce an OPP power estimation helper Quentin Perret
@ 2018-01-09 11:02 ` Quentin Perret
  2018-01-10  4:37   ` Viresh Kumar
  2018-01-10 19:34 ` [PATCH 0/2] thermal, OPP: move the CPU power estimation to " Eduardo Valentin
  2 siblings, 1 reply; 18+ messages in thread
From: Quentin Perret @ 2018-01-09 11:02 UTC (permalink / raw)
  To: linux-pm
  Cc: rjw, vireshk, nm, sboyd, sudeep.holla, amit.kachhap, javi.merino,
	rui.zhang, edubezval, matthias.bgg, dietmar.eggemann,
	morten.rasmussen, patrick.bellasi, ionela.voinescu

Now that the OPP library features a power estimator, the existing code
in IPA can be modified to rely only on dev_pm_opp_get_power() without
having to care about the dynamic-power-coefficient DT binding.

Signed-off-by: Quentin Perret <quentin.perret@arm.com>
---
 drivers/cpufreq/arm_big_little.c   |  2 ++
 drivers/cpufreq/cpufreq-dt.c       |  2 ++
 drivers/cpufreq/mediatek-cpufreq.c |  2 ++
 drivers/thermal/cpu_cooling.c      | 33 +++++++++------------------------
 4 files changed, 15 insertions(+), 24 deletions(-)

diff --git a/drivers/cpufreq/arm_big_little.c b/drivers/cpufreq/arm_big_little.c
index c56b57dcfda5..3ae67c3510b6 100644
--- a/drivers/cpufreq/arm_big_little.c
+++ b/drivers/cpufreq/arm_big_little.c
@@ -497,6 +497,8 @@ static int bL_cpufreq_init(struct cpufreq_policy *policy)
 	if (is_bL_switching_enabled())
 		per_cpu(cpu_last_req_freq, policy->cpu) = clk_get_cpu_rate(policy->cpu);
 
+	dev_pm_opp_of_estimate_power(cpu_dev);
+
 	dev_info(cpu_dev, "%s: CPU %d initialized\n", __func__, policy->cpu);
 	return 0;
 }
diff --git a/drivers/cpufreq/cpufreq-dt.c b/drivers/cpufreq/cpufreq-dt.c
index de3d104c25d7..b05b0d8eb3f2 100644
--- a/drivers/cpufreq/cpufreq-dt.c
+++ b/drivers/cpufreq/cpufreq-dt.c
@@ -284,6 +284,8 @@ static int cpufreq_init(struct cpufreq_policy *policy)
 	policy->cpuinfo.transition_latency = transition_latency;
 	policy->dvfs_possible_from_any_cpu = true;
 
+	dev_pm_opp_of_estimate_power(cpu_dev);
+
 	return 0;
 
 out_free_cpufreq_table:
diff --git a/drivers/cpufreq/mediatek-cpufreq.c b/drivers/cpufreq/mediatek-cpufreq.c
index 8c04dddd3c28..c1b2fb67a73e 100644
--- a/drivers/cpufreq/mediatek-cpufreq.c
+++ b/drivers/cpufreq/mediatek-cpufreq.c
@@ -470,6 +470,8 @@ static int mtk_cpufreq_init(struct cpufreq_policy *policy)
 	policy->driver_data = info;
 	policy->clk = info->cpu_clk;
 
+	dev_pm_opp_of_estimate_power(info->cpu_dev);
+
 	return 0;
 
 out_free_cpufreq_table:
diff --git a/drivers/thermal/cpu_cooling.c b/drivers/thermal/cpu_cooling.c
index dfd23245f778..b961bd6a198b 100644
--- a/drivers/thermal/cpu_cooling.c
+++ b/drivers/thermal/cpu_cooling.c
@@ -187,7 +187,6 @@ static int cpufreq_thermal_notifier(struct notifier_block *nb,
 /**
  * update_freq_table() - Update the freq table with power numbers
  * @cpufreq_cdev:	the cpufreq cooling device in which to update the table
- * @capacitance: dynamic power coefficient for these cpus
  *
  * Update the freq table with power numbers.  This table will be used in
  * cpu_power_to_freq() and cpu_freq_to_power() to convert between power and
@@ -197,8 +196,7 @@ static int cpufreq_thermal_notifier(struct notifier_block *nb,
  * Return: 0 on success, -EINVAL if there are no OPPs for any CPUs,
  * or -ENOMEM if we run out of memory.
  */
-static int update_freq_table(struct cpufreq_cooling_device *cpufreq_cdev,
-			     u32 capacitance)
+static int update_freq_table(struct cpufreq_cooling_device *cpufreq_cdev)
 {
 	struct freq_table *freq_table = cpufreq_cdev->freq_table;
 	struct dev_pm_opp *opp;
@@ -227,9 +225,6 @@ static int update_freq_table(struct cpufreq_cooling_device *cpufreq_cdev,
 
 	for (i = 0; i <= cpufreq_cdev->max_level; i++) {
 		unsigned long freq = freq_table[i].frequency * 1000;
-		u32 freq_mhz = freq_table[i].frequency / 1000;
-		u64 power;
-		u32 voltage_mv;
 
 		/*
 		 * Find ceil frequency as 'freq' may be slightly lower than OPP
@@ -242,18 +237,9 @@ static int update_freq_table(struct cpufreq_cooling_device *cpufreq_cdev,
 			return -EINVAL;
 		}
 
-		voltage_mv = dev_pm_opp_get_voltage(opp) / 1000;
-		dev_pm_opp_put(opp);
-
-		/*
-		 * Do the multiplication with MHz and millivolt so as
-		 * to not overflow.
-		 */
-		power = (u64)capacitance * freq_mhz * voltage_mv * voltage_mv;
-		do_div(power, 1000000000);
-
 		/* power is stored in mW */
-		freq_table[i].power = power;
+		freq_table[i].power = dev_pm_opp_get_power(opp) / 1000;
+		dev_pm_opp_put(opp);
 	}
 
 	return 0;
@@ -606,7 +592,7 @@ static unsigned int find_next_max(struct cpufreq_frequency_table *table,
  */
 static struct thermal_cooling_device *
 __cpufreq_cooling_register(struct device_node *np,
-			struct cpufreq_policy *policy, u32 capacitance)
+			struct cpufreq_policy *policy, bool power_model)
 {
 	struct thermal_cooling_device *cdev;
 	struct cpufreq_cooling_device *cpufreq_cdev;
@@ -675,8 +661,8 @@ __cpufreq_cooling_register(struct device_node *np,
 			pr_debug("%s: freq:%u KHz\n", __func__, freq);
 	}
 
-	if (capacitance) {
-		ret = update_freq_table(cpufreq_cdev, capacitance);
+	if (power_model) {
+		ret = update_freq_table(cpufreq_cdev);
 		if (ret) {
 			cdev = ERR_PTR(ret);
 			goto remove_ida;
@@ -760,7 +746,7 @@ of_cpufreq_cooling_register(struct cpufreq_policy *policy)
 {
 	struct device_node *np = of_get_cpu_node(policy->cpu, NULL);
 	struct thermal_cooling_device *cdev = NULL;
-	u32 capacitance = 0;
+	bool has_power;
 
 	if (!np) {
 		pr_err("cpu_cooling: OF node not available for cpu%d\n",
@@ -769,10 +755,9 @@ of_cpufreq_cooling_register(struct cpufreq_policy *policy)
 	}
 
 	if (of_find_property(np, "#cooling-cells", NULL)) {
-		of_property_read_u32(np, "dynamic-power-coefficient",
-				     &capacitance);
+		has_power = dev_pm_opp_has_power(get_cpu_device(policy->cpu));
 
-		cdev = __cpufreq_cooling_register(np, policy, capacitance);
+		cdev = __cpufreq_cooling_register(np, policy, has_power);
 		if (IS_ERR(cdev)) {
 			pr_err("cpu_cooling: cpu%d is not running as cooling device: %ld\n",
 			       policy->cpu, PTR_ERR(cdev));
-- 
2.15.1

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* Re: [PATCH 1/2] PM / OPP: introduce an OPP power estimation helper
  2018-01-09 11:02 ` [PATCH 1/2] PM / OPP: introduce an OPP power estimation helper Quentin Perret
@ 2018-01-10  4:36   ` Viresh Kumar
  2018-01-10 10:20     ` Quentin Perret
  0 siblings, 1 reply; 18+ messages in thread
From: Viresh Kumar @ 2018-01-10  4:36 UTC (permalink / raw)
  To: Quentin Perret
  Cc: linux-pm, rjw, vireshk, nm, sboyd, sudeep.holla, amit.kachhap,
	javi.merino, rui.zhang, edubezval, matthias.bgg,
	dietmar.eggemann, morten.rasmussen, patrick.bellasi,
	ionela.voinescu

On 09-01-18, 11:02, Quentin Perret wrote:
> The power dissipated by a CPU at a specific OPP is currently estimated by
> the thermal subsystem as P = C * V^2 * f, with P the power, C the CPU's
> capacitance, V the OPP's voltage and f the OPP's frequency. As this
> feature can be useful for other clients requiring energy models of CPUs,
> this commit introduces an equivalent power estimator directly into the OPP
> library, hence enabling code re-use.

Okay. I am fine with the basic idea of moving this stuff into the OPP
library but will do it a bit differently.

> Signed-off-by: Quentin Perret <quentin.perret@arm.com>
> ---
>  drivers/opp/core.c     | 40 +++++++++++++++++++++++++++++++++
>  drivers/opp/of.c       | 61 ++++++++++++++++++++++++++++++++++++++++++++++++++
>  drivers/opp/opp.h      |  4 ++++
>  include/linux/pm_opp.h | 20 +++++++++++++++++
>  4 files changed, 125 insertions(+)
> 
> diff --git a/drivers/opp/core.c b/drivers/opp/core.c
> index 92fa94a6dcc1..b5e1ad2d311d 100644
> --- a/drivers/opp/core.c
> +++ b/drivers/opp/core.c
> @@ -127,6 +127,24 @@ unsigned long dev_pm_opp_get_freq(struct dev_pm_opp *opp)
>  }
>  EXPORT_SYMBOL_GPL(dev_pm_opp_get_freq);
>  
> +/**
> + * dev_pm_opp_get_power() - Gets the estimated power corresponding to an opp
> + * @opp:	opp for which power has to be returned for
> + *
> + * Return: estimated power in mirco-watts corresponding to the opp, else
> + * return 0
> + */
> +unsigned long dev_pm_opp_get_power(struct dev_pm_opp *opp)
> +{
> +	if (IS_ERR_OR_NULL(opp) || !opp->available) {

Drop the available check here.

> +		pr_err("%s: Invalid parameters\n", __func__);
> +		return 0;
> +	}
> +
> +	return opp->power_estimate_uw;
> +}
> +EXPORT_SYMBOL_GPL(dev_pm_opp_get_power);
> +
>  /**
>   * dev_pm_opp_is_turbo() - Returns if opp is turbo OPP or not
>   * @opp: opp for which turbo mode is being verified
> @@ -148,6 +166,28 @@ bool dev_pm_opp_is_turbo(struct dev_pm_opp *opp)
>  }
>  EXPORT_SYMBOL_GPL(dev_pm_opp_is_turbo);
>  
> +/**
> + * dev_pm_opp_has_power() - Get the status of power values for OPPs
> + * @cpu_dev:	CPU device for which we do this operation
> + *
> + * Return: True if the OPPs hold power estimates for the CPU
> + */
> +bool dev_pm_opp_has_power(struct device *cpu_dev)
> +{
> +	struct opp_table *opp_table;
> +	bool has_power;
> +
> +	opp_table = _find_opp_table(cpu_dev);
> +	if (IS_ERR(opp_table))
> +		return false;
> +
> +	has_power = opp_table->has_power;
> +
> +	dev_pm_opp_put_opp_table(opp_table);
> +
> +	return has_power;
> +}
> +
>  /**
>   * dev_pm_opp_get_max_clock_latency() - Get max clock latency in nanoseconds
>   * @dev:	device for which we do this operation
> diff --git a/drivers/opp/of.c b/drivers/opp/of.c
> index cb716aa2f44b..4a376cd00e8d 100644
> --- a/drivers/opp/of.c
> +++ b/drivers/opp/of.c
> @@ -633,3 +633,64 @@ int dev_pm_opp_of_get_sharing_cpus(struct device *cpu_dev,
>  	return ret;
>  }
>  EXPORT_SYMBOL_GPL(dev_pm_opp_of_get_sharing_cpus);
> +
> +/**
> + * dev_pm_opp_of_estimate_power() - Estimates the power dissipated by @cpu_dev
> + *                                  at each OPP.
> + * @cpu_dev:	CPU device for which we do this estimation
> + *
> + * This estimates the active power consumed by a CPU at each OPP using:
> + *                         P = C * V^2 * f
> + * with P the power, C the CPU's capacitance, V the OPP's voltage and f the
> + * OPP's frequency. V and f are assumed to be known by the time this function
> + * is called and C is read from DT.
> + *
> + * Returns -EINVAL if the CPU's capacitance cannot be read from DT.
> + */
> +int dev_pm_opp_of_estimate_power(struct device *cpu_dev)

I wouldn't add this special function, but rather fill
power_estimate_uW while creating the OPPs for the first time.

-- 
viresh

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 2/2] thermal: cpu_cooling: use power models from the OPP library
  2018-01-09 11:02 ` [PATCH 2/2] thermal: cpu_cooling: use power models from the OPP library Quentin Perret
@ 2018-01-10  4:37   ` Viresh Kumar
  0 siblings, 0 replies; 18+ messages in thread
From: Viresh Kumar @ 2018-01-10  4:37 UTC (permalink / raw)
  To: Quentin Perret
  Cc: linux-pm, rjw, vireshk, nm, sboyd, sudeep.holla, amit.kachhap,
	javi.merino, rui.zhang, edubezval, matthias.bgg,
	dietmar.eggemann, morten.rasmussen, patrick.bellasi,
	ionela.voinescu

On 09-01-18, 11:02, Quentin Perret wrote:
> Now that the OPP library features a power estimator, the existing code
> in IPA can be modified to rely only on dev_pm_opp_get_power() without
> having to care about the dynamic-power-coefficient DT binding.
> 
> Signed-off-by: Quentin Perret <quentin.perret@arm.com>
> ---
>  drivers/cpufreq/arm_big_little.c   |  2 ++
>  drivers/cpufreq/cpufreq-dt.c       |  2 ++
>  drivers/cpufreq/mediatek-cpufreq.c |  2 ++
>  drivers/thermal/cpu_cooling.c      | 33 +++++++++------------------------
>  4 files changed, 15 insertions(+), 24 deletions(-)
> 
> diff --git a/drivers/cpufreq/arm_big_little.c b/drivers/cpufreq/arm_big_little.c
> index c56b57dcfda5..3ae67c3510b6 100644
> --- a/drivers/cpufreq/arm_big_little.c
> +++ b/drivers/cpufreq/arm_big_little.c
> @@ -497,6 +497,8 @@ static int bL_cpufreq_init(struct cpufreq_policy *policy)
>  	if (is_bL_switching_enabled())
>  		per_cpu(cpu_last_req_freq, policy->cpu) = clk_get_cpu_rate(policy->cpu);
>  
> +	dev_pm_opp_of_estimate_power(cpu_dev);
> +
>  	dev_info(cpu_dev, "%s: CPU %d initialized\n", __func__, policy->cpu);
>  	return 0;
>  }
> diff --git a/drivers/cpufreq/cpufreq-dt.c b/drivers/cpufreq/cpufreq-dt.c
> index de3d104c25d7..b05b0d8eb3f2 100644
> --- a/drivers/cpufreq/cpufreq-dt.c
> +++ b/drivers/cpufreq/cpufreq-dt.c
> @@ -284,6 +284,8 @@ static int cpufreq_init(struct cpufreq_policy *policy)
>  	policy->cpuinfo.transition_latency = transition_latency;
>  	policy->dvfs_possible_from_any_cpu = true;
>  
> +	dev_pm_opp_of_estimate_power(cpu_dev);
> +
>  	return 0;
>  
>  out_free_cpufreq_table:
> diff --git a/drivers/cpufreq/mediatek-cpufreq.c b/drivers/cpufreq/mediatek-cpufreq.c
> index 8c04dddd3c28..c1b2fb67a73e 100644
> --- a/drivers/cpufreq/mediatek-cpufreq.c
> +++ b/drivers/cpufreq/mediatek-cpufreq.c
> @@ -470,6 +470,8 @@ static int mtk_cpufreq_init(struct cpufreq_policy *policy)
>  	policy->driver_data = info;
>  	policy->clk = info->cpu_clk;
>  
> +	dev_pm_opp_of_estimate_power(info->cpu_dev);
> +
>  	return 0;
>  
>  out_free_cpufreq_table:

The above changes wouldn't be required with suggestions from 1/2.

> diff --git a/drivers/thermal/cpu_cooling.c b/drivers/thermal/cpu_cooling.c
> index dfd23245f778..b961bd6a198b 100644
> --- a/drivers/thermal/cpu_cooling.c
> +++ b/drivers/thermal/cpu_cooling.c
> @@ -187,7 +187,6 @@ static int cpufreq_thermal_notifier(struct notifier_block *nb,
>  /**
>   * update_freq_table() - Update the freq table with power numbers
>   * @cpufreq_cdev:	the cpufreq cooling device in which to update the table
> - * @capacitance: dynamic power coefficient for these cpus
>   *
>   * Update the freq table with power numbers.  This table will be used in
>   * cpu_power_to_freq() and cpu_freq_to_power() to convert between power and
> @@ -197,8 +196,7 @@ static int cpufreq_thermal_notifier(struct notifier_block *nb,
>   * Return: 0 on success, -EINVAL if there are no OPPs for any CPUs,
>   * or -ENOMEM if we run out of memory.
>   */
> -static int update_freq_table(struct cpufreq_cooling_device *cpufreq_cdev,
> -			     u32 capacitance)
> +static int update_freq_table(struct cpufreq_cooling_device *cpufreq_cdev)
>  {
>  	struct freq_table *freq_table = cpufreq_cdev->freq_table;
>  	struct dev_pm_opp *opp;
> @@ -227,9 +225,6 @@ static int update_freq_table(struct cpufreq_cooling_device *cpufreq_cdev,
>  
>  	for (i = 0; i <= cpufreq_cdev->max_level; i++) {
>  		unsigned long freq = freq_table[i].frequency * 1000;
> -		u32 freq_mhz = freq_table[i].frequency / 1000;
> -		u64 power;
> -		u32 voltage_mv;
>  
>  		/*
>  		 * Find ceil frequency as 'freq' may be slightly lower than OPP
> @@ -242,18 +237,9 @@ static int update_freq_table(struct cpufreq_cooling_device *cpufreq_cdev,
>  			return -EINVAL;
>  		}
>  
> -		voltage_mv = dev_pm_opp_get_voltage(opp) / 1000;
> -		dev_pm_opp_put(opp);
> -
> -		/*
> -		 * Do the multiplication with MHz and millivolt so as
> -		 * to not overflow.
> -		 */
> -		power = (u64)capacitance * freq_mhz * voltage_mv * voltage_mv;
> -		do_div(power, 1000000000);
> -
>  		/* power is stored in mW */
> -		freq_table[i].power = power;
> +		freq_table[i].power = dev_pm_opp_get_power(opp) / 1000;
> +		dev_pm_opp_put(opp);
>  	}
>  
>  	return 0;
> @@ -606,7 +592,7 @@ static unsigned int find_next_max(struct cpufreq_frequency_table *table,
>   */
>  static struct thermal_cooling_device *
>  __cpufreq_cooling_register(struct device_node *np,
> -			struct cpufreq_policy *policy, u32 capacitance)
> +			struct cpufreq_policy *policy, bool power_model)
>  {
>  	struct thermal_cooling_device *cdev;
>  	struct cpufreq_cooling_device *cpufreq_cdev;
> @@ -675,8 +661,8 @@ __cpufreq_cooling_register(struct device_node *np,
>  			pr_debug("%s: freq:%u KHz\n", __func__, freq);
>  	}
>  
> -	if (capacitance) {
> -		ret = update_freq_table(cpufreq_cdev, capacitance);
> +	if (power_model) {
> +		ret = update_freq_table(cpufreq_cdev);
>  		if (ret) {
>  			cdev = ERR_PTR(ret);
>  			goto remove_ida;
> @@ -760,7 +746,7 @@ of_cpufreq_cooling_register(struct cpufreq_policy *policy)
>  {
>  	struct device_node *np = of_get_cpu_node(policy->cpu, NULL);
>  	struct thermal_cooling_device *cdev = NULL;
> -	u32 capacitance = 0;
> +	bool has_power;
>  
>  	if (!np) {
>  		pr_err("cpu_cooling: OF node not available for cpu%d\n",
> @@ -769,10 +755,9 @@ of_cpufreq_cooling_register(struct cpufreq_policy *policy)
>  	}
>  
>  	if (of_find_property(np, "#cooling-cells", NULL)) {
> -		of_property_read_u32(np, "dynamic-power-coefficient",
> -				     &capacitance);
> +		has_power = dev_pm_opp_has_power(get_cpu_device(policy->cpu));
>  
> -		cdev = __cpufreq_cooling_register(np, policy, capacitance);
> +		cdev = __cpufreq_cooling_register(np, policy, has_power);
>  		if (IS_ERR(cdev)) {
>  			pr_err("cpu_cooling: cpu%d is not running as cooling device: %ld\n",
>  			       policy->cpu, PTR_ERR(cdev));

And this looks more or less fine to me.

-- 
viresh

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 1/2] PM / OPP: introduce an OPP power estimation helper
  2018-01-10  4:36   ` Viresh Kumar
@ 2018-01-10 10:20     ` Quentin Perret
  2018-01-10 10:25       ` Viresh Kumar
  0 siblings, 1 reply; 18+ messages in thread
From: Quentin Perret @ 2018-01-10 10:20 UTC (permalink / raw)
  To: Viresh Kumar
  Cc: linux-pm, rjw, vireshk, nm, sboyd, sudeep.holla, amit.kachhap,
	javi.merino, rui.zhang, edubezval, matthias.bgg,
	dietmar.eggemann, morten.rasmussen, patrick.bellasi,
	ionela.voinescu

Hi Viresh,

On Wednesday 10 Jan 2018 at 10:06:25 (+0530), Viresh Kumar wrote:
> > +/**
> > + * dev_pm_opp_get_power() - Gets the estimated power corresponding to an opp
> > + * @opp:	opp for which power has to be returned for
> > + *
> > + * Return: estimated power in mirco-watts corresponding to the opp, else
> > + * return 0
> > + */
> > +unsigned long dev_pm_opp_get_power(struct dev_pm_opp *opp)
> > +{
> > +	if (IS_ERR_OR_NULL(opp) || !opp->available) {
> 
> Drop the available check here.

Sure.

> > +/**
> > + * dev_pm_opp_of_estimate_power() - Estimates the power dissipated by @cpu_dev
> > + *                                  at each OPP.
> > + * @cpu_dev:	CPU device for which we do this estimation
> > + *
> > + * This estimates the active power consumed by a CPU at each OPP using:
> > + *                         P = C * V^2 * f
> > + * with P the power, C the CPU's capacitance, V the OPP's voltage and f the
> > + * OPP's frequency. V and f are assumed to be known by the time this function
> > + * is called and C is read from DT.
> > + *
> > + * Returns -EINVAL if the CPU's capacitance cannot be read from DT.
> > + */
> > +int dev_pm_opp_of_estimate_power(struct device *cpu_dev)
> 
> I wouldn't add this special function, but rather fill
> power_estimate_uW while creating the OPPs for the first time.

So that was actually my first idea as well but I struggled to come up
with a clean implementation TBH ...

My concern was mainly to get the dynamic-power-coefficient cleanly. With
the approach you proposed, there are cases (for platforms using
dev_pm_opp_add() such as Juno for ex) where I don't see how we can avoid
to re-read the capacitance from the DT for each and every OPP that's being
added. Or we have to rely on the driver to give it to us but that's against
changes that you pushed recently I think.

Do you think reading the "dynamic-power-coefficient" value from the DT
directly in dev_pm_add_opp() (and other places to support the v2 bindings)
would be acceptable ? Did you have something different in mind ?

Thanks,
Quentin

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 1/2] PM / OPP: introduce an OPP power estimation helper
  2018-01-10 10:20     ` Quentin Perret
@ 2018-01-10 10:25       ` Viresh Kumar
  2018-01-10 10:36         ` Quentin Perret
  0 siblings, 1 reply; 18+ messages in thread
From: Viresh Kumar @ 2018-01-10 10:25 UTC (permalink / raw)
  To: Quentin Perret
  Cc: linux-pm, rjw, vireshk, nm, sboyd, sudeep.holla, amit.kachhap,
	javi.merino, rui.zhang, edubezval, matthias.bgg,
	dietmar.eggemann, morten.rasmussen, patrick.bellasi,
	ionela.voinescu

On 10-01-18, 10:20, Quentin Perret wrote:
> So that was actually my first idea as well but I struggled to come up
> with a clean implementation TBH ...
> 
> My concern was mainly to get the dynamic-power-coefficient cleanly. With
> the approach you proposed, there are cases (for platforms using
> dev_pm_opp_add() such as Juno for ex) where I don't see how we can avoid
> to re-read the capacitance from the DT for each and every OPP that's being
> added. Or we have to rely on the driver to give it to us but that's against
> changes that you pushed recently I think.
> 
> Do you think reading the "dynamic-power-coefficient" value from the DT
> directly in dev_pm_add_opp() (and other places to support the v2 bindings)
> would be acceptable ? Did you have something different in mind ?

I think you can read it from within _of_init_opp_table() only once and then just
use it everywhere. Will that work ?

-- 
viresh

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 1/2] PM / OPP: introduce an OPP power estimation helper
  2018-01-10 10:25       ` Viresh Kumar
@ 2018-01-10 10:36         ` Quentin Perret
  0 siblings, 0 replies; 18+ messages in thread
From: Quentin Perret @ 2018-01-10 10:36 UTC (permalink / raw)
  To: Viresh Kumar
  Cc: linux-pm, rjw, vireshk, nm, sboyd, sudeep.holla, amit.kachhap,
	javi.merino, rui.zhang, edubezval, matthias.bgg,
	dietmar.eggemann, morten.rasmussen, patrick.bellasi,
	ionela.voinescu

On Wednesday 10 Jan 2018 at 15:55:52 (+0530), Viresh Kumar wrote:
> On 10-01-18, 10:20, Quentin Perret wrote:
> > So that was actually my first idea as well but I struggled to come up
> > with a clean implementation TBH ...
> > 
> > My concern was mainly to get the dynamic-power-coefficient cleanly. With
> > the approach you proposed, there are cases (for platforms using
> > dev_pm_opp_add() such as Juno for ex) where I don't see how we can avoid
> > to re-read the capacitance from the DT for each and every OPP that's being
> > added. Or we have to rely on the driver to give it to us but that's against
> > changes that you pushed recently I think.
> > 
> > Do you think reading the "dynamic-power-coefficient" value from the DT
> > directly in dev_pm_add_opp() (and other places to support the v2 bindings)
> > would be acceptable ? Did you have something different in mind ?
> 
> I think you can read it from within _of_init_opp_table() only once and then just
> use it everywhere. Will that work ?

Yes, that should definitely do the trick :) I'll do change in the v2.

Thank you very much,
Quentin

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 0/2] thermal, OPP: move the CPU power estimation to the OPP library
  2018-01-09 11:02 [PATCH 0/2] thermal, OPP: move the CPU power estimation to the OPP library Quentin Perret
  2018-01-09 11:02 ` [PATCH 1/2] PM / OPP: introduce an OPP power estimation helper Quentin Perret
  2018-01-09 11:02 ` [PATCH 2/2] thermal: cpu_cooling: use power models from the OPP library Quentin Perret
@ 2018-01-10 19:34 ` Eduardo Valentin
  2018-01-11  9:42   ` Viresh Kumar
  2018-01-11  9:42   ` Quentin Perret
  2 siblings, 2 replies; 18+ messages in thread
From: Eduardo Valentin @ 2018-01-10 19:34 UTC (permalink / raw)
  To: Quentin Perret
  Cc: linux-pm, rjw, vireshk, nm, sboyd, sudeep.holla, amit.kachhap,
	javi.merino, rui.zhang, matthias.bgg, dietmar.eggemann,
	morten.rasmussen, patrick.bellasi, ionela.voinescu

On Tue, Jan 09, 2018 at 11:02:50AM +0000, Quentin Perret wrote:
> Currently, IPA estimates the power dissipated by a CPU at each available OPP
> using its capacitance (the dynamic-power-coefficient DT binding). This series
> relocates this feature into the OPP library as a preparation for future
> changes. More specifically:
> 
>    1. The current DT-based approach for power estimation will need deep
>       changes to support SCMI-provided power values. While the thermal
>       subsystem is not necessarily the best place to hide multiple power
>       estimation methods, the OPP library appears to be a good candidate to
>       implement the required platform abstraction.
>    2. The energy models of CPUs will be needed by other clients in the future
>       (such as the task scheduler or CPUFreq governors for example) in order
>       to make energy-aware decisions. The relocation to the OPP library will
>       enable code re-use and all clients will benefit form the platform
>       abstraction mentioned previously.

To be quite frank, I am happy to see this leaving thermal subsystem.
However, a few concerns with the patch set as it is. First, I am not
convinced PM OPP is the right place to put this, nor I see a good
explanation put in the patch set why it must be part of PM OPP.
Second, looks like we are following ARM "good" practice of fixing
problems of the future. I would only really sign off for this series
when we see real "other future users", otherwise we end up with the
infamous static power scenario in 2-3 years down the row. If we
currently do not have users of this IN MAINLINE KERNEL, then the series
is not for upstream.

> 
> Quentin Perret (2):
>   PM / OPP: introduce an OPP power estimation helper
>   thermal: cpu_cooling: use power models from the OPP library
> 
>  drivers/cpufreq/arm_big_little.c   |  2 ++
>  drivers/cpufreq/cpufreq-dt.c       |  2 ++
>  drivers/cpufreq/mediatek-cpufreq.c |  2 ++
>  drivers/opp/core.c                 | 40 +++++++++++++++++++++++++
>  drivers/opp/of.c                   | 61 ++++++++++++++++++++++++++++++++++++++
>  drivers/opp/opp.h                  |  4 +++
>  drivers/thermal/cpu_cooling.c      | 33 ++++++---------------
>  include/linux/pm_opp.h             | 20 +++++++++++++
>  8 files changed, 140 insertions(+), 24 deletions(-)
> 
> -- 
> 2.15.1
> 

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 0/2] thermal, OPP: move the CPU power estimation to the OPP library
  2018-01-10 19:34 ` [PATCH 0/2] thermal, OPP: move the CPU power estimation to " Eduardo Valentin
@ 2018-01-11  9:42   ` Viresh Kumar
  2018-01-11  9:42   ` Quentin Perret
  1 sibling, 0 replies; 18+ messages in thread
From: Viresh Kumar @ 2018-01-11  9:42 UTC (permalink / raw)
  To: Eduardo Valentin
  Cc: Quentin Perret, linux-pm, rjw, vireshk, nm, sboyd, sudeep.holla,
	amit.kachhap, javi.merino, rui.zhang, matthias.bgg,
	dietmar.eggemann, morten.rasmussen, patrick.bellasi,
	ionela.voinescu

On 10-01-18, 11:34, Eduardo Valentin wrote:
> To be quite frank, I am happy to see this leaving thermal subsystem.

:)

> However, a few concerns with the patch set as it is. First, I am not
> convinced PM OPP is the right place to put this,

I had the very same doubt in the beginning as I wasn't sure if we
should read the dynamic power coefficient from within the OPP core as
it isn't part of the OPP table in the first place.

But then I thought a bit more on where exactly should we keep per
frequency power (or per OPP power) and nothing is better than the OPP
core for that, even if we don't have any more users and so I didn't
object to the series.

> nor I see a good
> explanation put in the patch set why it must be part of PM OPP.
> Second, looks like we are following ARM "good" practice of fixing

:)

> problems of the future. I would only really sign off for this series
> when we see real "other future users", otherwise we end up with the
> infamous static power scenario in 2-3 years down the row. If we
> currently do not have users of this IN MAINLINE KERNEL, then the series
> is not for upstream.

Well we are currently using the dynamic power numbers using the
thermal callbacks for cpu_cooling driver, so it isn't that bad :)

-- 
viresh

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 0/2] thermal, OPP: move the CPU power estimation to the OPP library
  2018-01-10 19:34 ` [PATCH 0/2] thermal, OPP: move the CPU power estimation to " Eduardo Valentin
  2018-01-11  9:42   ` Viresh Kumar
@ 2018-01-11  9:42   ` Quentin Perret
  2018-01-12 17:24     ` Eduardo Valentin
  1 sibling, 1 reply; 18+ messages in thread
From: Quentin Perret @ 2018-01-11  9:42 UTC (permalink / raw)
  To: Eduardo Valentin
  Cc: linux-pm, rjw, vireshk, nm, sboyd, sudeep.holla, amit.kachhap,
	javi.merino, rui.zhang, matthias.bgg, dietmar.eggemann,
	morten.rasmussen, patrick.bellasi, ionela.voinescu

Hi Eduardo,

On Wednesday 10 Jan 2018 at 11:34:33 (-0800), Eduardo Valentin wrote:
> On Tue, Jan 09, 2018 at 11:02:50AM +0000, Quentin Perret wrote:
> > Currently, IPA estimates the power dissipated by a CPU at each available OPP
> > using its capacitance (the dynamic-power-coefficient DT binding). This series
> > relocates this feature into the OPP library as a preparation for future
> > changes. More specifically:
> > 
> >    1. The current DT-based approach for power estimation will need deep
> >       changes to support SCMI-provided power values. While the thermal
> >       subsystem is not necessarily the best place to hide multiple power
> >       estimation methods, the OPP library appears to be a good candidate to
> >       implement the required platform abstraction.
> >    2. The energy models of CPUs will be needed by other clients in the future
> >       (such as the task scheduler or CPUFreq governors for example) in order
> >       to make energy-aware decisions. The relocation to the OPP library will
> >       enable code re-use and all clients will benefit form the platform
> >       abstraction mentioned previously.
> 
> To be quite frank, I am happy to see this leaving thermal subsystem.
> However, a few concerns with the patch set as it is. First, I am not
> convinced PM OPP is the right place to put this, nor I see a good
> explanation put in the patch set why it must be part of PM OPP.

IPA already uses the OPP library to build its power model (to retrieve
the voltage and frequency). The power model itself is a per-OPP
data-structure. As such, moving it to the OPP library felt like a
natural extension of the framework. That said, if you have any
suggestions of better places to put it, please do let me know -- I'm just
hopping to make the power model available to other clients, but it
doesn't necessarily have to be through the OPP library. In the meantime,
I'll rework the cover letter to better explain the choice of PM OPP.

> Second, looks like we are following ARM "good" practice of fixing
> problems of the future. I would only really sign off for this series
> when we see real "other future users"

The first reason behind this posting is that I actually have scheduler
patches depending on this infrastructure that I intend to send on the
list in the coming weeks. I would appreciate to keep the two postings
separated because:
   1. I'd like to avoid sending a patch-bomb to the task scheduler
      maintainers that also touches a number of different sub-systems.
      The purpose of those scheduler patches is to introduce some form
      of energy awareness inside the scheduler, and I'm trying to make
      sure this message won't be lost in the middle of too many
      infrastructure patches.
   2. This infrastructure is not tied only to the aforementioned
      scheduler patches as it can also help another user, namely the
      transition to SCMI (which is currently being discussed on the list).

> otherwise we end up with the
> infamous static power scenario in 2-3 years down the row. If we
> currently do not have users of this IN MAINLINE KERNEL, then the series
> is not for upstream.

Again, this patchset is the first step of an upstreaming work, and as such,
and I hope it can make its way to getting merged.

Thank you very much,
Quentin

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 0/2] thermal, OPP: move the CPU power estimation to the OPP library
  2018-01-11  9:42   ` Quentin Perret
@ 2018-01-12 17:24     ` Eduardo Valentin
  2018-01-12 17:44       ` Quentin Perret
  2018-01-15  4:26       ` Viresh Kumar
  0 siblings, 2 replies; 18+ messages in thread
From: Eduardo Valentin @ 2018-01-12 17:24 UTC (permalink / raw)
  To: Quentin Perret
  Cc: linux-pm, rjw, vireshk, nm, sboyd, sudeep.holla, amit.kachhap,
	javi.merino, rui.zhang, matthias.bgg, dietmar.eggemann,
	morten.rasmussen, patrick.bellasi, ionela.voinescu

On Thu, Jan 11, 2018 at 09:42:57AM +0000, Quentin Perret wrote:
> Hi Eduardo,
> 
> On Wednesday 10 Jan 2018 at 11:34:33 (-0800), Eduardo Valentin wrote:
> > On Tue, Jan 09, 2018 at 11:02:50AM +0000, Quentin Perret wrote:
> > > Currently, IPA estimates the power dissipated by a CPU at each available OPP
> > > using its capacitance (the dynamic-power-coefficient DT binding). This series
> > > relocates this feature into the OPP library as a preparation for future
> > > changes. More specifically:
> > > 
> > >    1. The current DT-based approach for power estimation will need deep
> > >       changes to support SCMI-provided power values. While the thermal
> > >       subsystem is not necessarily the best place to hide multiple power
> > >       estimation methods, the OPP library appears to be a good candidate to
> > >       implement the required platform abstraction.
> > >    2. The energy models of CPUs will be needed by other clients in the future
> > >       (such as the task scheduler or CPUFreq governors for example) in order
> > >       to make energy-aware decisions. The relocation to the OPP library will
> > >       enable code re-use and all clients will benefit form the platform
> > >       abstraction mentioned previously.
> > 
> > To be quite frank, I am happy to see this leaving thermal subsystem.
> > However, a few concerns with the patch set as it is. First, I am not
> > convinced PM OPP is the right place to put this, nor I see a good
> > explanation put in the patch set why it must be part of PM OPP.
> 
> IPA already uses the OPP library to build its power model (to retrieve
> the voltage and frequency). The power model itself is a per-OPP
> data-structure. As such, moving it to the OPP library felt like a
> natural extension of the framework. That said, if you have any
> suggestions of better places to put it, please do let me know -- I'm just
> hopping to make the power model available to other clients, but it
> doesn't necessarily have to be through the OPP library. In the meantime,
> I'll rework the cover letter to better explain the choice of PM OPP.
> 
> > Second, looks like we are following ARM "good" practice of fixing
> > problems of the future. I would only really sign off for this series
> > when we see real "other future users"
> 
> The first reason behind this posting is that I actually have scheduler
> patches depending on this infrastructure that I intend to send on the
> list in the coming weeks. I would appreciate to keep the two postings
> separated because:
>    1. I'd like to avoid sending a patch-bomb to the task scheduler
>       maintainers that also touches a number of different sub-systems.
>       The purpose of those scheduler patches is to introduce some form
>       of energy awareness inside the scheduler, and I'm trying to make
>       sure this message won't be lost in the middle of too many
>       infrastructure patches.
>    2. This infrastructure is not tied only to the aforementioned
>       scheduler patches as it can also help another user, namely the
>       transition to SCMI (which is currently being discussed on the list).
> 
> > otherwise we end up with the
> > infamous static power scenario in 2-3 years down the row. If we
> > currently do not have users of this IN MAINLINE KERNEL, then the series
> > is not for upstream.
> 
> Again, this patchset is the first step of an upstreaming work, and as such,
> and I hope it can make its way to getting merged.

I would rather see a single series showing all users of the new API
instead. If you want to split the series and add a link to the users
of the new API into the series that takes it out of thermal subsystem,
I am also fine, as long as I see the other users. Otherwise, this is a
light nack, reason: no real new users of the new API, even though I
can surely see how the scheduler could use it. But if you do not really
present the code of the new users, this is just speculation.

Presenting the entire code of the new API + all its users can help us to
judge if: (1) a new API is really needed, (2) where it should be place.

Cheers,

> 
> Thank you very much,
> Quentin

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 0/2] thermal, OPP: move the CPU power estimation to the OPP library
  2018-01-12 17:24     ` Eduardo Valentin
@ 2018-01-12 17:44       ` Quentin Perret
  2018-01-12 17:47         ` Eduardo Valentin
  2018-01-15  4:26       ` Viresh Kumar
  1 sibling, 1 reply; 18+ messages in thread
From: Quentin Perret @ 2018-01-12 17:44 UTC (permalink / raw)
  To: Eduardo Valentin
  Cc: linux-pm, rjw, vireshk, nm, sboyd, sudeep.holla, amit.kachhap,
	javi.merino, rui.zhang, matthias.bgg, dietmar.eggemann,
	morten.rasmussen, patrick.bellasi, ionela.voinescu

On Friday 12 Jan 2018 at 09:24:12 (-0800), Eduardo Valentin wrote:
> I would rather see a single series showing all users of the new API
> instead. If you want to split the series and add a link to the users
> of the new API into the series that takes it out of thermal subsystem,
> I am also fine, as long as I see the other users. Otherwise, this is a
> light nack, reason: no real new users of the new API, even though I
> can surely see how the scheduler could use it. But if you do not really
> present the code of the new users, this is just speculation.
> 
> Presenting the entire code of the new API + all its users can help us to
> judge if: (1) a new API is really needed, (2) where it should be place.
> 

Understood. I will still send a v2 next week with the changes requested
by Viresh and I'll mark my upcoming scheduler patches as depending on this
series.

Cheers,
Quentin

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 0/2] thermal, OPP: move the CPU power estimation to the OPP library
  2018-01-12 17:44       ` Quentin Perret
@ 2018-01-12 17:47         ` Eduardo Valentin
  2018-01-12 17:50           ` Quentin Perret
  0 siblings, 1 reply; 18+ messages in thread
From: Eduardo Valentin @ 2018-01-12 17:47 UTC (permalink / raw)
  To: Quentin Perret
  Cc: linux-pm, rjw, vireshk, nm, sboyd, sudeep.holla, amit.kachhap,
	javi.merino, rui.zhang, matthias.bgg, dietmar.eggemann,
	morten.rasmussen, patrick.bellasi, ionela.voinescu

On Fri, Jan 12, 2018 at 05:44:10PM +0000, Quentin Perret wrote:
> On Friday 12 Jan 2018 at 09:24:12 (-0800), Eduardo Valentin wrote:
> > I would rather see a single series showing all users of the new API
> > instead. If you want to split the series and add a link to the users
> > of the new API into the series that takes it out of thermal subsystem,
> > I am also fine, as long as I see the other users. Otherwise, this is a
> > light nack, reason: no real new users of the new API, even though I
> > can surely see how the scheduler could use it. But if you do not really
> > present the code of the new users, this is just speculation.
> > 
> > Presenting the entire code of the new API + all its users can help us to
> > judge if: (1) a new API is really needed, (2) where it should be place.
> > 
> 
> Understood. I will still send a v2 next week with the changes requested
> by Viresh and I'll mark my upcoming scheduler patches as depending on this
> series.

Please note on this series the patches that will be depending on this
too, so this present review can also benefit.

> 
> Cheers,
> Quentin

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 0/2] thermal, OPP: move the CPU power estimation to the OPP library
  2018-01-12 17:47         ` Eduardo Valentin
@ 2018-01-12 17:50           ` Quentin Perret
  0 siblings, 0 replies; 18+ messages in thread
From: Quentin Perret @ 2018-01-12 17:50 UTC (permalink / raw)
  To: Eduardo Valentin
  Cc: linux-pm, rjw, vireshk, nm, sboyd, sudeep.holla, amit.kachhap,
	javi.merino, rui.zhang, matthias.bgg, dietmar.eggemann,
	morten.rasmussen, patrick.bellasi, ionela.voinescu

> Please note on this series the patches that will be depending on this
> too, so this present review can also benefit.

Will do!

Cheers,
Quentin

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 0/2] thermal, OPP: move the CPU power estimation to the OPP library
  2018-01-12 17:24     ` Eduardo Valentin
  2018-01-12 17:44       ` Quentin Perret
@ 2018-01-15  4:26       ` Viresh Kumar
  2018-01-15 17:46         ` Eduardo Valentin
  1 sibling, 1 reply; 18+ messages in thread
From: Viresh Kumar @ 2018-01-15  4:26 UTC (permalink / raw)
  To: Eduardo Valentin
  Cc: Quentin Perret, linux-pm, rjw, vireshk, nm, sboyd, sudeep.holla,
	amit.kachhap, javi.merino, rui.zhang, matthias.bgg,
	dietmar.eggemann, morten.rasmussen, patrick.bellasi,
	ionela.voinescu

On 12-01-18, 09:24, Eduardo Valentin wrote:
> I would rather see a single series showing all users of the new API
> instead. If you want to split the series and add a link to the users
> of the new API into the series that takes it out of thermal subsystem,
> I am also fine, as long as I see the other users. Otherwise, this is a
> light nack, reason: no real new users of the new API, even though I
> can surely see how the scheduler could use it. But if you do not really
> present the code of the new users, this is just speculation.
> 
> Presenting the entire code of the new API + all its users can help us to
> judge if: (1) a new API is really needed, (2) where it should be place.

So the idea will be to wait until the scheduler maintainers are ready to apply
those patches and then only apply the thermal/OPP patches ?

I am asking because it normally takes *ages* for anything to get accepted in the
scheduler and what gets merged eventually can be very much different than what
was proposed.

-- 
viresh

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 0/2] thermal, OPP: move the CPU power estimation to the OPP library
  2018-01-15  4:26       ` Viresh Kumar
@ 2018-01-15 17:46         ` Eduardo Valentin
  2018-01-16  9:16           ` Quentin Perret
  0 siblings, 1 reply; 18+ messages in thread
From: Eduardo Valentin @ 2018-01-15 17:46 UTC (permalink / raw)
  To: Viresh Kumar
  Cc: Quentin Perret, linux-pm, rjw, vireshk, nm, sboyd, sudeep.holla,
	amit.kachhap, javi.merino, rui.zhang, matthias.bgg,
	dietmar.eggemann, morten.rasmussen, patrick.bellasi,
	ionela.voinescu

On Mon, Jan 15, 2018 at 09:56:17AM +0530, Viresh Kumar wrote:
> On 12-01-18, 09:24, Eduardo Valentin wrote:
> > I would rather see a single series showing all users of the new API
> > instead. If you want to split the series and add a link to the users
> > of the new API into the series that takes it out of thermal subsystem,
> > I am also fine, as long as I see the other users. Otherwise, this is a
> > light nack, reason: no real new users of the new API, even though I
> > can surely see how the scheduler could use it. But if you do not really
> > present the code of the new users, this is just speculation.
> > 
> > Presenting the entire code of the new API + all its users can help us to
> > judge if: (1) a new API is really needed, (2) where it should be place.
> 
> So the idea will be to wait until the scheduler maintainers are ready to apply
> those patches and then only apply the thermal/OPP patches ?

Yes.

> 
> I am asking because it normally takes *ages* for anything to get accepted in the
> scheduler and what gets merged eventually can be very much different than what
> was proposed.

Well, if the only other user of the new API is not ready to accept the
changes, it means, the API is not really ready to be changed, right?

> 
> -- 
> viresh

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 0/2] thermal, OPP: move the CPU power estimation to the OPP library
  2018-01-15 17:46         ` Eduardo Valentin
@ 2018-01-16  9:16           ` Quentin Perret
  0 siblings, 0 replies; 18+ messages in thread
From: Quentin Perret @ 2018-01-16  9:16 UTC (permalink / raw)
  To: Eduardo Valentin
  Cc: Viresh Kumar, linux-pm, rjw, vireshk, nm, sboyd, sudeep.holla,
	amit.kachhap, javi.merino, rui.zhang, matthias.bgg,
	dietmar.eggemann, morten.rasmussen, patrick.bellasi,
	ionela.voinescu

Hi Eduardo,

On Monday 15 Jan 2018 at 09:46:04 (-0800), Eduardo Valentin wrote:
> On Mon, Jan 15, 2018 at 09:56:17AM +0530, Viresh Kumar wrote:
> > On 12-01-18, 09:24, Eduardo Valentin wrote:
> > > I would rather see a single series showing all users of the new API
> > > instead. If you want to split the series and add a link to the users
> > > of the new API into the series that takes it out of thermal subsystem,
> > > I am also fine, as long as I see the other users. Otherwise, this is a
> > > light nack, reason: no real new users of the new API, even though I
> > > can surely see how the scheduler could use it. But if you do not really
> > > present the code of the new users, this is just speculation.
> > > 
> > > Presenting the entire code of the new API + all its users can help us to
> > > judge if: (1) a new API is really needed, (2) where it should be place.
> > 
> > So the idea will be to wait until the scheduler maintainers are ready to apply
> > those patches and then only apply the thermal/OPP patches ?
> 
> Yes.
> 
> > 
> > I am asking because it normally takes *ages* for anything to get accepted in the
> > scheduler and what gets merged eventually can be very much different than what
> > was proposed.
> 
> Well, if the only other user of the new API is not ready to accept the
> changes, it means, the API is not really ready to be changed, right?

The scheduler will probably be the only new client per-se of the power model,
but there is another use case where the OPP library can help today, namely
SCMI. You cannot fetch voltage information for the OPPs with SCMI so the
whole P=CV^2f approach doesn't work IIUC. However, SCMI gives you a power
value for each OPP directly. Hence, having a dev_pm_opp_get_power() function
behind which we can hide the DT-or-SCMI business might be useful. The
thermal subsystem already relies on the OPP library to provide the
platform abstraction (dev_pm_opp_get_voltage() returns values read from
the DT or coming from SCPI) so putting the SCMI abstraction here again kind
of makes sense.

This posting doesn't implement the SCMI abstraction or the energy-awareness
in the scheduler, but it contains the common denominator between the two, and
that's another reason why I though this could fly on its own.

I hope that's useful.

Thanks,
Quentin

^ permalink raw reply	[flat|nested] 18+ messages in thread

end of thread, other threads:[~2018-01-16  9:16 UTC | newest]

Thread overview: 18+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-01-09 11:02 [PATCH 0/2] thermal, OPP: move the CPU power estimation to the OPP library Quentin Perret
2018-01-09 11:02 ` [PATCH 1/2] PM / OPP: introduce an OPP power estimation helper Quentin Perret
2018-01-10  4:36   ` Viresh Kumar
2018-01-10 10:20     ` Quentin Perret
2018-01-10 10:25       ` Viresh Kumar
2018-01-10 10:36         ` Quentin Perret
2018-01-09 11:02 ` [PATCH 2/2] thermal: cpu_cooling: use power models from the OPP library Quentin Perret
2018-01-10  4:37   ` Viresh Kumar
2018-01-10 19:34 ` [PATCH 0/2] thermal, OPP: move the CPU power estimation to " Eduardo Valentin
2018-01-11  9:42   ` Viresh Kumar
2018-01-11  9:42   ` Quentin Perret
2018-01-12 17:24     ` Eduardo Valentin
2018-01-12 17:44       ` Quentin Perret
2018-01-12 17:47         ` Eduardo Valentin
2018-01-12 17:50           ` Quentin Perret
2018-01-15  4:26       ` Viresh Kumar
2018-01-15 17:46         ` Eduardo Valentin
2018-01-16  9:16           ` Quentin Perret

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.