linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v2 0/5] Register an Energy Model for Arm reference platforms
@ 2019-01-30 17:05 Quentin Perret
  2019-01-30 17:05 ` [PATCH v2 1/5] PM / OPP: Introduce a power estimation helper Quentin Perret
                   ` (4 more replies)
  0 siblings, 5 replies; 13+ messages in thread
From: Quentin Perret @ 2019-01-30 17:05 UTC (permalink / raw)
  To: viresh.kumar, sudeep.holla, rjw, nm, sboyd, mka
  Cc: linux-pm, linux-kernel, linux-arm-kernel, dietmar.eggemann,
	quentin.perret

The Energy Model (EM) framework feeds interested subsystems (the
scheduler/EAS as of now) with power costs provided by drivers. Yet, no
driver is actually doing that upstream yet. This series updates a set of
CPUFreq drivers in order to register power costs in the EM framework for
some of the Arm reference platforms for EAS: Hikey960, Juno and TC2.

The series is split as follows:
 - Patch 01 introduces in PM_OPP a helper function which estimates the
   CPU power using the P=CV²f equation also used by IPA. It should be
   noted that this introduces duplicate code with IPA, which will
   eventually be fixed by migrating IPA to using PM_EM. The ideal plan
   would be to do so later, in a separate patch series. I would indeed
   prefer to keep the thermal and CPUFreq discussion separate at this
   stage, if deemed acceptable.
 - Patches 02-04 make use of that PM_OPP helper function from the
   following CPUFreq drivers: cpufreq-dt, scpi-cpufreq and
   arm_big_little.
 - Patch 05 modifies the SCMI cpufreq driver to pass the power costs
   obtained from firmware to PM_EM. This patch is independent from the
   rest of the series.


Changes since v1 (20190128165522.31749-1-quentin.perret@arm.com):
 - Dropped the DT patches that have been queued by Sudeep
 - Introduced dev_pm_opp_of_register_em() helper to check the presence
   of the DT coeff before calling PM_EM (Viresh, Matthias)
 - Coding-style improvements (Viresh, Matthias)


Thanks,
Quentin


Dietmar Eggemann (1):
  cpufreq: arm_big_little: Register an Energy Model

Quentin Perret (4):
  PM / OPP: Introduce a power estimation helper
  cpufreq: dt: Register an Energy Model
  cpufreq: scpi: Register an Energy Model
  cpufreq: scmi: Register an Energy Model

 drivers/cpufreq/arm_big_little.c |  8 +++
 drivers/cpufreq/cpufreq-dt.c     |  8 +--
 drivers/cpufreq/scmi-cpufreq.c   | 39 ++++++++++++--
 drivers/cpufreq/scpi-cpufreq.c   |  9 ++--
 drivers/opp/of.c                 | 88 ++++++++++++++++++++++++++++++++
 include/linux/pm_opp.h           |  6 +++
 6 files changed, 149 insertions(+), 9 deletions(-)

-- 
2.20.1


^ permalink raw reply	[flat|nested] 13+ messages in thread

* [PATCH v2 1/5] PM / OPP: Introduce a power estimation helper
  2019-01-30 17:05 [PATCH v2 0/5] Register an Energy Model for Arm reference platforms Quentin Perret
@ 2019-01-30 17:05 ` Quentin Perret
  2019-01-30 19:07   ` Matthias Kaehlcke
  2019-01-31  7:26   ` Viresh Kumar
  2019-01-30 17:05 ` [PATCH v2 2/5] cpufreq: dt: Register an Energy Model Quentin Perret
                   ` (3 subsequent siblings)
  4 siblings, 2 replies; 13+ messages in thread
From: Quentin Perret @ 2019-01-30 17:05 UTC (permalink / raw)
  To: viresh.kumar, sudeep.holla, rjw, nm, sboyd, mka
  Cc: linux-pm, linux-kernel, linux-arm-kernel, dietmar.eggemann,
	quentin.perret

The Energy Model (EM) framework provides an API to let drivers register
the active power of CPUs. The drivers are expected to provide a callback
method which estimates the power consumed by a CPU at each available
performance levels. How exactly this should be implemented, however,
depends on the platform.

On some systems, PM_OPP knows the voltage and frequency at which CPUs
can run. When coupled with the CPU 'capacitance' (as provided by the
'dynamic-power-coefficient' devicetree binding), it is possible to
estimate the dynamic power consumption of a CPU as P = C * V^2 * f, with
C its capacitance and V and f respectively the voltage and frequency of
the OPP. The Intelligent Power Allocator (IPA) thermal governor already
implements that estimation method, in the thermal framework.

However, this power estimation method can be applied to any platform
where all the parameters are known (C, V and f), and not only those
suffering thermal issues. As such, the code implementing this feature
can be re-used to also populate the EM framework now used by EAS.

As a first step, introduce in PM_OPP a helper function which CPUFreq
drivers can use to register into the EM framework. This duplicates the
power estimation done in IPA until it can be migrated to using the EM
framework. This will be done later, once the EM framework has support
for at least all platforms currently supported by IPA.

Signed-off-by: Quentin Perret <quentin.perret@arm.com>

---

Matthias: Given this patch changed a bit I dropped your Reviewed-by and
Tested-by, but let me know if you think they still hold.
---
 drivers/opp/of.c       | 88 ++++++++++++++++++++++++++++++++++++++++++
 include/linux/pm_opp.h |  6 +++
 2 files changed, 94 insertions(+)

diff --git a/drivers/opp/of.c b/drivers/opp/of.c
index 06f0f632ec47..4c8bf172e9ed 100644
--- a/drivers/opp/of.c
+++ b/drivers/opp/of.c
@@ -20,6 +20,7 @@
 #include <linux/pm_domain.h>
 #include <linux/slab.h>
 #include <linux/export.h>
+#include <linux/energy_model.h>
 
 #include "opp.h"
 
@@ -1047,3 +1048,90 @@ struct device_node *dev_pm_opp_get_of_node(struct dev_pm_opp *opp)
 	return of_node_get(opp->np);
 }
 EXPORT_SYMBOL_GPL(dev_pm_opp_get_of_node);
+
+/*
+ * Callback function provided to the Energy Model framework upon registration.
+ * This computes the power estimated by @CPU at the first OPP above @kHz (ceil),
+ * and updates @kHz and @mW accordingly. The power is estimated as
+ * P = C * V^2 * f with C being the CPU's capacitance and V and f respectively
+ * the voltage and frequency of the OPP.
+ *
+ * Returns -ENODEV if the CPU device cannot be found, -EINVAL if the power
+ * calculation failed because of missing parameters, 0 otherwise.
+ */
+static int __maybe_unused _get_cpu_power(unsigned long *mW, unsigned long *kHz,
+					 int cpu)
+{
+	struct device *cpu_dev;
+	struct dev_pm_opp *opp;
+	struct device_node *np;
+	unsigned long mV, Hz;
+	u32 cap;
+	u64 tmp;
+	int ret;
+
+	cpu_dev = get_cpu_device(cpu);
+	if (!cpu_dev)
+		return -ENODEV;
+
+	np = of_node_get(cpu_dev->of_node);
+	if (!np)
+		return -EINVAL;
+
+	ret = of_property_read_u32(np, "dynamic-power-coefficient", &cap);
+	of_node_put(np);
+	if (ret)
+		return -EINVAL;
+
+	Hz = *kHz * 1000;
+	opp = dev_pm_opp_find_freq_ceil(cpu_dev, &Hz);
+	if (IS_ERR(opp))
+		return -EINVAL;
+
+	mV = dev_pm_opp_get_voltage(opp) / 1000;
+	dev_pm_opp_put(opp);
+	if (!mV)
+		return -EINVAL;
+
+	tmp = (u64)cap * mV * mV * (Hz / 1000000);
+	do_div(tmp, 1000000000);
+
+	*mW = (unsigned long)tmp;
+	*kHz = Hz / 1000;
+
+	return 0;
+}
+
+/**
+ * dev_pm_opp_of_register_em() - Attempt to register an Energy Model
+ * @cpus	: CPUs for which an Energy Model has to be registered
+ * @nr_opp	: Number of OPPs to register in the Energy Model
+ *
+ * This checks whether the "dynamic-power-coefficient" devicetree binding has
+ * been specified, and tries to register an Energy Model with it if it has.
+ */
+void dev_pm_opp_of_register_em(struct cpumask *cpus, int nr_opp)
+{
+	struct em_data_callback em_cb = EM_DATA_CB(_get_cpu_power);
+	int ret, cpu = cpumask_first(cpus);
+	struct device *cpu_dev;
+	struct device_node *np;
+	u32 cap;
+
+	cpu_dev = get_cpu_device(cpu);
+	if (!cpu_dev)
+		return;
+
+	np = of_node_get(cpu_dev->of_node);
+	if (!np)
+		return;
+
+	/* Don't register an EM without the right DT binding */
+	ret = of_property_read_u32(np, "dynamic-power-coefficient", &cap);
+	of_node_put(np);
+	if (ret || !cap)
+		return;
+
+	em_register_perf_domain(cpus, nr_opp, &em_cb);
+}
+EXPORT_SYMBOL_GPL(dev_pm_opp_of_register_em);
diff --git a/include/linux/pm_opp.h b/include/linux/pm_opp.h
index b895f4e79868..58ae08b024bd 100644
--- a/include/linux/pm_opp.h
+++ b/include/linux/pm_opp.h
@@ -327,6 +327,7 @@ int dev_pm_opp_of_get_sharing_cpus(struct device *cpu_dev, struct cpumask *cpuma
 struct device_node *dev_pm_opp_of_get_opp_desc_node(struct device *dev);
 struct device_node *dev_pm_opp_get_of_node(struct dev_pm_opp *opp);
 int of_get_required_opp_performance_state(struct device_node *np, int index);
+void dev_pm_opp_of_register_em(struct cpumask *cpus, int nr_opp);
 #else
 static inline int dev_pm_opp_of_add_table(struct device *dev)
 {
@@ -365,6 +366,11 @@ static inline struct device_node *dev_pm_opp_get_of_node(struct dev_pm_opp *opp)
 {
 	return NULL;
 }
+
+static inline void dev_pm_opp_of_register_em(struct cpumask *cpus, int nr_opp)
+{
+}
+
 static inline int of_get_required_opp_performance_state(struct device_node *np, int index)
 {
 	return -ENOTSUPP;
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH v2 2/5] cpufreq: dt: Register an Energy Model
  2019-01-30 17:05 [PATCH v2 0/5] Register an Energy Model for Arm reference platforms Quentin Perret
  2019-01-30 17:05 ` [PATCH v2 1/5] PM / OPP: Introduce a power estimation helper Quentin Perret
@ 2019-01-30 17:05 ` Quentin Perret
  2019-01-30 17:05 ` [PATCH v2 3/5] cpufreq: scpi: " Quentin Perret
                   ` (2 subsequent siblings)
  4 siblings, 0 replies; 13+ messages in thread
From: Quentin Perret @ 2019-01-30 17:05 UTC (permalink / raw)
  To: viresh.kumar, sudeep.holla, rjw, nm, sboyd, mka
  Cc: linux-pm, linux-kernel, linux-arm-kernel, dietmar.eggemann,
	quentin.perret

Now that PM_OPP provides a helper function to estimate the power
consumed by CPUs, make sure to try and register an Energy Model (EM)
from cpufreq-dt, hence ensuring interested subsystems (the task
scheduler, for example) can make use of that information when available.

Signed-off-by: Quentin Perret <quentin.perret@arm.com>
---
 drivers/cpufreq/cpufreq-dt.c | 8 +++++---
 1 file changed, 5 insertions(+), 3 deletions(-)

diff --git a/drivers/cpufreq/cpufreq-dt.c b/drivers/cpufreq/cpufreq-dt.c
index e58bfcb1169e..1d4ad6f67408 100644
--- a/drivers/cpufreq/cpufreq-dt.c
+++ b/drivers/cpufreq/cpufreq-dt.c
@@ -160,7 +160,7 @@ static int cpufreq_init(struct cpufreq_policy *policy)
 	unsigned int transition_latency;
 	bool fallback = false;
 	const char *name;
-	int ret;
+	int ret, nr_opp;
 
 	cpu_dev = get_cpu_device(policy->cpu);
 	if (!cpu_dev) {
@@ -231,8 +231,8 @@ static int cpufreq_init(struct cpufreq_policy *policy)
 	 * But we need OPP table to function so if it is not there let's
 	 * give platform code chance to provide it for us.
 	 */
-	ret = dev_pm_opp_get_opp_count(cpu_dev);
-	if (ret <= 0) {
+	nr_opp = dev_pm_opp_get_opp_count(cpu_dev);
+	if (nr_opp <= 0) {
 		dev_dbg(cpu_dev, "OPP table is not ready, deferring probe\n");
 		ret = -EPROBE_DEFER;
 		goto out_free_opp;
@@ -280,6 +280,8 @@ static int cpufreq_init(struct cpufreq_policy *policy)
 	policy->cpuinfo.transition_latency = transition_latency;
 	policy->dvfs_possible_from_any_cpu = true;
 
+	dev_pm_opp_of_register_em(policy->cpus, nr_opp);
+
 	return 0;
 
 out_free_cpufreq_table:
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH v2 3/5] cpufreq: scpi: Register an Energy Model
  2019-01-30 17:05 [PATCH v2 0/5] Register an Energy Model for Arm reference platforms Quentin Perret
  2019-01-30 17:05 ` [PATCH v2 1/5] PM / OPP: Introduce a power estimation helper Quentin Perret
  2019-01-30 17:05 ` [PATCH v2 2/5] cpufreq: dt: Register an Energy Model Quentin Perret
@ 2019-01-30 17:05 ` Quentin Perret
  2019-01-30 17:05 ` [PATCH v2 4/5] cpufreq: arm_big_little: " Quentin Perret
  2019-01-30 17:05 ` [PATCH v2 5/5] cpufreq: scmi: " Quentin Perret
  4 siblings, 0 replies; 13+ messages in thread
From: Quentin Perret @ 2019-01-30 17:05 UTC (permalink / raw)
  To: viresh.kumar, sudeep.holla, rjw, nm, sboyd, mka
  Cc: linux-pm, linux-kernel, linux-arm-kernel, dietmar.eggemann,
	quentin.perret

Now that PM_OPP provides a helper function to estimate the power
consumed by CPUs, make sure to try and register an Energy Model (EM)
from scpi-cpufreq, hence ensuring interested subsystems (the task
scheduler, for example) can make use of that information when available.

Signed-off-by: Quentin Perret <quentin.perret@arm.com>
---
 drivers/cpufreq/scpi-cpufreq.c | 9 ++++++---
 1 file changed, 6 insertions(+), 3 deletions(-)

diff --git a/drivers/cpufreq/scpi-cpufreq.c b/drivers/cpufreq/scpi-cpufreq.c
index 99449738faa4..8e77a67a8d8c 100644
--- a/drivers/cpufreq/scpi-cpufreq.c
+++ b/drivers/cpufreq/scpi-cpufreq.c
@@ -98,7 +98,7 @@ scpi_get_sharing_cpus(struct device *cpu_dev, struct cpumask *cpumask)
 
 static int scpi_cpufreq_init(struct cpufreq_policy *policy)
 {
-	int ret;
+	int ret, nr_opp;
 	unsigned int latency;
 	struct device *cpu_dev;
 	struct scpi_data *priv;
@@ -129,8 +129,8 @@ static int scpi_cpufreq_init(struct cpufreq_policy *policy)
 		return ret;
 	}
 
-	ret = dev_pm_opp_get_opp_count(cpu_dev);
-	if (ret <= 0) {
+	nr_opp = dev_pm_opp_get_opp_count(cpu_dev);
+	if (nr_opp <= 0) {
 		dev_dbg(cpu_dev, "OPP table is not ready, deferring probe\n");
 		ret = -EPROBE_DEFER;
 		goto out_free_opp;
@@ -170,6 +170,9 @@ static int scpi_cpufreq_init(struct cpufreq_policy *policy)
 	policy->cpuinfo.transition_latency = latency;
 
 	policy->fast_switch_possible = false;
+
+	dev_pm_opp_of_register_em(policy->cpus, nr_opp);
+
 	return 0;
 
 out_free_cpufreq_table:
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH v2 4/5] cpufreq: arm_big_little: Register an Energy Model
  2019-01-30 17:05 [PATCH v2 0/5] Register an Energy Model for Arm reference platforms Quentin Perret
                   ` (2 preceding siblings ...)
  2019-01-30 17:05 ` [PATCH v2 3/5] cpufreq: scpi: " Quentin Perret
@ 2019-01-30 17:05 ` Quentin Perret
  2019-01-30 17:05 ` [PATCH v2 5/5] cpufreq: scmi: " Quentin Perret
  4 siblings, 0 replies; 13+ messages in thread
From: Quentin Perret @ 2019-01-30 17:05 UTC (permalink / raw)
  To: viresh.kumar, sudeep.holla, rjw, nm, sboyd, mka
  Cc: linux-pm, linux-kernel, linux-arm-kernel, dietmar.eggemann,
	quentin.perret

From: Dietmar Eggemann <dietmar.eggemann@arm.com>

Now that PM_OPP provides a helper function to estimate the power
consumed by CPUs, make sure to try and register an Energy Model (EM)
from the arm_big_little CPUFreq driver, hence ensuring interested
subsystems (the task scheduler, for example) can make use of that
information when available.

Signed-off-by: Dietmar Eggemann <dietmar.eggemann@arm.com>
Signed-off-by: Quentin Perret <quentin.perret@arm.com>
---
 drivers/cpufreq/arm_big_little.c | 8 ++++++++
 1 file changed, 8 insertions(+)

diff --git a/drivers/cpufreq/arm_big_little.c b/drivers/cpufreq/arm_big_little.c
index cf62a1f64dd7..18b05bcb2614 100644
--- a/drivers/cpufreq/arm_big_little.c
+++ b/drivers/cpufreq/arm_big_little.c
@@ -487,6 +487,14 @@ static int bL_cpufreq_init(struct cpufreq_policy *policy)
 	policy->cpuinfo.transition_latency =
 				arm_bL_ops->get_transition_latency(cpu_dev);
 
+	ret = dev_pm_opp_get_opp_count(cpu_dev);
+	if (ret <= 0) {
+		dev_dbg(cpu_dev, "OPP table is not ready, deferring probe\n");
+		return -EPROBE_DEFER;
+	}
+
+	dev_pm_opp_of_register_em(policy->cpus, ret);
+
 	if (is_bL_switching_enabled())
 		per_cpu(cpu_last_req_freq, policy->cpu) = clk_get_cpu_rate(policy->cpu);
 
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH v2 5/5] cpufreq: scmi: Register an Energy Model
  2019-01-30 17:05 [PATCH v2 0/5] Register an Energy Model for Arm reference platforms Quentin Perret
                   ` (3 preceding siblings ...)
  2019-01-30 17:05 ` [PATCH v2 4/5] cpufreq: arm_big_little: " Quentin Perret
@ 2019-01-30 17:05 ` Quentin Perret
  4 siblings, 0 replies; 13+ messages in thread
From: Quentin Perret @ 2019-01-30 17:05 UTC (permalink / raw)
  To: viresh.kumar, sudeep.holla, rjw, nm, sboyd, mka
  Cc: linux-pm, linux-kernel, linux-arm-kernel, dietmar.eggemann,
	quentin.perret

The Energy Model (EM) framework provides an API to register the active
power of CPUs. Call this API from the scmi-cpufreq driver by using the
power costs obtained from firmware. This is done to ensure interested
subsystems (the task scheduler, for example) can make use of the EM
when available.

Signed-off-by: Quentin Perret <quentin.perret@arm.com>
---
 drivers/cpufreq/scmi-cpufreq.c | 39 +++++++++++++++++++++++++++++++---
 1 file changed, 36 insertions(+), 3 deletions(-)

diff --git a/drivers/cpufreq/scmi-cpufreq.c b/drivers/cpufreq/scmi-cpufreq.c
index 242c3370544e..0d87e1433296 100644
--- a/drivers/cpufreq/scmi-cpufreq.c
+++ b/drivers/cpufreq/scmi-cpufreq.c
@@ -12,6 +12,7 @@
 #include <linux/cpufreq.h>
 #include <linux/cpumask.h>
 #include <linux/cpu_cooling.h>
+#include <linux/energy_model.h>
 #include <linux/export.h>
 #include <linux/module.h>
 #include <linux/pm_opp.h>
@@ -103,13 +104,42 @@ scmi_get_sharing_cpus(struct device *cpu_dev, struct cpumask *cpumask)
 	return 0;
 }
 
+static int __maybe_unused
+scmi_get_cpu_power(unsigned long *power, unsigned long *KHz, int cpu)
+{
+	struct device *cpu_dev = get_cpu_device(cpu);
+	unsigned long Hz;
+	int ret, domain;
+
+	if (!cpu_dev) {
+		pr_err("failed to get cpu%d device\n", cpu);
+		return -ENODEV;
+	}
+
+	domain = handle->perf_ops->device_domain_id(cpu_dev);
+	if (domain < 0)
+		return domain;
+
+	/* Get the power cost of the performance domain. */
+	Hz = *KHz * 1000;
+	ret = handle->perf_ops->est_power_get(handle, domain, &Hz, power);
+	if (ret)
+		return ret;
+
+	/* The EM framework specifies the frequency in KHz. */
+	*KHz = Hz / 1000;
+
+	return 0;
+}
+
 static int scmi_cpufreq_init(struct cpufreq_policy *policy)
 {
-	int ret;
+	int ret, nr_opp;
 	unsigned int latency;
 	struct device *cpu_dev;
 	struct scmi_data *priv;
 	struct cpufreq_frequency_table *freq_table;
+	struct em_data_callback em_cb = EM_DATA_CB(scmi_get_cpu_power);
 
 	cpu_dev = get_cpu_device(policy->cpu);
 	if (!cpu_dev) {
@@ -136,8 +166,8 @@ static int scmi_cpufreq_init(struct cpufreq_policy *policy)
 		return ret;
 	}
 
-	ret = dev_pm_opp_get_opp_count(cpu_dev);
-	if (ret <= 0) {
+	nr_opp = dev_pm_opp_get_opp_count(cpu_dev);
+	if (nr_opp <= 0) {
 		dev_dbg(cpu_dev, "OPP table is not ready, deferring probe\n");
 		ret = -EPROBE_DEFER;
 		goto out_free_opp;
@@ -171,6 +201,9 @@ static int scmi_cpufreq_init(struct cpufreq_policy *policy)
 	policy->cpuinfo.transition_latency = latency;
 
 	policy->fast_switch_possible = true;
+
+	em_register_perf_domain(policy->cpus, nr_opp, &em_cb);
+
 	return 0;
 
 out_free_priv:
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: [PATCH v2 1/5] PM / OPP: Introduce a power estimation helper
  2019-01-30 17:05 ` [PATCH v2 1/5] PM / OPP: Introduce a power estimation helper Quentin Perret
@ 2019-01-30 19:07   ` Matthias Kaehlcke
  2019-01-31  7:22     ` Viresh Kumar
  2019-01-31  9:42     ` Quentin Perret
  2019-01-31  7:26   ` Viresh Kumar
  1 sibling, 2 replies; 13+ messages in thread
From: Matthias Kaehlcke @ 2019-01-30 19:07 UTC (permalink / raw)
  To: Quentin Perret
  Cc: viresh.kumar, sudeep.holla, rjw, nm, sboyd, linux-pm,
	linux-kernel, linux-arm-kernel, dietmar.eggemann

Hi Quentin,

On Wed, Jan 30, 2019 at 05:05:02PM +0000, Quentin Perret wrote:
> The Energy Model (EM) framework provides an API to let drivers register
> the active power of CPUs. The drivers are expected to provide a callback
> method which estimates the power consumed by a CPU at each available
> performance levels. How exactly this should be implemented, however,
> depends on the platform.
> 
> On some systems, PM_OPP knows the voltage and frequency at which CPUs
> can run. When coupled with the CPU 'capacitance' (as provided by the
> 'dynamic-power-coefficient' devicetree binding), it is possible to
> estimate the dynamic power consumption of a CPU as P = C * V^2 * f, with
> C its capacitance and V and f respectively the voltage and frequency of
> the OPP. The Intelligent Power Allocator (IPA) thermal governor already
> implements that estimation method, in the thermal framework.
> 
> However, this power estimation method can be applied to any platform
> where all the parameters are known (C, V and f), and not only those
> suffering thermal issues. As such, the code implementing this feature
> can be re-used to also populate the EM framework now used by EAS.
> 
> As a first step, introduce in PM_OPP a helper function which CPUFreq
> drivers can use to register into the EM framework. This duplicates the
> power estimation done in IPA until it can be migrated to using the EM
> framework. This will be done later, once the EM framework has support
> for at least all platforms currently supported by IPA.
> 
> Signed-off-by: Quentin Perret <quentin.perret@arm.com>
> 
> ---
> 
> Matthias: Given this patch changed a bit I dropped your Reviewed-by and
> Tested-by, but let me know if you think they still hold.
> ---
>  drivers/opp/of.c       | 88 ++++++++++++++++++++++++++++++++++++++++++
>  include/linux/pm_opp.h |  6 +++
>  2 files changed, 94 insertions(+)
> 
> diff --git a/drivers/opp/of.c b/drivers/opp/of.c
> index 06f0f632ec47..4c8bf172e9ed 100644
> --- a/drivers/opp/of.c
> +++ b/drivers/opp/of.c
> @@ -20,6 +20,7 @@
>  #include <linux/pm_domain.h>
>  #include <linux/slab.h>
>  #include <linux/export.h>
> +#include <linux/energy_model.h>

nit: AFAIK typically alphabetical order is used for includes, though
this file doesn't exactly adhere to it.

>  #include "opp.h"
>  
> @@ -1047,3 +1048,90 @@ struct device_node *dev_pm_opp_get_of_node(struct dev_pm_opp *opp)
>  	return of_node_get(opp->np);
>  }
>  EXPORT_SYMBOL_GPL(dev_pm_opp_get_of_node);
> +
> +/*
> + * Callback function provided to the Energy Model framework upon registration.
> + * This computes the power estimated by @CPU at the first OPP above @kHz (ceil),

that's not entirely correct, it could be the OPP at @kHz.

> + * and updates @kHz and @mW accordingly. The power is estimated as
> + * P = C * V^2 * f with C being the CPU's capacitance and V and f respectively
> + * the voltage and frequency of the OPP.
> + *
> + * Returns -ENODEV if the CPU device cannot be found, -EINVAL if the power
> + * calculation failed because of missing parameters, 0 otherwise.
> + */
> +static int __maybe_unused _get_cpu_power(unsigned long *mW, unsigned long *kHz,
> +					 int cpu)

why __maybe_unused?

> +{
> +	struct device *cpu_dev;
> +	struct dev_pm_opp *opp;
> +	struct device_node *np;
> +	unsigned long mV, Hz;
> +	u32 cap;
> +	u64 tmp;
> +	int ret;
> +
> +	cpu_dev = get_cpu_device(cpu);
> +	if (!cpu_dev)
> +		return -ENODEV;
> +
> +	np = of_node_get(cpu_dev->of_node);
> +	if (!np)
> +		return -EINVAL;
> +
> +	ret = of_property_read_u32(np, "dynamic-power-coefficient", &cap);
> +	of_node_put(np);
> +	if (ret)
> +		return -EINVAL;
> +
> +	Hz = *kHz * 1000;
> +	opp = dev_pm_opp_find_freq_ceil(cpu_dev, &Hz);
> +	if (IS_ERR(opp))
> +		return -EINVAL;
> +
> +	mV = dev_pm_opp_get_voltage(opp) / 1000;
> +	dev_pm_opp_put(opp);
> +	if (!mV)
> +		return -EINVAL;
> +
> +	tmp = (u64)cap * mV * mV * (Hz / 1000000);
> +	do_div(tmp, 1000000000);
> +
> +	*mW = (unsigned long)tmp;
> +	*kHz = Hz / 1000;
> +
> +	return 0;
> +}
> +
> +/**
> + * dev_pm_opp_of_register_em() - Attempt to register an Energy Model
> + * @cpus	: CPUs for which an Energy Model has to be registered
> + * @nr_opp	: Number of OPPs to register in the Energy Model
> + *
> + * This checks whether the "dynamic-power-coefficient" devicetree binding has

s/binding/property/ ?

> + * been specified, and tries to register an Energy Model with it if it has.
> + */
> +void dev_pm_opp_of_register_em(struct cpumask *cpus, int nr_opp)

Is the nr_opp parameter really needed? The function looks up the CPU
device and hence could determine the OPP count itself with
dev_pm_opp_get_opp_count(). I see most cpufreq drivers call
dev_pm_opp_get_opp_count() anyway, so passing the count as parameter
can be considered a small optimization, not sure how relevant it is
though, since dev_pm_opp_of_register_em() isn't called frequently.

> +{
> +	struct em_data_callback em_cb = EM_DATA_CB(_get_cpu_power);
> +	int ret, cpu = cpumask_first(cpus);
> +	struct device *cpu_dev;
> +	struct device_node *np;
> +	u32 cap;
> +
> +	cpu_dev = get_cpu_device(cpu);
> +	if (!cpu_dev)
> +		return;
> +
> +	np = of_node_get(cpu_dev->of_node);
> +	if (!np)
> +		return;
> +
> +	/* Don't register an EM without the right DT binding */
> +	ret = of_property_read_u32(np, "dynamic-power-coefficient", &cap);
> +	of_node_put(np);
> +	if (ret || !cap)
> +		return;
> +
> +	em_register_perf_domain(cpus, nr_opp, &em_cb);
> +}
> +EXPORT_SYMBOL_GPL(dev_pm_opp_of_register_em);
> diff --git a/include/linux/pm_opp.h b/include/linux/pm_opp.h
> index b895f4e79868..58ae08b024bd 100644
> --- a/include/linux/pm_opp.h
> +++ b/include/linux/pm_opp.h
> @@ -327,6 +327,7 @@ int dev_pm_opp_of_get_sharing_cpus(struct device *cpu_dev, struct cpumask *cpuma
>  struct device_node *dev_pm_opp_of_get_opp_desc_node(struct device *dev);
>  struct device_node *dev_pm_opp_get_of_node(struct dev_pm_opp *opp);
>  int of_get_required_opp_performance_state(struct device_node *np, int index);
> +void dev_pm_opp_of_register_em(struct cpumask *cpus, int nr_opp);
>  #else
>  static inline int dev_pm_opp_of_add_table(struct device *dev)
>  {
> @@ -365,6 +366,11 @@ static inline struct device_node *dev_pm_opp_get_of_node(struct dev_pm_opp *opp)
>  {
>  	return NULL;
>  }
> +
> +static inline void dev_pm_opp_of_register_em(struct cpumask *cpus, int nr_opp)
> +{
> +}
> +
>  static inline int of_get_required_opp_performance_state(struct device_node *np, int index)
>  {
>  	return -ENOTSUPP;

Tested-by: Matthias Kaehlcke <mka@chromium.org>

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v2 1/5] PM / OPP: Introduce a power estimation helper
  2019-01-30 19:07   ` Matthias Kaehlcke
@ 2019-01-31  7:22     ` Viresh Kumar
  2019-01-31  9:34       ` Quentin Perret
  2019-01-31  9:42     ` Quentin Perret
  1 sibling, 1 reply; 13+ messages in thread
From: Viresh Kumar @ 2019-01-31  7:22 UTC (permalink / raw)
  To: Matthias Kaehlcke
  Cc: Quentin Perret, sudeep.holla, rjw, nm, sboyd, linux-pm,
	linux-kernel, linux-arm-kernel, dietmar.eggemann

On 30-01-19, 11:07, Matthias Kaehlcke wrote:
> On Wed, Jan 30, 2019 at 05:05:02PM +0000, Quentin Perret wrote:
> > +static int __maybe_unused _get_cpu_power(unsigned long *mW, unsigned long *kHz,
> > +					 int cpu)
> 
> why __maybe_unused?

Yeah, it isn't required I think. He probably added it for the case
where CONFIG_ENERGY_MODEL=n, but even then an inline routine is
defined which will accept it as argument and wouldn't do anything with
it. Had it been a macro, we would have required __maybe_unused but not
now.

-- 
viresh

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v2 1/5] PM / OPP: Introduce a power estimation helper
  2019-01-30 17:05 ` [PATCH v2 1/5] PM / OPP: Introduce a power estimation helper Quentin Perret
  2019-01-30 19:07   ` Matthias Kaehlcke
@ 2019-01-31  7:26   ` Viresh Kumar
  2019-01-31  9:51     ` Quentin Perret
  1 sibling, 1 reply; 13+ messages in thread
From: Viresh Kumar @ 2019-01-31  7:26 UTC (permalink / raw)
  To: Quentin Perret
  Cc: sudeep.holla, rjw, nm, sboyd, mka, linux-pm, linux-kernel,
	linux-arm-kernel, dietmar.eggemann

On 30-01-19, 17:05, Quentin Perret wrote:
> +static int __maybe_unused _get_cpu_power(unsigned long *mW, unsigned long *kHz,
> +					 int cpu)
> +{
> +	struct device *cpu_dev;
> +	struct dev_pm_opp *opp;
> +	struct device_node *np;
> +	unsigned long mV, Hz;
> +	u32 cap;
> +	u64 tmp;
> +	int ret;
> +
> +	cpu_dev = get_cpu_device(cpu);
> +	if (!cpu_dev)
> +		return -ENODEV;
> +
> +	np = of_node_get(cpu_dev->of_node);
> +	if (!np)
> +		return -EINVAL;
> +
> +	ret = of_property_read_u32(np, "dynamic-power-coefficient", &cap);
> +	of_node_put(np);
> +	if (ret)
> +		return -EINVAL;
> +
> +	Hz = *kHz * 1000;
> +	opp = dev_pm_opp_find_freq_ceil(cpu_dev, &Hz);
> +	if (IS_ERR(opp))
> +		return -EINVAL;
> +
> +	mV = dev_pm_opp_get_voltage(opp) / 1000;

The voltage is also stored as triplet now a days and we must consider
the higher value for these calculations. Also what about the case of
multiple regulators here or performance-states ?

> +	dev_pm_opp_put(opp);
> +	if (!mV)
> +		return -EINVAL;
> +
> +	tmp = (u64)cap * mV * mV * (Hz / 1000000);
> +	do_div(tmp, 1000000000);
> +
> +	*mW = (unsigned long)tmp;

I was thinking will it be better if we just save this information in
opp->power field during init, so we can just read a value here
instead. But I am still not sure :(

> +	*kHz = Hz / 1000;
> +
> +	return 0;
> +}
> +
> +/**
> + * dev_pm_opp_of_register_em() - Attempt to register an Energy Model
> + * @cpus	: CPUs for which an Energy Model has to be registered
> + * @nr_opp	: Number of OPPs to register in the Energy Model
> + *
> + * This checks whether the "dynamic-power-coefficient" devicetree binding has
> + * been specified, and tries to register an Energy Model with it if it has.
> + */
> +void dev_pm_opp_of_register_em(struct cpumask *cpus, int nr_opp)
> +{
> +	struct em_data_callback em_cb = EM_DATA_CB(_get_cpu_power);
> +	int ret, cpu = cpumask_first(cpus);
> +	struct device *cpu_dev;
> +	struct device_node *np;
> +	u32 cap;
> +
> +	cpu_dev = get_cpu_device(cpu);
> +	if (!cpu_dev)
> +		return;
> +
> +	np = of_node_get(cpu_dev->of_node);
> +	if (!np)
> +		return;
> +
> +	/* Don't register an EM without the right DT binding */
> +	ret = of_property_read_u32(np, "dynamic-power-coefficient", &cap);
> +	of_node_put(np);
> +	if (ret || !cap)
> +		return;

What if no voltage is supplied in DT ?

> +
> +	em_register_perf_domain(cpus, nr_opp, &em_cb);
> +}
> +EXPORT_SYMBOL_GPL(dev_pm_opp_of_register_em);
> diff --git a/include/linux/pm_opp.h b/include/linux/pm_opp.h
> index b895f4e79868..58ae08b024bd 100644
> --- a/include/linux/pm_opp.h
> +++ b/include/linux/pm_opp.h
> @@ -327,6 +327,7 @@ int dev_pm_opp_of_get_sharing_cpus(struct device *cpu_dev, struct cpumask *cpuma
>  struct device_node *dev_pm_opp_of_get_opp_desc_node(struct device *dev);
>  struct device_node *dev_pm_opp_get_of_node(struct dev_pm_opp *opp);
>  int of_get_required_opp_performance_state(struct device_node *np, int index);
> +void dev_pm_opp_of_register_em(struct cpumask *cpus, int nr_opp);
>  #else
>  static inline int dev_pm_opp_of_add_table(struct device *dev)
>  {
> @@ -365,6 +366,11 @@ static inline struct device_node *dev_pm_opp_get_of_node(struct dev_pm_opp *opp)
>  {
>  	return NULL;
>  }
> +
> +static inline void dev_pm_opp_of_register_em(struct cpumask *cpus, int nr_opp)
> +{
> +}
> +
>  static inline int of_get_required_opp_performance_state(struct device_node *np, int index)
>  {
>  	return -ENOTSUPP;
> -- 
> 2.20.1

-- 
viresh

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v2 1/5] PM / OPP: Introduce a power estimation helper
  2019-01-31  7:22     ` Viresh Kumar
@ 2019-01-31  9:34       ` Quentin Perret
  2019-01-31  9:37         ` Viresh Kumar
  0 siblings, 1 reply; 13+ messages in thread
From: Quentin Perret @ 2019-01-31  9:34 UTC (permalink / raw)
  To: Viresh Kumar
  Cc: Matthias Kaehlcke, sudeep.holla, rjw, nm, sboyd, linux-pm,
	linux-kernel, linux-arm-kernel, dietmar.eggemann

On Thursday 31 Jan 2019 at 12:52:09 (+0530), Viresh Kumar wrote:
> On 30-01-19, 11:07, Matthias Kaehlcke wrote:
> > On Wed, Jan 30, 2019 at 05:05:02PM +0000, Quentin Perret wrote:
> > > +static int __maybe_unused _get_cpu_power(unsigned long *mW, unsigned long *kHz,
> > > +					 int cpu)
> > 
> > why __maybe_unused?
> 
> Yeah, it isn't required I think. He probably added it for the case
> where CONFIG_ENERGY_MODEL=n, but even then an inline routine is
> defined which will accept it as argument and wouldn't do anything with
> it. Had it been a macro, we would have required __maybe_unused but not
> now.

The thing is, the EM_DATA_CB() macro _is_ stubbed for
CONFIG_ENERGY_MODEL=n:

https://elixir.bootlin.com/linux/v5.0-rc4/source/include/linux/energy_model.h#L165

So, without __maybe_unused you get do get a compiler warning.

Thanks,
Quentin

> 
> -- 
> viresh

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v2 1/5] PM / OPP: Introduce a power estimation helper
  2019-01-31  9:34       ` Quentin Perret
@ 2019-01-31  9:37         ` Viresh Kumar
  0 siblings, 0 replies; 13+ messages in thread
From: Viresh Kumar @ 2019-01-31  9:37 UTC (permalink / raw)
  To: Quentin Perret
  Cc: Matthias Kaehlcke, sudeep.holla, rjw, nm, sboyd, linux-pm,
	linux-kernel, linux-arm-kernel, dietmar.eggemann

On 31-01-19, 09:34, Quentin Perret wrote:
> On Thursday 31 Jan 2019 at 12:52:09 (+0530), Viresh Kumar wrote:
> > On 30-01-19, 11:07, Matthias Kaehlcke wrote:
> > > On Wed, Jan 30, 2019 at 05:05:02PM +0000, Quentin Perret wrote:
> > > > +static int __maybe_unused _get_cpu_power(unsigned long *mW, unsigned long *kHz,
> > > > +					 int cpu)
> > > 
> > > why __maybe_unused?
> > 
> > Yeah, it isn't required I think. He probably added it for the case
> > where CONFIG_ENERGY_MODEL=n, but even then an inline routine is
> > defined which will accept it as argument and wouldn't do anything with
> > it. Had it been a macro, we would have required __maybe_unused but not
> > now.
> 
> The thing is, the EM_DATA_CB() macro _is_ stubbed for
> CONFIG_ENERGY_MODEL=n:
> 
> https://elixir.bootlin.com/linux/v5.0-rc4/source/include/linux/energy_model.h#L165
> 
> So, without __maybe_unused you get do get a compiler warning.

Yeah, I almost got to it and still missed it :)

-- 
viresh

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v2 1/5] PM / OPP: Introduce a power estimation helper
  2019-01-30 19:07   ` Matthias Kaehlcke
  2019-01-31  7:22     ` Viresh Kumar
@ 2019-01-31  9:42     ` Quentin Perret
  1 sibling, 0 replies; 13+ messages in thread
From: Quentin Perret @ 2019-01-31  9:42 UTC (permalink / raw)
  To: Matthias Kaehlcke
  Cc: viresh.kumar, sudeep.holla, rjw, nm, sboyd, linux-pm,
	linux-kernel, linux-arm-kernel, dietmar.eggemann

Hi Matthias,

On Wednesday 30 Jan 2019 at 11:07:03 (-0800), Matthias Kaehlcke wrote:
> > diff --git a/drivers/opp/of.c b/drivers/opp/of.c
> > index 06f0f632ec47..4c8bf172e9ed 100644
> > --- a/drivers/opp/of.c
> > +++ b/drivers/opp/of.c
> > @@ -20,6 +20,7 @@
> >  #include <linux/pm_domain.h>
> >  #include <linux/slab.h>
> >  #include <linux/export.h>
> > +#include <linux/energy_model.h>
> 
> nit: AFAIK typically alphabetical order is used for includes, though
> this file doesn't exactly adhere to it.

Yeah that's what I was thinking too. Since nothing is in order here I
figured there wasn't a best place to put it so I just stick it there. I
happy to re-order all of them if necessary.

> 
> >  #include "opp.h"
> >  
> > @@ -1047,3 +1048,90 @@ struct device_node *dev_pm_opp_get_of_node(struct dev_pm_opp *opp)
> >  	return of_node_get(opp->np);
> >  }
> >  EXPORT_SYMBOL_GPL(dev_pm_opp_get_of_node);
> > +
> > +/*
> > + * Callback function provided to the Energy Model framework upon registration.
> > + * This computes the power estimated by @CPU at the first OPP above @kHz (ceil),
> 
> that's not entirely correct, it could be the OPP at @kHz.

Right, I'll update that.

> 
> > + * and updates @kHz and @mW accordingly. The power is estimated as
> > + * P = C * V^2 * f with C being the CPU's capacitance and V and f respectively
> > + * the voltage and frequency of the OPP.
> > + *
> > + * Returns -ENODEV if the CPU device cannot be found, -EINVAL if the power
> > + * calculation failed because of missing parameters, 0 otherwise.
> > + */
> > +static int __maybe_unused _get_cpu_power(unsigned long *mW, unsigned long *kHz,
> > +					 int cpu)
> 
> why __maybe_unused?

To avoid compiler warnings with CONFIG_ENERGY_MODEL=n, see my other
email ;-)

> > +{
> > +	struct device *cpu_dev;
> > +	struct dev_pm_opp *opp;
> > +	struct device_node *np;
> > +	unsigned long mV, Hz;
> > +	u32 cap;
> > +	u64 tmp;
> > +	int ret;
> > +
> > +	cpu_dev = get_cpu_device(cpu);
> > +	if (!cpu_dev)
> > +		return -ENODEV;
> > +
> > +	np = of_node_get(cpu_dev->of_node);
> > +	if (!np)
> > +		return -EINVAL;
> > +
> > +	ret = of_property_read_u32(np, "dynamic-power-coefficient", &cap);
> > +	of_node_put(np);
> > +	if (ret)
> > +		return -EINVAL;
> > +
> > +	Hz = *kHz * 1000;
> > +	opp = dev_pm_opp_find_freq_ceil(cpu_dev, &Hz);
> > +	if (IS_ERR(opp))
> > +		return -EINVAL;
> > +
> > +	mV = dev_pm_opp_get_voltage(opp) / 1000;
> > +	dev_pm_opp_put(opp);
> > +	if (!mV)
> > +		return -EINVAL;
> > +
> > +	tmp = (u64)cap * mV * mV * (Hz / 1000000);
> > +	do_div(tmp, 1000000000);
> > +
> > +	*mW = (unsigned long)tmp;
> > +	*kHz = Hz / 1000;
> > +
> > +	return 0;
> > +}
> > +
> > +/**
> > + * dev_pm_opp_of_register_em() - Attempt to register an Energy Model
> > + * @cpus	: CPUs for which an Energy Model has to be registered
> > + * @nr_opp	: Number of OPPs to register in the Energy Model
> > + *
> > + * This checks whether the "dynamic-power-coefficient" devicetree binding has
> 
> s/binding/property/ ?

Sounds good.

> 
> > + * been specified, and tries to register an Energy Model with it if it has.
> > + */
> > +void dev_pm_opp_of_register_em(struct cpumask *cpus, int nr_opp)
> 
> Is the nr_opp parameter really needed? The function looks up the CPU
> device and hence could determine the OPP count itself with
> dev_pm_opp_get_opp_count(). I see most cpufreq drivers call
> dev_pm_opp_get_opp_count() anyway, so passing the count as parameter
> can be considered a small optimization, not sure how relevant it is
> though, since dev_pm_opp_of_register_em() isn't called frequently.

Yeah, I figured since most callers of dev_pm_opp_of_register_em()
already counted the OPPs, I could as well use the data instead of
counting again. I mean, dev_pm_opp_get_count() has to traverse the whole
list every time, so there is no point in doing that twice. Not a huge
deal I guess.

> 
> > +{
> > +	struct em_data_callback em_cb = EM_DATA_CB(_get_cpu_power);
> > +	int ret, cpu = cpumask_first(cpus);
> > +	struct device *cpu_dev;
> > +	struct device_node *np;
> > +	u32 cap;
> > +
> > +	cpu_dev = get_cpu_device(cpu);
> > +	if (!cpu_dev)
> > +		return;
> > +
> > +	np = of_node_get(cpu_dev->of_node);
> > +	if (!np)
> > +		return;
> > +
> > +	/* Don't register an EM without the right DT binding */
> > +	ret = of_property_read_u32(np, "dynamic-power-coefficient", &cap);
> > +	of_node_put(np);
> > +	if (ret || !cap)
> > +		return;
> > +
> > +	em_register_perf_domain(cpus, nr_opp, &em_cb);
> > +}
> > +EXPORT_SYMBOL_GPL(dev_pm_opp_of_register_em);
> > diff --git a/include/linux/pm_opp.h b/include/linux/pm_opp.h
> > index b895f4e79868..58ae08b024bd 100644
> > --- a/include/linux/pm_opp.h
> > +++ b/include/linux/pm_opp.h
> > @@ -327,6 +327,7 @@ int dev_pm_opp_of_get_sharing_cpus(struct device *cpu_dev, struct cpumask *cpuma
> >  struct device_node *dev_pm_opp_of_get_opp_desc_node(struct device *dev);
> >  struct device_node *dev_pm_opp_get_of_node(struct dev_pm_opp *opp);
> >  int of_get_required_opp_performance_state(struct device_node *np, int index);
> > +void dev_pm_opp_of_register_em(struct cpumask *cpus, int nr_opp);
> >  #else
> >  static inline int dev_pm_opp_of_add_table(struct device *dev)
> >  {
> > @@ -365,6 +366,11 @@ static inline struct device_node *dev_pm_opp_get_of_node(struct dev_pm_opp *opp)
> >  {
> >  	return NULL;
> >  }
> > +
> > +static inline void dev_pm_opp_of_register_em(struct cpumask *cpus, int nr_opp)
> > +{
> > +}
> > +
> >  static inline int of_get_required_opp_performance_state(struct device_node *np, int index)
> >  {
> >  	return -ENOTSUPP;
> 
> Tested-by: Matthias Kaehlcke <mka@chromium.org>

Thank you very much !
Quentin

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v2 1/5] PM / OPP: Introduce a power estimation helper
  2019-01-31  7:26   ` Viresh Kumar
@ 2019-01-31  9:51     ` Quentin Perret
  0 siblings, 0 replies; 13+ messages in thread
From: Quentin Perret @ 2019-01-31  9:51 UTC (permalink / raw)
  To: Viresh Kumar
  Cc: sudeep.holla, rjw, nm, sboyd, mka, linux-pm, linux-kernel,
	linux-arm-kernel, dietmar.eggemann

On Thursday 31 Jan 2019 at 12:56:33 (+0530), Viresh Kumar wrote:
> On 30-01-19, 17:05, Quentin Perret wrote:
> > +static int __maybe_unused _get_cpu_power(unsigned long *mW, unsigned long *kHz,
> > +					 int cpu)
> > +{
> > +	struct device *cpu_dev;
> > +	struct dev_pm_opp *opp;
> > +	struct device_node *np;
> > +	unsigned long mV, Hz;
> > +	u32 cap;
> > +	u64 tmp;
> > +	int ret;
> > +
> > +	cpu_dev = get_cpu_device(cpu);
> > +	if (!cpu_dev)
> > +		return -ENODEV;
> > +
> > +	np = of_node_get(cpu_dev->of_node);
> > +	if (!np)
> > +		return -EINVAL;
> > +
> > +	ret = of_property_read_u32(np, "dynamic-power-coefficient", &cap);
> > +	of_node_put(np);
> > +	if (ret)
> > +		return -EINVAL;
> > +
> > +	Hz = *kHz * 1000;
> > +	opp = dev_pm_opp_find_freq_ceil(cpu_dev, &Hz);
> > +	if (IS_ERR(opp))
> > +		return -EINVAL;
> > +
> > +	mV = dev_pm_opp_get_voltage(opp) / 1000;
> 
> The voltage is also stored as triplet now a days and we must consider
> the higher value for these calculations. Also what about the case of
> multiple regulators here or performance-states ?

Well at least this is not worst than what we already do for IPA :-)

https://elixir.bootlin.com/linux/latest/source/drivers/thermal/cpu_cooling.c#L245

In the case of multiple regulators, then maybe that should be dealt with
at the dev_pm_op_get_voltage() ? Not sure.

> 
> > +	dev_pm_opp_put(opp);
> > +	if (!mV)
> > +		return -EINVAL;
> > +
> > +	tmp = (u64)cap * mV * mV * (Hz / 1000000);
> > +	do_div(tmp, 1000000000);
> > +
> > +	*mW = (unsigned long)tmp;
> 
> I was thinking will it be better if we just save this information in
> opp->power field during init, so we can just read a value here
> instead. But I am still not sure :(

Yeah, I had the exact same question. But, then I thought, we're only
gonna use that once, so it's not clear we need to cache the value. And I
don't think we want other subsystems to ask PM_OPP for power values
directly. Those subsystems should ask the EM framework instead (which
exists for that very reason). So we're probably not gonna expose a
dev_pm_opp_get_power() accessor or so, I think.

That's why I went that way.

> 
> > +	*kHz = Hz / 1000;
> > +
> > +	return 0;
> > +}
> > +
> > +/**
> > + * dev_pm_opp_of_register_em() - Attempt to register an Energy Model
> > + * @cpus	: CPUs for which an Energy Model has to be registered
> > + * @nr_opp	: Number of OPPs to register in the Energy Model
> > + *
> > + * This checks whether the "dynamic-power-coefficient" devicetree binding has
> > + * been specified, and tries to register an Energy Model with it if it has.
> > + */
> > +void dev_pm_opp_of_register_em(struct cpumask *cpus, int nr_opp)
> > +{
> > +	struct em_data_callback em_cb = EM_DATA_CB(_get_cpu_power);
> > +	int ret, cpu = cpumask_first(cpus);
> > +	struct device *cpu_dev;
> > +	struct device_node *np;
> > +	u32 cap;
> > +
> > +	cpu_dev = get_cpu_device(cpu);
> > +	if (!cpu_dev)
> > +		return;
> > +
> > +	np = of_node_get(cpu_dev->of_node);
> > +	if (!np)
> > +		return;
> > +
> > +	/* Don't register an EM without the right DT binding */
> > +	ret = of_property_read_u32(np, "dynamic-power-coefficient", &cap);
> > +	of_node_put(np);
> > +	if (ret || !cap)
> > +		return;
> 
> What if no voltage is supplied in DT ?

Then don't provide 'dynamic-power-coefficient' ? There is nothing you
can do with that without voltages I think.

With this implementation you'll get an error message at some point,
which is probably sane.

> 
> > +
> > +	em_register_perf_domain(cpus, nr_opp, &em_cb);
> > +}
> > +EXPORT_SYMBOL_GPL(dev_pm_opp_of_register_em);
> > diff --git a/include/linux/pm_opp.h b/include/linux/pm_opp.h
> > index b895f4e79868..58ae08b024bd 100644
> > --- a/include/linux/pm_opp.h
> > +++ b/include/linux/pm_opp.h
> > @@ -327,6 +327,7 @@ int dev_pm_opp_of_get_sharing_cpus(struct device *cpu_dev, struct cpumask *cpuma
> >  struct device_node *dev_pm_opp_of_get_opp_desc_node(struct device *dev);
> >  struct device_node *dev_pm_opp_get_of_node(struct dev_pm_opp *opp);
> >  int of_get_required_opp_performance_state(struct device_node *np, int index);
> > +void dev_pm_opp_of_register_em(struct cpumask *cpus, int nr_opp);
> >  #else
> >  static inline int dev_pm_opp_of_add_table(struct device *dev)
> >  {
> > @@ -365,6 +366,11 @@ static inline struct device_node *dev_pm_opp_get_of_node(struct dev_pm_opp *opp)
> >  {
> >  	return NULL;
> >  }
> > +
> > +static inline void dev_pm_opp_of_register_em(struct cpumask *cpus, int nr_opp)
> > +{
> > +}
> > +
> >  static inline int of_get_required_opp_performance_state(struct device_node *np, int index)
> >  {
> >  	return -ENOTSUPP;
> > -- 
> > 2.20.1
> 
> -- 
> viresh

Thanks,
Quentin

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2019-01-31  9:51 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-01-30 17:05 [PATCH v2 0/5] Register an Energy Model for Arm reference platforms Quentin Perret
2019-01-30 17:05 ` [PATCH v2 1/5] PM / OPP: Introduce a power estimation helper Quentin Perret
2019-01-30 19:07   ` Matthias Kaehlcke
2019-01-31  7:22     ` Viresh Kumar
2019-01-31  9:34       ` Quentin Perret
2019-01-31  9:37         ` Viresh Kumar
2019-01-31  9:42     ` Quentin Perret
2019-01-31  7:26   ` Viresh Kumar
2019-01-31  9:51     ` Quentin Perret
2019-01-30 17:05 ` [PATCH v2 2/5] cpufreq: dt: Register an Energy Model Quentin Perret
2019-01-30 17:05 ` [PATCH v2 3/5] cpufreq: scpi: " Quentin Perret
2019-01-30 17:05 ` [PATCH v2 4/5] cpufreq: arm_big_little: " Quentin Perret
2019-01-30 17:05 ` [PATCH v2 5/5] cpufreq: scmi: " Quentin Perret

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).