linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/7] Register an Energy Model for Arm reference platforms
@ 2019-01-28 16:55 Quentin Perret
  2019-01-28 16:55 ` [PATCH 1/7] PM / OPP: Introduce a power estimation helper Quentin Perret
                   ` (6 more replies)
  0 siblings, 7 replies; 25+ messages in thread
From: Quentin Perret @ 2019-01-28 16:55 UTC (permalink / raw)
  To: rjw, viresh.kumar, sudeep.holla, liviu.dudau, lorenzo.pieralisi,
	robh+dt, mark.rutland, nm, sboyd
  Cc: linux-arm-kernel, devicetree, linux-kernel, linux-pm,
	dietmar.eggemann, mka, quentin.perret

The Energy Model (EM) framework feeds interested subsystems (the
scheduler/EAS as of now) with power costs provided by drivers. Yet, no
driver is actually doing that upstream yet. This series updates a set of
CPUFreq drivers and DT files in order to register power costs in the EM
framework for some of the Arm reference platforms for EAS: Hikey960,
Juno and TC2.

The series is split as follows:

 * Patch 01 introduces in PM_OPP a helper function which estimates the
   CPU power using the P=CV²f equation also used by IPA. It should be
   noted that this introduces duplicate code with IPA, which will
   eventually be fixed by migrating IPA to using PM_EM. The ideal plan
   would be to do so later, in a separate patch series. I would indeed
   prefer to keep the thermal and CPUFreq discussion separate at this
   stage, if deemed acceptable.

 * Patches 02-04 make use of that PM_OPP helper function from the
   following CPUFreq drivers: cpufreq-dt, scpi-cpufreq and
   arm_big_little.

 * Patch 05 modifies the SCMI cpufreq driver to pass the power costs
   obtained from firmware to PM_EM.

 * Patches 06-07 provide the dynamic-power-coefficient values for Juno
   and TC2.

Thanks,
Quentin

Dietmar Eggemann (3):
  cpufreq: arm_big_little: Register an Energy Model
  arm64: dts: juno: Add cpu dynamic-power-coefficient information
  arm: dts: vexpress-v2p-ca15_a7: Add cpu dynamic-power-coefficient
    information

Quentin Perret (4):
  PM / OPP: Introduce a power estimation helper
  cpufreq: dt: Register an Energy Model
  cpufreq: scpi: Register an Energy Model
  cpufreq: scmi: Register an Energy Model

 arch/arm/boot/dts/vexpress-v2p-ca15_a7.dts |  5 ++
 arch/arm64/boot/dts/arm/juno-r2.dts        |  6 +++
 arch/arm64/boot/dts/arm/juno.dts           |  6 +++
 drivers/cpufreq/arm_big_little.c           | 10 ++++
 drivers/cpufreq/cpufreq-dt.c               |  7 ++-
 drivers/cpufreq/scmi-cpufreq.c             | 36 ++++++++++++-
 drivers/cpufreq/scpi-cpufreq.c             |  8 ++-
 drivers/opp/of.c                           | 60 ++++++++++++++++++++++
 include/linux/pm_opp.h                     |  5 ++
 9 files changed, 140 insertions(+), 3 deletions(-)

-- 
2.20.1


^ permalink raw reply	[flat|nested] 25+ messages in thread

* [PATCH 1/7] PM / OPP: Introduce a power estimation helper
  2019-01-28 16:55 [PATCH 0/7] Register an Energy Model for Arm reference platforms Quentin Perret
@ 2019-01-28 16:55 ` Quentin Perret
  2019-01-28 19:02   ` Matthias Kaehlcke
  2019-01-29  5:10   ` Viresh Kumar
  2019-01-28 16:55 ` [PATCH 2/7] cpufreq: dt: Register an Energy Model Quentin Perret
                   ` (5 subsequent siblings)
  6 siblings, 2 replies; 25+ messages in thread
From: Quentin Perret @ 2019-01-28 16:55 UTC (permalink / raw)
  To: rjw, viresh.kumar, sudeep.holla, liviu.dudau, lorenzo.pieralisi,
	robh+dt, mark.rutland, nm, sboyd
  Cc: linux-arm-kernel, devicetree, linux-kernel, linux-pm,
	dietmar.eggemann, mka, quentin.perret

The Energy Model (EM) framework provides an API to let drivers register
the active power of CPUs. The drivers are expected to provide a callback
method which estimates the power consumed by a CPU at each available
performance levels. How exactly this should be implemented, however,
depends on the platform.

On some systems, PM_OPP knows the voltage and frequency at which CPUs
can run. When coupled with the CPU 'capacitance' (as provided by the
'dynamic-power-coefficient' devicetree binding), it is possible to
estimate the dynamic power consumption of a CPU as P = C * V^2 * f, with
C its capacitance and V and f respectively the voltage and frequency of
the OPP. The Intelligent Power Allocator (IPA) thermal governor already
implements that estimation method, in the thermal framework.

However, this power estimation method can be applied to any platform
where all the parameters are known (C, V and f), and not only those
suffering thermal issues. As such, the code implementing this feature
can be re-used to also populate the EM framework now used by EAS.

As a first step, introduce in PM_OPP a helper function which CPUFreq
drivers can use to register into the EM framework. This duplicates the
power estimation done in IPA until it can be migrated to using the EM
framework. This will be done later, once the EM framework has support
for at least all platforms currently supported by IPA.

Signed-off-by: Quentin Perret <quentin.perret@arm.com>
---
 drivers/opp/of.c       | 60 ++++++++++++++++++++++++++++++++++++++++++
 include/linux/pm_opp.h |  5 ++++
 2 files changed, 65 insertions(+)

diff --git a/drivers/opp/of.c b/drivers/opp/of.c
index 06f0f632ec47..7572a2eb2fd4 100644
--- a/drivers/opp/of.c
+++ b/drivers/opp/of.c
@@ -1047,3 +1047,63 @@ struct device_node *dev_pm_opp_get_of_node(struct dev_pm_opp *opp)
 	return of_node_get(opp->np);
 }
 EXPORT_SYMBOL_GPL(dev_pm_opp_get_of_node);
+
+/**
+ * of_dev_pm_opp_get_cpu_power() - Estimates the power of a CPU
+ * @mW:		pointer to the power estimate in milli-watts
+ * @KHz:	pointer to the OPP's frequency, in kilo-hertz
+ * @cpu:	CPU for which power needs to be estimated
+ *
+ * Computes the power estimated by @CPU at the first OPP above @KHz (ceil),
+ * and updates @KHz and @mW accordingly.
+ *
+ * The power is estimated as P = C * V^2 * f, with C the CPU's capacitance
+ * (read from the 'dynamic-power-coefficient' devicetree binding) and V and f
+ * respectively the voltage and frequency of the OPP.
+ *
+ * Return: -ENODEV if the CPU device cannot be found, -EINVAL if the power
+ * calculation failed because of missing parameters, 0 otherwise.
+ */
+int of_dev_pm_opp_get_cpu_power(unsigned long *mW, unsigned long *KHz, int cpu)
+{
+	unsigned long mV, Hz, MHz;
+	struct device *cpu_dev;
+	struct dev_pm_opp *opp;
+	struct device_node *np;
+	u32 cap;
+	u64 tmp;
+	int ret;
+
+	cpu_dev = get_cpu_device(cpu);
+	if (!cpu_dev)
+		return -ENODEV;
+
+	np = of_node_get(cpu_dev->of_node);
+	if (!np)
+		return -EINVAL;
+
+	ret = of_property_read_u32(np, "dynamic-power-coefficient", &cap);
+	of_node_put(np);
+	if (ret)
+		return -EINVAL;
+
+	Hz = *KHz * 1000;
+	opp = dev_pm_opp_find_freq_ceil(cpu_dev, &Hz);
+	if (IS_ERR(opp))
+		return -EINVAL;
+
+	mV = dev_pm_opp_get_voltage(opp) / 1000;
+	dev_pm_opp_put(opp);
+	if (!mV)
+		return -EINVAL;
+
+	MHz = Hz / 1000000;
+	tmp = (u64)cap * mV * mV * MHz;
+	do_div(tmp, 1000000000);
+
+	*mW = (unsigned long)tmp;
+	*KHz = Hz / 1000;
+
+	return 0;
+}
+EXPORT_SYMBOL_GPL(of_dev_pm_opp_get_cpu_power);
diff --git a/include/linux/pm_opp.h b/include/linux/pm_opp.h
index 0a2a88e5a383..fedde14f5187 100644
--- a/include/linux/pm_opp.h
+++ b/include/linux/pm_opp.h
@@ -322,6 +322,7 @@ int dev_pm_opp_of_get_sharing_cpus(struct device *cpu_dev, struct cpumask *cpuma
 struct device_node *dev_pm_opp_of_get_opp_desc_node(struct device *dev);
 struct device_node *dev_pm_opp_get_of_node(struct dev_pm_opp *opp);
 int of_get_required_opp_performance_state(struct device_node *np, int index);
+int of_dev_pm_opp_get_cpu_power(unsigned long *mW, unsigned long *KHz, int cpu);
 #else
 static inline int dev_pm_opp_of_add_table(struct device *dev)
 {
@@ -364,6 +365,10 @@ static inline int of_get_required_opp_performance_state(struct device_node *np,
 {
 	return -ENOTSUPP;
 }
+static inline int of_dev_pm_opp_get_cpu_power(unsigned long *mW, unsigned long *KHz, int cpu)
+{
+	return -ENOTSUPP;
+}
 #endif
 
 #endif		/* __LINUX_OPP_H__ */
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH 2/7] cpufreq: dt: Register an Energy Model
  2019-01-28 16:55 [PATCH 0/7] Register an Energy Model for Arm reference platforms Quentin Perret
  2019-01-28 16:55 ` [PATCH 1/7] PM / OPP: Introduce a power estimation helper Quentin Perret
@ 2019-01-28 16:55 ` Quentin Perret
  2019-01-28 19:36   ` Matthias Kaehlcke
  2019-01-29  5:30   ` Viresh Kumar
  2019-01-28 16:55 ` [PATCH 3/7] cpufreq: scpi: " Quentin Perret
                   ` (4 subsequent siblings)
  6 siblings, 2 replies; 25+ messages in thread
From: Quentin Perret @ 2019-01-28 16:55 UTC (permalink / raw)
  To: rjw, viresh.kumar, sudeep.holla, liviu.dudau, lorenzo.pieralisi,
	robh+dt, mark.rutland, nm, sboyd
  Cc: linux-arm-kernel, devicetree, linux-kernel, linux-pm,
	dietmar.eggemann, mka, quentin.perret

Now that PM_OPP provides a helper function to estimate the power
consumed by CPUs, make sure to try and register an Energy Model (EM)
from cpufreq-dt, hence ensuring interested subsystems (the task
scheduler, for example) can make use of that information when available.

Signed-off-by: Quentin Perret <quentin.perret@arm.com>
---
 drivers/cpufreq/cpufreq-dt.c | 7 ++++++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/drivers/cpufreq/cpufreq-dt.c b/drivers/cpufreq/cpufreq-dt.c
index e58bfcb1169e..7556e07e7a9f 100644
--- a/drivers/cpufreq/cpufreq-dt.c
+++ b/drivers/cpufreq/cpufreq-dt.c
@@ -16,6 +16,7 @@
 #include <linux/cpu_cooling.h>
 #include <linux/cpufreq.h>
 #include <linux/cpumask.h>
+#include <linux/energy_model.h>
 #include <linux/err.h>
 #include <linux/module.h>
 #include <linux/of.h>
@@ -152,6 +153,7 @@ static int resources_available(void)
 
 static int cpufreq_init(struct cpufreq_policy *policy)
 {
+	struct em_data_callback em_cb = EM_DATA_CB(of_dev_pm_opp_get_cpu_power);
 	struct cpufreq_frequency_table *freq_table;
 	struct opp_table *opp_table = NULL;
 	struct private_data *priv;
@@ -160,7 +162,7 @@ static int cpufreq_init(struct cpufreq_policy *policy)
 	unsigned int transition_latency;
 	bool fallback = false;
 	const char *name;
-	int ret;
+	int ret, nr_opp;
 
 	cpu_dev = get_cpu_device(policy->cpu);
 	if (!cpu_dev) {
@@ -237,6 +239,7 @@ static int cpufreq_init(struct cpufreq_policy *policy)
 		ret = -EPROBE_DEFER;
 		goto out_free_opp;
 	}
+	nr_opp = ret;
 
 	if (fallback) {
 		cpumask_setall(policy->cpus);
@@ -280,6 +283,8 @@ static int cpufreq_init(struct cpufreq_policy *policy)
 	policy->cpuinfo.transition_latency = transition_latency;
 	policy->dvfs_possible_from_any_cpu = true;
 
+	em_register_perf_domain(policy->cpus, nr_opp, &em_cb);
+
 	return 0;
 
 out_free_cpufreq_table:
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH 3/7] cpufreq: scpi: Register an Energy Model
  2019-01-28 16:55 [PATCH 0/7] Register an Energy Model for Arm reference platforms Quentin Perret
  2019-01-28 16:55 ` [PATCH 1/7] PM / OPP: Introduce a power estimation helper Quentin Perret
  2019-01-28 16:55 ` [PATCH 2/7] cpufreq: dt: Register an Energy Model Quentin Perret
@ 2019-01-28 16:55 ` Quentin Perret
  2019-01-29  5:31   ` Viresh Kumar
  2019-01-28 16:55 ` [PATCH 4/7] cpufreq: arm_big_little: " Quentin Perret
                   ` (3 subsequent siblings)
  6 siblings, 1 reply; 25+ messages in thread
From: Quentin Perret @ 2019-01-28 16:55 UTC (permalink / raw)
  To: rjw, viresh.kumar, sudeep.holla, liviu.dudau, lorenzo.pieralisi,
	robh+dt, mark.rutland, nm, sboyd
  Cc: linux-arm-kernel, devicetree, linux-kernel, linux-pm,
	dietmar.eggemann, mka, quentin.perret

Now that PM_OPP provides a helper function to estimate the power
consumed by CPUs, make sure to try and register an Energy Model (EM)
from scpi-cpufreq, hence ensuring interested subsystems (the task
scheduler, for example) can make use of that information when available.

Signed-off-by: Quentin Perret <quentin.perret@arm.com>
---
 drivers/cpufreq/scpi-cpufreq.c | 8 +++++++-
 1 file changed, 7 insertions(+), 1 deletion(-)

diff --git a/drivers/cpufreq/scpi-cpufreq.c b/drivers/cpufreq/scpi-cpufreq.c
index 87a98ec77773..05fc7448f5cb 100644
--- a/drivers/cpufreq/scpi-cpufreq.c
+++ b/drivers/cpufreq/scpi-cpufreq.c
@@ -23,6 +23,7 @@
 #include <linux/cpufreq.h>
 #include <linux/cpumask.h>
 #include <linux/cpu_cooling.h>
+#include <linux/energy_model.h>
 #include <linux/export.h>
 #include <linux/module.h>
 #include <linux/of_platform.h>
@@ -98,11 +99,12 @@ scpi_get_sharing_cpus(struct device *cpu_dev, struct cpumask *cpumask)
 
 static int scpi_cpufreq_init(struct cpufreq_policy *policy)
 {
-	int ret;
+	int ret, nr_opp;
 	unsigned int latency;
 	struct device *cpu_dev;
 	struct scpi_data *priv;
 	struct cpufreq_frequency_table *freq_table;
+	struct em_data_callback em_cb = EM_DATA_CB(of_dev_pm_opp_get_cpu_power);
 
 	cpu_dev = get_cpu_device(policy->cpu);
 	if (!cpu_dev) {
@@ -135,6 +137,7 @@ static int scpi_cpufreq_init(struct cpufreq_policy *policy)
 		ret = -EPROBE_DEFER;
 		goto out_free_opp;
 	}
+	nr_opp = ret;
 
 	priv = kzalloc(sizeof(*priv), GFP_KERNEL);
 	if (!priv) {
@@ -170,6 +173,9 @@ static int scpi_cpufreq_init(struct cpufreq_policy *policy)
 	policy->cpuinfo.transition_latency = latency;
 
 	policy->fast_switch_possible = false;
+
+	em_register_perf_domain(policy->cpus, nr_opp, &em_cb);
+
 	return 0;
 
 out_free_cpufreq_table:
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH 4/7] cpufreq: arm_big_little: Register an Energy Model
  2019-01-28 16:55 [PATCH 0/7] Register an Energy Model for Arm reference platforms Quentin Perret
                   ` (2 preceding siblings ...)
  2019-01-28 16:55 ` [PATCH 3/7] cpufreq: scpi: " Quentin Perret
@ 2019-01-28 16:55 ` Quentin Perret
  2019-01-28 16:55 ` [PATCH 5/7] cpufreq: scmi: " Quentin Perret
                   ` (2 subsequent siblings)
  6 siblings, 0 replies; 25+ messages in thread
From: Quentin Perret @ 2019-01-28 16:55 UTC (permalink / raw)
  To: rjw, viresh.kumar, sudeep.holla, liviu.dudau, lorenzo.pieralisi,
	robh+dt, mark.rutland, nm, sboyd
  Cc: linux-arm-kernel, devicetree, linux-kernel, linux-pm,
	dietmar.eggemann, mka, quentin.perret

From: Dietmar Eggemann <dietmar.eggemann@arm.com>

Now that PM_OPP provides a helper function to estimate the power
consumed by CPUs, make sure to try and register an Energy Model (EM)
from the arm_big_little CPUFreq driver, hence ensuring interested
subsystems (the task scheduler, for example) can make use of that
information when available.

Signed-off-by: Dietmar Eggemann <dietmar.eggemann@arm.com>
Signed-off-by: Quentin Perret <quentin.perret@arm.com>
---
 drivers/cpufreq/arm_big_little.c | 10 ++++++++++
 1 file changed, 10 insertions(+)

diff --git a/drivers/cpufreq/arm_big_little.c b/drivers/cpufreq/arm_big_little.c
index cf62a1f64dd7..803d41c629c3 100644
--- a/drivers/cpufreq/arm_big_little.c
+++ b/drivers/cpufreq/arm_big_little.c
@@ -24,6 +24,7 @@
 #include <linux/cpufreq.h>
 #include <linux/cpumask.h>
 #include <linux/cpu_cooling.h>
+#include <linux/energy_model.h>
 #include <linux/export.h>
 #include <linux/module.h>
 #include <linux/mutex.h>
@@ -456,6 +457,7 @@ static int get_cluster_clk_and_freq_table(struct device *cpu_dev,
 /* Per-CPU initialization */
 static int bL_cpufreq_init(struct cpufreq_policy *policy)
 {
+	struct em_data_callback em_cb = EM_DATA_CB(of_dev_pm_opp_get_cpu_power);
 	u32 cur_cluster = cpu_to_cluster(policy->cpu);
 	struct device *cpu_dev;
 	int ret;
@@ -487,6 +489,14 @@ static int bL_cpufreq_init(struct cpufreq_policy *policy)
 	policy->cpuinfo.transition_latency =
 				arm_bL_ops->get_transition_latency(cpu_dev);
 
+	ret = dev_pm_opp_get_opp_count(cpu_dev);
+	if (ret <= 0) {
+		dev_dbg(cpu_dev, "OPP table is not ready, deferring probe\n");
+		return -EPROBE_DEFER;
+	}
+
+	em_register_perf_domain(policy->cpus, ret, &em_cb);
+
 	if (is_bL_switching_enabled())
 		per_cpu(cpu_last_req_freq, policy->cpu) = clk_get_cpu_rate(policy->cpu);
 
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH 5/7] cpufreq: scmi: Register an Energy Model
  2019-01-28 16:55 [PATCH 0/7] Register an Energy Model for Arm reference platforms Quentin Perret
                   ` (3 preceding siblings ...)
  2019-01-28 16:55 ` [PATCH 4/7] cpufreq: arm_big_little: " Quentin Perret
@ 2019-01-28 16:55 ` Quentin Perret
  2019-01-28 16:55 ` [PATCH 6/7] arm64: dts: juno: Add cpu dynamic-power-coefficient information Quentin Perret
  2019-01-28 16:55 ` [PATCH 7/7] arm: dts: vexpress-v2p-ca15_a7: " Quentin Perret
  6 siblings, 0 replies; 25+ messages in thread
From: Quentin Perret @ 2019-01-28 16:55 UTC (permalink / raw)
  To: rjw, viresh.kumar, sudeep.holla, liviu.dudau, lorenzo.pieralisi,
	robh+dt, mark.rutland, nm, sboyd
  Cc: linux-arm-kernel, devicetree, linux-kernel, linux-pm,
	dietmar.eggemann, mka, quentin.perret

The Energy Model (EM) framework provides an API to register the active
power of CPUs. Call this API from the scmi-cpufreq driver by using the
power costs obtained from firmware. This is done to ensure interested
subsystems (the task scheduler, for example) can make use of the EM
when available.

Signed-off-by: Quentin Perret <quentin.perret@arm.com>
---
 drivers/cpufreq/scmi-cpufreq.c | 36 +++++++++++++++++++++++++++++++++-
 1 file changed, 35 insertions(+), 1 deletion(-)

diff --git a/drivers/cpufreq/scmi-cpufreq.c b/drivers/cpufreq/scmi-cpufreq.c
index 50b1551ba894..80a7f8da7e74 100644
--- a/drivers/cpufreq/scmi-cpufreq.c
+++ b/drivers/cpufreq/scmi-cpufreq.c
@@ -12,6 +12,7 @@
 #include <linux/cpufreq.h>
 #include <linux/cpumask.h>
 #include <linux/cpu_cooling.h>
+#include <linux/energy_model.h>
 #include <linux/export.h>
 #include <linux/module.h>
 #include <linux/pm_opp.h>
@@ -103,13 +104,42 @@ scmi_get_sharing_cpus(struct device *cpu_dev, struct cpumask *cpumask)
 	return 0;
 }
 
+static int __maybe_unused
+scmi_get_cpu_power(unsigned long *power, unsigned long *KHz, int cpu)
+{
+	struct device *cpu_dev = get_cpu_device(cpu);
+	unsigned long Hz;
+	int ret, domain;
+
+	if (!cpu_dev) {
+		pr_err("failed to get cpu%d device\n", cpu);
+		return -ENODEV;
+	}
+
+	domain = handle->perf_ops->device_domain_id(cpu_dev);
+	if (domain < 0)
+		return domain;
+
+	/* Get the power cost of the performance domain. */
+	Hz = *KHz * 1000;
+	ret = handle->perf_ops->est_power_get(handle, domain, &Hz, power);
+	if (ret)
+		return ret;
+
+	/* The EM framework specifies the frequency in KHz. */
+	*KHz = Hz / 1000;
+
+	return 0;
+}
+
 static int scmi_cpufreq_init(struct cpufreq_policy *policy)
 {
-	int ret;
+	int ret, nr_opp;
 	unsigned int latency;
 	struct device *cpu_dev;
 	struct scmi_data *priv;
 	struct cpufreq_frequency_table *freq_table;
+	struct em_data_callback em_cb = EM_DATA_CB(scmi_get_cpu_power);
 
 	cpu_dev = get_cpu_device(policy->cpu);
 	if (!cpu_dev) {
@@ -142,6 +172,7 @@ static int scmi_cpufreq_init(struct cpufreq_policy *policy)
 		ret = -EPROBE_DEFER;
 		goto out_free_opp;
 	}
+	nr_opp = ret;
 
 	priv = kzalloc(sizeof(*priv), GFP_KERNEL);
 	if (!priv) {
@@ -171,6 +202,9 @@ static int scmi_cpufreq_init(struct cpufreq_policy *policy)
 	policy->cpuinfo.transition_latency = latency;
 
 	policy->fast_switch_possible = true;
+
+	em_register_perf_domain(policy->cpus, nr_opp, &em_cb);
+
 	return 0;
 
 out_free_priv:
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH 6/7] arm64: dts: juno: Add cpu dynamic-power-coefficient information
  2019-01-28 16:55 [PATCH 0/7] Register an Energy Model for Arm reference platforms Quentin Perret
                   ` (4 preceding siblings ...)
  2019-01-28 16:55 ` [PATCH 5/7] cpufreq: scmi: " Quentin Perret
@ 2019-01-28 16:55 ` Quentin Perret
  2019-01-29 15:27   ` Sudeep Holla
  2019-01-28 16:55 ` [PATCH 7/7] arm: dts: vexpress-v2p-ca15_a7: " Quentin Perret
  6 siblings, 1 reply; 25+ messages in thread
From: Quentin Perret @ 2019-01-28 16:55 UTC (permalink / raw)
  To: rjw, viresh.kumar, sudeep.holla, liviu.dudau, lorenzo.pieralisi,
	robh+dt, mark.rutland, nm, sboyd
  Cc: linux-arm-kernel, devicetree, linux-kernel, linux-pm,
	dietmar.eggemann, mka, quentin.perret

From: Dietmar Eggemann <dietmar.eggemann@arm.com>

A CPUfreq driver, like the scpi driver used on Juno boards, which
provide the Energy Model with power cost information via the PM_OPP
of_dev_pm_opp_get_cpu_power() function, do need the
dynamic-power-coefficient (C) in the device tree.

Method used to obtain the C value:
C is computed by measuring energy (E) consumption of a frequency domain
(FD) over a 10s runtime (t) sysbench workload running at each Operating
Performance Point (OPP) affine to 1 or 2 CPUs of that FD while the other
CPUs of the system are hotplugged out.
By definition all CPUs of a FD have the the same micro-architecture. An
OPP is characterized by a certain frequency (f) and voltage (V) value.
The corresponding power values (P) are calculated by dividing the delta
of the E values between the runs with 2 and 1 CPUs by t.

With n data tuples (P, f, V), n equal to number of OPPs for this
frequency domain, we can solve C by:

P = Pstat + Pdyn

P = Pstat + CV²f

Cx = (Px - P1)/(Vx²fx - V1²f1) with x = {2, ..., n}

The C value is the arithmetic mean out of {C2, ..., Cn}.

Since DVFS is broken on Juno r1, no dynamic-power-coefficient
information has been added to its dts file.

Signed-off-by: Dietmar Eggemann <dietmar.eggemann@arm.com>
Signed-off-by: Quentin Perret <quentin.perret@arm.com>
---
 arch/arm64/boot/dts/arm/juno-r2.dts | 6 ++++++
 arch/arm64/boot/dts/arm/juno.dts    | 6 ++++++
 2 files changed, 12 insertions(+)

diff --git a/arch/arm64/boot/dts/arm/juno-r2.dts b/arch/arm64/boot/dts/arm/juno-r2.dts
index ab77adb4f3c2..66f0ec79c864 100644
--- a/arch/arm64/boot/dts/arm/juno-r2.dts
+++ b/arch/arm64/boot/dts/arm/juno-r2.dts
@@ -99,6 +99,7 @@
 			clocks = <&scpi_dvfs 0>;
 			cpu-idle-states = <&CPU_SLEEP_0 &CLUSTER_SLEEP_0>;
 			capacity-dmips-mhz = <1024>;
+			dynamic-power-coefficient = <450>;
 		};
 
 		A72_1: cpu@1 {
@@ -116,6 +117,7 @@
 			clocks = <&scpi_dvfs 0>;
 			cpu-idle-states = <&CPU_SLEEP_0 &CLUSTER_SLEEP_0>;
 			capacity-dmips-mhz = <1024>;
+			dynamic-power-coefficient = <450>;
 		};
 
 		A53_0: cpu@100 {
@@ -133,6 +135,7 @@
 			clocks = <&scpi_dvfs 1>;
 			cpu-idle-states = <&CPU_SLEEP_0 &CLUSTER_SLEEP_0>;
 			capacity-dmips-mhz = <485>;
+			dynamic-power-coefficient = <140>;
 		};
 
 		A53_1: cpu@101 {
@@ -150,6 +153,7 @@
 			clocks = <&scpi_dvfs 1>;
 			cpu-idle-states = <&CPU_SLEEP_0 &CLUSTER_SLEEP_0>;
 			capacity-dmips-mhz = <485>;
+			dynamic-power-coefficient = <140>;
 		};
 
 		A53_2: cpu@102 {
@@ -167,6 +171,7 @@
 			clocks = <&scpi_dvfs 1>;
 			cpu-idle-states = <&CPU_SLEEP_0 &CLUSTER_SLEEP_0>;
 			capacity-dmips-mhz = <485>;
+			dynamic-power-coefficient = <140>;
 		};
 
 		A53_3: cpu@103 {
@@ -184,6 +189,7 @@
 			clocks = <&scpi_dvfs 1>;
 			cpu-idle-states = <&CPU_SLEEP_0 &CLUSTER_SLEEP_0>;
 			capacity-dmips-mhz = <485>;
+			dynamic-power-coefficient = <140>;
 		};
 
 		A72_L2: l2-cache0 {
diff --git a/arch/arm64/boot/dts/arm/juno.dts b/arch/arm64/boot/dts/arm/juno.dts
index 08d4ba1716c3..9890afdda77b 100644
--- a/arch/arm64/boot/dts/arm/juno.dts
+++ b/arch/arm64/boot/dts/arm/juno.dts
@@ -98,6 +98,7 @@
 			clocks = <&scpi_dvfs 0>;
 			cpu-idle-states = <&CPU_SLEEP_0 &CLUSTER_SLEEP_0>;
 			capacity-dmips-mhz = <1024>;
+			dynamic-power-coefficient = <530>;
 		};
 
 		A57_1: cpu@1 {
@@ -115,6 +116,7 @@
 			clocks = <&scpi_dvfs 0>;
 			cpu-idle-states = <&CPU_SLEEP_0 &CLUSTER_SLEEP_0>;
 			capacity-dmips-mhz = <1024>;
+			dynamic-power-coefficient = <530>;
 		};
 
 		A53_0: cpu@100 {
@@ -132,6 +134,7 @@
 			clocks = <&scpi_dvfs 1>;
 			cpu-idle-states = <&CPU_SLEEP_0 &CLUSTER_SLEEP_0>;
 			capacity-dmips-mhz = <578>;
+			dynamic-power-coefficient = <140>;
 		};
 
 		A53_1: cpu@101 {
@@ -149,6 +152,7 @@
 			clocks = <&scpi_dvfs 1>;
 			cpu-idle-states = <&CPU_SLEEP_0 &CLUSTER_SLEEP_0>;
 			capacity-dmips-mhz = <578>;
+			dynamic-power-coefficient = <140>;
 		};
 
 		A53_2: cpu@102 {
@@ -166,6 +170,7 @@
 			clocks = <&scpi_dvfs 1>;
 			cpu-idle-states = <&CPU_SLEEP_0 &CLUSTER_SLEEP_0>;
 			capacity-dmips-mhz = <578>;
+			dynamic-power-coefficient = <140>;
 		};
 
 		A53_3: cpu@103 {
@@ -183,6 +188,7 @@
 			clocks = <&scpi_dvfs 1>;
 			cpu-idle-states = <&CPU_SLEEP_0 &CLUSTER_SLEEP_0>;
 			capacity-dmips-mhz = <578>;
+			dynamic-power-coefficient = <140>;
 		};
 
 		A57_L2: l2-cache0 {
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH 7/7] arm: dts: vexpress-v2p-ca15_a7: Add cpu dynamic-power-coefficient information
  2019-01-28 16:55 [PATCH 0/7] Register an Energy Model for Arm reference platforms Quentin Perret
                   ` (5 preceding siblings ...)
  2019-01-28 16:55 ` [PATCH 6/7] arm64: dts: juno: Add cpu dynamic-power-coefficient information Quentin Perret
@ 2019-01-28 16:55 ` Quentin Perret
  6 siblings, 0 replies; 25+ messages in thread
From: Quentin Perret @ 2019-01-28 16:55 UTC (permalink / raw)
  To: rjw, viresh.kumar, sudeep.holla, liviu.dudau, lorenzo.pieralisi,
	robh+dt, mark.rutland, nm, sboyd
  Cc: linux-arm-kernel, devicetree, linux-kernel, linux-pm,
	dietmar.eggemann, mka, quentin.perret

From: Dietmar Eggemann <dietmar.eggemann@arm.com>

A CPUfreq driver, like the ARM big.LITTLE driver used on the TC2 board,
which provide the Energy Model with power cost information via the
PM_OPP of_dev_pm_opp_get_cpu_power() function, do need the
dynamic-power-coefficient (C) in the device tree.

Method used to obtain the C value:
C is computed by measuring energy (E) consumption of a frequency domain
(FD) over a 10s runtime (t) sysbench workload running at each Operating
Performance Point (OPP) affine to 1 or 2 CPUs of that FD while the other
CPUs of the system are hotplugged out.
By definition all CPUs of a FD have the the same micro-architecture. An
OPP is characterized by a certain frequency (f) and voltage (V) value.
The corresponding power values (P) are calculated by dividing the delta
of the E values between the runs with 2 and 1 CPUs by t.

With n data tuples (P, f, V), n equal to number of OPPs for this
frequency domain, we can solve C by:

P = Pstat + Pdyn

P = Pstat + CV²f

Cx = (Px - P1)/(Vx²fx - V1²f1) with x = {2, ..., n}

The C value is the arithmetic mean out of {C2, ..., Cn}.

Signed-off-by: Dietmar Eggemann <dietmar.eggemann@arm.com>
Signed-off-by: Quentin Perret <quentin.perret@arm.com>
---
 arch/arm/boot/dts/vexpress-v2p-ca15_a7.dts | 5 +++++
 1 file changed, 5 insertions(+)

diff --git a/arch/arm/boot/dts/vexpress-v2p-ca15_a7.dts b/arch/arm/boot/dts/vexpress-v2p-ca15_a7.dts
index a2ccacd07f4f..00cd9f5bef2e 100644
--- a/arch/arm/boot/dts/vexpress-v2p-ca15_a7.dts
+++ b/arch/arm/boot/dts/vexpress-v2p-ca15_a7.dts
@@ -42,6 +42,7 @@
 			cci-control-port = <&cci_control1>;
 			cpu-idle-states = <&CLUSTER_SLEEP_BIG>;
 			capacity-dmips-mhz = <1024>;
+			dynamic-power-coefficient = <990>;
 		};
 
 		cpu1: cpu@1 {
@@ -51,6 +52,7 @@
 			cci-control-port = <&cci_control1>;
 			cpu-idle-states = <&CLUSTER_SLEEP_BIG>;
 			capacity-dmips-mhz = <1024>;
+			dynamic-power-coefficient = <990>;
 		};
 
 		cpu2: cpu@2 {
@@ -60,6 +62,7 @@
 			cci-control-port = <&cci_control2>;
 			cpu-idle-states = <&CLUSTER_SLEEP_LITTLE>;
 			capacity-dmips-mhz = <516>;
+			dynamic-power-coefficient = <133>;
 		};
 
 		cpu3: cpu@3 {
@@ -69,6 +72,7 @@
 			cci-control-port = <&cci_control2>;
 			cpu-idle-states = <&CLUSTER_SLEEP_LITTLE>;
 			capacity-dmips-mhz = <516>;
+			dynamic-power-coefficient = <133>;
 		};
 
 		cpu4: cpu@4 {
@@ -78,6 +82,7 @@
 			cci-control-port = <&cci_control2>;
 			cpu-idle-states = <&CLUSTER_SLEEP_LITTLE>;
 			capacity-dmips-mhz = <516>;
+			dynamic-power-coefficient = <133>;
 		};
 
 		idle-states {
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* Re: [PATCH 1/7] PM / OPP: Introduce a power estimation helper
  2019-01-28 16:55 ` [PATCH 1/7] PM / OPP: Introduce a power estimation helper Quentin Perret
@ 2019-01-28 19:02   ` Matthias Kaehlcke
  2019-01-29  9:03     ` Quentin Perret
  2019-01-29  5:10   ` Viresh Kumar
  1 sibling, 1 reply; 25+ messages in thread
From: Matthias Kaehlcke @ 2019-01-28 19:02 UTC (permalink / raw)
  To: Quentin Perret
  Cc: rjw, viresh.kumar, sudeep.holla, liviu.dudau, lorenzo.pieralisi,
	robh+dt, mark.rutland, nm, sboyd, linux-arm-kernel, devicetree,
	linux-kernel, linux-pm, dietmar.eggemann

rHi Quentin,

On Mon, Jan 28, 2019 at 04:55:16PM +0000, Quentin Perret wrote:
> The Energy Model (EM) framework provides an API to let drivers register
> the active power of CPUs. The drivers are expected to provide a callback
> method which estimates the power consumed by a CPU at each available
> performance levels. How exactly this should be implemented, however,
> depends on the platform.
> 
> On some systems, PM_OPP knows the voltage and frequency at which CPUs
> can run. When coupled with the CPU 'capacitance' (as provided by the
> 'dynamic-power-coefficient' devicetree binding), it is possible to
> estimate the dynamic power consumption of a CPU as P = C * V^2 * f, with
> C its capacitance and V and f respectively the voltage and frequency of
> the OPP. The Intelligent Power Allocator (IPA) thermal governor already
> implements that estimation method, in the thermal framework.
> 
> However, this power estimation method can be applied to any platform
> where all the parameters are known (C, V and f), and not only those
> suffering thermal issues. As such, the code implementing this feature
> can be re-used to also populate the EM framework now used by EAS.
> 
> As a first step, introduce in PM_OPP a helper function which CPUFreq
> drivers can use to register into the EM framework. This duplicates the
> power estimation done in IPA until it can be migrated to using the EM
> framework. This will be done later, once the EM framework has support
> for at least all platforms currently supported by IPA.
> 
> Signed-off-by: Quentin Perret <quentin.perret@arm.com>
> ---
>  drivers/opp/of.c       | 60 ++++++++++++++++++++++++++++++++++++++++++
>  include/linux/pm_opp.h |  5 ++++
>  2 files changed, 65 insertions(+)
> 
> diff --git a/drivers/opp/of.c b/drivers/opp/of.c
> index 06f0f632ec47..7572a2eb2fd4 100644
> --- a/drivers/opp/of.c
> +++ b/drivers/opp/of.c
> @@ -1047,3 +1047,63 @@ struct device_node *dev_pm_opp_get_of_node(struct dev_pm_opp *opp)
>  	return of_node_get(opp->np);
>  }
>  EXPORT_SYMBOL_GPL(dev_pm_opp_get_of_node);
> +
> +/**
> + * of_dev_pm_opp_get_cpu_power() - Estimates the power of a CPU
> + * @mW:		pointer to the power estimate in milli-watts
> + * @KHz:	pointer to the OPP's frequency, in kilo-hertz

nit: should be kHz

> + * @cpu:	CPU for which power needs to be estimated
> + *
> + * Computes the power estimated by @CPU at the first OPP above @KHz (ceil),
> + * and updates @KHz and @mW accordingly.
> + *
> + * The power is estimated as P = C * V^2 * f, with C the CPU's capacitance
> + * (read from the 'dynamic-power-coefficient' devicetree binding) and V and f
> + * respectively the voltage and frequency of the OPP.
> + *
> + * Return: -ENODEV if the CPU device cannot be found, -EINVAL if the power
> + * calculation failed because of missing parameters, 0 otherwise.
> + */
> +int of_dev_pm_opp_get_cpu_power(unsigned long *mW, unsigned long *KHz, int cpu)

I think it is more common to put the input parameters first, then the
output ones, i.e. cpu, kHz, mW.

> +{
> +	unsigned long mV, Hz, MHz;
> +	struct device *cpu_dev;
> +	struct dev_pm_opp *opp;
> +	struct device_node *np;
> +	u32 cap;
> +	u64 tmp;
> +	int ret;
> +
> +	cpu_dev = get_cpu_device(cpu);
> +	if (!cpu_dev)
> +		return -ENODEV;
> +
> +	np = of_node_get(cpu_dev->of_node);
> +	if (!np)
> +		return -EINVAL;
> +
> +	ret = of_property_read_u32(np, "dynamic-power-coefficient", &cap);
> +	of_node_put(np);
> +	if (ret)
> +		return -EINVAL;
> +
> +	Hz = *KHz * 1000;
> +	opp = dev_pm_opp_find_freq_ceil(cpu_dev, &Hz);
> +	if (IS_ERR(opp))
> +		return -EINVAL;
> +
> +	mV = dev_pm_opp_get_voltage(opp) / 1000;
> +	dev_pm_opp_put(opp);
> +	if (!mV)
> +		return -EINVAL;
> +
> +	MHz = Hz / 1000000;
> +	tmp = (u64)cap * mV * mV * MHz;
> +	do_div(tmp, 1000000000);
> +
> +	*mW = (unsigned long)tmp;
> +	*KHz = Hz / 1000;
> +
> +	return 0;
> +}
> +EXPORT_SYMBOL_GPL(of_dev_pm_opp_get_cpu_power);
> diff --git a/include/linux/pm_opp.h b/include/linux/pm_opp.h
> index 0a2a88e5a383..fedde14f5187 100644
> --- a/include/linux/pm_opp.h
> +++ b/include/linux/pm_opp.h
> @@ -322,6 +322,7 @@ int dev_pm_opp_of_get_sharing_cpus(struct device *cpu_dev, struct cpumask *cpuma
>  struct device_node *dev_pm_opp_of_get_opp_desc_node(struct device *dev);
>  struct device_node *dev_pm_opp_get_of_node(struct dev_pm_opp *opp);
>  int of_get_required_opp_performance_state(struct device_node *np, int index);
> +int of_dev_pm_opp_get_cpu_power(unsigned long *mW, unsigned long *KHz, int cpu);
>  #else
>  static inline int dev_pm_opp_of_add_table(struct device *dev)
>  {
> @@ -364,6 +365,10 @@ static inline int of_get_required_opp_performance_state(struct device_node *np,
>  {
>  	return -ENOTSUPP;
>  }
> +static inline int of_dev_pm_opp_get_cpu_power(unsigned long *mW, unsigned long *KHz, int cpu)
> +{
> +	return -ENOTSUPP;
> +}
>  #endif
>  
>  #endif		/* __LINUX_OPP_H__ */

Besides the nits above:

Reviewed-by: Matthias Kaehlcke <mka@chromium.org>
Tested-by: Matthias Kaehlcke <mka@chromium.org>

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 2/7] cpufreq: dt: Register an Energy Model
  2019-01-28 16:55 ` [PATCH 2/7] cpufreq: dt: Register an Energy Model Quentin Perret
@ 2019-01-28 19:36   ` Matthias Kaehlcke
  2019-01-29  5:21     ` Viresh Kumar
  2019-01-29  5:30   ` Viresh Kumar
  1 sibling, 1 reply; 25+ messages in thread
From: Matthias Kaehlcke @ 2019-01-28 19:36 UTC (permalink / raw)
  To: Quentin Perret
  Cc: rjw, viresh.kumar, sudeep.holla, liviu.dudau, lorenzo.pieralisi,
	robh+dt, mark.rutland, nm, sboyd, linux-arm-kernel, devicetree,
	linux-kernel, linux-pm, dietmar.eggemann

On Mon, Jan 28, 2019 at 04:55:17PM +0000, Quentin Perret wrote:
> Now that PM_OPP provides a helper function to estimate the power
> consumed by CPUs, make sure to try and register an Energy Model (EM)
> from cpufreq-dt, hence ensuring interested subsystems (the task
> scheduler, for example) can make use of that information when available.
> 
> Signed-off-by: Quentin Perret <quentin.perret@arm.com>
> ---
>  drivers/cpufreq/cpufreq-dt.c | 7 ++++++-
>  1 file changed, 6 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/cpufreq/cpufreq-dt.c b/drivers/cpufreq/cpufreq-dt.c
> index e58bfcb1169e..7556e07e7a9f 100644
> --- a/drivers/cpufreq/cpufreq-dt.c
> +++ b/drivers/cpufreq/cpufreq-dt.c
> @@ -16,6 +16,7 @@
>  #include <linux/cpu_cooling.h>
>  #include <linux/cpufreq.h>
>  #include <linux/cpumask.h>
> +#include <linux/energy_model.h>
>  #include <linux/err.h>
>  #include <linux/module.h>
>  #include <linux/of.h>
> @@ -152,6 +153,7 @@ static int resources_available(void)
>  
>  static int cpufreq_init(struct cpufreq_policy *policy)
>  {
> +	struct em_data_callback em_cb = EM_DATA_CB(of_dev_pm_opp_get_cpu_power);
>  	struct cpufreq_frequency_table *freq_table;
>  	struct opp_table *opp_table = NULL;
>  	struct private_data *priv;
> @@ -160,7 +162,7 @@ static int cpufreq_init(struct cpufreq_policy *policy)
>  	unsigned int transition_latency;
>  	bool fallback = false;
>  	const char *name;
> -	int ret;
> +	int ret, nr_opp;
>  
>  	cpu_dev = get_cpu_device(policy->cpu);
>  	if (!cpu_dev) {
> @@ -237,6 +239,7 @@ static int cpufreq_init(struct cpufreq_policy *policy)
>  		ret = -EPROBE_DEFER;
>  		goto out_free_opp;
>  	}
> +	nr_opp = ret;
>  
>  	if (fallback) {
>  		cpumask_setall(policy->cpus);
> @@ -280,6 +283,8 @@ static int cpufreq_init(struct cpufreq_policy *policy)
>  	policy->cpuinfo.transition_latency = transition_latency;
>  	policy->dvfs_possible_from_any_cpu = true;
>  
> +	em_register_perf_domain(policy->cpus, nr_opp, &em_cb);

The perf domain is registered here but not unregistered in exit()
(there is currently no API do do this). When init() is called for the
second (or third, ...) time em_register_perf_domain() is called again,
though it skips the registration. To make things more explicit and not
rely on internal behavior of em_register_perf_domain() you could place
the registration inside an 'if (!em_cpu_get(cpu))' branch.

You should probably check the return value of em_register_perf_domain()
and at least print a warning in case of failure. This would require to
call em_register_perf_domain() only when CONFIG_ENERGY_MODEL=y (the
function returns -EINVAL otherwise).

I think this patch will result in error messages at registration on
platforms that use the cpufreq-dt driver and don't specify
'dynamic-power-coefficient' for the CPUs in the DT. Not sure if that's
a problem as long as the cpufreq initialization succeeds regardless,
it could be seen as a not-so-gentle nudge to add the values.

Cheers

Matthias

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 1/7] PM / OPP: Introduce a power estimation helper
  2019-01-28 16:55 ` [PATCH 1/7] PM / OPP: Introduce a power estimation helper Quentin Perret
  2019-01-28 19:02   ` Matthias Kaehlcke
@ 2019-01-29  5:10   ` Viresh Kumar
  2019-01-29  9:09     ` Quentin Perret
  1 sibling, 1 reply; 25+ messages in thread
From: Viresh Kumar @ 2019-01-29  5:10 UTC (permalink / raw)
  To: Quentin Perret
  Cc: rjw, sudeep.holla, liviu.dudau, lorenzo.pieralisi, robh+dt,
	mark.rutland, nm, sboyd, linux-arm-kernel, devicetree,
	linux-kernel, linux-pm, dietmar.eggemann, mka

On 28-01-19, 16:55, Quentin Perret wrote:
> diff --git a/drivers/opp/of.c b/drivers/opp/of.c
> index 06f0f632ec47..7572a2eb2fd4 100644
> --- a/drivers/opp/of.c
> +++ b/drivers/opp/of.c
> @@ -1047,3 +1047,63 @@ struct device_node *dev_pm_opp_get_of_node(struct dev_pm_opp *opp)
>  	return of_node_get(opp->np);
>  }
>  EXPORT_SYMBOL_GPL(dev_pm_opp_get_of_node);
> +
> +/**
> + * of_dev_pm_opp_get_cpu_power() - Estimates the power of a CPU
> + * @mW:		pointer to the power estimate in milli-watts
> + * @KHz:	pointer to the OPP's frequency, in kilo-hertz
> + * @cpu:	CPU for which power needs to be estimated
> + *
> + * Computes the power estimated by @CPU at the first OPP above @KHz (ceil),
> + * and updates @KHz and @mW accordingly.
> + *
> + * The power is estimated as P = C * V^2 * f, with C the CPU's capacitance
> + * (read from the 'dynamic-power-coefficient' devicetree binding) and V and f
> + * respectively the voltage and frequency of the OPP.
> + *
> + * Return: -ENODEV if the CPU device cannot be found, -EINVAL if the power
> + * calculation failed because of missing parameters, 0 otherwise.
> + */
> +int of_dev_pm_opp_get_cpu_power(unsigned long *mW, unsigned long *KHz, int cpu)

I would suggest to change the return type to unsigned long and return 0 for
errors and positive values for mW.

> +{
> +	unsigned long mV, Hz, MHz;

MHz is used only once, I think you can get rid of it just open code the division
by 1000000. Maybe remove Hz as well and directly use kHz.

> +	struct device *cpu_dev;
> +	struct dev_pm_opp *opp;
> +	struct device_node *np;
> +	u32 cap;
> +	u64 tmp;

Name this mW with the above changes.

> +	int ret;
> +
> +	cpu_dev = get_cpu_device(cpu);
> +	if (!cpu_dev)
> +		return -ENODEV;
> +
> +	np = of_node_get(cpu_dev->of_node);
> +	if (!np)
> +		return -EINVAL;
> +
> +	ret = of_property_read_u32(np, "dynamic-power-coefficient", &cap);
> +	of_node_put(np);
> +	if (ret)
> +		return -EINVAL;
> +
> +	Hz = *KHz * 1000;
> +	opp = dev_pm_opp_find_freq_ceil(cpu_dev, &Hz);
> +	if (IS_ERR(opp))
> +		return -EINVAL;
> +
> +	mV = dev_pm_opp_get_voltage(opp) / 1000;
> +	dev_pm_opp_put(opp);
> +	if (!mV)
> +		return -EINVAL;
> +
> +	MHz = Hz / 1000000;
> +	tmp = (u64)cap * mV * mV * MHz;
> +	do_div(tmp, 1000000000);
> +
> +	*mW = (unsigned long)tmp;
> +	*KHz = Hz / 1000;
> +
> +	return 0;
> +}
> +EXPORT_SYMBOL_GPL(of_dev_pm_opp_get_cpu_power);
> diff --git a/include/linux/pm_opp.h b/include/linux/pm_opp.h
> index 0a2a88e5a383..fedde14f5187 100644
> --- a/include/linux/pm_opp.h
> +++ b/include/linux/pm_opp.h
> @@ -322,6 +322,7 @@ int dev_pm_opp_of_get_sharing_cpus(struct device *cpu_dev, struct cpumask *cpuma
>  struct device_node *dev_pm_opp_of_get_opp_desc_node(struct device *dev);
>  struct device_node *dev_pm_opp_get_of_node(struct dev_pm_opp *opp);
>  int of_get_required_opp_performance_state(struct device_node *np, int index);
> +int of_dev_pm_opp_get_cpu_power(unsigned long *mW, unsigned long *KHz, int cpu);
>  #else
>  static inline int dev_pm_opp_of_add_table(struct device *dev)
>  {
> @@ -364,6 +365,10 @@ static inline int of_get_required_opp_performance_state(struct device_node *np,
>  {
>  	return -ENOTSUPP;
>  }

Add a blank line here. We missed adding the same before
of_get_required_opp_performance_state() earlier, maybe add the new routine
before of_get_required_opp_performance_state() and add blank line before and
after it :)

> +static inline int of_dev_pm_opp_get_cpu_power(unsigned long *mW, unsigned long *KHz, int cpu)
> +{
> +	return -ENOTSUPP;
> +}
>  #endif
>  
>  #endif		/* __LINUX_OPP_H__ */
> -- 
> 2.20.1

-- 
viresh

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 2/7] cpufreq: dt: Register an Energy Model
  2019-01-28 19:36   ` Matthias Kaehlcke
@ 2019-01-29  5:21     ` Viresh Kumar
  2019-01-29  9:15       ` Quentin Perret
  0 siblings, 1 reply; 25+ messages in thread
From: Viresh Kumar @ 2019-01-29  5:21 UTC (permalink / raw)
  To: Matthias Kaehlcke
  Cc: Quentin Perret, rjw, sudeep.holla, liviu.dudau,
	lorenzo.pieralisi, robh+dt, mark.rutland, nm, sboyd,
	linux-arm-kernel, devicetree, linux-kernel, linux-pm,
	dietmar.eggemann

On 28-01-19, 11:36, Matthias Kaehlcke wrote:
> I think this patch will result in error messages at registration on
> platforms that use the cpufreq-dt driver and don't specify
> 'dynamic-power-coefficient' for the CPUs in the DT. Not sure if that's
> a problem as long as the cpufreq initialization succeeds regardless,
> it could be seen as a not-so-gentle nudge to add the values.

That wouldn't be acceptable.

-- 
viresh

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 2/7] cpufreq: dt: Register an Energy Model
  2019-01-28 16:55 ` [PATCH 2/7] cpufreq: dt: Register an Energy Model Quentin Perret
  2019-01-28 19:36   ` Matthias Kaehlcke
@ 2019-01-29  5:30   ` Viresh Kumar
  2019-01-29  9:10     ` Quentin Perret
  1 sibling, 1 reply; 25+ messages in thread
From: Viresh Kumar @ 2019-01-29  5:30 UTC (permalink / raw)
  To: Quentin Perret
  Cc: rjw, sudeep.holla, liviu.dudau, lorenzo.pieralisi, robh+dt,
	mark.rutland, nm, sboyd, linux-arm-kernel, devicetree,
	linux-kernel, linux-pm, dietmar.eggemann, mka

On 28-01-19, 16:55, Quentin Perret wrote:
> Now that PM_OPP provides a helper function to estimate the power
> consumed by CPUs, make sure to try and register an Energy Model (EM)
> from cpufreq-dt, hence ensuring interested subsystems (the task
> scheduler, for example) can make use of that information when available.
> 
> Signed-off-by: Quentin Perret <quentin.perret@arm.com>
> ---
>  drivers/cpufreq/cpufreq-dt.c | 7 ++++++-
>  1 file changed, 6 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/cpufreq/cpufreq-dt.c b/drivers/cpufreq/cpufreq-dt.c
> index e58bfcb1169e..7556e07e7a9f 100644
> --- a/drivers/cpufreq/cpufreq-dt.c
> +++ b/drivers/cpufreq/cpufreq-dt.c
> @@ -16,6 +16,7 @@
>  #include <linux/cpu_cooling.h>
>  #include <linux/cpufreq.h>
>  #include <linux/cpumask.h>
> +#include <linux/energy_model.h>
>  #include <linux/err.h>
>  #include <linux/module.h>
>  #include <linux/of.h>
> @@ -152,6 +153,7 @@ static int resources_available(void)
>  
>  static int cpufreq_init(struct cpufreq_policy *policy)
>  {
> +	struct em_data_callback em_cb = EM_DATA_CB(of_dev_pm_opp_get_cpu_power);

Hmm, so all the earlier comments regarding prototype of this routine are invalid
because of this.

>  	struct cpufreq_frequency_table *freq_table;
>  	struct opp_table *opp_table = NULL;
>  	struct private_data *priv;
> @@ -160,7 +162,7 @@ static int cpufreq_init(struct cpufreq_policy *policy)
>  	unsigned int transition_latency;
>  	bool fallback = false;
>  	const char *name;
> -	int ret;
> +	int ret, nr_opp;
>  
>  	cpu_dev = get_cpu_device(policy->cpu);
>  	if (!cpu_dev) {
> @@ -237,6 +239,7 @@ static int cpufreq_init(struct cpufreq_policy *policy)
>  		ret = -EPROBE_DEFER;
>  		goto out_free_opp;
>  	}
> +	nr_opp = ret;

Instead use nr_opp instead of ret while calling dev_pm_opp_get_opp_count().

>  
>  	if (fallback) {
>  		cpumask_setall(policy->cpus);
> @@ -280,6 +283,8 @@ static int cpufreq_init(struct cpufreq_policy *policy)
>  	policy->cpuinfo.transition_latency = transition_latency;
>  	policy->dvfs_possible_from_any_cpu = true;
>  
> +	em_register_perf_domain(policy->cpus, nr_opp, &em_cb);
> +
>  	return 0;
>  
>  out_free_cpufreq_table:
> -- 
> 2.20.1

-- 
viresh

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 3/7] cpufreq: scpi: Register an Energy Model
  2019-01-28 16:55 ` [PATCH 3/7] cpufreq: scpi: " Quentin Perret
@ 2019-01-29  5:31   ` Viresh Kumar
  0 siblings, 0 replies; 25+ messages in thread
From: Viresh Kumar @ 2019-01-29  5:31 UTC (permalink / raw)
  To: Quentin Perret
  Cc: rjw, sudeep.holla, liviu.dudau, lorenzo.pieralisi, robh+dt,
	mark.rutland, nm, sboyd, linux-arm-kernel, devicetree,
	linux-kernel, linux-pm, dietmar.eggemann, mka

On 28-01-19, 16:55, Quentin Perret wrote:
> Now that PM_OPP provides a helper function to estimate the power
> consumed by CPUs, make sure to try and register an Energy Model (EM)
> from scpi-cpufreq, hence ensuring interested subsystems (the task
> scheduler, for example) can make use of that information when available.
> 
> Signed-off-by: Quentin Perret <quentin.perret@arm.com>
> ---
>  drivers/cpufreq/scpi-cpufreq.c | 8 +++++++-
>  1 file changed, 7 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/cpufreq/scpi-cpufreq.c b/drivers/cpufreq/scpi-cpufreq.c
> index 87a98ec77773..05fc7448f5cb 100644
> --- a/drivers/cpufreq/scpi-cpufreq.c
> +++ b/drivers/cpufreq/scpi-cpufreq.c
> @@ -23,6 +23,7 @@
>  #include <linux/cpufreq.h>
>  #include <linux/cpumask.h>
>  #include <linux/cpu_cooling.h>
> +#include <linux/energy_model.h>
>  #include <linux/export.h>
>  #include <linux/module.h>
>  #include <linux/of_platform.h>
> @@ -98,11 +99,12 @@ scpi_get_sharing_cpus(struct device *cpu_dev, struct cpumask *cpumask)
>  
>  static int scpi_cpufreq_init(struct cpufreq_policy *policy)
>  {
> -	int ret;
> +	int ret, nr_opp;
>  	unsigned int latency;
>  	struct device *cpu_dev;
>  	struct scpi_data *priv;
>  	struct cpufreq_frequency_table *freq_table;
> +	struct em_data_callback em_cb = EM_DATA_CB(of_dev_pm_opp_get_cpu_power);
>  
>  	cpu_dev = get_cpu_device(policy->cpu);
>  	if (!cpu_dev) {
> @@ -135,6 +137,7 @@ static int scpi_cpufreq_init(struct cpufreq_policy *policy)
>  		ret = -EPROBE_DEFER;
>  		goto out_free_opp;
>  	}
> +	nr_opp = ret;

Same comment here.

-- 
viresh

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 1/7] PM / OPP: Introduce a power estimation helper
  2019-01-28 19:02   ` Matthias Kaehlcke
@ 2019-01-29  9:03     ` Quentin Perret
  2019-01-29 17:52       ` Matthias Kaehlcke
  0 siblings, 1 reply; 25+ messages in thread
From: Quentin Perret @ 2019-01-29  9:03 UTC (permalink / raw)
  To: Matthias Kaehlcke
  Cc: rjw, viresh.kumar, sudeep.holla, liviu.dudau, lorenzo.pieralisi,
	robh+dt, mark.rutland, nm, sboyd, linux-arm-kernel, devicetree,
	linux-kernel, linux-pm, dietmar.eggemann

Hi Matthias,

On Monday 28 Jan 2019 at 11:02:51 (-0800), Matthias Kaehlcke wrote:
> > +/**
> > + * of_dev_pm_opp_get_cpu_power() - Estimates the power of a CPU
> > + * @mW:		pointer to the power estimate in milli-watts
> > + * @KHz:	pointer to the OPP's frequency, in kilo-hertz
> 
> nit: should be kHz

Right, I can change that :-)

> 
> > + * @cpu:	CPU for which power needs to be estimated
> > + *
> > + * Computes the power estimated by @CPU at the first OPP above @KHz (ceil),
> > + * and updates @KHz and @mW accordingly.
> > + *
> > + * The power is estimated as P = C * V^2 * f, with C the CPU's capacitance
> > + * (read from the 'dynamic-power-coefficient' devicetree binding) and V and f
> > + * respectively the voltage and frequency of the OPP.
> > + *
> > + * Return: -ENODEV if the CPU device cannot be found, -EINVAL if the power
> > + * calculation failed because of missing parameters, 0 otherwise.
> > + */
> > +int of_dev_pm_opp_get_cpu_power(unsigned long *mW, unsigned long *KHz, int cpu)
> 
> I think it is more common to put the input parameters first, then the
> output ones, i.e. cpu, kHz, mW.

Hmm, the issue is this must match the expectations of the EM framework:

https://elixir.bootlin.com/linux/v5.0-rc4/source/include/linux/energy_model.h#L62

So, I don't really have a choice. Or we can change the core code if this
_really_ is a problem -- we don't have users yet so now is the best time
to do so I guess ...

> 
> > +{
> > +	unsigned long mV, Hz, MHz;
> > +	struct device *cpu_dev;
> > +	struct dev_pm_opp *opp;
> > +	struct device_node *np;
> > +	u32 cap;
> > +	u64 tmp;
> > +	int ret;
> > +
> > +	cpu_dev = get_cpu_device(cpu);
> > +	if (!cpu_dev)
> > +		return -ENODEV;
> > +
> > +	np = of_node_get(cpu_dev->of_node);
> > +	if (!np)
> > +		return -EINVAL;
> > +
> > +	ret = of_property_read_u32(np, "dynamic-power-coefficient", &cap);
> > +	of_node_put(np);
> > +	if (ret)
> > +		return -EINVAL;
> > +
> > +	Hz = *KHz * 1000;
> > +	opp = dev_pm_opp_find_freq_ceil(cpu_dev, &Hz);
> > +	if (IS_ERR(opp))
> > +		return -EINVAL;
> > +
> > +	mV = dev_pm_opp_get_voltage(opp) / 1000;
> > +	dev_pm_opp_put(opp);
> > +	if (!mV)
> > +		return -EINVAL;
> > +
> > +	MHz = Hz / 1000000;
> > +	tmp = (u64)cap * mV * mV * MHz;
> > +	do_div(tmp, 1000000000);
> > +
> > +	*mW = (unsigned long)tmp;
> > +	*KHz = Hz / 1000;
> > +
> > +	return 0;
> > +}
> > +EXPORT_SYMBOL_GPL(of_dev_pm_opp_get_cpu_power);
> > diff --git a/include/linux/pm_opp.h b/include/linux/pm_opp.h
> > index 0a2a88e5a383..fedde14f5187 100644
> > --- a/include/linux/pm_opp.h
> > +++ b/include/linux/pm_opp.h
> > @@ -322,6 +322,7 @@ int dev_pm_opp_of_get_sharing_cpus(struct device *cpu_dev, struct cpumask *cpuma
> >  struct device_node *dev_pm_opp_of_get_opp_desc_node(struct device *dev);
> >  struct device_node *dev_pm_opp_get_of_node(struct dev_pm_opp *opp);
> >  int of_get_required_opp_performance_state(struct device_node *np, int index);
> > +int of_dev_pm_opp_get_cpu_power(unsigned long *mW, unsigned long *KHz, int cpu);
> >  #else
> >  static inline int dev_pm_opp_of_add_table(struct device *dev)
> >  {
> > @@ -364,6 +365,10 @@ static inline int of_get_required_opp_performance_state(struct device_node *np,
> >  {
> >  	return -ENOTSUPP;
> >  }
> > +static inline int of_dev_pm_opp_get_cpu_power(unsigned long *mW, unsigned long *KHz, int cpu)
> > +{
> > +	return -ENOTSUPP;
> > +}
> >  #endif
> >  
> >  #endif		/* __LINUX_OPP_H__ */
> 
> Besides the nits above:
> 
> Reviewed-by: Matthias Kaehlcke <mka@chromium.org>
> Tested-by: Matthias Kaehlcke <mka@chromium.org>

Thank you !
Quentin

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 1/7] PM / OPP: Introduce a power estimation helper
  2019-01-29  5:10   ` Viresh Kumar
@ 2019-01-29  9:09     ` Quentin Perret
  0 siblings, 0 replies; 25+ messages in thread
From: Quentin Perret @ 2019-01-29  9:09 UTC (permalink / raw)
  To: Viresh Kumar
  Cc: rjw, sudeep.holla, liviu.dudau, lorenzo.pieralisi, robh+dt,
	mark.rutland, nm, sboyd, linux-arm-kernel, devicetree,
	linux-kernel, linux-pm, dietmar.eggemann, mka

Hi Viresh,

On Tuesday 29 Jan 2019 at 10:40:30 (+0530), Viresh Kumar wrote:
> > +int of_dev_pm_opp_get_cpu_power(unsigned long *mW, unsigned long *KHz, int cpu)
> 
> I would suggest to change the return type to unsigned long and return 0 for
> errors and positive values for mW.

Well, I can't really do that -- this must match the API of PM_EM:

https://elixir.bootlin.com/linux/v5.0-rc4/source/include/linux/energy_model.h#L62

I'll make sure to make that explicit in the commit message for v2.

> 
> > +{
> > +	unsigned long mV, Hz, MHz;
> 
> MHz is used only once, I think you can get rid of it just open code the division
> by 1000000. Maybe remove Hz as well and directly use kHz.

I would expect/hope the compiler will optimize things like that for me
but yes, I can try to reduce a bit the temp variables if you prefer.

> 
> > +	struct device *cpu_dev;
> > +	struct dev_pm_opp *opp;
> > +	struct device_node *np;
> > +	u32 cap;
> > +	u64 tmp;
> 
> Name this mW with the above changes.

So that doesn't really work given what I mentioned above.

> > +	int ret;
> > +
> > +	cpu_dev = get_cpu_device(cpu);
> > +	if (!cpu_dev)
> > +		return -ENODEV;
> > +
> > +	np = of_node_get(cpu_dev->of_node);
> > +	if (!np)
> > +		return -EINVAL;
> > +
> > +	ret = of_property_read_u32(np, "dynamic-power-coefficient", &cap);
> > +	of_node_put(np);
> > +	if (ret)
> > +		return -EINVAL;
> > +
> > +	Hz = *KHz * 1000;
> > +	opp = dev_pm_opp_find_freq_ceil(cpu_dev, &Hz);
> > +	if (IS_ERR(opp))
> > +		return -EINVAL;
> > +
> > +	mV = dev_pm_opp_get_voltage(opp) / 1000;
> > +	dev_pm_opp_put(opp);
> > +	if (!mV)
> > +		return -EINVAL;
> > +
> > +	MHz = Hz / 1000000;
> > +	tmp = (u64)cap * mV * mV * MHz;
> > +	do_div(tmp, 1000000000);
> > +
> > +	*mW = (unsigned long)tmp;
> > +	*KHz = Hz / 1000;
> > +
> > +	return 0;
> > +}
> > +EXPORT_SYMBOL_GPL(of_dev_pm_opp_get_cpu_power);
> > diff --git a/include/linux/pm_opp.h b/include/linux/pm_opp.h
> > index 0a2a88e5a383..fedde14f5187 100644
> > --- a/include/linux/pm_opp.h
> > +++ b/include/linux/pm_opp.h
> > @@ -322,6 +322,7 @@ int dev_pm_opp_of_get_sharing_cpus(struct device *cpu_dev, struct cpumask *cpuma
> >  struct device_node *dev_pm_opp_of_get_opp_desc_node(struct device *dev);
> >  struct device_node *dev_pm_opp_get_of_node(struct dev_pm_opp *opp);
> >  int of_get_required_opp_performance_state(struct device_node *np, int index);
> > +int of_dev_pm_opp_get_cpu_power(unsigned long *mW, unsigned long *KHz, int cpu);
> >  #else
> >  static inline int dev_pm_opp_of_add_table(struct device *dev)
> >  {
> > @@ -364,6 +365,10 @@ static inline int of_get_required_opp_performance_state(struct device_node *np,
> >  {
> >  	return -ENOTSUPP;
> >  }
> 
> Add a blank line here. We missed adding the same before
> of_get_required_opp_performance_state() earlier, maybe add the new routine
> before of_get_required_opp_performance_state() and add blank line before and
> after it :)

Sounds good !

> 
> > +static inline int of_dev_pm_opp_get_cpu_power(unsigned long *mW, unsigned long *KHz, int cpu)
> > +{
> > +	return -ENOTSUPP;
> > +}
> >  #endif
> >  
> >  #endif		/* __LINUX_OPP_H__ */
> > -- 
> > 2.20.1
> 
> -- 
> viresh

Thanks,
Quentin

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 2/7] cpufreq: dt: Register an Energy Model
  2019-01-29  5:30   ` Viresh Kumar
@ 2019-01-29  9:10     ` Quentin Perret
  0 siblings, 0 replies; 25+ messages in thread
From: Quentin Perret @ 2019-01-29  9:10 UTC (permalink / raw)
  To: Viresh Kumar
  Cc: rjw, sudeep.holla, liviu.dudau, lorenzo.pieralisi, robh+dt,
	mark.rutland, nm, sboyd, linux-arm-kernel, devicetree,
	linux-kernel, linux-pm, dietmar.eggemann, mka

On Tuesday 29 Jan 2019 at 11:00:41 (+0530), Viresh Kumar wrote:
> On 28-01-19, 16:55, Quentin Perret wrote:
> > Now that PM_OPP provides a helper function to estimate the power
> > consumed by CPUs, make sure to try and register an Energy Model (EM)
> > from cpufreq-dt, hence ensuring interested subsystems (the task
> > scheduler, for example) can make use of that information when available.
> > 
> > Signed-off-by: Quentin Perret <quentin.perret@arm.com>
> > ---
> >  drivers/cpufreq/cpufreq-dt.c | 7 ++++++-
> >  1 file changed, 6 insertions(+), 1 deletion(-)
> > 
> > diff --git a/drivers/cpufreq/cpufreq-dt.c b/drivers/cpufreq/cpufreq-dt.c
> > index e58bfcb1169e..7556e07e7a9f 100644
> > --- a/drivers/cpufreq/cpufreq-dt.c
> > +++ b/drivers/cpufreq/cpufreq-dt.c
> > @@ -16,6 +16,7 @@
> >  #include <linux/cpu_cooling.h>
> >  #include <linux/cpufreq.h>
> >  #include <linux/cpumask.h>
> > +#include <linux/energy_model.h>
> >  #include <linux/err.h>
> >  #include <linux/module.h>
> >  #include <linux/of.h>
> > @@ -152,6 +153,7 @@ static int resources_available(void)
> >  
> >  static int cpufreq_init(struct cpufreq_policy *policy)
> >  {
> > +	struct em_data_callback em_cb = EM_DATA_CB(of_dev_pm_opp_get_cpu_power);
> 
> Hmm, so all the earlier comments regarding prototype of this routine are invalid
> because of this.

Right.

> 
> >  	struct cpufreq_frequency_table *freq_table;
> >  	struct opp_table *opp_table = NULL;
> >  	struct private_data *priv;
> > @@ -160,7 +162,7 @@ static int cpufreq_init(struct cpufreq_policy *policy)
> >  	unsigned int transition_latency;
> >  	bool fallback = false;
> >  	const char *name;
> > -	int ret;
> > +	int ret, nr_opp;
> >  
> >  	cpu_dev = get_cpu_device(policy->cpu);
> >  	if (!cpu_dev) {
> > @@ -237,6 +239,7 @@ static int cpufreq_init(struct cpufreq_policy *policy)
> >  		ret = -EPROBE_DEFER;
> >  		goto out_free_opp;
> >  	}
> > +	nr_opp = ret;
> 
> Instead use nr_opp instead of ret while calling dev_pm_opp_get_opp_count().

Will do.

> 
> >  
> >  	if (fallback) {
> >  		cpumask_setall(policy->cpus);
> > @@ -280,6 +283,8 @@ static int cpufreq_init(struct cpufreq_policy *policy)
> >  	policy->cpuinfo.transition_latency = transition_latency;
> >  	policy->dvfs_possible_from_any_cpu = true;
> >  
> > +	em_register_perf_domain(policy->cpus, nr_opp, &em_cb);
> > +
> >  	return 0;
> >  
> >  out_free_cpufreq_table:
> > -- 
> > 2.20.1
> 
> -- 
> viresh

Thanks,
Quentin

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 2/7] cpufreq: dt: Register an Energy Model
  2019-01-29  5:21     ` Viresh Kumar
@ 2019-01-29  9:15       ` Quentin Perret
  2019-01-30  5:18         ` Viresh Kumar
  0 siblings, 1 reply; 25+ messages in thread
From: Quentin Perret @ 2019-01-29  9:15 UTC (permalink / raw)
  To: Viresh Kumar
  Cc: Matthias Kaehlcke, rjw, sudeep.holla, liviu.dudau,
	lorenzo.pieralisi, robh+dt, mark.rutland, nm, sboyd,
	linux-arm-kernel, devicetree, linux-kernel, linux-pm,
	dietmar.eggemann

On Tuesday 29 Jan 2019 at 10:51:44 (+0530), Viresh Kumar wrote:
> On 28-01-19, 11:36, Matthias Kaehlcke wrote:
> > I think this patch will result in error messages at registration on
> > platforms that use the cpufreq-dt driver and don't specify
> > 'dynamic-power-coefficient' for the CPUs in the DT. Not sure if that's
> > a problem as long as the cpufreq initialization succeeds regardless,
> > it could be seen as a not-so-gentle nudge to add the values.
> 
> That wouldn't be acceptable.

Fair enough. What I can propose in this case is to have in PM_OPP a
helper called 'dev_pm_opp_of_register_em()' or something like this. This
function will check all prerequisites are present (we have the right
values in DT, and so on) and then call (or not) em_register_perf_domain().
Then we can make the CPUFreq drivers use that instead of calling
em_register_perf_domain() directly.

That would also make it easy to implement Matthias' suggestion to not
call em_register_perf_domain() if an EM is already present.

Would that work ?

Thanks,
Quentin

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 6/7] arm64: dts: juno: Add cpu dynamic-power-coefficient information
  2019-01-28 16:55 ` [PATCH 6/7] arm64: dts: juno: Add cpu dynamic-power-coefficient information Quentin Perret
@ 2019-01-29 15:27   ` Sudeep Holla
  2019-01-30 10:23     ` Quentin Perret
  0 siblings, 1 reply; 25+ messages in thread
From: Sudeep Holla @ 2019-01-29 15:27 UTC (permalink / raw)
  To: Quentin Perret
  Cc: rjw, viresh.kumar, liviu.dudau, lorenzo.pieralisi, robh+dt,
	mark.rutland, nm, sboyd, linux-arm-kernel, devicetree,
	linux-kernel, linux-pm, dietmar.eggemann, Sudeep Holla, mka

On Mon, Jan 28, 2019 at 04:55:21PM +0000, Quentin Perret wrote:
> From: Dietmar Eggemann <dietmar.eggemann@arm.com>
>
> A CPUfreq driver, like the scpi driver used on Juno boards, which
> provide the Energy Model with power cost information via the PM_OPP
> of_dev_pm_opp_get_cpu_power() function, do need the
> dynamic-power-coefficient (C) in the device tree.
>
> Method used to obtain the C value:
> C is computed by measuring energy (E) consumption of a frequency domain
> (FD) over a 10s runtime (t) sysbench workload running at each Operating
> Performance Point (OPP) affine to 1 or 2 CPUs of that FD while the other
> CPUs of the system are hotplugged out.
> By definition all CPUs of a FD have the the same micro-architecture. An
> OPP is characterized by a certain frequency (f) and voltage (V) value.
> The corresponding power values (P) are calculated by dividing the delta
> of the E values between the runs with 2 and 1 CPUs by t.
>
> With n data tuples (P, f, V), n equal to number of OPPs for this
> frequency domain, we can solve C by:
>
> P = Pstat + Pdyn
>
> P = Pstat + CV²f
>
> Cx = (Px - P1)/(Vx²fx - V1²f1) with x = {2, ..., n}
>
> The C value is the arithmetic mean out of {C2, ..., Cn}.
>
> Since DVFS is broken on Juno r1, no dynamic-power-coefficient
> information has been added to its dts file.
>

Since the binding for "dynamic-power-coefficient" property already exist,
and I don't see any dependency for this and the next patch(TC2) on this
series, I will apply them. Please shout if for any reason that's not true.

--
Regards,
Sudeep

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 1/7] PM / OPP: Introduce a power estimation helper
  2019-01-29  9:03     ` Quentin Perret
@ 2019-01-29 17:52       ` Matthias Kaehlcke
  0 siblings, 0 replies; 25+ messages in thread
From: Matthias Kaehlcke @ 2019-01-29 17:52 UTC (permalink / raw)
  To: Quentin Perret
  Cc: rjw, viresh.kumar, sudeep.holla, liviu.dudau, lorenzo.pieralisi,
	robh+dt, mark.rutland, nm, sboyd, linux-arm-kernel, devicetree,
	linux-kernel, linux-pm, dietmar.eggemann

Hi Quentin,

On Tue, Jan 29, 2019 at 09:03:34AM +0000, Quentin Perret wrote:
> Hi Matthias,
> 
> On Monday 28 Jan 2019 at 11:02:51 (-0800), Matthias Kaehlcke wrote:
> > > +/**
> > > + * of_dev_pm_opp_get_cpu_power() - Estimates the power of a CPU
> > > + * @mW:		pointer to the power estimate in milli-watts
> > > + * @KHz:	pointer to the OPP's frequency, in kilo-hertz
> > 
> > nit: should be kHz
> 
> Right, I can change that :-)
> 
> > 
> > > + * @cpu:	CPU for which power needs to be estimated
> > > + *
> > > + * Computes the power estimated by @CPU at the first OPP above @KHz (ceil),
> > > + * and updates @KHz and @mW accordingly.
> > > + *
> > > + * The power is estimated as P = C * V^2 * f, with C the CPU's capacitance
> > > + * (read from the 'dynamic-power-coefficient' devicetree binding) and V and f
> > > + * respectively the voltage and frequency of the OPP.
> > > + *
> > > + * Return: -ENODEV if the CPU device cannot be found, -EINVAL if the power
> > > + * calculation failed because of missing parameters, 0 otherwise.
> > > + */
> > > +int of_dev_pm_opp_get_cpu_power(unsigned long *mW, unsigned long *KHz, int cpu)
> > 
> > I think it is more common to put the input parameters first, then the
> > output ones, i.e. cpu, kHz, mW.
> 
> Hmm, the issue is this must match the expectations of the EM framework:
> 
> https://elixir.bootlin.com/linux/v5.0-rc4/source/include/linux/energy_model.h#L62
> 
> So, I don't really have a choice. Or we can change the core code if this
> _really_ is a problem -- we don't have users yet so now is the best time
> to do so I guess ...

I see.

I think it would be preferable to follow the common scheme, but it's
also not a major issue, up to you whether you want to change the
definition of the callback. It's certainly true that now would be the
best time for a change.

Cheers

Matthias

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 2/7] cpufreq: dt: Register an Energy Model
  2019-01-29  9:15       ` Quentin Perret
@ 2019-01-30  5:18         ` Viresh Kumar
  2019-01-30  9:12           ` Quentin Perret
  0 siblings, 1 reply; 25+ messages in thread
From: Viresh Kumar @ 2019-01-30  5:18 UTC (permalink / raw)
  To: Quentin Perret
  Cc: Matthias Kaehlcke, rjw, sudeep.holla, liviu.dudau,
	lorenzo.pieralisi, robh+dt, mark.rutland, nm, sboyd,
	linux-arm-kernel, devicetree, linux-kernel, linux-pm,
	dietmar.eggemann

On 29-01-19, 09:15, Quentin Perret wrote:
> On Tuesday 29 Jan 2019 at 10:51:44 (+0530), Viresh Kumar wrote:
> > On 28-01-19, 11:36, Matthias Kaehlcke wrote:
> > > I think this patch will result in error messages at registration on
> > > platforms that use the cpufreq-dt driver and don't specify
> > > 'dynamic-power-coefficient' for the CPUs in the DT. Not sure if that's
> > > a problem as long as the cpufreq initialization succeeds regardless,
> > > it could be seen as a not-so-gentle nudge to add the values.
> > 
> > That wouldn't be acceptable.
> 
> Fair enough. What I can propose in this case is to have in PM_OPP a
> helper called 'dev_pm_opp_of_register_em()' or something like this. This
> function will check all prerequisites are present (we have the right
> values in DT, and so on) and then call (or not) em_register_perf_domain().
> Then we can make the CPUFreq drivers use that instead of calling
> em_register_perf_domain() directly.

That should be fine.

> That would also make it easy to implement Matthias' suggestion to not
> call em_register_perf_domain() if an EM is already present.

So you will track registration state within the OPP core for that ?
Sorry but that doesn't sound right. What's wrong with having an
unregister helper in energy-model to keep proper code flow everywhere
?

-- 
viresh

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 2/7] cpufreq: dt: Register an Energy Model
  2019-01-30  5:18         ` Viresh Kumar
@ 2019-01-30  9:12           ` Quentin Perret
  2019-01-30 10:17             ` Viresh Kumar
  0 siblings, 1 reply; 25+ messages in thread
From: Quentin Perret @ 2019-01-30  9:12 UTC (permalink / raw)
  To: Viresh Kumar
  Cc: Matthias Kaehlcke, rjw, sudeep.holla, liviu.dudau,
	lorenzo.pieralisi, robh+dt, mark.rutland, nm, sboyd,
	linux-arm-kernel, devicetree, linux-kernel, linux-pm,
	dietmar.eggemann

Hi Viresh,

On Wednesday 30 Jan 2019 at 10:48:06 (+0530), Viresh Kumar wrote:
> On 29-01-19, 09:15, Quentin Perret wrote:
> > On Tuesday 29 Jan 2019 at 10:51:44 (+0530), Viresh Kumar wrote:
> > > On 28-01-19, 11:36, Matthias Kaehlcke wrote:
> > > > I think this patch will result in error messages at registration on
> > > > platforms that use the cpufreq-dt driver and don't specify
> > > > 'dynamic-power-coefficient' for the CPUs in the DT. Not sure if that's
> > > > a problem as long as the cpufreq initialization succeeds regardless,
> > > > it could be seen as a not-so-gentle nudge to add the values.
> > > 
> > > That wouldn't be acceptable.
> > 
> > Fair enough. What I can propose in this case is to have in PM_OPP a
> > helper called 'dev_pm_opp_of_register_em()' or something like this. This
> > function will check all prerequisites are present (we have the right
> > values in DT, and so on) and then call (or not) em_register_perf_domain().
> > Then we can make the CPUFreq drivers use that instead of calling
> > em_register_perf_domain() directly.
> 
> That should be fine.
> 
> > That would also make it easy to implement Matthias' suggestion to not
> > call em_register_perf_domain() if an EM is already present.
> 
> So you will track registration state within the OPP core for that ?

What I had in mind is something as simple as:

	void of_dev_pm_opp_register_em(struct cpumask *cpus)
	{
		/* Bail out if an EM is there */
		if (em_cpu_get(cpumask_first(cpus)))
			return;

		/* Check prerequisites: dpc coeff in DT, ... */
		...

		em_register_perf_domain(...);
	}

IIUC, Matthias' point was that if the EM is already registered, there is
no good reason to call em_register_perf_domain() again. Now, that should
in fact be harmless because em_register_perf_domain() already does that
check. It's just cleaner and easier to understand from a conceptual
standpoint to not call that function several times for no reason I
assume.

> Sorry but that doesn't sound right. What's wrong with having an
> unregister helper in energy-model to keep proper code flow everywhere
> ?

For the EM we basically allocate the tables in memory once and for all
at boot time and never touch them again. That makes it easy for users
(the scheduler as of now, IPA soon) to access them without messing
around with RCU or so. So there isn't really a concept or unregistering
a pd right now.

Thanks,
Quentin

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 2/7] cpufreq: dt: Register an Energy Model
  2019-01-30  9:12           ` Quentin Perret
@ 2019-01-30 10:17             ` Viresh Kumar
  2019-01-30 10:20               ` Quentin Perret
  0 siblings, 1 reply; 25+ messages in thread
From: Viresh Kumar @ 2019-01-30 10:17 UTC (permalink / raw)
  To: Quentin Perret
  Cc: Matthias Kaehlcke, rjw, sudeep.holla, liviu.dudau,
	lorenzo.pieralisi, robh+dt, mark.rutland, nm, sboyd,
	linux-arm-kernel, devicetree, linux-kernel, linux-pm,
	dietmar.eggemann

On 30-01-19, 09:12, Quentin Perret wrote:
> What I had in mind is something as simple as:
> 
> 	void of_dev_pm_opp_register_em(struct cpumask *cpus)
> 	{
> 		/* Bail out if an EM is there */
> 		if (em_cpu_get(cpumask_first(cpus)))
> 			return;
> 
> 		/* Check prerequisites: dpc coeff in DT, ... */
> 		...
> 
> 		em_register_perf_domain(...);
> 	}
> 
> IIUC, Matthias' point was that if the EM is already registered, there is
> no good reason to call em_register_perf_domain() again. Now, that should
> in fact be harmless because em_register_perf_domain() already does that
> check. It's just cleaner and easier to understand from a conceptual
> standpoint to not call that function several times for no reason I
> assume.

If there is no good reason to call em_register_perf_domain() several
times, then the same applies to of_dev_pm_opp_register_em() as well,
isn't it ?

This is init code anyway isn't going to run a lot, so I wouldn't
suggest adding any such (duplicate) checks in
of_dev_pm_opp_register_em().

-- 
viresh

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 2/7] cpufreq: dt: Register an Energy Model
  2019-01-30 10:17             ` Viresh Kumar
@ 2019-01-30 10:20               ` Quentin Perret
  0 siblings, 0 replies; 25+ messages in thread
From: Quentin Perret @ 2019-01-30 10:20 UTC (permalink / raw)
  To: Viresh Kumar
  Cc: Matthias Kaehlcke, rjw, sudeep.holla, liviu.dudau,
	lorenzo.pieralisi, robh+dt, mark.rutland, nm, sboyd,
	linux-arm-kernel, devicetree, linux-kernel, linux-pm,
	dietmar.eggemann

On Wednesday 30 Jan 2019 at 15:47:15 (+0530), Viresh Kumar wrote:
> On 30-01-19, 09:12, Quentin Perret wrote:
> > What I had in mind is something as simple as:
> > 
> > 	void of_dev_pm_opp_register_em(struct cpumask *cpus)
> > 	{
> > 		/* Bail out if an EM is there */
> > 		if (em_cpu_get(cpumask_first(cpus)))
> > 			return;
> > 
> > 		/* Check prerequisites: dpc coeff in DT, ... */
> > 		...
> > 
> > 		em_register_perf_domain(...);
> > 	}
> > 
> > IIUC, Matthias' point was that if the EM is already registered, there is
> > no good reason to call em_register_perf_domain() again. Now, that should
> > in fact be harmless because em_register_perf_domain() already does that
> > check. It's just cleaner and easier to understand from a conceptual
> > standpoint to not call that function several times for no reason I
> > assume.
> 
> If there is no good reason to call em_register_perf_domain() several
> times, then the same applies to of_dev_pm_opp_register_em() as well,
> isn't it ?

Hrmpf, that is true ... :-)

> This is init code anyway isn't going to run a lot, so I wouldn't
> suggest adding any such (duplicate) checks in
> of_dev_pm_opp_register_em().

Fair enough, I'll post a v2 w/o the check and we'll see if others have
complaints.

Thanks,
Quentin

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 6/7] arm64: dts: juno: Add cpu dynamic-power-coefficient information
  2019-01-29 15:27   ` Sudeep Holla
@ 2019-01-30 10:23     ` Quentin Perret
  0 siblings, 0 replies; 25+ messages in thread
From: Quentin Perret @ 2019-01-30 10:23 UTC (permalink / raw)
  To: Sudeep Holla
  Cc: rjw, viresh.kumar, liviu.dudau, lorenzo.pieralisi, robh+dt,
	mark.rutland, nm, sboyd, linux-arm-kernel, devicetree,
	linux-kernel, linux-pm, dietmar.eggemann, mka

On Tuesday 29 Jan 2019 at 15:27:35 (+0000), Sudeep Holla wrote:
> On Mon, Jan 28, 2019 at 04:55:21PM +0000, Quentin Perret wrote:
> > From: Dietmar Eggemann <dietmar.eggemann@arm.com>
> >
> > A CPUfreq driver, like the scpi driver used on Juno boards, which
> > provide the Energy Model with power cost information via the PM_OPP
> > of_dev_pm_opp_get_cpu_power() function, do need the
> > dynamic-power-coefficient (C) in the device tree.
> >
> > Method used to obtain the C value:
> > C is computed by measuring energy (E) consumption of a frequency domain
> > (FD) over a 10s runtime (t) sysbench workload running at each Operating
> > Performance Point (OPP) affine to 1 or 2 CPUs of that FD while the other
> > CPUs of the system are hotplugged out.
> > By definition all CPUs of a FD have the the same micro-architecture. An
> > OPP is characterized by a certain frequency (f) and voltage (V) value.
> > The corresponding power values (P) are calculated by dividing the delta
> > of the E values between the runs with 2 and 1 CPUs by t.
> >
> > With n data tuples (P, f, V), n equal to number of OPPs for this
> > frequency domain, we can solve C by:
> >
> > P = Pstat + Pdyn
> >
> > P = Pstat + CV²f
> >
> > Cx = (Px - P1)/(Vx²fx - V1²f1) with x = {2, ..., n}
> >
> > The C value is the arithmetic mean out of {C2, ..., Cn}.
> >
> > Since DVFS is broken on Juno r1, no dynamic-power-coefficient
> > information has been added to its dts file.
> >
> 
> Since the binding for "dynamic-power-coefficient" property already exist,
> and I don't see any dependency for this and the next patch(TC2) on this
> series, I will apply them. Please shout if for any reason that's not true.

Thanks Sudeep !
Quentin

^ permalink raw reply	[flat|nested] 25+ messages in thread

end of thread, other threads:[~2019-01-30 10:23 UTC | newest]

Thread overview: 25+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-01-28 16:55 [PATCH 0/7] Register an Energy Model for Arm reference platforms Quentin Perret
2019-01-28 16:55 ` [PATCH 1/7] PM / OPP: Introduce a power estimation helper Quentin Perret
2019-01-28 19:02   ` Matthias Kaehlcke
2019-01-29  9:03     ` Quentin Perret
2019-01-29 17:52       ` Matthias Kaehlcke
2019-01-29  5:10   ` Viresh Kumar
2019-01-29  9:09     ` Quentin Perret
2019-01-28 16:55 ` [PATCH 2/7] cpufreq: dt: Register an Energy Model Quentin Perret
2019-01-28 19:36   ` Matthias Kaehlcke
2019-01-29  5:21     ` Viresh Kumar
2019-01-29  9:15       ` Quentin Perret
2019-01-30  5:18         ` Viresh Kumar
2019-01-30  9:12           ` Quentin Perret
2019-01-30 10:17             ` Viresh Kumar
2019-01-30 10:20               ` Quentin Perret
2019-01-29  5:30   ` Viresh Kumar
2019-01-29  9:10     ` Quentin Perret
2019-01-28 16:55 ` [PATCH 3/7] cpufreq: scpi: " Quentin Perret
2019-01-29  5:31   ` Viresh Kumar
2019-01-28 16:55 ` [PATCH 4/7] cpufreq: arm_big_little: " Quentin Perret
2019-01-28 16:55 ` [PATCH 5/7] cpufreq: scmi: " Quentin Perret
2019-01-28 16:55 ` [PATCH 6/7] arm64: dts: juno: Add cpu dynamic-power-coefficient information Quentin Perret
2019-01-29 15:27   ` Sudeep Holla
2019-01-30 10:23     ` Quentin Perret
2019-01-28 16:55 ` [PATCH 7/7] arm: dts: vexpress-v2p-ca15_a7: " Quentin Perret

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).