All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v4 0/3] Make IPA use PM_EM
@ 2019-05-15  8:23 ` Quentin Perret
  0 siblings, 0 replies; 40+ messages in thread
From: Quentin Perret @ 2019-05-15  8:23 UTC (permalink / raw)
  To: edubezval, rui.zhang, javi.merino, viresh.kumar, amit.kachhap,
	rjw, will.deacon, catalin.marinas
  Cc: daniel.lezcano, dietmar.eggemann, ionela.voinescu, mka, linux-pm,
	linux-kernel, linux-arm-kernel, quentin.perret

Changes in v4:
**************
 - Added Viresh's Acked-by to all 3 patches
 - Improved commit message of patch 3/3 to explain how it has no
   functional impact on existing users (Eduardo)

Changes in v3:
**************
 - Changed warning message for unordered tables to something more
   explicit (Viresh)
 - Changed WARN() into a pr_err() for consistency

Changes in v2:
**************
 - Fixed patch 01/03 to actually enable CONFIG_ENERGY_MODEL
 - Added "depends on ENERGY_MODEL" to IPA (Daniel)
 - Added check to bail out if the freq table is unsorted (Viresh)

Cover letter:
*************

The Intelligent Power Allocator (IPA) thermal governor uses an Energy
Model (or EM) of the CPUs to re-distribute the power budget. To do so,
it builds a table of <frequency, power> tuples where the power values
are computed using the 'dynamic-power-coefficient' DT property. All of
this is done in and only for the thermal subsystem, and more
specifically for CPUs -- the power of other types of devices is obtained
differently.

Recently, the CPU scheduler has seen the introduction of Energy Aware
Scheduling (EAS) patches, which also rely on an EM of the CPUs. This EM,
however, is managed by an independent framework, called PM_EM, aimed to
be used by all kernel subsystems interested in the power consumed by
CPUs, and not only the scheduler.

This patch series follows this logic and removes the (now redundant)
thermal-specific EM computation code to migrate IPA to use PM_EM
instead.

Doing so should have no visible functional impact for existing users of
IPA since:

 - during the 5.1 development cycle, a series of patches [1] introduced
   in PM_OPP some infrastructure (dev_pm_opp_of_register_em()) enabling
   the registration of EMs in PM_EM using the DT property used by IPA;

 - the existing upstream cpufreq drivers marked with the
   'CPUFREQ_IS_COOLING_DEV' flag all call dev_pm_opp_of_register_em(),
   which means they all support PM_EM (the only two exceptions are
   qoriq-cpufreq which doesn't in fact use an EM and scmi-cpufreq which
   already supports PM_EM without using the PM_OPP infrastructurei
   because it read power costs directly from firmware);

So, migrating IPA to using PM_EM should effectively be just plumbing
since for the existing IPA users the PM_EM tables will contain the
exact same power values that IPA used to compute on its own until now.
The only new dependency is to compile in CONFIG_ENERGY_MODEL.

Why is this migration still a good thing ? For three main reasons.

 1. it removes redundant code;

 2. it introduces an abstraction layer between IPA and the EM
    computation. PM_EM offers to EAS and IPA (and potentially other
    clients) standardized EM tables and hides 'how' these tables have
    been obtained. PM_EM as of now supports power values either coming
    from the 'dynamic-power-coefficient' DT property or obtained
    directly from firmware using SCMI. The latter is a new feature for
    IPA and that comes 'for free' with the migration. This will also be
    true in the future every time PM_EM gets support for other ways of
    loading the EM. Moreover, PM_EM is documented and has a debugfs
    interface which should help adding support for new platforms.

 3. it builds a consistent view of the EM of CPUs across kernel
    subsystems, which is a pre-requisite for any kind of future work
    aiming at a smarter power allocation using scheduler knowledge about
    the system for example.

[1] https://lore.kernel.org/lkml/20190204110952.16025-1-quentin.perret@arm.com/


Quentin Perret (3):
  arm64: defconfig: Enable CONFIG_ENERGY_MODEL
  PM / EM: Expose perf domain struct
  thermal: cpu_cooling: Migrate to using the EM framework

 arch/arm64/configs/defconfig  |   1 +
 drivers/thermal/Kconfig       |   1 +
 drivers/thermal/cpu_cooling.c | 238 ++++++++++++----------------------
 include/linux/energy_model.h  |   3 +-
 4 files changed, 84 insertions(+), 159 deletions(-)

-- 
2.21.0


^ permalink raw reply	[flat|nested] 40+ messages in thread

* [PATCH v4 0/3] Make IPA use PM_EM
@ 2019-05-15  8:23 ` Quentin Perret
  0 siblings, 0 replies; 40+ messages in thread
From: Quentin Perret @ 2019-05-15  8:23 UTC (permalink / raw)
  To: edubezval, rui.zhang, javi.merino, viresh.kumar, amit.kachhap,
	rjw, will.deacon, catalin.marinas
  Cc: linux-pm, daniel.lezcano, linux-kernel, quentin.perret, mka,
	ionela.voinescu, dietmar.eggemann, linux-arm-kernel

Changes in v4:
**************
 - Added Viresh's Acked-by to all 3 patches
 - Improved commit message of patch 3/3 to explain how it has no
   functional impact on existing users (Eduardo)

Changes in v3:
**************
 - Changed warning message for unordered tables to something more
   explicit (Viresh)
 - Changed WARN() into a pr_err() for consistency

Changes in v2:
**************
 - Fixed patch 01/03 to actually enable CONFIG_ENERGY_MODEL
 - Added "depends on ENERGY_MODEL" to IPA (Daniel)
 - Added check to bail out if the freq table is unsorted (Viresh)

Cover letter:
*************

The Intelligent Power Allocator (IPA) thermal governor uses an Energy
Model (or EM) of the CPUs to re-distribute the power budget. To do so,
it builds a table of <frequency, power> tuples where the power values
are computed using the 'dynamic-power-coefficient' DT property. All of
this is done in and only for the thermal subsystem, and more
specifically for CPUs -- the power of other types of devices is obtained
differently.

Recently, the CPU scheduler has seen the introduction of Energy Aware
Scheduling (EAS) patches, which also rely on an EM of the CPUs. This EM,
however, is managed by an independent framework, called PM_EM, aimed to
be used by all kernel subsystems interested in the power consumed by
CPUs, and not only the scheduler.

This patch series follows this logic and removes the (now redundant)
thermal-specific EM computation code to migrate IPA to use PM_EM
instead.

Doing so should have no visible functional impact for existing users of
IPA since:

 - during the 5.1 development cycle, a series of patches [1] introduced
   in PM_OPP some infrastructure (dev_pm_opp_of_register_em()) enabling
   the registration of EMs in PM_EM using the DT property used by IPA;

 - the existing upstream cpufreq drivers marked with the
   'CPUFREQ_IS_COOLING_DEV' flag all call dev_pm_opp_of_register_em(),
   which means they all support PM_EM (the only two exceptions are
   qoriq-cpufreq which doesn't in fact use an EM and scmi-cpufreq which
   already supports PM_EM without using the PM_OPP infrastructurei
   because it read power costs directly from firmware);

So, migrating IPA to using PM_EM should effectively be just plumbing
since for the existing IPA users the PM_EM tables will contain the
exact same power values that IPA used to compute on its own until now.
The only new dependency is to compile in CONFIG_ENERGY_MODEL.

Why is this migration still a good thing ? For three main reasons.

 1. it removes redundant code;

 2. it introduces an abstraction layer between IPA and the EM
    computation. PM_EM offers to EAS and IPA (and potentially other
    clients) standardized EM tables and hides 'how' these tables have
    been obtained. PM_EM as of now supports power values either coming
    from the 'dynamic-power-coefficient' DT property or obtained
    directly from firmware using SCMI. The latter is a new feature for
    IPA and that comes 'for free' with the migration. This will also be
    true in the future every time PM_EM gets support for other ways of
    loading the EM. Moreover, PM_EM is documented and has a debugfs
    interface which should help adding support for new platforms.

 3. it builds a consistent view of the EM of CPUs across kernel
    subsystems, which is a pre-requisite for any kind of future work
    aiming at a smarter power allocation using scheduler knowledge about
    the system for example.

[1] https://lore.kernel.org/lkml/20190204110952.16025-1-quentin.perret@arm.com/


Quentin Perret (3):
  arm64: defconfig: Enable CONFIG_ENERGY_MODEL
  PM / EM: Expose perf domain struct
  thermal: cpu_cooling: Migrate to using the EM framework

 arch/arm64/configs/defconfig  |   1 +
 drivers/thermal/Kconfig       |   1 +
 drivers/thermal/cpu_cooling.c | 238 ++++++++++++----------------------
 include/linux/energy_model.h  |   3 +-
 4 files changed, 84 insertions(+), 159 deletions(-)

-- 
2.21.0


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 40+ messages in thread

* [PATCH v4 1/3] arm64: defconfig: Enable CONFIG_ENERGY_MODEL
  2019-05-15  8:23 ` Quentin Perret
@ 2019-05-15  8:23   ` Quentin Perret
  -1 siblings, 0 replies; 40+ messages in thread
From: Quentin Perret @ 2019-05-15  8:23 UTC (permalink / raw)
  To: edubezval, rui.zhang, javi.merino, viresh.kumar, amit.kachhap,
	rjw, will.deacon, catalin.marinas
  Cc: daniel.lezcano, dietmar.eggemann, ionela.voinescu, mka, linux-pm,
	linux-kernel, linux-arm-kernel, quentin.perret

The recently introduced Energy Model (EM) framework manages power cost
tables for the CPUs of the system. Its only user right now is the
scheduler, in the context of Energy Aware Scheduling (EAS).

However, the EM framework also offers a generic infrastructure that
could replace subsystem-specific implementations of the same concepts,
as this is the case in the thermal framework.

So, in order to prepare the migration of the thermal subsystem to use
the EM framework, enable it in the default arm64 defconfig, which is the
most commonly used architecture for IPA. This will also compile-in all
of the EAS code, although it won't be enabled by default -- EAS requires
to use the 'schedutil' CPUFreq governor while arm64 defaults to
'performance'.

Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Quentin Perret <quentin.perret@arm.com>
---
 arch/arm64/configs/defconfig | 1 +
 1 file changed, 1 insertion(+)

diff --git a/arch/arm64/configs/defconfig b/arch/arm64/configs/defconfig
index 2d9c39033c1a..4881a752ab3f 100644
--- a/arch/arm64/configs/defconfig
+++ b/arch/arm64/configs/defconfig
@@ -97,6 +97,7 @@ CONFIG_XEN=y
 CONFIG_COMPAT=y
 CONFIG_HIBERNATION=y
 CONFIG_WQ_POWER_EFFICIENT_DEFAULT=y
+CONFIG_ENERGY_MODEL=y
 CONFIG_ARM_CPUIDLE=y
 CONFIG_CPU_FREQ=y
 CONFIG_CPU_FREQ_STAT=y
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [PATCH v4 1/3] arm64: defconfig: Enable CONFIG_ENERGY_MODEL
@ 2019-05-15  8:23   ` Quentin Perret
  0 siblings, 0 replies; 40+ messages in thread
From: Quentin Perret @ 2019-05-15  8:23 UTC (permalink / raw)
  To: edubezval, rui.zhang, javi.merino, viresh.kumar, amit.kachhap,
	rjw, will.deacon, catalin.marinas
  Cc: linux-pm, daniel.lezcano, linux-kernel, quentin.perret, mka,
	ionela.voinescu, dietmar.eggemann, linux-arm-kernel

The recently introduced Energy Model (EM) framework manages power cost
tables for the CPUs of the system. Its only user right now is the
scheduler, in the context of Energy Aware Scheduling (EAS).

However, the EM framework also offers a generic infrastructure that
could replace subsystem-specific implementations of the same concepts,
as this is the case in the thermal framework.

So, in order to prepare the migration of the thermal subsystem to use
the EM framework, enable it in the default arm64 defconfig, which is the
most commonly used architecture for IPA. This will also compile-in all
of the EAS code, although it won't be enabled by default -- EAS requires
to use the 'schedutil' CPUFreq governor while arm64 defaults to
'performance'.

Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Quentin Perret <quentin.perret@arm.com>
---
 arch/arm64/configs/defconfig | 1 +
 1 file changed, 1 insertion(+)

diff --git a/arch/arm64/configs/defconfig b/arch/arm64/configs/defconfig
index 2d9c39033c1a..4881a752ab3f 100644
--- a/arch/arm64/configs/defconfig
+++ b/arch/arm64/configs/defconfig
@@ -97,6 +97,7 @@ CONFIG_XEN=y
 CONFIG_COMPAT=y
 CONFIG_HIBERNATION=y
 CONFIG_WQ_POWER_EFFICIENT_DEFAULT=y
+CONFIG_ENERGY_MODEL=y
 CONFIG_ARM_CPUIDLE=y
 CONFIG_CPU_FREQ=y
 CONFIG_CPU_FREQ_STAT=y
-- 
2.21.0


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [PATCH v4 2/3] PM / EM: Expose perf domain struct
  2019-05-15  8:23 ` Quentin Perret
@ 2019-05-15  8:23   ` Quentin Perret
  -1 siblings, 0 replies; 40+ messages in thread
From: Quentin Perret @ 2019-05-15  8:23 UTC (permalink / raw)
  To: edubezval, rui.zhang, javi.merino, viresh.kumar, amit.kachhap,
	rjw, will.deacon, catalin.marinas
  Cc: daniel.lezcano, dietmar.eggemann, ionela.voinescu, mka, linux-pm,
	linux-kernel, linux-arm-kernel, quentin.perret

In the current state, the perf_domain struct is fully defined only when
CONFIG_ENERGY_MODEL=y. Since we need to write code that compiles both
with or without that option in the thermal framework, make sure to
actually define the struct regardless of the config option. That allows
to avoid using stubbed accessor functions all the time in code paths
that use the EM.

Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Quentin Perret <quentin.perret@arm.com>
---
 include/linux/energy_model.h | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/include/linux/energy_model.h b/include/linux/energy_model.h
index aa027f7bcb3e..fb32b86a467d 100644
--- a/include/linux/energy_model.h
+++ b/include/linux/energy_model.h
@@ -9,7 +9,6 @@
 #include <linux/sched/topology.h>
 #include <linux/types.h>
 
-#ifdef CONFIG_ENERGY_MODEL
 /**
  * em_cap_state - Capacity state of a performance domain
  * @frequency:	The CPU frequency in KHz, for consistency with CPUFreq
@@ -40,6 +39,7 @@ struct em_perf_domain {
 	unsigned long cpus[0];
 };
 
+#ifdef CONFIG_ENERGY_MODEL
 #define EM_CPU_MAX_POWER 0xFFFF
 
 struct em_data_callback {
@@ -160,7 +160,6 @@ static inline int em_pd_nr_cap_states(struct em_perf_domain *pd)
 }
 
 #else
-struct em_perf_domain {};
 struct em_data_callback {};
 #define EM_DATA_CB(_active_power_cb) { }
 
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [PATCH v4 2/3] PM / EM: Expose perf domain struct
@ 2019-05-15  8:23   ` Quentin Perret
  0 siblings, 0 replies; 40+ messages in thread
From: Quentin Perret @ 2019-05-15  8:23 UTC (permalink / raw)
  To: edubezval, rui.zhang, javi.merino, viresh.kumar, amit.kachhap,
	rjw, will.deacon, catalin.marinas
  Cc: linux-pm, daniel.lezcano, linux-kernel, quentin.perret, mka,
	ionela.voinescu, dietmar.eggemann, linux-arm-kernel

In the current state, the perf_domain struct is fully defined only when
CONFIG_ENERGY_MODEL=y. Since we need to write code that compiles both
with or without that option in the thermal framework, make sure to
actually define the struct regardless of the config option. That allows
to avoid using stubbed accessor functions all the time in code paths
that use the EM.

Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Quentin Perret <quentin.perret@arm.com>
---
 include/linux/energy_model.h | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/include/linux/energy_model.h b/include/linux/energy_model.h
index aa027f7bcb3e..fb32b86a467d 100644
--- a/include/linux/energy_model.h
+++ b/include/linux/energy_model.h
@@ -9,7 +9,6 @@
 #include <linux/sched/topology.h>
 #include <linux/types.h>
 
-#ifdef CONFIG_ENERGY_MODEL
 /**
  * em_cap_state - Capacity state of a performance domain
  * @frequency:	The CPU frequency in KHz, for consistency with CPUFreq
@@ -40,6 +39,7 @@ struct em_perf_domain {
 	unsigned long cpus[0];
 };
 
+#ifdef CONFIG_ENERGY_MODEL
 #define EM_CPU_MAX_POWER 0xFFFF
 
 struct em_data_callback {
@@ -160,7 +160,6 @@ static inline int em_pd_nr_cap_states(struct em_perf_domain *pd)
 }
 
 #else
-struct em_perf_domain {};
 struct em_data_callback {};
 #define EM_DATA_CB(_active_power_cb) { }
 
-- 
2.21.0


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [PATCH v4 3/3] thermal: cpu_cooling: Migrate to using the EM framework
  2019-05-15  8:23 ` Quentin Perret
@ 2019-05-15  8:23   ` Quentin Perret
  -1 siblings, 0 replies; 40+ messages in thread
From: Quentin Perret @ 2019-05-15  8:23 UTC (permalink / raw)
  To: edubezval, rui.zhang, javi.merino, viresh.kumar, amit.kachhap,
	rjw, will.deacon, catalin.marinas
  Cc: daniel.lezcano, dietmar.eggemann, ionela.voinescu, mka, linux-pm,
	linux-kernel, linux-arm-kernel, quentin.perret

The newly introduced Energy Model framework manages power cost tables in
a generic way. Moreover, it supports several types of models since the
tables can come from DT or firmware (through SCMI) for example. On the
other hand, the cpu_cooling subsystem manages its own power cost tables
using only DT data.

In order to avoid the duplication of data in the kernel, and in order to
enable IPA with EMs coming from more than just DT, remove the private
tables from cpu_cooling.c and migrate it to using the centralized EM
framework. Doing so should have no visible functional impact for
existing users of IPA since:

 - recent extenstions to the the PM_OPP infrastructure enable the
   registration of EMs in PM_EM using the DT property used by IPA;

 - the existing upstream cpufreq drivers marked with the
   'CPUFREQ_IS_COOLING_DEV' flag all use the aforementioned PM_OPP
   infrastructure, which means they all support PM_EM. The only two
   exceptions are qoriq-cpufreq which doesn't in fact use an EM and
   scmi-cpufreq which doesn't use DT for power costs.

For existing users of cpu_cooling, PM_EM tables will contain the exact
same power values that IPA used to compute on its own until now. The
only new dependency for them is to compile in CONFIG_ENERGY_MODEL.

The case where the thermal subsystem is used without an Energy Model
(cpufreq_cooling_ops) is handled by looking directly at CPUFreq's
frequency table which is already a dependency for cpu_cooling.c anyway.
Since the thermal framework expects the cooling states in a particular
order, bail out whenever the CPUFreq table is unsorted, since that is
fairly uncommon in general, and there are currently no users of
cpu_cooling for this use-case.

Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Quentin Perret <quentin.perret@arm.com>
---
 drivers/thermal/Kconfig       |   1 +
 drivers/thermal/cpu_cooling.c | 238 ++++++++++++----------------------
 2 files changed, 82 insertions(+), 157 deletions(-)

diff --git a/drivers/thermal/Kconfig b/drivers/thermal/Kconfig
index 653aa27a25a4..d695bd33c440 100644
--- a/drivers/thermal/Kconfig
+++ b/drivers/thermal/Kconfig
@@ -144,6 +144,7 @@ config THERMAL_GOV_USER_SPACE
 
 config THERMAL_GOV_POWER_ALLOCATOR
 	bool "Power allocator thermal governor"
+	depends on ENERGY_MODEL
 	help
 	  Enable this to manage platform thermals by dynamically
 	  allocating and limiting power to devices.
diff --git a/drivers/thermal/cpu_cooling.c b/drivers/thermal/cpu_cooling.c
index f7c1f49ec87f..322ea89dd078 100644
--- a/drivers/thermal/cpu_cooling.c
+++ b/drivers/thermal/cpu_cooling.c
@@ -31,6 +31,7 @@
 #include <linux/slab.h>
 #include <linux/cpu.h>
 #include <linux/cpu_cooling.h>
+#include <linux/energy_model.h>
 
 #include <trace/events/thermal.h>
 
@@ -48,19 +49,6 @@
  *	...
  */
 
-/**
- * struct freq_table - frequency table along with power entries
- * @frequency:	frequency in KHz
- * @power:	power in mW
- *
- * This structure is built when the cooling device registers and helps
- * in translating frequency to power and vice versa.
- */
-struct freq_table {
-	u32 frequency;
-	u32 power;
-};
-
 /**
  * struct time_in_idle - Idle time stats
  * @time: previous reading of the absolute time that this cpu was idle
@@ -82,7 +70,7 @@ struct time_in_idle {
  *	frequency.
  * @max_level: maximum cooling level. One less than total number of valid
  *	cpufreq frequencies.
- * @freq_table: Freq table in descending order of frequencies
+ * @em: Reference on the Energy Model of the device
  * @cdev: thermal_cooling_device pointer to keep track of the
  *	registered cooling device.
  * @policy: cpufreq policy.
@@ -98,7 +86,7 @@ struct cpufreq_cooling_device {
 	unsigned int cpufreq_state;
 	unsigned int clipped_freq;
 	unsigned int max_level;
-	struct freq_table *freq_table;	/* In descending order */
+	struct em_perf_domain *em;
 	struct thermal_cooling_device *cdev;
 	struct cpufreq_policy *policy;
 	struct list_head node;
@@ -121,14 +109,14 @@ static LIST_HEAD(cpufreq_cdev_list);
 static unsigned long get_level(struct cpufreq_cooling_device *cpufreq_cdev,
 			       unsigned int freq)
 {
-	struct freq_table *freq_table = cpufreq_cdev->freq_table;
-	unsigned long level;
+	int i;
 
-	for (level = 1; level <= cpufreq_cdev->max_level; level++)
-		if (freq > freq_table[level].frequency)
+	for (i = cpufreq_cdev->max_level - 1; i >= 0; i--) {
+		if (freq > cpufreq_cdev->em->table[i].frequency)
 			break;
+	}
 
-	return level - 1;
+	return cpufreq_cdev->max_level - i - 1;
 }
 
 /**
@@ -184,105 +172,30 @@ static int cpufreq_thermal_notifier(struct notifier_block *nb,
 	return NOTIFY_OK;
 }
 
-/**
- * update_freq_table() - Update the freq table with power numbers
- * @cpufreq_cdev:	the cpufreq cooling device in which to update the table
- * @capacitance: dynamic power coefficient for these cpus
- *
- * Update the freq table with power numbers.  This table will be used in
- * cpu_power_to_freq() and cpu_freq_to_power() to convert between power and
- * frequency efficiently.  Power is stored in mW, frequency in KHz.  The
- * resulting table is in descending order.
- *
- * Return: 0 on success, -EINVAL if there are no OPPs for any CPUs,
- * or -ENOMEM if we run out of memory.
- */
-static int update_freq_table(struct cpufreq_cooling_device *cpufreq_cdev,
-			     u32 capacitance)
-{
-	struct freq_table *freq_table = cpufreq_cdev->freq_table;
-	struct dev_pm_opp *opp;
-	struct device *dev = NULL;
-	int num_opps = 0, cpu = cpufreq_cdev->policy->cpu, i;
-
-	dev = get_cpu_device(cpu);
-	if (unlikely(!dev)) {
-		dev_warn(&cpufreq_cdev->cdev->device,
-			 "No cpu device for cpu %d\n", cpu);
-		return -ENODEV;
-	}
-
-	num_opps = dev_pm_opp_get_opp_count(dev);
-	if (num_opps < 0)
-		return num_opps;
-
-	/*
-	 * The cpufreq table is also built from the OPP table and so the count
-	 * should match.
-	 */
-	if (num_opps != cpufreq_cdev->max_level + 1) {
-		dev_warn(dev, "Number of OPPs not matching with max_levels\n");
-		return -EINVAL;
-	}
-
-	for (i = 0; i <= cpufreq_cdev->max_level; i++) {
-		unsigned long freq = freq_table[i].frequency * 1000;
-		u32 freq_mhz = freq_table[i].frequency / 1000;
-		u64 power;
-		u32 voltage_mv;
-
-		/*
-		 * Find ceil frequency as 'freq' may be slightly lower than OPP
-		 * freq due to truncation while converting to kHz.
-		 */
-		opp = dev_pm_opp_find_freq_ceil(dev, &freq);
-		if (IS_ERR(opp)) {
-			dev_err(dev, "failed to get opp for %lu frequency\n",
-				freq);
-			return -EINVAL;
-		}
-
-		voltage_mv = dev_pm_opp_get_voltage(opp) / 1000;
-		dev_pm_opp_put(opp);
-
-		/*
-		 * Do the multiplication with MHz and millivolt so as
-		 * to not overflow.
-		 */
-		power = (u64)capacitance * freq_mhz * voltage_mv * voltage_mv;
-		do_div(power, 1000000000);
-
-		/* power is stored in mW */
-		freq_table[i].power = power;
-	}
-
-	return 0;
-}
-
 static u32 cpu_freq_to_power(struct cpufreq_cooling_device *cpufreq_cdev,
 			     u32 freq)
 {
 	int i;
-	struct freq_table *freq_table = cpufreq_cdev->freq_table;
 
-	for (i = 1; i <= cpufreq_cdev->max_level; i++)
-		if (freq > freq_table[i].frequency)
+	for (i = cpufreq_cdev->max_level - 1; i >= 0; i--) {
+		if (freq > cpufreq_cdev->em->table[i].frequency)
 			break;
+	}
 
-	return freq_table[i - 1].power;
+	return cpufreq_cdev->em->table[i + 1].power;
 }
 
 static u32 cpu_power_to_freq(struct cpufreq_cooling_device *cpufreq_cdev,
 			     u32 power)
 {
 	int i;
-	struct freq_table *freq_table = cpufreq_cdev->freq_table;
 
-	for (i = 1; i <= cpufreq_cdev->max_level; i++)
-		if (power > freq_table[i].power)
+	for (i = cpufreq_cdev->max_level - 1; i >= 0; i--) {
+		if (power > cpufreq_cdev->em->table[i].power)
 			break;
+	}
 
-	return freq_table[i - 1].frequency;
+	return cpufreq_cdev->em->table[i + 1].frequency;
 }
 
 /**
@@ -374,6 +287,28 @@ static int cpufreq_get_cur_state(struct thermal_cooling_device *cdev,
 	return 0;
 }
 
+static unsigned int get_state_freq(struct cpufreq_cooling_device *cpufreq_cdev,
+			      unsigned long state)
+{
+	struct cpufreq_policy *policy;
+	unsigned long idx;
+
+	/* Use the Energy Model table if available */
+	if (cpufreq_cdev->em) {
+		idx = cpufreq_cdev->max_level - state;
+		return cpufreq_cdev->em->table[idx].frequency;
+	}
+
+	/* Otherwise, fallback on the CPUFreq table */
+	policy = cpufreq_cdev->policy;
+	if (policy->freq_table_sorted == CPUFREQ_TABLE_SORTED_ASCENDING)
+		idx = cpufreq_cdev->max_level - state;
+	else
+		idx = state;
+
+	return policy->freq_table[idx].frequency;
+}
+
 /**
  * cpufreq_set_cur_state - callback function to set the current cooling state.
  * @cdev: thermal cooling device pointer.
@@ -398,7 +333,7 @@ static int cpufreq_set_cur_state(struct thermal_cooling_device *cdev,
 	if (cpufreq_cdev->cpufreq_state == state)
 		return 0;
 
-	clip_freq = cpufreq_cdev->freq_table[state].frequency;
+	clip_freq = get_state_freq(cpufreq_cdev, state);
 	cpufreq_cdev->cpufreq_state = state;
 	cpufreq_cdev->clipped_freq = clip_freq;
 
@@ -497,7 +432,7 @@ static int cpufreq_state2power(struct thermal_cooling_device *cdev,
 			       struct thermal_zone_device *tz,
 			       unsigned long state, u32 *power)
 {
-	unsigned int freq, num_cpus;
+	unsigned int freq, num_cpus, idx;
 	struct cpufreq_cooling_device *cpufreq_cdev = cdev->devdata;
 
 	/* Request state should be less than max_level */
@@ -506,7 +441,8 @@ static int cpufreq_state2power(struct thermal_cooling_device *cdev,
 
 	num_cpus = cpumask_weight(cpufreq_cdev->policy->cpus);
 
-	freq = cpufreq_cdev->freq_table[state].frequency;
+	idx = cpufreq_cdev->max_level - state;
+	freq = cpufreq_cdev->em->table[idx].frequency;
 	*power = cpu_freq_to_power(cpufreq_cdev, freq) * num_cpus;
 
 	return 0;
@@ -559,7 +495,6 @@ static struct thermal_cooling_device_ops cpufreq_cooling_ops = {
 	.get_cur_state = cpufreq_get_cur_state,
 	.set_cur_state = cpufreq_set_cur_state,
 };
-
 static struct thermal_cooling_device_ops cpufreq_power_cooling_ops = {
 	.get_max_state		= cpufreq_get_max_state,
 	.get_cur_state		= cpufreq_get_cur_state,
@@ -574,18 +509,31 @@ static struct notifier_block thermal_cpufreq_notifier_block = {
 	.notifier_call = cpufreq_thermal_notifier,
 };
 
-static unsigned int find_next_max(struct cpufreq_frequency_table *table,
-				  unsigned int prev_max)
-{
-	struct cpufreq_frequency_table *pos;
-	unsigned int max = 0;
+static inline bool em_is_sane(struct cpufreq_cooling_device *cpufreq_cdev,
+			      struct em_perf_domain *em) {
+	struct cpufreq_policy *policy;
+	unsigned int nr_levels;
 
-	cpufreq_for_each_valid_entry(pos, table) {
-		if (pos->frequency > max && pos->frequency < prev_max)
-			max = pos->frequency;
+	if (!em)
+		return false;
+
+	policy = cpufreq_cdev->policy;
+	if (!cpumask_equal(policy->related_cpus, to_cpumask(em->cpus))) {
+		pr_err("The span of pd %*pbl is misaligned with cpufreq policy %*pbl\n",
+			cpumask_pr_args(to_cpumask(em->cpus)),
+			cpumask_pr_args(policy->related_cpus));
+		return false;
 	}
 
-	return max;
+	nr_levels = cpufreq_cdev->max_level + 1;
+	if (em->nr_cap_states != nr_levels) {
+		pr_err("The number of cap states in pd %*pbl (%u) doesn't match the number of cooling levels (%u)\n",
+			cpumask_pr_args(to_cpumask(em->cpus)),
+			em->nr_cap_states, nr_levels);
+		return false;
+	}
+
+	return true;
 }
 
 /**
@@ -593,7 +541,7 @@ static unsigned int find_next_max(struct cpufreq_frequency_table *table,
  * @np: a valid struct device_node to the cooling device device tree node
  * @policy: cpufreq policy
  * Normally this should be same as cpufreq policy->related_cpus.
- * @capacitance: dynamic power coefficient for these cpus
+ * @em: Energy Model of the cpufreq policy
  *
  * This interface function registers the cpufreq cooling device with the name
  * "thermal-cpufreq-%x". This api can support multiple instances of cpufreq
@@ -605,12 +553,13 @@ static unsigned int find_next_max(struct cpufreq_frequency_table *table,
  */
 static struct thermal_cooling_device *
 __cpufreq_cooling_register(struct device_node *np,
-			struct cpufreq_policy *policy, u32 capacitance)
+			struct cpufreq_policy *policy,
+			struct em_perf_domain *em)
 {
 	struct thermal_cooling_device *cdev;
 	struct cpufreq_cooling_device *cpufreq_cdev;
 	char dev_name[THERMAL_NAME_LENGTH];
-	unsigned int freq, i, num_cpus;
+	unsigned int i, num_cpus;
 	int ret;
 	struct thermal_cooling_device_ops *cooling_ops;
 	bool first;
@@ -644,46 +593,26 @@ __cpufreq_cooling_register(struct device_node *np,
 	/* max_level is an index, not a counter */
 	cpufreq_cdev->max_level = i - 1;
 
-	cpufreq_cdev->freq_table = kmalloc_array(i,
-					sizeof(*cpufreq_cdev->freq_table),
-					GFP_KERNEL);
-	if (!cpufreq_cdev->freq_table) {
-		cdev = ERR_PTR(-ENOMEM);
-		goto free_idle_time;
-	}
-
 	ret = ida_simple_get(&cpufreq_ida, 0, 0, GFP_KERNEL);
 	if (ret < 0) {
 		cdev = ERR_PTR(ret);
-		goto free_table;
+		goto free_idle_time;
 	}
 	cpufreq_cdev->id = ret;
 
 	snprintf(dev_name, sizeof(dev_name), "thermal-cpufreq-%d",
 		 cpufreq_cdev->id);
 
-	/* Fill freq-table in descending order of frequencies */
-	for (i = 0, freq = -1; i <= cpufreq_cdev->max_level; i++) {
-		freq = find_next_max(policy->freq_table, freq);
-		cpufreq_cdev->freq_table[i].frequency = freq;
-
-		/* Warn for duplicate entries */
-		if (!freq)
-			pr_warn("%s: table has duplicate entries\n", __func__);
-		else
-			pr_debug("%s: freq:%u KHz\n", __func__, freq);
-	}
-
-	if (capacitance) {
-		ret = update_freq_table(cpufreq_cdev, capacitance);
-		if (ret) {
-			cdev = ERR_PTR(ret);
-			goto remove_ida;
-		}
-
+	if (em_is_sane(cpufreq_cdev, em)) {
+		cpufreq_cdev->em = em;
 		cooling_ops = &cpufreq_power_cooling_ops;
-	} else {
+	} else if (policy->freq_table_sorted != CPUFREQ_TABLE_UNSORTED) {
 		cooling_ops = &cpufreq_cooling_ops;
+	} else {
+		pr_err("%s: unsorted frequency tables are not supported\n",
+				__func__);
+		cdev = ERR_PTR(-EINVAL);
+		goto remove_ida;
 	}
 
 	cdev = thermal_of_cooling_device_register(np, dev_name, cpufreq_cdev,
@@ -691,7 +620,7 @@ __cpufreq_cooling_register(struct device_node *np,
 	if (IS_ERR(cdev))
 		goto remove_ida;
 
-	cpufreq_cdev->clipped_freq = cpufreq_cdev->freq_table[0].frequency;
+	cpufreq_cdev->clipped_freq = get_state_freq(cpufreq_cdev, 0);
 	cpufreq_cdev->cdev = cdev;
 
 	mutex_lock(&cooling_list_lock);
@@ -708,8 +637,6 @@ __cpufreq_cooling_register(struct device_node *np,
 
 remove_ida:
 	ida_simple_remove(&cpufreq_ida, cpufreq_cdev->id);
-free_table:
-	kfree(cpufreq_cdev->freq_table);
 free_idle_time:
 	kfree(cpufreq_cdev->idle_time);
 free_cdev:
@@ -731,7 +658,7 @@ __cpufreq_cooling_register(struct device_node *np,
 struct thermal_cooling_device *
 cpufreq_cooling_register(struct cpufreq_policy *policy)
 {
-	return __cpufreq_cooling_register(NULL, policy, 0);
+	return __cpufreq_cooling_register(NULL, policy, NULL);
 }
 EXPORT_SYMBOL_GPL(cpufreq_cooling_register);
 
@@ -759,7 +686,6 @@ of_cpufreq_cooling_register(struct cpufreq_policy *policy)
 {
 	struct device_node *np = of_get_cpu_node(policy->cpu, NULL);
 	struct thermal_cooling_device *cdev = NULL;
-	u32 capacitance = 0;
 
 	if (!np) {
 		pr_err("cpu_cooling: OF node not available for cpu%d\n",
@@ -768,10 +694,9 @@ of_cpufreq_cooling_register(struct cpufreq_policy *policy)
 	}
 
 	if (of_find_property(np, "#cooling-cells", NULL)) {
-		of_property_read_u32(np, "dynamic-power-coefficient",
-				     &capacitance);
+		struct em_perf_domain *em = em_cpu_get(policy->cpu);
 
-		cdev = __cpufreq_cooling_register(np, policy, capacitance);
+		cdev = __cpufreq_cooling_register(np, policy, em);
 		if (IS_ERR(cdev)) {
 			pr_err("cpu_cooling: cpu%d failed to register as cooling device: %ld\n",
 			       policy->cpu, PTR_ERR(cdev));
@@ -813,7 +738,6 @@ void cpufreq_cooling_unregister(struct thermal_cooling_device *cdev)
 	thermal_cooling_device_unregister(cpufreq_cdev->cdev);
 	ida_simple_remove(&cpufreq_ida, cpufreq_cdev->id);
 	kfree(cpufreq_cdev->idle_time);
-	kfree(cpufreq_cdev->freq_table);
 	kfree(cpufreq_cdev);
 }
 EXPORT_SYMBOL_GPL(cpufreq_cooling_unregister);
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [PATCH v4 3/3] thermal: cpu_cooling: Migrate to using the EM framework
@ 2019-05-15  8:23   ` Quentin Perret
  0 siblings, 0 replies; 40+ messages in thread
From: Quentin Perret @ 2019-05-15  8:23 UTC (permalink / raw)
  To: edubezval, rui.zhang, javi.merino, viresh.kumar, amit.kachhap,
	rjw, will.deacon, catalin.marinas
  Cc: linux-pm, daniel.lezcano, linux-kernel, quentin.perret, mka,
	ionela.voinescu, dietmar.eggemann, linux-arm-kernel

The newly introduced Energy Model framework manages power cost tables in
a generic way. Moreover, it supports several types of models since the
tables can come from DT or firmware (through SCMI) for example. On the
other hand, the cpu_cooling subsystem manages its own power cost tables
using only DT data.

In order to avoid the duplication of data in the kernel, and in order to
enable IPA with EMs coming from more than just DT, remove the private
tables from cpu_cooling.c and migrate it to using the centralized EM
framework. Doing so should have no visible functional impact for
existing users of IPA since:

 - recent extenstions to the the PM_OPP infrastructure enable the
   registration of EMs in PM_EM using the DT property used by IPA;

 - the existing upstream cpufreq drivers marked with the
   'CPUFREQ_IS_COOLING_DEV' flag all use the aforementioned PM_OPP
   infrastructure, which means they all support PM_EM. The only two
   exceptions are qoriq-cpufreq which doesn't in fact use an EM and
   scmi-cpufreq which doesn't use DT for power costs.

For existing users of cpu_cooling, PM_EM tables will contain the exact
same power values that IPA used to compute on its own until now. The
only new dependency for them is to compile in CONFIG_ENERGY_MODEL.

The case where the thermal subsystem is used without an Energy Model
(cpufreq_cooling_ops) is handled by looking directly at CPUFreq's
frequency table which is already a dependency for cpu_cooling.c anyway.
Since the thermal framework expects the cooling states in a particular
order, bail out whenever the CPUFreq table is unsorted, since that is
fairly uncommon in general, and there are currently no users of
cpu_cooling for this use-case.

Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Quentin Perret <quentin.perret@arm.com>
---
 drivers/thermal/Kconfig       |   1 +
 drivers/thermal/cpu_cooling.c | 238 ++++++++++++----------------------
 2 files changed, 82 insertions(+), 157 deletions(-)

diff --git a/drivers/thermal/Kconfig b/drivers/thermal/Kconfig
index 653aa27a25a4..d695bd33c440 100644
--- a/drivers/thermal/Kconfig
+++ b/drivers/thermal/Kconfig
@@ -144,6 +144,7 @@ config THERMAL_GOV_USER_SPACE
 
 config THERMAL_GOV_POWER_ALLOCATOR
 	bool "Power allocator thermal governor"
+	depends on ENERGY_MODEL
 	help
 	  Enable this to manage platform thermals by dynamically
 	  allocating and limiting power to devices.
diff --git a/drivers/thermal/cpu_cooling.c b/drivers/thermal/cpu_cooling.c
index f7c1f49ec87f..322ea89dd078 100644
--- a/drivers/thermal/cpu_cooling.c
+++ b/drivers/thermal/cpu_cooling.c
@@ -31,6 +31,7 @@
 #include <linux/slab.h>
 #include <linux/cpu.h>
 #include <linux/cpu_cooling.h>
+#include <linux/energy_model.h>
 
 #include <trace/events/thermal.h>
 
@@ -48,19 +49,6 @@
  *	...
  */
 
-/**
- * struct freq_table - frequency table along with power entries
- * @frequency:	frequency in KHz
- * @power:	power in mW
- *
- * This structure is built when the cooling device registers and helps
- * in translating frequency to power and vice versa.
- */
-struct freq_table {
-	u32 frequency;
-	u32 power;
-};
-
 /**
  * struct time_in_idle - Idle time stats
  * @time: previous reading of the absolute time that this cpu was idle
@@ -82,7 +70,7 @@ struct time_in_idle {
  *	frequency.
  * @max_level: maximum cooling level. One less than total number of valid
  *	cpufreq frequencies.
- * @freq_table: Freq table in descending order of frequencies
+ * @em: Reference on the Energy Model of the device
  * @cdev: thermal_cooling_device pointer to keep track of the
  *	registered cooling device.
  * @policy: cpufreq policy.
@@ -98,7 +86,7 @@ struct cpufreq_cooling_device {
 	unsigned int cpufreq_state;
 	unsigned int clipped_freq;
 	unsigned int max_level;
-	struct freq_table *freq_table;	/* In descending order */
+	struct em_perf_domain *em;
 	struct thermal_cooling_device *cdev;
 	struct cpufreq_policy *policy;
 	struct list_head node;
@@ -121,14 +109,14 @@ static LIST_HEAD(cpufreq_cdev_list);
 static unsigned long get_level(struct cpufreq_cooling_device *cpufreq_cdev,
 			       unsigned int freq)
 {
-	struct freq_table *freq_table = cpufreq_cdev->freq_table;
-	unsigned long level;
+	int i;
 
-	for (level = 1; level <= cpufreq_cdev->max_level; level++)
-		if (freq > freq_table[level].frequency)
+	for (i = cpufreq_cdev->max_level - 1; i >= 0; i--) {
+		if (freq > cpufreq_cdev->em->table[i].frequency)
 			break;
+	}
 
-	return level - 1;
+	return cpufreq_cdev->max_level - i - 1;
 }
 
 /**
@@ -184,105 +172,30 @@ static int cpufreq_thermal_notifier(struct notifier_block *nb,
 	return NOTIFY_OK;
 }
 
-/**
- * update_freq_table() - Update the freq table with power numbers
- * @cpufreq_cdev:	the cpufreq cooling device in which to update the table
- * @capacitance: dynamic power coefficient for these cpus
- *
- * Update the freq table with power numbers.  This table will be used in
- * cpu_power_to_freq() and cpu_freq_to_power() to convert between power and
- * frequency efficiently.  Power is stored in mW, frequency in KHz.  The
- * resulting table is in descending order.
- *
- * Return: 0 on success, -EINVAL if there are no OPPs for any CPUs,
- * or -ENOMEM if we run out of memory.
- */
-static int update_freq_table(struct cpufreq_cooling_device *cpufreq_cdev,
-			     u32 capacitance)
-{
-	struct freq_table *freq_table = cpufreq_cdev->freq_table;
-	struct dev_pm_opp *opp;
-	struct device *dev = NULL;
-	int num_opps = 0, cpu = cpufreq_cdev->policy->cpu, i;
-
-	dev = get_cpu_device(cpu);
-	if (unlikely(!dev)) {
-		dev_warn(&cpufreq_cdev->cdev->device,
-			 "No cpu device for cpu %d\n", cpu);
-		return -ENODEV;
-	}
-
-	num_opps = dev_pm_opp_get_opp_count(dev);
-	if (num_opps < 0)
-		return num_opps;
-
-	/*
-	 * The cpufreq table is also built from the OPP table and so the count
-	 * should match.
-	 */
-	if (num_opps != cpufreq_cdev->max_level + 1) {
-		dev_warn(dev, "Number of OPPs not matching with max_levels\n");
-		return -EINVAL;
-	}
-
-	for (i = 0; i <= cpufreq_cdev->max_level; i++) {
-		unsigned long freq = freq_table[i].frequency * 1000;
-		u32 freq_mhz = freq_table[i].frequency / 1000;
-		u64 power;
-		u32 voltage_mv;
-
-		/*
-		 * Find ceil frequency as 'freq' may be slightly lower than OPP
-		 * freq due to truncation while converting to kHz.
-		 */
-		opp = dev_pm_opp_find_freq_ceil(dev, &freq);
-		if (IS_ERR(opp)) {
-			dev_err(dev, "failed to get opp for %lu frequency\n",
-				freq);
-			return -EINVAL;
-		}
-
-		voltage_mv = dev_pm_opp_get_voltage(opp) / 1000;
-		dev_pm_opp_put(opp);
-
-		/*
-		 * Do the multiplication with MHz and millivolt so as
-		 * to not overflow.
-		 */
-		power = (u64)capacitance * freq_mhz * voltage_mv * voltage_mv;
-		do_div(power, 1000000000);
-
-		/* power is stored in mW */
-		freq_table[i].power = power;
-	}
-
-	return 0;
-}
-
 static u32 cpu_freq_to_power(struct cpufreq_cooling_device *cpufreq_cdev,
 			     u32 freq)
 {
 	int i;
-	struct freq_table *freq_table = cpufreq_cdev->freq_table;
 
-	for (i = 1; i <= cpufreq_cdev->max_level; i++)
-		if (freq > freq_table[i].frequency)
+	for (i = cpufreq_cdev->max_level - 1; i >= 0; i--) {
+		if (freq > cpufreq_cdev->em->table[i].frequency)
 			break;
+	}
 
-	return freq_table[i - 1].power;
+	return cpufreq_cdev->em->table[i + 1].power;
 }
 
 static u32 cpu_power_to_freq(struct cpufreq_cooling_device *cpufreq_cdev,
 			     u32 power)
 {
 	int i;
-	struct freq_table *freq_table = cpufreq_cdev->freq_table;
 
-	for (i = 1; i <= cpufreq_cdev->max_level; i++)
-		if (power > freq_table[i].power)
+	for (i = cpufreq_cdev->max_level - 1; i >= 0; i--) {
+		if (power > cpufreq_cdev->em->table[i].power)
 			break;
+	}
 
-	return freq_table[i - 1].frequency;
+	return cpufreq_cdev->em->table[i + 1].frequency;
 }
 
 /**
@@ -374,6 +287,28 @@ static int cpufreq_get_cur_state(struct thermal_cooling_device *cdev,
 	return 0;
 }
 
+static unsigned int get_state_freq(struct cpufreq_cooling_device *cpufreq_cdev,
+			      unsigned long state)
+{
+	struct cpufreq_policy *policy;
+	unsigned long idx;
+
+	/* Use the Energy Model table if available */
+	if (cpufreq_cdev->em) {
+		idx = cpufreq_cdev->max_level - state;
+		return cpufreq_cdev->em->table[idx].frequency;
+	}
+
+	/* Otherwise, fallback on the CPUFreq table */
+	policy = cpufreq_cdev->policy;
+	if (policy->freq_table_sorted == CPUFREQ_TABLE_SORTED_ASCENDING)
+		idx = cpufreq_cdev->max_level - state;
+	else
+		idx = state;
+
+	return policy->freq_table[idx].frequency;
+}
+
 /**
  * cpufreq_set_cur_state - callback function to set the current cooling state.
  * @cdev: thermal cooling device pointer.
@@ -398,7 +333,7 @@ static int cpufreq_set_cur_state(struct thermal_cooling_device *cdev,
 	if (cpufreq_cdev->cpufreq_state == state)
 		return 0;
 
-	clip_freq = cpufreq_cdev->freq_table[state].frequency;
+	clip_freq = get_state_freq(cpufreq_cdev, state);
 	cpufreq_cdev->cpufreq_state = state;
 	cpufreq_cdev->clipped_freq = clip_freq;
 
@@ -497,7 +432,7 @@ static int cpufreq_state2power(struct thermal_cooling_device *cdev,
 			       struct thermal_zone_device *tz,
 			       unsigned long state, u32 *power)
 {
-	unsigned int freq, num_cpus;
+	unsigned int freq, num_cpus, idx;
 	struct cpufreq_cooling_device *cpufreq_cdev = cdev->devdata;
 
 	/* Request state should be less than max_level */
@@ -506,7 +441,8 @@ static int cpufreq_state2power(struct thermal_cooling_device *cdev,
 
 	num_cpus = cpumask_weight(cpufreq_cdev->policy->cpus);
 
-	freq = cpufreq_cdev->freq_table[state].frequency;
+	idx = cpufreq_cdev->max_level - state;
+	freq = cpufreq_cdev->em->table[idx].frequency;
 	*power = cpu_freq_to_power(cpufreq_cdev, freq) * num_cpus;
 
 	return 0;
@@ -559,7 +495,6 @@ static struct thermal_cooling_device_ops cpufreq_cooling_ops = {
 	.get_cur_state = cpufreq_get_cur_state,
 	.set_cur_state = cpufreq_set_cur_state,
 };
-
 static struct thermal_cooling_device_ops cpufreq_power_cooling_ops = {
 	.get_max_state		= cpufreq_get_max_state,
 	.get_cur_state		= cpufreq_get_cur_state,
@@ -574,18 +509,31 @@ static struct notifier_block thermal_cpufreq_notifier_block = {
 	.notifier_call = cpufreq_thermal_notifier,
 };
 
-static unsigned int find_next_max(struct cpufreq_frequency_table *table,
-				  unsigned int prev_max)
-{
-	struct cpufreq_frequency_table *pos;
-	unsigned int max = 0;
+static inline bool em_is_sane(struct cpufreq_cooling_device *cpufreq_cdev,
+			      struct em_perf_domain *em) {
+	struct cpufreq_policy *policy;
+	unsigned int nr_levels;
 
-	cpufreq_for_each_valid_entry(pos, table) {
-		if (pos->frequency > max && pos->frequency < prev_max)
-			max = pos->frequency;
+	if (!em)
+		return false;
+
+	policy = cpufreq_cdev->policy;
+	if (!cpumask_equal(policy->related_cpus, to_cpumask(em->cpus))) {
+		pr_err("The span of pd %*pbl is misaligned with cpufreq policy %*pbl\n",
+			cpumask_pr_args(to_cpumask(em->cpus)),
+			cpumask_pr_args(policy->related_cpus));
+		return false;
 	}
 
-	return max;
+	nr_levels = cpufreq_cdev->max_level + 1;
+	if (em->nr_cap_states != nr_levels) {
+		pr_err("The number of cap states in pd %*pbl (%u) doesn't match the number of cooling levels (%u)\n",
+			cpumask_pr_args(to_cpumask(em->cpus)),
+			em->nr_cap_states, nr_levels);
+		return false;
+	}
+
+	return true;
 }
 
 /**
@@ -593,7 +541,7 @@ static unsigned int find_next_max(struct cpufreq_frequency_table *table,
  * @np: a valid struct device_node to the cooling device device tree node
  * @policy: cpufreq policy
  * Normally this should be same as cpufreq policy->related_cpus.
- * @capacitance: dynamic power coefficient for these cpus
+ * @em: Energy Model of the cpufreq policy
  *
  * This interface function registers the cpufreq cooling device with the name
  * "thermal-cpufreq-%x". This api can support multiple instances of cpufreq
@@ -605,12 +553,13 @@ static unsigned int find_next_max(struct cpufreq_frequency_table *table,
  */
 static struct thermal_cooling_device *
 __cpufreq_cooling_register(struct device_node *np,
-			struct cpufreq_policy *policy, u32 capacitance)
+			struct cpufreq_policy *policy,
+			struct em_perf_domain *em)
 {
 	struct thermal_cooling_device *cdev;
 	struct cpufreq_cooling_device *cpufreq_cdev;
 	char dev_name[THERMAL_NAME_LENGTH];
-	unsigned int freq, i, num_cpus;
+	unsigned int i, num_cpus;
 	int ret;
 	struct thermal_cooling_device_ops *cooling_ops;
 	bool first;
@@ -644,46 +593,26 @@ __cpufreq_cooling_register(struct device_node *np,
 	/* max_level is an index, not a counter */
 	cpufreq_cdev->max_level = i - 1;
 
-	cpufreq_cdev->freq_table = kmalloc_array(i,
-					sizeof(*cpufreq_cdev->freq_table),
-					GFP_KERNEL);
-	if (!cpufreq_cdev->freq_table) {
-		cdev = ERR_PTR(-ENOMEM);
-		goto free_idle_time;
-	}
-
 	ret = ida_simple_get(&cpufreq_ida, 0, 0, GFP_KERNEL);
 	if (ret < 0) {
 		cdev = ERR_PTR(ret);
-		goto free_table;
+		goto free_idle_time;
 	}
 	cpufreq_cdev->id = ret;
 
 	snprintf(dev_name, sizeof(dev_name), "thermal-cpufreq-%d",
 		 cpufreq_cdev->id);
 
-	/* Fill freq-table in descending order of frequencies */
-	for (i = 0, freq = -1; i <= cpufreq_cdev->max_level; i++) {
-		freq = find_next_max(policy->freq_table, freq);
-		cpufreq_cdev->freq_table[i].frequency = freq;
-
-		/* Warn for duplicate entries */
-		if (!freq)
-			pr_warn("%s: table has duplicate entries\n", __func__);
-		else
-			pr_debug("%s: freq:%u KHz\n", __func__, freq);
-	}
-
-	if (capacitance) {
-		ret = update_freq_table(cpufreq_cdev, capacitance);
-		if (ret) {
-			cdev = ERR_PTR(ret);
-			goto remove_ida;
-		}
-
+	if (em_is_sane(cpufreq_cdev, em)) {
+		cpufreq_cdev->em = em;
 		cooling_ops = &cpufreq_power_cooling_ops;
-	} else {
+	} else if (policy->freq_table_sorted != CPUFREQ_TABLE_UNSORTED) {
 		cooling_ops = &cpufreq_cooling_ops;
+	} else {
+		pr_err("%s: unsorted frequency tables are not supported\n",
+				__func__);
+		cdev = ERR_PTR(-EINVAL);
+		goto remove_ida;
 	}
 
 	cdev = thermal_of_cooling_device_register(np, dev_name, cpufreq_cdev,
@@ -691,7 +620,7 @@ __cpufreq_cooling_register(struct device_node *np,
 	if (IS_ERR(cdev))
 		goto remove_ida;
 
-	cpufreq_cdev->clipped_freq = cpufreq_cdev->freq_table[0].frequency;
+	cpufreq_cdev->clipped_freq = get_state_freq(cpufreq_cdev, 0);
 	cpufreq_cdev->cdev = cdev;
 
 	mutex_lock(&cooling_list_lock);
@@ -708,8 +637,6 @@ __cpufreq_cooling_register(struct device_node *np,
 
 remove_ida:
 	ida_simple_remove(&cpufreq_ida, cpufreq_cdev->id);
-free_table:
-	kfree(cpufreq_cdev->freq_table);
 free_idle_time:
 	kfree(cpufreq_cdev->idle_time);
 free_cdev:
@@ -731,7 +658,7 @@ __cpufreq_cooling_register(struct device_node *np,
 struct thermal_cooling_device *
 cpufreq_cooling_register(struct cpufreq_policy *policy)
 {
-	return __cpufreq_cooling_register(NULL, policy, 0);
+	return __cpufreq_cooling_register(NULL, policy, NULL);
 }
 EXPORT_SYMBOL_GPL(cpufreq_cooling_register);
 
@@ -759,7 +686,6 @@ of_cpufreq_cooling_register(struct cpufreq_policy *policy)
 {
 	struct device_node *np = of_get_cpu_node(policy->cpu, NULL);
 	struct thermal_cooling_device *cdev = NULL;
-	u32 capacitance = 0;
 
 	if (!np) {
 		pr_err("cpu_cooling: OF node not available for cpu%d\n",
@@ -768,10 +694,9 @@ of_cpufreq_cooling_register(struct cpufreq_policy *policy)
 	}
 
 	if (of_find_property(np, "#cooling-cells", NULL)) {
-		of_property_read_u32(np, "dynamic-power-coefficient",
-				     &capacitance);
+		struct em_perf_domain *em = em_cpu_get(policy->cpu);
 
-		cdev = __cpufreq_cooling_register(np, policy, capacitance);
+		cdev = __cpufreq_cooling_register(np, policy, em);
 		if (IS_ERR(cdev)) {
 			pr_err("cpu_cooling: cpu%d failed to register as cooling device: %ld\n",
 			       policy->cpu, PTR_ERR(cdev));
@@ -813,7 +738,6 @@ void cpufreq_cooling_unregister(struct thermal_cooling_device *cdev)
 	thermal_cooling_device_unregister(cpufreq_cdev->cdev);
 	ida_simple_remove(&cpufreq_ida, cpufreq_cdev->id);
 	kfree(cpufreq_cdev->idle_time);
-	kfree(cpufreq_cdev->freq_table);
 	kfree(cpufreq_cdev);
 }
 EXPORT_SYMBOL_GPL(cpufreq_cooling_unregister);
-- 
2.21.0


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 40+ messages in thread

* Re: [PATCH v4 1/3] arm64: defconfig: Enable CONFIG_ENERGY_MODEL
  2019-05-15  8:23   ` Quentin Perret
@ 2019-05-15  8:46     ` Daniel Lezcano
  -1 siblings, 0 replies; 40+ messages in thread
From: Daniel Lezcano @ 2019-05-15  8:46 UTC (permalink / raw)
  To: Quentin Perret, edubezval, rui.zhang, javi.merino, viresh.kumar,
	amit.kachhap, rjw, will.deacon, catalin.marinas
  Cc: dietmar.eggemann, ionela.voinescu, mka, linux-pm, linux-kernel,
	linux-arm-kernel

On 15/05/2019 10:23, Quentin Perret wrote:
> The recently introduced Energy Model (EM) framework manages power cost
> tables for the CPUs of the system. Its only user right now is the
> scheduler, in the context of Energy Aware Scheduling (EAS).
> 
> However, the EM framework also offers a generic infrastructure that
> could replace subsystem-specific implementations of the same concepts,
> as this is the case in the thermal framework.
> 
> So, in order to prepare the migration of the thermal subsystem to use
> the EM framework, enable it in the default arm64 defconfig, which is the
> most commonly used architecture for IPA. This will also compile-in all
> of the EAS code, although it won't be enabled by default -- EAS requires
> to use the 'schedutil' CPUFreq governor while arm64 defaults to
> 'performance'.
> 
> Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
> Signed-off-by: Quentin Perret <quentin.perret@arm.com>

Acked-by: Daniel Lezcano <daniel.lezcano@linaro.org>

> ---
>  arch/arm64/configs/defconfig | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/arch/arm64/configs/defconfig b/arch/arm64/configs/defconfig
> index 2d9c39033c1a..4881a752ab3f 100644
> --- a/arch/arm64/configs/defconfig
> +++ b/arch/arm64/configs/defconfig
> @@ -97,6 +97,7 @@ CONFIG_XEN=y
>  CONFIG_COMPAT=y
>  CONFIG_HIBERNATION=y
>  CONFIG_WQ_POWER_EFFICIENT_DEFAULT=y
> +CONFIG_ENERGY_MODEL=y
>  CONFIG_ARM_CPUIDLE=y
>  CONFIG_CPU_FREQ=y
>  CONFIG_CPU_FREQ_STAT=y
> 


-- 
 <http://www.linaro.org/> Linaro.org │ Open source software for ARM SoCs

Follow Linaro:  <http://www.facebook.com/pages/Linaro> Facebook |
<http://twitter.com/#!/linaroorg> Twitter |
<http://www.linaro.org/linaro-blog/> Blog


^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v4 1/3] arm64: defconfig: Enable CONFIG_ENERGY_MODEL
@ 2019-05-15  8:46     ` Daniel Lezcano
  0 siblings, 0 replies; 40+ messages in thread
From: Daniel Lezcano @ 2019-05-15  8:46 UTC (permalink / raw)
  To: Quentin Perret, edubezval, rui.zhang, javi.merino, viresh.kumar,
	amit.kachhap, rjw, will.deacon, catalin.marinas
  Cc: linux-pm, linux-kernel, mka, ionela.voinescu, dietmar.eggemann,
	linux-arm-kernel

On 15/05/2019 10:23, Quentin Perret wrote:
> The recently introduced Energy Model (EM) framework manages power cost
> tables for the CPUs of the system. Its only user right now is the
> scheduler, in the context of Energy Aware Scheduling (EAS).
> 
> However, the EM framework also offers a generic infrastructure that
> could replace subsystem-specific implementations of the same concepts,
> as this is the case in the thermal framework.
> 
> So, in order to prepare the migration of the thermal subsystem to use
> the EM framework, enable it in the default arm64 defconfig, which is the
> most commonly used architecture for IPA. This will also compile-in all
> of the EAS code, although it won't be enabled by default -- EAS requires
> to use the 'schedutil' CPUFreq governor while arm64 defaults to
> 'performance'.
> 
> Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
> Signed-off-by: Quentin Perret <quentin.perret@arm.com>

Acked-by: Daniel Lezcano <daniel.lezcano@linaro.org>

> ---
>  arch/arm64/configs/defconfig | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/arch/arm64/configs/defconfig b/arch/arm64/configs/defconfig
> index 2d9c39033c1a..4881a752ab3f 100644
> --- a/arch/arm64/configs/defconfig
> +++ b/arch/arm64/configs/defconfig
> @@ -97,6 +97,7 @@ CONFIG_XEN=y
>  CONFIG_COMPAT=y
>  CONFIG_HIBERNATION=y
>  CONFIG_WQ_POWER_EFFICIENT_DEFAULT=y
> +CONFIG_ENERGY_MODEL=y
>  CONFIG_ARM_CPUIDLE=y
>  CONFIG_CPU_FREQ=y
>  CONFIG_CPU_FREQ_STAT=y
> 


-- 
 <http://www.linaro.org/> Linaro.org │ Open source software for ARM SoCs

Follow Linaro:  <http://www.facebook.com/pages/Linaro> Facebook |
<http://twitter.com/#!/linaroorg> Twitter |
<http://www.linaro.org/linaro-blog/> Blog


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v4 2/3] PM / EM: Expose perf domain struct
  2019-05-15  8:23   ` Quentin Perret
@ 2019-05-15  9:06     ` Daniel Lezcano
  -1 siblings, 0 replies; 40+ messages in thread
From: Daniel Lezcano @ 2019-05-15  9:06 UTC (permalink / raw)
  To: Quentin Perret, edubezval, rui.zhang, javi.merino, viresh.kumar,
	amit.kachhap, rjw, will.deacon, catalin.marinas
  Cc: dietmar.eggemann, ionela.voinescu, mka, linux-pm, linux-kernel,
	linux-arm-kernel

On 15/05/2019 10:23, Quentin Perret wrote:
> In the current state, the perf_domain struct is fully defined only when
> CONFIG_ENERGY_MODEL=y. Since we need to write code that compiles both
> with or without that option in the thermal framework, make sure to
> actually define the struct regardless of the config option. That allows
> to avoid using stubbed accessor functions all the time in code paths
> that use the EM.
> 
> Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
> Signed-off-by: Quentin Perret <quentin.perret@arm.com>

This patch implies the cpu cooling device can be set without the energy
model.

Isn't it possible to make a strong dependency for the cpu cooling device
on the energy model option, add the energy model as default on arm arch
and drop this patch?

After all, the cpu cooling is using the em framework.

> ---
>  include/linux/energy_model.h | 3 +--
>  1 file changed, 1 insertion(+), 2 deletions(-)
> 
> diff --git a/include/linux/energy_model.h b/include/linux/energy_model.h
> index aa027f7bcb3e..fb32b86a467d 100644
> --- a/include/linux/energy_model.h
> +++ b/include/linux/energy_model.h
> @@ -9,7 +9,6 @@
>  #include <linux/sched/topology.h>
>  #include <linux/types.h>
>  
> -#ifdef CONFIG_ENERGY_MODEL
>  /**
>   * em_cap_state - Capacity state of a performance domain
>   * @frequency:	The CPU frequency in KHz, for consistency with CPUFreq
> @@ -40,6 +39,7 @@ struct em_perf_domain {
>  	unsigned long cpus[0];
>  };
>  
> +#ifdef CONFIG_ENERGY_MODEL
>  #define EM_CPU_MAX_POWER 0xFFFF
>  
>  struct em_data_callback {
> @@ -160,7 +160,6 @@ static inline int em_pd_nr_cap_states(struct em_perf_domain *pd)
>  }
>  
>  #else
> -struct em_perf_domain {};
>  struct em_data_callback {};
>  #define EM_DATA_CB(_active_power_cb) { }
>  
> 


-- 
 <http://www.linaro.org/> Linaro.org │ Open source software for ARM SoCs

Follow Linaro:  <http://www.facebook.com/pages/Linaro> Facebook |
<http://twitter.com/#!/linaroorg> Twitter |
<http://www.linaro.org/linaro-blog/> Blog


^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v4 2/3] PM / EM: Expose perf domain struct
@ 2019-05-15  9:06     ` Daniel Lezcano
  0 siblings, 0 replies; 40+ messages in thread
From: Daniel Lezcano @ 2019-05-15  9:06 UTC (permalink / raw)
  To: Quentin Perret, edubezval, rui.zhang, javi.merino, viresh.kumar,
	amit.kachhap, rjw, will.deacon, catalin.marinas
  Cc: linux-pm, linux-kernel, mka, ionela.voinescu, dietmar.eggemann,
	linux-arm-kernel

On 15/05/2019 10:23, Quentin Perret wrote:
> In the current state, the perf_domain struct is fully defined only when
> CONFIG_ENERGY_MODEL=y. Since we need to write code that compiles both
> with or without that option in the thermal framework, make sure to
> actually define the struct regardless of the config option. That allows
> to avoid using stubbed accessor functions all the time in code paths
> that use the EM.
> 
> Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
> Signed-off-by: Quentin Perret <quentin.perret@arm.com>

This patch implies the cpu cooling device can be set without the energy
model.

Isn't it possible to make a strong dependency for the cpu cooling device
on the energy model option, add the energy model as default on arm arch
and drop this patch?

After all, the cpu cooling is using the em framework.

> ---
>  include/linux/energy_model.h | 3 +--
>  1 file changed, 1 insertion(+), 2 deletions(-)
> 
> diff --git a/include/linux/energy_model.h b/include/linux/energy_model.h
> index aa027f7bcb3e..fb32b86a467d 100644
> --- a/include/linux/energy_model.h
> +++ b/include/linux/energy_model.h
> @@ -9,7 +9,6 @@
>  #include <linux/sched/topology.h>
>  #include <linux/types.h>
>  
> -#ifdef CONFIG_ENERGY_MODEL
>  /**
>   * em_cap_state - Capacity state of a performance domain
>   * @frequency:	The CPU frequency in KHz, for consistency with CPUFreq
> @@ -40,6 +39,7 @@ struct em_perf_domain {
>  	unsigned long cpus[0];
>  };
>  
> +#ifdef CONFIG_ENERGY_MODEL
>  #define EM_CPU_MAX_POWER 0xFFFF
>  
>  struct em_data_callback {
> @@ -160,7 +160,6 @@ static inline int em_pd_nr_cap_states(struct em_perf_domain *pd)
>  }
>  
>  #else
> -struct em_perf_domain {};
>  struct em_data_callback {};
>  #define EM_DATA_CB(_active_power_cb) { }
>  
> 


-- 
 <http://www.linaro.org/> Linaro.org │ Open source software for ARM SoCs

Follow Linaro:  <http://www.facebook.com/pages/Linaro> Facebook |
<http://twitter.com/#!/linaroorg> Twitter |
<http://www.linaro.org/linaro-blog/> Blog


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v4 2/3] PM / EM: Expose perf domain struct
  2019-05-15  9:06     ` Daniel Lezcano
@ 2019-05-15  9:17       ` Quentin Perret
  -1 siblings, 0 replies; 40+ messages in thread
From: Quentin Perret @ 2019-05-15  9:17 UTC (permalink / raw)
  To: Daniel Lezcano
  Cc: edubezval, rui.zhang, javi.merino, viresh.kumar, amit.kachhap,
	rjw, will.deacon, catalin.marinas, dietmar.eggemann,
	ionela.voinescu, mka, linux-pm, linux-kernel, linux-arm-kernel

Hi Daniel,

On Wednesday 15 May 2019 at 11:06:18 (+0200), Daniel Lezcano wrote:
> On 15/05/2019 10:23, Quentin Perret wrote:
> > In the current state, the perf_domain struct is fully defined only when
> > CONFIG_ENERGY_MODEL=y. Since we need to write code that compiles both
> > with or without that option in the thermal framework, make sure to
> > actually define the struct regardless of the config option. That allows
> > to avoid using stubbed accessor functions all the time in code paths
> > that use the EM.
> > 
> > Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
> > Signed-off-by: Quentin Perret <quentin.perret@arm.com>
> 
> This patch implies the cpu cooling device can be set without the energy
> model.
> 
> Isn't it possible to make a strong dependency for the cpu cooling device
> on the energy model option, add the energy model as default on arm arch
> and drop this patch?

Right, that should work too.

> After all, the cpu cooling is using the em framework.

The reason I did it that way is simply to keep things flexible. If you
don't compile in THERMAL_GOV_POWER_ALLOCATOR, you will never use the EM
for CPU thermal. So I thought it would be good to not mandate compiling
in ENERGY_MODEL in this case -- that should save a bit of space.

But TBH I don't have a strong opinion on this one, so if everybody
agrees it's fine to just make CPU_THERMAL depend on ENERGY_MODEL, I'm
happy to drop this patch and fix patch 3/3. That would indeed simplify
things a bit.

Thanks,
Quentin

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v4 2/3] PM / EM: Expose perf domain struct
@ 2019-05-15  9:17       ` Quentin Perret
  0 siblings, 0 replies; 40+ messages in thread
From: Quentin Perret @ 2019-05-15  9:17 UTC (permalink / raw)
  To: Daniel Lezcano
  Cc: linux-pm, viresh.kumar, amit.kachhap, rjw, linux-kernel,
	will.deacon, edubezval, mka, catalin.marinas, rui.zhang,
	javi.merino, ionela.voinescu, dietmar.eggemann, linux-arm-kernel

Hi Daniel,

On Wednesday 15 May 2019 at 11:06:18 (+0200), Daniel Lezcano wrote:
> On 15/05/2019 10:23, Quentin Perret wrote:
> > In the current state, the perf_domain struct is fully defined only when
> > CONFIG_ENERGY_MODEL=y. Since we need to write code that compiles both
> > with or without that option in the thermal framework, make sure to
> > actually define the struct regardless of the config option. That allows
> > to avoid using stubbed accessor functions all the time in code paths
> > that use the EM.
> > 
> > Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
> > Signed-off-by: Quentin Perret <quentin.perret@arm.com>
> 
> This patch implies the cpu cooling device can be set without the energy
> model.
> 
> Isn't it possible to make a strong dependency for the cpu cooling device
> on the energy model option, add the energy model as default on arm arch
> and drop this patch?

Right, that should work too.

> After all, the cpu cooling is using the em framework.

The reason I did it that way is simply to keep things flexible. If you
don't compile in THERMAL_GOV_POWER_ALLOCATOR, you will never use the EM
for CPU thermal. So I thought it would be good to not mandate compiling
in ENERGY_MODEL in this case -- that should save a bit of space.

But TBH I don't have a strong opinion on this one, so if everybody
agrees it's fine to just make CPU_THERMAL depend on ENERGY_MODEL, I'm
happy to drop this patch and fix patch 3/3. That would indeed simplify
things a bit.

Thanks,
Quentin

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v4 1/3] arm64: defconfig: Enable CONFIG_ENERGY_MODEL
  2019-05-15  8:46     ` Daniel Lezcano
@ 2019-05-15  9:22       ` Quentin Perret
  -1 siblings, 0 replies; 40+ messages in thread
From: Quentin Perret @ 2019-05-15  9:22 UTC (permalink / raw)
  To: Daniel Lezcano
  Cc: edubezval, rui.zhang, javi.merino, viresh.kumar, amit.kachhap,
	rjw, will.deacon, catalin.marinas, dietmar.eggemann,
	ionela.voinescu, mka, linux-pm, linux-kernel, linux-arm-kernel

On Wednesday 15 May 2019 at 10:46:09 (+0200), Daniel Lezcano wrote:
> On 15/05/2019 10:23, Quentin Perret wrote:
> > The recently introduced Energy Model (EM) framework manages power cost
> > tables for the CPUs of the system. Its only user right now is the
> > scheduler, in the context of Energy Aware Scheduling (EAS).
> > 
> > However, the EM framework also offers a generic infrastructure that
> > could replace subsystem-specific implementations of the same concepts,
> > as this is the case in the thermal framework.
> > 
> > So, in order to prepare the migration of the thermal subsystem to use
> > the EM framework, enable it in the default arm64 defconfig, which is the
> > most commonly used architecture for IPA. This will also compile-in all
> > of the EAS code, although it won't be enabled by default -- EAS requires
> > to use the 'schedutil' CPUFreq governor while arm64 defaults to
> > 'performance'.
> > 
> > Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
> > Signed-off-by: Quentin Perret <quentin.perret@arm.com>
> 
> Acked-by: Daniel Lezcano <daniel.lezcano@linaro.org>

Thanks !
Quentin

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v4 1/3] arm64: defconfig: Enable CONFIG_ENERGY_MODEL
@ 2019-05-15  9:22       ` Quentin Perret
  0 siblings, 0 replies; 40+ messages in thread
From: Quentin Perret @ 2019-05-15  9:22 UTC (permalink / raw)
  To: Daniel Lezcano
  Cc: linux-pm, viresh.kumar, amit.kachhap, rjw, linux-kernel,
	will.deacon, edubezval, mka, catalin.marinas, rui.zhang,
	javi.merino, ionela.voinescu, dietmar.eggemann, linux-arm-kernel

On Wednesday 15 May 2019 at 10:46:09 (+0200), Daniel Lezcano wrote:
> On 15/05/2019 10:23, Quentin Perret wrote:
> > The recently introduced Energy Model (EM) framework manages power cost
> > tables for the CPUs of the system. Its only user right now is the
> > scheduler, in the context of Energy Aware Scheduling (EAS).
> > 
> > However, the EM framework also offers a generic infrastructure that
> > could replace subsystem-specific implementations of the same concepts,
> > as this is the case in the thermal framework.
> > 
> > So, in order to prepare the migration of the thermal subsystem to use
> > the EM framework, enable it in the default arm64 defconfig, which is the
> > most commonly used architecture for IPA. This will also compile-in all
> > of the EAS code, although it won't be enabled by default -- EAS requires
> > to use the 'schedutil' CPUFreq governor while arm64 defaults to
> > 'performance'.
> > 
> > Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
> > Signed-off-by: Quentin Perret <quentin.perret@arm.com>
> 
> Acked-by: Daniel Lezcano <daniel.lezcano@linaro.org>

Thanks !
Quentin

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v4 2/3] PM / EM: Expose perf domain struct
  2019-05-15  9:17       ` Quentin Perret
@ 2019-05-15  9:56         ` Daniel Lezcano
  -1 siblings, 0 replies; 40+ messages in thread
From: Daniel Lezcano @ 2019-05-15  9:56 UTC (permalink / raw)
  To: Quentin Perret
  Cc: edubezval, rui.zhang, javi.merino, viresh.kumar, amit.kachhap,
	rjw, will.deacon, catalin.marinas, dietmar.eggemann,
	ionela.voinescu, mka, linux-pm, linux-kernel, linux-arm-kernel

On 15/05/2019 11:17, Quentin Perret wrote:
> Hi Daniel,
> 
> On Wednesday 15 May 2019 at 11:06:18 (+0200), Daniel Lezcano wrote:
>> On 15/05/2019 10:23, Quentin Perret wrote:
>>> In the current state, the perf_domain struct is fully defined only when
>>> CONFIG_ENERGY_MODEL=y. Since we need to write code that compiles both
>>> with or without that option in the thermal framework, make sure to
>>> actually define the struct regardless of the config option. That allows
>>> to avoid using stubbed accessor functions all the time in code paths
>>> that use the EM.
>>>
>>> Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
>>> Signed-off-by: Quentin Perret <quentin.perret@arm.com>
>>
>> This patch implies the cpu cooling device can be set without the energy
>> model.
>>
>> Isn't it possible to make a strong dependency for the cpu cooling device
>> on the energy model option, add the energy model as default on arm arch
>> and drop this patch?
> 
> Right, that should work too.
> 
>> After all, the cpu cooling is using the em framework.
> 
> The reason I did it that way is simply to keep things flexible. If you
> don't compile in THERMAL_GOV_POWER_ALLOCATOR, you will never use the EM
> for CPU thermal. So I thought it would be good to not mandate compiling
> in ENERGY_MODEL in this case -- that should save a bit of space.
> 
> But TBH I don't have a strong opinion on this one, so if everybody
> agrees it's fine to just make CPU_THERMAL depend on ENERGY_MODEL, I'm
> happy to drop this patch and fix patch 3/3. That would indeed simplify
> things a bit.

Ok in this case it will be better to drop the 2/3 and add a small series
doing for the cpu_cooling.c

#ifdef CONFIG_THERMAL_GOV_POWER_ALLOCATOR

/* structure freq */

/* power2state */

/* state2power*/

/* getrequestedpower */

/* All functions needed for the above */

#endif

static struct thermal_cooling_device_ops cpufreq_cooling_ops = {
        .get_max_state          = cpufreq_get_max_state,
        .get_cur_state          = cpufreq_get_cur_state,
        .set_cur_state          = cpufreq_set_cur_state,
#ifdef CONFIG_THERMAL_GOV_POWER_ALLOCATOR
        .get_requested_power    = cpufreq_get_requested_power,
        .state2power            = cpufreq_state2power,
        .power2state            = cpufreq_power2state,
#endif
};

So you don't have to care about ENERGY_MODEL to be set as
THERMAL_GOV_POWER_ALLOCATOR depends on it.

I think the result for cpu_cooling.c will be even more cleaner with the
em change.




-- 
 <http://www.linaro.org/> Linaro.org │ Open source software for ARM SoCs

Follow Linaro:  <http://www.facebook.com/pages/Linaro> Facebook |
<http://twitter.com/#!/linaroorg> Twitter |
<http://www.linaro.org/linaro-blog/> Blog


^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v4 2/3] PM / EM: Expose perf domain struct
@ 2019-05-15  9:56         ` Daniel Lezcano
  0 siblings, 0 replies; 40+ messages in thread
From: Daniel Lezcano @ 2019-05-15  9:56 UTC (permalink / raw)
  To: Quentin Perret
  Cc: linux-pm, viresh.kumar, amit.kachhap, rjw, linux-kernel,
	will.deacon, edubezval, mka, catalin.marinas, rui.zhang,
	javi.merino, ionela.voinescu, dietmar.eggemann, linux-arm-kernel

On 15/05/2019 11:17, Quentin Perret wrote:
> Hi Daniel,
> 
> On Wednesday 15 May 2019 at 11:06:18 (+0200), Daniel Lezcano wrote:
>> On 15/05/2019 10:23, Quentin Perret wrote:
>>> In the current state, the perf_domain struct is fully defined only when
>>> CONFIG_ENERGY_MODEL=y. Since we need to write code that compiles both
>>> with or without that option in the thermal framework, make sure to
>>> actually define the struct regardless of the config option. That allows
>>> to avoid using stubbed accessor functions all the time in code paths
>>> that use the EM.
>>>
>>> Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
>>> Signed-off-by: Quentin Perret <quentin.perret@arm.com>
>>
>> This patch implies the cpu cooling device can be set without the energy
>> model.
>>
>> Isn't it possible to make a strong dependency for the cpu cooling device
>> on the energy model option, add the energy model as default on arm arch
>> and drop this patch?
> 
> Right, that should work too.
> 
>> After all, the cpu cooling is using the em framework.
> 
> The reason I did it that way is simply to keep things flexible. If you
> don't compile in THERMAL_GOV_POWER_ALLOCATOR, you will never use the EM
> for CPU thermal. So I thought it would be good to not mandate compiling
> in ENERGY_MODEL in this case -- that should save a bit of space.
> 
> But TBH I don't have a strong opinion on this one, so if everybody
> agrees it's fine to just make CPU_THERMAL depend on ENERGY_MODEL, I'm
> happy to drop this patch and fix patch 3/3. That would indeed simplify
> things a bit.

Ok in this case it will be better to drop the 2/3 and add a small series
doing for the cpu_cooling.c

#ifdef CONFIG_THERMAL_GOV_POWER_ALLOCATOR

/* structure freq */

/* power2state */

/* state2power*/

/* getrequestedpower */

/* All functions needed for the above */

#endif

static struct thermal_cooling_device_ops cpufreq_cooling_ops = {
        .get_max_state          = cpufreq_get_max_state,
        .get_cur_state          = cpufreq_get_cur_state,
        .set_cur_state          = cpufreq_set_cur_state,
#ifdef CONFIG_THERMAL_GOV_POWER_ALLOCATOR
        .get_requested_power    = cpufreq_get_requested_power,
        .state2power            = cpufreq_state2power,
        .power2state            = cpufreq_power2state,
#endif
};

So you don't have to care about ENERGY_MODEL to be set as
THERMAL_GOV_POWER_ALLOCATOR depends on it.

I think the result for cpu_cooling.c will be even more cleaner with the
em change.




-- 
 <http://www.linaro.org/> Linaro.org │ Open source software for ARM SoCs

Follow Linaro:  <http://www.facebook.com/pages/Linaro> Facebook |
<http://twitter.com/#!/linaroorg> Twitter |
<http://www.linaro.org/linaro-blog/> Blog


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v4 2/3] PM / EM: Expose perf domain struct
  2019-05-15  9:56         ` Daniel Lezcano
@ 2019-05-15 10:07           ` Quentin Perret
  -1 siblings, 0 replies; 40+ messages in thread
From: Quentin Perret @ 2019-05-15 10:07 UTC (permalink / raw)
  To: Daniel Lezcano
  Cc: edubezval, rui.zhang, javi.merino, viresh.kumar, amit.kachhap,
	rjw, will.deacon, catalin.marinas, dietmar.eggemann,
	ionela.voinescu, mka, linux-pm, linux-kernel, linux-arm-kernel

On Wednesday 15 May 2019 at 11:56:30 (+0200), Daniel Lezcano wrote:
> On 15/05/2019 11:17, Quentin Perret wrote:
> > Hi Daniel,
> > 
> > On Wednesday 15 May 2019 at 11:06:18 (+0200), Daniel Lezcano wrote:
> >> On 15/05/2019 10:23, Quentin Perret wrote:
> >>> In the current state, the perf_domain struct is fully defined only when
> >>> CONFIG_ENERGY_MODEL=y. Since we need to write code that compiles both
> >>> with or without that option in the thermal framework, make sure to
> >>> actually define the struct regardless of the config option. That allows
> >>> to avoid using stubbed accessor functions all the time in code paths
> >>> that use the EM.
> >>>
> >>> Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
> >>> Signed-off-by: Quentin Perret <quentin.perret@arm.com>
> >>
> >> This patch implies the cpu cooling device can be set without the energy
> >> model.
> >>
> >> Isn't it possible to make a strong dependency for the cpu cooling device
> >> on the energy model option, add the energy model as default on arm arch
> >> and drop this patch?
> > 
> > Right, that should work too.
> > 
> >> After all, the cpu cooling is using the em framework.
> > 
> > The reason I did it that way is simply to keep things flexible. If you
> > don't compile in THERMAL_GOV_POWER_ALLOCATOR, you will never use the EM
> > for CPU thermal. So I thought it would be good to not mandate compiling
> > in ENERGY_MODEL in this case -- that should save a bit of space.
> > 
> > But TBH I don't have a strong opinion on this one, so if everybody
> > agrees it's fine to just make CPU_THERMAL depend on ENERGY_MODEL, I'm
> > happy to drop this patch and fix patch 3/3. That would indeed simplify
> > things a bit.
> 
> Ok in this case it will be better to drop the 2/3 and add a small series
> doing for the cpu_cooling.c
> 
> #ifdef CONFIG_THERMAL_GOV_POWER_ALLOCATOR
> 
> /* structure freq */
> 
> /* power2state */
> 
> /* state2power*/
> 
> /* getrequestedpower */
> 
> /* All functions needed for the above */
> 
> #endif
> 
> static struct thermal_cooling_device_ops cpufreq_cooling_ops = {
>         .get_max_state          = cpufreq_get_max_state,
>         .get_cur_state          = cpufreq_get_cur_state,
>         .set_cur_state          = cpufreq_set_cur_state,
> #ifdef CONFIG_THERMAL_GOV_POWER_ALLOCATOR
>         .get_requested_power    = cpufreq_get_requested_power,
>         .state2power            = cpufreq_state2power,
>         .power2state            = cpufreq_power2state,
> #endif
> };
> 
> So you don't have to care about ENERGY_MODEL to be set as
> THERMAL_GOV_POWER_ALLOCATOR depends on it.
> 
> I think the result for cpu_cooling.c will be even more cleaner with the
> em change.

OK, that works for me. I'll give it a go in v5.

Thanks !
Quentin

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v4 2/3] PM / EM: Expose perf domain struct
@ 2019-05-15 10:07           ` Quentin Perret
  0 siblings, 0 replies; 40+ messages in thread
From: Quentin Perret @ 2019-05-15 10:07 UTC (permalink / raw)
  To: Daniel Lezcano
  Cc: linux-pm, viresh.kumar, amit.kachhap, rjw, linux-kernel,
	will.deacon, edubezval, mka, catalin.marinas, rui.zhang,
	javi.merino, ionela.voinescu, dietmar.eggemann, linux-arm-kernel

On Wednesday 15 May 2019 at 11:56:30 (+0200), Daniel Lezcano wrote:
> On 15/05/2019 11:17, Quentin Perret wrote:
> > Hi Daniel,
> > 
> > On Wednesday 15 May 2019 at 11:06:18 (+0200), Daniel Lezcano wrote:
> >> On 15/05/2019 10:23, Quentin Perret wrote:
> >>> In the current state, the perf_domain struct is fully defined only when
> >>> CONFIG_ENERGY_MODEL=y. Since we need to write code that compiles both
> >>> with or without that option in the thermal framework, make sure to
> >>> actually define the struct regardless of the config option. That allows
> >>> to avoid using stubbed accessor functions all the time in code paths
> >>> that use the EM.
> >>>
> >>> Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
> >>> Signed-off-by: Quentin Perret <quentin.perret@arm.com>
> >>
> >> This patch implies the cpu cooling device can be set without the energy
> >> model.
> >>
> >> Isn't it possible to make a strong dependency for the cpu cooling device
> >> on the energy model option, add the energy model as default on arm arch
> >> and drop this patch?
> > 
> > Right, that should work too.
> > 
> >> After all, the cpu cooling is using the em framework.
> > 
> > The reason I did it that way is simply to keep things flexible. If you
> > don't compile in THERMAL_GOV_POWER_ALLOCATOR, you will never use the EM
> > for CPU thermal. So I thought it would be good to not mandate compiling
> > in ENERGY_MODEL in this case -- that should save a bit of space.
> > 
> > But TBH I don't have a strong opinion on this one, so if everybody
> > agrees it's fine to just make CPU_THERMAL depend on ENERGY_MODEL, I'm
> > happy to drop this patch and fix patch 3/3. That would indeed simplify
> > things a bit.
> 
> Ok in this case it will be better to drop the 2/3 and add a small series
> doing for the cpu_cooling.c
> 
> #ifdef CONFIG_THERMAL_GOV_POWER_ALLOCATOR
> 
> /* structure freq */
> 
> /* power2state */
> 
> /* state2power*/
> 
> /* getrequestedpower */
> 
> /* All functions needed for the above */
> 
> #endif
> 
> static struct thermal_cooling_device_ops cpufreq_cooling_ops = {
>         .get_max_state          = cpufreq_get_max_state,
>         .get_cur_state          = cpufreq_get_cur_state,
>         .set_cur_state          = cpufreq_set_cur_state,
> #ifdef CONFIG_THERMAL_GOV_POWER_ALLOCATOR
>         .get_requested_power    = cpufreq_get_requested_power,
>         .state2power            = cpufreq_state2power,
>         .power2state            = cpufreq_power2state,
> #endif
> };
> 
> So you don't have to care about ENERGY_MODEL to be set as
> THERMAL_GOV_POWER_ALLOCATOR depends on it.
> 
> I think the result for cpu_cooling.c will be even more cleaner with the
> em change.

OK, that works for me. I'll give it a go in v5.

Thanks !
Quentin

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v4 2/3] PM / EM: Expose perf domain struct
  2019-05-15 10:07           ` Quentin Perret
@ 2019-05-15 10:16             ` Daniel Lezcano
  -1 siblings, 0 replies; 40+ messages in thread
From: Daniel Lezcano @ 2019-05-15 10:16 UTC (permalink / raw)
  To: Quentin Perret
  Cc: edubezval, rui.zhang, javi.merino, viresh.kumar, amit.kachhap,
	rjw, will.deacon, catalin.marinas, dietmar.eggemann,
	ionela.voinescu, mka, linux-pm, linux-kernel, linux-arm-kernel

On 15/05/2019 12:07, Quentin Perret wrote:
> On Wednesday 15 May 2019 at 11:56:30 (+0200), Daniel Lezcano wrote:
>> On 15/05/2019 11:17, Quentin Perret wrote:
>>> Hi Daniel,
>>>
>>> On Wednesday 15 May 2019 at 11:06:18 (+0200), Daniel Lezcano wrote:
>>>> On 15/05/2019 10:23, Quentin Perret wrote:
>>>>> In the current state, the perf_domain struct is fully defined only when
>>>>> CONFIG_ENERGY_MODEL=y. Since we need to write code that compiles both
>>>>> with or without that option in the thermal framework, make sure to
>>>>> actually define the struct regardless of the config option. That allows
>>>>> to avoid using stubbed accessor functions all the time in code paths
>>>>> that use the EM.
>>>>>
>>>>> Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
>>>>> Signed-off-by: Quentin Perret <quentin.perret@arm.com>
>>>>
>>>> This patch implies the cpu cooling device can be set without the energy
>>>> model.
>>>>
>>>> Isn't it possible to make a strong dependency for the cpu cooling device
>>>> on the energy model option, add the energy model as default on arm arch
>>>> and drop this patch?
>>>
>>> Right, that should work too.
>>>
>>>> After all, the cpu cooling is using the em framework.
>>>
>>> The reason I did it that way is simply to keep things flexible. If you
>>> don't compile in THERMAL_GOV_POWER_ALLOCATOR, you will never use the EM
>>> for CPU thermal. So I thought it would be good to not mandate compiling
>>> in ENERGY_MODEL in this case -- that should save a bit of space.
>>>
>>> But TBH I don't have a strong opinion on this one, so if everybody
>>> agrees it's fine to just make CPU_THERMAL depend on ENERGY_MODEL, I'm
>>> happy to drop this patch and fix patch 3/3. That would indeed simplify
>>> things a bit.
>>
>> Ok in this case it will be better to drop the 2/3 and add a small series
>> doing for the cpu_cooling.c
>>
>> #ifdef CONFIG_THERMAL_GOV_POWER_ALLOCATOR
>>
>> /* structure freq */
>>
>> /* power2state */
>>
>> /* state2power*/
>>
>> /* getrequestedpower */
>>
>> /* All functions needed for the above */
>>
>> #endif
>>
>> static struct thermal_cooling_device_ops cpufreq_cooling_ops = {
>>         .get_max_state          = cpufreq_get_max_state,
>>         .get_cur_state          = cpufreq_get_cur_state,
>>         .set_cur_state          = cpufreq_set_cur_state,
>> #ifdef CONFIG_THERMAL_GOV_POWER_ALLOCATOR
>>         .get_requested_power    = cpufreq_get_requested_power,
>>         .state2power            = cpufreq_state2power,
>>         .power2state            = cpufreq_power2state,
>> #endif
>> };
>>
>> So you don't have to care about ENERGY_MODEL to be set as
>> THERMAL_GOV_POWER_ALLOCATOR depends on it.
>>
>> I think the result for cpu_cooling.c will be even more cleaner with the
>> em change.
> 
> OK, that works for me. I'll give it a go in v5.

Viresh what do you think ?


-- 
 <http://www.linaro.org/> Linaro.org │ Open source software for ARM SoCs

Follow Linaro:  <http://www.facebook.com/pages/Linaro> Facebook |
<http://twitter.com/#!/linaroorg> Twitter |
<http://www.linaro.org/linaro-blog/> Blog


^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v4 2/3] PM / EM: Expose perf domain struct
@ 2019-05-15 10:16             ` Daniel Lezcano
  0 siblings, 0 replies; 40+ messages in thread
From: Daniel Lezcano @ 2019-05-15 10:16 UTC (permalink / raw)
  To: Quentin Perret
  Cc: linux-pm, viresh.kumar, amit.kachhap, rjw, linux-kernel,
	will.deacon, edubezval, mka, catalin.marinas, rui.zhang,
	javi.merino, ionela.voinescu, dietmar.eggemann, linux-arm-kernel

On 15/05/2019 12:07, Quentin Perret wrote:
> On Wednesday 15 May 2019 at 11:56:30 (+0200), Daniel Lezcano wrote:
>> On 15/05/2019 11:17, Quentin Perret wrote:
>>> Hi Daniel,
>>>
>>> On Wednesday 15 May 2019 at 11:06:18 (+0200), Daniel Lezcano wrote:
>>>> On 15/05/2019 10:23, Quentin Perret wrote:
>>>>> In the current state, the perf_domain struct is fully defined only when
>>>>> CONFIG_ENERGY_MODEL=y. Since we need to write code that compiles both
>>>>> with or without that option in the thermal framework, make sure to
>>>>> actually define the struct regardless of the config option. That allows
>>>>> to avoid using stubbed accessor functions all the time in code paths
>>>>> that use the EM.
>>>>>
>>>>> Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
>>>>> Signed-off-by: Quentin Perret <quentin.perret@arm.com>
>>>>
>>>> This patch implies the cpu cooling device can be set without the energy
>>>> model.
>>>>
>>>> Isn't it possible to make a strong dependency for the cpu cooling device
>>>> on the energy model option, add the energy model as default on arm arch
>>>> and drop this patch?
>>>
>>> Right, that should work too.
>>>
>>>> After all, the cpu cooling is using the em framework.
>>>
>>> The reason I did it that way is simply to keep things flexible. If you
>>> don't compile in THERMAL_GOV_POWER_ALLOCATOR, you will never use the EM
>>> for CPU thermal. So I thought it would be good to not mandate compiling
>>> in ENERGY_MODEL in this case -- that should save a bit of space.
>>>
>>> But TBH I don't have a strong opinion on this one, so if everybody
>>> agrees it's fine to just make CPU_THERMAL depend on ENERGY_MODEL, I'm
>>> happy to drop this patch and fix patch 3/3. That would indeed simplify
>>> things a bit.
>>
>> Ok in this case it will be better to drop the 2/3 and add a small series
>> doing for the cpu_cooling.c
>>
>> #ifdef CONFIG_THERMAL_GOV_POWER_ALLOCATOR
>>
>> /* structure freq */
>>
>> /* power2state */
>>
>> /* state2power*/
>>
>> /* getrequestedpower */
>>
>> /* All functions needed for the above */
>>
>> #endif
>>
>> static struct thermal_cooling_device_ops cpufreq_cooling_ops = {
>>         .get_max_state          = cpufreq_get_max_state,
>>         .get_cur_state          = cpufreq_get_cur_state,
>>         .set_cur_state          = cpufreq_set_cur_state,
>> #ifdef CONFIG_THERMAL_GOV_POWER_ALLOCATOR
>>         .get_requested_power    = cpufreq_get_requested_power,
>>         .state2power            = cpufreq_state2power,
>>         .power2state            = cpufreq_power2state,
>> #endif
>> };
>>
>> So you don't have to care about ENERGY_MODEL to be set as
>> THERMAL_GOV_POWER_ALLOCATOR depends on it.
>>
>> I think the result for cpu_cooling.c will be even more cleaner with the
>> em change.
> 
> OK, that works for me. I'll give it a go in v5.

Viresh what do you think ?


-- 
 <http://www.linaro.org/> Linaro.org │ Open source software for ARM SoCs

Follow Linaro:  <http://www.facebook.com/pages/Linaro> Facebook |
<http://twitter.com/#!/linaroorg> Twitter |
<http://www.linaro.org/linaro-blog/> Blog


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v4 2/3] PM / EM: Expose perf domain struct
  2019-05-15 10:16             ` Daniel Lezcano
@ 2019-05-15 10:22               ` Viresh Kumar
  -1 siblings, 0 replies; 40+ messages in thread
From: Viresh Kumar @ 2019-05-15 10:22 UTC (permalink / raw)
  To: Daniel Lezcano
  Cc: Quentin Perret, edubezval, rui.zhang, javi.merino, amit.kachhap,
	rjw, will.deacon, catalin.marinas, dietmar.eggemann,
	ionela.voinescu, mka, linux-pm, linux-kernel, linux-arm-kernel

On 15-05-19, 12:16, Daniel Lezcano wrote:
> Viresh what do you think ?

I agree with your last suggestions. They do make sense.

-- 
viresh

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v4 2/3] PM / EM: Expose perf domain struct
@ 2019-05-15 10:22               ` Viresh Kumar
  0 siblings, 0 replies; 40+ messages in thread
From: Viresh Kumar @ 2019-05-15 10:22 UTC (permalink / raw)
  To: Daniel Lezcano
  Cc: linux-pm, will.deacon, amit.kachhap, rjw, linux-kernel,
	edubezval, Quentin Perret, mka, catalin.marinas, rui.zhang,
	javi.merino, ionela.voinescu, dietmar.eggemann, linux-arm-kernel

On 15-05-19, 12:16, Daniel Lezcano wrote:
> Viresh what do you think ?

I agree with your last suggestions. They do make sense.

-- 
viresh

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v4 2/3] PM / EM: Expose perf domain struct
  2019-05-15 10:22               ` Viresh Kumar
  (?)
@ 2019-05-15 10:40                 ` Quentin Perret
  -1 siblings, 0 replies; 40+ messages in thread
From: Quentin Perret @ 2019-05-15 10:40 UTC (permalink / raw)
  To: Viresh Kumar
  Cc: Daniel Lezcano, edubezval, rui.zhang, javi.merino, amit.kachhap,
	rjw, will.deacon, catalin.marinas, dietmar.eggemann,
	ionela.voinescu, mka, linux-pm, linux-kernel, linux-arm-kernel

On Wednesday 15 May 2019 at 15:52:00 (+0530), Viresh Kumar wrote:
> On 15-05-19, 12:16, Daniel Lezcano wrote:
> > Viresh what do you think ?
> 
> I agree with your last suggestions. They do make sense.

Good :-)

So, FWIW, the below compiles w/ or w/o THERMAL_GOV_POWER_ALLOCATOR. I'll
test it and clean it up some more and put it as patch 1 in the series if
that's OK.

Thanks,
Quentin


diff --git a/drivers/thermal/cpu_cooling.c b/drivers/thermal/cpu_cooling.c
index f7c1f49ec87f..ee431848ef71 100644
--- a/drivers/thermal/cpu_cooling.c
+++ b/drivers/thermal/cpu_cooling.c
@@ -58,7 +58,9 @@
  */
 struct freq_table {
        u32 frequency;
+#ifdef CONFIG_THERMAL_GOV_POWER_ALLOCATOR
        u32 power;
+#endif
 };
 
 /**
@@ -109,28 +111,6 @@ static DEFINE_IDA(cpufreq_ida);
 static DEFINE_MUTEX(cooling_list_lock);
 static LIST_HEAD(cpufreq_cdev_list);
 
-/* Below code defines functions to be used for cpufreq as cooling device */
-
-/**
- * get_level: Find the level for a particular frequency
- * @cpufreq_cdev: cpufreq_cdev for which the property is required
- * @freq: Frequency
- *
- * Return: level corresponding to the frequency.
- */
-static unsigned long get_level(struct cpufreq_cooling_device *cpufreq_cdev,
-                              unsigned int freq)
-{
-       struct freq_table *freq_table = cpufreq_cdev->freq_table;
-       unsigned long level;
-
-       for (level = 1; level <= cpufreq_cdev->max_level; level++)
-               if (freq > freq_table[level].frequency)
-                       break;
-
-       return level - 1;
-}
-
 /**
  * cpufreq_thermal_notifier - notifier callback for cpufreq policy change.
  * @nb:        struct notifier_block * with callback info.
@@ -184,6 +164,27 @@ static int cpufreq_thermal_notifier(struct notifier_block *nb,
        return NOTIFY_OK;
 }
 
+#ifdef CONFIG_THERMAL_GOV_POWER_ALLOCATOR
+/**
+ * get_level: Find the level for a particular frequency
+ * @cpufreq_cdev: cpufreq_cdev for which the property is required
+ * @freq: Frequency
+ *
+ * Return: level corresponding to the frequency.
+ */
+static unsigned long get_level(struct cpufreq_cooling_device *cpufreq_cdev,
+                              unsigned int freq)
+{
+       struct freq_table *freq_table = cpufreq_cdev->freq_table;
+       unsigned long level;
+
+       for (level = 1; level <= cpufreq_cdev->max_level; level++)
+               if (freq > freq_table[level].frequency)
+                       break;
+
+       return level - 1;
+}
+
 /**
  * update_freq_table() - Update the freq table with power numbers
  * @cpufreq_cdev:      the cpufreq cooling device in which to update the table
@@ -333,80 +334,6 @@ static u32 get_dynamic_power(struct cpufreq_cooling_device *cpufreq_cdev,
        return (raw_cpu_power * cpufreq_cdev->last_load) / 100;
 }
 
-/* cpufreq cooling device callback functions are defined below */
-
-/**
- * cpufreq_get_max_state - callback function to get the max cooling state.
- * @cdev: thermal cooling device pointer.
- * @state: fill this variable with the max cooling state.
- *
- * Callback for the thermal cooling device to return the cpufreq
- * max cooling state.
- *
- * Return: 0 on success, an error code otherwise.
- */
-static int cpufreq_get_max_state(struct thermal_cooling_device *cdev,
-                                unsigned long *state)
-{
-       struct cpufreq_cooling_device *cpufreq_cdev = cdev->devdata;
-
-       *state = cpufreq_cdev->max_level;
-       return 0;
-}
-
-/**
- * cpufreq_get_cur_state - callback function to get the current cooling state.
- * @cdev: thermal cooling device pointer.
- * @state: fill this variable with the current cooling state.
- *
- * Callback for the thermal cooling device to return the cpufreq
- * current cooling state.
- *
- * Return: 0 on success, an error code otherwise.
- */
-static int cpufreq_get_cur_state(struct thermal_cooling_device *cdev,
-                                unsigned long *state)
-{
-       struct cpufreq_cooling_device *cpufreq_cdev = cdev->devdata;
-
-       *state = cpufreq_cdev->cpufreq_state;
-
-       return 0;
-}
-
-/**
- * cpufreq_set_cur_state - callback function to set the current cooling state.
- * @cdev: thermal cooling device pointer.
- * @state: set this variable to the current cooling state.
- *
- * Callback for the thermal cooling device to change the cpufreq
- * current cooling state.
- *
- * Return: 0 on success, an error code otherwise.
- */
-static int cpufreq_set_cur_state(struct thermal_cooling_device *cdev,
-                                unsigned long state)
-{
-       struct cpufreq_cooling_device *cpufreq_cdev = cdev->devdata;
-       unsigned int clip_freq;
-
-       /* Request state should be less than max_level */
-       if (WARN_ON(state > cpufreq_cdev->max_level))
-               return -EINVAL;
-
-       /* Check if the old cooling action is same as new cooling action */
-       if (cpufreq_cdev->cpufreq_state == state)
-               return 0;
-
-       clip_freq = cpufreq_cdev->freq_table[state].frequency;
-       cpufreq_cdev->cpufreq_state = state;
-       cpufreq_cdev->clipped_freq = clip_freq;
-
-       cpufreq_update_policy(cpufreq_cdev->policy->cpu);
-
-       return 0;
-}
-
 /**
  * cpufreq_get_requested_power() - get the current power
  * @cdev:      &thermal_cooling_device pointer
@@ -551,22 +478,93 @@ static int cpufreq_power2state(struct thermal_cooling_device *cdev,
                                      power);
        return 0;
 }
+#endif /* CONFIG_THERMAL_GOV_POWER_ALLOCATOR */
+
+/* cpufreq cooling device callback functions are defined below */
+
+/**
+ * cpufreq_get_max_state - callback function to get the max cooling state.
+ * @cdev: thermal cooling device pointer.
+ * @state: fill this variable with the max cooling state.
+ *
+ * Callback for the thermal cooling device to return the cpufreq
+ * max cooling state.
+ *
+ * Return: 0 on success, an error code otherwise.
+ */
+static int cpufreq_get_max_state(struct thermal_cooling_device *cdev,
+                                unsigned long *state)
+{
+       struct cpufreq_cooling_device *cpufreq_cdev = cdev->devdata;
+
+       *state = cpufreq_cdev->max_level;
+       return 0;
+}
+
+/**
+ * cpufreq_get_cur_state - callback function to get the current cooling state.
+ * @cdev: thermal cooling device pointer.
+ * @state: fill this variable with the current cooling state.
+ *
+ * Callback for the thermal cooling device to return the cpufreq
+ * current cooling state.
+ *
+ * Return: 0 on success, an error code otherwise.
+ */
+static int cpufreq_get_cur_state(struct thermal_cooling_device *cdev,
+                                unsigned long *state)
+{
+       struct cpufreq_cooling_device *cpufreq_cdev = cdev->devdata;
+
+       *state = cpufreq_cdev->cpufreq_state;
+
+       return 0;
+}
+
+/**
+ * cpufreq_set_cur_state - callback function to set the current cooling state.
+ * @cdev: thermal cooling device pointer.
+ * @state: set this variable to the current cooling state.
+ *
+ * Callback for the thermal cooling device to change the cpufreq
+ * current cooling state.
+ *
+ * Return: 0 on success, an error code otherwise.
+ */
+static int cpufreq_set_cur_state(struct thermal_cooling_device *cdev,
+                                unsigned long state)
+{
+       struct cpufreq_cooling_device *cpufreq_cdev = cdev->devdata;
+       unsigned int clip_freq;
+
+       /* Request state should be less than max_level */
+       if (WARN_ON(state > cpufreq_cdev->max_level))
+               return -EINVAL;
+
+       /* Check if the old cooling action is same as new cooling action */
+       if (cpufreq_cdev->cpufreq_state == state)
+               return 0;
+
+       clip_freq = cpufreq_cdev->freq_table[state].frequency;
+       cpufreq_cdev->cpufreq_state = state;
+       cpufreq_cdev->clipped_freq = clip_freq;
+
+       cpufreq_update_policy(cpufreq_cdev->policy->cpu);
+
+       return 0;
+}
 
 /* Bind cpufreq callbacks to thermal cooling device ops */
 
 static struct thermal_cooling_device_ops cpufreq_cooling_ops = {
-       .get_max_state = cpufreq_get_max_state,
-       .get_cur_state = cpufreq_get_cur_state,
-       .set_cur_state = cpufreq_set_cur_state,
-};
-
-static struct thermal_cooling_device_ops cpufreq_power_cooling_ops = {
        .get_max_state          = cpufreq_get_max_state,
        .get_cur_state          = cpufreq_get_cur_state,
        .set_cur_state          = cpufreq_set_cur_state,
+#ifdef CONFIG_THERMAL_GOV_POWER_ALLOCATOR
        .get_requested_power    = cpufreq_get_requested_power,
        .state2power            = cpufreq_state2power,
        .power2state            = cpufreq_power2state,
+#endif
 };
 
 /* Notifier for cpufreq policy change */
@@ -674,17 +672,16 @@ __cpufreq_cooling_register(struct device_node *np,
                        pr_debug("%s: freq:%u KHz\n", __func__, freq);
        }
 
+#ifdef CONFIG_THERMAL_GOV_POWER_ALLOCATOR
        if (capacitance) {
                ret = update_freq_table(cpufreq_cdev, capacitance);
                if (ret) {
                        cdev = ERR_PTR(ret);
                        goto remove_ida;
                }
-
-               cooling_ops = &cpufreq_power_cooling_ops;
-       } else {
-               cooling_ops = &cpufreq_cooling_ops;
        }
+#endif
+       cooling_ops = &cpufreq_cooling_ops;
 
        cdev = thermal_of_cooling_device_register(np, dev_name, cpufreq_cdev,
                                                  cooling_ops);

^ permalink raw reply related	[flat|nested] 40+ messages in thread

* Re: [PATCH v4 2/3] PM / EM: Expose perf domain struct
@ 2019-05-15 10:40                 ` Quentin Perret
  0 siblings, 0 replies; 40+ messages in thread
From: Quentin Perret @ 2019-05-15 10:40 UTC (permalink / raw)
  To: Viresh Kumar
  Cc: linux-pm, will.deacon, amit.kachhap, Daniel Lezcano, rjw,
	linux-kernel, edubezval, mka, catalin.marinas, rui.zhang,
	javi.merino, ionela.voinescu, dietmar.eggemann, linux-arm-kernel

On Wednesday 15 May 2019 at 15:52:00 (+0530), Viresh Kumar wrote:
> On 15-05-19, 12:16, Daniel Lezcano wrote:
> > Viresh what do you think ?
> 
> I agree with your last suggestions. They do make sense.

Good :-)

So, FWIW, the below compiles w/ or w/o THERMAL_GOV_POWER_ALLOCATOR. I'll
test it and clean it up some more and put it as patch 1 in the series if
that's OK.

Thanks,
Quentin


diff --git a/drivers/thermal/cpu_cooling.c b/drivers/thermal/cpu_cooling.c
index f7c1f49ec87f..ee431848ef71 100644
--- a/drivers/thermal/cpu_cooling.c
+++ b/drivers/thermal/cpu_cooling.c
@@ -58,7 +58,9 @@
  */
 struct freq_table {
        u32 frequency;
+#ifdef CONFIG_THERMAL_GOV_POWER_ALLOCATOR
        u32 power;
+#endif
 };
 
 /**
@@ -109,28 +111,6 @@ static DEFINE_IDA(cpufreq_ida);
 static DEFINE_MUTEX(cooling_list_lock);
 static LIST_HEAD(cpufreq_cdev_list);
 
-/* Below code defines functions to be used for cpufreq as cooling device */
-
-/**
- * get_level: Find the level for a particular frequency
- * @cpufreq_cdev: cpufreq_cdev for which the property is required
- * @freq: Frequency
- *
- * Return: level corresponding to the frequency.
- */
-static unsigned long get_level(struct cpufreq_cooling_device *cpufreq_cdev,
-                              unsigned int freq)
-{
-       struct freq_table *freq_table = cpufreq_cdev->freq_table;
-       unsigned long level;
-
-       for (level = 1; level <= cpufreq_cdev->max_level; level++)
-               if (freq > freq_table[level].frequency)
-                       break;
-
-       return level - 1;
-}
-
 /**
  * cpufreq_thermal_notifier - notifier callback for cpufreq policy change.
  * @nb:        struct notifier_block * with callback info.
@@ -184,6 +164,27 @@ static int cpufreq_thermal_notifier(struct notifier_block *nb,
        return NOTIFY_OK;
 }
 
+#ifdef CONFIG_THERMAL_GOV_POWER_ALLOCATOR
+/**
+ * get_level: Find the level for a particular frequency
+ * @cpufreq_cdev: cpufreq_cdev for which the property is required
+ * @freq: Frequency
+ *
+ * Return: level corresponding to the frequency.
+ */
+static unsigned long get_level(struct cpufreq_cooling_device *cpufreq_cdev,
+                              unsigned int freq)
+{
+       struct freq_table *freq_table = cpufreq_cdev->freq_table;
+       unsigned long level;
+
+       for (level = 1; level <= cpufreq_cdev->max_level; level++)
+               if (freq > freq_table[level].frequency)
+                       break;
+
+       return level - 1;
+}
+
 /**
  * update_freq_table() - Update the freq table with power numbers
  * @cpufreq_cdev:      the cpufreq cooling device in which to update the table
@@ -333,80 +334,6 @@ static u32 get_dynamic_power(struct cpufreq_cooling_device *cpufreq_cdev,
        return (raw_cpu_power * cpufreq_cdev->last_load) / 100;
 }
 
-/* cpufreq cooling device callback functions are defined below */
-
-/**
- * cpufreq_get_max_state - callback function to get the max cooling state.
- * @cdev: thermal cooling device pointer.
- * @state: fill this variable with the max cooling state.
- *
- * Callback for the thermal cooling device to return the cpufreq
- * max cooling state.
- *
- * Return: 0 on success, an error code otherwise.
- */
-static int cpufreq_get_max_state(struct thermal_cooling_device *cdev,
-                                unsigned long *state)
-{
-       struct cpufreq_cooling_device *cpufreq_cdev = cdev->devdata;
-
-       *state = cpufreq_cdev->max_level;
-       return 0;
-}
-
-/**
- * cpufreq_get_cur_state - callback function to get the current cooling state.
- * @cdev: thermal cooling device pointer.
- * @state: fill this variable with the current cooling state.
- *
- * Callback for the thermal cooling device to return the cpufreq
- * current cooling state.
- *
- * Return: 0 on success, an error code otherwise.
- */
-static int cpufreq_get_cur_state(struct thermal_cooling_device *cdev,
-                                unsigned long *state)
-{
-       struct cpufreq_cooling_device *cpufreq_cdev = cdev->devdata;
-
-       *state = cpufreq_cdev->cpufreq_state;
-
-       return 0;
-}
-
-/**
- * cpufreq_set_cur_state - callback function to set the current cooling state.
- * @cdev: thermal cooling device pointer.
- * @state: set this variable to the current cooling state.
- *
- * Callback for the thermal cooling device to change the cpufreq
- * current cooling state.
- *
- * Return: 0 on success, an error code otherwise.
- */
-static int cpufreq_set_cur_state(struct thermal_cooling_device *cdev,
-                                unsigned long state)
-{
-       struct cpufreq_cooling_device *cpufreq_cdev = cdev->devdata;
-       unsigned int clip_freq;
-
-       /* Request state should be less than max_level */
-       if (WARN_ON(state > cpufreq_cdev->max_level))
-               return -EINVAL;
-
-       /* Check if the old cooling action is same as new cooling action */
-       if (cpufreq_cdev->cpufreq_state == state)
-               return 0;
-
-       clip_freq = cpufreq_cdev->freq_table[state].frequency;
-       cpufreq_cdev->cpufreq_state = state;
-       cpufreq_cdev->clipped_freq = clip_freq;
-
-       cpufreq_update_policy(cpufreq_cdev->policy->cpu);
-
-       return 0;
-}
-
 /**
  * cpufreq_get_requested_power() - get the current power
  * @cdev:      &thermal_cooling_device pointer
@@ -551,22 +478,93 @@ static int cpufreq_power2state(struct thermal_cooling_device *cdev,
                                      power);
        return 0;
 }
+#endif /* CONFIG_THERMAL_GOV_POWER_ALLOCATOR */
+
+/* cpufreq cooling device callback functions are defined below */
+
+/**
+ * cpufreq_get_max_state - callback function to get the max cooling state.
+ * @cdev: thermal cooling device pointer.
+ * @state: fill this variable with the max cooling state.
+ *
+ * Callback for the thermal cooling device to return the cpufreq
+ * max cooling state.
+ *
+ * Return: 0 on success, an error code otherwise.
+ */
+static int cpufreq_get_max_state(struct thermal_cooling_device *cdev,
+                                unsigned long *state)
+{
+       struct cpufreq_cooling_device *cpufreq_cdev = cdev->devdata;
+
+       *state = cpufreq_cdev->max_level;
+       return 0;
+}
+
+/**
+ * cpufreq_get_cur_state - callback function to get the current cooling state.
+ * @cdev: thermal cooling device pointer.
+ * @state: fill this variable with the current cooling state.
+ *
+ * Callback for the thermal cooling device to return the cpufreq
+ * current cooling state.
+ *
+ * Return: 0 on success, an error code otherwise.
+ */
+static int cpufreq_get_cur_state(struct thermal_cooling_device *cdev,
+                                unsigned long *state)
+{
+       struct cpufreq_cooling_device *cpufreq_cdev = cdev->devdata;
+
+       *state = cpufreq_cdev->cpufreq_state;
+
+       return 0;
+}
+
+/**
+ * cpufreq_set_cur_state - callback function to set the current cooling state.
+ * @cdev: thermal cooling device pointer.
+ * @state: set this variable to the current cooling state.
+ *
+ * Callback for the thermal cooling device to change the cpufreq
+ * current cooling state.
+ *
+ * Return: 0 on success, an error code otherwise.
+ */
+static int cpufreq_set_cur_state(struct thermal_cooling_device *cdev,
+                                unsigned long state)
+{
+       struct cpufreq_cooling_device *cpufreq_cdev = cdev->devdata;
+       unsigned int clip_freq;
+
+       /* Request state should be less than max_level */
+       if (WARN_ON(state > cpufreq_cdev->max_level))
+               return -EINVAL;
+
+       /* Check if the old cooling action is same as new cooling action */
+       if (cpufreq_cdev->cpufreq_state == state)
+               return 0;
+
+       clip_freq = cpufreq_cdev->freq_table[state].frequency;
+       cpufreq_cdev->cpufreq_state = state;
+       cpufreq_cdev->clipped_freq = clip_freq;
+
+       cpufreq_update_policy(cpufreq_cdev->policy->cpu);
+
+       return 0;
+}
 
 /* Bind cpufreq callbacks to thermal cooling device ops */
 
 static struct thermal_cooling_device_ops cpufreq_cooling_ops = {
-       .get_max_state = cpufreq_get_max_state,
-       .get_cur_state = cpufreq_get_cur_state,
-       .set_cur_state = cpufreq_set_cur_state,
-};
-
-static struct thermal_cooling_device_ops cpufreq_power_cooling_ops = {
        .get_max_state          = cpufreq_get_max_state,
        .get_cur_state          = cpufreq_get_cur_state,
        .set_cur_state          = cpufreq_set_cur_state,
+#ifdef CONFIG_THERMAL_GOV_POWER_ALLOCATOR
        .get_requested_power    = cpufreq_get_requested_power,
        .state2power            = cpufreq_state2power,
        .power2state            = cpufreq_power2state,
+#endif
 };
 
 /* Notifier for cpufreq policy change */
@@ -674,17 +672,16 @@ __cpufreq_cooling_register(struct device_node *np,
                        pr_debug("%s: freq:%u KHz\n", __func__, freq);
        }
 
+#ifdef CONFIG_THERMAL_GOV_POWER_ALLOCATOR
        if (capacitance) {
                ret = update_freq_table(cpufreq_cdev, capacitance);
                if (ret) {
                        cdev = ERR_PTR(ret);
                        goto remove_ida;
                }
-
-               cooling_ops = &cpufreq_power_cooling_ops;
-       } else {
-               cooling_ops = &cpufreq_cooling_ops;
        }
+#endif
+       cooling_ops = &cpufreq_cooling_ops;
 
        cdev = thermal_of_cooling_device_register(np, dev_name, cpufreq_cdev,
                                                  cooling_ops);

^ permalink raw reply related	[flat|nested] 40+ messages in thread

* Re: [PATCH v4 2/3] PM / EM: Expose perf domain struct
@ 2019-05-15 10:40                 ` Quentin Perret
  0 siblings, 0 replies; 40+ messages in thread
From: Quentin Perret @ 2019-05-15 10:40 UTC (permalink / raw)
  To: Viresh Kumar
  Cc: linux-pm, will.deacon, amit.kachhap, Daniel Lezcano, rjw,
	linux-kernel, edubezval, mka, catalin.marinas, rui.zhang,
	javi.merino, ionela.voinescu, dietmar.eggemann, linux-arm-kernel

On Wednesday 15 May 2019 at 15:52:00 (+0530), Viresh Kumar wrote:
> On 15-05-19, 12:16, Daniel Lezcano wrote:
> > Viresh what do you think ?
> 
> I agree with your last suggestions. They do make sense.

Good :-)

So, FWIW, the below compiles w/ or w/o THERMAL_GOV_POWER_ALLOCATOR. I'll
test it and clean it up some more and put it as patch 1 in the series if
that's OK.

Thanks,
Quentin


diff --git a/drivers/thermal/cpu_cooling.c b/drivers/thermal/cpu_cooling.c
index f7c1f49ec87f..ee431848ef71 100644
--- a/drivers/thermal/cpu_cooling.c
+++ b/drivers/thermal/cpu_cooling.c
@@ -58,7 +58,9 @@
  */
 struct freq_table {
        u32 frequency;
+#ifdef CONFIG_THERMAL_GOV_POWER_ALLOCATOR
        u32 power;
+#endif
 };
 
 /**
@@ -109,28 +111,6 @@ static DEFINE_IDA(cpufreq_ida);
 static DEFINE_MUTEX(cooling_list_lock);
 static LIST_HEAD(cpufreq_cdev_list);
 
-/* Below code defines functions to be used for cpufreq as cooling device */
-
-/**
- * get_level: Find the level for a particular frequency
- * @cpufreq_cdev: cpufreq_cdev for which the property is required
- * @freq: Frequency
- *
- * Return: level corresponding to the frequency.
- */
-static unsigned long get_level(struct cpufreq_cooling_device *cpufreq_cdev,
-                              unsigned int freq)
-{
-       struct freq_table *freq_table = cpufreq_cdev->freq_table;
-       unsigned long level;
-
-       for (level = 1; level <= cpufreq_cdev->max_level; level++)
-               if (freq > freq_table[level].frequency)
-                       break;
-
-       return level - 1;
-}
-
 /**
  * cpufreq_thermal_notifier - notifier callback for cpufreq policy change.
  * @nb:        struct notifier_block * with callback info.
@@ -184,6 +164,27 @@ static int cpufreq_thermal_notifier(struct notifier_block *nb,
        return NOTIFY_OK;
 }
 
+#ifdef CONFIG_THERMAL_GOV_POWER_ALLOCATOR
+/**
+ * get_level: Find the level for a particular frequency
+ * @cpufreq_cdev: cpufreq_cdev for which the property is required
+ * @freq: Frequency
+ *
+ * Return: level corresponding to the frequency.
+ */
+static unsigned long get_level(struct cpufreq_cooling_device *cpufreq_cdev,
+                              unsigned int freq)
+{
+       struct freq_table *freq_table = cpufreq_cdev->freq_table;
+       unsigned long level;
+
+       for (level = 1; level <= cpufreq_cdev->max_level; level++)
+               if (freq > freq_table[level].frequency)
+                       break;
+
+       return level - 1;
+}
+
 /**
  * update_freq_table() - Update the freq table with power numbers
  * @cpufreq_cdev:      the cpufreq cooling device in which to update the table
@@ -333,80 +334,6 @@ static u32 get_dynamic_power(struct cpufreq_cooling_device *cpufreq_cdev,
        return (raw_cpu_power * cpufreq_cdev->last_load) / 100;
 }
 
-/* cpufreq cooling device callback functions are defined below */
-
-/**
- * cpufreq_get_max_state - callback function to get the max cooling state.
- * @cdev: thermal cooling device pointer.
- * @state: fill this variable with the max cooling state.
- *
- * Callback for the thermal cooling device to return the cpufreq
- * max cooling state.
- *
- * Return: 0 on success, an error code otherwise.
- */
-static int cpufreq_get_max_state(struct thermal_cooling_device *cdev,
-                                unsigned long *state)
-{
-       struct cpufreq_cooling_device *cpufreq_cdev = cdev->devdata;
-
-       *state = cpufreq_cdev->max_level;
-       return 0;
-}
-
-/**
- * cpufreq_get_cur_state - callback function to get the current cooling state.
- * @cdev: thermal cooling device pointer.
- * @state: fill this variable with the current cooling state.
- *
- * Callback for the thermal cooling device to return the cpufreq
- * current cooling state.
- *
- * Return: 0 on success, an error code otherwise.
- */
-static int cpufreq_get_cur_state(struct thermal_cooling_device *cdev,
-                                unsigned long *state)
-{
-       struct cpufreq_cooling_device *cpufreq_cdev = cdev->devdata;
-
-       *state = cpufreq_cdev->cpufreq_state;
-
-       return 0;
-}
-
-/**
- * cpufreq_set_cur_state - callback function to set the current cooling state.
- * @cdev: thermal cooling device pointer.
- * @state: set this variable to the current cooling state.
- *
- * Callback for the thermal cooling device to change the cpufreq
- * current cooling state.
- *
- * Return: 0 on success, an error code otherwise.
- */
-static int cpufreq_set_cur_state(struct thermal_cooling_device *cdev,
-                                unsigned long state)
-{
-       struct cpufreq_cooling_device *cpufreq_cdev = cdev->devdata;
-       unsigned int clip_freq;
-
-       /* Request state should be less than max_level */
-       if (WARN_ON(state > cpufreq_cdev->max_level))
-               return -EINVAL;
-
-       /* Check if the old cooling action is same as new cooling action */
-       if (cpufreq_cdev->cpufreq_state == state)
-               return 0;
-
-       clip_freq = cpufreq_cdev->freq_table[state].frequency;
-       cpufreq_cdev->cpufreq_state = state;
-       cpufreq_cdev->clipped_freq = clip_freq;
-
-       cpufreq_update_policy(cpufreq_cdev->policy->cpu);
-
-       return 0;
-}
-
 /**
  * cpufreq_get_requested_power() - get the current power
  * @cdev:      &thermal_cooling_device pointer
@@ -551,22 +478,93 @@ static int cpufreq_power2state(struct thermal_cooling_device *cdev,
                                      power);
        return 0;
 }
+#endif /* CONFIG_THERMAL_GOV_POWER_ALLOCATOR */
+
+/* cpufreq cooling device callback functions are defined below */
+
+/**
+ * cpufreq_get_max_state - callback function to get the max cooling state.
+ * @cdev: thermal cooling device pointer.
+ * @state: fill this variable with the max cooling state.
+ *
+ * Callback for the thermal cooling device to return the cpufreq
+ * max cooling state.
+ *
+ * Return: 0 on success, an error code otherwise.
+ */
+static int cpufreq_get_max_state(struct thermal_cooling_device *cdev,
+                                unsigned long *state)
+{
+       struct cpufreq_cooling_device *cpufreq_cdev = cdev->devdata;
+
+       *state = cpufreq_cdev->max_level;
+       return 0;
+}
+
+/**
+ * cpufreq_get_cur_state - callback function to get the current cooling state.
+ * @cdev: thermal cooling device pointer.
+ * @state: fill this variable with the current cooling state.
+ *
+ * Callback for the thermal cooling device to return the cpufreq
+ * current cooling state.
+ *
+ * Return: 0 on success, an error code otherwise.
+ */
+static int cpufreq_get_cur_state(struct thermal_cooling_device *cdev,
+                                unsigned long *state)
+{
+       struct cpufreq_cooling_device *cpufreq_cdev = cdev->devdata;
+
+       *state = cpufreq_cdev->cpufreq_state;
+
+       return 0;
+}
+
+/**
+ * cpufreq_set_cur_state - callback function to set the current cooling state.
+ * @cdev: thermal cooling device pointer.
+ * @state: set this variable to the current cooling state.
+ *
+ * Callback for the thermal cooling device to change the cpufreq
+ * current cooling state.
+ *
+ * Return: 0 on success, an error code otherwise.
+ */
+static int cpufreq_set_cur_state(struct thermal_cooling_device *cdev,
+                                unsigned long state)
+{
+       struct cpufreq_cooling_device *cpufreq_cdev = cdev->devdata;
+       unsigned int clip_freq;
+
+       /* Request state should be less than max_level */
+       if (WARN_ON(state > cpufreq_cdev->max_level))
+               return -EINVAL;
+
+       /* Check if the old cooling action is same as new cooling action */
+       if (cpufreq_cdev->cpufreq_state == state)
+               return 0;
+
+       clip_freq = cpufreq_cdev->freq_table[state].frequency;
+       cpufreq_cdev->cpufreq_state = state;
+       cpufreq_cdev->clipped_freq = clip_freq;
+
+       cpufreq_update_policy(cpufreq_cdev->policy->cpu);
+
+       return 0;
+}
 
 /* Bind cpufreq callbacks to thermal cooling device ops */
 
 static struct thermal_cooling_device_ops cpufreq_cooling_ops = {
-       .get_max_state = cpufreq_get_max_state,
-       .get_cur_state = cpufreq_get_cur_state,
-       .set_cur_state = cpufreq_set_cur_state,
-};
-
-static struct thermal_cooling_device_ops cpufreq_power_cooling_ops = {
        .get_max_state          = cpufreq_get_max_state,
        .get_cur_state          = cpufreq_get_cur_state,
        .set_cur_state          = cpufreq_set_cur_state,
+#ifdef CONFIG_THERMAL_GOV_POWER_ALLOCATOR
        .get_requested_power    = cpufreq_get_requested_power,
        .state2power            = cpufreq_state2power,
        .power2state            = cpufreq_power2state,
+#endif
 };
 
 /* Notifier for cpufreq policy change */
@@ -674,17 +672,16 @@ __cpufreq_cooling_register(struct device_node *np,
                        pr_debug("%s: freq:%u KHz\n", __func__, freq);
        }
 
+#ifdef CONFIG_THERMAL_GOV_POWER_ALLOCATOR
        if (capacitance) {
                ret = update_freq_table(cpufreq_cdev, capacitance);
                if (ret) {
                        cdev = ERR_PTR(ret);
                        goto remove_ida;
                }
-
-               cooling_ops = &cpufreq_power_cooling_ops;
-       } else {
-               cooling_ops = &cpufreq_cooling_ops;
        }
+#endif
+       cooling_ops = &cpufreq_cooling_ops;
 
        cdev = thermal_of_cooling_device_register(np, dev_name, cpufreq_cdev,
                                                  cooling_ops);

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 40+ messages in thread

* Re: [PATCH v4 2/3] PM / EM: Expose perf domain struct
  2019-05-15 10:40                 ` Quentin Perret
@ 2019-05-15 10:46                   ` Quentin Perret
  -1 siblings, 0 replies; 40+ messages in thread
From: Quentin Perret @ 2019-05-15 10:46 UTC (permalink / raw)
  To: Viresh Kumar
  Cc: Daniel Lezcano, edubezval, rui.zhang, javi.merino, amit.kachhap,
	rjw, will.deacon, catalin.marinas, dietmar.eggemann,
	ionela.voinescu, mka, linux-pm, linux-kernel, linux-arm-kernel

On Wednesday 15 May 2019 at 11:40:44 (+0100), Quentin Perret wrote:
> On Wednesday 15 May 2019 at 15:52:00 (+0530), Viresh Kumar wrote:
> > On 15-05-19, 12:16, Daniel Lezcano wrote:
> > > Viresh what do you think ?
> > 
> > I agree with your last suggestions. They do make sense.
> 
> Good :-)
> 
> So, FWIW, the below compiles w/ or w/o THERMAL_GOV_POWER_ALLOCATOR. I'll
> test it and clean it up some more and put it as patch 1 in the series if
> that's OK.
> 
> Thanks,
> Quentin
> 
> 
> diff --git a/drivers/thermal/cpu_cooling.c b/drivers/thermal/cpu_cooling.c
> index f7c1f49ec87f..ee431848ef71 100644
> --- a/drivers/thermal/cpu_cooling.c
> +++ b/drivers/thermal/cpu_cooling.c
> @@ -58,7 +58,9 @@
>   */
>  struct freq_table {
>         u32 frequency;
> +#ifdef CONFIG_THERMAL_GOV_POWER_ALLOCATOR
>         u32 power;
> +#endif
>  };
>  
>  /**
> @@ -109,28 +111,6 @@ static DEFINE_IDA(cpufreq_ida);
>  static DEFINE_MUTEX(cooling_list_lock);
>  static LIST_HEAD(cpufreq_cdev_list);
>  
> -/* Below code defines functions to be used for cpufreq as cooling device */
> -
> -/**
> - * get_level: Find the level for a particular frequency
> - * @cpufreq_cdev: cpufreq_cdev for which the property is required
> - * @freq: Frequency
> - *
> - * Return: level corresponding to the frequency.
> - */
> -static unsigned long get_level(struct cpufreq_cooling_device *cpufreq_cdev,
> -                              unsigned int freq)
> -{
> -       struct freq_table *freq_table = cpufreq_cdev->freq_table;
> -       unsigned long level;
> -
> -       for (level = 1; level <= cpufreq_cdev->max_level; level++)
> -               if (freq > freq_table[level].frequency)
> -                       break;
> -
> -       return level - 1;
> -}
> -
>  /**
>   * cpufreq_thermal_notifier - notifier callback for cpufreq policy change.
>   * @nb:        struct notifier_block * with callback info.
> @@ -184,6 +164,27 @@ static int cpufreq_thermal_notifier(struct notifier_block *nb,
>         return NOTIFY_OK;
>  }
>  
> +#ifdef CONFIG_THERMAL_GOV_POWER_ALLOCATOR
> +/**
> + * get_level: Find the level for a particular frequency
> + * @cpufreq_cdev: cpufreq_cdev for which the property is required
> + * @freq: Frequency
> + *
> + * Return: level corresponding to the frequency.
> + */
> +static unsigned long get_level(struct cpufreq_cooling_device *cpufreq_cdev,
> +                              unsigned int freq)
> +{
> +       struct freq_table *freq_table = cpufreq_cdev->freq_table;
> +       unsigned long level;
> +
> +       for (level = 1; level <= cpufreq_cdev->max_level; level++)
> +               if (freq > freq_table[level].frequency)
> +                       break;
> +
> +       return level - 1;
> +}
> +
>  /**
>   * update_freq_table() - Update the freq table with power numbers
>   * @cpufreq_cdev:      the cpufreq cooling device in which to update the table
> @@ -333,80 +334,6 @@ static u32 get_dynamic_power(struct cpufreq_cooling_device *cpufreq_cdev,
>         return (raw_cpu_power * cpufreq_cdev->last_load) / 100;
>  }
>  
> -/* cpufreq cooling device callback functions are defined below */
> -
> -/**
> - * cpufreq_get_max_state - callback function to get the max cooling state.
> - * @cdev: thermal cooling device pointer.
> - * @state: fill this variable with the max cooling state.
> - *
> - * Callback for the thermal cooling device to return the cpufreq
> - * max cooling state.
> - *
> - * Return: 0 on success, an error code otherwise.
> - */
> -static int cpufreq_get_max_state(struct thermal_cooling_device *cdev,
> -                                unsigned long *state)
> -{
> -       struct cpufreq_cooling_device *cpufreq_cdev = cdev->devdata;
> -
> -       *state = cpufreq_cdev->max_level;
> -       return 0;
> -}
> -
> -/**
> - * cpufreq_get_cur_state - callback function to get the current cooling state.
> - * @cdev: thermal cooling device pointer.
> - * @state: fill this variable with the current cooling state.
> - *
> - * Callback for the thermal cooling device to return the cpufreq
> - * current cooling state.
> - *
> - * Return: 0 on success, an error code otherwise.
> - */
> -static int cpufreq_get_cur_state(struct thermal_cooling_device *cdev,
> -                                unsigned long *state)
> -{
> -       struct cpufreq_cooling_device *cpufreq_cdev = cdev->devdata;
> -
> -       *state = cpufreq_cdev->cpufreq_state;
> -
> -       return 0;
> -}
> -
> -/**
> - * cpufreq_set_cur_state - callback function to set the current cooling state.
> - * @cdev: thermal cooling device pointer.
> - * @state: set this variable to the current cooling state.
> - *
> - * Callback for the thermal cooling device to change the cpufreq
> - * current cooling state.
> - *
> - * Return: 0 on success, an error code otherwise.
> - */
> -static int cpufreq_set_cur_state(struct thermal_cooling_device *cdev,
> -                                unsigned long state)
> -{
> -       struct cpufreq_cooling_device *cpufreq_cdev = cdev->devdata;
> -       unsigned int clip_freq;
> -
> -       /* Request state should be less than max_level */
> -       if (WARN_ON(state > cpufreq_cdev->max_level))
> -               return -EINVAL;
> -
> -       /* Check if the old cooling action is same as new cooling action */
> -       if (cpufreq_cdev->cpufreq_state == state)
> -               return 0;
> -
> -       clip_freq = cpufreq_cdev->freq_table[state].frequency;
> -       cpufreq_cdev->cpufreq_state = state;
> -       cpufreq_cdev->clipped_freq = clip_freq;
> -
> -       cpufreq_update_policy(cpufreq_cdev->policy->cpu);
> -
> -       return 0;
> -}
> -
>  /**
>   * cpufreq_get_requested_power() - get the current power
>   * @cdev:      &thermal_cooling_device pointer
> @@ -551,22 +478,93 @@ static int cpufreq_power2state(struct thermal_cooling_device *cdev,
>                                       power);
>         return 0;
>  }
> +#endif /* CONFIG_THERMAL_GOV_POWER_ALLOCATOR */
> +
> +/* cpufreq cooling device callback functions are defined below */
> +
> +/**
> + * cpufreq_get_max_state - callback function to get the max cooling state.
> + * @cdev: thermal cooling device pointer.
> + * @state: fill this variable with the max cooling state.
> + *
> + * Callback for the thermal cooling device to return the cpufreq
> + * max cooling state.
> + *
> + * Return: 0 on success, an error code otherwise.
> + */
> +static int cpufreq_get_max_state(struct thermal_cooling_device *cdev,
> +                                unsigned long *state)
> +{
> +       struct cpufreq_cooling_device *cpufreq_cdev = cdev->devdata;
> +
> +       *state = cpufreq_cdev->max_level;
> +       return 0;
> +}
> +
> +/**
> + * cpufreq_get_cur_state - callback function to get the current cooling state.
> + * @cdev: thermal cooling device pointer.
> + * @state: fill this variable with the current cooling state.
> + *
> + * Callback for the thermal cooling device to return the cpufreq
> + * current cooling state.
> + *
> + * Return: 0 on success, an error code otherwise.
> + */
> +static int cpufreq_get_cur_state(struct thermal_cooling_device *cdev,
> +                                unsigned long *state)
> +{
> +       struct cpufreq_cooling_device *cpufreq_cdev = cdev->devdata;
> +
> +       *state = cpufreq_cdev->cpufreq_state;
> +
> +       return 0;
> +}
> +
> +/**
> + * cpufreq_set_cur_state - callback function to set the current cooling state.
> + * @cdev: thermal cooling device pointer.
> + * @state: set this variable to the current cooling state.
> + *
> + * Callback for the thermal cooling device to change the cpufreq
> + * current cooling state.
> + *
> + * Return: 0 on success, an error code otherwise.
> + */
> +static int cpufreq_set_cur_state(struct thermal_cooling_device *cdev,
> +                                unsigned long state)
> +{
> +       struct cpufreq_cooling_device *cpufreq_cdev = cdev->devdata;
> +       unsigned int clip_freq;
> +
> +       /* Request state should be less than max_level */
> +       if (WARN_ON(state > cpufreq_cdev->max_level))
> +               return -EINVAL;
> +
> +       /* Check if the old cooling action is same as new cooling action */
> +       if (cpufreq_cdev->cpufreq_state == state)
> +               return 0;
> +
> +       clip_freq = cpufreq_cdev->freq_table[state].frequency;
> +       cpufreq_cdev->cpufreq_state = state;
> +       cpufreq_cdev->clipped_freq = clip_freq;
> +
> +       cpufreq_update_policy(cpufreq_cdev->policy->cpu);
> +
> +       return 0;
> +}
>  
>  /* Bind cpufreq callbacks to thermal cooling device ops */
>  
>  static struct thermal_cooling_device_ops cpufreq_cooling_ops = {
> -       .get_max_state = cpufreq_get_max_state,
> -       .get_cur_state = cpufreq_get_cur_state,
> -       .set_cur_state = cpufreq_set_cur_state,
> -};
> -
> -static struct thermal_cooling_device_ops cpufreq_power_cooling_ops = {
>         .get_max_state          = cpufreq_get_max_state,
>         .get_cur_state          = cpufreq_get_cur_state,
>         .set_cur_state          = cpufreq_set_cur_state,
> +#ifdef CONFIG_THERMAL_GOV_POWER_ALLOCATOR
>         .get_requested_power    = cpufreq_get_requested_power,
>         .state2power            = cpufreq_state2power,
>         .power2state            = cpufreq_power2state,
> +#endif
>  };
>  
>  /* Notifier for cpufreq policy change */
> @@ -674,17 +672,16 @@ __cpufreq_cooling_register(struct device_node *np,
>                         pr_debug("%s: freq:%u KHz\n", __func__, freq);
>         }
>  
> +#ifdef CONFIG_THERMAL_GOV_POWER_ALLOCATOR
>         if (capacitance) {
>                 ret = update_freq_table(cpufreq_cdev, capacitance);
>                 if (ret) {
>                         cdev = ERR_PTR(ret);
>                         goto remove_ida;
>                 }
> -
> -               cooling_ops = &cpufreq_power_cooling_ops;
> -       } else {
> -               cooling_ops = &cpufreq_cooling_ops;
>         }
> +#endif
> +       cooling_ops = &cpufreq_cooling_ops;

Argh, that is actually broken with !capacitance and
THERMAL_GOV_POWER_ALLOCATOR=y ... Perhaps it's best to keep the two
thermal_cooling_device_ops struct separated in the end.

>  
>         cdev = thermal_of_cooling_device_register(np, dev_name, cpufreq_cdev,
>                                                   cooling_ops);

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v4 2/3] PM / EM: Expose perf domain struct
@ 2019-05-15 10:46                   ` Quentin Perret
  0 siblings, 0 replies; 40+ messages in thread
From: Quentin Perret @ 2019-05-15 10:46 UTC (permalink / raw)
  To: Viresh Kumar
  Cc: linux-pm, will.deacon, amit.kachhap, Daniel Lezcano, rjw,
	linux-kernel, edubezval, mka, catalin.marinas, rui.zhang,
	javi.merino, ionela.voinescu, dietmar.eggemann, linux-arm-kernel

On Wednesday 15 May 2019 at 11:40:44 (+0100), Quentin Perret wrote:
> On Wednesday 15 May 2019 at 15:52:00 (+0530), Viresh Kumar wrote:
> > On 15-05-19, 12:16, Daniel Lezcano wrote:
> > > Viresh what do you think ?
> > 
> > I agree with your last suggestions. They do make sense.
> 
> Good :-)
> 
> So, FWIW, the below compiles w/ or w/o THERMAL_GOV_POWER_ALLOCATOR. I'll
> test it and clean it up some more and put it as patch 1 in the series if
> that's OK.
> 
> Thanks,
> Quentin
> 
> 
> diff --git a/drivers/thermal/cpu_cooling.c b/drivers/thermal/cpu_cooling.c
> index f7c1f49ec87f..ee431848ef71 100644
> --- a/drivers/thermal/cpu_cooling.c
> +++ b/drivers/thermal/cpu_cooling.c
> @@ -58,7 +58,9 @@
>   */
>  struct freq_table {
>         u32 frequency;
> +#ifdef CONFIG_THERMAL_GOV_POWER_ALLOCATOR
>         u32 power;
> +#endif
>  };
>  
>  /**
> @@ -109,28 +111,6 @@ static DEFINE_IDA(cpufreq_ida);
>  static DEFINE_MUTEX(cooling_list_lock);
>  static LIST_HEAD(cpufreq_cdev_list);
>  
> -/* Below code defines functions to be used for cpufreq as cooling device */
> -
> -/**
> - * get_level: Find the level for a particular frequency
> - * @cpufreq_cdev: cpufreq_cdev for which the property is required
> - * @freq: Frequency
> - *
> - * Return: level corresponding to the frequency.
> - */
> -static unsigned long get_level(struct cpufreq_cooling_device *cpufreq_cdev,
> -                              unsigned int freq)
> -{
> -       struct freq_table *freq_table = cpufreq_cdev->freq_table;
> -       unsigned long level;
> -
> -       for (level = 1; level <= cpufreq_cdev->max_level; level++)
> -               if (freq > freq_table[level].frequency)
> -                       break;
> -
> -       return level - 1;
> -}
> -
>  /**
>   * cpufreq_thermal_notifier - notifier callback for cpufreq policy change.
>   * @nb:        struct notifier_block * with callback info.
> @@ -184,6 +164,27 @@ static int cpufreq_thermal_notifier(struct notifier_block *nb,
>         return NOTIFY_OK;
>  }
>  
> +#ifdef CONFIG_THERMAL_GOV_POWER_ALLOCATOR
> +/**
> + * get_level: Find the level for a particular frequency
> + * @cpufreq_cdev: cpufreq_cdev for which the property is required
> + * @freq: Frequency
> + *
> + * Return: level corresponding to the frequency.
> + */
> +static unsigned long get_level(struct cpufreq_cooling_device *cpufreq_cdev,
> +                              unsigned int freq)
> +{
> +       struct freq_table *freq_table = cpufreq_cdev->freq_table;
> +       unsigned long level;
> +
> +       for (level = 1; level <= cpufreq_cdev->max_level; level++)
> +               if (freq > freq_table[level].frequency)
> +                       break;
> +
> +       return level - 1;
> +}
> +
>  /**
>   * update_freq_table() - Update the freq table with power numbers
>   * @cpufreq_cdev:      the cpufreq cooling device in which to update the table
> @@ -333,80 +334,6 @@ static u32 get_dynamic_power(struct cpufreq_cooling_device *cpufreq_cdev,
>         return (raw_cpu_power * cpufreq_cdev->last_load) / 100;
>  }
>  
> -/* cpufreq cooling device callback functions are defined below */
> -
> -/**
> - * cpufreq_get_max_state - callback function to get the max cooling state.
> - * @cdev: thermal cooling device pointer.
> - * @state: fill this variable with the max cooling state.
> - *
> - * Callback for the thermal cooling device to return the cpufreq
> - * max cooling state.
> - *
> - * Return: 0 on success, an error code otherwise.
> - */
> -static int cpufreq_get_max_state(struct thermal_cooling_device *cdev,
> -                                unsigned long *state)
> -{
> -       struct cpufreq_cooling_device *cpufreq_cdev = cdev->devdata;
> -
> -       *state = cpufreq_cdev->max_level;
> -       return 0;
> -}
> -
> -/**
> - * cpufreq_get_cur_state - callback function to get the current cooling state.
> - * @cdev: thermal cooling device pointer.
> - * @state: fill this variable with the current cooling state.
> - *
> - * Callback for the thermal cooling device to return the cpufreq
> - * current cooling state.
> - *
> - * Return: 0 on success, an error code otherwise.
> - */
> -static int cpufreq_get_cur_state(struct thermal_cooling_device *cdev,
> -                                unsigned long *state)
> -{
> -       struct cpufreq_cooling_device *cpufreq_cdev = cdev->devdata;
> -
> -       *state = cpufreq_cdev->cpufreq_state;
> -
> -       return 0;
> -}
> -
> -/**
> - * cpufreq_set_cur_state - callback function to set the current cooling state.
> - * @cdev: thermal cooling device pointer.
> - * @state: set this variable to the current cooling state.
> - *
> - * Callback for the thermal cooling device to change the cpufreq
> - * current cooling state.
> - *
> - * Return: 0 on success, an error code otherwise.
> - */
> -static int cpufreq_set_cur_state(struct thermal_cooling_device *cdev,
> -                                unsigned long state)
> -{
> -       struct cpufreq_cooling_device *cpufreq_cdev = cdev->devdata;
> -       unsigned int clip_freq;
> -
> -       /* Request state should be less than max_level */
> -       if (WARN_ON(state > cpufreq_cdev->max_level))
> -               return -EINVAL;
> -
> -       /* Check if the old cooling action is same as new cooling action */
> -       if (cpufreq_cdev->cpufreq_state == state)
> -               return 0;
> -
> -       clip_freq = cpufreq_cdev->freq_table[state].frequency;
> -       cpufreq_cdev->cpufreq_state = state;
> -       cpufreq_cdev->clipped_freq = clip_freq;
> -
> -       cpufreq_update_policy(cpufreq_cdev->policy->cpu);
> -
> -       return 0;
> -}
> -
>  /**
>   * cpufreq_get_requested_power() - get the current power
>   * @cdev:      &thermal_cooling_device pointer
> @@ -551,22 +478,93 @@ static int cpufreq_power2state(struct thermal_cooling_device *cdev,
>                                       power);
>         return 0;
>  }
> +#endif /* CONFIG_THERMAL_GOV_POWER_ALLOCATOR */
> +
> +/* cpufreq cooling device callback functions are defined below */
> +
> +/**
> + * cpufreq_get_max_state - callback function to get the max cooling state.
> + * @cdev: thermal cooling device pointer.
> + * @state: fill this variable with the max cooling state.
> + *
> + * Callback for the thermal cooling device to return the cpufreq
> + * max cooling state.
> + *
> + * Return: 0 on success, an error code otherwise.
> + */
> +static int cpufreq_get_max_state(struct thermal_cooling_device *cdev,
> +                                unsigned long *state)
> +{
> +       struct cpufreq_cooling_device *cpufreq_cdev = cdev->devdata;
> +
> +       *state = cpufreq_cdev->max_level;
> +       return 0;
> +}
> +
> +/**
> + * cpufreq_get_cur_state - callback function to get the current cooling state.
> + * @cdev: thermal cooling device pointer.
> + * @state: fill this variable with the current cooling state.
> + *
> + * Callback for the thermal cooling device to return the cpufreq
> + * current cooling state.
> + *
> + * Return: 0 on success, an error code otherwise.
> + */
> +static int cpufreq_get_cur_state(struct thermal_cooling_device *cdev,
> +                                unsigned long *state)
> +{
> +       struct cpufreq_cooling_device *cpufreq_cdev = cdev->devdata;
> +
> +       *state = cpufreq_cdev->cpufreq_state;
> +
> +       return 0;
> +}
> +
> +/**
> + * cpufreq_set_cur_state - callback function to set the current cooling state.
> + * @cdev: thermal cooling device pointer.
> + * @state: set this variable to the current cooling state.
> + *
> + * Callback for the thermal cooling device to change the cpufreq
> + * current cooling state.
> + *
> + * Return: 0 on success, an error code otherwise.
> + */
> +static int cpufreq_set_cur_state(struct thermal_cooling_device *cdev,
> +                                unsigned long state)
> +{
> +       struct cpufreq_cooling_device *cpufreq_cdev = cdev->devdata;
> +       unsigned int clip_freq;
> +
> +       /* Request state should be less than max_level */
> +       if (WARN_ON(state > cpufreq_cdev->max_level))
> +               return -EINVAL;
> +
> +       /* Check if the old cooling action is same as new cooling action */
> +       if (cpufreq_cdev->cpufreq_state == state)
> +               return 0;
> +
> +       clip_freq = cpufreq_cdev->freq_table[state].frequency;
> +       cpufreq_cdev->cpufreq_state = state;
> +       cpufreq_cdev->clipped_freq = clip_freq;
> +
> +       cpufreq_update_policy(cpufreq_cdev->policy->cpu);
> +
> +       return 0;
> +}
>  
>  /* Bind cpufreq callbacks to thermal cooling device ops */
>  
>  static struct thermal_cooling_device_ops cpufreq_cooling_ops = {
> -       .get_max_state = cpufreq_get_max_state,
> -       .get_cur_state = cpufreq_get_cur_state,
> -       .set_cur_state = cpufreq_set_cur_state,
> -};
> -
> -static struct thermal_cooling_device_ops cpufreq_power_cooling_ops = {
>         .get_max_state          = cpufreq_get_max_state,
>         .get_cur_state          = cpufreq_get_cur_state,
>         .set_cur_state          = cpufreq_set_cur_state,
> +#ifdef CONFIG_THERMAL_GOV_POWER_ALLOCATOR
>         .get_requested_power    = cpufreq_get_requested_power,
>         .state2power            = cpufreq_state2power,
>         .power2state            = cpufreq_power2state,
> +#endif
>  };
>  
>  /* Notifier for cpufreq policy change */
> @@ -674,17 +672,16 @@ __cpufreq_cooling_register(struct device_node *np,
>                         pr_debug("%s: freq:%u KHz\n", __func__, freq);
>         }
>  
> +#ifdef CONFIG_THERMAL_GOV_POWER_ALLOCATOR
>         if (capacitance) {
>                 ret = update_freq_table(cpufreq_cdev, capacitance);
>                 if (ret) {
>                         cdev = ERR_PTR(ret);
>                         goto remove_ida;
>                 }
> -
> -               cooling_ops = &cpufreq_power_cooling_ops;
> -       } else {
> -               cooling_ops = &cpufreq_cooling_ops;
>         }
> +#endif
> +       cooling_ops = &cpufreq_cooling_ops;

Argh, that is actually broken with !capacitance and
THERMAL_GOV_POWER_ALLOCATOR=y ... Perhaps it's best to keep the two
thermal_cooling_device_ops struct separated in the end.

>  
>         cdev = thermal_of_cooling_device_register(np, dev_name, cpufreq_cdev,
>                                                   cooling_ops);

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v4 2/3] PM / EM: Expose perf domain struct
  2019-05-15 10:46                   ` Quentin Perret
@ 2019-05-15 10:51                     ` Daniel Lezcano
  -1 siblings, 0 replies; 40+ messages in thread
From: Daniel Lezcano @ 2019-05-15 10:51 UTC (permalink / raw)
  To: Quentin Perret, Viresh Kumar
  Cc: edubezval, rui.zhang, javi.merino, amit.kachhap, rjw,
	will.deacon, catalin.marinas, dietmar.eggemann, ionela.voinescu,
	mka, linux-pm, linux-kernel, linux-arm-kernel

On 15/05/2019 12:46, Quentin Perret wrote:
> On Wednesday 15 May 2019 at 11:40:44 (+0100), Quentin Perret wrote:

[ ... ]

>> +#ifdef CONFIG_THERMAL_GOV_POWER_ALLOCATOR
>>         if (capacitance) {
>>                 ret = update_freq_table(cpufreq_cdev, capacitance);
>>                 if (ret) {
>>                         cdev = ERR_PTR(ret);
>>                         goto remove_ida;
>>                 }
>> -
>> -               cooling_ops = &cpufreq_power_cooling_ops;
>> -       } else {
>> -               cooling_ops = &cpufreq_cooling_ops;
>>         }
>> +#endif
>> +       cooling_ops = &cpufreq_cooling_ops;
> 
> Argh, that is actually broken with !capacitance and
> THERMAL_GOV_POWER_ALLOCATOR=y ... Perhaps it's best to keep the two
> thermal_cooling_device_ops struct separated in the end.

Or alternatively you can keep one structure but instead of filling the
state2power,power2state and getrequestedpower fields in the declaration,
you fill them in the if (capacitance) block, no?




-- 
 <http://www.linaro.org/> Linaro.org │ Open source software for ARM SoCs

Follow Linaro:  <http://www.facebook.com/pages/Linaro> Facebook |
<http://twitter.com/#!/linaroorg> Twitter |
<http://www.linaro.org/linaro-blog/> Blog


^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v4 2/3] PM / EM: Expose perf domain struct
@ 2019-05-15 10:51                     ` Daniel Lezcano
  0 siblings, 0 replies; 40+ messages in thread
From: Daniel Lezcano @ 2019-05-15 10:51 UTC (permalink / raw)
  To: Quentin Perret, Viresh Kumar
  Cc: linux-pm, rjw, amit.kachhap, will.deacon, linux-kernel,
	edubezval, mka, catalin.marinas, rui.zhang, javi.merino,
	ionela.voinescu, dietmar.eggemann, linux-arm-kernel

On 15/05/2019 12:46, Quentin Perret wrote:
> On Wednesday 15 May 2019 at 11:40:44 (+0100), Quentin Perret wrote:

[ ... ]

>> +#ifdef CONFIG_THERMAL_GOV_POWER_ALLOCATOR
>>         if (capacitance) {
>>                 ret = update_freq_table(cpufreq_cdev, capacitance);
>>                 if (ret) {
>>                         cdev = ERR_PTR(ret);
>>                         goto remove_ida;
>>                 }
>> -
>> -               cooling_ops = &cpufreq_power_cooling_ops;
>> -       } else {
>> -               cooling_ops = &cpufreq_cooling_ops;
>>         }
>> +#endif
>> +       cooling_ops = &cpufreq_cooling_ops;
> 
> Argh, that is actually broken with !capacitance and
> THERMAL_GOV_POWER_ALLOCATOR=y ... Perhaps it's best to keep the two
> thermal_cooling_device_ops struct separated in the end.

Or alternatively you can keep one structure but instead of filling the
state2power,power2state and getrequestedpower fields in the declaration,
you fill them in the if (capacitance) block, no?




-- 
 <http://www.linaro.org/> Linaro.org │ Open source software for ARM SoCs

Follow Linaro:  <http://www.facebook.com/pages/Linaro> Facebook |
<http://twitter.com/#!/linaroorg> Twitter |
<http://www.linaro.org/linaro-blog/> Blog


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v4 2/3] PM / EM: Expose perf domain struct
  2019-05-15 10:40                 ` Quentin Perret
@ 2019-05-15 10:54                   ` Daniel Lezcano
  -1 siblings, 0 replies; 40+ messages in thread
From: Daniel Lezcano @ 2019-05-15 10:54 UTC (permalink / raw)
  To: Quentin Perret, Viresh Kumar
  Cc: edubezval, rui.zhang, javi.merino, amit.kachhap, rjw,
	will.deacon, catalin.marinas, dietmar.eggemann, ionela.voinescu,
	mka, linux-pm, linux-kernel, linux-arm-kernel

On 15/05/2019 12:40, Quentin Perret wrote:
> On Wednesday 15 May 2019 at 15:52:00 (+0530), Viresh Kumar wrote:
>> On 15-05-19, 12:16, Daniel Lezcano wrote:
>>> Viresh what do you think ?
>>
>> I agree with your last suggestions. They do make sense.
> 
> Good :-)
> 
> So, FWIW, the below compiles w/ or w/o THERMAL_GOV_POWER_ALLOCATOR. I'll
> test it and clean it up some more and put it as patch 1 in the series if
> that's OK.
> 
> Thanks,
> Quentin
> 
> 
> diff --git a/drivers/thermal/cpu_cooling.c b/drivers/thermal/cpu_cooling.c
> index f7c1f49ec87f..ee431848ef71 100644
> --- a/drivers/thermal/cpu_cooling.c
> +++ b/drivers/thermal/cpu_cooling.c
> @@ -58,7 +58,9 @@
>   */
>  struct freq_table {
>         u32 frequency;

I suspect we will have a problem here as cpufreq_set_cur_state uses the
frequency and when switching to EM, we will still need a freq table.


-- 
 <http://www.linaro.org/> Linaro.org │ Open source software for ARM SoCs

Follow Linaro:  <http://www.facebook.com/pages/Linaro> Facebook |
<http://twitter.com/#!/linaroorg> Twitter |
<http://www.linaro.org/linaro-blog/> Blog


^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v4 2/3] PM / EM: Expose perf domain struct
@ 2019-05-15 10:54                   ` Daniel Lezcano
  0 siblings, 0 replies; 40+ messages in thread
From: Daniel Lezcano @ 2019-05-15 10:54 UTC (permalink / raw)
  To: Quentin Perret, Viresh Kumar
  Cc: linux-pm, rjw, amit.kachhap, will.deacon, linux-kernel,
	edubezval, mka, catalin.marinas, rui.zhang, javi.merino,
	ionela.voinescu, dietmar.eggemann, linux-arm-kernel

On 15/05/2019 12:40, Quentin Perret wrote:
> On Wednesday 15 May 2019 at 15:52:00 (+0530), Viresh Kumar wrote:
>> On 15-05-19, 12:16, Daniel Lezcano wrote:
>>> Viresh what do you think ?
>>
>> I agree with your last suggestions. They do make sense.
> 
> Good :-)
> 
> So, FWIW, the below compiles w/ or w/o THERMAL_GOV_POWER_ALLOCATOR. I'll
> test it and clean it up some more and put it as patch 1 in the series if
> that's OK.
> 
> Thanks,
> Quentin
> 
> 
> diff --git a/drivers/thermal/cpu_cooling.c b/drivers/thermal/cpu_cooling.c
> index f7c1f49ec87f..ee431848ef71 100644
> --- a/drivers/thermal/cpu_cooling.c
> +++ b/drivers/thermal/cpu_cooling.c
> @@ -58,7 +58,9 @@
>   */
>  struct freq_table {
>         u32 frequency;

I suspect we will have a problem here as cpufreq_set_cur_state uses the
frequency and when switching to EM, we will still need a freq table.


-- 
 <http://www.linaro.org/> Linaro.org │ Open source software for ARM SoCs

Follow Linaro:  <http://www.facebook.com/pages/Linaro> Facebook |
<http://twitter.com/#!/linaroorg> Twitter |
<http://www.linaro.org/linaro-blog/> Blog


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v4 2/3] PM / EM: Expose perf domain struct
  2019-05-15 10:54                   ` Daniel Lezcano
@ 2019-05-15 10:57                     ` Quentin Perret
  -1 siblings, 0 replies; 40+ messages in thread
From: Quentin Perret @ 2019-05-15 10:57 UTC (permalink / raw)
  To: Daniel Lezcano
  Cc: Viresh Kumar, edubezval, rui.zhang, javi.merino, amit.kachhap,
	rjw, will.deacon, catalin.marinas, dietmar.eggemann,
	ionela.voinescu, mka, linux-pm, linux-kernel, linux-arm-kernel

On Wednesday 15 May 2019 at 12:54:10 (+0200), Daniel Lezcano wrote:
> On 15/05/2019 12:40, Quentin Perret wrote:
> > On Wednesday 15 May 2019 at 15:52:00 (+0530), Viresh Kumar wrote:
> >> On 15-05-19, 12:16, Daniel Lezcano wrote:
> >>> Viresh what do you think ?
> >>
> >> I agree with your last suggestions. They do make sense.
> > 
> > Good :-)
> > 
> > So, FWIW, the below compiles w/ or w/o THERMAL_GOV_POWER_ALLOCATOR. I'll
> > test it and clean it up some more and put it as patch 1 in the series if
> > that's OK.
> > 
> > Thanks,
> > Quentin
> > 
> > 
> > diff --git a/drivers/thermal/cpu_cooling.c b/drivers/thermal/cpu_cooling.c
> > index f7c1f49ec87f..ee431848ef71 100644
> > --- a/drivers/thermal/cpu_cooling.c
> > +++ b/drivers/thermal/cpu_cooling.c
> > @@ -58,7 +58,9 @@
> >   */
> >  struct freq_table {
> >         u32 frequency;
> 
> I suspect we will have a problem here as cpufreq_set_cur_state uses the
> frequency and when switching to EM, we will still need a freq table.

Indeed, I'll need a bit of ifdefery in the get_state_freq() function
introduced in patch 3, but nothing too horrible I hope.

Thanks,
Quentin

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v4 2/3] PM / EM: Expose perf domain struct
@ 2019-05-15 10:57                     ` Quentin Perret
  0 siblings, 0 replies; 40+ messages in thread
From: Quentin Perret @ 2019-05-15 10:57 UTC (permalink / raw)
  To: Daniel Lezcano
  Cc: linux-pm, Viresh Kumar, amit.kachhap, rjw, linux-kernel,
	will.deacon, edubezval, mka, catalin.marinas, rui.zhang,
	javi.merino, ionela.voinescu, dietmar.eggemann, linux-arm-kernel

On Wednesday 15 May 2019 at 12:54:10 (+0200), Daniel Lezcano wrote:
> On 15/05/2019 12:40, Quentin Perret wrote:
> > On Wednesday 15 May 2019 at 15:52:00 (+0530), Viresh Kumar wrote:
> >> On 15-05-19, 12:16, Daniel Lezcano wrote:
> >>> Viresh what do you think ?
> >>
> >> I agree with your last suggestions. They do make sense.
> > 
> > Good :-)
> > 
> > So, FWIW, the below compiles w/ or w/o THERMAL_GOV_POWER_ALLOCATOR. I'll
> > test it and clean it up some more and put it as patch 1 in the series if
> > that's OK.
> > 
> > Thanks,
> > Quentin
> > 
> > 
> > diff --git a/drivers/thermal/cpu_cooling.c b/drivers/thermal/cpu_cooling.c
> > index f7c1f49ec87f..ee431848ef71 100644
> > --- a/drivers/thermal/cpu_cooling.c
> > +++ b/drivers/thermal/cpu_cooling.c
> > @@ -58,7 +58,9 @@
> >   */
> >  struct freq_table {
> >         u32 frequency;
> 
> I suspect we will have a problem here as cpufreq_set_cur_state uses the
> frequency and when switching to EM, we will still need a freq table.

Indeed, I'll need a bit of ifdefery in the get_state_freq() function
introduced in patch 3, but nothing too horrible I hope.

Thanks,
Quentin

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v4 2/3] PM / EM: Expose perf domain struct
  2019-05-15 10:51                     ` Daniel Lezcano
  (?)
@ 2019-05-15 11:01                       ` Quentin Perret
  -1 siblings, 0 replies; 40+ messages in thread
From: Quentin Perret @ 2019-05-15 11:01 UTC (permalink / raw)
  To: Daniel Lezcano
  Cc: Viresh Kumar, edubezval, rui.zhang, javi.merino, amit.kachhap,
	rjw, will.deacon, catalin.marinas, dietmar.eggemann,
	ionela.voinescu, mka, linux-pm, linux-kernel, linux-arm-kernel

On Wednesday 15 May 2019 at 12:51:57 (+0200), Daniel Lezcano wrote:
> On 15/05/2019 12:46, Quentin Perret wrote:
> > On Wednesday 15 May 2019 at 11:40:44 (+0100), Quentin Perret wrote:
> 
> [ ... ]
> 
> >> +#ifdef CONFIG_THERMAL_GOV_POWER_ALLOCATOR
> >>         if (capacitance) {
> >>                 ret = update_freq_table(cpufreq_cdev, capacitance);
> >>                 if (ret) {
> >>                         cdev = ERR_PTR(ret);
> >>                         goto remove_ida;
> >>                 }
> >> -
> >> -               cooling_ops = &cpufreq_power_cooling_ops;
> >> -       } else {
> >> -               cooling_ops = &cpufreq_cooling_ops;
> >>         }
> >> +#endif
> >> +       cooling_ops = &cpufreq_cooling_ops;
> > 
> > Argh, that is actually broken with !capacitance and
> > THERMAL_GOV_POWER_ALLOCATOR=y ... Perhaps it's best to keep the two
> > thermal_cooling_device_ops struct separated in the end.
> 
> Or alternatively you can keep one structure but instead of filling the
> state2power,power2state and getrequestedpower fields in the declaration,
> you fill them in the if (capacitance) block, no?

Something like the below ? Yes, that works too. I'll write a proper
patch and send that next week or so.

Thanks,
Quentin

--->8---

 /* Bind cpufreq callbacks to thermal cooling device ops */

 static struct thermal_cooling_device_ops cpufreq_cooling_ops = {
-       .get_max_state = cpufreq_get_max_state,
-       .get_cur_state = cpufreq_get_cur_state,
-       .set_cur_state = cpufreq_set_cur_state,
-};
-
-static struct thermal_cooling_device_ops cpufreq_power_cooling_ops = {
        .get_max_state          = cpufreq_get_max_state,
        .get_cur_state          = cpufreq_get_cur_state,
        .set_cur_state          = cpufreq_set_cur_state,
-       .get_requested_power    = cpufreq_get_requested_power,
-       .state2power            = cpufreq_state2power,
-       .power2state            = cpufreq_power2state,
 };

 /* Notifier for cpufreq policy change */
@@ -674,18 +667,19 @@ __cpufreq_cooling_register(struct device_node *np,
                        pr_debug("%s: freq:%u KHz\n", __func__, freq);
        }

+       cooling_ops = &cpufreq_cooling_ops;
+#ifdef CONFIG_THERMAL_GOV_POWER_ALLOCATOR
        if (capacitance) {
                ret = update_freq_table(cpufreq_cdev, capacitance);
                if (ret) {
                        cdev = ERR_PTR(ret);
                        goto remove_ida;
                }
-
-               cooling_ops = &cpufreq_power_cooling_ops;
-       } else {
-               cooling_ops = &cpufreq_cooling_ops;
+               cooling_ops->get_requested_power = cpufreq_get_requested_power;
+               cooling_ops->state2power = cpufreq_state2power;
+               cooling_ops->power2state = cpufreq_power2state;
        }
-
+#endif
        cdev = thermal_of_cooling_device_register(np, dev_name, cpufreq_cdev,
                                                  cooling_ops);
        if (IS_ERR(cdev))

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v4 2/3] PM / EM: Expose perf domain struct
@ 2019-05-15 11:01                       ` Quentin Perret
  0 siblings, 0 replies; 40+ messages in thread
From: Quentin Perret @ 2019-05-15 11:01 UTC (permalink / raw)
  To: Daniel Lezcano
  Cc: linux-pm, Viresh Kumar, amit.kachhap, rjw, linux-kernel,
	will.deacon, edubezval, mka, catalin.marinas, rui.zhang,
	javi.merino, ionela.voinescu, dietmar.eggemann, linux-arm-kernel

On Wednesday 15 May 2019 at 12:51:57 (+0200), Daniel Lezcano wrote:
> On 15/05/2019 12:46, Quentin Perret wrote:
> > On Wednesday 15 May 2019 at 11:40:44 (+0100), Quentin Perret wrote:
> 
> [ ... ]
> 
> >> +#ifdef CONFIG_THERMAL_GOV_POWER_ALLOCATOR
> >>         if (capacitance) {
> >>                 ret = update_freq_table(cpufreq_cdev, capacitance);
> >>                 if (ret) {
> >>                         cdev = ERR_PTR(ret);
> >>                         goto remove_ida;
> >>                 }
> >> -
> >> -               cooling_ops = &cpufreq_power_cooling_ops;
> >> -       } else {
> >> -               cooling_ops = &cpufreq_cooling_ops;
> >>         }
> >> +#endif
> >> +       cooling_ops = &cpufreq_cooling_ops;
> > 
> > Argh, that is actually broken with !capacitance and
> > THERMAL_GOV_POWER_ALLOCATOR=y ... Perhaps it's best to keep the two
> > thermal_cooling_device_ops struct separated in the end.
> 
> Or alternatively you can keep one structure but instead of filling the
> state2power,power2state and getrequestedpower fields in the declaration,
> you fill them in the if (capacitance) block, no?

Something like the below ? Yes, that works too. I'll write a proper
patch and send that next week or so.

Thanks,
Quentin

--->8---

 /* Bind cpufreq callbacks to thermal cooling device ops */

 static struct thermal_cooling_device_ops cpufreq_cooling_ops = {
-       .get_max_state = cpufreq_get_max_state,
-       .get_cur_state = cpufreq_get_cur_state,
-       .set_cur_state = cpufreq_set_cur_state,
-};
-
-static struct thermal_cooling_device_ops cpufreq_power_cooling_ops = {
        .get_max_state          = cpufreq_get_max_state,
        .get_cur_state          = cpufreq_get_cur_state,
        .set_cur_state          = cpufreq_set_cur_state,
-       .get_requested_power    = cpufreq_get_requested_power,
-       .state2power            = cpufreq_state2power,
-       .power2state            = cpufreq_power2state,
 };

 /* Notifier for cpufreq policy change */
@@ -674,18 +667,19 @@ __cpufreq_cooling_register(struct device_node *np,
                        pr_debug("%s: freq:%u KHz\n", __func__, freq);
        }

+       cooling_ops = &cpufreq_cooling_ops;
+#ifdef CONFIG_THERMAL_GOV_POWER_ALLOCATOR
        if (capacitance) {
                ret = update_freq_table(cpufreq_cdev, capacitance);
                if (ret) {
                        cdev = ERR_PTR(ret);
                        goto remove_ida;
                }
-
-               cooling_ops = &cpufreq_power_cooling_ops;
-       } else {
-               cooling_ops = &cpufreq_cooling_ops;
+               cooling_ops->get_requested_power = cpufreq_get_requested_power;
+               cooling_ops->state2power = cpufreq_state2power;
+               cooling_ops->power2state = cpufreq_power2state;
        }
-
+#endif
        cdev = thermal_of_cooling_device_register(np, dev_name, cpufreq_cdev,
                                                  cooling_ops);
        if (IS_ERR(cdev))

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v4 2/3] PM / EM: Expose perf domain struct
@ 2019-05-15 11:01                       ` Quentin Perret
  0 siblings, 0 replies; 40+ messages in thread
From: Quentin Perret @ 2019-05-15 11:01 UTC (permalink / raw)
  To: Daniel Lezcano
  Cc: linux-pm, Viresh Kumar, amit.kachhap, rjw, linux-kernel,
	will.deacon, edubezval, mka, catalin.marinas, rui.zhang,
	javi.merino, ionela.voinescu, dietmar.eggemann, linux-arm-kernel

On Wednesday 15 May 2019 at 12:51:57 (+0200), Daniel Lezcano wrote:
> On 15/05/2019 12:46, Quentin Perret wrote:
> > On Wednesday 15 May 2019 at 11:40:44 (+0100), Quentin Perret wrote:
> 
> [ ... ]
> 
> >> +#ifdef CONFIG_THERMAL_GOV_POWER_ALLOCATOR
> >>         if (capacitance) {
> >>                 ret = update_freq_table(cpufreq_cdev, capacitance);
> >>                 if (ret) {
> >>                         cdev = ERR_PTR(ret);
> >>                         goto remove_ida;
> >>                 }
> >> -
> >> -               cooling_ops = &cpufreq_power_cooling_ops;
> >> -       } else {
> >> -               cooling_ops = &cpufreq_cooling_ops;
> >>         }
> >> +#endif
> >> +       cooling_ops = &cpufreq_cooling_ops;
> > 
> > Argh, that is actually broken with !capacitance and
> > THERMAL_GOV_POWER_ALLOCATOR=y ... Perhaps it's best to keep the two
> > thermal_cooling_device_ops struct separated in the end.
> 
> Or alternatively you can keep one structure but instead of filling the
> state2power,power2state and getrequestedpower fields in the declaration,
> you fill them in the if (capacitance) block, no?

Something like the below ? Yes, that works too. I'll write a proper
patch and send that next week or so.

Thanks,
Quentin

--->8---

 /* Bind cpufreq callbacks to thermal cooling device ops */

 static struct thermal_cooling_device_ops cpufreq_cooling_ops = {
-       .get_max_state = cpufreq_get_max_state,
-       .get_cur_state = cpufreq_get_cur_state,
-       .set_cur_state = cpufreq_set_cur_state,
-};
-
-static struct thermal_cooling_device_ops cpufreq_power_cooling_ops = {
        .get_max_state          = cpufreq_get_max_state,
        .get_cur_state          = cpufreq_get_cur_state,
        .set_cur_state          = cpufreq_set_cur_state,
-       .get_requested_power    = cpufreq_get_requested_power,
-       .state2power            = cpufreq_state2power,
-       .power2state            = cpufreq_power2state,
 };

 /* Notifier for cpufreq policy change */
@@ -674,18 +667,19 @@ __cpufreq_cooling_register(struct device_node *np,
                        pr_debug("%s: freq:%u KHz\n", __func__, freq);
        }

+       cooling_ops = &cpufreq_cooling_ops;
+#ifdef CONFIG_THERMAL_GOV_POWER_ALLOCATOR
        if (capacitance) {
                ret = update_freq_table(cpufreq_cdev, capacitance);
                if (ret) {
                        cdev = ERR_PTR(ret);
                        goto remove_ida;
                }
-
-               cooling_ops = &cpufreq_power_cooling_ops;
-       } else {
-               cooling_ops = &cpufreq_cooling_ops;
+               cooling_ops->get_requested_power = cpufreq_get_requested_power;
+               cooling_ops->state2power = cpufreq_state2power;
+               cooling_ops->power2state = cpufreq_power2state;
        }
-
+#endif
        cdev = thermal_of_cooling_device_register(np, dev_name, cpufreq_cdev,
                                                  cooling_ops);
        if (IS_ERR(cdev))

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v4 2/3] PM / EM: Expose perf domain struct
  2019-05-15 11:01                       ` Quentin Perret
@ 2019-05-15 11:07                         ` Daniel Lezcano
  -1 siblings, 0 replies; 40+ messages in thread
From: Daniel Lezcano @ 2019-05-15 11:07 UTC (permalink / raw)
  To: Quentin Perret
  Cc: Viresh Kumar, edubezval, rui.zhang, javi.merino, amit.kachhap,
	rjw, will.deacon, catalin.marinas, dietmar.eggemann,
	ionela.voinescu, mka, linux-pm, linux-kernel, linux-arm-kernel

On 15/05/2019 13:01, Quentin Perret wrote:
> On Wednesday 15 May 2019 at 12:51:57 (+0200), Daniel Lezcano wrote:
>> On 15/05/2019 12:46, Quentin Perret wrote:
>>> On Wednesday 15 May 2019 at 11:40:44 (+0100), Quentin Perret wrote:
>>
>> [ ... ]
>>
>>>> +#ifdef CONFIG_THERMAL_GOV_POWER_ALLOCATOR
>>>>         if (capacitance) {
>>>>                 ret = update_freq_table(cpufreq_cdev, capacitance);
>>>>                 if (ret) {
>>>>                         cdev = ERR_PTR(ret);
>>>>                         goto remove_ida;
>>>>                 }
>>>> -
>>>> -               cooling_ops = &cpufreq_power_cooling_ops;
>>>> -       } else {
>>>> -               cooling_ops = &cpufreq_cooling_ops;
>>>>         }
>>>> +#endif
>>>> +       cooling_ops = &cpufreq_cooling_ops;
>>>
>>> Argh, that is actually broken with !capacitance and
>>> THERMAL_GOV_POWER_ALLOCATOR=y ... Perhaps it's best to keep the two
>>> thermal_cooling_device_ops struct separated in the end.
>>
>> Or alternatively you can keep one structure but instead of filling the
>> state2power,power2state and getrequestedpower fields in the declaration,
>> you fill them in the if (capacitance) block, no?
> 
> Something like the below ? Yes, that works too. I'll write a proper
> patch and send that next week or so.

Yes, exactly. And IMHO, that helps for the understanding of code also.

> --->8---
> 
>  /* Bind cpufreq callbacks to thermal cooling device ops */
> 
>  static struct thermal_cooling_device_ops cpufreq_cooling_ops = {
> -       .get_max_state = cpufreq_get_max_state,
> -       .get_cur_state = cpufreq_get_cur_state,
> -       .set_cur_state = cpufreq_set_cur_state,
> -};
> -
> -static struct thermal_cooling_device_ops cpufreq_power_cooling_ops = {
>         .get_max_state          = cpufreq_get_max_state,
>         .get_cur_state          = cpufreq_get_cur_state,
>         .set_cur_state          = cpufreq_set_cur_state,
> -       .get_requested_power    = cpufreq_get_requested_power,
> -       .state2power            = cpufreq_state2power,
> -       .power2state            = cpufreq_power2state,
>  };
> 
>  /* Notifier for cpufreq policy change */
> @@ -674,18 +667,19 @@ __cpufreq_cooling_register(struct device_node *np,
>                         pr_debug("%s: freq:%u KHz\n", __func__, freq);
>         }
> 
> +       cooling_ops = &cpufreq_cooling_ops;
> +#ifdef CONFIG_THERMAL_GOV_POWER_ALLOCATOR
>         if (capacitance) {
>                 ret = update_freq_table(cpufreq_cdev, capacitance);
>                 if (ret) {
>                         cdev = ERR_PTR(ret);
>                         goto remove_ida;
>                 }
> -
> -               cooling_ops = &cpufreq_power_cooling_ops;
> -       } else {
> -               cooling_ops = &cpufreq_cooling_ops;
> +               cooling_ops->get_requested_power = cpufreq_get_requested_power;
> +               cooling_ops->state2power = cpufreq_state2power;
> +               cooling_ops->power2state = cpufreq_power2state;
>         }
> -
> +#endif
>         cdev = thermal_of_cooling_device_register(np, dev_name, cpufreq_cdev,
>                                                   cooling_ops);
>         if (IS_ERR(cdev))
> 


-- 
 <http://www.linaro.org/> Linaro.org │ Open source software for ARM SoCs

Follow Linaro:  <http://www.facebook.com/pages/Linaro> Facebook |
<http://twitter.com/#!/linaroorg> Twitter |
<http://www.linaro.org/linaro-blog/> Blog


^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v4 2/3] PM / EM: Expose perf domain struct
@ 2019-05-15 11:07                         ` Daniel Lezcano
  0 siblings, 0 replies; 40+ messages in thread
From: Daniel Lezcano @ 2019-05-15 11:07 UTC (permalink / raw)
  To: Quentin Perret
  Cc: linux-pm, Viresh Kumar, amit.kachhap, rjw, linux-kernel,
	will.deacon, edubezval, mka, catalin.marinas, rui.zhang,
	javi.merino, ionela.voinescu, dietmar.eggemann, linux-arm-kernel

On 15/05/2019 13:01, Quentin Perret wrote:
> On Wednesday 15 May 2019 at 12:51:57 (+0200), Daniel Lezcano wrote:
>> On 15/05/2019 12:46, Quentin Perret wrote:
>>> On Wednesday 15 May 2019 at 11:40:44 (+0100), Quentin Perret wrote:
>>
>> [ ... ]
>>
>>>> +#ifdef CONFIG_THERMAL_GOV_POWER_ALLOCATOR
>>>>         if (capacitance) {
>>>>                 ret = update_freq_table(cpufreq_cdev, capacitance);
>>>>                 if (ret) {
>>>>                         cdev = ERR_PTR(ret);
>>>>                         goto remove_ida;
>>>>                 }
>>>> -
>>>> -               cooling_ops = &cpufreq_power_cooling_ops;
>>>> -       } else {
>>>> -               cooling_ops = &cpufreq_cooling_ops;
>>>>         }
>>>> +#endif
>>>> +       cooling_ops = &cpufreq_cooling_ops;
>>>
>>> Argh, that is actually broken with !capacitance and
>>> THERMAL_GOV_POWER_ALLOCATOR=y ... Perhaps it's best to keep the two
>>> thermal_cooling_device_ops struct separated in the end.
>>
>> Or alternatively you can keep one structure but instead of filling the
>> state2power,power2state and getrequestedpower fields in the declaration,
>> you fill them in the if (capacitance) block, no?
> 
> Something like the below ? Yes, that works too. I'll write a proper
> patch and send that next week or so.

Yes, exactly. And IMHO, that helps for the understanding of code also.

> --->8---
> 
>  /* Bind cpufreq callbacks to thermal cooling device ops */
> 
>  static struct thermal_cooling_device_ops cpufreq_cooling_ops = {
> -       .get_max_state = cpufreq_get_max_state,
> -       .get_cur_state = cpufreq_get_cur_state,
> -       .set_cur_state = cpufreq_set_cur_state,
> -};
> -
> -static struct thermal_cooling_device_ops cpufreq_power_cooling_ops = {
>         .get_max_state          = cpufreq_get_max_state,
>         .get_cur_state          = cpufreq_get_cur_state,
>         .set_cur_state          = cpufreq_set_cur_state,
> -       .get_requested_power    = cpufreq_get_requested_power,
> -       .state2power            = cpufreq_state2power,
> -       .power2state            = cpufreq_power2state,
>  };
> 
>  /* Notifier for cpufreq policy change */
> @@ -674,18 +667,19 @@ __cpufreq_cooling_register(struct device_node *np,
>                         pr_debug("%s: freq:%u KHz\n", __func__, freq);
>         }
> 
> +       cooling_ops = &cpufreq_cooling_ops;
> +#ifdef CONFIG_THERMAL_GOV_POWER_ALLOCATOR
>         if (capacitance) {
>                 ret = update_freq_table(cpufreq_cdev, capacitance);
>                 if (ret) {
>                         cdev = ERR_PTR(ret);
>                         goto remove_ida;
>                 }
> -
> -               cooling_ops = &cpufreq_power_cooling_ops;
> -       } else {
> -               cooling_ops = &cpufreq_cooling_ops;
> +               cooling_ops->get_requested_power = cpufreq_get_requested_power;
> +               cooling_ops->state2power = cpufreq_state2power;
> +               cooling_ops->power2state = cpufreq_power2state;
>         }
> -
> +#endif
>         cdev = thermal_of_cooling_device_register(np, dev_name, cpufreq_cdev,
>                                                   cooling_ops);
>         if (IS_ERR(cdev))
> 


-- 
 <http://www.linaro.org/> Linaro.org │ Open source software for ARM SoCs

Follow Linaro:  <http://www.facebook.com/pages/Linaro> Facebook |
<http://twitter.com/#!/linaroorg> Twitter |
<http://www.linaro.org/linaro-blog/> Blog


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 40+ messages in thread

end of thread, other threads:[~2019-05-15 12:17 UTC | newest]

Thread overview: 40+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-05-15  8:23 [PATCH v4 0/3] Make IPA use PM_EM Quentin Perret
2019-05-15  8:23 ` Quentin Perret
2019-05-15  8:23 ` [PATCH v4 1/3] arm64: defconfig: Enable CONFIG_ENERGY_MODEL Quentin Perret
2019-05-15  8:23   ` Quentin Perret
2019-05-15  8:46   ` Daniel Lezcano
2019-05-15  8:46     ` Daniel Lezcano
2019-05-15  9:22     ` Quentin Perret
2019-05-15  9:22       ` Quentin Perret
2019-05-15  8:23 ` [PATCH v4 2/3] PM / EM: Expose perf domain struct Quentin Perret
2019-05-15  8:23   ` Quentin Perret
2019-05-15  9:06   ` Daniel Lezcano
2019-05-15  9:06     ` Daniel Lezcano
2019-05-15  9:17     ` Quentin Perret
2019-05-15  9:17       ` Quentin Perret
2019-05-15  9:56       ` Daniel Lezcano
2019-05-15  9:56         ` Daniel Lezcano
2019-05-15 10:07         ` Quentin Perret
2019-05-15 10:07           ` Quentin Perret
2019-05-15 10:16           ` Daniel Lezcano
2019-05-15 10:16             ` Daniel Lezcano
2019-05-15 10:22             ` Viresh Kumar
2019-05-15 10:22               ` Viresh Kumar
2019-05-15 10:40               ` Quentin Perret
2019-05-15 10:40                 ` Quentin Perret
2019-05-15 10:40                 ` Quentin Perret
2019-05-15 10:46                 ` Quentin Perret
2019-05-15 10:46                   ` Quentin Perret
2019-05-15 10:51                   ` Daniel Lezcano
2019-05-15 10:51                     ` Daniel Lezcano
2019-05-15 11:01                     ` Quentin Perret
2019-05-15 11:01                       ` Quentin Perret
2019-05-15 11:01                       ` Quentin Perret
2019-05-15 11:07                       ` Daniel Lezcano
2019-05-15 11:07                         ` Daniel Lezcano
2019-05-15 10:54                 ` Daniel Lezcano
2019-05-15 10:54                   ` Daniel Lezcano
2019-05-15 10:57                   ` Quentin Perret
2019-05-15 10:57                     ` Quentin Perret
2019-05-15  8:23 ` [PATCH v4 3/3] thermal: cpu_cooling: Migrate to using the EM framework Quentin Perret
2019-05-15  8:23   ` Quentin Perret

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.