linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [RESEND PATCH v2 0/2] Enable Odroid-XU3/4 to use Energy Model and Energy Aware Scheduler
@ 2020-02-20  9:56 Lukasz Luba
  2020-02-20  9:56 ` [RESEND PATCH v2 1/2] ARM: dts: exynos: Add dynamic-power-coefficient to Exynos5422 CPUs Lukasz Luba
                   ` (2 more replies)
  0 siblings, 3 replies; 7+ messages in thread
From: Lukasz Luba @ 2020-02-20  9:56 UTC (permalink / raw)
  To: linux-kernel, kgene, krzk, linux-arm-kernel, linux-samsung-soc,
	devicetree, linux-pm
  Cc: myungjoo.ham, kyungmin.park, cw00.choi, robh+dt, mark.rutland,
	b.zolnierkie, lukasz.luba, dietmar.eggemann

Hi all,

This is just a resend, now with proper v2 in the patches subject.

The Odroid-XU4/3 is a decent and easy accessible ARM big.LITTLE platform,
which might be used for research and development.

This small patch set provides possibility to run Energy Aware Scheduler (EAS)
on Odroid-XU4/3 and experiment with it. 

The patch 1/2 provides 'dynamic-power-coefficient' in CPU DT nodes, which is
then used by the Energy Model (EM).
The patch 2/2 enables SCHED_MC (which adds another level in scheduling domains)
and enables EM making EAS possible to run (when schedutil is set as a CPUFreq
governor).

1. Test results

Two types of different tests have been executed. The first is energy test
case showing impact on energy consumption of this patch set. It is using a
synthetic set of tasks (rt-app based). The second is the performance test
case which is using hackbench (less time to complete is better).
In both tests schedutil has been used as cpufreq governor. In all tests
PROVE_LOCKING has not been compiled into the kernels.

1.1 Energy test case

10 iterations of 24 periodic rt-app tasks (16ms period, 10% duty-cycle)
with energy measurement. The cpufreq governor - schedutil. Unit is Joules.
The energy is calculated based on hwmon0 and hwmon3 power1_input.
The goal is to save energy, lower is better.

+-----------+-----------------+------------------------+
|           | Without patches | With patches           |
+-----------+--------+--------+----------------+-------+
| benchmark |  Mean  | RSD*   | Mean           | RSD*  |
+-----------+--------+--------+----------------+-------+
| 24 rt-app |  21.56 |  1.37% |  19.85 (-9.2%) | 0.92% |
|    tasks  |        |        |                |       |
+-----------+--------+--------+----------------+-------+

1.2 Performance test case

10 consecutive iterations of hackbench (hackbench -l 500 -s 4096),
no delay between two successive executions.
The cpufreq governor - schedutil. Units in seconds.
The goal is to see not regression, lower completion time is better.

+-----------+-----------------+------------------------+
|           | Without patches | With patches           |
+-----------+--------+--------+----------------+-------+
| benchmark | Mean   | RSD*   | Mean           | RSD*  |
+-----------+--------+--------+----------------+-------+
| hackbench |  8.15  | 2.86%  |  7.95 (-2.5%)  | 0.60% |
+-----------+--------+--------+----------------+-------+

*RSD: Relative Standard Deviation (std dev / mean)

Changes:
v2:
- changed dynamic power coeffcient to 90 for A7, which prevents odd
  behaviour for some low utilisation and at low OPPs;
  now, the power ratio is ~3x between big an LITTLE core;
  it's better aligned with [1]; probably due to measurement noise
  at lower OPPs the values obtained from hwmon0|3 were different
  from reality; some synthetic workloads showed this differences
- cleaned commit messages (no measurements in commit message)
- merged configs into one patch and re-ordered patches
- provided energy measurmements in the cover letter
- measurements focused on comparing similar setup - with schedutil governor,
  to compare apples with apples

The v1 can be found in [2].
The patch set is on top of Krzysztof's tree, branch 'next/dt' [3] and has 
been tested on Odroid-XU3 rev0.2 20140529.

Regards,
Lukasz Luba

[1] https://www.cl.cam.ac.uk/~rdm34/big.LITTLE.pdf
[2] https://lore.kernel.org/linux-arm-kernel/20200127215453.15144-1-lukasz.luba@arm.com/T/
[3] https://git.kernel.org/pub/scm/linux/kernel/git/krzk/linux.git/log/?h=next/dt


Lukasz Luba (2):
  ARM: dts: exynos: Add dynamic-power-coefficient to Exynos5422 CPUs
  ARM: exynos_defconfig: Enable SCHED_MC and ENERGY_MODEL

 arch/arm/boot/dts/exynos5422-cpus.dtsi | 8 ++++++++
 arch/arm/configs/exynos_defconfig      | 2 ++
 2 files changed, 10 insertions(+)

-- 
2.17.1


^ permalink raw reply	[flat|nested] 7+ messages in thread

* [RESEND PATCH v2 1/2] ARM: dts: exynos: Add dynamic-power-coefficient to Exynos5422 CPUs
  2020-02-20  9:56 [RESEND PATCH v2 0/2] Enable Odroid-XU3/4 to use Energy Model and Energy Aware Scheduler Lukasz Luba
@ 2020-02-20  9:56 ` Lukasz Luba
  2020-02-20  9:56 ` [RESEND PATCH v2 2/2] ARM: exynos_defconfig: Enable SCHED_MC and ENERGY_MODEL Lukasz Luba
  2020-02-20 18:00 ` [RESEND PATCH v2 0/2] Enable Odroid-XU3/4 to use Energy Model and Energy Aware Scheduler Krzysztof Kozlowski
  2 siblings, 0 replies; 7+ messages in thread
From: Lukasz Luba @ 2020-02-20  9:56 UTC (permalink / raw)
  To: linux-kernel, kgene, krzk, linux-arm-kernel, linux-samsung-soc,
	devicetree, linux-pm
  Cc: myungjoo.ham, kyungmin.park, cw00.choi, robh+dt, mark.rutland,
	b.zolnierkie, lukasz.luba, dietmar.eggemann

To use Energy Aware Scheduler (EAS) the Energy Model (EM) should be
registered for CPUs. Add dynamic-power-coefficient into CPU nodes which
let CPUFreq subsystem register the EM structures. This will increase
energy efficiency of big.LITTLE platforms.

The 'dynamic-power-coefficient' values have been obtained experimenting
with different workloads. The power measurements taken from big CPU
Cluster and LITTLE CPU Cluster has been compared with official documents
and synthetic workloads estimations. The effective power ratio between
Cortex-A7 and Cortex-A15 CPUs (~3x) is also aligned with documentation.

Signed-off-by: Lukasz Luba <lukasz.luba@arm.com>
---
 arch/arm/boot/dts/exynos5422-cpus.dtsi | 8 ++++++++
 1 file changed, 8 insertions(+)

diff --git a/arch/arm/boot/dts/exynos5422-cpus.dtsi b/arch/arm/boot/dts/exynos5422-cpus.dtsi
index 1b8605cf2407..4b641b9b8179 100644
--- a/arch/arm/boot/dts/exynos5422-cpus.dtsi
+++ b/arch/arm/boot/dts/exynos5422-cpus.dtsi
@@ -31,6 +31,7 @@
 			operating-points-v2 = <&cluster_a7_opp_table>;
 			#cooling-cells = <2>; /* min followed by max */
 			capacity-dmips-mhz = <539>;
+			dynamic-power-coefficient = <90>;
 		};
 
 		cpu1: cpu@101 {
@@ -43,6 +44,7 @@
 			operating-points-v2 = <&cluster_a7_opp_table>;
 			#cooling-cells = <2>; /* min followed by max */
 			capacity-dmips-mhz = <539>;
+			dynamic-power-coefficient = <90>;
 		};
 
 		cpu2: cpu@102 {
@@ -55,6 +57,7 @@
 			operating-points-v2 = <&cluster_a7_opp_table>;
 			#cooling-cells = <2>; /* min followed by max */
 			capacity-dmips-mhz = <539>;
+			dynamic-power-coefficient = <90>;
 		};
 
 		cpu3: cpu@103 {
@@ -67,6 +70,7 @@
 			operating-points-v2 = <&cluster_a7_opp_table>;
 			#cooling-cells = <2>; /* min followed by max */
 			capacity-dmips-mhz = <539>;
+			dynamic-power-coefficient = <90>;
 		};
 
 		cpu4: cpu@0 {
@@ -79,6 +83,7 @@
 			operating-points-v2 = <&cluster_a15_opp_table>;
 			#cooling-cells = <2>; /* min followed by max */
 			capacity-dmips-mhz = <1024>;
+			dynamic-power-coefficient = <310>;
 		};
 
 		cpu5: cpu@1 {
@@ -91,6 +96,7 @@
 			operating-points-v2 = <&cluster_a15_opp_table>;
 			#cooling-cells = <2>; /* min followed by max */
 			capacity-dmips-mhz = <1024>;
+			dynamic-power-coefficient = <310>;
 		};
 
 		cpu6: cpu@2 {
@@ -103,6 +109,7 @@
 			operating-points-v2 = <&cluster_a15_opp_table>;
 			#cooling-cells = <2>; /* min followed by max */
 			capacity-dmips-mhz = <1024>;
+			dynamic-power-coefficient = <310>;
 		};
 
 		cpu7: cpu@3 {
@@ -115,6 +122,7 @@
 			operating-points-v2 = <&cluster_a15_opp_table>;
 			#cooling-cells = <2>; /* min followed by max */
 			capacity-dmips-mhz = <1024>;
+			dynamic-power-coefficient = <310>;
 		};
 	};
 };
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [RESEND PATCH v2 2/2] ARM: exynos_defconfig: Enable SCHED_MC and ENERGY_MODEL
  2020-02-20  9:56 [RESEND PATCH v2 0/2] Enable Odroid-XU3/4 to use Energy Model and Energy Aware Scheduler Lukasz Luba
  2020-02-20  9:56 ` [RESEND PATCH v2 1/2] ARM: dts: exynos: Add dynamic-power-coefficient to Exynos5422 CPUs Lukasz Luba
@ 2020-02-20  9:56 ` Lukasz Luba
  2020-02-20 18:00 ` [RESEND PATCH v2 0/2] Enable Odroid-XU3/4 to use Energy Model and Energy Aware Scheduler Krzysztof Kozlowski
  2 siblings, 0 replies; 7+ messages in thread
From: Lukasz Luba @ 2020-02-20  9:56 UTC (permalink / raw)
  To: linux-kernel, kgene, krzk, linux-arm-kernel, linux-samsung-soc,
	devicetree, linux-pm
  Cc: myungjoo.ham, kyungmin.park, cw00.choi, robh+dt, mark.rutland,
	b.zolnierkie, lukasz.luba, dietmar.eggemann

The Energy Model (EM) is needed to run Energy Aware Scheduler (EAS).
Enable ENERGY_MODEL to make that happen. This will increase energy
efficiency of the big.LITTLE platform by smart decisions in scheduling
tasks in non-heavy workloads.

Add SCHED_MC in order to create another level in scheduling domains: 'MC'.
This improves scheduler's decisions on platforms with CPU clusters, such
as big.LITTLE.

Signed-off-by: Lukasz Luba <lukasz.luba@arm.com>
---
 arch/arm/configs/exynos_defconfig | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/arch/arm/configs/exynos_defconfig b/arch/arm/configs/exynos_defconfig
index c8e0c14092e8..90d376eb333b 100644
--- a/arch/arm/configs/exynos_defconfig
+++ b/arch/arm/configs/exynos_defconfig
@@ -8,6 +8,7 @@ CONFIG_PERF_EVENTS=y
 CONFIG_ARCH_EXYNOS=y
 CONFIG_CPU_ICACHE_MISMATCH_WORKAROUND=y
 CONFIG_SMP=y
+CONFIG_SCHED_MC=y
 CONFIG_BIG_LITTLE=y
 CONFIG_NR_CPUS=8
 CONFIG_HIGHMEM=y
@@ -17,6 +18,7 @@ CONFIG_ZBOOT_ROM_BSS=0x0
 CONFIG_ARM_APPENDED_DTB=y
 CONFIG_ARM_ATAG_DTB_COMPAT=y
 CONFIG_CMDLINE="root=/dev/ram0 rw ramdisk=8192 initrd=0x41000000,8M console=ttySAC1,115200 init=/linuxrc mem=256M"
+CONFIG_ENERGY_MODEL=y
 CONFIG_CPU_FREQ=y
 CONFIG_CPU_FREQ_STAT=y
 CONFIG_CPU_FREQ_DEFAULT_GOV_ONDEMAND=y
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [RESEND PATCH v2 0/2] Enable Odroid-XU3/4 to use Energy Model and Energy Aware Scheduler
  2020-02-20  9:56 [RESEND PATCH v2 0/2] Enable Odroid-XU3/4 to use Energy Model and Energy Aware Scheduler Lukasz Luba
  2020-02-20  9:56 ` [RESEND PATCH v2 1/2] ARM: dts: exynos: Add dynamic-power-coefficient to Exynos5422 CPUs Lukasz Luba
  2020-02-20  9:56 ` [RESEND PATCH v2 2/2] ARM: exynos_defconfig: Enable SCHED_MC and ENERGY_MODEL Lukasz Luba
@ 2020-02-20 18:00 ` Krzysztof Kozlowski
  2020-02-21 10:32   ` Lukasz Luba
  2 siblings, 1 reply; 7+ messages in thread
From: Krzysztof Kozlowski @ 2020-02-20 18:00 UTC (permalink / raw)
  To: Lukasz Luba
  Cc: linux-kernel, kgene, linux-arm-kernel, linux-samsung-soc,
	devicetree, linux-pm, myungjoo.ham, kyungmin.park, cw00.choi,
	robh+dt, mark.rutland, b.zolnierkie, dietmar.eggemann

On Thu, Feb 20, 2020 at 09:56:34AM +0000, Lukasz Luba wrote:
> Hi all,
> 
> This is just a resend, now with proper v2 in the patches subject.
> 
> The Odroid-XU4/3 is a decent and easy accessible ARM big.LITTLE platform,
> which might be used for research and development.
> 
> This small patch set provides possibility to run Energy Aware Scheduler (EAS)
> on Odroid-XU4/3 and experiment with it. 
> 
> The patch 1/2 provides 'dynamic-power-coefficient' in CPU DT nodes, which is
> then used by the Energy Model (EM).
> The patch 2/2 enables SCHED_MC (which adds another level in scheduling domains)
> and enables EM making EAS possible to run (when schedutil is set as a CPUFreq
> governor).
> 
> 1. Test results
> 
> Two types of different tests have been executed. The first is energy test
> case showing impact on energy consumption of this patch set. It is using a
> synthetic set of tasks (rt-app based). The second is the performance test
> case which is using hackbench (less time to complete is better).
> In both tests schedutil has been used as cpufreq governor. In all tests
> PROVE_LOCKING has not been compiled into the kernels.
> 
> 1.1 Energy test case
> 
> 10 iterations of 24 periodic rt-app tasks (16ms period, 10% duty-cycle)
> with energy measurement. The cpufreq governor - schedutil. Unit is Joules.
> The energy is calculated based on hwmon0 and hwmon3 power1_input.
> The goal is to save energy, lower is better.
> 
> +-----------+-----------------+------------------------+
> |           | Without patches | With patches           |
> +-----------+--------+--------+----------------+-------+
> | benchmark |  Mean  | RSD*   | Mean           | RSD*  |
> +-----------+--------+--------+----------------+-------+
> | 24 rt-app |  21.56 |  1.37% |  19.85 (-9.2%) | 0.92% |
> |    tasks  |        |        |                |       |
> +-----------+--------+--------+----------------+-------+
> 
> 1.2 Performance test case
> 
> 10 consecutive iterations of hackbench (hackbench -l 500 -s 4096),
> no delay between two successive executions.
> The cpufreq governor - schedutil. Units in seconds.
> The goal is to see not regression, lower completion time is better.
> 
> +-----------+-----------------+------------------------+
> |           | Without patches | With patches           |
> +-----------+--------+--------+----------------+-------+
> | benchmark | Mean   | RSD*   | Mean           | RSD*  |
> +-----------+--------+--------+----------------+-------+
> | hackbench |  8.15  | 2.86%  |  7.95 (-2.5%)  | 0.60% |
> +-----------+--------+--------+----------------+-------+
> 
> *RSD: Relative Standard Deviation (std dev / mean)

Nice measurements!

Applied both, thank you.

Best regards,
Krzysztof


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [RESEND PATCH v2 0/2] Enable Odroid-XU3/4 to use Energy Model and Energy Aware Scheduler
  2020-02-20 18:00 ` [RESEND PATCH v2 0/2] Enable Odroid-XU3/4 to use Energy Model and Energy Aware Scheduler Krzysztof Kozlowski
@ 2020-02-21 10:32   ` Lukasz Luba
  2020-02-28 10:59     ` Marek Szyprowski
  0 siblings, 1 reply; 7+ messages in thread
From: Lukasz Luba @ 2020-02-21 10:32 UTC (permalink / raw)
  To: Krzysztof Kozlowski
  Cc: linux-kernel, kgene, linux-arm-kernel, linux-samsung-soc,
	devicetree, linux-pm, myungjoo.ham, kyungmin.park, cw00.choi,
	robh+dt, mark.rutland, b.zolnierkie, dietmar.eggemann

Hi Krzysztof,

On 2/20/20 6:00 PM, Krzysztof Kozlowski wrote:
> On Thu, Feb 20, 2020 at 09:56:34AM +0000, Lukasz Luba wrote:
>> Hi all,
>>
>> This is just a resend, now with proper v2 in the patches subject.
>>
>> The Odroid-XU4/3 is a decent and easy accessible ARM big.LITTLE platform,
>> which might be used for research and development.
>>
>> This small patch set provides possibility to run Energy Aware Scheduler (EAS)
>> on Odroid-XU4/3 and experiment with it.
>>
>> The patch 1/2 provides 'dynamic-power-coefficient' in CPU DT nodes, which is
>> then used by the Energy Model (EM).
>> The patch 2/2 enables SCHED_MC (which adds another level in scheduling domains)
>> and enables EM making EAS possible to run (when schedutil is set as a CPUFreq
>> governor).
>>
>> 1. Test results
>>
>> Two types of different tests have been executed. The first is energy test
>> case showing impact on energy consumption of this patch set. It is using a
>> synthetic set of tasks (rt-app based). The second is the performance test
>> case which is using hackbench (less time to complete is better).
>> In both tests schedutil has been used as cpufreq governor. In all tests
>> PROVE_LOCKING has not been compiled into the kernels.
>>
>> 1.1 Energy test case
>>
>> 10 iterations of 24 periodic rt-app tasks (16ms period, 10% duty-cycle)
>> with energy measurement. The cpufreq governor - schedutil. Unit is Joules.
>> The energy is calculated based on hwmon0 and hwmon3 power1_input.
>> The goal is to save energy, lower is better.
>>
>> +-----------+-----------------+------------------------+
>> |           | Without patches | With patches           |
>> +-----------+--------+--------+----------------+-------+
>> | benchmark |  Mean  | RSD*   | Mean           | RSD*  |
>> +-----------+--------+--------+----------------+-------+
>> | 24 rt-app |  21.56 |  1.37% |  19.85 (-9.2%) | 0.92% |
>> |    tasks  |        |        |                |       |
>> +-----------+--------+--------+----------------+-------+
>>
>> 1.2 Performance test case
>>
>> 10 consecutive iterations of hackbench (hackbench -l 500 -s 4096),
>> no delay between two successive executions.
>> The cpufreq governor - schedutil. Units in seconds.
>> The goal is to see not regression, lower completion time is better.
>>
>> +-----------+-----------------+------------------------+
>> |           | Without patches | With patches           |
>> +-----------+--------+--------+----------------+-------+
>> | benchmark | Mean   | RSD*   | Mean           | RSD*  |
>> +-----------+--------+--------+----------------+-------+
>> | hackbench |  8.15  | 2.86%  |  7.95 (-2.5%)  | 0.60% |
>> +-----------+--------+--------+----------------+-------+
>>
>> *RSD: Relative Standard Deviation (std dev / mean)
> 
> Nice measurements!

Glad to hear that.

> 
> Applied both, thank you.
> 
> Best regards,
> Krzysztof
> 

Thank you for applying this.

Regards,
Lukasz

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [RESEND PATCH v2 0/2] Enable Odroid-XU3/4 to use Energy Model and Energy Aware Scheduler
  2020-02-21 10:32   ` Lukasz Luba
@ 2020-02-28 10:59     ` Marek Szyprowski
  2020-02-28 12:00       ` Lukasz Luba
  0 siblings, 1 reply; 7+ messages in thread
From: Marek Szyprowski @ 2020-02-28 10:59 UTC (permalink / raw)
  To: Lukasz Luba, Krzysztof Kozlowski
  Cc: linux-kernel, kgene, linux-arm-kernel, linux-samsung-soc,
	devicetree, linux-pm, myungjoo.ham, kyungmin.park, cw00.choi,
	robh+dt, mark.rutland, b.zolnierkie, dietmar.eggemann

Hi Lukasz

On 21.02.2020 11:32, Lukasz Luba wrote:
> On 2/20/20 6:00 PM, Krzysztof Kozlowski wrote:
>> On Thu, Feb 20, 2020 at 09:56:34AM +0000, Lukasz Luba wrote:
>>> This is just a resend, now with proper v2 in the patches subject.
>>>
>>> The Odroid-XU4/3 is a decent and easy accessible ARM big.LITTLE 
>>> platform,
>>> which might be used for research and development.
>>>
>>> This small patch set provides possibility to run Energy Aware 
>>> Scheduler (EAS)
>>> on Odroid-XU4/3 and experiment with it.
>>>
>>> The patch 1/2 provides 'dynamic-power-coefficient' in CPU DT nodes, 
>>> which is
>>> then used by the Energy Model (EM).
>>> The patch 2/2 enables SCHED_MC (which adds another level in 
>>> scheduling domains)
>>> and enables EM making EAS possible to run (when schedutil is set as 
>>> a CPUFreq
>>> governor).
>>>
>>> 1. Test results
>>>
>>> Two types of different tests have been executed. The first is energy 
>>> test
>>> case showing impact on energy consumption of this patch set. It is 
>>> using a
>>> synthetic set of tasks (rt-app based). The second is the performance 
>>> test
>>> case which is using hackbench (less time to complete is better).
>>> In both tests schedutil has been used as cpufreq governor. In all tests
>>> PROVE_LOCKING has not been compiled into the kernels.
>>>
>>> 1.1 Energy test case
>>>
>>> 10 iterations of 24 periodic rt-app tasks (16ms period, 10% duty-cycle)
>>> with energy measurement. The cpufreq governor - schedutil. Unit is 
>>> Joules.
>>> The energy is calculated based on hwmon0 and hwmon3 power1_input.
>>> The goal is to save energy, lower is better.
>>>
>>> +-----------+-----------------+------------------------+
>>> |           | Without patches | With patches           |
>>> +-----------+--------+--------+----------------+-------+
>>> | benchmark |  Mean  | RSD*   | Mean           | RSD*  |
>>> +-----------+--------+--------+----------------+-------+
>>> | 24 rt-app |  21.56 |  1.37% |  19.85 (-9.2%) | 0.92% |
>>> |    tasks  |        |        |                |       |
>>> +-----------+--------+--------+----------------+-------+
>>>
>>> 1.2 Performance test case
>>>
>>> 10 consecutive iterations of hackbench (hackbench -l 500 -s 4096),
>>> no delay between two successive executions.
>>> The cpufreq governor - schedutil. Units in seconds.
>>> The goal is to see not regression, lower completion time is better.
>>>
>>> +-----------+-----------------+------------------------+
>>> |           | Without patches | With patches           |
>>> +-----------+--------+--------+----------------+-------+
>>> | benchmark | Mean   | RSD*   | Mean           | RSD*  |
>>> +-----------+--------+--------+----------------+-------+
>>> | hackbench |  8.15  | 2.86%  |  7.95 (-2.5%)  | 0.60% |
>>> +-----------+--------+--------+----------------+-------+
>>>
>>> *RSD: Relative Standard Deviation (std dev / mean)
>>
>> Nice measurements!
>
> Glad to hear that.
>
>>
>> Applied both, thank you.
>>
>
> Thank you for applying this.


After applying the patches I see the following warnings during boot (XU4):

energy_model: pd0: hertz/watts ratio non-monotonically decreasing: 
em_cap_state 1 >= em_cap_state0
energy_model: pd0: hertz/watts ratio non-monotonically decreasing: 
em_cap_state 3 >= em_cap_state2
energy_model: pd0: hertz/watts ratio non-monotonically decreasing: 
em_cap_state 4 >= em_cap_state3
energy_model: pd0: hertz/watts ratio non-monotonically decreasing: 
em_cap_state 5 >= em_cap_state4
energy_model: pd0: hertz/watts ratio non-monotonically decreasing: 
em_cap_state 8 >= em_cap_state7
energy_model: pd0: hertz/watts ratio non-monotonically decreasing: 
em_cap_state 10 >= em_cap_state9
energy_model: pd0: hertz/watts ratio non-monotonically decreasing: 
em_cap_state 11 >= em_cap_state10
energy_model: pd4: hertz/watts ratio non-monotonically decreasing: 
em_cap_state 1 >= em_cap_state0
energy_model: pd4: hertz/watts ratio non-monotonically decreasing: 
em_cap_state 2 >= em_cap_state1
energy_model: pd4: hertz/watts ratio non-monotonically decreasing: 
em_cap_state 3 >= em_cap_state2
energy_model: pd4: hertz/watts ratio non-monotonically decreasing: 
em_cap_state 4 >= em_cap_state3
energy_model: pd4: hertz/watts ratio non-monotonically decreasing: 
em_cap_state 5 >= em_cap_state4
energy_model: pd4: hertz/watts ratio non-monotonically decreasing: 
em_cap_state 6 >= em_cap_state5
energy_model: pd4: hertz/watts ratio non-monotonically decreasing: 
em_cap_state 8 >= em_cap_state7
energy_model: pd4: hertz/watts ratio non-monotonically decreasing: 
em_cap_state 9 >= em_cap_state8
energy_model: pd4: hertz/watts ratio non-monotonically decreasing: 
em_cap_state 10 >= em_cap_state9
energy_model: pd4: hertz/watts ratio non-monotonically decreasing: 
em_cap_state 13 >= em_cap_state12
energy_model: pd4: hertz/watts ratio non-monotonically decreasing: 
em_cap_state 15 >= em_cap_state14
energy_model: pd4: hertz/watts ratio non-monotonically decreasing: 
em_cap_state 16 >= em_cap_state15

Is it okay?

Best regards
-- 
Marek Szyprowski, PhD
Samsung R&D Institute Poland


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [RESEND PATCH v2 0/2] Enable Odroid-XU3/4 to use Energy Model and Energy Aware Scheduler
  2020-02-28 10:59     ` Marek Szyprowski
@ 2020-02-28 12:00       ` Lukasz Luba
  0 siblings, 0 replies; 7+ messages in thread
From: Lukasz Luba @ 2020-02-28 12:00 UTC (permalink / raw)
  To: Marek Szyprowski, Krzysztof Kozlowski
  Cc: linux-kernel, kgene, linux-arm-kernel, linux-samsung-soc,
	devicetree, linux-pm, myungjoo.ham, kyungmin.park, cw00.choi,
	robh+dt, mark.rutland, b.zolnierkie, dietmar.eggemann

Hi Marek,

On 2/28/20 10:59 AM, Marek Szyprowski wrote:
> Hi Lukasz
> 
> On 21.02.2020 11:32, Lukasz Luba wrote:
>> On 2/20/20 6:00 PM, Krzysztof Kozlowski wrote:
>>> On Thu, Feb 20, 2020 at 09:56:34AM +0000, Lukasz Luba wrote:
>>>> This is just a resend, now with proper v2 in the patches subject.
>>>>
>>>> The Odroid-XU4/3 is a decent and easy accessible ARM big.LITTLE
>>>> platform,
>>>> which might be used for research and development.
>>>>
>>>> This small patch set provides possibility to run Energy Aware
>>>> Scheduler (EAS)
>>>> on Odroid-XU4/3 and experiment with it.
>>>>
>>>> The patch 1/2 provides 'dynamic-power-coefficient' in CPU DT nodes,
>>>> which is
>>>> then used by the Energy Model (EM).
>>>> The patch 2/2 enables SCHED_MC (which adds another level in
>>>> scheduling domains)
>>>> and enables EM making EAS possible to run (when schedutil is set as
>>>> a CPUFreq
>>>> governor).
>>>>
>>>> 1. Test results
>>>>
>>>> Two types of different tests have been executed. The first is energy
>>>> test
>>>> case showing impact on energy consumption of this patch set. It is
>>>> using a
>>>> synthetic set of tasks (rt-app based). The second is the performance
>>>> test
>>>> case which is using hackbench (less time to complete is better).
>>>> In both tests schedutil has been used as cpufreq governor. In all tests
>>>> PROVE_LOCKING has not been compiled into the kernels.
>>>>
>>>> 1.1 Energy test case
>>>>
>>>> 10 iterations of 24 periodic rt-app tasks (16ms period, 10% duty-cycle)
>>>> with energy measurement. The cpufreq governor - schedutil. Unit is
>>>> Joules.
>>>> The energy is calculated based on hwmon0 and hwmon3 power1_input.
>>>> The goal is to save energy, lower is better.
>>>>
>>>> +-----------+-----------------+------------------------+
>>>> |           | Without patches | With patches           |
>>>> +-----------+--------+--------+----------------+-------+
>>>> | benchmark |  Mean  | RSD*   | Mean           | RSD*  |
>>>> +-----------+--------+--------+----------------+-------+
>>>> | 24 rt-app |  21.56 |  1.37% |  19.85 (-9.2%) | 0.92% |
>>>> |    tasks  |        |        |                |       |
>>>> +-----------+--------+--------+----------------+-------+
>>>>
>>>> 1.2 Performance test case
>>>>
>>>> 10 consecutive iterations of hackbench (hackbench -l 500 -s 4096),
>>>> no delay between two successive executions.
>>>> The cpufreq governor - schedutil. Units in seconds.
>>>> The goal is to see not regression, lower completion time is better.
>>>>
>>>> +-----------+-----------------+------------------------+
>>>> |           | Without patches | With patches           |
>>>> +-----------+--------+--------+----------------+-------+
>>>> | benchmark | Mean   | RSD*   | Mean           | RSD*  |
>>>> +-----------+--------+--------+----------------+-------+
>>>> | hackbench |  8.15  | 2.86%  |  7.95 (-2.5%)  | 0.60% |
>>>> +-----------+--------+--------+----------------+-------+
>>>>
>>>> *RSD: Relative Standard Deviation (std dev / mean)
>>>
>>> Nice measurements!
>>
>> Glad to hear that.
>>
>>>
>>> Applied both, thank you.
>>>
>>
>> Thank you for applying this.
> 
> 
> After applying the patches I see the following warnings during boot (XU4):
> 
> energy_model: pd0: hertz/watts ratio non-monotonically decreasing:
> em_cap_state 1 >= em_cap_state0
> energy_model: pd0: hertz/watts ratio non-monotonically decreasing:
> em_cap_state 3 >= em_cap_state2
> energy_model: pd0: hertz/watts ratio non-monotonically decreasing:
> em_cap_state 4 >= em_cap_state3
> energy_model: pd0: hertz/watts ratio non-monotonically decreasing:
> em_cap_state 5 >= em_cap_state4
> energy_model: pd0: hertz/watts ratio non-monotonically decreasing:
> em_cap_state 8 >= em_cap_state7
> energy_model: pd0: hertz/watts ratio non-monotonically decreasing:
> em_cap_state 10 >= em_cap_state9
> energy_model: pd0: hertz/watts ratio non-monotonically decreasing:
> em_cap_state 11 >= em_cap_state10
> energy_model: pd4: hertz/watts ratio non-monotonically decreasing:
> em_cap_state 1 >= em_cap_state0
> energy_model: pd4: hertz/watts ratio non-monotonically decreasing:
> em_cap_state 2 >= em_cap_state1
> energy_model: pd4: hertz/watts ratio non-monotonically decreasing:
> em_cap_state 3 >= em_cap_state2
> energy_model: pd4: hertz/watts ratio non-monotonically decreasing:
> em_cap_state 4 >= em_cap_state3
> energy_model: pd4: hertz/watts ratio non-monotonically decreasing:
> em_cap_state 5 >= em_cap_state4
> energy_model: pd4: hertz/watts ratio non-monotonically decreasing:
> em_cap_state 6 >= em_cap_state5
> energy_model: pd4: hertz/watts ratio non-monotonically decreasing:
> em_cap_state 8 >= em_cap_state7
> energy_model: pd4: hertz/watts ratio non-monotonically decreasing:
> em_cap_state 9 >= em_cap_state8
> energy_model: pd4: hertz/watts ratio non-monotonically decreasing:
> em_cap_state 10 >= em_cap_state9
> energy_model: pd4: hertz/watts ratio non-monotonically decreasing:
> em_cap_state 13 >= em_cap_state12
> energy_model: pd4: hertz/watts ratio non-monotonically decreasing:
> em_cap_state 15 >= em_cap_state14
> energy_model: pd4: hertz/watts ratio non-monotonically decreasing:
> em_cap_state 16 >= em_cap_state15
> 
> Is it okay?

It shouldn't harm the EAS but it might be used by thermal, especially
those OPPs from the top. Like in your case in step_wise (IIRC the DT
settings).
But removing some of these from the bottom, would be good.
It would lower the Energy Model complexity, which is:
nr_perf_domain * (nr_cpus + nr_OPPs) [1] (in Odroid XU4 is ~80 IIRC)

smaller OPP number is better.

Douglas is working on a patch set which could skip non-efficient OPPs
(the OPPs which have the same voltage but different frequency).
Although, we don't know the numbers how much it could save energy - when
we use the fastest frequency for the set of OPPs with the same voltage,
comparing to the slowest (theoretically entering idle earlier) .
The discussion is ongoing here [2].

Regarding the print message. It's not a bug in the platform so in
my opinion we shouldn't use 'pr_warn' in this case.
It's going to be changed to just debug level print. I have this
change in the new Energy Model. It is in last point in changelog v3 [3]
and the change which does this is in patch 1/4:
--------------------------------------------->8------------------
-			pr_warn("pd%d: hertz/watts ratio non-monotonically decreasing: 
em_cap_state %d >= em_cap_state%d\n",
-					cpu, i, i - 1);
+			dev_dbg(dev, "EM: hertz/watts ratio non-monotonically decreasing: 
em_perf_state %d >= em_perf_state%d\n",
+					i, i - 1);

--------------------------------------8<------------------------


Regards,
Lukasz

[1] 
https://elixir.bootlin.com/linux/latest/source/kernel/sched/topology.c#L397
[2] https://lkml.org/lkml/2020/1/22/1169
[3] https://lkml.org/lkml/2020/2/21/1910

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2020-02-28 12:00 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-02-20  9:56 [RESEND PATCH v2 0/2] Enable Odroid-XU3/4 to use Energy Model and Energy Aware Scheduler Lukasz Luba
2020-02-20  9:56 ` [RESEND PATCH v2 1/2] ARM: dts: exynos: Add dynamic-power-coefficient to Exynos5422 CPUs Lukasz Luba
2020-02-20  9:56 ` [RESEND PATCH v2 2/2] ARM: exynos_defconfig: Enable SCHED_MC and ENERGY_MODEL Lukasz Luba
2020-02-20 18:00 ` [RESEND PATCH v2 0/2] Enable Odroid-XU3/4 to use Energy Model and Energy Aware Scheduler Krzysztof Kozlowski
2020-02-21 10:32   ` Lukasz Luba
2020-02-28 10:59     ` Marek Szyprowski
2020-02-28 12:00       ` Lukasz Luba

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).