linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/3] Enable Odroid-XU3/4 to use Energy Model and Energy Aware Scheduler
@ 2020-01-27 21:54 lukasz.luba
  2020-01-27 21:54 ` [PATCH 1/3] ARM: exynos_defconfig: Enable SCHED_MC lukasz.luba
                   ` (2 more replies)
  0 siblings, 3 replies; 20+ messages in thread
From: lukasz.luba @ 2020-01-27 21:54 UTC (permalink / raw)
  To: kgene, krzk, linux-arm-kernel, linux-samsung-soc, linux-kernel,
	devicetree, linux-pm
  Cc: myungjoo.ham, kyungmin.park, cw00.choi, robh+dt, mark.rutland,
	b.zolnierkie, lukasz.luba, dietmar.eggemann

From: Lukasz Luba <lukasz.luba@arm.com>

The Odroid-XU3/4 is a decent and easy accessible ARM big.LITTLE platform,
which might be used for research and development.

This small patch set provides possibility to run Energy Aware Scheduler (EAS)
on Odroid-XU4/3 and experiment with it. 

The patch 1 enables SCHED_MC, which adds another level in sched_domain.
The patch 2 provides 'dynamic-power-coefficient' in CPU DT nodes, which is
then is used by the Energy Model (EM).
The patch 3 enables EM and makes EAS possible to run. Please read the commit
message in the patch 3 describing how to enable or disable EAS at runtime.
Some of the test results are provided also in there.

The patch set is on top of Krzysztof's tree, branch 'next/dt' [1] and has 
been tested on Odroid-XU3.

Regards,
Lukasz Luba

[1] https://git.kernel.org/pub/scm/linux/kernel/git/krzk/linux.git/log/?h=next/dt

Lukasz Luba (3):
  ARM: exynos_defconfig: Enable SCHED_MC
  ARM: dts: exynos: Add Exynos5422 CPU dynamic-power-coefficient
    information
  ARM: exynos_defconfig: Enable Energy Model framework

 arch/arm/boot/dts/exynos5422-cpus.dtsi | 8 ++++++++
 arch/arm/configs/exynos_defconfig      | 2 ++
 2 files changed, 10 insertions(+)

-- 
2.17.1


^ permalink raw reply	[flat|nested] 20+ messages in thread

* [PATCH 1/3] ARM: exynos_defconfig: Enable SCHED_MC
  2020-01-27 21:54 [PATCH 0/3] Enable Odroid-XU3/4 to use Energy Model and Energy Aware Scheduler lukasz.luba
@ 2020-01-27 21:54 ` lukasz.luba
  2020-01-31 12:47   ` Krzysztof Kozlowski
  2020-01-27 21:54 ` [PATCH 2/3] ARM: dts: exynos: Add Exynos5422 CPU dynamic-power-coefficient information lukasz.luba
  2020-01-27 21:54 ` [PATCH 3/3] ARM: exynos_defconfig: Enable Energy Model framework lukasz.luba
  2 siblings, 1 reply; 20+ messages in thread
From: lukasz.luba @ 2020-01-27 21:54 UTC (permalink / raw)
  To: kgene, krzk, linux-arm-kernel, linux-samsung-soc, linux-kernel,
	devicetree, linux-pm
  Cc: myungjoo.ham, kyungmin.park, cw00.choi, robh+dt, mark.rutland,
	b.zolnierkie, lukasz.luba, dietmar.eggemann

From: Lukasz Luba <lukasz.luba@arm.com>

Since the 'capacities-dmips-mhz' are present in the CPU nodes, make use of
this knowledge in smarter decisions during scheduling.

The values in 'capacities-dmips-mhz' are normilized, this means that i.e.
when CPU0's capacities-dmips-mhz=100 and CPU1's 'capacities-dmips-mhz'=50,
cpu0 is twice fast as CPU1, at the same frequency. The proper hirarchy
in sched_domain topology could exploit the SoC architecture advantages
like big.LITTLE. Enabling the SCHED_MC will create two levels in
sched_domain hierarchy, which might be observed in:

grep . /proc/sys/kernel/sched_domain/cpu*/domain*/{name,flags}
/proc/sys/kernel/sched_domain/cpu0/domain0/name:MC
/proc/sys/kernel/sched_domain/cpu0/domain1/name:DIE
...
/proc/sys/kernel/sched_domain/cpu0/domain0/flags:575
/proc/sys/kernel/sched_domain/cpu0/domain1/flags:4223

Signed-off-by: Lukasz Luba <lukasz.luba@arm.com>
---
 arch/arm/configs/exynos_defconfig | 1 +
 1 file changed, 1 insertion(+)

diff --git a/arch/arm/configs/exynos_defconfig b/arch/arm/configs/exynos_defconfig
index e7e4bb5ad8d5..1db857056992 100644
--- a/arch/arm/configs/exynos_defconfig
+++ b/arch/arm/configs/exynos_defconfig
@@ -8,6 +8,7 @@ CONFIG_PERF_EVENTS=y
 CONFIG_ARCH_EXYNOS=y
 CONFIG_CPU_ICACHE_MISMATCH_WORKAROUND=y
 CONFIG_SMP=y
+CONFIG_SCHED_MC=y
 CONFIG_BIG_LITTLE=y
 CONFIG_NR_CPUS=8
 CONFIG_HIGHMEM=y
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH 2/3] ARM: dts: exynos: Add Exynos5422 CPU dynamic-power-coefficient information
  2020-01-27 21:54 [PATCH 0/3] Enable Odroid-XU3/4 to use Energy Model and Energy Aware Scheduler lukasz.luba
  2020-01-27 21:54 ` [PATCH 1/3] ARM: exynos_defconfig: Enable SCHED_MC lukasz.luba
@ 2020-01-27 21:54 ` lukasz.luba
  2020-01-31 13:05   ` Krzysztof Kozlowski
  2020-01-27 21:54 ` [PATCH 3/3] ARM: exynos_defconfig: Enable Energy Model framework lukasz.luba
  2 siblings, 1 reply; 20+ messages in thread
From: lukasz.luba @ 2020-01-27 21:54 UTC (permalink / raw)
  To: kgene, krzk, linux-arm-kernel, linux-samsung-soc, linux-kernel,
	devicetree, linux-pm
  Cc: myungjoo.ham, kyungmin.park, cw00.choi, robh+dt, mark.rutland,
	b.zolnierkie, lukasz.luba, dietmar.eggemann

From: Lukasz Luba <lukasz.luba@arm.com>

Add dynamic power coefficient into CPU nodes which let CPUFreq subsystem
register the Energy Model (EM) for the CPUs.

The 'dynamic-power-coefficient' is used for calculating the dynamic power
according to the equation in documentation [1].  The Energy Model (EM)
framework relies on calculated power and cost for each OPP. The OPP power
values come from CPUFreq driver, which registered required callback
function. The simple implementation of a CPUFREQ driver, like cpufreq-dt,
uses 'dev_pm_opp_of_register_em()' which relay on
'dynamic-power-coefficient' to calculate the power of requested OPP for the
EM [2].

The calculated values might be checked in
/sys/kernel/debug/energy_model/pd*/

$ grep . /sys/kernel/debug/energy_model/pd1/cs*/*
/sys/kernel/debug/energy_model/pd1/cs:1000000/cost:558
/sys/kernel/debug/energy_model/pd1/cs:1000000/frequency:1000000
/sys/kernel/debug/energy_model/pd1/cs:1000000/power:310
/sys/kernel/debug/energy_model/pd1/cs:1100000/cost:558
/sys/kernel/debug/energy_model/pd1/cs:1100000/frequency:1100000
/sys/kernel/debug/energy_model/pd1/cs:1100000/power:341
/sys/kernel/debug/energy_model/pd1/cs:1200000/cost:558
/sys/kernel/debug/energy_model/pd1/cs:1200000/frequency:1200000
/sys/kernel/debug/energy_model/pd1/cs:1200000/power:372
/sys/kernel/debug/energy_model/pd1/cs:1300000/cost:674
/sys/kernel/debug/energy_model/pd1/cs:1300000/frequency:1300000
/sys/kernel/debug/energy_model/pd1/cs:1300000/power:487
/sys/kernel/debug/energy_model/pd1/cs:1400000/cost:675 ...

$ grep . /sys/kernel/debug/energy_model/pd0/cs*/*
/sys/kernel/debug/energy_model/pd0/cs:1000000/cost:200
/sys/kernel/debug/energy_model/pd0/cs:1000000/frequency:1000000
/sys/kernel/debug/energy_model/pd0/cs:1000000/power:154
/sys/kernel/debug/energy_model/pd0/cs:1100000/cost:260
/sys/kernel/debug/energy_model/pd0/cs:1100000/frequency:1100000
/sys/kernel/debug/energy_model/pd0/cs:1100000/power:220
/sys/kernel/debug/energy_model/pd0/cs:1200000/cost:260
/sys/kernel/debug/energy_model/pd0/cs:1200000/frequency:1200000
/sys/kernel/debug/energy_model/pd0/cs:1200000/power:240
/sys/kernel/debug/energy_model/pd0/cs:1300000/cost:260
/sys/kernel/debug/energy_model/pd0/cs:1300000/frequency:1300000
/sys/kernel/debug/energy_model/pd0/cs:1300000/power:260
/sys/kernel/debug/energy_model/pd0/cs:200000/cost:130 ...

To provide a proper value of the 'dynamic-power-coefficient' the real power
can be measured using a dedicated hardware, i.e. INA2xx. The Odroid-XU3
hwmon sensors have been used to capture the power value during a sysbench
test running on single core and at each possible OPP. The measured values
were divided by 2, since the dynamic power is typically half of the
consumed power (the second half is static power). Next, the approximation
was made and the power model derived, showing the 'C' value of routhly X.
Check the example equations in drivers/opp/of.c [2].
Thus, i.e. the power = 1.0Watt at 1GHz => 0.5W dynamic power =>
dynamic-power-coefficient = 400

Using this simple technique we can provide and needed coefficient.  The
approximation does not have to be super precised. The proportion is
important and the difference between power consumed by different CPUs
running at the same frequency, which is then used in Energy Aware Scheduler
algorithms. An example power values on Odroid-XU3:

(LITTLE CPU)
/sys/kernel/debug/energy_model/pd0/cs:1000000/frequency:1000000
/sys/kernel/debug/energy_model/pd0/cs:1000000/power:154
(big CPU)
/sys/kernel/debug/energy_model/pd1/cs:1000000/frequency:1000000
/sys/kernel/debug/energy_model/pd1/cs:1000000/power:310

In Odroid-XU3 case the derived coefficient value for 'big' CPU has:
dynamic-power-coefficient = <310>;
while the 'LITTLE':
dynamic-power-coefficient = <128>;

[1] Documentation/devicetree/bindings/arm/cpus.yaml
[2] https://elixir.bootlin.com/linux/v5.4/source/drivers/opp/of.c#L1044

Signed-off-by: Lukasz Luba <lukasz.luba@arm.com>
---
 arch/arm/boot/dts/exynos5422-cpus.dtsi | 8 ++++++++
 1 file changed, 8 insertions(+)

diff --git a/arch/arm/boot/dts/exynos5422-cpus.dtsi b/arch/arm/boot/dts/exynos5422-cpus.dtsi
index 1b8605cf2407..c9a0dc99d2fb 100644
--- a/arch/arm/boot/dts/exynos5422-cpus.dtsi
+++ b/arch/arm/boot/dts/exynos5422-cpus.dtsi
@@ -31,6 +31,7 @@
 			operating-points-v2 = <&cluster_a7_opp_table>;
 			#cooling-cells = <2>; /* min followed by max */
 			capacity-dmips-mhz = <539>;
+			dynamic-power-coefficient = <128>;
 		};
 
 		cpu1: cpu@101 {
@@ -43,6 +44,7 @@
 			operating-points-v2 = <&cluster_a7_opp_table>;
 			#cooling-cells = <2>; /* min followed by max */
 			capacity-dmips-mhz = <539>;
+			dynamic-power-coefficient = <128>;
 		};
 
 		cpu2: cpu@102 {
@@ -55,6 +57,7 @@
 			operating-points-v2 = <&cluster_a7_opp_table>;
 			#cooling-cells = <2>; /* min followed by max */
 			capacity-dmips-mhz = <539>;
+			dynamic-power-coefficient = <128>;
 		};
 
 		cpu3: cpu@103 {
@@ -67,6 +70,7 @@
 			operating-points-v2 = <&cluster_a7_opp_table>;
 			#cooling-cells = <2>; /* min followed by max */
 			capacity-dmips-mhz = <539>;
+			dynamic-power-coefficient = <128>;
 		};
 
 		cpu4: cpu@0 {
@@ -79,6 +83,7 @@
 			operating-points-v2 = <&cluster_a15_opp_table>;
 			#cooling-cells = <2>; /* min followed by max */
 			capacity-dmips-mhz = <1024>;
+			dynamic-power-coefficient = <310>;
 		};
 
 		cpu5: cpu@1 {
@@ -91,6 +96,7 @@
 			operating-points-v2 = <&cluster_a15_opp_table>;
 			#cooling-cells = <2>; /* min followed by max */
 			capacity-dmips-mhz = <1024>;
+			dynamic-power-coefficient = <310>;
 		};
 
 		cpu6: cpu@2 {
@@ -103,6 +109,7 @@
 			operating-points-v2 = <&cluster_a15_opp_table>;
 			#cooling-cells = <2>; /* min followed by max */
 			capacity-dmips-mhz = <1024>;
+			dynamic-power-coefficient = <310>;
 		};
 
 		cpu7: cpu@3 {
@@ -115,6 +122,7 @@
 			operating-points-v2 = <&cluster_a15_opp_table>;
 			#cooling-cells = <2>; /* min followed by max */
 			capacity-dmips-mhz = <1024>;
+			dynamic-power-coefficient = <310>;
 		};
 	};
 };
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH 3/3] ARM: exynos_defconfig: Enable Energy Model framework
  2020-01-27 21:54 [PATCH 0/3] Enable Odroid-XU3/4 to use Energy Model and Energy Aware Scheduler lukasz.luba
  2020-01-27 21:54 ` [PATCH 1/3] ARM: exynos_defconfig: Enable SCHED_MC lukasz.luba
  2020-01-27 21:54 ` [PATCH 2/3] ARM: dts: exynos: Add Exynos5422 CPU dynamic-power-coefficient information lukasz.luba
@ 2020-01-27 21:54 ` lukasz.luba
  2020-01-31 13:16   ` Krzysztof Kozlowski
  2020-01-31 13:30   ` Bartlomiej Zolnierkiewicz
  2 siblings, 2 replies; 20+ messages in thread
From: lukasz.luba @ 2020-01-27 21:54 UTC (permalink / raw)
  To: kgene, krzk, linux-arm-kernel, linux-samsung-soc, linux-kernel,
	devicetree, linux-pm
  Cc: myungjoo.ham, kyungmin.park, cw00.choi, robh+dt, mark.rutland,
	b.zolnierkie, lukasz.luba, dietmar.eggemann

From: Lukasz Luba <lukasz.luba@arm.com>

Enable the Energy Model (EM) brings possibility to use Energy Aware
Scheduler (EAS). This compiles the EM but does not enable to run EAS in
default. The EAS only works with SchedUtil - a CPUFreq governor which
handles direct requests from the scheduler for the frequency change. Thus,
to make EAS working in default, the SchedUtil governor should be
configured as default CPUFreq governor. Although, the EAS might be enabled
in runtime, when the EM is present for CPUs, the SchedUtil is compiled and
then set as CPUFreq governor, i.e.:

echo schedutil > /sys/devices/system/cpu/cpu0/cpufreq/scaling_governor
echo schedutil > /sys/devices/system/cpu/cpu4/cpufreq/scaling_governor

To check if EAS is ready to work, the read output from the command below
should show '1':
cat /proc/sys/kernel/sched_energy_aware

To disable EAS in runtime simply 'echo 0' to the file above.

Some test results, which stress the scheduler on Odroid-XU3:
hackbench -l 500 -s 4096
With mainline code and with this patch set.

The tests have been made with and without CONFIG_PROVE_LOCKING (PL)
(which is set to =y in default exynos_defconfig)

		|		this patch set			| mainline
		|-----------------------------------------------|---------------
		| performance	| SchedUtil	| SchedUtil	| performance
		| governor	| governor	| governor	| governor
		|		| w/o EAS	| w/ EAS	|
----------------|---------------|---------------|---------------|---------------
hackbench w/ PL | 12.7s		| 11.7s		| 12.0s		| 13.0s - 12.2s
hackbench w/o PL| 9.2s		| 8.1s		| 8.2s		| 9.2s - 8.4s

Signed-off-by: Lukasz Luba <lukasz.luba@arm.com>
---
 arch/arm/configs/exynos_defconfig | 1 +
 1 file changed, 1 insertion(+)

diff --git a/arch/arm/configs/exynos_defconfig b/arch/arm/configs/exynos_defconfig
index 1db857056992..c0f8ecabc607 100644
--- a/arch/arm/configs/exynos_defconfig
+++ b/arch/arm/configs/exynos_defconfig
@@ -18,6 +18,7 @@ CONFIG_ZBOOT_ROM_BSS=0x0
 CONFIG_ARM_APPENDED_DTB=y
 CONFIG_ARM_ATAG_DTB_COMPAT=y
 CONFIG_CMDLINE="root=/dev/ram0 rw ramdisk=8192 initrd=0x41000000,8M console=ttySAC1,115200 init=/linuxrc mem=256M"
+CONFIG_ENERGY_MODEL=y
 CONFIG_CPU_FREQ=y
 CONFIG_CPU_FREQ_STAT=y
 CONFIG_CPU_FREQ_DEFAULT_GOV_ONDEMAND=y
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* Re: [PATCH 1/3] ARM: exynos_defconfig: Enable SCHED_MC
  2020-01-27 21:54 ` [PATCH 1/3] ARM: exynos_defconfig: Enable SCHED_MC lukasz.luba
@ 2020-01-31 12:47   ` Krzysztof Kozlowski
  2020-01-31 15:59     ` Lukasz Luba
  0 siblings, 1 reply; 20+ messages in thread
From: Krzysztof Kozlowski @ 2020-01-31 12:47 UTC (permalink / raw)
  To: lukasz.luba
  Cc: kgene, linux-arm-kernel, linux-samsung-soc, linux-kernel,
	devicetree, linux-pm, myungjoo.ham, kyungmin.park, Chanwoo Choi,
	robh+dt, mark.rutland, Bartłomiej Żołnierkiewicz,
	dietmar.eggemann

On Mon, 27 Jan 2020 at 22:55, <lukasz.luba@arm.com> wrote:
>
> From: Lukasz Luba <lukasz.luba@arm.com>
>
> Since the 'capacities-dmips-mhz' are present in the CPU nodes, make use of
> this knowledge in smarter decisions during scheduling.
>
> The values in 'capacities-dmips-mhz' are normilized, this means that i.e.
> when CPU0's capacities-dmips-mhz=100 and CPU1's 'capacities-dmips-mhz'=50,
> cpu0 is twice fast as CPU1, at the same frequency. The proper hirarchy
> in sched_domain topology could exploit the SoC architecture advantages
> like big.LITTLE.

I do not quite get how this is related to rationale behind changing defconfig...

> Enabling the SCHED_MC will create two levels in
> sched_domain hierarchy, which might be observed in:

This is looks more convincing... but still what is the need? To work with EAS?

> grep . /proc/sys/kernel/sched_domain/cpu*/domain*/{name,flags}
> /proc/sys/kernel/sched_domain/cpu0/domain0/name:MC
> /proc/sys/kernel/sched_domain/cpu0/domain1/name:DIE
> ...
> /proc/sys/kernel/sched_domain/cpu0/domain0/flags:575
> /proc/sys/kernel/sched_domain/cpu0/domain1/flags:4223

Not related to defconfig change and not visible after this commit.

Best regards,
Krzysztof

>
> Signed-off-by: Lukasz Luba <lukasz.luba@arm.com>
> ---
>  arch/arm/configs/exynos_defconfig | 1 +
>  1 file changed, 1 insertion(+)
>
> diff --git a/arch/arm/configs/exynos_defconfig b/arch/arm/configs/exynos_defconfig
> index e7e4bb5ad8d5..1db857056992 100644
> --- a/arch/arm/configs/exynos_defconfig
> +++ b/arch/arm/configs/exynos_defconfig
> @@ -8,6 +8,7 @@ CONFIG_PERF_EVENTS=y
>  CONFIG_ARCH_EXYNOS=y
>  CONFIG_CPU_ICACHE_MISMATCH_WORKAROUND=y
>  CONFIG_SMP=y
> +CONFIG_SCHED_MC=y
>  CONFIG_BIG_LITTLE=y
>  CONFIG_NR_CPUS=8
>  CONFIG_HIGHMEM=y
> --
> 2.17.1
>

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 2/3] ARM: dts: exynos: Add Exynos5422 CPU dynamic-power-coefficient information
  2020-01-27 21:54 ` [PATCH 2/3] ARM: dts: exynos: Add Exynos5422 CPU dynamic-power-coefficient information lukasz.luba
@ 2020-01-31 13:05   ` Krzysztof Kozlowski
  2020-01-31 16:42     ` Lukasz Luba
  0 siblings, 1 reply; 20+ messages in thread
From: Krzysztof Kozlowski @ 2020-01-31 13:05 UTC (permalink / raw)
  To: lukasz.luba
  Cc: kgene, linux-arm-kernel, linux-samsung-soc, linux-kernel,
	devicetree, linux-pm, myungjoo.ham, kyungmin.park, Chanwoo Choi,
	robh+dt, mark.rutland, Bartłomiej Żołnierkiewicz,
	dietmar.eggemann

On Mon, 27 Jan 2020 at 22:55, <lukasz.luba@arm.com> wrote:
>
> From: Lukasz Luba <lukasz.luba@arm.com>
>
> Add dynamic power coefficient into CPU nodes which let CPUFreq subsystem
> register the Energy Model (EM) for the CPUs.
>
> The 'dynamic-power-coefficient' is used for calculating the dynamic power
> according to the equation in documentation [1].  The Energy Model (EM)
> framework relies on calculated power and cost for each OPP. The OPP power
> values come from CPUFreq driver, which registered required callback
> function. The simple implementation of a CPUFREQ driver, like cpufreq-dt,
> uses 'dev_pm_opp_of_register_em()' which relay on
> 'dynamic-power-coefficient' to calculate the power of requested OPP for the
> EM [2].
>
> The calculated values might be checked in
> /sys/kernel/debug/energy_model/pd*/
>
> $ grep . /sys/kernel/debug/energy_model/pd1/cs*/*
> /sys/kernel/debug/energy_model/pd1/cs:1000000/cost:558
> /sys/kernel/debug/energy_model/pd1/cs:1000000/frequency:1000000
> /sys/kernel/debug/energy_model/pd1/cs:1000000/power:310
> /sys/kernel/debug/energy_model/pd1/cs:1100000/cost:558
> /sys/kernel/debug/energy_model/pd1/cs:1100000/frequency:1100000
> /sys/kernel/debug/energy_model/pd1/cs:1100000/power:341
> /sys/kernel/debug/energy_model/pd1/cs:1200000/cost:558
> /sys/kernel/debug/energy_model/pd1/cs:1200000/frequency:1200000
> /sys/kernel/debug/energy_model/pd1/cs:1200000/power:372
> /sys/kernel/debug/energy_model/pd1/cs:1300000/cost:674
> /sys/kernel/debug/energy_model/pd1/cs:1300000/frequency:1300000
> /sys/kernel/debug/energy_model/pd1/cs:1300000/power:487
> /sys/kernel/debug/energy_model/pd1/cs:1400000/cost:675 ...
>
> $ grep . /sys/kernel/debug/energy_model/pd0/cs*/*
> /sys/kernel/debug/energy_model/pd0/cs:1000000/cost:200
> /sys/kernel/debug/energy_model/pd0/cs:1000000/frequency:1000000
> /sys/kernel/debug/energy_model/pd0/cs:1000000/power:154
> /sys/kernel/debug/energy_model/pd0/cs:1100000/cost:260
> /sys/kernel/debug/energy_model/pd0/cs:1100000/frequency:1100000
> /sys/kernel/debug/energy_model/pd0/cs:1100000/power:220
> /sys/kernel/debug/energy_model/pd0/cs:1200000/cost:260
> /sys/kernel/debug/energy_model/pd0/cs:1200000/frequency:1200000
> /sys/kernel/debug/energy_model/pd0/cs:1200000/power:240
> /sys/kernel/debug/energy_model/pd0/cs:1300000/cost:260
> /sys/kernel/debug/energy_model/pd0/cs:1300000/frequency:1300000
> /sys/kernel/debug/energy_model/pd0/cs:1300000/power:260
> /sys/kernel/debug/energy_model/pd0/cs:200000/cost:130 ...

Please, do not describe entire Energy Model in commit message touching
DTS. It brings too much information which look unrelated and therefore
it makes difficult to spot real rationale behind the change. Just
mention:
1. Why you are doing it?
2. What are you doing?
3. How did you figure out magic constants here (details of "what")?

> To provide a proper value of the 'dynamic-power-coefficient' the real power
> can be measured using a dedicated hardware, i.e. INA2xx. The Odroid-XU3
> hwmon sensors have been used to capture the power value during a sysbench
> test running on single core and at each possible OPP.

Since you mention the values, post them. That's the only thing which
reader cannot get on his own. All other values posted in commit
message will be seen after running tests...

> The measured values
> were divided by 2, since the dynamic power is typically half of the
> consumed power (the second half is static power). Next, the approximation
> was made and the power model derived, showing the 'C' value of routhly X.

s/routhly/roughly/

What is X?

> Check the example equations in drivers/opp/of.c [2].
> Thus, i.e. the power = 1.0Watt at 1GHz => 0.5W dynamic power =>
> dynamic-power-coefficient = 400
>
> Using this simple technique we can provide and needed coefficient.  The

s/and/the/ ?

> approximation does not have to be super precised. The proportion is
> important and the difference between power consumed by different CPUs
> running at the same frequency, which is then used in Energy Aware Scheduler
> algorithms. An example power values on Odroid-XU3:
>
> (LITTLE CPU)
> /sys/kernel/debug/energy_model/pd0/cs:1000000/frequency:1000000
> /sys/kernel/debug/energy_model/pd0/cs:1000000/power:154

For A7, 1V and 1 GHz this gives 142, not 154. Is it correct? What ASV
are you using?

> (big CPU)
> /sys/kernel/debug/energy_model/pd1/cs:1000000/frequency:1000000
> /sys/kernel/debug/energy_model/pd1/cs:1000000/power:310
>
> In Odroid-XU3 case the derived coefficient value for 'big' CPU has:
> dynamic-power-coefficient = <310>;
> while the 'LITTLE':
> dynamic-power-coefficient = <128>;

Make it all compact. First, you mention power values which are the
same as in the beginning of this commit message. Why repeating? Then
you mention the power coefficient in 4 lines instead of simple:
For Odroid XU3, the derived power coefficient is then 128 for an A7
CPU and 310 for an A15 CPU. Or something similar.

>
> [1] Documentation/devicetree/bindings/arm/cpus.yaml
> [2] https://elixir.bootlin.com/linux/v5.4/source/drivers/opp/of.c#L1044

Refer to path inside, no external sources unless needed.

Best regards,
Krzysztof

>
> Signed-off-by: Lukasz Luba <lukasz.luba@arm.com>
> ---
>  arch/arm/boot/dts/exynos5422-cpus.dtsi | 8 ++++++++
>  1 file changed, 8 insertions(+)
>
> diff --git a/arch/arm/boot/dts/exynos5422-cpus.dtsi b/arch/arm/boot/dts/exynos5422-cpus.dtsi
> index 1b8605cf2407..c9a0dc99d2fb 100644
> --- a/arch/arm/boot/dts/exynos5422-cpus.dtsi
> +++ b/arch/arm/boot/dts/exynos5422-cpus.dtsi
> @@ -31,6 +31,7 @@
>                         operating-points-v2 = <&cluster_a7_opp_table>;
>                         #cooling-cells = <2>; /* min followed by max */
>                         capacity-dmips-mhz = <539>;
> +                       dynamic-power-coefficient = <128>;
>                 };
>
>                 cpu1: cpu@101 {
> @@ -43,6 +44,7 @@
>                         operating-points-v2 = <&cluster_a7_opp_table>;
>                         #cooling-cells = <2>; /* min followed by max */
>                         capacity-dmips-mhz = <539>;
> +                       dynamic-power-coefficient = <128>;
>                 };
>
>                 cpu2: cpu@102 {
> @@ -55,6 +57,7 @@
>                         operating-points-v2 = <&cluster_a7_opp_table>;
>                         #cooling-cells = <2>; /* min followed by max */
>                         capacity-dmips-mhz = <539>;
> +                       dynamic-power-coefficient = <128>;
>                 };
>
>                 cpu3: cpu@103 {
> @@ -67,6 +70,7 @@
>                         operating-points-v2 = <&cluster_a7_opp_table>;
>                         #cooling-cells = <2>; /* min followed by max */
>                         capacity-dmips-mhz = <539>;
> +                       dynamic-power-coefficient = <128>;
>                 };
>
>                 cpu4: cpu@0 {
> @@ -79,6 +83,7 @@
>                         operating-points-v2 = <&cluster_a15_opp_table>;
>                         #cooling-cells = <2>; /* min followed by max */
>                         capacity-dmips-mhz = <1024>;
> +                       dynamic-power-coefficient = <310>;
>                 };
>
>                 cpu5: cpu@1 {
> @@ -91,6 +96,7 @@
>                         operating-points-v2 = <&cluster_a15_opp_table>;
>                         #cooling-cells = <2>; /* min followed by max */
>                         capacity-dmips-mhz = <1024>;
> +                       dynamic-power-coefficient = <310>;
>                 };
>
>                 cpu6: cpu@2 {
> @@ -103,6 +109,7 @@
>                         operating-points-v2 = <&cluster_a15_opp_table>;
>                         #cooling-cells = <2>; /* min followed by max */
>                         capacity-dmips-mhz = <1024>;
> +                       dynamic-power-coefficient = <310>;
>                 };
>
>                 cpu7: cpu@3 {
> @@ -115,6 +122,7 @@
>                         operating-points-v2 = <&cluster_a15_opp_table>;
>                         #cooling-cells = <2>; /* min followed by max */
>                         capacity-dmips-mhz = <1024>;
> +                       dynamic-power-coefficient = <310>;
>                 };
>         };
>  };
> --
> 2.17.1
>

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 3/3] ARM: exynos_defconfig: Enable Energy Model framework
  2020-01-27 21:54 ` [PATCH 3/3] ARM: exynos_defconfig: Enable Energy Model framework lukasz.luba
@ 2020-01-31 13:16   ` Krzysztof Kozlowski
  2020-01-31 17:30     ` Lukasz Luba
  2020-01-31 13:30   ` Bartlomiej Zolnierkiewicz
  1 sibling, 1 reply; 20+ messages in thread
From: Krzysztof Kozlowski @ 2020-01-31 13:16 UTC (permalink / raw)
  To: lukasz.luba
  Cc: kgene, linux-arm-kernel, linux-samsung-soc, linux-kernel,
	devicetree, linux-pm, myungjoo.ham, kyungmin.park, Chanwoo Choi,
	robh+dt, mark.rutland, Bartłomiej Żołnierkiewicz,
	dietmar.eggemann

On Mon, 27 Jan 2020 at 22:55, <lukasz.luba@arm.com> wrote:
>
> From: Lukasz Luba <lukasz.luba@arm.com>
>
> Enable the Energy Model (EM) brings possibility to use Energy Aware
> Scheduler (EAS). This compiles the EM but does not enable to run EAS in
> default. The EAS only works with SchedUtil - a CPUFreq governor which
> handles direct requests from the scheduler for the frequency change. Thus,
> to make EAS working in default, the SchedUtil governor should be
> configured as default CPUFreq governor.

Full stop. That's enough of needed explanation of schedutil.

> Although, the EAS might be enabled
> in runtime, when the EM is present for CPUs, the SchedUtil is compiled and
> then set as CPUFreq governor, i.e.:
>
> echo schedutil > /sys/devices/system/cpu/cpu0/cpufreq/scaling_governor
> echo schedutil > /sys/devices/system/cpu/cpu4/cpufreq/scaling_governor
>
> To check if EAS is ready to work, the read output from the command below
> should show '1':
> cat /proc/sys/kernel/sched_energy_aware
>
> To disable EAS in runtime simply 'echo 0' to the file above.

Not related to this commit. If you were implemeting here
schedutil/EAS, then it makes sense to post all this. However what's
the point to describe it in every defconfig change?

> Some test results, which stress the scheduler on Odroid-XU3:
> hackbench -l 500 -s 4096
> With mainline code and with this patch set.

Skip the last sentence - duplicated information.

>
> The tests have been made with and without CONFIG_PROVE_LOCKING (PL)
> (which is set to =y in default exynos_defconfig)
>
>                 |               this patch set                  | mainline

The commit will be applied on its own branch so the meaning of "this
patch set" will be lost. Maybe just "before/after"?

>                 |-----------------------------------------------|---------------
>                 | performance   | SchedUtil     | SchedUtil     | performance
>                 | governor      | governor      | governor      | governor
>                 |               | w/o EAS       | w/ EAS        |
> ----------------|---------------|---------------|---------------|---------------
> hackbench w/ PL | 12.7s         | 11.7s         | 12.0s         | 13.0s - 12.2s
> hackbench w/o PL| 9.2s          | 8.1s          | 8.2s          | 9.2s - 8.4s

Why does the performance different before and after this patch?

Mention - lower better (?). Space between number and unit... or better
mention [s] in column title.

And last but not least:
Why this patch is separate from 1/3? I don't get the need of splitting them.

Best regards,
Krzysztof

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 3/3] ARM: exynos_defconfig: Enable Energy Model framework
  2020-01-27 21:54 ` [PATCH 3/3] ARM: exynos_defconfig: Enable Energy Model framework lukasz.luba
  2020-01-31 13:16   ` Krzysztof Kozlowski
@ 2020-01-31 13:30   ` Bartlomiej Zolnierkiewicz
  2020-01-31 13:31     ` Krzysztof Kozlowski
  2020-01-31 17:38     ` Lukasz Luba
  1 sibling, 2 replies; 20+ messages in thread
From: Bartlomiej Zolnierkiewicz @ 2020-01-31 13:30 UTC (permalink / raw)
  To: lukasz.luba
  Cc: kgene, krzk, linux-arm-kernel, linux-samsung-soc, linux-kernel,
	devicetree, linux-pm, myungjoo.ham, kyungmin.park, cw00.choi,
	robh+dt, mark.rutland, dietmar.eggemann


Hi,

On 1/27/20 10:54 PM, lukasz.luba@arm.com wrote:
> From: Lukasz Luba <lukasz.luba@arm.com>
> 
> Enable the Energy Model (EM) brings possibility to use Energy Aware
> Scheduler (EAS). This compiles the EM but does not enable to run EAS in
> default. The EAS only works with SchedUtil - a CPUFreq governor which
> handles direct requests from the scheduler for the frequency change. Thus,
> to make EAS working in default, the SchedUtil governor should be
> configured as default CPUFreq governor. Although, the EAS might be enabled
> in runtime, when the EM is present for CPUs, the SchedUtil is compiled and
> then set as CPUFreq governor, i.e.:
> 
> echo schedutil > /sys/devices/system/cpu/cpu0/cpufreq/scaling_governor
> echo schedutil > /sys/devices/system/cpu/cpu4/cpufreq/scaling_governor
> 
> To check if EAS is ready to work, the read output from the command below
> should show '1':
> cat /proc/sys/kernel/sched_energy_aware
> 
> To disable EAS in runtime simply 'echo 0' to the file above.
> 
> Some test results, which stress the scheduler on Odroid-XU3:
> hackbench -l 500 -s 4096
> With mainline code and with this patch set.
> 
> The tests have been made with and without CONFIG_PROVE_LOCKING (PL)
> (which is set to =y in default exynos_defconfig)
> 
> 		|		this patch set			| mainline
> 		|-----------------------------------------------|---------------
> 		| performance	| SchedUtil	| SchedUtil	| performance
> 		| governor	| governor	| governor	| governor
> 		|		| w/o EAS	| w/ EAS	|
> ----------------|---------------|---------------|---------------|---------------
> hackbench w/ PL | 12.7s		| 11.7s		| 12.0s		| 13.0s - 12.2s
> hackbench w/o PL| 9.2s		| 8.1s		| 8.2s		| 9.2s - 8.4s

Would you happen to have measurements of how much power is
saved by running hackbench using "SchedUtil governor w/ EAS"
instead of "SchedUtil governor w/o EAS"?

Best regards,
--
Bartlomiej Zolnierkiewicz
Samsung R&D Institute Poland
Samsung Electronics

> Signed-off-by: Lukasz Luba <lukasz.luba@arm.com>
> ---
>  arch/arm/configs/exynos_defconfig | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/arch/arm/configs/exynos_defconfig b/arch/arm/configs/exynos_defconfig
> index 1db857056992..c0f8ecabc607 100644
> --- a/arch/arm/configs/exynos_defconfig
> +++ b/arch/arm/configs/exynos_defconfig
> @@ -18,6 +18,7 @@ CONFIG_ZBOOT_ROM_BSS=0x0
>  CONFIG_ARM_APPENDED_DTB=y
>  CONFIG_ARM_ATAG_DTB_COMPAT=y
>  CONFIG_CMDLINE="root=/dev/ram0 rw ramdisk=8192 initrd=0x41000000,8M console=ttySAC1,115200 init=/linuxrc mem=256M"
> +CONFIG_ENERGY_MODEL=y
>  CONFIG_CPU_FREQ=y
>  CONFIG_CPU_FREQ_STAT=y
>  CONFIG_CPU_FREQ_DEFAULT_GOV_ONDEMAND=y

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 3/3] ARM: exynos_defconfig: Enable Energy Model framework
  2020-01-31 13:30   ` Bartlomiej Zolnierkiewicz
@ 2020-01-31 13:31     ` Krzysztof Kozlowski
  2020-01-31 13:47       ` Bartlomiej Zolnierkiewicz
  2020-01-31 17:38     ` Lukasz Luba
  1 sibling, 1 reply; 20+ messages in thread
From: Krzysztof Kozlowski @ 2020-01-31 13:31 UTC (permalink / raw)
  To: Bartlomiej Zolnierkiewicz
  Cc: lukasz.luba, kgene, linux-arm-kernel, linux-samsung-soc,
	linux-kernel, devicetree, linux-pm, myungjoo.ham, kyungmin.park,
	Chanwoo Choi, robh+dt, mark.rutland, dietmar.eggemann

On Fri, 31 Jan 2020 at 14:30, Bartlomiej Zolnierkiewicz
<b.zolnierkie@samsung.com> wrote:
>
>
> Hi,
>
> On 1/27/20 10:54 PM, lukasz.luba@arm.com wrote:
> > From: Lukasz Luba <lukasz.luba@arm.com>
> >
> > Enable the Energy Model (EM) brings possibility to use Energy Aware
> > Scheduler (EAS). This compiles the EM but does not enable to run EAS in
> > default. The EAS only works with SchedUtil - a CPUFreq governor which
> > handles direct requests from the scheduler for the frequency change. Thus,
> > to make EAS working in default, the SchedUtil governor should be
> > configured as default CPUFreq governor. Although, the EAS might be enabled
> > in runtime, when the EM is present for CPUs, the SchedUtil is compiled and
> > then set as CPUFreq governor, i.e.:
> >
> > echo schedutil > /sys/devices/system/cpu/cpu0/cpufreq/scaling_governor
> > echo schedutil > /sys/devices/system/cpu/cpu4/cpufreq/scaling_governor
> >
> > To check if EAS is ready to work, the read output from the command below
> > should show '1':
> > cat /proc/sys/kernel/sched_energy_aware
> >
> > To disable EAS in runtime simply 'echo 0' to the file above.
> >
> > Some test results, which stress the scheduler on Odroid-XU3:
> > hackbench -l 500 -s 4096
> > With mainline code and with this patch set.
> >
> > The tests have been made with and without CONFIG_PROVE_LOCKING (PL)
> > (which is set to =y in default exynos_defconfig)
> >
> >               |               this patch set                  | mainline
> >               |-----------------------------------------------|---------------
> >               | performance   | SchedUtil     | SchedUtil     | performance
> >               | governor      | governor      | governor      | governor
> >               |               | w/o EAS       | w/ EAS        |
> > ----------------|---------------|---------------|---------------|---------------
> > hackbench w/ PL | 12.7s               | 11.7s         | 12.0s         | 13.0s - 12.2s
> > hackbench w/o PL| 9.2s                | 8.1s          | 8.2s          | 9.2s - 8.4s
>
> Would you happen to have measurements of how much power is
> saved by running hackbench using "SchedUtil governor w/ EAS"
> instead of "SchedUtil governor w/o EAS"?

That's a good point and quite important reason behind enabling (or not) EAS...

Best regards,
Krzysztof

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 3/3] ARM: exynos_defconfig: Enable Energy Model framework
  2020-01-31 13:31     ` Krzysztof Kozlowski
@ 2020-01-31 13:47       ` Bartlomiej Zolnierkiewicz
  2020-01-31 13:48         ` Bartlomiej Zolnierkiewicz
  0 siblings, 1 reply; 20+ messages in thread
From: Bartlomiej Zolnierkiewicz @ 2020-01-31 13:47 UTC (permalink / raw)
  To: Krzysztof Kozlowski
  Cc: lukasz.luba, kgene, linux-arm-kernel, linux-samsung-soc,
	linux-kernel, devicetree, linux-pm, myungjoo.ham, kyungmin.park,
	Chanwoo Choi, robh+dt, mark.rutland, dietmar.eggemann


On 1/31/20 2:31 PM, Krzysztof Kozlowski wrote:
> On Fri, 31 Jan 2020 at 14:30, Bartlomiej Zolnierkiewicz
> <b.zolnierkie@samsung.com> wrote:
>>
>>
>> Hi,
>>
>> On 1/27/20 10:54 PM, lukasz.luba@arm.com wrote:
>>> From: Lukasz Luba <lukasz.luba@arm.com>
>>>
>>> Enable the Energy Model (EM) brings possibility to use Energy Aware
>>> Scheduler (EAS). This compiles the EM but does not enable to run EAS in
>>> default. The EAS only works with SchedUtil - a CPUFreq governor which
>>> handles direct requests from the scheduler for the frequency change. Thus,
>>> to make EAS working in default, the SchedUtil governor should be
>>> configured as default CPUFreq governor. Although, the EAS might be enabled
>>> in runtime, when the EM is present for CPUs, the SchedUtil is compiled and
>>> then set as CPUFreq governor, i.e.:
>>>
>>> echo schedutil > /sys/devices/system/cpu/cpu0/cpufreq/scaling_governor
>>> echo schedutil > /sys/devices/system/cpu/cpu4/cpufreq/scaling_governor
>>>
>>> To check if EAS is ready to work, the read output from the command below
>>> should show '1':
>>> cat /proc/sys/kernel/sched_energy_aware
>>>
>>> To disable EAS in runtime simply 'echo 0' to the file above.
>>>
>>> Some test results, which stress the scheduler on Odroid-XU3:
>>> hackbench -l 500 -s 4096
>>> With mainline code and with this patch set.
>>>
>>> The tests have been made with and without CONFIG_PROVE_LOCKING (PL)
>>> (which is set to =y in default exynos_defconfig)
>>>
>>>               |               this patch set                  | mainline
>>>               |-----------------------------------------------|---------------
>>>               | performance   | SchedUtil     | SchedUtil     | performance
>>>               | governor      | governor      | governor      | governor
>>>               |               | w/o EAS       | w/ EAS        |
>>> ----------------|---------------|---------------|---------------|---------------
>>> hackbench w/ PL | 12.7s               | 11.7s         | 12.0s         | 13.0s - 12.2s
>>> hackbench w/o PL| 9.2s                | 8.1s          | 8.2s          | 9.2s - 8.4s
>>
>> Would you happen to have measurements of how much power is
>> saved by running hackbench using "SchedUtil governor w/ EAS"
>> instead of "SchedUtil governor w/o EAS"?
> 
> That's a good point and quite important reason behind enabling (or not) EAS...

IIUC EAS is enabled by default if you use SchedUtil
governor and Energy Model is available on you platform.

[ SchedUtil governor is enabled in exynos_defconfig
  although not enabled by default currently. ]

Best regards,
--
Bartlomiej Zolnierkiewicz
Samsung R&D Institute Poland
Samsung Electronics

> Best regards,
> Krzysztof

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 3/3] ARM: exynos_defconfig: Enable Energy Model framework
  2020-01-31 13:47       ` Bartlomiej Zolnierkiewicz
@ 2020-01-31 13:48         ` Bartlomiej Zolnierkiewicz
  0 siblings, 0 replies; 20+ messages in thread
From: Bartlomiej Zolnierkiewicz @ 2020-01-31 13:48 UTC (permalink / raw)
  To: Krzysztof Kozlowski
  Cc: lukasz.luba, kgene, linux-arm-kernel, linux-samsung-soc,
	linux-kernel, devicetree, linux-pm, myungjoo.ham, kyungmin.park,
	Chanwoo Choi, robh+dt, mark.rutland, dietmar.eggemann


On 1/31/20 2:47 PM, Bartlomiej Zolnierkiewicz wrote:
> 
> On 1/31/20 2:31 PM, Krzysztof Kozlowski wrote:
>> On Fri, 31 Jan 2020 at 14:30, Bartlomiej Zolnierkiewicz
>> <b.zolnierkie@samsung.com> wrote:
>>>
>>>
>>> Hi,
>>>
>>> On 1/27/20 10:54 PM, lukasz.luba@arm.com wrote:
>>>> From: Lukasz Luba <lukasz.luba@arm.com>
>>>>
>>>> Enable the Energy Model (EM) brings possibility to use Energy Aware
>>>> Scheduler (EAS). This compiles the EM but does not enable to run EAS in
>>>> default. The EAS only works with SchedUtil - a CPUFreq governor which
>>>> handles direct requests from the scheduler for the frequency change. Thus,
>>>> to make EAS working in default, the SchedUtil governor should be
>>>> configured as default CPUFreq governor. Although, the EAS might be enabled
>>>> in runtime, when the EM is present for CPUs, the SchedUtil is compiled and
>>>> then set as CPUFreq governor, i.e.:
>>>>
>>>> echo schedutil > /sys/devices/system/cpu/cpu0/cpufreq/scaling_governor
>>>> echo schedutil > /sys/devices/system/cpu/cpu4/cpufreq/scaling_governor
>>>>
>>>> To check if EAS is ready to work, the read output from the command below
>>>> should show '1':
>>>> cat /proc/sys/kernel/sched_energy_aware
>>>>
>>>> To disable EAS in runtime simply 'echo 0' to the file above.
>>>>
>>>> Some test results, which stress the scheduler on Odroid-XU3:
>>>> hackbench -l 500 -s 4096
>>>> With mainline code and with this patch set.
>>>>
>>>> The tests have been made with and without CONFIG_PROVE_LOCKING (PL)
>>>> (which is set to =y in default exynos_defconfig)
>>>>
>>>>               |               this patch set                  | mainline
>>>>               |-----------------------------------------------|---------------
>>>>               | performance   | SchedUtil     | SchedUtil     | performance
>>>>               | governor      | governor      | governor      | governor
>>>>               |               | w/o EAS       | w/ EAS        |
>>>> ----------------|---------------|---------------|---------------|---------------
>>>> hackbench w/ PL | 12.7s               | 11.7s         | 12.0s         | 13.0s - 12.2s
>>>> hackbench w/o PL| 9.2s                | 8.1s          | 8.2s          | 9.2s - 8.4s
>>>
>>> Would you happen to have measurements of how much power is
>>> saved by running hackbench using "SchedUtil governor w/ EAS"
>>> instead of "SchedUtil governor w/o EAS"?
>>
>> That's a good point and quite important reason behind enabling (or not) EAS...
> 
> IIUC EAS is enabled by default if you use SchedUtil
> governor and Energy Model is available on you platform.
> 
> [ SchedUtil governor is enabled in exynos_defconfig
>   although not enabled by default currently. ]

s/enabled/used/

Best regards,
--
Bartlomiej Zolnierkiewicz
Samsung R&D Institute Poland
Samsung Electronics

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 1/3] ARM: exynos_defconfig: Enable SCHED_MC
  2020-01-31 12:47   ` Krzysztof Kozlowski
@ 2020-01-31 15:59     ` Lukasz Luba
  2020-01-31 20:33       ` Krzysztof Kozlowski
  0 siblings, 1 reply; 20+ messages in thread
From: Lukasz Luba @ 2020-01-31 15:59 UTC (permalink / raw)
  To: Krzysztof Kozlowski
  Cc: kgene, linux-arm-kernel, linux-samsung-soc, linux-kernel,
	devicetree, linux-pm, myungjoo.ham, kyungmin.park, Chanwoo Choi,
	robh+dt, mark.rutland, Bartłomiej Żołnierkiewicz,
	dietmar.eggemann

Hi Krzysztof,

Thank you for your review, please see my comments below.

On 1/31/20 12:47 PM, Krzysztof Kozlowski wrote:
> On Mon, 27 Jan 2020 at 22:55, <lukasz.luba@arm.com> wrote:
>>
>> From: Lukasz Luba <lukasz.luba@arm.com>
>>
>> Since the 'capacities-dmips-mhz' are present in the CPU nodes, make use of
>> this knowledge in smarter decisions during scheduling.
>>
>> The values in 'capacities-dmips-mhz' are normilized, this means that i.e.
>> when CPU0's capacities-dmips-mhz=100 and CPU1's 'capacities-dmips-mhz'=50,
>> cpu0 is twice fast as CPU1, at the same frequency. The proper hirarchy
>> in sched_domain topology could exploit the SoC architecture advantages
>> like big.LITTLE.
> 
> I do not quite get how this is related to rationale behind changing defconfig...

It is not strictly about EAS, it is useful in general for big.LITTLE
platform with clusters.

> 
>> Enabling the SCHED_MC will create two levels in
>> sched_domain hierarchy, which might be observed in:
> 
> This is looks more convincing... but still what is the need? To work with EAS?

It is not only for EAS, but in general for the scheduler (load balance,
task's wake-up path, etc). The scheduler algorithms iterate CPUs in the
sched groups. To make better decisions, the information about MC domain
is needed. More about the scheduler domains and i.e. load_balance()
you can find here:

https://www.kernel.org/doc/html/latest/scheduler/sched-domains.html


> 
>> grep . /proc/sys/kernel/sched_domain/cpu*/domain*/{name,flags}
>> /proc/sys/kernel/sched_domain/cpu0/domain0/name:MC
>> /proc/sys/kernel/sched_domain/cpu0/domain1/name:DIE
>> ...
>> /proc/sys/kernel/sched_domain/cpu0/domain0/flags:575
>> /proc/sys/kernel/sched_domain/cpu0/domain1/flags:4223
> 
> Not related to defconfig change and not visible after this commit.

Without this patch there is only one domain: 'domain0' -> 'DIE'
cat /proc/sys/kernel/sched_domain/cpu0/domain0/name
DIE

When you apply this patch you will get two: 'domain0, 'domain1'
grep . /proc/sys/kernel/sched_domain/cpu0/domain?/name 

/proc/sys/kernel/sched_domain/cpu0/domain0/name:MC
/proc/sys/kernel/sched_domain/cpu0/domain1/name:DIE

I can remove it this information, but it is the most important
to spot this difference out.

This is also the main reason I haven't merge the patch 1 + 3.

Regards,
Lukasz

> 
> Best regards,
> Krzysztof
> 
>>
>> Signed-off-by: Lukasz Luba <lukasz.luba@arm.com>
>> ---
>>   arch/arm/configs/exynos_defconfig | 1 +
>>   1 file changed, 1 insertion(+)
>>
>> diff --git a/arch/arm/configs/exynos_defconfig b/arch/arm/configs/exynos_defconfig
>> index e7e4bb5ad8d5..1db857056992 100644
>> --- a/arch/arm/configs/exynos_defconfig
>> +++ b/arch/arm/configs/exynos_defconfig
>> @@ -8,6 +8,7 @@ CONFIG_PERF_EVENTS=y
>>   CONFIG_ARCH_EXYNOS=y
>>   CONFIG_CPU_ICACHE_MISMATCH_WORKAROUND=y
>>   CONFIG_SMP=y
>> +CONFIG_SCHED_MC=y
>>   CONFIG_BIG_LITTLE=y
>>   CONFIG_NR_CPUS=8
>>   CONFIG_HIGHMEM=y
>> --
>> 2.17.1
>>

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 2/3] ARM: dts: exynos: Add Exynos5422 CPU dynamic-power-coefficient information
  2020-01-31 13:05   ` Krzysztof Kozlowski
@ 2020-01-31 16:42     ` Lukasz Luba
  0 siblings, 0 replies; 20+ messages in thread
From: Lukasz Luba @ 2020-01-31 16:42 UTC (permalink / raw)
  To: Krzysztof Kozlowski
  Cc: kgene, linux-arm-kernel, linux-samsung-soc, linux-kernel,
	devicetree, linux-pm, myungjoo.ham, kyungmin.park, Chanwoo Choi,
	robh+dt, mark.rutland, Bartłomiej Żołnierkiewicz,
	dietmar.eggemann



On 1/31/20 1:05 PM, Krzysztof Kozlowski wrote:
> On Mon, 27 Jan 2020 at 22:55, <lukasz.luba@arm.com> wrote:
>>
>> From: Lukasz Luba <lukasz.luba@arm.com>
>>
>> Add dynamic power coefficient into CPU nodes which let CPUFreq subsystem
>> register the Energy Model (EM) for the CPUs.
>>
>> The 'dynamic-power-coefficient' is used for calculating the dynamic power
>> according to the equation in documentation [1].  The Energy Model (EM)
>> framework relies on calculated power and cost for each OPP. The OPP power
>> values come from CPUFreq driver, which registered required callback
>> function. The simple implementation of a CPUFREQ driver, like cpufreq-dt,
>> uses 'dev_pm_opp_of_register_em()' which relay on
>> 'dynamic-power-coefficient' to calculate the power of requested OPP for the
>> EM [2].
>>
>> The calculated values might be checked in
>> /sys/kernel/debug/energy_model/pd*/
>>
>> $ grep . /sys/kernel/debug/energy_model/pd1/cs*/*
>> /sys/kernel/debug/energy_model/pd1/cs:1000000/cost:558
>> /sys/kernel/debug/energy_model/pd1/cs:1000000/frequency:1000000
>> /sys/kernel/debug/energy_model/pd1/cs:1000000/power:310
>> /sys/kernel/debug/energy_model/pd1/cs:1100000/cost:558
>> /sys/kernel/debug/energy_model/pd1/cs:1100000/frequency:1100000
>> /sys/kernel/debug/energy_model/pd1/cs:1100000/power:341
>> /sys/kernel/debug/energy_model/pd1/cs:1200000/cost:558
>> /sys/kernel/debug/energy_model/pd1/cs:1200000/frequency:1200000
>> /sys/kernel/debug/energy_model/pd1/cs:1200000/power:372
>> /sys/kernel/debug/energy_model/pd1/cs:1300000/cost:674
>> /sys/kernel/debug/energy_model/pd1/cs:1300000/frequency:1300000
>> /sys/kernel/debug/energy_model/pd1/cs:1300000/power:487
>> /sys/kernel/debug/energy_model/pd1/cs:1400000/cost:675 ...
>>
>> $ grep . /sys/kernel/debug/energy_model/pd0/cs*/*
>> /sys/kernel/debug/energy_model/pd0/cs:1000000/cost:200
>> /sys/kernel/debug/energy_model/pd0/cs:1000000/frequency:1000000
>> /sys/kernel/debug/energy_model/pd0/cs:1000000/power:154
>> /sys/kernel/debug/energy_model/pd0/cs:1100000/cost:260
>> /sys/kernel/debug/energy_model/pd0/cs:1100000/frequency:1100000
>> /sys/kernel/debug/energy_model/pd0/cs:1100000/power:220
>> /sys/kernel/debug/energy_model/pd0/cs:1200000/cost:260
>> /sys/kernel/debug/energy_model/pd0/cs:1200000/frequency:1200000
>> /sys/kernel/debug/energy_model/pd0/cs:1200000/power:240
>> /sys/kernel/debug/energy_model/pd0/cs:1300000/cost:260
>> /sys/kernel/debug/energy_model/pd0/cs:1300000/frequency:1300000
>> /sys/kernel/debug/energy_model/pd0/cs:1300000/power:260
>> /sys/kernel/debug/energy_model/pd0/cs:200000/cost:130 ...
> 
> Please, do not describe entire Energy Model in commit message touching
> DTS. It brings too much information which look unrelated and therefore
> it makes difficult to spot real rationale behind the change. Just
> mention:
> 1. Why you are doing it?
> 2. What are you doing?
> 3. How did you figure out magic constants here (details of "what")?

OK, I will clean this up.

> 
>> To provide a proper value of the 'dynamic-power-coefficient' the real power
>> can be measured using a dedicated hardware, i.e. INA2xx. The Odroid-XU3
>> hwmon sensors have been used to capture the power value during a sysbench
>> test running on single core and at each possible OPP.
> 
> Since you mention the values, post them. That's the only thing which
> reader cannot get on his own. All other values posted in commit
> message will be seen after running tests...

Makes sense, but as you spotted it can vary probably due to ASV, so I
will skip to put values in commit message.

> 
>> The measured values
>> were divided by 2, since the dynamic power is typically half of the
>> consumed power (the second half is static power). Next, the approximation
>> was made and the power model derived, showing the 'C' value of routhly X.
> 
> s/routhly/roughly/
> 
> What is X?

The 'X' is <128> or <310>

> 
>> Check the example equations in drivers/opp/of.c [2].
>> Thus, i.e. the power = 1.0Watt at 1GHz => 0.5W dynamic power =>
>> dynamic-power-coefficient = 400
>>
>> Using this simple technique we can provide and needed coefficient.  The
> 
> s/and/the/ ?

correct

> 
>> approximation does not have to be super precised. The proportion is
>> important and the difference between power consumed by different CPUs
>> running at the same frequency, which is then used in Energy Aware Scheduler
>> algorithms. An example power values on Odroid-XU3:
>>
>> (LITTLE CPU)
>> /sys/kernel/debug/energy_model/pd0/cs:1000000/frequency:1000000
>> /sys/kernel/debug/energy_model/pd0/cs:1000000/power:154
> 
> For A7, 1V and 1 GHz this gives 142, not 154. Is it correct? What ASV
> are you using?

Good question, it may vary depending on ASV. Would it vary also due to
bootloader?
This one is quite old:
U-Boot 2012.07 (Aug 11 2014 - 18:33:44) for Exynos5422

Odroid-xu3 rev0.2 20140529 ASV regs dump:
EXYNOS_CHIPID_REG_PKG_ID=0x320c832a
EXYNOS_CHIPID_REG_AUX_INFO=0x4f

Odroid-xu4 rev0.1 20180912 ASV regs dump:
EXYNOS_CHIPID_REG_PKG_ID=0x3b0e832a
EXYNOS_CHIPID_REG_AUX_INFO=0x100c004f

> 
>> (big CPU)
>> /sys/kernel/debug/energy_model/pd1/cs:1000000/frequency:1000000
>> /sys/kernel/debug/energy_model/pd1/cs:1000000/power:310
>>
>> In Odroid-XU3 case the derived coefficient value for 'big' CPU has:
>> dynamic-power-coefficient = <310>;
>> while the 'LITTLE':
>> dynamic-power-coefficient = <128>;
> 
> Make it all compact. First, you mention power values which are the
> same as in the beginning of this commit message. Why repeating? Then
> you mention the power coefficient in 4 lines instead of simple:
> For Odroid XU3, the derived power coefficient is then 128 for an A7
> CPU and 310 for an A15 CPU. Or something similar.

OK, I will keep simple, as you have commented.

> 
>>
>> [1] Documentation/devicetree/bindings/arm/cpus.yaml
>> [2] https://elixir.bootlin.com/linux/v5.4/source/drivers/opp/of.c#L1044
> 
> Refer to path inside, no external sources unless needed.

OK

Regards,
Lukasz

> 
> Best regards,
> Krzysztof
> 


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 3/3] ARM: exynos_defconfig: Enable Energy Model framework
  2020-01-31 13:16   ` Krzysztof Kozlowski
@ 2020-01-31 17:30     ` Lukasz Luba
  2020-01-31 20:41       ` Krzysztof Kozlowski
  0 siblings, 1 reply; 20+ messages in thread
From: Lukasz Luba @ 2020-01-31 17:30 UTC (permalink / raw)
  To: Krzysztof Kozlowski
  Cc: kgene, linux-arm-kernel, linux-samsung-soc, linux-kernel,
	devicetree, linux-pm, myungjoo.ham, kyungmin.park, Chanwoo Choi,
	robh+dt, mark.rutland, Bartłomiej Żołnierkiewicz,
	dietmar.eggemann

Hi Krzysztof,

On 1/31/20 1:16 PM, Krzysztof Kozlowski wrote:
> On Mon, 27 Jan 2020 at 22:55, <lukasz.luba@arm.com> wrote:
>>
>> From: Lukasz Luba <lukasz.luba@arm.com>
>>
>> Enable the Energy Model (EM) brings possibility to use Energy Aware
>> Scheduler (EAS). This compiles the EM but does not enable to run EAS in
>> default. The EAS only works with SchedUtil - a CPUFreq governor which
>> handles direct requests from the scheduler for the frequency change. Thus,
>> to make EAS working in default, the SchedUtil governor should be
>> configured as default CPUFreq governor.
> 
> Full stop. That's enough of needed explanation of schedutil.

OK

> 
>> Although, the EAS might be enabled
>> in runtime, when the EM is present for CPUs, the SchedUtil is compiled and
>> then set as CPUFreq governor, i.e.:
>>
>> echo schedutil > /sys/devices/system/cpu/cpu0/cpufreq/scaling_governor
>> echo schedutil > /sys/devices/system/cpu/cpu4/cpufreq/scaling_governor
>>
>> To check if EAS is ready to work, the read output from the command below
>> should show '1':
>> cat /proc/sys/kernel/sched_energy_aware
>>
>> To disable EAS in runtime simply 'echo 0' to the file above.
> 
> Not related to this commit. If you were implemeting here
> schedutil/EAS, then it makes sense to post all this. However what's
> the point to describe it in every defconfig change?

I will drop it.

> 
>> Some test results, which stress the scheduler on Odroid-XU3:
>> hackbench -l 500 -s 4096
>> With mainline code and with this patch set.
> 
> Skip the last sentence - duplicated information.

OK

> 
>>
>> The tests have been made with and without CONFIG_PROVE_LOCKING (PL)
>> (which is set to =y in default exynos_defconfig)
>>
>>                  |               this patch set                  | mainline
> 
> The commit will be applied on its own branch so the meaning of "this
> patch set" will be lost. Maybe just "before/after"?

OK

> 
>>                  |-----------------------------------------------|---------------
>>                  | performance   | SchedUtil     | SchedUtil     | performance
>>                  | governor      | governor      | governor      | governor
>>                  |               | w/o EAS       | w/ EAS        |
>> ----------------|---------------|---------------|---------------|---------------
>> hackbench w/ PL | 12.7s         | 11.7s         | 12.0s         | 13.0s - 12.2s
>> hackbench w/o PL| 9.2s          | 8.1s          | 8.2s          | 9.2s - 8.4s
> 
> Why does the performance different before and after this patch?

Probably due to better locality and cache utilization. I can see that
there is ~700k context switches vs ~450k and ~160k migrations vs ~50k.
If you need to communicate two threads in different clusters, it will go
through CCI.

> 
> Mention - lower better (?). Space between number and unit... or better
> mention [s] in column title.

OK

> 
> And last but not least:
> Why this patch is separate from 1/3? I don't get the need of splitting them.

As mentioned in response to patch 1/3. The fist patch would create MC
domain, something different than Energy Model or EAS. The decisions in
the scheduler would be different.

I can merge 1/3 and 3/3 if you like, though.

Regards,
Lukasz

> 
> Best regards,
> Krzysztof
> 

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 3/3] ARM: exynos_defconfig: Enable Energy Model framework
  2020-01-31 13:30   ` Bartlomiej Zolnierkiewicz
  2020-01-31 13:31     ` Krzysztof Kozlowski
@ 2020-01-31 17:38     ` Lukasz Luba
  1 sibling, 0 replies; 20+ messages in thread
From: Lukasz Luba @ 2020-01-31 17:38 UTC (permalink / raw)
  To: Bartlomiej Zolnierkiewicz
  Cc: kgene, krzk, linux-arm-kernel, linux-samsung-soc, linux-kernel,
	devicetree, linux-pm, myungjoo.ham, kyungmin.park, cw00.choi,
	robh+dt, mark.rutland, dietmar.eggemann

Hi Bartek,

On 1/31/20 1:30 PM, Bartlomiej Zolnierkiewicz wrote:
> 
> Hi,
> 
> On 1/27/20 10:54 PM, lukasz.luba@arm.com wrote:
>> From: Lukasz Luba <lukasz.luba@arm.com>
>>
>> Enable the Energy Model (EM) brings possibility to use Energy Aware
>> Scheduler (EAS). This compiles the EM but does not enable to run EAS in
>> default. The EAS only works with SchedUtil - a CPUFreq governor which
>> handles direct requests from the scheduler for the frequency change. Thus,
>> to make EAS working in default, the SchedUtil governor should be
>> configured as default CPUFreq governor. Although, the EAS might be enabled
>> in runtime, when the EM is present for CPUs, the SchedUtil is compiled and
>> then set as CPUFreq governor, i.e.:
>>
>> echo schedutil > /sys/devices/system/cpu/cpu0/cpufreq/scaling_governor
>> echo schedutil > /sys/devices/system/cpu/cpu4/cpufreq/scaling_governor
>>
>> To check if EAS is ready to work, the read output from the command below
>> should show '1':
>> cat /proc/sys/kernel/sched_energy_aware
>>
>> To disable EAS in runtime simply 'echo 0' to the file above.
>>
>> Some test results, which stress the scheduler on Odroid-XU3:
>> hackbench -l 500 -s 4096
>> With mainline code and with this patch set.
>>
>> The tests have been made with and without CONFIG_PROVE_LOCKING (PL)
>> (which is set to =y in default exynos_defconfig)
>>
>> 		|		this patch set			| mainline
>> 		|-----------------------------------------------|---------------
>> 		| performance	| SchedUtil	| SchedUtil	| performance
>> 		| governor	| governor	| governor	| governor
>> 		|		| w/o EAS	| w/ EAS	|
>> ----------------|---------------|---------------|---------------|---------------
>> hackbench w/ PL | 12.7s		| 11.7s		| 12.0s		| 13.0s - 12.2s
>> hackbench w/o PL| 9.2s		| 8.1s		| 8.2s		| 9.2s - 8.4s
> 
> Would you happen to have measurements of how much power is
> saved by running hackbench using "SchedUtil governor w/ EAS"
> instead of "SchedUtil governor w/o EAS"?

I need to check if this xu3 ina2xx can aggregate energy or
it's only drained-current-at-that-moment value.

Regards,
Lukasz

> 
> Best regards,
> --
> Bartlomiej Zolnierkiewicz
> Samsung R&D Institute Poland
> Samsung Electronics
> 
>> Signed-off-by: Lukasz Luba <lukasz.luba@arm.com>
>> ---
>>   arch/arm/configs/exynos_defconfig | 1 +
>>   1 file changed, 1 insertion(+)
>>
>> diff --git a/arch/arm/configs/exynos_defconfig b/arch/arm/configs/exynos_defconfig
>> index 1db857056992..c0f8ecabc607 100644
>> --- a/arch/arm/configs/exynos_defconfig
>> +++ b/arch/arm/configs/exynos_defconfig
>> @@ -18,6 +18,7 @@ CONFIG_ZBOOT_ROM_BSS=0x0
>>   CONFIG_ARM_APPENDED_DTB=y
>>   CONFIG_ARM_ATAG_DTB_COMPAT=y
>>   CONFIG_CMDLINE="root=/dev/ram0 rw ramdisk=8192 initrd=0x41000000,8M console=ttySAC1,115200 init=/linuxrc mem=256M"
>> +CONFIG_ENERGY_MODEL=y
>>   CONFIG_CPU_FREQ=y
>>   CONFIG_CPU_FREQ_STAT=y
>>   CONFIG_CPU_FREQ_DEFAULT_GOV_ONDEMAND=y

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 1/3] ARM: exynos_defconfig: Enable SCHED_MC
  2020-01-31 15:59     ` Lukasz Luba
@ 2020-01-31 20:33       ` Krzysztof Kozlowski
  0 siblings, 0 replies; 20+ messages in thread
From: Krzysztof Kozlowski @ 2020-01-31 20:33 UTC (permalink / raw)
  To: Lukasz Luba
  Cc: kgene, linux-arm-kernel, linux-samsung-soc, linux-kernel,
	devicetree, linux-pm, myungjoo.ham, kyungmin.park, Chanwoo Choi,
	robh+dt, mark.rutland, Bartłomiej Żołnierkiewicz,
	dietmar.eggemann

On Fri, Jan 31, 2020 at 03:59:30PM +0000, Lukasz Luba wrote:
> Hi Krzysztof,
> 
> Thank you for your review, please see my comments below.
> 
> On 1/31/20 12:47 PM, Krzysztof Kozlowski wrote:
> > On Mon, 27 Jan 2020 at 22:55, <lukasz.luba@arm.com> wrote:
> > > 
> > > From: Lukasz Luba <lukasz.luba@arm.com>
> > > 
> > > Since the 'capacities-dmips-mhz' are present in the CPU nodes, make use of
> > > this knowledge in smarter decisions during scheduling.
> > > 
> > > The values in 'capacities-dmips-mhz' are normilized, this means that i.e.
> > > when CPU0's capacities-dmips-mhz=100 and CPU1's 'capacities-dmips-mhz'=50,
> > > cpu0 is twice fast as CPU1, at the same frequency. The proper hirarchy
> > > in sched_domain topology could exploit the SoC architecture advantages
> > > like big.LITTLE.
> > 
> > I do not quite get how this is related to rationale behind changing defconfig...
> 
> It is not strictly about EAS, it is useful in general for big.LITTLE
> platform with clusters.
> 
> > 
> > > Enabling the SCHED_MC will create two levels in
> > > sched_domain hierarchy, which might be observed in:
> > 
> > This is looks more convincing... but still what is the need? To work with EAS?
> 
> It is not only for EAS, but in general for the scheduler (load balance,
> task's wake-up path, etc). The scheduler algorithms iterate CPUs in the
> sched groups. To make better decisions, the information about MC domain
> is needed. More about the scheduler domains and i.e. load_balance()
> you can find here:
> 
> https://www.kernel.org/doc/html/latest/scheduler/sched-domains.html

Ahhh, I see, it's independent of later patches. Somehow I had impression
it is a prerequisite...

> 
> > 
> > > grep . /proc/sys/kernel/sched_domain/cpu*/domain*/{name,flags}
> > > /proc/sys/kernel/sched_domain/cpu0/domain0/name:MC
> > > /proc/sys/kernel/sched_domain/cpu0/domain1/name:DIE
> > > ...
> > > /proc/sys/kernel/sched_domain/cpu0/domain0/flags:575
> > > /proc/sys/kernel/sched_domain/cpu0/domain1/flags:4223
> > 
> > Not related to defconfig change and not visible after this commit.
> 
> Without this patch there is only one domain: 'domain0' -> 'DIE'
> cat /proc/sys/kernel/sched_domain/cpu0/domain0/name
> DIE
> 
> When you apply this patch you will get two: 'domain0, 'domain1'
> grep . /proc/sys/kernel/sched_domain/cpu0/domain?/name
> 
> /proc/sys/kernel/sched_domain/cpu0/domain0/name:MC
> /proc/sys/kernel/sched_domain/cpu0/domain1/name:DIE
> 
> I can remove it this information, but it is the most important
> to spot this difference out.
> 
> This is also the main reason I haven't merge the patch 1 + 3.

Indeed. I thought that these will pop up at the end of the patchset, my
bad.

I do not see big benefits of adding these outputs as proofs of working
SCHED_MC because they are kind of obvious. It is not a measurement but
report of current system state. However they don't harm neither, so I am
fine with it.

However please us in commit msg also the name of turned on option, next
or instead of SCHED_MC.  The options might be sometimes cryptic or too
vague and the name actually easily expresses what you want enable.


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 3/3] ARM: exynos_defconfig: Enable Energy Model framework
  2020-01-31 17:30     ` Lukasz Luba
@ 2020-01-31 20:41       ` Krzysztof Kozlowski
  2020-02-05 12:49         ` Lukasz Luba
  0 siblings, 1 reply; 20+ messages in thread
From: Krzysztof Kozlowski @ 2020-01-31 20:41 UTC (permalink / raw)
  To: Lukasz Luba
  Cc: kgene, linux-arm-kernel, linux-samsung-soc, linux-kernel,
	devicetree, linux-pm, myungjoo.ham, kyungmin.park, Chanwoo Choi,
	robh+dt, mark.rutland, Bartłomiej Żołnierkiewicz,
	dietmar.eggemann

On Fri, Jan 31, 2020 at 05:30:46PM +0000, Lukasz Luba wrote:
 
> > 
> > >                  |-----------------------------------------------|---------------
> > >                  | performance   | SchedUtil     | SchedUtil     | performance
> > >                  | governor      | governor      | governor      | governor
> > >                  |               | w/o EAS       | w/ EAS        |
> > > ----------------|---------------|---------------|---------------|---------------
> > > hackbench w/ PL | 12.7s         | 11.7s         | 12.0s         | 13.0s - 12.2s
> > > hackbench w/o PL| 9.2s          | 8.1s          | 8.2s          | 9.2s - 8.4s
> > 
> > Why does the performance different before and after this patch?
> 
> Probably due to better locality and cache utilization. I can see that
> there is ~700k context switches vs ~450k and ~160k migrations vs ~50k.
> If you need to communicate two threads in different clusters, it will go
> through CCI.

Mhmm... I was not specific - I mean, "performance governor". All this
you mentioned should not differ between performance governor before and
after. However once you have 12.7, then 13.0 - 12.2. Unless multi-core
scheduler affects it... but then these numbers here are not showing
only this change, but also the SCHED_MC effect.  In such case each of
commits should be coming with their own numbers.

> As mentioned in response to patch 1/3. The fist patch would create MC
> domain, something different than Energy Model or EAS. The decisions in
> the scheduler would be different.
> 
> I can merge 1/3 and 3/3 if you like, though.

I understand now that their independent. Still, they are part of one
goal to tune the scheduler for Exynos platform. Splitting these looks
too much, like enabling multiple drivers one after another.

However if you provide numbers for each of cases (before patches, multi
core scheduler, energy model with DTS), then I see benefit of splitting
it.  Each commit would have its own rationale.  I am not sure if it is
worth such investigation - that's just defconfig... distros might ignore
it anyway.

Best regards,
Krzysztof


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 3/3] ARM: exynos_defconfig: Enable Energy Model framework
  2020-01-31 20:41       ` Krzysztof Kozlowski
@ 2020-02-05 12:49         ` Lukasz Luba
  2020-02-06 12:59           ` Krzysztof Kozlowski
  0 siblings, 1 reply; 20+ messages in thread
From: Lukasz Luba @ 2020-02-05 12:49 UTC (permalink / raw)
  To: Krzysztof Kozlowski
  Cc: kgene, linux-arm-kernel, linux-samsung-soc, linux-kernel,
	devicetree, linux-pm, myungjoo.ham, kyungmin.park, Chanwoo Choi,
	robh+dt, mark.rutland, Bartłomiej Żołnierkiewicz,
	dietmar.eggemann

Hi Krzysztof,

On 1/31/20 8:41 PM, Krzysztof Kozlowski wrote:
> On Fri, Jan 31, 2020 at 05:30:46PM +0000, Lukasz Luba wrote:
>   
>>>
>>>>                   |-----------------------------------------------|---------------
>>>>                   | performance   | SchedUtil     | SchedUtil     | performance
>>>>                   | governor      | governor      | governor      | governor
>>>>                   |               | w/o EAS       | w/ EAS        |
>>>> ----------------|---------------|---------------|---------------|---------------
>>>> hackbench w/ PL | 12.7s         | 11.7s         | 12.0s         | 13.0s - 12.2s
>>>> hackbench w/o PL| 9.2s          | 8.1s          | 8.2s          | 9.2s - 8.4s
>>>
>>> Why does the performance different before and after this patch?
>>
>> Probably due to better locality and cache utilization. I can see that
>> there is ~700k context switches vs ~450k and ~160k migrations vs ~50k.
>> If you need to communicate two threads in different clusters, it will go
>> through CCI.
> 
> Mhmm... I was not specific - I mean, "performance governor". All this
> you mentioned should not differ between performance governor before and
> after. However once you have 12.7, then 13.0 - 12.2. Unless multi-core
> scheduler affects it... but then these numbers here are not showing
> only this change, but also the SCHED_MC effect.  In such case each of
> commits should be coming with their own numbers.

Agree, I should have not put 'this patch set' in the commit
msg. It should go into the cover letter and avoid this confusion.
You are right with ' Unless multi-core scheduler affects it...',
that's why when the SCHED_MC is missing, the decisions about task
placing might cause this variation and delay '13.0 - 12.2' seconds.

> 
>> As mentioned in response to patch 1/3. The fist patch would create MC
>> domain, something different than Energy Model or EAS. The decisions in
>> the scheduler would be different.
>>
>> I can merge 1/3 and 3/3 if you like, though.
> 
> I understand now that their independent. Still, they are part of one
> goal to tune the scheduler for Exynos platform. Splitting these looks
> too much, like enabling multiple drivers one after another.
> 
> However if you provide numbers for each of cases (before patches, multi
> core scheduler, energy model with DTS), then I see benefit of splitting
> it.  Each commit would have its own rationale.  I am not sure if it is
> worth such investigation - that's just defconfig... distros might ignore
> it anyway.

Good point, and I agree that it would require more investigation, for
which unfortunately I don't have currently spare cycles.

Should I merge patch 1/3 and 3/3 and send the v2 with a cover letter
which would have the test results?

Regards,
Lukasz


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 3/3] ARM: exynos_defconfig: Enable Energy Model framework
  2020-02-05 12:49         ` Lukasz Luba
@ 2020-02-06 12:59           ` Krzysztof Kozlowski
  2020-02-06 14:15             ` Lukasz Luba
  0 siblings, 1 reply; 20+ messages in thread
From: Krzysztof Kozlowski @ 2020-02-06 12:59 UTC (permalink / raw)
  To: Lukasz Luba
  Cc: kgene, linux-arm-kernel, linux-samsung-soc, linux-kernel,
	devicetree, linux-pm, myungjoo.ham, kyungmin.park, Chanwoo Choi,
	robh+dt, mark.rutland, Bartłomiej Żołnierkiewicz,
	dietmar.eggemann

On Wed, 5 Feb 2020 at 13:49, Lukasz Luba <lukasz.luba@arm.com> wrote:
> >> As mentioned in response to patch 1/3. The fist patch would create MC
> >> domain, something different than Energy Model or EAS. The decisions in
> >> the scheduler would be different.
> >>
> >> I can merge 1/3 and 3/3 if you like, though.
> >
> > I understand now that their independent. Still, they are part of one
> > goal to tune the scheduler for Exynos platform. Splitting these looks
> > too much, like enabling multiple drivers one after another.
> >
> > However if you provide numbers for each of cases (before patches, multi
> > core scheduler, energy model with DTS), then I see benefit of splitting
> > it.  Each commit would have its own rationale.  I am not sure if it is
> > worth such investigation - that's just defconfig... distros might ignore
> > it anyway.
>
> Good point, and I agree that it would require more investigation, for
> which unfortunately I don't have currently spare cycles.
>
> Should I merge patch 1/3 and 3/3 and send the v2 with a cover letter
> which would have the test results?

Yes, let's do this way.

Thanks for working on this!

Best regards,
Krzysztof

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 3/3] ARM: exynos_defconfig: Enable Energy Model framework
  2020-02-06 12:59           ` Krzysztof Kozlowski
@ 2020-02-06 14:15             ` Lukasz Luba
  0 siblings, 0 replies; 20+ messages in thread
From: Lukasz Luba @ 2020-02-06 14:15 UTC (permalink / raw)
  To: Krzysztof Kozlowski
  Cc: kgene, linux-arm-kernel, linux-samsung-soc, linux-kernel,
	devicetree, linux-pm, myungjoo.ham, kyungmin.park, Chanwoo Choi,
	robh+dt, mark.rutland, Bartłomiej Żołnierkiewicz,
	dietmar.eggemann



On 2/6/20 12:59 PM, Krzysztof Kozlowski wrote:
> On Wed, 5 Feb 2020 at 13:49, Lukasz Luba <lukasz.luba@arm.com> wrote:
>>>> As mentioned in response to patch 1/3. The fist patch would create MC
>>>> domain, something different than Energy Model or EAS. The decisions in
>>>> the scheduler would be different.
>>>>
>>>> I can merge 1/3 and 3/3 if you like, though.
>>>
>>> I understand now that their independent. Still, they are part of one
>>> goal to tune the scheduler for Exynos platform. Splitting these looks
>>> too much, like enabling multiple drivers one after another.
>>>
>>> However if you provide numbers for each of cases (before patches, multi
>>> core scheduler, energy model with DTS), then I see benefit of splitting
>>> it.  Each commit would have its own rationale.  I am not sure if it is
>>> worth such investigation - that's just defconfig... distros might ignore
>>> it anyway.
>>
>> Good point, and I agree that it would require more investigation, for
>> which unfortunately I don't have currently spare cycles.
>>
>> Should I merge patch 1/3 and 3/3 and send the v2 with a cover letter
>> which would have the test results?
> 
> Yes, let's do this way.

Thank you, I will send the v2 then.

Regards,
Lukasz

> 
> Thanks for working on this!
> 
> Best regards,
> Krzysztof
> 

^ permalink raw reply	[flat|nested] 20+ messages in thread

end of thread, other threads:[~2020-02-06 14:15 UTC | newest]

Thread overview: 20+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-01-27 21:54 [PATCH 0/3] Enable Odroid-XU3/4 to use Energy Model and Energy Aware Scheduler lukasz.luba
2020-01-27 21:54 ` [PATCH 1/3] ARM: exynos_defconfig: Enable SCHED_MC lukasz.luba
2020-01-31 12:47   ` Krzysztof Kozlowski
2020-01-31 15:59     ` Lukasz Luba
2020-01-31 20:33       ` Krzysztof Kozlowski
2020-01-27 21:54 ` [PATCH 2/3] ARM: dts: exynos: Add Exynos5422 CPU dynamic-power-coefficient information lukasz.luba
2020-01-31 13:05   ` Krzysztof Kozlowski
2020-01-31 16:42     ` Lukasz Luba
2020-01-27 21:54 ` [PATCH 3/3] ARM: exynos_defconfig: Enable Energy Model framework lukasz.luba
2020-01-31 13:16   ` Krzysztof Kozlowski
2020-01-31 17:30     ` Lukasz Luba
2020-01-31 20:41       ` Krzysztof Kozlowski
2020-02-05 12:49         ` Lukasz Luba
2020-02-06 12:59           ` Krzysztof Kozlowski
2020-02-06 14:15             ` Lukasz Luba
2020-01-31 13:30   ` Bartlomiej Zolnierkiewicz
2020-01-31 13:31     ` Krzysztof Kozlowski
2020-01-31 13:47       ` Bartlomiej Zolnierkiewicz
2020-01-31 13:48         ` Bartlomiej Zolnierkiewicz
2020-01-31 17:38     ` Lukasz Luba

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).