linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/4] arm: remove cpu_efficiency
@ 2017-08-30 14:41 Dietmar Eggemann
  2017-08-30 14:41 ` [PATCH 1/4] arm: topology: " Dietmar Eggemann
                   ` (3 more replies)
  0 siblings, 4 replies; 17+ messages in thread
From: Dietmar Eggemann @ 2017-08-30 14:41 UTC (permalink / raw)
  To: linux-kernel, linux-arm-kernel, devicetree, linux-samsung-soc,
	linux-renesas-soc
  Cc: Russell King, Rob Herring, Mark Rutland, Kukjin Kim,
	Krzysztof Kozlowski, Vincent Guittot, Juri Lelli

For Cortex-A15/A7 arm big.LITTLE systems there are currently two ways to
set the cpu capacity.

The first one (commit 06073ee26775 "ARM: 8621/3: parse cpu
capacity-dmips-mhz from DT") is based on dt 'cpu capacity-dmips-mhz'
bindings and the appropriate dt parsing code in
drivers/base/arch_topology.c. It further takes differences in maximum
cpu frequency values into consideration, normalizes the maximum cpu
capacity to SCHED_CAPACITY_SCALE (1024) and scales all the cpus
accordingly.

  cpu capacity = (capacity-dmips-mhz * max cpu frequency) / 
                 (max capacity-dmips-mhz * max (max cpu frequency)

This solution is shared between arm and arm64 and works for other
combinations of big and little cpus (besides Cortex-A15/A7) as well.

The second one (commit 339ca09d7ada "ARM: 7463/1: topology: Update
cpu_power according to DT information" is based on the 'struct
cpu_efficiency table_efficiency[]' and the dt parsing code in
arch/arm/kernel/topology.c. It further requires a clock-frequency
property per cpu node, calculates a so called middle frequency for an
average cpu in the system which is as close as possible to
SCHED_CAPACITY_SCALE (1024) and uses this to compute the cpu capacity
values.

  cpu capacity = (cpu efficiency * clock frequency) / middle capacity

This solution only works for Cortex-A15/A7 arm big.LITTLE systems.

The aim of this patch-set is to have only one solution for all arm and
arm64 big.LITTLE platforms.

(1) Therefore, it removes the code for the 'cpu_efficiency/
    clock-frequency dt property' (second) solution [patch 01/04] and
    migrates the arm big.LITTLE platforms currently using this approach
    [patch 02-04/04] to use the 'cpu capacity-dmips-mhz' (first)
    solution.

(2) Moreover, it will also assure that the highest original cpu capacity
    (rq->cpu_capacity_orig) in a non-smt system is SCHED_CAPACITY_SCALE
    (1024).

(3) And finally, another advantage is the dynamic detection of the max
    cpu frequency which comes with the first solution instead of the
    static clock-frequency dt property value.

Currently, the arm dt parsing code in parse_dt_topology() checks if the
dt uses the capacity-dmips-mhz property. If this is the case it uses
the first, otherwise the second solution. This patch-set removes the
code for the second solution from arch/arm/kernel/topology.c.

The following arm big.LITTLE platforms which use cpu node descriptions
with the 'compatible' properties "arm,cortex-a15" and "arm,cortex-a7"
as well as the "clock-frequency" are (theoretically*) affected:

(1) arndale-octa, peach-pi, peach-pit, smdk5420 (exynos5420-cpus.dtsi)

(2) odroidxu3, odroidxu3-lite, odroidxu4 (exynos5422-cpus.dtsi)

(3) r8a7790-lager (r8a7790.dtsi)

TC2 (vexpress-v2p-ca15_a7.dts) already has the capacity-dmips-mhz
properties (it never had "clock-frequency" properties per cpu node
though).

*Currently, these platforms are only theoretically affected. The reason
is because heterogeneous cpu capacity support on arm stopped with commit
8cd5601c5060 ("sched/fair: Convert arch_scale_cpu_capacity() from weak
function to #define") because the arch never defined
arch_scale_cpu_capacity so the task scheduler uses the default
implementation in kernel/sched/sched.h. This will change as soon the
patch "arm: wire cpu-invariant accounting support up to the task
scheduler" [1] is in mainline.

This patch-set has been tested on TC2 and Samsung Chromebook 2 13"
(peach-pi, Exynos 5800).

[1] https://marc.info/?l=linux-kernel&m=150367158111303&w=2

Dietmar Eggemann (4):
  arm: topology: remove cpu_efficiency
  arm: dts: exynos: add exynos5420 cpu capacity-dmips-mhz information
  arm: dts: exynos: add exynos5422 cpu capacity-dmips-mhz information
  arm: dts: r8a7790: add cpu capacity-dmips-mhz information

 arch/arm/boot/dts/exynos5420-cpus.dtsi |   8 +++
 arch/arm/boot/dts/exynos5422-cpus.dtsi |   8 +++
 arch/arm/boot/dts/r8a7790.dtsi         |   8 +++
 arch/arm/kernel/topology.c             | 113 +--------------------------------
 4 files changed, 27 insertions(+), 110 deletions(-)

-- 
2.11.0

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [PATCH 1/4] arm: topology: remove cpu_efficiency
  2017-08-30 14:41 [PATCH 0/4] arm: remove cpu_efficiency Dietmar Eggemann
@ 2017-08-30 14:41 ` Dietmar Eggemann
  2017-09-04  7:49   ` Vincent Guittot
  2017-08-30 14:41 ` [PATCH 2/4] arm: dts: exynos: add exynos5420 cpu capacity-dmips-mhz information Dietmar Eggemann
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 17+ messages in thread
From: Dietmar Eggemann @ 2017-08-30 14:41 UTC (permalink / raw)
  To: linux-kernel, linux-arm-kernel, devicetree, linux-samsung-soc,
	linux-renesas-soc
  Cc: Russell King, Rob Herring, Mark Rutland, Kukjin Kim,
	Krzysztof Kozlowski, Vincent Guittot, Juri Lelli

Remove the 'cpu_efficiency/clock-frequency dt property' based solution
to set cpu capacity which was only working for Cortex-A15/A7 arm
big.LITTLE systems.

I.e. the 'capacity-dmips-mhz' based solution is now the only one. It is
shared between arm and arm64 and works for every big.LITTLE system no
matter which core types it consists of.

Cc: Russell King <linux@arm.linux.org.uk>
Cc: Vincent Guittot <vincent.guittot@linaro.org>
Cc: Juri Lelli <juri.lelli@arm.com>
Signed-off-by: Dietmar Eggemann <dietmar.eggemann@arm.com>
---
 arch/arm/kernel/topology.c | 113 ++-------------------------------------------
 1 file changed, 3 insertions(+), 110 deletions(-)

diff --git a/arch/arm/kernel/topology.c b/arch/arm/kernel/topology.c
index bf949a763dbe..04ccfdd94213 100644
--- a/arch/arm/kernel/topology.c
+++ b/arch/arm/kernel/topology.c
@@ -47,52 +47,10 @@
  */
 
 #ifdef CONFIG_OF
-struct cpu_efficiency {
-	const char *compatible;
-	unsigned long efficiency;
-};
-
-/*
- * Table of relative efficiency of each processors
- * The efficiency value must fit in 20bit and the final
- * cpu_scale value must be in the range
- *   0 < cpu_scale < 3*SCHED_CAPACITY_SCALE/2
- * in order to return at most 1 when DIV_ROUND_CLOSEST
- * is used to compute the capacity of a CPU.
- * Processors that are not defined in the table,
- * use the default SCHED_CAPACITY_SCALE value for cpu_scale.
- */
-static const struct cpu_efficiency table_efficiency[] = {
-	{"arm,cortex-a15", 3891},
-	{"arm,cortex-a7",  2048},
-	{NULL, },
-};
-
-static unsigned long *__cpu_capacity;
-#define cpu_capacity(cpu)	__cpu_capacity[cpu]
-
-static unsigned long middle_capacity = 1;
-static bool cap_from_dt = true;
-
-/*
- * Iterate all CPUs' descriptor in DT and compute the efficiency
- * (as per table_efficiency). Also calculate a middle efficiency
- * as close as possible to  (max{eff_i} - min{eff_i}) / 2
- * This is later used to scale the cpu_capacity field such that an
- * 'average' CPU is of middle capacity. Also see the comments near
- * table_efficiency[] and update_cpu_capacity().
- */
 static void __init parse_dt_topology(void)
 {
-	const struct cpu_efficiency *cpu_eff;
-	struct device_node *cn = NULL;
-	unsigned long min_capacity = ULONG_MAX;
-	unsigned long max_capacity = 0;
-	unsigned long capacity = 0;
-	int cpu = 0;
-
-	__cpu_capacity = kcalloc(nr_cpu_ids, sizeof(*__cpu_capacity),
-				 GFP_NOWAIT);
+	struct device_node *cn;
+	int cpu;
 
 	cn = of_find_node_by_path("/cpus");
 	if (!cn) {
@@ -101,9 +59,6 @@ static void __init parse_dt_topology(void)
 	}
 
 	for_each_possible_cpu(cpu) {
-		const u32 *rate;
-		int len;
-
 		/* too early to use cpu->of_node */
 		cn = of_get_cpu_node(cpu, NULL);
 		if (!cn) {
@@ -115,73 +70,13 @@ static void __init parse_dt_topology(void)
 			of_node_put(cn);
 			continue;
 		}
-
-		cap_from_dt = false;
-
-		for (cpu_eff = table_efficiency; cpu_eff->compatible; cpu_eff++)
-			if (of_device_is_compatible(cn, cpu_eff->compatible))
-				break;
-
-		if (cpu_eff->compatible == NULL)
-			continue;
-
-		rate = of_get_property(cn, "clock-frequency", &len);
-		if (!rate || len != 4) {
-			pr_err("%s missing clock-frequency property\n",
-				cn->full_name);
-			continue;
-		}
-
-		capacity = ((be32_to_cpup(rate)) >> 20) * cpu_eff->efficiency;
-
-		/* Save min capacity of the system */
-		if (capacity < min_capacity)
-			min_capacity = capacity;
-
-		/* Save max capacity of the system */
-		if (capacity > max_capacity)
-			max_capacity = capacity;
-
-		cpu_capacity(cpu) = capacity;
 	}
 
-	/* If min and max capacities are equals, we bypass the update of the
-	 * cpu_scale because all CPUs have the same capacity. Otherwise, we
-	 * compute a middle_capacity factor that will ensure that the capacity
-	 * of an 'average' CPU of the system will be as close as possible to
-	 * SCHED_CAPACITY_SCALE, which is the default value, but with the
-	 * constraint explained near table_efficiency[].
-	 */
-	if (4*max_capacity < (3*(max_capacity + min_capacity)))
-		middle_capacity = (min_capacity + max_capacity)
-				>> (SCHED_CAPACITY_SHIFT+1);
-	else
-		middle_capacity = ((max_capacity / 3)
-				>> (SCHED_CAPACITY_SHIFT-1)) + 1;
-
-	if (cap_from_dt)
-		topology_normalize_cpu_scale();
-}
-
-/*
- * Look for a customed capacity of a CPU in the cpu_capacity table during the
- * boot. The update of all CPUs is in O(n^2) for heteregeneous system but the
- * function returns directly for SMP system.
- */
-static void update_cpu_capacity(unsigned int cpu)
-{
-	if (!cpu_capacity(cpu) || cap_from_dt)
-		return;
-
-	topology_set_cpu_scale(cpu, cpu_capacity(cpu) / middle_capacity);
-
-	pr_info("CPU%u: update cpu_capacity %lu\n",
-		cpu, topology_get_cpu_scale(NULL, cpu));
+	topology_normalize_cpu_scale();
 }
 
 #else
 static inline void parse_dt_topology(void) {}
-static inline void update_cpu_capacity(unsigned int cpuid) {}
 #endif
 
  /*
@@ -277,8 +172,6 @@ void store_cpu_topology(unsigned int cpuid)
 
 	update_siblings_masks(cpuid);
 
-	update_cpu_capacity(cpuid);
-
 	pr_info("CPU%u: thread %d, cpu %d, socket %d, mpidr %x\n",
 		cpuid, cpu_topology[cpuid].thread_id,
 		cpu_topology[cpuid].core_id,
-- 
2.11.0

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH 2/4] arm: dts: exynos: add exynos5420 cpu capacity-dmips-mhz information
  2017-08-30 14:41 [PATCH 0/4] arm: remove cpu_efficiency Dietmar Eggemann
  2017-08-30 14:41 ` [PATCH 1/4] arm: topology: " Dietmar Eggemann
@ 2017-08-30 14:41 ` Dietmar Eggemann
  2017-08-30 20:26   ` Krzysztof Kozlowski
  2017-09-17  7:37   ` Krzysztof Kozlowski
  2017-08-30 14:41 ` [PATCH 3/4] arm: dts: exynos: add exynos5422 " Dietmar Eggemann
  2017-08-30 14:41 ` [PATCH 4/4] arm: dts: r8a7790: add " Dietmar Eggemann
  3 siblings, 2 replies; 17+ messages in thread
From: Dietmar Eggemann @ 2017-08-30 14:41 UTC (permalink / raw)
  To: linux-kernel, linux-arm-kernel, devicetree, linux-samsung-soc,
	linux-renesas-soc
  Cc: Russell King, Rob Herring, Mark Rutland, Kukjin Kim,
	Krzysztof Kozlowski, Vincent Guittot, Juri Lelli

The following 'capacity-dmips-mhz' dt property values are used:

Cortex-A15: 1024, Cortex-A7: 539

They have been derived from the cpu_efficiency values:

Cortex-A15: 3891, Cortex-A7: 2048

by scaling them so that the Cortex-A15s (big cores) use 1024.

The cpu_efficiency values were originally derived from the "Big.LITTLE
Processing with ARM Cortex™-A15 & Cortex-A7" white paper
(http://www.cl.cam.ac.uk/~rdm34/big.LITTLE.pdf). Table 1 lists 1.9x
(3891/2048) as the Cortex-A15 vs Cortex-A7 performance ratio for the
Dhrystone benchmark.

The following platforms are affected once cpu-invariant accounting
support is re-connected to the task scheduler:

arndale-octa, peach-pi, peach-pit, smdk5420

The patch has been tested on Samsung Chromebook 2 13" (peach-pi, Exynos
5800).

$ cat /sys/devices/system/cpu/cpu*/cpu_capacity
1024
1024
1024
1024
389
389
389
389

The Cortex-A15 vs Cortex-A7 performance ratio is 1024/389 = 2.63.

The values derived with the 'cpu_efficiency/clock-frequency dt property'
solution are:

$ cat /sys/devices/system/cpu/cpu*/cpu_capacity
1535
1535
1535
1535
448
448
448
448

The Cortex-A15 vs Cortex-A7 performance ratio is 1535/448 = 3.43.

The discrepancy between 2.63 and 3.43 is due to the false assumption
when using the 'cpu_efficiency/clock-frequency dt property' solution
that the max cpu frequency of the little cpus is 1 GHZ and not 1.3 GHz.
The Cortex-A7 cluster runs with a max cpu frequency of 1.3 GHZ whereas
the 'clock-frequency' property value is set to 1 GHz.

3.43/1.3 = 2.64

$ cat /sys/devices/system/cpu/cpu*/cpufreq/scaling_max_freq
1800000
1800000
1800000
1800000
1300000 <-- max cpu frequency of the Cortex-A7s (little cores)
1300000
1300000
1300000

Running another benchmark (single-threaded sysbench affine to the
individual cpus) with performance cpufreq governor on the Samsung
Chromebook 2 13" showed the following numbers:

$ for i in `seq 0 7`; do taskset -c $i sysbench --test=cpu
  --num-threads=1 --max-time=10 run | grep "total number of events:";
  done

total number of events: 1083
total number of events: 1085
total number of events: 1085
total number of events: 1085
total number of events: 454
total number of events: 454
total number of events: 454
total number of events: 454

The Cortex-A15 vs Cortex-A7 performance ratio is 2.39, i.e. very close
to the one derived from the Dhrystone based one of the "Big.LITTLE
Processing with ARM Cortex™-A15 & Cortex-A7" white paper (2.63).

We don't aim for exact values for the cpu capacity values. Besides the
CPI (Cycles Per Instruction), the instruction mix and whether the system
runs cpu-bound or memory-bound has an impact on the cpu capacity values
derived from these benchmark results.

Cc: Rob Herring <robh+dt@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Russell King <linux@armlinux.org.uk>
Cc: Kukjin Kim <kgene@kernel.org>
Cc: Krzysztof Kozlowski <krzk@kernel.org>
Signed-off-by: Dietmar Eggemann <dietmar.eggemann@arm.com>
---
 arch/arm/boot/dts/exynos5420-cpus.dtsi | 8 ++++++++
 1 file changed, 8 insertions(+)

diff --git a/arch/arm/boot/dts/exynos5420-cpus.dtsi b/arch/arm/boot/dts/exynos5420-cpus.dtsi
index 5c052d7ff554..d7d703aa1699 100644
--- a/arch/arm/boot/dts/exynos5420-cpus.dtsi
+++ b/arch/arm/boot/dts/exynos5420-cpus.dtsi
@@ -36,6 +36,7 @@
 			cooling-min-level = <0>;
 			cooling-max-level = <11>;
 			#cooling-cells = <2>; /* min followed by max */
+			capacity-dmips-mhz = <1024>;
 		};
 
 		cpu1: cpu@1 {
@@ -48,6 +49,7 @@
 			cooling-min-level = <0>;
 			cooling-max-level = <11>;
 			#cooling-cells = <2>; /* min followed by max */
+			capacity-dmips-mhz = <1024>;
 		};
 
 		cpu2: cpu@2 {
@@ -60,6 +62,7 @@
 			cooling-min-level = <0>;
 			cooling-max-level = <11>;
 			#cooling-cells = <2>; /* min followed by max */
+			capacity-dmips-mhz = <1024>;
 		};
 
 		cpu3: cpu@3 {
@@ -72,6 +75,7 @@
 			cooling-min-level = <0>;
 			cooling-max-level = <11>;
 			#cooling-cells = <2>; /* min followed by max */
+			capacity-dmips-mhz = <1024>;
 		};
 
 		cpu4: cpu@100 {
@@ -85,6 +89,7 @@
 			cooling-min-level = <0>;
 			cooling-max-level = <7>;
 			#cooling-cells = <2>; /* min followed by max */
+			capacity-dmips-mhz = <539>;
 		};
 
 		cpu5: cpu@101 {
@@ -97,6 +102,7 @@
 			cooling-min-level = <0>;
 			cooling-max-level = <7>;
 			#cooling-cells = <2>; /* min followed by max */
+			capacity-dmips-mhz = <539>;
 		};
 
 		cpu6: cpu@102 {
@@ -109,6 +115,7 @@
 			cooling-min-level = <0>;
 			cooling-max-level = <7>;
 			#cooling-cells = <2>; /* min followed by max */
+			capacity-dmips-mhz = <539>;
 		};
 
 		cpu7: cpu@103 {
@@ -121,6 +128,7 @@
 			cooling-min-level = <0>;
 			cooling-max-level = <7>;
 			#cooling-cells = <2>; /* min followed by max */
+			capacity-dmips-mhz = <539>;
 		};
 	};
 };
-- 
2.11.0

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH 3/4] arm: dts: exynos: add exynos5422 cpu capacity-dmips-mhz information
  2017-08-30 14:41 [PATCH 0/4] arm: remove cpu_efficiency Dietmar Eggemann
  2017-08-30 14:41 ` [PATCH 1/4] arm: topology: " Dietmar Eggemann
  2017-08-30 14:41 ` [PATCH 2/4] arm: dts: exynos: add exynos5420 cpu capacity-dmips-mhz information Dietmar Eggemann
@ 2017-08-30 14:41 ` Dietmar Eggemann
  2017-09-17  7:37   ` Krzysztof Kozlowski
  2017-08-30 14:41 ` [PATCH 4/4] arm: dts: r8a7790: add " Dietmar Eggemann
  3 siblings, 1 reply; 17+ messages in thread
From: Dietmar Eggemann @ 2017-08-30 14:41 UTC (permalink / raw)
  To: linux-kernel, linux-arm-kernel, devicetree, linux-samsung-soc,
	linux-renesas-soc
  Cc: Russell King, Rob Herring, Mark Rutland, Kukjin Kim,
	Krzysztof Kozlowski, Vincent Guittot, Juri Lelli

The following 'capacity-dmips-mhz' dt property values are used:

Cortex-A15: 1024, Cortex-A7: 539

They have been derived form the cpu_efficiency values:

Cortex-A15: 3891, Cortex-A7: 2048

by scaling them so that the Cortex-A15s (big cores) use 1024.

The cpu_efficiency values were originally derived from the "Big.LITTLE
Processing with ARM Cortex™-A15 & Cortex-A7" white paper
(http://www.cl.cam.ac.uk/~rdm34/big.LITTLE.pdf). Table 1 lists 1.9x
(3891/2048) as the Cortex-A15 vs Cortex-A7 performance ratio for the
Dhrystone benchmark.

The following platforms are affected once cpu-invariant accounting
support is re-connected to the task scheduler:

odroidxu3, odroidxu3-lite, odroidxu4

Cc: Rob Herring <robh+dt@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Russell King <linux@armlinux.org.uk>
Cc: Kukjin Kim <kgene@kernel.org>
Cc: Krzysztof Kozlowski <krzk@kernel.org>
Signed-off-by: Dietmar Eggemann <dietmar.eggemann@arm.com>
---
 arch/arm/boot/dts/exynos5422-cpus.dtsi | 8 ++++++++
 1 file changed, 8 insertions(+)

diff --git a/arch/arm/boot/dts/exynos5422-cpus.dtsi b/arch/arm/boot/dts/exynos5422-cpus.dtsi
index bf3c6f1ec4ee..ec01d8020c2d 100644
--- a/arch/arm/boot/dts/exynos5422-cpus.dtsi
+++ b/arch/arm/boot/dts/exynos5422-cpus.dtsi
@@ -35,6 +35,7 @@
 			cooling-min-level = <0>;
 			cooling-max-level = <11>;
 			#cooling-cells = <2>; /* min followed by max */
+			capacity-dmips-mhz = <539>;
 		};
 
 		cpu1: cpu@101 {
@@ -47,6 +48,7 @@
 			cooling-min-level = <0>;
 			cooling-max-level = <11>;
 			#cooling-cells = <2>; /* min followed by max */
+			capacity-dmips-mhz = <539>;
 		};
 
 		cpu2: cpu@102 {
@@ -59,6 +61,7 @@
 			cooling-min-level = <0>;
 			cooling-max-level = <11>;
 			#cooling-cells = <2>; /* min followed by max */
+			capacity-dmips-mhz = <539>;
 		};
 
 		cpu3: cpu@103 {
@@ -71,6 +74,7 @@
 			cooling-min-level = <0>;
 			cooling-max-level = <11>;
 			#cooling-cells = <2>; /* min followed by max */
+			capacity-dmips-mhz = <539>;
 		};
 
 		cpu4: cpu@0 {
@@ -84,6 +88,7 @@
 			cooling-min-level = <0>;
 			cooling-max-level = <15>;
 			#cooling-cells = <2>; /* min followed by max */
+			capacity-dmips-mhz = <1024>;
 		};
 
 		cpu5: cpu@1 {
@@ -96,6 +101,7 @@
 			cooling-min-level = <0>;
 			cooling-max-level = <15>;
 			#cooling-cells = <2>; /* min followed by max */
+			capacity-dmips-mhz = <1024>;
 		};
 
 		cpu6: cpu@2 {
@@ -108,6 +114,7 @@
 			cooling-min-level = <0>;
 			cooling-max-level = <15>;
 			#cooling-cells = <2>; /* min followed by max */
+			capacity-dmips-mhz = <1024>;
 		};
 
 		cpu7: cpu@3 {
@@ -120,6 +127,7 @@
 			cooling-min-level = <0>;
 			cooling-max-level = <15>;
 			#cooling-cells = <2>; /* min followed by max */
+			capacity-dmips-mhz = <1024>;
 		};
 	};
 };
-- 
2.11.0

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH 4/4] arm: dts: r8a7790: add cpu capacity-dmips-mhz information
  2017-08-30 14:41 [PATCH 0/4] arm: remove cpu_efficiency Dietmar Eggemann
                   ` (2 preceding siblings ...)
  2017-08-30 14:41 ` [PATCH 3/4] arm: dts: exynos: add exynos5422 " Dietmar Eggemann
@ 2017-08-30 14:41 ` Dietmar Eggemann
  2017-09-18  7:39   ` Simon Horman
  3 siblings, 1 reply; 17+ messages in thread
From: Dietmar Eggemann @ 2017-08-30 14:41 UTC (permalink / raw)
  To: linux-kernel, linux-arm-kernel, devicetree, linux-samsung-soc,
	linux-renesas-soc
  Cc: Russell King, Rob Herring, Mark Rutland, Kukjin Kim,
	Krzysztof Kozlowski, Vincent Guittot, Juri Lelli

The following 'capacity-dmips-mhz' dt property values are used:

Cortex-A15: 1024, Cortex-A7: 539

They have been derived form the cpu_efficiency values:

Cortex-A15: 3891, Cortex-A7: 2048

by scaling them so that the Cortex-A15s (big cores) use 1024.

The cpu_efficiency values were originally derived from the "Big.LITTLE
Processing with ARM Cortex™-A15 & Cortex-A7" white paper
(http://www.cl.cam.ac.uk/~rdm34/big.LITTLE.pdf). Table 1 lists 1.9x
(3891/2048) as the Cortex-A15 vs Cortex-A7 performance ratio for the
Dhrystone benchmark.

The following platform is affected once cpu-invariant accounting
support is re-connected to the task scheduler:

r8a7790-lager

Cc: Rob Herring <robh+dt@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Russell King <linux@armlinux.org.uk>
Signed-off-by: Dietmar Eggemann <dietmar.eggemann@arm.com>
---
 arch/arm/boot/dts/r8a7790.dtsi | 8 ++++++++
 1 file changed, 8 insertions(+)

diff --git a/arch/arm/boot/dts/r8a7790.dtsi b/arch/arm/boot/dts/r8a7790.dtsi
index 2805a8608d4b..a57c0e170d8b 100644
--- a/arch/arm/boot/dts/r8a7790.dtsi
+++ b/arch/arm/boot/dts/r8a7790.dtsi
@@ -56,6 +56,7 @@
 			clock-latency = <300000>; /* 300 us */
 			power-domains = <&sysc R8A7790_PD_CA15_CPU0>;
 			next-level-cache = <&L2_CA15>;
+			capacity-dmips-mhz = <1024>;
 
 			/* kHz - uV - OPPs unknown yet */
 			operating-points = <1400000 1000000>,
@@ -73,6 +74,7 @@
 			clock-frequency = <1300000000>;
 			power-domains = <&sysc R8A7790_PD_CA15_CPU1>;
 			next-level-cache = <&L2_CA15>;
+			capacity-dmips-mhz = <1024>;
 		};
 
 		cpu2: cpu@2 {
@@ -82,6 +84,7 @@
 			clock-frequency = <1300000000>;
 			power-domains = <&sysc R8A7790_PD_CA15_CPU2>;
 			next-level-cache = <&L2_CA15>;
+			capacity-dmips-mhz = <1024>;
 		};
 
 		cpu3: cpu@3 {
@@ -91,6 +94,7 @@
 			clock-frequency = <1300000000>;
 			power-domains = <&sysc R8A7790_PD_CA15_CPU3>;
 			next-level-cache = <&L2_CA15>;
+			capacity-dmips-mhz = <1024>;
 		};
 
 		cpu4: cpu@100 {
@@ -100,6 +104,7 @@
 			clock-frequency = <780000000>;
 			power-domains = <&sysc R8A7790_PD_CA7_CPU0>;
 			next-level-cache = <&L2_CA7>;
+			capacity-dmips-mhz = <539>;
 		};
 
 		cpu5: cpu@101 {
@@ -109,6 +114,7 @@
 			clock-frequency = <780000000>;
 			power-domains = <&sysc R8A7790_PD_CA7_CPU1>;
 			next-level-cache = <&L2_CA7>;
+			capacity-dmips-mhz = <539>;
 		};
 
 		cpu6: cpu@102 {
@@ -118,6 +124,7 @@
 			clock-frequency = <780000000>;
 			power-domains = <&sysc R8A7790_PD_CA7_CPU2>;
 			next-level-cache = <&L2_CA7>;
+			capacity-dmips-mhz = <539>;
 		};
 
 		cpu7: cpu@103 {
@@ -127,6 +134,7 @@
 			clock-frequency = <780000000>;
 			power-domains = <&sysc R8A7790_PD_CA7_CPU3>;
 			next-level-cache = <&L2_CA7>;
+			capacity-dmips-mhz = <539>;
 		};
 
 		L2_CA15: cache-controller-0 {
-- 
2.11.0

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* Re: [PATCH 2/4] arm: dts: exynos: add exynos5420 cpu capacity-dmips-mhz information
  2017-08-30 14:41 ` [PATCH 2/4] arm: dts: exynos: add exynos5420 cpu capacity-dmips-mhz information Dietmar Eggemann
@ 2017-08-30 20:26   ` Krzysztof Kozlowski
  2017-08-31 10:36     ` Dietmar Eggemann
  2017-09-17  7:37   ` Krzysztof Kozlowski
  1 sibling, 1 reply; 17+ messages in thread
From: Krzysztof Kozlowski @ 2017-08-30 20:26 UTC (permalink / raw)
  To: Dietmar Eggemann
  Cc: linux-kernel, linux-arm-kernel, devicetree, linux-samsung-soc,
	linux-renesas-soc, Russell King, Rob Herring, Mark Rutland,
	Kukjin Kim, Vincent Guittot, Juri Lelli

On Wed, Aug 30, 2017 at 03:41:18PM +0100, Dietmar Eggemann wrote:
> The following 'capacity-dmips-mhz' dt property values are used:
> 
> Cortex-A15: 1024, Cortex-A7: 539
> 
> They have been derived from the cpu_efficiency values:
> 
> Cortex-A15: 3891, Cortex-A7: 2048
> 
> by scaling them so that the Cortex-A15s (big cores) use 1024.
> 
> The cpu_efficiency values were originally derived from the "Big.LITTLE
> Processing with ARM Cortex™-A15 & Cortex-A7" white paper
> (http://www.cl.cam.ac.uk/~rdm34/big.LITTLE.pdf). Table 1 lists 1.9x
> (3891/2048) as the Cortex-A15 vs Cortex-A7 performance ratio for the
> Dhrystone benchmark.
> 
> The following platforms are affected once cpu-invariant accounting
> support is re-connected to the task scheduler:
> 
> arndale-octa, peach-pi, peach-pit, smdk5420
> 
> The patch has been tested on Samsung Chromebook 2 13" (peach-pi, Exynos
> 5800).
> 
> $ cat /sys/devices/system/cpu/cpu*/cpu_capacity
> 1024
> 1024
> 1024
> 1024
> 389
> 389
> 389
> 389

I am missing something... shouldn't this be 539? Or is it scaled with
the clock-frequency (1 GHz) value?


Best regards,
Krzysztof


> 
> The Cortex-A15 vs Cortex-A7 performance ratio is 1024/389 = 2.63.
> 
> The values derived with the 'cpu_efficiency/clock-frequency dt property'
> solution are:
> 
> $ cat /sys/devices/system/cpu/cpu*/cpu_capacity
> 1535
> 1535
> 1535
> 1535
> 448
> 448
> 448
> 448
> 
> The Cortex-A15 vs Cortex-A7 performance ratio is 1535/448 = 3.43.
> 
> The discrepancy between 2.63 and 3.43 is due to the false assumption
> when using the 'cpu_efficiency/clock-frequency dt property' solution
> that the max cpu frequency of the little cpus is 1 GHZ and not 1.3 GHz.
> The Cortex-A7 cluster runs with a max cpu frequency of 1.3 GHZ whereas
> the 'clock-frequency' property value is set to 1 GHz.
> 
> 3.43/1.3 = 2.64
> 
> $ cat /sys/devices/system/cpu/cpu*/cpufreq/scaling_max_freq
> 1800000
> 1800000
> 1800000
> 1800000
> 1300000 <-- max cpu frequency of the Cortex-A7s (little cores)
> 1300000
> 1300000
> 1300000
> 
> Running another benchmark (single-threaded sysbench affine to the
> individual cpus) with performance cpufreq governor on the Samsung
> Chromebook 2 13" showed the following numbers:
> 
> $ for i in `seq 0 7`; do taskset -c $i sysbench --test=cpu
>   --num-threads=1 --max-time=10 run | grep "total number of events:";
>   done
> 
> total number of events: 1083
> total number of events: 1085
> total number of events: 1085
> total number of events: 1085
> total number of events: 454
> total number of events: 454
> total number of events: 454
> total number of events: 454
> 
> The Cortex-A15 vs Cortex-A7 performance ratio is 2.39, i.e. very close
> to the one derived from the Dhrystone based one of the "Big.LITTLE
> Processing with ARM Cortex™-A15 & Cortex-A7" white paper (2.63).
> 
> We don't aim for exact values for the cpu capacity values. Besides the
> CPI (Cycles Per Instruction), the instruction mix and whether the system
> runs cpu-bound or memory-bound has an impact on the cpu capacity values
> derived from these benchmark results.
> 
> Cc: Rob Herring <robh+dt@kernel.org>
> Cc: Mark Rutland <mark.rutland@arm.com>
> Cc: Russell King <linux@armlinux.org.uk>
> Cc: Kukjin Kim <kgene@kernel.org>
> Cc: Krzysztof Kozlowski <krzk@kernel.org>
> Signed-off-by: Dietmar Eggemann <dietmar.eggemann@arm.com>
> ---
>  arch/arm/boot/dts/exynos5420-cpus.dtsi | 8 ++++++++
>  1 file changed, 8 insertions(+)
> 
> diff --git a/arch/arm/boot/dts/exynos5420-cpus.dtsi b/arch/arm/boot/dts/exynos5420-cpus.dtsi
> index 5c052d7ff554..d7d703aa1699 100644
> --- a/arch/arm/boot/dts/exynos5420-cpus.dtsi
> +++ b/arch/arm/boot/dts/exynos5420-cpus.dtsi
> @@ -36,6 +36,7 @@
>  			cooling-min-level = <0>;
>  			cooling-max-level = <11>;
>  			#cooling-cells = <2>; /* min followed by max */
> +			capacity-dmips-mhz = <1024>;
>  		};
>  
>  		cpu1: cpu@1 {
> @@ -48,6 +49,7 @@
>  			cooling-min-level = <0>;
>  			cooling-max-level = <11>;
>  			#cooling-cells = <2>; /* min followed by max */
> +			capacity-dmips-mhz = <1024>;
>  		};
>  
>  		cpu2: cpu@2 {
> @@ -60,6 +62,7 @@
>  			cooling-min-level = <0>;
>  			cooling-max-level = <11>;
>  			#cooling-cells = <2>; /* min followed by max */
> +			capacity-dmips-mhz = <1024>;
>  		};
>  
>  		cpu3: cpu@3 {
> @@ -72,6 +75,7 @@
>  			cooling-min-level = <0>;
>  			cooling-max-level = <11>;
>  			#cooling-cells = <2>; /* min followed by max */
> +			capacity-dmips-mhz = <1024>;
>  		};
>  
>  		cpu4: cpu@100 {
> @@ -85,6 +89,7 @@
>  			cooling-min-level = <0>;
>  			cooling-max-level = <7>;
>  			#cooling-cells = <2>; /* min followed by max */
> +			capacity-dmips-mhz = <539>;
>  		};
>  
>  		cpu5: cpu@101 {
> @@ -97,6 +102,7 @@
>  			cooling-min-level = <0>;
>  			cooling-max-level = <7>;
>  			#cooling-cells = <2>; /* min followed by max */
> +			capacity-dmips-mhz = <539>;
>  		};
>  
>  		cpu6: cpu@102 {
> @@ -109,6 +115,7 @@
>  			cooling-min-level = <0>;
>  			cooling-max-level = <7>;
>  			#cooling-cells = <2>; /* min followed by max */
> +			capacity-dmips-mhz = <539>;
>  		};
>  
>  		cpu7: cpu@103 {
> @@ -121,6 +128,7 @@
>  			cooling-min-level = <0>;
>  			cooling-max-level = <7>;
>  			#cooling-cells = <2>; /* min followed by max */
> +			capacity-dmips-mhz = <539>;
>  		};
>  	};
>  };
> -- 
> 2.11.0
> 

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 2/4] arm: dts: exynos: add exynos5420 cpu capacity-dmips-mhz information
  2017-08-30 20:26   ` Krzysztof Kozlowski
@ 2017-08-31 10:36     ` Dietmar Eggemann
  2017-09-03 19:56       ` Krzysztof Kozlowski
  0 siblings, 1 reply; 17+ messages in thread
From: Dietmar Eggemann @ 2017-08-31 10:36 UTC (permalink / raw)
  To: Krzysztof Kozlowski
  Cc: linux-kernel, linux-arm-kernel, devicetree, linux-samsung-soc,
	linux-renesas-soc, Russell King, Rob Herring, Mark Rutland,
	Kukjin Kim, Vincent Guittot, Juri Lelli

On 30/08/17 21:26, Krzysztof Kozlowski wrote:
> On Wed, Aug 30, 2017 at 03:41:18PM +0100, Dietmar Eggemann wrote:
>> The following 'capacity-dmips-mhz' dt property values are used:
>>
>> Cortex-A15: 1024, Cortex-A7: 539
>>
>> They have been derived from the cpu_efficiency values:
>>
>> Cortex-A15: 3891, Cortex-A7: 2048
>>
>> by scaling them so that the Cortex-A15s (big cores) use 1024.
>>
>> The cpu_efficiency values were originally derived from the "Big.LITTLE
>> Processing with ARM Cortex™-A15 & Cortex-A7" white paper
>> (http://www.cl.cam.ac.uk/~rdm34/big.LITTLE.pdf). Table 1 lists 1.9x
>> (3891/2048) as the Cortex-A15 vs Cortex-A7 performance ratio for the
>> Dhrystone benchmark.
>>
>> The following platforms are affected once cpu-invariant accounting
>> support is re-connected to the task scheduler:
>>
>> arndale-octa, peach-pi, peach-pit, smdk5420
>>
>> The patch has been tested on Samsung Chromebook 2 13" (peach-pi, Exynos
>> 5800).
>>
>> $ cat /sys/devices/system/cpu/cpu*/cpu_capacity
>> 1024
>> 1024
>> 1024
>> 1024
>> 389
>> 389
>> 389
>> 389
> 
> I am missing something... shouldn't this be 539? Or is it scaled with
> the clock-frequency (1 GHz) value?

Yeah, the capacity-dmips-mhz dt value of 539 for the little cpus is
scaled by 1.3/1.8 (max cpu capacity/ system wide max cpu capacity):

539 * 1.3/1.8 = 389

This max cpu capacity scaling is part of both solutions, the 'cpu
capacity-dmips-mhz' and the 'cpu_efficiency/clock-frequency dt property'
one.

The (original*) cpu capacity on a heterogeneous platform expresses uArch
and max cpu frequency differences between the (logical) cpus of the
system.

* not further reduced by rt and/or irq pressure.

[...]

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 2/4] arm: dts: exynos: add exynos5420 cpu capacity-dmips-mhz information
  2017-08-31 10:36     ` Dietmar Eggemann
@ 2017-09-03 19:56       ` Krzysztof Kozlowski
  2017-09-06 11:47         ` Dietmar Eggemann
  0 siblings, 1 reply; 17+ messages in thread
From: Krzysztof Kozlowski @ 2017-09-03 19:56 UTC (permalink / raw)
  To: Dietmar Eggemann
  Cc: linux-kernel, linux-arm-kernel, devicetree, linux-samsung-soc,
	linux-renesas-soc, Russell King, Rob Herring, Mark Rutland,
	Kukjin Kim, Vincent Guittot, Juri Lelli

On Thu, Aug 31, 2017 at 11:36:07AM +0100, Dietmar Eggemann wrote:
> On 30/08/17 21:26, Krzysztof Kozlowski wrote:
> > On Wed, Aug 30, 2017 at 03:41:18PM +0100, Dietmar Eggemann wrote:
> >> The following 'capacity-dmips-mhz' dt property values are used:
> >>
> >> Cortex-A15: 1024, Cortex-A7: 539
> >>
> >> They have been derived from the cpu_efficiency values:
> >>
> >> Cortex-A15: 3891, Cortex-A7: 2048
> >>
> >> by scaling them so that the Cortex-A15s (big cores) use 1024.
> >>
> >> The cpu_efficiency values were originally derived from the "Big.LITTLE
> >> Processing with ARM Cortex™-A15 & Cortex-A7" white paper
> >> (http://www.cl.cam.ac.uk/~rdm34/big.LITTLE.pdf). Table 1 lists 1.9x
> >> (3891/2048) as the Cortex-A15 vs Cortex-A7 performance ratio for the
> >> Dhrystone benchmark.
> >>
> >> The following platforms are affected once cpu-invariant accounting
> >> support is re-connected to the task scheduler:
> >>
> >> arndale-octa, peach-pi, peach-pit, smdk5420
> >>
> >> The patch has been tested on Samsung Chromebook 2 13" (peach-pi, Exynos
> >> 5800).
> >>
> >> $ cat /sys/devices/system/cpu/cpu*/cpu_capacity
> >> 1024
> >> 1024
> >> 1024
> >> 1024
> >> 389
> >> 389
> >> 389
> >> 389
> > 
> > I am missing something... shouldn't this be 539? Or is it scaled with
> > the clock-frequency (1 GHz) value?
> 
> Yeah, the capacity-dmips-mhz dt value of 539 for the little cpus is
> scaled by 1.3/1.8 (max cpu capacity/ system wide max cpu capacity):
> 
> 539 * 1.3/1.8 = 389
> 
> This max cpu capacity scaling is part of both solutions, the 'cpu
> capacity-dmips-mhz' and the 'cpu_efficiency/clock-frequency dt property'
> one.
> 
> The (original*) cpu capacity on a heterogeneous platform expresses uArch
> and max cpu frequency differences between the (logical) cpus of the
> system.
> 
> * not further reduced by rt and/or irq pressure.
> 
> [...]

Thanks for explanation, looks fine for me. I'll take it after merge
window.

Best regards,
Krzysztof

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 1/4] arm: topology: remove cpu_efficiency
  2017-08-30 14:41 ` [PATCH 1/4] arm: topology: " Dietmar Eggemann
@ 2017-09-04  7:49   ` Vincent Guittot
  2017-09-06 11:43     ` Dietmar Eggemann
  0 siblings, 1 reply; 17+ messages in thread
From: Vincent Guittot @ 2017-09-04  7:49 UTC (permalink / raw)
  To: Dietmar Eggemann
  Cc: linux-kernel, LAK, devicetree, linux-samsung-soc,
	linux-renesas-soc, Russell King, Rob Herring, Mark Rutland,
	Kukjin Kim, Krzysztof Kozlowski, Juri Lelli

Hi Dietmar,

Removing cpu effificiency table looks good to me. Nevertheless, i have
some comments below for this patch.

On 30 August 2017 at 16:41, Dietmar Eggemann <dietmar.eggemann@arm.com> wrote:
> Remove the 'cpu_efficiency/clock-frequency dt property' based solution
> to set cpu capacity which was only working for Cortex-A15/A7 arm
> big.LITTLE systems.
>
> I.e. the 'capacity-dmips-mhz' based solution is now the only one. It is
> shared between arm and arm64 and works for every big.LITTLE system no
> matter which core types it consists of.
>
> Cc: Russell King <linux@arm.linux.org.uk>
> Cc: Vincent Guittot <vincent.guittot@linaro.org>
> Cc: Juri Lelli <juri.lelli@arm.com>
> Signed-off-by: Dietmar Eggemann <dietmar.eggemann@arm.com>
> ---
>  arch/arm/kernel/topology.c | 113 ++-------------------------------------------
>  1 file changed, 3 insertions(+), 110 deletions(-)
>
> diff --git a/arch/arm/kernel/topology.c b/arch/arm/kernel/topology.c
> index bf949a763dbe..04ccfdd94213 100644
> --- a/arch/arm/kernel/topology.c
> +++ b/arch/arm/kernel/topology.c
> @@ -47,52 +47,10 @@
>   */
>
>  #ifdef CONFIG_OF
> -struct cpu_efficiency {
> -       const char *compatible;
> -       unsigned long efficiency;
> -};
> -
> -/*
> - * Table of relative efficiency of each processors
> - * The efficiency value must fit in 20bit and the final
> - * cpu_scale value must be in the range
> - *   0 < cpu_scale < 3*SCHED_CAPACITY_SCALE/2
> - * in order to return at most 1 when DIV_ROUND_CLOSEST
> - * is used to compute the capacity of a CPU.
> - * Processors that are not defined in the table,
> - * use the default SCHED_CAPACITY_SCALE value for cpu_scale.
> - */
> -static const struct cpu_efficiency table_efficiency[] = {
> -       {"arm,cortex-a15", 3891},
> -       {"arm,cortex-a7",  2048},
> -       {NULL, },
> -};
> -
> -static unsigned long *__cpu_capacity;
> -#define cpu_capacity(cpu)      __cpu_capacity[cpu]
> -
> -static unsigned long middle_capacity = 1;
> -static bool cap_from_dt = true;
> -
> -/*
> - * Iterate all CPUs' descriptor in DT and compute the efficiency
> - * (as per table_efficiency). Also calculate a middle efficiency
> - * as close as possible to  (max{eff_i} - min{eff_i}) / 2
> - * This is later used to scale the cpu_capacity field such that an
> - * 'average' CPU is of middle capacity. Also see the comments near
> - * table_efficiency[] and update_cpu_capacity().
> - */
>  static void __init parse_dt_topology(void)
>  {
> -       const struct cpu_efficiency *cpu_eff;
> -       struct device_node *cn = NULL;
> -       unsigned long min_capacity = ULONG_MAX;
> -       unsigned long max_capacity = 0;
> -       unsigned long capacity = 0;
> -       int cpu = 0;
> -
> -       __cpu_capacity = kcalloc(nr_cpu_ids, sizeof(*__cpu_capacity),
> -                                GFP_NOWAIT);
> +       struct device_node *cn;
> +       int cpu;
>
>         cn = of_find_node_by_path("/cpus");
>         if (!cn) {
> @@ -101,9 +59,6 @@ static void __init parse_dt_topology(void)
>         }
>
>         for_each_possible_cpu(cpu) {
> -               const u32 *rate;
> -               int len;
> -
>                 /* too early to use cpu->of_node */
>                 cn = of_get_cpu_node(cpu, NULL);
>                 if (!cn) {
> @@ -115,73 +70,13 @@ static void __init parse_dt_topology(void)
>                         of_node_put(cn);
>                         continue;

AFAICT, this continue is now useless as it was there to skipe the cpu
table efficiency method

>                 }
> -
> -               cap_from_dt = false;
> -
> -               for (cpu_eff = table_efficiency; cpu_eff->compatible; cpu_eff++)
> -                       if (of_device_is_compatible(cn, cpu_eff->compatible))
> -                               break;
> -
> -               if (cpu_eff->compatible == NULL)
> -                       continue;
> -
> -               rate = of_get_property(cn, "clock-frequency", &len);
> -               if (!rate || len != 4) {
> -                       pr_err("%s missing clock-frequency property\n",
> -                               cn->full_name);
> -                       continue;
> -               }
> -
> -               capacity = ((be32_to_cpup(rate)) >> 20) * cpu_eff->efficiency;
> -
> -               /* Save min capacity of the system */
> -               if (capacity < min_capacity)
> -                       min_capacity = capacity;
> -
> -               /* Save max capacity of the system */
> -               if (capacity > max_capacity)
> -                       max_capacity = capacity;
> -
> -               cpu_capacity(cpu) = capacity;
>         }
>
> -       /* If min and max capacities are equals, we bypass the update of the
> -        * cpu_scale because all CPUs have the same capacity. Otherwise, we
> -        * compute a middle_capacity factor that will ensure that the capacity
> -        * of an 'average' CPU of the system will be as close as possible to
> -        * SCHED_CAPACITY_SCALE, which is the default value, but with the
> -        * constraint explained near table_efficiency[].
> -        */
> -       if (4*max_capacity < (3*(max_capacity + min_capacity)))
> -               middle_capacity = (min_capacity + max_capacity)
> -                               >> (SCHED_CAPACITY_SHIFT+1);
> -       else
> -               middle_capacity = ((max_capacity / 3)
> -                               >> (SCHED_CAPACITY_SHIFT-1)) + 1;
> -
> -       if (cap_from_dt)
> -               topology_normalize_cpu_scale();

Why have you moved the call to topology_normalize_cpu_scale() from
parse_dt_topology() to update_cpu_capacity() ?

You should keep it in parse_dt_topology() as itis part of the dt
parsing sequence

> -}
> -
> -/*
> - * Look for a customed capacity of a CPU in the cpu_capacity table during the
> - * boot. The update of all CPUs is in O(n^2) for heteregeneous system but the
> - * function returns directly for SMP system.
> - */
> -static void update_cpu_capacity(unsigned int cpu)
> -{
> -       if (!cpu_capacity(cpu) || cap_from_dt)
> -               return;
> -
> -       topology_set_cpu_scale(cpu, cpu_capacity(cpu) / middle_capacity);
> -
> -       pr_info("CPU%u: update cpu_capacity %lu\n",
> -               cpu, topology_get_cpu_scale(NULL, cpu));
> +       topology_normalize_cpu_scale();
>  }

You can probably just removed update_cpu_capacity()

>
>  #else
>  static inline void parse_dt_topology(void) {}
> -static inline void update_cpu_capacity(unsigned int cpuid) {}
>  #endif
>
>   /*
> @@ -277,8 +172,6 @@ void store_cpu_topology(unsigned int cpuid)
>
>         update_siblings_masks(cpuid);
>
> -       update_cpu_capacity(cpuid);
> -
>         pr_info("CPU%u: thread %d, cpu %d, socket %d, mpidr %x\n",
>                 cpuid, cpu_topology[cpuid].thread_id,
>                 cpu_topology[cpuid].core_id,
> --
> 2.11.0
>

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 1/4] arm: topology: remove cpu_efficiency
  2017-09-04  7:49   ` Vincent Guittot
@ 2017-09-06 11:43     ` Dietmar Eggemann
  2017-09-06 12:40       ` Vincent Guittot
  0 siblings, 1 reply; 17+ messages in thread
From: Dietmar Eggemann @ 2017-09-06 11:43 UTC (permalink / raw)
  To: Vincent Guittot
  Cc: linux-kernel, LAK, devicetree, linux-samsung-soc,
	linux-renesas-soc, Russell King, Rob Herring, Mark Rutland,
	Kukjin Kim, Krzysztof Kozlowski, Juri Lelli

Hi Vincent,

On 04/09/17 08:49, Vincent Guittot wrote:
> Hi Dietmar,
> 
> Removing cpu effificiency table looks good to me. Nevertheless, i have
> some comments below for this patch.

Thanks for the review!

> On 30 August 2017 at 16:41, Dietmar Eggemann <dietmar.eggemann@arm.com> wrote:
>> Remove the 'cpu_efficiency/clock-frequency dt property' based solution
>> to set cpu capacity which was only working for Cortex-A15/A7 arm
>> big.LITTLE systems.
>>
>> I.e. the 'capacity-dmips-mhz' based solution is now the only one. It is
>> shared between arm and arm64 and works for every big.LITTLE system no
>> matter which core types it consists of.
>>
>> Cc: Russell King <linux@arm.linux.org.uk>
>> Cc: Vincent Guittot <vincent.guittot@linaro.org>
>> Cc: Juri Lelli <juri.lelli@arm.com>
>> Signed-off-by: Dietmar Eggemann <dietmar.eggemann@arm.com>
>> ---
>>  arch/arm/kernel/topology.c | 113 ++-------------------------------------------
>>  1 file changed, 3 insertions(+), 110 deletions(-)

[...]

>> @@ -115,73 +70,13 @@ static void __init parse_dt_topology(void)
>>                         of_node_put(cn);
>>                         continue;
> 
> AFAICT, this continue is now useless as it was there to skipe the cpu
> table efficiency method

You're right ... will remove it.

[...]

>> -       if (cap_from_dt)
>> -               topology_normalize_cpu_scale();
> 
> Why have you moved the call to topology_normalize_cpu_scale() from
> parse_dt_topology() to update_cpu_capacity() ?

Didn't move it ? It's still called from parse_dt_topology().

> You should keep it in parse_dt_topology() as itis part of the dt
> parsing sequence

Yes, this should be the case.

[...]

>> -/*
>> - * Look for a customed capacity of a CPU in the cpu_capacity table during the
>> - * boot. The update of all CPUs is in O(n^2) for heteregeneous system but the
>> - * function returns directly for SMP system.
>> - */
>> -static void update_cpu_capacity(unsigned int cpu)
>> -{
>> -       if (!cpu_capacity(cpu) || cap_from_dt)
>> -               return;
>> -
>> -       topology_set_cpu_scale(cpu, cpu_capacity(cpu) / middle_capacity);
>> -
>> -       pr_info("CPU%u: update cpu_capacity %lu\n",
>> -               cpu, topology_get_cpu_scale(NULL, cpu));
>> +       topology_normalize_cpu_scale();
>>  }
> 
> You can probably just removed update_cpu_capacity()

I did remove update_cpu_capacity(). Maybe the patch layout is confusing?

[...]

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 2/4] arm: dts: exynos: add exynos5420 cpu capacity-dmips-mhz information
  2017-09-03 19:56       ` Krzysztof Kozlowski
@ 2017-09-06 11:47         ` Dietmar Eggemann
  0 siblings, 0 replies; 17+ messages in thread
From: Dietmar Eggemann @ 2017-09-06 11:47 UTC (permalink / raw)
  To: Krzysztof Kozlowski
  Cc: linux-kernel, linux-arm-kernel, devicetree, linux-samsung-soc,
	linux-renesas-soc, Russell King, Rob Herring, Mark Rutland,
	Kukjin Kim, Vincent Guittot, Juri Lelli

On 03/09/17 20:56, Krzysztof Kozlowski wrote:
> On Thu, Aug 31, 2017 at 11:36:07AM +0100, Dietmar Eggemann wrote:
>> On 30/08/17 21:26, Krzysztof Kozlowski wrote:
>>> On Wed, Aug 30, 2017 at 03:41:18PM +0100, Dietmar Eggemann wrote:

[...]

>>>> The patch has been tested on Samsung Chromebook 2 13" (peach-pi, Exynos
>>>> 5800).
>>>>
>>>> $ cat /sys/devices/system/cpu/cpu*/cpu_capacity
>>>> 1024
>>>> 1024
>>>> 1024
>>>> 1024
>>>> 389
>>>> 389
>>>> 389
>>>> 389
>>>
>>> I am missing something... shouldn't this be 539? Or is it scaled with
>>> the clock-frequency (1 GHz) value?
>>
>> Yeah, the capacity-dmips-mhz dt value of 539 for the little cpus is
>> scaled by 1.3/1.8 (max cpu capacity/ system wide max cpu capacity):
>>
>> 539 * 1.3/1.8 = 389
>>
>> This max cpu capacity scaling is part of both solutions, the 'cpu
>> capacity-dmips-mhz' and the 'cpu_efficiency/clock-frequency dt property'
>> one.
>>
>> The (original*) cpu capacity on a heterogeneous platform expresses uArch
>> and max cpu frequency differences between the (logical) cpus of the
>> system.
>>
>> * not further reduced by rt and/or irq pressure.
>>
>> [...]
> 
> Thanks for explanation, looks fine for me. I'll take it after merge
> window.

Nice, since the 'cpu capacity-dmips-mhz' is already supported for arm
(and used by TC2 (vexpress-v2p-ca15_a7.dts)) this can be done
independently of the actual removal of the
'cpu_efficiency/clock-frequency dt property' solution in patch 1/4.

[..]

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 1/4] arm: topology: remove cpu_efficiency
  2017-09-06 11:43     ` Dietmar Eggemann
@ 2017-09-06 12:40       ` Vincent Guittot
  2017-09-07 10:41         ` Dietmar Eggemann
  0 siblings, 1 reply; 17+ messages in thread
From: Vincent Guittot @ 2017-09-06 12:40 UTC (permalink / raw)
  To: Dietmar Eggemann
  Cc: linux-kernel, LAK, devicetree, linux-samsung-soc,
	linux-renesas-soc, Russell King, Rob Herring, Mark Rutland,
	Kukjin Kim, Krzysztof Kozlowski, Juri Lelli

On 6 September 2017 at 13:43, Dietmar Eggemann <dietmar.eggemann@arm.com> wrote:
> Hi Vincent,
>
> On 04/09/17 08:49, Vincent Guittot wrote:
>> Hi Dietmar,
>>
>> Removing cpu effificiency table looks good to me. Nevertheless, i have
>> some comments below for this patch.
>
> Thanks for the review!
>
>> On 30 August 2017 at 16:41, Dietmar Eggemann <dietmar.eggemann@arm.com> wrote:
>>> Remove the 'cpu_efficiency/clock-frequency dt property' based solution
>>> to set cpu capacity which was only working for Cortex-A15/A7 arm
>>> big.LITTLE systems.
>>>
>>> I.e. the 'capacity-dmips-mhz' based solution is now the only one. It is
>>> shared between arm and arm64 and works for every big.LITTLE system no
>>> matter which core types it consists of.
>>>
>>> Cc: Russell King <linux@arm.linux.org.uk>
>>> Cc: Vincent Guittot <vincent.guittot@linaro.org>
>>> Cc: Juri Lelli <juri.lelli@arm.com>
>>> Signed-off-by: Dietmar Eggemann <dietmar.eggemann@arm.com>
>>> ---
>>>  arch/arm/kernel/topology.c | 113 ++-------------------------------------------
>>>  1 file changed, 3 insertions(+), 110 deletions(-)
>
> [...]
>
>>> @@ -115,73 +70,13 @@ static void __init parse_dt_topology(void)
>>>                         of_node_put(cn);
>>>                         continue;
>>
>> AFAICT, this continue is now useless as it was there to skipe the cpu
>> table efficiency method
>
> You're right ... will remove it.
>
> [...]
>
>>> -       if (cap_from_dt)
>>> -               topology_normalize_cpu_scale();
>>
>> Why have you moved the call to topology_normalize_cpu_scale() from
>> parse_dt_topology() to update_cpu_capacity() ?
>
> Didn't move it ? It's still called from parse_dt_topology().
>
>> You should keep it in parse_dt_topology() as itis part of the dt
>> parsing sequence
>
> Yes, this should be the case.
>
> [...]
>
>>> -/*
>>> - * Look for a customed capacity of a CPU in the cpu_capacity table during the
>>> - * boot. The update of all CPUs is in O(n^2) for heteregeneous system but the
>>> - * function returns directly for SMP system.
>>> - */
>>> -static void update_cpu_capacity(unsigned int cpu)
>>> -{
>>> -       if (!cpu_capacity(cpu) || cap_from_dt)
>>> -               return;
>>> -
>>> -       topology_set_cpu_scale(cpu, cpu_capacity(cpu) / middle_capacity);
>>> -
>>> -       pr_info("CPU%u: update cpu_capacity %lu\n",
>>> -               cpu, topology_get_cpu_scale(NULL, cpu));
>>> +       topology_normalize_cpu_scale();
>>>  }
>>
>> You can probably just removed update_cpu_capacity()
>
> I did remove update_cpu_capacity(). Maybe the patch layout is confusing?

yes you're right I have been confused by the layout

>
> [...]

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 1/4] arm: topology: remove cpu_efficiency
  2017-09-06 12:40       ` Vincent Guittot
@ 2017-09-07 10:41         ` Dietmar Eggemann
  0 siblings, 0 replies; 17+ messages in thread
From: Dietmar Eggemann @ 2017-09-07 10:41 UTC (permalink / raw)
  To: Vincent Guittot
  Cc: linux-kernel, LAK, devicetree, linux-samsung-soc,
	linux-renesas-soc, Russell King, Rob Herring, Mark Rutland,
	Kukjin Kim, Krzysztof Kozlowski, Juri Lelli

On 06/09/17 13:40, Vincent Guittot wrote:
> On 6 September 2017 at 13:43, Dietmar Eggemann <dietmar.eggemann@arm.com> wrote:
>> Hi Vincent,
>>
>> On 04/09/17 08:49, Vincent Guittot wrote:
>>> Hi Dietmar,
>>>
>>> Removing cpu effificiency table looks good to me. Nevertheless, i have
>>> some comments below for this patch.
>>
>> Thanks for the review!
>>
>>> On 30 August 2017 at 16:41, Dietmar Eggemann <dietmar.eggemann@arm.com> wrote:

[...]

I fixed the issue with the continue statement. Moreover, I think we should also
remove the comment block about 'cpu capacity scale management' and 'cpu capacity
table' on top of  parse_dt_topology() because this is now all handled in
drivers/base/arch_topology.c.

-- >8 --

From: Dietmar Eggemann <dietmar.eggemann@arm.com>
Date: Sun, 9 Jul 2017 23:43:43 +0100
Subject: [PATCH] arm: topology: remove cpu_efficiency

Remove the 'cpu_efficiency/clock-frequency dt property' based solution
to set cpu capacity which was only working for Cortex-A15/A7 arm
big.LITTLE systems.

I.e. the 'capacity-dmips-mhz' based solution is now the only one. It is
shared between arm and arm64 and works for every big.LITTLE system no
matter which core types it consists of.

Cc: Russell King <linux@arm.linux.org.uk>
Cc: Vincent Guittot <vincent.guittot@linaro.org>
Cc: Juri Lelli <juri.lelli@arm.com>
Signed-off-by: Dietmar Eggemann <dietmar.eggemann@arm.com>
---
 arch/arm/kernel/topology.c | 130 ++-------------------------------------------
 1 file changed, 3 insertions(+), 127 deletions(-)

diff --git a/arch/arm/kernel/topology.c b/arch/arm/kernel/topology.c
index bf949a763dbe..5f46d290e34b 100644
--- a/arch/arm/kernel/topology.c
+++ b/arch/arm/kernel/topology.c
@@ -30,69 +30,11 @@
 #include <asm/cputype.h>
 #include <asm/topology.h>
 
-/*
- * cpu capacity scale management
- */
-
-/*
- * cpu capacity table
- * This per cpu data structure describes the relative capacity of each core.
- * On a heteregenous system, cores don't have the same computation capacity
- * and we reflect that difference in the cpu_capacity field so the scheduler
- * can take this difference into account during load balance. A per cpu
- * structure is preferred because each CPU updates its own cpu_capacity field
- * during the load balance except for idle cores. One idle core is selected
- * to run the rebalance_domains for all idle cores and the cpu_capacity can be
- * updated during this sequence.
- */
-
 #ifdef CONFIG_OF
-struct cpu_efficiency {
-       const char *compatible;
-       unsigned long efficiency;
-};
-
-/*
- * Table of relative efficiency of each processors
- * The efficiency value must fit in 20bit and the final
- * cpu_scale value must be in the range
- *   0 < cpu_scale < 3*SCHED_CAPACITY_SCALE/2
- * in order to return at most 1 when DIV_ROUND_CLOSEST
- * is used to compute the capacity of a CPU.
- * Processors that are not defined in the table,
- * use the default SCHED_CAPACITY_SCALE value for cpu_scale.
- */
-static const struct cpu_efficiency table_efficiency[] = {
-       {"arm,cortex-a15", 3891},
-       {"arm,cortex-a7",  2048},
-       {NULL, },
-};
-
-static unsigned long *__cpu_capacity;
-#define cpu_capacity(cpu)      __cpu_capacity[cpu]
-
-static unsigned long middle_capacity = 1;
-static bool cap_from_dt = true;
-
-/*
- * Iterate all CPUs' descriptor in DT and compute the efficiency
- * (as per table_efficiency). Also calculate a middle efficiency
- * as close as possible to  (max{eff_i} - min{eff_i}) / 2
- * This is later used to scale the cpu_capacity field such that an
- * 'average' CPU is of middle capacity. Also see the comments near
- * table_efficiency[] and update_cpu_capacity().
- */
 static void __init parse_dt_topology(void)
 {
-       const struct cpu_efficiency *cpu_eff;
-       struct device_node *cn = NULL;
-       unsigned long min_capacity = ULONG_MAX;
-       unsigned long max_capacity = 0;
-       unsigned long capacity = 0;
-       int cpu = 0;
-
-       __cpu_capacity = kcalloc(nr_cpu_ids, sizeof(*__cpu_capacity),
-                                GFP_NOWAIT);
+       struct device_node *cn;
+       int cpu;
 
        cn = of_find_node_by_path("/cpus");
        if (!cn) {
@@ -101,9 +43,6 @@ static void __init parse_dt_topology(void)
        }
 
        for_each_possible_cpu(cpu) {
-               const u32 *rate;
-               int len;
-
                /* too early to use cpu->of_node */
                cn = of_get_cpu_node(cpu, NULL);
                if (!cn) {
@@ -113,75 +52,14 @@ static void __init parse_dt_topology(void)
 
                if (topology_parse_cpu_capacity(cn, cpu)) {
                        of_node_put(cn);
-                       continue;
                }
-
-               cap_from_dt = false;
-
-               for (cpu_eff = table_efficiency; cpu_eff->compatible; cpu_eff++)
-                       if (of_device_is_compatible(cn, cpu_eff->compatible))
-                               break;
-
-               if (cpu_eff->compatible == NULL)
-                       continue;
-
-               rate = of_get_property(cn, "clock-frequency", &len);
-               if (!rate || len != 4) {
-                       pr_err("%s missing clock-frequency property\n",
-                               cn->full_name);
-                       continue;
-               }
-
-               capacity = ((be32_to_cpup(rate)) >> 20) * cpu_eff->efficiency;
-
-               /* Save min capacity of the system */
-               if (capacity < min_capacity)
-                       min_capacity = capacity;
-
-               /* Save max capacity of the system */
-               if (capacity > max_capacity)
-                       max_capacity = capacity;
-
-               cpu_capacity(cpu) = capacity;
        }
 
-       /* If min and max capacities are equals, we bypass the update of the
-        * cpu_scale because all CPUs have the same capacity. Otherwise, we
-        * compute a middle_capacity factor that will ensure that the capacity
-        * of an 'average' CPU of the system will be as close as possible to
-        * SCHED_CAPACITY_SCALE, which is the default value, but with the
-        * constraint explained near table_efficiency[].
-        */
-       if (4*max_capacity < (3*(max_capacity + min_capacity)))
-               middle_capacity = (min_capacity + max_capacity)
-                               >> (SCHED_CAPACITY_SHIFT+1);
-       else
-               middle_capacity = ((max_capacity / 3)
-                               >> (SCHED_CAPACITY_SHIFT-1)) + 1;
-
-       if (cap_from_dt)
-               topology_normalize_cpu_scale();
-}
-
-/*
- * Look for a customed capacity of a CPU in the cpu_capacity table during the
- * boot. The update of all CPUs is in O(n^2) for heteregeneous system but the
- * function returns directly for SMP system.
- */
-static void update_cpu_capacity(unsigned int cpu)
-{
-       if (!cpu_capacity(cpu) || cap_from_dt)
-               return;
-
-       topology_set_cpu_scale(cpu, cpu_capacity(cpu) / middle_capacity);
-
-       pr_info("CPU%u: update cpu_capacity %lu\n",
-               cpu, topology_get_cpu_scale(NULL, cpu));
+       topology_normalize_cpu_scale();
 }
 
 #else
 static inline void parse_dt_topology(void) {}
-static inline void update_cpu_capacity(unsigned int cpuid) {}
 #endif
 
  /*
@@ -277,8 +155,6 @@ void store_cpu_topology(unsigned int cpuid)
 
        update_siblings_masks(cpuid);
 
-       update_cpu_capacity(cpuid);
-
        pr_info("CPU%u: thread %d, cpu %d, socket %d, mpidr %x\n",
                cpuid, cpu_topology[cpuid].thread_id,
                cpu_topology[cpuid].core_id,
-- 
2.11.0

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* Re: [PATCH 2/4] arm: dts: exynos: add exynos5420 cpu capacity-dmips-mhz information
  2017-08-30 14:41 ` [PATCH 2/4] arm: dts: exynos: add exynos5420 cpu capacity-dmips-mhz information Dietmar Eggemann
  2017-08-30 20:26   ` Krzysztof Kozlowski
@ 2017-09-17  7:37   ` Krzysztof Kozlowski
  1 sibling, 0 replies; 17+ messages in thread
From: Krzysztof Kozlowski @ 2017-09-17  7:37 UTC (permalink / raw)
  To: Dietmar Eggemann
  Cc: linux-kernel, linux-arm-kernel, devicetree, linux-samsung-soc,
	linux-renesas-soc, Russell King, Rob Herring, Mark Rutland,
	Kukjin Kim, Vincent Guittot, Juri Lelli

On Wed, Aug 30, 2017 at 03:41:18PM +0100, Dietmar Eggemann wrote:
> The following 'capacity-dmips-mhz' dt property values are used:
> 
> Cortex-A15: 1024, Cortex-A7: 539
> 
> They have been derived from the cpu_efficiency values:
> 
> Cortex-A15: 3891, Cortex-A7: 2048
> 
> by scaling them so that the Cortex-A15s (big cores) use 1024.
> 
> The cpu_efficiency values were originally derived from the "Big.LITTLE
> Processing with ARM Cortex™-A15 & Cortex-A7" white paper
> (http://www.cl.cam.ac.uk/~rdm34/big.LITTLE.pdf). Table 1 lists 1.9x
> (3891/2048) as the Cortex-A15 vs Cortex-A7 performance ratio for the
> Dhrystone benchmark.
> 
> The following platforms are affected once cpu-invariant accounting
> support is re-connected to the task scheduler:
> 
> arndale-octa, peach-pi, peach-pit, smdk5420
> 
> The patch has been tested on Samsung Chromebook 2 13" (peach-pi, Exynos
> 5800).
> 
> $ cat /sys/devices/system/cpu/cpu*/cpu_capacity
> 1024
> 1024
> 1024
> 1024
> 389
> 389
> 389
> 389
> 
> The Cortex-A15 vs Cortex-A7 performance ratio is 1024/389 = 2.63.
> 
> The values derived with the 'cpu_efficiency/clock-frequency dt property'
> solution are:
> 
> $ cat /sys/devices/system/cpu/cpu*/cpu_capacity
> 1535
> 1535
> 1535
> 1535
> 448
> 448
> 448
> 448
> 
> The Cortex-A15 vs Cortex-A7 performance ratio is 1535/448 = 3.43.
> 
> The discrepancy between 2.63 and 3.43 is due to the false assumption
> when using the 'cpu_efficiency/clock-frequency dt property' solution
> that the max cpu frequency of the little cpus is 1 GHZ and not 1.3 GHz.
> The Cortex-A7 cluster runs with a max cpu frequency of 1.3 GHZ whereas
> the 'clock-frequency' property value is set to 1 GHz.
> 
> 3.43/1.3 = 2.64
> 
> $ cat /sys/devices/system/cpu/cpu*/cpufreq/scaling_max_freq
> 1800000
> 1800000
> 1800000
> 1800000
> 1300000 <-- max cpu frequency of the Cortex-A7s (little cores)
> 1300000
> 1300000
> 1300000
> 
> Running another benchmark (single-threaded sysbench affine to the
> individual cpus) with performance cpufreq governor on the Samsung
> Chromebook 2 13" showed the following numbers:
> 
> $ for i in `seq 0 7`; do taskset -c $i sysbench --test=cpu
>   --num-threads=1 --max-time=10 run | grep "total number of events:";
>   done
> 
> total number of events: 1083
> total number of events: 1085
> total number of events: 1085
> total number of events: 1085
> total number of events: 454
> total number of events: 454
> total number of events: 454
> total number of events: 454
> 
> The Cortex-A15 vs Cortex-A7 performance ratio is 2.39, i.e. very close
> to the one derived from the Dhrystone based one of the "Big.LITTLE
> Processing with ARM Cortex™-A15 & Cortex-A7" white paper (2.63).
> 
> We don't aim for exact values for the cpu capacity values. Besides the
> CPI (Cycles Per Instruction), the instruction mix and whether the system
> runs cpu-bound or memory-bound has an impact on the cpu capacity values
> derived from these benchmark results.
> 
> Cc: Rob Herring <robh+dt@kernel.org>
> Cc: Mark Rutland <mark.rutland@arm.com>
> Cc: Russell King <linux@armlinux.org.uk>
> Cc: Kukjin Kim <kgene@kernel.org>
> Cc: Krzysztof Kozlowski <krzk@kernel.org>
> Signed-off-by: Dietmar Eggemann <dietmar.eggemann@arm.com>
> ---
>  arch/arm/boot/dts/exynos5420-cpus.dtsi | 8 ++++++++
>  1 file changed, 8 insertions(+)
> 

Thanks, applied (with s/arm/ARM/ change in subject).

Best regards,
Krzysztof

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 3/4] arm: dts: exynos: add exynos5422 cpu capacity-dmips-mhz information
  2017-08-30 14:41 ` [PATCH 3/4] arm: dts: exynos: add exynos5422 " Dietmar Eggemann
@ 2017-09-17  7:37   ` Krzysztof Kozlowski
  0 siblings, 0 replies; 17+ messages in thread
From: Krzysztof Kozlowski @ 2017-09-17  7:37 UTC (permalink / raw)
  To: Dietmar Eggemann
  Cc: linux-kernel, linux-arm-kernel, devicetree, linux-samsung-soc,
	linux-renesas-soc, Russell King, Rob Herring, Mark Rutland,
	Kukjin Kim, Vincent Guittot, Juri Lelli

On Wed, Aug 30, 2017 at 03:41:19PM +0100, Dietmar Eggemann wrote:
> The following 'capacity-dmips-mhz' dt property values are used:
> 
> Cortex-A15: 1024, Cortex-A7: 539
> 
> They have been derived form the cpu_efficiency values:
> 
> Cortex-A15: 3891, Cortex-A7: 2048
> 
> by scaling them so that the Cortex-A15s (big cores) use 1024.
> 
> The cpu_efficiency values were originally derived from the "Big.LITTLE
> Processing with ARM Cortex™-A15 & Cortex-A7" white paper
> (http://www.cl.cam.ac.uk/~rdm34/big.LITTLE.pdf). Table 1 lists 1.9x
> (3891/2048) as the Cortex-A15 vs Cortex-A7 performance ratio for the
> Dhrystone benchmark.
> 
> The following platforms are affected once cpu-invariant accounting
> support is re-connected to the task scheduler:
> 
> odroidxu3, odroidxu3-lite, odroidxu4
> 
> Cc: Rob Herring <robh+dt@kernel.org>
> Cc: Mark Rutland <mark.rutland@arm.com>
> Cc: Russell King <linux@armlinux.org.uk>
> Cc: Kukjin Kim <kgene@kernel.org>
> Cc: Krzysztof Kozlowski <krzk@kernel.org>
> Signed-off-by: Dietmar Eggemann <dietmar.eggemann@arm.com>
> ---
>  arch/arm/boot/dts/exynos5422-cpus.dtsi | 8 ++++++++
>  1 file changed, 8 insertions(+)
> 

Thanks, applied.

Best regards,
Krzysztof

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 4/4] arm: dts: r8a7790: add cpu capacity-dmips-mhz information
  2017-08-30 14:41 ` [PATCH 4/4] arm: dts: r8a7790: add " Dietmar Eggemann
@ 2017-09-18  7:39   ` Simon Horman
  2017-10-09 17:55     ` Dietmar Eggemann
  0 siblings, 1 reply; 17+ messages in thread
From: Simon Horman @ 2017-09-18  7:39 UTC (permalink / raw)
  To: Dietmar Eggemann
  Cc: linux-kernel, linux-arm-kernel, devicetree, linux-samsung-soc,
	linux-renesas-soc, Russell King, Rob Herring, Mark Rutland,
	Kukjin Kim, Krzysztof Kozlowski, Vincent Guittot, Juri Lelli

On Wed, Aug 30, 2017 at 03:41:20PM +0100, Dietmar Eggemann wrote:
> The following 'capacity-dmips-mhz' dt property values are used:
> 
> Cortex-A15: 1024, Cortex-A7: 539
> 
> They have been derived form the cpu_efficiency values:
> 
> Cortex-A15: 3891, Cortex-A7: 2048
> 
> by scaling them so that the Cortex-A15s (big cores) use 1024.
> 
> The cpu_efficiency values were originally derived from the "Big.LITTLE
> Processing with ARM Cortex™-A15 & Cortex-A7" white paper
> (http://www.cl.cam.ac.uk/~rdm34/big.LITTLE.pdf). Table 1 lists 1.9x
> (3891/2048) as the Cortex-A15 vs Cortex-A7 performance ratio for the
> Dhrystone benchmark.
> 
> The following platform is affected once cpu-invariant accounting
> support is re-connected to the task scheduler:

Thanks, applied for v4.15.

My understanding from the following comment in the cover letter is that not
currently the case and this there is no behavioural change in applying this
patch.

For the record I observed the following with and without this patch
applied. I believe this is the expected result.

v4.14-rc1
# cat /sys/devices/system/cpu/cpu*/cpu_capacity
1535
1535
1535
1535
1024
1024
1024
1024

v4.14-rc1 + patch
# cat /sys/devices/system/cpu/cpu*/cpu_capacity
1024
1024
1024
1024
539
539
539
539

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 4/4] arm: dts: r8a7790: add cpu capacity-dmips-mhz information
  2017-09-18  7:39   ` Simon Horman
@ 2017-10-09 17:55     ` Dietmar Eggemann
  0 siblings, 0 replies; 17+ messages in thread
From: Dietmar Eggemann @ 2017-10-09 17:55 UTC (permalink / raw)
  To: Simon Horman
  Cc: linux-kernel, linux-arm-kernel, devicetree, linux-samsung-soc,
	linux-renesas-soc, Russell King, Rob Herring, Mark Rutland,
	Kukjin Kim, Krzysztof Kozlowski, Vincent Guittot, Juri Lelli

On 18/09/17 08:39, Simon Horman wrote:
> On Wed, Aug 30, 2017 at 03:41:20PM +0100, Dietmar Eggemann wrote:
>> The following 'capacity-dmips-mhz' dt property values are used:
>>
>> Cortex-A15: 1024, Cortex-A7: 539
>>
>> They have been derived form the cpu_efficiency values:
>>
>> Cortex-A15: 3891, Cortex-A7: 2048
>>
>> by scaling them so that the Cortex-A15s (big cores) use 1024.
>>
>> The cpu_efficiency values were originally derived from the "Big.LITTLE
>> Processing with ARM Cortex™-A15 & Cortex-A7" white paper
>> (http://www.cl.cam.ac.uk/~rdm34/big.LITTLE.pdf). Table 1 lists 1.9x
>> (3891/2048) as the Cortex-A15 vs Cortex-A7 performance ratio for the
>> Dhrystone benchmark.
>>
>> The following platform is affected once cpu-invariant accounting
>> support is re-connected to the task scheduler:
> 
> Thanks, applied for v4.15.
> 
> My understanding from the following comment in the cover letter is that not
> currently the case and this there is no behavioural change in applying this
> patch.
> 
> For the record I observed the following with and without this patch
> applied. I believe this is the expected result.
> 
> v4.14-rc1
> # cat /sys/devices/system/cpu/cpu*/cpu_capacity
> 1535
> 1535
> 1535
> 1535
> 1024
> 1024
> 1024
> 1024
> 
> v4.14-rc1 + patch
> # cat /sys/devices/system/cpu/cpu*/cpu_capacity
> 1024
> 1024
> 1024
> 1024
> 539
> 539
> 539
> 539

Thanks Simon! Yes, that is the expected behaviour. And sorry for not
responding earlier!

With exynos542{0,2} and r8a7790 switching to the 'capacity-dmips-mhz'
based solution in v4.15, I can push for removal of the cpu_efficency
code [patch 1/4].

^ permalink raw reply	[flat|nested] 17+ messages in thread

end of thread, other threads:[~2017-10-09 17:55 UTC | newest]

Thread overview: 17+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-08-30 14:41 [PATCH 0/4] arm: remove cpu_efficiency Dietmar Eggemann
2017-08-30 14:41 ` [PATCH 1/4] arm: topology: " Dietmar Eggemann
2017-09-04  7:49   ` Vincent Guittot
2017-09-06 11:43     ` Dietmar Eggemann
2017-09-06 12:40       ` Vincent Guittot
2017-09-07 10:41         ` Dietmar Eggemann
2017-08-30 14:41 ` [PATCH 2/4] arm: dts: exynos: add exynos5420 cpu capacity-dmips-mhz information Dietmar Eggemann
2017-08-30 20:26   ` Krzysztof Kozlowski
2017-08-31 10:36     ` Dietmar Eggemann
2017-09-03 19:56       ` Krzysztof Kozlowski
2017-09-06 11:47         ` Dietmar Eggemann
2017-09-17  7:37   ` Krzysztof Kozlowski
2017-08-30 14:41 ` [PATCH 3/4] arm: dts: exynos: add exynos5422 " Dietmar Eggemann
2017-09-17  7:37   ` Krzysztof Kozlowski
2017-08-30 14:41 ` [PATCH 4/4] arm: dts: r8a7790: add " Dietmar Eggemann
2017-09-18  7:39   ` Simon Horman
2017-10-09 17:55     ` Dietmar Eggemann

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).