* [PATCH v2 0/2] Optimization with aware of cpu capacity for R-Car Gen3
@ 2018-11-08 7:24 Gaku Inami
2018-11-08 7:24 ` [PATCH v2 1/2] arm64: dts: renesas: Add CPU topology on R-Car Gen3 SoCs Gaku Inami
2018-11-08 7:24 ` [PATCH v2 2/2] arm64: dts: renesas: Add CPU capacity-dmips-mhz Gaku Inami
0 siblings, 2 replies; 8+ messages in thread
From: Gaku Inami @ 2018-11-08 7:24 UTC (permalink / raw)
To: horms, magnus.damm, robh+dt, mark.rutland
Cc: linux-renesas-soc, devicetree, Gaku Inami
The commit 05484e098448 ("sched/topology: Add SD_ASYM_CPUCAPACITY
flag detection") to automatically detect asymmetric CPU capacity
has been merged into v4.20-rc1, so I will post this patch series
as v2 again.
These add the scheduler information to be aware cpu capacity. Some
R-Car SoCs have big LITTLE architecture(e.g. CA57/CA53). It has a
difference performance/power consumption for each CPUs.
As the scheduler will be aware the capacity of CPU, the scheduler is
balancing so that the free capacity of each CPU is even. This means
that it aggressively migrates tasks to big CPUs(e.g. CA57) with large
capacity in case of the system load is low and middle, the performance
of user application is improved than before.
Since most users for IVI are using CPU with performance oriented than
power consumption, this change will benefit for their use-cases. Some
benchmark is improved as an example below.
UnixBench (1 parallel) on r8a7796 SoC (CA57x2 + CA53x4) :
before after
- Dhrystone 2 using register variables 4777159 11353624 +58%
- Double-Precision Whetstone 866 1218 +29%
- Execl Throughput 728 920 +21%
- File Copy 1024 bufsize 2000 maxblocks 69405 115962 +40%
- File Copy 256 bufsize 500 maxblocks 21404 28685 +25%
- File Copy 4096 bufsize 8000 maxblocks 102749 159978 +36%
- Pipe Throughput 93876 150848 +38%
- Pipe-based Context Switching 27257 25317 -8%
- Process Creation 1885 2292 +18%
- Shell Scripts (1 concurrent) 135 137 +1%
- Shell Scripts (8 concurrent) 35 34 -3%
- System Call Overhead 99169 140146 +29%
- System Benchmarks Index Score 112 152 +26%
UnixBench (8 parallel) on r8a7795 SoC (CA57x4 + CA53x4) :
before after
- Dhrystone 2 using register variables 64686060 64472624 0%
- Double-Precision Whetstone 8380 8423 +1%
- Execl Throughput 5856 6147 +5%
- File Copy 1024 bufsize 2000 maxblocks 142923 164482 +13%
- File Copy 256 bufsize 500 maxblocks 46257 51344 +10%
- File Copy 4096 bufsize 8000 maxblocks 360398 393339 +8%
- Pipe Throughput 974106 972545 0%
- Pipe-based Context Switching 162455 146567 -11%
- Process Creation 10164 9659 -5%
- Shell Scripts (1 concurrent) 317 317 0%
- Shell Scripts (8 concurrent) 30 31 +3%
- System Call Overhead 897596 899274 0%
- System Benchmarks Index Score 523 534 +2%
based on renesas-devel-20181105-v4.20-rc1
v1 -> v2:
- Consolidate two patches for r8a7795 and r8a7796 into one patch
- Add the formula for capacity-dmips-mhz into description
- Remove the static setting of SD_ASYM_CPUCAPACITY for R-Car
Gaku Inami (2):
arm64: dts: renesas: Add CPU topology on R-Car Gen3 SoCs
arm64: dts: renesas: Add CPU capacity-dmips-mhz
arch/arm64/boot/dts/renesas/r8a7795.dtsi | 40 ++++++++++++++++++++++++++++++++
arch/arm64/boot/dts/renesas/r8a7796.dtsi | 32 +++++++++++++++++++++++++
2 files changed, 72 insertions(+)
--
2.7.4
^ permalink raw reply [flat|nested] 8+ messages in thread
* [PATCH v2 1/2] arm64: dts: renesas: Add CPU topology on R-Car Gen3 SoCs
2018-11-08 7:24 [PATCH v2 0/2] Optimization with aware of cpu capacity for R-Car Gen3 Gaku Inami
@ 2018-11-08 7:24 ` Gaku Inami
2018-11-14 9:50 ` Geert Uytterhoeven
2018-11-08 7:24 ` [PATCH v2 2/2] arm64: dts: renesas: Add CPU capacity-dmips-mhz Gaku Inami
1 sibling, 1 reply; 8+ messages in thread
From: Gaku Inami @ 2018-11-08 7:24 UTC (permalink / raw)
To: horms, magnus.damm, robh+dt, mark.rutland
Cc: linux-renesas-soc, devicetree, Gaku Inami
This patch adds the "cpu-map" into r8a7795/r8a7796 composed of
multi-cluster. This definition is used to parse the cpu topology.
Signed-off-by: Gaku Inami <gaku.inami.xh@renesas.com>
---
v1 -> v2:
- Consolidate two patches for r8a7795 and r8a7796 into one patch
---
arch/arm64/boot/dts/renesas/r8a7795.dtsi | 32 ++++++++++++++++++++++++++++++++
arch/arm64/boot/dts/renesas/r8a7796.dtsi | 26 ++++++++++++++++++++++++++
2 files changed, 58 insertions(+)
diff --git a/arch/arm64/boot/dts/renesas/r8a7795.dtsi b/arch/arm64/boot/dts/renesas/r8a7795.dtsi
index 0b54c53..63d5b61 100644
--- a/arch/arm64/boot/dts/renesas/r8a7795.dtsi
+++ b/arch/arm64/boot/dts/renesas/r8a7795.dtsi
@@ -116,6 +116,38 @@
#address-cells = <1>;
#size-cells = <0>;
+ cpu-map {
+ cluster0 {
+ core0 {
+ cpu = <&a57_0>;
+ };
+ core1 {
+ cpu = <&a57_1>;
+ };
+ core2 {
+ cpu = <&a57_2>;
+ };
+ core3 {
+ cpu = <&a57_3>;
+ };
+ };
+
+ cluster1 {
+ core0 {
+ cpu = <&a53_0>;
+ };
+ core1 {
+ cpu = <&a53_1>;
+ };
+ core2 {
+ cpu = <&a53_2>;
+ };
+ core3 {
+ cpu = <&a53_3>;
+ };
+ };
+ };
+
a57_0: cpu@0 {
compatible = "arm,cortex-a57", "arm,armv8";
reg = <0x0>;
diff --git a/arch/arm64/boot/dts/renesas/r8a7796.dtsi b/arch/arm64/boot/dts/renesas/r8a7796.dtsi
index 3baee26..b12bf73 100644
--- a/arch/arm64/boot/dts/renesas/r8a7796.dtsi
+++ b/arch/arm64/boot/dts/renesas/r8a7796.dtsi
@@ -127,6 +127,32 @@
#address-cells = <1>;
#size-cells = <0>;
+ cpu-map {
+ cluster0 {
+ core0 {
+ cpu = <&a57_0>;
+ };
+ core1 {
+ cpu = <&a57_1>;
+ };
+ };
+
+ cluster1 {
+ core0 {
+ cpu = <&a53_0>;
+ };
+ core1 {
+ cpu = <&a53_1>;
+ };
+ core2 {
+ cpu = <&a53_2>;
+ };
+ core3 {
+ cpu = <&a53_3>;
+ };
+ };
+ };
+
a57_0: cpu@0 {
compatible = "arm,cortex-a57", "arm,armv8";
reg = <0x0>;
--
2.7.4
^ permalink raw reply related [flat|nested] 8+ messages in thread
* [PATCH v2 2/2] arm64: dts: renesas: Add CPU capacity-dmips-mhz
2018-11-08 7:24 [PATCH v2 0/2] Optimization with aware of cpu capacity for R-Car Gen3 Gaku Inami
2018-11-08 7:24 ` [PATCH v2 1/2] arm64: dts: renesas: Add CPU topology on R-Car Gen3 SoCs Gaku Inami
@ 2018-11-08 7:24 ` Gaku Inami
2018-11-14 9:56 ` Geert Uytterhoeven
1 sibling, 1 reply; 8+ messages in thread
From: Gaku Inami @ 2018-11-08 7:24 UTC (permalink / raw)
To: horms, magnus.damm, robh+dt, mark.rutland
Cc: linux-renesas-soc, devicetree, Gaku Inami
Set the capacity-dmips-mhz for R-Car Gen3 SoCs, that is based on
dhrystone. The average in 10 times of dhrystone result as follows:
r8a7795 SoC (A57x4 + A53x4)
CPU max-freq dhrystone
---------------------------------
A57 1500 MHz 11470943 lps/s
A53 1200 MHz 4798583 lps/s
r8a7796 SoC (A57x2 + A53x4)
CPU max-freq dhrystone
---------------------------------
A57 1500 MHz 11463526 lps/s
A53 1200 MHz 4793276 lps/s
Based on above, capacity-dmips-mhz values are calculated as follows:
r8a7795 SoC
A57 : 1024 / (11470943 / 1500) * (11470943 / 1500) = 1024
A53 : 1024 / (11470943 / 1500) * ( 4798583 / 1200) = 535
r8a7796 SoC
A57 : 1024 / (11463526 / 1500) * (11463526 / 1500) = 1024
A53 : 1024 / (11463526 / 1500) * ( 4793276 / 1200) = 535
However, since each CPUs have different max frequencies, the final
CPU capacities of A53 are scaled by this difference, the values are
as follows.
[r8a7795 SoC]
$ cat /sys/devices/system/cpu/cpu*/cpu_capacity
1024 <---- CPU capacity of A57
1024
1024
1024
428 <---- CPU capacity of A53
428
428
428
[r8a7796 SoC]
$ cat /sys/devices/system/cpu/cpu*/cpu_capacity
1024 <---- CPU capacity of A57
1024
428 <---- CPU capacity of A53
428
428
428
Signed-off-by: Gaku Inami <gaku.inami.xh@renesas.com>
---
v1 -> v2:
- Consolidate two patches for r8a7795 and r8a7796 into one patch
- Add the formula for capacity-dmips-mhz into description
---
arch/arm64/boot/dts/renesas/r8a7795.dtsi | 8 ++++++++
arch/arm64/boot/dts/renesas/r8a7796.dtsi | 6 ++++++
2 files changed, 14 insertions(+)
diff --git a/arch/arm64/boot/dts/renesas/r8a7795.dtsi b/arch/arm64/boot/dts/renesas/r8a7795.dtsi
index 63d5b61..94a4ab6 100644
--- a/arch/arm64/boot/dts/renesas/r8a7795.dtsi
+++ b/arch/arm64/boot/dts/renesas/r8a7795.dtsi
@@ -157,6 +157,7 @@
enable-method = "psci";
clocks = <&cpg CPG_CORE R8A7795_CLK_Z>;
operating-points-v2 = <&cluster0_opp>;
+ capacity-dmips-mhz = <1024>;
#cooling-cells = <2>;
};
@@ -169,6 +170,7 @@
enable-method = "psci";
clocks = <&cpg CPG_CORE R8A7795_CLK_Z>;
operating-points-v2 = <&cluster0_opp>;
+ capacity-dmips-mhz = <1024>;
#cooling-cells = <2>;
};
@@ -181,6 +183,7 @@
enable-method = "psci";
clocks = <&cpg CPG_CORE R8A7795_CLK_Z>;
operating-points-v2 = <&cluster0_opp>;
+ capacity-dmips-mhz = <1024>;
#cooling-cells = <2>;
};
@@ -193,6 +196,7 @@
enable-method = "psci";
clocks = <&cpg CPG_CORE R8A7795_CLK_Z>;
operating-points-v2 = <&cluster0_opp>;
+ capacity-dmips-mhz = <1024>;
#cooling-cells = <2>;
};
@@ -205,6 +209,7 @@
enable-method = "psci";
clocks = <&cpg CPG_CORE R8A7795_CLK_Z2>;
operating-points-v2 = <&cluster1_opp>;
+ capacity-dmips-mhz = <535>;
};
a53_1: cpu@101 {
@@ -216,6 +221,7 @@
enable-method = "psci";
clocks = <&cpg CPG_CORE R8A7795_CLK_Z2>;
operating-points-v2 = <&cluster1_opp>;
+ capacity-dmips-mhz = <535>;
};
a53_2: cpu@102 {
@@ -227,6 +233,7 @@
enable-method = "psci";
clocks = <&cpg CPG_CORE R8A7795_CLK_Z2>;
operating-points-v2 = <&cluster1_opp>;
+ capacity-dmips-mhz = <535>;
};
a53_3: cpu@103 {
@@ -238,6 +245,7 @@
enable-method = "psci";
clocks = <&cpg CPG_CORE R8A7795_CLK_Z2>;
operating-points-v2 = <&cluster1_opp>;
+ capacity-dmips-mhz = <535>;
};
L2_CA57: cache-controller-0 {
diff --git a/arch/arm64/boot/dts/renesas/r8a7796.dtsi b/arch/arm64/boot/dts/renesas/r8a7796.dtsi
index b12bf73..369d0bc 100644
--- a/arch/arm64/boot/dts/renesas/r8a7796.dtsi
+++ b/arch/arm64/boot/dts/renesas/r8a7796.dtsi
@@ -162,6 +162,7 @@
enable-method = "psci";
clocks = <&cpg CPG_CORE R8A7796_CLK_Z>;
operating-points-v2 = <&cluster0_opp>;
+ capacity-dmips-mhz = <1024>;
#cooling-cells = <2>;
};
@@ -174,6 +175,7 @@
enable-method = "psci";
clocks = <&cpg CPG_CORE R8A7796_CLK_Z>;
operating-points-v2 = <&cluster0_opp>;
+ capacity-dmips-mhz = <1024>;
#cooling-cells = <2>;
};
@@ -186,6 +188,7 @@
enable-method = "psci";
clocks = <&cpg CPG_CORE R8A7796_CLK_Z2>;
operating-points-v2 = <&cluster1_opp>;
+ capacity-dmips-mhz = <535>;
};
a53_1: cpu@101 {
@@ -197,6 +200,7 @@
enable-method = "psci";
clocks = <&cpg CPG_CORE R8A7796_CLK_Z2>;
operating-points-v2 = <&cluster1_opp>;
+ capacity-dmips-mhz = <535>;
};
a53_2: cpu@102 {
@@ -208,6 +212,7 @@
enable-method = "psci";
clocks = <&cpg CPG_CORE R8A7796_CLK_Z2>;
operating-points-v2 = <&cluster1_opp>;
+ capacity-dmips-mhz = <535>;
};
a53_3: cpu@103 {
@@ -219,6 +224,7 @@
enable-method = "psci";
clocks = <&cpg CPG_CORE R8A7796_CLK_Z2>;
operating-points-v2 = <&cluster1_opp>;
+ capacity-dmips-mhz = <535>;
};
L2_CA57: cache-controller-0 {
--
2.7.4
^ permalink raw reply related [flat|nested] 8+ messages in thread
* Re: [PATCH v2 1/2] arm64: dts: renesas: Add CPU topology on R-Car Gen3 SoCs
2018-11-08 7:24 ` [PATCH v2 1/2] arm64: dts: renesas: Add CPU topology on R-Car Gen3 SoCs Gaku Inami
@ 2018-11-14 9:50 ` Geert Uytterhoeven
2018-11-15 1:07 ` Gaku Inami
2018-11-15 14:27 ` Simon Horman
0 siblings, 2 replies; 8+ messages in thread
From: Geert Uytterhoeven @ 2018-11-14 9:50 UTC (permalink / raw)
To: Gaku Inami
Cc: Simon Horman, Magnus Damm, Rob Herring, Mark Rutland,
Linux-Renesas,
open list:OPEN FIRMWARE AND FLATTENED DEVICE TREE BINDINGS
Hi Inami-san,
On Thu, Nov 8, 2018 at 8:25 AM Gaku Inami <gaku.inami.xh@renesas.com> wrote:
> This patch adds the "cpu-map" into r8a7795/r8a7796 composed of
> multi-cluster. This definition is used to parse the cpu topology.
>
> Signed-off-by: Gaku Inami <gaku.inami.xh@renesas.com>
Thanks for your patch!
Next time, please collect tags provided by reviewers on the previous
version.
Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be>
Gr{oetje,eeting}s,
Geert
--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org
In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
-- Linus Torvalds
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH v2 2/2] arm64: dts: renesas: Add CPU capacity-dmips-mhz
2018-11-08 7:24 ` [PATCH v2 2/2] arm64: dts: renesas: Add CPU capacity-dmips-mhz Gaku Inami
@ 2018-11-14 9:56 ` Geert Uytterhoeven
2018-11-15 14:28 ` Simon Horman
0 siblings, 1 reply; 8+ messages in thread
From: Geert Uytterhoeven @ 2018-11-14 9:56 UTC (permalink / raw)
To: Gaku Inami
Cc: Simon Horman, Magnus Damm, Rob Herring, Mark Rutland,
Linux-Renesas,
open list:OPEN FIRMWARE AND FLATTENED DEVICE TREE BINDINGS
Hi Inami-san,
On Thu, Nov 8, 2018 at 8:25 AM Gaku Inami <gaku.inami.xh@renesas.com> wrote:
> Set the capacity-dmips-mhz for R-Car Gen3 SoCs, that is based on
> dhrystone. The average in 10 times of dhrystone result as follows:
[...]
> Signed-off-by: Gaku Inami <gaku.inami.xh@renesas.com>
> ---
> v1 -> v2:
> - Consolidate two patches for r8a7795 and r8a7796 into one patch
> - Add the formula for capacity-dmips-mhz into description
Thanks for the update, and the extensive and clear description!
Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be>
Gr{oetje,eeting}s,
Geert
--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org
In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
-- Linus Torvalds
^ permalink raw reply [flat|nested] 8+ messages in thread
* RE: [PATCH v2 1/2] arm64: dts: renesas: Add CPU topology on R-Car Gen3 SoCs
2018-11-14 9:50 ` Geert Uytterhoeven
@ 2018-11-15 1:07 ` Gaku Inami
2018-11-15 14:27 ` Simon Horman
1 sibling, 0 replies; 8+ messages in thread
From: Gaku Inami @ 2018-11-15 1:07 UTC (permalink / raw)
To: Geert Uytterhoeven
Cc: Simon Horman, Magnus Damm, Rob Herring, Mark Rutland,
Linux-Renesas,
open list:OPEN FIRMWARE AND FLATTENED DEVICE TREE BINDINGS
Hi Geert-san,
Thanks for your review.
> -----Original Message-----
> From: Geert Uytterhoeven <geert@linux-m68k.org>
> Sent: Wednesday, November 14, 2018 6:50 PM
> To: Gaku Inami <gaku.inami.xh@renesas.com>
> Cc: Simon Horman <horms@verge.net.au>; Magnus Damm <magnus.damm@gmail.com>; Rob Herring <robh+dt@kernel.org>; Mark
> Rutland <mark.rutland@arm.com>; Linux-Renesas <linux-renesas-soc@vger.kernel.org>; open list:OPEN FIRMWARE AND FLATTENED
> DEVICE TREE BINDINGS <devicetree@vger.kernel.org>
> Subject: Re: [PATCH v2 1/2] arm64: dts: renesas: Add CPU topology on R-Car Gen3 SoCs
[snip]
> Next time, please collect tags provided by reviewers on the previous
> version.
I am sorry that "reviewed-by" from you is lacked. I will add correct tags next time.
Regards,
Inami
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH v2 1/2] arm64: dts: renesas: Add CPU topology on R-Car Gen3 SoCs
2018-11-14 9:50 ` Geert Uytterhoeven
2018-11-15 1:07 ` Gaku Inami
@ 2018-11-15 14:27 ` Simon Horman
1 sibling, 0 replies; 8+ messages in thread
From: Simon Horman @ 2018-11-15 14:27 UTC (permalink / raw)
To: Geert Uytterhoeven
Cc: Gaku Inami, Magnus Damm, Rob Herring, Mark Rutland,
Linux-Renesas,
open list:OPEN FIRMWARE AND FLATTENED DEVICE TREE BINDINGS
On Wed, Nov 14, 2018 at 10:50:03AM +0100, Geert Uytterhoeven wrote:
> Hi Inami-san,
>
> On Thu, Nov 8, 2018 at 8:25 AM Gaku Inami <gaku.inami.xh@renesas.com> wrote:
> > This patch adds the "cpu-map" into r8a7795/r8a7796 composed of
> > multi-cluster. This definition is used to parse the cpu topology.
> >
> > Signed-off-by: Gaku Inami <gaku.inami.xh@renesas.com>
>
> Thanks for your patch!
>
> Next time, please collect tags provided by reviewers on the previous
> version.
>
> Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be>
Thanks, applied for v4.21.
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH v2 2/2] arm64: dts: renesas: Add CPU capacity-dmips-mhz
2018-11-14 9:56 ` Geert Uytterhoeven
@ 2018-11-15 14:28 ` Simon Horman
0 siblings, 0 replies; 8+ messages in thread
From: Simon Horman @ 2018-11-15 14:28 UTC (permalink / raw)
To: Geert Uytterhoeven
Cc: Gaku Inami, Magnus Damm, Rob Herring, Mark Rutland,
Linux-Renesas,
open list:OPEN FIRMWARE AND FLATTENED DEVICE TREE BINDINGS
On Wed, Nov 14, 2018 at 10:56:04AM +0100, Geert Uytterhoeven wrote:
> Hi Inami-san,
>
> On Thu, Nov 8, 2018 at 8:25 AM Gaku Inami <gaku.inami.xh@renesas.com> wrote:
> > Set the capacity-dmips-mhz for R-Car Gen3 SoCs, that is based on
> > dhrystone. The average in 10 times of dhrystone result as follows:
>
> [...]
>
> > Signed-off-by: Gaku Inami <gaku.inami.xh@renesas.com>
> > ---
> > v1 -> v2:
> > - Consolidate two patches for r8a7795 and r8a7796 into one patch
> > - Add the formula for capacity-dmips-mhz into description
>
> Thanks for the update, and the extensive and clear description!
>
> Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be>
Thanks, applied for v4.21.
^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2018-11-15 14:28 UTC | newest]
Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-11-08 7:24 [PATCH v2 0/2] Optimization with aware of cpu capacity for R-Car Gen3 Gaku Inami
2018-11-08 7:24 ` [PATCH v2 1/2] arm64: dts: renesas: Add CPU topology on R-Car Gen3 SoCs Gaku Inami
2018-11-14 9:50 ` Geert Uytterhoeven
2018-11-15 1:07 ` Gaku Inami
2018-11-15 14:27 ` Simon Horman
2018-11-08 7:24 ` [PATCH v2 2/2] arm64: dts: renesas: Add CPU capacity-dmips-mhz Gaku Inami
2018-11-14 9:56 ` Geert Uytterhoeven
2018-11-15 14:28 ` Simon Horman
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.