* [PATCH 0/3] CPU power management for SM8150
@ 2020-12-21 0:29 Danny Lin
2020-12-21 0:29 ` [PATCH 1/3] arm64: dts: qcom: sm8150: Define CPU topology Danny Lin
` (3 more replies)
0 siblings, 4 replies; 10+ messages in thread
From: Danny Lin @ 2020-12-21 0:29 UTC (permalink / raw)
Cc: Danny Lin, Andy Gross, Bjorn Andersson, Rob Herring,
linux-arm-msm, devicetree, linux-kernel
These patches add support for high-level CPU power management on the
SM8150 platform. cpuidle and energy-aware scheduling are now working
with the new idle states and CPU energy model.
Danny Lin (3):
arm64: dts: qcom: sm8150: Define CPU topology
arm64: dts: qcom: sm8150: Add PSCI idle states
arm64: dts: qcom: sm8150: Add CPU capacities and energy model
arch/arm64/boot/dts/qcom/sm8150.dtsi | 102 +++++++++++++++++++++++++++
1 file changed, 102 insertions(+)
--
2.29.2
^ permalink raw reply [flat|nested] 10+ messages in thread
* [PATCH 1/3] arm64: dts: qcom: sm8150: Define CPU topology
2020-12-21 0:29 [PATCH 0/3] CPU power management for SM8150 Danny Lin
@ 2020-12-21 0:29 ` Danny Lin
2020-12-21 0:29 ` [PATCH 2/3] arm64: dts: qcom: sm8150: Add PSCI idle states Danny Lin
` (2 subsequent siblings)
3 siblings, 0 replies; 10+ messages in thread
From: Danny Lin @ 2020-12-21 0:29 UTC (permalink / raw)
Cc: Danny Lin, Andy Gross, Bjorn Andersson, Rob Herring,
linux-arm-msm, devicetree, linux-kernel
sm8150 has a big.LITTLE CPU setup with DynamIQ, so all cores are within
the same CPU cluster and LLC (Last-Level Cache) domain. Define this
topology to help the scheduler make decisions.
Signed-off-by: Danny Lin <danny@kdrag0n.dev>
---
arch/arm64/boot/dts/qcom/sm8150.dtsi | 36 ++++++++++++++++++++++++++++
1 file changed, 36 insertions(+)
diff --git a/arch/arm64/boot/dts/qcom/sm8150.dtsi b/arch/arm64/boot/dts/qcom/sm8150.dtsi
index b58cf1b8542c..75ed38ee5d88 100644
--- a/arch/arm64/boot/dts/qcom/sm8150.dtsi
+++ b/arch/arm64/boot/dts/qcom/sm8150.dtsi
@@ -160,6 +160,42 @@ L2_700: l2-cache {
next-level-cache = <&L3_0>;
};
};
+
+ cpu-map {
+ cluster0 {
+ core0 {
+ cpu = <&CPU0>;
+ };
+
+ core1 {
+ cpu = <&CPU1>;
+ };
+
+ core2 {
+ cpu = <&CPU2>;
+ };
+
+ core3 {
+ cpu = <&CPU3>;
+ };
+
+ core4 {
+ cpu = <&CPU4>;
+ };
+
+ core5 {
+ cpu = <&CPU5>;
+ };
+
+ core6 {
+ cpu = <&CPU6>;
+ };
+
+ core7 {
+ cpu = <&CPU7>;
+ };
+ };
+ };
};
firmware {
--
2.29.2
^ permalink raw reply related [flat|nested] 10+ messages in thread
* [PATCH 2/3] arm64: dts: qcom: sm8150: Add PSCI idle states
2020-12-21 0:29 [PATCH 0/3] CPU power management for SM8150 Danny Lin
2020-12-21 0:29 ` [PATCH 1/3] arm64: dts: qcom: sm8150: Define CPU topology Danny Lin
@ 2020-12-21 0:29 ` Danny Lin
2020-12-21 3:48 ` Bjorn Andersson
2020-12-21 0:29 ` [PATCH 3/3] arm64: dts: qcom: sm8150: Add CPU capacities and energy model Danny Lin
2021-01-06 3:50 ` [PATCH 0/3] CPU power management for SM8150 patchwork-bot+linux-arm-msm
3 siblings, 1 reply; 10+ messages in thread
From: Danny Lin @ 2020-12-21 0:29 UTC (permalink / raw)
Cc: Danny Lin, Andy Gross, Bjorn Andersson, Rob Herring,
linux-arm-msm, devicetree, linux-kernel
Like other Qualcomm SoCs, sm8150 exposes CPU and cluster idle states
through PSCI. Define the idle states to save power when the CPU is not
in active use.
These idle states, latency, and residency values match the downstream
4.14 kernel from Qualcomm as of LA.UM.8.1.r1-15600-sm8150.0.
It's worth noting that the CPU has an additional C3 power collapse idle
state between WFI and rail power collapse (with PSCI mode 0x40000003),
but it is not officially used in downstream kernels due to "thermal
throttling issues."
Signed-off-by: Danny Lin <danny@kdrag0n.dev>
---
arch/arm64/boot/dts/qcom/sm8150.dtsi | 50 ++++++++++++++++++++++++++++
1 file changed, 50 insertions(+)
diff --git a/arch/arm64/boot/dts/qcom/sm8150.dtsi b/arch/arm64/boot/dts/qcom/sm8150.dtsi
index 75ed38ee5d88..edc1fe6d7f1b 100644
--- a/arch/arm64/boot/dts/qcom/sm8150.dtsi
+++ b/arch/arm64/boot/dts/qcom/sm8150.dtsi
@@ -50,6 +50,8 @@ CPU0: cpu@0 {
compatible = "qcom,kryo485";
reg = <0x0 0x0>;
enable-method = "psci";
+ cpu-idle-states = <&LITTLE_CPU_SLEEP_0
+ &CLUSTER_SLEEP_0>;
next-level-cache = <&L2_0>;
qcom,freq-domain = <&cpufreq_hw 0>;
#cooling-cells = <2>;
@@ -67,6 +69,8 @@ CPU1: cpu@100 {
compatible = "qcom,kryo485";
reg = <0x0 0x100>;
enable-method = "psci";
+ cpu-idle-states = <&LITTLE_CPU_SLEEP_0
+ &CLUSTER_SLEEP_0>;
next-level-cache = <&L2_100>;
qcom,freq-domain = <&cpufreq_hw 0>;
#cooling-cells = <2>;
@@ -82,6 +86,8 @@ CPU2: cpu@200 {
compatible = "qcom,kryo485";
reg = <0x0 0x200>;
enable-method = "psci";
+ cpu-idle-states = <&LITTLE_CPU_SLEEP_0
+ &CLUSTER_SLEEP_0>;
next-level-cache = <&L2_200>;
qcom,freq-domain = <&cpufreq_hw 0>;
#cooling-cells = <2>;
@@ -96,6 +102,8 @@ CPU3: cpu@300 {
compatible = "qcom,kryo485";
reg = <0x0 0x300>;
enable-method = "psci";
+ cpu-idle-states = <&LITTLE_CPU_SLEEP_0
+ &CLUSTER_SLEEP_0>;
next-level-cache = <&L2_300>;
qcom,freq-domain = <&cpufreq_hw 0>;
#cooling-cells = <2>;
@@ -110,6 +118,8 @@ CPU4: cpu@400 {
compatible = "qcom,kryo485";
reg = <0x0 0x400>;
enable-method = "psci";
+ cpu-idle-states = <&BIG_CPU_SLEEP_0
+ &CLUSTER_SLEEP_0>;
next-level-cache = <&L2_400>;
qcom,freq-domain = <&cpufreq_hw 1>;
#cooling-cells = <2>;
@@ -124,6 +134,8 @@ CPU5: cpu@500 {
compatible = "qcom,kryo485";
reg = <0x0 0x500>;
enable-method = "psci";
+ cpu-idle-states = <&BIG_CPU_SLEEP_0
+ &CLUSTER_SLEEP_0>;
next-level-cache = <&L2_500>;
qcom,freq-domain = <&cpufreq_hw 1>;
#cooling-cells = <2>;
@@ -138,6 +150,8 @@ CPU6: cpu@600 {
compatible = "qcom,kryo485";
reg = <0x0 0x600>;
enable-method = "psci";
+ cpu-idle-states = <&BIG_CPU_SLEEP_0
+ &CLUSTER_SLEEP_0>;
next-level-cache = <&L2_600>;
qcom,freq-domain = <&cpufreq_hw 1>;
#cooling-cells = <2>;
@@ -152,6 +166,8 @@ CPU7: cpu@700 {
compatible = "qcom,kryo485";
reg = <0x0 0x700>;
enable-method = "psci";
+ cpu-idle-states = <&BIG_CPU_SLEEP_0
+ &CLUSTER_SLEEP_0>;
next-level-cache = <&L2_700>;
qcom,freq-domain = <&cpufreq_hw 2>;
#cooling-cells = <2>;
@@ -196,6 +212,40 @@ core7 {
};
};
};
+
+ idle-states {
+ entry-method = "psci";
+
+ LITTLE_CPU_SLEEP_0: cpu-sleep-0-0 {
+ compatible = "arm,idle-state";
+ idle-state-name = "little-rail-power-collapse";
+ arm,psci-suspend-param = <0x40000004>;
+ entry-latency-us = <355>;
+ exit-latency-us = <909>;
+ min-residency-us = <3934>;
+ local-timer-stop;
+ };
+
+ BIG_CPU_SLEEP_0: cpu-sleep-1-0 {
+ compatible = "arm,idle-state";
+ idle-state-name = "big-rail-power-collapse";
+ arm,psci-suspend-param = <0x40000004>;
+ entry-latency-us = <241>;
+ exit-latency-us = <1461>;
+ min-residency-us = <4488>;
+ local-timer-stop;
+ };
+
+ CLUSTER_SLEEP_0: cluster-sleep-0 {
+ compatible = "arm,idle-state";
+ idle-state-name = "cluster-power-collapse";
+ arm,psci-suspend-param = <0x400000F4>;
+ entry-latency-us = <3263>;
+ exit-latency-us = <6562>;
+ min-residency-us = <9987>;
+ local-timer-stop;
+ };
+ };
};
firmware {
--
2.29.2
^ permalink raw reply related [flat|nested] 10+ messages in thread
* [PATCH 3/3] arm64: dts: qcom: sm8150: Add CPU capacities and energy model
2020-12-21 0:29 [PATCH 0/3] CPU power management for SM8150 Danny Lin
2020-12-21 0:29 ` [PATCH 1/3] arm64: dts: qcom: sm8150: Define CPU topology Danny Lin
2020-12-21 0:29 ` [PATCH 2/3] arm64: dts: qcom: sm8150: Add PSCI idle states Danny Lin
@ 2020-12-21 0:29 ` Danny Lin
2021-01-06 3:50 ` [PATCH 0/3] CPU power management for SM8150 patchwork-bot+linux-arm-msm
3 siblings, 0 replies; 10+ messages in thread
From: Danny Lin @ 2020-12-21 0:29 UTC (permalink / raw)
Cc: Danny Lin, Andy Gross, Bjorn Andersson, Rob Herring,
linux-arm-msm, devicetree, linux-kernel
Power and performance measurements were made using my freqbench [1]
benchmark coordinator, which isolates, offlines, and disables the timer
tick on test CPUs to maximize accuracy. It uses EEMBC CoreMark [2] as
the workload and measures power usage using the PM8150B PMIC's fuel
gauge.
The energy model dynamic-power-coefficient values were calculated with
DPC = µW / MHz / V^2
for each OPP, and averaged across all OPPs within each cluster for the
final coefficient. Voltages were obtained from the qcom-cpufreq-hw
driver that reads voltages from the OSM LUT programmed into the SoC.
Normalized DMIPS/MHz capacity scale values for each CPU were calculated
from CoreMarks/MHz (CoreMark iterations per second per MHz), which
serves the same purpose. For each CPU, the final capacity-dmips-mhz
value is the C/MHz value of its maximum frequency normalized to
SCHED_CAPACITY_SCALE (1024) for the fastest CPU in the system.
An Asus ZenFone 6 device running a downstream Qualcomm 4.14 kernel
(LA.UM.8.1.r1-15600-sm8150.0) was used for benchmarks to ensure proper
frequency scaling and other low-level controls.
Raw benchmark results can be found in the freqbench repository [3].
Below is a human-readable summary:
Frequency domains: cpu1 cpu4 cpu7
Offline CPUs: cpu1 cpu2 cpu3 cpu4 cpu5 cpu6 cpu7
Baseline power usage: 1400 mW
===== CPU 1 =====
Frequencies: 300 403 499 576 672 768 844 940 1036 1113 1209 1305 1382 1478 1555 1632 1708 1785
300: 1114 3.7 C/MHz 52 mW 11.8 J 21.3 I/mJ 224.4 s
403: 1497 3.7 C/MHz 57 mW 9.5 J 26.2 I/mJ 167.0 s
499: 1854 3.7 C/MHz 73 mW 9.8 J 25.5 I/mJ 134.9 s
576: 2139 3.7 C/MHz 83 mW 9.7 J 25.8 I/mJ 116.9 s
672: 2495 3.7 C/MHz 65 mW 6.5 J 38.6 I/mJ 100.2 s
768: 2852 3.7 C/MHz 72 mW 6.3 J 39.4 I/mJ 87.7 s
844: 3137 3.7 C/MHz 77 mW 6.2 J 40.5 I/mJ 79.7 s
940: 3493 3.7 C/MHz 84 mW 6.0 J 41.8 I/mJ 71.6 s
1036: 3850 3.7 C/MHz 91 mW 5.9 J 42.5 I/mJ 64.9 s
1113: 4135 3.7 C/MHz 96 mW 5.8 J 43.2 I/mJ 60.5 s
1209: 4491 3.7 C/MHz 102 mW 5.7 J 44.2 I/mJ 55.7 s
1305: 4848 3.7 C/MHz 110 mW 5.7 J 44.0 I/mJ 51.6 s
1382: 5133 3.7 C/MHz 114 mW 5.5 J 45.2 I/mJ 48.7 s
1478: 5490 3.7 C/MHz 120 mW 5.5 J 45.7 I/mJ 45.5 s
1555: 5775 3.7 C/MHz 126 mW 5.5 J 45.8 I/mJ 43.3 s
1632: 6060 3.7 C/MHz 131 mW 5.4 J 46.1 I/mJ 41.3 s
1708: 6345 3.7 C/MHz 137 mW 5.4 J 46.3 I/mJ 39.4 s
1785: 6630 3.7 C/MHz 146 mW 5.5 J 45.5 I/mJ 37.7 s
===== CPU 4 =====
Frequencies: 710 825 940 1056 1171 1286 1401 1497 1612 1708 1804 1920 2016 2131 2227 2323 2419
710: 2765 3.9 C/MHz 126 mW 11.4 J 22.0 I/mJ 90.4 s
825: 6432 7.8 C/MHz 206 mW 8.0 J 31.2 I/mJ 38.9 s
940: 7331 7.8 C/MHz 227 mW 7.7 J 32.3 I/mJ 34.1 s
1056: 8227 7.8 C/MHz 249 mW 7.6 J 33.0 I/mJ 30.4 s
1171: 9127 7.8 C/MHz 261 mW 7.2 J 34.9 I/mJ 27.4 s
1286: 10020 7.8 C/MHz 289 mW 7.2 J 34.6 I/mJ 25.0 s
1401: 10918 7.8 C/MHz 311 mW 7.1 J 35.1 I/mJ 22.9 s
1497: 11663 7.8 C/MHz 336 mW 7.2 J 34.7 I/mJ 21.4 s
1612: 12546 7.8 C/MHz 375 mW 7.5 J 33.5 I/mJ 19.9 s
1708: 13320 7.8 C/MHz 398 mW 7.5 J 33.5 I/mJ 18.8 s
1804: 14069 7.8 C/MHz 456 mW 8.1 J 30.9 I/mJ 17.8 s
1920: 14909 7.8 C/MHz 507 mW 8.5 J 29.4 I/mJ 16.8 s
2016: 15706 7.8 C/MHz 558 mW 8.9 J 28.1 I/mJ 15.9 s
2131: 16612 7.8 C/MHz 632 mW 9.5 J 26.3 I/mJ 15.1 s
2227: 17349 7.8 C/MHz 698 mW 10.1 J 24.8 I/mJ 14.4 s
2323: 18088 7.8 C/MHz 717 mW 9.9 J 25.2 I/mJ 13.8 s
2419: 18835 7.8 C/MHz 845 mW 11.2 J 22.3 I/mJ 13.3 s
===== CPU 7 =====
Frequencies: 825 940 1056 1171 1286 1401 1497 1612 1708 1804 1920 2016 2131 2227 2323 2419 2534 2649 2745 2841
825: 3215 3.9 C/MHz 158 mW 12.3 J 20.3 I/mJ 77.8 s
940: 7330 7.8 C/MHz 269 mW 9.2 J 27.3 I/mJ 34.1 s
1056: 8227 7.8 C/MHz 291 mW 8.8 J 28.2 I/mJ 30.4 s
1171: 9125 7.8 C/MHz 316 mW 8.7 J 28.9 I/mJ 27.4 s
1286: 10024 7.8 C/MHz 338 mW 8.4 J 29.6 I/mJ 25.0 s
1401: 10922 7.8 C/MHz 365 mW 8.4 J 29.9 I/mJ 22.9 s
1497: 11674 7.8 C/MHz 383 mW 8.2 J 30.4 I/mJ 21.4 s
1612: 12564 7.8 C/MHz 406 mW 8.1 J 30.9 I/mJ 19.9 s
1708: 13317 7.8 C/MHz 427 mW 8.0 J 31.2 I/mJ 18.8 s
1804: 14062 7.8 C/MHz 446 mW 7.9 J 31.5 I/mJ 17.8 s
1920: 14966 7.8 C/MHz 498 mW 8.3 J 30.1 I/mJ 16.7 s
2016: 15711 7.8 C/MHz 513 mW 8.2 J 30.6 I/mJ 15.9 s
2131: 16599 7.8 C/MHz 599 mW 9.0 J 27.7 I/mJ 15.1 s
2227: 17353 7.8 C/MHz 622 mW 9.0 J 27.9 I/mJ 14.4 s
2323: 18095 7.8 C/MHz 704 mW 9.7 J 25.7 I/mJ 13.8 s
2419: 18849 7.8 C/MHz 738 mW 9.8 J 25.5 I/mJ 13.3 s
2534: 19761 7.8 C/MHz 824 mW 10.4 J 23.9 I/mJ 12.7 s
2649: 20658 7.8 C/MHz 882 mW 10.7 J 23.4 I/mJ 12.1 s
2745: 21400 7.8 C/MHz 1003 mW 11.7 J 21.3 I/mJ 11.7 s
2841: 22147 7.8 C/MHz 1092 mW 12.3 J 20.3 I/mJ 11.3 s
[1] https://github.com/kdrag0n/freqbench
[2] https://www.eembc.org/coremark/
[3] https://github.com/kdrag0n/freqbench/tree/master/results/sm8150/main
Signed-off-by: Danny Lin <danny@kdrag0n.dev>
---
arch/arm64/boot/dts/qcom/sm8150.dtsi | 16 ++++++++++++++++
1 file changed, 16 insertions(+)
diff --git a/arch/arm64/boot/dts/qcom/sm8150.dtsi b/arch/arm64/boot/dts/qcom/sm8150.dtsi
index edc1fe6d7f1b..5cadab1e052a 100644
--- a/arch/arm64/boot/dts/qcom/sm8150.dtsi
+++ b/arch/arm64/boot/dts/qcom/sm8150.dtsi
@@ -50,6 +50,8 @@ CPU0: cpu@0 {
compatible = "qcom,kryo485";
reg = <0x0 0x0>;
enable-method = "psci";
+ capacity-dmips-mhz = <488>;
+ dynamic-power-coefficient = <232>;
cpu-idle-states = <&LITTLE_CPU_SLEEP_0
&CLUSTER_SLEEP_0>;
next-level-cache = <&L2_0>;
@@ -69,6 +71,8 @@ CPU1: cpu@100 {
compatible = "qcom,kryo485";
reg = <0x0 0x100>;
enable-method = "psci";
+ capacity-dmips-mhz = <488>;
+ dynamic-power-coefficient = <232>;
cpu-idle-states = <&LITTLE_CPU_SLEEP_0
&CLUSTER_SLEEP_0>;
next-level-cache = <&L2_100>;
@@ -86,6 +90,8 @@ CPU2: cpu@200 {
compatible = "qcom,kryo485";
reg = <0x0 0x200>;
enable-method = "psci";
+ capacity-dmips-mhz = <488>;
+ dynamic-power-coefficient = <232>;
cpu-idle-states = <&LITTLE_CPU_SLEEP_0
&CLUSTER_SLEEP_0>;
next-level-cache = <&L2_200>;
@@ -102,6 +108,8 @@ CPU3: cpu@300 {
compatible = "qcom,kryo485";
reg = <0x0 0x300>;
enable-method = "psci";
+ capacity-dmips-mhz = <488>;
+ dynamic-power-coefficient = <232>;
cpu-idle-states = <&LITTLE_CPU_SLEEP_0
&CLUSTER_SLEEP_0>;
next-level-cache = <&L2_300>;
@@ -118,6 +126,8 @@ CPU4: cpu@400 {
compatible = "qcom,kryo485";
reg = <0x0 0x400>;
enable-method = "psci";
+ capacity-dmips-mhz = <1024>;
+ dynamic-power-coefficient = <369>;
cpu-idle-states = <&BIG_CPU_SLEEP_0
&CLUSTER_SLEEP_0>;
next-level-cache = <&L2_400>;
@@ -134,6 +144,8 @@ CPU5: cpu@500 {
compatible = "qcom,kryo485";
reg = <0x0 0x500>;
enable-method = "psci";
+ capacity-dmips-mhz = <1024>;
+ dynamic-power-coefficient = <369>;
cpu-idle-states = <&BIG_CPU_SLEEP_0
&CLUSTER_SLEEP_0>;
next-level-cache = <&L2_500>;
@@ -150,6 +162,8 @@ CPU6: cpu@600 {
compatible = "qcom,kryo485";
reg = <0x0 0x600>;
enable-method = "psci";
+ capacity-dmips-mhz = <1024>;
+ dynamic-power-coefficient = <369>;
cpu-idle-states = <&BIG_CPU_SLEEP_0
&CLUSTER_SLEEP_0>;
next-level-cache = <&L2_600>;
@@ -166,6 +180,8 @@ CPU7: cpu@700 {
compatible = "qcom,kryo485";
reg = <0x0 0x700>;
enable-method = "psci";
+ capacity-dmips-mhz = <1024>;
+ dynamic-power-coefficient = <421>;
cpu-idle-states = <&BIG_CPU_SLEEP_0
&CLUSTER_SLEEP_0>;
next-level-cache = <&L2_700>;
--
2.29.2
^ permalink raw reply related [flat|nested] 10+ messages in thread
* Re: [PATCH 2/3] arm64: dts: qcom: sm8150: Add PSCI idle states
2020-12-21 0:29 ` [PATCH 2/3] arm64: dts: qcom: sm8150: Add PSCI idle states Danny Lin
@ 2020-12-21 3:48 ` Bjorn Andersson
2020-12-23 2:00 ` Danny Lin
0 siblings, 1 reply; 10+ messages in thread
From: Bjorn Andersson @ 2020-12-21 3:48 UTC (permalink / raw)
To: Danny Lin
Cc: Andy Gross, Rob Herring, linux-arm-msm, devicetree, linux-kernel
On Sun 20 Dec 16:29 PST 2020, Danny Lin wrote:
> Like other Qualcomm SoCs, sm8150 exposes CPU and cluster idle states
> through PSCI. Define the idle states to save power when the CPU is not
> in active use.
>
> These idle states, latency, and residency values match the downstream
> 4.14 kernel from Qualcomm as of LA.UM.8.1.r1-15600-sm8150.0.
>
> It's worth noting that the CPU has an additional C3 power collapse idle
> state between WFI and rail power collapse (with PSCI mode 0x40000003),
> but it is not officially used in downstream kernels due to "thermal
> throttling issues."
>
Thanks Danny for this series, very happy to see this kind of additions.
Just one small question about the cluster param below.
> Signed-off-by: Danny Lin <danny@kdrag0n.dev>
> ---
> arch/arm64/boot/dts/qcom/sm8150.dtsi | 50 ++++++++++++++++++++++++++++
> 1 file changed, 50 insertions(+)
>
> diff --git a/arch/arm64/boot/dts/qcom/sm8150.dtsi b/arch/arm64/boot/dts/qcom/sm8150.dtsi
> index 75ed38ee5d88..edc1fe6d7f1b 100644
> --- a/arch/arm64/boot/dts/qcom/sm8150.dtsi
> +++ b/arch/arm64/boot/dts/qcom/sm8150.dtsi
> @@ -50,6 +50,8 @@ CPU0: cpu@0 {
> compatible = "qcom,kryo485";
> reg = <0x0 0x0>;
> enable-method = "psci";
> + cpu-idle-states = <&LITTLE_CPU_SLEEP_0
> + &CLUSTER_SLEEP_0>;
> next-level-cache = <&L2_0>;
> qcom,freq-domain = <&cpufreq_hw 0>;
> #cooling-cells = <2>;
> @@ -67,6 +69,8 @@ CPU1: cpu@100 {
> compatible = "qcom,kryo485";
> reg = <0x0 0x100>;
> enable-method = "psci";
> + cpu-idle-states = <&LITTLE_CPU_SLEEP_0
> + &CLUSTER_SLEEP_0>;
> next-level-cache = <&L2_100>;
> qcom,freq-domain = <&cpufreq_hw 0>;
> #cooling-cells = <2>;
> @@ -82,6 +86,8 @@ CPU2: cpu@200 {
> compatible = "qcom,kryo485";
> reg = <0x0 0x200>;
> enable-method = "psci";
> + cpu-idle-states = <&LITTLE_CPU_SLEEP_0
> + &CLUSTER_SLEEP_0>;
> next-level-cache = <&L2_200>;
> qcom,freq-domain = <&cpufreq_hw 0>;
> #cooling-cells = <2>;
> @@ -96,6 +102,8 @@ CPU3: cpu@300 {
> compatible = "qcom,kryo485";
> reg = <0x0 0x300>;
> enable-method = "psci";
> + cpu-idle-states = <&LITTLE_CPU_SLEEP_0
> + &CLUSTER_SLEEP_0>;
> next-level-cache = <&L2_300>;
> qcom,freq-domain = <&cpufreq_hw 0>;
> #cooling-cells = <2>;
> @@ -110,6 +118,8 @@ CPU4: cpu@400 {
> compatible = "qcom,kryo485";
> reg = <0x0 0x400>;
> enable-method = "psci";
> + cpu-idle-states = <&BIG_CPU_SLEEP_0
> + &CLUSTER_SLEEP_0>;
> next-level-cache = <&L2_400>;
> qcom,freq-domain = <&cpufreq_hw 1>;
> #cooling-cells = <2>;
> @@ -124,6 +134,8 @@ CPU5: cpu@500 {
> compatible = "qcom,kryo485";
> reg = <0x0 0x500>;
> enable-method = "psci";
> + cpu-idle-states = <&BIG_CPU_SLEEP_0
> + &CLUSTER_SLEEP_0>;
> next-level-cache = <&L2_500>;
> qcom,freq-domain = <&cpufreq_hw 1>;
> #cooling-cells = <2>;
> @@ -138,6 +150,8 @@ CPU6: cpu@600 {
> compatible = "qcom,kryo485";
> reg = <0x0 0x600>;
> enable-method = "psci";
> + cpu-idle-states = <&BIG_CPU_SLEEP_0
> + &CLUSTER_SLEEP_0>;
> next-level-cache = <&L2_600>;
> qcom,freq-domain = <&cpufreq_hw 1>;
> #cooling-cells = <2>;
> @@ -152,6 +166,8 @@ CPU7: cpu@700 {
> compatible = "qcom,kryo485";
> reg = <0x0 0x700>;
> enable-method = "psci";
> + cpu-idle-states = <&BIG_CPU_SLEEP_0
> + &CLUSTER_SLEEP_0>;
> next-level-cache = <&L2_700>;
> qcom,freq-domain = <&cpufreq_hw 2>;
> #cooling-cells = <2>;
> @@ -196,6 +212,40 @@ core7 {
> };
> };
> };
> +
> + idle-states {
> + entry-method = "psci";
> +
> + LITTLE_CPU_SLEEP_0: cpu-sleep-0-0 {
> + compatible = "arm,idle-state";
> + idle-state-name = "little-rail-power-collapse";
> + arm,psci-suspend-param = <0x40000004>;
> + entry-latency-us = <355>;
> + exit-latency-us = <909>;
> + min-residency-us = <3934>;
> + local-timer-stop;
> + };
> +
> + BIG_CPU_SLEEP_0: cpu-sleep-1-0 {
> + compatible = "arm,idle-state";
> + idle-state-name = "big-rail-power-collapse";
> + arm,psci-suspend-param = <0x40000004>;
> + entry-latency-us = <241>;
> + exit-latency-us = <1461>;
> + min-residency-us = <4488>;
> + local-timer-stop;
> + };
> +
> + CLUSTER_SLEEP_0: cluster-sleep-0 {
> + compatible = "arm,idle-state";
> + idle-state-name = "cluster-power-collapse";
> + arm,psci-suspend-param = <0x400000F4>;
How come this is 0xf4?
Isn't downstream saying that this should be either 0x1 << 4 or 0xc24 <<
4, depending on how deep we want to go? Could we at least mention why
this is 0xf4?
Regards,
Bjorn
> + entry-latency-us = <3263>;
> + exit-latency-us = <6562>;
> + min-residency-us = <9987>;
> + local-timer-stop;
> + };
> + };
> };
>
> firmware {
> --
> 2.29.2
>
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH 2/3] arm64: dts: qcom: sm8150: Add PSCI idle states
2020-12-21 3:48 ` Bjorn Andersson
@ 2020-12-23 2:00 ` Danny Lin
2020-12-28 18:02 ` Bjorn Andersson
0 siblings, 1 reply; 10+ messages in thread
From: Danny Lin @ 2020-12-23 2:00 UTC (permalink / raw)
To: Bjorn Andersson
Cc: Andy Gross, Rob Herring, linux-arm-msm, devicetree, linux-kernel
On Sun, Dec 20, 2020 at 7:48 PM, Bjorn Andersson wrote:
> On Sun 20 Dec 16:29 PST 2020, Danny Lin wrote:
>
> > Like other Qualcomm SoCs, sm8150 exposes CPU and cluster idle states
> > through PSCI. Define the idle states to save power when the CPU is not
> > in active use.
> >
> > These idle states, latency, and residency values match the downstream
> > 4.14 kernel from Qualcomm as of LA.UM.8.1.r1-15600-sm8150.0.
> >
> > It's worth noting that the CPU has an additional C3 power collapse idle
> > state between WFI and rail power collapse (with PSCI mode 0x40000003),
> > but it is not officially used in downstream kernels due to "thermal
> > throttling issues."
> >
>
> Thanks Danny for this series, very happy to see this kind of additions.
> Just one small question about the cluster param below.
>
> > Signed-off-by: Danny Lin <danny@kdrag0n.dev>
> > ---
> > arch/arm64/boot/dts/qcom/sm8150.dtsi | 50 ++++++++++++++++++++++++++++
> > 1 file changed, 50 insertions(+)
> >
> > diff --git a/arch/arm64/boot/dts/qcom/sm8150.dtsi b/arch/arm64/boot/dts/qcom/sm8150.dtsi
> > index 75ed38ee5d88..edc1fe6d7f1b 100644
> > --- a/arch/arm64/boot/dts/qcom/sm8150.dtsi
> > +++ b/arch/arm64/boot/dts/qcom/sm8150.dtsi
> > @@ -50,6 +50,8 @@ CPU0: cpu@0 {
> > compatible = "qcom,kryo485";
> > reg = <0x0 0x0>;
> > enable-method = "psci";
> > + cpu-idle-states = <&LITTLE_CPU_SLEEP_0
> > + &CLUSTER_SLEEP_0>;
> > next-level-cache = <&L2_0>;
> > qcom,freq-domain = <&cpufreq_hw 0>;
> > #cooling-cells = <2>;
> > @@ -67,6 +69,8 @@ CPU1: cpu@100 {
> > compatible = "qcom,kryo485";
> > reg = <0x0 0x100>;
> > enable-method = "psci";
> > + cpu-idle-states = <&LITTLE_CPU_SLEEP_0
> > + &CLUSTER_SLEEP_0>;
> > next-level-cache = <&L2_100>;
> > qcom,freq-domain = <&cpufreq_hw 0>;
> > #cooling-cells = <2>;
> > @@ -82,6 +86,8 @@ CPU2: cpu@200 {
> > compatible = "qcom,kryo485";
> > reg = <0x0 0x200>;
> > enable-method = "psci";
> > + cpu-idle-states = <&LITTLE_CPU_SLEEP_0
> > + &CLUSTER_SLEEP_0>;
> > next-level-cache = <&L2_200>;
> > qcom,freq-domain = <&cpufreq_hw 0>;
> > #cooling-cells = <2>;
> > @@ -96,6 +102,8 @@ CPU3: cpu@300 {
> > compatible = "qcom,kryo485";
> > reg = <0x0 0x300>;
> > enable-method = "psci";
> > + cpu-idle-states = <&LITTLE_CPU_SLEEP_0
> > + &CLUSTER_SLEEP_0>;
> > next-level-cache = <&L2_300>;
> > qcom,freq-domain = <&cpufreq_hw 0>;
> > #cooling-cells = <2>;
> > @@ -110,6 +118,8 @@ CPU4: cpu@400 {
> > compatible = "qcom,kryo485";
> > reg = <0x0 0x400>;
> > enable-method = "psci";
> > + cpu-idle-states = <&BIG_CPU_SLEEP_0
> > + &CLUSTER_SLEEP_0>;
> > next-level-cache = <&L2_400>;
> > qcom,freq-domain = <&cpufreq_hw 1>;
> > #cooling-cells = <2>;
> > @@ -124,6 +134,8 @@ CPU5: cpu@500 {
> > compatible = "qcom,kryo485";
> > reg = <0x0 0x500>;
> > enable-method = "psci";
> > + cpu-idle-states = <&BIG_CPU_SLEEP_0
> > + &CLUSTER_SLEEP_0>;
> > next-level-cache = <&L2_500>;
> > qcom,freq-domain = <&cpufreq_hw 1>;
> > #cooling-cells = <2>;
> > @@ -138,6 +150,8 @@ CPU6: cpu@600 {
> > compatible = "qcom,kryo485";
> > reg = <0x0 0x600>;
> > enable-method = "psci";
> > + cpu-idle-states = <&BIG_CPU_SLEEP_0
> > + &CLUSTER_SLEEP_0>;
> > next-level-cache = <&L2_600>;
> > qcom,freq-domain = <&cpufreq_hw 1>;
> > #cooling-cells = <2>;
> > @@ -152,6 +166,8 @@ CPU7: cpu@700 {
> > compatible = "qcom,kryo485";
> > reg = <0x0 0x700>;
> > enable-method = "psci";
> > + cpu-idle-states = <&BIG_CPU_SLEEP_0
> > + &CLUSTER_SLEEP_0>;
> > next-level-cache = <&L2_700>;
> > qcom,freq-domain = <&cpufreq_hw 2>;
> > #cooling-cells = <2>;
> > @@ -196,6 +212,40 @@ core7 {
> > };
> > };
> > };
> > +
> > + idle-states {
> > + entry-method = "psci";
> > +
> > + LITTLE_CPU_SLEEP_0: cpu-sleep-0-0 {
> > + compatible = "arm,idle-state";
> > + idle-state-name = "little-rail-power-collapse";
> > + arm,psci-suspend-param = <0x40000004>;
> > + entry-latency-us = <355>;
> > + exit-latency-us = <909>;
> > + min-residency-us = <3934>;
> > + local-timer-stop;
> > + };
> > +
> > + BIG_CPU_SLEEP_0: cpu-sleep-1-0 {
> > + compatible = "arm,idle-state";
> > + idle-state-name = "big-rail-power-collapse";
> > + arm,psci-suspend-param = <0x40000004>;
> > + entry-latency-us = <241>;
> > + exit-latency-us = <1461>;
> > + min-residency-us = <4488>;
> > + local-timer-stop;
> > + };
> > +
> > + CLUSTER_SLEEP_0: cluster-sleep-0 {
> > + compatible = "arm,idle-state";
> > + idle-state-name = "cluster-power-collapse";
> > + arm,psci-suspend-param = <0x400000F4>;
>
> How come this is 0xf4?
>
> Isn't downstream saying that this should be either 0x1 << 4 or 0xc24 <<
> 4, depending on how deep we want to go? Could we at least mention why
> this is 0xf4?
I'm not sure where 0x400000F4 originally came from. I noticed that
sdm845 uses the same 0xc24 mode in downstream, but Qualcomm used
0x400000F4 in mainline.
I did some testing on a downstream kernel and found that the real value
it uses on sm8150 is 0x4100c244, but the idle state doesn't work at all
if I use the same value on mainline. The logic appears to be the same in
the downstream sdm845 kernel. Maybe it has to do with how downstream has
"notify RPM" before attempting to enter the idle state?
In downstream, the final PSCI value is calculated as the sum of:
1. (cluster-mode & cluster-mode-mask) << cluster-mode-shift = (0xc24 & 0xfff) << 4 = 0xc240
2. (is-reset << 30) = 0x40000000
3. (affinity level & 0x3) << 24 = 0x1000000
4. (cpu-mode) = 0x4
so 0xc240 + 0x40000000 + 0x1000000 + 0x4 = 0x4100c244.
It's also possible that the problem comes from the cluster idle state
needing all CPUs in the cluster to be asleep (as far as I know), since
it doesn't look like mainline handles that.
>
> Regards,
> Bjorn
>
> > + entry-latency-us = <3263>;
> > + exit-latency-us = <6562>;
> > + min-residency-us = <9987>;
> > + local-timer-stop;
> > + };
> > + };
> > };
> >
> > firmware {
> > --
> > 2.29.2
> >
>
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH 2/3] arm64: dts: qcom: sm8150: Add PSCI idle states
2020-12-23 2:00 ` Danny Lin
@ 2020-12-28 18:02 ` Bjorn Andersson
2020-12-29 23:19 ` Danny Lin
0 siblings, 1 reply; 10+ messages in thread
From: Bjorn Andersson @ 2020-12-28 18:02 UTC (permalink / raw)
To: Danny Lin
Cc: Andy Gross, Rob Herring, linux-arm-msm, devicetree, linux-kernel
On Tue 22 Dec 20:00 CST 2020, Danny Lin wrote:
> On Sun, Dec 20, 2020 at 7:48 PM, Bjorn Andersson wrote:
> > On Sun 20 Dec 16:29 PST 2020, Danny Lin wrote:
> >
> > > Like other Qualcomm SoCs, sm8150 exposes CPU and cluster idle states
> > > through PSCI. Define the idle states to save power when the CPU is not
> > > in active use.
> > >
> > > These idle states, latency, and residency values match the downstream
> > > 4.14 kernel from Qualcomm as of LA.UM.8.1.r1-15600-sm8150.0.
> > >
> > > It's worth noting that the CPU has an additional C3 power collapse idle
> > > state between WFI and rail power collapse (with PSCI mode 0x40000003),
> > > but it is not officially used in downstream kernels due to "thermal
> > > throttling issues."
> > >
> >
> > Thanks Danny for this series, very happy to see this kind of additions.
> > Just one small question about the cluster param below.
> >
> > > Signed-off-by: Danny Lin <danny@kdrag0n.dev>
> > > ---
> > > arch/arm64/boot/dts/qcom/sm8150.dtsi | 50 ++++++++++++++++++++++++++++
> > > 1 file changed, 50 insertions(+)
> > >
> > > diff --git a/arch/arm64/boot/dts/qcom/sm8150.dtsi b/arch/arm64/boot/dts/qcom/sm8150.dtsi
> > > index 75ed38ee5d88..edc1fe6d7f1b 100644
> > > --- a/arch/arm64/boot/dts/qcom/sm8150.dtsi
> > > +++ b/arch/arm64/boot/dts/qcom/sm8150.dtsi
> > > @@ -50,6 +50,8 @@ CPU0: cpu@0 {
> > > compatible = "qcom,kryo485";
> > > reg = <0x0 0x0>;
> > > enable-method = "psci";
> > > + cpu-idle-states = <&LITTLE_CPU_SLEEP_0
> > > + &CLUSTER_SLEEP_0>;
> > > next-level-cache = <&L2_0>;
> > > qcom,freq-domain = <&cpufreq_hw 0>;
> > > #cooling-cells = <2>;
> > > @@ -67,6 +69,8 @@ CPU1: cpu@100 {
> > > compatible = "qcom,kryo485";
> > > reg = <0x0 0x100>;
> > > enable-method = "psci";
> > > + cpu-idle-states = <&LITTLE_CPU_SLEEP_0
> > > + &CLUSTER_SLEEP_0>;
> > > next-level-cache = <&L2_100>;
> > > qcom,freq-domain = <&cpufreq_hw 0>;
> > > #cooling-cells = <2>;
> > > @@ -82,6 +86,8 @@ CPU2: cpu@200 {
> > > compatible = "qcom,kryo485";
> > > reg = <0x0 0x200>;
> > > enable-method = "psci";
> > > + cpu-idle-states = <&LITTLE_CPU_SLEEP_0
> > > + &CLUSTER_SLEEP_0>;
> > > next-level-cache = <&L2_200>;
> > > qcom,freq-domain = <&cpufreq_hw 0>;
> > > #cooling-cells = <2>;
> > > @@ -96,6 +102,8 @@ CPU3: cpu@300 {
> > > compatible = "qcom,kryo485";
> > > reg = <0x0 0x300>;
> > > enable-method = "psci";
> > > + cpu-idle-states = <&LITTLE_CPU_SLEEP_0
> > > + &CLUSTER_SLEEP_0>;
> > > next-level-cache = <&L2_300>;
> > > qcom,freq-domain = <&cpufreq_hw 0>;
> > > #cooling-cells = <2>;
> > > @@ -110,6 +118,8 @@ CPU4: cpu@400 {
> > > compatible = "qcom,kryo485";
> > > reg = <0x0 0x400>;
> > > enable-method = "psci";
> > > + cpu-idle-states = <&BIG_CPU_SLEEP_0
> > > + &CLUSTER_SLEEP_0>;
> > > next-level-cache = <&L2_400>;
> > > qcom,freq-domain = <&cpufreq_hw 1>;
> > > #cooling-cells = <2>;
> > > @@ -124,6 +134,8 @@ CPU5: cpu@500 {
> > > compatible = "qcom,kryo485";
> > > reg = <0x0 0x500>;
> > > enable-method = "psci";
> > > + cpu-idle-states = <&BIG_CPU_SLEEP_0
> > > + &CLUSTER_SLEEP_0>;
> > > next-level-cache = <&L2_500>;
> > > qcom,freq-domain = <&cpufreq_hw 1>;
> > > #cooling-cells = <2>;
> > > @@ -138,6 +150,8 @@ CPU6: cpu@600 {
> > > compatible = "qcom,kryo485";
> > > reg = <0x0 0x600>;
> > > enable-method = "psci";
> > > + cpu-idle-states = <&BIG_CPU_SLEEP_0
> > > + &CLUSTER_SLEEP_0>;
> > > next-level-cache = <&L2_600>;
> > > qcom,freq-domain = <&cpufreq_hw 1>;
> > > #cooling-cells = <2>;
> > > @@ -152,6 +166,8 @@ CPU7: cpu@700 {
> > > compatible = "qcom,kryo485";
> > > reg = <0x0 0x700>;
> > > enable-method = "psci";
> > > + cpu-idle-states = <&BIG_CPU_SLEEP_0
> > > + &CLUSTER_SLEEP_0>;
> > > next-level-cache = <&L2_700>;
> > > qcom,freq-domain = <&cpufreq_hw 2>;
> > > #cooling-cells = <2>;
> > > @@ -196,6 +212,40 @@ core7 {
> > > };
> > > };
> > > };
> > > +
> > > + idle-states {
> > > + entry-method = "psci";
> > > +
> > > + LITTLE_CPU_SLEEP_0: cpu-sleep-0-0 {
> > > + compatible = "arm,idle-state";
> > > + idle-state-name = "little-rail-power-collapse";
> > > + arm,psci-suspend-param = <0x40000004>;
> > > + entry-latency-us = <355>;
> > > + exit-latency-us = <909>;
> > > + min-residency-us = <3934>;
> > > + local-timer-stop;
> > > + };
> > > +
> > > + BIG_CPU_SLEEP_0: cpu-sleep-1-0 {
> > > + compatible = "arm,idle-state";
> > > + idle-state-name = "big-rail-power-collapse";
> > > + arm,psci-suspend-param = <0x40000004>;
> > > + entry-latency-us = <241>;
> > > + exit-latency-us = <1461>;
> > > + min-residency-us = <4488>;
> > > + local-timer-stop;
> > > + };
> > > +
> > > + CLUSTER_SLEEP_0: cluster-sleep-0 {
> > > + compatible = "arm,idle-state";
> > > + idle-state-name = "cluster-power-collapse";
> > > + arm,psci-suspend-param = <0x400000F4>;
> >
> > How come this is 0xf4?
> >
> > Isn't downstream saying that this should be either 0x1 << 4 or 0xc24 <<
> > 4, depending on how deep we want to go? Could we at least mention why
> > this is 0xf4?
>
> I'm not sure where 0x400000F4 originally came from. I noticed that
> sdm845 uses the same 0xc24 mode in downstream, but Qualcomm used
> 0x400000F4 in mainline.
>
> I did some testing on a downstream kernel and found that the real value
> it uses on sm8150 is 0x4100c244, but the idle state doesn't work at all
> if I use the same value on mainline. The logic appears to be the same in
> the downstream sdm845 kernel. Maybe it has to do with how downstream has
> "notify RPM" before attempting to enter the idle state?
>
> In downstream, the final PSCI value is calculated as the sum of:
>
> 1. (cluster-mode & cluster-mode-mask) << cluster-mode-shift = (0xc24 & 0xfff) << 4 = 0xc240
> 2. (is-reset << 30) = 0x40000000
> 3. (affinity level & 0x3) << 24 = 0x1000000
> 4. (cpu-mode) = 0x4
>
> so 0xc240 + 0x40000000 + 0x1000000 + 0x4 = 0x4100c244.
>
> It's also possible that the problem comes from the cluster idle state
> needing all CPUs in the cluster to be asleep (as far as I know), since
> it doesn't look like mainline handles that.
>
Thanks for the explanation. I believe we have the code in place to do
OSI sleep using the "psci domain cpuidle" driver, but I'm not entirely
sure about the details about it - perhaps it's just a matter of wiring
it all up(?).
Let's go with your current patches and then swing back to this once
we've figured out the remaining details.
Thanks,
Bjorn
> >
> > Regards,
> > Bjorn
> >
> > > + entry-latency-us = <3263>;
> > > + exit-latency-us = <6562>;
> > > + min-residency-us = <9987>;
> > > + local-timer-stop;
> > > + };
> > > + };
> > > };
> > >
> > > firmware {
> > > --
> > > 2.29.2
> > >
> >
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH 2/3] arm64: dts: qcom: sm8150: Add PSCI idle states
2020-12-28 18:02 ` Bjorn Andersson
@ 2020-12-29 23:19 ` Danny Lin
2021-01-04 4:44 ` Bjorn Andersson
0 siblings, 1 reply; 10+ messages in thread
From: Danny Lin @ 2020-12-29 23:19 UTC (permalink / raw)
To: Bjorn Andersson
Cc: Andy Gross, Rob Herring, linux-arm-msm, devicetree, linux-kernel
On Mon, Dec 28, 2020 at10:02 AM, Bjorn Andersson wrote:
> On Tue 22 Dec 20:00 CST 2020, Danny Lin wrote:
>
> > On Sun, Dec 20, 2020 at 7:48 PM, Bjorn Andersson wrote:
> > > On Sun 20 Dec 16:29 PST 2020, Danny Lin wrote:
> > >
> > > > Like other Qualcomm SoCs, sm8150 exposes CPU and cluster idle states
> > > > through PSCI. Define the idle states to save power when the CPU is not
> > > > in active use.
> > > >
> > > > These idle states, latency, and residency values match the downstream
> > > > 4.14 kernel from Qualcomm as of LA.UM.8.1.r1-15600-sm8150.0.
> > > >
> > > > It's worth noting that the CPU has an additional C3 power collapse idle
> > > > state between WFI and rail power collapse (with PSCI mode 0x40000003),
> > > > but it is not officially used in downstream kernels due to "thermal
> > > > throttling issues."
> > > >
> > >
> > > Thanks Danny for this series, very happy to see this kind of additions.
> > > Just one small question about the cluster param below.
> > >
> > > > Signed-off-by: Danny Lin <danny@kdrag0n.dev>
> > > > ---
> > > > arch/arm64/boot/dts/qcom/sm8150.dtsi | 50 ++++++++++++++++++++++++++++
> > > > 1 file changed, 50 insertions(+)
> > > >
> > > > diff --git a/arch/arm64/boot/dts/qcom/sm8150.dtsi b/arch/arm64/boot/dts/qcom/sm8150.dtsi
> > > > index 75ed38ee5d88..edc1fe6d7f1b 100644
> > > > --- a/arch/arm64/boot/dts/qcom/sm8150.dtsi
> > > > +++ b/arch/arm64/boot/dts/qcom/sm8150.dtsi
> > > > @@ -50,6 +50,8 @@ CPU0: cpu@0 {
> > > > compatible = "qcom,kryo485";
> > > > reg = <0x0 0x0>;
> > > > enable-method = "psci";
> > > > + cpu-idle-states = <&LITTLE_CPU_SLEEP_0
> > > > + &CLUSTER_SLEEP_0>;
> > > > next-level-cache = <&L2_0>;
> > > > qcom,freq-domain = <&cpufreq_hw 0>;
> > > > #cooling-cells = <2>;
> > > > @@ -67,6 +69,8 @@ CPU1: cpu@100 {
> > > > compatible = "qcom,kryo485";
> > > > reg = <0x0 0x100>;
> > > > enable-method = "psci";
> > > > + cpu-idle-states = <&LITTLE_CPU_SLEEP_0
> > > > + &CLUSTER_SLEEP_0>;
> > > > next-level-cache = <&L2_100>;
> > > > qcom,freq-domain = <&cpufreq_hw 0>;
> > > > #cooling-cells = <2>;
> > > > @@ -82,6 +86,8 @@ CPU2: cpu@200 {
> > > > compatible = "qcom,kryo485";
> > > > reg = <0x0 0x200>;
> > > > enable-method = "psci";
> > > > + cpu-idle-states = <&LITTLE_CPU_SLEEP_0
> > > > + &CLUSTER_SLEEP_0>;
> > > > next-level-cache = <&L2_200>;
> > > > qcom,freq-domain = <&cpufreq_hw 0>;
> > > > #cooling-cells = <2>;
> > > > @@ -96,6 +102,8 @@ CPU3: cpu@300 {
> > > > compatible = "qcom,kryo485";
> > > > reg = <0x0 0x300>;
> > > > enable-method = "psci";
> > > > + cpu-idle-states = <&LITTLE_CPU_SLEEP_0
> > > > + &CLUSTER_SLEEP_0>;
> > > > next-level-cache = <&L2_300>;
> > > > qcom,freq-domain = <&cpufreq_hw 0>;
> > > > #cooling-cells = <2>;
> > > > @@ -110,6 +118,8 @@ CPU4: cpu@400 {
> > > > compatible = "qcom,kryo485";
> > > > reg = <0x0 0x400>;
> > > > enable-method = "psci";
> > > > + cpu-idle-states = <&BIG_CPU_SLEEP_0
> > > > + &CLUSTER_SLEEP_0>;
> > > > next-level-cache = <&L2_400>;
> > > > qcom,freq-domain = <&cpufreq_hw 1>;
> > > > #cooling-cells = <2>;
> > > > @@ -124,6 +134,8 @@ CPU5: cpu@500 {
> > > > compatible = "qcom,kryo485";
> > > > reg = <0x0 0x500>;
> > > > enable-method = "psci";
> > > > + cpu-idle-states = <&BIG_CPU_SLEEP_0
> > > > + &CLUSTER_SLEEP_0>;
> > > > next-level-cache = <&L2_500>;
> > > > qcom,freq-domain = <&cpufreq_hw 1>;
> > > > #cooling-cells = <2>;
> > > > @@ -138,6 +150,8 @@ CPU6: cpu@600 {
> > > > compatible = "qcom,kryo485";
> > > > reg = <0x0 0x600>;
> > > > enable-method = "psci";
> > > > + cpu-idle-states = <&BIG_CPU_SLEEP_0
> > > > + &CLUSTER_SLEEP_0>;
> > > > next-level-cache = <&L2_600>;
> > > > qcom,freq-domain = <&cpufreq_hw 1>;
> > > > #cooling-cells = <2>;
> > > > @@ -152,6 +166,8 @@ CPU7: cpu@700 {
> > > > compatible = "qcom,kryo485";
> > > > reg = <0x0 0x700>;
> > > > enable-method = "psci";
> > > > + cpu-idle-states = <&BIG_CPU_SLEEP_0
> > > > + &CLUSTER_SLEEP_0>;
> > > > next-level-cache = <&L2_700>;
> > > > qcom,freq-domain = <&cpufreq_hw 2>;
> > > > #cooling-cells = <2>;
> > > > @@ -196,6 +212,40 @@ core7 {
> > > > };
> > > > };
> > > > };
> > > > +
> > > > + idle-states {
> > > > + entry-method = "psci";
> > > > +
> > > > + LITTLE_CPU_SLEEP_0: cpu-sleep-0-0 {
> > > > + compatible = "arm,idle-state";
> > > > + idle-state-name = "little-rail-power-collapse";
> > > > + arm,psci-suspend-param = <0x40000004>;
> > > > + entry-latency-us = <355>;
> > > > + exit-latency-us = <909>;
> > > > + min-residency-us = <3934>;
> > > > + local-timer-stop;
> > > > + };
> > > > +
> > > > + BIG_CPU_SLEEP_0: cpu-sleep-1-0 {
> > > > + compatible = "arm,idle-state";
> > > > + idle-state-name = "big-rail-power-collapse";
> > > > + arm,psci-suspend-param = <0x40000004>;
> > > > + entry-latency-us = <241>;
> > > > + exit-latency-us = <1461>;
> > > > + min-residency-us = <4488>;
> > > > + local-timer-stop;
> > > > + };
> > > > +
> > > > + CLUSTER_SLEEP_0: cluster-sleep-0 {
> > > > + compatible = "arm,idle-state";
> > > > + idle-state-name = "cluster-power-collapse";
> > > > + arm,psci-suspend-param = <0x400000F4>;
> > >
> > > How come this is 0xf4?
> > >
> > > Isn't downstream saying that this should be either 0x1 << 4 or 0xc24 <<
> > > 4, depending on how deep we want to go? Could we at least mention why
> > > this is 0xf4?
> >
> > I'm not sure where 0x400000F4 originally came from. I noticed that
> > sdm845 uses the same 0xc24 mode in downstream, but Qualcomm used
> > 0x400000F4 in mainline.
> >
> > I did some testing on a downstream kernel and found that the real value
> > it uses on sm8150 is 0x4100c244, but the idle state doesn't work at all
> > if I use the same value on mainline. The logic appears to be the same in
> > the downstream sdm845 kernel. Maybe it has to do with how downstream has
> > "notify RPM" before attempting to enter the idle state?
> >
> > In downstream, the final PSCI value is calculated as the sum of:
> >
> > 1. (cluster-mode & cluster-mode-mask) << cluster-mode-shift = (0xc24 & 0xfff) << 4 = 0xc240
> > 2. (is-reset << 30) = 0x40000000
> > 3. (affinity level & 0x3) << 24 = 0x1000000
> > 4. (cpu-mode) = 0x4
> >
> > so 0xc240 + 0x40000000 + 0x1000000 + 0x4 = 0x4100c244.
> >
> > It's also possible that the problem comes from the cluster idle state
> > needing all CPUs in the cluster to be asleep (as far as I know), since
> > it doesn't look like mainline handles that.
> >
>
> Thanks for the explanation. I believe we have the code in place to do
> OSI sleep using the "psci domain cpuidle" driver, but I'm not entirely
> sure about the details about it - perhaps it's just a matter of wiring
> it all up(?).
>
> Let's go with your current patches and then swing back to this once
> we've figured out the remaining details.
Following your hint, I was able to get cluster idle working using power
domain idle states. The cluster idle state is now successfully using the
same value as downstream with no apparent issues, and individual CPU
idle states are still working. Time spent in the cluster idle state
increases when and only when all CPUs are idle, which matches the
expected behavior.
Should I send a separate patch for it or revise this series? It might be
helpful for future reference to keep a record of how to convert the
current 0xf4 cluster states on modern Qualcomm SoCs in the commit
history.
>
> Thanks,
> Bjorn
>
> > >
> > > Regards,
> > > Bjorn
> > >
> > > > + entry-latency-us = <3263>;
> > > > + exit-latency-us = <6562>;
> > > > + min-residency-us = <9987>;
> > > > + local-timer-stop;
> > > > + };
> > > > + };
> > > > };
> > > >
> > > > firmware {
> > > > --
> > > > 2.29.2
> > > >
> > >
>
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH 2/3] arm64: dts: qcom: sm8150: Add PSCI idle states
2020-12-29 23:19 ` Danny Lin
@ 2021-01-04 4:44 ` Bjorn Andersson
0 siblings, 0 replies; 10+ messages in thread
From: Bjorn Andersson @ 2021-01-04 4:44 UTC (permalink / raw)
To: Danny Lin
Cc: Andy Gross, Rob Herring, linux-arm-msm, devicetree, linux-kernel
On Tue 29 Dec 17:19 CST 2020, Danny Lin wrote:
> On Mon, Dec 28, 2020 at10:02 AM, Bjorn Andersson wrote:
> > On Tue 22 Dec 20:00 CST 2020, Danny Lin wrote:
> >
> > > On Sun, Dec 20, 2020 at 7:48 PM, Bjorn Andersson wrote:
> > > > On Sun 20 Dec 16:29 PST 2020, Danny Lin wrote:
> > > >
> > > > > Like other Qualcomm SoCs, sm8150 exposes CPU and cluster idle states
> > > > > through PSCI. Define the idle states to save power when the CPU is not
> > > > > in active use.
> > > > >
> > > > > These idle states, latency, and residency values match the downstream
> > > > > 4.14 kernel from Qualcomm as of LA.UM.8.1.r1-15600-sm8150.0.
> > > > >
> > > > > It's worth noting that the CPU has an additional C3 power collapse idle
> > > > > state between WFI and rail power collapse (with PSCI mode 0x40000003),
> > > > > but it is not officially used in downstream kernels due to "thermal
> > > > > throttling issues."
> > > > >
> > > >
> > > > Thanks Danny for this series, very happy to see this kind of additions.
> > > > Just one small question about the cluster param below.
> > > >
> > > > > Signed-off-by: Danny Lin <danny@kdrag0n.dev>
> > > > > ---
> > > > > arch/arm64/boot/dts/qcom/sm8150.dtsi | 50 ++++++++++++++++++++++++++++
> > > > > 1 file changed, 50 insertions(+)
> > > > >
> > > > > diff --git a/arch/arm64/boot/dts/qcom/sm8150.dtsi b/arch/arm64/boot/dts/qcom/sm8150.dtsi
> > > > > index 75ed38ee5d88..edc1fe6d7f1b 100644
> > > > > --- a/arch/arm64/boot/dts/qcom/sm8150.dtsi
> > > > > +++ b/arch/arm64/boot/dts/qcom/sm8150.dtsi
> > > > > @@ -50,6 +50,8 @@ CPU0: cpu@0 {
> > > > > compatible = "qcom,kryo485";
> > > > > reg = <0x0 0x0>;
> > > > > enable-method = "psci";
> > > > > + cpu-idle-states = <&LITTLE_CPU_SLEEP_0
> > > > > + &CLUSTER_SLEEP_0>;
> > > > > next-level-cache = <&L2_0>;
> > > > > qcom,freq-domain = <&cpufreq_hw 0>;
> > > > > #cooling-cells = <2>;
> > > > > @@ -67,6 +69,8 @@ CPU1: cpu@100 {
> > > > > compatible = "qcom,kryo485";
> > > > > reg = <0x0 0x100>;
> > > > > enable-method = "psci";
> > > > > + cpu-idle-states = <&LITTLE_CPU_SLEEP_0
> > > > > + &CLUSTER_SLEEP_0>;
> > > > > next-level-cache = <&L2_100>;
> > > > > qcom,freq-domain = <&cpufreq_hw 0>;
> > > > > #cooling-cells = <2>;
> > > > > @@ -82,6 +86,8 @@ CPU2: cpu@200 {
> > > > > compatible = "qcom,kryo485";
> > > > > reg = <0x0 0x200>;
> > > > > enable-method = "psci";
> > > > > + cpu-idle-states = <&LITTLE_CPU_SLEEP_0
> > > > > + &CLUSTER_SLEEP_0>;
> > > > > next-level-cache = <&L2_200>;
> > > > > qcom,freq-domain = <&cpufreq_hw 0>;
> > > > > #cooling-cells = <2>;
> > > > > @@ -96,6 +102,8 @@ CPU3: cpu@300 {
> > > > > compatible = "qcom,kryo485";
> > > > > reg = <0x0 0x300>;
> > > > > enable-method = "psci";
> > > > > + cpu-idle-states = <&LITTLE_CPU_SLEEP_0
> > > > > + &CLUSTER_SLEEP_0>;
> > > > > next-level-cache = <&L2_300>;
> > > > > qcom,freq-domain = <&cpufreq_hw 0>;
> > > > > #cooling-cells = <2>;
> > > > > @@ -110,6 +118,8 @@ CPU4: cpu@400 {
> > > > > compatible = "qcom,kryo485";
> > > > > reg = <0x0 0x400>;
> > > > > enable-method = "psci";
> > > > > + cpu-idle-states = <&BIG_CPU_SLEEP_0
> > > > > + &CLUSTER_SLEEP_0>;
> > > > > next-level-cache = <&L2_400>;
> > > > > qcom,freq-domain = <&cpufreq_hw 1>;
> > > > > #cooling-cells = <2>;
> > > > > @@ -124,6 +134,8 @@ CPU5: cpu@500 {
> > > > > compatible = "qcom,kryo485";
> > > > > reg = <0x0 0x500>;
> > > > > enable-method = "psci";
> > > > > + cpu-idle-states = <&BIG_CPU_SLEEP_0
> > > > > + &CLUSTER_SLEEP_0>;
> > > > > next-level-cache = <&L2_500>;
> > > > > qcom,freq-domain = <&cpufreq_hw 1>;
> > > > > #cooling-cells = <2>;
> > > > > @@ -138,6 +150,8 @@ CPU6: cpu@600 {
> > > > > compatible = "qcom,kryo485";
> > > > > reg = <0x0 0x600>;
> > > > > enable-method = "psci";
> > > > > + cpu-idle-states = <&BIG_CPU_SLEEP_0
> > > > > + &CLUSTER_SLEEP_0>;
> > > > > next-level-cache = <&L2_600>;
> > > > > qcom,freq-domain = <&cpufreq_hw 1>;
> > > > > #cooling-cells = <2>;
> > > > > @@ -152,6 +166,8 @@ CPU7: cpu@700 {
> > > > > compatible = "qcom,kryo485";
> > > > > reg = <0x0 0x700>;
> > > > > enable-method = "psci";
> > > > > + cpu-idle-states = <&BIG_CPU_SLEEP_0
> > > > > + &CLUSTER_SLEEP_0>;
> > > > > next-level-cache = <&L2_700>;
> > > > > qcom,freq-domain = <&cpufreq_hw 2>;
> > > > > #cooling-cells = <2>;
> > > > > @@ -196,6 +212,40 @@ core7 {
> > > > > };
> > > > > };
> > > > > };
> > > > > +
> > > > > + idle-states {
> > > > > + entry-method = "psci";
> > > > > +
> > > > > + LITTLE_CPU_SLEEP_0: cpu-sleep-0-0 {
> > > > > + compatible = "arm,idle-state";
> > > > > + idle-state-name = "little-rail-power-collapse";
> > > > > + arm,psci-suspend-param = <0x40000004>;
> > > > > + entry-latency-us = <355>;
> > > > > + exit-latency-us = <909>;
> > > > > + min-residency-us = <3934>;
> > > > > + local-timer-stop;
> > > > > + };
> > > > > +
> > > > > + BIG_CPU_SLEEP_0: cpu-sleep-1-0 {
> > > > > + compatible = "arm,idle-state";
> > > > > + idle-state-name = "big-rail-power-collapse";
> > > > > + arm,psci-suspend-param = <0x40000004>;
> > > > > + entry-latency-us = <241>;
> > > > > + exit-latency-us = <1461>;
> > > > > + min-residency-us = <4488>;
> > > > > + local-timer-stop;
> > > > > + };
> > > > > +
> > > > > + CLUSTER_SLEEP_0: cluster-sleep-0 {
> > > > > + compatible = "arm,idle-state";
> > > > > + idle-state-name = "cluster-power-collapse";
> > > > > + arm,psci-suspend-param = <0x400000F4>;
> > > >
> > > > How come this is 0xf4?
> > > >
> > > > Isn't downstream saying that this should be either 0x1 << 4 or 0xc24 <<
> > > > 4, depending on how deep we want to go? Could we at least mention why
> > > > this is 0xf4?
> > >
> > > I'm not sure where 0x400000F4 originally came from. I noticed that
> > > sdm845 uses the same 0xc24 mode in downstream, but Qualcomm used
> > > 0x400000F4 in mainline.
> > >
> > > I did some testing on a downstream kernel and found that the real value
> > > it uses on sm8150 is 0x4100c244, but the idle state doesn't work at all
> > > if I use the same value on mainline. The logic appears to be the same in
> > > the downstream sdm845 kernel. Maybe it has to do with how downstream has
> > > "notify RPM" before attempting to enter the idle state?
> > >
> > > In downstream, the final PSCI value is calculated as the sum of:
> > >
> > > 1. (cluster-mode & cluster-mode-mask) << cluster-mode-shift = (0xc24 & 0xfff) << 4 = 0xc240
> > > 2. (is-reset << 30) = 0x40000000
> > > 3. (affinity level & 0x3) << 24 = 0x1000000
> > > 4. (cpu-mode) = 0x4
> > >
> > > so 0xc240 + 0x40000000 + 0x1000000 + 0x4 = 0x4100c244.
> > >
> > > It's also possible that the problem comes from the cluster idle state
> > > needing all CPUs in the cluster to be asleep (as far as I know), since
> > > it doesn't look like mainline handles that.
> > >
> >
> > Thanks for the explanation. I believe we have the code in place to do
> > OSI sleep using the "psci domain cpuidle" driver, but I'm not entirely
> > sure about the details about it - perhaps it's just a matter of wiring
> > it all up(?).
> >
> > Let's go with your current patches and then swing back to this once
> > we've figured out the remaining details.
>
> Following your hint, I was able to get cluster idle working using power
> domain idle states. The cluster idle state is now successfully using the
> same value as downstream with no apparent issues, and individual CPU
> idle states are still working. Time spent in the cluster idle state
> increases when and only when all CPUs are idle, which matches the
> expected behavior.
>
Really interesting, thanks for pursuing this!
> Should I send a separate patch for it or revise this series? It might be
> helpful for future reference to keep a record of how to convert the
> current 0xf4 cluster states on modern Qualcomm SoCs in the commit
> history.
>
I did go ahead and merge this series last week, and I like the idea of
"documenting" how the difference - so please send this as a separate
patch(es).
Thanks,
Bjorn
> >
> > Thanks,
> > Bjorn
> >
> > > >
> > > > Regards,
> > > > Bjorn
> > > >
> > > > > + entry-latency-us = <3263>;
> > > > > + exit-latency-us = <6562>;
> > > > > + min-residency-us = <9987>;
> > > > > + local-timer-stop;
> > > > > + };
> > > > > + };
> > > > > };
> > > > >
> > > > > firmware {
> > > > > --
> > > > > 2.29.2
> > > > >
> > > >
> >
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH 0/3] CPU power management for SM8150
2020-12-21 0:29 [PATCH 0/3] CPU power management for SM8150 Danny Lin
` (2 preceding siblings ...)
2020-12-21 0:29 ` [PATCH 3/3] arm64: dts: qcom: sm8150: Add CPU capacities and energy model Danny Lin
@ 2021-01-06 3:50 ` patchwork-bot+linux-arm-msm
3 siblings, 0 replies; 10+ messages in thread
From: patchwork-bot+linux-arm-msm @ 2021-01-06 3:50 UTC (permalink / raw)
To: Danny Lin; +Cc: linux-arm-msm
Hello:
This series was applied to qcom/linux.git (refs/heads/for-next):
On Sun, 20 Dec 2020 16:29:04 -0800 you wrote:
> These patches add support for high-level CPU power management on the
> SM8150 platform. cpuidle and energy-aware scheduling are now working
> with the new idle states and CPU energy model.
>
> Danny Lin (3):
> arm64: dts: qcom: sm8150: Define CPU topology
> arm64: dts: qcom: sm8150: Add PSCI idle states
> arm64: dts: qcom: sm8150: Add CPU capacities and energy model
>
> [...]
Here is the summary with links:
- [1/3] arm64: dts: qcom: sm8150: Define CPU topology
https://git.kernel.org/qcom/c/066d21bcf605
- [2/3] arm64: dts: qcom: sm8150: Add PSCI idle states
https://git.kernel.org/qcom/c/81188f585d02
- [3/3] arm64: dts: qcom: sm8150: Add CPU capacities and energy model
https://git.kernel.org/qcom/c/5b2dae72187d
You are awesome, thank you!
--
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/patchwork/pwbot.html
^ permalink raw reply [flat|nested] 10+ messages in thread
end of thread, other threads:[~2021-01-06 3:51 UTC | newest]
Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-12-21 0:29 [PATCH 0/3] CPU power management for SM8150 Danny Lin
2020-12-21 0:29 ` [PATCH 1/3] arm64: dts: qcom: sm8150: Define CPU topology Danny Lin
2020-12-21 0:29 ` [PATCH 2/3] arm64: dts: qcom: sm8150: Add PSCI idle states Danny Lin
2020-12-21 3:48 ` Bjorn Andersson
2020-12-23 2:00 ` Danny Lin
2020-12-28 18:02 ` Bjorn Andersson
2020-12-29 23:19 ` Danny Lin
2021-01-04 4:44 ` Bjorn Andersson
2020-12-21 0:29 ` [PATCH 3/3] arm64: dts: qcom: sm8150: Add CPU capacities and energy model Danny Lin
2021-01-06 3:50 ` [PATCH 0/3] CPU power management for SM8150 patchwork-bot+linux-arm-msm
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.