Add support for thermal throttling on SDM845. We introduce a generic .ready callback to be used by cpufreq drivers to register as a thermal cooling device. If this approach is acceptable I can send a series converting other cpufreq drivers to use this callback. Amit Kucheria (7): drivers: thermal: of-thermal: Print name of device node with error drivers: cpufreq: Add thermal_cooling_device pointer to struct cpufreq_policy cpu_cooling: Add generic driver ready callback cpufreq: qcom-hw: Move to device_initcall cpufreq: qcom-hw: Register as a cpufreq cooling device arm64: dts: sdm845: Increase alert trip point to 95 degrees arm64: dts: sdm845: wireup the thermal trip points to cpufreq arch/arm64/boot/dts/qcom/sdm845.dtsi | 161 +++++++++++++++++++++++++-- drivers/cpufreq/qcom-cpufreq-hw.c | 7 +- drivers/thermal/cpu_cooling.c | 18 +++ drivers/thermal/of-thermal.c | 4 +- include/linux/cpu_cooling.h | 9 ++ include/linux/cpufreq.h | 2 + 6 files changed, 190 insertions(+), 11 deletions(-) -- 2.17.1
Signed-off-by: Amit Kucheria <amit.kucheria@linaro.org> --- drivers/thermal/of-thermal.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/thermal/of-thermal.c b/drivers/thermal/of-thermal.c index 4bfdb4a1e47d..5ca2a6b370ea 100644 --- a/drivers/thermal/of-thermal.c +++ b/drivers/thermal/of-thermal.c @@ -867,14 +867,14 @@ __init *thermal_of_build_thermal_zone(struct device_node *np) ret = of_property_read_u32(np, "polling-delay-passive", &prop); if (ret < 0) { - pr_err("missing polling-delay-passive property\n"); + pr_err("%s: missing polling-delay-passive property\n", np->name); goto free_tz; } tz->passive_delay = prop; ret = of_property_read_u32(np, "polling-delay", &prop); if (ret < 0) { - pr_err("missing polling-delay property\n"); + pr_err("%s: missing polling-delay property\n", np->name); goto free_tz; } tz->polling_delay = prop; -- 2.17.1
Several cpufreq drivers register themselves as thermal cooling devices. Adding a pointer to struct cpufreq_policy removes the need for them to store this pointer in a private data structure. Signed-off-by: Amit Kucheria <amit.kucheria@linaro.org> --- include/linux/cpufreq.h | 2 ++ 1 file changed, 2 insertions(+) diff --git a/include/linux/cpufreq.h b/include/linux/cpufreq.h index c86d6d8bdfed..2496549d7573 100644 --- a/include/linux/cpufreq.h +++ b/include/linux/cpufreq.h @@ -95,6 +95,8 @@ struct cpufreq_policy { struct cpufreq_frequency_table *freq_table; enum cpufreq_table_sorting freq_table_sorted; + struct thermal_cooling_device *cooldev; /* Pointer to the cooling + * device if used for thermal mitigation */ struct list_head policy_list; struct kobject kobj; struct completion kobj_unregister; -- 2.17.1
All cpufreq drivers do similar things to register as a cooling device. Provide a generic call back so we can get rid of duplicated code in the drivers. Signed-off-by: Amit Kucheria <amit.kucheria@linaro.org> Suggested-by: Stephen Boyd <swboyd@chromium.org> --- drivers/thermal/cpu_cooling.c | 18 ++++++++++++++++++ include/linux/cpu_cooling.h | 9 +++++++++ 2 files changed, 27 insertions(+) diff --git a/drivers/thermal/cpu_cooling.c b/drivers/thermal/cpu_cooling.c index dfd23245f778..5154ffc12332 100644 --- a/drivers/thermal/cpu_cooling.c +++ b/drivers/thermal/cpu_cooling.c @@ -694,6 +694,7 @@ __cpufreq_cooling_register(struct device_node *np, cpufreq_cdev->clipped_freq = cpufreq_cdev->freq_table[0].frequency; cpufreq_cdev->cdev = cdev; + policy->cooldev = cdev; mutex_lock(&cooling_list_lock); /* Register the notifier for first cpufreq cooling device */ @@ -785,6 +786,23 @@ of_cpufreq_cooling_register(struct cpufreq_policy *policy) } EXPORT_SYMBOL_GPL(of_cpufreq_cooling_register); +/** + * generic_cpufreq_ready - generic driver callback to register a thermal_cooling_device + * @policy: cpufreq policy + */ +void generic_cpufreq_ready(struct cpufreq_policy *policy) +{ + struct thermal_cooling_device **cdev = &policy->cooldev; + + *cdev = of_cpufreq_cooling_register(policy); + if (IS_ERR(*cdev)) { + pr_err("Failed to register cpufreq cooling device for CPU%d: %ld\n", + policy->cpu, PTR_ERR(cdev)); + cdev = NULL; + } +} +EXPORT_SYMBOL_GPL(generic_cpufreq_ready); + /** * cpufreq_cooling_unregister - function to remove cpufreq cooling device. * @cdev: thermal cooling device pointer. diff --git a/include/linux/cpu_cooling.h b/include/linux/cpu_cooling.h index de0dafb9399d..d7815bb2967a 100644 --- a/include/linux/cpu_cooling.h +++ b/include/linux/cpu_cooling.h @@ -65,12 +65,21 @@ void cpufreq_cooling_unregister(struct thermal_cooling_device *cdev) */ struct thermal_cooling_device * of_cpufreq_cooling_register(struct cpufreq_policy *policy); + +/** + * generic_cpufreq_ready - generic driver callback to register a thermal_cooling_device + * @policy: cpufreq policy + */ +void generic_cpufreq_ready(struct cpufreq_policy *policy); #else static inline struct thermal_cooling_device * of_cpufreq_cooling_register(struct cpufreq_policy *policy) { return NULL; } + +void generic_cpufreq_ready(struct cpufreq_policy *policy) {} + #endif /* defined(CONFIG_THERMAL_OF) && defined(CONFIG_CPU_THERMAL) */ #endif /* __CPU_COOLING_H__ */ -- 2.17.1
subsys_initcall causes problems registering the driver as a thermal cooling device. If "faster boot" is the main reason for doing subsys_initcall, this should be handled in the bootloader or another boot constraint framework. Signed-off-by: Amit Kucheria <amit.kucheria@linaro.org> --- drivers/cpufreq/qcom-cpufreq-hw.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/cpufreq/qcom-cpufreq-hw.c b/drivers/cpufreq/qcom-cpufreq-hw.c index d83939a1b3d4..649dddd72749 100644 --- a/drivers/cpufreq/qcom-cpufreq-hw.c +++ b/drivers/cpufreq/qcom-cpufreq-hw.c @@ -296,7 +296,7 @@ static int __init qcom_cpufreq_hw_init(void) { return platform_driver_register(&qcom_cpufreq_hw_driver); } -subsys_initcall(qcom_cpufreq_hw_init); +device_initcall(qcom_cpufreq_hw_init); static void __exit qcom_cpufreq_hw_exit(void) { -- 2.17.1
Signed-off-by: Amit Kucheria <amit.kucheria@linaro.org> --- drivers/cpufreq/qcom-cpufreq-hw.c | 5 +++++ 1 file changed, 5 insertions(+) diff --git a/drivers/cpufreq/qcom-cpufreq-hw.c b/drivers/cpufreq/qcom-cpufreq-hw.c index 649dddd72749..1c01311e5927 100644 --- a/drivers/cpufreq/qcom-cpufreq-hw.c +++ b/drivers/cpufreq/qcom-cpufreq-hw.c @@ -5,6 +5,7 @@ #include <linux/bitfield.h> #include <linux/cpufreq.h> +#include <linux/cpu_cooling.h> #include <linux/init.h> #include <linux/kernel.h> #include <linux/module.h> @@ -216,7 +217,10 @@ static int qcom_cpufreq_hw_cpu_init(struct cpufreq_policy *policy) static int qcom_cpufreq_hw_cpu_exit(struct cpufreq_policy *policy) { void __iomem *base = policy->driver_data - REG_PERF_STATE; + struct thermal_cooling_device *cdev = policy->cooldev; + if (cdev) + cpufreq_cooling_unregister(cdev); kfree(policy->freq_table); devm_iounmap(&global_pdev->dev, base); @@ -238,6 +242,7 @@ static struct cpufreq_driver cpufreq_qcom_hw_driver = { .init = qcom_cpufreq_hw_cpu_init, .exit = qcom_cpufreq_hw_cpu_exit, .fast_switch = qcom_cpufreq_hw_fast_switch, + .ready = generic_cpufreq_ready, .name = "qcom-cpufreq-hw", .attr = qcom_cpufreq_hw_attr, }; -- 2.17.1
75 degrees is too aggressive for throttling the CPU. After speaking to Qualcomm engineers, increase it to 95 degrees. Signed-off-by: Amit Kucheria <amit.kucheria@linaro.org> --- arch/arm64/boot/dts/qcom/sdm845.dtsi | 16 ++++++++-------- 1 file changed, 8 insertions(+), 8 deletions(-) diff --git a/arch/arm64/boot/dts/qcom/sdm845.dtsi b/arch/arm64/boot/dts/qcom/sdm845.dtsi index c27cbd3bcb0a..29e823b0caf4 100644 --- a/arch/arm64/boot/dts/qcom/sdm845.dtsi +++ b/arch/arm64/boot/dts/qcom/sdm845.dtsi @@ -1692,7 +1692,7 @@ trips { cpu_alert0: trip0 { - temperature = <75000>; + temperature = <95000>; hysteresis = <2000>; type = "passive"; }; @@ -1713,7 +1713,7 @@ trips { cpu_alert1: trip0 { - temperature = <75000>; + temperature = <95000>; hysteresis = <2000>; type = "passive"; }; @@ -1734,7 +1734,7 @@ trips { cpu_alert2: trip0 { - temperature = <75000>; + temperature = <95000>; hysteresis = <2000>; type = "passive"; }; @@ -1755,7 +1755,7 @@ trips { cpu_alert3: trip0 { - temperature = <75000>; + temperature = <95000>; hysteresis = <2000>; type = "passive"; }; @@ -1776,7 +1776,7 @@ trips { cpu_alert4: trip0 { - temperature = <75000>; + temperature = <95000>; hysteresis = <2000>; type = "passive"; }; @@ -1797,7 +1797,7 @@ trips { cpu_alert5: trip0 { - temperature = <75000>; + temperature = <95000>; hysteresis = <2000>; type = "passive"; }; @@ -1818,7 +1818,7 @@ trips { cpu_alert6: trip0 { - temperature = <75000>; + temperature = <95000>; hysteresis = <2000>; type = "passive"; }; @@ -1839,7 +1839,7 @@ trips { cpu_alert7: trip0 { - temperature = <75000>; + temperature = <95000>; hysteresis = <2000>; type = "passive"; }; -- 2.17.1
Since the big and little cpus are in the same frequency domain, use all of them for mitigation in the cooling-map. At the lower trip points we restrict ourselves to throttling only a few OPPs. At higher trip temperatures, allow ourselves to be throttled to any extent. Signed-off-by: Amit Kucheria <amit.kucheria@linaro.org> --- arch/arm64/boot/dts/qcom/sdm845.dtsi | 145 +++++++++++++++++++++++++++ 1 file changed, 145 insertions(+) diff --git a/arch/arm64/boot/dts/qcom/sdm845.dtsi b/arch/arm64/boot/dts/qcom/sdm845.dtsi index 29e823b0caf4..cd6402a9aa64 100644 --- a/arch/arm64/boot/dts/qcom/sdm845.dtsi +++ b/arch/arm64/boot/dts/qcom/sdm845.dtsi @@ -13,6 +13,7 @@ #include <dt-bindings/reset/qcom,sdm845-aoss.h> #include <dt-bindings/soc/qcom,rpmh-rsc.h> #include <dt-bindings/clock/qcom,gcc-sdm845.h> +#include <dt-bindings/thermal/thermal.h> / { interrupt-parent = <&intc>; @@ -99,6 +100,7 @@ compatible = "qcom,kryo385"; reg = <0x0 0x0>; enable-method = "psci"; + #cooling-cells = <2>; next-level-cache = <&L2_0>; L2_0: l2-cache { compatible = "cache"; @@ -114,6 +116,7 @@ compatible = "qcom,kryo385"; reg = <0x0 0x100>; enable-method = "psci"; + #cooling-cells = <2>; next-level-cache = <&L2_100>; L2_100: l2-cache { compatible = "cache"; @@ -126,6 +129,7 @@ compatible = "qcom,kryo385"; reg = <0x0 0x200>; enable-method = "psci"; + #cooling-cells = <2>; next-level-cache = <&L2_200>; L2_200: l2-cache { compatible = "cache"; @@ -138,6 +142,7 @@ compatible = "qcom,kryo385"; reg = <0x0 0x300>; enable-method = "psci"; + #cooling-cells = <2>; next-level-cache = <&L2_300>; L2_300: l2-cache { compatible = "cache"; @@ -150,6 +155,7 @@ compatible = "qcom,kryo385"; reg = <0x0 0x400>; enable-method = "psci"; + #cooling-cells = <2>; next-level-cache = <&L2_400>; L2_400: l2-cache { compatible = "cache"; @@ -162,6 +168,7 @@ compatible = "qcom,kryo385"; reg = <0x0 0x500>; enable-method = "psci"; + #cooling-cells = <2>; next-level-cache = <&L2_500>; L2_500: l2-cache { compatible = "cache"; @@ -174,6 +181,7 @@ compatible = "qcom,kryo385"; reg = <0x0 0x600>; enable-method = "psci"; + #cooling-cells = <2>; next-level-cache = <&L2_600>; L2_600: l2-cache { compatible = "cache"; @@ -186,6 +194,7 @@ compatible = "qcom,kryo385"; reg = <0x0 0x700>; enable-method = "psci"; + #cooling-cells = <2>; next-level-cache = <&L2_700>; L2_700: l2-cache { compatible = "cache"; @@ -1703,6 +1712,23 @@ type = "critical"; }; }; + + cooling-maps { + map0 { + trip = <&cpu_alert0>; + cooling-device = <&CPU0 THERMAL_NO_LIMIT 4>, + <&CPU1 THERMAL_NO_LIMIT 4>, + <&CPU2 THERMAL_NO_LIMIT 4>, + <&CPU3 THERMAL_NO_LIMIT 4>; + }; + map1 { + trip = <&cpu_crit0>; + cooling-device = <&CPU0 THERMAL_NO_LIMIT THERMAL_NO_LIMIT>, + <&CPU1 THERMAL_NO_LIMIT THERMAL_NO_LIMIT>, + <&CPU2 THERMAL_NO_LIMIT THERMAL_NO_LIMIT>, + <&CPU3 THERMAL_NO_LIMIT THERMAL_NO_LIMIT>; + }; + }; }; cpu1-thermal { @@ -1724,6 +1750,23 @@ type = "critical"; }; }; + + cooling-maps { + map0 { + trip = <&cpu_alert1>; + cooling-device = <&CPU0 THERMAL_NO_LIMIT 4>, + <&CPU1 THERMAL_NO_LIMIT 4>, + <&CPU2 THERMAL_NO_LIMIT 4>, + <&CPU3 THERMAL_NO_LIMIT 4>; + }; + map1 { + trip = <&cpu_crit1>; + cooling-device = <&CPU0 THERMAL_NO_LIMIT THERMAL_NO_LIMIT>, + <&CPU1 THERMAL_NO_LIMIT THERMAL_NO_LIMIT>, + <&CPU2 THERMAL_NO_LIMIT THERMAL_NO_LIMIT>, + <&CPU3 THERMAL_NO_LIMIT THERMAL_NO_LIMIT>; + }; + }; }; cpu2-thermal { @@ -1745,6 +1788,23 @@ type = "critical"; }; }; + + cooling-maps { + map0 { + trip = <&cpu_alert2>; + cooling-device = <&CPU0 THERMAL_NO_LIMIT 4>, + <&CPU1 THERMAL_NO_LIMIT 4>, + <&CPU2 THERMAL_NO_LIMIT 4>, + <&CPU3 THERMAL_NO_LIMIT 4>; + }; + map1 { + trip = <&cpu_crit2>; + cooling-device = <&CPU0 THERMAL_NO_LIMIT THERMAL_NO_LIMIT>, + <&CPU1 THERMAL_NO_LIMIT THERMAL_NO_LIMIT>, + <&CPU2 THERMAL_NO_LIMIT THERMAL_NO_LIMIT>, + <&CPU3 THERMAL_NO_LIMIT THERMAL_NO_LIMIT>; + }; + }; }; cpu3-thermal { @@ -1766,6 +1826,23 @@ type = "critical"; }; }; + + cooling-maps { + map0 { + trip = <&cpu_alert3>; + cooling-device = <&CPU0 THERMAL_NO_LIMIT 4>, + <&CPU1 THERMAL_NO_LIMIT 4>, + <&CPU2 THERMAL_NO_LIMIT 4>, + <&CPU3 THERMAL_NO_LIMIT 4>; + }; + map1 { + trip = <&cpu_crit3>; + cooling-device = <&CPU0 THERMAL_NO_LIMIT THERMAL_NO_LIMIT>, + <&CPU1 THERMAL_NO_LIMIT THERMAL_NO_LIMIT>, + <&CPU2 THERMAL_NO_LIMIT THERMAL_NO_LIMIT>, + <&CPU3 THERMAL_NO_LIMIT THERMAL_NO_LIMIT>; + }; + }; }; cpu4-thermal { @@ -1787,6 +1864,23 @@ type = "critical"; }; }; + + cooling-maps { + map0 { + trip = <&cpu_alert4>; + cooling-device = <&CPU4 THERMAL_NO_LIMIT 4>, + <&CPU5 THERMAL_NO_LIMIT 4>, + <&CPU6 THERMAL_NO_LIMIT 4>, + <&CPU7 THERMAL_NO_LIMIT 4>; + }; + map1 { + trip = <&cpu_crit4>; + cooling-device = <&CPU4 THERMAL_NO_LIMIT THERMAL_NO_LIMIT>, + <&CPU5 THERMAL_NO_LIMIT THERMAL_NO_LIMIT>, + <&CPU6 THERMAL_NO_LIMIT THERMAL_NO_LIMIT>, + <&CPU7 THERMAL_NO_LIMIT THERMAL_NO_LIMIT>; + }; + }; }; cpu5-thermal { @@ -1808,6 +1902,23 @@ type = "critical"; }; }; + + cooling-maps { + map0 { + trip = <&cpu_alert5>; + cooling-device = <&CPU4 THERMAL_NO_LIMIT 4>, + <&CPU5 THERMAL_NO_LIMIT 4>, + <&CPU6 THERMAL_NO_LIMIT 4>, + <&CPU7 THERMAL_NO_LIMIT 4>; + }; + map1 { + trip = <&cpu_crit5>; + cooling-device = <&CPU4 THERMAL_NO_LIMIT THERMAL_NO_LIMIT>, + <&CPU5 THERMAL_NO_LIMIT THERMAL_NO_LIMIT>, + <&CPU6 THERMAL_NO_LIMIT THERMAL_NO_LIMIT>, + <&CPU7 THERMAL_NO_LIMIT THERMAL_NO_LIMIT>; + }; + }; }; cpu6-thermal { @@ -1829,6 +1940,23 @@ type = "critical"; }; }; + + cooling-maps { + map0 { + trip = <&cpu_alert6>; + cooling-device = <&CPU4 THERMAL_NO_LIMIT 4>, + <&CPU5 THERMAL_NO_LIMIT 4>, + <&CPU6 THERMAL_NO_LIMIT 4>, + <&CPU7 THERMAL_NO_LIMIT 4>; + }; + map1 { + trip = <&cpu_crit6>; + cooling-device = <&CPU4 THERMAL_NO_LIMIT THERMAL_NO_LIMIT>, + <&CPU5 THERMAL_NO_LIMIT THERMAL_NO_LIMIT>, + <&CPU6 THERMAL_NO_LIMIT THERMAL_NO_LIMIT>, + <&CPU7 THERMAL_NO_LIMIT THERMAL_NO_LIMIT>; + }; + }; }; cpu7-thermal { @@ -1850,6 +1978,23 @@ type = "critical"; }; }; + + cooling-maps { + map0 { + trip = <&cpu_alert7>; + cooling-device = <&CPU4 THERMAL_NO_LIMIT 4>, + <&CPU5 THERMAL_NO_LIMIT 4>, + <&CPU6 THERMAL_NO_LIMIT 4>, + <&CPU7 THERMAL_NO_LIMIT 4>; + }; + map1 { + trip = <&cpu_crit7>; + cooling-device = <&CPU4 THERMAL_NO_LIMIT THERMAL_NO_LIMIT>, + <&CPU5 THERMAL_NO_LIMIT THERMAL_NO_LIMIT>, + <&CPU6 THERMAL_NO_LIMIT THERMAL_NO_LIMIT>, + <&CPU7 THERMAL_NO_LIMIT THERMAL_NO_LIMIT>; + }; + }; }; }; }; -- 2.17.1
Quoting Amit Kucheria (2019-01-09 16:00:50)
> Signed-off-by: Amit Kucheria <amit.kucheria@linaro.org>
> ---
> drivers/thermal/of-thermal.c | 4 ++--
> 1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/thermal/of-thermal.c b/drivers/thermal/of-thermal.c
> index 4bfdb4a1e47d..5ca2a6b370ea 100644
> --- a/drivers/thermal/of-thermal.c
> +++ b/drivers/thermal/of-thermal.c
> @@ -867,14 +867,14 @@ __init *thermal_of_build_thermal_zone(struct device_node *np)
>
> ret = of_property_read_u32(np, "polling-delay-passive", &prop);
> if (ret < 0) {
> - pr_err("missing polling-delay-passive property\n");
> + pr_err("%s: missing polling-delay-passive property\n", np->name);
You should use %pOFn to print node names.
Quoting Amit Kucheria (2019-01-09 16:00:56) > Since the big and little cpus are in the same frequency domain, use all Oh? I thought the big and little cpus were in different frequency domains and voltage domains. Maybe that's what you're saying here but I'm misunderstanding. So change the wording a bit to be more clear? > of them for mitigation in the cooling-map. At the lower trip points we > restrict ourselves to throttling only a few OPPs. At higher trip > temperatures, allow ourselves to be throttled to any extent. > > Signed-off-by: Amit Kucheria <amit.kucheria@linaro.org>
Quoting Amit Kucheria (2019-01-09 16:00:55)
> 75 degrees is too aggressive for throttling the CPU. After speaking to
> Qualcomm engineers, increase it to 95 degrees.
>
> Signed-off-by: Amit Kucheria <amit.kucheria@linaro.org>
> ---
> arch/arm64/boot/dts/qcom/sdm845.dtsi | 16 ++++++++--------
Is the plan that these are some defaults that would be adjusted by board
variants? Just curious why we have anything in here and don't punt it
all to each board dts file.
On Thu, Jan 10, 2019 at 05:30:51AM +0530, Amit Kucheria wrote:
> Several cpufreq drivers register themselves as thermal cooling devices.
> Adding a pointer to struct cpufreq_policy removes the need for them to
> store this pointer in a private data structure.
>
> Signed-off-by: Amit Kucheria <amit.kucheria@linaro.org>
> ---
> include/linux/cpufreq.h | 2 ++
> 1 file changed, 2 insertions(+)
>
> diff --git a/include/linux/cpufreq.h b/include/linux/cpufreq.h
> index c86d6d8bdfed..2496549d7573 100644
> --- a/include/linux/cpufreq.h
> +++ b/include/linux/cpufreq.h
> @@ -95,6 +95,8 @@ struct cpufreq_policy {
> struct cpufreq_frequency_table *freq_table;
> enum cpufreq_table_sorting freq_table_sorted;
>
> + struct thermal_cooling_device *cooldev; /* Pointer to the cooling
> + * device if used for thermal mitigation */
> struct list_head policy_list;
> struct kobject kobj;
> struct completion kobj_unregister;
I've mixed feelings about this. It's definitely desirable to avoid
code duplication and tying the cooling device to the cpufreq_policy is
a convenient way to achieve that. However semantically it seems a bit
odd that a CPU cooling device is part of the cpufreq policy.
Anyway, unless there are better ideas we probably want to be pragmatic
here, so if Viresh is fine with it who am I to complain ;-)
Cheers
Matthias
Hi Amit, On Thu, Jan 10, 2019 at 05:30:55AM +0530, Amit Kucheria wrote: > 75 degrees is too aggressive for throttling the CPU. After speaking to > Qualcomm engineers, increase it to 95 degrees. > > Signed-off-by: Amit Kucheria <amit.kucheria@linaro.org> > --- > arch/arm64/boot/dts/qcom/sdm845.dtsi | 16 ++++++++-------- > 1 file changed, 8 insertions(+), 8 deletions(-) > > diff --git a/arch/arm64/boot/dts/qcom/sdm845.dtsi b/arch/arm64/boot/dts/qcom/sdm845.dtsi > index c27cbd3bcb0a..29e823b0caf4 100644 > --- a/arch/arm64/boot/dts/qcom/sdm845.dtsi > +++ b/arch/arm64/boot/dts/qcom/sdm845.dtsi > @@ -1692,7 +1692,7 @@ > > trips { > cpu_alert0: trip0 { > - temperature = <75000>; > + temperature = <95000>; > hysteresis = <2000>; > type = "passive"; > }; > @@ -1713,7 +1713,7 @@ > > trips { > cpu_alert1: trip0 { > - temperature = <75000>; > + temperature = <95000>; > hysteresis = <2000>; > type = "passive"; > }; > @@ -1734,7 +1734,7 @@ > > trips { > cpu_alert2: trip0 { > - temperature = <75000>; > + temperature = <95000>; > hysteresis = <2000>; > type = "passive"; > }; > @@ -1755,7 +1755,7 @@ > > trips { > cpu_alert3: trip0 { > - temperature = <75000>; > + temperature = <95000>; > hysteresis = <2000>; > type = "passive"; > }; > @@ -1776,7 +1776,7 @@ > > trips { > cpu_alert4: trip0 { > - temperature = <75000>; > + temperature = <95000>; > hysteresis = <2000>; > type = "passive"; > }; > @@ -1797,7 +1797,7 @@ > > trips { > cpu_alert5: trip0 { > - temperature = <75000>; > + temperature = <95000>; > hysteresis = <2000>; > type = "passive"; > }; > @@ -1818,7 +1818,7 @@ > > trips { > cpu_alert6: trip0 { > - temperature = <75000>; > + temperature = <95000>; > hysteresis = <2000>; > type = "passive"; > }; > @@ -1839,7 +1839,7 @@ > > trips { > cpu_alert7: trip0 { > - temperature = <75000>; > + temperature = <95000>; > hysteresis = <2000>; > type = "passive"; > }; The change itself looks good to me, however I wonder if it would be worth to eliminate redundancy and merge the current 8 thermal zones into 2, one for the Silver and one for the Gold cluster (as done by http://crrev.com/c/1381752). There is a single cooling device for each cluster, so it's not clear to me if there is any gain from having a separate thermal zone for each CPU. If it is important to monitor the temperatures of the individual cores this can still be done by configuring the thermal zone of the cluster with multiple thermal sensors. Cheers Matthias
On Wed, Jan 09, 2019 at 05:15:33PM -0800, Matthias Kaehlcke wrote: > Hi Amit, > > On Thu, Jan 10, 2019 at 05:30:55AM +0530, Amit Kucheria wrote: > > 75 degrees is too aggressive for throttling the CPU. After speaking to > > Qualcomm engineers, increase it to 95 degrees. > > > > Signed-off-by: Amit Kucheria <amit.kucheria@linaro.org> > > --- > > arch/arm64/boot/dts/qcom/sdm845.dtsi | 16 ++++++++-------- > > 1 file changed, 8 insertions(+), 8 deletions(-) > > > > diff --git a/arch/arm64/boot/dts/qcom/sdm845.dtsi b/arch/arm64/boot/dts/qcom/sdm845.dtsi > > index c27cbd3bcb0a..29e823b0caf4 100644 > > --- a/arch/arm64/boot/dts/qcom/sdm845.dtsi > > +++ b/arch/arm64/boot/dts/qcom/sdm845.dtsi > > @@ -1692,7 +1692,7 @@ > > > > trips { > > cpu_alert0: trip0 { > > - temperature = <75000>; > > + temperature = <95000>; > > hysteresis = <2000>; > > type = "passive"; > > }; > > @@ -1713,7 +1713,7 @@ > > > > trips { > > cpu_alert1: trip0 { > > - temperature = <75000>; > > + temperature = <95000>; > > hysteresis = <2000>; > > type = "passive"; > > }; > > @@ -1734,7 +1734,7 @@ > > > > trips { > > cpu_alert2: trip0 { > > - temperature = <75000>; > > + temperature = <95000>; > > hysteresis = <2000>; > > type = "passive"; > > }; > > @@ -1755,7 +1755,7 @@ > > > > trips { > > cpu_alert3: trip0 { > > - temperature = <75000>; > > + temperature = <95000>; > > hysteresis = <2000>; > > type = "passive"; > > }; > > @@ -1776,7 +1776,7 @@ > > > > trips { > > cpu_alert4: trip0 { > > - temperature = <75000>; > > + temperature = <95000>; > > hysteresis = <2000>; > > type = "passive"; > > }; > > @@ -1797,7 +1797,7 @@ > > > > trips { > > cpu_alert5: trip0 { > > - temperature = <75000>; > > + temperature = <95000>; > > hysteresis = <2000>; > > type = "passive"; > > }; > > @@ -1818,7 +1818,7 @@ > > > > trips { > > cpu_alert6: trip0 { > > - temperature = <75000>; > > + temperature = <95000>; > > hysteresis = <2000>; > > type = "passive"; > > }; > > @@ -1839,7 +1839,7 @@ > > > > trips { > > cpu_alert7: trip0 { > > - temperature = <75000>; > > + temperature = <95000>; > > hysteresis = <2000>; > > type = "passive"; > > }; > > The change itself looks good to me, however I wonder if it would be > worth to eliminate redundancy and merge the current 8 thermal zones > into 2, one for the Silver and one for the Gold cluster (as done by > http://crrev.com/c/1381752). There is a single cooling device for > each cluster, so it's not clear to me if there is any gain from having > a separate thermal zone for each CPU. If it is important to monitor > the temperatures of the individual cores this can still be done by > configuring the thermal zone of the cluster with multiple thermal > sensors. I see your idea is to have a cooling device per CPU ("arm64: dts: sdm845: wireup the thermal trip points to cpufreq" / https://lore.kernel.org/patchwork/patch/1030742/), however that doesn't work as intended. Only two cpufreq 'devices' are created, one for CPU0 and one for CPU4. In consequence cpufreq->ready() only runs for these cores and only two cooling devices are registered. Since the cores of a cluster all run at the same frequency I also doubt if having multiple cooling devices would bring any benefits. Cheers Matthias
Hi Amit, On Thu, Jan 10, 2019 at 05:30:56AM +0530, Amit Kucheria wrote: > Since the big and little cpus are in the same frequency domain, use all > of them for mitigation in the cooling-map. At the lower trip points we > restrict ourselves to throttling only a few OPPs. At higher trip > temperatures, allow ourselves to be throttled to any extent. > > Signed-off-by: Amit Kucheria <amit.kucheria@linaro.org> > --- > arch/arm64/boot/dts/qcom/sdm845.dtsi | 145 +++++++++++++++++++++++++++ > 1 file changed, 145 insertions(+) > > diff --git a/arch/arm64/boot/dts/qcom/sdm845.dtsi b/arch/arm64/boot/dts/qcom/sdm845.dtsi > index 29e823b0caf4..cd6402a9aa64 100644 > --- a/arch/arm64/boot/dts/qcom/sdm845.dtsi > +++ b/arch/arm64/boot/dts/qcom/sdm845.dtsi > @@ -13,6 +13,7 @@ > #include <dt-bindings/reset/qcom,sdm845-aoss.h> > #include <dt-bindings/soc/qcom,rpmh-rsc.h> > #include <dt-bindings/clock/qcom,gcc-sdm845.h> > +#include <dt-bindings/thermal/thermal.h> > > / { > interrupt-parent = <&intc>; > @@ -99,6 +100,7 @@ > compatible = "qcom,kryo385"; > reg = <0x0 0x0>; > enable-method = "psci"; > + #cooling-cells = <2>; > next-level-cache = <&L2_0>; > L2_0: l2-cache { > compatible = "cache"; > @@ -114,6 +116,7 @@ > compatible = "qcom,kryo385"; > reg = <0x0 0x100>; > enable-method = "psci"; > + #cooling-cells = <2>; This is not needed (also applies to other for other non-policy cores). A single cpufreq device is created per frequency domain / cluster, hence a single cooling device is registered per cluster, which IMO makes sense given that the CPUs of a cluster can't change their frequencies independently. > next-level-cache = <&L2_100>; > L2_100: l2-cache { > compatible = "cache"; > @@ -126,6 +129,7 @@ > compatible = "qcom,kryo385"; > reg = <0x0 0x200>; > enable-method = "psci"; > + #cooling-cells = <2>; > next-level-cache = <&L2_200>; > L2_200: l2-cache { > compatible = "cache"; > @@ -138,6 +142,7 @@ > compatible = "qcom,kryo385"; > reg = <0x0 0x300>; > enable-method = "psci"; > + #cooling-cells = <2>; > next-level-cache = <&L2_300>; > L2_300: l2-cache { > compatible = "cache"; > @@ -150,6 +155,7 @@ > compatible = "qcom,kryo385"; > reg = <0x0 0x400>; > enable-method = "psci"; > + #cooling-cells = <2>; > next-level-cache = <&L2_400>; > L2_400: l2-cache { > compatible = "cache"; > @@ -162,6 +168,7 @@ > compatible = "qcom,kryo385"; > reg = <0x0 0x500>; > enable-method = "psci"; > + #cooling-cells = <2>; > next-level-cache = <&L2_500>; > L2_500: l2-cache { > compatible = "cache"; > @@ -174,6 +181,7 @@ > compatible = "qcom,kryo385"; > reg = <0x0 0x600>; > enable-method = "psci"; > + #cooling-cells = <2>; > next-level-cache = <&L2_600>; > L2_600: l2-cache { > compatible = "cache"; > @@ -186,6 +194,7 @@ > compatible = "qcom,kryo385"; > reg = <0x0 0x700>; > enable-method = "psci"; > + #cooling-cells = <2>; > next-level-cache = <&L2_700>; > L2_700: l2-cache { > compatible = "cache"; > @@ -1703,6 +1712,23 @@ > type = "critical"; > }; > }; > + > + cooling-maps { > + map0 { > + trip = <&cpu_alert0>; > + cooling-device = <&CPU0 THERMAL_NO_LIMIT 4>, > + <&CPU1 THERMAL_NO_LIMIT 4>, As per above, there are no cooling devices for CPU1-3 and CPU5-7. Cheers Matthias
On 10-01-19, 05:30, Amit Kucheria wrote:
> Signed-off-by: Amit Kucheria <amit.kucheria@linaro.org>
> ---
> drivers/cpufreq/qcom-cpufreq-hw.c | 5 +++++
> 1 file changed, 5 insertions(+)
>
> diff --git a/drivers/cpufreq/qcom-cpufreq-hw.c b/drivers/cpufreq/qcom-cpufreq-hw.c
> index 649dddd72749..1c01311e5927 100644
> --- a/drivers/cpufreq/qcom-cpufreq-hw.c
> +++ b/drivers/cpufreq/qcom-cpufreq-hw.c
> @@ -5,6 +5,7 @@
>
> #include <linux/bitfield.h>
> #include <linux/cpufreq.h>
> +#include <linux/cpu_cooling.h>
> #include <linux/init.h>
> #include <linux/kernel.h>
> #include <linux/module.h>
> @@ -216,7 +217,10 @@ static int qcom_cpufreq_hw_cpu_init(struct cpufreq_policy *policy)
> static int qcom_cpufreq_hw_cpu_exit(struct cpufreq_policy *policy)
> {
> void __iomem *base = policy->driver_data - REG_PERF_STATE;
> + struct thermal_cooling_device *cdev = policy->cooldev;
>
> + if (cdev)
> + cpufreq_cooling_unregister(cdev);
> kfree(policy->freq_table);
> devm_iounmap(&global_pdev->dev, base);
>
> @@ -238,6 +242,7 @@ static struct cpufreq_driver cpufreq_qcom_hw_driver = {
> .init = qcom_cpufreq_hw_cpu_init,
> .exit = qcom_cpufreq_hw_cpu_exit,
> .fast_switch = qcom_cpufreq_hw_fast_switch,
> + .ready = generic_cpufreq_ready,
> .name = "qcom-cpufreq-hw",
> .attr = qcom_cpufreq_hw_attr,
> };
I liked the idea of reducing code duplication, but not much the
implementation. All we were able to get rid of was a call to
of_cpufreq_cooling_register() and nothing else. Is it worth it ?
Maybe we can add another flag in cpufreq.h:
#define CPUFREQ_AUTO_REGISTER_COOLING_DEV (1 << 7)
and let the core do it all automatically by itself, that will get rid
of code duplication actually.
@Rafael: What do you say ?
--
viresh
On 10-01-19, 05:30, Amit Kucheria wrote: > All cpufreq drivers do similar things to register as a cooling device. > Provide a generic call back so we can get rid of duplicated code in the > drivers. > > Signed-off-by: Amit Kucheria <amit.kucheria@linaro.org> > Suggested-by: Stephen Boyd <swboyd@chromium.org> > --- > drivers/thermal/cpu_cooling.c | 18 ++++++++++++++++++ > include/linux/cpu_cooling.h | 9 +++++++++ > 2 files changed, 27 insertions(+) We may be doing this differently, but I found some other issues with the patch. > diff --git a/drivers/thermal/cpu_cooling.c b/drivers/thermal/cpu_cooling.c > index dfd23245f778..5154ffc12332 100644 > --- a/drivers/thermal/cpu_cooling.c > +++ b/drivers/thermal/cpu_cooling.c > @@ -694,6 +694,7 @@ __cpufreq_cooling_register(struct device_node *np, > > cpufreq_cdev->clipped_freq = cpufreq_cdev->freq_table[0].frequency; > cpufreq_cdev->cdev = cdev; > + policy->cooldev = cdev; Why was this required ? The below routine was already doing it... > > mutex_lock(&cooling_list_lock); > /* Register the notifier for first cpufreq cooling device */ > @@ -785,6 +786,23 @@ of_cpufreq_cooling_register(struct cpufreq_policy *policy) > } > EXPORT_SYMBOL_GPL(of_cpufreq_cooling_register); > > +/** > + * generic_cpufreq_ready - generic driver callback to register a thermal_cooling_device > + * @policy: cpufreq policy > + */ > +void generic_cpufreq_ready(struct cpufreq_policy *policy) > +{ > + struct thermal_cooling_device **cdev = &policy->cooldev; > + > + *cdev = of_cpufreq_cooling_register(policy); ... here > + if (IS_ERR(*cdev)) { We never get an error here, only NULL. > + pr_err("Failed to register cpufreq cooling device for CPU%d: %ld\n", > + policy->cpu, PTR_ERR(cdev)); The of_cpufreq_cooling_register() routine already prints enough error messages on failures. > + cdev = NULL; This is a meaningless statement, perhaps you wanted to do *cdev = NULL :) -- viresh
On 09-01-19, 18:22, Matthias Kaehlcke wrote: > Hi Amit, > > On Thu, Jan 10, 2019 at 05:30:56AM +0530, Amit Kucheria wrote: > > Since the big and little cpus are in the same frequency domain, use all > > of them for mitigation in the cooling-map. At the lower trip points we > > restrict ourselves to throttling only a few OPPs. At higher trip > > temperatures, allow ourselves to be throttled to any extent. > > > > Signed-off-by: Amit Kucheria <amit.kucheria@linaro.org> > > --- > > arch/arm64/boot/dts/qcom/sdm845.dtsi | 145 +++++++++++++++++++++++++++ > > 1 file changed, 145 insertions(+) > > > > diff --git a/arch/arm64/boot/dts/qcom/sdm845.dtsi b/arch/arm64/boot/dts/qcom/sdm845.dtsi > > index 29e823b0caf4..cd6402a9aa64 100644 > > --- a/arch/arm64/boot/dts/qcom/sdm845.dtsi > > +++ b/arch/arm64/boot/dts/qcom/sdm845.dtsi > > @@ -13,6 +13,7 @@ > > #include <dt-bindings/reset/qcom,sdm845-aoss.h> > > #include <dt-bindings/soc/qcom,rpmh-rsc.h> > > #include <dt-bindings/clock/qcom,gcc-sdm845.h> > > +#include <dt-bindings/thermal/thermal.h> > > > > / { > > interrupt-parent = <&intc>; > > @@ -99,6 +100,7 @@ > > compatible = "qcom,kryo385"; > > reg = <0x0 0x0>; > > enable-method = "psci"; > > + #cooling-cells = <2>; > > next-level-cache = <&L2_0>; > > L2_0: l2-cache { > > compatible = "cache"; > > @@ -114,6 +116,7 @@ > > compatible = "qcom,kryo385"; > > reg = <0x0 0x100>; > > enable-method = "psci"; > > + #cooling-cells = <2>; > > This is not needed (also applies to other for other non-policy > cores). A single cpufreq device is created per frequency domain / > cluster, hence a single cooling device is registered per cluster, > which IMO makes sense given that the CPUs of a cluster can't change > their frequencies independently. > As per above, there are no cooling devices for CPU1-3 and CPU5-7. lore.kernel.org/lkml/cover.1527244200.git.viresh.kumar@linaro.org lore.kernel.org/lkml/b687bb6035fbb010383f4511a206abb4006679fa.1527244201.git.viresh.kumar@linaro.org -- viresh
On 10-01-19, 05:30, Amit Kucheria wrote:
> subsys_initcall causes problems registering the driver as a thermal
> cooling device.
>
> If "faster boot" is the main reason for doing subsys_initcall, this
> should be handled in the bootloader or another boot constraint
> framework.
>
> Signed-off-by: Amit Kucheria <amit.kucheria@linaro.org>
> ---
> drivers/cpufreq/qcom-cpufreq-hw.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/drivers/cpufreq/qcom-cpufreq-hw.c b/drivers/cpufreq/qcom-cpufreq-hw.c
> index d83939a1b3d4..649dddd72749 100644
> --- a/drivers/cpufreq/qcom-cpufreq-hw.c
> +++ b/drivers/cpufreq/qcom-cpufreq-hw.c
> @@ -296,7 +296,7 @@ static int __init qcom_cpufreq_hw_init(void)
> {
> return platform_driver_register(&qcom_cpufreq_hw_driver);
> }
> -subsys_initcall(qcom_cpufreq_hw_init);
> +device_initcall(qcom_cpufreq_hw_init);
>
> static void __exit qcom_cpufreq_hw_exit(void)
> {
Applied. Thanks.
--
viresh
On Thu, Jan 10, 2019 at 11:42 AM Viresh Kumar <viresh.kumar@linaro.org> wrote: > > On 10-01-19, 05:30, Amit Kucheria wrote: > > Signed-off-by: Amit Kucheria <amit.kucheria@linaro.org> > > --- > > drivers/cpufreq/qcom-cpufreq-hw.c | 5 +++++ > > 1 file changed, 5 insertions(+) > > > > diff --git a/drivers/cpufreq/qcom-cpufreq-hw.c b/drivers/cpufreq/qcom-cpufreq-hw.c > > index 649dddd72749..1c01311e5927 100644 > > --- a/drivers/cpufreq/qcom-cpufreq-hw.c > > +++ b/drivers/cpufreq/qcom-cpufreq-hw.c > > @@ -5,6 +5,7 @@ > > > > #include <linux/bitfield.h> > > #include <linux/cpufreq.h> > > +#include <linux/cpu_cooling.h> > > #include <linux/init.h> > > #include <linux/kernel.h> > > #include <linux/module.h> > > @@ -216,7 +217,10 @@ static int qcom_cpufreq_hw_cpu_init(struct cpufreq_policy *policy) > > static int qcom_cpufreq_hw_cpu_exit(struct cpufreq_policy *policy) > > { > > void __iomem *base = policy->driver_data - REG_PERF_STATE; > > + struct thermal_cooling_device *cdev = policy->cooldev; > > > > + if (cdev) > > + cpufreq_cooling_unregister(cdev); > > kfree(policy->freq_table); > > devm_iounmap(&global_pdev->dev, base); > > > > @@ -238,6 +242,7 @@ static struct cpufreq_driver cpufreq_qcom_hw_driver = { > > .init = qcom_cpufreq_hw_cpu_init, > > .exit = qcom_cpufreq_hw_cpu_exit, > > .fast_switch = qcom_cpufreq_hw_fast_switch, > > + .ready = generic_cpufreq_ready, > > .name = "qcom-cpufreq-hw", > > .attr = qcom_cpufreq_hw_attr, > > }; > > I liked the idea of reducing code duplication, but not much the > implementation. All we were able to get rid of was a call to > of_cpufreq_cooling_register() and nothing else. Is it worth it ? > > Maybe we can add another flag in cpufreq.h: > > #define CPUFREQ_AUTO_REGISTER_COOLING_DEV (1 << 7) > > and let the core do it all automatically by itself, that will get rid > of code duplication actually. I like the idea of a flag. I'll spin something implementing it in the next rev. > @Rafael: What do you say ?
On Thu, Jan 10, 2019 at 7:12 AM Viresh Kumar <viresh.kumar@linaro.org> wrote:
>
> On 10-01-19, 05:30, Amit Kucheria wrote:
> > Signed-off-by: Amit Kucheria <amit.kucheria@linaro.org>
> > ---
> > drivers/cpufreq/qcom-cpufreq-hw.c | 5 +++++
> > 1 file changed, 5 insertions(+)
> >
> > diff --git a/drivers/cpufreq/qcom-cpufreq-hw.c b/drivers/cpufreq/qcom-cpufreq-hw.c
> > index 649dddd72749..1c01311e5927 100644
> > --- a/drivers/cpufreq/qcom-cpufreq-hw.c
> > +++ b/drivers/cpufreq/qcom-cpufreq-hw.c
> > @@ -5,6 +5,7 @@
> >
> > #include <linux/bitfield.h>
> > #include <linux/cpufreq.h>
> > +#include <linux/cpu_cooling.h>
> > #include <linux/init.h>
> > #include <linux/kernel.h>
> > #include <linux/module.h>
> > @@ -216,7 +217,10 @@ static int qcom_cpufreq_hw_cpu_init(struct cpufreq_policy *policy)
> > static int qcom_cpufreq_hw_cpu_exit(struct cpufreq_policy *policy)
> > {
> > void __iomem *base = policy->driver_data - REG_PERF_STATE;
> > + struct thermal_cooling_device *cdev = policy->cooldev;
> >
> > + if (cdev)
> > + cpufreq_cooling_unregister(cdev);
> > kfree(policy->freq_table);
> > devm_iounmap(&global_pdev->dev, base);
> >
> > @@ -238,6 +242,7 @@ static struct cpufreq_driver cpufreq_qcom_hw_driver = {
> > .init = qcom_cpufreq_hw_cpu_init,
> > .exit = qcom_cpufreq_hw_cpu_exit,
> > .fast_switch = qcom_cpufreq_hw_fast_switch,
> > + .ready = generic_cpufreq_ready,
> > .name = "qcom-cpufreq-hw",
> > .attr = qcom_cpufreq_hw_attr,
> > };
>
> I liked the idea of reducing code duplication, but not much the
> implementation. All we were able to get rid of was a call to
> of_cpufreq_cooling_register() and nothing else. Is it worth it ?
>
> Maybe we can add another flag in cpufreq.h:
>
> #define CPUFREQ_AUTO_REGISTER_COOLING_DEV (1 << 7)
>
> and let the core do it all automatically by itself, that will get rid
> of code duplication actually.
>
> @Rafael: What do you say ?
Getting rid of code duplication is good, let's do that.
On Thu, Jan 10, 2019 at 5:58 AM Stephen Boyd <swboyd@chromium.org> wrote: > > Quoting Amit Kucheria (2019-01-09 16:00:56) > > Since the big and little cpus are in the same frequency domain, use all > > Oh? I thought the big and little cpus were in different frequency > domains and voltage domains. Maybe that's what you're saying here but > I'm misunderstanding. So change the wording a bit to be more clear? Yeah, forgive my English - it is my second language. ;-) That should've been "since all the cpus in the big and little clusters are in the same frequency domain". Will fix. > > of them for mitigation in the cooling-map. At the lower trip points we > > restrict ourselves to throttling only a few OPPs. At higher trip > > temperatures, allow ourselves to be throttled to any extent. > > > > Signed-off-by: Amit Kucheria <amit.kucheria@linaro.org> >
Hi,
On Wed, Jan 9, 2019 at 4:29 PM Stephen Boyd <swboyd@chromium.org> wrote:
>
> Quoting Amit Kucheria (2019-01-09 16:00:55)
> > 75 degrees is too aggressive for throttling the CPU. After speaking to
> > Qualcomm engineers, increase it to 95 degrees.
> >
> > Signed-off-by: Amit Kucheria <amit.kucheria@linaro.org>
> > ---
> > arch/arm64/boot/dts/qcom/sdm845.dtsi | 16 ++++++++--------
>
> Is the plan that these are some defaults that would be adjusted by board
> variants? Just curious why we have anything in here and don't punt it
> all to each board dts file.
My preference would be that the SoC device tree file should contain
thermal numbers that are important to pay attention to for the safety
/ proper operation of the SoC. ...then individual boards could (if
they needed to) override with lower values to control, for instance,
skin temperature.
From experience with previous boards, if you've got enough an off-SoC
thermistors then those are the ones you'd want to monitor to control
skin temperature. It's OK if the SoC spikes up quite high as long as
that heat has somewhere to go (like a heat pipe). The sensors that
are part of Amit's patch are on-chip.
...if you've got a board without external thermistors and are using
the SoC's on-chip sensors as a proxy for the heat in the overall
system then you might want to lower your values in the board device
tree file. You won't be able to have as many short term spikes, but
that's what you gotta do without the extra sensors.
Sound sane?
-Doug
On Thu, Jan 10, 2019 at 11:53:59AM +0530, Viresh Kumar wrote: > On 09-01-19, 18:22, Matthias Kaehlcke wrote: > > Hi Amit, > > > > On Thu, Jan 10, 2019 at 05:30:56AM +0530, Amit Kucheria wrote: > > > Since the big and little cpus are in the same frequency domain, use all > > > of them for mitigation in the cooling-map. At the lower trip points we > > > restrict ourselves to throttling only a few OPPs. At higher trip > > > temperatures, allow ourselves to be throttled to any extent. > > > > > > Signed-off-by: Amit Kucheria <amit.kucheria@linaro.org> > > > --- > > > arch/arm64/boot/dts/qcom/sdm845.dtsi | 145 +++++++++++++++++++++++++++ > > > 1 file changed, 145 insertions(+) > > > > > > diff --git a/arch/arm64/boot/dts/qcom/sdm845.dtsi b/arch/arm64/boot/dts/qcom/sdm845.dtsi > > > index 29e823b0caf4..cd6402a9aa64 100644 > > > --- a/arch/arm64/boot/dts/qcom/sdm845.dtsi > > > +++ b/arch/arm64/boot/dts/qcom/sdm845.dtsi > > > @@ -13,6 +13,7 @@ > > > #include <dt-bindings/reset/qcom,sdm845-aoss.h> > > > #include <dt-bindings/soc/qcom,rpmh-rsc.h> > > > #include <dt-bindings/clock/qcom,gcc-sdm845.h> > > > +#include <dt-bindings/thermal/thermal.h> > > > > > > / { > > > interrupt-parent = <&intc>; > > > @@ -99,6 +100,7 @@ > > > compatible = "qcom,kryo385"; > > > reg = <0x0 0x0>; > > > enable-method = "psci"; > > > + #cooling-cells = <2>; > > > next-level-cache = <&L2_0>; > > > L2_0: l2-cache { > > > compatible = "cache"; > > > @@ -114,6 +116,7 @@ > > > compatible = "qcom,kryo385"; > > > reg = <0x0 0x100>; > > > enable-method = "psci"; > > > + #cooling-cells = <2>; > > > > This is not needed (also applies to other for other non-policy > > cores). A single cpufreq device is created per frequency domain / > > cluster, hence a single cooling device is registered per cluster, > > which IMO makes sense given that the CPUs of a cluster can't change > > their frequencies independently. > > > As per above, there are no cooling devices for CPU1-3 and CPU5-7. > > lore.kernel.org/lkml/cover.1527244200.git.viresh.kumar@linaro.org > lore.kernel.org/lkml/b687bb6035fbb010383f4511a206abb4006679fa.1527244201.git.viresh.kumar@linaro.org Thanks for the pointer, there's always something new to learn! Ok, so the policy CPU and hence the CPU registered as cooling device may vary. I understand that this requires to list all possible cooling devices, even though only one will be active at any given time. However I wonder if we could change this: From 103703a46495ff210a521b5b6fbf32632053c64f Mon Sep 17 00:00:00 2001 From: Matthias Kaehlcke <mka@chromium.org> Date: Thu, 10 Jan 2019 09:48:38 -0800 Subject: [PATCH] thermal: cpu_cooling: always use first CPU of a freq domain as cooling device For all CPUs of a frequency domain a single cooling device is registered, since the CPUs can't switch their frequencies independently from each other. The cpufreq policy CPU is used to represent cooling device of the frequency domain. Which CPU is the policy CPU may vary based on the order of initialization or CPU hotplug. For device tree based platform the above implies that cooling maps must include a list of all possible cooling devices of a frequency domain, even though only one of them will exist at any given time. For example: cooling-maps { map0 { trip = <&cpu_alert0>; cooling-device = <&CPU0 THERMAL_NO_LIMIT 4>, <&CPU1 THERMAL_NO_LIMIT 4>, <&CPU2 THERMAL_NO_LIMIT 4>, <&CPU3 THERMAL_NO_LIMIT 4>; }; map1 { trip = <&cpu_crit0>; cooling-device = <&CPU0 THERMAL_NO_LIMIT THERMAL_NO_LIMIT>, <&CPU1 THERMAL_NO_LIMIT THERMAL_NO_LIMIT>, <&CPU2 THERMAL_NO_LIMIT THERMAL_NO_LIMIT>, <&CPU3 THERMAL_NO_LIMIT THERMAL_NO_LIMIT>; }; }; This can be avoided by using always the first CPU of a frequency domain as cooling device. It may happen that the first CPU is offline when the cooling device is registered (e.g. CPU2 is initialized first in the above example), hence the nominal cooling device might be offline. This may seem odd, however it is not really different from the current behavior: when the policy CPU is taking offline the cooling device corresponding to it remains active, unless it is unregistered because all other CPUs of the frequency domain are offline too. A single cooling device associated with a specific CPU of the frequency domain reduces redundant device tree clutter in CPU nodes and cooling maps. Signed-off-by: Matthias Kaehlcke <mka@chromium.org> --- drivers/thermal/cpu_cooling.c | 7 ++++--- 1 file changed, 4 insertions(+), 3 deletions(-) diff --git a/drivers/thermal/cpu_cooling.c b/drivers/thermal/cpu_cooling.c index dfd23245f778a..bb5ea06f893a2 100644 --- a/drivers/thermal/cpu_cooling.c +++ b/drivers/thermal/cpu_cooling.c @@ -758,13 +758,14 @@ EXPORT_SYMBOL_GPL(cpufreq_cooling_register); struct thermal_cooling_device * of_cpufreq_cooling_register(struct cpufreq_policy *policy) { - struct device_node *np = of_get_cpu_node(policy->cpu, NULL); + unsigned int first_cpu = cpumask_first(policy->related_cpus); + struct device_node *np = of_get_cpu_node(first_cpu, NULL); struct thermal_cooling_device *cdev = NULL; u32 capacitance = 0; if (!np) { pr_err("cpu_cooling: OF node not available for cpu%d\n", - policy->cpu); + first_cpu); return NULL; } @@ -775,7 +776,7 @@ of_cpufreq_cooling_register(struct cpufreq_policy *policy) cdev = __cpufreq_cooling_register(np, policy, capacitance); if (IS_ERR(cdev)) { pr_err("cpu_cooling: cpu%d is not running as cooling device: %ld\n", - policy->cpu, PTR_ERR(cdev)); + first_cpu, PTR_ERR(cdev)); cdev = NULL; } } Would that make sense or is there something I'm overlooking? Cheers Matthias
On Thu, Jan 10, 2019 at 7:45 AM Matthias Kaehlcke <mka@chromium.org> wrote:
>
> On Wed, Jan 09, 2019 at 05:15:33PM -0800, Matthias Kaehlcke wrote:
> > Hi Amit,
> >
> > On Thu, Jan 10, 2019 at 05:30:55AM +0530, Amit Kucheria wrote:
> > > 75 degrees is too aggressive for throttling the CPU. After speaking to
> > > Qualcomm engineers, increase it to 95 degrees.
> > >
> > > Signed-off-by: Amit Kucheria <amit.kucheria@linaro.org>
> > > ---
> > > arch/arm64/boot/dts/qcom/sdm845.dtsi | 16 ++++++++--------
> > > 1 file changed, 8 insertions(+), 8 deletions(-)
> > >
> > > diff --git a/arch/arm64/boot/dts/qcom/sdm845.dtsi b/arch/arm64/boot/dts/qcom/sdm845.dtsi
> > > index c27cbd3bcb0a..29e823b0caf4 100644
> > > --- a/arch/arm64/boot/dts/qcom/sdm845.dtsi
> > > +++ b/arch/arm64/boot/dts/qcom/sdm845.dtsi
> > > @@ -1692,7 +1692,7 @@
> > >
> > > trips {
> > > cpu_alert0: trip0 {
> > > - temperature = <75000>;
> > > + temperature = <95000>;
> > > hysteresis = <2000>;
> > > type = "passive";
> > > };
> > > @@ -1713,7 +1713,7 @@
> > >
> > > trips {
> > > cpu_alert1: trip0 {
> > > - temperature = <75000>;
> > > + temperature = <95000>;
> > > hysteresis = <2000>;
> > > type = "passive";
> > > };
> > > @@ -1734,7 +1734,7 @@
> > >
> > > trips {
> > > cpu_alert2: trip0 {
> > > - temperature = <75000>;
> > > + temperature = <95000>;
> > > hysteresis = <2000>;
> > > type = "passive";
> > > };
> > > @@ -1755,7 +1755,7 @@
> > >
> > > trips {
> > > cpu_alert3: trip0 {
> > > - temperature = <75000>;
> > > + temperature = <95000>;
> > > hysteresis = <2000>;
> > > type = "passive";
> > > };
> > > @@ -1776,7 +1776,7 @@
> > >
> > > trips {
> > > cpu_alert4: trip0 {
> > > - temperature = <75000>;
> > > + temperature = <95000>;
> > > hysteresis = <2000>;
> > > type = "passive";
> > > };
> > > @@ -1797,7 +1797,7 @@
> > >
> > > trips {
> > > cpu_alert5: trip0 {
> > > - temperature = <75000>;
> > > + temperature = <95000>;
> > > hysteresis = <2000>;
> > > type = "passive";
> > > };
> > > @@ -1818,7 +1818,7 @@
> > >
> > > trips {
> > > cpu_alert6: trip0 {
> > > - temperature = <75000>;
> > > + temperature = <95000>;
> > > hysteresis = <2000>;
> > > type = "passive";
> > > };
> > > @@ -1839,7 +1839,7 @@
> > >
> > > trips {
> > > cpu_alert7: trip0 {
> > > - temperature = <75000>;
> > > + temperature = <95000>;
> > > hysteresis = <2000>;
> > > type = "passive";
> > > };
> >
> > The change itself looks good to me, however I wonder if it would be
> > worth to eliminate redundancy and merge the current 8 thermal zones
> > into 2, one for the Silver and one for the Gold cluster (as done by
> > http://crrev.com/c/1381752). There is a single cooling device for
> > each cluster, so it's not clear to me if there is any gain from having
> > a separate thermal zone for each CPU. If it is important to monitor
> > the temperatures of the individual cores this can still be done by
> > configuring the thermal zone of the cluster with multiple thermal
> > sensors.
>
> I see your idea is to have a cooling device per CPU ("arm64: dts:
> sdm845: wireup the thermal trip points to cpufreq" /
> https://lore.kernel.org/patchwork/patch/1030742/), however that
> doesn't work as intended. Only two cpufreq 'devices' are created,
> one for CPU0 and one for CPU4. In consequence cpufreq->ready() only
> runs for these cores and only two cooling devices are
> registered. Since the cores of a cluster all run at the same
> frequency I also doubt if having multiple cooling devices would
> bring any benefits.
I actually only intended for two cooling devices - one for each
frequency domain. I'll clarify it better in the patch description.
On Fri, Jan 11, 2019 at 01:15:09AM +0530, Amit Kucheria wrote:
> On Thu, Jan 10, 2019 at 7:45 AM Matthias Kaehlcke <mka@chromium.org> wrote:
> >
> > On Wed, Jan 09, 2019 at 05:15:33PM -0800, Matthias Kaehlcke wrote:
> > > Hi Amit,
> > >
> > > On Thu, Jan 10, 2019 at 05:30:55AM +0530, Amit Kucheria wrote:
> > > > 75 degrees is too aggressive for throttling the CPU. After speaking to
> > > > Qualcomm engineers, increase it to 95 degrees.
> > > >
> > > > Signed-off-by: Amit Kucheria <amit.kucheria@linaro.org>
> > > > ---
> > > > arch/arm64/boot/dts/qcom/sdm845.dtsi | 16 ++++++++--------
> > > > 1 file changed, 8 insertions(+), 8 deletions(-)
> > > >
> > > > diff --git a/arch/arm64/boot/dts/qcom/sdm845.dtsi b/arch/arm64/boot/dts/qcom/sdm845.dtsi
> > > > index c27cbd3bcb0a..29e823b0caf4 100644
> > > > --- a/arch/arm64/boot/dts/qcom/sdm845.dtsi
> > > > +++ b/arch/arm64/boot/dts/qcom/sdm845.dtsi
> > > > @@ -1692,7 +1692,7 @@
> > > >
> > > > trips {
> > > > cpu_alert0: trip0 {
> > > > - temperature = <75000>;
> > > > + temperature = <95000>;
> > > > hysteresis = <2000>;
> > > > type = "passive";
> > > > };
> > > > @@ -1713,7 +1713,7 @@
> > > >
> > > > trips {
> > > > cpu_alert1: trip0 {
> > > > - temperature = <75000>;
> > > > + temperature = <95000>;
> > > > hysteresis = <2000>;
> > > > type = "passive";
> > > > };
> > > > @@ -1734,7 +1734,7 @@
> > > >
> > > > trips {
> > > > cpu_alert2: trip0 {
> > > > - temperature = <75000>;
> > > > + temperature = <95000>;
> > > > hysteresis = <2000>;
> > > > type = "passive";
> > > > };
> > > > @@ -1755,7 +1755,7 @@
> > > >
> > > > trips {
> > > > cpu_alert3: trip0 {
> > > > - temperature = <75000>;
> > > > + temperature = <95000>;
> > > > hysteresis = <2000>;
> > > > type = "passive";
> > > > };
> > > > @@ -1776,7 +1776,7 @@
> > > >
> > > > trips {
> > > > cpu_alert4: trip0 {
> > > > - temperature = <75000>;
> > > > + temperature = <95000>;
> > > > hysteresis = <2000>;
> > > > type = "passive";
> > > > };
> > > > @@ -1797,7 +1797,7 @@
> > > >
> > > > trips {
> > > > cpu_alert5: trip0 {
> > > > - temperature = <75000>;
> > > > + temperature = <95000>;
> > > > hysteresis = <2000>;
> > > > type = "passive";
> > > > };
> > > > @@ -1818,7 +1818,7 @@
> > > >
> > > > trips {
> > > > cpu_alert6: trip0 {
> > > > - temperature = <75000>;
> > > > + temperature = <95000>;
> > > > hysteresis = <2000>;
> > > > type = "passive";
> > > > };
> > > > @@ -1839,7 +1839,7 @@
> > > >
> > > > trips {
> > > > cpu_alert7: trip0 {
> > > > - temperature = <75000>;
> > > > + temperature = <95000>;
> > > > hysteresis = <2000>;
> > > > type = "passive";
> > > > };
> > >
> > > The change itself looks good to me, however I wonder if it would be
> > > worth to eliminate redundancy and merge the current 8 thermal zones
> > > into 2, one for the Silver and one for the Gold cluster (as done by
> > > http://crrev.com/c/1381752). There is a single cooling device for
> > > each cluster, so it's not clear to me if there is any gain from having
> > > a separate thermal zone for each CPU. If it is important to monitor
> > > the temperatures of the individual cores this can still be done by
> > > configuring the thermal zone of the cluster with multiple thermal
> > > sensors.
> >
> > I see your idea is to have a cooling device per CPU ("arm64: dts:
> > sdm845: wireup the thermal trip points to cpufreq" /
> > https://lore.kernel.org/patchwork/patch/1030742/), however that
> > doesn't work as intended. Only two cpufreq 'devices' are created,
> > one for CPU0 and one for CPU4. In consequence cpufreq->ready() only
> > runs for these cores and only two cooling devices are
> > registered. Since the cores of a cluster all run at the same
> > frequency I also doubt if having multiple cooling devices would
> > bring any benefits.
>
> I actually only intended for two cooling devices - one for each
> frequency domain. I'll clarify it better in the patch description.
Viresh helped me understand that we currently need to add cooling
device entries for all CPUs to the DT, even though at most one will be
active per freq domain at any time (I wonder if this could be changed
though).
Independently of the cooling devices, is there any value in having a
thermal zone for each CPU instead of having only one per freq domain?
Thanks
Matthias
On Thu, Jan 10, 2019 at 5:59 AM Stephen Boyd <swboyd@chromium.org> wrote:
>
> Quoting Amit Kucheria (2019-01-09 16:00:55)
> > 75 degrees is too aggressive for throttling the CPU. After speaking to
> > Qualcomm engineers, increase it to 95 degrees.
> >
> > Signed-off-by: Amit Kucheria <amit.kucheria@linaro.org>
> > ---
> > arch/arm64/boot/dts/qcom/sdm845.dtsi | 16 ++++++++--------
>
> Is the plan that these are some defaults that would be adjusted by board
> variants? Just curious why we have anything in here and don't punt it
> all to each board dts file.
These values are meant to be safe values for silicon operation as
characterised[1] by the HW team. So I'd consider them a more than just
default values. It also gives future boards a safe starting point.
IMHO it would be better to have the characterized values merged
upstream and if really required, those values can be tweaked in the
board-specific DTS.
[1] Caveat: The characterisation probably focused on mobile devices
and might change depending on form-factor, active cooling and heat
dissipation design but those differences should only impact the skin
temperature, not the operation of the SoC itself.
On Thu, Jan 10, 2019 at 05:30:56AM +0530, Amit Kucheria wrote:
> Since the big and little cpus are in the same frequency domain, use all
> of them for mitigation in the cooling-map. At the lower trip points we
> restrict ourselves to throttling only a few OPPs. At higher trip
> temperatures, allow ourselves to be throttled to any extent.
>
> Signed-off-by: Amit Kucheria <amit.kucheria@linaro.org>
> ---
> arch/arm64/boot/dts/qcom/sdm845.dtsi | 145 +++++++++++++++++++++++++++
> 1 file changed, 145 insertions(+)
>
> diff --git a/arch/arm64/boot/dts/qcom/sdm845.dtsi b/arch/arm64/boot/dts/qcom/sdm845.dtsi
> index 29e823b0caf4..cd6402a9aa64 100644
> --- a/arch/arm64/boot/dts/qcom/sdm845.dtsi
> +++ b/arch/arm64/boot/dts/qcom/sdm845.dtsi
> @@ -13,6 +13,7 @@
> #include <dt-bindings/reset/qcom,sdm845-aoss.h>
> #include <dt-bindings/soc/qcom,rpmh-rsc.h>
> #include <dt-bindings/clock/qcom,gcc-sdm845.h>
> +#include <dt-bindings/thermal/thermal.h>
>
> / {
> interrupt-parent = <&intc>;
> @@ -99,6 +100,7 @@
> compatible = "qcom,kryo385";
> reg = <0x0 0x0>;
> enable-method = "psci";
> + #cooling-cells = <2>;
> next-level-cache = <&L2_0>;
> L2_0: l2-cache {
> compatible = "cache";
> @@ -114,6 +116,7 @@
> compatible = "qcom,kryo385";
> reg = <0x0 0x100>;
> enable-method = "psci";
> + #cooling-cells = <2>;
> next-level-cache = <&L2_100>;
> L2_100: l2-cache {
> compatible = "cache";
> @@ -126,6 +129,7 @@
> compatible = "qcom,kryo385";
> reg = <0x0 0x200>;
> enable-method = "psci";
> + #cooling-cells = <2>;
> next-level-cache = <&L2_200>;
> L2_200: l2-cache {
> compatible = "cache";
> @@ -138,6 +142,7 @@
> compatible = "qcom,kryo385";
> reg = <0x0 0x300>;
> enable-method = "psci";
> + #cooling-cells = <2>;
> next-level-cache = <&L2_300>;
> L2_300: l2-cache {
> compatible = "cache";
> @@ -150,6 +155,7 @@
> compatible = "qcom,kryo385";
> reg = <0x0 0x400>;
> enable-method = "psci";
> + #cooling-cells = <2>;
> next-level-cache = <&L2_400>;
> L2_400: l2-cache {
> compatible = "cache";
> @@ -162,6 +168,7 @@
> compatible = "qcom,kryo385";
> reg = <0x0 0x500>;
> enable-method = "psci";
> + #cooling-cells = <2>;
> next-level-cache = <&L2_500>;
> L2_500: l2-cache {
> compatible = "cache";
> @@ -174,6 +181,7 @@
> compatible = "qcom,kryo385";
> reg = <0x0 0x600>;
> enable-method = "psci";
> + #cooling-cells = <2>;
> next-level-cache = <&L2_600>;
> L2_600: l2-cache {
> compatible = "cache";
> @@ -186,6 +194,7 @@
> compatible = "qcom,kryo385";
> reg = <0x0 0x700>;
> enable-method = "psci";
> + #cooling-cells = <2>;
> next-level-cache = <&L2_700>;
> L2_700: l2-cache {
> compatible = "cache";
> @@ -1703,6 +1712,23 @@
> type = "critical";
> };
> };
> +
> + cooling-maps {
> + map0 {
> + trip = <&cpu_alert0>;
> + cooling-device = <&CPU0 THERMAL_NO_LIMIT 4>,
> + <&CPU1 THERMAL_NO_LIMIT 4>,
> + <&CPU2 THERMAL_NO_LIMIT 4>,
> + <&CPU3 THERMAL_NO_LIMIT 4>;
> + };
> + map1 {
> + trip = <&cpu_crit0>;
> + cooling-device = <&CPU0 THERMAL_NO_LIMIT THERMAL_NO_LIMIT>,
> + <&CPU1 THERMAL_NO_LIMIT THERMAL_NO_LIMIT>,
> + <&CPU2 THERMAL_NO_LIMIT THERMAL_NO_LIMIT>,
> + <&CPU3 THERMAL_NO_LIMIT THERMAL_NO_LIMIT>;
> + };
> + };
Slightly off-topic, buy maybe not so much since we are just starting
to use the trip points:
Currently we use the naming scheme 'cpu_<type>N' for trip points. I
anticipate that we're going to add more passive trip points soon, to
keep the 'power_allocator' thermal governor happy, which expects a
'switch_on' and a 'desired_temperature' trip point. With the current
naming scheme this could become a bit messy. I suggest to change it to
'cpuN_<type>[X]', which would allow for something like 'cpuN_alert0'
and 'cpuN_alert1'.
If you think the change makes sense you can consider to do it within
this series, I can also send a separate patch once it has landed.
You could also consider to add the additional trip point in this
series if you agree that it will be needed.
This is not necessarily a call for action, just thinking loudly about
a closely related topic ;-)
Cheers
Matthias
On 10-01-19, 12:00, Matthias Kaehlcke wrote:
> Viresh helped me understand that we currently need to add cooling
> device entries for all CPUs to the DT, even though at most one will be
> active per freq domain at any time (I wonder if this could be changed
> though).
Actually we were only adding cooling-cells in CPU0 until now and I
fixed that, so that is going to stay :)
The idea is that the hardware should be described properly and not
partially. Even if all the CPUs are part of the same freq-domain, they
are all capable of being a cooling device here and the DT should
describe that. Kernel will ofcourse create a single cooling device.
--
viresh
On 10-01-19, 10:42, Matthias Kaehlcke wrote: > Thanks for the pointer, there's always something new to learn! > > Ok, so the policy CPU and hence the CPU registered as cooling > device may vary. I understand that this requires to list all possible > cooling devices, I won't say that I changed DT because of a design issue with kernel, rather the DT shall be complete by itself and that's why that change was made. And then we can have more things going on. For example with cpuidle cooling, we can individually control each CPU (and force idle on that) even if all CPUs are part of the same freq-domain. Each CPU shall expose its capabilities. > even though only one will be active at any given > time. However I wonder if we could change this: I won't say it that way. I see it as all the CPUs are active during a cooling state, i.e. they are all participating. > >From 103703a46495ff210a521b5b6fbf32632053c64f Mon Sep 17 00:00:00 2001 > From: Matthias Kaehlcke <mka@chromium.org> > Date: Thu, 10 Jan 2019 09:48:38 -0800 > Subject: [PATCH] thermal: cpu_cooling: always use first CPU of a freq domain > as cooling device > > For all CPUs of a frequency domain a single cooling device is > registered, since the CPUs can't switch their frequencies > independently from each other. The cpufreq policy CPU is used to > represent cooling device of the frequency domain. Which CPU is the > policy CPU may vary based on the order of initialization or CPU > hotplug. > > For device tree based platform the above implies that cooling maps > must include a list of all possible cooling devices of a frequency > domain, even though only one of them will exist at any given time. > > For example: > > cooling-maps { > map0 { > trip = <&cpu_alert0>; > cooling-device = <&CPU0 THERMAL_NO_LIMIT 4>, > <&CPU1 THERMAL_NO_LIMIT 4>, > <&CPU2 THERMAL_NO_LIMIT 4>, > <&CPU3 THERMAL_NO_LIMIT 4>; > }; > map1 { > trip = <&cpu_crit0>; > cooling-device = <&CPU0 THERMAL_NO_LIMIT THERMAL_NO_LIMIT>, > <&CPU1 THERMAL_NO_LIMIT THERMAL_NO_LIMIT>, > <&CPU2 THERMAL_NO_LIMIT THERMAL_NO_LIMIT>, > <&CPU3 THERMAL_NO_LIMIT THERMAL_NO_LIMIT>; This is the right thing to do hardware description wise, no matter what the kernel does. > }; > }; > > This can be avoided by using always the first CPU of a frequency > domain as cooling device. It may happen that the first CPU is offline > when the cooling device is registered (e.g. CPU2 is initialized > first in the above example), hence the nominal cooling device might > be offline. This may seem odd, however it is not really different from > the current behavior: when the policy CPU is taking offline the cooling > device corresponding to it remains active, unless it is unregistered > because all other CPUs of the frequency domain are offline too. > > A single cooling device associated with a specific CPU of the frequency > domain reduces redundant device tree clutter in CPU nodes and cooling > maps. > > Signed-off-by: Matthias Kaehlcke <mka@chromium.org> > --- > drivers/thermal/cpu_cooling.c | 7 ++++--- > 1 file changed, 4 insertions(+), 3 deletions(-) > > diff --git a/drivers/thermal/cpu_cooling.c b/drivers/thermal/cpu_cooling.c > index dfd23245f778a..bb5ea06f893a2 100644 > --- a/drivers/thermal/cpu_cooling.c > +++ b/drivers/thermal/cpu_cooling.c > @@ -758,13 +758,14 @@ EXPORT_SYMBOL_GPL(cpufreq_cooling_register); > struct thermal_cooling_device * > of_cpufreq_cooling_register(struct cpufreq_policy *policy) > { > - struct device_node *np = of_get_cpu_node(policy->cpu, NULL); > + unsigned int first_cpu = cpumask_first(policy->related_cpus); > + struct device_node *np = of_get_cpu_node(first_cpu, NULL); > struct thermal_cooling_device *cdev = NULL; > u32 capacitance = 0; > > if (!np) { > pr_err("cpu_cooling: OF node not available for cpu%d\n", > - policy->cpu); > + first_cpu); > return NULL; > } > > @@ -775,7 +776,7 @@ of_cpufreq_cooling_register(struct cpufreq_policy *policy) > cdev = __cpufreq_cooling_register(np, policy, capacitance); > if (IS_ERR(cdev)) { > pr_err("cpu_cooling: cpu%d is not running as cooling device: %ld\n", > - policy->cpu, PTR_ERR(cdev)); > + first_cpu, PTR_ERR(cdev)); > cdev = NULL; > } > } > > > Would that make sense or is there something I'm overlooking? I don't see any benefits of this to be honest. Even if we make this change, the DT should remain in its current form. -- viresh
On Thu, Jan 10, 2019 at 6:45 AM Matthias Kaehlcke <mka@chromium.org> wrote: > > Hi Amit, > > On Thu, Jan 10, 2019 at 05:30:55AM +0530, Amit Kucheria wrote: > > 75 degrees is too aggressive for throttling the CPU. After speaking to > > Qualcomm engineers, increase it to 95 degrees. > > > > Signed-off-by: Amit Kucheria <amit.kucheria@linaro.org> > > --- > > arch/arm64/boot/dts/qcom/sdm845.dtsi | 16 ++++++++-------- > > 1 file changed, 8 insertions(+), 8 deletions(-) > > > > diff --git a/arch/arm64/boot/dts/qcom/sdm845.dtsi b/arch/arm64/boot/dts/qcom/sdm845.dtsi > > index c27cbd3bcb0a..29e823b0caf4 100644 > > --- a/arch/arm64/boot/dts/qcom/sdm845.dtsi > > +++ b/arch/arm64/boot/dts/qcom/sdm845.dtsi > > @@ -1692,7 +1692,7 @@ > > > > trips { > > cpu_alert0: trip0 { > > - temperature = <75000>; > > + temperature = <95000>; > > hysteresis = <2000>; > > type = "passive"; > > }; > > @@ -1713,7 +1713,7 @@ > > > > trips { > > cpu_alert1: trip0 { > > - temperature = <75000>; > > + temperature = <95000>; > > hysteresis = <2000>; > > type = "passive"; > > }; > > @@ -1734,7 +1734,7 @@ > > > > trips { > > cpu_alert2: trip0 { > > - temperature = <75000>; > > + temperature = <95000>; > > hysteresis = <2000>; > > type = "passive"; > > }; > > @@ -1755,7 +1755,7 @@ > > > > trips { > > cpu_alert3: trip0 { > > - temperature = <75000>; > > + temperature = <95000>; > > hysteresis = <2000>; > > type = "passive"; > > }; > > @@ -1776,7 +1776,7 @@ > > > > trips { > > cpu_alert4: trip0 { > > - temperature = <75000>; > > + temperature = <95000>; > > hysteresis = <2000>; > > type = "passive"; > > }; > > @@ -1797,7 +1797,7 @@ > > > > trips { > > cpu_alert5: trip0 { > > - temperature = <75000>; > > + temperature = <95000>; > > hysteresis = <2000>; > > type = "passive"; > > }; > > @@ -1818,7 +1818,7 @@ > > > > trips { > > cpu_alert6: trip0 { > > - temperature = <75000>; > > + temperature = <95000>; > > hysteresis = <2000>; > > type = "passive"; > > }; > > @@ -1839,7 +1839,7 @@ > > > > trips { > > cpu_alert7: trip0 { > > - temperature = <75000>; > > + temperature = <95000>; > > hysteresis = <2000>; > > type = "passive"; > > }; > > The change itself looks good to me, however I wonder if it would be > worth to eliminate redundancy and merge the current 8 thermal zones > into 2, one for the Silver and one for the Gold cluster (as done by > http://crrev.com/c/1381752). There is a single cooling device for > each cluster, so it's not clear to me if there is any gain from having > a separate thermal zone for each CPU. If it is important to monitor > the temperatures of the individual cores this can still be done by > configuring the thermal zone of the cluster with multiple thermal > sensors. Reducing the number of thermal zones to 2 (by grouping 4 sensors per zone) is not possible due a limitation of the thermal framework[1]. It is something that we want to address. Previous attempts to fix this were rejected for various reasons. Eduardo was going to share a way to have more flexible mapping between sensors and zones after discussions at LPC. <nag> Eduardo, do you have anything we can review? </nag> :-) Having said that, we'll need some aggregation functions when we add multiple sensors to a zone (e.g. max, mean) to reflect the zone. This will lose information about hotspots and prevent things like idle injection on a particular CPU that is causing most of the heat in the aggregated zone. So IMHO, it might be useful to have information about the hotspots (i.e TZ per sensor) and aggregated values (ambient temperature) that can be fed to the thermal policy. [1] https://elixir.bootlin.com/linux/v5.0-rc1/source/drivers/thermal/of-thermal.c#L502
On Fri, Jan 11, 2019 at 6:00 AM Matthias Kaehlcke <mka@chromium.org> wrote: > > On Thu, Jan 10, 2019 at 05:30:56AM +0530, Amit Kucheria wrote: > > Since the big and little cpus are in the same frequency domain, use all > > of them for mitigation in the cooling-map. At the lower trip points we > > restrict ourselves to throttling only a few OPPs. At higher trip > > temperatures, allow ourselves to be throttled to any extent. > > > > Signed-off-by: Amit Kucheria <amit.kucheria@linaro.org> > > --- > > arch/arm64/boot/dts/qcom/sdm845.dtsi | 145 +++++++++++++++++++++++++++ > > 1 file changed, 145 insertions(+) > > > > diff --git a/arch/arm64/boot/dts/qcom/sdm845.dtsi b/arch/arm64/boot/dts/qcom/sdm845.dtsi > > index 29e823b0caf4..cd6402a9aa64 100644 > > --- a/arch/arm64/boot/dts/qcom/sdm845.dtsi > > +++ b/arch/arm64/boot/dts/qcom/sdm845.dtsi > > @@ -13,6 +13,7 @@ > > #include <dt-bindings/reset/qcom,sdm845-aoss.h> > > #include <dt-bindings/soc/qcom,rpmh-rsc.h> > > #include <dt-bindings/clock/qcom,gcc-sdm845.h> > > +#include <dt-bindings/thermal/thermal.h> > > > > / { > > interrupt-parent = <&intc>; > > @@ -99,6 +100,7 @@ > > compatible = "qcom,kryo385"; > > reg = <0x0 0x0>; > > enable-method = "psci"; > > + #cooling-cells = <2>; > > next-level-cache = <&L2_0>; > > L2_0: l2-cache { > > compatible = "cache"; > > @@ -114,6 +116,7 @@ > > compatible = "qcom,kryo385"; > > reg = <0x0 0x100>; > > enable-method = "psci"; > > + #cooling-cells = <2>; > > next-level-cache = <&L2_100>; > > L2_100: l2-cache { > > compatible = "cache"; > > @@ -126,6 +129,7 @@ > > compatible = "qcom,kryo385"; > > reg = <0x0 0x200>; > > enable-method = "psci"; > > + #cooling-cells = <2>; > > next-level-cache = <&L2_200>; > > L2_200: l2-cache { > > compatible = "cache"; > > @@ -138,6 +142,7 @@ > > compatible = "qcom,kryo385"; > > reg = <0x0 0x300>; > > enable-method = "psci"; > > + #cooling-cells = <2>; > > next-level-cache = <&L2_300>; > > L2_300: l2-cache { > > compatible = "cache"; > > @@ -150,6 +155,7 @@ > > compatible = "qcom,kryo385"; > > reg = <0x0 0x400>; > > enable-method = "psci"; > > + #cooling-cells = <2>; > > next-level-cache = <&L2_400>; > > L2_400: l2-cache { > > compatible = "cache"; > > @@ -162,6 +168,7 @@ > > compatible = "qcom,kryo385"; > > reg = <0x0 0x500>; > > enable-method = "psci"; > > + #cooling-cells = <2>; > > next-level-cache = <&L2_500>; > > L2_500: l2-cache { > > compatible = "cache"; > > @@ -174,6 +181,7 @@ > > compatible = "qcom,kryo385"; > > reg = <0x0 0x600>; > > enable-method = "psci"; > > + #cooling-cells = <2>; > > next-level-cache = <&L2_600>; > > L2_600: l2-cache { > > compatible = "cache"; > > @@ -186,6 +194,7 @@ > > compatible = "qcom,kryo385"; > > reg = <0x0 0x700>; > > enable-method = "psci"; > > + #cooling-cells = <2>; > > next-level-cache = <&L2_700>; > > L2_700: l2-cache { > > compatible = "cache"; > > @@ -1703,6 +1712,23 @@ > > type = "critical"; > > }; > > }; > > + > > + cooling-maps { > > + map0 { > > + trip = <&cpu_alert0>; > > + cooling-device = <&CPU0 THERMAL_NO_LIMIT 4>, > > + <&CPU1 THERMAL_NO_LIMIT 4>, > > + <&CPU2 THERMAL_NO_LIMIT 4>, > > + <&CPU3 THERMAL_NO_LIMIT 4>; > > + }; > > + map1 { > > + trip = <&cpu_crit0>; > > + cooling-device = <&CPU0 THERMAL_NO_LIMIT THERMAL_NO_LIMIT>, > > + <&CPU1 THERMAL_NO_LIMIT THERMAL_NO_LIMIT>, > > + <&CPU2 THERMAL_NO_LIMIT THERMAL_NO_LIMIT>, > > + <&CPU3 THERMAL_NO_LIMIT THERMAL_NO_LIMIT>; > > + }; > > + }; > > Slightly off-topic, buy maybe not so much since we are just starting > to use the trip points: > > Currently we use the naming scheme 'cpu_<type>N' for trip points. I > anticipate that we're going to add more passive trip points soon, to > keep the 'power_allocator' thermal governor happy, which expects a > 'switch_on' and a 'desired_temperature' trip point. With the current > naming scheme this could become a bit messy. I suggest to change it to > 'cpuN_<type>[X]', which would allow for something like 'cpuN_alert0' > and 'cpuN_alert1'. > > If you think the change makes sense you can consider to do it within > this series, I can also send a separate patch once it has landed. Sure, I can change them to cpuN_alertX format. > You could also consider to add the additional trip point in this > series if you agree that it will be needed. I expect that we'll end up with at least 2 passive trip points but I don't know what temperature to set the next one at. So let's just go with 1 passive and 1 critical trip point in this series and you can send a patch adding more once we've characterised IPA. > This is not necessarily a call for action, just thinking loudly about > a closely related topic ;-) Thanks for the reviews. Regards, Amit
On Fri, Jan 11, 2019 at 03:54:23PM +0530, Amit Kucheria wrote: > On Thu, Jan 10, 2019 at 6:45 AM Matthias Kaehlcke <mka@chromium.org> wrote: > > > > Hi Amit, > > > > On Thu, Jan 10, 2019 at 05:30:55AM +0530, Amit Kucheria wrote: > > > 75 degrees is too aggressive for throttling the CPU. After speaking to > > > Qualcomm engineers, increase it to 95 degrees. > > > > > > Signed-off-by: Amit Kucheria <amit.kucheria@linaro.org> > > > --- > > > arch/arm64/boot/dts/qcom/sdm845.dtsi | 16 ++++++++-------- > > > 1 file changed, 8 insertions(+), 8 deletions(-) > > > > > > diff --git a/arch/arm64/boot/dts/qcom/sdm845.dtsi b/arch/arm64/boot/dts/qcom/sdm845.dtsi > > > index c27cbd3bcb0a..29e823b0caf4 100644 > > > --- a/arch/arm64/boot/dts/qcom/sdm845.dtsi > > > +++ b/arch/arm64/boot/dts/qcom/sdm845.dtsi > > > @@ -1692,7 +1692,7 @@ > > > > > > trips { > > > cpu_alert0: trip0 { > > > - temperature = <75000>; > > > + temperature = <95000>; > > > hysteresis = <2000>; > > > type = "passive"; > > > }; > > > @@ -1713,7 +1713,7 @@ > > > > > > trips { > > > cpu_alert1: trip0 { > > > - temperature = <75000>; > > > + temperature = <95000>; > > > hysteresis = <2000>; > > > type = "passive"; > > > }; > > > @@ -1734,7 +1734,7 @@ > > > > > > trips { > > > cpu_alert2: trip0 { > > > - temperature = <75000>; > > > + temperature = <95000>; > > > hysteresis = <2000>; > > > type = "passive"; > > > }; > > > @@ -1755,7 +1755,7 @@ > > > > > > trips { > > > cpu_alert3: trip0 { > > > - temperature = <75000>; > > > + temperature = <95000>; > > > hysteresis = <2000>; > > > type = "passive"; > > > }; > > > @@ -1776,7 +1776,7 @@ > > > > > > trips { > > > cpu_alert4: trip0 { > > > - temperature = <75000>; > > > + temperature = <95000>; > > > hysteresis = <2000>; > > > type = "passive"; > > > }; > > > @@ -1797,7 +1797,7 @@ > > > > > > trips { > > > cpu_alert5: trip0 { > > > - temperature = <75000>; > > > + temperature = <95000>; > > > hysteresis = <2000>; > > > type = "passive"; > > > }; > > > @@ -1818,7 +1818,7 @@ > > > > > > trips { > > > cpu_alert6: trip0 { > > > - temperature = <75000>; > > > + temperature = <95000>; > > > hysteresis = <2000>; > > > type = "passive"; > > > }; > > > @@ -1839,7 +1839,7 @@ > > > > > > trips { > > > cpu_alert7: trip0 { > > > - temperature = <75000>; > > > + temperature = <95000>; > > > hysteresis = <2000>; > > > type = "passive"; > > > }; > > > > The change itself looks good to me, however I wonder if it would be > > worth to eliminate redundancy and merge the current 8 thermal zones > > into 2, one for the Silver and one for the Gold cluster (as done by > > http://crrev.com/c/1381752). There is a single cooling device for > > each cluster, so it's not clear to me if there is any gain from having > > a separate thermal zone for each CPU. If it is important to monitor > > the temperatures of the individual cores this can still be done by > > configuring the thermal zone of the cluster with multiple thermal > > sensors. > > Reducing the number of thermal zones to 2 (by grouping 4 sensors per > zone) is not possible due a limitation of the thermal framework[1]. It > is something that we want to address. Previous attempts to fix this > were rejected for various reasons. Eduardo was going to share a way to > have more flexible mapping between sensors and zones after discussions > at LPC. I wasn't aware of this limitation, thanks for the clarification! With this I understand that for now we indeed need the 8 thermal zones with all the redundant information :( > <nag> Eduardo, do you have anything we can review? </nag> :-) > > Having said that, we'll need some aggregation functions when we add > multiple sensors to a zone (e.g. max, mean) to reflect the zone. This > will lose information about hotspots and prevent things like idle > injection on a particular CPU that is causing most of the heat in the > aggregated zone. So IMHO, it might be useful to have information about > the hotspots (i.e TZ per sensor) and aggregated values (ambient > temperature) that can be fed to the thermal policy. Ok, it seems for now we need the 8 thermal zones in any case, when support for multiple sensors becomes available we can evaluate whether it's worth to change that or not. Cheers Matthias
On Fri, Jan 11, 2019 at 09:16:53AM +0530, Viresh Kumar wrote: > On 10-01-19, 10:42, Matthias Kaehlcke wrote: > > Thanks for the pointer, there's always something new to learn! > > > > Ok, so the policy CPU and hence the CPU registered as cooling > > device may vary. I understand that this requires to list all possible > > cooling devices, > > I won't say that I changed DT because of a design issue with kernel, > rather the DT shall be complete by itself and that's why that change > was made. fair enough > And then we can have more things going on. For example with cpuidle > cooling, we can individually control each CPU (and force idle on that) > even if all CPUs are part of the same freq-domain. Each CPU shall > expose its capabilities. Just to gain a better understanding: is cpuidle cooling already available for arm64 (or is there a patch set)? I came across the relatively new idle injecting framework but it seems currently the only user is the Intel powerclamp driver. > > even though only one will be active at any given > > time. However I wonder if we could change this: > > I won't say it that way. I see it as all the CPUs are active during a > cooling state, i.e. they are all participating. agreed, I was referring to the CPU cooling device, which (without cpuidle injection) could be considered a single device per freq domain. > > For device tree based platform the above implies that cooling maps > > must include a list of all possible cooling devices of a frequency > > domain, even though only one of them will exist at any given time. > > > > For example: > > > > cooling-maps { > > map0 { > > trip = <&cpu_alert0>; > > cooling-device = <&CPU0 THERMAL_NO_LIMIT 4>, > > <&CPU1 THERMAL_NO_LIMIT 4>, > > <&CPU2 THERMAL_NO_LIMIT 4>, > > <&CPU3 THERMAL_NO_LIMIT 4>; > > }; > > map1 { > > trip = <&cpu_crit0>; > > cooling-device = <&CPU0 THERMAL_NO_LIMIT THERMAL_NO_LIMIT>, > > <&CPU1 THERMAL_NO_LIMIT THERMAL_NO_LIMIT>, > > <&CPU2 THERMAL_NO_LIMIT THERMAL_NO_LIMIT>, > > <&CPU3 THERMAL_NO_LIMIT THERMAL_NO_LIMIT>; > > This is the right thing to do hardware description wise, no matter > what the kernel does. Not sure I would call it a hardware description. I'd say we pretend the thermal configuration is a hardware description so the DT folks don't yell at us ;-) IMO a CPU cooling device is an abstraction, I think there is no such IP block on most systems. It seems with cpuidle injection CPUs can perform cooling actions individually, with that I agree that representing them as individual cooling devices in the DT makes sense. Without that a cooling device per freq domain would seem a resonable abstraction. One of the reasons I dislike the above list of cooling devices is that it is repeated for different thermal-zone/cooling-maps, but I guess we have to live with that, would be nice if the DT would allow to do something like this: thermal-zones { cooling_maps_fd0 : cooling-maps { map0 { trip = <&cpu_alert0>; cooling-device = <&CPU0 THERMAL_NO_LIMIT 4>, <&CPU1 THERMAL_NO_LIMIT 4>, <&CPU2 THERMAL_NO_LIMIT 4>, <&CPU3 THERMAL_NO_LIMIT 4>; }; map1 { trip = <&cpu_crit0>; cooling-device = <&CPU0 THERMAL_NO_LIMIT THERMAL_NO_LIMIT>, <&CPU1 THERMAL_NO_LIMIT THERMAL_NO_LIMIT>, <&CPU2 THERMAL_NO_LIMIT THERMAL_NO_LIMIT>, <&CPU3 THERMAL_NO_LIMIT THERMAL_NO_LIMIT>; }; cpu0-thermal { ... cooling-maps = @cooling_maps_fd0; ... }; cpu1-thermal { ... cooling-maps = @cooling_maps_fd0; ... }; ... }; Cheers Matthias
On Fri, Jan 11, 2019 at 04:47:15PM +0530, Amit Kucheria wrote: > On Fri, Jan 11, 2019 at 6:00 AM Matthias Kaehlcke <mka@chromium.org> wrote: > > > > On Thu, Jan 10, 2019 at 05:30:56AM +0530, Amit Kucheria wrote: > > > Since the big and little cpus are in the same frequency domain, use all > > > of them for mitigation in the cooling-map. At the lower trip points we > > > restrict ourselves to throttling only a few OPPs. At higher trip > > > temperatures, allow ourselves to be throttled to any extent. > > > > > > Signed-off-by: Amit Kucheria <amit.kucheria@linaro.org> > > > --- > > > arch/arm64/boot/dts/qcom/sdm845.dtsi | 145 +++++++++++++++++++++++++++ > > > 1 file changed, 145 insertions(+) > > > > > > diff --git a/arch/arm64/boot/dts/qcom/sdm845.dtsi b/arch/arm64/boot/dts/qcom/sdm845.dtsi > > > index 29e823b0caf4..cd6402a9aa64 100644 > > > --- a/arch/arm64/boot/dts/qcom/sdm845.dtsi > > > +++ b/arch/arm64/boot/dts/qcom/sdm845.dtsi > > > @@ -13,6 +13,7 @@ > > > #include <dt-bindings/reset/qcom,sdm845-aoss.h> > > > #include <dt-bindings/soc/qcom,rpmh-rsc.h> > > > #include <dt-bindings/clock/qcom,gcc-sdm845.h> > > > +#include <dt-bindings/thermal/thermal.h> > > > > > > / { > > > interrupt-parent = <&intc>; > > > @@ -99,6 +100,7 @@ > > > compatible = "qcom,kryo385"; > > > reg = <0x0 0x0>; > > > enable-method = "psci"; > > > + #cooling-cells = <2>; > > > next-level-cache = <&L2_0>; > > > L2_0: l2-cache { > > > compatible = "cache"; > > > @@ -114,6 +116,7 @@ > > > compatible = "qcom,kryo385"; > > > reg = <0x0 0x100>; > > > enable-method = "psci"; > > > + #cooling-cells = <2>; > > > next-level-cache = <&L2_100>; > > > L2_100: l2-cache { > > > compatible = "cache"; > > > @@ -126,6 +129,7 @@ > > > compatible = "qcom,kryo385"; > > > reg = <0x0 0x200>; > > > enable-method = "psci"; > > > + #cooling-cells = <2>; > > > next-level-cache = <&L2_200>; > > > L2_200: l2-cache { > > > compatible = "cache"; > > > @@ -138,6 +142,7 @@ > > > compatible = "qcom,kryo385"; > > > reg = <0x0 0x300>; > > > enable-method = "psci"; > > > + #cooling-cells = <2>; > > > next-level-cache = <&L2_300>; > > > L2_300: l2-cache { > > > compatible = "cache"; > > > @@ -150,6 +155,7 @@ > > > compatible = "qcom,kryo385"; > > > reg = <0x0 0x400>; > > > enable-method = "psci"; > > > + #cooling-cells = <2>; > > > next-level-cache = <&L2_400>; > > > L2_400: l2-cache { > > > compatible = "cache"; > > > @@ -162,6 +168,7 @@ > > > compatible = "qcom,kryo385"; > > > reg = <0x0 0x500>; > > > enable-method = "psci"; > > > + #cooling-cells = <2>; > > > next-level-cache = <&L2_500>; > > > L2_500: l2-cache { > > > compatible = "cache"; > > > @@ -174,6 +181,7 @@ > > > compatible = "qcom,kryo385"; > > > reg = <0x0 0x600>; > > > enable-method = "psci"; > > > + #cooling-cells = <2>; > > > next-level-cache = <&L2_600>; > > > L2_600: l2-cache { > > > compatible = "cache"; > > > @@ -186,6 +194,7 @@ > > > compatible = "qcom,kryo385"; > > > reg = <0x0 0x700>; > > > enable-method = "psci"; > > > + #cooling-cells = <2>; > > > next-level-cache = <&L2_700>; > > > L2_700: l2-cache { > > > compatible = "cache"; > > > @@ -1703,6 +1712,23 @@ > > > type = "critical"; > > > }; > > > }; > > > + > > > + cooling-maps { > > > + map0 { > > > + trip = <&cpu_alert0>; > > > + cooling-device = <&CPU0 THERMAL_NO_LIMIT 4>, > > > + <&CPU1 THERMAL_NO_LIMIT 4>, > > > + <&CPU2 THERMAL_NO_LIMIT 4>, > > > + <&CPU3 THERMAL_NO_LIMIT 4>; > > > + }; > > > + map1 { > > > + trip = <&cpu_crit0>; > > > + cooling-device = <&CPU0 THERMAL_NO_LIMIT THERMAL_NO_LIMIT>, > > > + <&CPU1 THERMAL_NO_LIMIT THERMAL_NO_LIMIT>, > > > + <&CPU2 THERMAL_NO_LIMIT THERMAL_NO_LIMIT>, > > > + <&CPU3 THERMAL_NO_LIMIT THERMAL_NO_LIMIT>; > > > + }; > > > + }; > > > > Slightly off-topic, buy maybe not so much since we are just starting > > to use the trip points: > > > > Currently we use the naming scheme 'cpu_<type>N' for trip points. I > > anticipate that we're going to add more passive trip points soon, to > > keep the 'power_allocator' thermal governor happy, which expects a > > 'switch_on' and a 'desired_temperature' trip point. With the current > > naming scheme this could become a bit messy. I suggest to change it to > > 'cpuN_<type>[X]', which would allow for something like 'cpuN_alert0' > > and 'cpuN_alert1'. > > > > If you think the change makes sense you can consider to do it within > > this series, I can also send a separate patch once it has landed. > > Sure, I can change them to cpuN_alertX format. Great, thanks! Another concern about adding trip points later could be the node name. We currently have: trips { cpu0_alert0: trip0 { ... }; cpu0_crit: trip1 { ... }; }; If we keep increasing enumeration with the node name this would become: trips { cpu0_alert0: trip0 { ... }; cpu0_alert1: trip1 { ... }; cpu0_crit: trip2 { ... }; }; i.e. the node name of the critical trip-point changes, which might be a concern for dtsi's that override a value, though they should probably use the phandle &cpu0_crit anyway. If this is a concern we could change the node names to 'alert0' and 'crit'. I looked around a bit and actually I kinda like the naming scheme used by hisilicon/hi6220.dtsi, mediatek/mt8173.dtsi and rockchip/rk3328.dtsi (with minor variations): trips { threshold: trip-point@0 { temperature = <68000>; hysteresis = <2000>; type = "passive"; }; target: trip-point@1 { temperature = <85000>; hysteresis = <2000>; type = "passive"; }; cpu_crit: cpu_crit@0 { temperature = <115000>; hysteresis = <2000>; type = "critical"; }; }; If we were to use this we'd have to adapt it slightly since we have multiple thermal zones. In line with the other scheme this could be cpuN_threshold, cpuN_target and cpuN_crit. Up to you, just providing some options ;-) > > You could also consider to add the additional trip point in this > > series if you agree that it will be needed. > > I expect that we'll end up with at least 2 passive trip points but I > don't know what temperature to set the next one at. So let's just go > with 1 passive and 1 critical trip point in this series and you can > send a patch adding more once we've characterised IPA. Sounds good Thanks! Matthias
On 11-01-19, 11:58, Matthias Kaehlcke wrote: > On Fri, Jan 11, 2019 at 09:16:53AM +0530, Viresh Kumar wrote: > Just to gain a better understanding: is cpuidle cooling already > available for arm64 (or is there a patch set)? I came across the > relatively new idle injecting framework but it seems currently the > only user is the Intel powerclamp driver. Daniel was trying to upstream it earlier: lore.kernel.org/lkml/1522945005-7165-7-git-send-email-daniel.lezcano@linaro.org > > > even though only one will be active at any given > > > time. However I wonder if we could change this: > > > > I won't say it that way. I see it as all the CPUs are active during a > > cooling state, i.e. they are all participating. > > agreed, I was referring to the CPU cooling device, which (without > cpuidle injection) could be considered a single device per freq domain. Even without cpuidle injection all CPUs actually take part in cooling. > > > For device tree based platform the above implies that cooling maps > > > must include a list of all possible cooling devices of a frequency > > > domain, even though only one of them will exist at any given time. > > > > > > For example: > > > > > > cooling-maps { > > > map0 { > > > trip = <&cpu_alert0>; > > > cooling-device = <&CPU0 THERMAL_NO_LIMIT 4>, > > > <&CPU1 THERMAL_NO_LIMIT 4>, > > > <&CPU2 THERMAL_NO_LIMIT 4>, > > > <&CPU3 THERMAL_NO_LIMIT 4>; > > > }; > > > map1 { > > > trip = <&cpu_crit0>; > > > cooling-device = <&CPU0 THERMAL_NO_LIMIT THERMAL_NO_LIMIT>, > > > <&CPU1 THERMAL_NO_LIMIT THERMAL_NO_LIMIT>, > > > <&CPU2 THERMAL_NO_LIMIT THERMAL_NO_LIMIT>, > > > <&CPU3 THERMAL_NO_LIMIT THERMAL_NO_LIMIT>; > > > > This is the right thing to do hardware description wise, no matter > > what the kernel does. > > Not sure I would call it a hardware description. I'd say we pretend > the thermal configuration is a hardware description so the DT folks > don't yell at us ;-) IMO a CPU cooling device is an abstraction, I > think there is no such IP block on most systems. Right. > It seems with cpuidle injection CPUs can perform cooling actions > individually, with that I agree that representing them as individual > cooling devices in the DT makes sense. Without that a cooling device > per freq domain would seem a resonable abstraction. But we actually have 4 different cooling devices no matter what. The only thing is that they switch their cooling state together. And that shouldn't bother DT is what I thought :) > One of the reasons I dislike the above list of cooling devices is that > it is repeated for different thermal-zone/cooling-maps, but I guess > we have to live with that, would be nice if the DT would allow to do > something like this: > > thermal-zones { > cooling_maps_fd0 : cooling-maps { > map0 { > trip = <&cpu_alert0>; > cooling-device = <&CPU0 THERMAL_NO_LIMIT 4>, > <&CPU1 THERMAL_NO_LIMIT 4>, > <&CPU2 THERMAL_NO_LIMIT 4>, > <&CPU3 THERMAL_NO_LIMIT 4>; > }; > map1 { > trip = <&cpu_crit0>; > cooling-device = <&CPU0 THERMAL_NO_LIMIT THERMAL_NO_LIMIT>, > <&CPU1 THERMAL_NO_LIMIT THERMAL_NO_LIMIT>, > <&CPU2 THERMAL_NO_LIMIT THERMAL_NO_LIMIT>, > <&CPU3 THERMAL_NO_LIMIT THERMAL_NO_LIMIT>; > }; > > cpu0-thermal { > ... > cooling-maps = @cooling_maps_fd0; > ... > }; > > cpu1-thermal { > ... > cooling-maps = @cooling_maps_fd0; > ... > }; > > ... > }; Yeah, maybe. There aren't lot of examples of such duplication though if I remember correctly. -- viresh
On Sat, Jan 12, 2019 at 2:06 AM Matthias Kaehlcke <mka@chromium.org> wrote:
>
> Another concern about adding trip points later could be the node
> name. We currently have:
>
>
> trips {
> cpu0_alert0: trip0 {
> ...
> };
>
> cpu0_crit: trip1 {
> ...
> };
> };
>
> If we keep increasing enumeration with the node name this would become:
>
> trips {
> cpu0_alert0: trip0 {
> ...
> };
>
> cpu0_alert1: trip1 {
> ...
> };
>
> cpu0_crit: trip2 {
> ...
> };
> };
>
> i.e. the node name of the critical trip-point changes, which might be
> a concern for dtsi's that override a value, though they should
> probably use the phandle &cpu0_crit anyway. If this is a concern we
> could change the node names to 'alert0' and 'crit'.
>
> I looked around a bit and actually I kinda like the naming scheme used
> by hisilicon/hi6220.dtsi, mediatek/mt8173.dtsi and rockchip/rk3328.dtsi
> (with minor variations):
>
> trips {
> threshold: trip-point@0 {
> temperature = <68000>;
> hysteresis = <2000>;
> type = "passive";
> };
>
> target: trip-point@1 {
> temperature = <85000>;
> hysteresis = <2000>;
> type = "passive";
> };
>
> cpu_crit: cpu_crit@0 {
> temperature = <115000>;
> hysteresis = <2000>;
> type = "critical";
> };
> };
>
> If we were to use this we'd have to adapt it slightly since we have
> multiple thermal zones. In line with the other scheme this could be
> cpuN_threshold, cpuN_target and cpuN_crit.
>
I like this scheme enough that I adopted it for v2.