linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v1 0/7] Thermal throttling for SDM845
@ 2019-01-10  0:00 Amit Kucheria
  2019-01-10  0:00 ` [PATCH v1 1/7] drivers: thermal: of-thermal: Print name of device node with error Amit Kucheria
                   ` (6 more replies)
  0 siblings, 7 replies; 37+ messages in thread
From: Amit Kucheria @ 2019-01-10  0:00 UTC (permalink / raw)
  To: linux-kernel
  Cc: linux-arm-msm, bjorn.andersson, viresh.kumar, edubezval,
	andy.gross, tdas, swboyd, dianders, mka, Rafael J. Wysocki,
	Amit Daniel Kachhap, Daniel Lezcano, David Brown, Javi Merino,
	Mark Rutland, Rob Herring, Zhang Rui,
	open list:OPEN FIRMWARE AND FLATTENED DEVICE TREE BINDINGS,
	open list:CPU FREQUENCY DRIVERS

Add support for thermal throttling on SDM845. 

We introduce a generic .ready callback to be used by cpufreq drivers to
register as a thermal cooling device. If this approach is acceptable I can
send a series converting other cpufreq drivers to use this callback.

Amit Kucheria (7):
  drivers: thermal: of-thermal: Print name of device node with error
  drivers: cpufreq: Add thermal_cooling_device pointer to struct
    cpufreq_policy
  cpu_cooling: Add generic driver ready callback
  cpufreq: qcom-hw: Move to device_initcall
  cpufreq: qcom-hw: Register as a cpufreq cooling device
  arm64: dts: sdm845: Increase alert trip point to 95 degrees
  arm64: dts: sdm845: wireup the thermal trip points to cpufreq

 arch/arm64/boot/dts/qcom/sdm845.dtsi | 161 +++++++++++++++++++++++++--
 drivers/cpufreq/qcom-cpufreq-hw.c    |   7 +-
 drivers/thermal/cpu_cooling.c        |  18 +++
 drivers/thermal/of-thermal.c         |   4 +-
 include/linux/cpu_cooling.h          |   9 ++
 include/linux/cpufreq.h              |   2 +
 6 files changed, 190 insertions(+), 11 deletions(-)

-- 
2.17.1


^ permalink raw reply	[flat|nested] 37+ messages in thread

* [PATCH v1 1/7] drivers: thermal: of-thermal: Print name of device node with error
  2019-01-10  0:00 [PATCH v1 0/7] Thermal throttling for SDM845 Amit Kucheria
@ 2019-01-10  0:00 ` Amit Kucheria
  2019-01-10  0:15   ` Stephen Boyd
  2019-01-10  0:00 ` [PATCH v1 2/7] drivers: cpufreq: Add thermal_cooling_device pointer to struct cpufreq_policy Amit Kucheria
                   ` (5 subsequent siblings)
  6 siblings, 1 reply; 37+ messages in thread
From: Amit Kucheria @ 2019-01-10  0:00 UTC (permalink / raw)
  To: linux-kernel
  Cc: linux-arm-msm, bjorn.andersson, viresh.kumar, edubezval,
	andy.gross, tdas, swboyd, dianders, mka, Zhang Rui,
	Daniel Lezcano, open list:THERMAL

Signed-off-by: Amit Kucheria <amit.kucheria@linaro.org>
---
 drivers/thermal/of-thermal.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/thermal/of-thermal.c b/drivers/thermal/of-thermal.c
index 4bfdb4a1e47d..5ca2a6b370ea 100644
--- a/drivers/thermal/of-thermal.c
+++ b/drivers/thermal/of-thermal.c
@@ -867,14 +867,14 @@ __init *thermal_of_build_thermal_zone(struct device_node *np)
 
 	ret = of_property_read_u32(np, "polling-delay-passive", &prop);
 	if (ret < 0) {
-		pr_err("missing polling-delay-passive property\n");
+		pr_err("%s: missing polling-delay-passive property\n", np->name);
 		goto free_tz;
 	}
 	tz->passive_delay = prop;
 
 	ret = of_property_read_u32(np, "polling-delay", &prop);
 	if (ret < 0) {
-		pr_err("missing polling-delay property\n");
+		pr_err("%s: missing polling-delay property\n", np->name);
 		goto free_tz;
 	}
 	tz->polling_delay = prop;
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [PATCH v1 2/7] drivers: cpufreq: Add thermal_cooling_device pointer to struct cpufreq_policy
  2019-01-10  0:00 [PATCH v1 0/7] Thermal throttling for SDM845 Amit Kucheria
  2019-01-10  0:00 ` [PATCH v1 1/7] drivers: thermal: of-thermal: Print name of device node with error Amit Kucheria
@ 2019-01-10  0:00 ` Amit Kucheria
  2019-01-10  1:01   ` Matthias Kaehlcke
  2019-01-10  0:00 ` [PATCH v1 3/7] cpu_cooling: Add generic driver ready callback Amit Kucheria
                   ` (4 subsequent siblings)
  6 siblings, 1 reply; 37+ messages in thread
From: Amit Kucheria @ 2019-01-10  0:00 UTC (permalink / raw)
  To: linux-kernel
  Cc: linux-arm-msm, bjorn.andersson, viresh.kumar, edubezval,
	andy.gross, tdas, swboyd, dianders, mka, Rafael J. Wysocki,
	open list:CPU FREQUENCY DRIVERS

Several cpufreq drivers register themselves as thermal cooling devices.
Adding a pointer to struct cpufreq_policy removes the need for them to
store this pointer in a private data structure.

Signed-off-by: Amit Kucheria <amit.kucheria@linaro.org>
---
 include/linux/cpufreq.h | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/include/linux/cpufreq.h b/include/linux/cpufreq.h
index c86d6d8bdfed..2496549d7573 100644
--- a/include/linux/cpufreq.h
+++ b/include/linux/cpufreq.h
@@ -95,6 +95,8 @@ struct cpufreq_policy {
 	struct cpufreq_frequency_table	*freq_table;
 	enum cpufreq_table_sorting freq_table_sorted;
 
+	struct thermal_cooling_device *cooldev; /* Pointer to the cooling
+						 * device if used for thermal mitigation */
 	struct list_head        policy_list;
 	struct kobject		kobj;
 	struct completion	kobj_unregister;
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [PATCH v1 3/7] cpu_cooling: Add generic driver ready callback
  2019-01-10  0:00 [PATCH v1 0/7] Thermal throttling for SDM845 Amit Kucheria
  2019-01-10  0:00 ` [PATCH v1 1/7] drivers: thermal: of-thermal: Print name of device node with error Amit Kucheria
  2019-01-10  0:00 ` [PATCH v1 2/7] drivers: cpufreq: Add thermal_cooling_device pointer to struct cpufreq_policy Amit Kucheria
@ 2019-01-10  0:00 ` Amit Kucheria
  2019-01-10  6:14   ` Viresh Kumar
  2019-01-10  0:00 ` [PATCH v1 4/7] cpufreq: qcom-hw: Move to device_initcall Amit Kucheria
                   ` (3 subsequent siblings)
  6 siblings, 1 reply; 37+ messages in thread
From: Amit Kucheria @ 2019-01-10  0:00 UTC (permalink / raw)
  To: linux-kernel
  Cc: linux-arm-msm, bjorn.andersson, viresh.kumar, edubezval,
	andy.gross, tdas, swboyd, dianders, mka, Amit Daniel Kachhap,
	Javi Merino, Zhang Rui, Daniel Lezcano,
	open list:THERMAL/CPU_COOLING

All cpufreq drivers do similar things to register as a cooling device.
Provide a generic call back so we can get rid of duplicated code in the
drivers.

Signed-off-by: Amit Kucheria <amit.kucheria@linaro.org>
Suggested-by: Stephen Boyd <swboyd@chromium.org>
---
 drivers/thermal/cpu_cooling.c | 18 ++++++++++++++++++
 include/linux/cpu_cooling.h   |  9 +++++++++
 2 files changed, 27 insertions(+)

diff --git a/drivers/thermal/cpu_cooling.c b/drivers/thermal/cpu_cooling.c
index dfd23245f778..5154ffc12332 100644
--- a/drivers/thermal/cpu_cooling.c
+++ b/drivers/thermal/cpu_cooling.c
@@ -694,6 +694,7 @@ __cpufreq_cooling_register(struct device_node *np,
 
 	cpufreq_cdev->clipped_freq = cpufreq_cdev->freq_table[0].frequency;
 	cpufreq_cdev->cdev = cdev;
+	policy->cooldev = cdev;
 
 	mutex_lock(&cooling_list_lock);
 	/* Register the notifier for first cpufreq cooling device */
@@ -785,6 +786,23 @@ of_cpufreq_cooling_register(struct cpufreq_policy *policy)
 }
 EXPORT_SYMBOL_GPL(of_cpufreq_cooling_register);
 
+/**
+ * generic_cpufreq_ready - generic driver callback to register a thermal_cooling_device
+ * @policy: cpufreq policy
+ */
+void generic_cpufreq_ready(struct cpufreq_policy *policy)
+{
+	struct thermal_cooling_device **cdev = &policy->cooldev;
+
+	*cdev = of_cpufreq_cooling_register(policy);
+	if (IS_ERR(*cdev)) {
+		pr_err("Failed to register cpufreq cooling device for CPU%d: %ld\n",
+		       policy->cpu, PTR_ERR(cdev));
+		cdev = NULL;
+	}
+}
+EXPORT_SYMBOL_GPL(generic_cpufreq_ready);
+
 /**
  * cpufreq_cooling_unregister - function to remove cpufreq cooling device.
  * @cdev: thermal cooling device pointer.
diff --git a/include/linux/cpu_cooling.h b/include/linux/cpu_cooling.h
index de0dafb9399d..d7815bb2967a 100644
--- a/include/linux/cpu_cooling.h
+++ b/include/linux/cpu_cooling.h
@@ -65,12 +65,21 @@ void cpufreq_cooling_unregister(struct thermal_cooling_device *cdev)
  */
 struct thermal_cooling_device *
 of_cpufreq_cooling_register(struct cpufreq_policy *policy);
+
+/**
+ * generic_cpufreq_ready - generic driver callback to register a thermal_cooling_device
+ * @policy: cpufreq policy
+ */
+void generic_cpufreq_ready(struct cpufreq_policy *policy);
 #else
 static inline struct thermal_cooling_device *
 of_cpufreq_cooling_register(struct cpufreq_policy *policy)
 {
 	return NULL;
 }
+
+void generic_cpufreq_ready(struct cpufreq_policy *policy) {}
+
 #endif /* defined(CONFIG_THERMAL_OF) && defined(CONFIG_CPU_THERMAL) */
 
 #endif /* __CPU_COOLING_H__ */
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [PATCH v1 4/7] cpufreq: qcom-hw: Move to device_initcall
  2019-01-10  0:00 [PATCH v1 0/7] Thermal throttling for SDM845 Amit Kucheria
                   ` (2 preceding siblings ...)
  2019-01-10  0:00 ` [PATCH v1 3/7] cpu_cooling: Add generic driver ready callback Amit Kucheria
@ 2019-01-10  0:00 ` Amit Kucheria
  2019-01-10  6:44   ` Viresh Kumar
  2019-01-10  0:00 ` [PATCH v1 5/7] cpufreq: qcom-hw: Register as a cpufreq cooling device Amit Kucheria
                   ` (2 subsequent siblings)
  6 siblings, 1 reply; 37+ messages in thread
From: Amit Kucheria @ 2019-01-10  0:00 UTC (permalink / raw)
  To: linux-kernel
  Cc: linux-arm-msm, bjorn.andersson, viresh.kumar, edubezval,
	andy.gross, tdas, swboyd, dianders, mka, Rafael J. Wysocki,
	open list:CPU FREQUENCY DRIVERS

subsys_initcall causes problems registering the driver as a thermal
cooling device.

If "faster boot" is the main reason for doing subsys_initcall, this
should be handled in the bootloader or another boot constraint
framework.

Signed-off-by: Amit Kucheria <amit.kucheria@linaro.org>
---
 drivers/cpufreq/qcom-cpufreq-hw.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/cpufreq/qcom-cpufreq-hw.c b/drivers/cpufreq/qcom-cpufreq-hw.c
index d83939a1b3d4..649dddd72749 100644
--- a/drivers/cpufreq/qcom-cpufreq-hw.c
+++ b/drivers/cpufreq/qcom-cpufreq-hw.c
@@ -296,7 +296,7 @@ static int __init qcom_cpufreq_hw_init(void)
 {
 	return platform_driver_register(&qcom_cpufreq_hw_driver);
 }
-subsys_initcall(qcom_cpufreq_hw_init);
+device_initcall(qcom_cpufreq_hw_init);
 
 static void __exit qcom_cpufreq_hw_exit(void)
 {
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [PATCH v1 5/7] cpufreq: qcom-hw: Register as a cpufreq cooling device
  2019-01-10  0:00 [PATCH v1 0/7] Thermal throttling for SDM845 Amit Kucheria
                   ` (3 preceding siblings ...)
  2019-01-10  0:00 ` [PATCH v1 4/7] cpufreq: qcom-hw: Move to device_initcall Amit Kucheria
@ 2019-01-10  0:00 ` Amit Kucheria
  2019-01-10  6:12   ` Viresh Kumar
  2019-01-10  0:00 ` [PATCH v1 6/7] arm64: dts: sdm845: Increase alert trip point to 95 degrees Amit Kucheria
  2019-01-10  0:00 ` [PATCH v1 7/7] arm64: dts: sdm845: wireup the thermal trip points to cpufreq Amit Kucheria
  6 siblings, 1 reply; 37+ messages in thread
From: Amit Kucheria @ 2019-01-10  0:00 UTC (permalink / raw)
  To: linux-kernel
  Cc: linux-arm-msm, bjorn.andersson, viresh.kumar, edubezval,
	andy.gross, tdas, swboyd, dianders, mka, Rafael J. Wysocki,
	open list:CPU FREQUENCY DRIVERS

Signed-off-by: Amit Kucheria <amit.kucheria@linaro.org>
---
 drivers/cpufreq/qcom-cpufreq-hw.c | 5 +++++
 1 file changed, 5 insertions(+)

diff --git a/drivers/cpufreq/qcom-cpufreq-hw.c b/drivers/cpufreq/qcom-cpufreq-hw.c
index 649dddd72749..1c01311e5927 100644
--- a/drivers/cpufreq/qcom-cpufreq-hw.c
+++ b/drivers/cpufreq/qcom-cpufreq-hw.c
@@ -5,6 +5,7 @@
 
 #include <linux/bitfield.h>
 #include <linux/cpufreq.h>
+#include <linux/cpu_cooling.h>
 #include <linux/init.h>
 #include <linux/kernel.h>
 #include <linux/module.h>
@@ -216,7 +217,10 @@ static int qcom_cpufreq_hw_cpu_init(struct cpufreq_policy *policy)
 static int qcom_cpufreq_hw_cpu_exit(struct cpufreq_policy *policy)
 {
 	void __iomem *base = policy->driver_data - REG_PERF_STATE;
+	struct thermal_cooling_device *cdev = policy->cooldev;
 
+	if (cdev)
+		cpufreq_cooling_unregister(cdev);
 	kfree(policy->freq_table);
 	devm_iounmap(&global_pdev->dev, base);
 
@@ -238,6 +242,7 @@ static struct cpufreq_driver cpufreq_qcom_hw_driver = {
 	.init		= qcom_cpufreq_hw_cpu_init,
 	.exit		= qcom_cpufreq_hw_cpu_exit,
 	.fast_switch    = qcom_cpufreq_hw_fast_switch,
+	.ready		= generic_cpufreq_ready,
 	.name		= "qcom-cpufreq-hw",
 	.attr		= qcom_cpufreq_hw_attr,
 };
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [PATCH v1 6/7] arm64: dts: sdm845: Increase alert trip point to 95 degrees
  2019-01-10  0:00 [PATCH v1 0/7] Thermal throttling for SDM845 Amit Kucheria
                   ` (4 preceding siblings ...)
  2019-01-10  0:00 ` [PATCH v1 5/7] cpufreq: qcom-hw: Register as a cpufreq cooling device Amit Kucheria
@ 2019-01-10  0:00 ` Amit Kucheria
  2019-01-10  0:29   ` Stephen Boyd
  2019-01-10  1:15   ` Matthias Kaehlcke
  2019-01-10  0:00 ` [PATCH v1 7/7] arm64: dts: sdm845: wireup the thermal trip points to cpufreq Amit Kucheria
  6 siblings, 2 replies; 37+ messages in thread
From: Amit Kucheria @ 2019-01-10  0:00 UTC (permalink / raw)
  To: linux-kernel
  Cc: linux-arm-msm, bjorn.andersson, viresh.kumar, edubezval,
	andy.gross, tdas, swboyd, dianders, mka, David Brown,
	Rob Herring, Mark Rutland,
	open list:OPEN FIRMWARE AND FLATTENED DEVICE TREE BINDINGS

75 degrees is too aggressive for throttling the CPU. After speaking to
Qualcomm engineers, increase it to 95 degrees.

Signed-off-by: Amit Kucheria <amit.kucheria@linaro.org>
---
 arch/arm64/boot/dts/qcom/sdm845.dtsi | 16 ++++++++--------
 1 file changed, 8 insertions(+), 8 deletions(-)

diff --git a/arch/arm64/boot/dts/qcom/sdm845.dtsi b/arch/arm64/boot/dts/qcom/sdm845.dtsi
index c27cbd3bcb0a..29e823b0caf4 100644
--- a/arch/arm64/boot/dts/qcom/sdm845.dtsi
+++ b/arch/arm64/boot/dts/qcom/sdm845.dtsi
@@ -1692,7 +1692,7 @@
 
 			trips {
 				cpu_alert0: trip0 {
-					temperature = <75000>;
+					temperature = <95000>;
 					hysteresis = <2000>;
 					type = "passive";
 				};
@@ -1713,7 +1713,7 @@
 
 			trips {
 				cpu_alert1: trip0 {
-					temperature = <75000>;
+					temperature = <95000>;
 					hysteresis = <2000>;
 					type = "passive";
 				};
@@ -1734,7 +1734,7 @@
 
 			trips {
 				cpu_alert2: trip0 {
-					temperature = <75000>;
+					temperature = <95000>;
 					hysteresis = <2000>;
 					type = "passive";
 				};
@@ -1755,7 +1755,7 @@
 
 			trips {
 				cpu_alert3: trip0 {
-					temperature = <75000>;
+					temperature = <95000>;
 					hysteresis = <2000>;
 					type = "passive";
 				};
@@ -1776,7 +1776,7 @@
 
 			trips {
 				cpu_alert4: trip0 {
-					temperature = <75000>;
+					temperature = <95000>;
 					hysteresis = <2000>;
 					type = "passive";
 				};
@@ -1797,7 +1797,7 @@
 
 			trips {
 				cpu_alert5: trip0 {
-					temperature = <75000>;
+					temperature = <95000>;
 					hysteresis = <2000>;
 					type = "passive";
 				};
@@ -1818,7 +1818,7 @@
 
 			trips {
 				cpu_alert6: trip0 {
-					temperature = <75000>;
+					temperature = <95000>;
 					hysteresis = <2000>;
 					type = "passive";
 				};
@@ -1839,7 +1839,7 @@
 
 			trips {
 				cpu_alert7: trip0 {
-					temperature = <75000>;
+					temperature = <95000>;
 					hysteresis = <2000>;
 					type = "passive";
 				};
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [PATCH v1 7/7] arm64: dts: sdm845: wireup the thermal trip points to cpufreq
  2019-01-10  0:00 [PATCH v1 0/7] Thermal throttling for SDM845 Amit Kucheria
                   ` (5 preceding siblings ...)
  2019-01-10  0:00 ` [PATCH v1 6/7] arm64: dts: sdm845: Increase alert trip point to 95 degrees Amit Kucheria
@ 2019-01-10  0:00 ` Amit Kucheria
  2019-01-10  0:28   ` Stephen Boyd
                     ` (2 more replies)
  6 siblings, 3 replies; 37+ messages in thread
From: Amit Kucheria @ 2019-01-10  0:00 UTC (permalink / raw)
  To: linux-kernel
  Cc: linux-arm-msm, bjorn.andersson, viresh.kumar, edubezval,
	andy.gross, tdas, swboyd, dianders, mka, David Brown,
	Rob Herring, Mark Rutland,
	open list:OPEN FIRMWARE AND FLATTENED DEVICE TREE BINDINGS

Since the big and little cpus are in the same frequency domain, use all
of them for mitigation in the cooling-map. At the lower trip points we
restrict ourselves to throttling only a few OPPs. At higher trip
temperatures, allow ourselves to be throttled to any extent.

Signed-off-by: Amit Kucheria <amit.kucheria@linaro.org>
---
 arch/arm64/boot/dts/qcom/sdm845.dtsi | 145 +++++++++++++++++++++++++++
 1 file changed, 145 insertions(+)

diff --git a/arch/arm64/boot/dts/qcom/sdm845.dtsi b/arch/arm64/boot/dts/qcom/sdm845.dtsi
index 29e823b0caf4..cd6402a9aa64 100644
--- a/arch/arm64/boot/dts/qcom/sdm845.dtsi
+++ b/arch/arm64/boot/dts/qcom/sdm845.dtsi
@@ -13,6 +13,7 @@
 #include <dt-bindings/reset/qcom,sdm845-aoss.h>
 #include <dt-bindings/soc/qcom,rpmh-rsc.h>
 #include <dt-bindings/clock/qcom,gcc-sdm845.h>
+#include <dt-bindings/thermal/thermal.h>
 
 / {
 	interrupt-parent = <&intc>;
@@ -99,6 +100,7 @@
 			compatible = "qcom,kryo385";
 			reg = <0x0 0x0>;
 			enable-method = "psci";
+			#cooling-cells = <2>;
 			next-level-cache = <&L2_0>;
 			L2_0: l2-cache {
 				compatible = "cache";
@@ -114,6 +116,7 @@
 			compatible = "qcom,kryo385";
 			reg = <0x0 0x100>;
 			enable-method = "psci";
+			#cooling-cells = <2>;
 			next-level-cache = <&L2_100>;
 			L2_100: l2-cache {
 				compatible = "cache";
@@ -126,6 +129,7 @@
 			compatible = "qcom,kryo385";
 			reg = <0x0 0x200>;
 			enable-method = "psci";
+			#cooling-cells = <2>;
 			next-level-cache = <&L2_200>;
 			L2_200: l2-cache {
 				compatible = "cache";
@@ -138,6 +142,7 @@
 			compatible = "qcom,kryo385";
 			reg = <0x0 0x300>;
 			enable-method = "psci";
+			#cooling-cells = <2>;
 			next-level-cache = <&L2_300>;
 			L2_300: l2-cache {
 				compatible = "cache";
@@ -150,6 +155,7 @@
 			compatible = "qcom,kryo385";
 			reg = <0x0 0x400>;
 			enable-method = "psci";
+			#cooling-cells = <2>;
 			next-level-cache = <&L2_400>;
 			L2_400: l2-cache {
 				compatible = "cache";
@@ -162,6 +168,7 @@
 			compatible = "qcom,kryo385";
 			reg = <0x0 0x500>;
 			enable-method = "psci";
+			#cooling-cells = <2>;
 			next-level-cache = <&L2_500>;
 			L2_500: l2-cache {
 				compatible = "cache";
@@ -174,6 +181,7 @@
 			compatible = "qcom,kryo385";
 			reg = <0x0 0x600>;
 			enable-method = "psci";
+			#cooling-cells = <2>;
 			next-level-cache = <&L2_600>;
 			L2_600: l2-cache {
 				compatible = "cache";
@@ -186,6 +194,7 @@
 			compatible = "qcom,kryo385";
 			reg = <0x0 0x700>;
 			enable-method = "psci";
+			#cooling-cells = <2>;
 			next-level-cache = <&L2_700>;
 			L2_700: l2-cache {
 				compatible = "cache";
@@ -1703,6 +1712,23 @@
 					type = "critical";
 				};
 			};
+
+			cooling-maps {
+				map0 {
+					trip = <&cpu_alert0>;
+					cooling-device = <&CPU0 THERMAL_NO_LIMIT 4>,
+							 <&CPU1 THERMAL_NO_LIMIT 4>,
+							 <&CPU2 THERMAL_NO_LIMIT 4>,
+							 <&CPU3 THERMAL_NO_LIMIT 4>;
+				};
+				map1 {
+					trip = <&cpu_crit0>;
+					cooling-device = <&CPU0 THERMAL_NO_LIMIT THERMAL_NO_LIMIT>,
+							 <&CPU1 THERMAL_NO_LIMIT THERMAL_NO_LIMIT>,
+							 <&CPU2 THERMAL_NO_LIMIT THERMAL_NO_LIMIT>,
+							 <&CPU3 THERMAL_NO_LIMIT THERMAL_NO_LIMIT>;
+				};
+			};
 		};
 
 		cpu1-thermal {
@@ -1724,6 +1750,23 @@
 					type = "critical";
 				};
 			};
+
+			cooling-maps {
+				map0 {
+					trip = <&cpu_alert1>;
+					cooling-device = <&CPU0 THERMAL_NO_LIMIT 4>,
+							 <&CPU1 THERMAL_NO_LIMIT 4>,
+							 <&CPU2 THERMAL_NO_LIMIT 4>,
+							 <&CPU3 THERMAL_NO_LIMIT 4>;
+				};
+				map1 {
+					trip = <&cpu_crit1>;
+					cooling-device = <&CPU0 THERMAL_NO_LIMIT THERMAL_NO_LIMIT>,
+							 <&CPU1 THERMAL_NO_LIMIT THERMAL_NO_LIMIT>,
+							 <&CPU2 THERMAL_NO_LIMIT THERMAL_NO_LIMIT>,
+							 <&CPU3 THERMAL_NO_LIMIT THERMAL_NO_LIMIT>;
+				};
+			};
 		};
 
 		cpu2-thermal {
@@ -1745,6 +1788,23 @@
 					type = "critical";
 				};
 			};
+
+			cooling-maps {
+				map0 {
+					trip = <&cpu_alert2>;
+					cooling-device = <&CPU0 THERMAL_NO_LIMIT 4>,
+							 <&CPU1 THERMAL_NO_LIMIT 4>,
+							 <&CPU2 THERMAL_NO_LIMIT 4>,
+							 <&CPU3 THERMAL_NO_LIMIT 4>;
+				};
+				map1 {
+					trip = <&cpu_crit2>;
+					cooling-device = <&CPU0 THERMAL_NO_LIMIT THERMAL_NO_LIMIT>,
+							 <&CPU1 THERMAL_NO_LIMIT THERMAL_NO_LIMIT>,
+							 <&CPU2 THERMAL_NO_LIMIT THERMAL_NO_LIMIT>,
+							 <&CPU3 THERMAL_NO_LIMIT THERMAL_NO_LIMIT>;
+				};
+			};
 		};
 
 		cpu3-thermal {
@@ -1766,6 +1826,23 @@
 					type = "critical";
 				};
 			};
+
+			cooling-maps {
+				map0 {
+					trip = <&cpu_alert3>;
+					cooling-device = <&CPU0 THERMAL_NO_LIMIT 4>,
+							 <&CPU1 THERMAL_NO_LIMIT 4>,
+							 <&CPU2 THERMAL_NO_LIMIT 4>,
+							 <&CPU3 THERMAL_NO_LIMIT 4>;
+				};
+				map1 {
+					trip = <&cpu_crit3>;
+					cooling-device = <&CPU0 THERMAL_NO_LIMIT THERMAL_NO_LIMIT>,
+							 <&CPU1 THERMAL_NO_LIMIT THERMAL_NO_LIMIT>,
+							 <&CPU2 THERMAL_NO_LIMIT THERMAL_NO_LIMIT>,
+							 <&CPU3 THERMAL_NO_LIMIT THERMAL_NO_LIMIT>;
+				};
+			};
 		};
 
 		cpu4-thermal {
@@ -1787,6 +1864,23 @@
 					type = "critical";
 				};
 			};
+
+			cooling-maps {
+				map0 {
+					trip = <&cpu_alert4>;
+					cooling-device = <&CPU4 THERMAL_NO_LIMIT 4>,
+							 <&CPU5 THERMAL_NO_LIMIT 4>,
+							 <&CPU6 THERMAL_NO_LIMIT 4>,
+							 <&CPU7 THERMAL_NO_LIMIT 4>;
+				};
+				map1 {
+					trip = <&cpu_crit4>;
+					cooling-device = <&CPU4 THERMAL_NO_LIMIT THERMAL_NO_LIMIT>,
+							 <&CPU5 THERMAL_NO_LIMIT THERMAL_NO_LIMIT>,
+							 <&CPU6 THERMAL_NO_LIMIT THERMAL_NO_LIMIT>,
+							 <&CPU7 THERMAL_NO_LIMIT THERMAL_NO_LIMIT>;
+				};
+			};
 		};
 
 		cpu5-thermal {
@@ -1808,6 +1902,23 @@
 					type = "critical";
 				};
 			};
+
+			cooling-maps {
+				map0 {
+					trip = <&cpu_alert5>;
+					cooling-device = <&CPU4 THERMAL_NO_LIMIT 4>,
+							 <&CPU5 THERMAL_NO_LIMIT 4>,
+							 <&CPU6 THERMAL_NO_LIMIT 4>,
+							 <&CPU7 THERMAL_NO_LIMIT 4>;
+				};
+				map1 {
+					trip = <&cpu_crit5>;
+					cooling-device = <&CPU4 THERMAL_NO_LIMIT THERMAL_NO_LIMIT>,
+							 <&CPU5 THERMAL_NO_LIMIT THERMAL_NO_LIMIT>,
+							 <&CPU6 THERMAL_NO_LIMIT THERMAL_NO_LIMIT>,
+							 <&CPU7 THERMAL_NO_LIMIT THERMAL_NO_LIMIT>;
+				};
+			};
 		};
 
 		cpu6-thermal {
@@ -1829,6 +1940,23 @@
 					type = "critical";
 				};
 			};
+
+			cooling-maps {
+				map0 {
+					trip = <&cpu_alert6>;
+					cooling-device = <&CPU4 THERMAL_NO_LIMIT 4>,
+							 <&CPU5 THERMAL_NO_LIMIT 4>,
+							 <&CPU6 THERMAL_NO_LIMIT 4>,
+							 <&CPU7 THERMAL_NO_LIMIT 4>;
+				};
+				map1 {
+					trip = <&cpu_crit6>;
+					cooling-device = <&CPU4 THERMAL_NO_LIMIT THERMAL_NO_LIMIT>,
+							 <&CPU5 THERMAL_NO_LIMIT THERMAL_NO_LIMIT>,
+							 <&CPU6 THERMAL_NO_LIMIT THERMAL_NO_LIMIT>,
+							 <&CPU7 THERMAL_NO_LIMIT THERMAL_NO_LIMIT>;
+				};
+			};
 		};
 
 		cpu7-thermal {
@@ -1850,6 +1978,23 @@
 					type = "critical";
 				};
 			};
+
+			cooling-maps {
+				map0 {
+					trip = <&cpu_alert7>;
+					cooling-device = <&CPU4 THERMAL_NO_LIMIT 4>,
+							 <&CPU5 THERMAL_NO_LIMIT 4>,
+							 <&CPU6 THERMAL_NO_LIMIT 4>,
+							 <&CPU7 THERMAL_NO_LIMIT 4>;
+				};
+				map1 {
+					trip = <&cpu_crit7>;
+					cooling-device = <&CPU4 THERMAL_NO_LIMIT THERMAL_NO_LIMIT>,
+							 <&CPU5 THERMAL_NO_LIMIT THERMAL_NO_LIMIT>,
+							 <&CPU6 THERMAL_NO_LIMIT THERMAL_NO_LIMIT>,
+							 <&CPU7 THERMAL_NO_LIMIT THERMAL_NO_LIMIT>;
+				};
+			};
 		};
 	};
 };
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 37+ messages in thread

* Re: [PATCH v1 1/7] drivers: thermal: of-thermal: Print name of device node with error
  2019-01-10  0:00 ` [PATCH v1 1/7] drivers: thermal: of-thermal: Print name of device node with error Amit Kucheria
@ 2019-01-10  0:15   ` Stephen Boyd
  0 siblings, 0 replies; 37+ messages in thread
From: Stephen Boyd @ 2019-01-10  0:15 UTC (permalink / raw)
  To: Amit Kucheria, linux-kernel
  Cc: linux-arm-msm, bjorn.andersson, viresh.kumar, edubezval,
	andy.gross, tdas, dianders, mka, Zhang Rui, Daniel Lezcano,
	linux-pm

Quoting Amit Kucheria (2019-01-09 16:00:50)
> Signed-off-by: Amit Kucheria <amit.kucheria@linaro.org>
> ---
>  drivers/thermal/of-thermal.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/thermal/of-thermal.c b/drivers/thermal/of-thermal.c
> index 4bfdb4a1e47d..5ca2a6b370ea 100644
> --- a/drivers/thermal/of-thermal.c
> +++ b/drivers/thermal/of-thermal.c
> @@ -867,14 +867,14 @@ __init *thermal_of_build_thermal_zone(struct device_node *np)
>  
>         ret = of_property_read_u32(np, "polling-delay-passive", &prop);
>         if (ret < 0) {
> -               pr_err("missing polling-delay-passive property\n");
> +               pr_err("%s: missing polling-delay-passive property\n", np->name);

You should use %pOFn to print node names.


^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH v1 7/7] arm64: dts: sdm845: wireup the thermal trip points to cpufreq
  2019-01-10  0:00 ` [PATCH v1 7/7] arm64: dts: sdm845: wireup the thermal trip points to cpufreq Amit Kucheria
@ 2019-01-10  0:28   ` Stephen Boyd
  2019-01-10 12:28     ` Amit Kucheria
  2019-01-10  2:22   ` Matthias Kaehlcke
  2019-01-11  0:30   ` Matthias Kaehlcke
  2 siblings, 1 reply; 37+ messages in thread
From: Stephen Boyd @ 2019-01-10  0:28 UTC (permalink / raw)
  To: Amit Kucheria, linux-kernel
  Cc: linux-arm-msm, bjorn.andersson, viresh.kumar, edubezval,
	andy.gross, tdas, dianders, mka, David Brown, Rob Herring,
	Mark Rutland, devicetree

Quoting Amit Kucheria (2019-01-09 16:00:56)
> Since the big and little cpus are in the same frequency domain, use all

Oh? I thought the big and little cpus were in different frequency
domains and voltage domains. Maybe that's what you're saying here but
I'm misunderstanding. So change the wording a bit to be more clear?

> of them for mitigation in the cooling-map. At the lower trip points we
> restrict ourselves to throttling only a few OPPs. At higher trip
> temperatures, allow ourselves to be throttled to any extent.
> 
> Signed-off-by: Amit Kucheria <amit.kucheria@linaro.org>


^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH v1 6/7] arm64: dts: sdm845: Increase alert trip point to 95 degrees
  2019-01-10  0:00 ` [PATCH v1 6/7] arm64: dts: sdm845: Increase alert trip point to 95 degrees Amit Kucheria
@ 2019-01-10  0:29   ` Stephen Boyd
  2019-01-10 17:14     ` Doug Anderson
  2019-01-10 20:06     ` Amit Kucheria
  2019-01-10  1:15   ` Matthias Kaehlcke
  1 sibling, 2 replies; 37+ messages in thread
From: Stephen Boyd @ 2019-01-10  0:29 UTC (permalink / raw)
  To: Amit Kucheria, linux-kernel
  Cc: linux-arm-msm, bjorn.andersson, viresh.kumar, edubezval,
	andy.gross, tdas, dianders, mka, David Brown, Rob Herring,
	Mark Rutland, devicetree

Quoting Amit Kucheria (2019-01-09 16:00:55)
> 75 degrees is too aggressive for throttling the CPU. After speaking to
> Qualcomm engineers, increase it to 95 degrees.
> 
> Signed-off-by: Amit Kucheria <amit.kucheria@linaro.org>
> ---
>  arch/arm64/boot/dts/qcom/sdm845.dtsi | 16 ++++++++--------

Is the plan that these are some defaults that would be adjusted by board
variants? Just curious why we have anything in here and don't punt it
all to each board dts file.


^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH v1 2/7] drivers: cpufreq: Add thermal_cooling_device pointer to struct cpufreq_policy
  2019-01-10  0:00 ` [PATCH v1 2/7] drivers: cpufreq: Add thermal_cooling_device pointer to struct cpufreq_policy Amit Kucheria
@ 2019-01-10  1:01   ` Matthias Kaehlcke
  0 siblings, 0 replies; 37+ messages in thread
From: Matthias Kaehlcke @ 2019-01-10  1:01 UTC (permalink / raw)
  To: Amit Kucheria
  Cc: linux-kernel, linux-arm-msm, bjorn.andersson, viresh.kumar,
	edubezval, andy.gross, tdas, swboyd, dianders, Rafael J. Wysocki,
	open list:CPU FREQUENCY DRIVERS

On Thu, Jan 10, 2019 at 05:30:51AM +0530, Amit Kucheria wrote:
> Several cpufreq drivers register themselves as thermal cooling devices.
> Adding a pointer to struct cpufreq_policy removes the need for them to
> store this pointer in a private data structure.
> 
> Signed-off-by: Amit Kucheria <amit.kucheria@linaro.org>
> ---
>  include/linux/cpufreq.h | 2 ++
>  1 file changed, 2 insertions(+)
> 
> diff --git a/include/linux/cpufreq.h b/include/linux/cpufreq.h
> index c86d6d8bdfed..2496549d7573 100644
> --- a/include/linux/cpufreq.h
> +++ b/include/linux/cpufreq.h
> @@ -95,6 +95,8 @@ struct cpufreq_policy {
>  	struct cpufreq_frequency_table	*freq_table;
>  	enum cpufreq_table_sorting freq_table_sorted;
>  
> +	struct thermal_cooling_device *cooldev; /* Pointer to the cooling
> +						 * device if used for thermal mitigation */
>  	struct list_head        policy_list;
>  	struct kobject		kobj;
>  	struct completion	kobj_unregister;

I've mixed feelings about this. It's definitely desirable to avoid
code duplication and tying the cooling device to the cpufreq_policy is
a convenient way to achieve that. However semantically it seems a bit
odd that a CPU cooling device is part of the cpufreq policy.

Anyway, unless there are better ideas we probably want to be pragmatic
here, so if Viresh is fine with it who am I to complain ;-)

Cheers

Matthias



^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH v1 6/7] arm64: dts: sdm845: Increase alert trip point to 95 degrees
  2019-01-10  0:00 ` [PATCH v1 6/7] arm64: dts: sdm845: Increase alert trip point to 95 degrees Amit Kucheria
  2019-01-10  0:29   ` Stephen Boyd
@ 2019-01-10  1:15   ` Matthias Kaehlcke
  2019-01-10  2:15     ` Matthias Kaehlcke
  2019-01-11 10:24     ` Amit Kucheria
  1 sibling, 2 replies; 37+ messages in thread
From: Matthias Kaehlcke @ 2019-01-10  1:15 UTC (permalink / raw)
  To: Amit Kucheria
  Cc: linux-kernel, linux-arm-msm, bjorn.andersson, viresh.kumar,
	edubezval, andy.gross, tdas, swboyd, dianders, David Brown,
	Rob Herring, Mark Rutland,
	open list:OPEN FIRMWARE AND FLATTENED DEVICE TREE BINDINGS

Hi Amit,

On Thu, Jan 10, 2019 at 05:30:55AM +0530, Amit Kucheria wrote:
> 75 degrees is too aggressive for throttling the CPU. After speaking to
> Qualcomm engineers, increase it to 95 degrees.
> 
> Signed-off-by: Amit Kucheria <amit.kucheria@linaro.org>
> ---
>  arch/arm64/boot/dts/qcom/sdm845.dtsi | 16 ++++++++--------
>  1 file changed, 8 insertions(+), 8 deletions(-)
> 
> diff --git a/arch/arm64/boot/dts/qcom/sdm845.dtsi b/arch/arm64/boot/dts/qcom/sdm845.dtsi
> index c27cbd3bcb0a..29e823b0caf4 100644
> --- a/arch/arm64/boot/dts/qcom/sdm845.dtsi
> +++ b/arch/arm64/boot/dts/qcom/sdm845.dtsi
> @@ -1692,7 +1692,7 @@
>  
>  			trips {
>  				cpu_alert0: trip0 {
> -					temperature = <75000>;
> +					temperature = <95000>;
>  					hysteresis = <2000>;
>  					type = "passive";
>  				};
> @@ -1713,7 +1713,7 @@
>  
>  			trips {
>  				cpu_alert1: trip0 {
> -					temperature = <75000>;
> +					temperature = <95000>;
>  					hysteresis = <2000>;
>  					type = "passive";
>  				};
> @@ -1734,7 +1734,7 @@
>  
>  			trips {
>  				cpu_alert2: trip0 {
> -					temperature = <75000>;
> +					temperature = <95000>;
>  					hysteresis = <2000>;
>  					type = "passive";
>  				};
> @@ -1755,7 +1755,7 @@
>  
>  			trips {
>  				cpu_alert3: trip0 {
> -					temperature = <75000>;
> +					temperature = <95000>;
>  					hysteresis = <2000>;
>  					type = "passive";
>  				};
> @@ -1776,7 +1776,7 @@
>  
>  			trips {
>  				cpu_alert4: trip0 {
> -					temperature = <75000>;
> +					temperature = <95000>;
>  					hysteresis = <2000>;
>  					type = "passive";
>  				};
> @@ -1797,7 +1797,7 @@
>  
>  			trips {
>  				cpu_alert5: trip0 {
> -					temperature = <75000>;
> +					temperature = <95000>;
>  					hysteresis = <2000>;
>  					type = "passive";
>  				};
> @@ -1818,7 +1818,7 @@
>  
>  			trips {
>  				cpu_alert6: trip0 {
> -					temperature = <75000>;
> +					temperature = <95000>;
>  					hysteresis = <2000>;
>  					type = "passive";
>  				};
> @@ -1839,7 +1839,7 @@
>  
>  			trips {
>  				cpu_alert7: trip0 {
> -					temperature = <75000>;
> +					temperature = <95000>;
>  					hysteresis = <2000>;
>  					type = "passive";
>  				};

The change itself looks good to me, however I wonder if it would be
worth to eliminate redundancy and merge the current 8 thermal zones
into 2, one for the Silver and one for the Gold cluster (as done by
http://crrev.com/c/1381752). There is a single cooling device for
each cluster, so it's not clear to me if there is any gain from having
a separate thermal zone for each CPU. If it is important to monitor
the temperatures of the individual cores this can still be done by
configuring the thermal zone of the cluster with multiple thermal
sensors.

Cheers

Matthias

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH v1 6/7] arm64: dts: sdm845: Increase alert trip point to 95 degrees
  2019-01-10  1:15   ` Matthias Kaehlcke
@ 2019-01-10  2:15     ` Matthias Kaehlcke
  2019-01-10 19:45       ` Amit Kucheria
  2019-01-11 10:24     ` Amit Kucheria
  1 sibling, 1 reply; 37+ messages in thread
From: Matthias Kaehlcke @ 2019-01-10  2:15 UTC (permalink / raw)
  To: Amit Kucheria
  Cc: linux-kernel, linux-arm-msm, bjorn.andersson, viresh.kumar,
	edubezval, andy.gross, tdas, swboyd, dianders, David Brown,
	Rob Herring, Mark Rutland,
	open list:OPEN FIRMWARE AND FLATTENED DEVICE TREE BINDINGS

On Wed, Jan 09, 2019 at 05:15:33PM -0800, Matthias Kaehlcke wrote:
> Hi Amit,
> 
> On Thu, Jan 10, 2019 at 05:30:55AM +0530, Amit Kucheria wrote:
> > 75 degrees is too aggressive for throttling the CPU. After speaking to
> > Qualcomm engineers, increase it to 95 degrees.
> > 
> > Signed-off-by: Amit Kucheria <amit.kucheria@linaro.org>
> > ---
> >  arch/arm64/boot/dts/qcom/sdm845.dtsi | 16 ++++++++--------
> >  1 file changed, 8 insertions(+), 8 deletions(-)
> > 
> > diff --git a/arch/arm64/boot/dts/qcom/sdm845.dtsi b/arch/arm64/boot/dts/qcom/sdm845.dtsi
> > index c27cbd3bcb0a..29e823b0caf4 100644
> > --- a/arch/arm64/boot/dts/qcom/sdm845.dtsi
> > +++ b/arch/arm64/boot/dts/qcom/sdm845.dtsi
> > @@ -1692,7 +1692,7 @@
> >  
> >  			trips {
> >  				cpu_alert0: trip0 {
> > -					temperature = <75000>;
> > +					temperature = <95000>;
> >  					hysteresis = <2000>;
> >  					type = "passive";
> >  				};
> > @@ -1713,7 +1713,7 @@
> >  
> >  			trips {
> >  				cpu_alert1: trip0 {
> > -					temperature = <75000>;
> > +					temperature = <95000>;
> >  					hysteresis = <2000>;
> >  					type = "passive";
> >  				};
> > @@ -1734,7 +1734,7 @@
> >  
> >  			trips {
> >  				cpu_alert2: trip0 {
> > -					temperature = <75000>;
> > +					temperature = <95000>;
> >  					hysteresis = <2000>;
> >  					type = "passive";
> >  				};
> > @@ -1755,7 +1755,7 @@
> >  
> >  			trips {
> >  				cpu_alert3: trip0 {
> > -					temperature = <75000>;
> > +					temperature = <95000>;
> >  					hysteresis = <2000>;
> >  					type = "passive";
> >  				};
> > @@ -1776,7 +1776,7 @@
> >  
> >  			trips {
> >  				cpu_alert4: trip0 {
> > -					temperature = <75000>;
> > +					temperature = <95000>;
> >  					hysteresis = <2000>;
> >  					type = "passive";
> >  				};
> > @@ -1797,7 +1797,7 @@
> >  
> >  			trips {
> >  				cpu_alert5: trip0 {
> > -					temperature = <75000>;
> > +					temperature = <95000>;
> >  					hysteresis = <2000>;
> >  					type = "passive";
> >  				};
> > @@ -1818,7 +1818,7 @@
> >  
> >  			trips {
> >  				cpu_alert6: trip0 {
> > -					temperature = <75000>;
> > +					temperature = <95000>;
> >  					hysteresis = <2000>;
> >  					type = "passive";
> >  				};
> > @@ -1839,7 +1839,7 @@
> >  
> >  			trips {
> >  				cpu_alert7: trip0 {
> > -					temperature = <75000>;
> > +					temperature = <95000>;
> >  					hysteresis = <2000>;
> >  					type = "passive";
> >  				};
> 
> The change itself looks good to me, however I wonder if it would be
> worth to eliminate redundancy and merge the current 8 thermal zones
> into 2, one for the Silver and one for the Gold cluster (as done by
> http://crrev.com/c/1381752). There is a single cooling device for
> each cluster, so it's not clear to me if there is any gain from having
> a separate thermal zone for each CPU. If it is important to monitor
> the temperatures of the individual cores this can still be done by
> configuring the thermal zone of the cluster with multiple thermal
> sensors.

I see your idea is to have a cooling device per CPU ("arm64: dts:
sdm845: wireup the thermal trip points to cpufreq" /
https://lore.kernel.org/patchwork/patch/1030742/), however that
doesn't work as intended. Only two cpufreq 'devices' are created,
one for CPU0 and one for CPU4. In consequence cpufreq->ready() only
runs for these cores and only two cooling devices are
registered. Since the cores of a cluster all run at the same
frequency I also doubt if having multiple cooling devices would
bring any benefits.

Cheers

Matthias

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH v1 7/7] arm64: dts: sdm845: wireup the thermal trip points to cpufreq
  2019-01-10  0:00 ` [PATCH v1 7/7] arm64: dts: sdm845: wireup the thermal trip points to cpufreq Amit Kucheria
  2019-01-10  0:28   ` Stephen Boyd
@ 2019-01-10  2:22   ` Matthias Kaehlcke
  2019-01-10  6:23     ` Viresh Kumar
  2019-01-11  0:30   ` Matthias Kaehlcke
  2 siblings, 1 reply; 37+ messages in thread
From: Matthias Kaehlcke @ 2019-01-10  2:22 UTC (permalink / raw)
  To: Amit Kucheria
  Cc: linux-kernel, linux-arm-msm, bjorn.andersson, viresh.kumar,
	edubezval, andy.gross, tdas, swboyd, dianders, David Brown,
	Rob Herring, Mark Rutland,
	open list:OPEN FIRMWARE AND FLATTENED DEVICE TREE BINDINGS

Hi Amit,

On Thu, Jan 10, 2019 at 05:30:56AM +0530, Amit Kucheria wrote:
> Since the big and little cpus are in the same frequency domain, use all
> of them for mitigation in the cooling-map. At the lower trip points we
> restrict ourselves to throttling only a few OPPs. At higher trip
> temperatures, allow ourselves to be throttled to any extent.
> 
> Signed-off-by: Amit Kucheria <amit.kucheria@linaro.org>
> ---
>  arch/arm64/boot/dts/qcom/sdm845.dtsi | 145 +++++++++++++++++++++++++++
>  1 file changed, 145 insertions(+)
> 
> diff --git a/arch/arm64/boot/dts/qcom/sdm845.dtsi b/arch/arm64/boot/dts/qcom/sdm845.dtsi
> index 29e823b0caf4..cd6402a9aa64 100644
> --- a/arch/arm64/boot/dts/qcom/sdm845.dtsi
> +++ b/arch/arm64/boot/dts/qcom/sdm845.dtsi
> @@ -13,6 +13,7 @@
>  #include <dt-bindings/reset/qcom,sdm845-aoss.h>
>  #include <dt-bindings/soc/qcom,rpmh-rsc.h>
>  #include <dt-bindings/clock/qcom,gcc-sdm845.h>
> +#include <dt-bindings/thermal/thermal.h>
>  
>  / {
>  	interrupt-parent = <&intc>;
> @@ -99,6 +100,7 @@
>  			compatible = "qcom,kryo385";
>  			reg = <0x0 0x0>;
>  			enable-method = "psci";
> +			#cooling-cells = <2>;
>  			next-level-cache = <&L2_0>;
>  			L2_0: l2-cache {
>  				compatible = "cache";
> @@ -114,6 +116,7 @@
>  			compatible = "qcom,kryo385";
>  			reg = <0x0 0x100>;
>  			enable-method = "psci";
> +			#cooling-cells = <2>;

This is not needed (also applies to other for other non-policy
cores). A single cpufreq device is created per frequency domain /
cluster, hence a single cooling device is registered per cluster,
which IMO makes sense given that the CPUs of a cluster can't change
their frequencies independently.

>  			next-level-cache = <&L2_100>;
>  			L2_100: l2-cache {
>  				compatible = "cache";
> @@ -126,6 +129,7 @@
>  			compatible = "qcom,kryo385";
>  			reg = <0x0 0x200>;
>  			enable-method = "psci";
> +			#cooling-cells = <2>;
>  			next-level-cache = <&L2_200>;
>  			L2_200: l2-cache {
>  				compatible = "cache";
> @@ -138,6 +142,7 @@
>  			compatible = "qcom,kryo385";
>  			reg = <0x0 0x300>;
>  			enable-method = "psci";
> +			#cooling-cells = <2>;
>  			next-level-cache = <&L2_300>;
>  			L2_300: l2-cache {
>  				compatible = "cache";
> @@ -150,6 +155,7 @@
>  			compatible = "qcom,kryo385";
>  			reg = <0x0 0x400>;
>  			enable-method = "psci";
> +			#cooling-cells = <2>;
>  			next-level-cache = <&L2_400>;
>  			L2_400: l2-cache {
>  				compatible = "cache";
> @@ -162,6 +168,7 @@
>  			compatible = "qcom,kryo385";
>  			reg = <0x0 0x500>;
>  			enable-method = "psci";
> +			#cooling-cells = <2>;
>  			next-level-cache = <&L2_500>;
>  			L2_500: l2-cache {
>  				compatible = "cache";
> @@ -174,6 +181,7 @@
>  			compatible = "qcom,kryo385";
>  			reg = <0x0 0x600>;
>  			enable-method = "psci";
> +			#cooling-cells = <2>;
>  			next-level-cache = <&L2_600>;
>  			L2_600: l2-cache {
>  				compatible = "cache";
> @@ -186,6 +194,7 @@
>  			compatible = "qcom,kryo385";
>  			reg = <0x0 0x700>;
>  			enable-method = "psci";
> +			#cooling-cells = <2>;
>  			next-level-cache = <&L2_700>;
>  			L2_700: l2-cache {
>  				compatible = "cache";
> @@ -1703,6 +1712,23 @@
>  					type = "critical";
>  				};
>  			};
> +
> +			cooling-maps {
> +				map0 {
> +					trip = <&cpu_alert0>;
> +					cooling-device = <&CPU0 THERMAL_NO_LIMIT 4>,
> +							 <&CPU1 THERMAL_NO_LIMIT 4>,

As per above, there are no cooling devices for CPU1-3 and CPU5-7.

Cheers

Matthias


^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH v1 5/7] cpufreq: qcom-hw: Register as a cpufreq cooling device
  2019-01-10  0:00 ` [PATCH v1 5/7] cpufreq: qcom-hw: Register as a cpufreq cooling device Amit Kucheria
@ 2019-01-10  6:12   ` Viresh Kumar
  2019-01-10  9:03     ` Amit Kucheria
  2019-01-10  9:32     ` Rafael J. Wysocki
  0 siblings, 2 replies; 37+ messages in thread
From: Viresh Kumar @ 2019-01-10  6:12 UTC (permalink / raw)
  To: Amit Kucheria
  Cc: linux-kernel, linux-arm-msm, bjorn.andersson, edubezval,
	andy.gross, tdas, swboyd, dianders, mka, Rafael J. Wysocki,
	open list:CPU FREQUENCY DRIVERS

On 10-01-19, 05:30, Amit Kucheria wrote:
> Signed-off-by: Amit Kucheria <amit.kucheria@linaro.org>
> ---
>  drivers/cpufreq/qcom-cpufreq-hw.c | 5 +++++
>  1 file changed, 5 insertions(+)
> 
> diff --git a/drivers/cpufreq/qcom-cpufreq-hw.c b/drivers/cpufreq/qcom-cpufreq-hw.c
> index 649dddd72749..1c01311e5927 100644
> --- a/drivers/cpufreq/qcom-cpufreq-hw.c
> +++ b/drivers/cpufreq/qcom-cpufreq-hw.c
> @@ -5,6 +5,7 @@
>  
>  #include <linux/bitfield.h>
>  #include <linux/cpufreq.h>
> +#include <linux/cpu_cooling.h>
>  #include <linux/init.h>
>  #include <linux/kernel.h>
>  #include <linux/module.h>
> @@ -216,7 +217,10 @@ static int qcom_cpufreq_hw_cpu_init(struct cpufreq_policy *policy)
>  static int qcom_cpufreq_hw_cpu_exit(struct cpufreq_policy *policy)
>  {
>  	void __iomem *base = policy->driver_data - REG_PERF_STATE;
> +	struct thermal_cooling_device *cdev = policy->cooldev;
>  
> +	if (cdev)
> +		cpufreq_cooling_unregister(cdev);
>  	kfree(policy->freq_table);
>  	devm_iounmap(&global_pdev->dev, base);
>  
> @@ -238,6 +242,7 @@ static struct cpufreq_driver cpufreq_qcom_hw_driver = {
>  	.init		= qcom_cpufreq_hw_cpu_init,
>  	.exit		= qcom_cpufreq_hw_cpu_exit,
>  	.fast_switch    = qcom_cpufreq_hw_fast_switch,
> +	.ready		= generic_cpufreq_ready,
>  	.name		= "qcom-cpufreq-hw",
>  	.attr		= qcom_cpufreq_hw_attr,
>  };

I liked the idea of reducing code duplication, but not much the
implementation. All we were able to get rid of was a call to
of_cpufreq_cooling_register() and nothing else. Is it worth it ?

Maybe we can add another flag in cpufreq.h:

#define CPUFREQ_AUTO_REGISTER_COOLING_DEV (1 << 7)

and let the core do it all automatically by itself, that will get rid
of code duplication actually.

@Rafael: What do you say ?

-- 
viresh

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH v1 3/7] cpu_cooling: Add generic driver ready callback
  2019-01-10  0:00 ` [PATCH v1 3/7] cpu_cooling: Add generic driver ready callback Amit Kucheria
@ 2019-01-10  6:14   ` Viresh Kumar
  0 siblings, 0 replies; 37+ messages in thread
From: Viresh Kumar @ 2019-01-10  6:14 UTC (permalink / raw)
  To: Amit Kucheria
  Cc: linux-kernel, linux-arm-msm, bjorn.andersson, edubezval,
	andy.gross, tdas, swboyd, dianders, mka, Amit Daniel Kachhap,
	Javi Merino, Zhang Rui, Daniel Lezcano,
	open list:THERMAL/CPU_COOLING

On 10-01-19, 05:30, Amit Kucheria wrote:
> All cpufreq drivers do similar things to register as a cooling device.
> Provide a generic call back so we can get rid of duplicated code in the
> drivers.
> 
> Signed-off-by: Amit Kucheria <amit.kucheria@linaro.org>
> Suggested-by: Stephen Boyd <swboyd@chromium.org>
> ---
>  drivers/thermal/cpu_cooling.c | 18 ++++++++++++++++++
>  include/linux/cpu_cooling.h   |  9 +++++++++
>  2 files changed, 27 insertions(+)

We may be doing this differently, but I found some other issues with
the patch.

> diff --git a/drivers/thermal/cpu_cooling.c b/drivers/thermal/cpu_cooling.c
> index dfd23245f778..5154ffc12332 100644
> --- a/drivers/thermal/cpu_cooling.c
> +++ b/drivers/thermal/cpu_cooling.c
> @@ -694,6 +694,7 @@ __cpufreq_cooling_register(struct device_node *np,
>  
>  	cpufreq_cdev->clipped_freq = cpufreq_cdev->freq_table[0].frequency;
>  	cpufreq_cdev->cdev = cdev;
> +	policy->cooldev = cdev;

Why was this required ? The below routine was already doing it...

>  
>  	mutex_lock(&cooling_list_lock);
>  	/* Register the notifier for first cpufreq cooling device */
> @@ -785,6 +786,23 @@ of_cpufreq_cooling_register(struct cpufreq_policy *policy)
>  }
>  EXPORT_SYMBOL_GPL(of_cpufreq_cooling_register);
>  
> +/**
> + * generic_cpufreq_ready - generic driver callback to register a thermal_cooling_device
> + * @policy: cpufreq policy
> + */
> +void generic_cpufreq_ready(struct cpufreq_policy *policy)
> +{
> +	struct thermal_cooling_device **cdev = &policy->cooldev;
> +
> +	*cdev = of_cpufreq_cooling_register(policy);

... here

> +	if (IS_ERR(*cdev)) {

We never get an error here, only NULL.

> +		pr_err("Failed to register cpufreq cooling device for CPU%d: %ld\n",
> +		       policy->cpu, PTR_ERR(cdev));

The of_cpufreq_cooling_register() routine already prints enough error
messages on failures.

> +		cdev = NULL;

This is a meaningless statement, perhaps you wanted to do *cdev = NULL
:)

-- 
viresh

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH v1 7/7] arm64: dts: sdm845: wireup the thermal trip points to cpufreq
  2019-01-10  2:22   ` Matthias Kaehlcke
@ 2019-01-10  6:23     ` Viresh Kumar
  2019-01-10 18:42       ` Matthias Kaehlcke
  0 siblings, 1 reply; 37+ messages in thread
From: Viresh Kumar @ 2019-01-10  6:23 UTC (permalink / raw)
  To: Matthias Kaehlcke
  Cc: Amit Kucheria, linux-kernel, linux-arm-msm, bjorn.andersson,
	edubezval, andy.gross, tdas, swboyd, dianders, David Brown,
	Rob Herring, Mark Rutland,
	open list:OPEN FIRMWARE AND FLATTENED DEVICE TREE BINDINGS

On 09-01-19, 18:22, Matthias Kaehlcke wrote:
> Hi Amit,
> 
> On Thu, Jan 10, 2019 at 05:30:56AM +0530, Amit Kucheria wrote:
> > Since the big and little cpus are in the same frequency domain, use all
> > of them for mitigation in the cooling-map. At the lower trip points we
> > restrict ourselves to throttling only a few OPPs. At higher trip
> > temperatures, allow ourselves to be throttled to any extent.
> > 
> > Signed-off-by: Amit Kucheria <amit.kucheria@linaro.org>
> > ---
> >  arch/arm64/boot/dts/qcom/sdm845.dtsi | 145 +++++++++++++++++++++++++++
> >  1 file changed, 145 insertions(+)
> > 
> > diff --git a/arch/arm64/boot/dts/qcom/sdm845.dtsi b/arch/arm64/boot/dts/qcom/sdm845.dtsi
> > index 29e823b0caf4..cd6402a9aa64 100644
> > --- a/arch/arm64/boot/dts/qcom/sdm845.dtsi
> > +++ b/arch/arm64/boot/dts/qcom/sdm845.dtsi
> > @@ -13,6 +13,7 @@
> >  #include <dt-bindings/reset/qcom,sdm845-aoss.h>
> >  #include <dt-bindings/soc/qcom,rpmh-rsc.h>
> >  #include <dt-bindings/clock/qcom,gcc-sdm845.h>
> > +#include <dt-bindings/thermal/thermal.h>
> >  
> >  / {
> >  	interrupt-parent = <&intc>;
> > @@ -99,6 +100,7 @@
> >  			compatible = "qcom,kryo385";
> >  			reg = <0x0 0x0>;
> >  			enable-method = "psci";
> > +			#cooling-cells = <2>;
> >  			next-level-cache = <&L2_0>;
> >  			L2_0: l2-cache {
> >  				compatible = "cache";
> > @@ -114,6 +116,7 @@
> >  			compatible = "qcom,kryo385";
> >  			reg = <0x0 0x100>;
> >  			enable-method = "psci";
> > +			#cooling-cells = <2>;
> 
> This is not needed (also applies to other for other non-policy
> cores). A single cpufreq device is created per frequency domain /
> cluster, hence a single cooling device is registered per cluster,
> which IMO makes sense given that the CPUs of a cluster can't change
> their frequencies independently.
 
> As per above, there are no cooling devices for CPU1-3 and CPU5-7.

lore.kernel.org/lkml/cover.1527244200.git.viresh.kumar@linaro.org
lore.kernel.org/lkml/b687bb6035fbb010383f4511a206abb4006679fa.1527244201.git.viresh.kumar@linaro.org

-- 
viresh

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH v1 4/7] cpufreq: qcom-hw: Move to device_initcall
  2019-01-10  0:00 ` [PATCH v1 4/7] cpufreq: qcom-hw: Move to device_initcall Amit Kucheria
@ 2019-01-10  6:44   ` Viresh Kumar
  0 siblings, 0 replies; 37+ messages in thread
From: Viresh Kumar @ 2019-01-10  6:44 UTC (permalink / raw)
  To: Amit Kucheria
  Cc: linux-kernel, linux-arm-msm, bjorn.andersson, edubezval,
	andy.gross, tdas, swboyd, dianders, mka, Rafael J. Wysocki,
	open list:CPU FREQUENCY DRIVERS

On 10-01-19, 05:30, Amit Kucheria wrote:
> subsys_initcall causes problems registering the driver as a thermal
> cooling device.
> 
> If "faster boot" is the main reason for doing subsys_initcall, this
> should be handled in the bootloader or another boot constraint
> framework.
> 
> Signed-off-by: Amit Kucheria <amit.kucheria@linaro.org>
> ---
>  drivers/cpufreq/qcom-cpufreq-hw.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/cpufreq/qcom-cpufreq-hw.c b/drivers/cpufreq/qcom-cpufreq-hw.c
> index d83939a1b3d4..649dddd72749 100644
> --- a/drivers/cpufreq/qcom-cpufreq-hw.c
> +++ b/drivers/cpufreq/qcom-cpufreq-hw.c
> @@ -296,7 +296,7 @@ static int __init qcom_cpufreq_hw_init(void)
>  {
>  	return platform_driver_register(&qcom_cpufreq_hw_driver);
>  }
> -subsys_initcall(qcom_cpufreq_hw_init);
> +device_initcall(qcom_cpufreq_hw_init);
>  
>  static void __exit qcom_cpufreq_hw_exit(void)
>  {

Applied. Thanks.

-- 
viresh

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH v1 5/7] cpufreq: qcom-hw: Register as a cpufreq cooling device
  2019-01-10  6:12   ` Viresh Kumar
@ 2019-01-10  9:03     ` Amit Kucheria
  2019-01-10  9:32     ` Rafael J. Wysocki
  1 sibling, 0 replies; 37+ messages in thread
From: Amit Kucheria @ 2019-01-10  9:03 UTC (permalink / raw)
  To: Viresh Kumar
  Cc: LKML, linux-arm-msm, Bjorn Andersson, Eduardo Valentin,
	Andy Gross, Taniya Das, Stephen Boyd, Douglas Anderson,
	Matthias Kaehlcke, Rafael J. Wysocki,
	open list:CPU FREQUENCY DRIVERS

On Thu, Jan 10, 2019 at 11:42 AM Viresh Kumar <viresh.kumar@linaro.org> wrote:
>
> On 10-01-19, 05:30, Amit Kucheria wrote:
> > Signed-off-by: Amit Kucheria <amit.kucheria@linaro.org>
> > ---
> >  drivers/cpufreq/qcom-cpufreq-hw.c | 5 +++++
> >  1 file changed, 5 insertions(+)
> >
> > diff --git a/drivers/cpufreq/qcom-cpufreq-hw.c b/drivers/cpufreq/qcom-cpufreq-hw.c
> > index 649dddd72749..1c01311e5927 100644
> > --- a/drivers/cpufreq/qcom-cpufreq-hw.c
> > +++ b/drivers/cpufreq/qcom-cpufreq-hw.c
> > @@ -5,6 +5,7 @@
> >
> >  #include <linux/bitfield.h>
> >  #include <linux/cpufreq.h>
> > +#include <linux/cpu_cooling.h>
> >  #include <linux/init.h>
> >  #include <linux/kernel.h>
> >  #include <linux/module.h>
> > @@ -216,7 +217,10 @@ static int qcom_cpufreq_hw_cpu_init(struct cpufreq_policy *policy)
> >  static int qcom_cpufreq_hw_cpu_exit(struct cpufreq_policy *policy)
> >  {
> >       void __iomem *base = policy->driver_data - REG_PERF_STATE;
> > +     struct thermal_cooling_device *cdev = policy->cooldev;
> >
> > +     if (cdev)
> > +             cpufreq_cooling_unregister(cdev);
> >       kfree(policy->freq_table);
> >       devm_iounmap(&global_pdev->dev, base);
> >
> > @@ -238,6 +242,7 @@ static struct cpufreq_driver cpufreq_qcom_hw_driver = {
> >       .init           = qcom_cpufreq_hw_cpu_init,
> >       .exit           = qcom_cpufreq_hw_cpu_exit,
> >       .fast_switch    = qcom_cpufreq_hw_fast_switch,
> > +     .ready          = generic_cpufreq_ready,
> >       .name           = "qcom-cpufreq-hw",
> >       .attr           = qcom_cpufreq_hw_attr,
> >  };
>
> I liked the idea of reducing code duplication, but not much the
> implementation. All we were able to get rid of was a call to
> of_cpufreq_cooling_register() and nothing else. Is it worth it ?
>
> Maybe we can add another flag in cpufreq.h:
>
> #define CPUFREQ_AUTO_REGISTER_COOLING_DEV (1 << 7)
>
> and let the core do it all automatically by itself, that will get rid
> of code duplication actually.

I like the idea of a flag. I'll spin something implementing it in the next rev.

> @Rafael: What do you say ?

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH v1 5/7] cpufreq: qcom-hw: Register as a cpufreq cooling device
  2019-01-10  6:12   ` Viresh Kumar
  2019-01-10  9:03     ` Amit Kucheria
@ 2019-01-10  9:32     ` Rafael J. Wysocki
  1 sibling, 0 replies; 37+ messages in thread
From: Rafael J. Wysocki @ 2019-01-10  9:32 UTC (permalink / raw)
  To: Viresh Kumar
  Cc: Amit Kucheria, Linux Kernel Mailing List, linux-arm-msm,
	bjorn.andersson, Eduardo Valentin, Andy Gross, Taniya Das,
	Stephen Boyd, Doug Anderson, Matthias Kaehlcke,
	Rafael J. Wysocki, open list:CPU FREQUENCY DRIVERS

On Thu, Jan 10, 2019 at 7:12 AM Viresh Kumar <viresh.kumar@linaro.org> wrote:
>
> On 10-01-19, 05:30, Amit Kucheria wrote:
> > Signed-off-by: Amit Kucheria <amit.kucheria@linaro.org>
> > ---
> >  drivers/cpufreq/qcom-cpufreq-hw.c | 5 +++++
> >  1 file changed, 5 insertions(+)
> >
> > diff --git a/drivers/cpufreq/qcom-cpufreq-hw.c b/drivers/cpufreq/qcom-cpufreq-hw.c
> > index 649dddd72749..1c01311e5927 100644
> > --- a/drivers/cpufreq/qcom-cpufreq-hw.c
> > +++ b/drivers/cpufreq/qcom-cpufreq-hw.c
> > @@ -5,6 +5,7 @@
> >
> >  #include <linux/bitfield.h>
> >  #include <linux/cpufreq.h>
> > +#include <linux/cpu_cooling.h>
> >  #include <linux/init.h>
> >  #include <linux/kernel.h>
> >  #include <linux/module.h>
> > @@ -216,7 +217,10 @@ static int qcom_cpufreq_hw_cpu_init(struct cpufreq_policy *policy)
> >  static int qcom_cpufreq_hw_cpu_exit(struct cpufreq_policy *policy)
> >  {
> >       void __iomem *base = policy->driver_data - REG_PERF_STATE;
> > +     struct thermal_cooling_device *cdev = policy->cooldev;
> >
> > +     if (cdev)
> > +             cpufreq_cooling_unregister(cdev);
> >       kfree(policy->freq_table);
> >       devm_iounmap(&global_pdev->dev, base);
> >
> > @@ -238,6 +242,7 @@ static struct cpufreq_driver cpufreq_qcom_hw_driver = {
> >       .init           = qcom_cpufreq_hw_cpu_init,
> >       .exit           = qcom_cpufreq_hw_cpu_exit,
> >       .fast_switch    = qcom_cpufreq_hw_fast_switch,
> > +     .ready          = generic_cpufreq_ready,
> >       .name           = "qcom-cpufreq-hw",
> >       .attr           = qcom_cpufreq_hw_attr,
> >  };
>
> I liked the idea of reducing code duplication, but not much the
> implementation. All we were able to get rid of was a call to
> of_cpufreq_cooling_register() and nothing else. Is it worth it ?
>
> Maybe we can add another flag in cpufreq.h:
>
> #define CPUFREQ_AUTO_REGISTER_COOLING_DEV (1 << 7)
>
> and let the core do it all automatically by itself, that will get rid
> of code duplication actually.
>
> @Rafael: What do you say ?

Getting rid of code duplication is good, let's do that.

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH v1 7/7] arm64: dts: sdm845: wireup the thermal trip points to cpufreq
  2019-01-10  0:28   ` Stephen Boyd
@ 2019-01-10 12:28     ` Amit Kucheria
  0 siblings, 0 replies; 37+ messages in thread
From: Amit Kucheria @ 2019-01-10 12:28 UTC (permalink / raw)
  To: Stephen Boyd
  Cc: LKML, linux-arm-msm, Bjorn Andersson, Viresh Kumar,
	Eduardo Valentin, Andy Gross, Taniya Das, Douglas Anderson,
	Matthias Kaehlcke, David Brown, Rob Herring, Mark Rutland,
	open list:OPEN FIRMWARE AND FLATTENED DEVICE TREE BINDINGS

On Thu, Jan 10, 2019 at 5:58 AM Stephen Boyd <swboyd@chromium.org> wrote:
>
> Quoting Amit Kucheria (2019-01-09 16:00:56)
> > Since the big and little cpus are in the same frequency domain, use all
>
> Oh? I thought the big and little cpus were in different frequency
> domains and voltage domains. Maybe that's what you're saying here but
> I'm misunderstanding. So change the wording a bit to be more clear?

Yeah, forgive my English - it is my second language. ;-)

That should've been "since all the cpus in the big and little clusters
are in the same frequency domain". Will fix.

> > of them for mitigation in the cooling-map. At the lower trip points we
> > restrict ourselves to throttling only a few OPPs. At higher trip
> > temperatures, allow ourselves to be throttled to any extent.
> >
> > Signed-off-by: Amit Kucheria <amit.kucheria@linaro.org>
>

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH v1 6/7] arm64: dts: sdm845: Increase alert trip point to 95 degrees
  2019-01-10  0:29   ` Stephen Boyd
@ 2019-01-10 17:14     ` Doug Anderson
  2019-01-10 20:06     ` Amit Kucheria
  1 sibling, 0 replies; 37+ messages in thread
From: Doug Anderson @ 2019-01-10 17:14 UTC (permalink / raw)
  To: Stephen Boyd
  Cc: Amit Kucheria, LKML, linux-arm-msm, Bjorn Andersson,
	Viresh Kumar, Eduardo Valentin, Andy Gross, Taniya Das,
	Matthias Kaehlcke, David Brown, Rob Herring, Mark Rutland,
	devicetree

Hi,

On Wed, Jan 9, 2019 at 4:29 PM Stephen Boyd <swboyd@chromium.org> wrote:
>
> Quoting Amit Kucheria (2019-01-09 16:00:55)
> > 75 degrees is too aggressive for throttling the CPU. After speaking to
> > Qualcomm engineers, increase it to 95 degrees.
> >
> > Signed-off-by: Amit Kucheria <amit.kucheria@linaro.org>
> > ---
> >  arch/arm64/boot/dts/qcom/sdm845.dtsi | 16 ++++++++--------
>
> Is the plan that these are some defaults that would be adjusted by board
> variants? Just curious why we have anything in here and don't punt it
> all to each board dts file.

My preference would be that the SoC device tree file should contain
thermal numbers that are important to pay attention to for the safety
/ proper operation of the SoC.  ...then individual boards could (if
they needed to) override with lower values to control, for instance,
skin temperature.

From experience with previous boards, if you've got enough an off-SoC
thermistors then those are the ones you'd want to monitor to control
skin temperature.  It's OK if the SoC spikes up quite high as long as
that heat has somewhere to go (like a heat pipe).  The sensors that
are part of Amit's patch are on-chip.

...if you've got a board without external thermistors and are using
the SoC's on-chip sensors as a proxy for the heat in the overall
system then you might want to lower your values in the board device
tree file.  You won't be able to have as many short term spikes, but
that's what you gotta do without the extra sensors.


Sound sane?

-Doug

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH v1 7/7] arm64: dts: sdm845: wireup the thermal trip points to cpufreq
  2019-01-10  6:23     ` Viresh Kumar
@ 2019-01-10 18:42       ` Matthias Kaehlcke
  2019-01-11  3:46         ` Viresh Kumar
  0 siblings, 1 reply; 37+ messages in thread
From: Matthias Kaehlcke @ 2019-01-10 18:42 UTC (permalink / raw)
  To: Viresh Kumar
  Cc: Amit Kucheria, linux-kernel, linux-arm-msm, bjorn.andersson,
	edubezval, andy.gross, tdas, swboyd, dianders, David Brown,
	Rob Herring, Mark Rutland,
	open list:OPEN FIRMWARE AND FLATTENED DEVICE TREE BINDINGS

On Thu, Jan 10, 2019 at 11:53:59AM +0530, Viresh Kumar wrote:
> On 09-01-19, 18:22, Matthias Kaehlcke wrote:
> > Hi Amit,
> > 
> > On Thu, Jan 10, 2019 at 05:30:56AM +0530, Amit Kucheria wrote:
> > > Since the big and little cpus are in the same frequency domain, use all
> > > of them for mitigation in the cooling-map. At the lower trip points we
> > > restrict ourselves to throttling only a few OPPs. At higher trip
> > > temperatures, allow ourselves to be throttled to any extent.
> > > 
> > > Signed-off-by: Amit Kucheria <amit.kucheria@linaro.org>
> > > ---
> > >  arch/arm64/boot/dts/qcom/sdm845.dtsi | 145 +++++++++++++++++++++++++++
> > >  1 file changed, 145 insertions(+)
> > > 
> > > diff --git a/arch/arm64/boot/dts/qcom/sdm845.dtsi b/arch/arm64/boot/dts/qcom/sdm845.dtsi
> > > index 29e823b0caf4..cd6402a9aa64 100644
> > > --- a/arch/arm64/boot/dts/qcom/sdm845.dtsi
> > > +++ b/arch/arm64/boot/dts/qcom/sdm845.dtsi
> > > @@ -13,6 +13,7 @@
> > >  #include <dt-bindings/reset/qcom,sdm845-aoss.h>
> > >  #include <dt-bindings/soc/qcom,rpmh-rsc.h>
> > >  #include <dt-bindings/clock/qcom,gcc-sdm845.h>
> > > +#include <dt-bindings/thermal/thermal.h>
> > >  
> > >  / {
> > >  	interrupt-parent = <&intc>;
> > > @@ -99,6 +100,7 @@
> > >  			compatible = "qcom,kryo385";
> > >  			reg = <0x0 0x0>;
> > >  			enable-method = "psci";
> > > +			#cooling-cells = <2>;
> > >  			next-level-cache = <&L2_0>;
> > >  			L2_0: l2-cache {
> > >  				compatible = "cache";
> > > @@ -114,6 +116,7 @@
> > >  			compatible = "qcom,kryo385";
> > >  			reg = <0x0 0x100>;
> > >  			enable-method = "psci";
> > > +			#cooling-cells = <2>;
> > 
> > This is not needed (also applies to other for other non-policy
> > cores). A single cpufreq device is created per frequency domain /
> > cluster, hence a single cooling device is registered per cluster,
> > which IMO makes sense given that the CPUs of a cluster can't change
> > their frequencies independently.
>  
> > As per above, there are no cooling devices for CPU1-3 and CPU5-7.
> 
> lore.kernel.org/lkml/cover.1527244200.git.viresh.kumar@linaro.org
> lore.kernel.org/lkml/b687bb6035fbb010383f4511a206abb4006679fa.1527244201.git.viresh.kumar@linaro.org

Thanks for the pointer, there's always something new to learn!

Ok, so the policy CPU and hence the CPU registered as cooling
device may vary. I understand that this requires to list all possible
cooling devices, even though only one will be active at any given
time. However I wonder if we could change this:


From 103703a46495ff210a521b5b6fbf32632053c64f Mon Sep 17 00:00:00 2001
From: Matthias Kaehlcke <mka@chromium.org>
Date: Thu, 10 Jan 2019 09:48:38 -0800
Subject: [PATCH] thermal: cpu_cooling: always use first CPU of a freq domain
 as cooling device

For all CPUs of a frequency domain a single cooling device is
registered, since the CPUs can't switch their frequencies
independently from each other. The cpufreq policy CPU is used to
represent cooling device of the frequency domain. Which CPU is the
policy CPU may vary based on the order of initialization or CPU
hotplug.

For device tree based platform the above implies that cooling maps
must include a list of all possible cooling devices of a frequency
domain, even though only one of them will exist at any given time.

For example:

cooling-maps {
	map0 {
		trip = <&cpu_alert0>;
		cooling-device = <&CPU0 THERMAL_NO_LIMIT 4>,
				 <&CPU1 THERMAL_NO_LIMIT 4>,
				 <&CPU2 THERMAL_NO_LIMIT 4>,
				 <&CPU3 THERMAL_NO_LIMIT 4>;
	};
	map1 {
		trip = <&cpu_crit0>;
		cooling-device = <&CPU0 THERMAL_NO_LIMIT THERMAL_NO_LIMIT>,
				 <&CPU1 THERMAL_NO_LIMIT THERMAL_NO_LIMIT>,
				 <&CPU2 THERMAL_NO_LIMIT THERMAL_NO_LIMIT>,
				 <&CPU3 THERMAL_NO_LIMIT THERMAL_NO_LIMIT>;
	};
};

This can be avoided by using always the first CPU of a frequency
domain as cooling device. It may happen that the first CPU is offline
when the cooling device is registered (e.g. CPU2 is initialized
first in the above example), hence the nominal cooling device might
be offline. This may seem odd, however it is not really different from
the current behavior: when the policy CPU is taking offline the cooling
device corresponding to it remains active, unless it is unregistered
because all other CPUs of the frequency domain are offline too.

A single cooling device associated with a specific CPU of the frequency
domain reduces redundant device tree clutter in CPU nodes and cooling
maps.

Signed-off-by: Matthias Kaehlcke <mka@chromium.org>
---
 drivers/thermal/cpu_cooling.c | 7 ++++---
 1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/drivers/thermal/cpu_cooling.c b/drivers/thermal/cpu_cooling.c
index dfd23245f778a..bb5ea06f893a2 100644
--- a/drivers/thermal/cpu_cooling.c
+++ b/drivers/thermal/cpu_cooling.c
@@ -758,13 +758,14 @@ EXPORT_SYMBOL_GPL(cpufreq_cooling_register);
 struct thermal_cooling_device *
 of_cpufreq_cooling_register(struct cpufreq_policy *policy)
 {
-	struct device_node *np = of_get_cpu_node(policy->cpu, NULL);
+	unsigned int first_cpu = cpumask_first(policy->related_cpus);
+	struct device_node *np = of_get_cpu_node(first_cpu, NULL);
 	struct thermal_cooling_device *cdev = NULL;
 	u32 capacitance = 0;
 
 	if (!np) {
 		pr_err("cpu_cooling: OF node not available for cpu%d\n",
-		       policy->cpu);
+		       first_cpu);
 		return NULL;
 	}
 
@@ -775,7 +776,7 @@ of_cpufreq_cooling_register(struct cpufreq_policy *policy)
 		cdev = __cpufreq_cooling_register(np, policy, capacitance);
 		if (IS_ERR(cdev)) {
 			pr_err("cpu_cooling: cpu%d is not running as cooling device: %ld\n",
-			       policy->cpu, PTR_ERR(cdev));
+			       first_cpu, PTR_ERR(cdev));
 			cdev = NULL;
 		}
 	}


Would that make sense or is there something I'm overlooking?

Cheers

Matthias

^ permalink raw reply related	[flat|nested] 37+ messages in thread

* Re: [PATCH v1 6/7] arm64: dts: sdm845: Increase alert trip point to 95 degrees
  2019-01-10  2:15     ` Matthias Kaehlcke
@ 2019-01-10 19:45       ` Amit Kucheria
  2019-01-10 20:00         ` Matthias Kaehlcke
  0 siblings, 1 reply; 37+ messages in thread
From: Amit Kucheria @ 2019-01-10 19:45 UTC (permalink / raw)
  To: Matthias Kaehlcke
  Cc: LKML, linux-arm-msm, Bjorn Andersson, Viresh Kumar,
	Eduardo Valentin, Andy Gross, Taniya Das, Stephen Boyd,
	Douglas Anderson, David Brown, Rob Herring, Mark Rutland,
	open list:OPEN FIRMWARE AND FLATTENED DEVICE TREE BINDINGS

On Thu, Jan 10, 2019 at 7:45 AM Matthias Kaehlcke <mka@chromium.org> wrote:
>
> On Wed, Jan 09, 2019 at 05:15:33PM -0800, Matthias Kaehlcke wrote:
> > Hi Amit,
> >
> > On Thu, Jan 10, 2019 at 05:30:55AM +0530, Amit Kucheria wrote:
> > > 75 degrees is too aggressive for throttling the CPU. After speaking to
> > > Qualcomm engineers, increase it to 95 degrees.
> > >
> > > Signed-off-by: Amit Kucheria <amit.kucheria@linaro.org>
> > > ---
> > >  arch/arm64/boot/dts/qcom/sdm845.dtsi | 16 ++++++++--------
> > >  1 file changed, 8 insertions(+), 8 deletions(-)
> > >
> > > diff --git a/arch/arm64/boot/dts/qcom/sdm845.dtsi b/arch/arm64/boot/dts/qcom/sdm845.dtsi
> > > index c27cbd3bcb0a..29e823b0caf4 100644
> > > --- a/arch/arm64/boot/dts/qcom/sdm845.dtsi
> > > +++ b/arch/arm64/boot/dts/qcom/sdm845.dtsi
> > > @@ -1692,7 +1692,7 @@
> > >
> > >                     trips {
> > >                             cpu_alert0: trip0 {
> > > -                                   temperature = <75000>;
> > > +                                   temperature = <95000>;
> > >                                     hysteresis = <2000>;
> > >                                     type = "passive";
> > >                             };
> > > @@ -1713,7 +1713,7 @@
> > >
> > >                     trips {
> > >                             cpu_alert1: trip0 {
> > > -                                   temperature = <75000>;
> > > +                                   temperature = <95000>;
> > >                                     hysteresis = <2000>;
> > >                                     type = "passive";
> > >                             };
> > > @@ -1734,7 +1734,7 @@
> > >
> > >                     trips {
> > >                             cpu_alert2: trip0 {
> > > -                                   temperature = <75000>;
> > > +                                   temperature = <95000>;
> > >                                     hysteresis = <2000>;
> > >                                     type = "passive";
> > >                             };
> > > @@ -1755,7 +1755,7 @@
> > >
> > >                     trips {
> > >                             cpu_alert3: trip0 {
> > > -                                   temperature = <75000>;
> > > +                                   temperature = <95000>;
> > >                                     hysteresis = <2000>;
> > >                                     type = "passive";
> > >                             };
> > > @@ -1776,7 +1776,7 @@
> > >
> > >                     trips {
> > >                             cpu_alert4: trip0 {
> > > -                                   temperature = <75000>;
> > > +                                   temperature = <95000>;
> > >                                     hysteresis = <2000>;
> > >                                     type = "passive";
> > >                             };
> > > @@ -1797,7 +1797,7 @@
> > >
> > >                     trips {
> > >                             cpu_alert5: trip0 {
> > > -                                   temperature = <75000>;
> > > +                                   temperature = <95000>;
> > >                                     hysteresis = <2000>;
> > >                                     type = "passive";
> > >                             };
> > > @@ -1818,7 +1818,7 @@
> > >
> > >                     trips {
> > >                             cpu_alert6: trip0 {
> > > -                                   temperature = <75000>;
> > > +                                   temperature = <95000>;
> > >                                     hysteresis = <2000>;
> > >                                     type = "passive";
> > >                             };
> > > @@ -1839,7 +1839,7 @@
> > >
> > >                     trips {
> > >                             cpu_alert7: trip0 {
> > > -                                   temperature = <75000>;
> > > +                                   temperature = <95000>;
> > >                                     hysteresis = <2000>;
> > >                                     type = "passive";
> > >                             };
> >
> > The change itself looks good to me, however I wonder if it would be
> > worth to eliminate redundancy and merge the current 8 thermal zones
> > into 2, one for the Silver and one for the Gold cluster (as done by
> > http://crrev.com/c/1381752). There is a single cooling device for
> > each cluster, so it's not clear to me if there is any gain from having
> > a separate thermal zone for each CPU. If it is important to monitor
> > the temperatures of the individual cores this can still be done by
> > configuring the thermal zone of the cluster with multiple thermal
> > sensors.
>
> I see your idea is to have a cooling device per CPU ("arm64: dts:
> sdm845: wireup the thermal trip points to cpufreq" /
> https://lore.kernel.org/patchwork/patch/1030742/), however that
> doesn't work as intended. Only two cpufreq 'devices' are created,
> one for CPU0 and one for CPU4. In consequence cpufreq->ready() only
> runs for these cores and only two cooling devices are
> registered. Since the cores of a cluster all run at the same
> frequency I also doubt if having multiple cooling devices would
> bring any benefits.

I actually only intended for two cooling devices - one for each
frequency domain. I'll clarify it better in the patch description.

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH v1 6/7] arm64: dts: sdm845: Increase alert trip point to 95 degrees
  2019-01-10 19:45       ` Amit Kucheria
@ 2019-01-10 20:00         ` Matthias Kaehlcke
  2019-01-11  3:32           ` Viresh Kumar
  0 siblings, 1 reply; 37+ messages in thread
From: Matthias Kaehlcke @ 2019-01-10 20:00 UTC (permalink / raw)
  To: Amit Kucheria
  Cc: LKML, linux-arm-msm, Bjorn Andersson, Viresh Kumar,
	Eduardo Valentin, Andy Gross, Taniya Das, Stephen Boyd,
	Douglas Anderson, David Brown, Rob Herring, Mark Rutland,
	open list:OPEN FIRMWARE AND FLATTENED DEVICE TREE BINDINGS

On Fri, Jan 11, 2019 at 01:15:09AM +0530, Amit Kucheria wrote:
> On Thu, Jan 10, 2019 at 7:45 AM Matthias Kaehlcke <mka@chromium.org> wrote:
> >
> > On Wed, Jan 09, 2019 at 05:15:33PM -0800, Matthias Kaehlcke wrote:
> > > Hi Amit,
> > >
> > > On Thu, Jan 10, 2019 at 05:30:55AM +0530, Amit Kucheria wrote:
> > > > 75 degrees is too aggressive for throttling the CPU. After speaking to
> > > > Qualcomm engineers, increase it to 95 degrees.
> > > >
> > > > Signed-off-by: Amit Kucheria <amit.kucheria@linaro.org>
> > > > ---
> > > >  arch/arm64/boot/dts/qcom/sdm845.dtsi | 16 ++++++++--------
> > > >  1 file changed, 8 insertions(+), 8 deletions(-)
> > > >
> > > > diff --git a/arch/arm64/boot/dts/qcom/sdm845.dtsi b/arch/arm64/boot/dts/qcom/sdm845.dtsi
> > > > index c27cbd3bcb0a..29e823b0caf4 100644
> > > > --- a/arch/arm64/boot/dts/qcom/sdm845.dtsi
> > > > +++ b/arch/arm64/boot/dts/qcom/sdm845.dtsi
> > > > @@ -1692,7 +1692,7 @@
> > > >
> > > >                     trips {
> > > >                             cpu_alert0: trip0 {
> > > > -                                   temperature = <75000>;
> > > > +                                   temperature = <95000>;
> > > >                                     hysteresis = <2000>;
> > > >                                     type = "passive";
> > > >                             };
> > > > @@ -1713,7 +1713,7 @@
> > > >
> > > >                     trips {
> > > >                             cpu_alert1: trip0 {
> > > > -                                   temperature = <75000>;
> > > > +                                   temperature = <95000>;
> > > >                                     hysteresis = <2000>;
> > > >                                     type = "passive";
> > > >                             };
> > > > @@ -1734,7 +1734,7 @@
> > > >
> > > >                     trips {
> > > >                             cpu_alert2: trip0 {
> > > > -                                   temperature = <75000>;
> > > > +                                   temperature = <95000>;
> > > >                                     hysteresis = <2000>;
> > > >                                     type = "passive";
> > > >                             };
> > > > @@ -1755,7 +1755,7 @@
> > > >
> > > >                     trips {
> > > >                             cpu_alert3: trip0 {
> > > > -                                   temperature = <75000>;
> > > > +                                   temperature = <95000>;
> > > >                                     hysteresis = <2000>;
> > > >                                     type = "passive";
> > > >                             };
> > > > @@ -1776,7 +1776,7 @@
> > > >
> > > >                     trips {
> > > >                             cpu_alert4: trip0 {
> > > > -                                   temperature = <75000>;
> > > > +                                   temperature = <95000>;
> > > >                                     hysteresis = <2000>;
> > > >                                     type = "passive";
> > > >                             };
> > > > @@ -1797,7 +1797,7 @@
> > > >
> > > >                     trips {
> > > >                             cpu_alert5: trip0 {
> > > > -                                   temperature = <75000>;
> > > > +                                   temperature = <95000>;
> > > >                                     hysteresis = <2000>;
> > > >                                     type = "passive";
> > > >                             };
> > > > @@ -1818,7 +1818,7 @@
> > > >
> > > >                     trips {
> > > >                             cpu_alert6: trip0 {
> > > > -                                   temperature = <75000>;
> > > > +                                   temperature = <95000>;
> > > >                                     hysteresis = <2000>;
> > > >                                     type = "passive";
> > > >                             };
> > > > @@ -1839,7 +1839,7 @@
> > > >
> > > >                     trips {
> > > >                             cpu_alert7: trip0 {
> > > > -                                   temperature = <75000>;
> > > > +                                   temperature = <95000>;
> > > >                                     hysteresis = <2000>;
> > > >                                     type = "passive";
> > > >                             };
> > >
> > > The change itself looks good to me, however I wonder if it would be
> > > worth to eliminate redundancy and merge the current 8 thermal zones
> > > into 2, one for the Silver and one for the Gold cluster (as done by
> > > http://crrev.com/c/1381752). There is a single cooling device for
> > > each cluster, so it's not clear to me if there is any gain from having
> > > a separate thermal zone for each CPU. If it is important to monitor
> > > the temperatures of the individual cores this can still be done by
> > > configuring the thermal zone of the cluster with multiple thermal
> > > sensors.
> >
> > I see your idea is to have a cooling device per CPU ("arm64: dts:
> > sdm845: wireup the thermal trip points to cpufreq" /
> > https://lore.kernel.org/patchwork/patch/1030742/), however that
> > doesn't work as intended. Only two cpufreq 'devices' are created,
> > one for CPU0 and one for CPU4. In consequence cpufreq->ready() only
> > runs for these cores and only two cooling devices are
> > registered. Since the cores of a cluster all run at the same
> > frequency I also doubt if having multiple cooling devices would
> > bring any benefits.
> 
> I actually only intended for two cooling devices - one for each
> frequency domain. I'll clarify it better in the patch description.

Viresh helped me understand that we currently need to add cooling
device entries for all CPUs to the DT, even though at most one will be
active per freq domain at any time (I wonder if this could be changed
though).

Independently of the cooling devices, is there any value in having a
thermal zone for each CPU instead of having only one per freq domain?

Thanks

Matthias

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH v1 6/7] arm64: dts: sdm845: Increase alert trip point to 95 degrees
  2019-01-10  0:29   ` Stephen Boyd
  2019-01-10 17:14     ` Doug Anderson
@ 2019-01-10 20:06     ` Amit Kucheria
  1 sibling, 0 replies; 37+ messages in thread
From: Amit Kucheria @ 2019-01-10 20:06 UTC (permalink / raw)
  To: Stephen Boyd
  Cc: LKML, linux-arm-msm, Bjorn Andersson, Viresh Kumar,
	Eduardo Valentin, Andy Gross, Taniya Das, Douglas Anderson,
	Matthias Kaehlcke, David Brown, Rob Herring, Mark Rutland,
	open list:OPEN FIRMWARE AND FLATTENED DEVICE TREE BINDINGS

On Thu, Jan 10, 2019 at 5:59 AM Stephen Boyd <swboyd@chromium.org> wrote:
>
> Quoting Amit Kucheria (2019-01-09 16:00:55)
> > 75 degrees is too aggressive for throttling the CPU. After speaking to
> > Qualcomm engineers, increase it to 95 degrees.
> >
> > Signed-off-by: Amit Kucheria <amit.kucheria@linaro.org>
> > ---
> >  arch/arm64/boot/dts/qcom/sdm845.dtsi | 16 ++++++++--------
>
> Is the plan that these are some defaults that would be adjusted by board
> variants? Just curious why we have anything in here and don't punt it
> all to each board dts file.

These values are meant to be safe values for silicon operation as
characterised[1] by the HW team. So I'd consider them a more than just
default values. It also gives future boards a safe starting point.

IMHO it would be better to have the characterized values merged
upstream and if really required, those values can be tweaked in the
board-specific DTS.

[1] Caveat: The characterisation probably focused on mobile devices
and might change depending on form-factor, active cooling and heat
dissipation design but those differences should only impact the skin
temperature, not the operation of the SoC itself.

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH v1 7/7] arm64: dts: sdm845: wireup the thermal trip points to cpufreq
  2019-01-10  0:00 ` [PATCH v1 7/7] arm64: dts: sdm845: wireup the thermal trip points to cpufreq Amit Kucheria
  2019-01-10  0:28   ` Stephen Boyd
  2019-01-10  2:22   ` Matthias Kaehlcke
@ 2019-01-11  0:30   ` Matthias Kaehlcke
  2019-01-11 11:17     ` Amit Kucheria
  2 siblings, 1 reply; 37+ messages in thread
From: Matthias Kaehlcke @ 2019-01-11  0:30 UTC (permalink / raw)
  To: Amit Kucheria
  Cc: linux-kernel, linux-arm-msm, bjorn.andersson, viresh.kumar,
	edubezval, andy.gross, tdas, swboyd, dianders, David Brown,
	Rob Herring, Mark Rutland,
	open list:OPEN FIRMWARE AND FLATTENED DEVICE TREE BINDINGS

On Thu, Jan 10, 2019 at 05:30:56AM +0530, Amit Kucheria wrote:
> Since the big and little cpus are in the same frequency domain, use all
> of them for mitigation in the cooling-map. At the lower trip points we
> restrict ourselves to throttling only a few OPPs. At higher trip
> temperatures, allow ourselves to be throttled to any extent.
> 
> Signed-off-by: Amit Kucheria <amit.kucheria@linaro.org>
> ---
>  arch/arm64/boot/dts/qcom/sdm845.dtsi | 145 +++++++++++++++++++++++++++
>  1 file changed, 145 insertions(+)
> 
> diff --git a/arch/arm64/boot/dts/qcom/sdm845.dtsi b/arch/arm64/boot/dts/qcom/sdm845.dtsi
> index 29e823b0caf4..cd6402a9aa64 100644
> --- a/arch/arm64/boot/dts/qcom/sdm845.dtsi
> +++ b/arch/arm64/boot/dts/qcom/sdm845.dtsi
> @@ -13,6 +13,7 @@
>  #include <dt-bindings/reset/qcom,sdm845-aoss.h>
>  #include <dt-bindings/soc/qcom,rpmh-rsc.h>
>  #include <dt-bindings/clock/qcom,gcc-sdm845.h>
> +#include <dt-bindings/thermal/thermal.h>
>  
>  / {
>  	interrupt-parent = <&intc>;
> @@ -99,6 +100,7 @@
>  			compatible = "qcom,kryo385";
>  			reg = <0x0 0x0>;
>  			enable-method = "psci";
> +			#cooling-cells = <2>;
>  			next-level-cache = <&L2_0>;
>  			L2_0: l2-cache {
>  				compatible = "cache";
> @@ -114,6 +116,7 @@
>  			compatible = "qcom,kryo385";
>  			reg = <0x0 0x100>;
>  			enable-method = "psci";
> +			#cooling-cells = <2>;
>  			next-level-cache = <&L2_100>;
>  			L2_100: l2-cache {
>  				compatible = "cache";
> @@ -126,6 +129,7 @@
>  			compatible = "qcom,kryo385";
>  			reg = <0x0 0x200>;
>  			enable-method = "psci";
> +			#cooling-cells = <2>;
>  			next-level-cache = <&L2_200>;
>  			L2_200: l2-cache {
>  				compatible = "cache";
> @@ -138,6 +142,7 @@
>  			compatible = "qcom,kryo385";
>  			reg = <0x0 0x300>;
>  			enable-method = "psci";
> +			#cooling-cells = <2>;
>  			next-level-cache = <&L2_300>;
>  			L2_300: l2-cache {
>  				compatible = "cache";
> @@ -150,6 +155,7 @@
>  			compatible = "qcom,kryo385";
>  			reg = <0x0 0x400>;
>  			enable-method = "psci";
> +			#cooling-cells = <2>;
>  			next-level-cache = <&L2_400>;
>  			L2_400: l2-cache {
>  				compatible = "cache";
> @@ -162,6 +168,7 @@
>  			compatible = "qcom,kryo385";
>  			reg = <0x0 0x500>;
>  			enable-method = "psci";
> +			#cooling-cells = <2>;
>  			next-level-cache = <&L2_500>;
>  			L2_500: l2-cache {
>  				compatible = "cache";
> @@ -174,6 +181,7 @@
>  			compatible = "qcom,kryo385";
>  			reg = <0x0 0x600>;
>  			enable-method = "psci";
> +			#cooling-cells = <2>;
>  			next-level-cache = <&L2_600>;
>  			L2_600: l2-cache {
>  				compatible = "cache";
> @@ -186,6 +194,7 @@
>  			compatible = "qcom,kryo385";
>  			reg = <0x0 0x700>;
>  			enable-method = "psci";
> +			#cooling-cells = <2>;
>  			next-level-cache = <&L2_700>;
>  			L2_700: l2-cache {
>  				compatible = "cache";
> @@ -1703,6 +1712,23 @@
>  					type = "critical";
>  				};
>  			};
> +
> +			cooling-maps {
> +				map0 {
> +					trip = <&cpu_alert0>;
> +					cooling-device = <&CPU0 THERMAL_NO_LIMIT 4>,
> +							 <&CPU1 THERMAL_NO_LIMIT 4>,
> +							 <&CPU2 THERMAL_NO_LIMIT 4>,
> +							 <&CPU3 THERMAL_NO_LIMIT 4>;
> +				};
> +				map1 {
> +					trip = <&cpu_crit0>;
> +					cooling-device = <&CPU0 THERMAL_NO_LIMIT THERMAL_NO_LIMIT>,
> +							 <&CPU1 THERMAL_NO_LIMIT THERMAL_NO_LIMIT>,
> +							 <&CPU2 THERMAL_NO_LIMIT THERMAL_NO_LIMIT>,
> +							 <&CPU3 THERMAL_NO_LIMIT THERMAL_NO_LIMIT>;
> +				};
> +			};

Slightly off-topic, buy maybe not so much since we are just starting
to use the trip points:

Currently we use the naming scheme 'cpu_<type>N' for trip points. I
anticipate that we're going to add more passive trip points soon, to
keep the 'power_allocator' thermal governor happy, which expects a
'switch_on' and a 'desired_temperature' trip point. With the current
naming scheme this could become a bit messy. I suggest to change it to
'cpuN_<type>[X]', which would allow for something like 'cpuN_alert0'
and 'cpuN_alert1'.

If you think the change makes sense you can consider to do it within
this series, I can also send a separate patch once it has landed.

You could also consider to add the additional trip point in this
series if you agree that it will be needed.

This is not necessarily a call for action, just thinking loudly about
a closely related topic ;-)

Cheers

Matthias

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH v1 6/7] arm64: dts: sdm845: Increase alert trip point to 95 degrees
  2019-01-10 20:00         ` Matthias Kaehlcke
@ 2019-01-11  3:32           ` Viresh Kumar
  0 siblings, 0 replies; 37+ messages in thread
From: Viresh Kumar @ 2019-01-11  3:32 UTC (permalink / raw)
  To: Matthias Kaehlcke
  Cc: Amit Kucheria, LKML, linux-arm-msm, Bjorn Andersson,
	Eduardo Valentin, Andy Gross, Taniya Das, Stephen Boyd,
	Douglas Anderson, David Brown, Rob Herring, Mark Rutland,
	open list:OPEN FIRMWARE AND FLATTENED DEVICE TREE BINDINGS

On 10-01-19, 12:00, Matthias Kaehlcke wrote:
> Viresh helped me understand that we currently need to add cooling
> device entries for all CPUs to the DT, even though at most one will be
> active per freq domain at any time (I wonder if this could be changed
> though).

Actually we were only adding cooling-cells in CPU0 until now and I
fixed that, so that is going to stay :)

The idea is that the hardware should be described properly and not
partially. Even if all the CPUs are part of the same freq-domain, they
are all capable of being a cooling device here and the DT should
describe that. Kernel will ofcourse create a single cooling device.

-- 
viresh

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH v1 7/7] arm64: dts: sdm845: wireup the thermal trip points to cpufreq
  2019-01-10 18:42       ` Matthias Kaehlcke
@ 2019-01-11  3:46         ` Viresh Kumar
  2019-01-11 19:58           ` Matthias Kaehlcke
  0 siblings, 1 reply; 37+ messages in thread
From: Viresh Kumar @ 2019-01-11  3:46 UTC (permalink / raw)
  To: Matthias Kaehlcke
  Cc: Amit Kucheria, linux-kernel, linux-arm-msm, bjorn.andersson,
	edubezval, andy.gross, tdas, swboyd, dianders, David Brown,
	Rob Herring, Mark Rutland,
	open list:OPEN FIRMWARE AND FLATTENED DEVICE TREE BINDINGS

On 10-01-19, 10:42, Matthias Kaehlcke wrote:
> Thanks for the pointer, there's always something new to learn!
> 
> Ok, so the policy CPU and hence the CPU registered as cooling
> device may vary. I understand that this requires to list all possible
> cooling devices,

I won't say that I changed DT because of a design issue with kernel,
rather the DT shall be complete by itself and that's why that change
was made.

And then we can have more things going on. For example with cpuidle
cooling, we can individually control each CPU (and force idle on that)
even if all CPUs are part of the same freq-domain. Each CPU shall
expose its capabilities.

> even though only one will be active at any given
> time. However I wonder if we could change this:

I won't say it that way. I see it as all the CPUs are active during a
cooling state, i.e. they are all participating.
 
> >From 103703a46495ff210a521b5b6fbf32632053c64f Mon Sep 17 00:00:00 2001
> From: Matthias Kaehlcke <mka@chromium.org>
> Date: Thu, 10 Jan 2019 09:48:38 -0800
> Subject: [PATCH] thermal: cpu_cooling: always use first CPU of a freq domain
>  as cooling device
> 
> For all CPUs of a frequency domain a single cooling device is
> registered, since the CPUs can't switch their frequencies
> independently from each other. The cpufreq policy CPU is used to
> represent cooling device of the frequency domain. Which CPU is the
> policy CPU may vary based on the order of initialization or CPU
> hotplug.
> 
> For device tree based platform the above implies that cooling maps
> must include a list of all possible cooling devices of a frequency
> domain, even though only one of them will exist at any given time.
> 
> For example:
> 
> cooling-maps {
> 	map0 {
> 		trip = <&cpu_alert0>;
> 		cooling-device = <&CPU0 THERMAL_NO_LIMIT 4>,
> 				 <&CPU1 THERMAL_NO_LIMIT 4>,
> 				 <&CPU2 THERMAL_NO_LIMIT 4>,
> 				 <&CPU3 THERMAL_NO_LIMIT 4>;
> 	};
> 	map1 {
> 		trip = <&cpu_crit0>;
> 		cooling-device = <&CPU0 THERMAL_NO_LIMIT THERMAL_NO_LIMIT>,
> 				 <&CPU1 THERMAL_NO_LIMIT THERMAL_NO_LIMIT>,
> 				 <&CPU2 THERMAL_NO_LIMIT THERMAL_NO_LIMIT>,
> 				 <&CPU3 THERMAL_NO_LIMIT THERMAL_NO_LIMIT>;

This is the right thing to do hardware description wise, no matter
what the kernel does.

> 	};
> };
> 
> This can be avoided by using always the first CPU of a frequency
> domain as cooling device. It may happen that the first CPU is offline
> when the cooling device is registered (e.g. CPU2 is initialized
> first in the above example), hence the nominal cooling device might
> be offline. This may seem odd, however it is not really different from
> the current behavior: when the policy CPU is taking offline the cooling
> device corresponding to it remains active, unless it is unregistered
> because all other CPUs of the frequency domain are offline too.
> 
> A single cooling device associated with a specific CPU of the frequency
> domain reduces redundant device tree clutter in CPU nodes and cooling
> maps.
> 
> Signed-off-by: Matthias Kaehlcke <mka@chromium.org>
> ---
>  drivers/thermal/cpu_cooling.c | 7 ++++---
>  1 file changed, 4 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/thermal/cpu_cooling.c b/drivers/thermal/cpu_cooling.c
> index dfd23245f778a..bb5ea06f893a2 100644
> --- a/drivers/thermal/cpu_cooling.c
> +++ b/drivers/thermal/cpu_cooling.c
> @@ -758,13 +758,14 @@ EXPORT_SYMBOL_GPL(cpufreq_cooling_register);
>  struct thermal_cooling_device *
>  of_cpufreq_cooling_register(struct cpufreq_policy *policy)
>  {
> -	struct device_node *np = of_get_cpu_node(policy->cpu, NULL);
> +	unsigned int first_cpu = cpumask_first(policy->related_cpus);
> +	struct device_node *np = of_get_cpu_node(first_cpu, NULL);
>  	struct thermal_cooling_device *cdev = NULL;
>  	u32 capacitance = 0;
>  
>  	if (!np) {
>  		pr_err("cpu_cooling: OF node not available for cpu%d\n",
> -		       policy->cpu);
> +		       first_cpu);
>  		return NULL;
>  	}
>  
> @@ -775,7 +776,7 @@ of_cpufreq_cooling_register(struct cpufreq_policy *policy)
>  		cdev = __cpufreq_cooling_register(np, policy, capacitance);
>  		if (IS_ERR(cdev)) {
>  			pr_err("cpu_cooling: cpu%d is not running as cooling device: %ld\n",
> -			       policy->cpu, PTR_ERR(cdev));
> +			       first_cpu, PTR_ERR(cdev));
>  			cdev = NULL;
>  		}
>  	}
> 
> 
> Would that make sense or is there something I'm overlooking?

I don't see any benefits of this to be honest. Even if we make this
change, the DT should remain in its current form.

-- 
viresh

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH v1 6/7] arm64: dts: sdm845: Increase alert trip point to 95 degrees
  2019-01-10  1:15   ` Matthias Kaehlcke
  2019-01-10  2:15     ` Matthias Kaehlcke
@ 2019-01-11 10:24     ` Amit Kucheria
  2019-01-11 18:30       ` Matthias Kaehlcke
  1 sibling, 1 reply; 37+ messages in thread
From: Amit Kucheria @ 2019-01-11 10:24 UTC (permalink / raw)
  To: Matthias Kaehlcke
  Cc: LKML, linux-arm-msm, Bjorn Andersson, Viresh Kumar,
	Eduardo Valentin, Andy Gross, Taniya Das, Stephen Boyd,
	Douglas Anderson, David Brown, Rob Herring, Mark Rutland,
	open list:OPEN FIRMWARE AND FLATTENED DEVICE TREE BINDINGS

On Thu, Jan 10, 2019 at 6:45 AM Matthias Kaehlcke <mka@chromium.org> wrote:
>
> Hi Amit,
>
> On Thu, Jan 10, 2019 at 05:30:55AM +0530, Amit Kucheria wrote:
> > 75 degrees is too aggressive for throttling the CPU. After speaking to
> > Qualcomm engineers, increase it to 95 degrees.
> >
> > Signed-off-by: Amit Kucheria <amit.kucheria@linaro.org>
> > ---
> >  arch/arm64/boot/dts/qcom/sdm845.dtsi | 16 ++++++++--------
> >  1 file changed, 8 insertions(+), 8 deletions(-)
> >
> > diff --git a/arch/arm64/boot/dts/qcom/sdm845.dtsi b/arch/arm64/boot/dts/qcom/sdm845.dtsi
> > index c27cbd3bcb0a..29e823b0caf4 100644
> > --- a/arch/arm64/boot/dts/qcom/sdm845.dtsi
> > +++ b/arch/arm64/boot/dts/qcom/sdm845.dtsi
> > @@ -1692,7 +1692,7 @@
> >
> >                       trips {
> >                               cpu_alert0: trip0 {
> > -                                     temperature = <75000>;
> > +                                     temperature = <95000>;
> >                                       hysteresis = <2000>;
> >                                       type = "passive";
> >                               };
> > @@ -1713,7 +1713,7 @@
> >
> >                       trips {
> >                               cpu_alert1: trip0 {
> > -                                     temperature = <75000>;
> > +                                     temperature = <95000>;
> >                                       hysteresis = <2000>;
> >                                       type = "passive";
> >                               };
> > @@ -1734,7 +1734,7 @@
> >
> >                       trips {
> >                               cpu_alert2: trip0 {
> > -                                     temperature = <75000>;
> > +                                     temperature = <95000>;
> >                                       hysteresis = <2000>;
> >                                       type = "passive";
> >                               };
> > @@ -1755,7 +1755,7 @@
> >
> >                       trips {
> >                               cpu_alert3: trip0 {
> > -                                     temperature = <75000>;
> > +                                     temperature = <95000>;
> >                                       hysteresis = <2000>;
> >                                       type = "passive";
> >                               };
> > @@ -1776,7 +1776,7 @@
> >
> >                       trips {
> >                               cpu_alert4: trip0 {
> > -                                     temperature = <75000>;
> > +                                     temperature = <95000>;
> >                                       hysteresis = <2000>;
> >                                       type = "passive";
> >                               };
> > @@ -1797,7 +1797,7 @@
> >
> >                       trips {
> >                               cpu_alert5: trip0 {
> > -                                     temperature = <75000>;
> > +                                     temperature = <95000>;
> >                                       hysteresis = <2000>;
> >                                       type = "passive";
> >                               };
> > @@ -1818,7 +1818,7 @@
> >
> >                       trips {
> >                               cpu_alert6: trip0 {
> > -                                     temperature = <75000>;
> > +                                     temperature = <95000>;
> >                                       hysteresis = <2000>;
> >                                       type = "passive";
> >                               };
> > @@ -1839,7 +1839,7 @@
> >
> >                       trips {
> >                               cpu_alert7: trip0 {
> > -                                     temperature = <75000>;
> > +                                     temperature = <95000>;
> >                                       hysteresis = <2000>;
> >                                       type = "passive";
> >                               };
>
> The change itself looks good to me, however I wonder if it would be
> worth to eliminate redundancy and merge the current 8 thermal zones
> into 2, one for the Silver and one for the Gold cluster (as done by
> http://crrev.com/c/1381752). There is a single cooling device for
> each cluster, so it's not clear to me if there is any gain from having
> a separate thermal zone for each CPU. If it is important to monitor
> the temperatures of the individual cores this can still be done by
> configuring the thermal zone of the cluster with multiple thermal
> sensors.

Reducing the number of thermal zones to 2 (by grouping 4 sensors per
zone) is not possible due a limitation of the thermal framework[1]. It
is something that we want to address. Previous attempts to fix this
were rejected for various reasons. Eduardo was going to share a way to
have more flexible mapping between sensors and zones after discussions
at LPC.

<nag> Eduardo, do you have anything we can review? </nag> :-)

Having said that, we'll need some aggregation functions when we add
multiple sensors to a zone (e.g. max, mean) to reflect the zone. This
will lose information about hotspots and prevent things like idle
injection on a particular CPU that is causing most of the heat in the
aggregated zone. So IMHO, it might be useful to have information about
the hotspots (i.e TZ per sensor) and aggregated values (ambient
temperature) that can be fed to the thermal policy.


[1] https://elixir.bootlin.com/linux/v5.0-rc1/source/drivers/thermal/of-thermal.c#L502

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH v1 7/7] arm64: dts: sdm845: wireup the thermal trip points to cpufreq
  2019-01-11  0:30   ` Matthias Kaehlcke
@ 2019-01-11 11:17     ` Amit Kucheria
  2019-01-11 20:36       ` Matthias Kaehlcke
  0 siblings, 1 reply; 37+ messages in thread
From: Amit Kucheria @ 2019-01-11 11:17 UTC (permalink / raw)
  To: Matthias Kaehlcke
  Cc: LKML, linux-arm-msm, Bjorn Andersson, Viresh Kumar,
	Eduardo Valentin, Andy Gross, Taniya Das, Stephen Boyd,
	Douglas Anderson, David Brown, Rob Herring, Mark Rutland,
	open list:OPEN FIRMWARE AND FLATTENED DEVICE TREE BINDINGS

On Fri, Jan 11, 2019 at 6:00 AM Matthias Kaehlcke <mka@chromium.org> wrote:
>
> On Thu, Jan 10, 2019 at 05:30:56AM +0530, Amit Kucheria wrote:
> > Since the big and little cpus are in the same frequency domain, use all
> > of them for mitigation in the cooling-map. At the lower trip points we
> > restrict ourselves to throttling only a few OPPs. At higher trip
> > temperatures, allow ourselves to be throttled to any extent.
> >
> > Signed-off-by: Amit Kucheria <amit.kucheria@linaro.org>
> > ---
> >  arch/arm64/boot/dts/qcom/sdm845.dtsi | 145 +++++++++++++++++++++++++++
> >  1 file changed, 145 insertions(+)
> >
> > diff --git a/arch/arm64/boot/dts/qcom/sdm845.dtsi b/arch/arm64/boot/dts/qcom/sdm845.dtsi
> > index 29e823b0caf4..cd6402a9aa64 100644
> > --- a/arch/arm64/boot/dts/qcom/sdm845.dtsi
> > +++ b/arch/arm64/boot/dts/qcom/sdm845.dtsi
> > @@ -13,6 +13,7 @@
> >  #include <dt-bindings/reset/qcom,sdm845-aoss.h>
> >  #include <dt-bindings/soc/qcom,rpmh-rsc.h>
> >  #include <dt-bindings/clock/qcom,gcc-sdm845.h>
> > +#include <dt-bindings/thermal/thermal.h>
> >
> >  / {
> >       interrupt-parent = <&intc>;
> > @@ -99,6 +100,7 @@
> >                       compatible = "qcom,kryo385";
> >                       reg = <0x0 0x0>;
> >                       enable-method = "psci";
> > +                     #cooling-cells = <2>;
> >                       next-level-cache = <&L2_0>;
> >                       L2_0: l2-cache {
> >                               compatible = "cache";
> > @@ -114,6 +116,7 @@
> >                       compatible = "qcom,kryo385";
> >                       reg = <0x0 0x100>;
> >                       enable-method = "psci";
> > +                     #cooling-cells = <2>;
> >                       next-level-cache = <&L2_100>;
> >                       L2_100: l2-cache {
> >                               compatible = "cache";
> > @@ -126,6 +129,7 @@
> >                       compatible = "qcom,kryo385";
> >                       reg = <0x0 0x200>;
> >                       enable-method = "psci";
> > +                     #cooling-cells = <2>;
> >                       next-level-cache = <&L2_200>;
> >                       L2_200: l2-cache {
> >                               compatible = "cache";
> > @@ -138,6 +142,7 @@
> >                       compatible = "qcom,kryo385";
> >                       reg = <0x0 0x300>;
> >                       enable-method = "psci";
> > +                     #cooling-cells = <2>;
> >                       next-level-cache = <&L2_300>;
> >                       L2_300: l2-cache {
> >                               compatible = "cache";
> > @@ -150,6 +155,7 @@
> >                       compatible = "qcom,kryo385";
> >                       reg = <0x0 0x400>;
> >                       enable-method = "psci";
> > +                     #cooling-cells = <2>;
> >                       next-level-cache = <&L2_400>;
> >                       L2_400: l2-cache {
> >                               compatible = "cache";
> > @@ -162,6 +168,7 @@
> >                       compatible = "qcom,kryo385";
> >                       reg = <0x0 0x500>;
> >                       enable-method = "psci";
> > +                     #cooling-cells = <2>;
> >                       next-level-cache = <&L2_500>;
> >                       L2_500: l2-cache {
> >                               compatible = "cache";
> > @@ -174,6 +181,7 @@
> >                       compatible = "qcom,kryo385";
> >                       reg = <0x0 0x600>;
> >                       enable-method = "psci";
> > +                     #cooling-cells = <2>;
> >                       next-level-cache = <&L2_600>;
> >                       L2_600: l2-cache {
> >                               compatible = "cache";
> > @@ -186,6 +194,7 @@
> >                       compatible = "qcom,kryo385";
> >                       reg = <0x0 0x700>;
> >                       enable-method = "psci";
> > +                     #cooling-cells = <2>;
> >                       next-level-cache = <&L2_700>;
> >                       L2_700: l2-cache {
> >                               compatible = "cache";
> > @@ -1703,6 +1712,23 @@
> >                                       type = "critical";
> >                               };
> >                       };
> > +
> > +                     cooling-maps {
> > +                             map0 {
> > +                                     trip = <&cpu_alert0>;
> > +                                     cooling-device = <&CPU0 THERMAL_NO_LIMIT 4>,
> > +                                                      <&CPU1 THERMAL_NO_LIMIT 4>,
> > +                                                      <&CPU2 THERMAL_NO_LIMIT 4>,
> > +                                                      <&CPU3 THERMAL_NO_LIMIT 4>;
> > +                             };
> > +                             map1 {
> > +                                     trip = <&cpu_crit0>;
> > +                                     cooling-device = <&CPU0 THERMAL_NO_LIMIT THERMAL_NO_LIMIT>,
> > +                                                      <&CPU1 THERMAL_NO_LIMIT THERMAL_NO_LIMIT>,
> > +                                                      <&CPU2 THERMAL_NO_LIMIT THERMAL_NO_LIMIT>,
> > +                                                      <&CPU3 THERMAL_NO_LIMIT THERMAL_NO_LIMIT>;
> > +                             };
> > +                     };
>
> Slightly off-topic, buy maybe not so much since we are just starting
> to use the trip points:
>
> Currently we use the naming scheme 'cpu_<type>N' for trip points. I
> anticipate that we're going to add more passive trip points soon, to
> keep the 'power_allocator' thermal governor happy, which expects a
> 'switch_on' and a 'desired_temperature' trip point. With the current
> naming scheme this could become a bit messy. I suggest to change it to
> 'cpuN_<type>[X]', which would allow for something like 'cpuN_alert0'
> and 'cpuN_alert1'.
>
> If you think the change makes sense you can consider to do it within
> this series, I can also send a separate patch once it has landed.

Sure, I can change them to cpuN_alertX format.

> You could also consider to add the additional trip point in this
> series if you agree that it will be needed.

I expect that we'll end up with at least 2 passive trip points but I
don't know what temperature to set the next one at. So let's just go
with 1 passive and 1 critical trip point in this series and you can
send a patch adding more once we've characterised IPA.

> This is not necessarily a call for action, just thinking loudly about
> a closely related topic ;-)

Thanks for the reviews.

Regards,
Amit

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH v1 6/7] arm64: dts: sdm845: Increase alert trip point to 95 degrees
  2019-01-11 10:24     ` Amit Kucheria
@ 2019-01-11 18:30       ` Matthias Kaehlcke
  0 siblings, 0 replies; 37+ messages in thread
From: Matthias Kaehlcke @ 2019-01-11 18:30 UTC (permalink / raw)
  To: Amit Kucheria
  Cc: LKML, linux-arm-msm, Bjorn Andersson, Viresh Kumar,
	Eduardo Valentin, Andy Gross, Taniya Das, Stephen Boyd,
	Douglas Anderson, David Brown, Rob Herring, Mark Rutland,
	open list:OPEN FIRMWARE AND FLATTENED DEVICE TREE BINDINGS

On Fri, Jan 11, 2019 at 03:54:23PM +0530, Amit Kucheria wrote:
> On Thu, Jan 10, 2019 at 6:45 AM Matthias Kaehlcke <mka@chromium.org> wrote:
> >
> > Hi Amit,
> >
> > On Thu, Jan 10, 2019 at 05:30:55AM +0530, Amit Kucheria wrote:
> > > 75 degrees is too aggressive for throttling the CPU. After speaking to
> > > Qualcomm engineers, increase it to 95 degrees.
> > >
> > > Signed-off-by: Amit Kucheria <amit.kucheria@linaro.org>
> > > ---
> > >  arch/arm64/boot/dts/qcom/sdm845.dtsi | 16 ++++++++--------
> > >  1 file changed, 8 insertions(+), 8 deletions(-)
> > >
> > > diff --git a/arch/arm64/boot/dts/qcom/sdm845.dtsi b/arch/arm64/boot/dts/qcom/sdm845.dtsi
> > > index c27cbd3bcb0a..29e823b0caf4 100644
> > > --- a/arch/arm64/boot/dts/qcom/sdm845.dtsi
> > > +++ b/arch/arm64/boot/dts/qcom/sdm845.dtsi
> > > @@ -1692,7 +1692,7 @@
> > >
> > >                       trips {
> > >                               cpu_alert0: trip0 {
> > > -                                     temperature = <75000>;
> > > +                                     temperature = <95000>;
> > >                                       hysteresis = <2000>;
> > >                                       type = "passive";
> > >                               };
> > > @@ -1713,7 +1713,7 @@
> > >
> > >                       trips {
> > >                               cpu_alert1: trip0 {
> > > -                                     temperature = <75000>;
> > > +                                     temperature = <95000>;
> > >                                       hysteresis = <2000>;
> > >                                       type = "passive";
> > >                               };
> > > @@ -1734,7 +1734,7 @@
> > >
> > >                       trips {
> > >                               cpu_alert2: trip0 {
> > > -                                     temperature = <75000>;
> > > +                                     temperature = <95000>;
> > >                                       hysteresis = <2000>;
> > >                                       type = "passive";
> > >                               };
> > > @@ -1755,7 +1755,7 @@
> > >
> > >                       trips {
> > >                               cpu_alert3: trip0 {
> > > -                                     temperature = <75000>;
> > > +                                     temperature = <95000>;
> > >                                       hysteresis = <2000>;
> > >                                       type = "passive";
> > >                               };
> > > @@ -1776,7 +1776,7 @@
> > >
> > >                       trips {
> > >                               cpu_alert4: trip0 {
> > > -                                     temperature = <75000>;
> > > +                                     temperature = <95000>;
> > >                                       hysteresis = <2000>;
> > >                                       type = "passive";
> > >                               };
> > > @@ -1797,7 +1797,7 @@
> > >
> > >                       trips {
> > >                               cpu_alert5: trip0 {
> > > -                                     temperature = <75000>;
> > > +                                     temperature = <95000>;
> > >                                       hysteresis = <2000>;
> > >                                       type = "passive";
> > >                               };
> > > @@ -1818,7 +1818,7 @@
> > >
> > >                       trips {
> > >                               cpu_alert6: trip0 {
> > > -                                     temperature = <75000>;
> > > +                                     temperature = <95000>;
> > >                                       hysteresis = <2000>;
> > >                                       type = "passive";
> > >                               };
> > > @@ -1839,7 +1839,7 @@
> > >
> > >                       trips {
> > >                               cpu_alert7: trip0 {
> > > -                                     temperature = <75000>;
> > > +                                     temperature = <95000>;
> > >                                       hysteresis = <2000>;
> > >                                       type = "passive";
> > >                               };
> >
> > The change itself looks good to me, however I wonder if it would be
> > worth to eliminate redundancy and merge the current 8 thermal zones
> > into 2, one for the Silver and one for the Gold cluster (as done by
> > http://crrev.com/c/1381752). There is a single cooling device for
> > each cluster, so it's not clear to me if there is any gain from having
> > a separate thermal zone for each CPU. If it is important to monitor
> > the temperatures of the individual cores this can still be done by
> > configuring the thermal zone of the cluster with multiple thermal
> > sensors.
> 
> Reducing the number of thermal zones to 2 (by grouping 4 sensors per
> zone) is not possible due a limitation of the thermal framework[1]. It
> is something that we want to address. Previous attempts to fix this
> were rejected for various reasons. Eduardo was going to share a way to
> have more flexible mapping between sensors and zones after discussions
> at LPC.

I wasn't aware of this limitation, thanks for the clarification! With
this I understand that for now we indeed need the 8 thermal zones with
all the redundant information :(

> <nag> Eduardo, do you have anything we can review? </nag> :-)
> 
> Having said that, we'll need some aggregation functions when we add
> multiple sensors to a zone (e.g. max, mean) to reflect the zone. This
> will lose information about hotspots and prevent things like idle
> injection on a particular CPU that is causing most of the heat in the
> aggregated zone. So IMHO, it might be useful to have information about
> the hotspots (i.e TZ per sensor) and aggregated values (ambient
> temperature) that can be fed to the thermal policy.

Ok, it seems for now we need the 8 thermal zones in any case, when
support for multiple sensors becomes available we can evaluate whether
it's worth to change that or not.

Cheers

Matthias

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH v1 7/7] arm64: dts: sdm845: wireup the thermal trip points to cpufreq
  2019-01-11  3:46         ` Viresh Kumar
@ 2019-01-11 19:58           ` Matthias Kaehlcke
  2019-01-14  5:59             ` Viresh Kumar
  0 siblings, 1 reply; 37+ messages in thread
From: Matthias Kaehlcke @ 2019-01-11 19:58 UTC (permalink / raw)
  To: Viresh Kumar
  Cc: Amit Kucheria, linux-kernel, linux-arm-msm, bjorn.andersson,
	edubezval, andy.gross, tdas, swboyd, dianders, David Brown,
	Rob Herring, Mark Rutland,
	open list:OPEN FIRMWARE AND FLATTENED DEVICE TREE BINDINGS

On Fri, Jan 11, 2019 at 09:16:53AM +0530, Viresh Kumar wrote:
> On 10-01-19, 10:42, Matthias Kaehlcke wrote:
> > Thanks for the pointer, there's always something new to learn!
> > 
> > Ok, so the policy CPU and hence the CPU registered as cooling
> > device may vary. I understand that this requires to list all possible
> > cooling devices,
> 
> I won't say that I changed DT because of a design issue with kernel,
> rather the DT shall be complete by itself and that's why that change
> was made.

fair enough

> And then we can have more things going on. For example with cpuidle
> cooling, we can individually control each CPU (and force idle on that)
> even if all CPUs are part of the same freq-domain. Each CPU shall
> expose its capabilities.

Just to gain a better understanding: is cpuidle cooling already
available for arm64 (or is there a patch set)? I came across the
relatively new idle injecting framework but it seems currently the
only user is the Intel powerclamp driver.

> > even though only one will be active at any given
> > time. However I wonder if we could change this:
> 
> I won't say it that way. I see it as all the CPUs are active during a
> cooling state, i.e. they are all participating.

agreed, I was referring to the CPU cooling device, which (without
cpuidle injection) could be considered a single device per freq domain.

> > For device tree based platform the above implies that cooling maps
> > must include a list of all possible cooling devices of a frequency
> > domain, even though only one of them will exist at any given time.
> > 
> > For example:
> > 
> > cooling-maps {
> > 	map0 {
> > 		trip = <&cpu_alert0>;
> > 		cooling-device = <&CPU0 THERMAL_NO_LIMIT 4>,
> > 				 <&CPU1 THERMAL_NO_LIMIT 4>,
> > 				 <&CPU2 THERMAL_NO_LIMIT 4>,
> > 				 <&CPU3 THERMAL_NO_LIMIT 4>;
> > 	};
> > 	map1 {
> > 		trip = <&cpu_crit0>;
> > 		cooling-device = <&CPU0 THERMAL_NO_LIMIT THERMAL_NO_LIMIT>,
> > 				 <&CPU1 THERMAL_NO_LIMIT THERMAL_NO_LIMIT>,
> > 				 <&CPU2 THERMAL_NO_LIMIT THERMAL_NO_LIMIT>,
> > 				 <&CPU3 THERMAL_NO_LIMIT THERMAL_NO_LIMIT>;
> 
> This is the right thing to do hardware description wise, no matter
> what the kernel does.

Not sure I would call it a hardware description. I'd say we pretend
the thermal configuration is a hardware description so the DT folks
don't yell at us ;-) IMO a CPU cooling device is an abstraction, I
think there is no such IP block on most systems.

It seems with cpuidle injection CPUs can perform cooling actions
individually, with that I agree that representing them as individual
cooling devices in the DT makes sense. Without that a cooling device
per freq domain would seem a resonable abstraction.

One of the reasons I dislike the above list of cooling devices is that
it is repeated for different thermal-zone/cooling-maps, but I guess
we have to live with that, would be nice if the DT would allow to do
something like this:

thermal-zones {
	cooling_maps_fd0 : cooling-maps {
		map0 {
			trip = <&cpu_alert0>;
			cooling-device = <&CPU0 THERMAL_NO_LIMIT 4>,
					 <&CPU1 THERMAL_NO_LIMIT 4>,
					 <&CPU2 THERMAL_NO_LIMIT 4>,
					 <&CPU3 THERMAL_NO_LIMIT 4>;
		};
		map1 {
			trip = <&cpu_crit0>;
			cooling-device = <&CPU0 THERMAL_NO_LIMIT THERMAL_NO_LIMIT>,
					 <&CPU1 THERMAL_NO_LIMIT THERMAL_NO_LIMIT>,
					 <&CPU2 THERMAL_NO_LIMIT THERMAL_NO_LIMIT>,
					 <&CPU3 THERMAL_NO_LIMIT THERMAL_NO_LIMIT>;
	};

	cpu0-thermal {
		...
		cooling-maps = @cooling_maps_fd0;
		...
	};

	cpu1-thermal {
		...
		cooling-maps = @cooling_maps_fd0;
		...
	};

	...
};

Cheers

Matthias

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH v1 7/7] arm64: dts: sdm845: wireup the thermal trip points to cpufreq
  2019-01-11 11:17     ` Amit Kucheria
@ 2019-01-11 20:36       ` Matthias Kaehlcke
  2019-01-14  8:22         ` Amit Kucheria
  0 siblings, 1 reply; 37+ messages in thread
From: Matthias Kaehlcke @ 2019-01-11 20:36 UTC (permalink / raw)
  To: Amit Kucheria
  Cc: LKML, linux-arm-msm, Bjorn Andersson, Viresh Kumar,
	Eduardo Valentin, Andy Gross, Taniya Das, Stephen Boyd,
	Douglas Anderson, David Brown, Rob Herring, Mark Rutland,
	open list:OPEN FIRMWARE AND FLATTENED DEVICE TREE BINDINGS

On Fri, Jan 11, 2019 at 04:47:15PM +0530, Amit Kucheria wrote:
> On Fri, Jan 11, 2019 at 6:00 AM Matthias Kaehlcke <mka@chromium.org> wrote:
> >
> > On Thu, Jan 10, 2019 at 05:30:56AM +0530, Amit Kucheria wrote:
> > > Since the big and little cpus are in the same frequency domain, use all
> > > of them for mitigation in the cooling-map. At the lower trip points we
> > > restrict ourselves to throttling only a few OPPs. At higher trip
> > > temperatures, allow ourselves to be throttled to any extent.
> > >
> > > Signed-off-by: Amit Kucheria <amit.kucheria@linaro.org>
> > > ---
> > >  arch/arm64/boot/dts/qcom/sdm845.dtsi | 145 +++++++++++++++++++++++++++
> > >  1 file changed, 145 insertions(+)
> > >
> > > diff --git a/arch/arm64/boot/dts/qcom/sdm845.dtsi b/arch/arm64/boot/dts/qcom/sdm845.dtsi
> > > index 29e823b0caf4..cd6402a9aa64 100644
> > > --- a/arch/arm64/boot/dts/qcom/sdm845.dtsi
> > > +++ b/arch/arm64/boot/dts/qcom/sdm845.dtsi
> > > @@ -13,6 +13,7 @@
> > >  #include <dt-bindings/reset/qcom,sdm845-aoss.h>
> > >  #include <dt-bindings/soc/qcom,rpmh-rsc.h>
> > >  #include <dt-bindings/clock/qcom,gcc-sdm845.h>
> > > +#include <dt-bindings/thermal/thermal.h>
> > >
> > >  / {
> > >       interrupt-parent = <&intc>;
> > > @@ -99,6 +100,7 @@
> > >                       compatible = "qcom,kryo385";
> > >                       reg = <0x0 0x0>;
> > >                       enable-method = "psci";
> > > +                     #cooling-cells = <2>;
> > >                       next-level-cache = <&L2_0>;
> > >                       L2_0: l2-cache {
> > >                               compatible = "cache";
> > > @@ -114,6 +116,7 @@
> > >                       compatible = "qcom,kryo385";
> > >                       reg = <0x0 0x100>;
> > >                       enable-method = "psci";
> > > +                     #cooling-cells = <2>;
> > >                       next-level-cache = <&L2_100>;
> > >                       L2_100: l2-cache {
> > >                               compatible = "cache";
> > > @@ -126,6 +129,7 @@
> > >                       compatible = "qcom,kryo385";
> > >                       reg = <0x0 0x200>;
> > >                       enable-method = "psci";
> > > +                     #cooling-cells = <2>;
> > >                       next-level-cache = <&L2_200>;
> > >                       L2_200: l2-cache {
> > >                               compatible = "cache";
> > > @@ -138,6 +142,7 @@
> > >                       compatible = "qcom,kryo385";
> > >                       reg = <0x0 0x300>;
> > >                       enable-method = "psci";
> > > +                     #cooling-cells = <2>;
> > >                       next-level-cache = <&L2_300>;
> > >                       L2_300: l2-cache {
> > >                               compatible = "cache";
> > > @@ -150,6 +155,7 @@
> > >                       compatible = "qcom,kryo385";
> > >                       reg = <0x0 0x400>;
> > >                       enable-method = "psci";
> > > +                     #cooling-cells = <2>;
> > >                       next-level-cache = <&L2_400>;
> > >                       L2_400: l2-cache {
> > >                               compatible = "cache";
> > > @@ -162,6 +168,7 @@
> > >                       compatible = "qcom,kryo385";
> > >                       reg = <0x0 0x500>;
> > >                       enable-method = "psci";
> > > +                     #cooling-cells = <2>;
> > >                       next-level-cache = <&L2_500>;
> > >                       L2_500: l2-cache {
> > >                               compatible = "cache";
> > > @@ -174,6 +181,7 @@
> > >                       compatible = "qcom,kryo385";
> > >                       reg = <0x0 0x600>;
> > >                       enable-method = "psci";
> > > +                     #cooling-cells = <2>;
> > >                       next-level-cache = <&L2_600>;
> > >                       L2_600: l2-cache {
> > >                               compatible = "cache";
> > > @@ -186,6 +194,7 @@
> > >                       compatible = "qcom,kryo385";
> > >                       reg = <0x0 0x700>;
> > >                       enable-method = "psci";
> > > +                     #cooling-cells = <2>;
> > >                       next-level-cache = <&L2_700>;
> > >                       L2_700: l2-cache {
> > >                               compatible = "cache";
> > > @@ -1703,6 +1712,23 @@
> > >                                       type = "critical";
> > >                               };
> > >                       };
> > > +
> > > +                     cooling-maps {
> > > +                             map0 {
> > > +                                     trip = <&cpu_alert0>;
> > > +                                     cooling-device = <&CPU0 THERMAL_NO_LIMIT 4>,
> > > +                                                      <&CPU1 THERMAL_NO_LIMIT 4>,
> > > +                                                      <&CPU2 THERMAL_NO_LIMIT 4>,
> > > +                                                      <&CPU3 THERMAL_NO_LIMIT 4>;
> > > +                             };
> > > +                             map1 {
> > > +                                     trip = <&cpu_crit0>;
> > > +                                     cooling-device = <&CPU0 THERMAL_NO_LIMIT THERMAL_NO_LIMIT>,
> > > +                                                      <&CPU1 THERMAL_NO_LIMIT THERMAL_NO_LIMIT>,
> > > +                                                      <&CPU2 THERMAL_NO_LIMIT THERMAL_NO_LIMIT>,
> > > +                                                      <&CPU3 THERMAL_NO_LIMIT THERMAL_NO_LIMIT>;
> > > +                             };
> > > +                     };
> >
> > Slightly off-topic, buy maybe not so much since we are just starting
> > to use the trip points:
> >
> > Currently we use the naming scheme 'cpu_<type>N' for trip points. I
> > anticipate that we're going to add more passive trip points soon, to
> > keep the 'power_allocator' thermal governor happy, which expects a
> > 'switch_on' and a 'desired_temperature' trip point. With the current
> > naming scheme this could become a bit messy. I suggest to change it to
> > 'cpuN_<type>[X]', which would allow for something like 'cpuN_alert0'
> > and 'cpuN_alert1'.
> >
> > If you think the change makes sense you can consider to do it within
> > this series, I can also send a separate patch once it has landed.
> 
> Sure, I can change them to cpuN_alertX format.

Great, thanks!

Another concern about adding trip points later could be the node
name. We currently have:


trips {
  cpu0_alert0: trip0 {
    ...
  };

  cpu0_crit: trip1 {
    ...
  };
};

If we keep increasing enumeration with the node name this would become:

trips {
  cpu0_alert0: trip0 {
    ...
  };

  cpu0_alert1: trip1 {
    ...
  };

  cpu0_crit: trip2 {
    ...
  };
};

i.e. the node name of the critical trip-point changes, which might be
a concern for dtsi's that override a value, though they should
probably use the phandle &cpu0_crit anyway. If this is a concern we
could change the node names to 'alert0' and 'crit'.

I looked around a bit and actually I kinda like the naming scheme used
by hisilicon/hi6220.dtsi, mediatek/mt8173.dtsi and rockchip/rk3328.dtsi
(with minor variations):

trips {
        threshold: trip-point@0 {
                temperature = <68000>;
                hysteresis = <2000>;
                type = "passive";
        };

        target: trip-point@1 {
                temperature = <85000>;
                hysteresis = <2000>;
                type = "passive";
        };

        cpu_crit: cpu_crit@0 {
                temperature = <115000>;
                hysteresis = <2000>;
                type = "critical";
        };
};

If we were to use this we'd have to adapt it slightly since we have
multiple thermal zones. In line with the other scheme this could be
cpuN_threshold, cpuN_target and cpuN_crit.

Up to you, just providing some options ;-)

> > You could also consider to add the additional trip point in this
> > series if you agree that it will be needed.
> 
> I expect that we'll end up with at least 2 passive trip points but I
> don't know what temperature to set the next one at. So let's just go
> with 1 passive and 1 critical trip point in this series and you can
> send a patch adding more once we've characterised IPA.

Sounds good

Thanks!

Matthias

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH v1 7/7] arm64: dts: sdm845: wireup the thermal trip points to cpufreq
  2019-01-11 19:58           ` Matthias Kaehlcke
@ 2019-01-14  5:59             ` Viresh Kumar
  0 siblings, 0 replies; 37+ messages in thread
From: Viresh Kumar @ 2019-01-14  5:59 UTC (permalink / raw)
  To: Matthias Kaehlcke
  Cc: Amit Kucheria, linux-kernel, linux-arm-msm, bjorn.andersson,
	edubezval, andy.gross, tdas, swboyd, dianders, David Brown,
	Rob Herring, Mark Rutland,
	open list:OPEN FIRMWARE AND FLATTENED DEVICE TREE BINDINGS,
	daniel.lezcano

On 11-01-19, 11:58, Matthias Kaehlcke wrote:
> On Fri, Jan 11, 2019 at 09:16:53AM +0530, Viresh Kumar wrote:
> Just to gain a better understanding: is cpuidle cooling already
> available for arm64 (or is there a patch set)? I came across the
> relatively new idle injecting framework but it seems currently the
> only user is the Intel powerclamp driver.

Daniel was trying to upstream it earlier:

lore.kernel.org/lkml/1522945005-7165-7-git-send-email-daniel.lezcano@linaro.org

> > > even though only one will be active at any given
> > > time. However I wonder if we could change this:
> > 
> > I won't say it that way. I see it as all the CPUs are active during a
> > cooling state, i.e. they are all participating.
> 
> agreed, I was referring to the CPU cooling device, which (without
> cpuidle injection) could be considered a single device per freq domain.

Even without cpuidle injection all CPUs actually take part in cooling.

> > > For device tree based platform the above implies that cooling maps
> > > must include a list of all possible cooling devices of a frequency
> > > domain, even though only one of them will exist at any given time.
> > > 
> > > For example:
> > > 
> > > cooling-maps {
> > > 	map0 {
> > > 		trip = <&cpu_alert0>;
> > > 		cooling-device = <&CPU0 THERMAL_NO_LIMIT 4>,
> > > 				 <&CPU1 THERMAL_NO_LIMIT 4>,
> > > 				 <&CPU2 THERMAL_NO_LIMIT 4>,
> > > 				 <&CPU3 THERMAL_NO_LIMIT 4>;
> > > 	};
> > > 	map1 {
> > > 		trip = <&cpu_crit0>;
> > > 		cooling-device = <&CPU0 THERMAL_NO_LIMIT THERMAL_NO_LIMIT>,
> > > 				 <&CPU1 THERMAL_NO_LIMIT THERMAL_NO_LIMIT>,
> > > 				 <&CPU2 THERMAL_NO_LIMIT THERMAL_NO_LIMIT>,
> > > 				 <&CPU3 THERMAL_NO_LIMIT THERMAL_NO_LIMIT>;
> > 
> > This is the right thing to do hardware description wise, no matter
> > what the kernel does.
> 
> Not sure I would call it a hardware description. I'd say we pretend
> the thermal configuration is a hardware description so the DT folks
> don't yell at us ;-) IMO a CPU cooling device is an abstraction, I
> think there is no such IP block on most systems.

Right.

> It seems with cpuidle injection CPUs can perform cooling actions
> individually, with that I agree that representing them as individual
> cooling devices in the DT makes sense. Without that a cooling device
> per freq domain would seem a resonable abstraction.

But we actually have 4 different cooling devices no matter what. The only thing
is that they switch their cooling state together. And that shouldn't bother DT
is what I thought :)

> One of the reasons I dislike the above list of cooling devices is that
> it is repeated for different thermal-zone/cooling-maps, but I guess
> we have to live with that, would be nice if the DT would allow to do
> something like this:
> 
> thermal-zones {
> 	cooling_maps_fd0 : cooling-maps {
> 		map0 {
> 			trip = <&cpu_alert0>;
> 			cooling-device = <&CPU0 THERMAL_NO_LIMIT 4>,
> 					 <&CPU1 THERMAL_NO_LIMIT 4>,
> 					 <&CPU2 THERMAL_NO_LIMIT 4>,
> 					 <&CPU3 THERMAL_NO_LIMIT 4>;
> 		};
> 		map1 {
> 			trip = <&cpu_crit0>;
> 			cooling-device = <&CPU0 THERMAL_NO_LIMIT THERMAL_NO_LIMIT>,
> 					 <&CPU1 THERMAL_NO_LIMIT THERMAL_NO_LIMIT>,
> 					 <&CPU2 THERMAL_NO_LIMIT THERMAL_NO_LIMIT>,
> 					 <&CPU3 THERMAL_NO_LIMIT THERMAL_NO_LIMIT>;
> 	};
> 
> 	cpu0-thermal {
> 		...
> 		cooling-maps = @cooling_maps_fd0;
> 		...
> 	};
> 
> 	cpu1-thermal {
> 		...
> 		cooling-maps = @cooling_maps_fd0;
> 		...
> 	};
> 
> 	...
> };

Yeah, maybe. There aren't lot of examples of such duplication though if I
remember correctly.

-- 
viresh

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH v1 7/7] arm64: dts: sdm845: wireup the thermal trip points to cpufreq
  2019-01-11 20:36       ` Matthias Kaehlcke
@ 2019-01-14  8:22         ` Amit Kucheria
  0 siblings, 0 replies; 37+ messages in thread
From: Amit Kucheria @ 2019-01-14  8:22 UTC (permalink / raw)
  To: Matthias Kaehlcke
  Cc: LKML, linux-arm-msm, Bjorn Andersson, Viresh Kumar,
	Eduardo Valentin, Andy Gross, Taniya Das, Stephen Boyd,
	Douglas Anderson, David Brown, Rob Herring, Mark Rutland,
	open list:OPEN FIRMWARE AND FLATTENED DEVICE TREE BINDINGS

On Sat, Jan 12, 2019 at 2:06 AM Matthias Kaehlcke <mka@chromium.org> wrote:
>
> Another concern about adding trip points later could be the node
> name. We currently have:
>
>
> trips {
>   cpu0_alert0: trip0 {
>     ...
>   };
>
>   cpu0_crit: trip1 {
>     ...
>   };
> };
>
> If we keep increasing enumeration with the node name this would become:
>
> trips {
>   cpu0_alert0: trip0 {
>     ...
>   };
>
>   cpu0_alert1: trip1 {
>     ...
>   };
>
>   cpu0_crit: trip2 {
>     ...
>   };
> };
>
> i.e. the node name of the critical trip-point changes, which might be
> a concern for dtsi's that override a value, though they should
> probably use the phandle &cpu0_crit anyway. If this is a concern we
> could change the node names to 'alert0' and 'crit'.
>
> I looked around a bit and actually I kinda like the naming scheme used
> by hisilicon/hi6220.dtsi, mediatek/mt8173.dtsi and rockchip/rk3328.dtsi
> (with minor variations):
>
> trips {
>         threshold: trip-point@0 {
>                 temperature = <68000>;
>                 hysteresis = <2000>;
>                 type = "passive";
>         };
>
>         target: trip-point@1 {
>                 temperature = <85000>;
>                 hysteresis = <2000>;
>                 type = "passive";
>         };
>
>         cpu_crit: cpu_crit@0 {
>                 temperature = <115000>;
>                 hysteresis = <2000>;
>                 type = "critical";
>         };
> };
>
> If we were to use this we'd have to adapt it slightly since we have
> multiple thermal zones. In line with the other scheme this could be
> cpuN_threshold, cpuN_target and cpuN_crit.
>

I like this scheme enough that I adopted it for v2.

^ permalink raw reply	[flat|nested] 37+ messages in thread

end of thread, other threads:[~2019-01-14  8:22 UTC | newest]

Thread overview: 37+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-01-10  0:00 [PATCH v1 0/7] Thermal throttling for SDM845 Amit Kucheria
2019-01-10  0:00 ` [PATCH v1 1/7] drivers: thermal: of-thermal: Print name of device node with error Amit Kucheria
2019-01-10  0:15   ` Stephen Boyd
2019-01-10  0:00 ` [PATCH v1 2/7] drivers: cpufreq: Add thermal_cooling_device pointer to struct cpufreq_policy Amit Kucheria
2019-01-10  1:01   ` Matthias Kaehlcke
2019-01-10  0:00 ` [PATCH v1 3/7] cpu_cooling: Add generic driver ready callback Amit Kucheria
2019-01-10  6:14   ` Viresh Kumar
2019-01-10  0:00 ` [PATCH v1 4/7] cpufreq: qcom-hw: Move to device_initcall Amit Kucheria
2019-01-10  6:44   ` Viresh Kumar
2019-01-10  0:00 ` [PATCH v1 5/7] cpufreq: qcom-hw: Register as a cpufreq cooling device Amit Kucheria
2019-01-10  6:12   ` Viresh Kumar
2019-01-10  9:03     ` Amit Kucheria
2019-01-10  9:32     ` Rafael J. Wysocki
2019-01-10  0:00 ` [PATCH v1 6/7] arm64: dts: sdm845: Increase alert trip point to 95 degrees Amit Kucheria
2019-01-10  0:29   ` Stephen Boyd
2019-01-10 17:14     ` Doug Anderson
2019-01-10 20:06     ` Amit Kucheria
2019-01-10  1:15   ` Matthias Kaehlcke
2019-01-10  2:15     ` Matthias Kaehlcke
2019-01-10 19:45       ` Amit Kucheria
2019-01-10 20:00         ` Matthias Kaehlcke
2019-01-11  3:32           ` Viresh Kumar
2019-01-11 10:24     ` Amit Kucheria
2019-01-11 18:30       ` Matthias Kaehlcke
2019-01-10  0:00 ` [PATCH v1 7/7] arm64: dts: sdm845: wireup the thermal trip points to cpufreq Amit Kucheria
2019-01-10  0:28   ` Stephen Boyd
2019-01-10 12:28     ` Amit Kucheria
2019-01-10  2:22   ` Matthias Kaehlcke
2019-01-10  6:23     ` Viresh Kumar
2019-01-10 18:42       ` Matthias Kaehlcke
2019-01-11  3:46         ` Viresh Kumar
2019-01-11 19:58           ` Matthias Kaehlcke
2019-01-14  5:59             ` Viresh Kumar
2019-01-11  0:30   ` Matthias Kaehlcke
2019-01-11 11:17     ` Amit Kucheria
2019-01-11 20:36       ` Matthias Kaehlcke
2019-01-14  8:22         ` Amit Kucheria

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).