linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [RFC v2 1/2] ARM: dts: omap3: Add cpu trips and cooling map for omap3 family
@ 2019-09-13 15:37 Adam Ford
  2019-09-13 15:37 ` [RFC v2 2/2] ARM: omap3: Consolidate thermal references to common omap3 Adam Ford
  2019-09-14  9:20 ` [RFC v2 1/2] ARM: dts: omap3: Add cpu trips and cooling map for omap3 family H. Nikolaus Schaller
  0 siblings, 2 replies; 12+ messages in thread
From: Adam Ford @ 2019-09-13 15:37 UTC (permalink / raw)
  To: linux-omap
  Cc: adam.ford, nm, hns, Adam Ford, Benoît Cousson,
	Tony Lindgren, Rob Herring, Mark Rutland, devicetree,
	linux-kernel

The OMAP3530, AM3517 and DM3730 all show thresholds of 90C and 105C
depending on commercial or industrial temperature ratings.  This
patch expands the thermal information to the limits of 90 and 105
for alert and critical.

For boards who never use industrial temperatures, these can be
changed on their respective device trees with something like:

&cpu_alert0 {
	temperature = <85000>; /* millicelsius */
};

&cpu_crit {
	temperature = <90000>; /* millicelsius */
};

Signed-off-by: Adam Ford <aford173@gmail.com>
---
V2:  Change the CPU reference to &cpu instead of &cpu0

diff --git a/arch/arm/boot/dts/omap3-cpu-thermal.dtsi b/arch/arm/boot/dts/omap3-cpu-thermal.dtsi
index 235ecfd61e2d..dfbd0cb0b00b 100644
--- a/arch/arm/boot/dts/omap3-cpu-thermal.dtsi
+++ b/arch/arm/boot/dts/omap3-cpu-thermal.dtsi
@@ -17,4 +17,25 @@ cpu_thermal: cpu_thermal {
 
 			/* sensor       ID */
 	thermal-sensors = <&bandgap     0>;
+
+	cpu_trips: trips {
+		cpu_alert0: cpu_alert {
+			temperature = <90000>; /* millicelsius */
+			hysteresis = <2000>; /* millicelsius */
+			type = "passive";
+		};
+		cpu_crit: cpu_crit {
+			temperature = <105000>; /* millicelsius */
+			hysteresis = <2000>; /* millicelsius */
+			type = "critical";
+		};
+	};
+
+	cpu_cooling_maps: cooling-maps {
+		map0 {
+			trip = <&cpu_alert0>;
+			cooling-device =
+				<&cpu THERMAL_NO_LIMIT THERMAL_NO_LIMIT>;
+		};
+	};
 };
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [RFC v2 2/2] ARM: omap3: Consolidate thermal references to common omap3
  2019-09-13 15:37 [RFC v2 1/2] ARM: dts: omap3: Add cpu trips and cooling map for omap3 family Adam Ford
@ 2019-09-13 15:37 ` Adam Ford
       [not found]   ` <40FEEAC9-8F19-466F-83C3-C8F0142D44B7@goldelico.com>
  2019-09-14  9:20 ` [RFC v2 1/2] ARM: dts: omap3: Add cpu trips and cooling map for omap3 family H. Nikolaus Schaller
  1 sibling, 1 reply; 12+ messages in thread
From: Adam Ford @ 2019-09-13 15:37 UTC (permalink / raw)
  To: linux-omap
  Cc: adam.ford, nm, hns, Adam Ford, Benoît Cousson,
	Tony Lindgren, Rob Herring, Mark Rutland, devicetree,
	linux-kernel

Because the omap34xx, omap36xx and am3517 SoC's have the same
thermal junction limits, there is no need to duplicate the entry
multiple times.

This patch removes the thermal references from omap36xx and
omap34xx and pushes it into the common omap3.dtsi file with
the added benefit of enabling the thermal info on the AM3517.

Signed-off-by: Adam Ford <aford173@gmail.com>
---
V2:	Add node name for cpu and add cooling-cells entry

diff --git a/arch/arm/boot/dts/omap3.dtsi b/arch/arm/boot/dts/omap3.dtsi
index 4043ecb38016..84704eb3b604 100644
--- a/arch/arm/boot/dts/omap3.dtsi
+++ b/arch/arm/boot/dts/omap3.dtsi
@@ -32,7 +32,7 @@
 		#address-cells = <1>;
 		#size-cells = <0>;
 
-		cpu@0 {
+		cpu: cpu@0 {
 			compatible = "arm,cortex-a8";
 			device_type = "cpu";
 			reg = <0x0>;
@@ -41,9 +41,14 @@
 			clock-names = "cpu";
 
 			clock-latency = <300000>; /* From omap-cpufreq driver */
+			#cooling-cells = <2>;
 		};
 	};
 
+	thermal_zones: thermal-zones {
+		#include "omap3-cpu-thermal.dtsi"
+	};
+
 	pmu@54000000 {
 		compatible = "arm,cortex-a8-pmu";
 		reg = <0x54000000 0x800000>;
diff --git a/arch/arm/boot/dts/omap34xx.dtsi b/arch/arm/boot/dts/omap34xx.dtsi
index f572a477f74c..b80378d6e5c1 100644
--- a/arch/arm/boot/dts/omap34xx.dtsi
+++ b/arch/arm/boot/dts/omap34xx.dtsi
@@ -101,10 +101,6 @@
 			};
 		};
 	};
-
-	thermal_zones: thermal-zones {
-		#include "omap3-cpu-thermal.dtsi"
-	};
 };
 
 &ssi {
diff --git a/arch/arm/boot/dts/omap36xx.dtsi b/arch/arm/boot/dts/omap36xx.dtsi
index 6fb23ada1f64..ff2dca63a04e 100644
--- a/arch/arm/boot/dts/omap36xx.dtsi
+++ b/arch/arm/boot/dts/omap36xx.dtsi
@@ -140,10 +140,6 @@
 			};
 		};
 	};
-
-	thermal_zones: thermal-zones {
-		#include "omap3-cpu-thermal.dtsi"
-	};
 };
 
 /* OMAP3630 needs dss_96m_fck for VENC */
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [RFC v2 1/2] ARM: dts: omap3: Add cpu trips and cooling map for omap3 family
  2019-09-13 15:37 [RFC v2 1/2] ARM: dts: omap3: Add cpu trips and cooling map for omap3 family Adam Ford
  2019-09-13 15:37 ` [RFC v2 2/2] ARM: omap3: Consolidate thermal references to common omap3 Adam Ford
@ 2019-09-14  9:20 ` H. Nikolaus Schaller
  2019-09-14 13:42   ` Adam Ford
  1 sibling, 1 reply; 12+ messages in thread
From: H. Nikolaus Schaller @ 2019-09-14  9:20 UTC (permalink / raw)
  To: Adam Ford
  Cc: Linux-OMAP, Adam Ford, Nishanth Menon, Benoît Cousson,
	Tony Lindgren, Rob Herring, Mark Rutland, devicetree,
	Linux Kernel Mailing List, Grazvydas Ignotas


> Am 13.09.2019 um 17:37 schrieb Adam Ford <aford173@gmail.com>:
> 
> The OMAP3530, AM3517 and DM3730 all show thresholds of 90C and 105C
> depending on commercial or industrial temperature ratings.  This
> patch expands the thermal information to the limits of 90 and 105
> for alert and critical.
> 
> For boards who never use industrial temperatures, these can be
> changed on their respective device trees with something like:
> 
> &cpu_alert0 {
> 	temperature = <85000>; /* millicelsius */
> };
> 
> &cpu_crit {
> 	temperature = <90000>; /* millicelsius */
> };
> 
> Signed-off-by: Adam Ford <aford173@gmail.com>
> ---
> V2:  Change the CPU reference to &cpu instead of &cpu0
> 
> diff --git a/arch/arm/boot/dts/omap3-cpu-thermal.dtsi b/arch/arm/boot/dts/omap3-cpu-thermal.dtsi
> index 235ecfd61e2d..dfbd0cb0b00b 100644
> --- a/arch/arm/boot/dts/omap3-cpu-thermal.dtsi
> +++ b/arch/arm/boot/dts/omap3-cpu-thermal.dtsi
> @@ -17,4 +17,25 @@ cpu_thermal: cpu_thermal {
> 
> 			/* sensor       ID */
> 	thermal-sensors = <&bandgap     0>;
> +
> +	cpu_trips: trips {
> +		cpu_alert0: cpu_alert {
> +			temperature = <90000>; /* millicelsius */
> +			hysteresis = <2000>; /* millicelsius */
> +			type = "passive";
> +		};
> +		cpu_crit: cpu_crit {
> +			temperature = <105000>; /* millicelsius */
> +			hysteresis = <2000>; /* millicelsius */
> +			type = "critical";
> +		};
> +	};
> +
> +	cpu_cooling_maps: cooling-maps {
> +		map0 {
> +			trip = <&cpu_alert0>;
> +			cooling-device =
> +				<&cpu THERMAL_NO_LIMIT THERMAL_NO_LIMIT>;
> +		};
> +	};
> };
> -- 
> 2.17.1
> 

Here is my test log (GTA04A5 with DM3730CBP100).
"high-load" script is driving the NEON to full power
and would report calculation errors.

There is no noise visible in the bandgap sensor data
induced by power supply fluctuations (log shows system
voltage while charging).

root@letux:~# ./high-load -n2
100% load stress test for 1 cores running ./neon_loop2
Sat Sep 14 09:05:50 UTC 2019 65° 4111mV 1000MHz
Sat Sep 14 09:05:50 UTC 2019 67° 4005mV 1000MHz
Sat Sep 14 09:05:52 UTC 2019 68° 4000mV 1000MHz
Sat Sep 14 09:05:53 UTC 2019 68° 4000mV 1000MHz
Sat Sep 14 09:05:55 UTC 2019 72° 3976mV 1000MHz
Sat Sep 14 09:05:56 UTC 2019 72° 4023mV 1000MHz
Sat Sep 14 09:05:57 UTC 2019 72° 3900mV 1000MHz
Sat Sep 14 09:05:59 UTC 2019 73° 4029mV 1000MHz
Sat Sep 14 09:06:00 UTC 2019 73° 3988mV 1000MHz
Sat Sep 14 09:06:01 UTC 2019 73° 4005mV 1000MHz
Sat Sep 14 09:06:03 UTC 2019 73° 4011mV 1000MHz
Sat Sep 14 09:06:04 UTC 2019 73° 4117mV 1000MHz
Sat Sep 14 09:06:06 UTC 2019 73° 4005mV 1000MHz
Sat Sep 14 09:06:07 UTC 2019 75° 3994mV 1000MHz
Sat Sep 14 09:06:08 UTC 2019 75° 3970mV 1000MHz
Sat Sep 14 09:06:09 UTC 2019 75° 4046mV 1000MHz
Sat Sep 14 09:06:11 UTC 2019 75° 4005mV 1000MHz
Sat Sep 14 09:06:12 UTC 2019 75° 4023mV 1000MHz
Sat Sep 14 09:06:14 UTC 2019 75° 3970mV 1000MHz
Sat Sep 14 09:06:15 UTC 2019 75° 4011mV 1000MHz
Sat Sep 14 09:06:16 UTC 2019 77° 4017mV 1000MHz
Sat Sep 14 09:06:18 UTC 2019 77° 3994mV 1000MHz
Sat Sep 14 09:06:19 UTC 2019 77° 3994mV 1000MHz
Sat Sep 14 09:06:20 UTC 2019 77° 3988mV 1000MHz
Sat Sep 14 09:06:22 UTC 2019 77° 4023mV 1000MHz
Sat Sep 14 09:06:23 UTC 2019 77° 4023mV 1000MHz
Sat Sep 14 09:06:24 UTC 2019 78° 4005mV 1000MHz
Sat Sep 14 09:06:26 UTC 2019 78° 4105mV 1000MHz
Sat Sep 14 09:06:27 UTC 2019 78° 4011mV 1000MHz
Sat Sep 14 09:06:28 UTC 2019 78° 3994mV 1000MHz
Sat Sep 14 09:06:30 UTC 2019 78° 4123mV 1000MHz
...
Sat Sep 14 09:09:57 UTC 2019 88° 4082mV 1000MHz
Sat Sep 14 09:09:59 UTC 2019 88° 4164mV 1000MHz
Sat Sep 14 09:10:00 UTC 2019 88° 4058mV 1000MHz
Sat Sep 14 09:10:01 UTC 2019 88° 4058mV 1000MHz
Sat Sep 14 09:10:03 UTC 2019 88° 4082mV 1000MHz
Sat Sep 14 09:10:04 UTC 2019 88° 4058mV 1000MHz
Sat Sep 14 09:10:06 UTC 2019 88° 4146mV 1000MHz
Sat Sep 14 09:10:07 UTC 2019 88° 4041mV 1000MHz
Sat Sep 14 09:10:08 UTC 2019 88° 4035mV 1000MHz
Sat Sep 14 09:10:10 UTC 2019 88° 4052mV 1000MHz
Sat Sep 14 09:10:11 UTC 2019 88° 4087mV 1000MHz
Sat Sep 14 09:10:12 UTC 2019 88° 4152mV 1000MHz
Sat Sep 14 09:10:14 UTC 2019 88° 4070mV 1000MHz
Sat Sep 14 09:10:15 UTC 2019 88° 4064mV 1000MHz
Sat Sep 14 09:10:17 UTC 2019 88° 4170mV 1000MHz
Sat Sep 14 09:10:18 UTC 2019 88° 4058mV 1000MHz
Sat Sep 14 09:10:19 UTC 2019 88° 4187mV 1000MHz
Sat Sep 14 09:10:21 UTC 2019 88° 4093mV 1000MHz
Sat Sep 14 09:10:22 UTC 2019 88° 4087mV 1000MHz
Sat Sep 14 09:10:23 UTC 2019 90° 4070mV 1000MHz
Sat Sep 14 09:10:25 UTC 2019 88° 4123mV 800MHz
Sat Sep 14 09:10:26 UTC 2019 88° 4064mV 1000MHz
Sat Sep 14 09:10:28 UTC 2019 90° 4058mV 1000MHz
Sat Sep 14 09:10:29 UTC 2019 88° 4076mV 1000MHz
Sat Sep 14 09:10:30 UTC 2019 88° 4064mV 1000MHz
Sat Sep 14 09:10:32 UTC 2019 88° 4117mV 1000MHz
Sat Sep 14 09:10:33 UTC 2019 88° 4105mV 800MHz
Sat Sep 14 09:10:34 UTC 2019 88° 4070mV 1000MHz
Sat Sep 14 09:10:36 UTC 2019 88° 4076mV 1000MHz
Sat Sep 14 09:10:37 UTC 2019 88° 4087mV 1000MHz
Sat Sep 14 09:10:39 UTC 2019 88° 4017mV 1000MHz
Sat Sep 14 09:10:40 UTC 2019 88° 4093mV 1000MHz
Sat Sep 14 09:10:41 UTC 2019 88° 4058mV 800MHz
Sat Sep 14 09:10:42 UTC 2019 88° 4035mV 1000MHz
Sat Sep 14 09:10:44 UTC 2019 90° 4058mV 1000MHz
Sat Sep 14 09:10:45 UTC 2019 88° 4064mV 1000MHz
Sat Sep 14 09:10:47 UTC 2019 88° 4064mV 1000MHz
Sat Sep 14 09:10:48 UTC 2019 88° 4029mV 1000MHz
Sat Sep 14 09:10:50 UTC 2019 90° 4046mV 1000MHz
^Ckill 4680
root@letux:~# cpufreq-info 
cpufrequtils 008: cpufreq-info (C) Dominik Brodowski 2004-2009
Report errors and bugs to cpufreq@vger.kernel.org, please.
analyzing CPU 0:
  driver: cpufreq-dt
  CPUs which run at the same hardware frequency: 0
  CPUs which need to have their frequency coordinated by software: 0
  maximum transition latency: 300 us.
  hardware limits: 300 MHz - 1000 MHz
  available frequency steps: 300 MHz, 600 MHz, 800 MHz, 1000 MHz
  available cpufreq governors: conservative, userspace, powersave, ondemand, performance
  current policy: frequency should be within 300 MHz and 1000 MHz.
                  The governor "ondemand" may decide which speed to use
                  within this range.
  current CPU frequency is 600 MHz (asserted by call to hardware).
  cpufreq stats: 300 MHz:22.81%, 600 MHz:2.50%, 800 MHz:2.10%, 1000 MHz:72.59%  (1563)
root@letux:~# 

So OPP is reduced if bandgap sensor reports >= 90°C
which almost immediately makes the temperature
go down.

No operational hickups were observed.

Surface temperature of the PoP chip did rise to
approx. 53°C during this test.

Tested-by: H. Nikolaus Schaller <hns@goldelico.com> # on GTA04A5 with dm3730cbp100


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [RFC v2 1/2] ARM: dts: omap3: Add cpu trips and cooling map for omap3 family
  2019-09-14  9:20 ` [RFC v2 1/2] ARM: dts: omap3: Add cpu trips and cooling map for omap3 family H. Nikolaus Schaller
@ 2019-09-14 13:42   ` Adam Ford
  2019-09-14 14:38     ` H. Nikolaus Schaller
  0 siblings, 1 reply; 12+ messages in thread
From: Adam Ford @ 2019-09-14 13:42 UTC (permalink / raw)
  To: H. Nikolaus Schaller
  Cc: Linux-OMAP, Adam Ford, Nishanth Menon, Benoît Cousson,
	Tony Lindgren, Rob Herring, Mark Rutland, devicetree,
	Linux Kernel Mailing List, Grazvydas Ignotas

On Sat, Sep 14, 2019 at 4:20 AM H. Nikolaus Schaller <hns@goldelico.com> wrote:
>
>
> > Am 13.09.2019 um 17:37 schrieb Adam Ford <aford173@gmail.com>:
> >
> > The OMAP3530, AM3517 and DM3730 all show thresholds of 90C and 105C
> > depending on commercial or industrial temperature ratings.  This
> > patch expands the thermal information to the limits of 90 and 105
> > for alert and critical.
> >
> > For boards who never use industrial temperatures, these can be
> > changed on their respective device trees with something like:
> >
> > &cpu_alert0 {
> >       temperature = <85000>; /* millicelsius */
> > };
> >
> > &cpu_crit {
> >       temperature = <90000>; /* millicelsius */
> > };
> >
> > Signed-off-by: Adam Ford <aford173@gmail.com>
> > ---
> > V2:  Change the CPU reference to &cpu instead of &cpu0
> >
> > diff --git a/arch/arm/boot/dts/omap3-cpu-thermal.dtsi b/arch/arm/boot/dts/omap3-cpu-thermal.dtsi
> > index 235ecfd61e2d..dfbd0cb0b00b 100644
> > --- a/arch/arm/boot/dts/omap3-cpu-thermal.dtsi
> > +++ b/arch/arm/boot/dts/omap3-cpu-thermal.dtsi
> > @@ -17,4 +17,25 @@ cpu_thermal: cpu_thermal {
> >
> >                       /* sensor       ID */
> >       thermal-sensors = <&bandgap     0>;
> > +
> > +     cpu_trips: trips {
> > +             cpu_alert0: cpu_alert {
> > +                     temperature = <90000>; /* millicelsius */
> > +                     hysteresis = <2000>; /* millicelsius */
> > +                     type = "passive";
> > +             };
> > +             cpu_crit: cpu_crit {
> > +                     temperature = <105000>; /* millicelsius */
> > +                     hysteresis = <2000>; /* millicelsius */
> > +                     type = "critical";
> > +             };
> > +     };
> > +
> > +     cpu_cooling_maps: cooling-maps {
> > +             map0 {
> > +                     trip = <&cpu_alert0>;
> > +                     cooling-device =
> > +                             <&cpu THERMAL_NO_LIMIT THERMAL_NO_LIMIT>;
> > +             };
> > +     };
> > };
> > --
> > 2.17.1
> >
>
> Here is my test log (GTA04A5 with DM3730CBP100).
> "high-load" script is driving the NEON to full power
> and would report calculation errors.
>
> There is no noise visible in the bandgap sensor data
> induced by power supply fluctuations (log shows system
> voltage while charging).
>

Great data!

> root@letux:~# ./high-load -n2
> 100% load stress test for 1 cores running ./neon_loop2
> Sat Sep 14 09:05:50 UTC 2019 65° 4111mV 1000MHz
> Sat Sep 14 09:05:50 UTC 2019 67° 4005mV 1000MHz
> Sat Sep 14 09:05:52 UTC 2019 68° 4000mV 1000MHz
> Sat Sep 14 09:05:53 UTC 2019 68° 4000mV 1000MHz
> Sat Sep 14 09:05:55 UTC 2019 72° 3976mV 1000MHz
> Sat Sep 14 09:05:56 UTC 2019 72° 4023mV 1000MHz
> Sat Sep 14 09:05:57 UTC 2019 72° 3900mV 1000MHz
> Sat Sep 14 09:05:59 UTC 2019 73° 4029mV 1000MHz
> Sat Sep 14 09:06:00 UTC 2019 73° 3988mV 1000MHz
> Sat Sep 14 09:06:01 UTC 2019 73° 4005mV 1000MHz
> Sat Sep 14 09:06:03 UTC 2019 73° 4011mV 1000MHz
> Sat Sep 14 09:06:04 UTC 2019 73° 4117mV 1000MHz
> Sat Sep 14 09:06:06 UTC 2019 73° 4005mV 1000MHz
> Sat Sep 14 09:06:07 UTC 2019 75° 3994mV 1000MHz
> Sat Sep 14 09:06:08 UTC 2019 75° 3970mV 1000MHz
> Sat Sep 14 09:06:09 UTC 2019 75° 4046mV 1000MHz
> Sat Sep 14 09:06:11 UTC 2019 75° 4005mV 1000MHz
> Sat Sep 14 09:06:12 UTC 2019 75° 4023mV 1000MHz
> Sat Sep 14 09:06:14 UTC 2019 75° 3970mV 1000MHz
> Sat Sep 14 09:06:15 UTC 2019 75° 4011mV 1000MHz
> Sat Sep 14 09:06:16 UTC 2019 77° 4017mV 1000MHz
> Sat Sep 14 09:06:18 UTC 2019 77° 3994mV 1000MHz
> Sat Sep 14 09:06:19 UTC 2019 77° 3994mV 1000MHz
> Sat Sep 14 09:06:20 UTC 2019 77° 3988mV 1000MHz
> Sat Sep 14 09:06:22 UTC 2019 77° 4023mV 1000MHz
> Sat Sep 14 09:06:23 UTC 2019 77° 4023mV 1000MHz
> Sat Sep 14 09:06:24 UTC 2019 78° 4005mV 1000MHz
> Sat Sep 14 09:06:26 UTC 2019 78° 4105mV 1000MHz
> Sat Sep 14 09:06:27 UTC 2019 78° 4011mV 1000MHz
> Sat Sep 14 09:06:28 UTC 2019 78° 3994mV 1000MHz
> Sat Sep 14 09:06:30 UTC 2019 78° 4123mV 1000MHz
> ...
> Sat Sep 14 09:09:57 UTC 2019 88° 4082mV 1000MHz
> Sat Sep 14 09:09:59 UTC 2019 88° 4164mV 1000MHz
> Sat Sep 14 09:10:00 UTC 2019 88° 4058mV 1000MHz
> Sat Sep 14 09:10:01 UTC 2019 88° 4058mV 1000MHz
> Sat Sep 14 09:10:03 UTC 2019 88° 4082mV 1000MHz
> Sat Sep 14 09:10:04 UTC 2019 88° 4058mV 1000MHz
> Sat Sep 14 09:10:06 UTC 2019 88° 4146mV 1000MHz
> Sat Sep 14 09:10:07 UTC 2019 88° 4041mV 1000MHz
> Sat Sep 14 09:10:08 UTC 2019 88° 4035mV 1000MHz
> Sat Sep 14 09:10:10 UTC 2019 88° 4052mV 1000MHz
> Sat Sep 14 09:10:11 UTC 2019 88° 4087mV 1000MHz
> Sat Sep 14 09:10:12 UTC 2019 88° 4152mV 1000MHz
> Sat Sep 14 09:10:14 UTC 2019 88° 4070mV 1000MHz
> Sat Sep 14 09:10:15 UTC 2019 88° 4064mV 1000MHz
> Sat Sep 14 09:10:17 UTC 2019 88° 4170mV 1000MHz
> Sat Sep 14 09:10:18 UTC 2019 88° 4058mV 1000MHz
> Sat Sep 14 09:10:19 UTC 2019 88° 4187mV 1000MHz
> Sat Sep 14 09:10:21 UTC 2019 88° 4093mV 1000MHz
> Sat Sep 14 09:10:22 UTC 2019 88° 4087mV 1000MHz
> Sat Sep 14 09:10:23 UTC 2019 90° 4070mV 1000MHz

Should we be a little more conservative?  Without knowing the
accuracy, i believe we do not want to run at 800 or 1GHz at 90C, so if
we made this value 89 instead of 90, we would throttle a little more
conservatively.

> Sat Sep 14 09:10:25 UTC 2019 88° 4123mV 800MHz
> Sat Sep 14 09:10:26 UTC 2019 88° 4064mV 1000MHz
> Sat Sep 14 09:10:28 UTC 2019 90° 4058mV 1000MHz

Again here, I interpret the data sheet correctly, we're technically out of spec

> Sat Sep 14 09:10:29 UTC 2019 88° 4076mV 1000MHz
> Sat Sep 14 09:10:30 UTC 2019 88° 4064mV 1000MHz
> Sat Sep 14 09:10:32 UTC 2019 88° 4117mV 1000MHz
> Sat Sep 14 09:10:33 UTC 2019 88° 4105mV 800MHz
> Sat Sep 14 09:10:34 UTC 2019 88° 4070mV 1000MHz
> Sat Sep 14 09:10:36 UTC 2019 88° 4076mV 1000MHz
> Sat Sep 14 09:10:37 UTC 2019 88° 4087mV 1000MHz
> Sat Sep 14 09:10:39 UTC 2019 88° 4017mV 1000MHz
> Sat Sep 14 09:10:40 UTC 2019 88° 4093mV 1000MHz
> Sat Sep 14 09:10:41 UTC 2019 88° 4058mV 800MHz
> Sat Sep 14 09:10:42 UTC 2019 88° 4035mV 1000MHz
> Sat Sep 14 09:10:44 UTC 2019 90° 4058mV 1000MHz
> Sat Sep 14 09:10:45 UTC 2019 88° 4064mV 1000MHz
> Sat Sep 14 09:10:47 UTC 2019 88° 4064mV 1000MHz
> Sat Sep 14 09:10:48 UTC 2019 88° 4029mV 1000MHz
> Sat Sep 14 09:10:50 UTC 2019 90° 4046mV 1000MHz
> ^Ckill 4680
> root@letux:~# cpufreq-info
> cpufrequtils 008: cpufreq-info (C) Dominik Brodowski 2004-2009
> Report errors and bugs to cpufreq@vger.kernel.org, please.
> analyzing CPU 0:
>   driver: cpufreq-dt
>   CPUs which run at the same hardware frequency: 0
>   CPUs which need to have their frequency coordinated by software: 0
>   maximum transition latency: 300 us.
>   hardware limits: 300 MHz - 1000 MHz
>   available frequency steps: 300 MHz, 600 MHz, 800 MHz, 1000 MHz
>   available cpufreq governors: conservative, userspace, powersave, ondemand, performance
>   current policy: frequency should be within 300 MHz and 1000 MHz.
>                   The governor "ondemand" may decide which speed to use
>                   within this range.
>   current CPU frequency is 600 MHz (asserted by call to hardware).
>   cpufreq stats: 300 MHz:22.81%, 600 MHz:2.50%, 800 MHz:2.10%, 1000 MHz:72.59%  (1563)
> root@letux:~#
>
> So OPP is reduced if bandgap sensor reports >= 90°C
> which almost immediately makes the temperature
> go down.
>
> No operational hickups were observed.
>
> Surface temperature of the PoP chip did rise to
> approx. 53°C during this test.
>
> Tested-by: H. Nikolaus Schaller <hns@goldelico.com> # on GTA04A5 with dm3730cbp100
>

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [RFC v2 2/2] ARM: omap3: Consolidate thermal references to common omap3
       [not found]   ` <40FEEAC9-8F19-466F-83C3-C8F0142D44B7@goldelico.com>
@ 2019-09-14 13:47     ` Adam Ford
  0 siblings, 0 replies; 12+ messages in thread
From: Adam Ford @ 2019-09-14 13:47 UTC (permalink / raw)
  To: H. Nikolaus Schaller
  Cc: Linux-OMAP, Adam Ford, Nishanth Menon, Benoît Cousson,
	Tony Lindgren, Rob Herring, Mark Rutland, devicetree,
	Linux Kernel Mailing List

On Sat, Sep 14, 2019 at 4:25 AM H. Nikolaus Schaller <hns@goldelico.com> wrote:
>
>
> > Am 13.09.2019 um 17:37 schrieb Adam Ford <aford173@gmail.com>:
> >
> > Because the omap34xx, omap36xx and am3517 SoC's have the same
> > thermal junction limits, there is no need to duplicate the entry
> > multiple times.
> >
> > This patch removes the thermal references from omap36xx and
> > omap34xx and pushes it into the common omap3.dtsi file with
> > the added benefit of enabling the thermal info on the AM3517.
> >
> > Signed-off-by: Adam Ford <aford173@gmail.com>

Disregard this patch.  I'll drop it based on Nikolaus' comments below.

> > ---
> > V2:   Add node name for cpu and add cooling-cells entry
> >
> > diff --git a/arch/arm/boot/dts/omap3.dtsi b/arch/arm/boot/dts/omap3.dtsi
> > index 4043ecb38016..84704eb3b604 100644
> > --- a/arch/arm/boot/dts/omap3.dtsi
> > +++ b/arch/arm/boot/dts/omap3.dtsi
> > @@ -32,7 +32,7 @@
> >               #address-cells = <1>;
> >               #size-cells = <0>;
> >
> > -             cpu@0 {
> > +             cpu: cpu@0 {
> >                       compatible = "arm,cortex-a8";
> >                       device_type = "cpu";
> >                       reg = <0x0>;
> > @@ -41,9 +41,14 @@
> >                       clock-names = "cpu";
> >
> >                       clock-latency = <300000>; /* From omap-cpufreq driver */
> > +                     #cooling-cells = <2>;
> >               };
> >       };
>
> Looks ok.
>
> >
> > +     thermal_zones: thermal-zones {
> > +             #include "omap3-cpu-thermal.dtsi"
> > +     };
> > +
>
> I have observed one compile issue: we also include this indirectly by am3517.dtsi
> and the included code refers to <&bandgap 0> but there is no bandgap definition in am3517.dtsi
>
> Therefore I studied the am35x TRM (SPRUGR0C) and compared to the am/dm37x TRM (SPRUGN4M).
>
> But I can't find a bandgap temperature sensor with ADC like it is described in
> "13.4.6 Band Gap Voltage and Temperature Sensor" for the am/dm37x. Only
> "BANDGAP Logic" exists in both and both have the CM_FCLKEN3_CORE but with
> different meaning of bit 0.

I didn't read the technical details, I just read there was a bandgap
logic, so I assumed it existed.

>
> There is also no description of an CONTROL_TEMP_SENSOR (0x48002524) register for am35x.
> (note: the register is also documented for omap3530).

Thanks for looking into this.

>
> So this might mean that the am35x does not have this feature unless TI simply
> did not document it because the chip is specified for a single OPP only where it
> make no sense to monitor the temperature.
>
> We can find out only by looking at 0x48002524 if there is an undocumented
> bandgap converter.

I will try to read this register when I have some time, but I have to
watch Chelsea FC play in 15 minutes.  ;-)

>
> Which means we probably can't make thermal throttling work for it. And even
> if the bandgap sensor exists we are lacking an value -> celsius table.

I think it's probably best to abandon this patch, per my comment based
on all your comments.

>
>
> >       pmu@54000000 {
> >               compatible = "arm,cortex-a8-pmu";
> >               reg = <0x54000000 0x800000>;
> > diff --git a/arch/arm/boot/dts/omap34xx.dtsi b/arch/arm/boot/dts/omap34xx.dtsi
> > index f572a477f74c..b80378d6e5c1 100644
> > --- a/arch/arm/boot/dts/omap34xx.dtsi
> > +++ b/arch/arm/boot/dts/omap34xx.dtsi
> > @@ -101,10 +101,6 @@
> >                       };
> >               };
> >       };
> > -
> > -     thermal_zones: thermal-zones {
> > -             #include "omap3-cpu-thermal.dtsi"
> > -     };
> > };
> >
> > &ssi {
> > diff --git a/arch/arm/boot/dts/omap36xx.dtsi b/arch/arm/boot/dts/omap36xx.dtsi
> > index 6fb23ada1f64..ff2dca63a04e 100644
> > --- a/arch/arm/boot/dts/omap36xx.dtsi
> > +++ b/arch/arm/boot/dts/omap36xx.dtsi
> > @@ -140,10 +140,6 @@
> >                       };
> >               };
> >       };
> > -
> > -     thermal_zones: thermal-zones {
> > -             #include "omap3-cpu-thermal.dtsi"
> > -     };
> > };
>
> So if we have to exclude the am3517 we can not apply the rearrangement part
> of this patch.
>
> I'd suggest to move the cpu: cpu@0 and #cooling-cells into 1/2 (also to make it
> compile stand-alone). And have the consolidation separately - if we can fix the
> am3517 bandgap sensor issue.

I'll drop this, and leave everything in the omap3-cpu-thermal file and
let omap34xx and omap36xx point to them as we do now.

>
> >
> > /* OMAP3630 needs dss_96m_fck for VENC */
> > --
> > 2.17.1
> >
>
> Tested-by: H. Nikolaus Schaller <hns@goldelico.com> # on GTA04A5 with dm3730cbp100
>

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [RFC v2 1/2] ARM: dts: omap3: Add cpu trips and cooling map for omap3 family
  2019-09-14 13:42   ` Adam Ford
@ 2019-09-14 14:38     ` H. Nikolaus Schaller
  2019-09-14 16:12       ` Adam Ford
  0 siblings, 1 reply; 12+ messages in thread
From: H. Nikolaus Schaller @ 2019-09-14 14:38 UTC (permalink / raw)
  To: Adam Ford
  Cc: Linux-OMAP, Adam Ford, Nishanth Menon, Benoît Cousson,
	Tony Lindgren, Rob Herring, Mark Rutland, devicetree,
	Linux Kernel Mailing List, Grazvydas Ignotas


> Am 14.09.2019 um 15:42 schrieb Adam Ford <aford173@gmail.com>:
> 
> On Sat, Sep 14, 2019 at 4:20 AM H. Nikolaus Schaller <hns@goldelico.com> wrote:
>> 
>> 
>>> Am 13.09.2019 um 17:37 schrieb Adam Ford <aford173@gmail.com>:
>>> 
>>> The OMAP3530, AM3517 and DM3730 all show thresholds of 90C and 105C
>>> depending on commercial or industrial temperature ratings.  This
>>> patch expands the thermal information to the limits of 90 and 105
>>> for alert and critical.
>>> 
>>> For boards who never use industrial temperatures, these can be
>>> changed on their respective device trees with something like:
>>> 
>>> &cpu_alert0 {
>>>      temperature = <85000>; /* millicelsius */
>>> };
>>> 
>>> &cpu_crit {
>>>      temperature = <90000>; /* millicelsius */
>>> };
>>> 
>>> Signed-off-by: Adam Ford <aford173@gmail.com>
>>> ---
>>> V2:  Change the CPU reference to &cpu instead of &cpu0
>>> 
>>> diff --git a/arch/arm/boot/dts/omap3-cpu-thermal.dtsi b/arch/arm/boot/dts/omap3-cpu-thermal.dtsi
>>> index 235ecfd61e2d..dfbd0cb0b00b 100644
>>> --- a/arch/arm/boot/dts/omap3-cpu-thermal.dtsi
>>> +++ b/arch/arm/boot/dts/omap3-cpu-thermal.dtsi
>>> @@ -17,4 +17,25 @@ cpu_thermal: cpu_thermal {
>>> 
>>>                      /* sensor       ID */
>>>      thermal-sensors = <&bandgap     0>;
>>> +
>>> +     cpu_trips: trips {
>>> +             cpu_alert0: cpu_alert {
>>> +                     temperature = <90000>; /* millicelsius */
>>> +                     hysteresis = <2000>; /* millicelsius */
>>> +                     type = "passive";
>>> +             };
>>> +             cpu_crit: cpu_crit {
>>> +                     temperature = <105000>; /* millicelsius */
>>> +                     hysteresis = <2000>; /* millicelsius */
>>> +                     type = "critical";
>>> +             };
>>> +     };
>>> +
>>> +     cpu_cooling_maps: cooling-maps {
>>> +             map0 {
>>> +                     trip = <&cpu_alert0>;
>>> +                     cooling-device =
>>> +                             <&cpu THERMAL_NO_LIMIT THERMAL_NO_LIMIT>;
>>> +             };
>>> +     };
>>> };
>>> --
>>> 2.17.1
>>> 
>> 
>> Here is my test log (GTA04A5 with DM3730CBP100).
>> "high-load" script is driving the NEON to full power
>> and would report calculation errors.
>> 
>> There is no noise visible in the bandgap sensor data
>> induced by power supply fluctuations (log shows system
>> voltage while charging).
>> 
> 
> Great data!
> 
>> root@letux:~# ./high-load -n2
>> 100% load stress test for 1 cores running ./neon_loop2
>> Sat Sep 14 09:05:50 UTC 2019 65° 4111mV 1000MHz
>> Sat Sep 14 09:05:50 UTC 2019 67° 4005mV 1000MHz
>> Sat Sep 14 09:05:52 UTC 2019 68° 4000mV 1000MHz
>> Sat Sep 14 09:05:53 UTC 2019 68° 4000mV 1000MHz
>> Sat Sep 14 09:05:55 UTC 2019 72° 3976mV 1000MHz
>> Sat Sep 14 09:05:56 UTC 2019 72° 4023mV 1000MHz
>> Sat Sep 14 09:05:57 UTC 2019 72° 3900mV 1000MHz
>> Sat Sep 14 09:05:59 UTC 2019 73° 4029mV 1000MHz
>> Sat Sep 14 09:06:00 UTC 2019 73° 3988mV 1000MHz
>> Sat Sep 14 09:06:01 UTC 2019 73° 4005mV 1000MHz
>> Sat Sep 14 09:06:03 UTC 2019 73° 4011mV 1000MHz
>> Sat Sep 14 09:06:04 UTC 2019 73° 4117mV 1000MHz
>> Sat Sep 14 09:06:06 UTC 2019 73° 4005mV 1000MHz
>> Sat Sep 14 09:06:07 UTC 2019 75° 3994mV 1000MHz
>> Sat Sep 14 09:06:08 UTC 2019 75° 3970mV 1000MHz
>> Sat Sep 14 09:06:09 UTC 2019 75° 4046mV 1000MHz
>> Sat Sep 14 09:06:11 UTC 2019 75° 4005mV 1000MHz
>> Sat Sep 14 09:06:12 UTC 2019 75° 4023mV 1000MHz
>> Sat Sep 14 09:06:14 UTC 2019 75° 3970mV 1000MHz
>> Sat Sep 14 09:06:15 UTC 2019 75° 4011mV 1000MHz
>> Sat Sep 14 09:06:16 UTC 2019 77° 4017mV 1000MHz
>> Sat Sep 14 09:06:18 UTC 2019 77° 3994mV 1000MHz
>> Sat Sep 14 09:06:19 UTC 2019 77° 3994mV 1000MHz
>> Sat Sep 14 09:06:20 UTC 2019 77° 3988mV 1000MHz
>> Sat Sep 14 09:06:22 UTC 2019 77° 4023mV 1000MHz
>> Sat Sep 14 09:06:23 UTC 2019 77° 4023mV 1000MHz
>> Sat Sep 14 09:06:24 UTC 2019 78° 4005mV 1000MHz
>> Sat Sep 14 09:06:26 UTC 2019 78° 4105mV 1000MHz
>> Sat Sep 14 09:06:27 UTC 2019 78° 4011mV 1000MHz
>> Sat Sep 14 09:06:28 UTC 2019 78° 3994mV 1000MHz
>> Sat Sep 14 09:06:30 UTC 2019 78° 4123mV 1000MHz
>> ...
>> Sat Sep 14 09:09:57 UTC 2019 88° 4082mV 1000MHz
>> Sat Sep 14 09:09:59 UTC 2019 88° 4164mV 1000MHz
>> Sat Sep 14 09:10:00 UTC 2019 88° 4058mV 1000MHz
>> Sat Sep 14 09:10:01 UTC 2019 88° 4058mV 1000MHz
>> Sat Sep 14 09:10:03 UTC 2019 88° 4082mV 1000MHz
>> Sat Sep 14 09:10:04 UTC 2019 88° 4058mV 1000MHz
>> Sat Sep 14 09:10:06 UTC 2019 88° 4146mV 1000MHz
>> Sat Sep 14 09:10:07 UTC 2019 88° 4041mV 1000MHz
>> Sat Sep 14 09:10:08 UTC 2019 88° 4035mV 1000MHz
>> Sat Sep 14 09:10:10 UTC 2019 88° 4052mV 1000MHz
>> Sat Sep 14 09:10:11 UTC 2019 88° 4087mV 1000MHz
>> Sat Sep 14 09:10:12 UTC 2019 88° 4152mV 1000MHz
>> Sat Sep 14 09:10:14 UTC 2019 88° 4070mV 1000MHz
>> Sat Sep 14 09:10:15 UTC 2019 88° 4064mV 1000MHz
>> Sat Sep 14 09:10:17 UTC 2019 88° 4170mV 1000MHz
>> Sat Sep 14 09:10:18 UTC 2019 88° 4058mV 1000MHz
>> Sat Sep 14 09:10:19 UTC 2019 88° 4187mV 1000MHz
>> Sat Sep 14 09:10:21 UTC 2019 88° 4093mV 1000MHz
>> Sat Sep 14 09:10:22 UTC 2019 88° 4087mV 1000MHz
>> Sat Sep 14 09:10:23 UTC 2019 90° 4070mV 1000MHz
> 
> Should we be a little more conservative?  Without knowing the
> accuracy, i believe we do not want to run at 800 or 1GHz at 90C, so if
> we made this value 89 instead of 90, we would throttle a little more
> conservatively.

Well, the OMAP5 also defines exactly 100°C in the device tree.

I would assume that the badgap sensor accuracy is so that it
never reports less than the real temperature. So if we
throttle at reported 90° TJ is likely lower.

>> Sat Sep 14 09:10:25 UTC 2019 88° 4123mV 800MHz
>> Sat Sep 14 09:10:26 UTC 2019 88° 4064mV 1000MHz
>> Sat Sep 14 09:10:28 UTC 2019 90° 4058mV 1000MHz
> 
> Again here, I interpret the data sheet correctly, we're technically out of spec

I read the data sheet as if 90°C at OPP1G is still within spec.
91 would be obviously outside (if bandgap sensor is precise).

> 
>> Sat Sep 14 09:10:29 UTC 2019 88° 4076mV 1000MHz
>> Sat Sep 14 09:10:30 UTC 2019 88° 4064mV 1000MHz
>> Sat Sep 14 09:10:32 UTC 2019 88° 4117mV 1000MHz
>> Sat Sep 14 09:10:33 UTC 2019 88° 4105mV 800MHz
>> Sat Sep 14 09:10:34 UTC 2019 88° 4070mV 1000MHz
>> Sat Sep 14 09:10:36 UTC 2019 88° 4076mV 1000MHz
>> Sat Sep 14 09:10:37 UTC 2019 88° 4087mV 1000MHz
>> Sat Sep 14 09:10:39 UTC 2019 88° 4017mV 1000MHz
>> Sat Sep 14 09:10:40 UTC 2019 88° 4093mV 1000MHz
>> Sat Sep 14 09:10:41 UTC 2019 88° 4058mV 800MHz
>> Sat Sep 14 09:10:42 UTC 2019 88° 4035mV 1000MHz
>> Sat Sep 14 09:10:44 UTC 2019 90° 4058mV 1000MHz
>> Sat Sep 14 09:10:45 UTC 2019 88° 4064mV 1000MHz
>> Sat Sep 14 09:10:47 UTC 2019 88° 4064mV 1000MHz
>> Sat Sep 14 09:10:48 UTC 2019 88° 4029mV 1000MHz
>> Sat Sep 14 09:10:50 UTC 2019 90° 4046mV 1000MHz
>> ^Ckill 4680
>> root@letux:~# cpufreq-info
>> cpufrequtils 008: cpufreq-info (C) Dominik Brodowski 2004-2009
>> Report errors and bugs to cpufreq@vger.kernel.org, please.
>> analyzing CPU 0:
>>  driver: cpufreq-dt
>>  CPUs which run at the same hardware frequency: 0
>>  CPUs which need to have their frequency coordinated by software: 0
>>  maximum transition latency: 300 us.
>>  hardware limits: 300 MHz - 1000 MHz
>>  available frequency steps: 300 MHz, 600 MHz, 800 MHz, 1000 MHz
>>  available cpufreq governors: conservative, userspace, powersave, ondemand, performance
>>  current policy: frequency should be within 300 MHz and 1000 MHz.
>>                  The governor "ondemand" may decide which speed to use
>>                  within this range.
>>  current CPU frequency is 600 MHz (asserted by call to hardware).
>>  cpufreq stats: 300 MHz:22.81%, 600 MHz:2.50%, 800 MHz:2.10%, 1000 MHz:72.59%  (1563)
>> root@letux:~#
>> 
>> So OPP is reduced if bandgap sensor reports >= 90°C
>> which almost immediately makes the temperature
>> go down.
>> 
>> No operational hickups were observed.
>> 
>> Surface temperature of the PoP chip did rise to
>> approx. 53°C during this test.
>> 
>> Tested-by: H. Nikolaus Schaller <hns@goldelico.com> # on GTA04A5 with dm3730cbp100
>> 

BTW: this patch (set) is even independent of my 1GHz OPP patches.
Should also work with OPP-v1 definitions so that maintainers can
decide which one to apply first.

It is just more difficult to reach TJ of 90°C without 1GHz.

BR,
Nikolaus


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [RFC v2 1/2] ARM: dts: omap3: Add cpu trips and cooling map for omap3 family
  2019-09-14 14:38     ` H. Nikolaus Schaller
@ 2019-09-14 16:12       ` Adam Ford
  2019-10-07 15:11         ` Adam Ford
  0 siblings, 1 reply; 12+ messages in thread
From: Adam Ford @ 2019-09-14 16:12 UTC (permalink / raw)
  To: H. Nikolaus Schaller
  Cc: Linux-OMAP, Adam Ford, Nishanth Menon, Benoît Cousson,
	Tony Lindgren, Rob Herring, Mark Rutland, devicetree,
	Linux Kernel Mailing List, Grazvydas Ignotas

On Sat, Sep 14, 2019 at 9:38 AM H. Nikolaus Schaller <hns@goldelico.com> wrote:
>
>
> > Am 14.09.2019 um 15:42 schrieb Adam Ford <aford173@gmail.com>:
> >
> > On Sat, Sep 14, 2019 at 4:20 AM H. Nikolaus Schaller <hns@goldelico.com> wrote:
> >>
> >>
> >>> Am 13.09.2019 um 17:37 schrieb Adam Ford <aford173@gmail.com>:
> >>>
> >>> The OMAP3530, AM3517 and DM3730 all show thresholds of 90C and 105C
> >>> depending on commercial or industrial temperature ratings.  This
> >>> patch expands the thermal information to the limits of 90 and 105
> >>> for alert and critical.
> >>>
> >>> For boards who never use industrial temperatures, these can be
> >>> changed on their respective device trees with something like:
> >>>
> >>> &cpu_alert0 {
> >>>      temperature = <85000>; /* millicelsius */
> >>> };
> >>>
> >>> &cpu_crit {
> >>>      temperature = <90000>; /* millicelsius */
> >>> };
> >>>
> >>> Signed-off-by: Adam Ford <aford173@gmail.com>
> >>> ---
> >>> V2:  Change the CPU reference to &cpu instead of &cpu0
> >>>
> >>> diff --git a/arch/arm/boot/dts/omap3-cpu-thermal.dtsi b/arch/arm/boot/dts/omap3-cpu-thermal.dtsi
> >>> index 235ecfd61e2d..dfbd0cb0b00b 100644
> >>> --- a/arch/arm/boot/dts/omap3-cpu-thermal.dtsi
> >>> +++ b/arch/arm/boot/dts/omap3-cpu-thermal.dtsi
> >>> @@ -17,4 +17,25 @@ cpu_thermal: cpu_thermal {
> >>>
> >>>                      /* sensor       ID */
> >>>      thermal-sensors = <&bandgap     0>;
> >>> +
> >>> +     cpu_trips: trips {
> >>> +             cpu_alert0: cpu_alert {
> >>> +                     temperature = <90000>; /* millicelsius */
> >>> +                     hysteresis = <2000>; /* millicelsius */
> >>> +                     type = "passive";
> >>> +             };
> >>> +             cpu_crit: cpu_crit {
> >>> +                     temperature = <105000>; /* millicelsius */
> >>> +                     hysteresis = <2000>; /* millicelsius */
> >>> +                     type = "critical";
> >>> +             };
> >>> +     };
> >>> +
> >>> +     cpu_cooling_maps: cooling-maps {
> >>> +             map0 {
> >>> +                     trip = <&cpu_alert0>;
> >>> +                     cooling-device =
> >>> +                             <&cpu THERMAL_NO_LIMIT THERMAL_NO_LIMIT>;
> >>> +             };
> >>> +     };
> >>> };
> >>> --
> >>> 2.17.1
> >>>
> >>
> >> Here is my test log (GTA04A5 with DM3730CBP100).
> >> "high-load" script is driving the NEON to full power
> >> and would report calculation errors.
> >>
> >> There is no noise visible in the bandgap sensor data
> >> induced by power supply fluctuations (log shows system
> >> voltage while charging).
> >>
> >
> > Great data!
> >
> >> root@letux:~# ./high-load -n2
> >> 100% load stress test for 1 cores running ./neon_loop2
> >> Sat Sep 14 09:05:50 UTC 2019 65° 4111mV 1000MHz
> >> Sat Sep 14 09:05:50 UTC 2019 67° 4005mV 1000MHz
> >> Sat Sep 14 09:05:52 UTC 2019 68° 4000mV 1000MHz
> >> Sat Sep 14 09:05:53 UTC 2019 68° 4000mV 1000MHz
> >> Sat Sep 14 09:05:55 UTC 2019 72° 3976mV 1000MHz
> >> Sat Sep 14 09:05:56 UTC 2019 72° 4023mV 1000MHz
> >> Sat Sep 14 09:05:57 UTC 2019 72° 3900mV 1000MHz
> >> Sat Sep 14 09:05:59 UTC 2019 73° 4029mV 1000MHz
> >> Sat Sep 14 09:06:00 UTC 2019 73° 3988mV 1000MHz
> >> Sat Sep 14 09:06:01 UTC 2019 73° 4005mV 1000MHz
> >> Sat Sep 14 09:06:03 UTC 2019 73° 4011mV 1000MHz
> >> Sat Sep 14 09:06:04 UTC 2019 73° 4117mV 1000MHz
> >> Sat Sep 14 09:06:06 UTC 2019 73° 4005mV 1000MHz
> >> Sat Sep 14 09:06:07 UTC 2019 75° 3994mV 1000MHz
> >> Sat Sep 14 09:06:08 UTC 2019 75° 3970mV 1000MHz
> >> Sat Sep 14 09:06:09 UTC 2019 75° 4046mV 1000MHz
> >> Sat Sep 14 09:06:11 UTC 2019 75° 4005mV 1000MHz
> >> Sat Sep 14 09:06:12 UTC 2019 75° 4023mV 1000MHz
> >> Sat Sep 14 09:06:14 UTC 2019 75° 3970mV 1000MHz
> >> Sat Sep 14 09:06:15 UTC 2019 75° 4011mV 1000MHz
> >> Sat Sep 14 09:06:16 UTC 2019 77° 4017mV 1000MHz
> >> Sat Sep 14 09:06:18 UTC 2019 77° 3994mV 1000MHz
> >> Sat Sep 14 09:06:19 UTC 2019 77° 3994mV 1000MHz
> >> Sat Sep 14 09:06:20 UTC 2019 77° 3988mV 1000MHz
> >> Sat Sep 14 09:06:22 UTC 2019 77° 4023mV 1000MHz
> >> Sat Sep 14 09:06:23 UTC 2019 77° 4023mV 1000MHz
> >> Sat Sep 14 09:06:24 UTC 2019 78° 4005mV 1000MHz
> >> Sat Sep 14 09:06:26 UTC 2019 78° 4105mV 1000MHz
> >> Sat Sep 14 09:06:27 UTC 2019 78° 4011mV 1000MHz
> >> Sat Sep 14 09:06:28 UTC 2019 78° 3994mV 1000MHz
> >> Sat Sep 14 09:06:30 UTC 2019 78° 4123mV 1000MHz
> >> ...
> >> Sat Sep 14 09:09:57 UTC 2019 88° 4082mV 1000MHz
> >> Sat Sep 14 09:09:59 UTC 2019 88° 4164mV 1000MHz
> >> Sat Sep 14 09:10:00 UTC 2019 88° 4058mV 1000MHz
> >> Sat Sep 14 09:10:01 UTC 2019 88° 4058mV 1000MHz
> >> Sat Sep 14 09:10:03 UTC 2019 88° 4082mV 1000MHz
> >> Sat Sep 14 09:10:04 UTC 2019 88° 4058mV 1000MHz
> >> Sat Sep 14 09:10:06 UTC 2019 88° 4146mV 1000MHz
> >> Sat Sep 14 09:10:07 UTC 2019 88° 4041mV 1000MHz
> >> Sat Sep 14 09:10:08 UTC 2019 88° 4035mV 1000MHz
> >> Sat Sep 14 09:10:10 UTC 2019 88° 4052mV 1000MHz
> >> Sat Sep 14 09:10:11 UTC 2019 88° 4087mV 1000MHz
> >> Sat Sep 14 09:10:12 UTC 2019 88° 4152mV 1000MHz
> >> Sat Sep 14 09:10:14 UTC 2019 88° 4070mV 1000MHz
> >> Sat Sep 14 09:10:15 UTC 2019 88° 4064mV 1000MHz
> >> Sat Sep 14 09:10:17 UTC 2019 88° 4170mV 1000MHz
> >> Sat Sep 14 09:10:18 UTC 2019 88° 4058mV 1000MHz
> >> Sat Sep 14 09:10:19 UTC 2019 88° 4187mV 1000MHz
> >> Sat Sep 14 09:10:21 UTC 2019 88° 4093mV 1000MHz
> >> Sat Sep 14 09:10:22 UTC 2019 88° 4087mV 1000MHz
> >> Sat Sep 14 09:10:23 UTC 2019 90° 4070mV 1000MHz
> >
> > Should we be a little more conservative?  Without knowing the
> > accuracy, i believe we do not want to run at 800 or 1GHz at 90C, so if
> > we made this value 89 instead of 90, we would throttle a little more
> > conservatively.
>
> Well, the OMAP5 also defines exactly 100°C in the device tree.
>
> I would assume that the badgap sensor accuracy is so that it
> never reports less than the real temperature. So if we
> throttle at reported 90° TJ is likely lower.
>
> >> Sat Sep 14 09:10:25 UTC 2019 88° 4123mV 800MHz
> >> Sat Sep 14 09:10:26 UTC 2019 88° 4064mV 1000MHz
> >> Sat Sep 14 09:10:28 UTC 2019 90° 4058mV 1000MHz
> >
> > Again here, I interpret the data sheet correctly, we're technically out of spec
>
> I read the data sheet as if 90°C at OPP1G is still within spec.
> 91 would be obviously outside (if bandgap sensor is precise).
>
> >
> >> Sat Sep 14 09:10:29 UTC 2019 88° 4076mV 1000MHz
> >> Sat Sep 14 09:10:30 UTC 2019 88° 4064mV 1000MHz
> >> Sat Sep 14 09:10:32 UTC 2019 88° 4117mV 1000MHz
> >> Sat Sep 14 09:10:33 UTC 2019 88° 4105mV 800MHz
> >> Sat Sep 14 09:10:34 UTC 2019 88° 4070mV 1000MHz
> >> Sat Sep 14 09:10:36 UTC 2019 88° 4076mV 1000MHz
> >> Sat Sep 14 09:10:37 UTC 2019 88° 4087mV 1000MHz
> >> Sat Sep 14 09:10:39 UTC 2019 88° 4017mV 1000MHz
> >> Sat Sep 14 09:10:40 UTC 2019 88° 4093mV 1000MHz
> >> Sat Sep 14 09:10:41 UTC 2019 88° 4058mV 800MHz
> >> Sat Sep 14 09:10:42 UTC 2019 88° 4035mV 1000MHz
> >> Sat Sep 14 09:10:44 UTC 2019 90° 4058mV 1000MHz
> >> Sat Sep 14 09:10:45 UTC 2019 88° 4064mV 1000MHz
> >> Sat Sep 14 09:10:47 UTC 2019 88° 4064mV 1000MHz
> >> Sat Sep 14 09:10:48 UTC 2019 88° 4029mV 1000MHz
> >> Sat Sep 14 09:10:50 UTC 2019 90° 4046mV 1000MHz
> >> ^Ckill 4680
> >> root@letux:~# cpufreq-info
> >> cpufrequtils 008: cpufreq-info (C) Dominik Brodowski 2004-2009
> >> Report errors and bugs to cpufreq@vger.kernel.org, please.
> >> analyzing CPU 0:
> >>  driver: cpufreq-dt
> >>  CPUs which run at the same hardware frequency: 0
> >>  CPUs which need to have their frequency coordinated by software: 0
> >>  maximum transition latency: 300 us.
> >>  hardware limits: 300 MHz - 1000 MHz
> >>  available frequency steps: 300 MHz, 600 MHz, 800 MHz, 1000 MHz
> >>  available cpufreq governors: conservative, userspace, powersave, ondemand, performance
> >>  current policy: frequency should be within 300 MHz and 1000 MHz.
> >>                  The governor "ondemand" may decide which speed to use
> >>                  within this range.
> >>  current CPU frequency is 600 MHz (asserted by call to hardware).
> >>  cpufreq stats: 300 MHz:22.81%, 600 MHz:2.50%, 800 MHz:2.10%, 1000 MHz:72.59%  (1563)
> >> root@letux:~#
> >>
> >> So OPP is reduced if bandgap sensor reports >= 90°C
> >> which almost immediately makes the temperature
> >> go down.
> >>
> >> No operational hickups were observed.
> >>
> >> Surface temperature of the PoP chip did rise to
> >> approx. 53°C during this test.
> >>
> >> Tested-by: H. Nikolaus Schaller <hns@goldelico.com> # on GTA04A5 with dm3730cbp100
> >>
>
> BTW: this patch (set) is even independent of my 1GHz OPP patches.
> Should also work with OPP-v1 definitions so that maintainers can
> decide which one to apply first.

If I am going integrate the cooling references into &cpu node, I'll
probably base it on your work since the cooling isn't really that
important until we exceed 800MHz.  If I do it on the current linux
master or omap for-next branch, it may not apply cleanly.

>
> It is just more difficult to reach TJ of 90°C without 1GHz.

If it even does at all without external influences.

adam
>
> BR,
> Nikolaus
>

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [RFC v2 1/2] ARM: dts: omap3: Add cpu trips and cooling map for omap3 family
  2019-09-14 16:12       ` Adam Ford
@ 2019-10-07 15:11         ` Adam Ford
  2019-10-07 15:44           ` H. Nikolaus Schaller
  0 siblings, 1 reply; 12+ messages in thread
From: Adam Ford @ 2019-10-07 15:11 UTC (permalink / raw)
  To: H. Nikolaus Schaller
  Cc: Linux-OMAP, Adam Ford, Nishanth Menon, Benoît Cousson,
	Tony Lindgren, Rob Herring, Mark Rutland, devicetree,
	Linux Kernel Mailing List, Grazvydas Ignotas

On Sat, Sep 14, 2019 at 11:12 AM Adam Ford <aford173@gmail.com> wrote:
>
> On Sat, Sep 14, 2019 at 9:38 AM H. Nikolaus Schaller <hns@goldelico.com> wrote:
> >
> >
> > > Am 14.09.2019 um 15:42 schrieb Adam Ford <aford173@gmail.com>:
> > >
> > > On Sat, Sep 14, 2019 at 4:20 AM H. Nikolaus Schaller <hns@goldelico.com> wrote:
> > >>
> > >>
> > >>> Am 13.09.2019 um 17:37 schrieb Adam Ford <aford173@gmail.com>:
> > >>>
> > >>> The OMAP3530, AM3517 and DM3730 all show thresholds of 90C and 105C
> > >>> depending on commercial or industrial temperature ratings.  This
> > >>> patch expands the thermal information to the limits of 90 and 105
> > >>> for alert and critical.
> > >>>

Tom / anyone from TI,

I am going to rebase this patch from the current 5.4-RC branch, remove
the AM3517 references, and leave the throttling only applicable to
omap34xx and 36xx (like it is now), and remove the RFC.  Before I do
that, I was hoping for some feedback on whether or not there is a
reason to not do this while acknowledging the thermal sensor isn't
very accurate.

Does anyone have any objections to this?

Other than the omap mailing list, are there other lists that should be CC'd?

adam

> > >>> For boards who never use industrial temperatures, these can be
> > >>> changed on their respective device trees with something like:
> > >>>
> > >>> &cpu_alert0 {
> > >>>      temperature = <85000>; /* millicelsius */
> > >>> };
> > >>>
> > >>> &cpu_crit {
> > >>>      temperature = <90000>; /* millicelsius */
> > >>> };
> > >>>
> > >>> Signed-off-by: Adam Ford <aford173@gmail.com>
> > >>> ---
> > >>> V2:  Change the CPU reference to &cpu instead of &cpu0
> > >>>
> > >>> diff --git a/arch/arm/boot/dts/omap3-cpu-thermal.dtsi b/arch/arm/boot/dts/omap3-cpu-thermal.dtsi
> > >>> index 235ecfd61e2d..dfbd0cb0b00b 100644
> > >>> --- a/arch/arm/boot/dts/omap3-cpu-thermal.dtsi
> > >>> +++ b/arch/arm/boot/dts/omap3-cpu-thermal.dtsi
> > >>> @@ -17,4 +17,25 @@ cpu_thermal: cpu_thermal {
> > >>>
> > >>>                      /* sensor       ID */
> > >>>      thermal-sensors = <&bandgap     0>;
> > >>> +
> > >>> +     cpu_trips: trips {
> > >>> +             cpu_alert0: cpu_alert {
> > >>> +                     temperature = <90000>; /* millicelsius */
> > >>> +                     hysteresis = <2000>; /* millicelsius */
> > >>> +                     type = "passive";
> > >>> +             };
> > >>> +             cpu_crit: cpu_crit {
> > >>> +                     temperature = <105000>; /* millicelsius */
> > >>> +                     hysteresis = <2000>; /* millicelsius */
> > >>> +                     type = "critical";
> > >>> +             };
> > >>> +     };
> > >>> +
> > >>> +     cpu_cooling_maps: cooling-maps {
> > >>> +             map0 {
> > >>> +                     trip = <&cpu_alert0>;
> > >>> +                     cooling-device =
> > >>> +                             <&cpu THERMAL_NO_LIMIT THERMAL_NO_LIMIT>;
> > >>> +             };
> > >>> +     };
> > >>> };
> > >>> --
> > >>> 2.17.1
> > >>>
> > >>
> > >> Here is my test log (GTA04A5 with DM3730CBP100).
> > >> "high-load" script is driving the NEON to full power
> > >> and would report calculation errors.
> > >>
> > >> There is no noise visible in the bandgap sensor data
> > >> induced by power supply fluctuations (log shows system
> > >> voltage while charging).
> > >>
> > >
> > > Great data!
> > >
> > >> root@letux:~# ./high-load -n2
> > >> 100% load stress test for 1 cores running ./neon_loop2
> > >> Sat Sep 14 09:05:50 UTC 2019 65° 4111mV 1000MHz
> > >> Sat Sep 14 09:05:50 UTC 2019 67° 4005mV 1000MHz
> > >> Sat Sep 14 09:05:52 UTC 2019 68° 4000mV 1000MHz
> > >> Sat Sep 14 09:05:53 UTC 2019 68° 4000mV 1000MHz
> > >> Sat Sep 14 09:05:55 UTC 2019 72° 3976mV 1000MHz
> > >> Sat Sep 14 09:05:56 UTC 2019 72° 4023mV 1000MHz
> > >> Sat Sep 14 09:05:57 UTC 2019 72° 3900mV 1000MHz
> > >> Sat Sep 14 09:05:59 UTC 2019 73° 4029mV 1000MHz
> > >> Sat Sep 14 09:06:00 UTC 2019 73° 3988mV 1000MHz
> > >> Sat Sep 14 09:06:01 UTC 2019 73° 4005mV 1000MHz
> > >> Sat Sep 14 09:06:03 UTC 2019 73° 4011mV 1000MHz
> > >> Sat Sep 14 09:06:04 UTC 2019 73° 4117mV 1000MHz
> > >> Sat Sep 14 09:06:06 UTC 2019 73° 4005mV 1000MHz
> > >> Sat Sep 14 09:06:07 UTC 2019 75° 3994mV 1000MHz
> > >> Sat Sep 14 09:06:08 UTC 2019 75° 3970mV 1000MHz
> > >> Sat Sep 14 09:06:09 UTC 2019 75° 4046mV 1000MHz
> > >> Sat Sep 14 09:06:11 UTC 2019 75° 4005mV 1000MHz
> > >> Sat Sep 14 09:06:12 UTC 2019 75° 4023mV 1000MHz
> > >> Sat Sep 14 09:06:14 UTC 2019 75° 3970mV 1000MHz
> > >> Sat Sep 14 09:06:15 UTC 2019 75° 4011mV 1000MHz
> > >> Sat Sep 14 09:06:16 UTC 2019 77° 4017mV 1000MHz
> > >> Sat Sep 14 09:06:18 UTC 2019 77° 3994mV 1000MHz
> > >> Sat Sep 14 09:06:19 UTC 2019 77° 3994mV 1000MHz
> > >> Sat Sep 14 09:06:20 UTC 2019 77° 3988mV 1000MHz
> > >> Sat Sep 14 09:06:22 UTC 2019 77° 4023mV 1000MHz
> > >> Sat Sep 14 09:06:23 UTC 2019 77° 4023mV 1000MHz
> > >> Sat Sep 14 09:06:24 UTC 2019 78° 4005mV 1000MHz
> > >> Sat Sep 14 09:06:26 UTC 2019 78° 4105mV 1000MHz
> > >> Sat Sep 14 09:06:27 UTC 2019 78° 4011mV 1000MHz
> > >> Sat Sep 14 09:06:28 UTC 2019 78° 3994mV 1000MHz
> > >> Sat Sep 14 09:06:30 UTC 2019 78° 4123mV 1000MHz
> > >> ...
> > >> Sat Sep 14 09:09:57 UTC 2019 88° 4082mV 1000MHz
> > >> Sat Sep 14 09:09:59 UTC 2019 88° 4164mV 1000MHz
> > >> Sat Sep 14 09:10:00 UTC 2019 88° 4058mV 1000MHz
> > >> Sat Sep 14 09:10:01 UTC 2019 88° 4058mV 1000MHz
> > >> Sat Sep 14 09:10:03 UTC 2019 88° 4082mV 1000MHz
> > >> Sat Sep 14 09:10:04 UTC 2019 88° 4058mV 1000MHz
> > >> Sat Sep 14 09:10:06 UTC 2019 88° 4146mV 1000MHz
> > >> Sat Sep 14 09:10:07 UTC 2019 88° 4041mV 1000MHz
> > >> Sat Sep 14 09:10:08 UTC 2019 88° 4035mV 1000MHz
> > >> Sat Sep 14 09:10:10 UTC 2019 88° 4052mV 1000MHz
> > >> Sat Sep 14 09:10:11 UTC 2019 88° 4087mV 1000MHz
> > >> Sat Sep 14 09:10:12 UTC 2019 88° 4152mV 1000MHz
> > >> Sat Sep 14 09:10:14 UTC 2019 88° 4070mV 1000MHz
> > >> Sat Sep 14 09:10:15 UTC 2019 88° 4064mV 1000MHz
> > >> Sat Sep 14 09:10:17 UTC 2019 88° 4170mV 1000MHz
> > >> Sat Sep 14 09:10:18 UTC 2019 88° 4058mV 1000MHz
> > >> Sat Sep 14 09:10:19 UTC 2019 88° 4187mV 1000MHz
> > >> Sat Sep 14 09:10:21 UTC 2019 88° 4093mV 1000MHz
> > >> Sat Sep 14 09:10:22 UTC 2019 88° 4087mV 1000MHz
> > >> Sat Sep 14 09:10:23 UTC 2019 90° 4070mV 1000MHz
> > >
> > > Should we be a little more conservative?  Without knowing the
> > > accuracy, i believe we do not want to run at 800 or 1GHz at 90C, so if
> > > we made this value 89 instead of 90, we would throttle a little more
> > > conservatively.
> >
> > Well, the OMAP5 also defines exactly 100°C in the device tree.
> >
> > I would assume that the badgap sensor accuracy is so that it
> > never reports less than the real temperature. So if we
> > throttle at reported 90° TJ is likely lower.
> >
> > >> Sat Sep 14 09:10:25 UTC 2019 88° 4123mV 800MHz
> > >> Sat Sep 14 09:10:26 UTC 2019 88° 4064mV 1000MHz
> > >> Sat Sep 14 09:10:28 UTC 2019 90° 4058mV 1000MHz
> > >
> > > Again here, I interpret the data sheet correctly, we're technically out of spec
> >
> > I read the data sheet as if 90°C at OPP1G is still within spec.
> > 91 would be obviously outside (if bandgap sensor is precise).
> >
> > >
> > >> Sat Sep 14 09:10:29 UTC 2019 88° 4076mV 1000MHz
> > >> Sat Sep 14 09:10:30 UTC 2019 88° 4064mV 1000MHz
> > >> Sat Sep 14 09:10:32 UTC 2019 88° 4117mV 1000MHz
> > >> Sat Sep 14 09:10:33 UTC 2019 88° 4105mV 800MHz
> > >> Sat Sep 14 09:10:34 UTC 2019 88° 4070mV 1000MHz
> > >> Sat Sep 14 09:10:36 UTC 2019 88° 4076mV 1000MHz
> > >> Sat Sep 14 09:10:37 UTC 2019 88° 4087mV 1000MHz
> > >> Sat Sep 14 09:10:39 UTC 2019 88° 4017mV 1000MHz
> > >> Sat Sep 14 09:10:40 UTC 2019 88° 4093mV 1000MHz
> > >> Sat Sep 14 09:10:41 UTC 2019 88° 4058mV 800MHz
> > >> Sat Sep 14 09:10:42 UTC 2019 88° 4035mV 1000MHz
> > >> Sat Sep 14 09:10:44 UTC 2019 90° 4058mV 1000MHz
> > >> Sat Sep 14 09:10:45 UTC 2019 88° 4064mV 1000MHz
> > >> Sat Sep 14 09:10:47 UTC 2019 88° 4064mV 1000MHz
> > >> Sat Sep 14 09:10:48 UTC 2019 88° 4029mV 1000MHz
> > >> Sat Sep 14 09:10:50 UTC 2019 90° 4046mV 1000MHz
> > >> ^Ckill 4680
> > >> root@letux:~# cpufreq-info
> > >> cpufrequtils 008: cpufreq-info (C) Dominik Brodowski 2004-2009
> > >> Report errors and bugs to cpufreq@vger.kernel.org, please.
> > >> analyzing CPU 0:
> > >>  driver: cpufreq-dt
> > >>  CPUs which run at the same hardware frequency: 0
> > >>  CPUs which need to have their frequency coordinated by software: 0
> > >>  maximum transition latency: 300 us.
> > >>  hardware limits: 300 MHz - 1000 MHz
> > >>  available frequency steps: 300 MHz, 600 MHz, 800 MHz, 1000 MHz
> > >>  available cpufreq governors: conservative, userspace, powersave, ondemand, performance
> > >>  current policy: frequency should be within 300 MHz and 1000 MHz.
> > >>                  The governor "ondemand" may decide which speed to use
> > >>                  within this range.
> > >>  current CPU frequency is 600 MHz (asserted by call to hardware).
> > >>  cpufreq stats: 300 MHz:22.81%, 600 MHz:2.50%, 800 MHz:2.10%, 1000 MHz:72.59%  (1563)
> > >> root@letux:~#
> > >>
> > >> So OPP is reduced if bandgap sensor reports >= 90°C
> > >> which almost immediately makes the temperature
> > >> go down.
> > >>
> > >> No operational hickups were observed.
> > >>
> > >> Surface temperature of the PoP chip did rise to
> > >> approx. 53°C during this test.
> > >>
> > >> Tested-by: H. Nikolaus Schaller <hns@goldelico.com> # on GTA04A5 with dm3730cbp100
> > >>
> >
> > BTW: this patch (set) is even independent of my 1GHz OPP patches.
> > Should also work with OPP-v1 definitions so that maintainers can
> > decide which one to apply first.
>
> If I am going integrate the cooling references into &cpu node, I'll
> probably base it on your work since the cooling isn't really that
> important until we exceed 800MHz.  If I do it on the current linux
> master or omap for-next branch, it may not apply cleanly.
>
> >
> > It is just more difficult to reach TJ of 90°C without 1GHz.
>
> If it even does at all without external influences.
>
> adam
> >
> > BR,
> > Nikolaus
> >

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [RFC v2 1/2] ARM: dts: omap3: Add cpu trips and cooling map for omap3 family
  2019-10-07 15:11         ` Adam Ford
@ 2019-10-07 15:44           ` H. Nikolaus Schaller
  2019-10-07 17:25             ` Adam Ford
  0 siblings, 1 reply; 12+ messages in thread
From: H. Nikolaus Schaller @ 2019-10-07 15:44 UTC (permalink / raw)
  To: Adam Ford
  Cc: Linux-OMAP, Adam Ford, Nishanth Menon, Benoît Cousson,
	Tony Lindgren, Rob Herring, Mark Rutland, devicetree,
	Linux Kernel Mailing List, Grazvydas Ignotas


> Am 07.10.2019 um 17:11 schrieb Adam Ford <aford173@gmail.com>:
> 
> On Sat, Sep 14, 2019 at 11:12 AM Adam Ford <aford173@gmail.com> wrote:
>> 
>> On Sat, Sep 14, 2019 at 9:38 AM H. Nikolaus Schaller <hns@goldelico.com> wrote:
>>> 
>>> 
>>>> Am 14.09.2019 um 15:42 schrieb Adam Ford <aford173@gmail.com>:
>>>> 
>>>> On Sat, Sep 14, 2019 at 4:20 AM H. Nikolaus Schaller <hns@goldelico.com> wrote:
>>>>> 
>>>>> 
>>>>>> Am 13.09.2019 um 17:37 schrieb Adam Ford <aford173@gmail.com>:
>>>>>> 
>>>>>> The OMAP3530, AM3517 and DM3730 all show thresholds of 90C and 105C
>>>>>> depending on commercial or industrial temperature ratings.  This
>>>>>> patch expands the thermal information to the limits of 90 and 105
>>>>>> for alert and critical.
>>>>>> 
> 
> Tom / anyone from TI,
> 
> I am going to rebase this patch from the current 5.4-RC branch, remove
> the AM3517 references, and leave the throttling only applicable to
> omap34xx and 36xx (like it is now), and remove the RFC.  Before I do
> that, I was hoping for some feedback on whether or not there is a
> reason to not do this while acknowledging the thermal sensor isn't
> very accurate.

I wonder if there is a more precise definition what "isn't very accurate"
means?

Is it just because the TI_BANDGAP_FEATURE_UNRELIABLE bit is set in
the driver and we assume that it is right?

Of course the "junction temperature" (TJ) is not well defined (at which
edge? in which area?) and the bandgap sensor can only report a single point
of the die. So e.g. the GPU or the NEON unit may be hotter or cooler.

And, the bandgap sensor + ADC is unlikely to be well calibrated to
0.1°C precision.

But in my experiments there seems to be not much noise and values rise
or fall monotonic according to expectations of processor load.

So a report of 90°C may not be exactly 90°C and some parts of the SoC
may be hotter.

I would also assume that the TJ limits of 90°C have some safety margin
but there seems to be no information in the data sheet.

So, IMHO an "unreliable" bandgap sensor is better than no sensor and
no trips / cooling maps.

One more thing is with the omap3 bandgap sensor (driver?). It appears to
report the value of the previous measurement. So unless it is regularily
polled (like cpufreq seems to do) it will report outdated values. The
first read hours after boot may report the value during probe while booting.

This is also a source of missing accuracy of course. But I haven't
investigated this (can only be tested if thermal management is turned
off) because I think it has no practical influence if cpufreq is polling.

> 
> Does anyone have any objections to this?
> 
> Other than the omap mailing list, are there other lists that should be CC'd?
> 
> adam
> 
>>>>>> For boards who never use industrial temperatures, these can be
>>>>>> changed on their respective device trees with something like:
>>>>>> 
>>>>>> &cpu_alert0 {
>>>>>>     temperature = <85000>; /* millicelsius */
>>>>>> };
>>>>>> 
>>>>>> &cpu_crit {
>>>>>>     temperature = <90000>; /* millicelsius */
>>>>>> };
>>>>>> 
>>>>>> Signed-off-by: Adam Ford <aford173@gmail.com>
>>>>>> ---
>>>>>> V2:  Change the CPU reference to &cpu instead of &cpu0
>>>>>> 
>>>>>> diff --git a/arch/arm/boot/dts/omap3-cpu-thermal.dtsi b/arch/arm/boot/dts/omap3-cpu-thermal.dtsi
>>>>>> index 235ecfd61e2d..dfbd0cb0b00b 100644
>>>>>> --- a/arch/arm/boot/dts/omap3-cpu-thermal.dtsi
>>>>>> +++ b/arch/arm/boot/dts/omap3-cpu-thermal.dtsi
>>>>>> @@ -17,4 +17,25 @@ cpu_thermal: cpu_thermal {
>>>>>> 
>>>>>>                     /* sensor       ID */
>>>>>>     thermal-sensors = <&bandgap     0>;
>>>>>> +
>>>>>> +     cpu_trips: trips {
>>>>>> +             cpu_alert0: cpu_alert {
>>>>>> +                     temperature = <90000>; /* millicelsius */
>>>>>> +                     hysteresis = <2000>; /* millicelsius */
>>>>>> +                     type = "passive";
>>>>>> +             };
>>>>>> +             cpu_crit: cpu_crit {
>>>>>> +                     temperature = <105000>; /* millicelsius */
>>>>>> +                     hysteresis = <2000>; /* millicelsius */
>>>>>> +                     type = "critical";
>>>>>> +             };
>>>>>> +     };
>>>>>> +
>>>>>> +     cpu_cooling_maps: cooling-maps {
>>>>>> +             map0 {
>>>>>> +                     trip = <&cpu_alert0>;
>>>>>> +                     cooling-device =
>>>>>> +                             <&cpu THERMAL_NO_LIMIT THERMAL_NO_LIMIT>;
>>>>>> +             };
>>>>>> +     };
>>>>>> };
>>>>>> --
>>>>>> 2.17.1
>>>>>> 
>>>>> 
>>>>> Here is my test log (GTA04A5 with DM3730CBP100).
>>>>> "high-load" script is driving the NEON to full power
>>>>> and would report calculation errors.
>>>>> 
>>>>> There is no noise visible in the bandgap sensor data
>>>>> induced by power supply fluctuations (log shows system
>>>>> voltage while charging).
>>>>> 
>>>> 
>>>> Great data!
>>>> 
>>>>> root@letux:~# ./high-load -n2
>>>>> 100% load stress test for 1 cores running ./neon_loop2
>>>>> Sat Sep 14 09:05:50 UTC 2019 65° 4111mV 1000MHz
>>>>> Sat Sep 14 09:05:50 UTC 2019 67° 4005mV 1000MHz
>>>>> Sat Sep 14 09:05:52 UTC 2019 68° 4000mV 1000MHz
>>>>> Sat Sep 14 09:05:53 UTC 2019 68° 4000mV 1000MHz
>>>>> Sat Sep 14 09:05:55 UTC 2019 72° 3976mV 1000MHz
>>>>> Sat Sep 14 09:05:56 UTC 2019 72° 4023mV 1000MHz
>>>>> Sat Sep 14 09:05:57 UTC 2019 72° 3900mV 1000MHz
>>>>> Sat Sep 14 09:05:59 UTC 2019 73° 4029mV 1000MHz
>>>>> Sat Sep 14 09:06:00 UTC 2019 73° 3988mV 1000MHz
>>>>> Sat Sep 14 09:06:01 UTC 2019 73° 4005mV 1000MHz
>>>>> Sat Sep 14 09:06:03 UTC 2019 73° 4011mV 1000MHz
>>>>> Sat Sep 14 09:06:04 UTC 2019 73° 4117mV 1000MHz
>>>>> Sat Sep 14 09:06:06 UTC 2019 73° 4005mV 1000MHz
>>>>> Sat Sep 14 09:06:07 UTC 2019 75° 3994mV 1000MHz
>>>>> Sat Sep 14 09:06:08 UTC 2019 75° 3970mV 1000MHz
>>>>> Sat Sep 14 09:06:09 UTC 2019 75° 4046mV 1000MHz
>>>>> Sat Sep 14 09:06:11 UTC 2019 75° 4005mV 1000MHz
>>>>> Sat Sep 14 09:06:12 UTC 2019 75° 4023mV 1000MHz
>>>>> Sat Sep 14 09:06:14 UTC 2019 75° 3970mV 1000MHz
>>>>> Sat Sep 14 09:06:15 UTC 2019 75° 4011mV 1000MHz
>>>>> Sat Sep 14 09:06:16 UTC 2019 77° 4017mV 1000MHz
>>>>> Sat Sep 14 09:06:18 UTC 2019 77° 3994mV 1000MHz
>>>>> Sat Sep 14 09:06:19 UTC 2019 77° 3994mV 1000MHz
>>>>> Sat Sep 14 09:06:20 UTC 2019 77° 3988mV 1000MHz
>>>>> Sat Sep 14 09:06:22 UTC 2019 77° 4023mV 1000MHz
>>>>> Sat Sep 14 09:06:23 UTC 2019 77° 4023mV 1000MHz
>>>>> Sat Sep 14 09:06:24 UTC 2019 78° 4005mV 1000MHz
>>>>> Sat Sep 14 09:06:26 UTC 2019 78° 4105mV 1000MHz
>>>>> Sat Sep 14 09:06:27 UTC 2019 78° 4011mV 1000MHz
>>>>> Sat Sep 14 09:06:28 UTC 2019 78° 3994mV 1000MHz
>>>>> Sat Sep 14 09:06:30 UTC 2019 78° 4123mV 1000MHz
>>>>> ...
>>>>> Sat Sep 14 09:09:57 UTC 2019 88° 4082mV 1000MHz
>>>>> Sat Sep 14 09:09:59 UTC 2019 88° 4164mV 1000MHz
>>>>> Sat Sep 14 09:10:00 UTC 2019 88° 4058mV 1000MHz
>>>>> Sat Sep 14 09:10:01 UTC 2019 88° 4058mV 1000MHz
>>>>> Sat Sep 14 09:10:03 UTC 2019 88° 4082mV 1000MHz
>>>>> Sat Sep 14 09:10:04 UTC 2019 88° 4058mV 1000MHz
>>>>> Sat Sep 14 09:10:06 UTC 2019 88° 4146mV 1000MHz
>>>>> Sat Sep 14 09:10:07 UTC 2019 88° 4041mV 1000MHz
>>>>> Sat Sep 14 09:10:08 UTC 2019 88° 4035mV 1000MHz
>>>>> Sat Sep 14 09:10:10 UTC 2019 88° 4052mV 1000MHz
>>>>> Sat Sep 14 09:10:11 UTC 2019 88° 4087mV 1000MHz
>>>>> Sat Sep 14 09:10:12 UTC 2019 88° 4152mV 1000MHz
>>>>> Sat Sep 14 09:10:14 UTC 2019 88° 4070mV 1000MHz
>>>>> Sat Sep 14 09:10:15 UTC 2019 88° 4064mV 1000MHz
>>>>> Sat Sep 14 09:10:17 UTC 2019 88° 4170mV 1000MHz
>>>>> Sat Sep 14 09:10:18 UTC 2019 88° 4058mV 1000MHz
>>>>> Sat Sep 14 09:10:19 UTC 2019 88° 4187mV 1000MHz
>>>>> Sat Sep 14 09:10:21 UTC 2019 88° 4093mV 1000MHz
>>>>> Sat Sep 14 09:10:22 UTC 2019 88° 4087mV 1000MHz
>>>>> Sat Sep 14 09:10:23 UTC 2019 90° 4070mV 1000MHz
>>>> 
>>>> Should we be a little more conservative?  Without knowing the
>>>> accuracy, i believe we do not want to run at 800 or 1GHz at 90C, so if
>>>> we made this value 89 instead of 90, we would throttle a little more
>>>> conservatively.
>>> 
>>> Well, the OMAP5 also defines exactly 100°C in the device tree.
>>> 
>>> I would assume that the badgap sensor accuracy is so that it
>>> never reports less than the real temperature. So if we
>>> throttle at reported 90° TJ is likely lower.
>>> 
>>>>> Sat Sep 14 09:10:25 UTC 2019 88° 4123mV 800MHz
>>>>> Sat Sep 14 09:10:26 UTC 2019 88° 4064mV 1000MHz
>>>>> Sat Sep 14 09:10:28 UTC 2019 90° 4058mV 1000MHz
>>>> 
>>>> Again here, I interpret the data sheet correctly, we're technically out of spec
>>> 
>>> I read the data sheet as if 90°C at OPP1G is still within spec.
>>> 91 would be obviously outside (if bandgap sensor is precise).
>>> 
>>>> 
>>>>> Sat Sep 14 09:10:29 UTC 2019 88° 4076mV 1000MHz
>>>>> Sat Sep 14 09:10:30 UTC 2019 88° 4064mV 1000MHz
>>>>> Sat Sep 14 09:10:32 UTC 2019 88° 4117mV 1000MHz
>>>>> Sat Sep 14 09:10:33 UTC 2019 88° 4105mV 800MHz
>>>>> Sat Sep 14 09:10:34 UTC 2019 88° 4070mV 1000MHz
>>>>> Sat Sep 14 09:10:36 UTC 2019 88° 4076mV 1000MHz
>>>>> Sat Sep 14 09:10:37 UTC 2019 88° 4087mV 1000MHz
>>>>> Sat Sep 14 09:10:39 UTC 2019 88° 4017mV 1000MHz
>>>>> Sat Sep 14 09:10:40 UTC 2019 88° 4093mV 1000MHz
>>>>> Sat Sep 14 09:10:41 UTC 2019 88° 4058mV 800MHz
>>>>> Sat Sep 14 09:10:42 UTC 2019 88° 4035mV 1000MHz
>>>>> Sat Sep 14 09:10:44 UTC 2019 90° 4058mV 1000MHz
>>>>> Sat Sep 14 09:10:45 UTC 2019 88° 4064mV 1000MHz
>>>>> Sat Sep 14 09:10:47 UTC 2019 88° 4064mV 1000MHz
>>>>> Sat Sep 14 09:10:48 UTC 2019 88° 4029mV 1000MHz
>>>>> Sat Sep 14 09:10:50 UTC 2019 90° 4046mV 1000MHz
>>>>> ^Ckill 4680
>>>>> root@letux:~# cpufreq-info
>>>>> cpufrequtils 008: cpufreq-info (C) Dominik Brodowski 2004-2009
>>>>> Report errors and bugs to cpufreq@vger.kernel.org, please.
>>>>> analyzing CPU 0:
>>>>> driver: cpufreq-dt
>>>>> CPUs which run at the same hardware frequency: 0
>>>>> CPUs which need to have their frequency coordinated by software: 0
>>>>> maximum transition latency: 300 us.
>>>>> hardware limits: 300 MHz - 1000 MHz
>>>>> available frequency steps: 300 MHz, 600 MHz, 800 MHz, 1000 MHz
>>>>> available cpufreq governors: conservative, userspace, powersave, ondemand, performance
>>>>> current policy: frequency should be within 300 MHz and 1000 MHz.
>>>>>                 The governor "ondemand" may decide which speed to use
>>>>>                 within this range.
>>>>> current CPU frequency is 600 MHz (asserted by call to hardware).
>>>>> cpufreq stats: 300 MHz:22.81%, 600 MHz:2.50%, 800 MHz:2.10%, 1000 MHz:72.59%  (1563)
>>>>> root@letux:~#
>>>>> 
>>>>> So OPP is reduced if bandgap sensor reports >= 90°C
>>>>> which almost immediately makes the temperature
>>>>> go down.
>>>>> 
>>>>> No operational hickups were observed.
>>>>> 
>>>>> Surface temperature of the PoP chip did rise to
>>>>> approx. 53°C during this test.
>>>>> 
>>>>> Tested-by: H. Nikolaus Schaller <hns@goldelico.com> # on GTA04A5 with dm3730cbp100
>>>>> 
>>> 
>>> BTW: this patch (set) is even independent of my 1GHz OPP patches.
>>> Should also work with OPP-v1 definitions so that maintainers can
>>> decide which one to apply first.
>> 
>> If I am going integrate the cooling references into &cpu node, I'll
>> probably base it on your work since the cooling isn't really that
>> important until we exceed 800MHz.  If I do it on the current linux
>> master or omap for-next branch, it may not apply cleanly.
>> 
>>> 
>>> It is just more difficult to reach TJ of 90°C without 1GHz.
>> 
>> If it even does at all without external influences.
>> 
>> adam
>>> 
>>> BR,
>>> Nikolaus
>>> 


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [RFC v2 1/2] ARM: dts: omap3: Add cpu trips and cooling map for omap3 family
  2019-10-07 15:44           ` H. Nikolaus Schaller
@ 2019-10-07 17:25             ` Adam Ford
  2019-10-30  8:39               ` H. Nikolaus Schaller
  0 siblings, 1 reply; 12+ messages in thread
From: Adam Ford @ 2019-10-07 17:25 UTC (permalink / raw)
  To: H. Nikolaus Schaller
  Cc: Linux-OMAP, Adam Ford, Nishanth Menon, Benoît Cousson,
	Tony Lindgren, Rob Herring, Mark Rutland, devicetree,
	Linux Kernel Mailing List, Grazvydas Ignotas

On Mon, Oct 7, 2019 at 10:45 AM H. Nikolaus Schaller <hns@goldelico.com> wrote:
>
>
> > Am 07.10.2019 um 17:11 schrieb Adam Ford <aford173@gmail.com>:
> >
> > On Sat, Sep 14, 2019 at 11:12 AM Adam Ford <aford173@gmail.com> wrote:
> >>
> >> On Sat, Sep 14, 2019 at 9:38 AM H. Nikolaus Schaller <hns@goldelico.com> wrote:
> >>>
> >>>
> >>>> Am 14.09.2019 um 15:42 schrieb Adam Ford <aford173@gmail.com>:
> >>>>
> >>>> On Sat, Sep 14, 2019 at 4:20 AM H. Nikolaus Schaller <hns@goldelico.com> wrote:
> >>>>>
> >>>>>
> >>>>>> Am 13.09.2019 um 17:37 schrieb Adam Ford <aford173@gmail.com>:
> >>>>>>
> >>>>>> The OMAP3530, AM3517 and DM3730 all show thresholds of 90C and 105C
> >>>>>> depending on commercial or industrial temperature ratings.  This
> >>>>>> patch expands the thermal information to the limits of 90 and 105
> >>>>>> for alert and critical.
> >>>>>>
> >
> > Tom / anyone from TI,
> >
> > I am going to rebase this patch from the current 5.4-RC branch, remove
> > the AM3517 references, and leave the throttling only applicable to
> > omap34xx and 36xx (like it is now), and remove the RFC.  Before I do
> > that, I was hoping for some feedback on whether or not there is a
> > reason to not do this while acknowledging the thermal sensor isn't
> > very accurate.
>
> I wonder if there is a more precise definition what "isn't very accurate"
> means?

That's what I was trying to get by asking TI for feedback.

>
> Is it just because the TI_BANDGAP_FEATURE_UNRELIABLE bit is set in
> the driver and we assume that it is right?

The bandgap sensor is disabled by default and, when enabled, it throws
a comment saying 'You've been warned' so I mostly want to acknowledge
that in the patch.

>
> Of course the "junction temperature" (TJ) is not well defined (at which
> edge? in which area?) and the bandgap sensor can only report a single point
> of the die. So e.g. the GPU or the NEON unit may be hotter or cooler.

I look forward to running the GPU again.  ;-)

>
> And, the bandgap sensor + ADC is unlikely to be well calibrated to
> 0.1°C precision.
>
> But in my experiments there seems to be not much noise and values rise
> or fall monotonic according to expectations of processor load.
>
> So a report of 90°C may not be exactly 90°C and some parts of the SoC
> may be hotter.
>
> I would also assume that the TJ limits of 90°C have some safety margin
> but there seems to be no information in the data sheet.
>
> So, IMHO an "unreliable" bandgap sensor is better than no sensor and
> no trips / cooling maps.

I completely agree.

>
> One more thing is with the omap3 bandgap sensor (driver?). It appears to
> report the value of the previous measurement. So unless it is regularily
> polled (like cpufreq seems to do) it will report outdated values. The
> first read hours after boot may report the value during probe while booting.
>
> This is also a source of missing accuracy of course. But I haven't
> investigated this (can only be tested if thermal management is turned
> off) because I think it has no practical influence if cpufreq is polling.
>
> >
> > Does anyone have any objections to this?
> >
> > Other than the omap mailing list, are there other lists that should be CC'd?
> >
> > adam
> >
> >>>>>> For boards who never use industrial temperatures, these can be
> >>>>>> changed on their respective device trees with something like:
> >>>>>>
> >>>>>> &cpu_alert0 {
> >>>>>>     temperature = <85000>; /* millicelsius */
> >>>>>> };
> >>>>>>
> >>>>>> &cpu_crit {
> >>>>>>     temperature = <90000>; /* millicelsius */
> >>>>>> };
> >>>>>>
> >>>>>> Signed-off-by: Adam Ford <aford173@gmail.com>
> >>>>>> ---
> >>>>>> V2:  Change the CPU reference to &cpu instead of &cpu0
> >>>>>>
> >>>>>> diff --git a/arch/arm/boot/dts/omap3-cpu-thermal.dtsi b/arch/arm/boot/dts/omap3-cpu-thermal.dtsi
> >>>>>> index 235ecfd61e2d..dfbd0cb0b00b 100644
> >>>>>> --- a/arch/arm/boot/dts/omap3-cpu-thermal.dtsi
> >>>>>> +++ b/arch/arm/boot/dts/omap3-cpu-thermal.dtsi
> >>>>>> @@ -17,4 +17,25 @@ cpu_thermal: cpu_thermal {
> >>>>>>
> >>>>>>                     /* sensor       ID */
> >>>>>>     thermal-sensors = <&bandgap     0>;
> >>>>>> +
> >>>>>> +     cpu_trips: trips {
> >>>>>> +             cpu_alert0: cpu_alert {
> >>>>>> +                     temperature = <90000>; /* millicelsius */
> >>>>>> +                     hysteresis = <2000>; /* millicelsius */
> >>>>>> +                     type = "passive";
> >>>>>> +             };
> >>>>>> +             cpu_crit: cpu_crit {
> >>>>>> +                     temperature = <105000>; /* millicelsius */
> >>>>>> +                     hysteresis = <2000>; /* millicelsius */
> >>>>>> +                     type = "critical";
> >>>>>> +             };
> >>>>>> +     };
> >>>>>> +
> >>>>>> +     cpu_cooling_maps: cooling-maps {
> >>>>>> +             map0 {
> >>>>>> +                     trip = <&cpu_alert0>;
> >>>>>> +                     cooling-device =
> >>>>>> +                             <&cpu THERMAL_NO_LIMIT THERMAL_NO_LIMIT>;
> >>>>>> +             };
> >>>>>> +     };
> >>>>>> };
> >>>>>> --
> >>>>>> 2.17.1
> >>>>>>
> >>>>>
> >>>>> Here is my test log (GTA04A5 with DM3730CBP100).
> >>>>> "high-load" script is driving the NEON to full power
> >>>>> and would report calculation errors.
> >>>>>
> >>>>> There is no noise visible in the bandgap sensor data
> >>>>> induced by power supply fluctuations (log shows system
> >>>>> voltage while charging).
> >>>>>
> >>>>
> >>>> Great data!
> >>>>
> >>>>> root@letux:~# ./high-load -n2
> >>>>> 100% load stress test for 1 cores running ./neon_loop2
> >>>>> Sat Sep 14 09:05:50 UTC 2019 65° 4111mV 1000MHz
> >>>>> Sat Sep 14 09:05:50 UTC 2019 67° 4005mV 1000MHz
> >>>>> Sat Sep 14 09:05:52 UTC 2019 68° 4000mV 1000MHz
> >>>>> Sat Sep 14 09:05:53 UTC 2019 68° 4000mV 1000MHz
> >>>>> Sat Sep 14 09:05:55 UTC 2019 72° 3976mV 1000MHz
> >>>>> Sat Sep 14 09:05:56 UTC 2019 72° 4023mV 1000MHz
> >>>>> Sat Sep 14 09:05:57 UTC 2019 72° 3900mV 1000MHz
> >>>>> Sat Sep 14 09:05:59 UTC 2019 73° 4029mV 1000MHz
> >>>>> Sat Sep 14 09:06:00 UTC 2019 73° 3988mV 1000MHz
> >>>>> Sat Sep 14 09:06:01 UTC 2019 73° 4005mV 1000MHz
> >>>>> Sat Sep 14 09:06:03 UTC 2019 73° 4011mV 1000MHz
> >>>>> Sat Sep 14 09:06:04 UTC 2019 73° 4117mV 1000MHz
> >>>>> Sat Sep 14 09:06:06 UTC 2019 73° 4005mV 1000MHz
> >>>>> Sat Sep 14 09:06:07 UTC 2019 75° 3994mV 1000MHz
> >>>>> Sat Sep 14 09:06:08 UTC 2019 75° 3970mV 1000MHz
> >>>>> Sat Sep 14 09:06:09 UTC 2019 75° 4046mV 1000MHz
> >>>>> Sat Sep 14 09:06:11 UTC 2019 75° 4005mV 1000MHz
> >>>>> Sat Sep 14 09:06:12 UTC 2019 75° 4023mV 1000MHz
> >>>>> Sat Sep 14 09:06:14 UTC 2019 75° 3970mV 1000MHz
> >>>>> Sat Sep 14 09:06:15 UTC 2019 75° 4011mV 1000MHz
> >>>>> Sat Sep 14 09:06:16 UTC 2019 77° 4017mV 1000MHz
> >>>>> Sat Sep 14 09:06:18 UTC 2019 77° 3994mV 1000MHz
> >>>>> Sat Sep 14 09:06:19 UTC 2019 77° 3994mV 1000MHz
> >>>>> Sat Sep 14 09:06:20 UTC 2019 77° 3988mV 1000MHz
> >>>>> Sat Sep 14 09:06:22 UTC 2019 77° 4023mV 1000MHz
> >>>>> Sat Sep 14 09:06:23 UTC 2019 77° 4023mV 1000MHz
> >>>>> Sat Sep 14 09:06:24 UTC 2019 78° 4005mV 1000MHz
> >>>>> Sat Sep 14 09:06:26 UTC 2019 78° 4105mV 1000MHz
> >>>>> Sat Sep 14 09:06:27 UTC 2019 78° 4011mV 1000MHz
> >>>>> Sat Sep 14 09:06:28 UTC 2019 78° 3994mV 1000MHz
> >>>>> Sat Sep 14 09:06:30 UTC 2019 78° 4123mV 1000MHz
> >>>>> ...
> >>>>> Sat Sep 14 09:09:57 UTC 2019 88° 4082mV 1000MHz
> >>>>> Sat Sep 14 09:09:59 UTC 2019 88° 4164mV 1000MHz
> >>>>> Sat Sep 14 09:10:00 UTC 2019 88° 4058mV 1000MHz
> >>>>> Sat Sep 14 09:10:01 UTC 2019 88° 4058mV 1000MHz
> >>>>> Sat Sep 14 09:10:03 UTC 2019 88° 4082mV 1000MHz
> >>>>> Sat Sep 14 09:10:04 UTC 2019 88° 4058mV 1000MHz
> >>>>> Sat Sep 14 09:10:06 UTC 2019 88° 4146mV 1000MHz
> >>>>> Sat Sep 14 09:10:07 UTC 2019 88° 4041mV 1000MHz
> >>>>> Sat Sep 14 09:10:08 UTC 2019 88° 4035mV 1000MHz
> >>>>> Sat Sep 14 09:10:10 UTC 2019 88° 4052mV 1000MHz
> >>>>> Sat Sep 14 09:10:11 UTC 2019 88° 4087mV 1000MHz
> >>>>> Sat Sep 14 09:10:12 UTC 2019 88° 4152mV 1000MHz
> >>>>> Sat Sep 14 09:10:14 UTC 2019 88° 4070mV 1000MHz
> >>>>> Sat Sep 14 09:10:15 UTC 2019 88° 4064mV 1000MHz
> >>>>> Sat Sep 14 09:10:17 UTC 2019 88° 4170mV 1000MHz
> >>>>> Sat Sep 14 09:10:18 UTC 2019 88° 4058mV 1000MHz
> >>>>> Sat Sep 14 09:10:19 UTC 2019 88° 4187mV 1000MHz
> >>>>> Sat Sep 14 09:10:21 UTC 2019 88° 4093mV 1000MHz
> >>>>> Sat Sep 14 09:10:22 UTC 2019 88° 4087mV 1000MHz
> >>>>> Sat Sep 14 09:10:23 UTC 2019 90° 4070mV 1000MHz
> >>>>
> >>>> Should we be a little more conservative?  Without knowing the
> >>>> accuracy, i believe we do not want to run at 800 or 1GHz at 90C, so if
> >>>> we made this value 89 instead of 90, we would throttle a little more
> >>>> conservatively.
> >>>
> >>> Well, the OMAP5 also defines exactly 100°C in the device tree.
> >>>
> >>> I would assume that the badgap sensor accuracy is so that it
> >>> never reports less than the real temperature. So if we
> >>> throttle at reported 90° TJ is likely lower.
> >>>
> >>>>> Sat Sep 14 09:10:25 UTC 2019 88° 4123mV 800MHz
> >>>>> Sat Sep 14 09:10:26 UTC 2019 88° 4064mV 1000MHz
> >>>>> Sat Sep 14 09:10:28 UTC 2019 90° 4058mV 1000MHz
> >>>>
> >>>> Again here, I interpret the data sheet correctly, we're technically out of spec
> >>>
> >>> I read the data sheet as if 90°C at OPP1G is still within spec.
> >>> 91 would be obviously outside (if bandgap sensor is precise).
> >>>
> >>>>
> >>>>> Sat Sep 14 09:10:29 UTC 2019 88° 4076mV 1000MHz
> >>>>> Sat Sep 14 09:10:30 UTC 2019 88° 4064mV 1000MHz
> >>>>> Sat Sep 14 09:10:32 UTC 2019 88° 4117mV 1000MHz
> >>>>> Sat Sep 14 09:10:33 UTC 2019 88° 4105mV 800MHz
> >>>>> Sat Sep 14 09:10:34 UTC 2019 88° 4070mV 1000MHz
> >>>>> Sat Sep 14 09:10:36 UTC 2019 88° 4076mV 1000MHz
> >>>>> Sat Sep 14 09:10:37 UTC 2019 88° 4087mV 1000MHz
> >>>>> Sat Sep 14 09:10:39 UTC 2019 88° 4017mV 1000MHz
> >>>>> Sat Sep 14 09:10:40 UTC 2019 88° 4093mV 1000MHz
> >>>>> Sat Sep 14 09:10:41 UTC 2019 88° 4058mV 800MHz
> >>>>> Sat Sep 14 09:10:42 UTC 2019 88° 4035mV 1000MHz
> >>>>> Sat Sep 14 09:10:44 UTC 2019 90° 4058mV 1000MHz
> >>>>> Sat Sep 14 09:10:45 UTC 2019 88° 4064mV 1000MHz
> >>>>> Sat Sep 14 09:10:47 UTC 2019 88° 4064mV 1000MHz
> >>>>> Sat Sep 14 09:10:48 UTC 2019 88° 4029mV 1000MHz
> >>>>> Sat Sep 14 09:10:50 UTC 2019 90° 4046mV 1000MHz
> >>>>> ^Ckill 4680
> >>>>> root@letux:~# cpufreq-info
> >>>>> cpufrequtils 008: cpufreq-info (C) Dominik Brodowski 2004-2009
> >>>>> Report errors and bugs to cpufreq@vger.kernel.org, please.
> >>>>> analyzing CPU 0:
> >>>>> driver: cpufreq-dt
> >>>>> CPUs which run at the same hardware frequency: 0
> >>>>> CPUs which need to have their frequency coordinated by software: 0
> >>>>> maximum transition latency: 300 us.
> >>>>> hardware limits: 300 MHz - 1000 MHz
> >>>>> available frequency steps: 300 MHz, 600 MHz, 800 MHz, 1000 MHz
> >>>>> available cpufreq governors: conservative, userspace, powersave, ondemand, performance
> >>>>> current policy: frequency should be within 300 MHz and 1000 MHz.
> >>>>>                 The governor "ondemand" may decide which speed to use
> >>>>>                 within this range.
> >>>>> current CPU frequency is 600 MHz (asserted by call to hardware).
> >>>>> cpufreq stats: 300 MHz:22.81%, 600 MHz:2.50%, 800 MHz:2.10%, 1000 MHz:72.59%  (1563)
> >>>>> root@letux:~#
> >>>>>
> >>>>> So OPP is reduced if bandgap sensor reports >= 90°C
> >>>>> which almost immediately makes the temperature
> >>>>> go down.
> >>>>>
> >>>>> No operational hickups were observed.
> >>>>>
> >>>>> Surface temperature of the PoP chip did rise to
> >>>>> approx. 53°C during this test.
> >>>>>
> >>>>> Tested-by: H. Nikolaus Schaller <hns@goldelico.com> # on GTA04A5 with dm3730cbp100
> >>>>>
> >>>
> >>> BTW: this patch (set) is even independent of my 1GHz OPP patches.
> >>> Should also work with OPP-v1 definitions so that maintainers can
> >>> decide which one to apply first.
> >>
> >> If I am going integrate the cooling references into &cpu node, I'll
> >> probably base it on your work since the cooling isn't really that
> >> important until we exceed 800MHz.  If I do it on the current linux
> >> master or omap for-next branch, it may not apply cleanly.
> >>
> >>>
> >>> It is just more difficult to reach TJ of 90°C without 1GHz.
> >>
> >> If it even does at all without external influences.
> >>
> >> adam
> >>>
> >>> BR,
> >>> Nikolaus
> >>>
>

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [RFC v2 1/2] ARM: dts: omap3: Add cpu trips and cooling map for omap3 family
  2019-10-07 17:25             ` Adam Ford
@ 2019-10-30  8:39               ` H. Nikolaus Schaller
  2019-10-30 12:00                 ` Adam Ford
  0 siblings, 1 reply; 12+ messages in thread
From: H. Nikolaus Schaller @ 2019-10-30  8:39 UTC (permalink / raw)
  To: Adam Ford
  Cc: Linux-OMAP, Adam Ford, Nishanth Menon, Benoît Cousson,
	Tony Lindgren, Rob Herring, Mark Rutland, devicetree,
	Linux Kernel Mailing List, Grazvydas Ignotas

Hi Adam,
what is the status of this RFC/PATCH?

BR and thanks,
Nikolaus


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [RFC v2 1/2] ARM: dts: omap3: Add cpu trips and cooling map for omap3 family
  2019-10-30  8:39               ` H. Nikolaus Schaller
@ 2019-10-30 12:00                 ` Adam Ford
  0 siblings, 0 replies; 12+ messages in thread
From: Adam Ford @ 2019-10-30 12:00 UTC (permalink / raw)
  To: H. Nikolaus Schaller
  Cc: Linux-OMAP, Adam Ford, Nishanth Menon, Benoît Cousson,
	Tony Lindgren, Rob Herring, Mark Rutland, devicetree,
	Linux Kernel Mailing List, Grazvydas Ignotas

On Wed, Oct 30, 2019 at 3:40 AM H. Nikolaus Schaller <hns@goldelico.com> wrote:
>
> Hi Adam,
> what is the status of this RFC/PATCH?

I've submitted a formal 2-part patch [1] and [2], but Tony is
concerned about power consumption.  As of right now, I don't have
cycles to work on it.  My employer is about to release two new SOM's,
and I'm writing up some documentation on some of the older ones to
help some of the developers working on the new ones make their job go
quicker.

[1] - https://patchwork.kernel.org/patch/11178561/
[2] - https://patchwork.kernel.org/patch/11178563/

I requested if we could apply them as-is with 'status=disabled' for
now until the bus is fixed, but I think there was some push-back.

I'm trying to get to it.  I am hoping to find a little time this weekend.

adam
>
> BR and thanks,
> Nikolaus
>

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2019-10-30 12:00 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-09-13 15:37 [RFC v2 1/2] ARM: dts: omap3: Add cpu trips and cooling map for omap3 family Adam Ford
2019-09-13 15:37 ` [RFC v2 2/2] ARM: omap3: Consolidate thermal references to common omap3 Adam Ford
     [not found]   ` <40FEEAC9-8F19-466F-83C3-C8F0142D44B7@goldelico.com>
2019-09-14 13:47     ` Adam Ford
2019-09-14  9:20 ` [RFC v2 1/2] ARM: dts: omap3: Add cpu trips and cooling map for omap3 family H. Nikolaus Schaller
2019-09-14 13:42   ` Adam Ford
2019-09-14 14:38     ` H. Nikolaus Schaller
2019-09-14 16:12       ` Adam Ford
2019-10-07 15:11         ` Adam Ford
2019-10-07 15:44           ` H. Nikolaus Schaller
2019-10-07 17:25             ` Adam Ford
2019-10-30  8:39               ` H. Nikolaus Schaller
2019-10-30 12:00                 ` Adam Ford

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).