All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] imx_thermal: Fix temperature retrieval after overheat
@ 2022-02-07 16:18 Nicolas Cavallari
  2022-02-07 17:02 ` Nicolas Cavallari
  0 siblings, 1 reply; 2+ messages in thread
From: Nicolas Cavallari @ 2022-02-07 16:18 UTC (permalink / raw)
  To: Rafael J. Wysocki, Daniel Lezcano, Shawn Guo, Sascha Hauer
  Cc: Amit Kucheria, Zhang Rui, Pengutronix Kernel Team, Fabio Estevam,
	NXP Linux Team, Andrzej Pietrasiewicz, linux-pm, linux-kernel

When the CPU temperature is above the passive trip point, reading the
temperature would fail forever with EAGAIN.  Fortunately, the thermal
core would continue to assume that the system is overheating, so would
put all passive cooling devices to the max.  Unfortunately, it does
this forever, even if the temperature returns to normal.

This can be easily tested by setting a very low trip point and crossing
it with while(1) loops.

The cause is commit d92ed2c9d3ff ("thermal: imx: Use driver's local data
to decide whether to run a measurement"), which replaced a check for
thermal_zone_device_is_enabled() by a check for irq_enabled, which
tests if the passive trip interrupt is enabled.

Normally, when the thermal zone is enabled, the temperature sensors
are always enabled and the interrupt is used to detect overheating.
When the interrupt fires, it must be disabled.
In that case, the commit causes the measurements to be done
manually (enable sensor, do measurement, disable sensor).
If the thermal core successfully cools down the system below the trip
point (which it typically does quickly), the irq is enabled again but
the sensor is not enabled.

To fix this without using thermal_zone_device_is_enabled(), use a
separate variable to record if the thermal zone is enabled.

Fixes: d92ed2c9d3ff ("thermal: imx: Use driver's local data to decide
whether to run a measurement")

Signed-off-by: Nicolas Cavallari <nicolas.cavallari@green-communications.fr>
---
 drivers/thermal/imx_thermal.c | 10 ++++++----
 1 file changed, 6 insertions(+), 4 deletions(-)

diff --git a/drivers/thermal/imx_thermal.c b/drivers/thermal/imx_thermal.c
index 2c7473d86a59..5a6ad5bae238 100644
--- a/drivers/thermal/imx_thermal.c
+++ b/drivers/thermal/imx_thermal.c
@@ -205,6 +205,7 @@ struct imx_thermal_data {
 	int alarm_temp;
 	int last_temp;
 	bool irq_enabled;
+	bool tz_enabled;
 	int irq;
 	struct clk *thermal_clk;
 	const struct thermal_soc_data *socdata;
@@ -252,11 +253,10 @@ static int imx_get_temp(struct thermal_zone_device *tz, int *temp)
 	const struct thermal_soc_data *soc_data = data->socdata;
 	struct regmap *map = data->tempmon;
 	unsigned int n_meas;
-	bool wait, run_measurement;
+	bool wait;
 	u32 val;
 
-	run_measurement = !data->irq_enabled;
-	if (!run_measurement) {
+	if (data->tz_enabled) {
 		/* Check if a measurement is currently in progress */
 		regmap_read(map, soc_data->temp_data, &val);
 		wait = !(val & soc_data->temp_valid_mask);
@@ -283,7 +283,7 @@ static int imx_get_temp(struct thermal_zone_device *tz, int *temp)
 
 	regmap_read(map, soc_data->temp_data, &val);
 
-	if (run_measurement) {
+	if (!data->tz_enabled) {
 		regmap_write(map, soc_data->sensor_ctrl + REG_CLR,
 			     soc_data->measure_temp_mask);
 		regmap_write(map, soc_data->sensor_ctrl + REG_SET,
@@ -339,6 +339,7 @@ static int imx_change_mode(struct thermal_zone_device *tz,
 	const struct thermal_soc_data *soc_data = data->socdata;
 
 	if (mode == THERMAL_DEVICE_ENABLED) {
+		data->tz_enabled = true;
 		regmap_write(map, soc_data->sensor_ctrl + REG_CLR,
 			     soc_data->power_down_mask);
 		regmap_write(map, soc_data->sensor_ctrl + REG_SET,
@@ -349,6 +350,7 @@ static int imx_change_mode(struct thermal_zone_device *tz,
 			enable_irq(data->irq);
 		}
 	} else {
+		data->tz_enabled = false;
 		regmap_write(map, soc_data->sensor_ctrl + REG_CLR,
 			     soc_data->measure_temp_mask);
 		regmap_write(map, soc_data->sensor_ctrl + REG_SET,
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 2+ messages in thread

* Re: [PATCH] imx_thermal: Fix temperature retrieval after overheat
  2022-02-07 16:18 [PATCH] imx_thermal: Fix temperature retrieval after overheat Nicolas Cavallari
@ 2022-02-07 17:02 ` Nicolas Cavallari
  0 siblings, 0 replies; 2+ messages in thread
From: Nicolas Cavallari @ 2022-02-07 17:02 UTC (permalink / raw)
  To: Rafael J. Wysocki, Daniel Lezcano, Shawn Guo, Sascha Hauer
  Cc: Amit Kucheria, Zhang Rui, Pengutronix Kernel Team, Fabio Estevam,
	NXP Linux Team, Andrzej Pietrasiewicz, linux-pm, linux-kernel

On 07/02/2022 17:18, Nicolas Cavallari wrote:
> When the CPU temperature is above the passive trip point, reading the
> temperature would fail forever with EAGAIN.  Fortunately, the thermal
> core would continue to assume that the system is overheating, so would
> put all passive cooling devices to the max.  Unfortunately, it does
> this forever, even if the temperature returns to normal.

Please drop this patch, apparently this was already fixed. Sorry for the 
noise.

^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2022-02-07 17:11 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-02-07 16:18 [PATCH] imx_thermal: Fix temperature retrieval after overheat Nicolas Cavallari
2022-02-07 17:02 ` Nicolas Cavallari

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.