All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 1/5] thermal/core: Rearm the monitoring only one time
@ 2022-08-05 15:38 Daniel Lezcano
  2022-08-05 15:38 ` [PATCH 2/5] thermal/core: Rework the monitoring a bit Daniel Lezcano
                   ` (5 more replies)
  0 siblings, 6 replies; 20+ messages in thread
From: Daniel Lezcano @ 2022-08-05 15:38 UTC (permalink / raw)
  To: daniel.lezcano, rafael; +Cc: rui.zhang, linux-pm, linux-kernel, Amit Kucheria

The current code calls monitor_thermal_zone() inside the
handle_thermal_trip() function. But this one is called in a loop for
each trip point which means the monitoring is rearmed several times
for nothing (assuming there could be several passive and active trip
points).

Move the monitor_thermal_zone() function out of the
handle_thermal_trip() function and after the thermal trip loop, so the
timer will be disabled or rearmed one time.

Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
---
 drivers/thermal/thermal_core.c | 7 ++-----
 1 file changed, 2 insertions(+), 5 deletions(-)

diff --git a/drivers/thermal/thermal_core.c b/drivers/thermal/thermal_core.c
index b4c68410c158..4e1a83987b99 100644
--- a/drivers/thermal/thermal_core.c
+++ b/drivers/thermal/thermal_core.c
@@ -383,11 +383,6 @@ static void handle_thermal_trip(struct thermal_zone_device *tz, int trip)
 		handle_critical_trips(tz, trip, trip_temp, type);
 	else
 		handle_non_critical_trips(tz, trip);
-	/*
-	 * Alright, we handled this trip successfully.
-	 * So, start monitoring again.
-	 */
-	monitor_thermal_zone(tz);
 }
 
 static void update_temperature(struct thermal_zone_device *tz)
@@ -503,6 +498,8 @@ void thermal_zone_device_update(struct thermal_zone_device *tz,
 
 	for (count = 0; count < tz->num_trips; count++)
 		handle_thermal_trip(tz, count);
+
+	monitor_thermal_zone(tz);
 }
 EXPORT_SYMBOL_GPL(thermal_zone_device_update);
 
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH 2/5] thermal/core: Rework the monitoring a bit
  2022-08-05 15:38 [PATCH 1/5] thermal/core: Rearm the monitoring only one time Daniel Lezcano
@ 2022-08-05 15:38 ` Daniel Lezcano
  2022-08-23 12:42   ` [thermal: thermal/next] " thermal-bot for Daniel Lezcano
  2022-08-05 15:38 ` [PATCH 3/5] thermal/governors: Group the thermal zone lock inside the throttle function Daniel Lezcano
                   ` (4 subsequent siblings)
  5 siblings, 1 reply; 20+ messages in thread
From: Daniel Lezcano @ 2022-08-05 15:38 UTC (permalink / raw)
  To: daniel.lezcano, rafael; +Cc: rui.zhang, linux-pm, linux-kernel, Amit Kucheria

The should_stop_polling() function wraps the function
thermal_zone_device_is_enabled().

The monitor_thermal_zone() function checks if the thermal zone is
enabled via the should_stop_polling() function.

However, the instant after checking the thermal zone is enabled, this
one can be disabled, so even if that reduces the race window, it does
not prevent that and the monitoring can be set again with the thermal
zone disabled.

For this reason, the function should_stop_polling() is replaced by a
direct check of the thermal zone mode with the mutex locks held, that
prevents the situation described above.

As the semantic is clear with the thermal_zone_is_enabled() function,
we can remove the should_stop_polling() function and replace the check
with the former function.

While at it, reorder the checks to improve the readability of the
monitor_thermal_zone() function.

In the future, the thermal_zone_device_disable() and the
thermal_zone_device_enable() functions should unset / set the polling
timer directly instead of relying on the next
thermal_zone_device_update() call to do that. That will make a
synchronous thermal zone mode change but the locking scheme should be
double checked for that which out of the scope of this change.

Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
---
 drivers/thermal/thermal_core.c | 19 +++++--------------
 1 file changed, 5 insertions(+), 14 deletions(-)

diff --git a/drivers/thermal/thermal_core.c b/drivers/thermal/thermal_core.c
index 4e1a83987b99..d7029fd1c112 100644
--- a/drivers/thermal/thermal_core.c
+++ b/drivers/thermal/thermal_core.c
@@ -295,25 +295,16 @@ static void thermal_zone_device_set_polling(struct thermal_zone_device *tz,
 		cancel_delayed_work(&tz->poll_queue);
 }
 
-static inline bool should_stop_polling(struct thermal_zone_device *tz)
-{
-	return !thermal_zone_device_is_enabled(tz);
-}
-
 static void monitor_thermal_zone(struct thermal_zone_device *tz)
 {
-	bool stop;
-
-	stop = should_stop_polling(tz);
-
 	mutex_lock(&tz->lock);
 
-	if (!stop && tz->passive)
+	if (tz->mode != THERMAL_DEVICE_ENABLED)
+		thermal_zone_device_set_polling(tz, 0);
+	else if (tz->passive)
 		thermal_zone_device_set_polling(tz, tz->passive_delay_jiffies);
-	else if (!stop && tz->polling_delay_jiffies)
+	else if (tz->polling_delay_jiffies)
 		thermal_zone_device_set_polling(tz, tz->polling_delay_jiffies);
-	else
-		thermal_zone_device_set_polling(tz, 0);
 
 	mutex_unlock(&tz->lock);
 }
@@ -480,7 +471,7 @@ void thermal_zone_device_update(struct thermal_zone_device *tz,
 {
 	int count;
 
-	if (should_stop_polling(tz))
+	if (!thermal_zone_device_is_enabled(tz))
 		return;
 
 	if (atomic_read(&in_suspend))
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH 3/5] thermal/governors: Group the thermal zone lock inside the throttle function
  2022-08-05 15:38 [PATCH 1/5] thermal/core: Rearm the monitoring only one time Daniel Lezcano
  2022-08-05 15:38 ` [PATCH 2/5] thermal/core: Rework the monitoring a bit Daniel Lezcano
@ 2022-08-05 15:38 ` Daniel Lezcano
  2022-08-23 12:42   ` [thermal: thermal/next] " thermal-bot for Daniel Lezcano
  2022-08-05 15:38 ` [PATCH 4/5] thermal/core: Move the thermal zone lock out of the governors Daniel Lezcano
                   ` (3 subsequent siblings)
  5 siblings, 1 reply; 20+ messages in thread
From: Daniel Lezcano @ 2022-08-05 15:38 UTC (permalink / raw)
  To: daniel.lezcano, rafael
  Cc: rui.zhang, linux-pm, linux-kernel, Amit Kucheria, Lukasz Luba

The thermal zone lock is taken in the different places in the
throttling path.

At the first glance it does not hurt to move them at the beginning and
the end of the 'throttle' function. That will allow a consolidation of
the lock in the next following changes.

Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
---
 drivers/thermal/gov_bang_bang.c       |  8 ++-----
 drivers/thermal/gov_fair_share.c      |  1 +
 drivers/thermal/gov_power_allocator.c | 34 ++++++++++++---------------
 drivers/thermal/gov_step_wise.c       |  8 ++-----
 4 files changed, 20 insertions(+), 31 deletions(-)

diff --git a/drivers/thermal/gov_bang_bang.c b/drivers/thermal/gov_bang_bang.c
index 991a1c54296d..f0bff2e0475b 100644
--- a/drivers/thermal/gov_bang_bang.c
+++ b/drivers/thermal/gov_bang_bang.c
@@ -31,8 +31,6 @@ static void thermal_zone_trip_update(struct thermal_zone_device *tz, int trip)
 				trip, trip_temp, tz->temperature,
 				trip_hyst);
 
-	mutex_lock(&tz->lock);
-
 	list_for_each_entry(instance, &tz->thermal_instances, tz_node) {
 		if (instance->trip != trip)
 			continue;
@@ -65,8 +63,6 @@ static void thermal_zone_trip_update(struct thermal_zone_device *tz, int trip)
 		instance->cdev->updated = false; /* cdev needs update */
 		mutex_unlock(&instance->cdev->lock);
 	}
-
-	mutex_unlock(&tz->lock);
 }
 
 /**
@@ -100,10 +96,10 @@ static int bang_bang_control(struct thermal_zone_device *tz, int trip)
 {
 	struct thermal_instance *instance;
 
-	thermal_zone_trip_update(tz, trip);
-
 	mutex_lock(&tz->lock);
 
+	thermal_zone_trip_update(tz, trip);
+
 	list_for_each_entry(instance, &tz->thermal_instances, tz_node)
 		thermal_cdev_update(instance->cdev);
 
diff --git a/drivers/thermal/gov_fair_share.c b/drivers/thermal/gov_fair_share.c
index 6a2abcfc648f..5d5ddd648cd2 100644
--- a/drivers/thermal/gov_fair_share.c
+++ b/drivers/thermal/gov_fair_share.c
@@ -113,6 +113,7 @@ static int fair_share_throttle(struct thermal_zone_device *tz, int trip)
 	}
 
 	mutex_unlock(&tz->lock);
+
 	return 0;
 }
 
diff --git a/drivers/thermal/gov_power_allocator.c b/drivers/thermal/gov_power_allocator.c
index 1d5052470967..d3aca236e274 100644
--- a/drivers/thermal/gov_power_allocator.c
+++ b/drivers/thermal/gov_power_allocator.c
@@ -392,8 +392,6 @@ static int allocate_power(struct thermal_zone_device *tz,
 	int i, num_actors, total_weight, ret = 0;
 	int trip_max_desired_temperature = params->trip_max_desired_temperature;
 
-	mutex_lock(&tz->lock);
-
 	num_actors = 0;
 	total_weight = 0;
 	list_for_each_entry(instance, &tz->thermal_instances, tz_node) {
@@ -404,10 +402,8 @@ static int allocate_power(struct thermal_zone_device *tz,
 		}
 	}
 
-	if (!num_actors) {
-		ret = -ENODEV;
-		goto unlock;
-	}
+	if (!num_actors)
+		return -ENODEV;
 
 	/*
 	 * We need to allocate five arrays of the same size:
@@ -421,10 +417,8 @@ static int allocate_power(struct thermal_zone_device *tz,
 	BUILD_BUG_ON(sizeof(*req_power) != sizeof(*extra_actor_power));
 	BUILD_BUG_ON(sizeof(*req_power) != sizeof(*weighted_req_power));
 	req_power = kcalloc(num_actors * 5, sizeof(*req_power), GFP_KERNEL);
-	if (!req_power) {
-		ret = -ENOMEM;
-		goto unlock;
-	}
+	if (!req_power)
+		return -ENOMEM;
 
 	max_power = &req_power[num_actors];
 	granted_power = &req_power[2 * num_actors];
@@ -496,8 +490,6 @@ static int allocate_power(struct thermal_zone_device *tz,
 				      control_temp - tz->temperature);
 
 	kfree(req_power);
-unlock:
-	mutex_unlock(&tz->lock);
 
 	return ret;
 }
@@ -576,7 +568,6 @@ static void allow_maximum_power(struct thermal_zone_device *tz, bool update)
 	struct power_allocator_params *params = tz->governor_data;
 	u32 req_power;
 
-	mutex_lock(&tz->lock);
 	list_for_each_entry(instance, &tz->thermal_instances, tz_node) {
 		struct thermal_cooling_device *cdev = instance->cdev;
 
@@ -598,7 +589,6 @@ static void allow_maximum_power(struct thermal_zone_device *tz, bool update)
 
 		mutex_unlock(&instance->cdev->lock);
 	}
-	mutex_unlock(&tz->lock);
 }
 
 /**
@@ -707,17 +697,19 @@ static void power_allocator_unbind(struct thermal_zone_device *tz)
 
 static int power_allocator_throttle(struct thermal_zone_device *tz, int trip)
 {
-	int ret;
+	int ret = 0;
 	int switch_on_temp, control_temp;
 	struct power_allocator_params *params = tz->governor_data;
 	bool update;
 
+	mutex_lock(&tz->lock);
+
 	/*
 	 * We get called for every trip point but we only need to do
 	 * our calculations once
 	 */
 	if (trip != params->trip_max_desired_temperature)
-		return 0;
+		goto out;
 
 	ret = tz->ops->get_trip_temp(tz, params->trip_switch_on,
 				     &switch_on_temp);
@@ -726,7 +718,7 @@ static int power_allocator_throttle(struct thermal_zone_device *tz, int trip)
 		tz->passive = 0;
 		reset_pid_controller(params);
 		allow_maximum_power(tz, update);
-		return 0;
+		goto out;
 	}
 
 	tz->passive = 1;
@@ -737,10 +729,14 @@ static int power_allocator_throttle(struct thermal_zone_device *tz, int trip)
 		dev_warn(&tz->device,
 			 "Failed to get the maximum desired temperature: %d\n",
 			 ret);
-		return ret;
+		goto out;
 	}
 
-	return allocate_power(tz, control_temp);
+	ret = allocate_power(tz, control_temp);
+
+	mutex_unlock(&tz->lock);
+out:
+	return ret;
 }
 
 static struct thermal_governor thermal_gov_power_allocator = {
diff --git a/drivers/thermal/gov_step_wise.c b/drivers/thermal/gov_step_wise.c
index 9729b46d0258..597a0ebec7a4 100644
--- a/drivers/thermal/gov_step_wise.c
+++ b/drivers/thermal/gov_step_wise.c
@@ -117,8 +117,6 @@ static void thermal_zone_trip_update(struct thermal_zone_device *tz, int trip)
 	dev_dbg(&tz->device, "Trip%d[type=%d,temp=%d]:trend=%d,throttle=%d\n",
 				trip, trip_type, trip_temp, trend, throttle);
 
-	mutex_lock(&tz->lock);
-
 	list_for_each_entry(instance, &tz->thermal_instances, tz_node) {
 		if (instance->trip != trip)
 			continue;
@@ -145,8 +143,6 @@ static void thermal_zone_trip_update(struct thermal_zone_device *tz, int trip)
 		instance->cdev->updated = false; /* cdev needs update */
 		mutex_unlock(&instance->cdev->lock);
 	}
-
-	mutex_unlock(&tz->lock);
 }
 
 /**
@@ -164,10 +160,10 @@ static int step_wise_throttle(struct thermal_zone_device *tz, int trip)
 {
 	struct thermal_instance *instance;
 
-	thermal_zone_trip_update(tz, trip);
-
 	mutex_lock(&tz->lock);
 
+	thermal_zone_trip_update(tz, trip);
+
 	list_for_each_entry(instance, &tz->thermal_instances, tz_node)
 		thermal_cdev_update(instance->cdev);
 
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH 4/5] thermal/core: Move the thermal zone lock out of the governors
  2022-08-05 15:38 [PATCH 1/5] thermal/core: Rearm the monitoring only one time Daniel Lezcano
  2022-08-05 15:38 ` [PATCH 2/5] thermal/core: Rework the monitoring a bit Daniel Lezcano
  2022-08-05 15:38 ` [PATCH 3/5] thermal/governors: Group the thermal zone lock inside the throttle function Daniel Lezcano
@ 2022-08-05 15:38 ` Daniel Lezcano
  2022-08-23 12:42   ` [thermal: thermal/next] " thermal-bot for Daniel Lezcano
  2022-10-04 14:14   ` [PATCH 4/5] " Guenter Roeck
  2022-08-05 15:38 ` [PATCH 5/5] thermal/core: Move the mutex inside the thermal_zone_device_update() function Daniel Lezcano
                   ` (2 subsequent siblings)
  5 siblings, 2 replies; 20+ messages in thread
From: Daniel Lezcano @ 2022-08-05 15:38 UTC (permalink / raw)
  To: daniel.lezcano, rafael
  Cc: rui.zhang, linux-pm, linux-kernel, Amit Kucheria, Lukasz Luba

All the governors throttling ops are taking/releasing the lock at the
beginning and the end of the function.

We can move the mutex to the throttling call site instead.

Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
---
 drivers/thermal/gov_bang_bang.c       |  4 +---
 drivers/thermal/gov_fair_share.c      |  4 +---
 drivers/thermal/gov_power_allocator.c | 16 ++++++----------
 drivers/thermal/gov_step_wise.c       |  4 +---
 drivers/thermal/thermal_core.c        |  2 ++
 5 files changed, 11 insertions(+), 19 deletions(-)

diff --git a/drivers/thermal/gov_bang_bang.c b/drivers/thermal/gov_bang_bang.c
index f0bff2e0475b..a08bbe33be96 100644
--- a/drivers/thermal/gov_bang_bang.c
+++ b/drivers/thermal/gov_bang_bang.c
@@ -96,15 +96,13 @@ static int bang_bang_control(struct thermal_zone_device *tz, int trip)
 {
 	struct thermal_instance *instance;
 
-	mutex_lock(&tz->lock);
+	lockdep_assert_held(&tz->lock);
 
 	thermal_zone_trip_update(tz, trip);
 
 	list_for_each_entry(instance, &tz->thermal_instances, tz_node)
 		thermal_cdev_update(instance->cdev);
 
-	mutex_unlock(&tz->lock);
-
 	return 0;
 }
 
diff --git a/drivers/thermal/gov_fair_share.c b/drivers/thermal/gov_fair_share.c
index 5d5ddd648cd2..a4ee4661e9cc 100644
--- a/drivers/thermal/gov_fair_share.c
+++ b/drivers/thermal/gov_fair_share.c
@@ -82,7 +82,7 @@ static int fair_share_throttle(struct thermal_zone_device *tz, int trip)
 	int total_instance = 0;
 	int cur_trip_level = get_trip_level(tz);
 
-	mutex_lock(&tz->lock);
+	lockdep_assert_held(&tz->lock);
 
 	list_for_each_entry(instance, &tz->thermal_instances, tz_node) {
 		if (instance->trip != trip)
@@ -112,8 +112,6 @@ static int fair_share_throttle(struct thermal_zone_device *tz, int trip)
 		mutex_unlock(&cdev->lock);
 	}
 
-	mutex_unlock(&tz->lock);
-
 	return 0;
 }
 
diff --git a/drivers/thermal/gov_power_allocator.c b/drivers/thermal/gov_power_allocator.c
index d3aca236e274..2d1aeaba38a8 100644
--- a/drivers/thermal/gov_power_allocator.c
+++ b/drivers/thermal/gov_power_allocator.c
@@ -697,19 +697,19 @@ static void power_allocator_unbind(struct thermal_zone_device *tz)
 
 static int power_allocator_throttle(struct thermal_zone_device *tz, int trip)
 {
-	int ret = 0;
+	int ret;
 	int switch_on_temp, control_temp;
 	struct power_allocator_params *params = tz->governor_data;
 	bool update;
 
-	mutex_lock(&tz->lock);
+	lockdep_assert_held(&tz->lock);
 
 	/*
 	 * We get called for every trip point but we only need to do
 	 * our calculations once
 	 */
 	if (trip != params->trip_max_desired_temperature)
-		goto out;
+		return 0;
 
 	ret = tz->ops->get_trip_temp(tz, params->trip_switch_on,
 				     &switch_on_temp);
@@ -718,7 +718,7 @@ static int power_allocator_throttle(struct thermal_zone_device *tz, int trip)
 		tz->passive = 0;
 		reset_pid_controller(params);
 		allow_maximum_power(tz, update);
-		goto out;
+		return 0;
 	}
 
 	tz->passive = 1;
@@ -729,14 +729,10 @@ static int power_allocator_throttle(struct thermal_zone_device *tz, int trip)
 		dev_warn(&tz->device,
 			 "Failed to get the maximum desired temperature: %d\n",
 			 ret);
-		goto out;
+		return ret;
 	}
 
-	ret = allocate_power(tz, control_temp);
-
-	mutex_unlock(&tz->lock);
-out:
-	return ret;
+	return allocate_power(tz, control_temp);
 }
 
 static struct thermal_governor thermal_gov_power_allocator = {
diff --git a/drivers/thermal/gov_step_wise.c b/drivers/thermal/gov_step_wise.c
index 597a0ebec7a4..cdd3354bc27f 100644
--- a/drivers/thermal/gov_step_wise.c
+++ b/drivers/thermal/gov_step_wise.c
@@ -160,15 +160,13 @@ static int step_wise_throttle(struct thermal_zone_device *tz, int trip)
 {
 	struct thermal_instance *instance;
 
-	mutex_lock(&tz->lock);
+	lockdep_assert_held(&tz->lock);
 
 	thermal_zone_trip_update(tz, trip);
 
 	list_for_each_entry(instance, &tz->thermal_instances, tz_node)
 		thermal_cdev_update(instance->cdev);
 
-	mutex_unlock(&tz->lock);
-
 	return 0;
 }
 
diff --git a/drivers/thermal/thermal_core.c b/drivers/thermal/thermal_core.c
index d7029fd1c112..9d554f97e081 100644
--- a/drivers/thermal/thermal_core.c
+++ b/drivers/thermal/thermal_core.c
@@ -311,8 +311,10 @@ static void monitor_thermal_zone(struct thermal_zone_device *tz)
 
 static void handle_non_critical_trips(struct thermal_zone_device *tz, int trip)
 {
+	mutex_lock(&tz->lock);
 	tz->governor ? tz->governor->throttle(tz, trip) :
 		       def_governor->throttle(tz, trip);
+	mutex_unlock(&tz->lock);
 }
 
 void thermal_zone_device_critical(struct thermal_zone_device *tz)
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH 5/5] thermal/core: Move the mutex inside the thermal_zone_device_update() function
  2022-08-05 15:38 [PATCH 1/5] thermal/core: Rearm the monitoring only one time Daniel Lezcano
                   ` (2 preceding siblings ...)
  2022-08-05 15:38 ` [PATCH 4/5] thermal/core: Move the thermal zone lock out of the governors Daniel Lezcano
@ 2022-08-05 15:38 ` Daniel Lezcano
       [not found]   ` <CGME20220812103956eucas1p1849443855b3537c3f5be2891fa50a50b@eucas1p1.samsung.com>
  2022-08-23 12:42   ` [thermal: thermal/next] " thermal-bot for Daniel Lezcano
  2022-08-05 16:29 ` [PATCH 1/5] thermal/core: Rearm the monitoring only one time Rafael J. Wysocki
  2022-08-23 12:42 ` [thermal: thermal/next] " thermal-bot for Daniel Lezcano
  5 siblings, 2 replies; 20+ messages in thread
From: Daniel Lezcano @ 2022-08-05 15:38 UTC (permalink / raw)
  To: daniel.lezcano, rafael; +Cc: rui.zhang, linux-pm, linux-kernel, Amit Kucheria

All the different calls inside the thermal_zone_device_update()
function take the mutex.

The previous changes move the mutex out of the different functions,
like the throttling ops. Now that the mutexes are all at the same
level in the call stack for the thermal_zone_device_update() function,
they can be moved inside this one.

That has the benefit of:

1. Simplify the code by not having a plethora of places where the lock is taken

2. Probably closes more race windows because releasing the lock from
one line to another can give the opportunity to the thermal zone to change
its state in the meantime. For example, the thermal zone can be
enabled right after checking it is disabled.

Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
---
 drivers/thermal/thermal_core.c    | 32 +++++--------
 drivers/thermal/thermal_core.h    |  2 +
 drivers/thermal/thermal_helpers.c | 75 ++++++++++++++++++-------------
 drivers/thermal/thermal_sysfs.c   |  6 ++-
 4 files changed, 62 insertions(+), 53 deletions(-)

diff --git a/drivers/thermal/thermal_core.c b/drivers/thermal/thermal_core.c
index 9d554f97e081..60110ac53e23 100644
--- a/drivers/thermal/thermal_core.c
+++ b/drivers/thermal/thermal_core.c
@@ -297,24 +297,18 @@ static void thermal_zone_device_set_polling(struct thermal_zone_device *tz,
 
 static void monitor_thermal_zone(struct thermal_zone_device *tz)
 {
-	mutex_lock(&tz->lock);
-
 	if (tz->mode != THERMAL_DEVICE_ENABLED)
 		thermal_zone_device_set_polling(tz, 0);
 	else if (tz->passive)
 		thermal_zone_device_set_polling(tz, tz->passive_delay_jiffies);
 	else if (tz->polling_delay_jiffies)
 		thermal_zone_device_set_polling(tz, tz->polling_delay_jiffies);
-
-	mutex_unlock(&tz->lock);
 }
 
 static void handle_non_critical_trips(struct thermal_zone_device *tz, int trip)
 {
-	mutex_lock(&tz->lock);
 	tz->governor ? tz->governor->throttle(tz, trip) :
 		       def_governor->throttle(tz, trip);
-	mutex_unlock(&tz->lock);
 }
 
 void thermal_zone_device_critical(struct thermal_zone_device *tz)
@@ -382,7 +376,7 @@ static void update_temperature(struct thermal_zone_device *tz)
 {
 	int temp, ret;
 
-	ret = thermal_zone_get_temp(tz, &temp);
+	ret = __thermal_zone_get_temp(tz, &temp);
 	if (ret) {
 		if (ret != -EAGAIN)
 			dev_warn(&tz->device,
@@ -391,10 +385,8 @@ static void update_temperature(struct thermal_zone_device *tz)
 		return;
 	}
 
-	mutex_lock(&tz->lock);
 	tz->last_temperature = tz->temperature;
 	tz->temperature = temp;
-	mutex_unlock(&tz->lock);
 
 	trace_thermal_temperature(tz);
 
@@ -457,15 +449,9 @@ EXPORT_SYMBOL_GPL(thermal_zone_device_disable);
 
 int thermal_zone_device_is_enabled(struct thermal_zone_device *tz)
 {
-	enum thermal_device_mode mode;
-
-	mutex_lock(&tz->lock);
-
-	mode = tz->mode;
+	lockdep_assert_held(&tz->lock);
 
-	mutex_unlock(&tz->lock);
-
-	return mode == THERMAL_DEVICE_ENABLED;
+	return tz->mode == THERMAL_DEVICE_ENABLED;
 }
 
 void thermal_zone_device_update(struct thermal_zone_device *tz,
@@ -473,9 +459,6 @@ void thermal_zone_device_update(struct thermal_zone_device *tz,
 {
 	int count;
 
-	if (!thermal_zone_device_is_enabled(tz))
-		return;
-
 	if (atomic_read(&in_suspend))
 		return;
 
@@ -483,9 +466,14 @@ void thermal_zone_device_update(struct thermal_zone_device *tz,
 		      "'get_temp' ops set\n", __func__))
 		return;
 
+	mutex_lock(&tz->lock);
+
+	if (!thermal_zone_device_is_enabled(tz))
+		goto out;
+
 	update_temperature(tz);
 
-	thermal_zone_set_trips(tz);
+	__thermal_zone_set_trips(tz);
 
 	tz->notify_event = event;
 
@@ -493,6 +481,8 @@ void thermal_zone_device_update(struct thermal_zone_device *tz,
 		handle_thermal_trip(tz, count);
 
 	monitor_thermal_zone(tz);
+out:
+	mutex_unlock(&tz->lock);
 }
 EXPORT_SYMBOL_GPL(thermal_zone_device_update);
 
diff --git a/drivers/thermal/thermal_core.h b/drivers/thermal/thermal_core.h
index 2241d2dce017..1571917bd3c8 100644
--- a/drivers/thermal/thermal_core.h
+++ b/drivers/thermal/thermal_core.h
@@ -112,6 +112,8 @@ int thermal_build_list_of_policies(char *buf);
 
 /* Helpers */
 void thermal_zone_set_trips(struct thermal_zone_device *tz);
+void __thermal_zone_set_trips(struct thermal_zone_device *tz);
+int __thermal_zone_get_temp(struct thermal_zone_device *tz, int *temp);
 
 /* sysfs I/F */
 int thermal_zone_create_device_groups(struct thermal_zone_device *, int);
diff --git a/drivers/thermal/thermal_helpers.c b/drivers/thermal/thermal_helpers.c
index 690890f054a3..702c70bdca48 100644
--- a/drivers/thermal/thermal_helpers.c
+++ b/drivers/thermal/thermal_helpers.c
@@ -64,27 +64,17 @@ get_thermal_instance(struct thermal_zone_device *tz,
 }
 EXPORT_SYMBOL(get_thermal_instance);
 
-/**
- * thermal_zone_get_temp() - returns the temperature of a thermal zone
- * @tz: a valid pointer to a struct thermal_zone_device
- * @temp: a valid pointer to where to store the resulting temperature.
- *
- * When a valid thermal zone reference is passed, it will fetch its
- * temperature and fill @temp.
- *
- * Return: On success returns 0, an error code otherwise
- */
-int thermal_zone_get_temp(struct thermal_zone_device *tz, int *temp)
+int __thermal_zone_get_temp(struct thermal_zone_device *tz, int *temp)
 {
 	int ret = -EINVAL;
 	int count;
 	int crit_temp = INT_MAX;
 	enum thermal_trip_type type;
 
+	lockdep_assert_held(&tz->lock);
+	
 	if (!tz || IS_ERR(tz) || !tz->ops->get_temp)
-		goto exit;
-
-	mutex_lock(&tz->lock);
+		return -EINVAL;
 
 	ret = tz->ops->get_temp(tz, temp);
 
@@ -107,35 +97,42 @@ int thermal_zone_get_temp(struct thermal_zone_device *tz, int *temp)
 			*temp = tz->emul_temperature;
 	}
 
-	mutex_unlock(&tz->lock);
-exit:
 	return ret;
 }
-EXPORT_SYMBOL_GPL(thermal_zone_get_temp);
 
 /**
- * thermal_zone_set_trips - Computes the next trip points for the driver
- * @tz: a pointer to a thermal zone device structure
+ * thermal_zone_get_temp() - returns the temperature of a thermal zone
+ * @tz: a valid pointer to a struct thermal_zone_device
+ * @temp: a valid pointer to where to store the resulting temperature.
  *
- * The function computes the next temperature boundaries by browsing
- * the trip points. The result is the closer low and high trip points
- * to the current temperature. These values are passed to the backend
- * driver to let it set its own notification mechanism (usually an
- * interrupt).
+ * When a valid thermal zone reference is passed, it will fetch its
+ * temperature and fill @temp.
  *
- * It does not return a value
+ * Return: On success returns 0, an error code otherwise
  */
-void thermal_zone_set_trips(struct thermal_zone_device *tz)
+int thermal_zone_get_temp(struct thermal_zone_device *tz, int *temp)
+{
+	int ret;
+
+	mutex_lock(&tz->lock);
+	ret = __thermal_zone_get_temp(tz, temp);
+	mutex_unlock(&tz->lock);
+
+	return ret;
+}
+EXPORT_SYMBOL_GPL(thermal_zone_get_temp);
+
+void __thermal_zone_set_trips(struct thermal_zone_device *tz)
 {
 	int low = -INT_MAX;
 	int high = INT_MAX;
 	int trip_temp, hysteresis;
 	int i, ret;
 
-	mutex_lock(&tz->lock);
-
+	lockdep_assert_held(&tz->lock);
+	
 	if (!tz->ops->set_trips || !tz->ops->get_trip_hyst)
-		goto exit;
+		return;
 
 	for (i = 0; i < tz->num_trips; i++) {
 		int trip_low;
@@ -154,7 +151,7 @@ void thermal_zone_set_trips(struct thermal_zone_device *tz)
 
 	/* No need to change trip points */
 	if (tz->prev_low_trip == low && tz->prev_high_trip == high)
-		goto exit;
+		return;
 
 	tz->prev_low_trip = low;
 	tz->prev_high_trip = high;
@@ -169,8 +166,24 @@ void thermal_zone_set_trips(struct thermal_zone_device *tz)
 	ret = tz->ops->set_trips(tz, low, high);
 	if (ret)
 		dev_err(&tz->device, "Failed to set trips: %d\n", ret);
+}
 
-exit:
+/**
+ * thermal_zone_set_trips - Computes the next trip points for the driver
+ * @tz: a pointer to a thermal zone device structure
+ *
+ * The function computes the next temperature boundaries by browsing
+ * the trip points. The result is the closer low and high trip points
+ * to the current temperature. These values are passed to the backend
+ * driver to let it set its own notification mechanism (usually an
+ * interrupt).
+ *
+ * It does not return a value
+ */
+void thermal_zone_set_trips(struct thermal_zone_device *tz)
+{
+	mutex_lock(&tz->lock);
+	__thermal_zone_set_trips(tz);
 	mutex_unlock(&tz->lock);
 }
 
diff --git a/drivers/thermal/thermal_sysfs.c b/drivers/thermal/thermal_sysfs.c
index 3c513561d346..f094f7cbc455 100644
--- a/drivers/thermal/thermal_sysfs.c
+++ b/drivers/thermal/thermal_sysfs.c
@@ -49,7 +49,11 @@ static ssize_t
 mode_show(struct device *dev, struct device_attribute *attr, char *buf)
 {
 	struct thermal_zone_device *tz = to_thermal_zone(dev);
-	int enabled = thermal_zone_device_is_enabled(tz);
+	int enabled;
+
+	mutex_lock(&tz->lock);
+	enabled = thermal_zone_device_is_enabled(tz);
+	mutex_unlock(&tz->lock);
 
 	return sprintf(buf, "%s\n", enabled ? "enabled" : "disabled");
 }
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* Re: [PATCH 1/5] thermal/core: Rearm the monitoring only one time
  2022-08-05 15:38 [PATCH 1/5] thermal/core: Rearm the monitoring only one time Daniel Lezcano
                   ` (3 preceding siblings ...)
  2022-08-05 15:38 ` [PATCH 5/5] thermal/core: Move the mutex inside the thermal_zone_device_update() function Daniel Lezcano
@ 2022-08-05 16:29 ` Rafael J. Wysocki
  2022-08-05 16:37   ` Daniel Lezcano
  2022-08-23 12:42 ` [thermal: thermal/next] " thermal-bot for Daniel Lezcano
  5 siblings, 1 reply; 20+ messages in thread
From: Rafael J. Wysocki @ 2022-08-05 16:29 UTC (permalink / raw)
  To: Daniel Lezcano
  Cc: Rafael J. Wysocki, Zhang, Rui, Linux PM,
	Linux Kernel Mailing List, Amit Kucheria

On Fri, Aug 5, 2022 at 5:38 PM Daniel Lezcano <daniel.lezcano@linaro.org> wrote:
>
> The current code calls monitor_thermal_zone() inside the
> handle_thermal_trip() function. But this one is called in a loop for
> each trip point which means the monitoring is rearmed several times
> for nothing (assuming there could be several passive and active trip
> points).
>
> Move the monitor_thermal_zone() function out of the
> handle_thermal_trip() function and after the thermal trip loop, so the
> timer will be disabled or rearmed one time.
>
> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>

Does this series depend on any other?

You've been sending quite a lot of material lately and it is not
always easy to tell what the dependencies between the different patch
series are.

> ---
>  drivers/thermal/thermal_core.c | 7 ++-----
>  1 file changed, 2 insertions(+), 5 deletions(-)
>
> diff --git a/drivers/thermal/thermal_core.c b/drivers/thermal/thermal_core.c
> index b4c68410c158..4e1a83987b99 100644
> --- a/drivers/thermal/thermal_core.c
> +++ b/drivers/thermal/thermal_core.c
> @@ -383,11 +383,6 @@ static void handle_thermal_trip(struct thermal_zone_device *tz, int trip)
>                 handle_critical_trips(tz, trip, trip_temp, type);
>         else
>                 handle_non_critical_trips(tz, trip);
> -       /*
> -        * Alright, we handled this trip successfully.
> -        * So, start monitoring again.
> -        */
> -       monitor_thermal_zone(tz);
>  }
>
>  static void update_temperature(struct thermal_zone_device *tz)
> @@ -503,6 +498,8 @@ void thermal_zone_device_update(struct thermal_zone_device *tz,
>
>         for (count = 0; count < tz->num_trips; count++)
>                 handle_thermal_trip(tz, count);
> +
> +       monitor_thermal_zone(tz);
>  }
>  EXPORT_SYMBOL_GPL(thermal_zone_device_update);
>
> --
> 2.25.1
>

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 1/5] thermal/core: Rearm the monitoring only one time
  2022-08-05 16:29 ` [PATCH 1/5] thermal/core: Rearm the monitoring only one time Rafael J. Wysocki
@ 2022-08-05 16:37   ` Daniel Lezcano
  0 siblings, 0 replies; 20+ messages in thread
From: Daniel Lezcano @ 2022-08-05 16:37 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Zhang, Rui, Linux PM, Linux Kernel Mailing List, Amit Kucheria


Hi Rafael,

On 05/08/2022 18:29, Rafael J. Wysocki wrote:
> On Fri, Aug 5, 2022 at 5:38 PM Daniel Lezcano <daniel.lezcano@linaro.org> wrote:
>>
>> The current code calls monitor_thermal_zone() inside the
>> handle_thermal_trip() function. But this one is called in a loop for
>> each trip point which means the monitoring is rearmed several times
>> for nothing (assuming there could be several passive and active trip
>> points).
>>
>> Move the monitor_thermal_zone() function out of the
>> handle_thermal_trip() function and after the thermal trip loop, so the
>> timer will be disabled or rearmed one time.
>>
>> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
> 
> Does this series depend on any other? >
> You've been sending quite a lot of material lately and it is not
> always easy to tell what the dependencies between the different patch
> series are.

Yes I understand.

This series does not depend on any other. It is not related to any 
pending changes.

It may have trivial conflicts with the other 26 patches series but it 
would be the case for any submissions posted by someone else anyway.




-- 
<http://www.linaro.org/> Linaro.org │ Open source software for ARM SoCs

Follow Linaro:  <http://www.facebook.com/pages/Linaro> Facebook |
<http://twitter.com/#!/linaroorg> Twitter |
<http://www.linaro.org/linaro-blog/> Blog

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 5/5] thermal/core: Move the mutex inside the thermal_zone_device_update() function
       [not found]   ` <CGME20220812103956eucas1p1849443855b3537c3f5be2891fa50a50b@eucas1p1.samsung.com>
@ 2022-08-12 10:39     ` Marek Szyprowski
  2022-08-12 13:12       ` [PATCH] thermal/core: Fix lockdep_assert() warning Daniel Lezcano
  2022-08-12 13:17       ` [PATCH 5/5] thermal/core: Move the mutex inside the thermal_zone_device_update() function Daniel Lezcano
  0 siblings, 2 replies; 20+ messages in thread
From: Marek Szyprowski @ 2022-08-12 10:39 UTC (permalink / raw)
  To: Daniel Lezcano, rafael, Krzysztof Kozlowski
  Cc: rui.zhang, linux-pm, linux-kernel, Amit Kucheria,
	'Linux Samsung SOC'

Hi Daniel,

On 05.08.2022 17:38, Daniel Lezcano wrote:
> All the different calls inside the thermal_zone_device_update()
> function take the mutex.
>
> The previous changes move the mutex out of the different functions,
> like the throttling ops. Now that the mutexes are all at the same
> level in the call stack for the thermal_zone_device_update() function,
> they can be moved inside this one.
>
> That has the benefit of:
>
> 1. Simplify the code by not having a plethora of places where the lock is taken
>
> 2. Probably closes more race windows because releasing the lock from
> one line to another can give the opportunity to the thermal zone to change
> its state in the meantime. For example, the thermal zone can be
> enabled right after checking it is disabled.
>
> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>

This patch landed recently in linux next-20220811 as commit ca48ad71717d 
("thermal/core: Move the mutex inside the thermal_zone_device_update() 
function"). Unfortunately it triggers a warning on Samsung ARM/ARM64 
Exynos-based systems during the system suspend/resume cycle:

Restarting tasks ... done.
random: crng reseeded on system resumption
------------[ cut here ]------------
WARNING: CPU: 1 PID: 1374 at drivers/thermal/thermal_core.c:452 
thermal_zone_device_is_enabled+0x58/0x5c
Modules linked in: cmac bnep hci_uart btbcm btintel bluetooth s5p_csis 
s5p_fimc exynos4_is_common v4l2_fwnode v4l2_async ecdh_generic ecc 
s5p_mfc brcmfmac cfg80211 s5p_jpeg videobuf2_dma_contig videobuf2_memops 
brcmutil v4l2_mem2mem videobuf2_v4l2 videobuf2_common videodev mc
CPU: 1 PID: 1374 Comm: rtcwake Not tainted 5.18.0-02136-gca48ad71717d #12560
Hardware name: Samsung Exynos (Flattened Device Tree)
  unwind_backtrace from show_stack+0x10/0x14
  show_stack from dump_stack_lvl+0x58/0x70
  dump_stack_lvl from __warn+0x230/0x234
  __warn from warn_slowpath_fmt+0xac/0xb4
  warn_slowpath_fmt from thermal_zone_device_is_enabled+0x58/0x5c
  thermal_zone_device_is_enabled from thermal_pm_notify+0x84/0xe8
  thermal_pm_notify from blocking_notifier_call_chain+0x6c/0x94
  blocking_notifier_call_chain from pm_suspend+0x2e8/0x428
  pm_suspend from state_store+0x68/0xc8
  state_store from kernfs_fop_write_iter+0x108/0x1b0
  kernfs_fop_write_iter from vfs_write+0x268/0x50c
  vfs_write from ksys_write+0x54/0xc8
  ksys_write from ret_fast_syscall+0x0/0x1c
Exception stack(0xf0fa9fa8 to 0xf0fa9ff0)
...
irq event stamp: 49341
hardirqs last  enabled at (49349): [<c019d3f0>] __up_console_sem+0x50/0x60
hardirqs last disabled at (49358): [<c019d3dc>] __up_console_sem+0x3c/0x60
softirqs last  enabled at (49336): [<c0101808>] __do_softirq+0x4c8/0x5e8
softirqs last disabled at (49331): [<c01305b8>] irq_exit+0x1cc/0x200
---[ end trace 0000000000000000 ]---
PM: suspend exit


It looks that either the exynos_thermal driver or the framework has to 
be somehow adjusted to avoid the above issue, but I didn't check the 
details in the code yet.


> ---
>   drivers/thermal/thermal_core.c    | 32 +++++--------
>   drivers/thermal/thermal_core.h    |  2 +
>   drivers/thermal/thermal_helpers.c | 75 ++++++++++++++++++-------------
>   drivers/thermal/thermal_sysfs.c   |  6 ++-
>   4 files changed, 62 insertions(+), 53 deletions(-)
>
> diff --git a/drivers/thermal/thermal_core.c b/drivers/thermal/thermal_core.c
> index 9d554f97e081..60110ac53e23 100644
> --- a/drivers/thermal/thermal_core.c
> +++ b/drivers/thermal/thermal_core.c
> @@ -297,24 +297,18 @@ static void thermal_zone_device_set_polling(struct thermal_zone_device *tz,
>   
>   static void monitor_thermal_zone(struct thermal_zone_device *tz)
>   {
> -	mutex_lock(&tz->lock);
> -
>   	if (tz->mode != THERMAL_DEVICE_ENABLED)
>   		thermal_zone_device_set_polling(tz, 0);
>   	else if (tz->passive)
>   		thermal_zone_device_set_polling(tz, tz->passive_delay_jiffies);
>   	else if (tz->polling_delay_jiffies)
>   		thermal_zone_device_set_polling(tz, tz->polling_delay_jiffies);
> -
> -	mutex_unlock(&tz->lock);
>   }
>   
>   static void handle_non_critical_trips(struct thermal_zone_device *tz, int trip)
>   {
> -	mutex_lock(&tz->lock);
>   	tz->governor ? tz->governor->throttle(tz, trip) :
>   		       def_governor->throttle(tz, trip);
> -	mutex_unlock(&tz->lock);
>   }
>   
>   void thermal_zone_device_critical(struct thermal_zone_device *tz)
> @@ -382,7 +376,7 @@ static void update_temperature(struct thermal_zone_device *tz)
>   {
>   	int temp, ret;
>   
> -	ret = thermal_zone_get_temp(tz, &temp);
> +	ret = __thermal_zone_get_temp(tz, &temp);
>   	if (ret) {
>   		if (ret != -EAGAIN)
>   			dev_warn(&tz->device,
> @@ -391,10 +385,8 @@ static void update_temperature(struct thermal_zone_device *tz)
>   		return;
>   	}
>   
> -	mutex_lock(&tz->lock);
>   	tz->last_temperature = tz->temperature;
>   	tz->temperature = temp;
> -	mutex_unlock(&tz->lock);
>   
>   	trace_thermal_temperature(tz);
>   
> @@ -457,15 +449,9 @@ EXPORT_SYMBOL_GPL(thermal_zone_device_disable);
>   
>   int thermal_zone_device_is_enabled(struct thermal_zone_device *tz)
>   {
> -	enum thermal_device_mode mode;
> -
> -	mutex_lock(&tz->lock);
> -
> -	mode = tz->mode;
> +	lockdep_assert_held(&tz->lock);
>   
> -	mutex_unlock(&tz->lock);
> -
> -	return mode == THERMAL_DEVICE_ENABLED;
> +	return tz->mode == THERMAL_DEVICE_ENABLED;
>   }
>   
>   void thermal_zone_device_update(struct thermal_zone_device *tz,
> @@ -473,9 +459,6 @@ void thermal_zone_device_update(struct thermal_zone_device *tz,
>   {
>   	int count;
>   
> -	if (!thermal_zone_device_is_enabled(tz))
> -		return;
> -
>   	if (atomic_read(&in_suspend))
>   		return;
>   
> @@ -483,9 +466,14 @@ void thermal_zone_device_update(struct thermal_zone_device *tz,
>   		      "'get_temp' ops set\n", __func__))
>   		return;
>   
> +	mutex_lock(&tz->lock);
> +
> +	if (!thermal_zone_device_is_enabled(tz))
> +		goto out;
> +
>   	update_temperature(tz);
>   
> -	thermal_zone_set_trips(tz);
> +	__thermal_zone_set_trips(tz);
>   
>   	tz->notify_event = event;
>   
> @@ -493,6 +481,8 @@ void thermal_zone_device_update(struct thermal_zone_device *tz,
>   		handle_thermal_trip(tz, count);
>   
>   	monitor_thermal_zone(tz);
> +out:
> +	mutex_unlock(&tz->lock);
>   }
>   EXPORT_SYMBOL_GPL(thermal_zone_device_update);
>   
> diff --git a/drivers/thermal/thermal_core.h b/drivers/thermal/thermal_core.h
> index 2241d2dce017..1571917bd3c8 100644
> --- a/drivers/thermal/thermal_core.h
> +++ b/drivers/thermal/thermal_core.h
> @@ -112,6 +112,8 @@ int thermal_build_list_of_policies(char *buf);
>   
>   /* Helpers */
>   void thermal_zone_set_trips(struct thermal_zone_device *tz);
> +void __thermal_zone_set_trips(struct thermal_zone_device *tz);
> +int __thermal_zone_get_temp(struct thermal_zone_device *tz, int *temp);
>   
>   /* sysfs I/F */
>   int thermal_zone_create_device_groups(struct thermal_zone_device *, int);
> diff --git a/drivers/thermal/thermal_helpers.c b/drivers/thermal/thermal_helpers.c
> index 690890f054a3..702c70bdca48 100644
> --- a/drivers/thermal/thermal_helpers.c
> +++ b/drivers/thermal/thermal_helpers.c
> @@ -64,27 +64,17 @@ get_thermal_instance(struct thermal_zone_device *tz,
>   }
>   EXPORT_SYMBOL(get_thermal_instance);
>   
> -/**
> - * thermal_zone_get_temp() - returns the temperature of a thermal zone
> - * @tz: a valid pointer to a struct thermal_zone_device
> - * @temp: a valid pointer to where to store the resulting temperature.
> - *
> - * When a valid thermal zone reference is passed, it will fetch its
> - * temperature and fill @temp.
> - *
> - * Return: On success returns 0, an error code otherwise
> - */
> -int thermal_zone_get_temp(struct thermal_zone_device *tz, int *temp)
> +int __thermal_zone_get_temp(struct thermal_zone_device *tz, int *temp)
>   {
>   	int ret = -EINVAL;
>   	int count;
>   	int crit_temp = INT_MAX;
>   	enum thermal_trip_type type;
>   
> +	lockdep_assert_held(&tz->lock);
> +	
>   	if (!tz || IS_ERR(tz) || !tz->ops->get_temp)
> -		goto exit;
> -
> -	mutex_lock(&tz->lock);
> +		return -EINVAL;
>   
>   	ret = tz->ops->get_temp(tz, temp);
>   
> @@ -107,35 +97,42 @@ int thermal_zone_get_temp(struct thermal_zone_device *tz, int *temp)
>   			*temp = tz->emul_temperature;
>   	}
>   
> -	mutex_unlock(&tz->lock);
> -exit:
>   	return ret;
>   }
> -EXPORT_SYMBOL_GPL(thermal_zone_get_temp);
>   
>   /**
> - * thermal_zone_set_trips - Computes the next trip points for the driver
> - * @tz: a pointer to a thermal zone device structure
> + * thermal_zone_get_temp() - returns the temperature of a thermal zone
> + * @tz: a valid pointer to a struct thermal_zone_device
> + * @temp: a valid pointer to where to store the resulting temperature.
>    *
> - * The function computes the next temperature boundaries by browsing
> - * the trip points. The result is the closer low and high trip points
> - * to the current temperature. These values are passed to the backend
> - * driver to let it set its own notification mechanism (usually an
> - * interrupt).
> + * When a valid thermal zone reference is passed, it will fetch its
> + * temperature and fill @temp.
>    *
> - * It does not return a value
> + * Return: On success returns 0, an error code otherwise
>    */
> -void thermal_zone_set_trips(struct thermal_zone_device *tz)
> +int thermal_zone_get_temp(struct thermal_zone_device *tz, int *temp)
> +{
> +	int ret;
> +
> +	mutex_lock(&tz->lock);
> +	ret = __thermal_zone_get_temp(tz, temp);
> +	mutex_unlock(&tz->lock);
> +
> +	return ret;
> +}
> +EXPORT_SYMBOL_GPL(thermal_zone_get_temp);
> +
> +void __thermal_zone_set_trips(struct thermal_zone_device *tz)
>   {
>   	int low = -INT_MAX;
>   	int high = INT_MAX;
>   	int trip_temp, hysteresis;
>   	int i, ret;
>   
> -	mutex_lock(&tz->lock);
> -
> +	lockdep_assert_held(&tz->lock);
> +	
>   	if (!tz->ops->set_trips || !tz->ops->get_trip_hyst)
> -		goto exit;
> +		return;
>   
>   	for (i = 0; i < tz->num_trips; i++) {
>   		int trip_low;
> @@ -154,7 +151,7 @@ void thermal_zone_set_trips(struct thermal_zone_device *tz)
>   
>   	/* No need to change trip points */
>   	if (tz->prev_low_trip == low && tz->prev_high_trip == high)
> -		goto exit;
> +		return;
>   
>   	tz->prev_low_trip = low;
>   	tz->prev_high_trip = high;
> @@ -169,8 +166,24 @@ void thermal_zone_set_trips(struct thermal_zone_device *tz)
>   	ret = tz->ops->set_trips(tz, low, high);
>   	if (ret)
>   		dev_err(&tz->device, "Failed to set trips: %d\n", ret);
> +}
>   
> -exit:
> +/**
> + * thermal_zone_set_trips - Computes the next trip points for the driver
> + * @tz: a pointer to a thermal zone device structure
> + *
> + * The function computes the next temperature boundaries by browsing
> + * the trip points. The result is the closer low and high trip points
> + * to the current temperature. These values are passed to the backend
> + * driver to let it set its own notification mechanism (usually an
> + * interrupt).
> + *
> + * It does not return a value
> + */
> +void thermal_zone_set_trips(struct thermal_zone_device *tz)
> +{
> +	mutex_lock(&tz->lock);
> +	__thermal_zone_set_trips(tz);
>   	mutex_unlock(&tz->lock);
>   }
>   
> diff --git a/drivers/thermal/thermal_sysfs.c b/drivers/thermal/thermal_sysfs.c
> index 3c513561d346..f094f7cbc455 100644
> --- a/drivers/thermal/thermal_sysfs.c
> +++ b/drivers/thermal/thermal_sysfs.c
> @@ -49,7 +49,11 @@ static ssize_t
>   mode_show(struct device *dev, struct device_attribute *attr, char *buf)
>   {
>   	struct thermal_zone_device *tz = to_thermal_zone(dev);
> -	int enabled = thermal_zone_device_is_enabled(tz);
> +	int enabled;
> +
> +	mutex_lock(&tz->lock);
> +	enabled = thermal_zone_device_is_enabled(tz);
> +	mutex_unlock(&tz->lock);
>   
>   	return sprintf(buf, "%s\n", enabled ? "enabled" : "disabled");
>   }

Best regards
-- 
Marek Szyprowski, PhD
Samsung R&D Institute Poland


^ permalink raw reply	[flat|nested] 20+ messages in thread

* [PATCH] thermal/core: Fix lockdep_assert() warning
  2022-08-12 10:39     ` Marek Szyprowski
@ 2022-08-12 13:12       ` Daniel Lezcano
  2022-08-12 13:34         ` Marek Szyprowski
  2022-08-12 13:17       ` [PATCH 5/5] thermal/core: Move the mutex inside the thermal_zone_device_update() function Daniel Lezcano
  1 sibling, 1 reply; 20+ messages in thread
From: Daniel Lezcano @ 2022-08-12 13:12 UTC (permalink / raw)
  To: m.szyprowski
  Cc: rafael, krzk, rui.zhang, linux-pm, linux-kernel, amitk,
	linux-samsung-soc

The function thermal_zone_device_is_enabled() must be called with the
thermal zone lock held. In the resume path, it is called without.

As the thermal_zone_device_is_enabled() is also checked in
thermal_zone_device_update(), do the check in resume() function is
pointless, except for saving an extra initialization which does not
hurt if it is done in all the cases.

Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
---
 drivers/thermal/thermal_core.c | 3 ---
 1 file changed, 3 deletions(-)

diff --git a/drivers/thermal/thermal_core.c b/drivers/thermal/thermal_core.c
index 50814009339d..dc8ff6a84df1 100644
--- a/drivers/thermal/thermal_core.c
+++ b/drivers/thermal/thermal_core.c
@@ -1438,9 +1438,6 @@ static int thermal_pm_notify(struct notifier_block *nb,
 	case PM_POST_SUSPEND:
 		atomic_set(&in_suspend, 0);
 		list_for_each_entry(tz, &thermal_tz_list, node) {
-			if (!thermal_zone_device_is_enabled(tz))
-				continue;
-
 			thermal_zone_device_init(tz);
 			thermal_zone_device_update(tz,
 						   THERMAL_EVENT_UNSPECIFIED);
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* Re: [PATCH 5/5] thermal/core: Move the mutex inside the thermal_zone_device_update() function
  2022-08-12 10:39     ` Marek Szyprowski
  2022-08-12 13:12       ` [PATCH] thermal/core: Fix lockdep_assert() warning Daniel Lezcano
@ 2022-08-12 13:17       ` Daniel Lezcano
  1 sibling, 0 replies; 20+ messages in thread
From: Daniel Lezcano @ 2022-08-12 13:17 UTC (permalink / raw)
  To: Marek Szyprowski, rafael, Krzysztof Kozlowski
  Cc: rui.zhang, linux-pm, linux-kernel, Amit Kucheria,
	'Linux Samsung SOC'


Hi Marek,

thanks for reporting.

Are you ok to check if the patch I've sent fixes the issue on your 
platform ?


On 12/08/2022 12:39, Marek Szyprowski wrote:
> Hi Daniel,
> 
> On 05.08.2022 17:38, Daniel Lezcano wrote:
>> All the different calls inside the thermal_zone_device_update()
>> function take the mutex.
>>
>> The previous changes move the mutex out of the different functions,
>> like the throttling ops. Now that the mutexes are all at the same
>> level in the call stack for the thermal_zone_device_update() function,
>> they can be moved inside this one.
>>
>> That has the benefit of:
>>
>> 1. Simplify the code by not having a plethora of places where the lock is taken
>>
>> 2. Probably closes more race windows because releasing the lock from
>> one line to another can give the opportunity to the thermal zone to change
>> its state in the meantime. For example, the thermal zone can be
>> enabled right after checking it is disabled.
>>
>> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
> 
> This patch landed recently in linux next-20220811 as commit ca48ad71717d
> ("thermal/core: Move the mutex inside the thermal_zone_device_update()
> function"). Unfortunately it triggers a warning on Samsung ARM/ARM64
> Exynos-based systems during the system suspend/resume cycle:
> 
> Restarting tasks ... done.
> random: crng reseeded on system resumption
> ------------[ cut here ]------------
> WARNING: CPU: 1 PID: 1374 at drivers/thermal/thermal_core.c:452
> thermal_zone_device_is_enabled+0x58/0x5c
> Modules linked in: cmac bnep hci_uart btbcm btintel bluetooth s5p_csis
> s5p_fimc exynos4_is_common v4l2_fwnode v4l2_async ecdh_generic ecc
> s5p_mfc brcmfmac cfg80211 s5p_jpeg videobuf2_dma_contig videobuf2_memops
> brcmutil v4l2_mem2mem videobuf2_v4l2 videobuf2_common videodev mc
> CPU: 1 PID: 1374 Comm: rtcwake Not tainted 5.18.0-02136-gca48ad71717d #12560
> Hardware name: Samsung Exynos (Flattened Device Tree)
>    unwind_backtrace from show_stack+0x10/0x14
>    show_stack from dump_stack_lvl+0x58/0x70
>    dump_stack_lvl from __warn+0x230/0x234
>    __warn from warn_slowpath_fmt+0xac/0xb4
>    warn_slowpath_fmt from thermal_zone_device_is_enabled+0x58/0x5c
>    thermal_zone_device_is_enabled from thermal_pm_notify+0x84/0xe8
>    thermal_pm_notify from blocking_notifier_call_chain+0x6c/0x94
>    blocking_notifier_call_chain from pm_suspend+0x2e8/0x428
>    pm_suspend from state_store+0x68/0xc8
>    state_store from kernfs_fop_write_iter+0x108/0x1b0
>    kernfs_fop_write_iter from vfs_write+0x268/0x50c
>    vfs_write from ksys_write+0x54/0xc8
>    ksys_write from ret_fast_syscall+0x0/0x1c
> Exception stack(0xf0fa9fa8 to 0xf0fa9ff0)
> ...
> irq event stamp: 49341
> hardirqs last  enabled at (49349): [<c019d3f0>] __up_console_sem+0x50/0x60
> hardirqs last disabled at (49358): [<c019d3dc>] __up_console_sem+0x3c/0x60
> softirqs last  enabled at (49336): [<c0101808>] __do_softirq+0x4c8/0x5e8
> softirqs last disabled at (49331): [<c01305b8>] irq_exit+0x1cc/0x200
> ---[ end trace 0000000000000000 ]---
> PM: suspend exit
> 
> 
> It looks that either the exynos_thermal driver or the framework has to
> be somehow adjusted to avoid the above issue, but I didn't check the
> details in the code yet.
> 
> 
>> ---
>>    drivers/thermal/thermal_core.c    | 32 +++++--------
>>    drivers/thermal/thermal_core.h    |  2 +
>>    drivers/thermal/thermal_helpers.c | 75 ++++++++++++++++++-------------
>>    drivers/thermal/thermal_sysfs.c   |  6 ++-
>>    4 files changed, 62 insertions(+), 53 deletions(-)
>>
>> diff --git a/drivers/thermal/thermal_core.c b/drivers/thermal/thermal_core.c
>> index 9d554f97e081..60110ac53e23 100644
>> --- a/drivers/thermal/thermal_core.c
>> +++ b/drivers/thermal/thermal_core.c
>> @@ -297,24 +297,18 @@ static void thermal_zone_device_set_polling(struct thermal_zone_device *tz,
>>    
>>    static void monitor_thermal_zone(struct thermal_zone_device *tz)
>>    {
>> -	mutex_lock(&tz->lock);
>> -
>>    	if (tz->mode != THERMAL_DEVICE_ENABLED)
>>    		thermal_zone_device_set_polling(tz, 0);
>>    	else if (tz->passive)
>>    		thermal_zone_device_set_polling(tz, tz->passive_delay_jiffies);
>>    	else if (tz->polling_delay_jiffies)
>>    		thermal_zone_device_set_polling(tz, tz->polling_delay_jiffies);
>> -
>> -	mutex_unlock(&tz->lock);
>>    }
>>    
>>    static void handle_non_critical_trips(struct thermal_zone_device *tz, int trip)
>>    {
>> -	mutex_lock(&tz->lock);
>>    	tz->governor ? tz->governor->throttle(tz, trip) :
>>    		       def_governor->throttle(tz, trip);
>> -	mutex_unlock(&tz->lock);
>>    }
>>    
>>    void thermal_zone_device_critical(struct thermal_zone_device *tz)
>> @@ -382,7 +376,7 @@ static void update_temperature(struct thermal_zone_device *tz)
>>    {
>>    	int temp, ret;
>>    
>> -	ret = thermal_zone_get_temp(tz, &temp);
>> +	ret = __thermal_zone_get_temp(tz, &temp);
>>    	if (ret) {
>>    		if (ret != -EAGAIN)
>>    			dev_warn(&tz->device,
>> @@ -391,10 +385,8 @@ static void update_temperature(struct thermal_zone_device *tz)
>>    		return;
>>    	}
>>    
>> -	mutex_lock(&tz->lock);
>>    	tz->last_temperature = tz->temperature;
>>    	tz->temperature = temp;
>> -	mutex_unlock(&tz->lock);
>>    
>>    	trace_thermal_temperature(tz);
>>    
>> @@ -457,15 +449,9 @@ EXPORT_SYMBOL_GPL(thermal_zone_device_disable);
>>    
>>    int thermal_zone_device_is_enabled(struct thermal_zone_device *tz)
>>    {
>> -	enum thermal_device_mode mode;
>> -
>> -	mutex_lock(&tz->lock);
>> -
>> -	mode = tz->mode;
>> +	lockdep_assert_held(&tz->lock);
>>    
>> -	mutex_unlock(&tz->lock);
>> -
>> -	return mode == THERMAL_DEVICE_ENABLED;
>> +	return tz->mode == THERMAL_DEVICE_ENABLED;
>>    }
>>    
>>    void thermal_zone_device_update(struct thermal_zone_device *tz,
>> @@ -473,9 +459,6 @@ void thermal_zone_device_update(struct thermal_zone_device *tz,
>>    {
>>    	int count;
>>    
>> -	if (!thermal_zone_device_is_enabled(tz))
>> -		return;
>> -
>>    	if (atomic_read(&in_suspend))
>>    		return;
>>    
>> @@ -483,9 +466,14 @@ void thermal_zone_device_update(struct thermal_zone_device *tz,
>>    		      "'get_temp' ops set\n", __func__))
>>    		return;
>>    
>> +	mutex_lock(&tz->lock);
>> +
>> +	if (!thermal_zone_device_is_enabled(tz))
>> +		goto out;
>> +
>>    	update_temperature(tz);
>>    
>> -	thermal_zone_set_trips(tz);
>> +	__thermal_zone_set_trips(tz);
>>    
>>    	tz->notify_event = event;
>>    
>> @@ -493,6 +481,8 @@ void thermal_zone_device_update(struct thermal_zone_device *tz,
>>    		handle_thermal_trip(tz, count);
>>    
>>    	monitor_thermal_zone(tz);
>> +out:
>> +	mutex_unlock(&tz->lock);
>>    }
>>    EXPORT_SYMBOL_GPL(thermal_zone_device_update);
>>    
>> diff --git a/drivers/thermal/thermal_core.h b/drivers/thermal/thermal_core.h
>> index 2241d2dce017..1571917bd3c8 100644
>> --- a/drivers/thermal/thermal_core.h
>> +++ b/drivers/thermal/thermal_core.h
>> @@ -112,6 +112,8 @@ int thermal_build_list_of_policies(char *buf);
>>    
>>    /* Helpers */
>>    void thermal_zone_set_trips(struct thermal_zone_device *tz);
>> +void __thermal_zone_set_trips(struct thermal_zone_device *tz);
>> +int __thermal_zone_get_temp(struct thermal_zone_device *tz, int *temp);
>>    
>>    /* sysfs I/F */
>>    int thermal_zone_create_device_groups(struct thermal_zone_device *, int);
>> diff --git a/drivers/thermal/thermal_helpers.c b/drivers/thermal/thermal_helpers.c
>> index 690890f054a3..702c70bdca48 100644
>> --- a/drivers/thermal/thermal_helpers.c
>> +++ b/drivers/thermal/thermal_helpers.c
>> @@ -64,27 +64,17 @@ get_thermal_instance(struct thermal_zone_device *tz,
>>    }
>>    EXPORT_SYMBOL(get_thermal_instance);
>>    
>> -/**
>> - * thermal_zone_get_temp() - returns the temperature of a thermal zone
>> - * @tz: a valid pointer to a struct thermal_zone_device
>> - * @temp: a valid pointer to where to store the resulting temperature.
>> - *
>> - * When a valid thermal zone reference is passed, it will fetch its
>> - * temperature and fill @temp.
>> - *
>> - * Return: On success returns 0, an error code otherwise
>> - */
>> -int thermal_zone_get_temp(struct thermal_zone_device *tz, int *temp)
>> +int __thermal_zone_get_temp(struct thermal_zone_device *tz, int *temp)
>>    {
>>    	int ret = -EINVAL;
>>    	int count;
>>    	int crit_temp = INT_MAX;
>>    	enum thermal_trip_type type;
>>    
>> +	lockdep_assert_held(&tz->lock);
>> +	
>>    	if (!tz || IS_ERR(tz) || !tz->ops->get_temp)
>> -		goto exit;
>> -
>> -	mutex_lock(&tz->lock);
>> +		return -EINVAL;
>>    
>>    	ret = tz->ops->get_temp(tz, temp);
>>    
>> @@ -107,35 +97,42 @@ int thermal_zone_get_temp(struct thermal_zone_device *tz, int *temp)
>>    			*temp = tz->emul_temperature;
>>    	}
>>    
>> -	mutex_unlock(&tz->lock);
>> -exit:
>>    	return ret;
>>    }
>> -EXPORT_SYMBOL_GPL(thermal_zone_get_temp);
>>    
>>    /**
>> - * thermal_zone_set_trips - Computes the next trip points for the driver
>> - * @tz: a pointer to a thermal zone device structure
>> + * thermal_zone_get_temp() - returns the temperature of a thermal zone
>> + * @tz: a valid pointer to a struct thermal_zone_device
>> + * @temp: a valid pointer to where to store the resulting temperature.
>>     *
>> - * The function computes the next temperature boundaries by browsing
>> - * the trip points. The result is the closer low and high trip points
>> - * to the current temperature. These values are passed to the backend
>> - * driver to let it set its own notification mechanism (usually an
>> - * interrupt).
>> + * When a valid thermal zone reference is passed, it will fetch its
>> + * temperature and fill @temp.
>>     *
>> - * It does not return a value
>> + * Return: On success returns 0, an error code otherwise
>>     */
>> -void thermal_zone_set_trips(struct thermal_zone_device *tz)
>> +int thermal_zone_get_temp(struct thermal_zone_device *tz, int *temp)
>> +{
>> +	int ret;
>> +
>> +	mutex_lock(&tz->lock);
>> +	ret = __thermal_zone_get_temp(tz, temp);
>> +	mutex_unlock(&tz->lock);
>> +
>> +	return ret;
>> +}
>> +EXPORT_SYMBOL_GPL(thermal_zone_get_temp);
>> +
>> +void __thermal_zone_set_trips(struct thermal_zone_device *tz)
>>    {
>>    	int low = -INT_MAX;
>>    	int high = INT_MAX;
>>    	int trip_temp, hysteresis;
>>    	int i, ret;
>>    
>> -	mutex_lock(&tz->lock);
>> -
>> +	lockdep_assert_held(&tz->lock);
>> +	
>>    	if (!tz->ops->set_trips || !tz->ops->get_trip_hyst)
>> -		goto exit;
>> +		return;
>>    
>>    	for (i = 0; i < tz->num_trips; i++) {
>>    		int trip_low;
>> @@ -154,7 +151,7 @@ void thermal_zone_set_trips(struct thermal_zone_device *tz)
>>    
>>    	/* No need to change trip points */
>>    	if (tz->prev_low_trip == low && tz->prev_high_trip == high)
>> -		goto exit;
>> +		return;
>>    
>>    	tz->prev_low_trip = low;
>>    	tz->prev_high_trip = high;
>> @@ -169,8 +166,24 @@ void thermal_zone_set_trips(struct thermal_zone_device *tz)
>>    	ret = tz->ops->set_trips(tz, low, high);
>>    	if (ret)
>>    		dev_err(&tz->device, "Failed to set trips: %d\n", ret);
>> +}
>>    
>> -exit:
>> +/**
>> + * thermal_zone_set_trips - Computes the next trip points for the driver
>> + * @tz: a pointer to a thermal zone device structure
>> + *
>> + * The function computes the next temperature boundaries by browsing
>> + * the trip points. The result is the closer low and high trip points
>> + * to the current temperature. These values are passed to the backend
>> + * driver to let it set its own notification mechanism (usually an
>> + * interrupt).
>> + *
>> + * It does not return a value
>> + */
>> +void thermal_zone_set_trips(struct thermal_zone_device *tz)
>> +{
>> +	mutex_lock(&tz->lock);
>> +	__thermal_zone_set_trips(tz);
>>    	mutex_unlock(&tz->lock);
>>    }
>>    
>> diff --git a/drivers/thermal/thermal_sysfs.c b/drivers/thermal/thermal_sysfs.c
>> index 3c513561d346..f094f7cbc455 100644
>> --- a/drivers/thermal/thermal_sysfs.c
>> +++ b/drivers/thermal/thermal_sysfs.c
>> @@ -49,7 +49,11 @@ static ssize_t
>>    mode_show(struct device *dev, struct device_attribute *attr, char *buf)
>>    {
>>    	struct thermal_zone_device *tz = to_thermal_zone(dev);
>> -	int enabled = thermal_zone_device_is_enabled(tz);
>> +	int enabled;
>> +
>> +	mutex_lock(&tz->lock);
>> +	enabled = thermal_zone_device_is_enabled(tz);
>> +	mutex_unlock(&tz->lock);
>>    
>>    	return sprintf(buf, "%s\n", enabled ? "enabled" : "disabled");
>>    }
> 
> Best regards


-- 
<http://www.linaro.org/> Linaro.org │ Open source software for ARM SoCs

Follow Linaro:  <http://www.facebook.com/pages/Linaro> Facebook |
<http://twitter.com/#!/linaroorg> Twitter |
<http://www.linaro.org/linaro-blog/> Blog

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH] thermal/core: Fix lockdep_assert() warning
  2022-08-12 13:12       ` [PATCH] thermal/core: Fix lockdep_assert() warning Daniel Lezcano
@ 2022-08-12 13:34         ` Marek Szyprowski
  2022-08-12 13:54           ` Daniel Lezcano
  0 siblings, 1 reply; 20+ messages in thread
From: Marek Szyprowski @ 2022-08-12 13:34 UTC (permalink / raw)
  To: Daniel Lezcano
  Cc: rafael, krzk, rui.zhang, linux-pm, linux-kernel, amitk,
	linux-samsung-soc

On 12.08.2022 15:12, Daniel Lezcano wrote:
> The function thermal_zone_device_is_enabled() must be called with the
> thermal zone lock held. In the resume path, it is called without.
>
> As the thermal_zone_device_is_enabled() is also checked in
> thermal_zone_device_update(), do the check in resume() function is
> pointless, except for saving an extra initialization which does not
> hurt if it is done in all the cases.
>
> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>

This fixes the warning I've reported. Feel free to add:

Reported-by: Marek Szyprowski <m.szyprowski@samsung.com>

Tested-by: Marek Szyprowski <m.szyprowski@samsung.com>

> ---
>   drivers/thermal/thermal_core.c | 3 ---
>   1 file changed, 3 deletions(-)
>
> diff --git a/drivers/thermal/thermal_core.c b/drivers/thermal/thermal_core.c
> index 50814009339d..dc8ff6a84df1 100644
> --- a/drivers/thermal/thermal_core.c
> +++ b/drivers/thermal/thermal_core.c
> @@ -1438,9 +1438,6 @@ static int thermal_pm_notify(struct notifier_block *nb,
>   	case PM_POST_SUSPEND:
>   		atomic_set(&in_suspend, 0);
>   		list_for_each_entry(tz, &thermal_tz_list, node) {
> -			if (!thermal_zone_device_is_enabled(tz))
> -				continue;
> -
>   			thermal_zone_device_init(tz);
>   			thermal_zone_device_update(tz,
>   						   THERMAL_EVENT_UNSPECIFIED);

Best regards
-- 
Marek Szyprowski, PhD
Samsung R&D Institute Poland


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH] thermal/core: Fix lockdep_assert() warning
  2022-08-12 13:34         ` Marek Szyprowski
@ 2022-08-12 13:54           ` Daniel Lezcano
  2022-08-20 11:57             ` Rafael J. Wysocki
  0 siblings, 1 reply; 20+ messages in thread
From: Daniel Lezcano @ 2022-08-12 13:54 UTC (permalink / raw)
  To: Marek Szyprowski
  Cc: rafael, krzk, rui.zhang, linux-pm, linux-kernel, amitk,
	linux-samsung-soc

On 12/08/2022 15:34, Marek Szyprowski wrote:
> On 12.08.2022 15:12, Daniel Lezcano wrote:
>> The function thermal_zone_device_is_enabled() must be called with the
>> thermal zone lock held. In the resume path, it is called without.
>>
>> As the thermal_zone_device_is_enabled() is also checked in
>> thermal_zone_device_update(), do the check in resume() function is
>> pointless, except for saving an extra initialization which does not
>> hurt if it is done in all the cases.
>>
>> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
> 
> This fixes the warning I've reported. Feel free to add:
> 
> Reported-by: Marek Szyprowski <m.szyprowski@samsung.com>
> 
> Tested-by: Marek Szyprowski <m.szyprowski@samsung.com>

Great, thanks for testing


-- 
<http://www.linaro.org/> Linaro.org │ Open source software for ARM SoCs

Follow Linaro:  <http://www.facebook.com/pages/Linaro> Facebook |
<http://twitter.com/#!/linaroorg> Twitter |
<http://www.linaro.org/linaro-blog/> Blog

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH] thermal/core: Fix lockdep_assert() warning
  2022-08-12 13:54           ` Daniel Lezcano
@ 2022-08-20 11:57             ` Rafael J. Wysocki
  0 siblings, 0 replies; 20+ messages in thread
From: Rafael J. Wysocki @ 2022-08-20 11:57 UTC (permalink / raw)
  To: Daniel Lezcano
  Cc: Marek Szyprowski, Rafael J. Wysocki, Krzysztof Kozlowski, Zhang,
	Rui, Linux PM, Linux Kernel Mailing List, Amit Kucheria,
	Linux Samsung SoC

On Fri, Aug 12, 2022 at 3:54 PM Daniel Lezcano
<daniel.lezcano@linaro.org> wrote:
>
> On 12/08/2022 15:34, Marek Szyprowski wrote:
> > On 12.08.2022 15:12, Daniel Lezcano wrote:
> >> The function thermal_zone_device_is_enabled() must be called with the
> >> thermal zone lock held. In the resume path, it is called without.
> >>
> >> As the thermal_zone_device_is_enabled() is also checked in
> >> thermal_zone_device_update(), do the check in resume() function is
> >> pointless, except for saving an extra initialization which does not
> >> hurt if it is done in all the cases.
> >>
> >> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
> >
> > This fixes the warning I've reported. Feel free to add:
> >
> > Reported-by: Marek Szyprowski <m.szyprowski@samsung.com>
> >
> > Tested-by: Marek Szyprowski <m.szyprowski@samsung.com>
>
> Great, thanks for testing

Do you want me to apply this for -rc3?

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [thermal: thermal/next] thermal/core: Move the mutex inside the thermal_zone_device_update() function
  2022-08-05 15:38 ` [PATCH 5/5] thermal/core: Move the mutex inside the thermal_zone_device_update() function Daniel Lezcano
       [not found]   ` <CGME20220812103956eucas1p1849443855b3537c3f5be2891fa50a50b@eucas1p1.samsung.com>
@ 2022-08-23 12:42   ` thermal-bot for Daniel Lezcano
  1 sibling, 0 replies; 20+ messages in thread
From: thermal-bot for Daniel Lezcano @ 2022-08-23 12:42 UTC (permalink / raw)
  To: linux-pm; +Cc: Daniel Lezcano, rui.zhang, amitk

The following commit has been merged into the thermal/next branch of thermal:

Commit-ID:     a930da9bf583b2add01fb0e086913664dadaffd0
Gitweb:        https://git.kernel.org/pub/scm/linux/kernel/git/thermal/linux.git//a930da9bf583b2add01fb0e086913664dadaffd0
Author:        Daniel Lezcano <daniel.lezcano@linaro.org>
AuthorDate:    Fri, 05 Aug 2022 17:38:34 +02:00
Committer:     Daniel Lezcano <daniel.lezcano@linaro.org>
CommitterDate: Wed, 17 Aug 2022 14:09:39 +02:00

thermal/core: Move the mutex inside the thermal_zone_device_update() function

All the different calls inside the thermal_zone_device_update()
function take the mutex.

The previous changes move the mutex out of the different functions,
like the throttling ops. Now that the mutexes are all at the same
level in the call stack for the thermal_zone_device_update() function,
they can be moved inside this one.

That has the benefit of:

1. Simplify the code by not having a plethora of places where the lock is taken

2. Probably closes more race windows because releasing the lock from
one line to another can give the opportunity to the thermal zone to change
its state in the meantime. For example, the thermal zone can be
enabled right after checking it is disabled.

Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Link: https://lore.kernel.org/r/20220805153834.2510142-5-daniel.lezcano@linaro.org
---
 drivers/thermal/thermal_core.c    | 32 ++++---------
 drivers/thermal/thermal_core.h    |  2 +-
 drivers/thermal/thermal_helpers.c | 73 +++++++++++++++++-------------
 drivers/thermal/thermal_sysfs.c   |  6 +-
 4 files changed, 61 insertions(+), 52 deletions(-)

diff --git a/drivers/thermal/thermal_core.c b/drivers/thermal/thermal_core.c
index fcac28d..4812170 100644
--- a/drivers/thermal/thermal_core.c
+++ b/drivers/thermal/thermal_core.c
@@ -297,24 +297,18 @@ static void thermal_zone_device_set_polling(struct thermal_zone_device *tz,
 
 static void monitor_thermal_zone(struct thermal_zone_device *tz)
 {
-	mutex_lock(&tz->lock);
-
 	if (tz->mode != THERMAL_DEVICE_ENABLED)
 		thermal_zone_device_set_polling(tz, 0);
 	else if (tz->passive)
 		thermal_zone_device_set_polling(tz, tz->passive_delay_jiffies);
 	else if (tz->polling_delay_jiffies)
 		thermal_zone_device_set_polling(tz, tz->polling_delay_jiffies);
-
-	mutex_unlock(&tz->lock);
 }
 
 static void handle_non_critical_trips(struct thermal_zone_device *tz, int trip)
 {
-	mutex_lock(&tz->lock);
 	tz->governor ? tz->governor->throttle(tz, trip) :
 		       def_governor->throttle(tz, trip);
-	mutex_unlock(&tz->lock);
 }
 
 void thermal_zone_device_critical(struct thermal_zone_device *tz)
@@ -382,7 +376,7 @@ static void update_temperature(struct thermal_zone_device *tz)
 {
 	int temp, ret;
 
-	ret = thermal_zone_get_temp(tz, &temp);
+	ret = __thermal_zone_get_temp(tz, &temp);
 	if (ret) {
 		if (ret != -EAGAIN)
 			dev_warn(&tz->device,
@@ -391,10 +385,8 @@ static void update_temperature(struct thermal_zone_device *tz)
 		return;
 	}
 
-	mutex_lock(&tz->lock);
 	tz->last_temperature = tz->temperature;
 	tz->temperature = temp;
-	mutex_unlock(&tz->lock);
 
 	trace_thermal_temperature(tz);
 
@@ -457,15 +449,9 @@ EXPORT_SYMBOL_GPL(thermal_zone_device_disable);
 
 int thermal_zone_device_is_enabled(struct thermal_zone_device *tz)
 {
-	enum thermal_device_mode mode;
-
-	mutex_lock(&tz->lock);
-
-	mode = tz->mode;
+	lockdep_assert_held(&tz->lock);
 
-	mutex_unlock(&tz->lock);
-
-	return mode == THERMAL_DEVICE_ENABLED;
+	return tz->mode == THERMAL_DEVICE_ENABLED;
 }
 
 void thermal_zone_device_update(struct thermal_zone_device *tz,
@@ -473,9 +459,6 @@ void thermal_zone_device_update(struct thermal_zone_device *tz,
 {
 	int count;
 
-	if (!thermal_zone_device_is_enabled(tz))
-		return;
-
 	if (atomic_read(&in_suspend))
 		return;
 
@@ -483,9 +466,14 @@ void thermal_zone_device_update(struct thermal_zone_device *tz,
 		      "'get_temp' ops set\n", __func__))
 		return;
 
+	mutex_lock(&tz->lock);
+
+	if (!thermal_zone_device_is_enabled(tz))
+		goto out;
+
 	update_temperature(tz);
 
-	thermal_zone_set_trips(tz);
+	__thermal_zone_set_trips(tz);
 
 	tz->notify_event = event;
 
@@ -493,6 +481,8 @@ void thermal_zone_device_update(struct thermal_zone_device *tz,
 		handle_thermal_trip(tz, count);
 
 	monitor_thermal_zone(tz);
+out:
+	mutex_unlock(&tz->lock);
 }
 EXPORT_SYMBOL_GPL(thermal_zone_device_update);
 
diff --git a/drivers/thermal/thermal_core.h b/drivers/thermal/thermal_core.h
index 2241d2d..1571917 100644
--- a/drivers/thermal/thermal_core.h
+++ b/drivers/thermal/thermal_core.h
@@ -112,6 +112,8 @@ int thermal_build_list_of_policies(char *buf);
 
 /* Helpers */
 void thermal_zone_set_trips(struct thermal_zone_device *tz);
+void __thermal_zone_set_trips(struct thermal_zone_device *tz);
+int __thermal_zone_get_temp(struct thermal_zone_device *tz, int *temp);
 
 /* sysfs I/F */
 int thermal_zone_create_device_groups(struct thermal_zone_device *, int);
diff --git a/drivers/thermal/thermal_helpers.c b/drivers/thermal/thermal_helpers.c
index 690890f..c65cdce 100644
--- a/drivers/thermal/thermal_helpers.c
+++ b/drivers/thermal/thermal_helpers.c
@@ -64,27 +64,17 @@ get_thermal_instance(struct thermal_zone_device *tz,
 }
 EXPORT_SYMBOL(get_thermal_instance);
 
-/**
- * thermal_zone_get_temp() - returns the temperature of a thermal zone
- * @tz: a valid pointer to a struct thermal_zone_device
- * @temp: a valid pointer to where to store the resulting temperature.
- *
- * When a valid thermal zone reference is passed, it will fetch its
- * temperature and fill @temp.
- *
- * Return: On success returns 0, an error code otherwise
- */
-int thermal_zone_get_temp(struct thermal_zone_device *tz, int *temp)
+int __thermal_zone_get_temp(struct thermal_zone_device *tz, int *temp)
 {
 	int ret = -EINVAL;
 	int count;
 	int crit_temp = INT_MAX;
 	enum thermal_trip_type type;
 
-	if (!tz || IS_ERR(tz) || !tz->ops->get_temp)
-		goto exit;
+	lockdep_assert_held(&tz->lock);
 
-	mutex_lock(&tz->lock);
+	if (!tz || IS_ERR(tz) || !tz->ops->get_temp)
+		return -EINVAL;
 
 	ret = tz->ops->get_temp(tz, temp);
 
@@ -107,35 +97,42 @@ int thermal_zone_get_temp(struct thermal_zone_device *tz, int *temp)
 			*temp = tz->emul_temperature;
 	}
 
-	mutex_unlock(&tz->lock);
-exit:
 	return ret;
 }
-EXPORT_SYMBOL_GPL(thermal_zone_get_temp);
 
 /**
- * thermal_zone_set_trips - Computes the next trip points for the driver
- * @tz: a pointer to a thermal zone device structure
+ * thermal_zone_get_temp() - returns the temperature of a thermal zone
+ * @tz: a valid pointer to a struct thermal_zone_device
+ * @temp: a valid pointer to where to store the resulting temperature.
  *
- * The function computes the next temperature boundaries by browsing
- * the trip points. The result is the closer low and high trip points
- * to the current temperature. These values are passed to the backend
- * driver to let it set its own notification mechanism (usually an
- * interrupt).
+ * When a valid thermal zone reference is passed, it will fetch its
+ * temperature and fill @temp.
  *
- * It does not return a value
+ * Return: On success returns 0, an error code otherwise
  */
-void thermal_zone_set_trips(struct thermal_zone_device *tz)
+int thermal_zone_get_temp(struct thermal_zone_device *tz, int *temp)
+{
+	int ret;
+
+	mutex_lock(&tz->lock);
+	ret = __thermal_zone_get_temp(tz, temp);
+	mutex_unlock(&tz->lock);
+
+	return ret;
+}
+EXPORT_SYMBOL_GPL(thermal_zone_get_temp);
+
+void __thermal_zone_set_trips(struct thermal_zone_device *tz)
 {
 	int low = -INT_MAX;
 	int high = INT_MAX;
 	int trip_temp, hysteresis;
 	int i, ret;
 
-	mutex_lock(&tz->lock);
+	lockdep_assert_held(&tz->lock);
 
 	if (!tz->ops->set_trips || !tz->ops->get_trip_hyst)
-		goto exit;
+		return;
 
 	for (i = 0; i < tz->num_trips; i++) {
 		int trip_low;
@@ -154,7 +151,7 @@ void thermal_zone_set_trips(struct thermal_zone_device *tz)
 
 	/* No need to change trip points */
 	if (tz->prev_low_trip == low && tz->prev_high_trip == high)
-		goto exit;
+		return;
 
 	tz->prev_low_trip = low;
 	tz->prev_high_trip = high;
@@ -169,8 +166,24 @@ void thermal_zone_set_trips(struct thermal_zone_device *tz)
 	ret = tz->ops->set_trips(tz, low, high);
 	if (ret)
 		dev_err(&tz->device, "Failed to set trips: %d\n", ret);
+}
 
-exit:
+/**
+ * thermal_zone_set_trips - Computes the next trip points for the driver
+ * @tz: a pointer to a thermal zone device structure
+ *
+ * The function computes the next temperature boundaries by browsing
+ * the trip points. The result is the closer low and high trip points
+ * to the current temperature. These values are passed to the backend
+ * driver to let it set its own notification mechanism (usually an
+ * interrupt).
+ *
+ * It does not return a value
+ */
+void thermal_zone_set_trips(struct thermal_zone_device *tz)
+{
+	mutex_lock(&tz->lock);
+	__thermal_zone_set_trips(tz);
 	mutex_unlock(&tz->lock);
 }
 
diff --git a/drivers/thermal/thermal_sysfs.c b/drivers/thermal/thermal_sysfs.c
index 0f82010..78c5841 100644
--- a/drivers/thermal/thermal_sysfs.c
+++ b/drivers/thermal/thermal_sysfs.c
@@ -49,7 +49,11 @@ static ssize_t
 mode_show(struct device *dev, struct device_attribute *attr, char *buf)
 {
 	struct thermal_zone_device *tz = to_thermal_zone(dev);
-	int enabled = thermal_zone_device_is_enabled(tz);
+	int enabled;
+
+	mutex_lock(&tz->lock);
+	enabled = thermal_zone_device_is_enabled(tz);
+	mutex_unlock(&tz->lock);
 
 	return sprintf(buf, "%s\n", enabled ? "enabled" : "disabled");
 }

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [thermal: thermal/next] thermal/core: Move the thermal zone lock out of the governors
  2022-08-05 15:38 ` [PATCH 4/5] thermal/core: Move the thermal zone lock out of the governors Daniel Lezcano
@ 2022-08-23 12:42   ` thermal-bot for Daniel Lezcano
  2022-10-04 14:14   ` [PATCH 4/5] " Guenter Roeck
  1 sibling, 0 replies; 20+ messages in thread
From: thermal-bot for Daniel Lezcano @ 2022-08-23 12:42 UTC (permalink / raw)
  To: linux-pm; +Cc: Daniel Lezcano, rui.zhang, amitk

The following commit has been merged into the thermal/next branch of thermal:

Commit-ID:     670a5e356cb6dfc61b87b599eba483af6a3a99ad
Gitweb:        https://git.kernel.org/pub/scm/linux/kernel/git/thermal/linux.git//670a5e356cb6dfc61b87b599eba483af6a3a99ad
Author:        Daniel Lezcano <daniel.lezcano@linaro.org>
AuthorDate:    Fri, 05 Aug 2022 17:38:33 +02:00
Committer:     Daniel Lezcano <daniel.lezcano@linaro.org>
CommitterDate: Wed, 17 Aug 2022 14:09:39 +02:00

thermal/core: Move the thermal zone lock out of the governors

All the governors throttling ops are taking/releasing the lock at the
beginning and the end of the function.

We can move the mutex to the throttling call site instead.

Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Link: https://lore.kernel.org/r/20220805153834.2510142-4-daniel.lezcano@linaro.org
---
 drivers/thermal/gov_bang_bang.c       |  4 +---
 drivers/thermal/gov_fair_share.c      |  4 +---
 drivers/thermal/gov_power_allocator.c | 16 ++++++----------
 drivers/thermal/gov_step_wise.c       |  4 +---
 drivers/thermal/thermal_core.c        |  2 ++
 5 files changed, 11 insertions(+), 19 deletions(-)

diff --git a/drivers/thermal/gov_bang_bang.c b/drivers/thermal/gov_bang_bang.c
index f0bff2e..a08bbe3 100644
--- a/drivers/thermal/gov_bang_bang.c
+++ b/drivers/thermal/gov_bang_bang.c
@@ -96,15 +96,13 @@ static int bang_bang_control(struct thermal_zone_device *tz, int trip)
 {
 	struct thermal_instance *instance;
 
-	mutex_lock(&tz->lock);
+	lockdep_assert_held(&tz->lock);
 
 	thermal_zone_trip_update(tz, trip);
 
 	list_for_each_entry(instance, &tz->thermal_instances, tz_node)
 		thermal_cdev_update(instance->cdev);
 
-	mutex_unlock(&tz->lock);
-
 	return 0;
 }
 
diff --git a/drivers/thermal/gov_fair_share.c b/drivers/thermal/gov_fair_share.c
index 5d5ddd6..a4ee466 100644
--- a/drivers/thermal/gov_fair_share.c
+++ b/drivers/thermal/gov_fair_share.c
@@ -82,7 +82,7 @@ static int fair_share_throttle(struct thermal_zone_device *tz, int trip)
 	int total_instance = 0;
 	int cur_trip_level = get_trip_level(tz);
 
-	mutex_lock(&tz->lock);
+	lockdep_assert_held(&tz->lock);
 
 	list_for_each_entry(instance, &tz->thermal_instances, tz_node) {
 		if (instance->trip != trip)
@@ -112,8 +112,6 @@ static int fair_share_throttle(struct thermal_zone_device *tz, int trip)
 		mutex_unlock(&cdev->lock);
 	}
 
-	mutex_unlock(&tz->lock);
-
 	return 0;
 }
 
diff --git a/drivers/thermal/gov_power_allocator.c b/drivers/thermal/gov_power_allocator.c
index d3aca23..2d1aeab 100644
--- a/drivers/thermal/gov_power_allocator.c
+++ b/drivers/thermal/gov_power_allocator.c
@@ -697,19 +697,19 @@ static void power_allocator_unbind(struct thermal_zone_device *tz)
 
 static int power_allocator_throttle(struct thermal_zone_device *tz, int trip)
 {
-	int ret = 0;
+	int ret;
 	int switch_on_temp, control_temp;
 	struct power_allocator_params *params = tz->governor_data;
 	bool update;
 
-	mutex_lock(&tz->lock);
+	lockdep_assert_held(&tz->lock);
 
 	/*
 	 * We get called for every trip point but we only need to do
 	 * our calculations once
 	 */
 	if (trip != params->trip_max_desired_temperature)
-		goto out;
+		return 0;
 
 	ret = tz->ops->get_trip_temp(tz, params->trip_switch_on,
 				     &switch_on_temp);
@@ -718,7 +718,7 @@ static int power_allocator_throttle(struct thermal_zone_device *tz, int trip)
 		tz->passive = 0;
 		reset_pid_controller(params);
 		allow_maximum_power(tz, update);
-		goto out;
+		return 0;
 	}
 
 	tz->passive = 1;
@@ -729,14 +729,10 @@ static int power_allocator_throttle(struct thermal_zone_device *tz, int trip)
 		dev_warn(&tz->device,
 			 "Failed to get the maximum desired temperature: %d\n",
 			 ret);
-		goto out;
+		return ret;
 	}
 
-	ret = allocate_power(tz, control_temp);
-
-	mutex_unlock(&tz->lock);
-out:
-	return ret;
+	return allocate_power(tz, control_temp);
 }
 
 static struct thermal_governor thermal_gov_power_allocator = {
diff --git a/drivers/thermal/gov_step_wise.c b/drivers/thermal/gov_step_wise.c
index 597a0eb..cdd3354 100644
--- a/drivers/thermal/gov_step_wise.c
+++ b/drivers/thermal/gov_step_wise.c
@@ -160,15 +160,13 @@ static int step_wise_throttle(struct thermal_zone_device *tz, int trip)
 {
 	struct thermal_instance *instance;
 
-	mutex_lock(&tz->lock);
+	lockdep_assert_held(&tz->lock);
 
 	thermal_zone_trip_update(tz, trip);
 
 	list_for_each_entry(instance, &tz->thermal_instances, tz_node)
 		thermal_cdev_update(instance->cdev);
 
-	mutex_unlock(&tz->lock);
-
 	return 0;
 }
 
diff --git a/drivers/thermal/thermal_core.c b/drivers/thermal/thermal_core.c
index 5408e92..fcac28d 100644
--- a/drivers/thermal/thermal_core.c
+++ b/drivers/thermal/thermal_core.c
@@ -311,8 +311,10 @@ static void monitor_thermal_zone(struct thermal_zone_device *tz)
 
 static void handle_non_critical_trips(struct thermal_zone_device *tz, int trip)
 {
+	mutex_lock(&tz->lock);
 	tz->governor ? tz->governor->throttle(tz, trip) :
 		       def_governor->throttle(tz, trip);
+	mutex_unlock(&tz->lock);
 }
 
 void thermal_zone_device_critical(struct thermal_zone_device *tz)

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [thermal: thermal/next] thermal/governors: Group the thermal zone lock inside the throttle function
  2022-08-05 15:38 ` [PATCH 3/5] thermal/governors: Group the thermal zone lock inside the throttle function Daniel Lezcano
@ 2022-08-23 12:42   ` thermal-bot for Daniel Lezcano
  0 siblings, 0 replies; 20+ messages in thread
From: thermal-bot for Daniel Lezcano @ 2022-08-23 12:42 UTC (permalink / raw)
  To: linux-pm; +Cc: Daniel Lezcano, rui.zhang, amitk

The following commit has been merged into the thermal/next branch of thermal:

Commit-ID:     63561fe36b094729d3d4d274bafaa030b39e89f6
Gitweb:        https://git.kernel.org/pub/scm/linux/kernel/git/thermal/linux.git//63561fe36b094729d3d4d274bafaa030b39e89f6
Author:        Daniel Lezcano <daniel.lezcano@linaro.org>
AuthorDate:    Fri, 05 Aug 2022 17:38:32 +02:00
Committer:     Daniel Lezcano <daniel.lezcano@linaro.org>
CommitterDate: Wed, 17 Aug 2022 14:09:39 +02:00

thermal/governors: Group the thermal zone lock inside the throttle function

The thermal zone lock is taken in the different places in the
throttling path.

At the first glance it does not hurt to move them at the beginning and
the end of the 'throttle' function. That will allow a consolidation of
the lock in the next following changes.

Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Link: https://lore.kernel.org/r/20220805153834.2510142-3-daniel.lezcano@linaro.org
---
 drivers/thermal/gov_bang_bang.c       |  8 +-----
 drivers/thermal/gov_fair_share.c      |  1 +-
 drivers/thermal/gov_power_allocator.c | 34 +++++++++++---------------
 drivers/thermal/gov_step_wise.c       |  8 +-----
 4 files changed, 20 insertions(+), 31 deletions(-)

diff --git a/drivers/thermal/gov_bang_bang.c b/drivers/thermal/gov_bang_bang.c
index 991a1c5..f0bff2e 100644
--- a/drivers/thermal/gov_bang_bang.c
+++ b/drivers/thermal/gov_bang_bang.c
@@ -31,8 +31,6 @@ static void thermal_zone_trip_update(struct thermal_zone_device *tz, int trip)
 				trip, trip_temp, tz->temperature,
 				trip_hyst);
 
-	mutex_lock(&tz->lock);
-
 	list_for_each_entry(instance, &tz->thermal_instances, tz_node) {
 		if (instance->trip != trip)
 			continue;
@@ -65,8 +63,6 @@ static void thermal_zone_trip_update(struct thermal_zone_device *tz, int trip)
 		instance->cdev->updated = false; /* cdev needs update */
 		mutex_unlock(&instance->cdev->lock);
 	}
-
-	mutex_unlock(&tz->lock);
 }
 
 /**
@@ -100,10 +96,10 @@ static int bang_bang_control(struct thermal_zone_device *tz, int trip)
 {
 	struct thermal_instance *instance;
 
-	thermal_zone_trip_update(tz, trip);
-
 	mutex_lock(&tz->lock);
 
+	thermal_zone_trip_update(tz, trip);
+
 	list_for_each_entry(instance, &tz->thermal_instances, tz_node)
 		thermal_cdev_update(instance->cdev);
 
diff --git a/drivers/thermal/gov_fair_share.c b/drivers/thermal/gov_fair_share.c
index 6a2abcf..5d5ddd6 100644
--- a/drivers/thermal/gov_fair_share.c
+++ b/drivers/thermal/gov_fair_share.c
@@ -113,6 +113,7 @@ static int fair_share_throttle(struct thermal_zone_device *tz, int trip)
 	}
 
 	mutex_unlock(&tz->lock);
+
 	return 0;
 }
 
diff --git a/drivers/thermal/gov_power_allocator.c b/drivers/thermal/gov_power_allocator.c
index 1d50524..d3aca23 100644
--- a/drivers/thermal/gov_power_allocator.c
+++ b/drivers/thermal/gov_power_allocator.c
@@ -392,8 +392,6 @@ static int allocate_power(struct thermal_zone_device *tz,
 	int i, num_actors, total_weight, ret = 0;
 	int trip_max_desired_temperature = params->trip_max_desired_temperature;
 
-	mutex_lock(&tz->lock);
-
 	num_actors = 0;
 	total_weight = 0;
 	list_for_each_entry(instance, &tz->thermal_instances, tz_node) {
@@ -404,10 +402,8 @@ static int allocate_power(struct thermal_zone_device *tz,
 		}
 	}
 
-	if (!num_actors) {
-		ret = -ENODEV;
-		goto unlock;
-	}
+	if (!num_actors)
+		return -ENODEV;
 
 	/*
 	 * We need to allocate five arrays of the same size:
@@ -421,10 +417,8 @@ static int allocate_power(struct thermal_zone_device *tz,
 	BUILD_BUG_ON(sizeof(*req_power) != sizeof(*extra_actor_power));
 	BUILD_BUG_ON(sizeof(*req_power) != sizeof(*weighted_req_power));
 	req_power = kcalloc(num_actors * 5, sizeof(*req_power), GFP_KERNEL);
-	if (!req_power) {
-		ret = -ENOMEM;
-		goto unlock;
-	}
+	if (!req_power)
+		return -ENOMEM;
 
 	max_power = &req_power[num_actors];
 	granted_power = &req_power[2 * num_actors];
@@ -496,8 +490,6 @@ static int allocate_power(struct thermal_zone_device *tz,
 				      control_temp - tz->temperature);
 
 	kfree(req_power);
-unlock:
-	mutex_unlock(&tz->lock);
 
 	return ret;
 }
@@ -576,7 +568,6 @@ static void allow_maximum_power(struct thermal_zone_device *tz, bool update)
 	struct power_allocator_params *params = tz->governor_data;
 	u32 req_power;
 
-	mutex_lock(&tz->lock);
 	list_for_each_entry(instance, &tz->thermal_instances, tz_node) {
 		struct thermal_cooling_device *cdev = instance->cdev;
 
@@ -598,7 +589,6 @@ static void allow_maximum_power(struct thermal_zone_device *tz, bool update)
 
 		mutex_unlock(&instance->cdev->lock);
 	}
-	mutex_unlock(&tz->lock);
 }
 
 /**
@@ -707,17 +697,19 @@ static void power_allocator_unbind(struct thermal_zone_device *tz)
 
 static int power_allocator_throttle(struct thermal_zone_device *tz, int trip)
 {
-	int ret;
+	int ret = 0;
 	int switch_on_temp, control_temp;
 	struct power_allocator_params *params = tz->governor_data;
 	bool update;
 
+	mutex_lock(&tz->lock);
+
 	/*
 	 * We get called for every trip point but we only need to do
 	 * our calculations once
 	 */
 	if (trip != params->trip_max_desired_temperature)
-		return 0;
+		goto out;
 
 	ret = tz->ops->get_trip_temp(tz, params->trip_switch_on,
 				     &switch_on_temp);
@@ -726,7 +718,7 @@ static int power_allocator_throttle(struct thermal_zone_device *tz, int trip)
 		tz->passive = 0;
 		reset_pid_controller(params);
 		allow_maximum_power(tz, update);
-		return 0;
+		goto out;
 	}
 
 	tz->passive = 1;
@@ -737,10 +729,14 @@ static int power_allocator_throttle(struct thermal_zone_device *tz, int trip)
 		dev_warn(&tz->device,
 			 "Failed to get the maximum desired temperature: %d\n",
 			 ret);
-		return ret;
+		goto out;
 	}
 
-	return allocate_power(tz, control_temp);
+	ret = allocate_power(tz, control_temp);
+
+	mutex_unlock(&tz->lock);
+out:
+	return ret;
 }
 
 static struct thermal_governor thermal_gov_power_allocator = {
diff --git a/drivers/thermal/gov_step_wise.c b/drivers/thermal/gov_step_wise.c
index 9729b46..597a0eb 100644
--- a/drivers/thermal/gov_step_wise.c
+++ b/drivers/thermal/gov_step_wise.c
@@ -117,8 +117,6 @@ static void thermal_zone_trip_update(struct thermal_zone_device *tz, int trip)
 	dev_dbg(&tz->device, "Trip%d[type=%d,temp=%d]:trend=%d,throttle=%d\n",
 				trip, trip_type, trip_temp, trend, throttle);
 
-	mutex_lock(&tz->lock);
-
 	list_for_each_entry(instance, &tz->thermal_instances, tz_node) {
 		if (instance->trip != trip)
 			continue;
@@ -145,8 +143,6 @@ static void thermal_zone_trip_update(struct thermal_zone_device *tz, int trip)
 		instance->cdev->updated = false; /* cdev needs update */
 		mutex_unlock(&instance->cdev->lock);
 	}
-
-	mutex_unlock(&tz->lock);
 }
 
 /**
@@ -164,10 +160,10 @@ static int step_wise_throttle(struct thermal_zone_device *tz, int trip)
 {
 	struct thermal_instance *instance;
 
-	thermal_zone_trip_update(tz, trip);
-
 	mutex_lock(&tz->lock);
 
+	thermal_zone_trip_update(tz, trip);
+
 	list_for_each_entry(instance, &tz->thermal_instances, tz_node)
 		thermal_cdev_update(instance->cdev);
 

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [thermal: thermal/next] thermal/core: Rework the monitoring a bit
  2022-08-05 15:38 ` [PATCH 2/5] thermal/core: Rework the monitoring a bit Daniel Lezcano
@ 2022-08-23 12:42   ` thermal-bot for Daniel Lezcano
  0 siblings, 0 replies; 20+ messages in thread
From: thermal-bot for Daniel Lezcano @ 2022-08-23 12:42 UTC (permalink / raw)
  To: linux-pm; +Cc: Daniel Lezcano, rui.zhang, amitk

The following commit has been merged into the thermal/next branch of thermal:

Commit-ID:     15a73839e3ced8d418e6c34548f5e2789f9da619
Gitweb:        https://git.kernel.org/pub/scm/linux/kernel/git/thermal/linux.git//15a73839e3ced8d418e6c34548f5e2789f9da619
Author:        Daniel Lezcano <daniel.lezcano@linaro.org>
AuthorDate:    Fri, 05 Aug 2022 17:38:31 +02:00
Committer:     Daniel Lezcano <daniel.lezcano@linaro.org>
CommitterDate: Wed, 17 Aug 2022 14:09:39 +02:00

thermal/core: Rework the monitoring a bit

The should_stop_polling() function wraps the function
thermal_zone_device_is_enabled().

The monitor_thermal_zone() function checks if the thermal zone is
enabled via the should_stop_polling() function.

However, the instant after checking the thermal zone is enabled, this
one can be disabled, so even if that reduces the race window, it does
not prevent that and the monitoring can be set again with the thermal
zone disabled.

For this reason, the function should_stop_polling() is replaced by a
direct check of the thermal zone mode with the mutex locks held, that
prevents the situation described above.

As the semantic is clear with the thermal_zone_is_enabled() function,
we can remove the should_stop_polling() function and replace the check
with the former function.

While at it, reorder the checks to improve the readability of the
monitor_thermal_zone() function.

In the future, the thermal_zone_device_disable() and the
thermal_zone_device_enable() functions should unset / set the polling
timer directly instead of relying on the next
thermal_zone_device_update() call to do that. That will make a
synchronous thermal zone mode change but the locking scheme should be
double checked for that which out of the scope of this change.

Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Link: https://lore.kernel.org/r/20220805153834.2510142-2-daniel.lezcano@linaro.org
---
 drivers/thermal/thermal_core.c | 19 +++++--------------
 1 file changed, 5 insertions(+), 14 deletions(-)

diff --git a/drivers/thermal/thermal_core.c b/drivers/thermal/thermal_core.c
index ea41ea6..5408e92 100644
--- a/drivers/thermal/thermal_core.c
+++ b/drivers/thermal/thermal_core.c
@@ -295,25 +295,16 @@ static void thermal_zone_device_set_polling(struct thermal_zone_device *tz,
 		cancel_delayed_work(&tz->poll_queue);
 }
 
-static inline bool should_stop_polling(struct thermal_zone_device *tz)
-{
-	return !thermal_zone_device_is_enabled(tz);
-}
-
 static void monitor_thermal_zone(struct thermal_zone_device *tz)
 {
-	bool stop;
-
-	stop = should_stop_polling(tz);
-
 	mutex_lock(&tz->lock);
 
-	if (!stop && tz->passive)
+	if (tz->mode != THERMAL_DEVICE_ENABLED)
+		thermal_zone_device_set_polling(tz, 0);
+	else if (tz->passive)
 		thermal_zone_device_set_polling(tz, tz->passive_delay_jiffies);
-	else if (!stop && tz->polling_delay_jiffies)
+	else if (tz->polling_delay_jiffies)
 		thermal_zone_device_set_polling(tz, tz->polling_delay_jiffies);
-	else
-		thermal_zone_device_set_polling(tz, 0);
 
 	mutex_unlock(&tz->lock);
 }
@@ -480,7 +471,7 @@ void thermal_zone_device_update(struct thermal_zone_device *tz,
 {
 	int count;
 
-	if (should_stop_polling(tz))
+	if (!thermal_zone_device_is_enabled(tz))
 		return;
 
 	if (atomic_read(&in_suspend))

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [thermal: thermal/next] thermal/core: Rearm the monitoring only one time
  2022-08-05 15:38 [PATCH 1/5] thermal/core: Rearm the monitoring only one time Daniel Lezcano
                   ` (4 preceding siblings ...)
  2022-08-05 16:29 ` [PATCH 1/5] thermal/core: Rearm the monitoring only one time Rafael J. Wysocki
@ 2022-08-23 12:42 ` thermal-bot for Daniel Lezcano
  5 siblings, 0 replies; 20+ messages in thread
From: thermal-bot for Daniel Lezcano @ 2022-08-23 12:42 UTC (permalink / raw)
  To: linux-pm; +Cc: Daniel Lezcano, rui.zhang, amitk

The following commit has been merged into the thermal/next branch of thermal:

Commit-ID:     9662756a9a1c34b3ee606dcddfda6a457f89b07f
Gitweb:        https://git.kernel.org/pub/scm/linux/kernel/git/thermal/linux.git//9662756a9a1c34b3ee606dcddfda6a457f89b07f
Author:        Daniel Lezcano <daniel.lezcano@linaro.org>
AuthorDate:    Fri, 05 Aug 2022 17:38:30 +02:00
Committer:     Daniel Lezcano <daniel.lezcano@linaro.org>
CommitterDate: Wed, 17 Aug 2022 14:09:39 +02:00

thermal/core: Rearm the monitoring only one time

The current code calls monitor_thermal_zone() inside the
handle_thermal_trip() function. But this one is called in a loop for
each trip point which means the monitoring is rearmed several times
for nothing (assuming there could be several passive and active trip
points).

Move the monitor_thermal_zone() function out of the
handle_thermal_trip() function and after the thermal trip loop, so the
timer will be disabled or rearmed one time.

Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Link: https://lore.kernel.org/r/20220805153834.2510142-1-daniel.lezcano@linaro.org
---
 drivers/thermal/thermal_core.c | 7 ++-----
 1 file changed, 2 insertions(+), 5 deletions(-)

diff --git a/drivers/thermal/thermal_core.c b/drivers/thermal/thermal_core.c
index 69447ab..ea41ea6 100644
--- a/drivers/thermal/thermal_core.c
+++ b/drivers/thermal/thermal_core.c
@@ -383,11 +383,6 @@ static void handle_thermal_trip(struct thermal_zone_device *tz, int trip)
 		handle_critical_trips(tz, trip, trip_temp, type);
 	else
 		handle_non_critical_trips(tz, trip);
-	/*
-	 * Alright, we handled this trip successfully.
-	 * So, start monitoring again.
-	 */
-	monitor_thermal_zone(tz);
 }
 
 static void update_temperature(struct thermal_zone_device *tz)
@@ -503,6 +498,8 @@ void thermal_zone_device_update(struct thermal_zone_device *tz,
 
 	for (count = 0; count < tz->num_trips; count++)
 		handle_thermal_trip(tz, count);
+
+	monitor_thermal_zone(tz);
 }
 EXPORT_SYMBOL_GPL(thermal_zone_device_update);
 

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* Re: [PATCH 4/5] thermal/core: Move the thermal zone lock out of the governors
  2022-08-05 15:38 ` [PATCH 4/5] thermal/core: Move the thermal zone lock out of the governors Daniel Lezcano
  2022-08-23 12:42   ` [thermal: thermal/next] " thermal-bot for Daniel Lezcano
@ 2022-10-04 14:14   ` Guenter Roeck
  2022-10-04 14:21     ` Daniel Lezcano
  1 sibling, 1 reply; 20+ messages in thread
From: Guenter Roeck @ 2022-10-04 14:14 UTC (permalink / raw)
  To: Daniel Lezcano
  Cc: rafael, rui.zhang, linux-pm, linux-kernel, Amit Kucheria, Lukasz Luba

On Fri, Aug 05, 2022 at 05:38:33PM +0200, Daniel Lezcano wrote:
> All the governors throttling ops are taking/releasing the lock at the
> beginning and the end of the function.
> 
> We can move the mutex to the throttling call site instead.
> 
> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
> ---
>  drivers/thermal/gov_bang_bang.c       |  4 +---
>  drivers/thermal/gov_fair_share.c      |  4 +---
>  drivers/thermal/gov_power_allocator.c | 16 ++++++----------
>  drivers/thermal/gov_step_wise.c       |  4 +---
>  drivers/thermal/thermal_core.c        |  2 ++

This doesn't drop the lock from drivers/thermal/gov_user_space.c.
Is that on purpose ?

Thanks,
Guenter

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 4/5] thermal/core: Move the thermal zone lock out of the governors
  2022-10-04 14:14   ` [PATCH 4/5] " Guenter Roeck
@ 2022-10-04 14:21     ` Daniel Lezcano
  0 siblings, 0 replies; 20+ messages in thread
From: Daniel Lezcano @ 2022-10-04 14:21 UTC (permalink / raw)
  To: Guenter Roeck
  Cc: rafael, rui.zhang, linux-pm, linux-kernel, Amit Kucheria, Lukasz Luba

On 04/10/2022 16:14, Guenter Roeck wrote:
> On Fri, Aug 05, 2022 at 05:38:33PM +0200, Daniel Lezcano wrote:
>> All the governors throttling ops are taking/releasing the lock at the
>> beginning and the end of the function.
>>
>> We can move the mutex to the throttling call site instead.
>>
>> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
>> ---
>>   drivers/thermal/gov_bang_bang.c       |  4 +---
>>   drivers/thermal/gov_fair_share.c      |  4 +---
>>   drivers/thermal/gov_power_allocator.c | 16 ++++++----------
>>   drivers/thermal/gov_step_wise.c       |  4 +---
>>   drivers/thermal/thermal_core.c        |  2 ++
> 
> This doesn't drop the lock from drivers/thermal/gov_user_space.c.
> Is that on purpose ?

No, it is an oversight. It was fixed after by Rafael

-- 
<http://www.linaro.org/> Linaro.org │ Open source software for ARM SoCs

Follow Linaro:  <http://www.facebook.com/pages/Linaro> Facebook |
<http://twitter.com/#!/linaroorg> Twitter |
<http://www.linaro.org/linaro-blog/> Blog

^ permalink raw reply	[flat|nested] 20+ messages in thread

end of thread, other threads:[~2022-10-04 14:21 UTC | newest]

Thread overview: 20+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-08-05 15:38 [PATCH 1/5] thermal/core: Rearm the monitoring only one time Daniel Lezcano
2022-08-05 15:38 ` [PATCH 2/5] thermal/core: Rework the monitoring a bit Daniel Lezcano
2022-08-23 12:42   ` [thermal: thermal/next] " thermal-bot for Daniel Lezcano
2022-08-05 15:38 ` [PATCH 3/5] thermal/governors: Group the thermal zone lock inside the throttle function Daniel Lezcano
2022-08-23 12:42   ` [thermal: thermal/next] " thermal-bot for Daniel Lezcano
2022-08-05 15:38 ` [PATCH 4/5] thermal/core: Move the thermal zone lock out of the governors Daniel Lezcano
2022-08-23 12:42   ` [thermal: thermal/next] " thermal-bot for Daniel Lezcano
2022-10-04 14:14   ` [PATCH 4/5] " Guenter Roeck
2022-10-04 14:21     ` Daniel Lezcano
2022-08-05 15:38 ` [PATCH 5/5] thermal/core: Move the mutex inside the thermal_zone_device_update() function Daniel Lezcano
     [not found]   ` <CGME20220812103956eucas1p1849443855b3537c3f5be2891fa50a50b@eucas1p1.samsung.com>
2022-08-12 10:39     ` Marek Szyprowski
2022-08-12 13:12       ` [PATCH] thermal/core: Fix lockdep_assert() warning Daniel Lezcano
2022-08-12 13:34         ` Marek Szyprowski
2022-08-12 13:54           ` Daniel Lezcano
2022-08-20 11:57             ` Rafael J. Wysocki
2022-08-12 13:17       ` [PATCH 5/5] thermal/core: Move the mutex inside the thermal_zone_device_update() function Daniel Lezcano
2022-08-23 12:42   ` [thermal: thermal/next] " thermal-bot for Daniel Lezcano
2022-08-05 16:29 ` [PATCH 1/5] thermal/core: Rearm the monitoring only one time Rafael J. Wysocki
2022-08-05 16:37   ` Daniel Lezcano
2022-08-23 12:42 ` [thermal: thermal/next] " thermal-bot for Daniel Lezcano

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.