All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/9] thermal/core: Protect thermal device operations against removal
@ 2022-10-17 13:09 Guenter Roeck
  2022-10-17 13:09 ` [PATCH 1/9] thermal/core: Destroy thermal zone device mutex in release function Guenter Roeck
                   ` (9 more replies)
  0 siblings, 10 replies; 25+ messages in thread
From: Guenter Roeck @ 2022-10-17 13:09 UTC (permalink / raw)
  To: Rafael J . Wysocki
  Cc: Daniel Lezcano, Amit Kucheria, Zhang Rui, linux-pm, linux-kernel,
	Guenter Roeck

Accesses to thermal zones, and with it to thermal zone device operations,
are still possible after the thermal zone device has been unregistered.
For example, thermal_zone_get_temp() can be called from temp_show()
in thermal_sysfs.c if the sysfs attribute was opened before the thermal
device was unregistered. This is problematic and may result in crashes
since the operations data structure and the underlying code may be gone
when the calls are made.

The following series solves the problem by protecting accesses to thermal
device operations with the thermal device mutex, and by verifying that the
thermal device is still registered after the mutex has been acquired.

This was previously sent as RFC/RFT as single patch [1]. The code was reworked
to match thermal subsystem changes made between v6.0 and v6.1, and it was
split into several patches to simplify review.

[1] https://lore.kernel.org/linux-pm/20221004033936.1047691-1-linux@roeck-us.net/

----------------------------------------------------------------
Guenter Roeck (9):
      thermal/core: Destroy thermal zone device mutex in release function
      thermal/core: Delete device under thermal device zone lock
      thermal/core: Ensure that thermal device is registered in thermal_zone_get_temp
      thermal/core: Move parameter validation from __thermal_zone_get_temp to thermal_zone_get_temp
      thermal/core: Introduce locked version of thermal_zone_device_update
      thermal/core: Protect hwmon accesses to thermal operations with thermal zone mutex
      thermal/core: Protect sysfs accesses to thermal operations with thermal zone mutex
      thermal/core: Remove thermal_zone_set_trips()
      thermal/core: Protect thermal device operations against thermal device removal

 drivers/thermal/thermal_core.c    | 76 ++++++++++++++++++++++++--------------
 drivers/thermal/thermal_core.h    |  3 +-
 drivers/thermal/thermal_helpers.c | 65 ++++++++++++++++++++++-----------
 drivers/thermal/thermal_hwmon.c   | 14 +++++--
 drivers/thermal/thermal_sysfs.c   | 77 +++++++++++++++++++++++++++++++++------
 5 files changed, 169 insertions(+), 66 deletions(-)

^ permalink raw reply	[flat|nested] 25+ messages in thread

* [PATCH 1/9] thermal/core: Destroy thermal zone device mutex in release function
  2022-10-17 13:09 [PATCH 0/9] thermal/core: Protect thermal device operations against removal Guenter Roeck
@ 2022-10-17 13:09 ` Guenter Roeck
  2022-10-17 13:09 ` [PATCH 2/9] thermal/core: Delete device under thermal device zone lock Guenter Roeck
                   ` (8 subsequent siblings)
  9 siblings, 0 replies; 25+ messages in thread
From: Guenter Roeck @ 2022-10-17 13:09 UTC (permalink / raw)
  To: Rafael J . Wysocki
  Cc: Daniel Lezcano, Amit Kucheria, Zhang Rui, linux-pm, linux-kernel,
	Guenter Roeck

Accesses to thermal zones, and with it the thermal zone device mutex,
are still possible after the thermal zone device has been unregistered.
For example, thermal_zone_get_temp() can be called from temp_show()
in thermal_sysfs.c if the sysfs attribute was opened before the thermal
device was unregistered.

Move the call to mutex_destroy from thermal_zone_device_unregister()
to thermal_release() to ensure that it is only destroyed after it is
guaranteed to be no longer accessed.

Signed-off-by: Guenter Roeck <linux@roeck-us.net>
---
 drivers/thermal/thermal_core.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/thermal/thermal_core.c b/drivers/thermal/thermal_core.c
index 117eeaf7dd24..f548875a016d 100644
--- a/drivers/thermal/thermal_core.c
+++ b/drivers/thermal/thermal_core.c
@@ -759,6 +759,7 @@ static void thermal_release(struct device *dev)
 		     sizeof("thermal_zone") - 1)) {
 		tz = to_thermal_zone(dev);
 		thermal_zone_destroy_device_groups(tz);
+		mutex_destroy(&tz->lock);
 		kfree(tz);
 	} else if (!strncmp(dev_name(dev), "cooling_device",
 			    sizeof("cooling_device") - 1)) {
@@ -1390,7 +1391,6 @@ void thermal_zone_device_unregister(struct thermal_zone_device *tz)
 	thermal_remove_hwmon_sysfs(tz);
 	ida_free(&thermal_tz_ida, tz->id);
 	ida_destroy(&tz->ida);
-	mutex_destroy(&tz->lock);
 	device_unregister(&tz->device);
 
 	thermal_notify_tz_delete(tz_id);
-- 
2.36.2


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH 2/9] thermal/core: Delete device under thermal device zone lock
  2022-10-17 13:09 [PATCH 0/9] thermal/core: Protect thermal device operations against removal Guenter Roeck
  2022-10-17 13:09 ` [PATCH 1/9] thermal/core: Destroy thermal zone device mutex in release function Guenter Roeck
@ 2022-10-17 13:09 ` Guenter Roeck
  2022-10-17 13:09 ` [PATCH 3/9] thermal/core: Ensure that thermal device is registered in thermal_zone_get_temp Guenter Roeck
                   ` (7 subsequent siblings)
  9 siblings, 0 replies; 25+ messages in thread
From: Guenter Roeck @ 2022-10-17 13:09 UTC (permalink / raw)
  To: Rafael J . Wysocki
  Cc: Daniel Lezcano, Amit Kucheria, Zhang Rui, linux-pm, linux-kernel,
	Guenter Roeck

Thermal device attributes may still be opened after unregistering
the thermal zone and deleting the thermal device.

Currently there is no protection against accessing thermal device
operations after unregistering a thermal zone. To enable adding
such protection, protect the device delete operation with the
thermal zone device mutex. This requires splitting the call to
device_unregister() into its components, device_del() and put_device().
Only the first call can be executed under mutex protection, since
put_device() may result in releasing the thermal zone device memory.

Signed-off-by: Guenter Roeck <linux@roeck-us.net>
---
 drivers/thermal/thermal_core.c | 7 ++++++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/drivers/thermal/thermal_core.c b/drivers/thermal/thermal_core.c
index f548875a016d..562ece8d16aa 100644
--- a/drivers/thermal/thermal_core.c
+++ b/drivers/thermal/thermal_core.c
@@ -1391,7 +1391,12 @@ void thermal_zone_device_unregister(struct thermal_zone_device *tz)
 	thermal_remove_hwmon_sysfs(tz);
 	ida_free(&thermal_tz_ida, tz->id);
 	ida_destroy(&tz->ida);
-	device_unregister(&tz->device);
+
+	mutex_lock(&tz->lock);
+	device_del(&tz->device);
+	mutex_unlock(&tz->lock);
+
+	put_device(&tz->device);
 
 	thermal_notify_tz_delete(tz_id);
 }
-- 
2.36.2


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH 3/9] thermal/core: Ensure that thermal device is registered in thermal_zone_get_temp
  2022-10-17 13:09 [PATCH 0/9] thermal/core: Protect thermal device operations against removal Guenter Roeck
  2022-10-17 13:09 ` [PATCH 1/9] thermal/core: Destroy thermal zone device mutex in release function Guenter Roeck
  2022-10-17 13:09 ` [PATCH 2/9] thermal/core: Delete device under thermal device zone lock Guenter Roeck
@ 2022-10-17 13:09 ` Guenter Roeck
  2022-11-09 19:07   ` Rafael J. Wysocki
  2022-10-17 13:09 ` [PATCH 4/9] thermal/core: Move parameter validation from __thermal_zone_get_temp to thermal_zone_get_temp Guenter Roeck
                   ` (6 subsequent siblings)
  9 siblings, 1 reply; 25+ messages in thread
From: Guenter Roeck @ 2022-10-17 13:09 UTC (permalink / raw)
  To: Rafael J . Wysocki
  Cc: Daniel Lezcano, Amit Kucheria, Zhang Rui, linux-pm, linux-kernel,
	Guenter Roeck

Calls to thermal_zone_get_temp() are not protected against thermal zone
device removal. As result, it is possible that the thermal zone operations
callbacks are no longer valid when thermal_zone_get_temp() is called.
This may result in crashes such as

BUG: unable to handle page fault for address: ffffffffc04ef420
 #PF: supervisor read access in kernel mode
 #PF: error_code(0x0000) - not-present page
PGD 5d60e067 P4D 5d60e067 PUD 5d610067 PMD 110197067 PTE 0
Oops: 0000 [#1] PREEMPT SMP NOPTI
CPU: 1 PID: 3209 Comm: cat Tainted: G        W         5.10.136-19389-g615abc6eb807 #1 02df41ac0b12f3a64f4b34245188d8875bb3bce1
Hardware name: Google Coral/Coral, BIOS Google_Coral.10068.92.0 11/27/2018
RIP: 0010:thermal_zone_get_temp+0x26/0x73
Code: 89 c3 eb d3 0f 1f 44 00 00 55 48 89 e5 41 57 41 56 53 48 85 ff 74 50 48 89 fb 48 81 ff 00 f0 ff ff 77 44 48 8b 83 98 03 00 00 <48> 83 78 10 00 74 36 49 89 f6 4c 8d bb d8 03 00 00 4c 89 ff e8 9f
RSP: 0018:ffffb3758138fd38 EFLAGS: 00010287
RAX: ffffffffc04ef410 RBX: ffff98f14d7fb000 RCX: 0000000000000000
RDX: ffff98f17cf90000 RSI: ffffb3758138fd64 RDI: ffff98f14d7fb000
RBP: ffffb3758138fd50 R08: 0000000000001000 R09: ffff98f17cf90000
R10: 0000000000000000 R11: ffffffff8dacad28 R12: 0000000000001000
R13: ffff98f1793a7d80 R14: ffff98f143231708 R15: ffff98f14d7fb018
FS:  00007ec166097800(0000) GS:ffff98f1bbd00000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: ffffffffc04ef420 CR3: 000000010ee9a000 CR4: 00000000003506e0
Call Trace:
 temp_show+0x31/0x68
 dev_attr_show+0x1d/0x4f
 sysfs_kf_seq_show+0x92/0x107
 seq_read_iter+0xf5/0x3f2
 vfs_read+0x205/0x379
 __x64_sys_read+0x7c/0xe2
 do_syscall_64+0x43/0x55
 entry_SYSCALL_64_after_hwframe+0x61/0xc6

if a thermal device is removed while accesses to its device attributes
are ongoing.

The problem is exposed by code in iwl_op_mode_mvm_start(), which registers
a thermal zone device only to unregister it shortly afterwards if an
unrelated failure is encountered while accessing the hardware.

Check if the thermal zone device is registered after acquiring the
thermal zone device mutex to ensure this does not happen.

The code was tested by triggering the failure in iwl_op_mode_mvm_start()
on purpose. Without this patch, the kernel crashes reliably. The crash
is no longer observed after applying this and the preceding patches.

Signed-off-by: Guenter Roeck <linux@roeck-us.net>
---
 drivers/thermal/thermal_helpers.c | 5 +++++
 1 file changed, 5 insertions(+)

diff --git a/drivers/thermal/thermal_helpers.c b/drivers/thermal/thermal_helpers.c
index c65cdce8f856..3bac0b7a4c62 100644
--- a/drivers/thermal/thermal_helpers.c
+++ b/drivers/thermal/thermal_helpers.c
@@ -115,7 +115,12 @@ int thermal_zone_get_temp(struct thermal_zone_device *tz, int *temp)
 	int ret;
 
 	mutex_lock(&tz->lock);
+	if (!device_is_registered(&tz->device)) {
+		ret = -ENODEV;
+		goto unlock;
+	}
 	ret = __thermal_zone_get_temp(tz, temp);
+unlock:
 	mutex_unlock(&tz->lock);
 
 	return ret;
-- 
2.36.2


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH 4/9] thermal/core: Move parameter validation from __thermal_zone_get_temp to thermal_zone_get_temp
  2022-10-17 13:09 [PATCH 0/9] thermal/core: Protect thermal device operations against removal Guenter Roeck
                   ` (2 preceding siblings ...)
  2022-10-17 13:09 ` [PATCH 3/9] thermal/core: Ensure that thermal device is registered in thermal_zone_get_temp Guenter Roeck
@ 2022-10-17 13:09 ` Guenter Roeck
  2022-11-09 19:12   ` Rafael J. Wysocki
  2022-10-17 13:09 ` [PATCH 5/9] thermal/core: Introduce locked version of thermal_zone_device_update Guenter Roeck
                   ` (5 subsequent siblings)
  9 siblings, 1 reply; 25+ messages in thread
From: Guenter Roeck @ 2022-10-17 13:09 UTC (permalink / raw)
  To: Rafael J . Wysocki
  Cc: Daniel Lezcano, Amit Kucheria, Zhang Rui, linux-pm, linux-kernel,
	Guenter Roeck

All callers of __thermal_zone_get_temp() already validated the
thermal zone parameters. Move validation to thermal_zone_get_temp()
where it is actually needed.

Signed-off-by: Guenter Roeck <linux@roeck-us.net>
---
 drivers/thermal/thermal_helpers.c | 26 +++++++++++++++++++++++---
 1 file changed, 23 insertions(+), 3 deletions(-)

diff --git a/drivers/thermal/thermal_helpers.c b/drivers/thermal/thermal_helpers.c
index 3bac0b7a4c62..0161d5fb1cf2 100644
--- a/drivers/thermal/thermal_helpers.c
+++ b/drivers/thermal/thermal_helpers.c
@@ -64,6 +64,20 @@ get_thermal_instance(struct thermal_zone_device *tz,
 }
 EXPORT_SYMBOL(get_thermal_instance);
 
+/**
+ * __thermal_zone_get_temp() - returns the temperature of a thermal zone
+ * @tz: a valid pointer to a struct thermal_zone_device
+ * @temp: a valid pointer to where to store the resulting temperature.
+ *
+ * When a valid thermal zone reference is passed, it will fetch its
+ * temperature and fill @temp.
+ *
+ * Both tz and tz->ops must be valid pointers when calling this function,
+ * and the tz->ops->get_temp callback must be provided.
+ * The function must be called under tz->lock.
+ *
+ * Return: On success returns 0, an error code otherwise
+ */
 int __thermal_zone_get_temp(struct thermal_zone_device *tz, int *temp)
 {
 	int ret = -EINVAL;
@@ -73,9 +87,6 @@ int __thermal_zone_get_temp(struct thermal_zone_device *tz, int *temp)
 
 	lockdep_assert_held(&tz->lock);
 
-	if (!tz || IS_ERR(tz) || !tz->ops->get_temp)
-		return -EINVAL;
-
 	ret = tz->ops->get_temp(tz, temp);
 
 	if (IS_ENABLED(CONFIG_THERMAL_EMULATION) && tz->emul_temperature) {
@@ -114,7 +125,16 @@ int thermal_zone_get_temp(struct thermal_zone_device *tz, int *temp)
 {
 	int ret;
 
+	if (!tz || IS_ERR(tz))
+		return -EINVAL;
+
 	mutex_lock(&tz->lock);
+
+	if (!tz->ops->get_temp) {
+		ret = -EINVAL;
+		goto unlock;
+	}
+
 	if (!device_is_registered(&tz->device)) {
 		ret = -ENODEV;
 		goto unlock;
-- 
2.36.2


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH 5/9] thermal/core: Introduce locked version of thermal_zone_device_update
  2022-10-17 13:09 [PATCH 0/9] thermal/core: Protect thermal device operations against removal Guenter Roeck
                   ` (3 preceding siblings ...)
  2022-10-17 13:09 ` [PATCH 4/9] thermal/core: Move parameter validation from __thermal_zone_get_temp to thermal_zone_get_temp Guenter Roeck
@ 2022-10-17 13:09 ` Guenter Roeck
  2022-11-09 19:15   ` Rafael J. Wysocki
  2022-10-17 13:09 ` [PATCH 6/9] thermal/core: Protect hwmon accesses to thermal operations with thermal zone mutex Guenter Roeck
                   ` (4 subsequent siblings)
  9 siblings, 1 reply; 25+ messages in thread
From: Guenter Roeck @ 2022-10-17 13:09 UTC (permalink / raw)
  To: Rafael J . Wysocki
  Cc: Daniel Lezcano, Amit Kucheria, Zhang Rui, linux-pm, linux-kernel,
	Guenter Roeck

In thermal_zone_device_set_mode(), the thermal zone mutex is released only
to be reacquired in the subsequent call to thermal_zone_device_update().

Introduce __thermal_zone_device_update() as locked version of
thermal_zone_device_update() and call it from
thermal_zone_device_set_mode() without releasing the lock to avoid
the extra release/acuire sequence.

Signed-off-by: Guenter Roeck <linux@roeck-us.net>
---
 drivers/thermal/thermal_core.c | 57 ++++++++++++++++++----------------
 1 file changed, 31 insertions(+), 26 deletions(-)

diff --git a/drivers/thermal/thermal_core.c b/drivers/thermal/thermal_core.c
index 562ece8d16aa..9facd9c5b70f 100644
--- a/drivers/thermal/thermal_core.c
+++ b/drivers/thermal/thermal_core.c
@@ -403,6 +403,34 @@ static void thermal_zone_device_init(struct thermal_zone_device *tz)
 		pos->initialized = false;
 }
 
+static void __thermal_zone_device_update(struct thermal_zone_device *tz,
+					 enum thermal_notify_event event)
+{
+	int count;
+
+	if (atomic_read(&in_suspend))
+		return;
+
+	if (WARN_ONCE(!tz->ops->get_temp,
+		      "'%s' must not be called without 'get_temp' ops set\n",
+		      __func__))
+		return;
+
+	if (!thermal_zone_device_is_enabled(tz))
+		return;
+
+	update_temperature(tz);
+
+	__thermal_zone_set_trips(tz);
+
+	tz->notify_event = event;
+
+	for (count = 0; count < tz->num_trips; count++)
+		handle_thermal_trip(tz, count);
+
+	monitor_thermal_zone(tz);
+}
+
 static int thermal_zone_device_set_mode(struct thermal_zone_device *tz,
 					enum thermal_device_mode mode)
 {
@@ -423,9 +451,9 @@ static int thermal_zone_device_set_mode(struct thermal_zone_device *tz,
 	if (!ret)
 		tz->mode = mode;
 
-	mutex_unlock(&tz->lock);
+	__thermal_zone_device_update(tz, THERMAL_EVENT_UNSPECIFIED);
 
-	thermal_zone_device_update(tz, THERMAL_EVENT_UNSPECIFIED);
+	mutex_unlock(&tz->lock);
 
 	if (mode == THERMAL_DEVICE_ENABLED)
 		thermal_notify_tz_enable(tz->id);
@@ -457,31 +485,8 @@ int thermal_zone_device_is_enabled(struct thermal_zone_device *tz)
 void thermal_zone_device_update(struct thermal_zone_device *tz,
 				enum thermal_notify_event event)
 {
-	int count;
-
-	if (atomic_read(&in_suspend))
-		return;
-
-	if (WARN_ONCE(!tz->ops->get_temp, "'%s' must not be called without "
-		      "'get_temp' ops set\n", __func__))
-		return;
-
 	mutex_lock(&tz->lock);
-
-	if (!thermal_zone_device_is_enabled(tz))
-		goto out;
-
-	update_temperature(tz);
-
-	__thermal_zone_set_trips(tz);
-
-	tz->notify_event = event;
-
-	for (count = 0; count < tz->num_trips; count++)
-		handle_thermal_trip(tz, count);
-
-	monitor_thermal_zone(tz);
-out:
+	__thermal_zone_device_update(tz, event);
 	mutex_unlock(&tz->lock);
 }
 EXPORT_SYMBOL_GPL(thermal_zone_device_update);
-- 
2.36.2


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH 6/9] thermal/core: Protect hwmon accesses to thermal operations with thermal zone mutex
  2022-10-17 13:09 [PATCH 0/9] thermal/core: Protect thermal device operations against removal Guenter Roeck
                   ` (4 preceding siblings ...)
  2022-10-17 13:09 ` [PATCH 5/9] thermal/core: Introduce locked version of thermal_zone_device_update Guenter Roeck
@ 2022-10-17 13:09 ` Guenter Roeck
  2022-11-09 19:19   ` Rafael J. Wysocki
  2022-10-17 13:09 ` [PATCH 7/9] thermal/core: Protect sysfs " Guenter Roeck
                   ` (3 subsequent siblings)
  9 siblings, 1 reply; 25+ messages in thread
From: Guenter Roeck @ 2022-10-17 13:09 UTC (permalink / raw)
  To: Rafael J . Wysocki
  Cc: Daniel Lezcano, Amit Kucheria, Zhang Rui, linux-pm, linux-kernel,
	Guenter Roeck

In preparation to protecting access to thermal operations against thermal
zone device removal, protect hwmon accesses to thermal zone operations
with the thermal zone mutex. After acquiring the mutex, ensure that the
thermal zone device is registered before proceeding.

Signed-off-by: Guenter Roeck <linux@roeck-us.net>
---
 drivers/thermal/thermal_hwmon.c | 14 +++++++++++---
 1 file changed, 11 insertions(+), 3 deletions(-)

diff --git a/drivers/thermal/thermal_hwmon.c b/drivers/thermal/thermal_hwmon.c
index f53f4ceb6a5d..33bfbaed4236 100644
--- a/drivers/thermal/thermal_hwmon.c
+++ b/drivers/thermal/thermal_hwmon.c
@@ -77,11 +77,19 @@ temp_crit_show(struct device *dev, struct device_attribute *attr, char *buf)
 	int temperature;
 	int ret;
 
+	mutex_lock(&tz->lock);
+
+	if (!device_is_registered(&tz->device)) {
+		ret = -ENODEV;
+		goto unlock;
+	}
+
 	ret = tz->ops->get_crit_temp(tz, &temperature);
-	if (ret)
-		return ret;
 
-	return sprintf(buf, "%d\n", temperature);
+unlock:
+	mutex_unlock(&tz->lock);
+
+	return ret ? ret : sprintf(buf, "%d\n", temperature);
 }
 
 
-- 
2.36.2


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH 7/9] thermal/core: Protect sysfs accesses to thermal operations with thermal zone mutex
  2022-10-17 13:09 [PATCH 0/9] thermal/core: Protect thermal device operations against removal Guenter Roeck
                   ` (5 preceding siblings ...)
  2022-10-17 13:09 ` [PATCH 6/9] thermal/core: Protect hwmon accesses to thermal operations with thermal zone mutex Guenter Roeck
@ 2022-10-17 13:09 ` Guenter Roeck
  2022-11-09 19:26   ` Rafael J. Wysocki
  2022-10-17 13:09 ` [PATCH 8/9] thermal/core: Remove thermal_zone_set_trips() Guenter Roeck
                   ` (2 subsequent siblings)
  9 siblings, 1 reply; 25+ messages in thread
From: Guenter Roeck @ 2022-10-17 13:09 UTC (permalink / raw)
  To: Rafael J . Wysocki
  Cc: Daniel Lezcano, Amit Kucheria, Zhang Rui, linux-pm, linux-kernel,
	Guenter Roeck

Protect access to thermal operations against thermal zone removal by
acquiring the thermal zone device mutex. After acquiring the mutex, check
if the thermal zone device is registered and abort the operation if not.

With this change, we can call __thermal_zone_device_update() instead of
thermal_zone_device_update() from trip_point_temp_store() and from
emul_temp_store(). Similar, we can call __thermal_zone_set_trips() instead
of thermal_zone_set_trips() from trip_point_hyst_store().

Signed-off-by: Guenter Roeck <linux@roeck-us.net>
---
 drivers/thermal/thermal_core.c  |  4 +-
 drivers/thermal/thermal_core.h  |  2 +
 drivers/thermal/thermal_sysfs.c | 77 ++++++++++++++++++++++++++++-----
 3 files changed, 69 insertions(+), 14 deletions(-)

diff --git a/drivers/thermal/thermal_core.c b/drivers/thermal/thermal_core.c
index 9facd9c5b70f..b8e3b262b2bd 100644
--- a/drivers/thermal/thermal_core.c
+++ b/drivers/thermal/thermal_core.c
@@ -403,8 +403,8 @@ static void thermal_zone_device_init(struct thermal_zone_device *tz)
 		pos->initialized = false;
 }
 
-static void __thermal_zone_device_update(struct thermal_zone_device *tz,
-					 enum thermal_notify_event event)
+void __thermal_zone_device_update(struct thermal_zone_device *tz,
+				  enum thermal_notify_event event)
 {
 	int count;
 
diff --git a/drivers/thermal/thermal_core.h b/drivers/thermal/thermal_core.h
index 1571917bd3c8..7b51b9a22e7e 100644
--- a/drivers/thermal/thermal_core.h
+++ b/drivers/thermal/thermal_core.h
@@ -109,6 +109,8 @@ int thermal_register_governor(struct thermal_governor *);
 void thermal_unregister_governor(struct thermal_governor *);
 int thermal_zone_device_set_policy(struct thermal_zone_device *, char *);
 int thermal_build_list_of_policies(char *buf);
+void __thermal_zone_device_update(struct thermal_zone_device *tz,
+				  enum thermal_notify_event event);
 
 /* Helpers */
 void thermal_zone_set_trips(struct thermal_zone_device *tz);
diff --git a/drivers/thermal/thermal_sysfs.c b/drivers/thermal/thermal_sysfs.c
index ec495c7dff03..68607e6070e8 100644
--- a/drivers/thermal/thermal_sysfs.c
+++ b/drivers/thermal/thermal_sysfs.c
@@ -92,7 +92,15 @@ trip_point_type_show(struct device *dev, struct device_attribute *attr,
 	if (sscanf(attr->attr.name, "trip_point_%d_type", &trip) != 1)
 		return -EINVAL;
 
+	mutex_lock(&tz->lock);
+
+	if (!device_is_registered(dev)) {
+		mutex_unlock(&tz->lock);
+		return -ENODEV;
+	}
+
 	result = tz->ops->get_trip_type(tz, trip, &type);
+	mutex_unlock(&tz->lock);
 	if (result)
 		return result;
 
@@ -128,10 +136,17 @@ trip_point_temp_store(struct device *dev, struct device_attribute *attr,
 	if (kstrtoint(buf, 10, &temperature))
 		return -EINVAL;
 
+	mutex_lock(&tz->lock);
+
+	if (!device_is_registered(dev)) {
+		ret = -ENODEV;
+		goto unlock;
+	}
+
 	if (tz->ops->set_trip_temp) {
 		ret = tz->ops->set_trip_temp(tz, trip, temperature);
 		if (ret)
-			return ret;
+			goto unlock;
 	}
 
 	if (tz->trips)
@@ -140,18 +155,21 @@ trip_point_temp_store(struct device *dev, struct device_attribute *attr,
 	if (tz->ops->get_trip_hyst) {
 		ret = tz->ops->get_trip_hyst(tz, trip, &hyst);
 		if (ret)
-			return ret;
+			goto unlock;
 	}
 
 	ret = tz->ops->get_trip_type(tz, trip, &type);
 	if (ret)
-		return ret;
+		goto unlock;
 
 	thermal_notify_tz_trip_change(tz->id, trip, type, temperature, hyst);
 
-	thermal_zone_device_update(tz, THERMAL_EVENT_UNSPECIFIED);
+	__thermal_zone_device_update(tz, THERMAL_EVENT_UNSPECIFIED);
 
-	return count;
+unlock:
+	mutex_unlock(&tz->lock);
+
+	return ret ? ret : count;
 }
 
 static ssize_t
@@ -168,12 +186,19 @@ trip_point_temp_show(struct device *dev, struct device_attribute *attr,
 	if (sscanf(attr->attr.name, "trip_point_%d_temp", &trip) != 1)
 		return -EINVAL;
 
+	mutex_lock(&tz->lock);
+
+	if (!device_is_registered(dev)) {
+		ret = -ENODEV;
+		goto unlock;
+	}
+
 	ret = tz->ops->get_trip_temp(tz, trip, &temperature);
 
-	if (ret)
-		return ret;
+unlock:
+	mutex_unlock(&tz->lock);
 
-	return sprintf(buf, "%d\n", temperature);
+	return ret ? ret : sprintf(buf, "%d\n", temperature);
 }
 
 static ssize_t
@@ -193,6 +218,13 @@ trip_point_hyst_store(struct device *dev, struct device_attribute *attr,
 	if (kstrtoint(buf, 10, &temperature))
 		return -EINVAL;
 
+	mutex_lock(&tz->lock);
+
+	if (!device_is_registered(dev)) {
+		ret = -ENODEV;
+		goto unlock;
+	}
+
 	/*
 	 * We are not doing any check on the 'temperature' value
 	 * here. The driver implementing 'set_trip_hyst' has to
@@ -201,7 +233,10 @@ trip_point_hyst_store(struct device *dev, struct device_attribute *attr,
 	ret = tz->ops->set_trip_hyst(tz, trip, temperature);
 
 	if (!ret)
-		thermal_zone_set_trips(tz);
+		__thermal_zone_set_trips(tz);
+
+unlock:
+	mutex_unlock(&tz->lock);
 
 	return ret ? ret : count;
 }
@@ -220,8 +255,18 @@ trip_point_hyst_show(struct device *dev, struct device_attribute *attr,
 	if (sscanf(attr->attr.name, "trip_point_%d_hyst", &trip) != 1)
 		return -EINVAL;
 
+	mutex_lock(&tz->lock);
+
+	if (!device_is_registered(dev)) {
+		ret = -ENODEV;
+		goto unlock;
+	}
+
 	ret = tz->ops->get_trip_hyst(tz, trip, &temperature);
 
+unlock:
+	mutex_unlock(&tz->lock);
+
 	return ret ? ret : sprintf(buf, "%d\n", temperature);
 }
 
@@ -269,16 +314,24 @@ emul_temp_store(struct device *dev, struct device_attribute *attr,
 	if (kstrtoint(buf, 10, &temperature))
 		return -EINVAL;
 
+	mutex_lock(&tz->lock);
+
+	if (!device_is_registered(dev)) {
+		ret = -ENODEV;
+		goto unlock;
+	}
+
 	if (!tz->ops->set_emul_temp) {
-		mutex_lock(&tz->lock);
 		tz->emul_temperature = temperature;
-		mutex_unlock(&tz->lock);
 	} else {
 		ret = tz->ops->set_emul_temp(tz, temperature);
 	}
 
 	if (!ret)
-		thermal_zone_device_update(tz, THERMAL_EVENT_UNSPECIFIED);
+		__thermal_zone_device_update(tz, THERMAL_EVENT_UNSPECIFIED);
+
+unlock:
+	mutex_unlock(&tz->lock);
 
 	return ret ? ret : count;
 }
-- 
2.36.2


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH 8/9] thermal/core: Remove thermal_zone_set_trips()
  2022-10-17 13:09 [PATCH 0/9] thermal/core: Protect thermal device operations against removal Guenter Roeck
                   ` (6 preceding siblings ...)
  2022-10-17 13:09 ` [PATCH 7/9] thermal/core: Protect sysfs " Guenter Roeck
@ 2022-10-17 13:09 ` Guenter Roeck
  2022-10-17 13:09 ` [PATCH 9/9] thermal/core: Protect thermal device operations against thermal device removal Guenter Roeck
  2022-11-02 18:50 ` [PATCH 0/9] thermal/core: Protect thermal device operations against removal Guenter Roeck
  9 siblings, 0 replies; 25+ messages in thread
From: Guenter Roeck @ 2022-10-17 13:09 UTC (permalink / raw)
  To: Rafael J . Wysocki
  Cc: Daniel Lezcano, Amit Kucheria, Zhang Rui, linux-pm, linux-kernel,
	Guenter Roeck

Since no callers of thermal_zone_set_trips() are left, remove the function.
Document __thermal_zone_set_trips() instead. Explicitly state that the
thermal zone lock must be held when calling the function, and that the
pointer to the thermal zone must be valid.

Signed-off-by: Guenter Roeck <linux@roeck-us.net>
---
 drivers/thermal/thermal_core.h    |  1 -
 drivers/thermal/thermal_helpers.c | 34 ++++++++++++++-----------------
 2 files changed, 15 insertions(+), 20 deletions(-)

diff --git a/drivers/thermal/thermal_core.h b/drivers/thermal/thermal_core.h
index 7b51b9a22e7e..b834cb273429 100644
--- a/drivers/thermal/thermal_core.h
+++ b/drivers/thermal/thermal_core.h
@@ -113,7 +113,6 @@ void __thermal_zone_device_update(struct thermal_zone_device *tz,
 				  enum thermal_notify_event event);
 
 /* Helpers */
-void thermal_zone_set_trips(struct thermal_zone_device *tz);
 void __thermal_zone_set_trips(struct thermal_zone_device *tz);
 int __thermal_zone_get_temp(struct thermal_zone_device *tz, int *temp);
 
diff --git a/drivers/thermal/thermal_helpers.c b/drivers/thermal/thermal_helpers.c
index 0161d5fb1cf2..02b353c8d9aa 100644
--- a/drivers/thermal/thermal_helpers.c
+++ b/drivers/thermal/thermal_helpers.c
@@ -147,6 +147,21 @@ int thermal_zone_get_temp(struct thermal_zone_device *tz, int *temp)
 }
 EXPORT_SYMBOL_GPL(thermal_zone_get_temp);
 
+/**
+ * __thermal_zone_set_trips - Computes the next trip points for the driver
+ * @tz: a pointer to a thermal zone device structure
+ *
+ * The function computes the next temperature boundaries by browsing
+ * the trip points. The result is the closer low and high trip points
+ * to the current temperature. These values are passed to the backend
+ * driver to let it set its own notification mechanism (usually an
+ * interrupt).
+ *
+ * This function must be called with tz->lock held. Both tz and tz->ops
+ * must be valid pointers.
+ *
+ * It does not return a value
+ */
 void __thermal_zone_set_trips(struct thermal_zone_device *tz)
 {
 	int low = -INT_MAX;
@@ -193,25 +208,6 @@ void __thermal_zone_set_trips(struct thermal_zone_device *tz)
 		dev_err(&tz->device, "Failed to set trips: %d\n", ret);
 }
 
-/**
- * thermal_zone_set_trips - Computes the next trip points for the driver
- * @tz: a pointer to a thermal zone device structure
- *
- * The function computes the next temperature boundaries by browsing
- * the trip points. The result is the closer low and high trip points
- * to the current temperature. These values are passed to the backend
- * driver to let it set its own notification mechanism (usually an
- * interrupt).
- *
- * It does not return a value
- */
-void thermal_zone_set_trips(struct thermal_zone_device *tz)
-{
-	mutex_lock(&tz->lock);
-	__thermal_zone_set_trips(tz);
-	mutex_unlock(&tz->lock);
-}
-
 static void thermal_cdev_set_cur_state(struct thermal_cooling_device *cdev,
 				       int target)
 {
-- 
2.36.2


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH 9/9] thermal/core: Protect thermal device operations against thermal device removal
  2022-10-17 13:09 [PATCH 0/9] thermal/core: Protect thermal device operations against removal Guenter Roeck
                   ` (7 preceding siblings ...)
  2022-10-17 13:09 ` [PATCH 8/9] thermal/core: Remove thermal_zone_set_trips() Guenter Roeck
@ 2022-10-17 13:09 ` Guenter Roeck
  2022-11-02 18:50 ` [PATCH 0/9] thermal/core: Protect thermal device operations against removal Guenter Roeck
  9 siblings, 0 replies; 25+ messages in thread
From: Guenter Roeck @ 2022-10-17 13:09 UTC (permalink / raw)
  To: Rafael J . Wysocki
  Cc: Daniel Lezcano, Amit Kucheria, Zhang Rui, linux-pm, linux-kernel,
	Guenter Roeck

Thermal device operations may be called after thermal zone device removal.
After thermal zone device removal, thermal zond device operations must
no longer be called. To prevent such calls from happening, ensure that
the thermal device is registered before executing any thermal device
operations.

Signed-off-by: Guenter Roeck <linux@roeck-us.net>
---
 drivers/thermal/thermal_core.c | 12 +++++++++++-
 1 file changed, 11 insertions(+), 1 deletion(-)

diff --git a/drivers/thermal/thermal_core.c b/drivers/thermal/thermal_core.c
index b8e3b262b2bd..aa0107f11c98 100644
--- a/drivers/thermal/thermal_core.c
+++ b/drivers/thermal/thermal_core.c
@@ -203,6 +203,9 @@ int thermal_zone_device_set_policy(struct thermal_zone_device *tz,
 	mutex_lock(&thermal_governor_lock);
 	mutex_lock(&tz->lock);
 
+	if (!device_is_registered(&tz->device))
+		goto exit;
+
 	gov = __find_governor(strim(policy));
 	if (!gov)
 		goto exit;
@@ -445,6 +448,12 @@ static int thermal_zone_device_set_mode(struct thermal_zone_device *tz,
 		return ret;
 	}
 
+	if (!device_is_registered(&tz->device)) {
+		mutex_unlock(&tz->lock);
+
+		return -ENODEV;
+	}
+
 	if (tz->ops->change_mode)
 		ret = tz->ops->change_mode(tz, mode);
 
@@ -486,7 +495,8 @@ void thermal_zone_device_update(struct thermal_zone_device *tz,
 				enum thermal_notify_event event)
 {
 	mutex_lock(&tz->lock);
-	__thermal_zone_device_update(tz, event);
+	if (device_is_registered(&tz->device))
+		__thermal_zone_device_update(tz, event);
 	mutex_unlock(&tz->lock);
 }
 EXPORT_SYMBOL_GPL(thermal_zone_device_update);
-- 
2.36.2


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* Re: [PATCH 0/9] thermal/core: Protect thermal device operations against removal
  2022-10-17 13:09 [PATCH 0/9] thermal/core: Protect thermal device operations against removal Guenter Roeck
                   ` (8 preceding siblings ...)
  2022-10-17 13:09 ` [PATCH 9/9] thermal/core: Protect thermal device operations against thermal device removal Guenter Roeck
@ 2022-11-02 18:50 ` Guenter Roeck
  2022-11-02 18:55   ` Daniel Lezcano
  2022-11-09 19:30   ` Rafael J. Wysocki
  9 siblings, 2 replies; 25+ messages in thread
From: Guenter Roeck @ 2022-11-02 18:50 UTC (permalink / raw)
  To: Rafael J . Wysocki
  Cc: Daniel Lezcano, Amit Kucheria, Zhang Rui, linux-pm, linux-kernel

Hi,

On Mon, Oct 17, 2022 at 06:09:01AM -0700, Guenter Roeck wrote:
> Accesses to thermal zones, and with it to thermal zone device operations,
> are still possible after the thermal zone device has been unregistered.
> For example, thermal_zone_get_temp() can be called from temp_show()
> in thermal_sysfs.c if the sysfs attribute was opened before the thermal
> device was unregistered. This is problematic and may result in crashes
> since the operations data structure and the underlying code may be gone
> when the calls are made.
> 
> The following series solves the problem by protecting accesses to thermal
> device operations with the thermal device mutex, and by verifying that the
> thermal device is still registered after the mutex has been acquired.
> 
> This was previously sent as RFC/RFT as single patch [1]. The code was reworked
> to match thermal subsystem changes made between v6.0 and v6.1, and it was
> split into several patches to simplify review.
> 

Any thoughts / comments / feedback on this series ?

Thanks,
Guenter

> [1] https://lore.kernel.org/linux-pm/20221004033936.1047691-1-linux@roeck-us.net/
> 
> ----------------------------------------------------------------
> Guenter Roeck (9):
>       thermal/core: Destroy thermal zone device mutex in release function
>       thermal/core: Delete device under thermal device zone lock
>       thermal/core: Ensure that thermal device is registered in thermal_zone_get_temp
>       thermal/core: Move parameter validation from __thermal_zone_get_temp to thermal_zone_get_temp
>       thermal/core: Introduce locked version of thermal_zone_device_update
>       thermal/core: Protect hwmon accesses to thermal operations with thermal zone mutex
>       thermal/core: Protect sysfs accesses to thermal operations with thermal zone mutex
>       thermal/core: Remove thermal_zone_set_trips()
>       thermal/core: Protect thermal device operations against thermal device removal
> 
>  drivers/thermal/thermal_core.c    | 76 ++++++++++++++++++++++++--------------
>  drivers/thermal/thermal_core.h    |  3 +-
>  drivers/thermal/thermal_helpers.c | 65 ++++++++++++++++++++++-----------
>  drivers/thermal/thermal_hwmon.c   | 14 +++++--
>  drivers/thermal/thermal_sysfs.c   | 77 +++++++++++++++++++++++++++++++++------
>  5 files changed, 169 insertions(+), 66 deletions(-)

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 0/9] thermal/core: Protect thermal device operations against removal
  2022-11-02 18:50 ` [PATCH 0/9] thermal/core: Protect thermal device operations against removal Guenter Roeck
@ 2022-11-02 18:55   ` Daniel Lezcano
  2022-11-09 19:30   ` Rafael J. Wysocki
  1 sibling, 0 replies; 25+ messages in thread
From: Daniel Lezcano @ 2022-11-02 18:55 UTC (permalink / raw)
  To: Guenter Roeck, Rafael J . Wysocki
  Cc: Amit Kucheria, Zhang Rui, linux-pm, linux-kernel

On 02/11/2022 19:50, Guenter Roeck wrote:
> Hi,
> 
> On Mon, Oct 17, 2022 at 06:09:01AM -0700, Guenter Roeck wrote:
>> Accesses to thermal zones, and with it to thermal zone device operations,
>> are still possible after the thermal zone device has been unregistered.
>> For example, thermal_zone_get_temp() can be called from temp_show()
>> in thermal_sysfs.c if the sysfs attribute was opened before the thermal
>> device was unregistered. This is problematic and may result in crashes
>> since the operations data structure and the underlying code may be gone
>> when the calls are made.
>>
>> The following series solves the problem by protecting accesses to thermal
>> device operations with the thermal device mutex, and by verifying that the
>> thermal device is still registered after the mutex has been acquired.
>>
>> This was previously sent as RFC/RFT as single patch [1]. The code was reworked
>> to match thermal subsystem changes made between v6.0 and v6.1, and it was
>> split into several patches to simplify review.
>>
> 
> Any thoughts / comments / feedback on this series ?

I'm out of the office ATM, I'll have a look on the series in a few days.


-- 
<http://www.linaro.org/> Linaro.org │ Open source software for ARM SoCs

Follow Linaro:  <http://www.facebook.com/pages/Linaro> Facebook |
<http://twitter.com/#!/linaroorg> Twitter |
<http://www.linaro.org/linaro-blog/> Blog


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 3/9] thermal/core: Ensure that thermal device is registered in thermal_zone_get_temp
  2022-10-17 13:09 ` [PATCH 3/9] thermal/core: Ensure that thermal device is registered in thermal_zone_get_temp Guenter Roeck
@ 2022-11-09 19:07   ` Rafael J. Wysocki
  2022-11-10 14:13     ` Guenter Roeck
  0 siblings, 1 reply; 25+ messages in thread
From: Rafael J. Wysocki @ 2022-11-09 19:07 UTC (permalink / raw)
  To: Guenter Roeck
  Cc: Rafael J . Wysocki, Daniel Lezcano, Amit Kucheria, Zhang Rui,
	linux-pm, linux-kernel

On Mon, Oct 17, 2022 at 3:09 PM Guenter Roeck <linux@roeck-us.net> wrote:
>
> Calls to thermal_zone_get_temp() are not protected against thermal zone
> device removal. As result, it is possible that the thermal zone operations
> callbacks are no longer valid when thermal_zone_get_temp() is called.
> This may result in crashes such as
>
> BUG: unable to handle page fault for address: ffffffffc04ef420
>  #PF: supervisor read access in kernel mode
>  #PF: error_code(0x0000) - not-present page
> PGD 5d60e067 P4D 5d60e067 PUD 5d610067 PMD 110197067 PTE 0
> Oops: 0000 [#1] PREEMPT SMP NOPTI
> CPU: 1 PID: 3209 Comm: cat Tainted: G        W         5.10.136-19389-g615abc6eb807 #1 02df41ac0b12f3a64f4b34245188d8875bb3bce1
> Hardware name: Google Coral/Coral, BIOS Google_Coral.10068.92.0 11/27/2018
> RIP: 0010:thermal_zone_get_temp+0x26/0x73
> Code: 89 c3 eb d3 0f 1f 44 00 00 55 48 89 e5 41 57 41 56 53 48 85 ff 74 50 48 89 fb 48 81 ff 00 f0 ff ff 77 44 48 8b 83 98 03 00 00 <48> 83 78 10 00 74 36 49 89 f6 4c 8d bb d8 03 00 00 4c 89 ff e8 9f
> RSP: 0018:ffffb3758138fd38 EFLAGS: 00010287
> RAX: ffffffffc04ef410 RBX: ffff98f14d7fb000 RCX: 0000000000000000
> RDX: ffff98f17cf90000 RSI: ffffb3758138fd64 RDI: ffff98f14d7fb000
> RBP: ffffb3758138fd50 R08: 0000000000001000 R09: ffff98f17cf90000
> R10: 0000000000000000 R11: ffffffff8dacad28 R12: 0000000000001000
> R13: ffff98f1793a7d80 R14: ffff98f143231708 R15: ffff98f14d7fb018
> FS:  00007ec166097800(0000) GS:ffff98f1bbd00000(0000) knlGS:0000000000000000
> CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> CR2: ffffffffc04ef420 CR3: 000000010ee9a000 CR4: 00000000003506e0
> Call Trace:
>  temp_show+0x31/0x68
>  dev_attr_show+0x1d/0x4f
>  sysfs_kf_seq_show+0x92/0x107
>  seq_read_iter+0xf5/0x3f2
>  vfs_read+0x205/0x379
>  __x64_sys_read+0x7c/0xe2
>  do_syscall_64+0x43/0x55
>  entry_SYSCALL_64_after_hwframe+0x61/0xc6
>
> if a thermal device is removed while accesses to its device attributes
> are ongoing.
>
> The problem is exposed by code in iwl_op_mode_mvm_start(), which registers
> a thermal zone device only to unregister it shortly afterwards if an
> unrelated failure is encountered while accessing the hardware.
>
> Check if the thermal zone device is registered after acquiring the
> thermal zone device mutex to ensure this does not happen.
>
> The code was tested by triggering the failure in iwl_op_mode_mvm_start()
> on purpose. Without this patch, the kernel crashes reliably. The crash
> is no longer observed after applying this and the preceding patches.
>
> Signed-off-by: Guenter Roeck <linux@roeck-us.net>
> ---
>  drivers/thermal/thermal_helpers.c | 5 +++++
>  1 file changed, 5 insertions(+)
>
> diff --git a/drivers/thermal/thermal_helpers.c b/drivers/thermal/thermal_helpers.c
> index c65cdce8f856..3bac0b7a4c62 100644
> --- a/drivers/thermal/thermal_helpers.c
> +++ b/drivers/thermal/thermal_helpers.c
> @@ -115,7 +115,12 @@ int thermal_zone_get_temp(struct thermal_zone_device *tz, int *temp)
>         int ret;
>
>         mutex_lock(&tz->lock);
> +       if (!device_is_registered(&tz->device)) {
> +               ret = -ENODEV;
> +               goto unlock;
> +       }
>         ret = __thermal_zone_get_temp(tz, temp);
> +unlock:

I would do it this way:

if (device_is_registered(&tz->device))
        ret = __thermal_zone_get_temp(tz, temp);
else
        ret = -ENODEV;

>         mutex_unlock(&tz->lock);
>
>         return ret;
> --

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 4/9] thermal/core: Move parameter validation from __thermal_zone_get_temp to thermal_zone_get_temp
  2022-10-17 13:09 ` [PATCH 4/9] thermal/core: Move parameter validation from __thermal_zone_get_temp to thermal_zone_get_temp Guenter Roeck
@ 2022-11-09 19:12   ` Rafael J. Wysocki
  0 siblings, 0 replies; 25+ messages in thread
From: Rafael J. Wysocki @ 2022-11-09 19:12 UTC (permalink / raw)
  To: Guenter Roeck
  Cc: Rafael J . Wysocki, Daniel Lezcano, Amit Kucheria, Zhang Rui,
	linux-pm, linux-kernel

On Mon, Oct 17, 2022 at 3:09 PM Guenter Roeck <linux@roeck-us.net> wrote:
>
> All callers of __thermal_zone_get_temp() already validated the
> thermal zone parameters. Move validation to thermal_zone_get_temp()
> where it is actually needed.
>
> Signed-off-by: Guenter Roeck <linux@roeck-us.net>
> ---
>  drivers/thermal/thermal_helpers.c | 26 +++++++++++++++++++++++---
>  1 file changed, 23 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/thermal/thermal_helpers.c b/drivers/thermal/thermal_helpers.c
> index 3bac0b7a4c62..0161d5fb1cf2 100644
> --- a/drivers/thermal/thermal_helpers.c
> +++ b/drivers/thermal/thermal_helpers.c
> @@ -64,6 +64,20 @@ get_thermal_instance(struct thermal_zone_device *tz,
>  }
>  EXPORT_SYMBOL(get_thermal_instance);
>
> +/**
> + * __thermal_zone_get_temp() - returns the temperature of a thermal zone
> + * @tz: a valid pointer to a struct thermal_zone_device
> + * @temp: a valid pointer to where to store the resulting temperature.
> + *
> + * When a valid thermal zone reference is passed, it will fetch its
> + * temperature and fill @temp.
> + *
> + * Both tz and tz->ops must be valid pointers when calling this function,
> + * and the tz->ops->get_temp callback must be provided.
> + * The function must be called under tz->lock.
> + *
> + * Return: On success returns 0, an error code otherwise
> + */

I would mention adding the kerneldoc in the patch changelog.

>  int __thermal_zone_get_temp(struct thermal_zone_device *tz, int *temp)
>  {
>         int ret = -EINVAL;
> @@ -73,9 +87,6 @@ int __thermal_zone_get_temp(struct thermal_zone_device *tz, int *temp)
>
>         lockdep_assert_held(&tz->lock);
>
> -       if (!tz || IS_ERR(tz) || !tz->ops->get_temp)
> -               return -EINVAL;
> -
>         ret = tz->ops->get_temp(tz, temp);
>
>         if (IS_ENABLED(CONFIG_THERMAL_EMULATION) && tz->emul_temperature) {
> @@ -114,7 +125,16 @@ int thermal_zone_get_temp(struct thermal_zone_device *tz, int *temp)
>  {
>         int ret;
>
> +       if (!tz || IS_ERR(tz))

IS_ERR_OR_NULL() ?

> +               return -EINVAL;
> +
>         mutex_lock(&tz->lock);
> +
> +       if (!tz->ops->get_temp) {
> +               ret = -EINVAL;
> +               goto unlock;
> +       }
> +
>         if (!device_is_registered(&tz->device)) {
>                 ret = -ENODEV;
>                 goto unlock;
> --

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 5/9] thermal/core: Introduce locked version of thermal_zone_device_update
  2022-10-17 13:09 ` [PATCH 5/9] thermal/core: Introduce locked version of thermal_zone_device_update Guenter Roeck
@ 2022-11-09 19:15   ` Rafael J. Wysocki
  2022-11-10  0:25     ` Guenter Roeck
  0 siblings, 1 reply; 25+ messages in thread
From: Rafael J. Wysocki @ 2022-11-09 19:15 UTC (permalink / raw)
  To: Guenter Roeck
  Cc: Rafael J . Wysocki, Daniel Lezcano, Amit Kucheria, Zhang Rui,
	linux-pm, linux-kernel

On Mon, Oct 17, 2022 at 3:09 PM Guenter Roeck <linux@roeck-us.net> wrote:
>
> In thermal_zone_device_set_mode(), the thermal zone mutex is released only
> to be reacquired in the subsequent call to thermal_zone_device_update().
>
> Introduce __thermal_zone_device_update() as locked version of

Did you mean "unlocked"?

> thermal_zone_device_update() and call it from
> thermal_zone_device_set_mode() without releasing the lock to avoid
> the extra release/acuire sequence.
>
> Signed-off-by: Guenter Roeck <linux@roeck-us.net>
> ---
>  drivers/thermal/thermal_core.c | 57 ++++++++++++++++++----------------
>  1 file changed, 31 insertions(+), 26 deletions(-)
>
> diff --git a/drivers/thermal/thermal_core.c b/drivers/thermal/thermal_core.c
> index 562ece8d16aa..9facd9c5b70f 100644
> --- a/drivers/thermal/thermal_core.c
> +++ b/drivers/thermal/thermal_core.c
> @@ -403,6 +403,34 @@ static void thermal_zone_device_init(struct thermal_zone_device *tz)
>                 pos->initialized = false;
>  }
>
> +static void __thermal_zone_device_update(struct thermal_zone_device *tz,
> +                                        enum thermal_notify_event event)
> +{
> +       int count;
> +
> +       if (atomic_read(&in_suspend))
> +               return;
> +
> +       if (WARN_ONCE(!tz->ops->get_temp,
> +                     "'%s' must not be called without 'get_temp' ops set\n",
> +                     __func__))
> +               return;
> +
> +       if (!thermal_zone_device_is_enabled(tz))
> +               return;
> +
> +       update_temperature(tz);
> +
> +       __thermal_zone_set_trips(tz);
> +
> +       tz->notify_event = event;
> +
> +       for (count = 0; count < tz->num_trips; count++)
> +               handle_thermal_trip(tz, count);
> +
> +       monitor_thermal_zone(tz);
> +}
> +
>  static int thermal_zone_device_set_mode(struct thermal_zone_device *tz,
>                                         enum thermal_device_mode mode)
>  {
> @@ -423,9 +451,9 @@ static int thermal_zone_device_set_mode(struct thermal_zone_device *tz,
>         if (!ret)
>                 tz->mode = mode;
>
> -       mutex_unlock(&tz->lock);
> +       __thermal_zone_device_update(tz, THERMAL_EVENT_UNSPECIFIED);
>
> -       thermal_zone_device_update(tz, THERMAL_EVENT_UNSPECIFIED);
> +       mutex_unlock(&tz->lock);
>
>         if (mode == THERMAL_DEVICE_ENABLED)
>                 thermal_notify_tz_enable(tz->id);
> @@ -457,31 +485,8 @@ int thermal_zone_device_is_enabled(struct thermal_zone_device *tz)
>  void thermal_zone_device_update(struct thermal_zone_device *tz,
>                                 enum thermal_notify_event event)
>  {
> -       int count;
> -
> -       if (atomic_read(&in_suspend))
> -               return;
> -
> -       if (WARN_ONCE(!tz->ops->get_temp, "'%s' must not be called without "
> -                     "'get_temp' ops set\n", __func__))
> -               return;
> -
>         mutex_lock(&tz->lock);
> -
> -       if (!thermal_zone_device_is_enabled(tz))
> -               goto out;
> -
> -       update_temperature(tz);
> -
> -       __thermal_zone_set_trips(tz);
> -
> -       tz->notify_event = event;
> -
> -       for (count = 0; count < tz->num_trips; count++)
> -               handle_thermal_trip(tz, count);
> -
> -       monitor_thermal_zone(tz);
> -out:
> +       __thermal_zone_device_update(tz, event);
>         mutex_unlock(&tz->lock);
>  }
>  EXPORT_SYMBOL_GPL(thermal_zone_device_update);
> --
> 2.36.2
>

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 6/9] thermal/core: Protect hwmon accesses to thermal operations with thermal zone mutex
  2022-10-17 13:09 ` [PATCH 6/9] thermal/core: Protect hwmon accesses to thermal operations with thermal zone mutex Guenter Roeck
@ 2022-11-09 19:19   ` Rafael J. Wysocki
  2022-11-10 14:21     ` Guenter Roeck
  0 siblings, 1 reply; 25+ messages in thread
From: Rafael J. Wysocki @ 2022-11-09 19:19 UTC (permalink / raw)
  To: Guenter Roeck
  Cc: Rafael J . Wysocki, Daniel Lezcano, Amit Kucheria, Zhang Rui,
	linux-pm, linux-kernel

On Mon, Oct 17, 2022 at 3:09 PM Guenter Roeck <linux@roeck-us.net> wrote:
>
> In preparation to protecting access to thermal operations against thermal
> zone device removal, protect hwmon accesses to thermal zone operations
> with the thermal zone mutex. After acquiring the mutex, ensure that the
> thermal zone device is registered before proceeding.
>
> Signed-off-by: Guenter Roeck <linux@roeck-us.net>
> ---
>  drivers/thermal/thermal_hwmon.c | 14 +++++++++++---
>  1 file changed, 11 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/thermal/thermal_hwmon.c b/drivers/thermal/thermal_hwmon.c
> index f53f4ceb6a5d..33bfbaed4236 100644
> --- a/drivers/thermal/thermal_hwmon.c
> +++ b/drivers/thermal/thermal_hwmon.c
> @@ -77,11 +77,19 @@ temp_crit_show(struct device *dev, struct device_attribute *attr, char *buf)
>         int temperature;
>         int ret;
>
> +       mutex_lock(&tz->lock);
> +
> +       if (!device_is_registered(&tz->device)) {
> +               ret = -ENODEV;
> +               goto unlock;
> +       }
> +
>         ret = tz->ops->get_crit_temp(tz, &temperature);

Again, I would do it this way:

        if (device_is_registered(&tz->device))
                ret = tz->ops->get_crit_temp(tz, &temperature);
        else
                ret = -ENODEV;

And I wouldn't change the code below (the ternary operator is out of
fashion in particular).

> -       if (ret)
> -               return ret;
>
> -       return sprintf(buf, "%d\n", temperature);
> +unlock:
> +       mutex_unlock(&tz->lock);
> +
> +       return ret ? ret : sprintf(buf, "%d\n", temperature);
>  }
>
>
> --

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 7/9] thermal/core: Protect sysfs accesses to thermal operations with thermal zone mutex
  2022-10-17 13:09 ` [PATCH 7/9] thermal/core: Protect sysfs " Guenter Roeck
@ 2022-11-09 19:26   ` Rafael J. Wysocki
  0 siblings, 0 replies; 25+ messages in thread
From: Rafael J. Wysocki @ 2022-11-09 19:26 UTC (permalink / raw)
  To: Guenter Roeck
  Cc: Rafael J . Wysocki, Daniel Lezcano, Amit Kucheria, Zhang Rui,
	linux-pm, linux-kernel

On Mon, Oct 17, 2022 at 3:09 PM Guenter Roeck <linux@roeck-us.net> wrote:
>
> Protect access to thermal operations against thermal zone removal by
> acquiring the thermal zone device mutex. After acquiring the mutex, check
> if the thermal zone device is registered and abort the operation if not.
>
> With this change, we can call __thermal_zone_device_update() instead of
> thermal_zone_device_update() from trip_point_temp_store() and from
> emul_temp_store(). Similar, we can call __thermal_zone_set_trips() instead
> of thermal_zone_set_trips() from trip_point_hyst_store().
>
> Signed-off-by: Guenter Roeck <linux@roeck-us.net>
> ---
>  drivers/thermal/thermal_core.c  |  4 +-
>  drivers/thermal/thermal_core.h  |  2 +
>  drivers/thermal/thermal_sysfs.c | 77 ++++++++++++++++++++++++++++-----
>  3 files changed, 69 insertions(+), 14 deletions(-)
>
> diff --git a/drivers/thermal/thermal_core.c b/drivers/thermal/thermal_core.c
> index 9facd9c5b70f..b8e3b262b2bd 100644
> --- a/drivers/thermal/thermal_core.c
> +++ b/drivers/thermal/thermal_core.c
> @@ -403,8 +403,8 @@ static void thermal_zone_device_init(struct thermal_zone_device *tz)
>                 pos->initialized = false;
>  }
>
> -static void __thermal_zone_device_update(struct thermal_zone_device *tz,
> -                                        enum thermal_notify_event event)
> +void __thermal_zone_device_update(struct thermal_zone_device *tz,
> +                                 enum thermal_notify_event event)
>  {
>         int count;
>
> diff --git a/drivers/thermal/thermal_core.h b/drivers/thermal/thermal_core.h
> index 1571917bd3c8..7b51b9a22e7e 100644
> --- a/drivers/thermal/thermal_core.h
> +++ b/drivers/thermal/thermal_core.h
> @@ -109,6 +109,8 @@ int thermal_register_governor(struct thermal_governor *);
>  void thermal_unregister_governor(struct thermal_governor *);
>  int thermal_zone_device_set_policy(struct thermal_zone_device *, char *);
>  int thermal_build_list_of_policies(char *buf);
> +void __thermal_zone_device_update(struct thermal_zone_device *tz,
> +                                 enum thermal_notify_event event);
>
>  /* Helpers */
>  void thermal_zone_set_trips(struct thermal_zone_device *tz);
> diff --git a/drivers/thermal/thermal_sysfs.c b/drivers/thermal/thermal_sysfs.c
> index ec495c7dff03..68607e6070e8 100644
> --- a/drivers/thermal/thermal_sysfs.c
> +++ b/drivers/thermal/thermal_sysfs.c
> @@ -92,7 +92,15 @@ trip_point_type_show(struct device *dev, struct device_attribute *attr,
>         if (sscanf(attr->attr.name, "trip_point_%d_type", &trip) != 1)
>                 return -EINVAL;
>
> +       mutex_lock(&tz->lock);
> +
> +       if (!device_is_registered(dev)) {
> +               mutex_unlock(&tz->lock);
> +               return -ENODEV;
> +       }
> +
>         result = tz->ops->get_trip_type(tz, trip, &type);
> +       mutex_unlock(&tz->lock);
>         if (result)
>                 return result;
>
> @@ -128,10 +136,17 @@ trip_point_temp_store(struct device *dev, struct device_attribute *attr,
>         if (kstrtoint(buf, 10, &temperature))
>                 return -EINVAL;
>
> +       mutex_lock(&tz->lock);
> +
> +       if (!device_is_registered(dev)) {
> +               ret = -ENODEV;
> +               goto unlock;
> +       }
> +
>         if (tz->ops->set_trip_temp) {
>                 ret = tz->ops->set_trip_temp(tz, trip, temperature);
>                 if (ret)
> -                       return ret;
> +                       goto unlock;
>         }
>
>         if (tz->trips)
> @@ -140,18 +155,21 @@ trip_point_temp_store(struct device *dev, struct device_attribute *attr,
>         if (tz->ops->get_trip_hyst) {
>                 ret = tz->ops->get_trip_hyst(tz, trip, &hyst);
>                 if (ret)
> -                       return ret;
> +                       goto unlock;
>         }
>
>         ret = tz->ops->get_trip_type(tz, trip, &type);
>         if (ret)
> -               return ret;
> +               goto unlock;
>
>         thermal_notify_tz_trip_change(tz->id, trip, type, temperature, hyst);
>
> -       thermal_zone_device_update(tz, THERMAL_EVENT_UNSPECIFIED);
> +       __thermal_zone_device_update(tz, THERMAL_EVENT_UNSPECIFIED);
>
> -       return count;
> +unlock:
> +       mutex_unlock(&tz->lock);
> +
> +       return ret ? ret : count;
>  }
>
>  static ssize_t
> @@ -168,12 +186,19 @@ trip_point_temp_show(struct device *dev, struct device_attribute *attr,
>         if (sscanf(attr->attr.name, "trip_point_%d_temp", &trip) != 1)
>                 return -EINVAL;
>
> +       mutex_lock(&tz->lock);
> +
> +       if (!device_is_registered(dev)) {
> +               ret = -ENODEV;
> +               goto unlock;
> +       }
> +
>         ret = tz->ops->get_trip_temp(tz, trip, &temperature);
>

Well, this is a pattern I've already talked about twice, so let me
just mention it here.

> -       if (ret)
> -               return ret;
> +unlock:
> +       mutex_unlock(&tz->lock);
>
> -       return sprintf(buf, "%d\n", temperature);
> +       return ret ? ret : sprintf(buf, "%d\n", temperature);

And I wouldn't change this (like in the other patch I've commented).

>  }
>
>  static ssize_t
> @@ -193,6 +218,13 @@ trip_point_hyst_store(struct device *dev, struct device_attribute *attr,
>         if (kstrtoint(buf, 10, &temperature))
>                 return -EINVAL;
>
> +       mutex_lock(&tz->lock);
> +
> +       if (!device_is_registered(dev)) {
> +               ret = -ENODEV;
> +               goto unlock;
> +       }
> +
>         /*
>          * We are not doing any check on the 'temperature' value
>          * here. The driver implementing 'set_trip_hyst' has to
> @@ -201,7 +233,10 @@ trip_point_hyst_store(struct device *dev, struct device_attribute *attr,
>         ret = tz->ops->set_trip_hyst(tz, trip, temperature);
>
>         if (!ret)
> -               thermal_zone_set_trips(tz);
> +               __thermal_zone_set_trips(tz);
> +
> +unlock:
> +       mutex_unlock(&tz->lock);
>
>         return ret ? ret : count;
>  }
> @@ -220,8 +255,18 @@ trip_point_hyst_show(struct device *dev, struct device_attribute *attr,
>         if (sscanf(attr->attr.name, "trip_point_%d_hyst", &trip) != 1)
>                 return -EINVAL;
>
> +       mutex_lock(&tz->lock);
> +
> +       if (!device_is_registered(dev)) {
> +               ret = -ENODEV;
> +               goto unlock;
> +       }
> +
>         ret = tz->ops->get_trip_hyst(tz, trip, &temperature);

Same pattern again.

> +unlock:
> +       mutex_unlock(&tz->lock);
> +
>         return ret ? ret : sprintf(buf, "%d\n", temperature);
>  }
>
> @@ -269,16 +314,24 @@ emul_temp_store(struct device *dev, struct device_attribute *attr,
>         if (kstrtoint(buf, 10, &temperature))
>                 return -EINVAL;
>
> +       mutex_lock(&tz->lock);
> +
> +       if (!device_is_registered(dev)) {
> +               ret = -ENODEV;
> +               goto unlock;
> +       }
> +
>         if (!tz->ops->set_emul_temp) {
> -               mutex_lock(&tz->lock);
>                 tz->emul_temperature = temperature;
> -               mutex_unlock(&tz->lock);
>         } else {
>                 ret = tz->ops->set_emul_temp(tz, temperature);
>         }

Drop the leftover braces that are not necessary now?

>
>         if (!ret)
> -               thermal_zone_device_update(tz, THERMAL_EVENT_UNSPECIFIED);
> +               __thermal_zone_device_update(tz, THERMAL_EVENT_UNSPECIFIED);
> +
> +unlock:
> +       mutex_unlock(&tz->lock);
>
>         return ret ? ret : count;
>  }
> --

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 0/9] thermal/core: Protect thermal device operations against removal
  2022-11-02 18:50 ` [PATCH 0/9] thermal/core: Protect thermal device operations against removal Guenter Roeck
  2022-11-02 18:55   ` Daniel Lezcano
@ 2022-11-09 19:30   ` Rafael J. Wysocki
  1 sibling, 0 replies; 25+ messages in thread
From: Rafael J. Wysocki @ 2022-11-09 19:30 UTC (permalink / raw)
  To: Guenter Roeck
  Cc: Rafael J . Wysocki, Daniel Lezcano, Amit Kucheria, Zhang Rui,
	linux-pm, linux-kernel

Hi Guenter,

Sorry for the delay.

On Wed, Nov 2, 2022 at 7:50 PM Guenter Roeck <linux@roeck-us.net> wrote:
>
> Hi,
>
> On Mon, Oct 17, 2022 at 06:09:01AM -0700, Guenter Roeck wrote:
> > Accesses to thermal zones, and with it to thermal zone device operations,
> > are still possible after the thermal zone device has been unregistered.
> > For example, thermal_zone_get_temp() can be called from temp_show()
> > in thermal_sysfs.c if the sysfs attribute was opened before the thermal
> > device was unregistered. This is problematic and may result in crashes
> > since the operations data structure and the underlying code may be gone
> > when the calls are made.
> >
> > The following series solves the problem by protecting accesses to thermal
> > device operations with the thermal device mutex, and by verifying that the
> > thermal device is still registered after the mutex has been acquired.
> >
> > This was previously sent as RFC/RFT as single patch [1]. The code was reworked
> > to match thermal subsystem changes made between v6.0 and v6.1, and it was
> > split into several patches to simplify review.
> >
>
> Any thoughts / comments / feedback on this series ?

I have reviewed the series now and haven't found any major issues in it.

I've posted some minor comments in separate replies to individual
patches.  If they are addressed, I can queue up the series for 6.2.

Cheers!


> > [1] https://lore.kernel.org/linux-pm/20221004033936.1047691-1-linux@roeck-us.net/
> >
> > ----------------------------------------------------------------
> > Guenter Roeck (9):
> >       thermal/core: Destroy thermal zone device mutex in release function
> >       thermal/core: Delete device under thermal device zone lock
> >       thermal/core: Ensure that thermal device is registered in thermal_zone_get_temp
> >       thermal/core: Move parameter validation from __thermal_zone_get_temp to thermal_zone_get_temp
> >       thermal/core: Introduce locked version of thermal_zone_device_update
> >       thermal/core: Protect hwmon accesses to thermal operations with thermal zone mutex
> >       thermal/core: Protect sysfs accesses to thermal operations with thermal zone mutex
> >       thermal/core: Remove thermal_zone_set_trips()
> >       thermal/core: Protect thermal device operations against thermal device removal
> >
> >  drivers/thermal/thermal_core.c    | 76 ++++++++++++++++++++++++--------------
> >  drivers/thermal/thermal_core.h    |  3 +-
> >  drivers/thermal/thermal_helpers.c | 65 ++++++++++++++++++++++-----------
> >  drivers/thermal/thermal_hwmon.c   | 14 +++++--
> >  drivers/thermal/thermal_sysfs.c   | 77 +++++++++++++++++++++++++++++++++------
> >  5 files changed, 169 insertions(+), 66 deletions(-)

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 5/9] thermal/core: Introduce locked version of thermal_zone_device_update
  2022-11-09 19:15   ` Rafael J. Wysocki
@ 2022-11-10  0:25     ` Guenter Roeck
  2022-11-10 13:01       ` Rafael J. Wysocki
  0 siblings, 1 reply; 25+ messages in thread
From: Guenter Roeck @ 2022-11-10  0:25 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Daniel Lezcano, Amit Kucheria, Zhang Rui, linux-pm, linux-kernel

On Wed, Nov 09, 2022 at 08:15:17PM +0100, Rafael J. Wysocki wrote:
> On Mon, Oct 17, 2022 at 3:09 PM Guenter Roeck <linux@roeck-us.net> wrote:
> >
> > In thermal_zone_device_set_mode(), the thermal zone mutex is released only
> > to be reacquired in the subsequent call to thermal_zone_device_update().
> >
> > Introduce __thermal_zone_device_update() as locked version of
> 
> Did you mean "unlocked"?
> 
No, I did mean "locked", as in "must be called with thermal zone device
mutex acquired".

locked:

void __thermal_zone_device_update(struct thermal_zone_device *tz,
                                  enum thermal_notify_event event)
{
	...
}

unlocked:

void thermal_zone_device_update(struct thermal_zone_device *tz,
                                enum thermal_notify_event event)
{
        mutex_lock(&tz->lock);
        if (device_is_registered(&tz->device))
                __thermal_zone_device_update(tz, event);
        mutex_unlock(&tz->lock);
}

Should I phrase or explain it differently ?

Thanks,
Guenter

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 5/9] thermal/core: Introduce locked version of thermal_zone_device_update
  2022-11-10  0:25     ` Guenter Roeck
@ 2022-11-10 13:01       ` Rafael J. Wysocki
  2022-11-10 14:11         ` Guenter Roeck
  0 siblings, 1 reply; 25+ messages in thread
From: Rafael J. Wysocki @ 2022-11-10 13:01 UTC (permalink / raw)
  To: Guenter Roeck
  Cc: Rafael J. Wysocki, Daniel Lezcano, Amit Kucheria, Zhang Rui,
	linux-pm, linux-kernel

On Thu, Nov 10, 2022 at 1:25 AM Guenter Roeck <linux@roeck-us.net> wrote:
>
> On Wed, Nov 09, 2022 at 08:15:17PM +0100, Rafael J. Wysocki wrote:
> > On Mon, Oct 17, 2022 at 3:09 PM Guenter Roeck <linux@roeck-us.net> wrote:
> > >
> > > In thermal_zone_device_set_mode(), the thermal zone mutex is released only
> > > to be reacquired in the subsequent call to thermal_zone_device_update().
> > >
> > > Introduce __thermal_zone_device_update() as locked version of
> >
> > Did you mean "unlocked"?
> >
> No, I did mean "locked", as in "must be called with thermal zone device
> mutex acquired".
>
> locked:
>
> void __thermal_zone_device_update(struct thermal_zone_device *tz,
>                                   enum thermal_notify_event event)
> {
>         ...
> }
>
> unlocked:
>
> void thermal_zone_device_update(struct thermal_zone_device *tz,
>                                 enum thermal_notify_event event)
> {
>         mutex_lock(&tz->lock);
>         if (device_is_registered(&tz->device))
>                 __thermal_zone_device_update(tz, event);
>         mutex_unlock(&tz->lock);
> }

Thanks for the explanation.

> Should I phrase or explain it differently ?

I would rather say "bare" or something like that so it is all clear to
people like me, but it is your call.

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 5/9] thermal/core: Introduce locked version of thermal_zone_device_update
  2022-11-10 13:01       ` Rafael J. Wysocki
@ 2022-11-10 14:11         ` Guenter Roeck
  2022-11-10 14:14           ` Rafael J. Wysocki
  0 siblings, 1 reply; 25+ messages in thread
From: Guenter Roeck @ 2022-11-10 14:11 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Daniel Lezcano, Amit Kucheria, Zhang Rui, linux-pm, linux-kernel

On Thu, Nov 10, 2022 at 02:01:49PM +0100, Rafael J. Wysocki wrote:
> On Thu, Nov 10, 2022 at 1:25 AM Guenter Roeck <linux@roeck-us.net> wrote:
> >
> > On Wed, Nov 09, 2022 at 08:15:17PM +0100, Rafael J. Wysocki wrote:
> > > On Mon, Oct 17, 2022 at 3:09 PM Guenter Roeck <linux@roeck-us.net> wrote:
> > > >
> > > > In thermal_zone_device_set_mode(), the thermal zone mutex is released only
> > > > to be reacquired in the subsequent call to thermal_zone_device_update().
> > > >
> > > > Introduce __thermal_zone_device_update() as locked version of
> > >
> > > Did you mean "unlocked"?
> > >
> > No, I did mean "locked", as in "must be called with thermal zone device
> > mutex acquired".
> >
> > locked:
> >
> > void __thermal_zone_device_update(struct thermal_zone_device *tz,
> >                                   enum thermal_notify_event event)
> > {
> >         ...
> > }
> >
> > unlocked:
> >
> > void thermal_zone_device_update(struct thermal_zone_device *tz,
> >                                 enum thermal_notify_event event)
> > {
> >         mutex_lock(&tz->lock);
> >         if (device_is_registered(&tz->device))
> >                 __thermal_zone_device_update(tz, event);
> >         mutex_unlock(&tz->lock);
> > }
> 
> Thanks for the explanation.
> 
> > Should I phrase or explain it differently ?
> 
> I would rather say "bare" or something like that so it is all clear to
> people like me, but it is your call.

I updated the commit description to use "must be called with thermal
device mutex held". I kept 'locked' in the subject; I don't think using
'bare' there would add any clarity. Hope that is ok.

Thanks,
Guenter

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 3/9] thermal/core: Ensure that thermal device is registered in thermal_zone_get_temp
  2022-11-09 19:07   ` Rafael J. Wysocki
@ 2022-11-10 14:13     ` Guenter Roeck
  0 siblings, 0 replies; 25+ messages in thread
From: Guenter Roeck @ 2022-11-10 14:13 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Daniel Lezcano, Amit Kucheria, Zhang Rui, linux-pm, linux-kernel

On Wed, Nov 09, 2022 at 08:07:48PM +0100, Rafael J. Wysocki wrote:
[ ... ]
> >
> > diff --git a/drivers/thermal/thermal_helpers.c b/drivers/thermal/thermal_helpers.c
> > index c65cdce8f856..3bac0b7a4c62 100644
> > --- a/drivers/thermal/thermal_helpers.c
> > +++ b/drivers/thermal/thermal_helpers.c
> > @@ -115,7 +115,12 @@ int thermal_zone_get_temp(struct thermal_zone_device *tz, int *temp)
> >         int ret;
> >
> >         mutex_lock(&tz->lock);
> > +       if (!device_is_registered(&tz->device)) {
> > +               ret = -ENODEV;
> > +               goto unlock;
> > +       }
> >         ret = __thermal_zone_get_temp(tz, temp);
> > +unlock:
> 
> I would do it this way:
> 
> if (device_is_registered(&tz->device))
>         ret = __thermal_zone_get_temp(tz, temp);
> else
>         ret = -ENODEV;
> 
Done in all patches. Agreed, that looks much better.

Thanks,
Guenter

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 5/9] thermal/core: Introduce locked version of thermal_zone_device_update
  2022-11-10 14:11         ` Guenter Roeck
@ 2022-11-10 14:14           ` Rafael J. Wysocki
  0 siblings, 0 replies; 25+ messages in thread
From: Rafael J. Wysocki @ 2022-11-10 14:14 UTC (permalink / raw)
  To: Guenter Roeck
  Cc: Rafael J. Wysocki, Daniel Lezcano, Amit Kucheria, Zhang Rui,
	linux-pm, linux-kernel

On Thu, Nov 10, 2022 at 3:12 PM Guenter Roeck <linux@roeck-us.net> wrote:
>
> On Thu, Nov 10, 2022 at 02:01:49PM +0100, Rafael J. Wysocki wrote:
> > On Thu, Nov 10, 2022 at 1:25 AM Guenter Roeck <linux@roeck-us.net> wrote:
> > >
> > > On Wed, Nov 09, 2022 at 08:15:17PM +0100, Rafael J. Wysocki wrote:
> > > > On Mon, Oct 17, 2022 at 3:09 PM Guenter Roeck <linux@roeck-us.net> wrote:
> > > > >
> > > > > In thermal_zone_device_set_mode(), the thermal zone mutex is released only
> > > > > to be reacquired in the subsequent call to thermal_zone_device_update().
> > > > >
> > > > > Introduce __thermal_zone_device_update() as locked version of
> > > >
> > > > Did you mean "unlocked"?
> > > >
> > > No, I did mean "locked", as in "must be called with thermal zone device
> > > mutex acquired".
> > >
> > > locked:
> > >
> > > void __thermal_zone_device_update(struct thermal_zone_device *tz,
> > >                                   enum thermal_notify_event event)
> > > {
> > >         ...
> > > }
> > >
> > > unlocked:
> > >
> > > void thermal_zone_device_update(struct thermal_zone_device *tz,
> > >                                 enum thermal_notify_event event)
> > > {
> > >         mutex_lock(&tz->lock);
> > >         if (device_is_registered(&tz->device))
> > >                 __thermal_zone_device_update(tz, event);
> > >         mutex_unlock(&tz->lock);
> > > }
> >
> > Thanks for the explanation.
> >
> > > Should I phrase or explain it differently ?
> >
> > I would rather say "bare" or something like that so it is all clear to
> > people like me, but it is your call.
>
> I updated the commit description to use "must be called with thermal
> device mutex held". I kept 'locked' in the subject; I don't think using
> 'bare' there would add any clarity. Hope that is ok.

It is.

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 6/9] thermal/core: Protect hwmon accesses to thermal operations with thermal zone mutex
  2022-11-09 19:19   ` Rafael J. Wysocki
@ 2022-11-10 14:21     ` Guenter Roeck
  2022-11-10 14:24       ` Rafael J. Wysocki
  0 siblings, 1 reply; 25+ messages in thread
From: Guenter Roeck @ 2022-11-10 14:21 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Daniel Lezcano, Amit Kucheria, Zhang Rui, linux-pm, linux-kernel

On Wed, Nov 09, 2022 at 08:19:13PM +0100, Rafael J. Wysocki wrote:
> On Mon, Oct 17, 2022 at 3:09 PM Guenter Roeck <linux@roeck-us.net> wrote:
[ ... ]
> 
> And I wouldn't change the code below (the ternary operator is out of
> fashion in particular).
> 

I tried to introduce some consistency; the ternary operator is used
in some of the existing thermal code. Guess I went the wrong direction.
Never mind; I don't have a strong opinion either way.
I updated the series patches to no longer use ternary operators in
updated code, but I left existing code alone (changing that should not
be part of this patch set anyway).

Thanks,
Guenter

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 6/9] thermal/core: Protect hwmon accesses to thermal operations with thermal zone mutex
  2022-11-10 14:21     ` Guenter Roeck
@ 2022-11-10 14:24       ` Rafael J. Wysocki
  0 siblings, 0 replies; 25+ messages in thread
From: Rafael J. Wysocki @ 2022-11-10 14:24 UTC (permalink / raw)
  To: Guenter Roeck
  Cc: Rafael J. Wysocki, Daniel Lezcano, Amit Kucheria, Zhang Rui,
	linux-pm, linux-kernel

On Thu, Nov 10, 2022 at 3:21 PM Guenter Roeck <linux@roeck-us.net> wrote:
>
> On Wed, Nov 09, 2022 at 08:19:13PM +0100, Rafael J. Wysocki wrote:
> > On Mon, Oct 17, 2022 at 3:09 PM Guenter Roeck <linux@roeck-us.net> wrote:
> [ ... ]
> >
> > And I wouldn't change the code below (the ternary operator is out of
> > fashion in particular).
> >
>
> I tried to introduce some consistency; the ternary operator is used
> in some of the existing thermal code. Guess I went the wrong direction.
> Never mind; I don't have a strong opinion either way.
> I updated the series patches to no longer use ternary operators in
> updated code, but I left existing code alone (changing that should not
> be part of this patch set anyway).

Thanks, that's what I would have done too.

^ permalink raw reply	[flat|nested] 25+ messages in thread

end of thread, other threads:[~2022-11-10 14:25 UTC | newest]

Thread overview: 25+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-10-17 13:09 [PATCH 0/9] thermal/core: Protect thermal device operations against removal Guenter Roeck
2022-10-17 13:09 ` [PATCH 1/9] thermal/core: Destroy thermal zone device mutex in release function Guenter Roeck
2022-10-17 13:09 ` [PATCH 2/9] thermal/core: Delete device under thermal device zone lock Guenter Roeck
2022-10-17 13:09 ` [PATCH 3/9] thermal/core: Ensure that thermal device is registered in thermal_zone_get_temp Guenter Roeck
2022-11-09 19:07   ` Rafael J. Wysocki
2022-11-10 14:13     ` Guenter Roeck
2022-10-17 13:09 ` [PATCH 4/9] thermal/core: Move parameter validation from __thermal_zone_get_temp to thermal_zone_get_temp Guenter Roeck
2022-11-09 19:12   ` Rafael J. Wysocki
2022-10-17 13:09 ` [PATCH 5/9] thermal/core: Introduce locked version of thermal_zone_device_update Guenter Roeck
2022-11-09 19:15   ` Rafael J. Wysocki
2022-11-10  0:25     ` Guenter Roeck
2022-11-10 13:01       ` Rafael J. Wysocki
2022-11-10 14:11         ` Guenter Roeck
2022-11-10 14:14           ` Rafael J. Wysocki
2022-10-17 13:09 ` [PATCH 6/9] thermal/core: Protect hwmon accesses to thermal operations with thermal zone mutex Guenter Roeck
2022-11-09 19:19   ` Rafael J. Wysocki
2022-11-10 14:21     ` Guenter Roeck
2022-11-10 14:24       ` Rafael J. Wysocki
2022-10-17 13:09 ` [PATCH 7/9] thermal/core: Protect sysfs " Guenter Roeck
2022-11-09 19:26   ` Rafael J. Wysocki
2022-10-17 13:09 ` [PATCH 8/9] thermal/core: Remove thermal_zone_set_trips() Guenter Roeck
2022-10-17 13:09 ` [PATCH 9/9] thermal/core: Protect thermal device operations against thermal device removal Guenter Roeck
2022-11-02 18:50 ` [PATCH 0/9] thermal/core: Protect thermal device operations against removal Guenter Roeck
2022-11-02 18:55   ` Daniel Lezcano
2022-11-09 19:30   ` Rafael J. Wysocki

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.