linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v2 0/2]  Add a generic virtual thermal sensor
@ 2021-09-17  7:27 Alexandre Bailon
  2021-09-17  7:27 ` [PATCH v2 1/2] dt-bindings: Add bindings for the " Alexandre Bailon
                   ` (2 more replies)
  0 siblings, 3 replies; 19+ messages in thread
From: Alexandre Bailon @ 2021-09-17  7:27 UTC (permalink / raw)
  To: rui.zhang, daniel.lezcano, amitk
  Cc: linux-pm, linux-kernel, ben.tseng, khilman, mka, Alexandre Bailon

This series add a virtual thermal sensor.
It could be used to get a temperature using some thermal sensors.
Currently, the supported operations are max, min and avg.
The virtual sensor could be easily extended to support others operations.

Note:
Currently, thermal drivers must explicitly register their sensors to make them
available to the virtual sensor.
This doesn't seem a good solution to me and I think it would be preferable to
update the framework to register the list of each available sensors.

Changes in v2:
- Fix some warnings / errors reported by kernel test robot
- rename some struct and functions with a more accurate name
- update the dt bindings: rename type attribute to aggregation-function
- factorize a little bit the aggregation functions

Alexandre Bailon (2):
  dt-bindings: Add bindings for the virtual thermal sensor
  thermal: add a virtual sensor to aggregate temperatures

 .../thermal/virtual,thermal-sensor.yaml       |  67 +++
 drivers/thermal/Kconfig                       |   8 +
 drivers/thermal/Makefile                      |   1 +
 drivers/thermal/virtual-sensor.h              |  51 +++
 drivers/thermal/virtual_sensor.c              | 400 ++++++++++++++++++
 include/dt-bindings/thermal/virtual-sensor.h  |  15 +
 6 files changed, 542 insertions(+)
 create mode 100644 Documentation/devicetree/bindings/thermal/virtual,thermal-sensor.yaml
 create mode 100644 drivers/thermal/virtual-sensor.h
 create mode 100644 drivers/thermal/virtual_sensor.c
 create mode 100644 include/dt-bindings/thermal/virtual-sensor.h

-- 
2.31.1


^ permalink raw reply	[flat|nested] 19+ messages in thread

* [PATCH v2 1/2] dt-bindings: Add bindings for the virtual thermal sensor
  2021-09-17  7:27 [PATCH v2 0/2] Add a generic virtual thermal sensor Alexandre Bailon
@ 2021-09-17  7:27 ` Alexandre Bailon
  2021-09-17  7:27 ` [PATCH v2 2/2] thermal: add a virtual sensor to aggregate temperatures Alexandre Bailon
  2021-09-17 12:41 ` [PATCH v2 0/2] Add a generic virtual thermal sensor Daniel Lezcano
  2 siblings, 0 replies; 19+ messages in thread
From: Alexandre Bailon @ 2021-09-17  7:27 UTC (permalink / raw)
  To: rui.zhang, daniel.lezcano, amitk
  Cc: linux-pm, linux-kernel, ben.tseng, khilman, mka, Alexandre Bailon

This adds the device tree bidings for the virtual thermal sensor.
The virtual sensor could be used to a temperature computed from
many thermal sensors.

Signed-off-by: Alexandre Bailon <abailon@baylibre.com>
---
 .../thermal/virtual,thermal-sensor.yaml       | 67 +++++++++++++++++++
 include/dt-bindings/thermal/virtual-sensor.h  | 15 +++++
 2 files changed, 82 insertions(+)
 create mode 100644 Documentation/devicetree/bindings/thermal/virtual,thermal-sensor.yaml
 create mode 100644 include/dt-bindings/thermal/virtual-sensor.h

diff --git a/Documentation/devicetree/bindings/thermal/virtual,thermal-sensor.yaml b/Documentation/devicetree/bindings/thermal/virtual,thermal-sensor.yaml
new file mode 100644
index 0000000000000..049620ed88a5b
--- /dev/null
+++ b/Documentation/devicetree/bindings/thermal/virtual,thermal-sensor.yaml
@@ -0,0 +1,67 @@
+# SPDX-License-Identifier: (GPL-2.0-only OR BSD-2-Clause)
+# Copyright 2021 BayLibre
+%YAML 1.2
+---
+$id: http://devicetree.org/schemas/thermal/thermal-sensor.yaml#
+$schema: http://devicetree.org/meta-schemas/core.yaml#
+
+title: Virtual thermal sensor binding
+
+description: |
+  The virtual thermal sensor devices provide temperature sensing capabilities
+  based on hardware thermal sensors. Basically, this could be used to get the
+  maximum, minimum or average temperature of the hardware thermal sensors.
+properties:
+  "#thermal-sensor-cells":
+    description:
+      Used to uniquely identify a thermal sensor instance within an IC. Will be
+      0 on sensor nodes with only a single sensor and at least 1 on nodes
+      containing several internal sensors.
+    enum: [0, 1]
+
+  aggregation-function:
+    $ref: /schemas/types.yaml#/definitions/uint32
+    description:
+      Used to select the operations to perform on the sensors to get the virtual
+      sensor temperature.
+    enum:
+      - VIRTUAL_THERMAL_SENSOR_MIN
+      - VIRTUAL_THERMAL_SENSOR_MAX
+      - VIRTUAL_THERMAL_SENSOR_AVG
+
+  thermal-sensors:
+    $ref: /schemas/types.yaml#/definitions/phandle-array
+    description:
+      The thermal sensor phandle and sensor specifier used to monitor this
+      thermal zone.
+
+required:
+  - "#thermal-sensor-cells"
+  - aggregation-function
+  - thermal-sensors
+
+additionalProperties: true
+
+examples:
+  - |
+    #include <dt-bindings/thermal/thermal.h>
+    #include <dt-bindings/thermal/virtual-sensor.h>
+    #include <dt-bindings/interrupt-controller/arm-gic.h>
+    #include <dt-bindings/clock/mt8192-clk.h>
+
+    lvts: lvts@1100b000 {
+        compatible = "mediatek,mt6873-lvts";
+        reg = <0x1100b000 0x1000>;
+        clocks = <&infracfg CLK_INFRA_THERM>;
+        clock-names = "lvts_clk";
+        #thermal-sensor-cells = <0>;
+        interrupts = <GIC_SPI 147 IRQ_TYPE_LEVEL_HIGH>;
+    };
+
+    soc_max_sensor: soc_max_sensor {
+      compatible = "virtual,thermal-sensor";
+      #thermal-sensor-cells = <1>;
+      aggregation-function = <VIRTUAL_THERMAL_SENSOR_MAX>;
+      thermal-sensors = <&lvts 0>, <&lvts 1>;
+    };
+...
diff --git a/include/dt-bindings/thermal/virtual-sensor.h b/include/dt-bindings/thermal/virtual-sensor.h
new file mode 100644
index 0000000000000..93f6dc2453b85
--- /dev/null
+++ b/include/dt-bindings/thermal/virtual-sensor.h
@@ -0,0 +1,15 @@
+/* SPDX-License-Identifier: GPL-2.0-only */
+/*
+ * This header provides constants for virtual thermal sensor bindings.
+ *
+ * Copyright (C) 2021 BayLibre
+ */
+
+#ifndef _DT_BINDINGS_THERMAL_VIRTUAL_SENSOR_H
+#define _DT_BINDINGS_THERMAL_VIRTUAL_SENSOR_H
+
+#define VIRTUAL_THERMAL_SENSOR_MIN 0
+#define VIRTUAL_THERMAL_SENSOR_MAX 1
+#define VIRTUAL_THERMAL_SENSOR_AVG 2
+
+#endif /* _DT_BINDINGS_THERMAL_VIRTUAL_SENSOR_H */
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH v2 2/2] thermal: add a virtual sensor to aggregate temperatures
  2021-09-17  7:27 [PATCH v2 0/2] Add a generic virtual thermal sensor Alexandre Bailon
  2021-09-17  7:27 ` [PATCH v2 1/2] dt-bindings: Add bindings for the " Alexandre Bailon
@ 2021-09-17  7:27 ` Alexandre Bailon
  2021-10-07 16:38   ` Rafael J. Wysocki
  2021-09-17 12:41 ` [PATCH v2 0/2] Add a generic virtual thermal sensor Daniel Lezcano
  2 siblings, 1 reply; 19+ messages in thread
From: Alexandre Bailon @ 2021-09-17  7:27 UTC (permalink / raw)
  To: rui.zhang, daniel.lezcano, amitk
  Cc: linux-pm, linux-kernel, ben.tseng, khilman, mka,
	Alexandre Bailon, kernel test robot

This adds a virtual thermal sensor that reads temperature from
hardware sensor and return an aggregated temperature.
Currently, this supports three operations:
the minimum, maximum and average temperature.

Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Alexandre Bailon <abailon@baylibre.com>
---
 drivers/thermal/Kconfig                  |   8 +
 drivers/thermal/Makefile                 |   1 +
 drivers/thermal/virtual-thermal-sensor.h |  54 ++++
 drivers/thermal/virtual_thermal_sensor.c | 350 +++++++++++++++++++++++
 4 files changed, 413 insertions(+)
 create mode 100644 drivers/thermal/virtual-thermal-sensor.h
 create mode 100644 drivers/thermal/virtual_thermal_sensor.c

diff --git a/drivers/thermal/Kconfig b/drivers/thermal/Kconfig
index d7f44deab5b1d..20bc93c48f5b1 100644
--- a/drivers/thermal/Kconfig
+++ b/drivers/thermal/Kconfig
@@ -228,6 +228,14 @@ config THERMAL_MMIO
 	  register or shared memory, is a potential candidate to work with this
 	  driver.
 
+config VIRTUAL_THERMAL
+	tristate "Virtual thermal sensor driver"
+	depends on THERMAL_OF || COMPILE_TEST
+	help
+	  This option enables the generic thermal sensor aggregator.
+	  This driver creates a thermal sensor that reads the hardware sensors
+	  and aggregate the temperature.
+
 config HISI_THERMAL
 	tristate "Hisilicon thermal driver"
 	depends on ARCH_HISI || COMPILE_TEST
diff --git a/drivers/thermal/Makefile b/drivers/thermal/Makefile
index 82fc3e616e54b..8bf55973059c5 100644
--- a/drivers/thermal/Makefile
+++ b/drivers/thermal/Makefile
@@ -60,3 +60,4 @@ obj-$(CONFIG_UNIPHIER_THERMAL)	+= uniphier_thermal.o
 obj-$(CONFIG_AMLOGIC_THERMAL)     += amlogic_thermal.o
 obj-$(CONFIG_SPRD_THERMAL)	+= sprd_thermal.o
 obj-$(CONFIG_KHADAS_MCU_FAN_THERMAL)	+= khadas_mcu_fan.o
+obj-$(CONFIG_VIRTUAL_THERMAL) += virtual_thermal_sensor.o
diff --git a/drivers/thermal/virtual-thermal-sensor.h b/drivers/thermal/virtual-thermal-sensor.h
new file mode 100644
index 0000000000000..3bbf7c324dddc
--- /dev/null
+++ b/drivers/thermal/virtual-thermal-sensor.h
@@ -0,0 +1,54 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * Copyright (c) 2021 BayLibre
+ */
+
+#ifndef __THERMAL_VIRTUAL_SENSOR_H__
+#define __THERMAL_VIRTUAL_SENSOR_H__
+
+#include <linux/device.h>
+#include <linux/thermal.h>
+
+struct virtual_thermal_sensor;
+struct thermal_sensor_data;
+
+#if IS_ENABLED(CONFIG_VIRTUAL_THERMAL)
+struct thermal_sensor_data *
+thermal_virtual_sensor_register(struct device *dev, int sensor_id, void *data,
+				const struct thermal_zone_of_device_ops *ops);
+void thermal_virtual_sensor_unregister(struct device *dev,
+				       struct thermal_sensor_data *sensor_data);
+struct thermal_sensor_data *
+devm_thermal_virtual_sensor_register(struct device *dev, int sensor_id, void *data,
+				     const struct thermal_zone_of_device_ops *ops);
+
+void devm_thermal_virtual_sensor_unregister(struct device *dev,
+					    struct virtual_thermal_sensor *sensor);
+#else
+static inline struct thermal_sensor_data *
+thermal_virtual_sensor_register(struct device *dev, int sensor_id, void *data,
+				const struct thermal_zone_of_device_ops *ops)
+{
+	return ERR_PTR(-ENODEV);
+}
+
+void thermal_virtual_sensor_unregister(struct device *dev,
+				       struct thermal_sensor_data *sensor_data)
+{
+}
+
+static inline struct thermal_sensor_data *
+devm_thermal_virtual_sensor_register(struct device *dev, int sensor_id, void *data,
+				     const struct thermal_zone_of_device_ops *ops)
+{
+	return ERR_PTR(-ENODEV);
+}
+
+static inline
+void devm_thermal_virtual_sensor_unregister(struct device *dev,
+					    struct virtual_thermal_sensor *sensor)
+{
+}
+#endif
+
+#endif /* __THERMAL_VIRTUAL_SENSOR_H__ */
diff --git a/drivers/thermal/virtual_thermal_sensor.c b/drivers/thermal/virtual_thermal_sensor.c
new file mode 100644
index 0000000000000..234563af6643e
--- /dev/null
+++ b/drivers/thermal/virtual_thermal_sensor.c
@@ -0,0 +1,350 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Copyright (c) 2021 BayLibre
+ */
+
+#include <linux/err.h>
+#include <linux/export.h>
+#include <linux/module.h>
+#include <linux/of_device.h>
+#include <linux/of_platform.h>
+#include <linux/slab.h>
+#include <linux/thermal.h>
+#include <linux/types.h>
+#include <linux/string.h>
+
+#include <dt-bindings/thermal/virtual-sensor.h>
+
+#include "virtual-thermal-sensor.h"
+
+struct thermal_sensor_data {
+	struct list_head node;
+
+	/* sensor interface */
+	int id;
+	void *sensor_data;
+	const struct thermal_zone_of_device_ops *ops;
+};
+
+struct virtual_thermal_sensor {
+	int count;
+	struct thermal_sensor_data *sensors;
+	struct thermal_zone_device *tzd;
+	int (*aggr_temp)(int temp1, int temp2);
+
+	struct list_head node;
+};
+
+static LIST_HEAD(thermal_sensors);
+static LIST_HEAD(virtual_sensors);
+
+static int max_temp(int temp1, int temp2)
+{
+	return max(temp1, temp2);
+}
+
+static int min_temp(int temp1, int temp2)
+{
+	return min(temp1, temp2);
+}
+
+static int avg_temp(int temp1, int temp2)
+{
+	return ((temp1) / 2) + ((temp2) / 2) + (((temp1) % 2 + (temp2) % 2) / 2);
+}
+
+static int virtual_thermal_sensor_get_temp(void *data, int *temperature)
+{
+	struct virtual_thermal_sensor *sensor = data;
+	int max_temp = INT_MIN;
+	int temp;
+	int i;
+
+	for (i = 0; i < sensor->count; i++) {
+		struct thermal_sensor_data *hw_sensor;
+
+		hw_sensor = &sensor->sensors[i];
+		if (!hw_sensor->ops)
+			return -ENODEV;
+
+		hw_sensor->ops->get_temp(hw_sensor->sensor_data, &temp);
+		max_temp = sensor->aggr_temp(max_temp, temp);
+	}
+
+	*temperature = max_temp;
+
+	return 0;
+}
+
+static const struct thermal_zone_of_device_ops virtual_thermal_sensor_ops = {
+	.get_temp = virtual_thermal_sensor_get_temp,
+};
+
+static int virtual_sensor_add_sensor(struct virtual_thermal_sensor *sensor,
+				     struct of_phandle_args args,
+				     int index)
+{
+	struct thermal_sensor_data *sensor_data;
+	int id;
+
+	list_for_each_entry(sensor_data, &thermal_sensors, node) {
+		id = args.args_count ? args.args[0] : 0;
+		if (sensor_data->id == id) {
+			memcpy(&sensor->sensors[index], sensor_data,
+				sizeof(*sensor_data));
+			return 0;
+		}
+	}
+
+	return -ENODEV;
+}
+
+static int virtual_thermal_sensor_probe(struct platform_device *pdev)
+{
+	struct virtual_thermal_sensor *sensor;
+	struct device *dev = &pdev->dev;
+	struct of_phandle_args args;
+	u32 type;
+	int ret;
+	int i;
+
+	sensor = devm_kzalloc(dev, sizeof(*sensor), GFP_KERNEL);
+	if (!sensor)
+		return -ENOMEM;
+
+	sensor->count = of_count_phandle_with_args(dev->of_node,
+						   "thermal-sensors",
+						   "#thermal-sensor-cells");
+	if (sensor->count <= 0)
+		return -EINVAL;
+
+	sensor->sensors = devm_kmalloc_array(dev, sensor->count,
+					     sizeof(*sensor->sensors),
+					     GFP_KERNEL);
+	if (!sensor->sensors)
+		return -ENOMEM;
+
+	for (i = 0; i < sensor->count; i++) {
+		ret = of_parse_phandle_with_args(dev->of_node,
+						 "thermal-sensors",
+						 "#thermal-sensor-cells",
+						 i,
+						 &args);
+		if (ret)
+			return ret;
+
+		ret = virtual_sensor_add_sensor(sensor, args, i);
+		if (ret)
+			return ret;
+	}
+
+	ret = of_property_read_u32(dev->of_node, "aggregation-function", &type);
+	if (ret)
+		return ret;
+
+	switch (type) {
+	case VIRTUAL_THERMAL_SENSOR_MAX:
+		sensor->aggr_temp = max_temp;
+		break;
+	case VIRTUAL_THERMAL_SENSOR_MIN:
+		sensor->aggr_temp = min_temp;
+		break;
+	case VIRTUAL_THERMAL_SENSOR_AVG:
+		sensor->aggr_temp = avg_temp;
+		break;
+	default:
+		return -EINVAL;
+	}
+
+	sensor->tzd = devm_thermal_zone_of_sensor_register(dev, 0, sensor,
+							   &virtual_thermal_sensor_ops);
+	if (IS_ERR(sensor->tzd))
+		return PTR_ERR(sensor->tzd);
+
+	platform_set_drvdata(pdev, sensor);
+	list_add(&sensor->node, &virtual_sensors);
+
+	return 0;
+}
+
+static int virtual_thermal_sensor_remove(struct platform_device *pdev)
+{
+	struct virtual_thermal_sensor *sensor;
+
+	sensor = platform_get_drvdata(pdev);
+	list_del(&sensor->node);
+
+	return 0;
+}
+
+static const struct of_device_id virtual_thermal_sensor_of_match[] = {
+	{
+		.compatible = "virtual,thermal-sensor",
+	},
+	{
+	},
+};
+MODULE_DEVICE_TABLE(of, virtual_thermal_sensor_of_match);
+
+static struct platform_driver virtual_thermal_sensor = {
+	.probe = virtual_thermal_sensor_probe,
+	.remove = virtual_thermal_sensor_remove,
+	.driver = {
+		.name = "virtual-thermal-sensor",
+		.of_match_table = virtual_thermal_sensor_of_match,
+	},
+};
+
+/**
+ * thermal_virtual_sensor_register - registers a sensor that could be a virtual
+ * sensor
+ * @dev: a valid struct device pointer of a sensor device. Must contain
+ *       a valid .of_node, for the sensor node.
+ * @sensor_id: a sensor identifier, in case the sensor IP has more
+ *             than one sensor
+ * @data: a private pointer (owned by the caller) that will be passed
+ *        back, when a temperature reading is needed.
+ * @ops: struct thermal_zone_of_device_ops *. Must contain at least .get_temp.
+ *
+ * This function will register a thermal sensor to make it available for later
+ * usage by a virtual sensor.
+ *
+ * The thermal zone temperature is provided by the @get_temp function
+ * pointer. When called, it will have the private pointer @data back.
+ *
+ * Return: On success returns a valid struct thermal_zone_device,
+ * otherwise, it returns a corresponding ERR_PTR(). Caller must
+ * check the return value with help of IS_ERR() helper.
+ */
+struct thermal_sensor_data *thermal_virtual_sensor_register(
+	struct device *dev, int sensor_id, void *data,
+	const struct thermal_zone_of_device_ops *ops)
+{
+	struct thermal_sensor_data *sensor_data;
+
+	sensor_data = devm_kzalloc(dev, sizeof(*sensor_data), GFP_KERNEL);
+	if (!sensor_data)
+		return ERR_PTR(-ENOMEM);
+
+	sensor_data->id = sensor_id;
+	sensor_data->sensor_data = data;
+	sensor_data->ops = ops;
+
+	list_add(&sensor_data->node, &thermal_sensors);
+
+	return sensor_data;
+}
+EXPORT_SYMBOL_GPL(thermal_virtual_sensor_register);
+
+/**
+ * thermal_virtual_sensor_unregister - unregisters a sensor
+ * @dev: a valid struct device pointer of a sensor device.
+ * @sensor_data: a pointer to struct thermal_sensor_data to unregister.
+ *
+ * This function removes the sensor from the list of available thermal sensors.
+ * If the sensor is in use, then the next call to .get_temp will return -ENODEV.
+ */
+void thermal_virtual_sensor_unregister(struct device *dev,
+				       struct thermal_sensor_data *sensor_data)
+{
+	struct thermal_sensor_data *sd;
+	struct virtual_thermal_sensor *sensor;
+	int i;
+
+	list_del(&sensor_data->node);
+
+	list_for_each_entry(sensor, &virtual_sensors, node) {
+		for (i = 0; i < sensor->count; i++) {
+			sd = &sensor->sensors[i];
+			if (sd->id == sensor_data->id &&
+				sd->sensor_data == sensor_data->sensor_data) {
+				sd->ops = NULL;
+			}
+		}
+	}
+}
+EXPORT_SYMBOL_GPL(thermal_virtual_sensor_unregister);
+
+static void devm_thermal_virtual_sensor_release(struct device *dev, void *res)
+{
+	thermal_virtual_sensor_unregister(dev,
+					  *(struct thermal_sensor_data **)res);
+}
+
+static int devm_thermal_virtual_sensor_match(struct device *dev, void *res,
+					     void *data)
+{
+	struct thermal_sensor_data **r = res;
+
+	if (WARN_ON(!r || !*r))
+		return 0;
+
+	return *r == data;
+}
+
+/**
+ * devm_thermal_virtual_sensor_register - Resource managed version of
+ *				thermal_virtual_sensor_register()
+ * @dev: a valid struct device pointer of a sensor device. Must contain
+ *       a valid .of_node, for the sensor node.
+ * @sensor_id: a sensor identifier, in case the sensor IP has more
+ *	       than one sensor
+ * @data: a private pointer (owned by the caller) that will be passed
+ *	  back, when a temperature reading is needed.
+ * @ops: struct thermal_zone_of_device_ops *. Must contain at least .get_temp.
+ *
+ * Refer thermal_zone_of_sensor_register() for more details.
+ *
+ * Return: On success returns a valid struct virtual_sensor_data,
+ * otherwise, it returns a corresponding ERR_PTR(). Caller must
+ * check the return value with help of IS_ERR() helper.
+ * Registered virtual_sensor_data device will automatically be
+ * released when device is unbound.
+ */
+struct thermal_sensor_data *devm_thermal_virtual_sensor_register(
+	struct device *dev, int sensor_id,
+	void *data, const struct thermal_zone_of_device_ops *ops)
+{
+	struct thermal_sensor_data **ptr, *sensor_data;
+
+	ptr = devres_alloc(devm_thermal_virtual_sensor_release, sizeof(*ptr),
+			   GFP_KERNEL);
+	if (!ptr)
+		return ERR_PTR(-ENOMEM);
+
+	sensor_data = thermal_virtual_sensor_register(dev, sensor_id, data, ops);
+	if (IS_ERR(sensor_data)) {
+		devres_free(ptr);
+		return sensor_data;
+	}
+
+	*ptr = sensor_data;
+	devres_add(dev, ptr);
+
+	return sensor_data;
+}
+EXPORT_SYMBOL_GPL(devm_thermal_virtual_sensor_register);
+
+/**
+ * devm_thermal_virtual_sensor_unregister - Resource managed version of
+ *				thermal_virtual_sensor_unregister().
+ * @dev: Device for which resource was allocated.
+ * @sensor: a pointer to struct thermal_zone_device where the sensor is registered.
+ *
+ * This function removes the sensor from the list of sensors registered with
+ * devm_thermal_virtual_sensor_register() API.
+ * Normally this function will not need to be called and the resource
+ * management code will ensure that the resource is freed.
+ */
+void devm_thermal_virtual_sensor_unregister(struct device *dev,
+					    struct virtual_thermal_sensor *sensor)
+{
+	WARN_ON(devres_release(dev, devm_thermal_virtual_sensor_release,
+			       devm_thermal_virtual_sensor_match, sensor));
+}
+EXPORT_SYMBOL_GPL(devm_thermal_virtual_sensor_unregister);
+
+module_platform_driver(virtual_thermal_sensor);
+MODULE_AUTHOR("Alexandre Bailon <abailon@baylibre.com>");
+MODULE_DESCRIPTION("Virtual thermal sensor");
+MODULE_LICENSE("GPL v2");
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* Re: [PATCH v2 0/2] Add a generic virtual thermal sensor
  2021-09-17  7:27 [PATCH v2 0/2] Add a generic virtual thermal sensor Alexandre Bailon
  2021-09-17  7:27 ` [PATCH v2 1/2] dt-bindings: Add bindings for the " Alexandre Bailon
  2021-09-17  7:27 ` [PATCH v2 2/2] thermal: add a virtual sensor to aggregate temperatures Alexandre Bailon
@ 2021-09-17 12:41 ` Daniel Lezcano
  2021-09-17 13:33   ` Alexandre Bailon
  2 siblings, 1 reply; 19+ messages in thread
From: Daniel Lezcano @ 2021-09-17 12:41 UTC (permalink / raw)
  To: Alexandre Bailon, rui.zhang, amitk
  Cc: linux-pm, linux-kernel, ben.tseng, khilman, mka

On 17/09/2021 09:27, Alexandre Bailon wrote:
> This series add a virtual thermal sensor.
> It could be used to get a temperature using some thermal sensors.
> Currently, the supported operations are max, min and avg.
> The virtual sensor could be easily extended to support others operations.
> 
> Note:
> Currently, thermal drivers must explicitly register their sensors to make them
> available to the virtual sensor.
> This doesn't seem a good solution to me and I think it would be preferable to
> update the framework to register the list of each available sensors.

Why must the drivers do that ?

> Changes in v2:
> - Fix some warnings / errors reported by kernel test robot
> - rename some struct and functions with a more accurate name
> - update the dt bindings: rename type attribute to aggregation-function
> - factorize a little bit the aggregation functions
> 
> Alexandre Bailon (2):
>   dt-bindings: Add bindings for the virtual thermal sensor
>   thermal: add a virtual sensor to aggregate temperatures
> 
>  .../thermal/virtual,thermal-sensor.yaml       |  67 +++
>  drivers/thermal/Kconfig                       |   8 +
>  drivers/thermal/Makefile                      |   1 +
>  drivers/thermal/virtual-sensor.h              |  51 +++
>  drivers/thermal/virtual_sensor.c              | 400 ++++++++++++++++++
>  include/dt-bindings/thermal/virtual-sensor.h  |  15 +
>  6 files changed, 542 insertions(+)
>  create mode 100644 Documentation/devicetree/bindings/thermal/virtual,thermal-sensor.yaml
>  create mode 100644 drivers/thermal/virtual-sensor.h
>  create mode 100644 drivers/thermal/virtual_sensor.c
>  create mode 100644 include/dt-bindings/thermal/virtual-sensor.h
> 


-- 
<http://www.linaro.org/> Linaro.org │ Open source software for ARM SoCs

Follow Linaro:  <http://www.facebook.com/pages/Linaro> Facebook |
<http://twitter.com/#!/linaroorg> Twitter |
<http://www.linaro.org/linaro-blog/> Blog

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v2 0/2] Add a generic virtual thermal sensor
  2021-09-17 12:41 ` [PATCH v2 0/2] Add a generic virtual thermal sensor Daniel Lezcano
@ 2021-09-17 13:33   ` Alexandre Bailon
  2021-09-17 14:03     ` Daniel Lezcano
  0 siblings, 1 reply; 19+ messages in thread
From: Alexandre Bailon @ 2021-09-17 13:33 UTC (permalink / raw)
  To: Daniel Lezcano, rui.zhang, amitk
  Cc: linux-pm, linux-kernel, ben.tseng, khilman, mka

Hi Daniel,

On 9/17/21 2:41 PM, Daniel Lezcano wrote:
> On 17/09/2021 09:27, Alexandre Bailon wrote:
>> This series add a virtual thermal sensor.
>> It could be used to get a temperature using some thermal sensors.
>> Currently, the supported operations are max, min and avg.
>> The virtual sensor could be easily extended to support others operations.
>>
>> Note:
>> Currently, thermal drivers must explicitly register their sensors to make them
>> available to the virtual sensor.
>> This doesn't seem a good solution to me and I think it would be preferable to
>> update the framework to register the list of each available sensors.
> Why must the drivers do that ?
Because there are no central place where thermal sensor are registered.
The only other way I found was to update thermal_of.c,
to register the thermal sensors and make them available later to the 
virtual thermal sensor.

To work, the virtual thermal need to get the sensor_data the ops from 
the thermal sensor.
And as far I know, this is only registered in thermal_of.c, in the 
thermal zone data
but I can't access it directly from the virtual thermal sensor.

How would you do it ?

Thanks,
Alexandre
>
>> Changes in v2:
>> - Fix some warnings / errors reported by kernel test robot
>> - rename some struct and functions with a more accurate name
>> - update the dt bindings: rename type attribute to aggregation-function
>> - factorize a little bit the aggregation functions
>>
>> Alexandre Bailon (2):
>>    dt-bindings: Add bindings for the virtual thermal sensor
>>    thermal: add a virtual sensor to aggregate temperatures
>>
>>   .../thermal/virtual,thermal-sensor.yaml       |  67 +++
>>   drivers/thermal/Kconfig                       |   8 +
>>   drivers/thermal/Makefile                      |   1 +
>>   drivers/thermal/virtual-sensor.h              |  51 +++
>>   drivers/thermal/virtual_sensor.c              | 400 ++++++++++++++++++
>>   include/dt-bindings/thermal/virtual-sensor.h  |  15 +
>>   6 files changed, 542 insertions(+)
>>   create mode 100644 Documentation/devicetree/bindings/thermal/virtual,thermal-sensor.yaml
>>   create mode 100644 drivers/thermal/virtual-sensor.h
>>   create mode 100644 drivers/thermal/virtual_sensor.c
>>   create mode 100644 include/dt-bindings/thermal/virtual-sensor.h
>>
>

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v2 0/2] Add a generic virtual thermal sensor
  2021-09-17 13:33   ` Alexandre Bailon
@ 2021-09-17 14:03     ` Daniel Lezcano
  2021-09-20 13:12       ` Alexandre Bailon
  0 siblings, 1 reply; 19+ messages in thread
From: Daniel Lezcano @ 2021-09-17 14:03 UTC (permalink / raw)
  To: Alexandre Bailon, rui.zhang, amitk
  Cc: linux-pm, linux-kernel, ben.tseng, khilman, mka

On 17/09/2021 15:33, Alexandre Bailon wrote:
> Hi Daniel,
> 
> On 9/17/21 2:41 PM, Daniel Lezcano wrote:
>> On 17/09/2021 09:27, Alexandre Bailon wrote:
>>> This series add a virtual thermal sensor.
>>> It could be used to get a temperature using some thermal sensors.
>>> Currently, the supported operations are max, min and avg.
>>> The virtual sensor could be easily extended to support others
>>> operations.
>>>
>>> Note:
>>> Currently, thermal drivers must explicitly register their sensors to
>>> make them
>>> available to the virtual sensor.
>>> This doesn't seem a good solution to me and I think it would be
>>> preferable to
>>> update the framework to register the list of each available sensors.
>> Why must the drivers do that ?
> Because there are no central place where thermal sensor are registered.
> The only other way I found was to update thermal_of.c,
> to register the thermal sensors and make them available later to the
> virtual thermal sensor.
> 
> To work, the virtual thermal need to get the sensor_data the ops from
> the thermal sensor.
> And as far I know, this is only registered in thermal_of.c, in the
> thermal zone data
> but I can't access it directly from the virtual thermal sensor.
> 
> How would you do it ?

Via the phandles when registering the virtual sensor ?

Or is the problem you are mentioning related to the sensor module
loading after the virtual sensor ?

-- 
<http://www.linaro.org/> Linaro.org │ Open source software for ARM SoCs

Follow Linaro:  <http://www.facebook.com/pages/Linaro> Facebook |
<http://twitter.com/#!/linaroorg> Twitter |
<http://www.linaro.org/linaro-blog/> Blog

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v2 0/2] Add a generic virtual thermal sensor
  2021-09-17 14:03     ` Daniel Lezcano
@ 2021-09-20 13:12       ` Alexandre Bailon
  2021-09-22  8:10         ` Daniel Lezcano
  0 siblings, 1 reply; 19+ messages in thread
From: Alexandre Bailon @ 2021-09-20 13:12 UTC (permalink / raw)
  To: Daniel Lezcano, rui.zhang, amitk
  Cc: linux-pm, linux-kernel, ben.tseng, khilman, mka


On 9/17/21 4:03 PM, Daniel Lezcano wrote:
> On 17/09/2021 15:33, Alexandre Bailon wrote:
>> Hi Daniel,
>>
>> On 9/17/21 2:41 PM, Daniel Lezcano wrote:
>>> On 17/09/2021 09:27, Alexandre Bailon wrote:
>>>> This series add a virtual thermal sensor.
>>>> It could be used to get a temperature using some thermal sensors.
>>>> Currently, the supported operations are max, min and avg.
>>>> The virtual sensor could be easily extended to support others
>>>> operations.
>>>>
>>>> Note:
>>>> Currently, thermal drivers must explicitly register their sensors to
>>>> make them
>>>> available to the virtual sensor.
>>>> This doesn't seem a good solution to me and I think it would be
>>>> preferable to
>>>> update the framework to register the list of each available sensors.
>>> Why must the drivers do that ?
>> Because there are no central place where thermal sensor are registered.
>> The only other way I found was to update thermal_of.c,
>> to register the thermal sensors and make them available later to the
>> virtual thermal sensor.
>>
>> To work, the virtual thermal need to get the sensor_data the ops from
>> the thermal sensor.
>> And as far I know, this is only registered in thermal_of.c, in the
>> thermal zone data
>> but I can't access it directly from the virtual thermal sensor.
>>
>> How would you do it ?
> Via the phandles when registering the virtual sensor ?
As far I know, we can't get the ops or the sensor_data from the phandle 
of a thermal sensor.
The closest solution I found so far would be to aggregate the thermal 
zones instead of thermal sensors.
thermal_zone_device has the data needed and a thermal zone could be find 
easily using its name.

But, using a thermal_zone_device, I don't see how to handle module 
unloading.

Thanks,
Alexandre
> Or is the problem you are mentioning related to the sensor module
> loading after the virtual sensor ?
>

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v2 0/2] Add a generic virtual thermal sensor
  2021-09-20 13:12       ` Alexandre Bailon
@ 2021-09-22  8:10         ` Daniel Lezcano
  2021-10-04 10:24           ` Alexandre Bailon
  0 siblings, 1 reply; 19+ messages in thread
From: Daniel Lezcano @ 2021-09-22  8:10 UTC (permalink / raw)
  To: Alexandre Bailon, rui.zhang, amitk
  Cc: linux-pm, linux-kernel, ben.tseng, khilman, mka

On 20/09/2021 15:12, Alexandre Bailon wrote:
> 
> On 9/17/21 4:03 PM, Daniel Lezcano wrote:
>> On 17/09/2021 15:33, Alexandre Bailon wrote:
>>> Hi Daniel,
>>>
>>> On 9/17/21 2:41 PM, Daniel Lezcano wrote:
>>>> On 17/09/2021 09:27, Alexandre Bailon wrote:
>>>>> This series add a virtual thermal sensor.
>>>>> It could be used to get a temperature using some thermal sensors.
>>>>> Currently, the supported operations are max, min and avg.
>>>>> The virtual sensor could be easily extended to support others
>>>>> operations.
>>>>>
>>>>> Note:
>>>>> Currently, thermal drivers must explicitly register their sensors to
>>>>> make them
>>>>> available to the virtual sensor.
>>>>> This doesn't seem a good solution to me and I think it would be
>>>>> preferable to
>>>>> update the framework to register the list of each available sensors.
>>>> Why must the drivers do that ?
>>> Because there are no central place where thermal sensor are registered.
>>> The only other way I found was to update thermal_of.c,
>>> to register the thermal sensors and make them available later to the
>>> virtual thermal sensor.
>>>
>>> To work, the virtual thermal need to get the sensor_data the ops from
>>> the thermal sensor.
>>> And as far I know, this is only registered in thermal_of.c, in the
>>> thermal zone data
>>> but I can't access it directly from the virtual thermal sensor.
>>>
>>> How would you do it ?
>> Via the phandles when registering the virtual sensor ?
> As far I know, we can't get the ops or the sensor_data from the phandle
> of a thermal sensor.
> The closest solution I found so far would be to aggregate the thermal
> zones instead of thermal sensors.
> thermal_zone_device has the data needed and a thermal zone could be find
> easily using its name.

Yeah, the concept of the thermal zone and the sensor are very close.

There is the function in thermal_core.h:

 -> for_each_thermal_zone()

You should be able for each 'slave' sensor, do a lookup to find the
corresponding thermal_zone_device_ops.

> But, using a thermal_zone_device, I don't see how to handle module
> unloading.

I think try_module_get() / module_put() are adequate for this situation
as it is done on an external module and we can not rely on the exported
symbols.


-- 
<http://www.linaro.org/> Linaro.org │ Open source software for ARM SoCs

Follow Linaro:  <http://www.facebook.com/pages/Linaro> Facebook |
<http://twitter.com/#!/linaroorg> Twitter |
<http://www.linaro.org/linaro-blog/> Blog

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v2 0/2] Add a generic virtual thermal sensor
  2021-09-22  8:10         ` Daniel Lezcano
@ 2021-10-04 10:24           ` Alexandre Bailon
  2021-10-04 13:42             ` Daniel Lezcano
  0 siblings, 1 reply; 19+ messages in thread
From: Alexandre Bailon @ 2021-10-04 10:24 UTC (permalink / raw)
  To: Daniel Lezcano, rui.zhang, amitk
  Cc: linux-pm, linux-kernel, ben.tseng, khilman, mka


On 9/22/21 10:10 AM, Daniel Lezcano wrote:
> On 20/09/2021 15:12, Alexandre Bailon wrote:
>> On 9/17/21 4:03 PM, Daniel Lezcano wrote:
>>> On 17/09/2021 15:33, Alexandre Bailon wrote:
>>>> Hi Daniel,
>>>>
>>>> On 9/17/21 2:41 PM, Daniel Lezcano wrote:
>>>>> On 17/09/2021 09:27, Alexandre Bailon wrote:
>>>>>> This series add a virtual thermal sensor.
>>>>>> It could be used to get a temperature using some thermal sensors.
>>>>>> Currently, the supported operations are max, min and avg.
>>>>>> The virtual sensor could be easily extended to support others
>>>>>> operations.
>>>>>>
>>>>>> Note:
>>>>>> Currently, thermal drivers must explicitly register their sensors to
>>>>>> make them
>>>>>> available to the virtual sensor.
>>>>>> This doesn't seem a good solution to me and I think it would be
>>>>>> preferable to
>>>>>> update the framework to register the list of each available sensors.
>>>>> Why must the drivers do that ?
>>>> Because there are no central place where thermal sensor are registered.
>>>> The only other way I found was to update thermal_of.c,
>>>> to register the thermal sensors and make them available later to the
>>>> virtual thermal sensor.
>>>>
>>>> To work, the virtual thermal need to get the sensor_data the ops from
>>>> the thermal sensor.
>>>> And as far I know, this is only registered in thermal_of.c, in the
>>>> thermal zone data
>>>> but I can't access it directly from the virtual thermal sensor.
>>>>
>>>> How would you do it ?
>>> Via the phandles when registering the virtual sensor ?
>> As far I know, we can't get the ops or the sensor_data from the phandle
>> of a thermal sensor.
>> The closest solution I found so far would be to aggregate the thermal
>> zones instead of thermal sensors.
>> thermal_zone_device has the data needed and a thermal zone could be find
>> easily using its name.
> Yeah, the concept of the thermal zone and the sensor are very close.
>
> There is the function in thermal_core.h:
>
>   -> for_each_thermal_zone()
>
> You should be able for each 'slave' sensor, do a lookup to find the
> corresponding thermal_zone_device_ops.
>
>> But, using a thermal_zone_device, I don't see how to handle module
>> unloading.
> I think try_module_get() / module_put() are adequate for this situation
> as it is done on an external module and we can not rely on the exported
> symbols.
I don't see how it would be possible to use these functions.
The thermal zone doesn't have the data required to use it.

Maybe a more easier way is to use the thermal_zone_device mutex.
If I get a lock before to use the thermal_zone_device ops, I have the 
guaranty that module won't be unloaded.

When a "thermal of sensor" is unloaded, it calls 
thermal_zone_of_sensor_unregister which takes a lock before
update ops.

Thanks,
Alexandre

>

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v2 0/2] Add a generic virtual thermal sensor
  2021-10-04 10:24           ` Alexandre Bailon
@ 2021-10-04 13:42             ` Daniel Lezcano
  2021-10-05 16:45               ` Rafael J. Wysocki
  0 siblings, 1 reply; 19+ messages in thread
From: Daniel Lezcano @ 2021-10-04 13:42 UTC (permalink / raw)
  To: Alexandre Bailon, rui.zhang, amitk, Rafael J. Wysocki
  Cc: linux-pm, linux-kernel, ben.tseng, khilman, mka

On 04/10/2021 12:24, Alexandre Bailon wrote:
> 
> On 9/22/21 10:10 AM, Daniel Lezcano wrote:
>> On 20/09/2021 15:12, Alexandre Bailon wrote:
>>> On 9/17/21 4:03 PM, Daniel Lezcano wrote:
>>>> On 17/09/2021 15:33, Alexandre Bailon wrote:
>>>>> Hi Daniel,
>>>>>
>>>>> On 9/17/21 2:41 PM, Daniel Lezcano wrote:
>>>>>> On 17/09/2021 09:27, Alexandre Bailon wrote:
>>>>>>> This series add a virtual thermal sensor.
>>>>>>> It could be used to get a temperature using some thermal sensors.
>>>>>>> Currently, the supported operations are max, min and avg.
>>>>>>> The virtual sensor could be easily extended to support others
>>>>>>> operations.
>>>>>>>
>>>>>>> Note:
>>>>>>> Currently, thermal drivers must explicitly register their sensors to
>>>>>>> make them
>>>>>>> available to the virtual sensor.
>>>>>>> This doesn't seem a good solution to me and I think it would be
>>>>>>> preferable to
>>>>>>> update the framework to register the list of each available sensors.
>>>>>> Why must the drivers do that ?
>>>>> Because there are no central place where thermal sensor are
>>>>> registered.
>>>>> The only other way I found was to update thermal_of.c,
>>>>> to register the thermal sensors and make them available later to the
>>>>> virtual thermal sensor.
>>>>>
>>>>> To work, the virtual thermal need to get the sensor_data the ops from
>>>>> the thermal sensor.
>>>>> And as far I know, this is only registered in thermal_of.c, in the
>>>>> thermal zone data
>>>>> but I can't access it directly from the virtual thermal sensor.
>>>>>
>>>>> How would you do it ?
>>>> Via the phandles when registering the virtual sensor ?
>>> As far I know, we can't get the ops or the sensor_data from the phandle
>>> of a thermal sensor.
>>> The closest solution I found so far would be to aggregate the thermal
>>> zones instead of thermal sensors.
>>> thermal_zone_device has the data needed and a thermal zone could be find
>>> easily using its name.
>> Yeah, the concept of the thermal zone and the sensor are very close.
>>
>> There is the function in thermal_core.h:
>>
>>   -> for_each_thermal_zone()
>>
>> You should be able for each 'slave' sensor, do a lookup to find the
>> corresponding thermal_zone_device_ops.
>>
>>> But, using a thermal_zone_device, I don't see how to handle module
>>> unloading.
>> I think try_module_get() / module_put() are adequate for this situation
>> as it is done on an external module and we can not rely on the exported
>> symbols.
> I don't see how it would be possible to use these functions.
> The thermal zone doesn't have the data required to use it.

Actually I was able to crash the kernel by doing:

console 1:

while $(true); do insmod <module> && rmmod <module>; done

console 2:

while $(true); cat /sys/class/thermal/thermal_zone0/temp; done

So there is something wrong already in the thermal framework.

IMO, the first thing would be to fix this critical issue by getting the
sensor module refcount when the thermal zone is enabled and dropping it
when it is disabled.

With that fixed, perhaps it will possible to get the device associated
with the sensor and then try_module_get(dev->driver->owner)

> Maybe a more easier way is to use the thermal_zone_device mutex.
> If I get a lock before to use the thermal_zone_device ops, I have the
> guaranty that module won't be unloaded.
> 
> When a "thermal of sensor" is unloaded, it calls
> thermal_zone_of_sensor_unregister which takes a lock before
> update ops.

I'm not sure to understand. The goal is to have the refcount on the
modules to be incremented when the virtual sensor is using them.

Until the virtual sensor is registered, it will prevent the other
modules to be unloaded.

-- 
<http://www.linaro.org/> Linaro.org │ Open source software for ARM SoCs

Follow Linaro:  <http://www.facebook.com/pages/Linaro> Facebook |
<http://twitter.com/#!/linaroorg> Twitter |
<http://www.linaro.org/linaro-blog/> Blog

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v2 0/2] Add a generic virtual thermal sensor
  2021-10-04 13:42             ` Daniel Lezcano
@ 2021-10-05 16:45               ` Rafael J. Wysocki
  2021-10-06 16:06                 ` Daniel Lezcano
  0 siblings, 1 reply; 19+ messages in thread
From: Rafael J. Wysocki @ 2021-10-05 16:45 UTC (permalink / raw)
  To: Daniel Lezcano
  Cc: Alexandre Bailon, Zhang, Rui, Amit Kucheria, Rafael J. Wysocki,
	Linux PM, Linux Kernel Mailing List, ben.tseng, Kevin Hilman,
	Matthias Kaehlcke

On Mon, Oct 4, 2021 at 3:42 PM Daniel Lezcano <daniel.lezcano@linaro.org> wrote:
>
> On 04/10/2021 12:24, Alexandre Bailon wrote:
> >
> > On 9/22/21 10:10 AM, Daniel Lezcano wrote:
> >> On 20/09/2021 15:12, Alexandre Bailon wrote:
> >>> On 9/17/21 4:03 PM, Daniel Lezcano wrote:
> >>>> On 17/09/2021 15:33, Alexandre Bailon wrote:
> >>>>> Hi Daniel,
> >>>>>
> >>>>> On 9/17/21 2:41 PM, Daniel Lezcano wrote:
> >>>>>> On 17/09/2021 09:27, Alexandre Bailon wrote:
> >>>>>>> This series add a virtual thermal sensor.
> >>>>>>> It could be used to get a temperature using some thermal sensors.
> >>>>>>> Currently, the supported operations are max, min and avg.
> >>>>>>> The virtual sensor could be easily extended to support others
> >>>>>>> operations.
> >>>>>>>
> >>>>>>> Note:
> >>>>>>> Currently, thermal drivers must explicitly register their sensors to
> >>>>>>> make them
> >>>>>>> available to the virtual sensor.
> >>>>>>> This doesn't seem a good solution to me and I think it would be
> >>>>>>> preferable to
> >>>>>>> update the framework to register the list of each available sensors.
> >>>>>> Why must the drivers do that ?
> >>>>> Because there are no central place where thermal sensor are
> >>>>> registered.
> >>>>> The only other way I found was to update thermal_of.c,
> >>>>> to register the thermal sensors and make them available later to the
> >>>>> virtual thermal sensor.
> >>>>>
> >>>>> To work, the virtual thermal need to get the sensor_data the ops from
> >>>>> the thermal sensor.
> >>>>> And as far I know, this is only registered in thermal_of.c, in the
> >>>>> thermal zone data
> >>>>> but I can't access it directly from the virtual thermal sensor.
> >>>>>
> >>>>> How would you do it ?
> >>>> Via the phandles when registering the virtual sensor ?
> >>> As far I know, we can't get the ops or the sensor_data from the phandle
> >>> of a thermal sensor.
> >>> The closest solution I found so far would be to aggregate the thermal
> >>> zones instead of thermal sensors.
> >>> thermal_zone_device has the data needed and a thermal zone could be find
> >>> easily using its name.
> >> Yeah, the concept of the thermal zone and the sensor are very close.
> >>
> >> There is the function in thermal_core.h:
> >>
> >>   -> for_each_thermal_zone()
> >>
> >> You should be able for each 'slave' sensor, do a lookup to find the
> >> corresponding thermal_zone_device_ops.
> >>
> >>> But, using a thermal_zone_device, I don't see how to handle module
> >>> unloading.
> >> I think try_module_get() / module_put() are adequate for this situation
> >> as it is done on an external module and we can not rely on the exported
> >> symbols.
> > I don't see how it would be possible to use these functions.
> > The thermal zone doesn't have the data required to use it.
>
> Actually I was able to crash the kernel by doing:
>
> console 1:
>
> while $(true); do insmod <module> && rmmod <module>; done
>
> console 2:
>
> while $(true); cat /sys/class/thermal/thermal_zone0/temp; done
>
> So there is something wrong already in the thermal framework.

Hmmm.

> IMO, the first thing would be to fix this critical issue by getting the
> sensor module refcount when the thermal zone is enabled and dropping it
> when it is disabled.
>
> With that fixed, perhaps it will possible to get the device associated
> with the sensor and then try_module_get(dev->driver->owner)
>
> > Maybe a more easier way is to use the thermal_zone_device mutex.
> > If I get a lock before to use the thermal_zone_device ops, I have the
> > guaranty that module won't be unloaded.

That would be my approach too.

> > When a "thermal of sensor" is unloaded, it calls
> > thermal_zone_of_sensor_unregister which takes a lock before
> > update ops.
>
> I'm not sure to understand. The goal is to have the refcount on the
> modules to be incremented when the virtual sensor is using them.

IMO the goal is to prevent the code from crashing when modules get
unloaded.  I'm not really sure if refcounts alone are sufficient for
that.

> Until the virtual sensor is registered, it will prevent the other
> modules to be unloaded.

Unless they are forced to unload that is, AFAICS.

IMO it would be better to make the code survive unloading of a module.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v2 0/2] Add a generic virtual thermal sensor
  2021-10-05 16:45               ` Rafael J. Wysocki
@ 2021-10-06 16:06                 ` Daniel Lezcano
  2021-10-06 18:00                   ` Rafael J. Wysocki
  0 siblings, 1 reply; 19+ messages in thread
From: Daniel Lezcano @ 2021-10-06 16:06 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Alexandre Bailon, Zhang, Rui, Amit Kucheria, Linux PM,
	Linux Kernel Mailing List, ben.tseng, Kevin Hilman,
	Matthias Kaehlcke

On 05/10/2021 18:45, Rafael J. Wysocki wrote:
> On Mon, Oct 4, 2021 at 3:42 PM Daniel Lezcano <daniel.lezcano@linaro.org> wrote:
>>
>> On 04/10/2021 12:24, Alexandre Bailon wrote:
>>>
>>> On 9/22/21 10:10 AM, Daniel Lezcano wrote:
>>>> On 20/09/2021 15:12, Alexandre Bailon wrote:
>>>>> On 9/17/21 4:03 PM, Daniel Lezcano wrote:
>>>>>> On 17/09/2021 15:33, Alexandre Bailon wrote:
>>>>>>> Hi Daniel,
>>>>>>>
>>>>>>> On 9/17/21 2:41 PM, Daniel Lezcano wrote:
>>>>>>>> On 17/09/2021 09:27, Alexandre Bailon wrote:
>>>>>>>>> This series add a virtual thermal sensor.
>>>>>>>>> It could be used to get a temperature using some thermal sensors.
>>>>>>>>> Currently, the supported operations are max, min and avg.
>>>>>>>>> The virtual sensor could be easily extended to support others
>>>>>>>>> operations.
>>>>>>>>>
>>>>>>>>> Note:
>>>>>>>>> Currently, thermal drivers must explicitly register their sensors to
>>>>>>>>> make them
>>>>>>>>> available to the virtual sensor.
>>>>>>>>> This doesn't seem a good solution to me and I think it would be
>>>>>>>>> preferable to
>>>>>>>>> update the framework to register the list of each available sensors.
>>>>>>>> Why must the drivers do that ?
>>>>>>> Because there are no central place where thermal sensor are
>>>>>>> registered.
>>>>>>> The only other way I found was to update thermal_of.c,
>>>>>>> to register the thermal sensors and make them available later to the
>>>>>>> virtual thermal sensor.
>>>>>>>
>>>>>>> To work, the virtual thermal need to get the sensor_data the ops from
>>>>>>> the thermal sensor.
>>>>>>> And as far I know, this is only registered in thermal_of.c, in the
>>>>>>> thermal zone data
>>>>>>> but I can't access it directly from the virtual thermal sensor.
>>>>>>>
>>>>>>> How would you do it ?
>>>>>> Via the phandles when registering the virtual sensor ?
>>>>> As far I know, we can't get the ops or the sensor_data from the phandle
>>>>> of a thermal sensor.
>>>>> The closest solution I found so far would be to aggregate the thermal
>>>>> zones instead of thermal sensors.
>>>>> thermal_zone_device has the data needed and a thermal zone could be find
>>>>> easily using its name.
>>>> Yeah, the concept of the thermal zone and the sensor are very close.
>>>>
>>>> There is the function in thermal_core.h:
>>>>
>>>>   -> for_each_thermal_zone()
>>>>
>>>> You should be able for each 'slave' sensor, do a lookup to find the
>>>> corresponding thermal_zone_device_ops.
>>>>
>>>>> But, using a thermal_zone_device, I don't see how to handle module
>>>>> unloading.
>>>> I think try_module_get() / module_put() are adequate for this situation
>>>> as it is done on an external module and we can not rely on the exported
>>>> symbols.
>>> I don't see how it would be possible to use these functions.
>>> The thermal zone doesn't have the data required to use it.
>>
>> Actually I was able to crash the kernel by doing:
>>
>> console 1:
>>
>> while $(true); do insmod <module> && rmmod <module>; done
>>
>> console 2:
>>
>> while $(true); cat /sys/class/thermal/thermal_zone0/temp; done
>>
>> So there is something wrong already in the thermal framework.
> 
> Hmmm.
> 
>> IMO, the first thing would be to fix this critical issue by getting the
>> sensor module refcount when the thermal zone is enabled and dropping it
>> when it is disabled.
>>
>> With that fixed, perhaps it will possible to get the device associated
>> with the sensor and then try_module_get(dev->driver->owner)
>>
>>> Maybe a more easier way is to use the thermal_zone_device mutex.
>>> If I get a lock before to use the thermal_zone_device ops, I have the
>>> guaranty that module won't be unloaded.
> 
> That would be my approach too.

The mutex is private to the thermal core. The virtual sensor should not
touch it :/

Perhaps, it can work with a private spin_lock with a try_spinlock() ?

>>> When a "thermal of sensor" is unloaded, it calls
>>> thermal_zone_of_sensor_unregister which takes a lock before
>>> update ops.
>>
>> I'm not sure to understand. The goal is to have the refcount on the
>> modules to be incremented when the virtual sensor is using them.
> 
> IMO the goal is to prevent the code from crashing when modules get
> unloaded.  I'm not really sure if refcounts alone are sufficient for
> that.

The problem is in the loop:

+static int virtual_thermal_sensor_get_temp(void *data, int *temperature)
+{
+	struct virtual_thermal_sensor *sensor = data;
+	int max_temp = INT_MIN;
+	int temp;
+	int i;
+
+	for (i = 0; i < sensor->count; i++) {
+		struct thermal_sensor_data *hw_sensor;
+
+		hw_sensor = &sensor->sensors[i];
+		if (!hw_sensor->ops)
+			return -ENODEV;
+
+		hw_sensor->ops->get_temp(hw_sensor->sensor_data, &temp);
+		max_temp = sensor->aggr_temp(max_temp, temp);
+	}
+
+	*temperature = max_temp;
+
+	return 0;
+}

If one of the sensor is unloaded when get_temp is called,
hw_sensor->ops->get_temp will crash.

So the proposal is virtual_sensor_add_sensor() does try_get_module()
and virtual_sensor_remove_sensor() does put_module().

The ref on the 'slave' modules will be release only if the virtual
sensor is unregistered.

So until, the virtual sensor is unregistered, the 'slaves' modules can
not be unloaded.

That is what we find with eg. the wifi modules.

>> Until the virtual sensor is registered, it will prevent the other
>> modules to be unloaded.
> 
> Unless they are forced to unload that is, AFAICS.
> 
> IMO it would be better to make the code survive unloading of a module.



-- 
<http://www.linaro.org/> Linaro.org │ Open source software for ARM SoCs

Follow Linaro:  <http://www.facebook.com/pages/Linaro> Facebook |
<http://twitter.com/#!/linaroorg> Twitter |
<http://www.linaro.org/linaro-blog/> Blog

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v2 0/2] Add a generic virtual thermal sensor
  2021-10-06 16:06                 ` Daniel Lezcano
@ 2021-10-06 18:00                   ` Rafael J. Wysocki
  2021-10-06 19:51                     ` Daniel Lezcano
  0 siblings, 1 reply; 19+ messages in thread
From: Rafael J. Wysocki @ 2021-10-06 18:00 UTC (permalink / raw)
  To: Daniel Lezcano
  Cc: Rafael J. Wysocki, Alexandre Bailon, Zhang, Rui, Amit Kucheria,
	Linux PM, Linux Kernel Mailing List, ben.tseng, Kevin Hilman,
	Matthias Kaehlcke

On Wed, Oct 6, 2021 at 6:06 PM Daniel Lezcano <daniel.lezcano@linaro.org> wrote:
>
> On 05/10/2021 18:45, Rafael J. Wysocki wrote:
> > On Mon, Oct 4, 2021 at 3:42 PM Daniel Lezcano <daniel.lezcano@linaro.org> wrote:
> >>
> >> On 04/10/2021 12:24, Alexandre Bailon wrote:
> >>>
> >>> On 9/22/21 10:10 AM, Daniel Lezcano wrote:
> >>>> On 20/09/2021 15:12, Alexandre Bailon wrote:
> >>>>> On 9/17/21 4:03 PM, Daniel Lezcano wrote:
> >>>>>> On 17/09/2021 15:33, Alexandre Bailon wrote:
> >>>>>>> Hi Daniel,
> >>>>>>>
> >>>>>>> On 9/17/21 2:41 PM, Daniel Lezcano wrote:
> >>>>>>>> On 17/09/2021 09:27, Alexandre Bailon wrote:
> >>>>>>>>> This series add a virtual thermal sensor.
> >>>>>>>>> It could be used to get a temperature using some thermal sensors.
> >>>>>>>>> Currently, the supported operations are max, min and avg.
> >>>>>>>>> The virtual sensor could be easily extended to support others
> >>>>>>>>> operations.
> >>>>>>>>>
> >>>>>>>>> Note:
> >>>>>>>>> Currently, thermal drivers must explicitly register their sensors to
> >>>>>>>>> make them
> >>>>>>>>> available to the virtual sensor.
> >>>>>>>>> This doesn't seem a good solution to me and I think it would be
> >>>>>>>>> preferable to
> >>>>>>>>> update the framework to register the list of each available sensors.
> >>>>>>>> Why must the drivers do that ?
> >>>>>>> Because there are no central place where thermal sensor are
> >>>>>>> registered.
> >>>>>>> The only other way I found was to update thermal_of.c,
> >>>>>>> to register the thermal sensors and make them available later to the
> >>>>>>> virtual thermal sensor.
> >>>>>>>
> >>>>>>> To work, the virtual thermal need to get the sensor_data the ops from
> >>>>>>> the thermal sensor.
> >>>>>>> And as far I know, this is only registered in thermal_of.c, in the
> >>>>>>> thermal zone data
> >>>>>>> but I can't access it directly from the virtual thermal sensor.
> >>>>>>>
> >>>>>>> How would you do it ?
> >>>>>> Via the phandles when registering the virtual sensor ?
> >>>>> As far I know, we can't get the ops or the sensor_data from the phandle
> >>>>> of a thermal sensor.
> >>>>> The closest solution I found so far would be to aggregate the thermal
> >>>>> zones instead of thermal sensors.
> >>>>> thermal_zone_device has the data needed and a thermal zone could be find
> >>>>> easily using its name.
> >>>> Yeah, the concept of the thermal zone and the sensor are very close.
> >>>>
> >>>> There is the function in thermal_core.h:
> >>>>
> >>>>   -> for_each_thermal_zone()
> >>>>
> >>>> You should be able for each 'slave' sensor, do a lookup to find the
> >>>> corresponding thermal_zone_device_ops.
> >>>>
> >>>>> But, using a thermal_zone_device, I don't see how to handle module
> >>>>> unloading.
> >>>> I think try_module_get() / module_put() are adequate for this situation
> >>>> as it is done on an external module and we can not rely on the exported
> >>>> symbols.
> >>> I don't see how it would be possible to use these functions.
> >>> The thermal zone doesn't have the data required to use it.
> >>
> >> Actually I was able to crash the kernel by doing:
> >>
> >> console 1:
> >>
> >> while $(true); do insmod <module> && rmmod <module>; done
> >>
> >> console 2:
> >>
> >> while $(true); cat /sys/class/thermal/thermal_zone0/temp; done
> >>
> >> So there is something wrong already in the thermal framework.
> >
> > Hmmm.
> >
> >> IMO, the first thing would be to fix this critical issue by getting the
> >> sensor module refcount when the thermal zone is enabled and dropping it
> >> when it is disabled.
> >>
> >> With that fixed, perhaps it will possible to get the device associated
> >> with the sensor and then try_module_get(dev->driver->owner)
> >>
> >>> Maybe a more easier way is to use the thermal_zone_device mutex.
> >>> If I get a lock before to use the thermal_zone_device ops, I have the
> >>> guaranty that module won't be unloaded.
> >
> > That would be my approach too.
>
> The mutex is private to the thermal core. The virtual sensor should not
> touch it :/
>
> Perhaps, it can work with a private spin_lock with a try_spinlock() ?

IIUC this is a case when module A refers to some memory in module B
and if the latter goes away, an access to that memory from the former
is a use-after-free, so it is not sufficient to use a local spinlock.

This can be avoided by having a lock and a flag such that the flag is
set under the lock by module B when making the memory in question
available and cleared under the lock when freeing that memory.  Then,
module A needs to check the flag under the lock on every access to
that memory.  Also, the lock and the flag must be accessible all the
time to both modules (ie. must not go away along with any of them if
they don't depend on each other).

> >>> When a "thermal of sensor" is unloaded, it calls
> >>> thermal_zone_of_sensor_unregister which takes a lock before
> >>> update ops.
> >>
> >> I'm not sure to understand. The goal is to have the refcount on the
> >> modules to be incremented when the virtual sensor is using them.
> >
> > IMO the goal is to prevent the code from crashing when modules get
> > unloaded.  I'm not really sure if refcounts alone are sufficient for
> > that.
>
> The problem is in the loop:
>
> +static int virtual_thermal_sensor_get_temp(void *data, int *temperature)
> +{
> +       struct virtual_thermal_sensor *sensor = data;
> +       int max_temp = INT_MIN;
> +       int temp;
> +       int i;
> +
> +       for (i = 0; i < sensor->count; i++) {
> +               struct thermal_sensor_data *hw_sensor;
> +
> +               hw_sensor = &sensor->sensors[i];
> +               if (!hw_sensor->ops)
> +                       return -ENODEV;
> +
> +               hw_sensor->ops->get_temp(hw_sensor->sensor_data, &temp);
> +               max_temp = sensor->aggr_temp(max_temp, temp);
> +       }
> +
> +       *temperature = max_temp;
> +
> +       return 0;
> +}
>
> If one of the sensor is unloaded when get_temp is called,
> hw_sensor->ops->get_temp will crash.

Right.

Dereferencing hw_sensor itself is not safe in this loop if the module
holding the memory pointed to by it may go away.

However, presumably, the hw_sensor object needs to be registered with
the core in order to be used here and unregistered when it goes away,
so it looks like this loop could use a wrapper like
thermal_get_sensor_temp(hw_sensor, &temp) which would return a
negative error code if the sensor in question went away.

Of course, that would require the core to check the list of available
sensors every time, but that may be implemented efficiently with the
help of an xarray, for example.

> So the proposal is virtual_sensor_add_sensor() does try_get_module()
> and virtual_sensor_remove_sensor() does put_module().
>
> The ref on the 'slave' modules will be release only if the virtual
> sensor is unregistered.
>
> So until, the virtual sensor is unregistered, the 'slaves' modules can
> not be unloaded.
>
> That is what we find with eg. the wifi modules.

Yes, but that's a bit cumbersome from the sysadmin perspective IMO,
especially when one wants to unload one of the modules and doesn't
know exactly what other modules hold references to it.

IIUC ref_module() might be nicer, but it is still more convenient if
the modules can just go away independently.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v2 0/2] Add a generic virtual thermal sensor
  2021-10-06 18:00                   ` Rafael J. Wysocki
@ 2021-10-06 19:51                     ` Daniel Lezcano
  2021-10-07 14:17                       ` Rafael J. Wysocki
  0 siblings, 1 reply; 19+ messages in thread
From: Daniel Lezcano @ 2021-10-06 19:51 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Alexandre Bailon, Zhang, Rui, Amit Kucheria, Linux PM,
	Linux Kernel Mailing List, ben.tseng, Kevin Hilman,
	Matthias Kaehlcke

On 06/10/2021 20:00, Rafael J. Wysocki wrote:
> On Wed, Oct 6, 2021 at 6:06 PM Daniel Lezcano <daniel.lezcano@linaro.org> wrote:
>>
>> On 05/10/2021 18:45, Rafael J. Wysocki wrote:
>>> On Mon, Oct 4, 2021 at 3:42 PM Daniel Lezcano <daniel.lezcano@linaro.org> wrote:
>>>>
>>>> On 04/10/2021 12:24, Alexandre Bailon wrote:
>>>>>
>>>>> On 9/22/21 10:10 AM, Daniel Lezcano wrote:
>>>>>> On 20/09/2021 15:12, Alexandre Bailon wrote:
>>>>>>> On 9/17/21 4:03 PM, Daniel Lezcano wrote:
>>>>>>>> On 17/09/2021 15:33, Alexandre Bailon wrote:
>>>>>>>>> Hi Daniel,
>>>>>>>>>
>>>>>>>>> On 9/17/21 2:41 PM, Daniel Lezcano wrote:
>>>>>>>>>> On 17/09/2021 09:27, Alexandre Bailon wrote:
>>>>>>>>>>> This series add a virtual thermal sensor.
>>>>>>>>>>> It could be used to get a temperature using some thermal sensors.
>>>>>>>>>>> Currently, the supported operations are max, min and avg.
>>>>>>>>>>> The virtual sensor could be easily extended to support others
>>>>>>>>>>> operations.
>>>>>>>>>>>
>>>>>>>>>>> Note:
>>>>>>>>>>> Currently, thermal drivers must explicitly register their sensors to
>>>>>>>>>>> make them
>>>>>>>>>>> available to the virtual sensor.
>>>>>>>>>>> This doesn't seem a good solution to me and I think it would be
>>>>>>>>>>> preferable to
>>>>>>>>>>> update the framework to register the list of each available sensors.
>>>>>>>>>> Why must the drivers do that ?
>>>>>>>>> Because there are no central place where thermal sensor are
>>>>>>>>> registered.
>>>>>>>>> The only other way I found was to update thermal_of.c,
>>>>>>>>> to register the thermal sensors and make them available later to the
>>>>>>>>> virtual thermal sensor.
>>>>>>>>>
>>>>>>>>> To work, the virtual thermal need to get the sensor_data the ops from
>>>>>>>>> the thermal sensor.
>>>>>>>>> And as far I know, this is only registered in thermal_of.c, in the
>>>>>>>>> thermal zone data
>>>>>>>>> but I can't access it directly from the virtual thermal sensor.
>>>>>>>>>
>>>>>>>>> How would you do it ?
>>>>>>>> Via the phandles when registering the virtual sensor ?
>>>>>>> As far I know, we can't get the ops or the sensor_data from the phandle
>>>>>>> of a thermal sensor.
>>>>>>> The closest solution I found so far would be to aggregate the thermal
>>>>>>> zones instead of thermal sensors.
>>>>>>> thermal_zone_device has the data needed and a thermal zone could be find
>>>>>>> easily using its name.
>>>>>> Yeah, the concept of the thermal zone and the sensor are very close.
>>>>>>
>>>>>> There is the function in thermal_core.h:
>>>>>>
>>>>>>   -> for_each_thermal_zone()
>>>>>>
>>>>>> You should be able for each 'slave' sensor, do a lookup to find the
>>>>>> corresponding thermal_zone_device_ops.
>>>>>>
>>>>>>> But, using a thermal_zone_device, I don't see how to handle module
>>>>>>> unloading.
>>>>>> I think try_module_get() / module_put() are adequate for this situation
>>>>>> as it is done on an external module and we can not rely on the exported
>>>>>> symbols.
>>>>> I don't see how it would be possible to use these functions.
>>>>> The thermal zone doesn't have the data required to use it.
>>>>
>>>> Actually I was able to crash the kernel by doing:
>>>>
>>>> console 1:
>>>>
>>>> while $(true); do insmod <module> && rmmod <module>; done
>>>>
>>>> console 2:
>>>>
>>>> while $(true); cat /sys/class/thermal/thermal_zone0/temp; done
>>>>
>>>> So there is something wrong already in the thermal framework.
>>>
>>> Hmmm.
>>>
>>>> IMO, the first thing would be to fix this critical issue by getting the
>>>> sensor module refcount when the thermal zone is enabled and dropping it
>>>> when it is disabled.
>>>>
>>>> With that fixed, perhaps it will possible to get the device associated
>>>> with the sensor and then try_module_get(dev->driver->owner)
>>>>
>>>>> Maybe a more easier way is to use the thermal_zone_device mutex.
>>>>> If I get a lock before to use the thermal_zone_device ops, I have the
>>>>> guaranty that module won't be unloaded.
>>>
>>> That would be my approach too.
>>
>> The mutex is private to the thermal core. The virtual sensor should not
>> touch it :/
>>
>> Perhaps, it can work with a private spin_lock with a try_spinlock() ?
> 
> IIUC this is a case when module A refers to some memory in module B
> and if the latter goes away, an access to that memory from the former
> is a use-after-free, so it is not sufficient to use a local spinlock.

> This can be avoided by having a lock and a flag such that the flag is
> set under the lock by module B when making the memory in question
> available and cleared under the lock when freeing that memory.  Then,
> module A needs to check the flag under the lock on every access to
> that memory.  Also, the lock and the flag must be accessible all the
> time to both modules (ie. must not go away along with any of them if
> they don't depend on each other).

Just to clarify, the idea behind the virtual sensor is the same as the
network bridge: the network interfaces are not aware they are attached
to a bridge.

The virtual sensor should attach and detach the physical sensors.

The sensors should not un|register themselves from|to the virtual
sensor, nor having specific code related to the virtual sensor. De
facto, all the existing sensors will be compatible with the virtual sensor.

Do we have a solution other than grabbing a module reference and
self-contained to the virtual sensor ?

>>>>> When a "thermal of sensor" is unloaded, it calls
>>>>> thermal_zone_of_sensor_unregister which takes a lock before
>>>>> update ops.
>>>>
>>>> I'm not sure to understand. The goal is to have the refcount on the
>>>> modules to be incremented when the virtual sensor is using them.
>>>
>>> IMO the goal is to prevent the code from crashing when modules get
>>> unloaded.  I'm not really sure if refcounts alone are sufficient for
>>> that.
>>
>> The problem is in the loop:
>>
>> +static int virtual_thermal_sensor_get_temp(void *data, int *temperature)
>> +{
>> +       struct virtual_thermal_sensor *sensor = data;
>> +       int max_temp = INT_MIN;
>> +       int temp;
>> +       int i;
>> +
>> +       for (i = 0; i < sensor->count; i++) {
>> +               struct thermal_sensor_data *hw_sensor;
>> +
>> +               hw_sensor = &sensor->sensors[i];
>> +               if (!hw_sensor->ops)
>> +                       return -ENODEV;
>> +
>> +               hw_sensor->ops->get_temp(hw_sensor->sensor_data, &temp);
>> +               max_temp = sensor->aggr_temp(max_temp, temp);
>> +       }
>> +
>> +       *temperature = max_temp;
>> +
>> +       return 0;
>> +}
>>
>> If one of the sensor is unloaded when get_temp is called,
>> hw_sensor->ops->get_temp will crash.
> 
> Right.
> 
> Dereferencing hw_sensor itself is not safe in this loop if the module
> holding the memory pointed to by it may go away.
> 
> However, presumably, the hw_sensor object needs to be registered with
> the core in order to be used here and unregistered when it goes away,
> so it looks like this loop could use a wrapper like
> thermal_get_sensor_temp(hw_sensor, &temp) which would return a
> negative error code if the sensor in question went away.
> 
> Of course, that would require the core to check the list of available
> sensors every time, but that may be implemented efficiently with the
> help of an xarray, for example.
> 
>> So the proposal is virtual_sensor_add_sensor() does try_get_module()
>> and virtual_sensor_remove_sensor() does put_module().
>>
>> The ref on the 'slave' modules will be release only if the virtual
>> sensor is unregistered.
>>
>> So until, the virtual sensor is unregistered, the 'slaves' modules can
>> not be unloaded.
>>
>> That is what we find with eg. the wifi modules.
> 
> Yes, but that's a bit cumbersome from the sysadmin perspective IMO,
> especially when one wants to unload one of the modules and doesn't
> know exactly what other modules hold references to it.

Well, sysadmin are not supposed to unload a thermal sensor and I would
say if he unloads the module, which is not an usual operation, up to him
to 'lsmod' and check what module is holding the reference.

> IIUC ref_module() might be nicer, but it is still more convenient if
> the modules can just go away independently.

"for desesperate users" :)


-- 
<http://www.linaro.org/> Linaro.org │ Open source software for ARM SoCs

Follow Linaro:  <http://www.facebook.com/pages/Linaro> Facebook |
<http://twitter.com/#!/linaroorg> Twitter |
<http://www.linaro.org/linaro-blog/> Blog

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v2 0/2] Add a generic virtual thermal sensor
  2021-10-06 19:51                     ` Daniel Lezcano
@ 2021-10-07 14:17                       ` Rafael J. Wysocki
  0 siblings, 0 replies; 19+ messages in thread
From: Rafael J. Wysocki @ 2021-10-07 14:17 UTC (permalink / raw)
  To: Daniel Lezcano
  Cc: Rafael J. Wysocki, Alexandre Bailon, Zhang, Rui, Amit Kucheria,
	Linux PM, Linux Kernel Mailing List, ben.tseng, Kevin Hilman,
	Matthias Kaehlcke

On Wed, Oct 6, 2021 at 9:51 PM Daniel Lezcano <daniel.lezcano@linaro.org> wrote:
>
> On 06/10/2021 20:00, Rafael J. Wysocki wrote:
> > On Wed, Oct 6, 2021 at 6:06 PM Daniel Lezcano <daniel.lezcano@linaro.org> wrote:
> >>
> >> On 05/10/2021 18:45, Rafael J. Wysocki wrote:
> >>> On Mon, Oct 4, 2021 at 3:42 PM Daniel Lezcano <daniel.lezcano@linaro.org> wrote:
> >>>>
> >>>> On 04/10/2021 12:24, Alexandre Bailon wrote:
> >>>>>
> >>>>> On 9/22/21 10:10 AM, Daniel Lezcano wrote:
> >>>>>> On 20/09/2021 15:12, Alexandre Bailon wrote:
> >>>>>>> On 9/17/21 4:03 PM, Daniel Lezcano wrote:
> >>>>>>>> On 17/09/2021 15:33, Alexandre Bailon wrote:
> >>>>>>>>> Hi Daniel,
> >>>>>>>>>
> >>>>>>>>> On 9/17/21 2:41 PM, Daniel Lezcano wrote:
> >>>>>>>>>> On 17/09/2021 09:27, Alexandre Bailon wrote:
> >>>>>>>>>>> This series add a virtual thermal sensor.
> >>>>>>>>>>> It could be used to get a temperature using some thermal sensors.
> >>>>>>>>>>> Currently, the supported operations are max, min and avg.
> >>>>>>>>>>> The virtual sensor could be easily extended to support others
> >>>>>>>>>>> operations.
> >>>>>>>>>>>
> >>>>>>>>>>> Note:
> >>>>>>>>>>> Currently, thermal drivers must explicitly register their sensors to
> >>>>>>>>>>> make them
> >>>>>>>>>>> available to the virtual sensor.
> >>>>>>>>>>> This doesn't seem a good solution to me and I think it would be
> >>>>>>>>>>> preferable to
> >>>>>>>>>>> update the framework to register the list of each available sensors.
> >>>>>>>>>> Why must the drivers do that ?
> >>>>>>>>> Because there are no central place where thermal sensor are
> >>>>>>>>> registered.
> >>>>>>>>> The only other way I found was to update thermal_of.c,
> >>>>>>>>> to register the thermal sensors and make them available later to the
> >>>>>>>>> virtual thermal sensor.
> >>>>>>>>>
> >>>>>>>>> To work, the virtual thermal need to get the sensor_data the ops from
> >>>>>>>>> the thermal sensor.
> >>>>>>>>> And as far I know, this is only registered in thermal_of.c, in the
> >>>>>>>>> thermal zone data
> >>>>>>>>> but I can't access it directly from the virtual thermal sensor.
> >>>>>>>>>
> >>>>>>>>> How would you do it ?
> >>>>>>>> Via the phandles when registering the virtual sensor ?
> >>>>>>> As far I know, we can't get the ops or the sensor_data from the phandle
> >>>>>>> of a thermal sensor.
> >>>>>>> The closest solution I found so far would be to aggregate the thermal
> >>>>>>> zones instead of thermal sensors.
> >>>>>>> thermal_zone_device has the data needed and a thermal zone could be find
> >>>>>>> easily using its name.
> >>>>>> Yeah, the concept of the thermal zone and the sensor are very close.
> >>>>>>
> >>>>>> There is the function in thermal_core.h:
> >>>>>>
> >>>>>>   -> for_each_thermal_zone()
> >>>>>>
> >>>>>> You should be able for each 'slave' sensor, do a lookup to find the
> >>>>>> corresponding thermal_zone_device_ops.
> >>>>>>
> >>>>>>> But, using a thermal_zone_device, I don't see how to handle module
> >>>>>>> unloading.
> >>>>>> I think try_module_get() / module_put() are adequate for this situation
> >>>>>> as it is done on an external module and we can not rely on the exported
> >>>>>> symbols.
> >>>>> I don't see how it would be possible to use these functions.
> >>>>> The thermal zone doesn't have the data required to use it.
> >>>>
> >>>> Actually I was able to crash the kernel by doing:
> >>>>
> >>>> console 1:
> >>>>
> >>>> while $(true); do insmod <module> && rmmod <module>; done
> >>>>
> >>>> console 2:
> >>>>
> >>>> while $(true); cat /sys/class/thermal/thermal_zone0/temp; done
> >>>>
> >>>> So there is something wrong already in the thermal framework.
> >>>
> >>> Hmmm.
> >>>
> >>>> IMO, the first thing would be to fix this critical issue by getting the
> >>>> sensor module refcount when the thermal zone is enabled and dropping it
> >>>> when it is disabled.
> >>>>
> >>>> With that fixed, perhaps it will possible to get the device associated
> >>>> with the sensor and then try_module_get(dev->driver->owner)
> >>>>
> >>>>> Maybe a more easier way is to use the thermal_zone_device mutex.
> >>>>> If I get a lock before to use the thermal_zone_device ops, I have the
> >>>>> guaranty that module won't be unloaded.
> >>>
> >>> That would be my approach too.
> >>
> >> The mutex is private to the thermal core. The virtual sensor should not
> >> touch it :/
> >>
> >> Perhaps, it can work with a private spin_lock with a try_spinlock() ?
> >
> > IIUC this is a case when module A refers to some memory in module B
> > and if the latter goes away, an access to that memory from the former
> > is a use-after-free, so it is not sufficient to use a local spinlock.
>
> > This can be avoided by having a lock and a flag such that the flag is
> > set under the lock by module B when making the memory in question
> > available and cleared under the lock when freeing that memory.  Then,
> > module A needs to check the flag under the lock on every access to
> > that memory.  Also, the lock and the flag must be accessible all the
> > time to both modules (ie. must not go away along with any of them if
> > they don't depend on each other).
>
> Just to clarify, the idea behind the virtual sensor is the same as the
> network bridge: the network interfaces are not aware they are attached
> to a bridge.

But there's no code handling the bridge that needs to access memory
controlled by network interface driver modules.

> The virtual sensor should attach and detach the physical sensors.
>
> The sensors should not un|register themselves from|to the virtual
> sensor, nor having specific code related to the virtual sensor. De
> facto, all the existing sensors will be compatible with the virtual sensor.

If I understand this patch series correctly, somebody is expected to
call thermal_virtual_sensor_register() (or the devm_ variant of it) in
order to add a sensor to the thermal_sensors list (which appears to be
racy given the lack of locking or any kind of synchronization
whatsoever).

This appears to assume that all of the sensors will be added before
running virtual_thermal_sensor_probe() which effectively is the only
reader of the thermal_sensors list and runs as a ->probe() callbacks
of virtual_thermal_sensor (there is a name clash between this and a
struct data type which isn't useful) which gets registered as a
platform driver when the module loads, as per
module_platform_driver().

So whoever called thermal_virtual_sensor_register() to register a
sensor, will be required to call thermal_virtual_sensor_unregister()
or devm_thermal_virtual_sensor_release() will be called for them when
they are going away (again, no locking there).  That should also
reconfigure all of the virtual sensors using the given sensor going
away and which should take care of the problem we are discussing,
shouldn't it?

Let me respond to the patches, though.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v2 2/2] thermal: add a virtual sensor to aggregate temperatures
  2021-09-17  7:27 ` [PATCH v2 2/2] thermal: add a virtual sensor to aggregate temperatures Alexandre Bailon
@ 2021-10-07 16:38   ` Rafael J. Wysocki
  2021-10-07 17:33     ` Rafael J. Wysocki
                       ` (2 more replies)
  0 siblings, 3 replies; 19+ messages in thread
From: Rafael J. Wysocki @ 2021-10-07 16:38 UTC (permalink / raw)
  To: Alexandre Bailon
  Cc: Zhang, Rui, Daniel Lezcano, Amit Kucheria, Linux PM,
	Linux Kernel Mailing List, ben.tseng, Kevin Hilman,
	Matthias Kaehlcke, kernel test robot

On Fri, Sep 17, 2021 at 9:25 AM Alexandre Bailon <abailon@baylibre.com> wrote:
>
> This adds a virtual thermal sensor that reads temperature from

This should be "virtual sensor driver" I suppose.

> hardware sensor and return an aggregated temperature.

"returns"

Also wrapping it around a single hardware sensor makes a little sense
AFAICS, so I guess you mean "multiple hardware sensors" here.

> Currently, this supports three operations:
> the minimum, maximum and average temperature.

They are aggregation functions rather than "operations"

>
> Reported-by: kernel test robot <lkp@intel.com>

This is new code, so what does it fix?

> Signed-off-by: Alexandre Bailon <abailon@baylibre.com>
> ---
>  drivers/thermal/Kconfig                  |   8 +
>  drivers/thermal/Makefile                 |   1 +
>  drivers/thermal/virtual-thermal-sensor.h |  54 ++++
>  drivers/thermal/virtual_thermal_sensor.c | 350 +++++++++++++++++++++++
>  4 files changed, 413 insertions(+)
>  create mode 100644 drivers/thermal/virtual-thermal-sensor.h
>  create mode 100644 drivers/thermal/virtual_thermal_sensor.c
>
> diff --git a/drivers/thermal/Kconfig b/drivers/thermal/Kconfig
> index d7f44deab5b1d..20bc93c48f5b1 100644
> --- a/drivers/thermal/Kconfig
> +++ b/drivers/thermal/Kconfig
> @@ -228,6 +228,14 @@ config THERMAL_MMIO
>           register or shared memory, is a potential candidate to work with this
>           driver.
>
> +config VIRTUAL_THERMAL
> +       tristate "Virtual thermal sensor driver"

This should be "DT-based virtual thermal sensor driver" I think.

> +       depends on THERMAL_OF || COMPILE_TEST
> +       help
> +         This option enables the generic thermal sensor aggregator.

"generic DT-based"

> +         This driver creates a thermal sensor that reads the hardware sensors

s/the/multiple/

> +         and aggregate the temperature.

"aggregates their output"

> +
>  config HISI_THERMAL
>         tristate "Hisilicon thermal driver"
>         depends on ARCH_HISI || COMPILE_TEST
> diff --git a/drivers/thermal/Makefile b/drivers/thermal/Makefile
> index 82fc3e616e54b..8bf55973059c5 100644
> --- a/drivers/thermal/Makefile
> +++ b/drivers/thermal/Makefile
> @@ -60,3 +60,4 @@ obj-$(CONFIG_UNIPHIER_THERMAL)        += uniphier_thermal.o
>  obj-$(CONFIG_AMLOGIC_THERMAL)     += amlogic_thermal.o
>  obj-$(CONFIG_SPRD_THERMAL)     += sprd_thermal.o
>  obj-$(CONFIG_KHADAS_MCU_FAN_THERMAL)   += khadas_mcu_fan.o
> +obj-$(CONFIG_VIRTUAL_THERMAL) += virtual_thermal_sensor.o
> diff --git a/drivers/thermal/virtual-thermal-sensor.h b/drivers/thermal/virtual-thermal-sensor.h
> new file mode 100644
> index 0000000000000..3bbf7c324dddc
> --- /dev/null
> +++ b/drivers/thermal/virtual-thermal-sensor.h
> @@ -0,0 +1,54 @@
> +/* SPDX-License-Identifier: GPL-2.0 */
> +/*
> + * Copyright (c) 2021 BayLibre
> + */
> +
> +#ifndef __THERMAL_VIRTUAL_SENSOR_H__
> +#define __THERMAL_VIRTUAL_SENSOR_H__
> +
> +#include <linux/device.h>
> +#include <linux/thermal.h>
> +
> +struct virtual_thermal_sensor;
> +struct thermal_sensor_data;
> +
> +#if IS_ENABLED(CONFIG_VIRTUAL_THERMAL)
> +struct thermal_sensor_data *
> +thermal_virtual_sensor_register(struct device *dev, int sensor_id, void *data,
> +                               const struct thermal_zone_of_device_ops *ops);
> +void thermal_virtual_sensor_unregister(struct device *dev,
> +                                      struct thermal_sensor_data *sensor_data);
> +struct thermal_sensor_data *
> +devm_thermal_virtual_sensor_register(struct device *dev, int sensor_id, void *data,
> +                                    const struct thermal_zone_of_device_ops *ops);
> +
> +void devm_thermal_virtual_sensor_unregister(struct device *dev,
> +                                           struct virtual_thermal_sensor *sensor);
> +#else
> +static inline struct thermal_sensor_data *
> +thermal_virtual_sensor_register(struct device *dev, int sensor_id, void *data,
> +                               const struct thermal_zone_of_device_ops *ops)
> +{
> +       return ERR_PTR(-ENODEV);
> +}
> +
> +void thermal_virtual_sensor_unregister(struct device *dev,
> +                                      struct thermal_sensor_data *sensor_data)
> +{
> +}
> +
> +static inline struct thermal_sensor_data *
> +devm_thermal_virtual_sensor_register(struct device *dev, int sensor_id, void *data,
> +                                    const struct thermal_zone_of_device_ops *ops)
> +{
> +       return ERR_PTR(-ENODEV);
> +}
> +
> +static inline
> +void devm_thermal_virtual_sensor_unregister(struct device *dev,
> +                                           struct virtual_thermal_sensor *sensor)
> +{
> +}
> +#endif
> +
> +#endif /* __THERMAL_VIRTUAL_SENSOR_H__ */
> diff --git a/drivers/thermal/virtual_thermal_sensor.c b/drivers/thermal/virtual_thermal_sensor.c
> new file mode 100644
> index 0000000000000..234563af6643e
> --- /dev/null
> +++ b/drivers/thermal/virtual_thermal_sensor.c
> @@ -0,0 +1,350 @@
> +// SPDX-License-Identifier: GPL-2.0
> +/*
> + * Copyright (c) 2021 BayLibre

Please add information on what's in this file to the preamble.

> + */
> +
> +#include <linux/err.h>
> +#include <linux/export.h>
> +#include <linux/module.h>
> +#include <linux/of_device.h>
> +#include <linux/of_platform.h>
> +#include <linux/slab.h>
> +#include <linux/thermal.h>
> +#include <linux/types.h>
> +#include <linux/string.h>
> +
> +#include <dt-bindings/thermal/virtual-sensor.h>
> +
> +#include "virtual-thermal-sensor.h"
> +
> +struct thermal_sensor_data {
> +       struct list_head node;
> +
> +       /* sensor interface */
> +       int id;
> +       void *sensor_data;
> +       const struct thermal_zone_of_device_ops *ops;
> +};
> +
> +struct virtual_thermal_sensor {
> +       int count;
> +       struct thermal_sensor_data *sensors;
> +       struct thermal_zone_device *tzd;
> +       int (*aggr_temp)(int temp1, int temp2);
> +
> +       struct list_head node;
> +};
> +
> +static LIST_HEAD(thermal_sensors);
> +static LIST_HEAD(virtual_sensors);

These lists don't seem to be protected against concurrent access and
they should be.

> +
> +static int max_temp(int temp1, int temp2)
> +{
> +       return max(temp1, temp2);
> +}
> +
> +static int min_temp(int temp1, int temp2)
> +{
> +       return min(temp1, temp2);
> +}
> +
> +static int avg_temp(int temp1, int temp2)
> +{
> +       return ((temp1) / 2) + ((temp2) / 2) + (((temp1) % 2 + (temp2) % 2) / 2);

There are a few redundant parens here.

And why not

return (temp1 + temp2) / 2;

> +}
> +
> +static int virtual_thermal_sensor_get_temp(void *data, int *temperature)
> +{
> +       struct virtual_thermal_sensor *sensor = data;
> +       int max_temp = INT_MIN;
> +       int temp;
> +       int i;
> +

Some synchronization (eg. locking) should be added to this.  It is racy as is.

> +       for (i = 0; i < sensor->count; i++) {
> +               struct thermal_sensor_data *hw_sensor;
> +
> +               hw_sensor = &sensor->sensors[i];
> +               if (!hw_sensor->ops)
> +                       return -ENODEV;
> +
> +               hw_sensor->ops->get_temp(hw_sensor->sensor_data, &temp);
> +               max_temp = sensor->aggr_temp(max_temp, temp);
> +       }
> +
> +       *temperature = max_temp;
> +
> +       return 0;
> +}
> +
> +static const struct thermal_zone_of_device_ops virtual_thermal_sensor_ops = {
> +       .get_temp = virtual_thermal_sensor_get_temp,
> +};
> +
> +static int virtual_sensor_add_sensor(struct virtual_thermal_sensor *sensor,
> +                                    struct of_phandle_args args,
> +                                    int index)
> +{
> +       struct thermal_sensor_data *sensor_data;
> +       int id;
> +
> +       list_for_each_entry(sensor_data, &thermal_sensors, node) {
> +               id = args.args_count ? args.args[0] : 0;
> +               if (sensor_data->id == id) {
> +                       memcpy(&sensor->sensors[index], sensor_data,
> +                               sizeof(*sensor_data));

The data object being copied includes pointers, so I wouldn't copy
them like this.  Especially the ops one annotated as const.

> +                       return 0;
> +               }
> +       }
> +
> +       return -ENODEV;
> +}
> +
> +static int virtual_thermal_sensor_probe(struct platform_device *pdev)
> +{
> +       struct virtual_thermal_sensor *sensor;
> +       struct device *dev = &pdev->dev;
> +       struct of_phandle_args args;
> +       u32 type;
> +       int ret;
> +       int i;
> +
> +       sensor = devm_kzalloc(dev, sizeof(*sensor), GFP_KERNEL);
> +       if (!sensor)
> +               return -ENOMEM;
> +
> +       sensor->count = of_count_phandle_with_args(dev->of_node,
> +                                                  "thermal-sensors",
> +                                                  "#thermal-sensor-cells");
> +       if (sensor->count <= 0)
> +               return -EINVAL;
> +
> +       sensor->sensors = devm_kmalloc_array(dev, sensor->count,
> +                                            sizeof(*sensor->sensors),
> +                                            GFP_KERNEL);
> +       if (!sensor->sensors)
> +               return -ENOMEM;
> +
> +       for (i = 0; i < sensor->count; i++) {
> +               ret = of_parse_phandle_with_args(dev->of_node,
> +                                                "thermal-sensors",
> +                                                "#thermal-sensor-cells",
> +                                                i,
> +                                                &args);
> +               if (ret)
> +                       return ret;
> +
> +               ret = virtual_sensor_add_sensor(sensor, args, i);
> +               if (ret)
> +                       return ret;
> +       }
> +
> +       ret = of_property_read_u32(dev->of_node, "aggregation-function", &type);
> +       if (ret)
> +               return ret;
> +
> +       switch (type) {
> +       case VIRTUAL_THERMAL_SENSOR_MAX:
> +               sensor->aggr_temp = max_temp;
> +               break;
> +       case VIRTUAL_THERMAL_SENSOR_MIN:
> +               sensor->aggr_temp = min_temp;
> +               break;
> +       case VIRTUAL_THERMAL_SENSOR_AVG:
> +               sensor->aggr_temp = avg_temp;
> +               break;
> +       default:
> +               return -EINVAL;
> +       }
> +
> +       sensor->tzd = devm_thermal_zone_of_sensor_register(dev, 0, sensor,
> +                                                          &virtual_thermal_sensor_ops);
> +       if (IS_ERR(sensor->tzd))
> +               return PTR_ERR(sensor->tzd);
> +
> +       platform_set_drvdata(pdev, sensor);
> +       list_add(&sensor->node, &virtual_sensors);
> +
> +       return 0;
> +}
> +
> +static int virtual_thermal_sensor_remove(struct platform_device *pdev)
> +{
> +       struct virtual_thermal_sensor *sensor;
> +
> +       sensor = platform_get_drvdata(pdev);
> +       list_del(&sensor->node);
> +
> +       return 0;
> +}
> +
> +static const struct of_device_id virtual_thermal_sensor_of_match[] = {
> +       {
> +               .compatible = "virtual,thermal-sensor",
> +       },
> +       {
> +       },
> +};
> +MODULE_DEVICE_TABLE(of, virtual_thermal_sensor_of_match);
> +
> +static struct platform_driver virtual_thermal_sensor = {
> +       .probe = virtual_thermal_sensor_probe,
> +       .remove = virtual_thermal_sensor_remove,
> +       .driver = {
> +               .name = "virtual-thermal-sensor",
> +               .of_match_table = virtual_thermal_sensor_of_match,
> +       },
> +};
> +
> +/**
> + * thermal_virtual_sensor_register - registers a sensor that could be a virtual
> + * sensor
> + * @dev: a valid struct device pointer of a sensor device. Must contain
> + *       a valid .of_node, for the sensor node.
> + * @sensor_id: a sensor identifier, in case the sensor IP has more
> + *             than one sensor
> + * @data: a private pointer (owned by the caller) that will be passed
> + *        back, when a temperature reading is needed.
> + * @ops: struct thermal_zone_of_device_ops *. Must contain at least .get_temp.
> + *
> + * This function will register a thermal sensor to make it available for later
> + * usage by a virtual sensor.

So who's expected to be calling it?

> + *
> + * The thermal zone temperature is provided by the @get_temp function
> + * pointer. When called, it will have the private pointer @data back.

It is unclear what you mean here.  I think you are describing a return
mechanism, but as is it is hard to follow.

> + *
> + * Return: On success returns a valid struct thermal_zone_device,

That doesn't seem to be the case.  It returns an address returned by
devm_kzalloc() and the memory at that address is expected to be a
thermal_sensor_data object.

> + * otherwise, it returns a corresponding ERR_PTR(). Caller must
> + * check the return value with help of IS_ERR() helper.
> + */
> +struct thermal_sensor_data *thermal_virtual_sensor_register(
> +       struct device *dev, int sensor_id, void *data,
> +       const struct thermal_zone_of_device_ops *ops)
> +{
> +       struct thermal_sensor_data *sensor_data;
> +
> +       sensor_data = devm_kzalloc(dev, sizeof(*sensor_data), GFP_KERNEL);
> +       if (!sensor_data)
> +               return ERR_PTR(-ENOMEM);
> +
> +       sensor_data->id = sensor_id;
> +       sensor_data->sensor_data = data;
> +       sensor_data->ops = ops;
> +
> +       list_add(&sensor_data->node, &thermal_sensors);
> +
> +       return sensor_data;

Overall, this is non-modular code squeezed into a module.

If this module goes away and whoever called it wants to unregister the
sensor, they will crash, so to a minimum it should do a
try_module_get() on the module containing it.

> +}
> +EXPORT_SYMBOL_GPL(thermal_virtual_sensor_register);
> +
> +/**
> + * thermal_virtual_sensor_unregister - unregisters a sensor
> + * @dev: a valid struct device pointer of a sensor device.
> + * @sensor_data: a pointer to struct thermal_sensor_data to unregister.
> + *
> + * This function removes the sensor from the list of available thermal sensors.
> + * If the sensor is in use, then the next call to .get_temp will return -ENODEV.
> + */
> +void thermal_virtual_sensor_unregister(struct device *dev,
> +                                      struct thermal_sensor_data *sensor_data)
> +{
> +       struct thermal_sensor_data *sd;
> +       struct virtual_thermal_sensor *sensor;
> +       int i;
> +

All of the code in this function is obviously racy.

> +       list_del(&sensor_data->node);
> +
> +       list_for_each_entry(sensor, &virtual_sensors, node) {
> +               for (i = 0; i < sensor->count; i++) {
> +                       sd = &sensor->sensors[i];
> +                       if (sd->id == sensor_data->id &&
> +                               sd->sensor_data == sensor_data->sensor_data) {
> +                               sd->ops = NULL;
> +                       }

The inner braces are not necessary.

> +               }
> +       }

And again, this code requires the module to be protected from
unloading until the last instance of
thermal_virtual_sensor_unregister() has returned.

> +}
> +EXPORT_SYMBOL_GPL(thermal_virtual_sensor_unregister);
> +
> +static void devm_thermal_virtual_sensor_release(struct device *dev, void *res)
> +{
> +       thermal_virtual_sensor_unregister(dev,
> +                                         *(struct thermal_sensor_data **)res);
> +}
> +
> +static int devm_thermal_virtual_sensor_match(struct device *dev, void *res,
> +                                            void *data)
> +{
> +       struct thermal_sensor_data **r = res;
> +
> +       if (WARN_ON(!r || !*r))
> +               return 0;
> +
> +       return *r == data;
> +}
> +
> +/**
> + * devm_thermal_virtual_sensor_register - Resource managed version of
> + *                             thermal_virtual_sensor_register()
> + * @dev: a valid struct device pointer of a sensor device. Must contain
> + *       a valid .of_node, for the sensor node.
> + * @sensor_id: a sensor identifier, in case the sensor IP has more
> + *            than one sensor
> + * @data: a private pointer (owned by the caller) that will be passed
> + *       back, when a temperature reading is needed.
> + * @ops: struct thermal_zone_of_device_ops *. Must contain at least .get_temp.
> + *
> + * Refer thermal_zone_of_sensor_register() for more details.
> + *
> + * Return: On success returns a valid struct virtual_sensor_data,
> + * otherwise, it returns a corresponding ERR_PTR(). Caller must
> + * check the return value with help of IS_ERR() helper.
> + * Registered virtual_sensor_data device will automatically be
> + * released when device is unbound.
> + */
> +struct thermal_sensor_data *devm_thermal_virtual_sensor_register(
> +       struct device *dev, int sensor_id,
> +       void *data, const struct thermal_zone_of_device_ops *ops)
> +{
> +       struct thermal_sensor_data **ptr, *sensor_data;
> +
> +       ptr = devres_alloc(devm_thermal_virtual_sensor_release, sizeof(*ptr),
> +                          GFP_KERNEL);
> +       if (!ptr)
> +               return ERR_PTR(-ENOMEM);
> +
> +       sensor_data = thermal_virtual_sensor_register(dev, sensor_id, data, ops);
> +       if (IS_ERR(sensor_data)) {
> +               devres_free(ptr);
> +               return sensor_data;
> +       }
> +
> +       *ptr = sensor_data;
> +       devres_add(dev, ptr);
> +
> +       return sensor_data;
> +}
> +EXPORT_SYMBOL_GPL(devm_thermal_virtual_sensor_register);
> +
> +/**
> + * devm_thermal_virtual_sensor_unregister - Resource managed version of
> + *                             thermal_virtual_sensor_unregister().
> + * @dev: Device for which resource was allocated.
> + * @sensor: a pointer to struct thermal_zone_device where the sensor is registered.
> + *
> + * This function removes the sensor from the list of sensors registered with
> + * devm_thermal_virtual_sensor_register() API.
> + * Normally this function will not need to be called and the resource
> + * management code will ensure that the resource is freed.
> + */
> +void devm_thermal_virtual_sensor_unregister(struct device *dev,
> +                                           struct virtual_thermal_sensor *sensor)
> +{
> +       WARN_ON(devres_release(dev, devm_thermal_virtual_sensor_release,
> +                              devm_thermal_virtual_sensor_match, sensor));
> +}
> +EXPORT_SYMBOL_GPL(devm_thermal_virtual_sensor_unregister);
> +
> +module_platform_driver(virtual_thermal_sensor);
> +MODULE_AUTHOR("Alexandre Bailon <abailon@baylibre.com>");
> +MODULE_DESCRIPTION("Virtual thermal sensor");
> +MODULE_LICENSE("GPL v2");
> --

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v2 2/2] thermal: add a virtual sensor to aggregate temperatures
  2021-10-07 16:38   ` Rafael J. Wysocki
@ 2021-10-07 17:33     ` Rafael J. Wysocki
  2021-10-09 14:46     ` Rafael J. Wysocki
  2021-10-25  9:18     ` Alexandre Bailon
  2 siblings, 0 replies; 19+ messages in thread
From: Rafael J. Wysocki @ 2021-10-07 17:33 UTC (permalink / raw)
  To: Alexandre Bailon
  Cc: Zhang, Rui, Daniel Lezcano, Amit Kucheria, Linux PM,
	Linux Kernel Mailing List, ben.tseng, Kevin Hilman,
	Matthias Kaehlcke, kernel test robot

On Thu, Oct 7, 2021 at 6:38 PM Rafael J. Wysocki <rafael@kernel.org> wrote:
>
> On Fri, Sep 17, 2021 at 9:25 AM Alexandre Bailon <abailon@baylibre.com> wrote:
> >
> > This adds a virtual thermal sensor that reads temperature from
>
> This should be "virtual sensor driver" I suppose.
>
> > hardware sensor and return an aggregated temperature.
>
> "returns"
>
> Also wrapping it around a single hardware sensor makes a little sense
> AFAICS, so I guess you mean "multiple hardware sensors" here.
>
> > Currently, this supports three operations:
> > the minimum, maximum and average temperature.
>
> They are aggregation functions rather than "operations"
>
> >
> > Reported-by: kernel test robot <lkp@intel.com>
>
> This is new code, so what does it fix?
>
> > Signed-off-by: Alexandre Bailon <abailon@baylibre.com>
> > ---
> >  drivers/thermal/Kconfig                  |   8 +
> >  drivers/thermal/Makefile                 |   1 +
> >  drivers/thermal/virtual-thermal-sensor.h |  54 ++++
> >  drivers/thermal/virtual_thermal_sensor.c | 350 +++++++++++++++++++++++
> >  4 files changed, 413 insertions(+)
> >  create mode 100644 drivers/thermal/virtual-thermal-sensor.h
> >  create mode 100644 drivers/thermal/virtual_thermal_sensor.c
> >
> > diff --git a/drivers/thermal/Kconfig b/drivers/thermal/Kconfig
> > index d7f44deab5b1d..20bc93c48f5b1 100644
> > --- a/drivers/thermal/Kconfig
> > +++ b/drivers/thermal/Kconfig
> > @@ -228,6 +228,14 @@ config THERMAL_MMIO
> >           register or shared memory, is a potential candidate to work with this
> >           driver.
> >
> > +config VIRTUAL_THERMAL
> > +       tristate "Virtual thermal sensor driver"
>
> This should be "DT-based virtual thermal sensor driver" I think.
>
> > +       depends on THERMAL_OF || COMPILE_TEST
> > +       help
> > +         This option enables the generic thermal sensor aggregator.
>
> "generic DT-based"
>
> > +         This driver creates a thermal sensor that reads the hardware sensors
>
> s/the/multiple/
>
> > +         and aggregate the temperature.
>
> "aggregates their output"
>
> > +
> >  config HISI_THERMAL
> >         tristate "Hisilicon thermal driver"
> >         depends on ARCH_HISI || COMPILE_TEST
> > diff --git a/drivers/thermal/Makefile b/drivers/thermal/Makefile
> > index 82fc3e616e54b..8bf55973059c5 100644
> > --- a/drivers/thermal/Makefile
> > +++ b/drivers/thermal/Makefile
> > @@ -60,3 +60,4 @@ obj-$(CONFIG_UNIPHIER_THERMAL)        += uniphier_thermal.o
> >  obj-$(CONFIG_AMLOGIC_THERMAL)     += amlogic_thermal.o
> >  obj-$(CONFIG_SPRD_THERMAL)     += sprd_thermal.o
> >  obj-$(CONFIG_KHADAS_MCU_FAN_THERMAL)   += khadas_mcu_fan.o
> > +obj-$(CONFIG_VIRTUAL_THERMAL) += virtual_thermal_sensor.o
> > diff --git a/drivers/thermal/virtual-thermal-sensor.h b/drivers/thermal/virtual-thermal-sensor.h
> > new file mode 100644
> > index 0000000000000..3bbf7c324dddc
> > --- /dev/null
> > +++ b/drivers/thermal/virtual-thermal-sensor.h
> > @@ -0,0 +1,54 @@
> > +/* SPDX-License-Identifier: GPL-2.0 */
> > +/*
> > + * Copyright (c) 2021 BayLibre
> > + */
> > +
> > +#ifndef __THERMAL_VIRTUAL_SENSOR_H__
> > +#define __THERMAL_VIRTUAL_SENSOR_H__
> > +
> > +#include <linux/device.h>
> > +#include <linux/thermal.h>
> > +
> > +struct virtual_thermal_sensor;
> > +struct thermal_sensor_data;
> > +
> > +#if IS_ENABLED(CONFIG_VIRTUAL_THERMAL)
> > +struct thermal_sensor_data *
> > +thermal_virtual_sensor_register(struct device *dev, int sensor_id, void *data,
> > +                               const struct thermal_zone_of_device_ops *ops);
> > +void thermal_virtual_sensor_unregister(struct device *dev,
> > +                                      struct thermal_sensor_data *sensor_data);
> > +struct thermal_sensor_data *
> > +devm_thermal_virtual_sensor_register(struct device *dev, int sensor_id, void *data,
> > +                                    const struct thermal_zone_of_device_ops *ops);
> > +
> > +void devm_thermal_virtual_sensor_unregister(struct device *dev,
> > +                                           struct virtual_thermal_sensor *sensor);
> > +#else
> > +static inline struct thermal_sensor_data *
> > +thermal_virtual_sensor_register(struct device *dev, int sensor_id, void *data,
> > +                               const struct thermal_zone_of_device_ops *ops)
> > +{
> > +       return ERR_PTR(-ENODEV);
> > +}
> > +
> > +void thermal_virtual_sensor_unregister(struct device *dev,
> > +                                      struct thermal_sensor_data *sensor_data)
> > +{
> > +}
> > +
> > +static inline struct thermal_sensor_data *
> > +devm_thermal_virtual_sensor_register(struct device *dev, int sensor_id, void *data,
> > +                                    const struct thermal_zone_of_device_ops *ops)
> > +{
> > +       return ERR_PTR(-ENODEV);
> > +}
> > +
> > +static inline
> > +void devm_thermal_virtual_sensor_unregister(struct device *dev,
> > +                                           struct virtual_thermal_sensor *sensor)
> > +{
> > +}
> > +#endif
> > +
> > +#endif /* __THERMAL_VIRTUAL_SENSOR_H__ */
> > diff --git a/drivers/thermal/virtual_thermal_sensor.c b/drivers/thermal/virtual_thermal_sensor.c
> > new file mode 100644
> > index 0000000000000..234563af6643e
> > --- /dev/null
> > +++ b/drivers/thermal/virtual_thermal_sensor.c
> > @@ -0,0 +1,350 @@
> > +// SPDX-License-Identifier: GPL-2.0
> > +/*
> > + * Copyright (c) 2021 BayLibre
>
> Please add information on what's in this file to the preamble.
>
> > + */
> > +
> > +#include <linux/err.h>
> > +#include <linux/export.h>
> > +#include <linux/module.h>
> > +#include <linux/of_device.h>
> > +#include <linux/of_platform.h>
> > +#include <linux/slab.h>
> > +#include <linux/thermal.h>
> > +#include <linux/types.h>
> > +#include <linux/string.h>
> > +
> > +#include <dt-bindings/thermal/virtual-sensor.h>
> > +
> > +#include "virtual-thermal-sensor.h"
> > +
> > +struct thermal_sensor_data {
> > +       struct list_head node;
> > +
> > +       /* sensor interface */
> > +       int id;
> > +       void *sensor_data;
> > +       const struct thermal_zone_of_device_ops *ops;
> > +};
> > +
> > +struct virtual_thermal_sensor {
> > +       int count;
> > +       struct thermal_sensor_data *sensors;
> > +       struct thermal_zone_device *tzd;
> > +       int (*aggr_temp)(int temp1, int temp2);
> > +
> > +       struct list_head node;
> > +};
> > +
> > +static LIST_HEAD(thermal_sensors);
> > +static LIST_HEAD(virtual_sensors);
>
> These lists don't seem to be protected against concurrent access and
> they should be.
>
> > +
> > +static int max_temp(int temp1, int temp2)
> > +{
> > +       return max(temp1, temp2);
> > +}
> > +
> > +static int min_temp(int temp1, int temp2)
> > +{
> > +       return min(temp1, temp2);
> > +}
> > +
> > +static int avg_temp(int temp1, int temp2)
> > +{
> > +       return ((temp1) / 2) + ((temp2) / 2) + (((temp1) % 2 + (temp2) % 2) / 2);
>
> There are a few redundant parens here.
>
> And why not
>
> return (temp1 + temp2) / 2;
>
> > +}
> > +
> > +static int virtual_thermal_sensor_get_temp(void *data, int *temperature)
> > +{
> > +       struct virtual_thermal_sensor *sensor = data;
> > +       int max_temp = INT_MIN;
> > +       int temp;
> > +       int i;
> > +
>
> Some synchronization (eg. locking) should be added to this.  It is racy as is.
>
> > +       for (i = 0; i < sensor->count; i++) {
> > +               struct thermal_sensor_data *hw_sensor;
> > +
> > +               hw_sensor = &sensor->sensors[i];
> > +               if (!hw_sensor->ops)
> > +                       return -ENODEV;
> > +
> > +               hw_sensor->ops->get_temp(hw_sensor->sensor_data, &temp);
> > +               max_temp = sensor->aggr_temp(max_temp, temp);
> > +       }
> > +
> > +       *temperature = max_temp;
> > +
> > +       return 0;
> > +}
> > +
> > +static const struct thermal_zone_of_device_ops virtual_thermal_sensor_ops = {
> > +       .get_temp = virtual_thermal_sensor_get_temp,
> > +};
> > +
> > +static int virtual_sensor_add_sensor(struct virtual_thermal_sensor *sensor,
> > +                                    struct of_phandle_args args,
> > +                                    int index)
> > +{
> > +       struct thermal_sensor_data *sensor_data;
> > +       int id;
> > +
> > +       list_for_each_entry(sensor_data, &thermal_sensors, node) {
> > +               id = args.args_count ? args.args[0] : 0;
> > +               if (sensor_data->id == id) {
> > +                       memcpy(&sensor->sensors[index], sensor_data,
> > +                               sizeof(*sensor_data));
>
> The data object being copied includes pointers, so I wouldn't copy
> them like this.  Especially the ops one annotated as const.
>
> > +                       return 0;
> > +               }
> > +       }
> > +
> > +       return -ENODEV;
> > +}
> > +
> > +static int virtual_thermal_sensor_probe(struct platform_device *pdev)
> > +{
> > +       struct virtual_thermal_sensor *sensor;
> > +       struct device *dev = &pdev->dev;
> > +       struct of_phandle_args args;
> > +       u32 type;
> > +       int ret;
> > +       int i;
> > +
> > +       sensor = devm_kzalloc(dev, sizeof(*sensor), GFP_KERNEL);
> > +       if (!sensor)
> > +               return -ENOMEM;
> > +
> > +       sensor->count = of_count_phandle_with_args(dev->of_node,
> > +                                                  "thermal-sensors",
> > +                                                  "#thermal-sensor-cells");
> > +       if (sensor->count <= 0)
> > +               return -EINVAL;
> > +
> > +       sensor->sensors = devm_kmalloc_array(dev, sensor->count,
> > +                                            sizeof(*sensor->sensors),
> > +                                            GFP_KERNEL);
> > +       if (!sensor->sensors)
> > +               return -ENOMEM;
> > +
> > +       for (i = 0; i < sensor->count; i++) {
> > +               ret = of_parse_phandle_with_args(dev->of_node,
> > +                                                "thermal-sensors",
> > +                                                "#thermal-sensor-cells",
> > +                                                i,
> > +                                                &args);
> > +               if (ret)
> > +                       return ret;
> > +
> > +               ret = virtual_sensor_add_sensor(sensor, args, i);
> > +               if (ret)
> > +                       return ret;
> > +       }
> > +
> > +       ret = of_property_read_u32(dev->of_node, "aggregation-function", &type);
> > +       if (ret)
> > +               return ret;
> > +
> > +       switch (type) {
> > +       case VIRTUAL_THERMAL_SENSOR_MAX:
> > +               sensor->aggr_temp = max_temp;
> > +               break;
> > +       case VIRTUAL_THERMAL_SENSOR_MIN:
> > +               sensor->aggr_temp = min_temp;
> > +               break;
> > +       case VIRTUAL_THERMAL_SENSOR_AVG:
> > +               sensor->aggr_temp = avg_temp;
> > +               break;
> > +       default:
> > +               return -EINVAL;
> > +       }
> > +
> > +       sensor->tzd = devm_thermal_zone_of_sensor_register(dev, 0, sensor,
> > +                                                          &virtual_thermal_sensor_ops);
> > +       if (IS_ERR(sensor->tzd))
> > +               return PTR_ERR(sensor->tzd);
> > +
> > +       platform_set_drvdata(pdev, sensor);
> > +       list_add(&sensor->node, &virtual_sensors);
> > +
> > +       return 0;
> > +}
> > +
> > +static int virtual_thermal_sensor_remove(struct platform_device *pdev)
> > +{
> > +       struct virtual_thermal_sensor *sensor;
> > +
> > +       sensor = platform_get_drvdata(pdev);
> > +       list_del(&sensor->node);
> > +
> > +       return 0;
> > +}
> > +
> > +static const struct of_device_id virtual_thermal_sensor_of_match[] = {
> > +       {
> > +               .compatible = "virtual,thermal-sensor",
> > +       },
> > +       {
> > +       },
> > +};
> > +MODULE_DEVICE_TABLE(of, virtual_thermal_sensor_of_match);
> > +
> > +static struct platform_driver virtual_thermal_sensor = {
> > +       .probe = virtual_thermal_sensor_probe,
> > +       .remove = virtual_thermal_sensor_remove,
> > +       .driver = {
> > +               .name = "virtual-thermal-sensor",
> > +               .of_match_table = virtual_thermal_sensor_of_match,
> > +       },
> > +};
> > +
> > +/**
> > + * thermal_virtual_sensor_register - registers a sensor that could be a virtual
> > + * sensor
> > + * @dev: a valid struct device pointer of a sensor device. Must contain
> > + *       a valid .of_node, for the sensor node.
> > + * @sensor_id: a sensor identifier, in case the sensor IP has more
> > + *             than one sensor
> > + * @data: a private pointer (owned by the caller) that will be passed
> > + *        back, when a temperature reading is needed.
> > + * @ops: struct thermal_zone_of_device_ops *. Must contain at least .get_temp.
> > + *
> > + * This function will register a thermal sensor to make it available for later
> > + * usage by a virtual sensor.
>
> So who's expected to be calling it?
>
> > + *
> > + * The thermal zone temperature is provided by the @get_temp function
> > + * pointer. When called, it will have the private pointer @data back.
>
> It is unclear what you mean here.  I think you are describing a return
> mechanism, but as is it is hard to follow.
>
> > + *
> > + * Return: On success returns a valid struct thermal_zone_device,
>
> That doesn't seem to be the case.  It returns an address returned by
> devm_kzalloc() and the memory at that address is expected to be a
> thermal_sensor_data object.
>
> > + * otherwise, it returns a corresponding ERR_PTR(). Caller must
> > + * check the return value with help of IS_ERR() helper.
> > + */
> > +struct thermal_sensor_data *thermal_virtual_sensor_register(
> > +       struct device *dev, int sensor_id, void *data,
> > +       const struct thermal_zone_of_device_ops *ops)
> > +{
> > +       struct thermal_sensor_data *sensor_data;
> > +
> > +       sensor_data = devm_kzalloc(dev, sizeof(*sensor_data), GFP_KERNEL);
> > +       if (!sensor_data)
> > +               return ERR_PTR(-ENOMEM);
> > +
> > +       sensor_data->id = sensor_id;
> > +       sensor_data->sensor_data = data;
> > +       sensor_data->ops = ops;
> > +
> > +       list_add(&sensor_data->node, &thermal_sensors);
> > +
> > +       return sensor_data;
>
> Overall, this is non-modular code squeezed into a module.
>
> If this module goes away and whoever called it wants to unregister the
> sensor, they will crash, so to a minimum it should do a
> try_module_get() on the module containing it.

And that wouldn't be sufficient either, because in theory the module
may go away before that try_module_get() gets called.  Sorry for the
confusion.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v2 2/2] thermal: add a virtual sensor to aggregate temperatures
  2021-10-07 16:38   ` Rafael J. Wysocki
  2021-10-07 17:33     ` Rafael J. Wysocki
@ 2021-10-09 14:46     ` Rafael J. Wysocki
  2021-10-25  9:18     ` Alexandre Bailon
  2 siblings, 0 replies; 19+ messages in thread
From: Rafael J. Wysocki @ 2021-10-09 14:46 UTC (permalink / raw)
  To: Alexandre Bailon
  Cc: Zhang, Rui, Daniel Lezcano, Amit Kucheria, Linux PM,
	Linux Kernel Mailing List, ben.tseng, Kevin Hilman,
	Matthias Kaehlcke, kernel test robot

On Thu, Oct 7, 2021 at 6:38 PM Rafael J. Wysocki <rafael@kernel.org> wrote:

[cut]

> > +static int virtual_thermal_sensor_probe(struct platform_device *pdev)
> > +{
> > +       struct virtual_thermal_sensor *sensor;
> > +       struct device *dev = &pdev->dev;
> > +       struct of_phandle_args args;
> > +       u32 type;
> > +       int ret;
> > +       int i;
> > +
> > +       sensor = devm_kzalloc(dev, sizeof(*sensor), GFP_KERNEL);
> > +       if (!sensor)
> > +               return -ENOMEM;
> > +
> > +       sensor->count = of_count_phandle_with_args(dev->of_node,
> > +                                                  "thermal-sensors",
> > +                                                  "#thermal-sensor-cells");
> > +       if (sensor->count <= 0)
> > +               return -EINVAL;
> > +
> > +       sensor->sensors = devm_kmalloc_array(dev, sensor->count,
> > +                                            sizeof(*sensor->sensors),
> > +                                            GFP_KERNEL);
> > +       if (!sensor->sensors)
> > +               return -ENOMEM;
> > +
> > +       for (i = 0; i < sensor->count; i++) {
> > +               ret = of_parse_phandle_with_args(dev->of_node,
> > +                                                "thermal-sensors",
> > +                                                "#thermal-sensor-cells",
> > +                                                i,
> > +                                                &args);
> > +               if (ret)
> > +                       return ret;
> > +
> > +               ret = virtual_sensor_add_sensor(sensor, args, i);
> > +               if (ret)
> > +                       return ret;

One more issue that escaped me earlier is that if the hardware sensor
being looked for is not there in the thermal_sensors list at this
point (say because it has not been added to it yet by whoever is
expected to do that), the above will return -ENODEV and all probe will
fail without a way to retry.

It would be better to return -EPROBE_DEFER from here IIUC.

However, this still wouldn't address a more general issue that if the
hardware sensor gets unregistered and then registered again (eg. by
unloading and reloading a module managing it), it will not be added
back to the corresponding virtual sensor's data set.  This should not
be very hard to address and it would take care of the above
initialization ordering issue automatically.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v2 2/2] thermal: add a virtual sensor to aggregate temperatures
  2021-10-07 16:38   ` Rafael J. Wysocki
  2021-10-07 17:33     ` Rafael J. Wysocki
  2021-10-09 14:46     ` Rafael J. Wysocki
@ 2021-10-25  9:18     ` Alexandre Bailon
  2 siblings, 0 replies; 19+ messages in thread
From: Alexandre Bailon @ 2021-10-25  9:18 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Zhang, Rui, Daniel Lezcano, Amit Kucheria, Linux PM,
	Linux Kernel Mailing List, ben.tseng, Kevin Hilman,
	Matthias Kaehlcke, kernel test robot

Hi Rafael,

On 10/7/21 6:38 PM, Rafael J. Wysocki wrote:
> On Fri, Sep 17, 2021 at 9:25 AM Alexandre Bailon <abailon@baylibre.com> wrote:
>> This adds a virtual thermal sensor that reads temperature from
> This should be "virtual sensor driver" I suppose.
>
>> hardware sensor and return an aggregated temperature.
> "returns"
>
> Also wrapping it around a single hardware sensor makes a little sense
> AFAICS, so I guess you mean "multiple hardware sensors" here.
>
>> Currently, this supports three operations:
>> the minimum, maximum and average temperature.
> They are aggregation functions rather than "operations"
>
>> Reported-by: kernel test robot <lkp@intel.com>
> This is new code, so what does it fix?
The CI has found some build issues.
I was not sure if I had to add this or not.
>
>> Signed-off-by: Alexandre Bailon <abailon@baylibre.com>
>> ---
>>   drivers/thermal/Kconfig                  |   8 +
>>   drivers/thermal/Makefile                 |   1 +
>>   drivers/thermal/virtual-thermal-sensor.h |  54 ++++
>>   drivers/thermal/virtual_thermal_sensor.c | 350 +++++++++++++++++++++++
>>   4 files changed, 413 insertions(+)
>>   create mode 100644 drivers/thermal/virtual-thermal-sensor.h
>>   create mode 100644 drivers/thermal/virtual_thermal_sensor.c
>>
>> diff --git a/drivers/thermal/Kconfig b/drivers/thermal/Kconfig
>> index d7f44deab5b1d..20bc93c48f5b1 100644
>> --- a/drivers/thermal/Kconfig
>> +++ b/drivers/thermal/Kconfig
>> @@ -228,6 +228,14 @@ config THERMAL_MMIO
>>            register or shared memory, is a potential candidate to work with this
>>            driver.
>>
>> +config VIRTUAL_THERMAL
>> +       tristate "Virtual thermal sensor driver"
> This should be "DT-based virtual thermal sensor driver" I think.
>
>> +       depends on THERMAL_OF || COMPILE_TEST
>> +       help
>> +         This option enables the generic thermal sensor aggregator.
> "generic DT-based"
>
>> +         This driver creates a thermal sensor that reads the hardware sensors
> s/the/multiple/
>
>> +         and aggregate the temperature.
> "aggregates their output"
>
>> +
>>   config HISI_THERMAL
>>          tristate "Hisilicon thermal driver"
>>          depends on ARCH_HISI || COMPILE_TEST
>> diff --git a/drivers/thermal/Makefile b/drivers/thermal/Makefile
>> index 82fc3e616e54b..8bf55973059c5 100644
>> --- a/drivers/thermal/Makefile
>> +++ b/drivers/thermal/Makefile
>> @@ -60,3 +60,4 @@ obj-$(CONFIG_UNIPHIER_THERMAL)        += uniphier_thermal.o
>>   obj-$(CONFIG_AMLOGIC_THERMAL)     += amlogic_thermal.o
>>   obj-$(CONFIG_SPRD_THERMAL)     += sprd_thermal.o
>>   obj-$(CONFIG_KHADAS_MCU_FAN_THERMAL)   += khadas_mcu_fan.o
>> +obj-$(CONFIG_VIRTUAL_THERMAL) += virtual_thermal_sensor.o
>> diff --git a/drivers/thermal/virtual-thermal-sensor.h b/drivers/thermal/virtual-thermal-sensor.h
>> new file mode 100644
>> index 0000000000000..3bbf7c324dddc
>> --- /dev/null
>> +++ b/drivers/thermal/virtual-thermal-sensor.h
>> @@ -0,0 +1,54 @@
>> +/* SPDX-License-Identifier: GPL-2.0 */
>> +/*
>> + * Copyright (c) 2021 BayLibre
>> + */
>> +
>> +#ifndef __THERMAL_VIRTUAL_SENSOR_H__
>> +#define __THERMAL_VIRTUAL_SENSOR_H__
>> +
>> +#include <linux/device.h>
>> +#include <linux/thermal.h>
>> +
>> +struct virtual_thermal_sensor;
>> +struct thermal_sensor_data;
>> +
>> +#if IS_ENABLED(CONFIG_VIRTUAL_THERMAL)
>> +struct thermal_sensor_data *
>> +thermal_virtual_sensor_register(struct device *dev, int sensor_id, void *data,
>> +                               const struct thermal_zone_of_device_ops *ops);
>> +void thermal_virtual_sensor_unregister(struct device *dev,
>> +                                      struct thermal_sensor_data *sensor_data);
>> +struct thermal_sensor_data *
>> +devm_thermal_virtual_sensor_register(struct device *dev, int sensor_id, void *data,
>> +                                    const struct thermal_zone_of_device_ops *ops);
>> +
>> +void devm_thermal_virtual_sensor_unregister(struct device *dev,
>> +                                           struct virtual_thermal_sensor *sensor);
>> +#else
>> +static inline struct thermal_sensor_data *
>> +thermal_virtual_sensor_register(struct device *dev, int sensor_id, void *data,
>> +                               const struct thermal_zone_of_device_ops *ops)
>> +{
>> +       return ERR_PTR(-ENODEV);
>> +}
>> +
>> +void thermal_virtual_sensor_unregister(struct device *dev,
>> +                                      struct thermal_sensor_data *sensor_data)
>> +{
>> +}
>> +
>> +static inline struct thermal_sensor_data *
>> +devm_thermal_virtual_sensor_register(struct device *dev, int sensor_id, void *data,
>> +                                    const struct thermal_zone_of_device_ops *ops)
>> +{
>> +       return ERR_PTR(-ENODEV);
>> +}
>> +
>> +static inline
>> +void devm_thermal_virtual_sensor_unregister(struct device *dev,
>> +                                           struct virtual_thermal_sensor *sensor)
>> +{
>> +}
>> +#endif
>> +
>> +#endif /* __THERMAL_VIRTUAL_SENSOR_H__ */
>> diff --git a/drivers/thermal/virtual_thermal_sensor.c b/drivers/thermal/virtual_thermal_sensor.c
>> new file mode 100644
>> index 0000000000000..234563af6643e
>> --- /dev/null
>> +++ b/drivers/thermal/virtual_thermal_sensor.c
>> @@ -0,0 +1,350 @@
>> +// SPDX-License-Identifier: GPL-2.0
>> +/*
>> + * Copyright (c) 2021 BayLibre
> Please add information on what's in this file to the preamble.
Does something like "DT-based virtual thermal sensor driver" would be 
enough,
or should I details a little bit more ?
>
>> + */
>> +
>> +#include <linux/err.h>
>> +#include <linux/export.h>
>> +#include <linux/module.h>
>> +#include <linux/of_device.h>
>> +#include <linux/of_platform.h>
>> +#include <linux/slab.h>
>> +#include <linux/thermal.h>
>> +#include <linux/types.h>
>> +#include <linux/string.h>
>> +
>> +#include <dt-bindings/thermal/virtual-sensor.h>
>> +
>> +#include "virtual-thermal-sensor.h"
>> +
>> +struct thermal_sensor_data {
>> +       struct list_head node;
>> +
>> +       /* sensor interface */
>> +       int id;
>> +       void *sensor_data;
>> +       const struct thermal_zone_of_device_ops *ops;
>> +};
>> +
>> +struct virtual_thermal_sensor {
>> +       int count;
>> +       struct thermal_sensor_data *sensors;
>> +       struct thermal_zone_device *tzd;
>> +       int (*aggr_temp)(int temp1, int temp2);
>> +
>> +       struct list_head node;
>> +};
>> +
>> +static LIST_HEAD(thermal_sensors);
>> +static LIST_HEAD(virtual_sensors);
> These lists don't seem to be protected against concurrent access and
> they should be.
I Reworked the driver and I am going to remove them.
>
>> +
>> +static int max_temp(int temp1, int temp2)
>> +{
>> +       return max(temp1, temp2);
>> +}
>> +
>> +static int min_temp(int temp1, int temp2)
>> +{
>> +       return min(temp1, temp2);
>> +}
>> +
>> +static int avg_temp(int temp1, int temp2)
>> +{
>> +       return ((temp1) / 2) + ((temp2) / 2) + (((temp1) % 2 + (temp2) % 2) / 2);
> There are a few redundant parens here.
>
> And why not
>
> return (temp1 + temp2) / 2;
>
>> +}
>> +
>> +static int virtual_thermal_sensor_get_temp(void *data, int *temperature)
>> +{
>> +       struct virtual_thermal_sensor *sensor = data;
>> +       int max_temp = INT_MIN;
>> +       int temp;
>> +       int i;
>> +
> Some synchronization (eg. locking) should be added to this.  It is racy as is.
>
I am not sure to figure out what is racy here.
Anyway, I will update the logic of the driver and simplify this 
function, so this would
probably be fixed.
>> +       for (i = 0; i < sensor->count; i++) {
>> +               struct thermal_sensor_data *hw_sensor;
>> +
>> +               hw_sensor = &sensor->sensors[i];
>> +               if (!hw_sensor->ops)
>> +                       return -ENODEV;
>> +
>> +               hw_sensor->ops->get_temp(hw_sensor->sensor_data, &temp);
>> +               max_temp = sensor->aggr_temp(max_temp, temp);
>> +       }
>> +
>> +       *temperature = max_temp;
>> +
>> +       return 0;
>> +}
>> +
>> +static const struct thermal_zone_of_device_ops virtual_thermal_sensor_ops = {
>> +       .get_temp = virtual_thermal_sensor_get_temp,
>> +};
>> +
>> +static int virtual_sensor_add_sensor(struct virtual_thermal_sensor *sensor,
>> +                                    struct of_phandle_args args,
>> +                                    int index)
>> +{
>> +       struct thermal_sensor_data *sensor_data;
>> +       int id;
>> +
>> +       list_for_each_entry(sensor_data, &thermal_sensors, node) {
>> +               id = args.args_count ? args.args[0] : 0;
>> +               if (sensor_data->id == id) {
>> +                       memcpy(&sensor->sensors[index], sensor_data,
>> +                               sizeof(*sensor_data));
> The data object being copied includes pointers, so I wouldn't copy
> them like this.  Especially the ops one annotated as const.
>
>> +                       return 0;
>> +               }
>> +       }
>> +
>> +       return -ENODEV;
>> +}
>> +
>> +static int virtual_thermal_sensor_probe(struct platform_device *pdev)
>> +{
>> +       struct virtual_thermal_sensor *sensor;
>> +       struct device *dev = &pdev->dev;
>> +       struct of_phandle_args args;
>> +       u32 type;
>> +       int ret;
>> +       int i;
>> +
>> +       sensor = devm_kzalloc(dev, sizeof(*sensor), GFP_KERNEL);
>> +       if (!sensor)
>> +               return -ENOMEM;
>> +
>> +       sensor->count = of_count_phandle_with_args(dev->of_node,
>> +                                                  "thermal-sensors",
>> +                                                  "#thermal-sensor-cells");
>> +       if (sensor->count <= 0)
>> +               return -EINVAL;
>> +
>> +       sensor->sensors = devm_kmalloc_array(dev, sensor->count,
>> +                                            sizeof(*sensor->sensors),
>> +                                            GFP_KERNEL);
>> +       if (!sensor->sensors)
>> +               return -ENOMEM;
>> +
>> +       for (i = 0; i < sensor->count; i++) {
>> +               ret = of_parse_phandle_with_args(dev->of_node,
>> +                                                "thermal-sensors",
>> +                                                "#thermal-sensor-cells",
>> +                                                i,
>> +                                                &args);
>> +               if (ret)
>> +                       return ret;
>> +
>> +               ret = virtual_sensor_add_sensor(sensor, args, i);
>> +               if (ret)
>> +                       return ret;
>> +       }
>> +
>> +       ret = of_property_read_u32(dev->of_node, "aggregation-function", &type);
>> +       if (ret)
>> +               return ret;
>> +
>> +       switch (type) {
>> +       case VIRTUAL_THERMAL_SENSOR_MAX:
>> +               sensor->aggr_temp = max_temp;
>> +               break;
>> +       case VIRTUAL_THERMAL_SENSOR_MIN:
>> +               sensor->aggr_temp = min_temp;
>> +               break;
>> +       case VIRTUAL_THERMAL_SENSOR_AVG:
>> +               sensor->aggr_temp = avg_temp;
>> +               break;
>> +       default:
>> +               return -EINVAL;
>> +       }
>> +
>> +       sensor->tzd = devm_thermal_zone_of_sensor_register(dev, 0, sensor,
>> +                                                          &virtual_thermal_sensor_ops);
>> +       if (IS_ERR(sensor->tzd))
>> +               return PTR_ERR(sensor->tzd);
>> +
>> +       platform_set_drvdata(pdev, sensor);
>> +       list_add(&sensor->node, &virtual_sensors);
>> +
>> +       return 0;
>> +}
>> +
>> +static int virtual_thermal_sensor_remove(struct platform_device *pdev)
>> +{
>> +       struct virtual_thermal_sensor *sensor;
>> +
>> +       sensor = platform_get_drvdata(pdev);
>> +       list_del(&sensor->node);
>> +
>> +       return 0;
>> +}
>> +
>> +static const struct of_device_id virtual_thermal_sensor_of_match[] = {
>> +       {
>> +               .compatible = "virtual,thermal-sensor",
>> +       },
>> +       {
>> +       },
>> +};
>> +MODULE_DEVICE_TABLE(of, virtual_thermal_sensor_of_match);
>> +
>> +static struct platform_driver virtual_thermal_sensor = {
>> +       .probe = virtual_thermal_sensor_probe,
>> +       .remove = virtual_thermal_sensor_remove,
>> +       .driver = {
>> +               .name = "virtual-thermal-sensor",
>> +               .of_match_table = virtual_thermal_sensor_of_match,
>> +       },
>> +};
>> +
>> +/**
>> + * thermal_virtual_sensor_register - registers a sensor that could be a virtual
>> + * sensor
>> + * @dev: a valid struct device pointer of a sensor device. Must contain
>> + *       a valid .of_node, for the sensor node.
>> + * @sensor_id: a sensor identifier, in case the sensor IP has more
>> + *             than one sensor
>> + * @data: a private pointer (owned by the caller) that will be passed
>> + *        back, when a temperature reading is needed.
>> + * @ops: struct thermal_zone_of_device_ops *. Must contain at least .get_temp.
>> + *
>> + * This function will register a thermal sensor to make it available for later
>> + * usage by a virtual sensor.
> So who's expected to be calling it?
Any driver that would like to expose its sensors to the virtual thermal 
sensor driver,
which was a bad solution ...
In the next series, I will remove this function, as well the other 
register /  unregister
functions, and use directly thermal_zone_device to get the temperature 
from the sensors.

Probe will take care of finding the thermal_zone_device listed in the 
virtual sensor device
tree node, will defer if a thermal_zone_device is not yet ready and use 
try_module_get()
to prevent any thermal_zone_device used by the virtual sensor to be removed.

Best Regards,
Alexandre

>
>> + *
>> + * The thermal zone temperature is provided by the @get_temp function
>> + * pointer. When called, it will have the private pointer @data back.
> It is unclear what you mean here.  I think you are describing a return
> mechanism, but as is it is hard to follow.
>
>> + *
>> + * Return: On success returns a valid struct thermal_zone_device,
> That doesn't seem to be the case.  It returns an address returned by
> devm_kzalloc() and the memory at that address is expected to be a
> thermal_sensor_data object.
>
>> + * otherwise, it returns a corresponding ERR_PTR(). Caller must
>> + * check the return value with help of IS_ERR() helper.
>> + */
>> +struct thermal_sensor_data *thermal_virtual_sensor_register(
>> +       struct device *dev, int sensor_id, void *data,
>> +       const struct thermal_zone_of_device_ops *ops)
>> +{
>> +       struct thermal_sensor_data *sensor_data;
>> +
>> +       sensor_data = devm_kzalloc(dev, sizeof(*sensor_data), GFP_KERNEL);
>> +       if (!sensor_data)
>> +               return ERR_PTR(-ENOMEM);
>> +
>> +       sensor_data->id = sensor_id;
>> +       sensor_data->sensor_data = data;
>> +       sensor_data->ops = ops;
>> +
>> +       list_add(&sensor_data->node, &thermal_sensors);
>> +
>> +       return sensor_data;
> Overall, this is non-modular code squeezed into a module.
>
> If this module goes away and whoever called it wants to unregister the
> sensor, they will crash, so to a minimum it should do a
> try_module_get() on the module containing it.
>
>> +}
>> +EXPORT_SYMBOL_GPL(thermal_virtual_sensor_register);
>> +
>> +/**
>> + * thermal_virtual_sensor_unregister - unregisters a sensor
>> + * @dev: a valid struct device pointer of a sensor device.
>> + * @sensor_data: a pointer to struct thermal_sensor_data to unregister.
>> + *
>> + * This function removes the sensor from the list of available thermal sensors.
>> + * If the sensor is in use, then the next call to .get_temp will return -ENODEV.
>> + */
>> +void thermal_virtual_sensor_unregister(struct device *dev,
>> +                                      struct thermal_sensor_data *sensor_data)
>> +{
>> +       struct thermal_sensor_data *sd;
>> +       struct virtual_thermal_sensor *sensor;
>> +       int i;
>> +
> All of the code in this function is obviously racy.
>
>> +       list_del(&sensor_data->node);
>> +
>> +       list_for_each_entry(sensor, &virtual_sensors, node) {
>> +               for (i = 0; i < sensor->count; i++) {
>> +                       sd = &sensor->sensors[i];
>> +                       if (sd->id == sensor_data->id &&
>> +                               sd->sensor_data == sensor_data->sensor_data) {
>> +                               sd->ops = NULL;
>> +                       }
> The inner braces are not necessary.
>
>> +               }
>> +       }
> And again, this code requires the module to be protected from
> unloading until the last instance of
> thermal_virtual_sensor_unregister() has returned.
>
>> +}
>> +EXPORT_SYMBOL_GPL(thermal_virtual_sensor_unregister);
>> +
>> +static void devm_thermal_virtual_sensor_release(struct device *dev, void *res)
>> +{
>> +       thermal_virtual_sensor_unregister(dev,
>> +                                         *(struct thermal_sensor_data **)res);
>> +}
>> +
>> +static int devm_thermal_virtual_sensor_match(struct device *dev, void *res,
>> +                                            void *data)
>> +{
>> +       struct thermal_sensor_data **r = res;
>> +
>> +       if (WARN_ON(!r || !*r))
>> +               return 0;
>> +
>> +       return *r == data;
>> +}
>> +
>> +/**
>> + * devm_thermal_virtual_sensor_register - Resource managed version of
>> + *                             thermal_virtual_sensor_register()
>> + * @dev: a valid struct device pointer of a sensor device. Must contain
>> + *       a valid .of_node, for the sensor node.
>> + * @sensor_id: a sensor identifier, in case the sensor IP has more
>> + *            than one sensor
>> + * @data: a private pointer (owned by the caller) that will be passed
>> + *       back, when a temperature reading is needed.
>> + * @ops: struct thermal_zone_of_device_ops *. Must contain at least .get_temp.
>> + *
>> + * Refer thermal_zone_of_sensor_register() for more details.
>> + *
>> + * Return: On success returns a valid struct virtual_sensor_data,
>> + * otherwise, it returns a corresponding ERR_PTR(). Caller must
>> + * check the return value with help of IS_ERR() helper.
>> + * Registered virtual_sensor_data device will automatically be
>> + * released when device is unbound.
>> + */
>> +struct thermal_sensor_data *devm_thermal_virtual_sensor_register(
>> +       struct device *dev, int sensor_id,
>> +       void *data, const struct thermal_zone_of_device_ops *ops)
>> +{
>> +       struct thermal_sensor_data **ptr, *sensor_data;
>> +
>> +       ptr = devres_alloc(devm_thermal_virtual_sensor_release, sizeof(*ptr),
>> +                          GFP_KERNEL);
>> +       if (!ptr)
>> +               return ERR_PTR(-ENOMEM);
>> +
>> +       sensor_data = thermal_virtual_sensor_register(dev, sensor_id, data, ops);
>> +       if (IS_ERR(sensor_data)) {
>> +               devres_free(ptr);
>> +               return sensor_data;
>> +       }
>> +
>> +       *ptr = sensor_data;
>> +       devres_add(dev, ptr);
>> +
>> +       return sensor_data;
>> +}
>> +EXPORT_SYMBOL_GPL(devm_thermal_virtual_sensor_register);
>> +
>> +/**
>> + * devm_thermal_virtual_sensor_unregister - Resource managed version of
>> + *                             thermal_virtual_sensor_unregister().
>> + * @dev: Device for which resource was allocated.
>> + * @sensor: a pointer to struct thermal_zone_device where the sensor is registered.
>> + *
>> + * This function removes the sensor from the list of sensors registered with
>> + * devm_thermal_virtual_sensor_register() API.
>> + * Normally this function will not need to be called and the resource
>> + * management code will ensure that the resource is freed.
>> + */
>> +void devm_thermal_virtual_sensor_unregister(struct device *dev,
>> +                                           struct virtual_thermal_sensor *sensor)
>> +{
>> +       WARN_ON(devres_release(dev, devm_thermal_virtual_sensor_release,
>> +                              devm_thermal_virtual_sensor_match, sensor));
>> +}
>> +EXPORT_SYMBOL_GPL(devm_thermal_virtual_sensor_unregister);
>> +
>> +module_platform_driver(virtual_thermal_sensor);
>> +MODULE_AUTHOR("Alexandre Bailon <abailon@baylibre.com>");
>> +MODULE_DESCRIPTION("Virtual thermal sensor");
>> +MODULE_LICENSE("GPL v2");
>> --

^ permalink raw reply	[flat|nested] 19+ messages in thread

end of thread, other threads:[~2021-10-25  9:18 UTC | newest]

Thread overview: 19+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-09-17  7:27 [PATCH v2 0/2] Add a generic virtual thermal sensor Alexandre Bailon
2021-09-17  7:27 ` [PATCH v2 1/2] dt-bindings: Add bindings for the " Alexandre Bailon
2021-09-17  7:27 ` [PATCH v2 2/2] thermal: add a virtual sensor to aggregate temperatures Alexandre Bailon
2021-10-07 16:38   ` Rafael J. Wysocki
2021-10-07 17:33     ` Rafael J. Wysocki
2021-10-09 14:46     ` Rafael J. Wysocki
2021-10-25  9:18     ` Alexandre Bailon
2021-09-17 12:41 ` [PATCH v2 0/2] Add a generic virtual thermal sensor Daniel Lezcano
2021-09-17 13:33   ` Alexandre Bailon
2021-09-17 14:03     ` Daniel Lezcano
2021-09-20 13:12       ` Alexandre Bailon
2021-09-22  8:10         ` Daniel Lezcano
2021-10-04 10:24           ` Alexandre Bailon
2021-10-04 13:42             ` Daniel Lezcano
2021-10-05 16:45               ` Rafael J. Wysocki
2021-10-06 16:06                 ` Daniel Lezcano
2021-10-06 18:00                   ` Rafael J. Wysocki
2021-10-06 19:51                     ` Daniel Lezcano
2021-10-07 14:17                       ` Rafael J. Wysocki

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).