linux-pm.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v1 0/6] thermal: core: Assorted improvements for v6.11
@ 2024-05-10 14:11 Rafael J. Wysocki
  2024-05-10 14:13 ` [PATCH v1 1/6] thermal: sysfs: Trigger zone temperature updates on sysfs reads Rafael J. Wysocki
                   ` (5 more replies)
  0 siblings, 6 replies; 12+ messages in thread
From: Rafael J. Wysocki @ 2024-05-10 14:11 UTC (permalink / raw)
  To: Linux PM
  Cc: LKML, Rafael J. Wysocki, Lukasz Luba, Daniel Lezcano,
	Srinivas Pandruvada, Zhang Rui

Hi Everyone,

This series is for the 6.11 cycle, but since it is ready from my POV,
here it goes in case people have the time to look at it in the meantime.

The patches in the series address some shortcomings in the thermal
core code (including the Bang-Bang governor) and clean it up somewhat.

Please refer to the individual patch changelogs for details.

This series is independent of the thermal debugfs one posted yesterday:

https://lore.kernel.org/linux-pm/12438864.O9o76ZdvQC@kreacher/

At one point I'm going to put this series on a git branch for easier
access/testing.

Thanks!




^ permalink raw reply	[flat|nested] 12+ messages in thread

* [PATCH v1 1/6] thermal: sysfs: Trigger zone temperature updates on sysfs reads
  2024-05-10 14:11 [PATCH v1 0/6] thermal: core: Assorted improvements for v6.11 Rafael J. Wysocki
@ 2024-05-10 14:13 ` Rafael J. Wysocki
  2024-05-13  7:11   ` Lukasz Luba
  2024-05-10 14:14 ` [PATCH v1 2/6] thermal: trip: Rename __thermal_zone_set_trips() to thermal_zone_set_trips() Rafael J. Wysocki
                   ` (4 subsequent siblings)
  5 siblings, 1 reply; 12+ messages in thread
From: Rafael J. Wysocki @ 2024-05-10 14:13 UTC (permalink / raw)
  To: Linux PM
  Cc: LKML, Rafael J. Wysocki, Lukasz Luba, Daniel Lezcano,
	Srinivas Pandruvada, Zhang Rui

From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>

Reading the zone temperature via sysfs causes the driver callback to
be invoked, but it does not cause the thermal zone object to be updated.

This is problematic if the zone temperature read via sysfs differs from
the temperature value stored in the thermal zone object as it may cause
the kernel and user space to act against each other in some cases.

For this reason, make temp_show() trigger a zone temperature update if
the temperature returned by thermal_zone_get_temp() is different from
the temperature value stored in the thermal zone object.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
---
 drivers/thermal/thermal_core.c  |    2 +-
 drivers/thermal/thermal_sysfs.c |    3 +++
 2 files changed, 4 insertions(+), 1 deletion(-)

Index: linux-pm/drivers/thermal/thermal_sysfs.c
===================================================================
--- linux-pm.orig/drivers/thermal/thermal_sysfs.c
+++ linux-pm/drivers/thermal/thermal_sysfs.c
@@ -42,6 +42,9 @@ temp_show(struct device *dev, struct dev
 	if (ret)
 		return ret;
 
+	if (temperature != READ_ONCE(tz->temperature))
+		thermal_zone_device_update(tz, THERMAL_EVENT_TEMP_SAMPLE);
+
 	return sprintf(buf, "%d\n", temperature);
 }
 
Index: linux-pm/drivers/thermal/thermal_core.c
===================================================================
--- linux-pm.orig/drivers/thermal/thermal_core.c
+++ linux-pm/drivers/thermal/thermal_core.c
@@ -429,7 +429,7 @@ static void update_temperature(struct th
 	}
 
 	tz->last_temperature = tz->temperature;
-	tz->temperature = temp;
+	WRITE_ONCE(tz->temperature, temp);
 
 	trace_thermal_temperature(tz);
 




^ permalink raw reply	[flat|nested] 12+ messages in thread

* [PATCH v1 2/6] thermal: trip: Rename __thermal_zone_set_trips() to thermal_zone_set_trips()
  2024-05-10 14:11 [PATCH v1 0/6] thermal: core: Assorted improvements for v6.11 Rafael J. Wysocki
  2024-05-10 14:13 ` [PATCH v1 1/6] thermal: sysfs: Trigger zone temperature updates on sysfs reads Rafael J. Wysocki
@ 2024-05-10 14:14 ` Rafael J. Wysocki
  2024-05-10 14:16 ` [PATCH v1 3/6] thermal: trip: Make thermal_zone_set_trips() use trip thresholds Rafael J. Wysocki
                   ` (3 subsequent siblings)
  5 siblings, 0 replies; 12+ messages in thread
From: Rafael J. Wysocki @ 2024-05-10 14:14 UTC (permalink / raw)
  To: Linux PM
  Cc: LKML, Rafael J. Wysocki, Lukasz Luba, Daniel Lezcano,
	Srinivas Pandruvada, Zhang Rui

From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>

Drop the pointless double underline prefix from the function name as per
the subject.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
---
 drivers/thermal/thermal_core.c |    2 +-
 drivers/thermal/thermal_core.h |    2 +-
 drivers/thermal/thermal_trip.c |    2 +-
 3 files changed, 3 insertions(+), 3 deletions(-)

Index: linux-pm/drivers/thermal/thermal_core.c
===================================================================
--- linux-pm.orig/drivers/thermal/thermal_core.c
+++ linux-pm/drivers/thermal/thermal_core.c
@@ -498,7 +498,7 @@ void __thermal_zone_device_update(struct
 	if (tz->temperature == THERMAL_TEMP_INVALID)
 		return;
 
-	__thermal_zone_set_trips(tz);
+	thermal_zone_set_trips(tz);
 
 	tz->notify_event = event;
 
Index: linux-pm/drivers/thermal/thermal_core.h
===================================================================
--- linux-pm.orig/drivers/thermal/thermal_core.h
+++ linux-pm/drivers/thermal/thermal_core.h
@@ -242,7 +242,7 @@ void thermal_governor_update_tz(struct t
 
 const char *thermal_trip_type_name(enum thermal_trip_type trip_type);
 
-void __thermal_zone_set_trips(struct thermal_zone_device *tz);
+void thermal_zone_set_trips(struct thermal_zone_device *tz);
 int thermal_zone_trip_id(const struct thermal_zone_device *tz,
 			 const struct thermal_trip *trip);
 void thermal_zone_trip_updated(struct thermal_zone_device *tz,
Index: linux-pm/drivers/thermal/thermal_trip.c
===================================================================
--- linux-pm.orig/drivers/thermal/thermal_trip.c
+++ linux-pm/drivers/thermal/thermal_trip.c
@@ -76,7 +76,7 @@ EXPORT_SYMBOL_GPL(thermal_zone_get_num_t
  *
  * It does not return a value
  */
-void __thermal_zone_set_trips(struct thermal_zone_device *tz)
+void thermal_zone_set_trips(struct thermal_zone_device *tz)
 {
 	const struct thermal_trip_desc *td;
 	int low = -INT_MAX, high = INT_MAX;




^ permalink raw reply	[flat|nested] 12+ messages in thread

* [PATCH v1 3/6] thermal: trip: Make thermal_zone_set_trips() use trip thresholds
  2024-05-10 14:11 [PATCH v1 0/6] thermal: core: Assorted improvements for v6.11 Rafael J. Wysocki
  2024-05-10 14:13 ` [PATCH v1 1/6] thermal: sysfs: Trigger zone temperature updates on sysfs reads Rafael J. Wysocki
  2024-05-10 14:14 ` [PATCH v1 2/6] thermal: trip: Rename __thermal_zone_set_trips() to thermal_zone_set_trips() Rafael J. Wysocki
@ 2024-05-10 14:16 ` Rafael J. Wysocki
  2024-05-10 14:18 ` [PATCH v1 4/6] thermal: trip: Use READ_ONCE() for lockless access to trip properties Rafael J. Wysocki
                   ` (2 subsequent siblings)
  5 siblings, 0 replies; 12+ messages in thread
From: Rafael J. Wysocki @ 2024-05-10 14:16 UTC (permalink / raw)
  To: Linux PM
  Cc: LKML, Rafael J. Wysocki, Lukasz Luba, Daniel Lezcano,
	Srinivas Pandruvada, Zhang Rui

From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>

Modify thermal_zone_set_trips() to use trip thresholds instead of
computing the low temperature for each trip to avoid deriving both
the high and low temperature levels from the same trip (which may
happen if the zone temperature falls into the hysteresis range of
one trip).

Accordingly, make __thermal_zone_device_update() call
thermal_zone_set_trips() later, when threshold values have
been updated for all trips.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
---
 drivers/thermal/thermal_core.c |    4 ++--
 drivers/thermal/thermal_trip.c |   14 ++++----------
 2 files changed, 6 insertions(+), 12 deletions(-)

Index: linux-pm/drivers/thermal/thermal_core.c
===================================================================
--- linux-pm.orig/drivers/thermal/thermal_core.c
+++ linux-pm/drivers/thermal/thermal_core.c
@@ -498,13 +498,13 @@ void __thermal_zone_device_update(struct
 	if (tz->temperature == THERMAL_TEMP_INVALID)
 		return;
 
-	thermal_zone_set_trips(tz);
-
 	tz->notify_event = event;
 
 	for_each_trip_desc(tz, td)
 		handle_thermal_trip(tz, td, &way_up_list, &way_down_list);
 
+	thermal_zone_set_trips(tz);
+
 	list_sort(&way_up_list, &way_up_list, thermal_trip_notify_cmp);
 	list_for_each_entry(td, &way_up_list, notify_list_node) {
 		thermal_notify_tz_trip_up(tz, &td->trip);
Index: linux-pm/drivers/thermal/thermal_trip.c
===================================================================
--- linux-pm.orig/drivers/thermal/thermal_trip.c
+++ linux-pm/drivers/thermal/thermal_trip.c
@@ -88,17 +88,11 @@ void thermal_zone_set_trips(struct therm
 		return;
 
 	for_each_trip_desc(tz, td) {
-		const struct thermal_trip *trip = &td->trip;
-		int trip_low;
+		if (td->threshold < tz->temperature && td->threshold > low)
+			low = td->threshold;
 
-		trip_low = trip->temperature - trip->hysteresis;
-
-		if (trip_low < tz->temperature && trip_low > low)
-			low = trip_low;
-
-		if (trip->temperature > tz->temperature &&
-		    trip->temperature < high)
-			high = trip->temperature;
+		if (td->threshold > tz->temperature && td->threshold < high)
+			high = td->threshold;
 	}
 
 	/* No need to change trip points */




^ permalink raw reply	[flat|nested] 12+ messages in thread

* [PATCH v1 4/6] thermal: trip: Use READ_ONCE() for lockless access to trip properties
  2024-05-10 14:11 [PATCH v1 0/6] thermal: core: Assorted improvements for v6.11 Rafael J. Wysocki
                   ` (2 preceding siblings ...)
  2024-05-10 14:16 ` [PATCH v1 3/6] thermal: trip: Make thermal_zone_set_trips() use trip thresholds Rafael J. Wysocki
@ 2024-05-10 14:18 ` Rafael J. Wysocki
  2024-05-10 14:19 ` [PATCH v1 5/6] thermal: gov_bang_bang: Drop unnecessary cooling device target state checks Rafael J. Wysocki
  2024-05-10 14:20 ` [PATCH v1 6/6] thermal: core: Avoid calling .trip_crossed() for critical and hot trips Rafael J. Wysocki
  5 siblings, 0 replies; 12+ messages in thread
From: Rafael J. Wysocki @ 2024-05-10 14:18 UTC (permalink / raw)
  To: Linux PM
  Cc: LKML, Rafael J. Wysocki, Lukasz Luba, Daniel Lezcano,
	Srinivas Pandruvada, Zhang Rui

From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>

When accessing trip temperature and hysteresis without locking, it is
better to use READ_ONCE() to prevent compiler optimizations possibly
affecting the read from being applied.

Of course, for the READ_ONCE() to be effective, WRITE_ONCE() needs to
be used when updating their values.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
---
 drivers/thermal/thermal_sysfs.c |    6 +++---
 drivers/thermal/thermal_trip.c  |    2 +-
 2 files changed, 4 insertions(+), 4 deletions(-)

Index: linux-pm/drivers/thermal/thermal_sysfs.c
===================================================================
--- linux-pm.orig/drivers/thermal/thermal_sysfs.c
+++ linux-pm/drivers/thermal/thermal_sysfs.c
@@ -141,7 +141,7 @@ trip_point_temp_show(struct device *dev,
 	if (sscanf(attr->attr.name, "trip_point_%d_temp", &trip_id) != 1)
 		return -EINVAL;
 
-	return sprintf(buf, "%d\n", tz->trips[trip_id].trip.temperature);
+	return sprintf(buf, "%d\n", READ_ONCE(tz->trips[trip_id].trip.temperature));
 }
 
 static ssize_t
@@ -165,7 +165,7 @@ trip_point_hyst_store(struct device *dev
 	trip = &tz->trips[trip_id].trip;
 
 	if (hyst != trip->hysteresis) {
-		trip->hysteresis = hyst;
+		WRITE_ONCE(trip->hysteresis, hyst);
 
 		thermal_zone_trip_updated(tz, trip);
 	}
@@ -185,7 +185,7 @@ trip_point_hyst_show(struct device *dev,
 	if (sscanf(attr->attr.name, "trip_point_%d_hyst", &trip_id) != 1)
 		return -EINVAL;
 
-	return sprintf(buf, "%d\n", tz->trips[trip_id].trip.hysteresis);
+	return sprintf(buf, "%d\n", READ_ONCE(tz->trips[trip_id].trip.hysteresis));
 }
 
 static ssize_t
Index: linux-pm/drivers/thermal/thermal_trip.c
===================================================================
--- linux-pm.orig/drivers/thermal/thermal_trip.c
+++ linux-pm/drivers/thermal/thermal_trip.c
@@ -161,7 +161,7 @@ void thermal_zone_set_trip_temp(struct t
 	if (trip->temperature == temp)
 		return;
 
-	trip->temperature = temp;
+	WRITE_ONCE(trip->temperature, temp);
 	thermal_notify_tz_trip_change(tz, trip);
 }
 EXPORT_SYMBOL_GPL(thermal_zone_set_trip_temp);




^ permalink raw reply	[flat|nested] 12+ messages in thread

* [PATCH v1 5/6] thermal: gov_bang_bang: Drop unnecessary cooling device target state checks
  2024-05-10 14:11 [PATCH v1 0/6] thermal: core: Assorted improvements for v6.11 Rafael J. Wysocki
                   ` (3 preceding siblings ...)
  2024-05-10 14:18 ` [PATCH v1 4/6] thermal: trip: Use READ_ONCE() for lockless access to trip properties Rafael J. Wysocki
@ 2024-05-10 14:19 ` Rafael J. Wysocki
  2024-05-10 14:20 ` [PATCH v1 6/6] thermal: core: Avoid calling .trip_crossed() for critical and hot trips Rafael J. Wysocki
  5 siblings, 0 replies; 12+ messages in thread
From: Rafael J. Wysocki @ 2024-05-10 14:19 UTC (permalink / raw)
  To: Linux PM
  Cc: LKML, Rafael J. Wysocki, Lukasz Luba, Daniel Lezcano,
	Srinivas Pandruvada, Zhang Rui

From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>

Some cooling device target state checks in bang_bang_control()
done before setting the new target state are not necessary after
recent changes, so drop them.

Also avoid updating the target state before checking it for
unexpected values.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
---
 drivers/thermal/gov_bang_bang.c |   14 +++-----------
 1 file changed, 3 insertions(+), 11 deletions(-)

Index: linux-pm/drivers/thermal/gov_bang_bang.c
===================================================================
--- linux-pm.orig/drivers/thermal/gov_bang_bang.c
+++ linux-pm/drivers/thermal/gov_bang_bang.c
@@ -57,24 +57,16 @@ static void bang_bang_control(struct the
 		if (instance->trip != trip)
 			continue;
 
-		if (instance->target == THERMAL_NO_TARGET)
-			instance->target = 0;
-
-		if (instance->target != 0 && instance->target != 1) {
+		if (instance->target != 0 && instance->target != 1 &&
+		    instance->target != THERMAL_NO_TARGET)
 			pr_debug("Unexpected state %ld of thermal instance %s in bang-bang\n",
 				 instance->target, instance->name);
 
-			instance->target = 1;
-		}
-
 		/*
 		 * Enable the fan when the trip is crossed on the way up and
 		 * disable it when the trip is crossed on the way down.
 		 */
-		if (instance->target == 0 && crossed_up)
-			instance->target = 1;
-		else if (instance->target == 1 && !crossed_up)
-			instance->target = 0;
+		instance->target = crossed_up;
 
 		dev_dbg(&instance->cdev->device, "target=%ld\n", instance->target);
 




^ permalink raw reply	[flat|nested] 12+ messages in thread

* [PATCH v1 6/6] thermal: core: Avoid calling .trip_crossed() for critical and hot trips
  2024-05-10 14:11 [PATCH v1 0/6] thermal: core: Assorted improvements for v6.11 Rafael J. Wysocki
                   ` (4 preceding siblings ...)
  2024-05-10 14:19 ` [PATCH v1 5/6] thermal: gov_bang_bang: Drop unnecessary cooling device target state checks Rafael J. Wysocki
@ 2024-05-10 14:20 ` Rafael J. Wysocki
  5 siblings, 0 replies; 12+ messages in thread
From: Rafael J. Wysocki @ 2024-05-10 14:20 UTC (permalink / raw)
  To: Linux PM
  Cc: LKML, Rafael J. Wysocki, Lukasz Luba, Daniel Lezcano,
	Srinivas Pandruvada, Zhang Rui

From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>

Invoking the governor .trip_crossed() callback for critical and hot
trips is pointless because they are handled directly by the core,
so make thermal_governor_trip_crossed() avoid doing that.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
---
 drivers/thermal/thermal_core.c |    3 +++
 1 file changed, 3 insertions(+)

Index: linux-pm/drivers/thermal/thermal_core.c
===================================================================
--- linux-pm.orig/drivers/thermal/thermal_core.c
+++ linux-pm/drivers/thermal/thermal_core.c
@@ -463,6 +463,9 @@ static void thermal_governor_trip_crosse
 					  const struct thermal_trip *trip,
 					  bool crossed_up)
 {
+	if (trip->type == THERMAL_TRIP_HOT || trip->type == THERMAL_TRIP_CRITICAL)
+		return;
+
 	if (governor->trip_crossed)
 		governor->trip_crossed(tz, trip, crossed_up);
 }




^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v1 1/6] thermal: sysfs: Trigger zone temperature updates on sysfs reads
  2024-05-10 14:13 ` [PATCH v1 1/6] thermal: sysfs: Trigger zone temperature updates on sysfs reads Rafael J. Wysocki
@ 2024-05-13  7:11   ` Lukasz Luba
  2024-05-16  9:04     ` Rafael J. Wysocki
  0 siblings, 1 reply; 12+ messages in thread
From: Lukasz Luba @ 2024-05-13  7:11 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: LKML, Linux PM, Rafael J. Wysocki, Daniel Lezcano,
	Srinivas Pandruvada, Zhang Rui

Hi Rafael,

On 5/10/24 15:13, Rafael J. Wysocki wrote:
> From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> 
> Reading the zone temperature via sysfs causes the driver callback to
> be invoked, but it does not cause the thermal zone object to be updated.
> 
> This is problematic if the zone temperature read via sysfs differs from
> the temperature value stored in the thermal zone object as it may cause
> the kernel and user space to act against each other in some cases.
> 
> For this reason, make temp_show() trigger a zone temperature update if
> the temperature returned by thermal_zone_get_temp() is different from
> the temperature value stored in the thermal zone object.
> 
> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> ---
>   drivers/thermal/thermal_core.c  |    2 +-
>   drivers/thermal/thermal_sysfs.c |    3 +++
>   2 files changed, 4 insertions(+), 1 deletion(-)
> 
> Index: linux-pm/drivers/thermal/thermal_sysfs.c
> ===================================================================
> --- linux-pm.orig/drivers/thermal/thermal_sysfs.c
> +++ linux-pm/drivers/thermal/thermal_sysfs.c
> @@ -42,6 +42,9 @@ temp_show(struct device *dev, struct dev
>   	if (ret)
>   		return ret;
>   
> +	if (temperature != READ_ONCE(tz->temperature))
> +		thermal_zone_device_update(tz, THERMAL_EVENT_TEMP_SAMPLE);

That's a bit problematic because it will trigger
governor->manage()

In case of IPA governor we relay on constant polling
period. We estimate the past power usage and current
thermal budget, to derive the next period power budget
for devices. I don't know if the internal PID algorithm
will be resilient enough to compensate this asynchronous
trigger caused from user-space.

We choose the period to be at least 1 frame (e.g. ~16ms)
to have good avg usage of CPUs and GPU. TBH I don't know
what would happen if someone reads the temp after e.g. 1ms
of last IPA trigger, but some devices (e.g. GPU) wasn't
loaded in that last 1ms delta...
I'm a bit more relaxed about CPUs because we use utilization
signal from runqueues (like the TEO util gov). That's a moving
avg signal which should keep some history, like low-pass
filter, so information is more resilient in that case.

Could we think about that IPA constant period usage?
I think I understand the proposal of your patch.
We might add a filter inside IPA to ignore such async
triggers in the .manage() callback.
What do you think?

Regards,
Lukasz

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v1 1/6] thermal: sysfs: Trigger zone temperature updates on sysfs reads
  2024-05-13  7:11   ` Lukasz Luba
@ 2024-05-16  9:04     ` Rafael J. Wysocki
  2024-05-16  9:45       ` Daniel Lezcano
  0 siblings, 1 reply; 12+ messages in thread
From: Rafael J. Wysocki @ 2024-05-16  9:04 UTC (permalink / raw)
  To: Lukasz Luba
  Cc: Rafael J. Wysocki, LKML, Linux PM, Rafael J. Wysocki,
	Daniel Lezcano, Srinivas Pandruvada, Zhang Rui

Hi Lukasz,

On Mon, May 13, 2024 at 9:11 AM Lukasz Luba <lukasz.luba@arm.com> wrote:
>
> Hi Rafael,
>
> On 5/10/24 15:13, Rafael J. Wysocki wrote:
> > From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> >
> > Reading the zone temperature via sysfs causes the driver callback to
> > be invoked, but it does not cause the thermal zone object to be updated.
> >
> > This is problematic if the zone temperature read via sysfs differs from
> > the temperature value stored in the thermal zone object as it may cause
> > the kernel and user space to act against each other in some cases.
> >
> > For this reason, make temp_show() trigger a zone temperature update if
> > the temperature returned by thermal_zone_get_temp() is different from
> > the temperature value stored in the thermal zone object.
> >
> > Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> > ---
> >   drivers/thermal/thermal_core.c  |    2 +-
> >   drivers/thermal/thermal_sysfs.c |    3 +++
> >   2 files changed, 4 insertions(+), 1 deletion(-)
> >
> > Index: linux-pm/drivers/thermal/thermal_sysfs.c
> > ===================================================================
> > --- linux-pm.orig/drivers/thermal/thermal_sysfs.c
> > +++ linux-pm/drivers/thermal/thermal_sysfs.c
> > @@ -42,6 +42,9 @@ temp_show(struct device *dev, struct dev
> >       if (ret)
> >               return ret;
> >
> > +     if (temperature != READ_ONCE(tz->temperature))
> > +             thermal_zone_device_update(tz, THERMAL_EVENT_TEMP_SAMPLE);
>
> That's a bit problematic because it will trigger
> governor->manage()
>
> In case of IPA governor we relay on constant polling
> period. We estimate the past power usage and current
> thermal budget, to derive the next period power budget
> for devices. I don't know if the internal PID algorithm
> will be resilient enough to compensate this asynchronous
> trigger caused from user-space.
>
> We choose the period to be at least 1 frame (e.g. ~16ms)
> to have good avg usage of CPUs and GPU. TBH I don't know
> what would happen if someone reads the temp after e.g. 1ms
> of last IPA trigger, but some devices (e.g. GPU) wasn't
> loaded in that last 1ms delta...
> I'm a bit more relaxed about CPUs because we use utilization
> signal from runqueues (like the TEO util gov). That's a moving
> avg signal which should keep some history, like low-pass
> filter, so information is more resilient in that case.
>
> Could we think about that IPA constant period usage?
> I think I understand the proposal of your patch.
> We might add a filter inside IPA to ignore such async
> triggers in the .manage() callback.
> What do you think?

Thanks for pointing this out.

Actually, the target audience for this change are thermal zones that
are not updated on a regular basis, so what about storing the last
zone temperature update time in the zone object and only running an
update from temp_show() if sufficient time has elapsed since the last
update (say 1 sec)?

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v1 1/6] thermal: sysfs: Trigger zone temperature updates on sysfs reads
  2024-05-16  9:04     ` Rafael J. Wysocki
@ 2024-05-16  9:45       ` Daniel Lezcano
  2024-05-16 10:02         ` Rafael J. Wysocki
  0 siblings, 1 reply; 12+ messages in thread
From: Daniel Lezcano @ 2024-05-16  9:45 UTC (permalink / raw)
  To: Rafael J. Wysocki, Lukasz Luba
  Cc: Rafael J. Wysocki, LKML, Linux PM, Srinivas Pandruvada, Zhang Rui


Hi Rafael,

On 16/05/2024 11:04, Rafael J. Wysocki wrote:
> Hi Lukasz,
> 
> On Mon, May 13, 2024 at 9:11 AM Lukasz Luba <lukasz.luba@arm.com> wrote:
>>
>> Hi Rafael,
>>
>> On 5/10/24 15:13, Rafael J. Wysocki wrote:
>>> From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
>>>
>>> Reading the zone temperature via sysfs causes the driver callback to
>>> be invoked, but it does not cause the thermal zone object to be updated.
>>>
>>> This is problematic if the zone temperature read via sysfs differs from
>>> the temperature value stored in the thermal zone object as it may cause
>>> the kernel and user space to act against each other in some cases.
>>>
>>> For this reason, make temp_show() trigger a zone temperature update if
>>> the temperature returned by thermal_zone_get_temp() is different from
>>> the temperature value stored in the thermal zone object.


The hwmon system is doing something similar and I'm not sure we want to 
mimic the same behavior.

Just to summarize:

1. There is a polling delay set

This polling delay gives the sampling rate the thermal zone is 
monitored. The temperature is updated at each 'delay' tick

2. There is no polling delay set

The system relies on the interrupts to tell when a temperature reaches a 
threshold.


On the other side, if the governor is in-kernel, then we should not read 
the temperature of the thermal zones because it is the job of the kernel 
to do that.

Actually we can assume the temperature information exported to the 
userspace is a courtesy of the kernel when this one is managing the 
thermal zone.

If there is no governor associated to the thermal zone because there is 
no cooling device associated to the defined trip points, then we can 
assume it is up to the userspace to monitor the thermal zone.

Furthermore, the hwmon gives the temperature information with the 
caching and because of that it is not possible for a thermal daemon to 
correctly handle a thermal zone.

That said, I would say we don't want the userspace to influence the 
thermal zone monitoring in any manner.

 From my POV, we should keep the code as it is.

The description of the change says "it may cause the kernel and user 
space to act against each other in some cases". Is it possible to give 
the cases when that can happen ?




-- 
<http://www.linaro.org/> Linaro.org │ Open source software for ARM SoCs

Follow Linaro:  <http://www.facebook.com/pages/Linaro> Facebook |
<http://twitter.com/#!/linaroorg> Twitter |
<http://www.linaro.org/linaro-blog/> Blog


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v1 1/6] thermal: sysfs: Trigger zone temperature updates on sysfs reads
  2024-05-16  9:45       ` Daniel Lezcano
@ 2024-05-16 10:02         ` Rafael J. Wysocki
  2024-05-16 12:56           ` Rafael J. Wysocki
  0 siblings, 1 reply; 12+ messages in thread
From: Rafael J. Wysocki @ 2024-05-16 10:02 UTC (permalink / raw)
  To: Daniel Lezcano
  Cc: Rafael J. Wysocki, Lukasz Luba, Rafael J. Wysocki, LKML,
	Linux PM, Srinivas Pandruvada, Zhang Rui

Hi Daniel,

On Thu, May 16, 2024 at 11:46 AM Daniel Lezcano
<daniel.lezcano@linaro.org> wrote:
>
>
> Hi Rafael,
>
> On 16/05/2024 11:04, Rafael J. Wysocki wrote:
> > Hi Lukasz,
> >
> > On Mon, May 13, 2024 at 9:11 AM Lukasz Luba <lukasz.luba@arm.com> wrote:
> >>
> >> Hi Rafael,
> >>
> >> On 5/10/24 15:13, Rafael J. Wysocki wrote:
> >>> From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> >>>
> >>> Reading the zone temperature via sysfs causes the driver callback to
> >>> be invoked, but it does not cause the thermal zone object to be updated.
> >>>
> >>> This is problematic if the zone temperature read via sysfs differs from
> >>> the temperature value stored in the thermal zone object as it may cause
> >>> the kernel and user space to act against each other in some cases.
> >>>
> >>> For this reason, make temp_show() trigger a zone temperature update if
> >>> the temperature returned by thermal_zone_get_temp() is different from
> >>> the temperature value stored in the thermal zone object.
>
>
> The hwmon system is doing something similar and I'm not sure we want to
> mimic the same behavior.
>
> Just to summarize:
>
> 1. There is a polling delay set
>
> This polling delay gives the sampling rate the thermal zone is
> monitored. The temperature is updated at each 'delay' tick
>
> 2. There is no polling delay set
>
> The system relies on the interrupts to tell when a temperature reaches a
> threshold.

So this is a bit of a problem if the interrupts are not coming.

At least from the debugfs perspective, there are "mitigation episodes"
that last forever if the zone temperature happens to be above a trip
at the system resume time, say, and is never updated afterward.

> On the other side, if the governor is in-kernel, then we should not read
> the temperature of the thermal zones because it is the job of the kernel
> to do that.
>
> Actually we can assume the temperature information exported to the
> userspace is a courtesy of the kernel when this one is managing the
> thermal zone.

It is not the case right now, though, as sysfs temperature reads
effectively bypass the whole in-kernel thermal management.

> If there is no governor associated to the thermal zone because there is
> no cooling device associated to the defined trip points, then we can
> assume it is up to the userspace to monitor the thermal zone.

Well, in that case trips should not be taken into account, but they are now.

> Furthermore, the hwmon gives the temperature information with the
> caching and because of that it is not possible for a thermal daemon to
> correctly handle a thermal zone.
>
> That said, I would say we don't want the userspace to influence the
> thermal zone monitoring in any manner.
>
>  From my POV, we should keep the code as it is.

Well, it is problematic as is.

> The description of the change says "it may cause the kernel and user
> space to act against each other in some cases". Is it possible to give
> the cases when that can happen ?

This is mostly theoretical, but if user space knows that the
temperature has fallen below a trip, but the kernel doesn't know that,
they may decide to put a cooling device into different states.

In any case, the issue at hand is that thermal_zone_device_update()
processes passive and active trip points without attached cooling
devices which doesn't make much sense, so this needs to be addressed
in the first place and the $subject patch may not make any difference
if that happens, so regard it as withdrawn.

Thanks!

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v1 1/6] thermal: sysfs: Trigger zone temperature updates on sysfs reads
  2024-05-16 10:02         ` Rafael J. Wysocki
@ 2024-05-16 12:56           ` Rafael J. Wysocki
  0 siblings, 0 replies; 12+ messages in thread
From: Rafael J. Wysocki @ 2024-05-16 12:56 UTC (permalink / raw)
  To: Daniel Lezcano
  Cc: Lukasz Luba, Rafael J. Wysocki, LKML, Linux PM,
	Srinivas Pandruvada, Zhang Rui

On Thu, May 16, 2024 at 12:02 PM Rafael J. Wysocki <rafael@kernel.org> wrote:
>
> Hi Daniel,
>
> On Thu, May 16, 2024 at 11:46 AM Daniel Lezcano
> <daniel.lezcano@linaro.org> wrote:
> >
> >
> > Hi Rafael,
> >
> > On 16/05/2024 11:04, Rafael J. Wysocki wrote:
> > > Hi Lukasz,
> > >
> > > On Mon, May 13, 2024 at 9:11 AM Lukasz Luba <lukasz.luba@arm.com> wrote:
> > >>
> > >> Hi Rafael,
> > >>
> > >> On 5/10/24 15:13, Rafael J. Wysocki wrote:
> > >>> From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> > >>>
> > >>> Reading the zone temperature via sysfs causes the driver callback to
> > >>> be invoked, but it does not cause the thermal zone object to be updated.
> > >>>
> > >>> This is problematic if the zone temperature read via sysfs differs from
> > >>> the temperature value stored in the thermal zone object as it may cause
> > >>> the kernel and user space to act against each other in some cases.
> > >>>
> > >>> For this reason, make temp_show() trigger a zone temperature update if
> > >>> the temperature returned by thermal_zone_get_temp() is different from
> > >>> the temperature value stored in the thermal zone object.
> >
> >
> > The hwmon system is doing something similar and I'm not sure we want to
> > mimic the same behavior.
> >
> > Just to summarize:
> >
> > 1. There is a polling delay set
> >
> > This polling delay gives the sampling rate the thermal zone is
> > monitored. The temperature is updated at each 'delay' tick
> >
> > 2. There is no polling delay set
> >
> > The system relies on the interrupts to tell when a temperature reaches a
> > threshold.
>
> So this is a bit of a problem if the interrupts are not coming.
>
> At least from the debugfs perspective, there are "mitigation episodes"
> that last forever if the zone temperature happens to be above a trip
> at the system resume time, say, and is never updated afterward.
>
> > On the other side, if the governor is in-kernel, then we should not read
> > the temperature of the thermal zones because it is the job of the kernel
> > to do that.
> >
> > Actually we can assume the temperature information exported to the
> > userspace is a courtesy of the kernel when this one is managing the
> > thermal zone.
>
> It is not the case right now, though, as sysfs temperature reads
> effectively bypass the whole in-kernel thermal management.
>
> > If there is no governor associated to the thermal zone because there is
> > no cooling device associated to the defined trip points, then we can
> > assume it is up to the userspace to monitor the thermal zone.
>
> Well, in that case trips should not be taken into account, but they are now.

Actually, there is always a governor associated with a thermal zone
AFAICS.  It is set before binding any cooling devices to the thermal
zone.

However, there are thermal zones without any cooling devices bound to
them and they have the governor set to user_space, so it appears that
they want to receive trip crossing notifications, but then they don't
have a way to trigger zone temperature updates which is inconsistent.

IMO at least for these thermal zones temp_store() should trigger a
zone temperature update.

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2024-05-16 12:56 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-05-10 14:11 [PATCH v1 0/6] thermal: core: Assorted improvements for v6.11 Rafael J. Wysocki
2024-05-10 14:13 ` [PATCH v1 1/6] thermal: sysfs: Trigger zone temperature updates on sysfs reads Rafael J. Wysocki
2024-05-13  7:11   ` Lukasz Luba
2024-05-16  9:04     ` Rafael J. Wysocki
2024-05-16  9:45       ` Daniel Lezcano
2024-05-16 10:02         ` Rafael J. Wysocki
2024-05-16 12:56           ` Rafael J. Wysocki
2024-05-10 14:14 ` [PATCH v1 2/6] thermal: trip: Rename __thermal_zone_set_trips() to thermal_zone_set_trips() Rafael J. Wysocki
2024-05-10 14:16 ` [PATCH v1 3/6] thermal: trip: Make thermal_zone_set_trips() use trip thresholds Rafael J. Wysocki
2024-05-10 14:18 ` [PATCH v1 4/6] thermal: trip: Use READ_ONCE() for lockless access to trip properties Rafael J. Wysocki
2024-05-10 14:19 ` [PATCH v1 5/6] thermal: gov_bang_bang: Drop unnecessary cooling device target state checks Rafael J. Wysocki
2024-05-10 14:20 ` [PATCH v1 6/6] thermal: core: Avoid calling .trip_crossed() for critical and hot trips Rafael J. Wysocki

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).