All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v2 1/2] thermal: int340x: Fix unexpected shutdown at critical temperature
@ 2020-12-21 17:23 Kai-Heng Feng
  2020-12-21 17:23 ` [PATCH v2 2/2] thermal: intel: pch: " Kai-Heng Feng
                   ` (2 more replies)
  0 siblings, 3 replies; 6+ messages in thread
From: Kai-Heng Feng @ 2020-12-21 17:23 UTC (permalink / raw)
  To: rui.zhang, daniel.lezcano, amitk
  Cc: andrzej.p, mjg59, srinivas.pandruvada, Kai-Heng Feng,
	Akinobu Mita, Andrew Morton, Andy Shevchenko, open list:THERMAL,
	open list

We are seeing thermal shutdown on Intel based mobile workstations, the
shutdown happens during the first trip handle in
thermal_zone_device_register():
kernel: thermal thermal_zone15: critical temperature reached (101 C), shutting down

However, we shouldn't do a thermal shutdown here, since
1) We may want to use a dedicated daemon, Intel's thermald in this case,
to handle thermal shutdown.

2) For ACPI based system, _CRT doesn't mean shutdown unless it's inside
ThermalZone namespace. ACPI Spec, 11.4.4 _CRT (Critical Temperature):
"... If this object it present under a device, the device’s driver
evaluates this object to determine the device’s critical cooling
temperature trip point. This value may then be used by the device’s
driver to program an internal device temperature sensor trip point."

So a "critical trip" here merely means we should take a more aggressive
cooling method.

As int340x device isn't present under ACPI ThermalZone, override the
default .critical callback to prevent surprising thermal shutdown.

Signed-off-by: Kai-Heng Feng <kai.heng.feng@canonical.com>
---
v2:
 - Amend subject.
 - Remove int3400 device.

 .../thermal/intel/int340x_thermal/int340x_thermal_zone.c    | 6 ++++++
 1 file changed, 6 insertions(+)

diff --git a/drivers/thermal/intel/int340x_thermal/int340x_thermal_zone.c b/drivers/thermal/intel/int340x_thermal/int340x_thermal_zone.c
index 6e479deff76b..d1248ba943a4 100644
--- a/drivers/thermal/intel/int340x_thermal/int340x_thermal_zone.c
+++ b/drivers/thermal/intel/int340x_thermal/int340x_thermal_zone.c
@@ -146,12 +146,18 @@ static int int340x_thermal_get_trip_hyst(struct thermal_zone_device *zone,
 	return 0;
 }
 
+static void int340x_thermal_critical(struct thermal_zone_device *zone)
+{
+	dev_dbg(&zone->device, "%s: critical temperature reached\n", zone->type);
+}
+
 static struct thermal_zone_device_ops int340x_thermal_zone_ops = {
 	.get_temp       = int340x_thermal_get_zone_temp,
 	.get_trip_temp	= int340x_thermal_get_trip_temp,
 	.get_trip_type	= int340x_thermal_get_trip_type,
 	.set_trip_temp	= int340x_thermal_set_trip_temp,
 	.get_trip_hyst =  int340x_thermal_get_trip_hyst,
+	.critical	= int340x_thermal_critical,
 };
 
 static int int340x_thermal_get_trip_config(acpi_handle handle, char *name,
-- 
2.29.2


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* [PATCH v2 2/2] thermal: intel: pch: Fix unexpected shutdown at critical temperature
  2020-12-21 17:23 [PATCH v2 1/2] thermal: int340x: Fix unexpected shutdown at critical temperature Kai-Heng Feng
@ 2020-12-21 17:23 ` Kai-Heng Feng
  2021-02-04  7:40   ` [thermal: thermal/next] " thermal-bot for Kai-Heng Feng
  2021-01-11 16:18 ` [PATCH v2 1/2] thermal: int340x: " Kai-Heng Feng
  2021-02-04  7:40 ` [thermal: thermal/next] " thermal-bot for Kai-Heng Feng
  2 siblings, 1 reply; 6+ messages in thread
From: Kai-Heng Feng @ 2020-12-21 17:23 UTC (permalink / raw)
  To: rui.zhang, daniel.lezcano, amitk
  Cc: andrzej.p, mjg59, srinivas.pandruvada, Kai-Heng Feng,
	Sumeet Pawnikar, Andy Shevchenko, Gayatri Kammela, Randy Dunlap,
	Andres Freund, Chuhong Yuan, Akinobu Mita, open list:THERMAL,
	open list

Like previous patch, the intel_pch_thermal device is not in ACPI
ThermalZone namespace, so a critical trip doesn't mean shutdown.

Override the default .critical callback to prevent surprising thermal
shutdoown.

Signed-off-by: Kai-Heng Feng <kai.heng.feng@canonical.com>
---
v2:
 - Amend subject.

 drivers/thermal/intel/intel_pch_thermal.c | 6 ++++++
 1 file changed, 6 insertions(+)

diff --git a/drivers/thermal/intel/intel_pch_thermal.c b/drivers/thermal/intel/intel_pch_thermal.c
index 41723c6c6c0c..527c91f5960b 100644
--- a/drivers/thermal/intel/intel_pch_thermal.c
+++ b/drivers/thermal/intel/intel_pch_thermal.c
@@ -326,10 +326,16 @@ static int pch_get_trip_temp(struct thermal_zone_device *tzd, int trip, int *tem
 	return 0;
 }
 
+static void pch_critical(struct thermal_zone_device *tzd)
+{
+	dev_dbg(&tzd->device, "%s: critical temperature reached\n", tzd->type);
+}
+
 static struct thermal_zone_device_ops tzd_ops = {
 	.get_temp = pch_thermal_get_temp,
 	.get_trip_type = pch_get_trip_type,
 	.get_trip_temp = pch_get_trip_temp,
+	.critical = pch_critical,
 };
 
 enum board_ids {
-- 
2.29.2


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH v2 1/2] thermal: int340x: Fix unexpected shutdown at critical temperature
  2020-12-21 17:23 [PATCH v2 1/2] thermal: int340x: Fix unexpected shutdown at critical temperature Kai-Heng Feng
  2020-12-21 17:23 ` [PATCH v2 2/2] thermal: intel: pch: " Kai-Heng Feng
@ 2021-01-11 16:18 ` Kai-Heng Feng
  2021-01-11 16:43   ` Daniel Lezcano
  2021-02-04  7:40 ` [thermal: thermal/next] " thermal-bot for Kai-Heng Feng
  2 siblings, 1 reply; 6+ messages in thread
From: Kai-Heng Feng @ 2021-01-11 16:18 UTC (permalink / raw)
  To: Zhang, Rui, Daniel Lezcano, amitk
  Cc: Andrzej Pietrasiewicz, Matthew Garrett, Srinivas Pandruvada,
	Akinobu Mita, Andrew Morton, Andy Shevchenko, open list:THERMAL,
	open list

On Tue, Dec 22, 2020 at 1:23 AM Kai-Heng Feng
<kai.heng.feng@canonical.com> wrote:
>
> We are seeing thermal shutdown on Intel based mobile workstations, the
> shutdown happens during the first trip handle in
> thermal_zone_device_register():
> kernel: thermal thermal_zone15: critical temperature reached (101 C), shutting down
>
> However, we shouldn't do a thermal shutdown here, since
> 1) We may want to use a dedicated daemon, Intel's thermald in this case,
> to handle thermal shutdown.
>
> 2) For ACPI based system, _CRT doesn't mean shutdown unless it's inside
> ThermalZone namespace. ACPI Spec, 11.4.4 _CRT (Critical Temperature):
> "... If this object it present under a device, the device’s driver
> evaluates this object to determine the device’s critical cooling
> temperature trip point. This value may then be used by the device’s
> driver to program an internal device temperature sensor trip point."
>
> So a "critical trip" here merely means we should take a more aggressive
> cooling method.
>
> As int340x device isn't present under ACPI ThermalZone, override the
> default .critical callback to prevent surprising thermal shutdown.
>
> Signed-off-by: Kai-Heng Feng <kai.heng.feng@canonical.com>

A gentle ping...

> ---
> v2:
>  - Amend subject.
>  - Remove int3400 device.
>
>  .../thermal/intel/int340x_thermal/int340x_thermal_zone.c    | 6 ++++++
>  1 file changed, 6 insertions(+)
>
> diff --git a/drivers/thermal/intel/int340x_thermal/int340x_thermal_zone.c b/drivers/thermal/intel/int340x_thermal/int340x_thermal_zone.c
> index 6e479deff76b..d1248ba943a4 100644
> --- a/drivers/thermal/intel/int340x_thermal/int340x_thermal_zone.c
> +++ b/drivers/thermal/intel/int340x_thermal/int340x_thermal_zone.c
> @@ -146,12 +146,18 @@ static int int340x_thermal_get_trip_hyst(struct thermal_zone_device *zone,
>         return 0;
>  }
>
> +static void int340x_thermal_critical(struct thermal_zone_device *zone)
> +{
> +       dev_dbg(&zone->device, "%s: critical temperature reached\n", zone->type);
> +}
> +
>  static struct thermal_zone_device_ops int340x_thermal_zone_ops = {
>         .get_temp       = int340x_thermal_get_zone_temp,
>         .get_trip_temp  = int340x_thermal_get_trip_temp,
>         .get_trip_type  = int340x_thermal_get_trip_type,
>         .set_trip_temp  = int340x_thermal_set_trip_temp,
>         .get_trip_hyst =  int340x_thermal_get_trip_hyst,
> +       .critical       = int340x_thermal_critical,
>  };
>
>  static int int340x_thermal_get_trip_config(acpi_handle handle, char *name,
> --
> 2.29.2
>

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH v2 1/2] thermal: int340x: Fix unexpected shutdown at critical temperature
  2021-01-11 16:18 ` [PATCH v2 1/2] thermal: int340x: " Kai-Heng Feng
@ 2021-01-11 16:43   ` Daniel Lezcano
  0 siblings, 0 replies; 6+ messages in thread
From: Daniel Lezcano @ 2021-01-11 16:43 UTC (permalink / raw)
  To: Kai-Heng Feng, Zhang, Rui, amitk
  Cc: Andrzej Pietrasiewicz, Matthew Garrett, Srinivas Pandruvada,
	Akinobu Mita, Andrew Morton, Andy Shevchenko, open list:THERMAL,
	open list

On 11/01/2021 17:18, Kai-Heng Feng wrote:
> On Tue, Dec 22, 2020 at 1:23 AM Kai-Heng Feng
> <kai.heng.feng@canonical.com> wrote:
>>
>> We are seeing thermal shutdown on Intel based mobile workstations, the
>> shutdown happens during the first trip handle in
>> thermal_zone_device_register():
>> kernel: thermal thermal_zone15: critical temperature reached (101 C), shutting down
>>
>> However, we shouldn't do a thermal shutdown here, since
>> 1) We may want to use a dedicated daemon, Intel's thermald in this case,
>> to handle thermal shutdown.
>>
>> 2) For ACPI based system, _CRT doesn't mean shutdown unless it's inside
>> ThermalZone namespace. ACPI Spec, 11.4.4 _CRT (Critical Temperature):
>> "... If this object it present under a device, the device’s driver
>> evaluates this object to determine the device’s critical cooling
>> temperature trip point. This value may then be used by the device’s
>> driver to program an internal device temperature sensor trip point."
>>
>> So a "critical trip" here merely means we should take a more aggressive
>> cooling method.
>>
>> As int340x device isn't present under ACPI ThermalZone, override the
>> default .critical callback to prevent surprising thermal shutdown.
>>
>> Signed-off-by: Kai-Heng Feng <kai.heng.feng@canonical.com>
> 
> A gentle ping...

Applied, they are in the testing branch now. They will be a linux-next
in a couple of days.

Thanks
  -- Daniel


-- 
<http://www.linaro.org/> Linaro.org │ Open source software for ARM SoCs

Follow Linaro:  <http://www.facebook.com/pages/Linaro> Facebook |
<http://twitter.com/#!/linaroorg> Twitter |
<http://www.linaro.org/linaro-blog/> Blog

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [thermal: thermal/next] thermal: int340x: Fix unexpected shutdown at critical temperature
  2020-12-21 17:23 [PATCH v2 1/2] thermal: int340x: Fix unexpected shutdown at critical temperature Kai-Heng Feng
  2020-12-21 17:23 ` [PATCH v2 2/2] thermal: intel: pch: " Kai-Heng Feng
  2021-01-11 16:18 ` [PATCH v2 1/2] thermal: int340x: " Kai-Heng Feng
@ 2021-02-04  7:40 ` thermal-bot for Kai-Heng Feng
  2 siblings, 0 replies; 6+ messages in thread
From: thermal-bot for Kai-Heng Feng @ 2021-02-04  7:40 UTC (permalink / raw)
  To: linux-pm; +Cc: Kai-Heng Feng, Daniel Lezcano, rui.zhang, amitk

The following commit has been merged into the thermal/next branch of thermal:

Commit-ID:     dd47366aaa9b93ac3d97cb4ee7641d38a28a771e
Gitweb:        https://git.kernel.org/pub/scm/linux/kernel/git/thermal/linux.git//dd47366aaa9b93ac3d97cb4ee7641d38a28a771e
Author:        Kai-Heng Feng <kai.heng.feng@canonical.com>
AuthorDate:    Tue, 22 Dec 2020 01:23:43 +08:00
Committer:     Daniel Lezcano <daniel.lezcano@linaro.org>
CommitterDate: Tue, 19 Jan 2021 22:30:25 +01:00

thermal: int340x: Fix unexpected shutdown at critical temperature

We are seeing thermal shutdown on Intel based mobile workstations, the
shutdown happens during the first trip handle in
thermal_zone_device_register():
kernel: thermal thermal_zone15: critical temperature reached (101 C), shutting down

However, we shouldn't do a thermal shutdown here, since
1) We may want to use a dedicated daemon, Intel's thermald in this case,
to handle thermal shutdown.

2) For ACPI based system, _CRT doesn't mean shutdown unless it's inside
ThermalZone namespace. ACPI Spec, 11.4.4 _CRT (Critical Temperature):
"... If this object it present under a device, the device’s driver
evaluates this object to determine the device’s critical cooling
temperature trip point. This value may then be used by the device’s
driver to program an internal device temperature sensor trip point."

So a "critical trip" here merely means we should take a more aggressive
cooling method.

As int340x device isn't present under ACPI ThermalZone, override the
default .critical callback to prevent surprising thermal shutdown.

Signed-off-by: Kai-Heng Feng <kai.heng.feng@canonical.com>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Link: https://lore.kernel.org/r/20201221172345.36976-1-kai.heng.feng@canonical.com
---
 drivers/thermal/intel/int340x_thermal/int340x_thermal_zone.c | 6 ++++++
 1 file changed, 6 insertions(+)

diff --git a/drivers/thermal/intel/int340x_thermal/int340x_thermal_zone.c b/drivers/thermal/intel/int340x_thermal/int340x_thermal_zone.c
index 6e479de..d1248ba 100644
--- a/drivers/thermal/intel/int340x_thermal/int340x_thermal_zone.c
+++ b/drivers/thermal/intel/int340x_thermal/int340x_thermal_zone.c
@@ -146,12 +146,18 @@ static int int340x_thermal_get_trip_hyst(struct thermal_zone_device *zone,
 	return 0;
 }
 
+static void int340x_thermal_critical(struct thermal_zone_device *zone)
+{
+	dev_dbg(&zone->device, "%s: critical temperature reached\n", zone->type);
+}
+
 static struct thermal_zone_device_ops int340x_thermal_zone_ops = {
 	.get_temp       = int340x_thermal_get_zone_temp,
 	.get_trip_temp	= int340x_thermal_get_trip_temp,
 	.get_trip_type	= int340x_thermal_get_trip_type,
 	.set_trip_temp	= int340x_thermal_set_trip_temp,
 	.get_trip_hyst =  int340x_thermal_get_trip_hyst,
+	.critical	= int340x_thermal_critical,
 };
 
 static int int340x_thermal_get_trip_config(acpi_handle handle, char *name,

^ permalink raw reply related	[flat|nested] 6+ messages in thread

* [thermal: thermal/next] thermal: intel: pch: Fix unexpected shutdown at critical temperature
  2020-12-21 17:23 ` [PATCH v2 2/2] thermal: intel: pch: " Kai-Heng Feng
@ 2021-02-04  7:40   ` thermal-bot for Kai-Heng Feng
  0 siblings, 0 replies; 6+ messages in thread
From: thermal-bot for Kai-Heng Feng @ 2021-02-04  7:40 UTC (permalink / raw)
  To: linux-pm; +Cc: Kai-Heng Feng, Daniel Lezcano, rui.zhang, amitk

The following commit has been merged into the thermal/next branch of thermal:

Commit-ID:     03671968d0bf2db598f7e3aa98f190b76c1bb4ff
Gitweb:        https://git.kernel.org/pub/scm/linux/kernel/git/thermal/linux.git//03671968d0bf2db598f7e3aa98f190b76c1bb4ff
Author:        Kai-Heng Feng <kai.heng.feng@canonical.com>
AuthorDate:    Tue, 22 Dec 2020 01:23:44 +08:00
Committer:     Daniel Lezcano <daniel.lezcano@linaro.org>
CommitterDate: Tue, 19 Jan 2021 22:30:25 +01:00

thermal: intel: pch: Fix unexpected shutdown at critical temperature

Like previous patch, the intel_pch_thermal device is not in ACPI
ThermalZone namespace, so a critical trip doesn't mean shutdown.

Override the default .critical callback to prevent surprising thermal
shutdoown.

Signed-off-by: Kai-Heng Feng <kai.heng.feng@canonical.com>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Link: https://lore.kernel.org/r/20201221172345.36976-2-kai.heng.feng@canonical.com
---
 drivers/thermal/intel/intel_pch_thermal.c | 6 ++++++
 1 file changed, 6 insertions(+)

diff --git a/drivers/thermal/intel/intel_pch_thermal.c b/drivers/thermal/intel/intel_pch_thermal.c
index 41723c6..527c91f 100644
--- a/drivers/thermal/intel/intel_pch_thermal.c
+++ b/drivers/thermal/intel/intel_pch_thermal.c
@@ -326,10 +326,16 @@ static int pch_get_trip_temp(struct thermal_zone_device *tzd, int trip, int *tem
 	return 0;
 }
 
+static void pch_critical(struct thermal_zone_device *tzd)
+{
+	dev_dbg(&tzd->device, "%s: critical temperature reached\n", tzd->type);
+}
+
 static struct thermal_zone_device_ops tzd_ops = {
 	.get_temp = pch_thermal_get_temp,
 	.get_trip_type = pch_get_trip_type,
 	.get_trip_temp = pch_get_trip_temp,
+	.critical = pch_critical,
 };
 
 enum board_ids {

^ permalink raw reply related	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2021-02-04  7:41 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-12-21 17:23 [PATCH v2 1/2] thermal: int340x: Fix unexpected shutdown at critical temperature Kai-Heng Feng
2020-12-21 17:23 ` [PATCH v2 2/2] thermal: intel: pch: " Kai-Heng Feng
2021-02-04  7:40   ` [thermal: thermal/next] " thermal-bot for Kai-Heng Feng
2021-01-11 16:18 ` [PATCH v2 1/2] thermal: int340x: " Kai-Heng Feng
2021-01-11 16:43   ` Daniel Lezcano
2021-02-04  7:40 ` [thermal: thermal/next] " thermal-bot for Kai-Heng Feng

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.