All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 1/2] thermal: int340x: Add critical callback to override default shutdown behavior
@ 2020-12-21 13:52 Kai-Heng Feng
  2020-12-21 13:52 ` [PATCH 2/2] thermal: intel_pch_thermal: " Kai-Heng Feng
                   ` (2 more replies)
  0 siblings, 3 replies; 8+ messages in thread
From: Kai-Heng Feng @ 2020-12-21 13:52 UTC (permalink / raw)
  To: rui.zhang, daniel.lezcano, amitk
  Cc: andrzej.p, mjg59, srinivas.pandruvada, Kai-Heng Feng,
	Bartlomiej Zolnierkiewicz, Peter Kaestle, Gayatri Kammela,
	Akinobu Mita, Andy Shevchenko, Andrew Morton, open list:THERMAL,
	open list

We are seeing thermal shutdown on Intel based mobile workstations, the
shutdown happens during the first trip handle in
thermal_zone_device_register():
kernel: thermal thermal_zone15: critical temperature reached (101 C), shutting down

However, we shouldn't do a thermal shutdown here, since
1) We may want to use a dedicated daemon, Intel's thermald in this case,
to handle thermal shutdown.

2) For ACPI based system, _CRT doesn't mean shutdown unless it's inside
ThermalZone namespace. ACPI Spec, 11.4.4 _CRT (Critical Temperature):
"... If this object it present under a device, the device’s driver
evaluates this object to determine the device’s critical cooling
temperature trip point. This value may then be used by the device’s
driver to program an internal device temperature sensor trip point."

So a "critical trip" here merely means we should take a more aggressive
cooling method.

As int340x device isn't present under ACPI ThermalZone, override the
default .critical callback to prevent surprising thermal shutdown.

Signed-off-by: Kai-Heng Feng <kai.heng.feng@canonical.com>
---
 drivers/thermal/intel/int340x_thermal/int3400_thermal.c     | 6 ++++++
 .../thermal/intel/int340x_thermal/int340x_thermal_zone.c    | 6 ++++++
 2 files changed, 12 insertions(+)

diff --git a/drivers/thermal/intel/int340x_thermal/int3400_thermal.c b/drivers/thermal/intel/int340x_thermal/int3400_thermal.c
index 823354a1a91a..9778a6dba939 100644
--- a/drivers/thermal/intel/int340x_thermal/int3400_thermal.c
+++ b/drivers/thermal/intel/int340x_thermal/int3400_thermal.c
@@ -431,9 +431,15 @@ static int int3400_thermal_change_mode(struct thermal_zone_device *thermal,
 	return result;
 }
 
+static void int3400_thermal_critical(struct thermal_zone_device *thermal)
+{
+	dev_dbg(&thermal->device, "%s: critical temperature reached\n", thermal->type);
+}
+
 static struct thermal_zone_device_ops int3400_thermal_ops = {
 	.get_temp = int3400_thermal_get_temp,
 	.change_mode = int3400_thermal_change_mode,
+	.critical = int3400_thermal_critical,
 };
 
 static struct thermal_zone_params int3400_thermal_params = {
diff --git a/drivers/thermal/intel/int340x_thermal/int340x_thermal_zone.c b/drivers/thermal/intel/int340x_thermal/int340x_thermal_zone.c
index 6e479deff76b..d1248ba943a4 100644
--- a/drivers/thermal/intel/int340x_thermal/int340x_thermal_zone.c
+++ b/drivers/thermal/intel/int340x_thermal/int340x_thermal_zone.c
@@ -146,12 +146,18 @@ static int int340x_thermal_get_trip_hyst(struct thermal_zone_device *zone,
 	return 0;
 }
 
+static void int340x_thermal_critical(struct thermal_zone_device *zone)
+{
+	dev_dbg(&zone->device, "%s: critical temperature reached\n", zone->type);
+}
+
 static struct thermal_zone_device_ops int340x_thermal_zone_ops = {
 	.get_temp       = int340x_thermal_get_zone_temp,
 	.get_trip_temp	= int340x_thermal_get_trip_temp,
 	.get_trip_type	= int340x_thermal_get_trip_type,
 	.set_trip_temp	= int340x_thermal_set_trip_temp,
 	.get_trip_hyst =  int340x_thermal_get_trip_hyst,
+	.critical	= int340x_thermal_critical,
 };
 
 static int int340x_thermal_get_trip_config(acpi_handle handle, char *name,
-- 
2.29.2


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH 2/2] thermal: intel_pch_thermal: Add critical callback to override default shutdown behavior
  2020-12-21 13:52 [PATCH 1/2] thermal: int340x: Add critical callback to override default shutdown behavior Kai-Heng Feng
@ 2020-12-21 13:52 ` Kai-Heng Feng
  2020-12-21 13:59 ` [PATCH 1/2] thermal: int340x: " Daniel Lezcano
  2020-12-21 16:55 ` Srinivas Pandruvada
  2 siblings, 0 replies; 8+ messages in thread
From: Kai-Heng Feng @ 2020-12-21 13:52 UTC (permalink / raw)
  To: rui.zhang, daniel.lezcano, amitk
  Cc: andrzej.p, mjg59, srinivas.pandruvada, Kai-Heng Feng,
	Sumeet Pawnikar, Andy Shevchenko, Andres Freund, Chuhong Yuan,
	Randy Dunlap, Gayatri Kammela, Akinobu Mita, open list:THERMAL,
	open list

Like previous patch, the intel_pch_thermal device is not in ACPI
ThermalZone namespace, so a critical trip doesn't mean shutdown.

Override the default .critical callback to prevent surprising thermal
shutdoown.

Signed-off-by: Kai-Heng Feng <kai.heng.feng@canonical.com>
---
 drivers/thermal/intel/intel_pch_thermal.c | 6 ++++++
 1 file changed, 6 insertions(+)

diff --git a/drivers/thermal/intel/intel_pch_thermal.c b/drivers/thermal/intel/intel_pch_thermal.c
index 41723c6c6c0c..527c91f5960b 100644
--- a/drivers/thermal/intel/intel_pch_thermal.c
+++ b/drivers/thermal/intel/intel_pch_thermal.c
@@ -326,10 +326,16 @@ static int pch_get_trip_temp(struct thermal_zone_device *tzd, int trip, int *tem
 	return 0;
 }
 
+static void pch_critical(struct thermal_zone_device *tzd)
+{
+	dev_dbg(&tzd->device, "%s: critical temperature reached\n", tzd->type);
+}
+
 static struct thermal_zone_device_ops tzd_ops = {
 	.get_temp = pch_thermal_get_temp,
 	.get_trip_type = pch_get_trip_type,
 	.get_trip_temp = pch_get_trip_temp,
+	.critical = pch_critical,
 };
 
 enum board_ids {
-- 
2.29.2


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [PATCH 1/2] thermal: int340x: Add critical callback to override default shutdown behavior
  2020-12-21 13:52 [PATCH 1/2] thermal: int340x: Add critical callback to override default shutdown behavior Kai-Heng Feng
  2020-12-21 13:52 ` [PATCH 2/2] thermal: intel_pch_thermal: " Kai-Heng Feng
@ 2020-12-21 13:59 ` Daniel Lezcano
  2020-12-21 15:00   ` Kai-Heng Feng
  2020-12-21 16:55 ` Srinivas Pandruvada
  2 siblings, 1 reply; 8+ messages in thread
From: Daniel Lezcano @ 2020-12-21 13:59 UTC (permalink / raw)
  To: Kai-Heng Feng, rui.zhang, amitk
  Cc: andrzej.p, mjg59, srinivas.pandruvada, Bartlomiej Zolnierkiewicz,
	Peter Kaestle, Gayatri Kammela, Akinobu Mita, Andy Shevchenko,
	Andrew Morton, open list:THERMAL, open list

On 21/12/2020 14:52, Kai-Heng Feng wrote:
> We are seeing thermal shutdown on Intel based mobile workstations, the
> shutdown happens during the first trip handle in
> thermal_zone_device_register():
> kernel: thermal thermal_zone15: critical temperature reached (101 C), shutting down
> 
> However, we shouldn't do a thermal shutdown here, since
> 1) We may want to use a dedicated daemon, Intel's thermald in this case,
> to handle thermal shutdown.
> 
> 2) For ACPI based system, _CRT doesn't mean shutdown unless it's inside
> ThermalZone namespace. ACPI Spec, 11.4.4 _CRT (Critical Temperature):
> "... If this object it present under a device, the device’s driver
> evaluates this object to determine the device’s critical cooling
> temperature trip point. This value may then be used by the device’s
> driver to program an internal device temperature sensor trip point."
> 
> So a "critical trip" here merely means we should take a more aggressive
> cooling method.
> 
> As int340x device isn't present under ACPI ThermalZone, override the
> default .critical callback to prevent surprising thermal shutdown.
> 
> Signed-off-by: Kai-Heng Feng <kai.heng.feng@canonical.com>

I'll submit those changes for v5.11-rc1 and change the subject by:

thermal: int340x: Fix unexpected shutdown at critical temperature
thermal: pch: Fix unexpected shutdown at critical temperature

Sounds good ?

> ---
>  drivers/thermal/intel/int340x_thermal/int3400_thermal.c     | 6 ++++++
>  .../thermal/intel/int340x_thermal/int340x_thermal_zone.c    | 6 ++++++
>  2 files changed, 12 insertions(+)
> 
> diff --git a/drivers/thermal/intel/int340x_thermal/int3400_thermal.c b/drivers/thermal/intel/int340x_thermal/int3400_thermal.c
> index 823354a1a91a..9778a6dba939 100644
> --- a/drivers/thermal/intel/int340x_thermal/int3400_thermal.c
> +++ b/drivers/thermal/intel/int340x_thermal/int3400_thermal.c
> @@ -431,9 +431,15 @@ static int int3400_thermal_change_mode(struct thermal_zone_device *thermal,
>  	return result;
>  }
>  
> +static void int3400_thermal_critical(struct thermal_zone_device *thermal)
> +{
> +	dev_dbg(&thermal->device, "%s: critical temperature reached\n", thermal->type);
> +}
> +
>  static struct thermal_zone_device_ops int3400_thermal_ops = {
>  	.get_temp = int3400_thermal_get_temp,
>  	.change_mode = int3400_thermal_change_mode,
> +	.critical = int3400_thermal_critical,
>  };
>  
>  static struct thermal_zone_params int3400_thermal_params = {
> diff --git a/drivers/thermal/intel/int340x_thermal/int340x_thermal_zone.c b/drivers/thermal/intel/int340x_thermal/int340x_thermal_zone.c
> index 6e479deff76b..d1248ba943a4 100644
> --- a/drivers/thermal/intel/int340x_thermal/int340x_thermal_zone.c
> +++ b/drivers/thermal/intel/int340x_thermal/int340x_thermal_zone.c
> @@ -146,12 +146,18 @@ static int int340x_thermal_get_trip_hyst(struct thermal_zone_device *zone,
>  	return 0;
>  }
>  
> +static void int340x_thermal_critical(struct thermal_zone_device *zone)
> +{
> +	dev_dbg(&zone->device, "%s: critical temperature reached\n", zone->type);
> +}
> +
>  static struct thermal_zone_device_ops int340x_thermal_zone_ops = {
>  	.get_temp       = int340x_thermal_get_zone_temp,
>  	.get_trip_temp	= int340x_thermal_get_trip_temp,
>  	.get_trip_type	= int340x_thermal_get_trip_type,
>  	.set_trip_temp	= int340x_thermal_set_trip_temp,
>  	.get_trip_hyst =  int340x_thermal_get_trip_hyst,
> +	.critical	= int340x_thermal_critical,
>  };
>  
>  static int int340x_thermal_get_trip_config(acpi_handle handle, char *name,
> 


-- 
<http://www.linaro.org/> Linaro.org │ Open source software for ARM SoCs

Follow Linaro:  <http://www.facebook.com/pages/Linaro> Facebook |
<http://twitter.com/#!/linaroorg> Twitter |
<http://www.linaro.org/linaro-blog/> Blog

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH 1/2] thermal: int340x: Add critical callback to override default shutdown behavior
  2020-12-21 13:59 ` [PATCH 1/2] thermal: int340x: " Daniel Lezcano
@ 2020-12-21 15:00   ` Kai-Heng Feng
  2020-12-21 16:03     ` Daniel Lezcano
  0 siblings, 1 reply; 8+ messages in thread
From: Kai-Heng Feng @ 2020-12-21 15:00 UTC (permalink / raw)
  To: Daniel Lezcano
  Cc: Zhang, Rui, amitk, Andrzej Pietrasiewicz, Matthew Garrett,
	Srinivas Pandruvada, Bartlomiej Zolnierkiewicz, Peter Kaestle,
	Gayatri Kammela, Akinobu Mita, Andy Shevchenko, Andrew Morton,
	open list:THERMAL, open list

On Mon, Dec 21, 2020 at 9:59 PM Daniel Lezcano
<daniel.lezcano@linaro.org> wrote:
>
> On 21/12/2020 14:52, Kai-Heng Feng wrote:
> > We are seeing thermal shutdown on Intel based mobile workstations, the
> > shutdown happens during the first trip handle in
> > thermal_zone_device_register():
> > kernel: thermal thermal_zone15: critical temperature reached (101 C), shutting down
> >
> > However, we shouldn't do a thermal shutdown here, since
> > 1) We may want to use a dedicated daemon, Intel's thermald in this case,
> > to handle thermal shutdown.
> >
> > 2) For ACPI based system, _CRT doesn't mean shutdown unless it's inside
> > ThermalZone namespace. ACPI Spec, 11.4.4 _CRT (Critical Temperature):
> > "... If this object it present under a device, the device’s driver
> > evaluates this object to determine the device’s critical cooling
> > temperature trip point. This value may then be used by the device’s
> > driver to program an internal device temperature sensor trip point."
> >
> > So a "critical trip" here merely means we should take a more aggressive
> > cooling method.
> >
> > As int340x device isn't present under ACPI ThermalZone, override the
> > default .critical callback to prevent surprising thermal shutdown.
> >
> > Signed-off-by: Kai-Heng Feng <kai.heng.feng@canonical.com>
>
> I'll submit those changes for v5.11-rc1 and change the subject by:
>
> thermal: int340x: Fix unexpected shutdown at critical temperature
> thermal: pch: Fix unexpected shutdown at critical temperature
>
> Sounds good ?

Sounds good to me. Thanks!

Kai-Heng

>
> > ---
> >  drivers/thermal/intel/int340x_thermal/int3400_thermal.c     | 6 ++++++
> >  .../thermal/intel/int340x_thermal/int340x_thermal_zone.c    | 6 ++++++
> >  2 files changed, 12 insertions(+)
> >
> > diff --git a/drivers/thermal/intel/int340x_thermal/int3400_thermal.c b/drivers/thermal/intel/int340x_thermal/int3400_thermal.c
> > index 823354a1a91a..9778a6dba939 100644
> > --- a/drivers/thermal/intel/int340x_thermal/int3400_thermal.c
> > +++ b/drivers/thermal/intel/int340x_thermal/int3400_thermal.c
> > @@ -431,9 +431,15 @@ static int int3400_thermal_change_mode(struct thermal_zone_device *thermal,
> >       return result;
> >  }
> >
> > +static void int3400_thermal_critical(struct thermal_zone_device *thermal)
> > +{
> > +     dev_dbg(&thermal->device, "%s: critical temperature reached\n", thermal->type);
> > +}
> > +
> >  static struct thermal_zone_device_ops int3400_thermal_ops = {
> >       .get_temp = int3400_thermal_get_temp,
> >       .change_mode = int3400_thermal_change_mode,
> > +     .critical = int3400_thermal_critical,
> >  };
> >
> >  static struct thermal_zone_params int3400_thermal_params = {
> > diff --git a/drivers/thermal/intel/int340x_thermal/int340x_thermal_zone.c b/drivers/thermal/intel/int340x_thermal/int340x_thermal_zone.c
> > index 6e479deff76b..d1248ba943a4 100644
> > --- a/drivers/thermal/intel/int340x_thermal/int340x_thermal_zone.c
> > +++ b/drivers/thermal/intel/int340x_thermal/int340x_thermal_zone.c
> > @@ -146,12 +146,18 @@ static int int340x_thermal_get_trip_hyst(struct thermal_zone_device *zone,
> >       return 0;
> >  }
> >
> > +static void int340x_thermal_critical(struct thermal_zone_device *zone)
> > +{
> > +     dev_dbg(&zone->device, "%s: critical temperature reached\n", zone->type);
> > +}
> > +
> >  static struct thermal_zone_device_ops int340x_thermal_zone_ops = {
> >       .get_temp       = int340x_thermal_get_zone_temp,
> >       .get_trip_temp  = int340x_thermal_get_trip_temp,
> >       .get_trip_type  = int340x_thermal_get_trip_type,
> >       .set_trip_temp  = int340x_thermal_set_trip_temp,
> >       .get_trip_hyst =  int340x_thermal_get_trip_hyst,
> > +     .critical       = int340x_thermal_critical,
> >  };
> >
> >  static int int340x_thermal_get_trip_config(acpi_handle handle, char *name,
> >
>
>
> --
> <http://www.linaro.org/> Linaro.org │ Open source software for ARM SoCs
>
> Follow Linaro:  <http://www.facebook.com/pages/Linaro> Facebook |
> <http://twitter.com/#!/linaroorg> Twitter |
> <http://www.linaro.org/linaro-blog/> Blog

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH 1/2] thermal: int340x: Add critical callback to override default shutdown behavior
  2020-12-21 15:00   ` Kai-Heng Feng
@ 2020-12-21 16:03     ` Daniel Lezcano
  2020-12-21 16:57       ` Srinivas Pandruvada
  0 siblings, 1 reply; 8+ messages in thread
From: Daniel Lezcano @ 2020-12-21 16:03 UTC (permalink / raw)
  To: Kai-Heng Feng
  Cc: Zhang, Rui, amitk, Andrzej Pietrasiewicz, Matthew Garrett,
	Srinivas Pandruvada, Bartlomiej Zolnierkiewicz, Peter Kaestle,
	Gayatri Kammela, Akinobu Mita, Andy Shevchenko, Andrew Morton,
	open list:THERMAL, open list

On 21/12/2020 16:00, Kai-Heng Feng wrote:
> On Mon, Dec 21, 2020 at 9:59 PM Daniel Lezcano
> <daniel.lezcano@linaro.org> wrote:
>>
>> On 21/12/2020 14:52, Kai-Heng Feng wrote:
>>> We are seeing thermal shutdown on Intel based mobile workstations, the
>>> shutdown happens during the first trip handle in
>>> thermal_zone_device_register():
>>> kernel: thermal thermal_zone15: critical temperature reached (101 C), shutting down
>>>
>>> However, we shouldn't do a thermal shutdown here, since
>>> 1) We may want to use a dedicated daemon, Intel's thermald in this case,
>>> to handle thermal shutdown.
>>>
>>> 2) For ACPI based system, _CRT doesn't mean shutdown unless it's inside
>>> ThermalZone namespace. ACPI Spec, 11.4.4 _CRT (Critical Temperature):
>>> "... If this object it present under a device, the device’s driver
>>> evaluates this object to determine the device’s critical cooling
>>> temperature trip point. This value may then be used by the device’s
>>> driver to program an internal device temperature sensor trip point."
>>>
>>> So a "critical trip" here merely means we should take a more aggressive
>>> cooling method.
>>>
>>> As int340x device isn't present under ACPI ThermalZone, override the
>>> default .critical callback to prevent surprising thermal shutdown.
>>>
>>> Signed-off-by: Kai-Heng Feng <kai.heng.feng@canonical.com>
>>
>> I'll submit those changes for v5.11-rc1 and change the subject by:
>>
>> thermal: int340x: Fix unexpected shutdown at critical temperature
>> thermal: pch: Fix unexpected shutdown at critical temperature
>>
>> Sounds good ?
> 
> Sounds good to me. Thanks!
> 
> Kai-Heng

Rui, Srinivas? Are you ok with the changes ?


-- 
<http://www.linaro.org/> Linaro.org │ Open source software for ARM SoCs

Follow Linaro:  <http://www.facebook.com/pages/Linaro> Facebook |
<http://twitter.com/#!/linaroorg> Twitter |
<http://www.linaro.org/linaro-blog/> Blog

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH 1/2] thermal: int340x: Add critical callback to override default shutdown behavior
  2020-12-21 13:52 [PATCH 1/2] thermal: int340x: Add critical callback to override default shutdown behavior Kai-Heng Feng
  2020-12-21 13:52 ` [PATCH 2/2] thermal: intel_pch_thermal: " Kai-Heng Feng
  2020-12-21 13:59 ` [PATCH 1/2] thermal: int340x: " Daniel Lezcano
@ 2020-12-21 16:55 ` Srinivas Pandruvada
  2020-12-21 16:58   ` Kai-Heng Feng
  2 siblings, 1 reply; 8+ messages in thread
From: Srinivas Pandruvada @ 2020-12-21 16:55 UTC (permalink / raw)
  To: Kai-Heng Feng, rui.zhang, daniel.lezcano, amitk
  Cc: andrzej.p, mjg59, Bartlomiej Zolnierkiewicz, Peter Kaestle,
	Gayatri Kammela, Akinobu Mita, Andy Shevchenko, Andrew Morton,
	open list:THERMAL, open list

On Mon, 2020-12-21 at 21:52 +0800, Kai-Heng Feng wrote:
> We are seeing thermal shutdown on Intel based mobile workstations,
> the
> shutdown happens during the first trip handle in
> thermal_zone_device_register():
> kernel: thermal thermal_zone15: critical temperature reached (101 C),
> shutting down
> 
> However, we shouldn't do a thermal shutdown here, since
> 1) We may want to use a dedicated daemon, Intel's thermald in this
> case,
> to handle thermal shutdown.
> 
> 2) For ACPI based system, _CRT doesn't mean shutdown unless it's
> inside
> ThermalZone namespace. ACPI Spec, 11.4.4 _CRT (Critical Temperature):
> "... If this object it present under a device, the device’s driver
> evaluates this object to determine the device’s critical cooling
> temperature trip point. This value may then be used by the device’s
> driver to program an internal device temperature sensor trip point."
> 
> So a "critical trip" here merely means we should take a more
> aggressive
> cooling method.
> 
> As int340x device isn't present under ACPI ThermalZone, override the
> default .critical callback to prevent surprising thermal shutdown.
> 
> Signed-off-by: Kai-Heng Feng <kai.heng.feng@canonical.com>
> ---
>  drivers/thermal/intel/int340x_thermal/int3400_thermal.c     | 6
> ++++++
>  .../thermal/intel/int340x_thermal/int340x_thermal_zone.c    | 6
> ++++++
>  2 files changed, 12 insertions(+)
> 
> diff --git a/drivers/thermal/intel/int340x_thermal/int3400_thermal.c
> b/drivers/thermal/intel/int340x_thermal/int3400_thermal.c
> index 823354a1a91a..9778a6dba939 100644
> --- a/drivers/thermal/intel/int340x_thermal/int3400_thermal.c
> +++ b/drivers/thermal/intel/int340x_thermal/int3400_thermal.c
> @@ -431,9 +431,15 @@ static int int3400_thermal_change_mode(struct
> thermal_zone_device *thermal,
>         return result;
>  }
>  
> +static void int3400_thermal_critical(struct thermal_zone_device
> *thermal)
> +{
> +       dev_dbg(&thermal->device, "%s: critical temperature
> reached\n", thermal->type);
> +}
> +
>  static struct thermal_zone_device_ops int3400_thermal_ops = {
>         .get_temp = int3400_thermal_get_temp,
>         .change_mode = int3400_thermal_change_mode,
> +       .critical = int3400_thermal_critical,
>  };

You don't need for int3400 device. This is a fake sensor.

>  
>  static struct thermal_zone_params int3400_thermal_params = {
> diff --git
> a/drivers/thermal/intel/int340x_thermal/int340x_thermal_zone.c
> b/drivers/thermal/intel/int340x_thermal/int340x_thermal_zone.c
> index 6e479deff76b..d1248ba943a4 100644
> --- a/drivers/thermal/intel/int340x_thermal/int340x_thermal_zone.c
> +++ b/drivers/thermal/intel/int340x_thermal/int340x_thermal_zone.c
> @@ -146,12 +146,18 @@ static int int340x_thermal_get_trip_hyst(struct
> thermal_zone_device *zone,
>         return 0;
>  }
>  
> +static void int340x_thermal_critical(struct thermal_zone_device
> *zone)
> +{
> +       dev_dbg(&zone->device, "%s: critical temperature reached\n",
> zone->type);
> +}
> +
>  static struct thermal_zone_device_ops int340x_thermal_zone_ops = {
>         .get_temp       = int340x_thermal_get_zone_temp,
>         .get_trip_temp  = int340x_thermal_get_trip_temp,
>         .get_trip_type  = int340x_thermal_get_trip_type,
>         .set_trip_temp  = int340x_thermal_set_trip_temp,
>         .get_trip_hyst =  int340x_thermal_get_trip_hyst,
> +       .critical       = int340x_thermal_critical,
>  };
>  
>  static int int340x_thermal_get_trip_config(acpi_handle handle, char
> *name,

Thanks,
Srinivas



^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH 1/2] thermal: int340x: Add critical callback to override default shutdown behavior
  2020-12-21 16:03     ` Daniel Lezcano
@ 2020-12-21 16:57       ` Srinivas Pandruvada
  0 siblings, 0 replies; 8+ messages in thread
From: Srinivas Pandruvada @ 2020-12-21 16:57 UTC (permalink / raw)
  To: Daniel Lezcano, Kai-Heng Feng
  Cc: Zhang, Rui, amitk, Andrzej Pietrasiewicz, Matthew Garrett,
	Bartlomiej Zolnierkiewicz, Peter Kaestle, Gayatri Kammela,
	Akinobu Mita, Andy Shevchenko, Andrew Morton, open list:THERMAL,
	open list

On Mon, 2020-12-21 at 17:03 +0100, Daniel Lezcano wrote:
> On 21/12/2020 16:00, Kai-Heng Feng wrote:
> > On Mon, Dec 21, 2020 at 9:59 PM Daniel Lezcano
> > <daniel.lezcano@linaro.org> wrote:
> > > 
> > > On 21/12/2020 14:52, Kai-Heng Feng wrote:
> > > > We are seeing thermal shutdown on Intel based mobile
> > > > workstations, the
> > > > shutdown happens during the first trip handle in
> > > > thermal_zone_device_register():
> > > > kernel: thermal thermal_zone15: critical temperature reached
> > > > (101 C), shutting down
> > > > 
> > > > However, we shouldn't do a thermal shutdown here, since
> > > > 1) We may want to use a dedicated daemon, Intel's thermald in
> > > > this case,
> > > > to handle thermal shutdown.
> > > > 
> > > > 2) For ACPI based system, _CRT doesn't mean shutdown unless
> > > > it's inside
> > > > ThermalZone namespace. ACPI Spec, 11.4.4 _CRT (Critical
> > > > Temperature):
> > > > "... If this object it present under a device, the device’s
> > > > driver
> > > > evaluates this object to determine the device’s critical
> > > > cooling
> > > > temperature trip point. This value may then be used by the
> > > > device’s
> > > > driver to program an internal device temperature sensor trip
> > > > point."
> > > > 
> > > > So a "critical trip" here merely means we should take a more
> > > > aggressive
> > > > cooling method.
> > > > 
> > > > As int340x device isn't present under ACPI ThermalZone,
> > > > override the
> > > > default .critical callback to prevent surprising thermal
> > > > shutdown.
> > > > 
> > > > Signed-off-by: Kai-Heng Feng <kai.heng.feng@canonical.com>
> > > 
> > > I'll submit those changes for v5.11-rc1 and change the subject
> > > by:
> > > 
> > > thermal: int340x: Fix unexpected shutdown at critical temperature
> > > thermal: pch: Fix unexpected shutdown at critical temperature
> > > 
> > > Sounds good ?
> > 
> > Sounds good to me. Thanks!
> > 
> > Kai-Heng
> 
> Rui, Srinivas? Are you ok with the changes ?
I have minor comment in one of the patch. But that is harmless.
So changes are fine.

Thanks,
Srinivas


> 
> 



^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH 1/2] thermal: int340x: Add critical callback to override default shutdown behavior
  2020-12-21 16:55 ` Srinivas Pandruvada
@ 2020-12-21 16:58   ` Kai-Heng Feng
  0 siblings, 0 replies; 8+ messages in thread
From: Kai-Heng Feng @ 2020-12-21 16:58 UTC (permalink / raw)
  To: Srinivas Pandruvada
  Cc: Zhang, Rui, Daniel Lezcano, amitk, Andrzej Pietrasiewicz,
	Matthew Garrett, Bartlomiej Zolnierkiewicz, Peter Kaestle,
	Gayatri Kammela, Akinobu Mita, Andy Shevchenko, Andrew Morton,
	open list:THERMAL, open list

On Tue, Dec 22, 2020 at 12:55 AM Srinivas Pandruvada
<srinivas.pandruvada@linux.intel.com> wrote:
>
> On Mon, 2020-12-21 at 21:52 +0800, Kai-Heng Feng wrote:
> > We are seeing thermal shutdown on Intel based mobile workstations,
> > the
> > shutdown happens during the first trip handle in
> > thermal_zone_device_register():
> > kernel: thermal thermal_zone15: critical temperature reached (101 C),
> > shutting down
> >
> > However, we shouldn't do a thermal shutdown here, since
> > 1) We may want to use a dedicated daemon, Intel's thermald in this
> > case,
> > to handle thermal shutdown.
> >
> > 2) For ACPI based system, _CRT doesn't mean shutdown unless it's
> > inside
> > ThermalZone namespace. ACPI Spec, 11.4.4 _CRT (Critical Temperature):
> > "... If this object it present under a device, the device’s driver
> > evaluates this object to determine the device’s critical cooling
> > temperature trip point. This value may then be used by the device’s
> > driver to program an internal device temperature sensor trip point."
> >
> > So a "critical trip" here merely means we should take a more
> > aggressive
> > cooling method.
> >
> > As int340x device isn't present under ACPI ThermalZone, override the
> > default .critical callback to prevent surprising thermal shutdown.
> >
> > Signed-off-by: Kai-Heng Feng <kai.heng.feng@canonical.com>
> > ---
> >  drivers/thermal/intel/int340x_thermal/int3400_thermal.c     | 6
> > ++++++
> >  .../thermal/intel/int340x_thermal/int340x_thermal_zone.c    | 6
> > ++++++
> >  2 files changed, 12 insertions(+)
> >
> > diff --git a/drivers/thermal/intel/int340x_thermal/int3400_thermal.c
> > b/drivers/thermal/intel/int340x_thermal/int3400_thermal.c
> > index 823354a1a91a..9778a6dba939 100644
> > --- a/drivers/thermal/intel/int340x_thermal/int3400_thermal.c
> > +++ b/drivers/thermal/intel/int340x_thermal/int3400_thermal.c
> > @@ -431,9 +431,15 @@ static int int3400_thermal_change_mode(struct
> > thermal_zone_device *thermal,
> >         return result;
> >  }
> >
> > +static void int3400_thermal_critical(struct thermal_zone_device
> > *thermal)
> > +{
> > +       dev_dbg(&thermal->device, "%s: critical temperature
> > reached\n", thermal->type);
> > +}
> > +
> >  static struct thermal_zone_device_ops int3400_thermal_ops = {
> >         .get_temp = int3400_thermal_get_temp,
> >         .change_mode = int3400_thermal_change_mode,
> > +       .critical = int3400_thermal_critical,
> >  };
>
> You don't need for int3400 device. This is a fake sensor.

Thanks. Let me send a v2.

Kai-Heng

>
> >
> >  static struct thermal_zone_params int3400_thermal_params = {
> > diff --git
> > a/drivers/thermal/intel/int340x_thermal/int340x_thermal_zone.c
> > b/drivers/thermal/intel/int340x_thermal/int340x_thermal_zone.c
> > index 6e479deff76b..d1248ba943a4 100644
> > --- a/drivers/thermal/intel/int340x_thermal/int340x_thermal_zone.c
> > +++ b/drivers/thermal/intel/int340x_thermal/int340x_thermal_zone.c
> > @@ -146,12 +146,18 @@ static int int340x_thermal_get_trip_hyst(struct
> > thermal_zone_device *zone,
> >         return 0;
> >  }
> >
> > +static void int340x_thermal_critical(struct thermal_zone_device
> > *zone)
> > +{
> > +       dev_dbg(&zone->device, "%s: critical temperature reached\n",
> > zone->type);
> > +}
> > +
> >  static struct thermal_zone_device_ops int340x_thermal_zone_ops = {
> >         .get_temp       = int340x_thermal_get_zone_temp,
> >         .get_trip_temp  = int340x_thermal_get_trip_temp,
> >         .get_trip_type  = int340x_thermal_get_trip_type,
> >         .set_trip_temp  = int340x_thermal_set_trip_temp,
> >         .get_trip_hyst =  int340x_thermal_get_trip_hyst,
> > +       .critical       = int340x_thermal_critical,
> >  };
> >
> >  static int int340x_thermal_get_trip_config(acpi_handle handle, char
> > *name,
>
> Thanks,
> Srinivas
>
>

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2020-12-21 18:38 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-12-21 13:52 [PATCH 1/2] thermal: int340x: Add critical callback to override default shutdown behavior Kai-Heng Feng
2020-12-21 13:52 ` [PATCH 2/2] thermal: intel_pch_thermal: " Kai-Heng Feng
2020-12-21 13:59 ` [PATCH 1/2] thermal: int340x: " Daniel Lezcano
2020-12-21 15:00   ` Kai-Heng Feng
2020-12-21 16:03     ` Daniel Lezcano
2020-12-21 16:57       ` Srinivas Pandruvada
2020-12-21 16:55 ` Srinivas Pandruvada
2020-12-21 16:58   ` Kai-Heng Feng

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.