All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v5 1/3] dt-bindings: thermal-zones: Document critical-action
@ 2023-08-29 19:41 Fabio Estevam
  2023-08-29 19:41 ` [PATCH v5 2/3] reboot: Introduce hw_protection_reboot() Fabio Estevam
                   ` (2 more replies)
  0 siblings, 3 replies; 12+ messages in thread
From: Fabio Estevam @ 2023-08-29 19:41 UTC (permalink / raw)
  To: daniel.lezcano
  Cc: rafael, amitk, rui.zhang, linux-pm, krzysztof.kozlowski+dt,
	robh+dt, conor+dt, devicetree, Fabio Estevam,
	Krzysztof Kozlowski

From: Fabio Estevam <festevam@denx.de>

Document the critical-action property to describe the thermal action
the OS should perform after the critical temperature is reached.

The possible values are "shutdown" and "reboot".

The motivation for introducing the critical-action property is that
different systems may need different thermal actions when the critical
temperature is reached.

For example, a desktop PC may want the OS to trigger a shutdown
when the critical temperature is reached.

However, in some embedded cases, such behavior does not suit well,
as the board may be unattended in the field and rebooting may be a
better approach.

The bootloader may also benefit from this new property as it can check
the SoC temperature and in case the temperature is above the critical
point, it can trigger a shutdown or reboot accordingly.

Signed-off-by: Fabio Estevam <festevam@denx.de>
Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
---
Changes since v4:
- None.

 .../devicetree/bindings/thermal/thermal-zones.yaml       | 9 +++++++++
 1 file changed, 9 insertions(+)

diff --git a/Documentation/devicetree/bindings/thermal/thermal-zones.yaml b/Documentation/devicetree/bindings/thermal/thermal-zones.yaml
index 4f3acdc4dec0..c2e4d28f885b 100644
--- a/Documentation/devicetree/bindings/thermal/thermal-zones.yaml
+++ b/Documentation/devicetree/bindings/thermal/thermal-zones.yaml
@@ -75,6 +75,15 @@ patternProperties:
           framework and assumes that the thermal sensors in this zone
           support interrupts.
 
+      critical-action:
+        $ref: /schemas/types.yaml#/definitions/string
+        description:
+          The action the OS should perform after the critical temperature is reached.
+
+        enum:
+          - shutdown
+          - reboot
+
       thermal-sensors:
         $ref: /schemas/types.yaml#/definitions/phandle-array
         maxItems: 1
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH v5 2/3] reboot: Introduce hw_protection_reboot()
  2023-08-29 19:41 [PATCH v5 1/3] dt-bindings: thermal-zones: Document critical-action Fabio Estevam
@ 2023-08-29 19:41 ` Fabio Estevam
  2023-08-30 11:33   ` Rafael J. Wysocki
  2023-08-29 19:42 ` [PATCH v5 3/3] thermal: thermal_core: Allow rebooting after critical temp Fabio Estevam
  2023-08-30 11:37 ` [PATCH v5 1/3] dt-bindings: thermal-zones: Document critical-action Rafael J. Wysocki
  2 siblings, 1 reply; 12+ messages in thread
From: Fabio Estevam @ 2023-08-29 19:41 UTC (permalink / raw)
  To: daniel.lezcano
  Cc: rafael, amitk, rui.zhang, linux-pm, krzysztof.kozlowski+dt,
	robh+dt, conor+dt, devicetree, Fabio Estevam

From: Fabio Estevam <festevam@denx.de>

Introduce hw_protection_reboot() to trigger an emergency reboot.

It is a counterpart of hw_protection_shutdown() with the difference
that it will force a reboot instead of shutdown.

The motivation for doing this is to allow the thermal subystem
to trigger a reboot when the temperature reaches the critical
temperature.

Signed-off-by: Fabio Estevam <festevam@denx.de>
---
Changes since v4:
- None

 include/linux/reboot.h |  1 +
 kernel/reboot.c        | 34 ++++++++++++++++++++++++++++++++++
 2 files changed, 35 insertions(+)

diff --git a/include/linux/reboot.h b/include/linux/reboot.h
index 2b6bb593be5b..4a319bc24f6a 100644
--- a/include/linux/reboot.h
+++ b/include/linux/reboot.h
@@ -174,6 +174,7 @@ void ctrl_alt_del(void);
 
 extern void orderly_poweroff(bool force);
 extern void orderly_reboot(void);
+void hw_protection_reboot(const char *reason, int ms_until_forced);
 void hw_protection_shutdown(const char *reason, int ms_until_forced);
 
 /*
diff --git a/kernel/reboot.c b/kernel/reboot.c
index 3bba88c7ffc6..05333ae8bc6b 100644
--- a/kernel/reboot.c
+++ b/kernel/reboot.c
@@ -952,6 +952,40 @@ static void hw_failure_emergency_poweroff(int poweroff_delay_ms)
 			      msecs_to_jiffies(poweroff_delay_ms));
 }
 
+/**
+ * hw_protection_reboot - Trigger an emergency system reboot
+ *
+ * @reason:		Reason of emergency reboot to be printed.
+ * @ms_until_forced:	Time to wait for orderly reboot before tiggering a
+ *			forced reboot. Negative value disables the forced
+ *			reboot.
+ *
+ * Initiate an emergency system reboot in order to protect hardware from
+ * further damage. Usage examples include a thermal protection.
+ *
+ * NOTE: The request is ignored if protection reboot is already pending even
+ * if the previous request has given a large timeout for forced reboot.
+ * Can be called from any context.
+ */
+void hw_protection_reboot(const char *reason, int ms_until_forced)
+{
+	static atomic_t allow_proceed = ATOMIC_INIT(1);
+
+	pr_emerg("HARDWARE PROTECTION reboot (%s)\n", reason);
+
+	/* Reboot should be initiated only once. */
+	if (!atomic_dec_and_test(&allow_proceed))
+		return;
+
+	/*
+	 * Queue a backup emergency reboot in the event of
+	 * orderly_reboot failure
+	 */
+	hw_failure_emergency_poweroff(ms_until_forced);
+	orderly_reboot();
+}
+EXPORT_SYMBOL_GPL(hw_protection_reboot);
+
 /**
  * hw_protection_shutdown - Trigger an emergency system poweroff
  *
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH v5 3/3] thermal: thermal_core: Allow rebooting after critical temp
  2023-08-29 19:41 [PATCH v5 1/3] dt-bindings: thermal-zones: Document critical-action Fabio Estevam
  2023-08-29 19:41 ` [PATCH v5 2/3] reboot: Introduce hw_protection_reboot() Fabio Estevam
@ 2023-08-29 19:42 ` Fabio Estevam
  2023-08-30 11:18   ` Rafael J. Wysocki
  2023-08-30 11:37 ` [PATCH v5 1/3] dt-bindings: thermal-zones: Document critical-action Rafael J. Wysocki
  2 siblings, 1 reply; 12+ messages in thread
From: Fabio Estevam @ 2023-08-29 19:42 UTC (permalink / raw)
  To: daniel.lezcano
  Cc: rafael, amitk, rui.zhang, linux-pm, krzysztof.kozlowski+dt,
	robh+dt, conor+dt, devicetree, Fabio Estevam

From: Fabio Estevam <festevam@denx.de>

Currently, the default mechanism is to trigger a shutdown after the
critical temperature is reached.

In some embedded cases, such behavior does not suit well, as the board may
be unattended in the field and rebooting may be a better approach.

The bootloader may also check the temperature and only allow the boot to
proceed when the temperature is below a certain threshold.

Introduce support for allowing a reboot to be triggered after the
critical temperature is reached.

If the "critical-action" devicetree property is not found, fall back to
the shutdown action to preserve the existing default behavior.

Tested on a i.MX8MM board with the following devicetree changes:

	thermal-zones {
		cpu-thermal {
			critical-action = "reboot";
		};
	};

Signed-off-by: Fabio Estevam <festevam@denx.de>
---
Changes since v4:
- Simplify the logic inside thermal_zone_device_critical(). (Rafael)
- Declare THERMAL_CRITICAL_ACTION_SHUTDOWN = 0 so it is clear what happens
on non-DT platforms. (Rafael)

 drivers/thermal/thermal_core.c |  6 +++++-
 drivers/thermal/thermal_of.c   | 27 +++++++++++++++++++++++++++
 include/linux/thermal.h        |  6 ++++++
 3 files changed, 38 insertions(+), 1 deletion(-)

diff --git a/drivers/thermal/thermal_core.c b/drivers/thermal/thermal_core.c
index a59700593d32..062114608667 100644
--- a/drivers/thermal/thermal_core.c
+++ b/drivers/thermal/thermal_core.c
@@ -320,11 +320,15 @@ void thermal_zone_device_critical(struct thermal_zone_device *tz)
 	 * Its a must for forced_emergency_poweroff_work to be scheduled.
 	 */
 	int poweroff_delay_ms = CONFIG_THERMAL_EMERGENCY_POWEROFF_DELAY_MS;
+	static const char *msg = "Temperature too high";
 
 	dev_emerg(&tz->device, "%s: critical temperature reached, "
 		  "shutting down\n", tz->type);
 
-	hw_protection_shutdown("Temperature too high", poweroff_delay_ms);
+	if (tz->action == THERMAL_CRITICAL_ACTION_REBOOT)
+		hw_protection_reboot(msg, poweroff_delay_ms);
+	else
+		hw_protection_shutdown(msg, poweroff_delay_ms);
 }
 EXPORT_SYMBOL(thermal_zone_device_critical);
 
diff --git a/drivers/thermal/thermal_of.c b/drivers/thermal/thermal_of.c
index 4ca905723429..8bc28cba7406 100644
--- a/drivers/thermal/thermal_of.c
+++ b/drivers/thermal/thermal_of.c
@@ -218,6 +218,31 @@ static struct device_node *of_thermal_zone_find(struct device_node *sensor, int
 	return tz;
 }
 
+static const char * const critical_actions[] = {
+	[THERMAL_CRITICAL_ACTION_SHUTDOWN]	= "shutdown",
+	[THERMAL_CRITICAL_ACTION_REBOOT]	= "reboot",
+};
+
+static void thermal_of_get_critical_action(struct device_node *np,
+					   enum thermal_action *action)
+{
+	const char *action_string;
+	int i, ret;
+
+	ret = of_property_read_string(np, "critical-action", &action_string);
+	if (ret < 0)
+		goto out_default_action;
+
+	for (i = 0; i < ARRAY_SIZE(critical_actions); i++)
+		if (!strcasecmp(action_string, critical_actions[i])) {
+			*action = i;
+			return;
+		}
+
+out_default_action:
+	*action = THERMAL_CRITICAL_ACTION_SHUTDOWN;
+}
+
 static int thermal_of_monitor_init(struct device_node *np, int *delay, int *pdelay)
 {
 	int ret;
@@ -516,6 +541,8 @@ static struct thermal_zone_device *thermal_of_zone_register(struct device_node *
 		goto out_kfree_trips;
 	}
 
+	thermal_of_get_critical_action(np, &tz->action);
+
 	ret = thermal_zone_device_enable(tz);
 	if (ret) {
 		pr_err("Failed to enabled thermal zone '%s', id=%d: %d\n",
diff --git a/include/linux/thermal.h b/include/linux/thermal.h
index b449a46766f5..b68e5734823d 100644
--- a/include/linux/thermal.h
+++ b/include/linux/thermal.h
@@ -34,6 +34,11 @@ struct thermal_cooling_device;
 struct thermal_instance;
 struct thermal_attr;
 
+enum thermal_action {
+	THERMAL_CRITICAL_ACTION_SHUTDOWN = 0, /* shutdown when crit temperature is reached */
+	THERMAL_CRITICAL_ACTION_REBOOT, /* reboot when crit temperature is reached */
+};
+
 enum thermal_trend {
 	THERMAL_TREND_STABLE, /* temperature is stable */
 	THERMAL_TREND_RAISING, /* temperature is raising */
@@ -187,6 +192,7 @@ struct thermal_zone_device {
 	struct list_head node;
 	struct delayed_work poll_queue;
 	enum thermal_notify_event notify_event;
+	enum thermal_action action;
 };
 
 /**
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [PATCH v5 3/3] thermal: thermal_core: Allow rebooting after critical temp
  2023-08-29 19:42 ` [PATCH v5 3/3] thermal: thermal_core: Allow rebooting after critical temp Fabio Estevam
@ 2023-08-30 11:18   ` Rafael J. Wysocki
  0 siblings, 0 replies; 12+ messages in thread
From: Rafael J. Wysocki @ 2023-08-30 11:18 UTC (permalink / raw)
  To: Fabio Estevam
  Cc: daniel.lezcano, rafael, amitk, rui.zhang, linux-pm,
	krzysztof.kozlowski+dt, robh+dt, conor+dt, devicetree,
	Fabio Estevam

On Tue, Aug 29, 2023 at 9:42 PM Fabio Estevam <festevam@gmail.com> wrote:
>
> From: Fabio Estevam <festevam@denx.de>
>
> Currently, the default mechanism is to trigger a shutdown after the
> critical temperature is reached.
>
> In some embedded cases, such behavior does not suit well, as the board may
> be unattended in the field and rebooting may be a better approach.
>
> The bootloader may also check the temperature and only allow the boot to
> proceed when the temperature is below a certain threshold.
>
> Introduce support for allowing a reboot to be triggered after the
> critical temperature is reached.
>
> If the "critical-action" devicetree property is not found, fall back to
> the shutdown action to preserve the existing default behavior.
>
> Tested on a i.MX8MM board with the following devicetree changes:
>
>         thermal-zones {
>                 cpu-thermal {
>                         critical-action = "reboot";
>                 };
>         };
>
> Signed-off-by: Fabio Estevam <festevam@denx.de>
> ---
> Changes since v4:
> - Simplify the logic inside thermal_zone_device_critical(). (Rafael)
> - Declare THERMAL_CRITICAL_ACTION_SHUTDOWN = 0 so it is clear what happens
> on non-DT platforms. (Rafael)
>
>  drivers/thermal/thermal_core.c |  6 +++++-
>  drivers/thermal/thermal_of.c   | 27 +++++++++++++++++++++++++++
>  include/linux/thermal.h        |  6 ++++++
>  3 files changed, 38 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/thermal/thermal_core.c b/drivers/thermal/thermal_core.c
> index a59700593d32..062114608667 100644
> --- a/drivers/thermal/thermal_core.c
> +++ b/drivers/thermal/thermal_core.c
> @@ -320,11 +320,15 @@ void thermal_zone_device_critical(struct thermal_zone_device *tz)
>          * Its a must for forced_emergency_poweroff_work to be scheduled.
>          */
>         int poweroff_delay_ms = CONFIG_THERMAL_EMERGENCY_POWEROFF_DELAY_MS;
> +       static const char *msg = "Temperature too high";
>
>         dev_emerg(&tz->device, "%s: critical temperature reached, "
>                   "shutting down\n", tz->type);
>
> -       hw_protection_shutdown("Temperature too high", poweroff_delay_ms);
> +       if (tz->action == THERMAL_CRITICAL_ACTION_REBOOT)
> +               hw_protection_reboot(msg, poweroff_delay_ms);
> +       else
> +               hw_protection_shutdown(msg, poweroff_delay_ms);
>  }
>  EXPORT_SYMBOL(thermal_zone_device_critical);
>
> diff --git a/drivers/thermal/thermal_of.c b/drivers/thermal/thermal_of.c
> index 4ca905723429..8bc28cba7406 100644
> --- a/drivers/thermal/thermal_of.c
> +++ b/drivers/thermal/thermal_of.c
> @@ -218,6 +218,31 @@ static struct device_node *of_thermal_zone_find(struct device_node *sensor, int
>         return tz;
>  }
>
> +static const char * const critical_actions[] = {
> +       [THERMAL_CRITICAL_ACTION_SHUTDOWN]      = "shutdown",
> +       [THERMAL_CRITICAL_ACTION_REBOOT]        = "reboot",
> +};
> +
> +static void thermal_of_get_critical_action(struct device_node *np,
> +                                          enum thermal_action *action)
> +{
> +       const char *action_string;
> +       int i, ret;
> +
> +       ret = of_property_read_string(np, "critical-action", &action_string);
> +       if (ret < 0)
> +               goto out_default_action;
> +
> +       for (i = 0; i < ARRAY_SIZE(critical_actions); i++)
> +               if (!strcasecmp(action_string, critical_actions[i])) {
> +                       *action = i;
> +                       return;
> +               }

This looks somewhat artificial and is a bit questionable (the index
variable should arguably start at THERMAL_CRITICAL_ACTION_SHUTDOWN,
for instance, and the "shutdown" item is redundant).

There are only two values and you want to carry out an emergency
shutdown anyway if the value is not "reboot".

I would just do

    if (!strcasecmp(action_string, "reboot")) {
            *action = THERMAL_CRITICAL_ACTION_REBOOT;
            return;
    }

> +
> +out_default_action:
> +       *action = THERMAL_CRITICAL_ACTION_SHUTDOWN;
> +}
> +
>  static int thermal_of_monitor_init(struct device_node *np, int *delay, int *pdelay)
>  {
>         int ret;
> @@ -516,6 +541,8 @@ static struct thermal_zone_device *thermal_of_zone_register(struct device_node *
>                 goto out_kfree_trips;
>         }
>
> +       thermal_of_get_critical_action(np, &tz->action);
> +
>         ret = thermal_zone_device_enable(tz);
>         if (ret) {
>                 pr_err("Failed to enabled thermal zone '%s', id=%d: %d\n",
> diff --git a/include/linux/thermal.h b/include/linux/thermal.h
> index b449a46766f5..b68e5734823d 100644
> --- a/include/linux/thermal.h
> +++ b/include/linux/thermal.h
> @@ -34,6 +34,11 @@ struct thermal_cooling_device;
>  struct thermal_instance;
>  struct thermal_attr;
>
> +enum thermal_action {
> +       THERMAL_CRITICAL_ACTION_SHUTDOWN = 0, /* shutdown when crit temperature is reached */
> +       THERMAL_CRITICAL_ACTION_REBOOT, /* reboot when crit temperature is reached */
> +};
> +
>  enum thermal_trend {
>         THERMAL_TREND_STABLE, /* temperature is stable */
>         THERMAL_TREND_RAISING, /* temperature is raising */
> @@ -187,6 +192,7 @@ struct thermal_zone_device {
>         struct list_head node;
>         struct delayed_work poll_queue;
>         enum thermal_notify_event notify_event;
> +       enum thermal_action action;
>  };
>
>  /**
> --

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v5 2/3] reboot: Introduce hw_protection_reboot()
  2023-08-29 19:41 ` [PATCH v5 2/3] reboot: Introduce hw_protection_reboot() Fabio Estevam
@ 2023-08-30 11:33   ` Rafael J. Wysocki
  0 siblings, 0 replies; 12+ messages in thread
From: Rafael J. Wysocki @ 2023-08-30 11:33 UTC (permalink / raw)
  To: Fabio Estevam
  Cc: daniel.lezcano, rafael, amitk, rui.zhang, linux-pm,
	krzysztof.kozlowski+dt, robh+dt, conor+dt, devicetree,
	Fabio Estevam

On Tue, Aug 29, 2023 at 9:42 PM Fabio Estevam <festevam@gmail.com> wrote:
>
> From: Fabio Estevam <festevam@denx.de>
>
> Introduce hw_protection_reboot() to trigger an emergency reboot.
>
> It is a counterpart of hw_protection_shutdown() with the difference
> that it will force a reboot instead of shutdown.
>
> The motivation for doing this is to allow the thermal subystem
> to trigger a reboot when the temperature reaches the critical
> temperature.
>
> Signed-off-by: Fabio Estevam <festevam@denx.de>
> ---
> Changes since v4:
> - None
>
>  include/linux/reboot.h |  1 +
>  kernel/reboot.c        | 34 ++++++++++++++++++++++++++++++++++
>  2 files changed, 35 insertions(+)
>
> diff --git a/include/linux/reboot.h b/include/linux/reboot.h
> index 2b6bb593be5b..4a319bc24f6a 100644
> --- a/include/linux/reboot.h
> +++ b/include/linux/reboot.h
> @@ -174,6 +174,7 @@ void ctrl_alt_del(void);
>
>  extern void orderly_poweroff(bool force);
>  extern void orderly_reboot(void);
> +void hw_protection_reboot(const char *reason, int ms_until_forced);
>  void hw_protection_shutdown(const char *reason, int ms_until_forced);
>
>  /*
> diff --git a/kernel/reboot.c b/kernel/reboot.c
> index 3bba88c7ffc6..05333ae8bc6b 100644
> --- a/kernel/reboot.c
> +++ b/kernel/reboot.c
> @@ -952,6 +952,40 @@ static void hw_failure_emergency_poweroff(int poweroff_delay_ms)
>                               msecs_to_jiffies(poweroff_delay_ms));
>  }
>
> +/**
> + * hw_protection_reboot - Trigger an emergency system reboot
> + *
> + * @reason:            Reason of emergency reboot to be printed.
> + * @ms_until_forced:   Time to wait for orderly reboot before tiggering a
> + *                     forced reboot. Negative value disables the forced
> + *                     reboot.
> + *
> + * Initiate an emergency system reboot in order to protect hardware from
> + * further damage. Usage examples include a thermal protection.
> + *
> + * NOTE: The request is ignored if protection reboot is already pending even
> + * if the previous request has given a large timeout for forced reboot.
> + * Can be called from any context.
> + */
> +void hw_protection_reboot(const char *reason, int ms_until_forced)

Some code and kerneldoc comment duplication could be avoided by
changing hw_protection_shutdown() definition to take an additional
argument specifying the final action to carry out, so there would be

void __hw_protection_shutdown(const char *reason, int ms_until_forced,
bool reboot)
{
...
   if (reboot)
       orderly_reboot();
   else
      orderly_poweroff(true);
}

and both hw_protection_shutdown() and hw_protection_reboot() can be
defined as static inline wrappers around this.

> +{
> +       static atomic_t allow_proceed = ATOMIC_INIT(1);
> +
> +       pr_emerg("HARDWARE PROTECTION reboot (%s)\n", reason);
> +
> +       /* Reboot should be initiated only once. */
> +       if (!atomic_dec_and_test(&allow_proceed))
> +               return;
> +
> +       /*
> +        * Queue a backup emergency reboot in the event of
> +        * orderly_reboot failure
> +        */
> +       hw_failure_emergency_poweroff(ms_until_forced);
> +       orderly_reboot();
> +}
> +EXPORT_SYMBOL_GPL(hw_protection_reboot);
> +
>  /**
>   * hw_protection_shutdown - Trigger an emergency system poweroff
>   *
> --
> 2.34.1
>

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v5 1/3] dt-bindings: thermal-zones: Document critical-action
  2023-08-29 19:41 [PATCH v5 1/3] dt-bindings: thermal-zones: Document critical-action Fabio Estevam
  2023-08-29 19:41 ` [PATCH v5 2/3] reboot: Introduce hw_protection_reboot() Fabio Estevam
  2023-08-29 19:42 ` [PATCH v5 3/3] thermal: thermal_core: Allow rebooting after critical temp Fabio Estevam
@ 2023-08-30 11:37 ` Rafael J. Wysocki
  2023-08-30 13:06   ` Krzysztof Kozlowski
  2 siblings, 1 reply; 12+ messages in thread
From: Rafael J. Wysocki @ 2023-08-30 11:37 UTC (permalink / raw)
  To: Fabio Estevam
  Cc: daniel.lezcano, rafael, amitk, rui.zhang, linux-pm,
	krzysztof.kozlowski+dt, robh+dt, conor+dt, devicetree,
	Fabio Estevam, Krzysztof Kozlowski

On Tue, Aug 29, 2023 at 9:42 PM Fabio Estevam <festevam@gmail.com> wrote:
>
> From: Fabio Estevam <festevam@denx.de>
>
> Document the critical-action property to describe the thermal action
> the OS should perform after the critical temperature is reached.
>
> The possible values are "shutdown" and "reboot".
>
> The motivation for introducing the critical-action property is that
> different systems may need different thermal actions when the critical
> temperature is reached.
>
> For example, a desktop PC may want the OS to trigger a shutdown
> when the critical temperature is reached.
>
> However, in some embedded cases, such behavior does not suit well,
> as the board may be unattended in the field and rebooting may be a
> better approach.
>
> The bootloader may also benefit from this new property as it can check
> the SoC temperature and in case the temperature is above the critical
> point, it can trigger a shutdown or reboot accordingly.
>
> Signed-off-by: Fabio Estevam <festevam@denx.de>
> Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
> ---
> Changes since v4:
> - None.
>
>  .../devicetree/bindings/thermal/thermal-zones.yaml       | 9 +++++++++
>  1 file changed, 9 insertions(+)
>
> diff --git a/Documentation/devicetree/bindings/thermal/thermal-zones.yaml b/Documentation/devicetree/bindings/thermal/thermal-zones.yaml
> index 4f3acdc4dec0..c2e4d28f885b 100644
> --- a/Documentation/devicetree/bindings/thermal/thermal-zones.yaml
> +++ b/Documentation/devicetree/bindings/thermal/thermal-zones.yaml
> @@ -75,6 +75,15 @@ patternProperties:
>            framework and assumes that the thermal sensors in this zone
>            support interrupts.
>
> +      critical-action:
> +        $ref: /schemas/types.yaml#/definitions/string
> +        description:
> +          The action the OS should perform after the critical temperature is reached.
> +
> +        enum:
> +          - shutdown
> +          - reboot
> +
>        thermal-sensors:
>          $ref: /schemas/types.yaml#/definitions/phandle-array
>          maxItems: 1
> --

I'm wondering if this should be a bool property called
"critical-reboot", say, which when present would mean to carry out a
reboot instead of a shutdown in an emergency.

As defined above, the "shutdown" value is simply redundant, because
the code will effectively convert any other value to "shutdown"
anyway.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v5 1/3] dt-bindings: thermal-zones: Document critical-action
  2023-08-30 11:37 ` [PATCH v5 1/3] dt-bindings: thermal-zones: Document critical-action Rafael J. Wysocki
@ 2023-08-30 13:06   ` Krzysztof Kozlowski
  2023-08-30 13:54     ` Rafael J. Wysocki
  0 siblings, 1 reply; 12+ messages in thread
From: Krzysztof Kozlowski @ 2023-08-30 13:06 UTC (permalink / raw)
  To: Rafael J. Wysocki, Fabio Estevam
  Cc: daniel.lezcano, amitk, rui.zhang, linux-pm,
	krzysztof.kozlowski+dt, robh+dt, conor+dt, devicetree,
	Fabio Estevam

On 30/08/2023 13:37, Rafael J. Wysocki wrote:
> On Tue, Aug 29, 2023 at 9:42 PM Fabio Estevam <festevam@gmail.com> wrote:
>>
>> From: Fabio Estevam <festevam@denx.de>
>>
>> Document the critical-action property to describe the thermal action
>> the OS should perform after the critical temperature is reached.
>>
>> The possible values are "shutdown" and "reboot".
>>
>> The motivation for introducing the critical-action property is that
>> different systems may need different thermal actions when the critical
>> temperature is reached.
>>
>> For example, a desktop PC may want the OS to trigger a shutdown
>> when the critical temperature is reached.
>>
>> However, in some embedded cases, such behavior does not suit well,
>> as the board may be unattended in the field and rebooting may be a
>> better approach.
>>
>> The bootloader may also benefit from this new property as it can check
>> the SoC temperature and in case the temperature is above the critical
>> point, it can trigger a shutdown or reboot accordingly.
>>
>> Signed-off-by: Fabio Estevam <festevam@denx.de>
>> Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
>> ---
>> Changes since v4:
>> - None.
>>
>>  .../devicetree/bindings/thermal/thermal-zones.yaml       | 9 +++++++++
>>  1 file changed, 9 insertions(+)
>>
>> diff --git a/Documentation/devicetree/bindings/thermal/thermal-zones.yaml b/Documentation/devicetree/bindings/thermal/thermal-zones.yaml
>> index 4f3acdc4dec0..c2e4d28f885b 100644
>> --- a/Documentation/devicetree/bindings/thermal/thermal-zones.yaml
>> +++ b/Documentation/devicetree/bindings/thermal/thermal-zones.yaml
>> @@ -75,6 +75,15 @@ patternProperties:
>>            framework and assumes that the thermal sensors in this zone
>>            support interrupts.
>>
>> +      critical-action:
>> +        $ref: /schemas/types.yaml#/definitions/string
>> +        description:
>> +          The action the OS should perform after the critical temperature is reached.
>> +
>> +        enum:
>> +          - shutdown
>> +          - reboot
>> +
>>        thermal-sensors:
>>          $ref: /schemas/types.yaml#/definitions/phandle-array
>>          maxItems: 1
>> --
> 
> I'm wondering if this should be a bool property called
> "critical-reboot", say, which when present would mean to carry out a
> reboot instead of a shutdown in an emergency.
> 
> As defined above, the "shutdown" value is simply redundant, because
> the code will effectively convert any other value to "shutdown"
> anyway.

We covered this at v1. Bool does not allow this property to change in
the future, e.g. for a third mode. And how would you change the action
to shutdown if default action in the system was reboot?

Best regards,
Krzysztof


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v5 1/3] dt-bindings: thermal-zones: Document critical-action
  2023-08-30 13:06   ` Krzysztof Kozlowski
@ 2023-08-30 13:54     ` Rafael J. Wysocki
  2023-08-30 15:14       ` Fabio Estevam
  2023-08-30 15:32       ` Krzysztof Kozlowski
  0 siblings, 2 replies; 12+ messages in thread
From: Rafael J. Wysocki @ 2023-08-30 13:54 UTC (permalink / raw)
  To: Krzysztof Kozlowski
  Cc: Rafael J. Wysocki, Fabio Estevam, daniel.lezcano, amitk,
	rui.zhang, linux-pm, krzysztof.kozlowski+dt, robh+dt, conor+dt,
	devicetree, Fabio Estevam

On Wed, Aug 30, 2023 at 3:07 PM Krzysztof Kozlowski
<krzysztof.kozlowski@linaro.org> wrote:
>
> On 30/08/2023 13:37, Rafael J. Wysocki wrote:
> > On Tue, Aug 29, 2023 at 9:42 PM Fabio Estevam <festevam@gmail.com> wrote:
> >>
> >> From: Fabio Estevam <festevam@denx.de>
> >>
> >> Document the critical-action property to describe the thermal action
> >> the OS should perform after the critical temperature is reached.
> >>
> >> The possible values are "shutdown" and "reboot".
> >>
> >> The motivation for introducing the critical-action property is that
> >> different systems may need different thermal actions when the critical
> >> temperature is reached.
> >>
> >> For example, a desktop PC may want the OS to trigger a shutdown
> >> when the critical temperature is reached.
> >>
> >> However, in some embedded cases, such behavior does not suit well,
> >> as the board may be unattended in the field and rebooting may be a
> >> better approach.
> >>
> >> The bootloader may also benefit from this new property as it can check
> >> the SoC temperature and in case the temperature is above the critical
> >> point, it can trigger a shutdown or reboot accordingly.
> >>
> >> Signed-off-by: Fabio Estevam <festevam@denx.de>
> >> Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
> >> ---
> >> Changes since v4:
> >> - None.
> >>
> >>  .../devicetree/bindings/thermal/thermal-zones.yaml       | 9 +++++++++
> >>  1 file changed, 9 insertions(+)
> >>
> >> diff --git a/Documentation/devicetree/bindings/thermal/thermal-zones.yaml b/Documentation/devicetree/bindings/thermal/thermal-zones.yaml
> >> index 4f3acdc4dec0..c2e4d28f885b 100644
> >> --- a/Documentation/devicetree/bindings/thermal/thermal-zones.yaml
> >> +++ b/Documentation/devicetree/bindings/thermal/thermal-zones.yaml
> >> @@ -75,6 +75,15 @@ patternProperties:
> >>            framework and assumes that the thermal sensors in this zone
> >>            support interrupts.
> >>
> >> +      critical-action:
> >> +        $ref: /schemas/types.yaml#/definitions/string
> >> +        description:
> >> +          The action the OS should perform after the critical temperature is reached.
> >> +
> >> +        enum:
> >> +          - shutdown
> >> +          - reboot
> >> +
> >>        thermal-sensors:
> >>          $ref: /schemas/types.yaml#/definitions/phandle-array
> >>          maxItems: 1
> >> --
> >
> > I'm wondering if this should be a bool property called
> > "critical-reboot", say, which when present would mean to carry out a
> > reboot instead of a shutdown in an emergency.
> >
> > As defined above, the "shutdown" value is simply redundant, because
> > the code will effectively convert any other value to "shutdown"
> > anyway.
>
> We covered this at v1. Bool does not allow this property to change in
> the future, e.g. for a third mode. And how would you change the action
> to shutdown if default action in the system was reboot?

Well, as a matter of fact, it isn't, so I'm not sure where this is going.

Bool definitely allows the property to be not present, which means
that the default behavior is intended and this is all about overriding
a known default behavior.

Anyway, if the maintainers of DT bindings are fine with the current
definition, I'm not going to block this.  I just wanted to make a note
that it wasn't looking particularly straightforward to me.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v5 1/3] dt-bindings: thermal-zones: Document critical-action
  2023-08-30 13:54     ` Rafael J. Wysocki
@ 2023-08-30 15:14       ` Fabio Estevam
  2023-08-30 15:33         ` Rafael J. Wysocki
  2023-08-30 15:32       ` Krzysztof Kozlowski
  1 sibling, 1 reply; 12+ messages in thread
From: Fabio Estevam @ 2023-08-30 15:14 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Krzysztof Kozlowski, Fabio Estevam, daniel.lezcano, amitk,
	rui.zhang, linux-pm, krzysztof.kozlowski+dt, robh+dt, conor+dt,
	devicetree

Hi Rafael,

On 30/08/2023 10:54, Rafael J. Wysocki wrote:

> Well, as a matter of fact, it isn't, so I'm not sure where this is 
> going.
> 
> Bool definitely allows the property to be not present, which means
> that the default behavior is intended and this is all about overriding
> a known default behavior.

This devicetree property can be used on non-Linux environments, such as 
bootloaders.

Bootloaders may have different default behavior than Linux, so 
explicitly listing
"shutdown" or "reboot" make it clearer, independent of the OS.

Thanks

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v5 1/3] dt-bindings: thermal-zones: Document critical-action
  2023-08-30 13:54     ` Rafael J. Wysocki
  2023-08-30 15:14       ` Fabio Estevam
@ 2023-08-30 15:32       ` Krzysztof Kozlowski
  2023-08-30 15:38         ` Rafael J. Wysocki
  1 sibling, 1 reply; 12+ messages in thread
From: Krzysztof Kozlowski @ 2023-08-30 15:32 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Fabio Estevam, daniel.lezcano, amitk, rui.zhang, linux-pm,
	krzysztof.kozlowski+dt, robh+dt, conor+dt, devicetree,
	Fabio Estevam

On 30/08/2023 15:54, Rafael J. Wysocki wrote:
> On Wed, Aug 30, 2023 at 3:07 PM Krzysztof Kozlowski
> <krzysztof.kozlowski@linaro.org> wrote:
>>
>> On 30/08/2023 13:37, Rafael J. Wysocki wrote:
>>> On Tue, Aug 29, 2023 at 9:42 PM Fabio Estevam <festevam@gmail.com> wrote:
>>>>
>>>> From: Fabio Estevam <festevam@denx.de>
>>>>
>>>> Document the critical-action property to describe the thermal action
>>>> the OS should perform after the critical temperature is reached.
>>>>
>>>> The possible values are "shutdown" and "reboot".
>>>>
>>>> The motivation for introducing the critical-action property is that
>>>> different systems may need different thermal actions when the critical
>>>> temperature is reached.
>>>>
>>>> For example, a desktop PC may want the OS to trigger a shutdown
>>>> when the critical temperature is reached.
>>>>
>>>> However, in some embedded cases, such behavior does not suit well,
>>>> as the board may be unattended in the field and rebooting may be a
>>>> better approach.
>>>>
>>>> The bootloader may also benefit from this new property as it can check
>>>> the SoC temperature and in case the temperature is above the critical
>>>> point, it can trigger a shutdown or reboot accordingly.
>>>>
>>>> Signed-off-by: Fabio Estevam <festevam@denx.de>
>>>> Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
>>>> ---
>>>> Changes since v4:
>>>> - None.
>>>>
>>>>  .../devicetree/bindings/thermal/thermal-zones.yaml       | 9 +++++++++
>>>>  1 file changed, 9 insertions(+)
>>>>
>>>> diff --git a/Documentation/devicetree/bindings/thermal/thermal-zones.yaml b/Documentation/devicetree/bindings/thermal/thermal-zones.yaml
>>>> index 4f3acdc4dec0..c2e4d28f885b 100644
>>>> --- a/Documentation/devicetree/bindings/thermal/thermal-zones.yaml
>>>> +++ b/Documentation/devicetree/bindings/thermal/thermal-zones.yaml
>>>> @@ -75,6 +75,15 @@ patternProperties:
>>>>            framework and assumes that the thermal sensors in this zone
>>>>            support interrupts.
>>>>
>>>> +      critical-action:
>>>> +        $ref: /schemas/types.yaml#/definitions/string
>>>> +        description:
>>>> +          The action the OS should perform after the critical temperature is reached.
>>>> +
>>>> +        enum:
>>>> +          - shutdown
>>>> +          - reboot
>>>> +
>>>>        thermal-sensors:
>>>>          $ref: /schemas/types.yaml#/definitions/phandle-array
>>>>          maxItems: 1
>>>> --
>>>
>>> I'm wondering if this should be a bool property called
>>> "critical-reboot", say, which when present would mean to carry out a
>>> reboot instead of a shutdown in an emergency.
>>>
>>> As defined above, the "shutdown" value is simply redundant, because
>>> the code will effectively convert any other value to "shutdown"
>>> anyway.
>>
>> We covered this at v1. Bool does not allow this property to change in
>> the future, e.g. for a third mode. And how would you change the action
>> to shutdown if default action in the system was reboot?
> 
> Well, as a matter of fact, it isn't, so I'm not sure where this is going.

It isn't in which system and firmware? Maybe most, but how can you know
for sure. Bindings are independent of given OS implementation, thus
relying on default OS choice is shortsighted.

> 
> Bool definitely allows the property to be not present, which means
> that the default behavior is intended and this is all about overriding
> a known default behavior.
> 
> Anyway, if the maintainers of DT bindings are fine with the current
> definition, I'm not going to block this.  I just wanted to make a note
> that it wasn't looking particularly straightforward to me.

Sure. It has one DT's maintainer Ack (mine) and maybe also Rob will
comment on it.

Best regards,
Krzysztof


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v5 1/3] dt-bindings: thermal-zones: Document critical-action
  2023-08-30 15:14       ` Fabio Estevam
@ 2023-08-30 15:33         ` Rafael J. Wysocki
  0 siblings, 0 replies; 12+ messages in thread
From: Rafael J. Wysocki @ 2023-08-30 15:33 UTC (permalink / raw)
  To: Fabio Estevam
  Cc: Rafael J. Wysocki, Krzysztof Kozlowski, Fabio Estevam,
	daniel.lezcano, amitk, rui.zhang, linux-pm,
	krzysztof.kozlowski+dt, robh+dt, conor+dt, devicetree

Hi Fabio,

On Wed, Aug 30, 2023 at 5:14 PM Fabio Estevam <festevam@denx.de> wrote:
>
> Hi Rafael,
>
> On 30/08/2023 10:54, Rafael J. Wysocki wrote:
>
> > Well, as a matter of fact, it isn't, so I'm not sure where this is
> > going.
> >
> > Bool definitely allows the property to be not present, which means
> > that the default behavior is intended and this is all about overriding
> > a known default behavior.
>
> This devicetree property can be used on non-Linux environments, such as
> bootloaders.
>
> Bootloaders may have different default behavior than Linux, so
> explicitly listing
> "shutdown" or "reboot" make it clearer, independent of the OS.

This property is defined in the Linux kernel source tree and it will
only turn out whether or not any other projects adopt it, so I'm not
really convinced by this argument.

What I'm saying is that from the Linux kernel's POV, the definition is
more complex than it needs to be.

Moreover, to me it merely represents the preference of a system
integrator and so in practice, it can be ignored.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v5 1/3] dt-bindings: thermal-zones: Document critical-action
  2023-08-30 15:32       ` Krzysztof Kozlowski
@ 2023-08-30 15:38         ` Rafael J. Wysocki
  0 siblings, 0 replies; 12+ messages in thread
From: Rafael J. Wysocki @ 2023-08-30 15:38 UTC (permalink / raw)
  To: Krzysztof Kozlowski
  Cc: Rafael J. Wysocki, Fabio Estevam, daniel.lezcano, amitk,
	rui.zhang, linux-pm, krzysztof.kozlowski+dt, robh+dt, conor+dt,
	devicetree, Fabio Estevam

On Wed, Aug 30, 2023 at 5:33 PM Krzysztof Kozlowski
<krzysztof.kozlowski@linaro.org> wrote:
>
> On 30/08/2023 15:54, Rafael J. Wysocki wrote:
> > On Wed, Aug 30, 2023 at 3:07 PM Krzysztof Kozlowski
> > <krzysztof.kozlowski@linaro.org> wrote:
> >>
> >> On 30/08/2023 13:37, Rafael J. Wysocki wrote:
> >>> On Tue, Aug 29, 2023 at 9:42 PM Fabio Estevam <festevam@gmail.com> wrote:
> >>>>
> >>>> From: Fabio Estevam <festevam@denx.de>
> >>>>
> >>>> Document the critical-action property to describe the thermal action
> >>>> the OS should perform after the critical temperature is reached.
> >>>>
> >>>> The possible values are "shutdown" and "reboot".
> >>>>
> >>>> The motivation for introducing the critical-action property is that
> >>>> different systems may need different thermal actions when the critical
> >>>> temperature is reached.
> >>>>
> >>>> For example, a desktop PC may want the OS to trigger a shutdown
> >>>> when the critical temperature is reached.
> >>>>
> >>>> However, in some embedded cases, such behavior does not suit well,
> >>>> as the board may be unattended in the field and rebooting may be a
> >>>> better approach.
> >>>>
> >>>> The bootloader may also benefit from this new property as it can check
> >>>> the SoC temperature and in case the temperature is above the critical
> >>>> point, it can trigger a shutdown or reboot accordingly.
> >>>>
> >>>> Signed-off-by: Fabio Estevam <festevam@denx.de>
> >>>> Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
> >>>> ---
> >>>> Changes since v4:
> >>>> - None.
> >>>>
> >>>>  .../devicetree/bindings/thermal/thermal-zones.yaml       | 9 +++++++++
> >>>>  1 file changed, 9 insertions(+)
> >>>>
> >>>> diff --git a/Documentation/devicetree/bindings/thermal/thermal-zones.yaml b/Documentation/devicetree/bindings/thermal/thermal-zones.yaml
> >>>> index 4f3acdc4dec0..c2e4d28f885b 100644
> >>>> --- a/Documentation/devicetree/bindings/thermal/thermal-zones.yaml
> >>>> +++ b/Documentation/devicetree/bindings/thermal/thermal-zones.yaml
> >>>> @@ -75,6 +75,15 @@ patternProperties:
> >>>>            framework and assumes that the thermal sensors in this zone
> >>>>            support interrupts.
> >>>>
> >>>> +      critical-action:
> >>>> +        $ref: /schemas/types.yaml#/definitions/string
> >>>> +        description:
> >>>> +          The action the OS should perform after the critical temperature is reached.
> >>>> +
> >>>> +        enum:
> >>>> +          - shutdown
> >>>> +          - reboot
> >>>> +
> >>>>        thermal-sensors:
> >>>>          $ref: /schemas/types.yaml#/definitions/phandle-array
> >>>>          maxItems: 1
> >>>> --
> >>>
> >>> I'm wondering if this should be a bool property called
> >>> "critical-reboot", say, which when present would mean to carry out a
> >>> reboot instead of a shutdown in an emergency.
> >>>
> >>> As defined above, the "shutdown" value is simply redundant, because
> >>> the code will effectively convert any other value to "shutdown"
> >>> anyway.
> >>
> >> We covered this at v1. Bool does not allow this property to change in
> >> the future, e.g. for a third mode. And how would you change the action
> >> to shutdown if default action in the system was reboot?
> >
> > Well, as a matter of fact, it isn't, so I'm not sure where this is going.
>
> It isn't in which system and firmware?

There is a specific default behavior present in the Linux kernel
today.  The addition of this property isn't going to change it AFAICS.

> Maybe most, but how can you know
> for sure. Bindings are independent of given OS implementation, thus
> relying on default OS choice is shortsighted.

So can you point me to any project other than the Linux kernel that
will support this property once it gets to the DT bindings
documentation in the kernel source?

> >
> > Bool definitely allows the property to be not present, which means
> > that the default behavior is intended and this is all about overriding
> > a known default behavior.
> >
> > Anyway, if the maintainers of DT bindings are fine with the current
> > definition, I'm not going to block this.  I just wanted to make a note
> > that it wasn't looking particularly straightforward to me.
>
> Sure. It has one DT's maintainer Ack (mine) and maybe also Rob will
> comment on it.

OK

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2023-08-30 18:35 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-08-29 19:41 [PATCH v5 1/3] dt-bindings: thermal-zones: Document critical-action Fabio Estevam
2023-08-29 19:41 ` [PATCH v5 2/3] reboot: Introduce hw_protection_reboot() Fabio Estevam
2023-08-30 11:33   ` Rafael J. Wysocki
2023-08-29 19:42 ` [PATCH v5 3/3] thermal: thermal_core: Allow rebooting after critical temp Fabio Estevam
2023-08-30 11:18   ` Rafael J. Wysocki
2023-08-30 11:37 ` [PATCH v5 1/3] dt-bindings: thermal-zones: Document critical-action Rafael J. Wysocki
2023-08-30 13:06   ` Krzysztof Kozlowski
2023-08-30 13:54     ` Rafael J. Wysocki
2023-08-30 15:14       ` Fabio Estevam
2023-08-30 15:33         ` Rafael J. Wysocki
2023-08-30 15:32       ` Krzysztof Kozlowski
2023-08-30 15:38         ` Rafael J. Wysocki

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.