All of lore.kernel.org
 help / color / mirror / Atom feed
From: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
To: Kai-Heng Feng <kai.heng.feng@canonical.com>
Cc: Daniel Lezcano <daniel.lezcano@linaro.org>,
	Zhang Rui <rui.zhang@intel.com>,
	amitk@kernel.org, "open list:THERMAL" <linux-pm@vger.kernel.org>,
	open list <linux-kernel@vger.kernel.org>
Subject: Re: [PATCH 1/3] thermal: core: Add indication for userspace usage
Date: Mon, 30 Nov 2020 10:13:08 -0800	[thread overview]
Message-ID: <585bb5d3ee5bea063795682108576c3464ba72b6.camel@linux.intel.com> (raw)
In-Reply-To: <34348B03-5E27-49A0-A704-6332BAC00758@canonical.com>

On Tue, 2020-12-01 at 02:04 +0800, Kai-Heng Feng wrote:
> > On Dec 1, 2020, at 00:19, Srinivas Pandruvada <
> > srinivas.pandruvada@linux.intel.com> wrote:
> > 
> > On Mon, 2020-11-30 at 16:23 +0800, Kai-Heng Feng wrote:
> > > > On Nov 30, 2020, at 15:57, Daniel Lezcano <
> > > > daniel.lezcano@linaro.org> wrote:
> > > > 
> > > > 
> > > > [Added Srinivas]
> > > > 
> > > > On 28/11/2020 18:54, Kai-Heng Feng wrote:
> > > > > We are seeing thermal shutdown on Intel based mobile
> > > > > workstations, the
> > > > > shutdown happens during the first trip handle in
> > > > > thermal_zone_device_register():
> > > > > kernel: thermal thermal_zone15: critical temperature reached
> > > > > (101
> > > > > C), shutting down
> > > > > 
> > > > > However, we shouldn't do a thermal shutdown here, since
> > > > > 1) We may want to use a dedicated daemon, Intel's thermald in
> > > > > this case,
> > > > > to handle thermal shutdown.
> > > > > 
> > > > > 2) For ACPI based system, _CRT doesn't mean shutdown unless
> > > > > it's
> > > > > inside
> > > > > ThermalZone. ACPI Spec, 11.4.4 _CRT (Critical Temperature):
> > > > > "... If this object it present under a device, the device’s
> > > > > driver
> > > > > evaluates this object to determine the device’s critical
> > > > > cooling
> > > > > temperature trip point. This value may then be used by the
> > > > > device’s
> > > > > driver to program an internal device temperature sensor trip
> > > > > point."
> > > > > 
> > > > > So a "critical trip" here merely means we should take a more
> > > > > aggressive
> > > > > cooling method.
> > > > 
> > > > Well, actually it is stated before:
> > > > 
> > > > "This object, when defined under a thermal zone, returns the
> > > > critical
> > > > temperature at which OSPM must shutdown the system".
> > > 
> > > This means specifically for the ACPI ThermalZone in AML, e.g.:
> > > 
> > > ThermalZone (TZ0) {
> > > ....
> > >    Method(_CRT) { ... }
> > > } // end of TZ0
> > > 
> > > However the device is not under any ACPI ThermalZone.
> > > 
> > > > That is what does the thermal subsystem, no ?
> > > > 
> > > > > So add an indication to let thermal core know it should leave
> > > > > thermal
> > > > > device to userspace to handle.
> > > > 
> > > > You may want to check the 'HOT' trip point and then use the
> > > > notification
> > > > mechanism to get notified in userspace and take action from
> > > > there
> > > > (eg.
> > > > offline some CPUs).
> > > 
> > > For this particular issue we are facing, the thermal shutdown
> > > happens
> > > in thermal_zone_device_register() and userspace isn't up yet.
> > 
> > What about creating an new callback
> > 
> > enum thermal_trip_status {
> > 	THERMAL_TRIP_DISABLED = 0,
> > 	THERMAL_TRIP_ENABLED,
> > };
> > 
> > int get_trip_status(struct thermal_zone_device *, int trip, enum
> > thermal_trip_status *state);
> > 
> > Then in 
> > static void handle_thermal_trip(struct thermal_zone_device *tz, int
> > trip)
> > {
> > 
> > /* before tz->ops->get_trip_temp(tz, trip, &trip_temp); */
> > if (tz->ops->get_trip_status) {
> > 	enum thermal_trip_status *status;
> > 
> > 	if (!tz->ops->get_trip_status(tz, trip, &status)) {
> > 		if (status == THERMAL_TRIP_DISABLED)
> > 			return;	
> > 	}
> > }
> > ...
> > ...
> > 
> > }
> > 
> > 
> > This callback will help the cases:
> > - Allows drivers to selectively disable certain trips during init
> > state
> > or system resume where there can be spikes or always. int340x
> > drivers
> > can disable always.
> 
> This sounds really great. This is indeed can happen on system resume,
> before userspace process thaw.
> 
> > - Still give options for drivers to handle critical trip even if
> > they
> > are bound to user space governors. User space process may be dead,
> > so
> > still allow kernel to process graceful shutdown
> 
> To make the scenario happen, do we need a new sysfs to let usespace
> enable it with THERMAL_TRIP_ENABLED?
This should be drivers call not user space.

Thanks,
Srinivas


> 
> Kai-Heng
> 
> > Thanks,
> > Srinivas
> > 
> > > Kai-Heng
> > > 
> > > > > Signed-off-by: Kai-Heng Feng <kai.heng.feng@canonical.com>
> > > > > ---
> > > > > drivers/thermal/thermal_core.c | 3 +++
> > > > > include/linux/thermal.h        | 2 ++
> > > > > 2 files changed, 5 insertions(+)
> > > > > 
> > > > > diff --git a/drivers/thermal/thermal_core.c
> > > > > b/drivers/thermal/thermal_core.c
> > > > > index c6d74bc1c90b..6561e3767529 100644
> > > > > --- a/drivers/thermal/thermal_core.c
> > > > > +++ b/drivers/thermal/thermal_core.c
> > > > > @@ -1477,6 +1477,9 @@ thermal_zone_device_register(const char
> > > > > *type, int trips, int mask,
> > > > > 			goto unregister;
> > > > > 	}
> > > > > 
> > > > > +	if (tz->tzp && tz->tzp->userspace)
> > > > > +		thermal_zone_device_disable(tz);
> > > > > +
> > > > > 	mutex_lock(&thermal_list_lock);
> > > > > 	list_add_tail(&tz->node, &thermal_tz_list);
> > > > > 	mutex_unlock(&thermal_list_lock);
> > > > > diff --git a/include/linux/thermal.h
> > > > > b/include/linux/thermal.h
> > > > > index d07ea27e72a9..e8e8fac78fc8 100644
> > > > > --- a/include/linux/thermal.h
> > > > > +++ b/include/linux/thermal.h
> > > > > @@ -247,6 +247,8 @@ struct thermal_zone_params {
> > > > > 	 */
> > > > > 	bool no_hwmon;
> > > > > 
> > > > > +	bool userspace;
> > > > > +
> > > > > 	int num_tbps;	/* Number of tbp entries */
> > > > > 	struct thermal_bind_params *tbp;
> > > > > 
> > > > > 
> > > > 
> > > > -- 
> > > > <http://www.linaro.org/> Linaro.org │ Open source software for
> > > > ARM
> > > > SoCs
> > > > 
> > > > Follow Linaro:  <http://www.facebook.com/pages/Linaro> Facebook
> > > > |
> > > > <http://twitter.com/#!/linaroorg> Twitter |
> > > > <http://www.linaro.org/linaro-blog/> Blog


  reply	other threads:[~2020-11-30 18:15 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-11-28 17:54 [PATCH 1/3] thermal: core: Add indication for userspace usage Kai-Heng Feng
2020-11-28 17:54 ` [PATCH 2/3] thermal: int340x: Indicate " Kai-Heng Feng
2020-11-30  5:29   ` Srinivas Pandruvada
2020-11-30  5:46     ` Kai-Heng Feng
2020-11-28 17:54 ` [PATCH 3/3] thermal: intel: intel_pch_thermal: " Kai-Heng Feng
2020-11-30  7:57 ` [PATCH 1/3] thermal: core: Add indication for " Daniel Lezcano
2020-11-30  8:23   ` Kai-Heng Feng
2020-11-30 16:19     ` Srinivas Pandruvada
2020-11-30 18:04       ` Kai-Heng Feng
2020-11-30 18:13         ` Srinivas Pandruvada [this message]
2020-11-30 18:22           ` Kai-Heng Feng
2020-11-30 18:39             ` Srinivas Pandruvada
2020-12-07  5:36               ` Kai-Heng Feng
2020-12-09  9:30                 ` Daniel Lezcano
2020-12-09 16:10                   ` Srinivas Pandruvada
2020-11-30  5:36 Kai-Heng Feng
2020-12-14 18:21 ` Matthew Garrett
2020-12-15 12:49   ` Kai-Heng Feng

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=585bb5d3ee5bea063795682108576c3464ba72b6.camel@linux.intel.com \
    --to=srinivas.pandruvada@linux.intel.com \
    --cc=amitk@kernel.org \
    --cc=daniel.lezcano@linaro.org \
    --cc=kai.heng.feng@canonical.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-pm@vger.kernel.org \
    --cc=rui.zhang@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.