From: Vadim Pasternak <vadimp@nvidia.com>
To: Daniel Lezcano <daniel.lezcano@linaro.org>,
"rui.zhang@intel.com" <rui.zhang@intel.com>
Cc: "linux-pm@vger.kernel.org" <linux-pm@vger.kernel.org>,
Ido Schimmel <idosch@nvidia.com>,
"Rafael J. Wysocki" <rjw@rjwysocki.net>
Subject: RE: [PATCH thermal 1/1] thermal/core: Skip cooling device statistics update for configuration operation
Date: Mon, 27 Sep 2021 13:29:16 +0000 [thread overview]
Message-ID: <BN9PR12MB5381EADD601B87E4F2C60EBFAFA79@BN9PR12MB5381.namprd12.prod.outlook.com> (raw)
In-Reply-To: <942558b3-e884-a907-0cc6-5eddf07c358a@linaro.org>
> -----Original Message-----
> From: Daniel Lezcano <daniel.lezcano@linaro.org>
> Sent: Monday, September 27, 2021 3:32 PM
> To: Vadim Pasternak <vadimp@nvidia.com>; rui.zhang@intel.com
> Cc: linux-pm@vger.kernel.org; Ido Schimmel <idosch@nvidia.com>; Rafael J.
> Wysocki <rjw@rjwysocki.net>
> Subject: Re: [PATCH thermal 1/1] thermal/core: Skip cooling device statistics
> update for configuration operation
>
>
>
>
> On 27/09/2021 13:22, Vadim Pasternak wrote:
> > Hi Daniel,
> >
> > Thank you for quick reply.
> >
> >> -----Original Message-----
> >> From: Daniel Lezcano <daniel.lezcano@linaro.org>
> >> Sent: Monday, September 27, 2021 1:42 PM
> >> To: Vadim Pasternak <vadimp@nvidia.com>; rui.zhang@intel.com
> >> Cc: =idosch@nvidia.com; linux-pm@vger.kernel.org
> >> Subject: Re: [PATCH thermal 1/1] thermal/core: Skip cooling device
> >> statistics update for configuration operation
> >>
> >>
> >> Hi Vadim,
> >>
> >>
> >> On 27/09/2021 10:24, Vadim Pasternak wrote:
> >>> The thermal subsystem maintains a transition table between states
> >>> that is allocated according to the maximum state supported by the
> >>> cooling device.
> >>>
> >>> When the table needs to be updated, the thermal subsystem does not
> >>> validate that the new state does not exceed the maximum state,
> >>> leading to out-of-bounds memory accesses [1].
> >>
> >> Actually, thermal_cooling_device_stats_update() is called if the
> >> set_cur_state is successful.
> >>
> >> With a state greater than the max state, the set_cur_state should
> >> fail and
> >> thermal_cooling_device_stats_update() is not called.
> >>
> >> Perhaps the problem is in mlxsw_thermal_set_cur_state() ?
> >
> > "mlxsw" thermal drivers has additional use of 'sysfs' 'cur_state' for
> > configuration purpose to limit minimum fan speed.
> > Fan speed minimum is enforced by setting 'cur_state' with value
> > exceeding actual fan speed maximum.
>
> Yes, and that is the problem because the driver is doing weird things with the
> cooling device state resulting in an abuse of the sysfs API and conflicting with
> the thermal internals.
>
>
> > This feature provides ability to limit fan speed according to some
> > system wise considerations, like absence of some replaceable units or
> > high system ambient temperature, or some other factors which
> > indirectly impacts system airflow.
>
> Is that a static thermal profile depending on the platform set by userspace or
> something which can be changed dynamically at runtime via eg. a daemon ?
Yes, this is some profiles/rules, which are system specific and according to these
rules userspace can limit fan speed. Like, for example:
- if one of power supplies is removed, system fan should be enforced to full
speed, because it makes a hole in a box, and it has hard impact on airflow.
- If port side ambient temperature reaches some threshold X1, fan speed should
be limited by Y1%, X2 - Y2%, etcetera.
- if temperature fault is detected for any optical transceivers - some limit is
required.
>
> > For example, if cooling devices operates at cooling levels from 1 to
> > 10
> > (1 for 10% fan speed, 10 for 100% fan speed), cooling device minimal
> > speed can be limited by setting 'cur_state' attribute through 'sysfs'
> > to the values from 'max_state' + 1 to 'max_state * 2' (from 11 to 20).
> > Following this example if value is set to 14 (40%) cooling levels
> > vector will be set to 4, 4, 4, 4, 4, 5, 6, 7, 8, 9, 10 for setting
> > device speed cooling states respectively in 40, 40, 40, 40, 40, 50,
> > 60. 70, 80, 90,
> > 100 percent. And it limits cooling device to operate only at 40% speed
> > and above.
> >
> > Maybe it would be worth adding earlier some dedicated 'cur_state_limit'
> > attribute for this feature, but it was not done.
> >
> > We have another driver required this feature and one new we are
> > developing now, which require fan minim speed limit as well.
>
> The use case is valid but I think the approach is wrong. Probably the simplest
> thing to do is to set a low trip point with a minimal fan speed.
For "trip_point_0_temp" there is the below definition:
{ /* In range - 0-40% PWM */
.type = THERMAL_TRIP_ACTIVE,
.temp = MLXSW_THERMAL_ASIC_TEMP_NORM,
.hyst = MLXSW_THERMAL_HYSTERESIS_TEMP,
.min_state = 0,
.max_state = (4 * MLXSW_THERMAL_MAX_STATE) / 10, (40%)
},
For "trip_point_1_temp":
{
/* In range - 40-100% PWM */
.type = THERMAL_TRIP_ACTIVE,
.temp = MLXSW_THERMAL_ASIC_TEMP_HIGH,
.hyst = MLXSW_THERMAL_HYSTERESIS_TEMP,
.min_state = (4 * MLXSW_THERMAL_MAX_STATE) / 10, (100%)
.max_state = MLXSW_THERMAL_MAX_STATE,
},
To limit cooling device by f.e 70%, I should change dynamically 'min_state' and 'max_state'
for both trips to (70%, 70%) and (70%, 100%). I am not sure I can do it?
And we have many customers, using this user space interface, it would be not so good to
change it.
I understand it is possible to handle this issue inside driver's set_cur_stat() callback by
returning positive value for configuration request.
But maybe this feature could you useful for other developers and it could be some common
interface to support it?
>
>
> --
> <http://www.linaro.org/> Linaro.org │ Open source software for ARM SoCs
>
> Follow Linaro: <http://www.facebook.com/pages/Linaro> Facebook |
> <http://twitter.com/#!/linaroorg> Twitter | <http://www.linaro.org/linaro-
> blog/> Blog
next prev parent reply other threads:[~2021-09-27 13:29 UTC|newest]
Thread overview: 9+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-09-27 8:24 [PATCH thermal 1/1] thermal/core: Skip cooling device statistics update for configuration operation Vadim Pasternak
2021-09-27 10:41 ` Daniel Lezcano
2021-09-27 11:22 ` Vadim Pasternak
2021-09-27 12:31 ` Daniel Lezcano
2021-09-27 13:29 ` Vadim Pasternak [this message]
2021-09-27 13:55 ` Daniel Lezcano
2021-09-27 17:52 ` Vadim Pasternak
2021-09-27 21:12 ` Daniel Lezcano
2021-09-28 11:35 ` Rafael J. Wysocki
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=BN9PR12MB5381EADD601B87E4F2C60EBFAFA79@BN9PR12MB5381.namprd12.prod.outlook.com \
--to=vadimp@nvidia.com \
--cc=daniel.lezcano@linaro.org \
--cc=idosch@nvidia.com \
--cc=linux-pm@vger.kernel.org \
--cc=rjw@rjwysocki.net \
--cc=rui.zhang@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).