From: "Zhang, Rui" <rui.zhang@intel.com>
To: "viresh.kumar@linaro.org" <viresh.kumar@linaro.org>,
"Wysocki, Rafael J" <rafael.j.wysocki@intel.com>,
"daniel.lezcano@linaro.org" <daniel.lezcano@linaro.org>
Cc: "srinivas.pandruvada@linux.intel.com"
<srinivas.pandruvada@linux.intel.com>,
"linux-pm@vger.kernel.org" <linux-pm@vger.kernel.org>,
"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>
Subject: [Regression] cached max_state breaks ACPI processor cooling device
Date: Sat, 25 Feb 2023 05:29:59 +0000 [thread overview]
Message-ID: <53ec1f06f61c984100868926f282647e57ecfb2d.camel@intel.com> (raw)
Hi, All,
Starting from commit c408b3d1d9bb("thermal: Validate new state in
cur_state_store()") and commit a365105c685c("thermal: sysfs: Reuse
cdev->max_state"), the cdev->get_max_state() is evaluated only once
during cooling device registration.
This is done to fix the below Smatch static checker warning:
drivers/thermal/thermal_sysfs.c:656 thermal_cooling_device_stats_update()
warn: potential integer overflow from user 'stats->state *
stats->max_states + new_state'
reported here https://lore.kernel.org/all/Y0ltRJRjO7AkawvE@kili/.
But this actually breaks cooling devices which could have dynamic max
cooling state, like ACPI processor cooling device.
acpi_processor_driver_init
driver_register(&acpi_processor_driver);
acpi_processor_start
acpi_processor_thermal_init
thermal_cooling_device_register
processor_get_max_state
acpi_processor_cpufreq_init = true
The driver doesn't count cpufreq as cooling state until
acpi_processor_cpufreq_init is set later.
As a result, without the commits above,
/sys/class/thermal/cooling_device10/cur_state:0
/sys/class/thermal/cooling_device10/max_state:3
/sys/class/thermal/cooling_device10/type:Processor
after the commits above,
/sys/class/thermal/cooling_device10/cur_state:0
/sys/class/thermal/cooling_device10/max_state:0
/sys/class/thermal/cooling_device10/type:Processor
In order to fix this, there are something worth clarification IMO.
1. should we support dynamic max_state or not?
ACPI processor cooling state is a combination of processor p-states
and t-states.
t-states are static, but p-states can vary depends on processor
frequency driver loaded or not.
I'm not sure if there is any other user like this, but still this
is a valid use case of dynamic max_state.
2. how to handle dynamic max_state for cooling device statistics in
sysfs?
IMO, when max_state changes, the definition of previous cooling
device is changed as well, thus we should abandon the previous
statistics and restart counting.
3. anything else needs to be handled when max_state changes?
Say, as the definition of each cooling state is changed when
max_state changes, should we invalidate and re-update the
thermal instances of this cdev in each thermal zone device?
4. how to detect/handle max cooling states changes?
Should we do this as before, which invokes .get_max_state()
everywhere and do updates when necessary , or
a. cache max_state like we do today
b. introduce a new API for max_state change
c. invoke the new API in the cooling device driver explicitly when
max_state changes
d. update cached max_state value, statistics sysfs and
thermal_instances in the API
e. remove .get_max_state() callback as we register and update the
max_state with a fixed value each time.
thanks,
rui
next reply other threads:[~2023-02-25 5:30 UTC|newest]
Thread overview: 5+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-02-25 5:29 Zhang, Rui [this message]
2023-02-25 5:34 ` [Regression] cached max_state breaks ACPI processor cooling device Zhang, Rui
2023-02-27 6:49 ` srinivas pandruvada
2023-02-27 12:03 ` Linux regression tracking #adding (Thorsten Leemhuis)
2023-03-02 19:29 ` Rafael J. Wysocki
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=53ec1f06f61c984100868926f282647e57ecfb2d.camel@intel.com \
--to=rui.zhang@intel.com \
--cc=daniel.lezcano@linaro.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-pm@vger.kernel.org \
--cc=rafael.j.wysocki@intel.com \
--cc=srinivas.pandruvada@linux.intel.com \
--cc=viresh.kumar@linaro.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).