* [PATCH] thermal: add ratelimited thermal and power logging
@ 2018-10-22 20:29 Ross Zwisler
2018-10-24 7:22 ` Viresh Kumar
0 siblings, 1 reply; 3+ messages in thread
From: Ross Zwisler @ 2018-10-22 20:29 UTC (permalink / raw)
To: linux-kernel
Cc: Ricky Liang, Amit Daniel Kachhap, Daniel Lezcano,
Eduardo Valentin, Javi Merino, Viresh Kumar, Zhang Rui, linux-pm,
Stephen Barber, Ross Zwisler
From: Ricky Liang <jcliang@chromium.org>
Add thermal logs in devfreq_cooling and cpu_cooling. Also add logging to
power_allocator when it starts to control power.
These changes can lead to excessive log spam when running up against
thermal limits, so have this logging ratelimited to allow only 1 log each
30 seconds from each of those subsystems.
Signed-off-by: Ricky Liang <jcliang@chromium.org>
Co-authored-by: Stephen Barber <smbarber@chromium.org>
Signed-off-by: Stephen Barber <smbarber@chromium.org>
Reviewed-by: Daniel Kurtz <djkurtz@chromium.org>
[ rez: squashed initial implementation & fixes, updated changelog for
upstream. ]
Signed-off-by: Ross Zwisler <zwisler@google.com>
---
drivers/thermal/cpu_cooling.c | 16 ++++++++++++++++
drivers/thermal/devfreq_cooling.c | 12 ++++++++++++
drivers/thermal/power_allocator.c | 17 +++++++++++++++++
3 files changed, 45 insertions(+)
diff --git a/drivers/thermal/cpu_cooling.c b/drivers/thermal/cpu_cooling.c
index dfd23245f778a..d8d1855d7d991 100644
--- a/drivers/thermal/cpu_cooling.c
+++ b/drivers/thermal/cpu_cooling.c
@@ -31,9 +31,17 @@
#include <linux/slab.h>
#include <linux/cpu.h>
#include <linux/cpu_cooling.h>
+#include <linux/ratelimit.h>
#include <trace/events/thermal.h>
+static DEFINE_RATELIMIT_STATE(cpu_cooling_ratelimit_state, 30 * HZ, 1);
+
+static int cpu_cooling_ratelimit(void)
+{
+ return __ratelimit(&cpu_cooling_ratelimit_state);
+}
+
/*
* Cooling state <-> CPUFreq frequency
*
@@ -389,6 +397,7 @@ static int cpufreq_set_cur_state(struct thermal_cooling_device *cdev,
{
struct cpufreq_cooling_device *cpufreq_cdev = cdev->devdata;
unsigned int clip_freq;
+ struct device *cpu_dev;
/* Request state should be less than max_level */
if (WARN_ON(state > cpufreq_cdev->max_level))
@@ -404,6 +413,13 @@ static int cpufreq_set_cur_state(struct thermal_cooling_device *cdev,
cpufreq_update_policy(cpufreq_cdev->policy->cpu);
+ if (cpu_cooling_ratelimit()) {
+ cpu_dev = get_cpu_device(cpufreq_cdev->policy->cpu);
+ dev_info(cpu_dev,
+ "Cooling state set to %lu. New max freq = %u\n",
+ state, clip_freq);
+ }
+
return 0;
}
diff --git a/drivers/thermal/devfreq_cooling.c b/drivers/thermal/devfreq_cooling.c
index ef59256887ff6..f95c7f513f05a 100644
--- a/drivers/thermal/devfreq_cooling.c
+++ b/drivers/thermal/devfreq_cooling.c
@@ -25,6 +25,7 @@
#include <linux/slab.h>
#include <linux/pm_opp.h>
#include <linux/thermal.h>
+#include <linux/ratelimit.h>
#include <trace/events/thermal.h>
@@ -32,6 +33,13 @@
static DEFINE_IDA(devfreq_ida);
+static DEFINE_RATELIMIT_STATE(devfreq_cooling_ratelimit_state, 30 * HZ, 1);
+
+static int devfreq_cooling_ratelimit(void)
+{
+ return __ratelimit(&devfreq_cooling_ratelimit_state);
+}
+
/**
* struct devfreq_cooling_device - Devfreq cooling device
* @id: unique integer value corresponding to each
@@ -150,6 +158,10 @@ static int devfreq_cooling_set_cur_state(struct thermal_cooling_device *cdev,
dfc->cooling_state = state;
+ if (devfreq_cooling_ratelimit())
+ dev_info(dev, "Cooling state set to %lu. New max freq = %u\n",
+ state, dfc->freq_table[state]);
+
return 0;
}
diff --git a/drivers/thermal/power_allocator.c b/drivers/thermal/power_allocator.c
index 3055f9a12a170..5140a07fe60aa 100644
--- a/drivers/thermal/power_allocator.c
+++ b/drivers/thermal/power_allocator.c
@@ -15,6 +15,7 @@
#define pr_fmt(fmt) "Power allocator: " fmt
+#include <linux/ratelimit.h>
#include <linux/rculist.h>
#include <linux/slab.h>
#include <linux/thermal.h>
@@ -30,6 +31,13 @@
#define int_to_frac(x) ((x) << FRAC_BITS)
#define frac_to_int(x) ((x) >> FRAC_BITS)
+static DEFINE_RATELIMIT_STATE(power_allocator_ratelimit_state, 30 * HZ, 1);
+
+static int power_allocator_ratelimit(void)
+{
+ return __ratelimit(&power_allocator_ratelimit_state);
+}
+
/**
* mul_frac() - multiply two fixed-point numbers
* @x: first multiplicand
@@ -443,6 +451,15 @@ static int allocate_power(struct thermal_zone_device *tz,
max_allocatable_power, tz->temperature,
control_temp - tz->temperature);
+ if (total_granted_power < total_req_power &&
+ power_allocator_ratelimit()) {
+ dev_info(&tz->device, "Controlling power: control_temp=%d "
+ "last_temp=%d, curr_temp=%d total_requested_power=%d "
+ "total_granted_power=%d\n", control_temp,
+ tz->last_temperature, tz->temperature,
+ total_req_power, total_granted_power);
+ }
+
kfree(req_power);
unlock:
mutex_unlock(&tz->lock);
--
2.19.1.568.g152ad8e336-goog
^ permalink raw reply related [flat|nested] 3+ messages in thread
* Re: [PATCH] thermal: add ratelimited thermal and power logging
2018-10-22 20:29 [PATCH] thermal: add ratelimited thermal and power logging Ross Zwisler
@ 2018-10-24 7:22 ` Viresh Kumar
2018-10-29 13:21 ` Ross Zwisler
0 siblings, 1 reply; 3+ messages in thread
From: Viresh Kumar @ 2018-10-24 7:22 UTC (permalink / raw)
To: Ross Zwisler
Cc: linux-kernel, Ricky Liang, Amit Daniel Kachhap, Daniel Lezcano,
Eduardo Valentin, Javi Merino, Zhang Rui, linux-pm,
Stephen Barber
On 22-10-18, 14:29, Ross Zwisler wrote:
> From: Ricky Liang <jcliang@chromium.org>
>
> Add thermal logs in devfreq_cooling and cpu_cooling.
Why should we add them ?
> Also add logging to
> power_allocator when it starts to control power.
>
> These changes can lead to excessive log spam when running up against
> thermal limits, so have this logging ratelimited to allow only 1 log each
> 30 seconds from each of those subsystems.
What's the use of these logs when we are going to print them only once every 30
seconds ?
I recently extended thermal sysfs support to share more stats.
commit 8ea229511e06 ("thermal: Add cooling device's statistics in sysfs")
Will that be helpful in your case ?
Otherwise we should probably add trace points instead.
--
viresh
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: [PATCH] thermal: add ratelimited thermal and power logging
2018-10-24 7:22 ` Viresh Kumar
@ 2018-10-29 13:21 ` Ross Zwisler
0 siblings, 0 replies; 3+ messages in thread
From: Ross Zwisler @ 2018-10-29 13:21 UTC (permalink / raw)
To: viresh.kumar
Cc: linux-kernel, Ricky Liang, amit.kachhap, daniel.lezcano,
edubezval, javi.merino, rui.zhang, linux-pm, Stephen Barber
On Wed, Oct 24, 2018 at 1:22 AM Viresh Kumar <viresh.kumar@linaro.org> wrote:
> On 22-10-18, 14:29, Ross Zwisler wrote:
> > From: Ricky Liang <jcliang@chromium.org>
> >
> > Add thermal logs in devfreq_cooling and cpu_cooling.
>
> Why should we add them ?
>
> > Also add logging to
> > power_allocator when it starts to control power.
> >
> > These changes can lead to excessive log spam when running up against
> > thermal limits, so have this logging ratelimited to allow only 1 log each
> > 30 seconds from each of those subsystems.
>
> What's the use of these logs when we are going to print them only once every 30
> seconds ?
>
> I recently extended thermal sysfs support to share more stats.
>
> commit 8ea229511e06 ("thermal: Add cooling device's statistics in sysfs")
>
> Will that be helpful in your case ?
>
> Otherwise we should probably add trace points instead.
Thank you for the review. Basically we use these prints to get a
notification when a system is having thermal issues. It's easy to
look in dmesg and see the prints and know that something temperature
related is going on.
However, I agree that the current solution is a bit hacky, and in
looking at it a bit further we don't even cover all the paths that we
need to. The processor_set_cur_state() function in
drivers/acpi/processor_thermal.c, for example, is used on the x86_64
systems I'm testing with and wasn't augmented with prints.
I'm going to take a step back and try and find another solution. The
info you added to sysfs looks very promising, thank you for pointing
it out.
^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2018-10-29 13:21 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-10-22 20:29 [PATCH] thermal: add ratelimited thermal and power logging Ross Zwisler
2018-10-24 7:22 ` Viresh Kumar
2018-10-29 13:21 ` Ross Zwisler
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.