* [PATCH] arch_topology: Trace the update thermal pressure
@ 2022-04-19 16:48 Lukasz Luba
2022-04-20 8:34 ` kernel test robot
2022-04-20 23:44 ` Steven Rostedt
0 siblings, 2 replies; 6+ messages in thread
From: Lukasz Luba @ 2022-04-19 16:48 UTC (permalink / raw)
To: linux-kernel
Cc: sudeep.holla, dietmar.eggemann, vincent.guittot, gregkh, rafael,
rostedt, mingo, lukasz.luba
Add trace event to capture the moment of the call for updating the thermal
pressure value. It's helpful to investigate how often those events occurs
in a system dealing with throttling. This trace event is needed since the
old 'cdev_update' might not be used by some drivers. Also, the new trace
event shows capacity value, not a cooling state.
We already have a trace point 'pelt_thermal_tp' which after a change to
trace event can be paired with this new 'thermal_pressure_update' and
derive more insight what is going on in the system under pressure
(and why).
Signed-off-by: Lukasz Luba <lukasz.luba@arm.com>
---
drivers/base/arch_topology.c | 4 ++++
include/trace/events/thermal.h | 19 +++++++++++++++++++
2 files changed, 23 insertions(+)
diff --git a/drivers/base/arch_topology.c b/drivers/base/arch_topology.c
index 1d6636ebaac5..4f0392de3081 100644
--- a/drivers/base/arch_topology.c
+++ b/drivers/base/arch_topology.c
@@ -19,6 +19,8 @@
#include <linux/rcupdate.h>
#include <linux/sched.h>
+#include <trace/events/thermal.h>
+
static DEFINE_PER_CPU(struct scale_freq_data __rcu *, sft_data);
static struct cpumask scale_freq_counters_mask;
static bool scale_freq_invariant;
@@ -195,6 +197,8 @@ void topology_update_thermal_pressure(const struct cpumask *cpus,
th_pressure = max_capacity - capacity;
+ trace_thermal_pressure_update(cpu, th_pressure);
+
for_each_cpu(cpu, cpus)
WRITE_ONCE(per_cpu(thermal_pressure, cpu), th_pressure);
}
diff --git a/include/trace/events/thermal.h b/include/trace/events/thermal.h
index 8a5f04888abd..1bf08ee1a25b 100644
--- a/include/trace/events/thermal.h
+++ b/include/trace/events/thermal.h
@@ -65,6 +65,25 @@ TRACE_EVENT(cdev_update,
TP_printk("type=%s target=%lu", __get_str(type), __entry->target)
);
+TRACE_EVENT(thermal_pressure_update,
+
+ TP_PROTO(int cpu, unsigned long thermal_pressure),
+
+ TP_ARGS(cpu, thermal_pressure),
+
+ TP_STRUCT__entry(
+ __field(int, cpu)
+ __field(unsigned long, thermal_pressure)
+ ),
+
+ TP_fast_assign(
+ __entry->cpu = cpu;
+ __entry->thermal_pressure = thermal_pressure;
+ ),
+
+ TP_printk("cpu=%d thermal_pressure=%lu", __entry->cpu, __entry->thermal_pressure)
+);
+
TRACE_EVENT(thermal_zone_trip,
TP_PROTO(struct thermal_zone_device *tz, int trip,
--
2.17.1
^ permalink raw reply related [flat|nested] 6+ messages in thread
* Re: [PATCH] arch_topology: Trace the update thermal pressure
2022-04-19 16:48 [PATCH] arch_topology: Trace the update thermal pressure Lukasz Luba
@ 2022-04-20 8:34 ` kernel test robot
2022-04-25 10:05 ` Lukasz Luba
2022-04-20 23:44 ` Steven Rostedt
1 sibling, 1 reply; 6+ messages in thread
From: kernel test robot @ 2022-04-20 8:34 UTC (permalink / raw)
To: Lukasz Luba, linux-kernel
Cc: kbuild-all, sudeep.holla, dietmar.eggemann, vincent.guittot,
gregkh, rafael, rostedt, mingo, lukasz.luba
Hi Lukasz,
I love your patch! Yet something to improve:
[auto build test ERROR on driver-core/driver-core-testing]
[also build test ERROR on rostedt-trace/for-next linus/master linux/master v5.18-rc3 next-20220419]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch]
url: https://github.com/intel-lab-lkp/linux/commits/Lukasz-Luba/arch_topology-Trace-the-update-thermal-pressure/20220420-005845
base: https://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core.git 3123109284176b1532874591f7c81f3837bbdc17
config: riscv-randconfig-c024-20220420 (https://download.01.org/0day-ci/archive/20220420/202204201654.vcszVDGb-lkp@intel.com/config)
compiler: riscv64-linux-gcc (GCC) 11.2.0
reproduce (this is a W=1 build):
wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
chmod +x ~/bin/make.cross
# https://github.com/intel-lab-lkp/linux/commit/aeff700a88be6a7469acfc312155bd213f76de95
git remote add linux-review https://github.com/intel-lab-lkp/linux
git fetch --no-tags linux-review Lukasz-Luba/arch_topology-Trace-the-update-thermal-pressure/20220420-005845
git checkout aeff700a88be6a7469acfc312155bd213f76de95
# save the config file
mkdir build_dir && cp config build_dir/.config
COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-11.2.0 make.cross W=1 O=build_dir ARCH=riscv SHELL=/bin/bash
If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot <lkp@intel.com>
All errors (new ones prefixed by >>):
riscv64-linux-ld: drivers/base/arch_topology.o: in function `.L18':
>> arch_topology.c:(.text+0x136): undefined reference to `__traceiter_thermal_pressure_update'
riscv64-linux-ld: drivers/base/arch_topology.o: in function `.L20':
>> arch_topology.c:(.text+0x168): undefined reference to `__tracepoint_thermal_pressure_update'
>> riscv64-linux-ld: arch_topology.c:(.text+0x16c): undefined reference to `__tracepoint_thermal_pressure_update'
riscv64-linux-ld: drivers/base/arch_topology.o: in function `.L0 ':
>> arch_topology.c:(__jump_table+0x8): undefined reference to `__tracepoint_thermal_pressure_update'
--
0-DAY CI Kernel Test Service
https://01.org/lkp
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH] arch_topology: Trace the update thermal pressure
2022-04-19 16:48 [PATCH] arch_topology: Trace the update thermal pressure Lukasz Luba
2022-04-20 8:34 ` kernel test robot
@ 2022-04-20 23:44 ` Steven Rostedt
2022-04-21 6:26 ` Lukasz Luba
1 sibling, 1 reply; 6+ messages in thread
From: Steven Rostedt @ 2022-04-20 23:44 UTC (permalink / raw)
To: Lukasz Luba
Cc: linux-kernel, sudeep.holla, dietmar.eggemann, vincent.guittot,
gregkh, rafael, mingo
On Tue, 19 Apr 2022 17:48:01 +0100
Lukasz Luba <lukasz.luba@arm.com> wrote:
> diff --git a/drivers/base/arch_topology.c b/drivers/base/arch_topology.c
> index 1d6636ebaac5..4f0392de3081 100644
> --- a/drivers/base/arch_topology.c
> +++ b/drivers/base/arch_topology.c
> @@ -19,6 +19,8 @@
> #include <linux/rcupdate.h>
> #include <linux/sched.h>
>
> +#include <trace/events/thermal.h>
> +
> static DEFINE_PER_CPU(struct scale_freq_data __rcu *, sft_data);
> static struct cpumask scale_freq_counters_mask;
> static bool scale_freq_invariant;
> @@ -195,6 +197,8 @@ void topology_update_thermal_pressure(const struct cpumask *cpus,
>
> th_pressure = max_capacity - capacity;
>
> + trace_thermal_pressure_update(cpu, th_pressure);
> +
> for_each_cpu(cpu, cpus)
> WRITE_ONCE(per_cpu(thermal_pressure, cpu), th_pressure);
> }
> diff --git a/include/trace/events/thermal.h b/include/trace/events/thermal.h
> index 8a5f04888abd..1bf08ee1a25b 100644
> --- a/include/trace/events/thermal.h
> +++ b/include/trace/events/thermal.h
> @@ -65,6 +65,25 @@ TRACE_EVENT(cdev_update,
> TP_printk("type=%s target=%lu", __get_str(type), __entry->target)
> );
>
> +TRACE_EVENT(thermal_pressure_update,
> +
> + TP_PROTO(int cpu, unsigned long thermal_pressure),
> +
> + TP_ARGS(cpu, thermal_pressure),
> +
> + TP_STRUCT__entry(
> + __field(int, cpu)
> + __field(unsigned long, thermal_pressure)
Note, it is always best to place the bigger object before the smaller one
(when not properly aligned), as that will help to prevent structure
"holes". That is:
__field(unsigned long, thermal_pressure)
__field(int, cpu)
Otherwise, you are pretty much guaranteed to have a 4 byte hole between cpu
and thermal_pressure on 64 bit machines.
Also, for the warning you got from the test robot, if you are using this in
a module and defining it in the core kernel, you need to add:
EXPORT_TRACEPOINT_SYMBOL_GPL(thermal_pressure_update);
Somewhere in the C file that includes this file and defines
CREATE_TRACE_POINTS.
-- Steve
> + ),
> +
> + TP_fast_assign(
> + __entry->cpu = cpu;
> + __entry->thermal_pressure = thermal_pressure;
> + ),
> +
> + TP_printk("cpu=%d thermal_pressure=%lu", __entry->cpu, __entry->thermal_pressure)
> +);
> +
> TRACE_EVENT(thermal_zone_trip,
>
> TP_PROTO(struct thermal_zone_device *tz, int trip,
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH] arch_topology: Trace the update thermal pressure
2022-04-20 23:44 ` Steven Rostedt
@ 2022-04-21 6:26 ` Lukasz Luba
0 siblings, 0 replies; 6+ messages in thread
From: Lukasz Luba @ 2022-04-21 6:26 UTC (permalink / raw)
To: Steven Rostedt
Cc: linux-kernel, sudeep.holla, dietmar.eggemann, vincent.guittot,
gregkh, rafael, mingo
On 4/21/22 00:44, Steven Rostedt wrote:
> On Tue, 19 Apr 2022 17:48:01 +0100
> Lukasz Luba <lukasz.luba@arm.com> wrote:
>
>> diff --git a/drivers/base/arch_topology.c b/drivers/base/arch_topology.c
>> index 1d6636ebaac5..4f0392de3081 100644
>> --- a/drivers/base/arch_topology.c
>> +++ b/drivers/base/arch_topology.c
>> @@ -19,6 +19,8 @@
>> #include <linux/rcupdate.h>
>> #include <linux/sched.h>
>>
>> +#include <trace/events/thermal.h>
>> +
>> static DEFINE_PER_CPU(struct scale_freq_data __rcu *, sft_data);
>> static struct cpumask scale_freq_counters_mask;
>> static bool scale_freq_invariant;
>> @@ -195,6 +197,8 @@ void topology_update_thermal_pressure(const struct cpumask *cpus,
>>
>> th_pressure = max_capacity - capacity;
>>
>> + trace_thermal_pressure_update(cpu, th_pressure);
>> +
>> for_each_cpu(cpu, cpus)
>> WRITE_ONCE(per_cpu(thermal_pressure, cpu), th_pressure);
>> }
>> diff --git a/include/trace/events/thermal.h b/include/trace/events/thermal.h
>> index 8a5f04888abd..1bf08ee1a25b 100644
>> --- a/include/trace/events/thermal.h
>> +++ b/include/trace/events/thermal.h
>> @@ -65,6 +65,25 @@ TRACE_EVENT(cdev_update,
>> TP_printk("type=%s target=%lu", __get_str(type), __entry->target)
>> );
>>
>> +TRACE_EVENT(thermal_pressure_update,
>> +
>> + TP_PROTO(int cpu, unsigned long thermal_pressure),
>> +
>> + TP_ARGS(cpu, thermal_pressure),
>> +
>> + TP_STRUCT__entry(
>> + __field(int, cpu)
>> + __field(unsigned long, thermal_pressure)
>
> Note, it is always best to place the bigger object before the smaller one
> (when not properly aligned), as that will help to prevent structure
> "holes". That is:
>
> __field(unsigned long, thermal_pressure)
> __field(int, cpu)
>
>
> Otherwise, you are pretty much guaranteed to have a 4 byte hole between cpu
> and thermal_pressure on 64 bit machines.
>
> Also, for the warning you got from the test robot, if you are using this in
> a module and defining it in the core kernel, you need to add:
>
> EXPORT_TRACEPOINT_SYMBOL_GPL(thermal_pressure_update);
>
> Somewhere in the C file that includes this file and defines
> CREATE_TRACE_POINTS.
>
> -- Steve
>
>
Thank you Steve, I'll swap those fields and add the needed export.
I have to go through those older trace events in that file and
understand if they are correct, since I just followed the pattern
in 'cdev_update'.
Regards,
Lukasz
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH] arch_topology: Trace the update thermal pressure
2022-04-20 8:34 ` kernel test robot
@ 2022-04-25 10:05 ` Lukasz Luba
0 siblings, 0 replies; 6+ messages in thread
From: Lukasz Luba @ 2022-04-25 10:05 UTC (permalink / raw)
To: kernel test robot, linux-kernel
Cc: kbuild-all, sudeep.holla, dietmar.eggemann, vincent.guittot,
gregkh, rafael, rostedt, mingo
On 4/20/22 09:34, kernel test robot wrote:
> Hi Lukasz,
>
> I love your patch! Yet something to improve:
>
> [auto build test ERROR on driver-core/driver-core-testing]
> [also build test ERROR on rostedt-trace/for-next linus/master linux/master v5.18-rc3 next-20220419]
> [If your patch is applied to the wrong git tree, kindly drop us a note.
> And when submitting patch, we suggest to use '--base' as documented in
> https://git-scm.com/docs/git-format-patch]
>
> url: https://github.com/intel-lab-lkp/linux/commits/Lukasz-Luba/arch_topology-Trace-the-update-thermal-pressure/20220420-005845
> base: https://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core.git 3123109284176b1532874591f7c81f3837bbdc17
> config: riscv-randconfig-c024-20220420 (https://download.01.org/0day-ci/archive/20220420/202204201654.vcszVDGb-lkp@intel.com/config)
> compiler: riscv64-linux-gcc (GCC) 11.2.0
> reproduce (this is a W=1 build):
> wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
> chmod +x ~/bin/make.cross
> # https://github.com/intel-lab-lkp/linux/commit/aeff700a88be6a7469acfc312155bd213f76de95
> git remote add linux-review https://github.com/intel-lab-lkp/linux
> git fetch --no-tags linux-review Lukasz-Luba/arch_topology-Trace-the-update-thermal-pressure/20220420-005845
> git checkout aeff700a88be6a7469acfc312155bd213f76de95
> # save the config file
> mkdir build_dir && cp config build_dir/.config
> COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-11.2.0 make.cross W=1 O=build_dir ARCH=riscv SHELL=/bin/bash
>
> If you fix the issue, kindly add following tag as appropriate
> Reported-by: kernel test robot <lkp@intel.com>
>
> All errors (new ones prefixed by >>):
>
> riscv64-linux-ld: drivers/base/arch_topology.o: in function `.L18':
>>> arch_topology.c:(.text+0x136): undefined reference to `__traceiter_thermal_pressure_update'
> riscv64-linux-ld: drivers/base/arch_topology.o: in function `.L20':
>>> arch_topology.c:(.text+0x168): undefined reference to `__tracepoint_thermal_pressure_update'
>>> riscv64-linux-ld: arch_topology.c:(.text+0x16c): undefined reference to `__tracepoint_thermal_pressure_update'
> riscv64-linux-ld: drivers/base/arch_topology.o: in function `.L0 ':
>>> arch_topology.c:(__jump_table+0x8): undefined reference to `__tracepoint_thermal_pressure_update'
>
I've reproduced this issue and experimented with a few solutions.
that config file doesn't set CONFIG_THERMAL, where normally we have
in thermal_core.c the:
24 #define CREATE_TRACE_POINTS
25 #include <trace/events/thermal.h>
(similar mechanism we have for thermal_power_allocator.h and IPA.)
We normally test w/ this config, but I'll also start checking the build
w/o thermal subsystem.
I cannot add those two lines in the arch_topology.c since it complains
about multiple definitions for many entries.
Thus, I'll create a separate header thermal_pressure.h to put it in the
arch_topology.c.
I'll also export the symbol as suggested by Steve, so some potential
cpufreq modules could just us it from there.
I'll also re-visit the trace events that we have in thermal subsystem,
since they are not exported but some modules might would like
to use (some of) them...
Regards,
Lukasz
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH] arch_topology: Trace the update thermal pressure
@ 2022-04-25 10:05 ` Lukasz Luba
0 siblings, 0 replies; 6+ messages in thread
From: Lukasz Luba @ 2022-04-25 10:05 UTC (permalink / raw)
To: kbuild-all
[-- Attachment #1: Type: text/plain, Size: 3415 bytes --]
On 4/20/22 09:34, kernel test robot wrote:
> Hi Lukasz,
>
> I love your patch! Yet something to improve:
>
> [auto build test ERROR on driver-core/driver-core-testing]
> [also build test ERROR on rostedt-trace/for-next linus/master linux/master v5.18-rc3 next-20220419]
> [If your patch is applied to the wrong git tree, kindly drop us a note.
> And when submitting patch, we suggest to use '--base' as documented in
> https://git-scm.com/docs/git-format-patch]
>
> url: https://github.com/intel-lab-lkp/linux/commits/Lukasz-Luba/arch_topology-Trace-the-update-thermal-pressure/20220420-005845
> base: https://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core.git 3123109284176b1532874591f7c81f3837bbdc17
> config: riscv-randconfig-c024-20220420 (https://download.01.org/0day-ci/archive/20220420/202204201654.vcszVDGb-lkp(a)intel.com/config)
> compiler: riscv64-linux-gcc (GCC) 11.2.0
> reproduce (this is a W=1 build):
> wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
> chmod +x ~/bin/make.cross
> # https://github.com/intel-lab-lkp/linux/commit/aeff700a88be6a7469acfc312155bd213f76de95
> git remote add linux-review https://github.com/intel-lab-lkp/linux
> git fetch --no-tags linux-review Lukasz-Luba/arch_topology-Trace-the-update-thermal-pressure/20220420-005845
> git checkout aeff700a88be6a7469acfc312155bd213f76de95
> # save the config file
> mkdir build_dir && cp config build_dir/.config
> COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-11.2.0 make.cross W=1 O=build_dir ARCH=riscv SHELL=/bin/bash
>
> If you fix the issue, kindly add following tag as appropriate
> Reported-by: kernel test robot <lkp@intel.com>
>
> All errors (new ones prefixed by >>):
>
> riscv64-linux-ld: drivers/base/arch_topology.o: in function `.L18':
>>> arch_topology.c:(.text+0x136): undefined reference to `__traceiter_thermal_pressure_update'
> riscv64-linux-ld: drivers/base/arch_topology.o: in function `.L20':
>>> arch_topology.c:(.text+0x168): undefined reference to `__tracepoint_thermal_pressure_update'
>>> riscv64-linux-ld: arch_topology.c:(.text+0x16c): undefined reference to `__tracepoint_thermal_pressure_update'
> riscv64-linux-ld: drivers/base/arch_topology.o: in function `.L0 ':
>>> arch_topology.c:(__jump_table+0x8): undefined reference to `__tracepoint_thermal_pressure_update'
>
I've reproduced this issue and experimented with a few solutions.
that config file doesn't set CONFIG_THERMAL, where normally we have
in thermal_core.c the:
24 #define CREATE_TRACE_POINTS
25 #include <trace/events/thermal.h>
(similar mechanism we have for thermal_power_allocator.h and IPA.)
We normally test w/ this config, but I'll also start checking the build
w/o thermal subsystem.
I cannot add those two lines in the arch_topology.c since it complains
about multiple definitions for many entries.
Thus, I'll create a separate header thermal_pressure.h to put it in the
arch_topology.c.
I'll also export the symbol as suggested by Steve, so some potential
cpufreq modules could just us it from there.
I'll also re-visit the trace events that we have in thermal subsystem,
since they are not exported but some modules might would like
to use (some of) them...
Regards,
Lukasz
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2022-04-25 10:06 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-04-19 16:48 [PATCH] arch_topology: Trace the update thermal pressure Lukasz Luba
2022-04-20 8:34 ` kernel test robot
2022-04-25 10:05 ` Lukasz Luba
2022-04-25 10:05 ` Lukasz Luba
2022-04-20 23:44 ` Steven Rostedt
2022-04-21 6:26 ` Lukasz Luba
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.