* Cleanup and fixes for power trace events
@ 2010-10-26 23:43 Thomas Renninger
2010-10-26 23:43 ` [PATCH 1/4] PERF: Do not export power_frequency, but power_start event Thomas Renninger
` (7 more replies)
0 siblings, 8 replies; 17+ messages in thread
From: Thomas Renninger @ 2010-10-26 23:43 UTC (permalink / raw)
To: arjan, jean.pihet, mingo, rjw, linux-omap, linux-pm, linux-trace-users
[PATCH 1/4] PERF: Do not export power_frequency, but power_start event
[PATCH 2/4] PERF(kernel): Cleanup power events V3
[PATCH 3/4] PERF(userspace): Adjust perf timechart to the new power events V3
[PATCH 4/4] PERF: fix power:cpu_idle double end events
^ permalink raw reply [flat|nested] 17+ messages in thread
* [PATCH 1/4] PERF: Do not export power_frequency, but power_start event
2010-10-26 23:43 Cleanup and fixes for power trace events Thomas Renninger
2010-10-26 23:43 ` [PATCH 1/4] PERF: Do not export power_frequency, but power_start event Thomas Renninger
@ 2010-10-26 23:43 ` Thomas Renninger
2010-10-26 23:43 ` [PATCH 2/4] PERF(kernel): Cleanup power events V3 Thomas Renninger
` (5 subsequent siblings)
7 siblings, 0 replies; 17+ messages in thread
From: Thomas Renninger @ 2010-10-26 23:43 UTC (permalink / raw)
To: arjan, jean.pihet, mingo, rjw, linux-omap, linux-pm, linux-trace-users
power_frequency moved to drivers/cpufreq/cpufreq.c which has
to be compiled in, no need to export it.
intel_idle can a be module though...
Signed-off-by: Thomas Renninger <trenn@suse.de>
CC: linux-omap@vger.kernel.org
CC: linux-pm@lists.linux-foundation.org
CC: linux-trace-users@vger.kernel.org
CC: Jean Pihet <jean.pihet@newoldbits.com>
CC: Arjan van de Ven <arjan@linux.intel.com>
CC: Ingo Molnar <mingo@elte.hu>
CC: rjw@sisk.pl
---
drivers/idle/intel_idle.c | 2 --
kernel/trace/power-traces.c | 2 +-
2 files changed, 1 insertions(+), 3 deletions(-)
diff --git a/drivers/idle/intel_idle.c b/drivers/idle/intel_idle.c
index c37ef64..21ac077 100644
--- a/drivers/idle/intel_idle.c
+++ b/drivers/idle/intel_idle.c
@@ -201,9 +201,7 @@ static int intel_idle(struct cpuidle_device *dev, struct cpuidle_state *state)
kt_before = ktime_get_real();
stop_critical_timings();
-#ifndef MODULE
trace_power_start(POWER_CSTATE, (eax >> 4) + 1, cpu);
-#endif
if (!need_resched()) {
__monitor((void *)¤t_thread_info()->flags, 0, 0);
diff --git a/kernel/trace/power-traces.c b/kernel/trace/power-traces.c
index a22582a..0e0497d 100644
--- a/kernel/trace/power-traces.c
+++ b/kernel/trace/power-traces.c
@@ -13,5 +13,5 @@
#define CREATE_TRACE_POINTS
#include <trace/events/power.h>
-EXPORT_TRACEPOINT_SYMBOL_GPL(power_frequency);
+EXPORT_TRACEPOINT_SYMBOL_GPL(power_start);
--
1.6.3
^ permalink raw reply related [flat|nested] 17+ messages in thread
* [PATCH 1/4] PERF: Do not export power_frequency, but power_start event
2010-10-26 23:43 Cleanup and fixes for power trace events Thomas Renninger
@ 2010-10-26 23:43 ` Thomas Renninger
2010-10-26 23:43 ` Thomas Renninger
` (6 subsequent siblings)
7 siblings, 0 replies; 17+ messages in thread
From: Thomas Renninger @ 2010-10-26 23:43 UTC (permalink / raw)
To: arjan, jean.pihet, mingo, rjw, linux-omap, linux-pm, linux-trace-users
Cc: Thomas Renninger
power_frequency moved to drivers/cpufreq/cpufreq.c which has
to be compiled in, no need to export it.
intel_idle can a be module though...
Signed-off-by: Thomas Renninger <trenn@suse.de>
CC: linux-omap@vger.kernel.org
CC: linux-pm@lists.linux-foundation.org
CC: linux-trace-users@vger.kernel.org
CC: Jean Pihet <jean.pihet@newoldbits.com>
CC: Arjan van de Ven <arjan@linux.intel.com>
CC: Ingo Molnar <mingo@elte.hu>
CC: rjw@sisk.pl
---
drivers/idle/intel_idle.c | 2 --
kernel/trace/power-traces.c | 2 +-
2 files changed, 1 insertions(+), 3 deletions(-)
diff --git a/drivers/idle/intel_idle.c b/drivers/idle/intel_idle.c
index c37ef64..21ac077 100644
--- a/drivers/idle/intel_idle.c
+++ b/drivers/idle/intel_idle.c
@@ -201,9 +201,7 @@ static int intel_idle(struct cpuidle_device *dev, struct cpuidle_state *state)
kt_before = ktime_get_real();
stop_critical_timings();
-#ifndef MODULE
trace_power_start(POWER_CSTATE, (eax >> 4) + 1, cpu);
-#endif
if (!need_resched()) {
__monitor((void *)¤t_thread_info()->flags, 0, 0);
diff --git a/kernel/trace/power-traces.c b/kernel/trace/power-traces.c
index a22582a..0e0497d 100644
--- a/kernel/trace/power-traces.c
+++ b/kernel/trace/power-traces.c
@@ -13,5 +13,5 @@
#define CREATE_TRACE_POINTS
#include <trace/events/power.h>
-EXPORT_TRACEPOINT_SYMBOL_GPL(power_frequency);
+EXPORT_TRACEPOINT_SYMBOL_GPL(power_start);
--
1.6.3
^ permalink raw reply related [flat|nested] 17+ messages in thread
* [PATCH 2/4] PERF(kernel): Cleanup power events V3
2010-10-26 23:43 Cleanup and fixes for power trace events Thomas Renninger
2010-10-26 23:43 ` [PATCH 1/4] PERF: Do not export power_frequency, but power_start event Thomas Renninger
2010-10-26 23:43 ` Thomas Renninger
@ 2010-10-26 23:43 ` Thomas Renninger
2010-10-26 23:43 ` Thomas Renninger
` (4 subsequent siblings)
7 siblings, 0 replies; 17+ messages in thread
From: Thomas Renninger @ 2010-10-26 23:43 UTC (permalink / raw)
To: arjan, jean.pihet, mingo, rjw, linux-omap, linux-pm, linux-trace-users
Changes in V3:
- PWR_EVENT_EXIT is -1 now instead of 0xFFFFFFFF
- Use cpu_{idle,frequency} instead of processor_{idle,frequency}
events
- Fixed a copy and paste bug (poll_idle should throw and event
state of zero, not 1)
Changes in V2:
- Introduce PWR_EVENT_EXIT instead of 0 to mark non-power state
- Use u32 instead of u64 for cpuid, state which is by far enough
New power trace events:
power:cpu_idle
power:cpu_frequency
power:machine_suspend
C-state/idle accounting events:
power:power_start
power:power_end
are replaced with:
power:cpu_idle
and
power:power_frequency
is replaced with:
power:cpu_frequency
power:machine_suspend
is newly introduced, a first implementation
comes from the ARM side, but it's easy to add these events
in X86 as well if needed.
the type= field got removed from both, it was never
used and the type is differed by the event type itself.
perf timechart
userspace tool gets adjusted in a separate patch.
Signed-off-by: Thomas Renninger <trenn@suse.de>
Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
CC: linux-omap@vger.kernel.org
CC: linux-pm@lists.linux-foundation.org
CC: linux-trace-users@vger.kernel.org
CC: Jean Pihet <jean.pihet@newoldbits.com>
CC: Arjan van de Ven <arjan@linux.intel.com>
CC: Ingo Molnar <mingo@elte.hu>
CC: rjw@sisk.pl
---
arch/x86/kernel/process.c | 7 +++-
arch/x86/kernel/process_32.c | 2 +-
arch/x86/kernel/process_64.c | 2 +
drivers/cpufreq/cpufreq.c | 1 +
drivers/cpuidle/cpuidle.c | 1 +
drivers/idle/intel_idle.c | 1 +
include/trace/events/power.h | 87 +++++++++++++++++++++++++++++++++++++++++-
kernel/trace/Kconfig | 14 +++++++
kernel/trace/power-traces.c | 3 +
9 files changed, 115 insertions(+), 3 deletions(-)
diff --git a/arch/x86/kernel/process.c b/arch/x86/kernel/process.c
index 57d1868..155d975 100644
--- a/arch/x86/kernel/process.c
+++ b/arch/x86/kernel/process.c
@@ -374,6 +374,7 @@ void default_idle(void)
{
if (hlt_use_halt()) {
trace_power_start(POWER_CSTATE, 1, smp_processor_id());
+ trace_cpu_idle(1, smp_processor_id());
current_thread_info()->status &= ~TS_POLLING;
/*
* TS_POLLING-cleared state must be visible before we
@@ -444,6 +445,7 @@ EXPORT_SYMBOL_GPL(cpu_idle_wait);
void mwait_idle_with_hints(unsigned long ax, unsigned long cx)
{
trace_power_start(POWER_CSTATE, (ax>>4)+1, smp_processor_id());
+ trace_cpu_idle((ax>>4)+1, smp_processor_id());
if (!need_resched()) {
if (cpu_has(¤t_cpu_data, X86_FEATURE_CLFLUSH_MONITOR))
clflush((void *)¤t_thread_info()->flags);
@@ -460,6 +462,7 @@ static void mwait_idle(void)
{
if (!need_resched()) {
trace_power_start(POWER_CSTATE, 1, smp_processor_id());
+ trace_cpu_idle(1, smp_processor_id());
if (cpu_has(¤t_cpu_data, X86_FEATURE_CLFLUSH_MONITOR))
clflush((void *)¤t_thread_info()->flags);
@@ -481,10 +484,12 @@ static void mwait_idle(void)
static void poll_idle(void)
{
trace_power_start(POWER_CSTATE, 0, smp_processor_id());
+ trace_cpu_idle(0, smp_processor_id());
local_irq_enable();
while (!need_resched())
cpu_relax();
- trace_power_end(0);
+ trace_power_end(smp_processor_id());
+ trace_cpu_idle(PWR_EVENT_EXIT, smp_processor_id());
}
/*
diff --git a/arch/x86/kernel/process_32.c b/arch/x86/kernel/process_32.c
index 96586c3..4b9befa 100644
--- a/arch/x86/kernel/process_32.c
+++ b/arch/x86/kernel/process_32.c
@@ -113,8 +113,8 @@ void cpu_idle(void)
stop_critical_timings();
pm_idle();
start_critical_timings();
-
trace_power_end(smp_processor_id());
+ trace_cpu_idle(PWR_EVENT_EXIT, smp_processor_id());
}
tick_nohz_restart_sched_tick();
preempt_enable_no_resched();
diff --git a/arch/x86/kernel/process_64.c b/arch/x86/kernel/process_64.c
index 3d9ea53..28153a9 100644
--- a/arch/x86/kernel/process_64.c
+++ b/arch/x86/kernel/process_64.c
@@ -142,6 +142,8 @@ void cpu_idle(void)
start_critical_timings();
trace_power_end(smp_processor_id());
+ trace_cpu_idle(PWR_EVENT_EXIT,
+ smp_processor_id());
/* In many cases the interrupt that ended idle
has already called exit_idle. But some idle
diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c
index 199dcb9..ed4919e 100644
--- a/drivers/cpufreq/cpufreq.c
+++ b/drivers/cpufreq/cpufreq.c
@@ -355,6 +355,7 @@ void cpufreq_notify_transition(struct cpufreq_freqs *freqs, unsigned int state)
dprintk("FREQ: %lu - CPU: %lu", (unsigned long)freqs->new,
(unsigned long)freqs->cpu);
trace_power_frequency(POWER_PSTATE, freqs->new, freqs->cpu);
+ trace_cpu_frequency(freqs->new, freqs->cpu);
srcu_notifier_call_chain(&cpufreq_transition_notifier_list,
CPUFREQ_POSTCHANGE, freqs);
if (likely(policy) && likely(policy->cpu == freqs->cpu))
diff --git a/drivers/cpuidle/cpuidle.c b/drivers/cpuidle/cpuidle.c
index a507108..08d5f05 100644
--- a/drivers/cpuidle/cpuidle.c
+++ b/drivers/cpuidle/cpuidle.c
@@ -107,6 +107,7 @@ static void cpuidle_idle_call(void)
if (cpuidle_curr_governor->reflect)
cpuidle_curr_governor->reflect(dev);
trace_power_end(smp_processor_id());
+ trace_cpu_idle(PWR_EVENT_EXIT, smp_processor_id());
}
/**
diff --git a/drivers/idle/intel_idle.c b/drivers/idle/intel_idle.c
index 21ac077..d3701bf 100644
--- a/drivers/idle/intel_idle.c
+++ b/drivers/idle/intel_idle.c
@@ -202,6 +202,7 @@ static int intel_idle(struct cpuidle_device *dev, struct cpuidle_state *state)
stop_critical_timings();
trace_power_start(POWER_CSTATE, (eax >> 4) + 1, cpu);
+ trace_cpu_idle((eax >> 4) + 1, cpu);
if (!need_resched()) {
__monitor((void *)¤t_thread_info()->flags, 0, 0);
diff --git a/include/trace/events/power.h b/include/trace/events/power.h
index 35a2a6e..f10de41 100644
--- a/include/trace/events/power.h
+++ b/include/trace/events/power.h
@@ -7,6 +7,67 @@
#include <linux/ktime.h>
#include <linux/tracepoint.h>
+DECLARE_EVENT_CLASS(cpu,
+
+ TP_PROTO(unsigned int state, unsigned int cpu_id),
+
+ TP_ARGS(state, cpu_id),
+
+ TP_STRUCT__entry(
+ __field( u32, state )
+ __field( u32, cpu_id )
+ ),
+
+ TP_fast_assign(
+ __entry->state = state;
+ __entry->cpu_id = cpu_id;
+ ),
+
+ TP_printk("state=%lu cpu_id=%lu", (unsigned long)__entry->state,
+ (unsigned long)__entry->cpu_id)
+);
+
+DEFINE_EVENT(cpu, cpu_idle,
+
+ TP_PROTO(unsigned int state, unsigned int cpu_id),
+
+ TP_ARGS(state, cpu_id)
+);
+
+/* This file can get included multiple times, TRACE_HEADER_MULTI_READ at top */
+#ifndef _PWR_EVENT_AVOID_DOUBLE_DEFINING
+#define _PWR_EVENT_AVOID_DOUBLE_DEFINING
+
+#define PWR_EVENT_EXIT -1
+
+#endif
+
+DEFINE_EVENT(cpu, cpu_frequency,
+
+ TP_PROTO(unsigned int frequency, unsigned int cpu_id),
+
+ TP_ARGS(frequency, cpu_id)
+);
+
+TRACE_EVENT(machine_suspend,
+
+ TP_PROTO(unsigned int state),
+
+ TP_ARGS(state),
+
+ TP_STRUCT__entry(
+ __field( u32, state )
+ ),
+
+ TP_fast_assign(
+ __entry->state = state;
+ ),
+
+ TP_printk("state=%lu", (unsigned long)__entry->state)
+);
+
+#ifdef CONFIG_EVENT_POWER_TRACING_DEPRECATED
+
#ifndef _TRACE_POWER_ENUM_
#define _TRACE_POWER_ENUM_
enum {
@@ -69,8 +130,32 @@ TRACE_EVENT(power_end,
TP_printk("cpu_id=%lu", (unsigned long)__entry->cpu_id)
);
-
+#endif /* CONFIG_EVENT_POWER_TRACING_DEPRECATED */
#endif /* _TRACE_POWER_H */
+/* Deprecated dummy functions must be protected against multi-declartion */
+#ifndef EVENT_POWER_TRACING_DEPRECATED_PART_H
+#define EVENT_POWER_TRACING_DEPRECATED_PART_H
+
+#ifndef CONFIG_EVENT_POWER_TRACING_DEPRECATED
+
+#ifndef _TRACE_POWER_ENUM_
+#define _TRACE_POWER_ENUM_
+enum {
+ POWER_NONE = 0,
+ POWER_CSTATE = 1,
+ POWER_PSTATE = 2,
+};
+#endif
+
+static inline void trace_power_start(u64 type, u64 state, u64 cpuid) {};
+static inline void trace_power_end(u64 cpuid) {};
+static inline void trace_power_frequency(u64 type, u64 state, u64 cpuid) {};
+#endif /* CONFIG_EVENT_POWER_TRACING_DEPRECATED */
+
+#endif /* EVENT_POWER_TRACING_DEPRECATED_PART_H */
+
+
+
/* This part must be outside protection */
#include <trace/define_trace.h>
diff --git a/kernel/trace/Kconfig b/kernel/trace/Kconfig
index 538501c..0b5c841 100644
--- a/kernel/trace/Kconfig
+++ b/kernel/trace/Kconfig
@@ -64,6 +64,20 @@ config EVENT_TRACING
select CONTEXT_SWITCH_TRACER
bool
+config EVENT_POWER_TRACING_DEPRECATED
+ depends on EVENT_TRACING
+ bool
+ help
+ Provides old power event types:
+ C-state/idle accounting events:
+ power:power_start
+ power:power_end
+ and old cpufreq accounting event:
+ power:power_frequency
+ This is for userspace compatibility
+ and will vanish after 5 kernel iterations,
+ namely 2.6.41.
+
config CONTEXT_SWITCH_TRACER
bool
diff --git a/kernel/trace/power-traces.c b/kernel/trace/power-traces.c
index 0e0497d..f55fcf6 100644
--- a/kernel/trace/power-traces.c
+++ b/kernel/trace/power-traces.c
@@ -13,5 +13,8 @@
#define CREATE_TRACE_POINTS
#include <trace/events/power.h>
+#ifdef EVENT_POWER_TRACING_DEPRECATED
EXPORT_TRACEPOINT_SYMBOL_GPL(power_start);
+#endif
+EXPORT_TRACEPOINT_SYMBOL_GPL(cpu_idle);
--
1.6.3
^ permalink raw reply related [flat|nested] 17+ messages in thread
* [PATCH 2/4] PERF(kernel): Cleanup power events V3
2010-10-26 23:43 Cleanup and fixes for power trace events Thomas Renninger
` (2 preceding siblings ...)
2010-10-26 23:43 ` [PATCH 2/4] PERF(kernel): Cleanup power events V3 Thomas Renninger
@ 2010-10-26 23:43 ` Thomas Renninger
2010-10-26 23:43 ` [PATCH 3/4] PERF(userspace): Adjust perf timechart to the new " Thomas Renninger
` (3 subsequent siblings)
7 siblings, 0 replies; 17+ messages in thread
From: Thomas Renninger @ 2010-10-26 23:43 UTC (permalink / raw)
To: arjan, jean.pihet, mingo, rjw, linux-omap, linux-pm, linux-trace-users
Cc: Thomas Renninger
Changes in V3:
- PWR_EVENT_EXIT is -1 now instead of 0xFFFFFFFF
- Use cpu_{idle,frequency} instead of processor_{idle,frequency}
events
- Fixed a copy and paste bug (poll_idle should throw and event
state of zero, not 1)
Changes in V2:
- Introduce PWR_EVENT_EXIT instead of 0 to mark non-power state
- Use u32 instead of u64 for cpuid, state which is by far enough
New power trace events:
power:cpu_idle
power:cpu_frequency
power:machine_suspend
C-state/idle accounting events:
power:power_start
power:power_end
are replaced with:
power:cpu_idle
and
power:power_frequency
is replaced with:
power:cpu_frequency
power:machine_suspend
is newly introduced, a first implementation
comes from the ARM side, but it's easy to add these events
in X86 as well if needed.
the type= field got removed from both, it was never
used and the type is differed by the event type itself.
perf timechart
userspace tool gets adjusted in a separate patch.
Signed-off-by: Thomas Renninger <trenn@suse.de>
Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
CC: linux-omap@vger.kernel.org
CC: linux-pm@lists.linux-foundation.org
CC: linux-trace-users@vger.kernel.org
CC: Jean Pihet <jean.pihet@newoldbits.com>
CC: Arjan van de Ven <arjan@linux.intel.com>
CC: Ingo Molnar <mingo@elte.hu>
CC: rjw@sisk.pl
---
arch/x86/kernel/process.c | 7 +++-
arch/x86/kernel/process_32.c | 2 +-
arch/x86/kernel/process_64.c | 2 +
drivers/cpufreq/cpufreq.c | 1 +
drivers/cpuidle/cpuidle.c | 1 +
drivers/idle/intel_idle.c | 1 +
include/trace/events/power.h | 87 +++++++++++++++++++++++++++++++++++++++++-
kernel/trace/Kconfig | 14 +++++++
kernel/trace/power-traces.c | 3 +
9 files changed, 115 insertions(+), 3 deletions(-)
diff --git a/arch/x86/kernel/process.c b/arch/x86/kernel/process.c
index 57d1868..155d975 100644
--- a/arch/x86/kernel/process.c
+++ b/arch/x86/kernel/process.c
@@ -374,6 +374,7 @@ void default_idle(void)
{
if (hlt_use_halt()) {
trace_power_start(POWER_CSTATE, 1, smp_processor_id());
+ trace_cpu_idle(1, smp_processor_id());
current_thread_info()->status &= ~TS_POLLING;
/*
* TS_POLLING-cleared state must be visible before we
@@ -444,6 +445,7 @@ EXPORT_SYMBOL_GPL(cpu_idle_wait);
void mwait_idle_with_hints(unsigned long ax, unsigned long cx)
{
trace_power_start(POWER_CSTATE, (ax>>4)+1, smp_processor_id());
+ trace_cpu_idle((ax>>4)+1, smp_processor_id());
if (!need_resched()) {
if (cpu_has(¤t_cpu_data, X86_FEATURE_CLFLUSH_MONITOR))
clflush((void *)¤t_thread_info()->flags);
@@ -460,6 +462,7 @@ static void mwait_idle(void)
{
if (!need_resched()) {
trace_power_start(POWER_CSTATE, 1, smp_processor_id());
+ trace_cpu_idle(1, smp_processor_id());
if (cpu_has(¤t_cpu_data, X86_FEATURE_CLFLUSH_MONITOR))
clflush((void *)¤t_thread_info()->flags);
@@ -481,10 +484,12 @@ static void mwait_idle(void)
static void poll_idle(void)
{
trace_power_start(POWER_CSTATE, 0, smp_processor_id());
+ trace_cpu_idle(0, smp_processor_id());
local_irq_enable();
while (!need_resched())
cpu_relax();
- trace_power_end(0);
+ trace_power_end(smp_processor_id());
+ trace_cpu_idle(PWR_EVENT_EXIT, smp_processor_id());
}
/*
diff --git a/arch/x86/kernel/process_32.c b/arch/x86/kernel/process_32.c
index 96586c3..4b9befa 100644
--- a/arch/x86/kernel/process_32.c
+++ b/arch/x86/kernel/process_32.c
@@ -113,8 +113,8 @@ void cpu_idle(void)
stop_critical_timings();
pm_idle();
start_critical_timings();
-
trace_power_end(smp_processor_id());
+ trace_cpu_idle(PWR_EVENT_EXIT, smp_processor_id());
}
tick_nohz_restart_sched_tick();
preempt_enable_no_resched();
diff --git a/arch/x86/kernel/process_64.c b/arch/x86/kernel/process_64.c
index 3d9ea53..28153a9 100644
--- a/arch/x86/kernel/process_64.c
+++ b/arch/x86/kernel/process_64.c
@@ -142,6 +142,8 @@ void cpu_idle(void)
start_critical_timings();
trace_power_end(smp_processor_id());
+ trace_cpu_idle(PWR_EVENT_EXIT,
+ smp_processor_id());
/* In many cases the interrupt that ended idle
has already called exit_idle. But some idle
diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c
index 199dcb9..ed4919e 100644
--- a/drivers/cpufreq/cpufreq.c
+++ b/drivers/cpufreq/cpufreq.c
@@ -355,6 +355,7 @@ void cpufreq_notify_transition(struct cpufreq_freqs *freqs, unsigned int state)
dprintk("FREQ: %lu - CPU: %lu", (unsigned long)freqs->new,
(unsigned long)freqs->cpu);
trace_power_frequency(POWER_PSTATE, freqs->new, freqs->cpu);
+ trace_cpu_frequency(freqs->new, freqs->cpu);
srcu_notifier_call_chain(&cpufreq_transition_notifier_list,
CPUFREQ_POSTCHANGE, freqs);
if (likely(policy) && likely(policy->cpu == freqs->cpu))
diff --git a/drivers/cpuidle/cpuidle.c b/drivers/cpuidle/cpuidle.c
index a507108..08d5f05 100644
--- a/drivers/cpuidle/cpuidle.c
+++ b/drivers/cpuidle/cpuidle.c
@@ -107,6 +107,7 @@ static void cpuidle_idle_call(void)
if (cpuidle_curr_governor->reflect)
cpuidle_curr_governor->reflect(dev);
trace_power_end(smp_processor_id());
+ trace_cpu_idle(PWR_EVENT_EXIT, smp_processor_id());
}
/**
diff --git a/drivers/idle/intel_idle.c b/drivers/idle/intel_idle.c
index 21ac077..d3701bf 100644
--- a/drivers/idle/intel_idle.c
+++ b/drivers/idle/intel_idle.c
@@ -202,6 +202,7 @@ static int intel_idle(struct cpuidle_device *dev, struct cpuidle_state *state)
stop_critical_timings();
trace_power_start(POWER_CSTATE, (eax >> 4) + 1, cpu);
+ trace_cpu_idle((eax >> 4) + 1, cpu);
if (!need_resched()) {
__monitor((void *)¤t_thread_info()->flags, 0, 0);
diff --git a/include/trace/events/power.h b/include/trace/events/power.h
index 35a2a6e..f10de41 100644
--- a/include/trace/events/power.h
+++ b/include/trace/events/power.h
@@ -7,6 +7,67 @@
#include <linux/ktime.h>
#include <linux/tracepoint.h>
+DECLARE_EVENT_CLASS(cpu,
+
+ TP_PROTO(unsigned int state, unsigned int cpu_id),
+
+ TP_ARGS(state, cpu_id),
+
+ TP_STRUCT__entry(
+ __field( u32, state )
+ __field( u32, cpu_id )
+ ),
+
+ TP_fast_assign(
+ __entry->state = state;
+ __entry->cpu_id = cpu_id;
+ ),
+
+ TP_printk("state=%lu cpu_id=%lu", (unsigned long)__entry->state,
+ (unsigned long)__entry->cpu_id)
+);
+
+DEFINE_EVENT(cpu, cpu_idle,
+
+ TP_PROTO(unsigned int state, unsigned int cpu_id),
+
+ TP_ARGS(state, cpu_id)
+);
+
+/* This file can get included multiple times, TRACE_HEADER_MULTI_READ at top */
+#ifndef _PWR_EVENT_AVOID_DOUBLE_DEFINING
+#define _PWR_EVENT_AVOID_DOUBLE_DEFINING
+
+#define PWR_EVENT_EXIT -1
+
+#endif
+
+DEFINE_EVENT(cpu, cpu_frequency,
+
+ TP_PROTO(unsigned int frequency, unsigned int cpu_id),
+
+ TP_ARGS(frequency, cpu_id)
+);
+
+TRACE_EVENT(machine_suspend,
+
+ TP_PROTO(unsigned int state),
+
+ TP_ARGS(state),
+
+ TP_STRUCT__entry(
+ __field( u32, state )
+ ),
+
+ TP_fast_assign(
+ __entry->state = state;
+ ),
+
+ TP_printk("state=%lu", (unsigned long)__entry->state)
+);
+
+#ifdef CONFIG_EVENT_POWER_TRACING_DEPRECATED
+
#ifndef _TRACE_POWER_ENUM_
#define _TRACE_POWER_ENUM_
enum {
@@ -69,8 +130,32 @@ TRACE_EVENT(power_end,
TP_printk("cpu_id=%lu", (unsigned long)__entry->cpu_id)
);
-
+#endif /* CONFIG_EVENT_POWER_TRACING_DEPRECATED */
#endif /* _TRACE_POWER_H */
+/* Deprecated dummy functions must be protected against multi-declartion */
+#ifndef EVENT_POWER_TRACING_DEPRECATED_PART_H
+#define EVENT_POWER_TRACING_DEPRECATED_PART_H
+
+#ifndef CONFIG_EVENT_POWER_TRACING_DEPRECATED
+
+#ifndef _TRACE_POWER_ENUM_
+#define _TRACE_POWER_ENUM_
+enum {
+ POWER_NONE = 0,
+ POWER_CSTATE = 1,
+ POWER_PSTATE = 2,
+};
+#endif
+
+static inline void trace_power_start(u64 type, u64 state, u64 cpuid) {};
+static inline void trace_power_end(u64 cpuid) {};
+static inline void trace_power_frequency(u64 type, u64 state, u64 cpuid) {};
+#endif /* CONFIG_EVENT_POWER_TRACING_DEPRECATED */
+
+#endif /* EVENT_POWER_TRACING_DEPRECATED_PART_H */
+
+
+
/* This part must be outside protection */
#include <trace/define_trace.h>
diff --git a/kernel/trace/Kconfig b/kernel/trace/Kconfig
index 538501c..0b5c841 100644
--- a/kernel/trace/Kconfig
+++ b/kernel/trace/Kconfig
@@ -64,6 +64,20 @@ config EVENT_TRACING
select CONTEXT_SWITCH_TRACER
bool
+config EVENT_POWER_TRACING_DEPRECATED
+ depends on EVENT_TRACING
+ bool
+ help
+ Provides old power event types:
+ C-state/idle accounting events:
+ power:power_start
+ power:power_end
+ and old cpufreq accounting event:
+ power:power_frequency
+ This is for userspace compatibility
+ and will vanish after 5 kernel iterations,
+ namely 2.6.41.
+
config CONTEXT_SWITCH_TRACER
bool
diff --git a/kernel/trace/power-traces.c b/kernel/trace/power-traces.c
index 0e0497d..f55fcf6 100644
--- a/kernel/trace/power-traces.c
+++ b/kernel/trace/power-traces.c
@@ -13,5 +13,8 @@
#define CREATE_TRACE_POINTS
#include <trace/events/power.h>
+#ifdef EVENT_POWER_TRACING_DEPRECATED
EXPORT_TRACEPOINT_SYMBOL_GPL(power_start);
+#endif
+EXPORT_TRACEPOINT_SYMBOL_GPL(cpu_idle);
--
1.6.3
^ permalink raw reply related [flat|nested] 17+ messages in thread
* [PATCH 3/4] PERF(userspace): Adjust perf timechart to the new power events V3
2010-10-26 23:43 Cleanup and fixes for power trace events Thomas Renninger
` (4 preceding siblings ...)
2010-10-26 23:43 ` [PATCH 3/4] PERF(userspace): Adjust perf timechart to the new " Thomas Renninger
@ 2010-10-26 23:43 ` Thomas Renninger
2010-10-26 23:43 ` [PATCH 4/4] PERF: fix power:cpu_idle double end events Thomas Renninger
2010-10-26 23:43 ` Thomas Renninger
7 siblings, 0 replies; 17+ messages in thread
From: Thomas Renninger @ 2010-10-26 23:43 UTC (permalink / raw)
To: arjan, jean.pihet, mingo, rjw, linux-omap, linux-pm, linux-trace-users
Changes in V3:
- PWR_EVENT_EXIT is -1 now instead of 0xFFFFFFFF
- Use cpu_{idle,frequency} instead of processor_{idle,frequency}
events
Changes in V2:
- Hanlde PWR_EVENT_EXIT instead of 0 to recon non-power state
The transition was rather smooth, only part I had to fiddle
some time was the check whether a tracepoint/event is
supported by the running kernel.
builtin-timechart must only pass -e power:xy events which
are supported by the running kernel.
For this I added the tiny helper function:
int is_valid_tracepoint(const char *event_string)
to parse-events.[hc]
which could be more generic as an interface and support
hardware/software/... events, not only tracepoints, but someone
else could extend that if needed...
Signed-off-by: Thomas Renninger <trenn@suse.de>
CC: linux-omap@vger.kernel.org
CC: linux-pm@lists.linux-foundation.org
CC: linux-trace-users@vger.kernel.org
CC: Jean Pihet <jean.pihet@newoldbits.com>
CC: Arjan van de Ven <arjan@linux.intel.com>
CC: Ingo Molnar <mingo@elte.hu>
CC: rjw@sisk.pl
---
tools/perf/builtin-timechart.c | 89 ++++++++++++++++++++++++++++++++-------
tools/perf/util/parse-events.c | 43 +++++++++++++++++++-
tools/perf/util/parse-events.h | 1 +
3 files changed, 116 insertions(+), 17 deletions(-)
diff --git a/tools/perf/builtin-timechart.c b/tools/perf/builtin-timechart.c
index 9bcc38f..bcc16b0 100644
--- a/tools/perf/builtin-timechart.c
+++ b/tools/perf/builtin-timechart.c
@@ -32,6 +32,10 @@
#include "util/session.h"
#include "util/svghelper.h"
+#define SUPPORT_OLD_POWER_EVENTS 1
+#define PWR_EVENT_EXIT 0xFFFFFFFF
+
+
static char const *input_name = "perf.data";
static char const *output_name = "output.svg";
@@ -298,12 +302,25 @@ struct trace_entry {
int lock_depth;
};
-struct power_entry {
+#if defined(SUPPORT_OLD_POWER_EVENTS)
+struct power_entry_old {
struct trace_entry te;
u64 type;
u64 value;
u64 cpu_id;
};
+#endif
+
+struct power_processor_entry {
+ struct trace_entry te;
+ u64 state;
+ u64 cpu_id;
+};
+
+struct power_suspend_entry {
+ struct trace_entry te;
+ u64 state;
+};
#define TASK_COMM_LEN 16
struct wakeup_entry {
@@ -489,29 +506,46 @@ static int process_sample_event(event_t *event, struct perf_session *session)
te = (void *)data.raw_data;
if (session->sample_type & PERF_SAMPLE_RAW && data.raw_size > 0) {
char *event_str;
- struct power_entry *pe;
-
- pe = (void *)te;
+#if defined(SUPPORT_OLD_POWER_EVENTS)
+ struct power_entry_old *peo;
+ peo = (void *)te;
+#endif
event_str = perf_header__find_event(te->type);
if (!event_str)
return 0;
- if (strcmp(event_str, "power:power_start") == 0)
- c_state_start(pe->cpu_id, data.time, pe->value);
-
- if (strcmp(event_str, "power:power_end") == 0)
- c_state_end(pe->cpu_id, data.time);
+ if (strcmp(event_str, "power:cpu_idle") == 0) {
+ struct power_processor_entry *ppe = (void *)te;
+ if (ppe->state == PWR_EVENT_EXIT)
+ c_state_end(ppe->cpu_id, data.time);
+ else
+ c_state_start(ppe->cpu_id, data.time,
+ ppe->state);
+ }
- if (strcmp(event_str, "power:power_frequency") == 0)
- p_state_change(pe->cpu_id, data.time, pe->value);
+ else if (strcmp(event_str, "power:cpu_frequency") == 0) {
+ struct power_processor_entry *ppe = (void *)te;
+ p_state_change(ppe->cpu_id, data.time, ppe->state);
+ }
- if (strcmp(event_str, "sched:sched_wakeup") == 0)
+ else if (strcmp(event_str, "sched:sched_wakeup") == 0)
sched_wakeup(data.cpu, data.time, data.pid, te);
- if (strcmp(event_str, "sched:sched_switch") == 0)
+ else if (strcmp(event_str, "sched:sched_switch") == 0)
sched_switch(data.cpu, data.time, te);
+
+#if defined(SUPPORT_OLD_POWER_EVENTS)
+ else if (strcmp(event_str, "power:power_start") == 0)
+ c_state_start(peo->cpu_id, data.time, peo->value);
+
+ else if (strcmp(event_str, "power:power_end") == 0)
+ c_state_end(peo->cpu_id, data.time);
+
+ else if (strcmp(event_str, "power:power_frequency") == 0)
+ p_state_change(peo->cpu_id, data.time, peo->value);
+#endif
}
return 0;
}
@@ -968,7 +1002,8 @@ static const char * const timechart_usage[] = {
NULL
};
-static const char *record_args[] = {
+#if defined(SUPPORT_OLD_POWER_EVENTS)
+static const char *record_old_args[] = {
"record",
"-a",
"-R",
@@ -980,16 +1015,38 @@ static const char *record_args[] = {
"-e", "sched:sched_wakeup",
"-e", "sched:sched_switch",
};
+#endif
+
+static const char *record_new_args[] = {
+ "record",
+ "-a",
+ "-R",
+ "-f",
+ "-c", "1",
+ "-e", "power:cpu_frequency",
+ "-e", "power:cpu_idle",
+ "-e", "sched:sched_wakeup",
+ "-e", "sched:sched_switch",
+};
static int __cmd_record(int argc, const char **argv)
{
unsigned int rec_argc, i, j;
const char **rec_argv;
+ const char **record_args = record_new_args;
+ unsigned int record_elems = ARRAY_SIZE(record_new_args);
- rec_argc = ARRAY_SIZE(record_args) + argc - 1;
+#if defined(SUPPORT_OLD_POWER_EVENTS)
+ if (is_valid_tracepoint("power:power_start")) {
+ record_args = record_old_args;
+ record_elems = ARRAY_SIZE(record_old_args);
+ }
+#endif
+
+ rec_argc = record_elems + argc - 1;
rec_argv = calloc(rec_argc + 1, sizeof(char *));
- for (i = 0; i < ARRAY_SIZE(record_args); i++)
+ for (i = 0; i < record_elems; i++)
rec_argv[i] = strdup(record_args[i]);
for (j = 1; j < (unsigned int)argc; j++, i++)
diff --git a/tools/perf/util/parse-events.c b/tools/perf/util/parse-events.c
index 4af5bd5..d706dcb 100644
--- a/tools/perf/util/parse-events.c
+++ b/tools/perf/util/parse-events.c
@@ -824,7 +824,7 @@ int parse_events(const struct option *opt __used, const char *str, int unset __u
if (ret != EVT_HANDLED_ALL) {
attrs[nr_counters] = attr;
nr_counters++;
- }
+ }
if (*str == 0)
break;
@@ -906,6 +906,47 @@ static void print_tracepoint_events(void)
}
/*
+ * Check whether event is in <debugfs_mount_point>/tracing/events
+ */
+
+int is_valid_tracepoint(const char *event_string)
+{
+ DIR *sys_dir, *evt_dir;
+ struct dirent *sys_next, *evt_next, sys_dirent, evt_dirent;
+ char evt_path[MAXPATHLEN];
+ char dir_path[MAXPATHLEN];
+
+ if (debugfs_valid_mountpoint(debugfs_path))
+ return 0;
+
+ sys_dir = opendir(debugfs_path);
+ if (!sys_dir)
+ return 0;
+
+ for_each_subsystem(sys_dir, sys_dirent, sys_next) {
+
+ snprintf(dir_path, MAXPATHLEN, "%s/%s", debugfs_path,
+ sys_dirent.d_name);
+ evt_dir = opendir(dir_path);
+ if (!evt_dir)
+ continue;
+
+ for_each_event(sys_dirent, evt_dir, evt_dirent, evt_next) {
+ snprintf(evt_path, MAXPATHLEN, "%s:%s",
+ sys_dirent.d_name, evt_dirent.d_name);
+ if (!strcmp(evt_path, event_string)) {
+ closedir(evt_dir);
+ closedir(sys_dir);
+ return 1;
+ }
+ }
+ closedir(evt_dir);
+ }
+ closedir(sys_dir);
+ return 0;
+}
+
+/*
* Print the help text for the event symbols:
*/
void print_events(void)
diff --git a/tools/perf/util/parse-events.h b/tools/perf/util/parse-events.h
index fc4ab3f..7ab4685 100644
--- a/tools/perf/util/parse-events.h
+++ b/tools/perf/util/parse-events.h
@@ -29,6 +29,7 @@ extern int parse_filter(const struct option *opt, const char *str, int unset);
#define EVENTS_HELP_MAX (128*1024)
extern void print_events(void);
+extern int is_valid_tracepoint(const char *event_string);
extern char debugfs_path[];
extern int valid_debugfs_mount(const char *debugfs);
--
1.6.3
^ permalink raw reply related [flat|nested] 17+ messages in thread
* [PATCH 3/4] PERF(userspace): Adjust perf timechart to the new power events V3
2010-10-26 23:43 Cleanup and fixes for power trace events Thomas Renninger
` (3 preceding siblings ...)
2010-10-26 23:43 ` Thomas Renninger
@ 2010-10-26 23:43 ` Thomas Renninger
2010-10-26 23:43 ` Thomas Renninger
` (2 subsequent siblings)
7 siblings, 0 replies; 17+ messages in thread
From: Thomas Renninger @ 2010-10-26 23:43 UTC (permalink / raw)
To: arjan, jean.pihet, mingo, rjw, linux-omap, linux-pm, linux-trace-users
Cc: Thomas Renninger
Changes in V3:
- PWR_EVENT_EXIT is -1 now instead of 0xFFFFFFFF
- Use cpu_{idle,frequency} instead of processor_{idle,frequency}
events
Changes in V2:
- Hanlde PWR_EVENT_EXIT instead of 0 to recon non-power state
The transition was rather smooth, only part I had to fiddle
some time was the check whether a tracepoint/event is
supported by the running kernel.
builtin-timechart must only pass -e power:xy events which
are supported by the running kernel.
For this I added the tiny helper function:
int is_valid_tracepoint(const char *event_string)
to parse-events.[hc]
which could be more generic as an interface and support
hardware/software/... events, not only tracepoints, but someone
else could extend that if needed...
Signed-off-by: Thomas Renninger <trenn@suse.de>
CC: linux-omap@vger.kernel.org
CC: linux-pm@lists.linux-foundation.org
CC: linux-trace-users@vger.kernel.org
CC: Jean Pihet <jean.pihet@newoldbits.com>
CC: Arjan van de Ven <arjan@linux.intel.com>
CC: Ingo Molnar <mingo@elte.hu>
CC: rjw@sisk.pl
---
tools/perf/builtin-timechart.c | 89 ++++++++++++++++++++++++++++++++-------
tools/perf/util/parse-events.c | 43 +++++++++++++++++++-
tools/perf/util/parse-events.h | 1 +
3 files changed, 116 insertions(+), 17 deletions(-)
diff --git a/tools/perf/builtin-timechart.c b/tools/perf/builtin-timechart.c
index 9bcc38f..bcc16b0 100644
--- a/tools/perf/builtin-timechart.c
+++ b/tools/perf/builtin-timechart.c
@@ -32,6 +32,10 @@
#include "util/session.h"
#include "util/svghelper.h"
+#define SUPPORT_OLD_POWER_EVENTS 1
+#define PWR_EVENT_EXIT 0xFFFFFFFF
+
+
static char const *input_name = "perf.data";
static char const *output_name = "output.svg";
@@ -298,12 +302,25 @@ struct trace_entry {
int lock_depth;
};
-struct power_entry {
+#if defined(SUPPORT_OLD_POWER_EVENTS)
+struct power_entry_old {
struct trace_entry te;
u64 type;
u64 value;
u64 cpu_id;
};
+#endif
+
+struct power_processor_entry {
+ struct trace_entry te;
+ u64 state;
+ u64 cpu_id;
+};
+
+struct power_suspend_entry {
+ struct trace_entry te;
+ u64 state;
+};
#define TASK_COMM_LEN 16
struct wakeup_entry {
@@ -489,29 +506,46 @@ static int process_sample_event(event_t *event, struct perf_session *session)
te = (void *)data.raw_data;
if (session->sample_type & PERF_SAMPLE_RAW && data.raw_size > 0) {
char *event_str;
- struct power_entry *pe;
-
- pe = (void *)te;
+#if defined(SUPPORT_OLD_POWER_EVENTS)
+ struct power_entry_old *peo;
+ peo = (void *)te;
+#endif
event_str = perf_header__find_event(te->type);
if (!event_str)
return 0;
- if (strcmp(event_str, "power:power_start") == 0)
- c_state_start(pe->cpu_id, data.time, pe->value);
-
- if (strcmp(event_str, "power:power_end") == 0)
- c_state_end(pe->cpu_id, data.time);
+ if (strcmp(event_str, "power:cpu_idle") == 0) {
+ struct power_processor_entry *ppe = (void *)te;
+ if (ppe->state == PWR_EVENT_EXIT)
+ c_state_end(ppe->cpu_id, data.time);
+ else
+ c_state_start(ppe->cpu_id, data.time,
+ ppe->state);
+ }
- if (strcmp(event_str, "power:power_frequency") == 0)
- p_state_change(pe->cpu_id, data.time, pe->value);
+ else if (strcmp(event_str, "power:cpu_frequency") == 0) {
+ struct power_processor_entry *ppe = (void *)te;
+ p_state_change(ppe->cpu_id, data.time, ppe->state);
+ }
- if (strcmp(event_str, "sched:sched_wakeup") == 0)
+ else if (strcmp(event_str, "sched:sched_wakeup") == 0)
sched_wakeup(data.cpu, data.time, data.pid, te);
- if (strcmp(event_str, "sched:sched_switch") == 0)
+ else if (strcmp(event_str, "sched:sched_switch") == 0)
sched_switch(data.cpu, data.time, te);
+
+#if defined(SUPPORT_OLD_POWER_EVENTS)
+ else if (strcmp(event_str, "power:power_start") == 0)
+ c_state_start(peo->cpu_id, data.time, peo->value);
+
+ else if (strcmp(event_str, "power:power_end") == 0)
+ c_state_end(peo->cpu_id, data.time);
+
+ else if (strcmp(event_str, "power:power_frequency") == 0)
+ p_state_change(peo->cpu_id, data.time, peo->value);
+#endif
}
return 0;
}
@@ -968,7 +1002,8 @@ static const char * const timechart_usage[] = {
NULL
};
-static const char *record_args[] = {
+#if defined(SUPPORT_OLD_POWER_EVENTS)
+static const char *record_old_args[] = {
"record",
"-a",
"-R",
@@ -980,16 +1015,38 @@ static const char *record_args[] = {
"-e", "sched:sched_wakeup",
"-e", "sched:sched_switch",
};
+#endif
+
+static const char *record_new_args[] = {
+ "record",
+ "-a",
+ "-R",
+ "-f",
+ "-c", "1",
+ "-e", "power:cpu_frequency",
+ "-e", "power:cpu_idle",
+ "-e", "sched:sched_wakeup",
+ "-e", "sched:sched_switch",
+};
static int __cmd_record(int argc, const char **argv)
{
unsigned int rec_argc, i, j;
const char **rec_argv;
+ const char **record_args = record_new_args;
+ unsigned int record_elems = ARRAY_SIZE(record_new_args);
- rec_argc = ARRAY_SIZE(record_args) + argc - 1;
+#if defined(SUPPORT_OLD_POWER_EVENTS)
+ if (is_valid_tracepoint("power:power_start")) {
+ record_args = record_old_args;
+ record_elems = ARRAY_SIZE(record_old_args);
+ }
+#endif
+
+ rec_argc = record_elems + argc - 1;
rec_argv = calloc(rec_argc + 1, sizeof(char *));
- for (i = 0; i < ARRAY_SIZE(record_args); i++)
+ for (i = 0; i < record_elems; i++)
rec_argv[i] = strdup(record_args[i]);
for (j = 1; j < (unsigned int)argc; j++, i++)
diff --git a/tools/perf/util/parse-events.c b/tools/perf/util/parse-events.c
index 4af5bd5..d706dcb 100644
--- a/tools/perf/util/parse-events.c
+++ b/tools/perf/util/parse-events.c
@@ -824,7 +824,7 @@ int parse_events(const struct option *opt __used, const char *str, int unset __u
if (ret != EVT_HANDLED_ALL) {
attrs[nr_counters] = attr;
nr_counters++;
- }
+ }
if (*str == 0)
break;
@@ -906,6 +906,47 @@ static void print_tracepoint_events(void)
}
/*
+ * Check whether event is in <debugfs_mount_point>/tracing/events
+ */
+
+int is_valid_tracepoint(const char *event_string)
+{
+ DIR *sys_dir, *evt_dir;
+ struct dirent *sys_next, *evt_next, sys_dirent, evt_dirent;
+ char evt_path[MAXPATHLEN];
+ char dir_path[MAXPATHLEN];
+
+ if (debugfs_valid_mountpoint(debugfs_path))
+ return 0;
+
+ sys_dir = opendir(debugfs_path);
+ if (!sys_dir)
+ return 0;
+
+ for_each_subsystem(sys_dir, sys_dirent, sys_next) {
+
+ snprintf(dir_path, MAXPATHLEN, "%s/%s", debugfs_path,
+ sys_dirent.d_name);
+ evt_dir = opendir(dir_path);
+ if (!evt_dir)
+ continue;
+
+ for_each_event(sys_dirent, evt_dir, evt_dirent, evt_next) {
+ snprintf(evt_path, MAXPATHLEN, "%s:%s",
+ sys_dirent.d_name, evt_dirent.d_name);
+ if (!strcmp(evt_path, event_string)) {
+ closedir(evt_dir);
+ closedir(sys_dir);
+ return 1;
+ }
+ }
+ closedir(evt_dir);
+ }
+ closedir(sys_dir);
+ return 0;
+}
+
+/*
* Print the help text for the event symbols:
*/
void print_events(void)
diff --git a/tools/perf/util/parse-events.h b/tools/perf/util/parse-events.h
index fc4ab3f..7ab4685 100644
--- a/tools/perf/util/parse-events.h
+++ b/tools/perf/util/parse-events.h
@@ -29,6 +29,7 @@ extern int parse_filter(const struct option *opt, const char *str, int unset);
#define EVENTS_HELP_MAX (128*1024)
extern void print_events(void);
+extern int is_valid_tracepoint(const char *event_string);
extern char debugfs_path[];
extern int valid_debugfs_mount(const char *debugfs);
--
1.6.3
^ permalink raw reply related [flat|nested] 17+ messages in thread
* [PATCH 4/4] PERF: fix power:cpu_idle double end events
2010-10-26 23:43 Cleanup and fixes for power trace events Thomas Renninger
` (5 preceding siblings ...)
2010-10-26 23:43 ` Thomas Renninger
@ 2010-10-26 23:43 ` Thomas Renninger
2010-10-26 23:43 ` Thomas Renninger
7 siblings, 0 replies; 17+ messages in thread
From: Thomas Renninger @ 2010-10-26 23:43 UTC (permalink / raw)
To: arjan, jean.pihet, mingo, rjw, linux-omap, linux-pm, linux-trace-users
Cc: Robert Schoene
cpu_idle events (transition into sleep state and exiting) are
both fired in pm_idle().
Entering sleep state and exiting should always get fired inside
pm_idle() already.
This is a revert of commit c882e0feb937af4e5b991cbd1c
Signed-off-by: Thomas Renninger <trenn@suse.de>
CC: Robert Schoene <robert.schoene@tu-dresden.de>
CC: Arjan van de Ven <arjan@linux.intel.com>
CC: Ingo Molnar <mingo@elte.hu>
---
arch/x86/kernel/process_32.c | 4 ----
arch/x86/kernel/process_64.c | 6 ------
2 files changed, 0 insertions(+), 10 deletions(-)
diff --git a/arch/x86/kernel/process_32.c b/arch/x86/kernel/process_32.c
index 4b9befa..8d12878 100644
--- a/arch/x86/kernel/process_32.c
+++ b/arch/x86/kernel/process_32.c
@@ -57,8 +57,6 @@
#include <asm/syscalls.h>
#include <asm/debugreg.h>
-#include <trace/events/power.h>
-
asmlinkage void ret_from_fork(void) __asm__("ret_from_fork");
/*
@@ -113,8 +111,6 @@ void cpu_idle(void)
stop_critical_timings();
pm_idle();
start_critical_timings();
- trace_power_end(smp_processor_id());
- trace_cpu_idle(PWR_EVENT_EXIT, smp_processor_id());
}
tick_nohz_restart_sched_tick();
preempt_enable_no_resched();
diff --git a/arch/x86/kernel/process_64.c b/arch/x86/kernel/process_64.c
index 28153a9..d7b3e95 100644
--- a/arch/x86/kernel/process_64.c
+++ b/arch/x86/kernel/process_64.c
@@ -51,8 +51,6 @@
#include <asm/syscalls.h>
#include <asm/debugreg.h>
-#include <trace/events/power.h>
-
asmlinkage extern void ret_from_fork(void);
DEFINE_PER_CPU(unsigned long, old_rsp);
@@ -141,10 +139,6 @@ void cpu_idle(void)
pm_idle();
start_critical_timings();
- trace_power_end(smp_processor_id());
- trace_cpu_idle(PWR_EVENT_EXIT,
- smp_processor_id());
-
/* In many cases the interrupt that ended idle
has already called exit_idle. But some idle
loops can be woken up without interrupt. */
--
1.6.3
^ permalink raw reply related [flat|nested] 17+ messages in thread
* [PATCH 4/4] PERF: fix power:cpu_idle double end events
2010-10-26 23:43 Cleanup and fixes for power trace events Thomas Renninger
` (6 preceding siblings ...)
2010-10-26 23:43 ` [PATCH 4/4] PERF: fix power:cpu_idle double end events Thomas Renninger
@ 2010-10-26 23:43 ` Thomas Renninger
2010-10-27 15:42 ` Thomas Renninger
2010-10-27 15:42 ` Thomas Renninger
7 siblings, 2 replies; 17+ messages in thread
From: Thomas Renninger @ 2010-10-26 23:43 UTC (permalink / raw)
To: arjan, jean.pihet, mingo, rjw, linux-omap, linux-pm, linux-trace-users
Cc: Thomas Renninger, Robert Schoene
cpu_idle events (transition into sleep state and exiting) are
both fired in pm_idle().
Entering sleep state and exiting should always get fired inside
pm_idle() already.
This is a revert of commit c882e0feb937af4e5b991cbd1c
Signed-off-by: Thomas Renninger <trenn@suse.de>
CC: Robert Schoene <robert.schoene@tu-dresden.de>
CC: Arjan van de Ven <arjan@linux.intel.com>
CC: Ingo Molnar <mingo@elte.hu>
---
arch/x86/kernel/process_32.c | 4 ----
arch/x86/kernel/process_64.c | 6 ------
2 files changed, 0 insertions(+), 10 deletions(-)
diff --git a/arch/x86/kernel/process_32.c b/arch/x86/kernel/process_32.c
index 4b9befa..8d12878 100644
--- a/arch/x86/kernel/process_32.c
+++ b/arch/x86/kernel/process_32.c
@@ -57,8 +57,6 @@
#include <asm/syscalls.h>
#include <asm/debugreg.h>
-#include <trace/events/power.h>
-
asmlinkage void ret_from_fork(void) __asm__("ret_from_fork");
/*
@@ -113,8 +111,6 @@ void cpu_idle(void)
stop_critical_timings();
pm_idle();
start_critical_timings();
- trace_power_end(smp_processor_id());
- trace_cpu_idle(PWR_EVENT_EXIT, smp_processor_id());
}
tick_nohz_restart_sched_tick();
preempt_enable_no_resched();
diff --git a/arch/x86/kernel/process_64.c b/arch/x86/kernel/process_64.c
index 28153a9..d7b3e95 100644
--- a/arch/x86/kernel/process_64.c
+++ b/arch/x86/kernel/process_64.c
@@ -51,8 +51,6 @@
#include <asm/syscalls.h>
#include <asm/debugreg.h>
-#include <trace/events/power.h>
-
asmlinkage extern void ret_from_fork(void);
DEFINE_PER_CPU(unsigned long, old_rsp);
@@ -141,10 +139,6 @@ void cpu_idle(void)
pm_idle();
start_critical_timings();
- trace_power_end(smp_processor_id());
- trace_cpu_idle(PWR_EVENT_EXIT,
- smp_processor_id());
-
/* In many cases the interrupt that ended idle
has already called exit_idle. But some idle
loops can be woken up without interrupt. */
--
1.6.3
^ permalink raw reply related [flat|nested] 17+ messages in thread
* Re: [PATCH 4/4] PERF: fix power:cpu_idle double end events
2010-10-26 23:43 ` Thomas Renninger
@ 2010-10-27 15:42 ` Thomas Renninger
2010-10-27 15:42 ` Thomas Renninger
1 sibling, 0 replies; 17+ messages in thread
From: Thomas Renninger @ 2010-10-27 15:42 UTC (permalink / raw)
To: arjan
Cc: jean.pihet, Robert Schoene, linux-trace-users, mingo, linux-omap,
linux-pm
On Wednesday 27 October 2010 01:43:25 Thomas Renninger wrote:
> cpu_idle events (transition into sleep state and exiting) are
> both fired in pm_idle().
>
> Entering sleep state and exiting should always get fired inside
> pm_idle() already.
>
> This is a revert of commit c882e0feb937af4e5b991cbd1c
Robert: I expect you tested this on a machine with no cpuidle
driver registered?
I should have had a deeper look at this at once, done so now:
Current cpu_idle, power_start/end (same before my changes) behavior
on X86 is rather weird (without this patch):
if pm_idle is:
poll_idle -> should throw double end events
default_idle -> only throws power_start, your patch
fixed that, but in the generic cpu_idle
thread function which always gets executed
also if pm_idle != default_idle
So Robert fixed this case, but at the wrong place.
cpuidle_idle_call -> depends whether intel_idle or acpi_idle
driver registered:
intel_idle -> throws a cpu idle state event, still double end
events (one from cpuidle, one from
process_{32,64}.c, due to Robert's patch
acpi_idle -> whether a power_start event is thrown at all
depends on (cmp with acpi_idle_do_entry()):
cx->entry_method == ACPI_CSTATE_FFH
will end up in a power_start event via:
acpi_processor_ffh_cstate_enter(cx)
mwait_idle_with_hints()
but it will not in case of:
cx->entry_method == ACPI_CSTATE_HALT
or IO based switching (the else path there):
/* IO port based C-state */
Again without this(my) patch you get:
poll_idle -> double end events
default_idle -> all is fine (with your patch)
cpuidle_idle_call
intel_idle registered -> double end events
acpi_idle registered -> double end events
start events may not be thrown
at all.
perf timechart can handle double end events, this may be
the reason this was overseen.
Argh, I tried to come up with patches, but run out of
time. I will send something soon.
I also found a bug in my userspace stuff: I forgot to convert
u64 to u32..., sorry about that.
I try to resend everything tomorrow.
Thomas
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH 4/4] PERF: fix power:cpu_idle double end events
2010-10-26 23:43 ` Thomas Renninger
2010-10-27 15:42 ` Thomas Renninger
@ 2010-10-27 15:42 ` Thomas Renninger
2010-10-28 0:46 ` [RFC] PERF: fix power:cpu_idle double end events and missing acpi_idle events - make cpu_idle power events cpuidle sysfs conform Thomas Renninger
` (3 more replies)
1 sibling, 4 replies; 17+ messages in thread
From: Thomas Renninger @ 2010-10-27 15:42 UTC (permalink / raw)
To: arjan
Cc: jean.pihet, mingo, rjw, linux-omap, linux-pm, linux-trace-users,
Robert Schoene
On Wednesday 27 October 2010 01:43:25 Thomas Renninger wrote:
> cpu_idle events (transition into sleep state and exiting) are
> both fired in pm_idle().
>
> Entering sleep state and exiting should always get fired inside
> pm_idle() already.
>
> This is a revert of commit c882e0feb937af4e5b991cbd1c
Robert: I expect you tested this on a machine with no cpuidle
driver registered?
I should have had a deeper look at this at once, done so now:
Current cpu_idle, power_start/end (same before my changes) behavior
on X86 is rather weird (without this patch):
if pm_idle is:
poll_idle -> should throw double end events
default_idle -> only throws power_start, your patch
fixed that, but in the generic cpu_idle
thread function which always gets executed
also if pm_idle != default_idle
So Robert fixed this case, but at the wrong place.
cpuidle_idle_call -> depends whether intel_idle or acpi_idle
driver registered:
intel_idle -> throws a cpu idle state event, still double end
events (one from cpuidle, one from
process_{32,64}.c, due to Robert's patch
acpi_idle -> whether a power_start event is thrown at all
depends on (cmp with acpi_idle_do_entry()):
cx->entry_method == ACPI_CSTATE_FFH
will end up in a power_start event via:
acpi_processor_ffh_cstate_enter(cx)
mwait_idle_with_hints()
but it will not in case of:
cx->entry_method == ACPI_CSTATE_HALT
or IO based switching (the else path there):
/* IO port based C-state */
Again without this(my) patch you get:
poll_idle -> double end events
default_idle -> all is fine (with your patch)
cpuidle_idle_call
intel_idle registered -> double end events
acpi_idle registered -> double end events
start events may not be thrown
at all.
perf timechart can handle double end events, this may be
the reason this was overseen.
Argh, I tried to come up with patches, but run out of
time. I will send something soon.
I also found a bug in my userspace stuff: I forgot to convert
u64 to u32..., sorry about that.
I try to resend everything tomorrow.
Thomas
^ permalink raw reply [flat|nested] 17+ messages in thread
* [RFC] PERF: fix power:cpu_idle double end events and missing acpi_idle events - make cpu_idle power events cpuidle sysfs conform
2010-10-27 15:42 ` Thomas Renninger
2010-10-28 0:46 ` [RFC] PERF: fix power:cpu_idle double end events and missing acpi_idle events - make cpu_idle power events cpuidle sysfs conform Thomas Renninger
@ 2010-10-28 0:46 ` Thomas Renninger
2010-11-01 8:11 ` [PATCH 4/4] PERF: fix power:cpu_idle double end events Robert Schöne
2010-11-01 8:11 ` Robert Schöne
3 siblings, 0 replies; 17+ messages in thread
From: Thomas Renninger @ 2010-10-28 0:46 UTC (permalink / raw)
To: arjan
Cc: jean.pihet, mingo, rjw, linux-pm, linux-trace-users,
Robert Schoene, Len Brown, linux-acpi
On Wednesday 27 October 2010 05:42:04 pm Thomas Renninger wrote:
> On Wednesday 27 October 2010 01:43:25 Thomas Renninger wrote:
> > cpu_idle events (transition into sleep state and exiting) are
> > both fired in pm_idle().
> >
> > Entering sleep state and exiting should always get fired inside
> > pm_idle() already.
> >
> > This is a revert of commit c882e0feb937af4e5b991cbd1c
..
> Argh, I tried to come up with patches, but run out of
> time. I will send something soon.
Below patch should fix and clean up current cpu_idle (former power_start/end)
event throwing policy.
But it introduces a change which state number is thrown in acpi_idle
mwait case. This should be done anyway, but needs further adjustings
in userspace.
If a cpuidle driver is registered, now the event throws the state which
maps to the registered cpuidle state.
This is an important fix, so that the whole cpu_idle event (state)
interface gets architecture independent.
For the intel_idle driver it currently does not matter because the cpuidle
and mwait C1/2/3 states by luck exactly match.
But on my machine with acpi_idle I only get 2 states: C1/C3.
ls /sys/devices/system/cpu/cpu0/cpuidle/state
state0/ state1/ state2/
(poll idle) (C1) (C3)
So with this change it would throw state=2, but it is C3, which has to be
grabbed from:
/sys/devices/system/cpu/cpu0/cpuidle/state2/name
But in intel_idle case the state names are bit long to be displayed nicely
in perf timechart:
{ /* MWAIT C1 */
.name = "NHM-C1",
.desc = "MWAIT 0x00",
{ /* MWAIT C3 */
.name = "NHM-C6",
.desc = "MWAIT 0x20",
Therefore I suggest to:
Shorten .name (also by cutting the char array down) to 2 characters
1) Either introduce convention that first word in .desc is the longer name,
arbitrary flags (as strings) follow, like:
{ /* MWAIT C3 */
.name = "C3",
.desc = "NHM-C6 MWAIT 0x20",
2) Or introduce another sysfs file, however it's named:
{ /* MWAIT C3 */
.name = "NHM-C6",
.desc = "MWAIT 0x20",
.abbr = "C3"
The latter is fully userspace (sysfs) compatible, I'd go for that.
If nobody objects, I may have time to implement this (will take more than a week) which includes:
- place cpu_idle events correctly, so that they always get thrown when idle
and no double end marker events can happen
- Make above cpuidle name/sysfs changes
- Let perf timechart grab the correct idle state name from sysfs.
-> Mapping between the cpu_idle event and the sysfs cpuidle state.
Thanks,
Thomas
--------
PERF: fix power:cpu_idle double end events - always throw events with acpi_idle
Convention should be:
Fire cpu_idle events inside the current pm_idle function (not somewhere
down the the callee tree) to keep things easy.
This is not always possible, only exception is:
c1e_idle -> calls default_idle which throws events -> easy and obvious.
Current possible pm_idle functions in X86:
c1e_idle, poll_idle, cpuidle_idle_call, mwait_idle, default_idle
-> this is really easy is now (if I haven't overseen something).
This also introduces another cleanup which may affect userspace:
The type field of the cpu_idle power event can now direclty get
mapped to:
/sys/devices/system/cpu/cpuX/cpuidle/stateX/{name,desc,usage,time,...}
instead of throwing very CPU/mwait specific numbers.
Fortunately this change is not visible for the intel_idle driver.
For the acpi_idle driver it should only be visible if the vendor
misses out C-states in his BIOS.
Signed-off-by: Thomas Renninger <trenn@suse.de>
CC: Robert Schoene <robert.schoene@tu-dresden.de>
CC: Arjan van de Ven <arjan@linux.intel.com>
CC: Ingo Molnar <mingo@elte.hu>
CC: linux-trace-users@vger.kernel.org
CC: linux-pm@lists.linux-foundation.org
CC: Len Brown <lenb@kernel.org>
CC: linux-acpi@vger.kernel.org
diff --git a/arch/x86/kernel/process.c b/arch/x86/kernel/process.c
index 155d975..b618548 100644
--- a/arch/x86/kernel/process.c
+++ b/arch/x86/kernel/process.c
@@ -387,6 +387,8 @@ void default_idle(void)
else
local_irq_enable();
current_thread_info()->status |= TS_POLLING;
+ trace_power_end(smp_processor_id());
+ trace_cpu_idle(PWR_EVENT_EXIT, smp_processor_id());
} else {
local_irq_enable();
/* loop is done by the caller */
@@ -444,8 +446,6 @@ EXPORT_SYMBOL_GPL(cpu_idle_wait);
*/
void mwait_idle_with_hints(unsigned long ax, unsigned long cx)
{
- trace_power_start(POWER_CSTATE, (ax>>4)+1, smp_processor_id());
- trace_cpu_idle((ax>>4)+1, smp_processor_id());
if (!need_resched()) {
if (cpu_has(¤t_cpu_data, X86_FEATURE_CLFLUSH_MONITOR))
clflush((void *)¤t_thread_info()->flags);
@@ -472,6 +472,8 @@ static void mwait_idle(void)
__sti_mwait(0, 0);
else
local_irq_enable();
+ trace_power_end(smp_processor_id());
+ trace_cpu_idle(PWR_EVENT_EXIT, smp_processor_id());
} else
local_irq_enable();
}
diff --git a/arch/x86/kernel/process_32.c b/arch/x86/kernel/process_32.c
index 4b9befa..8d12878 100644
--- a/arch/x86/kernel/process_32.c
+++ b/arch/x86/kernel/process_32.c
@@ -57,8 +57,6 @@
#include <asm/syscalls.h>
#include <asm/debugreg.h>
-#include <trace/events/power.h>
-
asmlinkage void ret_from_fork(void) __asm__("ret_from_fork");
/*
@@ -113,8 +111,6 @@ void cpu_idle(void)
stop_critical_timings();
pm_idle();
start_critical_timings();
- trace_power_end(smp_processor_id());
- trace_cpu_idle(PWR_EVENT_EXIT, smp_processor_id());
}
tick_nohz_restart_sched_tick();
preempt_enable_no_resched();
diff --git a/arch/x86/kernel/process_64.c b/arch/x86/kernel/process_64.c
index 28153a9..d7b3e95 100644
--- a/arch/x86/kernel/process_64.c
+++ b/arch/x86/kernel/process_64.c
@@ -51,8 +51,6 @@
#include <asm/syscalls.h>
#include <asm/debugreg.h>
-#include <trace/events/power.h>
-
asmlinkage extern void ret_from_fork(void);
DEFINE_PER_CPU(unsigned long, old_rsp);
@@ -141,10 +139,6 @@ void cpu_idle(void)
pm_idle();
start_critical_timings();
- trace_power_end(smp_processor_id());
- trace_cpu_idle(PWR_EVENT_EXIT,
- smp_processor_id());
-
/* In many cases the interrupt that ended idle
has already called exit_idle. But some idle
loops can be woken up without interrupt. */
diff --git a/drivers/cpuidle/cpuidle.c b/drivers/cpuidle/cpuidle.c
index 08d5f05..491a4af 100644
--- a/drivers/cpuidle/cpuidle.c
+++ b/drivers/cpuidle/cpuidle.c
@@ -96,7 +96,15 @@ static void cpuidle_idle_call(void)
/* enter the state and update stats */
dev->last_state = target_state;
+
+ trace_power_start(POWER_CSTATE, next_state, dev->cpu);
+ trace_cpu_idle(next_state, dev->cpu);
+
dev->last_residency = target_state->enter(dev, target_state);
+
+ trace_power_end(dev->cpu);
+ trace_cpu_idle(PWR_EVENT_EXIT, dev->cpu);
+
if (dev->last_state)
target_state = dev->last_state;
@@ -106,8 +114,6 @@ static void cpuidle_idle_call(void)
/* give the governor an opportunity to reflect on the outcome */
if (cpuidle_curr_governor->reflect)
cpuidle_curr_governor->reflect(dev);
- trace_power_end(smp_processor_id());
- trace_cpu_idle(PWR_EVENT_EXIT, smp_processor_id());
}
/**
diff --git a/drivers/idle/intel_idle.c b/drivers/idle/intel_idle.c
index d3701bf..5539cff 100644
--- a/drivers/idle/intel_idle.c
+++ b/drivers/idle/intel_idle.c
@@ -201,8 +201,6 @@ static int intel_idle(struct cpuidle_device *dev, struct cpuidle_state *state)
kt_before = ktime_get_real();
stop_critical_timings();
- trace_power_start(POWER_CSTATE, (eax >> 4) + 1, cpu);
- trace_cpu_idle((eax >> 4) + 1, cpu);
if (!need_resched()) {
__monitor((void *)¤t_thread_info()->flags, 0, 0);
^ permalink raw reply related [flat|nested] 17+ messages in thread
* [RFC] PERF: fix power:cpu_idle double end events and missing acpi_idle events - make cpu_idle power events cpuidle sysfs conform
2010-10-27 15:42 ` Thomas Renninger
@ 2010-10-28 0:46 ` Thomas Renninger
2010-10-28 0:46 ` Thomas Renninger
` (2 subsequent siblings)
3 siblings, 0 replies; 17+ messages in thread
From: Thomas Renninger @ 2010-10-28 0:46 UTC (permalink / raw)
To: arjan
Cc: linux-acpi, jean.pihet, Robert Schoene, linux-trace-users, mingo,
linux-pm
On Wednesday 27 October 2010 05:42:04 pm Thomas Renninger wrote:
> On Wednesday 27 October 2010 01:43:25 Thomas Renninger wrote:
> > cpu_idle events (transition into sleep state and exiting) are
> > both fired in pm_idle().
> >
> > Entering sleep state and exiting should always get fired inside
> > pm_idle() already.
> >
> > This is a revert of commit c882e0feb937af4e5b991cbd1c
..
> Argh, I tried to come up with patches, but run out of
> time. I will send something soon.
Below patch should fix and clean up current cpu_idle (former power_start/end)
event throwing policy.
But it introduces a change which state number is thrown in acpi_idle
mwait case. This should be done anyway, but needs further adjustings
in userspace.
If a cpuidle driver is registered, now the event throws the state which
maps to the registered cpuidle state.
This is an important fix, so that the whole cpu_idle event (state)
interface gets architecture independent.
For the intel_idle driver it currently does not matter because the cpuidle
and mwait C1/2/3 states by luck exactly match.
But on my machine with acpi_idle I only get 2 states: C1/C3.
ls /sys/devices/system/cpu/cpu0/cpuidle/state
state0/ state1/ state2/
(poll idle) (C1) (C3)
So with this change it would throw state=2, but it is C3, which has to be
grabbed from:
/sys/devices/system/cpu/cpu0/cpuidle/state2/name
But in intel_idle case the state names are bit long to be displayed nicely
in perf timechart:
{ /* MWAIT C1 */
.name = "NHM-C1",
.desc = "MWAIT 0x00",
{ /* MWAIT C3 */
.name = "NHM-C6",
.desc = "MWAIT 0x20",
Therefore I suggest to:
Shorten .name (also by cutting the char array down) to 2 characters
1) Either introduce convention that first word in .desc is the longer name,
arbitrary flags (as strings) follow, like:
{ /* MWAIT C3 */
.name = "C3",
.desc = "NHM-C6 MWAIT 0x20",
2) Or introduce another sysfs file, however it's named:
{ /* MWAIT C3 */
.name = "NHM-C6",
.desc = "MWAIT 0x20",
.abbr = "C3"
The latter is fully userspace (sysfs) compatible, I'd go for that.
If nobody objects, I may have time to implement this (will take more than a week) which includes:
- place cpu_idle events correctly, so that they always get thrown when idle
and no double end marker events can happen
- Make above cpuidle name/sysfs changes
- Let perf timechart grab the correct idle state name from sysfs.
-> Mapping between the cpu_idle event and the sysfs cpuidle state.
Thanks,
Thomas
--------
PERF: fix power:cpu_idle double end events - always throw events with acpi_idle
Convention should be:
Fire cpu_idle events inside the current pm_idle function (not somewhere
down the the callee tree) to keep things easy.
This is not always possible, only exception is:
c1e_idle -> calls default_idle which throws events -> easy and obvious.
Current possible pm_idle functions in X86:
c1e_idle, poll_idle, cpuidle_idle_call, mwait_idle, default_idle
-> this is really easy is now (if I haven't overseen something).
This also introduces another cleanup which may affect userspace:
The type field of the cpu_idle power event can now direclty get
mapped to:
/sys/devices/system/cpu/cpuX/cpuidle/stateX/{name,desc,usage,time,...}
instead of throwing very CPU/mwait specific numbers.
Fortunately this change is not visible for the intel_idle driver.
For the acpi_idle driver it should only be visible if the vendor
misses out C-states in his BIOS.
Signed-off-by: Thomas Renninger <trenn@suse.de>
CC: Robert Schoene <robert.schoene@tu-dresden.de>
CC: Arjan van de Ven <arjan@linux.intel.com>
CC: Ingo Molnar <mingo@elte.hu>
CC: linux-trace-users@vger.kernel.org
CC: linux-pm@lists.linux-foundation.org
CC: Len Brown <lenb@kernel.org>
CC: linux-acpi@vger.kernel.org
diff --git a/arch/x86/kernel/process.c b/arch/x86/kernel/process.c
index 155d975..b618548 100644
--- a/arch/x86/kernel/process.c
+++ b/arch/x86/kernel/process.c
@@ -387,6 +387,8 @@ void default_idle(void)
else
local_irq_enable();
current_thread_info()->status |= TS_POLLING;
+ trace_power_end(smp_processor_id());
+ trace_cpu_idle(PWR_EVENT_EXIT, smp_processor_id());
} else {
local_irq_enable();
/* loop is done by the caller */
@@ -444,8 +446,6 @@ EXPORT_SYMBOL_GPL(cpu_idle_wait);
*/
void mwait_idle_with_hints(unsigned long ax, unsigned long cx)
{
- trace_power_start(POWER_CSTATE, (ax>>4)+1, smp_processor_id());
- trace_cpu_idle((ax>>4)+1, smp_processor_id());
if (!need_resched()) {
if (cpu_has(¤t_cpu_data, X86_FEATURE_CLFLUSH_MONITOR))
clflush((void *)¤t_thread_info()->flags);
@@ -472,6 +472,8 @@ static void mwait_idle(void)
__sti_mwait(0, 0);
else
local_irq_enable();
+ trace_power_end(smp_processor_id());
+ trace_cpu_idle(PWR_EVENT_EXIT, smp_processor_id());
} else
local_irq_enable();
}
diff --git a/arch/x86/kernel/process_32.c b/arch/x86/kernel/process_32.c
index 4b9befa..8d12878 100644
--- a/arch/x86/kernel/process_32.c
+++ b/arch/x86/kernel/process_32.c
@@ -57,8 +57,6 @@
#include <asm/syscalls.h>
#include <asm/debugreg.h>
-#include <trace/events/power.h>
-
asmlinkage void ret_from_fork(void) __asm__("ret_from_fork");
/*
@@ -113,8 +111,6 @@ void cpu_idle(void)
stop_critical_timings();
pm_idle();
start_critical_timings();
- trace_power_end(smp_processor_id());
- trace_cpu_idle(PWR_EVENT_EXIT, smp_processor_id());
}
tick_nohz_restart_sched_tick();
preempt_enable_no_resched();
diff --git a/arch/x86/kernel/process_64.c b/arch/x86/kernel/process_64.c
index 28153a9..d7b3e95 100644
--- a/arch/x86/kernel/process_64.c
+++ b/arch/x86/kernel/process_64.c
@@ -51,8 +51,6 @@
#include <asm/syscalls.h>
#include <asm/debugreg.h>
-#include <trace/events/power.h>
-
asmlinkage extern void ret_from_fork(void);
DEFINE_PER_CPU(unsigned long, old_rsp);
@@ -141,10 +139,6 @@ void cpu_idle(void)
pm_idle();
start_critical_timings();
- trace_power_end(smp_processor_id());
- trace_cpu_idle(PWR_EVENT_EXIT,
- smp_processor_id());
-
/* In many cases the interrupt that ended idle
has already called exit_idle. But some idle
loops can be woken up without interrupt. */
diff --git a/drivers/cpuidle/cpuidle.c b/drivers/cpuidle/cpuidle.c
index 08d5f05..491a4af 100644
--- a/drivers/cpuidle/cpuidle.c
+++ b/drivers/cpuidle/cpuidle.c
@@ -96,7 +96,15 @@ static void cpuidle_idle_call(void)
/* enter the state and update stats */
dev->last_state = target_state;
+
+ trace_power_start(POWER_CSTATE, next_state, dev->cpu);
+ trace_cpu_idle(next_state, dev->cpu);
+
dev->last_residency = target_state->enter(dev, target_state);
+
+ trace_power_end(dev->cpu);
+ trace_cpu_idle(PWR_EVENT_EXIT, dev->cpu);
+
if (dev->last_state)
target_state = dev->last_state;
@@ -106,8 +114,6 @@ static void cpuidle_idle_call(void)
/* give the governor an opportunity to reflect on the outcome */
if (cpuidle_curr_governor->reflect)
cpuidle_curr_governor->reflect(dev);
- trace_power_end(smp_processor_id());
- trace_cpu_idle(PWR_EVENT_EXIT, smp_processor_id());
}
/**
diff --git a/drivers/idle/intel_idle.c b/drivers/idle/intel_idle.c
index d3701bf..5539cff 100644
--- a/drivers/idle/intel_idle.c
+++ b/drivers/idle/intel_idle.c
@@ -201,8 +201,6 @@ static int intel_idle(struct cpuidle_device *dev, struct cpuidle_state *state)
kt_before = ktime_get_real();
stop_critical_timings();
- trace_power_start(POWER_CSTATE, (eax >> 4) + 1, cpu);
- trace_cpu_idle((eax >> 4) + 1, cpu);
if (!need_resched()) {
__monitor((void *)¤t_thread_info()->flags, 0, 0);
^ permalink raw reply related [flat|nested] 17+ messages in thread
* Re: [PATCH 4/4] PERF: fix power:cpu_idle double end events
2010-10-27 15:42 ` Thomas Renninger
2010-10-28 0:46 ` [RFC] PERF: fix power:cpu_idle double end events and missing acpi_idle events - make cpu_idle power events cpuidle sysfs conform Thomas Renninger
2010-10-28 0:46 ` Thomas Renninger
@ 2010-11-01 8:11 ` Robert Schöne
2010-11-01 8:11 ` Robert Schöne
3 siblings, 0 replies; 17+ messages in thread
From: Robert Schöne @ 2010-11-01 8:11 UTC (permalink / raw)
To: Thomas Renninger
Cc: jean.pihet, linux-trace-users, mingo, linux-omap, arjan, linux-pm
Am Mittwoch, den 27.10.2010, 17:42 +0200 schrieb Thomas Renninger:
> Robert: I expect you tested this on a machine with no cpuidle
> driver registered?
You're right, there was no idle driver, but the idle process from
process_64.c which called the idle routine.
I reported my thoughts on this on 14th of May this year 2010, mostly
claiming for a standard on where to report these events.
You're also missing the other idle routines from x86/kernel/process.c
mwait_idle_with_hints and mwait_idle only throw start events, so they
should behave like default_idle. poll_idle on the other hand reports the
end event itself.
Robert
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH 4/4] PERF: fix power:cpu_idle double end events
2010-10-27 15:42 ` Thomas Renninger
` (2 preceding siblings ...)
2010-11-01 8:11 ` [PATCH 4/4] PERF: fix power:cpu_idle double end events Robert Schöne
@ 2010-11-01 8:11 ` Robert Schöne
2010-11-04 8:57 ` Thomas Renninger
2010-11-04 8:57 ` Thomas Renninger
3 siblings, 2 replies; 17+ messages in thread
From: Robert Schöne @ 2010-11-01 8:11 UTC (permalink / raw)
To: Thomas Renninger
Cc: arjan, jean.pihet, mingo, rjw, linux-omap, linux-pm, linux-trace-users
Am Mittwoch, den 27.10.2010, 17:42 +0200 schrieb Thomas Renninger:
> Robert: I expect you tested this on a machine with no cpuidle
> driver registered?
You're right, there was no idle driver, but the idle process from
process_64.c which called the idle routine.
I reported my thoughts on this on 14th of May this year 2010, mostly
claiming for a standard on where to report these events.
You're also missing the other idle routines from x86/kernel/process.c
mwait_idle_with_hints and mwait_idle only throw start events, so they
should behave like default_idle. poll_idle on the other hand reports the
end event itself.
Robert
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH 4/4] PERF: fix power:cpu_idle double end events
2010-11-01 8:11 ` Robert Schöne
2010-11-04 8:57 ` Thomas Renninger
@ 2010-11-04 8:57 ` Thomas Renninger
1 sibling, 0 replies; 17+ messages in thread
From: Thomas Renninger @ 2010-11-04 8:57 UTC (permalink / raw)
To: Robert Schöne
Cc: jean.pihet, linux-trace-users, mingo, linux-omap, arjan, linux-pm
Hi Robert,
On Monday 01 November 2010 09:11:28 Robert Schöne wrote:
> Am Mittwoch, den 27.10.2010, 17:42 +0200 schrieb Thomas Renninger:
>
> > Robert: I expect you tested this on a machine with no cpuidle
> > driver registered?
>
> You're right, there was no idle driver, but the idle process from
> process_64.c which called the idle routine.
> I reported my thoughts on this on 14th of May this year 2010, mostly
> claiming for a standard on where to report these events.
>
> You're also missing the other idle routines from x86/kernel/process.c
> mwait_idle_with_hints and mwait_idle only throw start events, so they
> should behave like default_idle. poll_idle on the other hand reports
> the end event itself.
I added you to CC of a patch that fixes the issue (and all other double
or missing (for acpi_idle driver) events) in a nice generic way.
It needs further userspace adjustings and I still wait for this
separate patch series to go into some branch.
Thanks,
Thomas
_______________________________________________
linux-pm mailing list
linux-pm@lists.linux-foundation.org
https://lists.linux-foundation.org/mailman/listinfo/linux-pm
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH 4/4] PERF: fix power:cpu_idle double end events
2010-11-01 8:11 ` Robert Schöne
@ 2010-11-04 8:57 ` Thomas Renninger
2010-11-04 8:57 ` Thomas Renninger
1 sibling, 0 replies; 17+ messages in thread
From: Thomas Renninger @ 2010-11-04 8:57 UTC (permalink / raw)
To: Robert Schöne
Cc: arjan, jean.pihet, mingo, rjw, linux-omap, linux-pm, linux-trace-users
Hi Robert,
On Monday 01 November 2010 09:11:28 Robert Schöne wrote:
> Am Mittwoch, den 27.10.2010, 17:42 +0200 schrieb Thomas Renninger:
>
> > Robert: I expect you tested this on a machine with no cpuidle
> > driver registered?
>
> You're right, there was no idle driver, but the idle process from
> process_64.c which called the idle routine.
> I reported my thoughts on this on 14th of May this year 2010, mostly
> claiming for a standard on where to report these events.
>
> You're also missing the other idle routines from x86/kernel/process.c
> mwait_idle_with_hints and mwait_idle only throw start events, so they
> should behave like default_idle. poll_idle on the other hand reports
> the end event itself.
I added you to CC of a patch that fixes the issue (and all other double
or missing (for acpi_idle driver) events) in a nice generic way.
It needs further userspace adjustings and I still wait for this
separate patch series to go into some branch.
Thanks,
Thomas
--
To unsubscribe from this list: send the line "unsubscribe linux-omap" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 17+ messages in thread
end of thread, other threads:[~2010-11-04 8:57 UTC | newest]
Thread overview: 17+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2010-10-26 23:43 Cleanup and fixes for power trace events Thomas Renninger
2010-10-26 23:43 ` [PATCH 1/4] PERF: Do not export power_frequency, but power_start event Thomas Renninger
2010-10-26 23:43 ` Thomas Renninger
2010-10-26 23:43 ` [PATCH 2/4] PERF(kernel): Cleanup power events V3 Thomas Renninger
2010-10-26 23:43 ` Thomas Renninger
2010-10-26 23:43 ` [PATCH 3/4] PERF(userspace): Adjust perf timechart to the new " Thomas Renninger
2010-10-26 23:43 ` Thomas Renninger
2010-10-26 23:43 ` [PATCH 4/4] PERF: fix power:cpu_idle double end events Thomas Renninger
2010-10-26 23:43 ` Thomas Renninger
2010-10-27 15:42 ` Thomas Renninger
2010-10-27 15:42 ` Thomas Renninger
2010-10-28 0:46 ` [RFC] PERF: fix power:cpu_idle double end events and missing acpi_idle events - make cpu_idle power events cpuidle sysfs conform Thomas Renninger
2010-10-28 0:46 ` Thomas Renninger
2010-11-01 8:11 ` [PATCH 4/4] PERF: fix power:cpu_idle double end events Robert Schöne
2010-11-01 8:11 ` Robert Schöne
2010-11-04 8:57 ` Thomas Renninger
2010-11-04 8:57 ` Thomas Renninger
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.