All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/7] perf updates and fixes
@ 2010-03-26  1:52 Frederic Weisbecker
  2010-03-26  1:52 ` [PATCH 1/7] perf: Drop the frame reliablity check Frederic Weisbecker
                   ` (7 more replies)
  0 siblings, 8 replies; 23+ messages in thread
From: Frederic Weisbecker @ 2010-03-26  1:52 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: LKML, Frederic Weisbecker, Peter Zijlstra,
	Arnaldo Carvalho de Melo, Paul Mackerras, David Miller,
	Steven Rostedt, Archs, H . Peter Anvin, Thomas Gleixner,
	Stephane Eranian

Hi,

The series is not yet mergeable because it would break PowerPc
(hot regs snapshot API has been changed, and I don't know how
to update PowerPc for that).

But if you're fine with the ideas, I can integrate the necessary
changes to fix this, and also separate fixes and updates.

Thanks.

Frederic Weisbecker (7):
  perf: Drop the frame reliablity check
  perf: Fetch hot regs from the template caller
  x86: Unify dumpstack.h and stacktrace.h
  perf: Move perf_arch_fetch_caller_regs into a macro
  perf: Make perf_fetch_caller_regs rewind to the first caller only
  perf: Use hot regs with software sched/migrate events
  perf: Correctly align perf event tracing buffer

 arch/x86/include/asm/perf_event.h |   10 ++++++-
 arch/x86/include/asm/stacktrace.h |   45 ++++++++++++++++++++++++++++++++
 arch/x86/kernel/cpu/perf_event.c  |   18 +------------
 arch/x86/kernel/dumpstack.c       |    1 -
 arch/x86/kernel/dumpstack.h       |   47 ----------------------------------
 arch/x86/kernel/dumpstack_32.c    |    2 -
 arch/x86/kernel/dumpstack_64.c    |    2 -
 arch/x86/kernel/stacktrace.c      |    7 +++--
 include/linux/perf_event.h        |   51 ++++++++++++++----------------------
 include/trace/ftrace.h            |   23 ++++++++--------
 kernel/perf_event.c               |   10 +------
 kernel/trace/trace_event_perf.c   |   13 ++++++---
 12 files changed, 101 insertions(+), 128 deletions(-)
 delete mode 100644 arch/x86/kernel/dumpstack.h


^ permalink raw reply	[flat|nested] 23+ messages in thread

* [PATCH 1/7] perf: Drop the frame reliablity check
  2010-03-26  1:52 [PATCH 0/7] perf updates and fixes Frederic Weisbecker
@ 2010-03-26  1:52 ` Frederic Weisbecker
  2010-03-26  1:52 ` [PATCH 2/7] perf: Fetch hot regs from the template caller Frederic Weisbecker
                   ` (6 subsequent siblings)
  7 siblings, 0 replies; 23+ messages in thread
From: Frederic Weisbecker @ 2010-03-26  1:52 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: LKML, Frederic Weisbecker, Peter Zijlstra, Stephane Eranian,
	Arnaldo Carvalho de Melo, Paul Mackerras, Ingo Molnar

It is useless now that we have a pure stack frame
walker, as given addr are always reliable.

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Stephane Eranian <eranian@google.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Ingo Molnar <mingo@elte.hu>
---
 arch/x86/kernel/cpu/perf_event.c |    3 +--
 1 files changed, 1 insertions(+), 2 deletions(-)

diff --git a/arch/x86/kernel/cpu/perf_event.c b/arch/x86/kernel/cpu/perf_event.c
index f571f51..9bc6550 100644
--- a/arch/x86/kernel/cpu/perf_event.c
+++ b/arch/x86/kernel/cpu/perf_event.c
@@ -1597,8 +1597,7 @@ static void backtrace_address(void *data, unsigned long addr, int reliable)
 {
 	struct perf_callchain_entry *entry = data;
 
-	if (reliable)
-		callchain_store(entry, addr);
+	callchain_store(entry, addr);
 }
 
 static const struct stacktrace_ops backtrace_ops = {
-- 
1.6.2.3


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH 2/7] perf: Fetch hot regs from the template caller
  2010-03-26  1:52 [PATCH 0/7] perf updates and fixes Frederic Weisbecker
  2010-03-26  1:52 ` [PATCH 1/7] perf: Drop the frame reliablity check Frederic Weisbecker
@ 2010-03-26  1:52 ` Frederic Weisbecker
  2010-03-26  1:52 ` [PATCH 3/7] x86: Unify dumpstack.h and stacktrace.h Frederic Weisbecker
                   ` (5 subsequent siblings)
  7 siblings, 0 replies; 23+ messages in thread
From: Frederic Weisbecker @ 2010-03-26  1:52 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: LKML, Frederic Weisbecker, Peter Zijlstra,
	Arnaldo Carvalho de Melo, Paul Mackerras, Ingo Molnar

Trace events can be defined from a template using
DECLARE_EVENT_CLASS/DEFINE_EVENT or directly with TRACE_EVENT.

In both cases we have a template tracepoint handler, used to
record the trace, to which we pass our ftrace event instance.

In the function level, if the class is named "foo" and the event
is named "blah", we have the following chain of calls:

perf_trace_blah() -> perf_trace_templ_foo()

In the case we have several events sharing the class "blah",
we'll have multiple users of perf_trace_templ_foo(), and it
won't be inlined by the compiler. This is usually what happens
with the DECLARE_EVENT_CLASS/DEFINE_EVENT based definition.

But if perf_trace_blah() is the only caller of perf_trace_templ_foo()
there are fair chances that it will be inlined.

The problem is that we fetch the regs from perf_trace_templ_foo()
after we rewinded the frame pointer to the second caller, we want
to reach the caller of perf_trace_blah() to get the right source
of the event. And we do this by always assuming that
perf_trace_templ_foo() is not inlined. But as shown above this
is not always true. And if it is inlined we miss the first caller,
losing the most important level of precision.

We get:
	    61.31%       ls  [kernel.kallsyms]  [k] do_softirq
                         |
                         --- do_softirq
                             irq_exit
                             do_IRQ
                             common_interrupt
                            |
                            |--25.00%-- tty_buffer_request_room

Instead of:
	    61.31%       ls  [kernel.kallsyms]  [k] __do_softirq
                         |
                         --- __do_softirq
                             do_softirq
                             irq_exit
                             do_IRQ
                             common_interrupt
                            |
                            |--25.00%-- tty_buffer_request_room

To fix this, we fetch the regs from perf_trace_blah() rather than
perf_trace_templ_foo() so that we don't have to deal with inlining
surprises.

That also bring us the advantage of having the true source of the
event even if we don't have frame pointers.

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Ingo Molnar <mingo@elte.hu>
---
 include/trace/ftrace.h |   23 ++++++++++++-----------
 1 files changed, 12 insertions(+), 11 deletions(-)

diff --git a/include/trace/ftrace.h b/include/trace/ftrace.h
index ea6f9d4..882c648 100644
--- a/include/trace/ftrace.h
+++ b/include/trace/ftrace.h
@@ -758,13 +758,12 @@ __attribute__((section("_ftrace_events"))) event_##call = {		\
 #define DECLARE_EVENT_CLASS(call, proto, args, tstruct, assign, print)	\
 static notrace void							\
 perf_trace_templ_##call(struct ftrace_event_call *event_call,		\
-			    proto)					\
+			struct pt_regs *__regs, proto)			\
 {									\
 	struct ftrace_data_offsets_##call __maybe_unused __data_offsets;\
 	struct ftrace_raw_##call *entry;				\
 	u64 __addr = 0, __count = 1;					\
 	unsigned long irq_flags;					\
-	struct pt_regs *__regs;						\
 	int __entry_size;						\
 	int __data_size;						\
 	int rctx;							\
@@ -785,20 +784,22 @@ perf_trace_templ_##call(struct ftrace_event_call *event_call,		\
 									\
 	{ assign; }							\
 									\
-	__regs = &__get_cpu_var(perf_trace_regs);			\
-	perf_fetch_caller_regs(__regs, 2);				\
-									\
 	perf_trace_buf_submit(entry, __entry_size, rctx, __addr,	\
 			       __count, irq_flags, __regs);		\
 }
 
 #undef DEFINE_EVENT
-#define DEFINE_EVENT(template, call, proto, args)		\
-static notrace void perf_trace_##call(proto)			\
-{								\
-	struct ftrace_event_call *event_call = &event_##call;	\
-								\
-	perf_trace_templ_##template(event_call, args);		\
+#define DEFINE_EVENT(template, call, proto, args)			\
+static notrace void perf_trace_##call(proto)				\
+{									\
+	struct ftrace_event_call *event_call = &event_##call;		\
+	struct pt_regs *__regs = &get_cpu_var(perf_trace_regs);		\
+									\
+	perf_fetch_caller_regs(__regs, 1);				\
+									\
+	perf_trace_templ_##template(event_call, __regs, args);		\
+									\
+	put_cpu_var(perf_trace_regs);					\
 }
 
 #undef DEFINE_EVENT_PRINT
-- 
1.6.2.3


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH 3/7] x86: Unify dumpstack.h and stacktrace.h
  2010-03-26  1:52 [PATCH 0/7] perf updates and fixes Frederic Weisbecker
  2010-03-26  1:52 ` [PATCH 1/7] perf: Drop the frame reliablity check Frederic Weisbecker
  2010-03-26  1:52 ` [PATCH 2/7] perf: Fetch hot regs from the template caller Frederic Weisbecker
@ 2010-03-26  1:52 ` Frederic Weisbecker
  2010-03-26  1:52   ` Frederic Weisbecker
                   ` (4 subsequent siblings)
  7 siblings, 0 replies; 23+ messages in thread
From: Frederic Weisbecker @ 2010-03-26  1:52 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: LKML, Frederic Weisbecker, Ingo Molnar, H. Peter Anvin, Thomas Gleixner

arch/x86/include/asm/stacktrace.h and arch/x86/kernel/dumpstack.h
declare headers of objects about the same topic.
Actually most of the files that include stacktrace.h also include
dumpstack.h

Although dumpstack.h seems more reserved for internals of stack
traces, those are quite often needed to define specialized stack
trace operations. And perf event arch headers are going to need
access to such low level operations anyway. So don't continue to
bother with dumpstack.h as it's not anymore about isolated deep
internals.

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
---
 arch/x86/include/asm/stacktrace.h |   46 ++++++++++++++++++++++++++++++++++++
 arch/x86/kernel/cpu/perf_event.c  |    2 -
 arch/x86/kernel/dumpstack.c       |    1 -
 arch/x86/kernel/dumpstack.h       |   47 -------------------------------------
 arch/x86/kernel/dumpstack_32.c    |    2 -
 arch/x86/kernel/dumpstack_64.c    |    2 -
 arch/x86/kernel/stacktrace.c      |    7 +++--
 7 files changed, 50 insertions(+), 57 deletions(-)
 delete mode 100644 arch/x86/kernel/dumpstack.h

diff --git a/arch/x86/include/asm/stacktrace.h b/arch/x86/include/asm/stacktrace.h
index 4dab78e..8fb70b7 100644
--- a/arch/x86/include/asm/stacktrace.h
+++ b/arch/x86/include/asm/stacktrace.h
@@ -1,3 +1,8 @@
+/*
+ *  Copyright (C) 1991, 1992  Linus Torvalds
+ *  Copyright (C) 2000, 2001, 2002 Andi Kleen, SuSE Labs
+ */
+
 #ifndef _ASM_X86_STACKTRACE_H
 #define _ASM_X86_STACKTRACE_H
 
@@ -38,8 +43,49 @@ struct stacktrace_ops {
 	walk_stack_t	walk_stack;
 };
 
+struct pt_regs;
+struct task_struct;
+
 void dump_trace(struct task_struct *tsk, struct pt_regs *regs,
 		unsigned long *stack, unsigned long bp,
 		const struct stacktrace_ops *ops, void *data);
 
+#ifdef CONFIG_X86_32
+#define STACKSLOTS_PER_LINE 8
+#define get_bp(bp) asm("movl %%ebp, %0" : "=r" (bp) :)
+#else
+#define STACKSLOTS_PER_LINE 4
+#define get_bp(bp) asm("movq %%rbp, %0" : "=r" (bp) :)
+#endif
+
+extern void
+show_trace_log_lvl(struct task_struct *task, struct pt_regs *regs,
+		unsigned long *stack, unsigned long bp, char *log_lvl);
+
+extern void
+show_stack_log_lvl(struct task_struct *task, struct pt_regs *regs,
+		unsigned long *sp, unsigned long bp, char *log_lvl);
+
+extern unsigned int code_bytes;
+
+/* The form of the top of the frame on the stack */
+struct stack_frame {
+	struct stack_frame *next_frame;
+	unsigned long return_address;
+};
+
+static inline unsigned long rewind_frame_pointer(int n)
+{
+	struct stack_frame *frame;
+
+	get_bp(frame);
+
+#ifdef CONFIG_FRAME_POINTER
+	while (n--)
+		frame = frame->next_frame;
+#endif
+
+	return (unsigned long)frame;
+}
+
 #endif /* _ASM_X86_STACKTRACE_H */
diff --git a/arch/x86/kernel/cpu/perf_event.c b/arch/x86/kernel/cpu/perf_event.c
index 9bc6550..c3f203e 100644
--- a/arch/x86/kernel/cpu/perf_event.c
+++ b/arch/x86/kernel/cpu/perf_event.c
@@ -1608,8 +1608,6 @@ static const struct stacktrace_ops backtrace_ops = {
 	.walk_stack		= print_context_stack_bp,
 };
 
-#include "../dumpstack.h"
-
 static void
 perf_callchain_kernel(struct pt_regs *regs, struct perf_callchain_entry *entry)
 {
diff --git a/arch/x86/kernel/dumpstack.c b/arch/x86/kernel/dumpstack.c
index 6d81755..b5329df 100644
--- a/arch/x86/kernel/dumpstack.c
+++ b/arch/x86/kernel/dumpstack.c
@@ -18,7 +18,6 @@
 
 #include <asm/stacktrace.h>
 
-#include "dumpstack.h"
 
 int panic_on_unrecovered_nmi;
 int panic_on_io_nmi;
diff --git a/arch/x86/kernel/dumpstack.h b/arch/x86/kernel/dumpstack.h
deleted file mode 100644
index 29e5f7c..0000000
--- a/arch/x86/kernel/dumpstack.h
+++ /dev/null
@@ -1,47 +0,0 @@
-/*
- *  Copyright (C) 1991, 1992  Linus Torvalds
- *  Copyright (C) 2000, 2001, 2002 Andi Kleen, SuSE Labs
- */
-
-#ifndef DUMPSTACK_H
-#define DUMPSTACK_H
-
-#ifdef CONFIG_X86_32
-#define STACKSLOTS_PER_LINE 8
-#define get_bp(bp) asm("movl %%ebp, %0" : "=r" (bp) :)
-#else
-#define STACKSLOTS_PER_LINE 4
-#define get_bp(bp) asm("movq %%rbp, %0" : "=r" (bp) :)
-#endif
-
-extern void
-show_trace_log_lvl(struct task_struct *task, struct pt_regs *regs,
-		unsigned long *stack, unsigned long bp, char *log_lvl);
-
-extern void
-show_stack_log_lvl(struct task_struct *task, struct pt_regs *regs,
-		unsigned long *sp, unsigned long bp, char *log_lvl);
-
-extern unsigned int code_bytes;
-
-/* The form of the top of the frame on the stack */
-struct stack_frame {
-	struct stack_frame *next_frame;
-	unsigned long return_address;
-};
-
-static inline unsigned long rewind_frame_pointer(int n)
-{
-	struct stack_frame *frame;
-
-	get_bp(frame);
-
-#ifdef CONFIG_FRAME_POINTER
-	while (n--)
-		frame = frame->next_frame;
-#endif
-
-	return (unsigned long)frame;
-}
-
-#endif /* DUMPSTACK_H */
diff --git a/arch/x86/kernel/dumpstack_32.c b/arch/x86/kernel/dumpstack_32.c
index 11540a1..0f6376f 100644
--- a/arch/x86/kernel/dumpstack_32.c
+++ b/arch/x86/kernel/dumpstack_32.c
@@ -16,8 +16,6 @@
 
 #include <asm/stacktrace.h>
 
-#include "dumpstack.h"
-
 
 void dump_trace(struct task_struct *task, struct pt_regs *regs,
 		unsigned long *stack, unsigned long bp,
diff --git a/arch/x86/kernel/dumpstack_64.c b/arch/x86/kernel/dumpstack_64.c
index 272c9f1..77fc784 100644
--- a/arch/x86/kernel/dumpstack_64.c
+++ b/arch/x86/kernel/dumpstack_64.c
@@ -16,8 +16,6 @@
 
 #include <asm/stacktrace.h>
 
-#include "dumpstack.h"
-
 #define N_EXCEPTION_STACKS_END \
 		(N_EXCEPTION_STACKS + DEBUG_STKSZ/EXCEPTION_STKSZ - 2)
 
diff --git a/arch/x86/kernel/stacktrace.c b/arch/x86/kernel/stacktrace.c
index 922eefb..ea54d02 100644
--- a/arch/x86/kernel/stacktrace.c
+++ b/arch/x86/kernel/stacktrace.c
@@ -96,12 +96,13 @@ EXPORT_SYMBOL_GPL(save_stack_trace_tsk);
 
 /* Userspace stacktrace - based on kernel/trace/trace_sysprof.c */
 
-struct stack_frame {
+struct stack_frame_user {
 	const void __user	*next_fp;
 	unsigned long		ret_addr;
 };
 
-static int copy_stack_frame(const void __user *fp, struct stack_frame *frame)
+static int
+copy_stack_frame(const void __user *fp, struct stack_frame_user *frame)
 {
 	int ret;
 
@@ -126,7 +127,7 @@ static inline void __save_stack_trace_user(struct stack_trace *trace)
 		trace->entries[trace->nr_entries++] = regs->ip;
 
 	while (trace->nr_entries < trace->max_entries) {
-		struct stack_frame frame;
+		struct stack_frame_user frame;
 
 		frame.next_fp = NULL;
 		frame.ret_addr = 0;
-- 
1.6.2.3


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH 4/7] perf: Move perf_arch_fetch_caller_regs into a macro
  2010-03-26  1:52 [PATCH 0/7] perf updates and fixes Frederic Weisbecker
  2010-03-26  1:52 ` [PATCH 1/7] perf: Drop the frame reliablity check Frederic Weisbecker
@ 2010-03-26  1:52   ` Frederic Weisbecker
  2010-03-26  1:52 ` [PATCH 3/7] x86: Unify dumpstack.h and stacktrace.h Frederic Weisbecker
                     ` (5 subsequent siblings)
  7 siblings, 0 replies; 23+ messages in thread
From: Frederic Weisbecker @ 2010-03-26  1:52 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: LKML, Frederic Weisbecker, Peter Zijlstra,
	Arnaldo Carvalho de Melo, Paul Mackerras, Ingo Molnar,
	David Miller, Archs

Having perf_arch_fetch_caller_regs() as a weak function overridable
by macros makes it hard to export its symbol for modules, as we
need to export the symbol in another file than the generic weak
definition to deal with compiler bugs.

Currently it's not a problem because we can define the symbol in
trace_event_perf.c as only trace events use it. But we are going
to make some generic software events to use this new facility,
so the symbol must be available even if trace events are not built.

This patch fixes the problem by making perf_arch_fetch_caller_regs
a macro that archs can define in their <asm/perf_event.h>

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: David Miller <davem@davemloft.net>
Cc: Archs <linux-arch@vger.kernel.org>
---
 arch/x86/include/asm/perf_event.h |   10 +++++++++-
 arch/x86/kernel/cpu/perf_event.c  |   13 -------------
 include/linux/perf_event.h        |    8 +++++---
 kernel/perf_event.c               |    7 -------
 kernel/trace/trace_event_perf.c   |    2 --
 5 files changed, 14 insertions(+), 26 deletions(-)

diff --git a/arch/x86/include/asm/perf_event.h b/arch/x86/include/asm/perf_event.h
index 124dddd..4bf3d37 100644
--- a/arch/x86/include/asm/perf_event.h
+++ b/arch/x86/include/asm/perf_event.h
@@ -155,7 +155,15 @@ extern void perf_events_lapic_init(void);
 
 #define perf_instruction_pointer(regs)	((regs)->ip)
 
-#else
+#include <asm/stacktrace.h>
+
+#define perf_arch_fetch_caller_regs(regs, ip, skip)	\
+	(regs)->ip = ip;				\
+	(regs)->bp = rewind_frame_pointer(skip);	\
+	(regs)->cs = __KERNEL_CS;			\
+	local_save_flags((regs)->flags);
+
+#else /* !CONFIG_PERF_EVENTS */
 static inline void init_hw_perf_events(void)		{ }
 static inline void perf_events_lapic_init(void)	{ }
 #endif
diff --git a/arch/x86/kernel/cpu/perf_event.c b/arch/x86/kernel/cpu/perf_event.c
index c3f203e..43fae02 100644
--- a/arch/x86/kernel/cpu/perf_event.c
+++ b/arch/x86/kernel/cpu/perf_event.c
@@ -1691,16 +1691,3 @@ struct perf_callchain_entry *perf_callchain(struct pt_regs *regs)
 	return entry;
 }
 
-#ifdef CONFIG_EVENT_TRACING
-void perf_arch_fetch_caller_regs(struct pt_regs *regs, unsigned long ip, int skip)
-{
-	regs->ip = ip;
-	/*
-	 * perf_arch_fetch_caller_regs adds another call, we need to increment
-	 * the skip level
-	 */
-	regs->bp = rewind_frame_pointer(skip + 1);
-	regs->cs = __KERNEL_CS;
-	local_save_flags(regs->flags);
-}
-#endif
diff --git a/include/linux/perf_event.h b/include/linux/perf_event.h
index 2bccb7b..76b680f 100644
--- a/include/linux/perf_event.h
+++ b/include/linux/perf_event.h
@@ -867,8 +867,10 @@ perf_sw_event(u32 event_id, u64 nr, int nmi, struct pt_regs *regs, u64 addr)
 		__perf_sw_event(event_id, nr, nmi, regs, addr);
 }
 
-extern void
-perf_arch_fetch_caller_regs(struct pt_regs *regs, unsigned long ip, int skip);
+#ifndef perf_arch_fetch_caller_regs
+static inline void
+perf_arch_fetch_caller_regs(struct pt_regs *regs, unsigned long ip, int skip){ }
+#endif
 
 /*
  * Take a snapshot of the regs. Skip ip and frame pointer to
@@ -902,7 +904,7 @@ static inline void perf_fetch_caller_regs(struct pt_regs *regs, int skip)
 		ip = 0;
 	}
 
-	return perf_arch_fetch_caller_regs(regs, ip, skip);
+	perf_arch_fetch_caller_regs(regs, ip, skip);
 }
 
 extern void __perf_event_mmap(struct vm_area_struct *vma);
diff --git a/kernel/perf_event.c b/kernel/perf_event.c
index 455393e..bbed6f0 100644
--- a/kernel/perf_event.c
+++ b/kernel/perf_event.c
@@ -2790,13 +2790,6 @@ __weak struct perf_callchain_entry *perf_callchain(struct pt_regs *regs)
 	return NULL;
 }
 
-#ifdef CONFIG_EVENT_TRACING
-__weak
-void perf_arch_fetch_caller_regs(struct pt_regs *regs, unsigned long ip, int skip)
-{
-}
-#endif
-
 /*
  * Output
  */
diff --git a/kernel/trace/trace_event_perf.c b/kernel/trace/trace_event_perf.c
index 81f691e..8e9edcd 100644
--- a/kernel/trace/trace_event_perf.c
+++ b/kernel/trace/trace_event_perf.c
@@ -12,8 +12,6 @@
 DEFINE_PER_CPU(struct pt_regs, perf_trace_regs);
 EXPORT_PER_CPU_SYMBOL_GPL(perf_trace_regs);
 
-EXPORT_SYMBOL_GPL(perf_arch_fetch_caller_regs);
-
 static char *perf_trace_buf;
 static char *perf_trace_buf_nmi;
 
-- 
1.6.2.3


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH 4/7] perf: Move perf_arch_fetch_caller_regs into a macro
@ 2010-03-26  1:52   ` Frederic Weisbecker
  0 siblings, 0 replies; 23+ messages in thread
From: Frederic Weisbecker @ 2010-03-26  1:52 UTC (permalink / raw)
  Cc: LKML, Frederic Weisbecker, Peter Zijlstra,
	Arnaldo Carvalho de Melo, Paul Mackerras, Ingo Molnar,
	David Miller, Archs

Having perf_arch_fetch_caller_regs() as a weak function overridable
by macros makes it hard to export its symbol for modules, as we
need to export the symbol in another file than the generic weak
definition to deal with compiler bugs.

Currently it's not a problem because we can define the symbol in
trace_event_perf.c as only trace events use it. But we are going
to make some generic software events to use this new facility,
so the symbol must be available even if trace events are not built.

This patch fixes the problem by making perf_arch_fetch_caller_regs
a macro that archs can define in their <asm/perf_event.h>

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: David Miller <davem@davemloft.net>
Cc: Archs <linux-arch@vger.kernel.org>
---
 arch/x86/include/asm/perf_event.h |   10 +++++++++-
 arch/x86/kernel/cpu/perf_event.c  |   13 -------------
 include/linux/perf_event.h        |    8 +++++---
 kernel/perf_event.c               |    7 -------
 kernel/trace/trace_event_perf.c   |    2 --
 5 files changed, 14 insertions(+), 26 deletions(-)

diff --git a/arch/x86/include/asm/perf_event.h b/arch/x86/include/asm/perf_event.h
index 124dddd..4bf3d37 100644
--- a/arch/x86/include/asm/perf_event.h
+++ b/arch/x86/include/asm/perf_event.h
@@ -155,7 +155,15 @@ extern void perf_events_lapic_init(void);
 
 #define perf_instruction_pointer(regs)	((regs)->ip)
 
-#else
+#include <asm/stacktrace.h>
+
+#define perf_arch_fetch_caller_regs(regs, ip, skip)	\
+	(regs)->ip = ip;				\
+	(regs)->bp = rewind_frame_pointer(skip);	\
+	(regs)->cs = __KERNEL_CS;			\
+	local_save_flags((regs)->flags);
+
+#else /* !CONFIG_PERF_EVENTS */
 static inline void init_hw_perf_events(void)		{ }
 static inline void perf_events_lapic_init(void)	{ }
 #endif
diff --git a/arch/x86/kernel/cpu/perf_event.c b/arch/x86/kernel/cpu/perf_event.c
index c3f203e..43fae02 100644
--- a/arch/x86/kernel/cpu/perf_event.c
+++ b/arch/x86/kernel/cpu/perf_event.c
@@ -1691,16 +1691,3 @@ struct perf_callchain_entry *perf_callchain(struct pt_regs *regs)
 	return entry;
 }
 
-#ifdef CONFIG_EVENT_TRACING
-void perf_arch_fetch_caller_regs(struct pt_regs *regs, unsigned long ip, int skip)
-{
-	regs->ip = ip;
-	/*
-	 * perf_arch_fetch_caller_regs adds another call, we need to increment
-	 * the skip level
-	 */
-	regs->bp = rewind_frame_pointer(skip + 1);
-	regs->cs = __KERNEL_CS;
-	local_save_flags(regs->flags);
-}
-#endif
diff --git a/include/linux/perf_event.h b/include/linux/perf_event.h
index 2bccb7b..76b680f 100644
--- a/include/linux/perf_event.h
+++ b/include/linux/perf_event.h
@@ -867,8 +867,10 @@ perf_sw_event(u32 event_id, u64 nr, int nmi, struct pt_regs *regs, u64 addr)
 		__perf_sw_event(event_id, nr, nmi, regs, addr);
 }
 
-extern void
-perf_arch_fetch_caller_regs(struct pt_regs *regs, unsigned long ip, int skip);
+#ifndef perf_arch_fetch_caller_regs
+static inline void
+perf_arch_fetch_caller_regs(struct pt_regs *regs, unsigned long ip, int skip){ }
+#endif
 
 /*
  * Take a snapshot of the regs. Skip ip and frame pointer to
@@ -902,7 +904,7 @@ static inline void perf_fetch_caller_regs(struct pt_regs *regs, int skip)
 		ip = 0;
 	}
 
-	return perf_arch_fetch_caller_regs(regs, ip, skip);
+	perf_arch_fetch_caller_regs(regs, ip, skip);
 }
 
 extern void __perf_event_mmap(struct vm_area_struct *vma);
diff --git a/kernel/perf_event.c b/kernel/perf_event.c
index 455393e..bbed6f0 100644
--- a/kernel/perf_event.c
+++ b/kernel/perf_event.c
@@ -2790,13 +2790,6 @@ __weak struct perf_callchain_entry *perf_callchain(struct pt_regs *regs)
 	return NULL;
 }
 
-#ifdef CONFIG_EVENT_TRACING
-__weak
-void perf_arch_fetch_caller_regs(struct pt_regs *regs, unsigned long ip, int skip)
-{
-}
-#endif
-
 /*
  * Output
  */
diff --git a/kernel/trace/trace_event_perf.c b/kernel/trace/trace_event_perf.c
index 81f691e..8e9edcd 100644
--- a/kernel/trace/trace_event_perf.c
+++ b/kernel/trace/trace_event_perf.c
@@ -12,8 +12,6 @@
 DEFINE_PER_CPU(struct pt_regs, perf_trace_regs);
 EXPORT_PER_CPU_SYMBOL_GPL(perf_trace_regs);
 
-EXPORT_SYMBOL_GPL(perf_arch_fetch_caller_regs);
-
 static char *perf_trace_buf;
 static char *perf_trace_buf_nmi;
 
-- 
1.6.2.3

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH 4/7] perf: Move perf_arch_fetch_caller_regs into a macro
@ 2010-03-26  1:52   ` Frederic Weisbecker
  0 siblings, 0 replies; 23+ messages in thread
From: Frederic Weisbecker @ 2010-03-26  1:52 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: LKML, Frederic Weisbecker, Peter Zijlstra,
	Arnaldo Carvalho de Melo, Paul Mackerras, David Miller, Archs

Having perf_arch_fetch_caller_regs() as a weak function overridable
by macros makes it hard to export its symbol for modules, as we
need to export the symbol in another file than the generic weak
definition to deal with compiler bugs.

Currently it's not a problem because we can define the symbol in
trace_event_perf.c as only trace events use it. But we are going
to make some generic software events to use this new facility,
so the symbol must be available even if trace events are not built.

This patch fixes the problem by making perf_arch_fetch_caller_regs
a macro that archs can define in their <asm/perf_event.h>

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: David Miller <davem@davemloft.net>
Cc: Archs <linux-arch@vger.kernel.org>
---
 arch/x86/include/asm/perf_event.h |   10 +++++++++-
 arch/x86/kernel/cpu/perf_event.c  |   13 -------------
 include/linux/perf_event.h        |    8 +++++---
 kernel/perf_event.c               |    7 -------
 kernel/trace/trace_event_perf.c   |    2 --
 5 files changed, 14 insertions(+), 26 deletions(-)

diff --git a/arch/x86/include/asm/perf_event.h b/arch/x86/include/asm/perf_event.h
index 124dddd..4bf3d37 100644
--- a/arch/x86/include/asm/perf_event.h
+++ b/arch/x86/include/asm/perf_event.h
@@ -155,7 +155,15 @@ extern void perf_events_lapic_init(void);
 
 #define perf_instruction_pointer(regs)	((regs)->ip)
 
-#else
+#include <asm/stacktrace.h>
+
+#define perf_arch_fetch_caller_regs(regs, ip, skip)	\
+	(regs)->ip = ip;				\
+	(regs)->bp = rewind_frame_pointer(skip);	\
+	(regs)->cs = __KERNEL_CS;			\
+	local_save_flags((regs)->flags);
+
+#else /* !CONFIG_PERF_EVENTS */
 static inline void init_hw_perf_events(void)		{ }
 static inline void perf_events_lapic_init(void)	{ }
 #endif
diff --git a/arch/x86/kernel/cpu/perf_event.c b/arch/x86/kernel/cpu/perf_event.c
index c3f203e..43fae02 100644
--- a/arch/x86/kernel/cpu/perf_event.c
+++ b/arch/x86/kernel/cpu/perf_event.c
@@ -1691,16 +1691,3 @@ struct perf_callchain_entry *perf_callchain(struct pt_regs *regs)
 	return entry;
 }
 
-#ifdef CONFIG_EVENT_TRACING
-void perf_arch_fetch_caller_regs(struct pt_regs *regs, unsigned long ip, int skip)
-{
-	regs->ip = ip;
-	/*
-	 * perf_arch_fetch_caller_regs adds another call, we need to increment
-	 * the skip level
-	 */
-	regs->bp = rewind_frame_pointer(skip + 1);
-	regs->cs = __KERNEL_CS;
-	local_save_flags(regs->flags);
-}
-#endif
diff --git a/include/linux/perf_event.h b/include/linux/perf_event.h
index 2bccb7b..76b680f 100644
--- a/include/linux/perf_event.h
+++ b/include/linux/perf_event.h
@@ -867,8 +867,10 @@ perf_sw_event(u32 event_id, u64 nr, int nmi, struct pt_regs *regs, u64 addr)
 		__perf_sw_event(event_id, nr, nmi, regs, addr);
 }
 
-extern void
-perf_arch_fetch_caller_regs(struct pt_regs *regs, unsigned long ip, int skip);
+#ifndef perf_arch_fetch_caller_regs
+static inline void
+perf_arch_fetch_caller_regs(struct pt_regs *regs, unsigned long ip, int skip){ }
+#endif
 
 /*
  * Take a snapshot of the regs. Skip ip and frame pointer to
@@ -902,7 +904,7 @@ static inline void perf_fetch_caller_regs(struct pt_regs *regs, int skip)
 		ip = 0;
 	}
 
-	return perf_arch_fetch_caller_regs(regs, ip, skip);
+	perf_arch_fetch_caller_regs(regs, ip, skip);
 }
 
 extern void __perf_event_mmap(struct vm_area_struct *vma);
diff --git a/kernel/perf_event.c b/kernel/perf_event.c
index 455393e..bbed6f0 100644
--- a/kernel/perf_event.c
+++ b/kernel/perf_event.c
@@ -2790,13 +2790,6 @@ __weak struct perf_callchain_entry *perf_callchain(struct pt_regs *regs)
 	return NULL;
 }
 
-#ifdef CONFIG_EVENT_TRACING
-__weak
-void perf_arch_fetch_caller_regs(struct pt_regs *regs, unsigned long ip, int skip)
-{
-}
-#endif
-
 /*
  * Output
  */
diff --git a/kernel/trace/trace_event_perf.c b/kernel/trace/trace_event_perf.c
index 81f691e..8e9edcd 100644
--- a/kernel/trace/trace_event_perf.c
+++ b/kernel/trace/trace_event_perf.c
@@ -12,8 +12,6 @@
 DEFINE_PER_CPU(struct pt_regs, perf_trace_regs);
 EXPORT_PER_CPU_SYMBOL_GPL(perf_trace_regs);
 
-EXPORT_SYMBOL_GPL(perf_arch_fetch_caller_regs);
-
 static char *perf_trace_buf;
 static char *perf_trace_buf_nmi;
 
-- 
1.6.2.3


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH 5/7] perf: Make perf_fetch_caller_regs rewind to the first caller only
  2010-03-26  1:52 [PATCH 0/7] perf updates and fixes Frederic Weisbecker
  2010-03-26  1:52 ` [PATCH 1/7] perf: Drop the frame reliablity check Frederic Weisbecker
@ 2010-03-26  1:52   ` Frederic Weisbecker
  2010-03-26  1:52 ` [PATCH 3/7] x86: Unify dumpstack.h and stacktrace.h Frederic Weisbecker
                     ` (5 subsequent siblings)
  7 siblings, 0 replies; 23+ messages in thread
From: Frederic Weisbecker @ 2010-03-26  1:52 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: LKML, Frederic Weisbecker, Peter Zijlstra,
	Arnaldo Carvalho de Melo, Paul Mackerras, Ingo Molnar,
	David Miller, Archs

Trace events now only need to rewind the regs to the immediate
caller, so we don't need anymore to implement the support for
further stack walking.

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: David Miller <davem@davemloft.net>
Cc: Archs <linux-arch@vger.kernel.org>
---
 arch/x86/include/asm/perf_event.h |    6 +++---
 arch/x86/include/asm/stacktrace.h |    5 ++---
 include/linux/perf_event.h        |   30 +++++-------------------------
 include/trace/ftrace.h            |    2 +-
 4 files changed, 11 insertions(+), 32 deletions(-)

diff --git a/arch/x86/include/asm/perf_event.h b/arch/x86/include/asm/perf_event.h
index 4bf3d37..1df2317 100644
--- a/arch/x86/include/asm/perf_event.h
+++ b/arch/x86/include/asm/perf_event.h
@@ -157,9 +157,9 @@ extern void perf_events_lapic_init(void);
 
 #include <asm/stacktrace.h>
 
-#define perf_arch_fetch_caller_regs(regs, ip, skip)	\
-	(regs)->ip = ip;				\
-	(regs)->bp = rewind_frame_pointer(skip);	\
+#define perf_arch_fetch_caller_regs(regs, __ip)		\
+	(regs)->ip = __ip;				\
+	(regs)->bp = caller_frame_pointer();		\
 	(regs)->cs = __KERNEL_CS;			\
 	local_save_flags((regs)->flags);
 
diff --git a/arch/x86/include/asm/stacktrace.h b/arch/x86/include/asm/stacktrace.h
index 8fb70b7..62c9897 100644
--- a/arch/x86/include/asm/stacktrace.h
+++ b/arch/x86/include/asm/stacktrace.h
@@ -74,15 +74,14 @@ struct stack_frame {
 	unsigned long return_address;
 };
 
-static inline unsigned long rewind_frame_pointer(int n)
+static inline unsigned long caller_frame_pointer(void)
 {
 	struct stack_frame *frame;
 
 	get_bp(frame);
 
 #ifdef CONFIG_FRAME_POINTER
-	while (n--)
-		frame = frame->next_frame;
+	frame = frame->next_frame;
 #endif
 
 	return (unsigned long)frame;
diff --git a/include/linux/perf_event.h b/include/linux/perf_event.h
index 76b680f..3b59cf7 100644
--- a/include/linux/perf_event.h
+++ b/include/linux/perf_event.h
@@ -869,42 +869,22 @@ perf_sw_event(u32 event_id, u64 nr, int nmi, struct pt_regs *regs, u64 addr)
 
 #ifndef perf_arch_fetch_caller_regs
 static inline void
-perf_arch_fetch_caller_regs(struct pt_regs *regs, unsigned long ip, int skip){ }
+perf_arch_fetch_caller_regs(struct pt_regs *regs, unsigned long ip) { }
 #endif
 
 /*
- * Take a snapshot of the regs. Skip ip and frame pointer to
- * the nth caller. We only need a few of the regs:
+ * Take a snapshot of the regs. Rewind ip and frame pointer to
+ * the caller. We only need a few of the regs:
  * - ip for PERF_SAMPLE_IP
  * - cs for user_mode() tests
  * - bp for callchains
  * - eflags, for future purposes, just in case
  */
-static inline void perf_fetch_caller_regs(struct pt_regs *regs, int skip)
+static inline void perf_fetch_caller_regs(struct pt_regs *regs)
 {
-	unsigned long ip;
-
 	memset(regs, 0, sizeof(*regs));
 
-	switch (skip) {
-	case 1 :
-		ip = CALLER_ADDR0;
-		break;
-	case 2 :
-		ip = CALLER_ADDR1;
-		break;
-	case 3 :
-		ip = CALLER_ADDR2;
-		break;
-	case 4:
-		ip = CALLER_ADDR3;
-		break;
-	/* No need to support further for now */
-	default:
-		ip = 0;
-	}
-
-	perf_arch_fetch_caller_regs(regs, ip, skip);
+	perf_arch_fetch_caller_regs(regs, CALLER_ADDR0);
 }
 
 extern void __perf_event_mmap(struct vm_area_struct *vma);
diff --git a/include/trace/ftrace.h b/include/trace/ftrace.h
index 882c648..82e9977 100644
--- a/include/trace/ftrace.h
+++ b/include/trace/ftrace.h
@@ -795,7 +795,7 @@ static notrace void perf_trace_##call(proto)				\
 	struct ftrace_event_call *event_call = &event_##call;		\
 	struct pt_regs *__regs = &get_cpu_var(perf_trace_regs);		\
 									\
-	perf_fetch_caller_regs(__regs, 1);				\
+	perf_fetch_caller_regs(__regs);					\
 									\
 	perf_trace_templ_##template(event_call, __regs, args);		\
 									\
-- 
1.6.2.3


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH 5/7] perf: Make perf_fetch_caller_regs rewind to the first caller only
@ 2010-03-26  1:52   ` Frederic Weisbecker
  0 siblings, 0 replies; 23+ messages in thread
From: Frederic Weisbecker @ 2010-03-26  1:52 UTC (permalink / raw)
  Cc: LKML, Frederic Weisbecker, Peter Zijlstra,
	Arnaldo Carvalho de Melo, Paul Mackerras, Ingo Molnar,
	David Miller, Archs

Trace events now only need to rewind the regs to the immediate
caller, so we don't need anymore to implement the support for
further stack walking.

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: David Miller <davem@davemloft.net>
Cc: Archs <linux-arch@vger.kernel.org>
---
 arch/x86/include/asm/perf_event.h |    6 +++---
 arch/x86/include/asm/stacktrace.h |    5 ++---
 include/linux/perf_event.h        |   30 +++++-------------------------
 include/trace/ftrace.h            |    2 +-
 4 files changed, 11 insertions(+), 32 deletions(-)

diff --git a/arch/x86/include/asm/perf_event.h b/arch/x86/include/asm/perf_event.h
index 4bf3d37..1df2317 100644
--- a/arch/x86/include/asm/perf_event.h
+++ b/arch/x86/include/asm/perf_event.h
@@ -157,9 +157,9 @@ extern void perf_events_lapic_init(void);
 
 #include <asm/stacktrace.h>
 
-#define perf_arch_fetch_caller_regs(regs, ip, skip)	\
-	(regs)->ip = ip;				\
-	(regs)->bp = rewind_frame_pointer(skip);	\
+#define perf_arch_fetch_caller_regs(regs, __ip)		\
+	(regs)->ip = __ip;				\
+	(regs)->bp = caller_frame_pointer();		\
 	(regs)->cs = __KERNEL_CS;			\
 	local_save_flags((regs)->flags);
 
diff --git a/arch/x86/include/asm/stacktrace.h b/arch/x86/include/asm/stacktrace.h
index 8fb70b7..62c9897 100644
--- a/arch/x86/include/asm/stacktrace.h
+++ b/arch/x86/include/asm/stacktrace.h
@@ -74,15 +74,14 @@ struct stack_frame {
 	unsigned long return_address;
 };
 
-static inline unsigned long rewind_frame_pointer(int n)
+static inline unsigned long caller_frame_pointer(void)
 {
 	struct stack_frame *frame;
 
 	get_bp(frame);
 
 #ifdef CONFIG_FRAME_POINTER
-	while (n--)
-		frame = frame->next_frame;
+	frame = frame->next_frame;
 #endif
 
 	return (unsigned long)frame;
diff --git a/include/linux/perf_event.h b/include/linux/perf_event.h
index 76b680f..3b59cf7 100644
--- a/include/linux/perf_event.h
+++ b/include/linux/perf_event.h
@@ -869,42 +869,22 @@ perf_sw_event(u32 event_id, u64 nr, int nmi, struct pt_regs *regs, u64 addr)
 
 #ifndef perf_arch_fetch_caller_regs
 static inline void
-perf_arch_fetch_caller_regs(struct pt_regs *regs, unsigned long ip, int skip){ }
+perf_arch_fetch_caller_regs(struct pt_regs *regs, unsigned long ip) { }
 #endif
 
 /*
- * Take a snapshot of the regs. Skip ip and frame pointer to
- * the nth caller. We only need a few of the regs:
+ * Take a snapshot of the regs. Rewind ip and frame pointer to
+ * the caller. We only need a few of the regs:
  * - ip for PERF_SAMPLE_IP
  * - cs for user_mode() tests
  * - bp for callchains
  * - eflags, for future purposes, just in case
  */
-static inline void perf_fetch_caller_regs(struct pt_regs *regs, int skip)
+static inline void perf_fetch_caller_regs(struct pt_regs *regs)
 {
-	unsigned long ip;
-
 	memset(regs, 0, sizeof(*regs));
 
-	switch (skip) {
-	case 1 :
-		ip = CALLER_ADDR0;
-		break;
-	case 2 :
-		ip = CALLER_ADDR1;
-		break;
-	case 3 :
-		ip = CALLER_ADDR2;
-		break;
-	case 4:
-		ip = CALLER_ADDR3;
-		break;
-	/* No need to support further for now */
-	default:
-		ip = 0;
-	}
-
-	perf_arch_fetch_caller_regs(regs, ip, skip);
+	perf_arch_fetch_caller_regs(regs, CALLER_ADDR0);
 }
 
 extern void __perf_event_mmap(struct vm_area_struct *vma);
diff --git a/include/trace/ftrace.h b/include/trace/ftrace.h
index 882c648..82e9977 100644
--- a/include/trace/ftrace.h
+++ b/include/trace/ftrace.h
@@ -795,7 +795,7 @@ static notrace void perf_trace_##call(proto)				\
 	struct ftrace_event_call *event_call = &event_##call;		\
 	struct pt_regs *__regs = &get_cpu_var(perf_trace_regs);		\
 									\
-	perf_fetch_caller_regs(__regs, 1);				\
+	perf_fetch_caller_regs(__regs);					\
 									\
 	perf_trace_templ_##template(event_call, __regs, args);		\
 									\
-- 
1.6.2.3

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH 5/7] perf: Make perf_fetch_caller_regs rewind to the first caller only
@ 2010-03-26  1:52   ` Frederic Weisbecker
  0 siblings, 0 replies; 23+ messages in thread
From: Frederic Weisbecker @ 2010-03-26  1:52 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: LKML, Frederic Weisbecker, Peter Zijlstra,
	Arnaldo Carvalho de Melo, Paul Mackerras, David Miller, Archs

Trace events now only need to rewind the regs to the immediate
caller, so we don't need anymore to implement the support for
further stack walking.

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: David Miller <davem@davemloft.net>
Cc: Archs <linux-arch@vger.kernel.org>
---
 arch/x86/include/asm/perf_event.h |    6 +++---
 arch/x86/include/asm/stacktrace.h |    5 ++---
 include/linux/perf_event.h        |   30 +++++-------------------------
 include/trace/ftrace.h            |    2 +-
 4 files changed, 11 insertions(+), 32 deletions(-)

diff --git a/arch/x86/include/asm/perf_event.h b/arch/x86/include/asm/perf_event.h
index 4bf3d37..1df2317 100644
--- a/arch/x86/include/asm/perf_event.h
+++ b/arch/x86/include/asm/perf_event.h
@@ -157,9 +157,9 @@ extern void perf_events_lapic_init(void);
 
 #include <asm/stacktrace.h>
 
-#define perf_arch_fetch_caller_regs(regs, ip, skip)	\
-	(regs)->ip = ip;				\
-	(regs)->bp = rewind_frame_pointer(skip);	\
+#define perf_arch_fetch_caller_regs(regs, __ip)		\
+	(regs)->ip = __ip;				\
+	(regs)->bp = caller_frame_pointer();		\
 	(regs)->cs = __KERNEL_CS;			\
 	local_save_flags((regs)->flags);
 
diff --git a/arch/x86/include/asm/stacktrace.h b/arch/x86/include/asm/stacktrace.h
index 8fb70b7..62c9897 100644
--- a/arch/x86/include/asm/stacktrace.h
+++ b/arch/x86/include/asm/stacktrace.h
@@ -74,15 +74,14 @@ struct stack_frame {
 	unsigned long return_address;
 };
 
-static inline unsigned long rewind_frame_pointer(int n)
+static inline unsigned long caller_frame_pointer(void)
 {
 	struct stack_frame *frame;
 
 	get_bp(frame);
 
 #ifdef CONFIG_FRAME_POINTER
-	while (n--)
-		frame = frame->next_frame;
+	frame = frame->next_frame;
 #endif
 
 	return (unsigned long)frame;
diff --git a/include/linux/perf_event.h b/include/linux/perf_event.h
index 76b680f..3b59cf7 100644
--- a/include/linux/perf_event.h
+++ b/include/linux/perf_event.h
@@ -869,42 +869,22 @@ perf_sw_event(u32 event_id, u64 nr, int nmi, struct pt_regs *regs, u64 addr)
 
 #ifndef perf_arch_fetch_caller_regs
 static inline void
-perf_arch_fetch_caller_regs(struct pt_regs *regs, unsigned long ip, int skip){ }
+perf_arch_fetch_caller_regs(struct pt_regs *regs, unsigned long ip) { }
 #endif
 
 /*
- * Take a snapshot of the regs. Skip ip and frame pointer to
- * the nth caller. We only need a few of the regs:
+ * Take a snapshot of the regs. Rewind ip and frame pointer to
+ * the caller. We only need a few of the regs:
  * - ip for PERF_SAMPLE_IP
  * - cs for user_mode() tests
  * - bp for callchains
  * - eflags, for future purposes, just in case
  */
-static inline void perf_fetch_caller_regs(struct pt_regs *regs, int skip)
+static inline void perf_fetch_caller_regs(struct pt_regs *regs)
 {
-	unsigned long ip;
-
 	memset(regs, 0, sizeof(*regs));
 
-	switch (skip) {
-	case 1 :
-		ip = CALLER_ADDR0;
-		break;
-	case 2 :
-		ip = CALLER_ADDR1;
-		break;
-	case 3 :
-		ip = CALLER_ADDR2;
-		break;
-	case 4:
-		ip = CALLER_ADDR3;
-		break;
-	/* No need to support further for now */
-	default:
-		ip = 0;
-	}
-
-	perf_arch_fetch_caller_regs(regs, ip, skip);
+	perf_arch_fetch_caller_regs(regs, CALLER_ADDR0);
 }
 
 extern void __perf_event_mmap(struct vm_area_struct *vma);
diff --git a/include/trace/ftrace.h b/include/trace/ftrace.h
index 882c648..82e9977 100644
--- a/include/trace/ftrace.h
+++ b/include/trace/ftrace.h
@@ -795,7 +795,7 @@ static notrace void perf_trace_##call(proto)				\
 	struct ftrace_event_call *event_call = &event_##call;		\
 	struct pt_regs *__regs = &get_cpu_var(perf_trace_regs);		\
 									\
-	perf_fetch_caller_regs(__regs, 1);				\
+	perf_fetch_caller_regs(__regs);					\
 									\
 	perf_trace_templ_##template(event_call, __regs, args);		\
 									\
-- 
1.6.2.3


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH 6/7] perf: Use hot regs with software sched/migrate events
  2010-03-26  1:52 [PATCH 0/7] perf updates and fixes Frederic Weisbecker
                   ` (4 preceding siblings ...)
  2010-03-26  1:52   ` Frederic Weisbecker
@ 2010-03-26  1:52 ` Frederic Weisbecker
  2010-03-26  1:52 ` [PATCH 7/7] perf: Correctly align perf event tracing buffer Frederic Weisbecker
  2010-03-26  6:02 ` [PATCH 0/7] perf updates and fixes Paul Mackerras
  7 siblings, 0 replies; 23+ messages in thread
From: Frederic Weisbecker @ 2010-03-26  1:52 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: LKML, Frederic Weisbecker, Peter Zijlstra,
	Arnaldo Carvalho de Melo, Paul Mackerras, Ingo Molnar,
	David Miller

Scheduler's task migration events don't work because they always
pass NULL regs perf_sw_event(). The event hence gets filtered
in perf_swevent_add().

Scheduler's context switches events use task_pt_regs() to get
the context when the event occured which is a wrong thing to
do as this won't give us the place in the kernel where we went
to sleep but the place where we left userspace. The result is
even more wrong if we switch from a kernel thread.

Use the hot regs snapshot for both events as they belong to the
non-interrupt/exception based events family. Unlike page faults
or so that provide the regs matching the exact origin of the event,
we need to save the current context.

This makes the task migration event working and fix the context
switch callchains and origin ip.

Example: perf record -a -e cs

Before:

    10.91%      ksoftirqd/0                  0  [k] 0000000000000000
                |
                --- (nil)
                    perf_callchain
                    perf_prepare_sample
                    __perf_event_overflow
                    perf_swevent_overflow
                    perf_swevent_add
                    perf_swevent_ctx_event
                    do_perf_sw_event
                    __perf_sw_event
                    perf_event_task_sched_out
                    schedule
                    run_ksoftirqd
                    kthread
                    kernel_thread_helper

After:

    23.77%  hald-addon-stor  [kernel.kallsyms]  [k] schedule
            |
            --- schedule
               |
               |--60.00%-- schedule_timeout
               |          wait_for_common
               |          wait_for_completion
               |          blk_execute_rq
               |          scsi_execute
               |          scsi_execute_req
               |          sr_test_unit_ready
               |          |
               |          |--66.67%-- sr_media_change
               |          |          media_changed
               |          |          cdrom_media_changed
               |          |          sr_block_media_changed
               |          |          check_disk_change
               |          |          cdrom_open

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: David Miller <davem@davemloft.net>
---
 include/linux/perf_event.h |   21 ++++++++++++++-------
 kernel/perf_event.c        |    3 +--
 2 files changed, 15 insertions(+), 9 deletions(-)

diff --git a/include/linux/perf_event.h b/include/linux/perf_event.h
index 3b59cf7..613b419 100644
--- a/include/linux/perf_event.h
+++ b/include/linux/perf_event.h
@@ -860,13 +860,6 @@ extern atomic_t perf_swevent_enabled[PERF_COUNT_SW_MAX];
 
 extern void __perf_sw_event(u32, u64, int, struct pt_regs *, u64);
 
-static inline void
-perf_sw_event(u32 event_id, u64 nr, int nmi, struct pt_regs *regs, u64 addr)
-{
-	if (atomic_read(&perf_swevent_enabled[event_id]))
-		__perf_sw_event(event_id, nr, nmi, regs, addr);
-}
-
 #ifndef perf_arch_fetch_caller_regs
 static inline void
 perf_arch_fetch_caller_regs(struct pt_regs *regs, unsigned long ip) { }
@@ -887,6 +880,20 @@ static inline void perf_fetch_caller_regs(struct pt_regs *regs)
 	perf_arch_fetch_caller_regs(regs, CALLER_ADDR0);
 }
 
+static inline void
+perf_sw_event(u32 event_id, u64 nr, int nmi, struct pt_regs *regs, u64 addr)
+{
+	if (atomic_read(&perf_swevent_enabled[event_id])) {
+		struct pt_regs caller_regs;
+
+		if (!regs) {
+			perf_fetch_caller_regs(&caller_regs);
+			regs = &caller_regs;
+		}
+		__perf_sw_event(event_id, nr, nmi, regs, addr);
+	}
+}
+
 extern void __perf_event_mmap(struct vm_area_struct *vma);
 
 static inline void perf_event_mmap(struct vm_area_struct *vma)
diff --git a/kernel/perf_event.c b/kernel/perf_event.c
index bbed6f0..8b392c3 100644
--- a/kernel/perf_event.c
+++ b/kernel/perf_event.c
@@ -1167,8 +1167,7 @@ void perf_event_task_sched_out(struct task_struct *task,
 	struct pt_regs *regs;
 	int do_switch = 1;
 
-	regs = task_pt_regs(task);
-	perf_sw_event(PERF_COUNT_SW_CONTEXT_SWITCHES, 1, 1, regs, 0);
+	perf_sw_event(PERF_COUNT_SW_CONTEXT_SWITCHES, 1, 1, NULL, 0);
 
 	if (likely(!ctx || !cpuctx->task_ctx))
 		return;
-- 
1.6.2.3


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH 7/7] perf: Correctly align perf event tracing buffer
  2010-03-26  1:52 [PATCH 0/7] perf updates and fixes Frederic Weisbecker
                   ` (5 preceding siblings ...)
  2010-03-26  1:52 ` [PATCH 6/7] perf: Use hot regs with software sched/migrate events Frederic Weisbecker
@ 2010-03-26  1:52 ` Frederic Weisbecker
  2010-03-26  6:02 ` [PATCH 0/7] perf updates and fixes Paul Mackerras
  7 siblings, 0 replies; 23+ messages in thread
From: Frederic Weisbecker @ 2010-03-26  1:52 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: LKML, Frederic Weisbecker, Peter Zijlstra,
	Arnaldo Carvalho de Melo, Paul Mackerras, Ingo Molnar,
	David Miller, Tejun Heo, Steven Rostedt

The trace event buffer used by perf to record raw sample events
is typed as an array of char and may then not be aligned to 8
by alloc_percpu().

But we need it to be aligned to 8 in sparc64 because we cast
this buffer into a random structure type built by the TRACE_EVENT()
macro to store the traces. So if a random 64 bits field is accessed
inside, it may be not under an expected good alignment.

Use an array of long instead to force the appropriate alignment, and
perform a compile time check to ensure the size in byte of the buffer
is a multiple of sizeof(long) so that its actual size doesn't get
shrinked under us.

This fixes unaligned accesses reported while using perf lock
in sparc 64.

Suggested-by: David Miller <davem@davemloft.net>
Suggested-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: David Miller <davem@davemloft.net>
Cc: Tejun Heo <htejun@gmail.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
---
 kernel/trace/trace_event_perf.c |   11 +++++++++--
 1 files changed, 9 insertions(+), 2 deletions(-)

diff --git a/kernel/trace/trace_event_perf.c b/kernel/trace/trace_event_perf.c
index 8e9edcd..30314f5 100644
--- a/kernel/trace/trace_event_perf.c
+++ b/kernel/trace/trace_event_perf.c
@@ -15,7 +15,12 @@ EXPORT_PER_CPU_SYMBOL_GPL(perf_trace_regs);
 static char *perf_trace_buf;
 static char *perf_trace_buf_nmi;
 
-typedef typeof(char [PERF_MAX_TRACE_SIZE]) perf_trace_t ;
+/*
+ * Force it to be aligned to unsigned long to avoid misaligned accesses
+ * suprises
+ */
+typedef typeof(unsigned long [PERF_MAX_TRACE_SIZE / sizeof(unsigned long)])
+	perf_trace_t;
 
 /* Count the events in use (per event id, not per instance) */
 static int	total_ref_count;
@@ -128,6 +133,8 @@ __kprobes void *perf_trace_buf_prepare(int size, unsigned short type,
 	char *trace_buf, *raw_data;
 	int pc, cpu;
 
+	BUILD_BUG_ON(PERF_MAX_TRACE_SIZE % sizeof(unsigned long));
+
 	pc = preempt_count();
 
 	/* Protect the per cpu buffer, begin the rcu read side */
@@ -150,7 +157,7 @@ __kprobes void *perf_trace_buf_prepare(int size, unsigned short type,
 	raw_data = per_cpu_ptr(trace_buf, cpu);
 
 	/* zero the dead bytes from align to not leak stack to user */
-	*(u64 *)(&raw_data[size - sizeof(u64)]) = 0ULL;
+	memset(&raw_data[size - sizeof(u64)], 0, sizeof(u64));
 
 	entry = (struct trace_entry *)raw_data;
 	tracing_generic_entry_update(entry, *irq_flags, pc);
-- 
1.6.2.3


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* Re: [PATCH 0/7] perf updates and fixes
  2010-03-26  1:52 [PATCH 0/7] perf updates and fixes Frederic Weisbecker
                   ` (6 preceding siblings ...)
  2010-03-26  1:52 ` [PATCH 7/7] perf: Correctly align perf event tracing buffer Frederic Weisbecker
@ 2010-03-26  6:02 ` Paul Mackerras
  2010-03-26  7:58   ` Ingo Molnar
  2010-03-26 17:45   ` Frederic Weisbecker
  7 siblings, 2 replies; 23+ messages in thread
From: Paul Mackerras @ 2010-03-26  6:02 UTC (permalink / raw)
  To: Frederic Weisbecker
  Cc: Ingo Molnar, LKML, Peter Zijlstra, Arnaldo Carvalho de Melo,
	David Miller, Steven Rostedt, Archs, H . Peter Anvin,
	Thomas Gleixner, Stephane Eranian

On Fri, Mar 26, 2010 at 02:52:35AM +0100, Frederic Weisbecker wrote:

> The series is not yet mergeable because it would break PowerPc
> (hot regs snapshot API has been changed, and I don't know how
> to update PowerPc for that).
> 
> But if you're fine with the ideas, I can integrate the necessary
> changes to fix this, and also separate fixes and updates.

The patch below adds the necessary stuff for powerpc.  You could fold
it into your "perf: Move perf_arch_fetch_caller_regs into a macro"
patch, or keep it as a separate patch in the series (though that would
make preserving bisectability more difficult).

There is a little difficulty in that you first create a 3-argument
form of perf_arch_fetch_caller_regs and then remove one argument in a
later patch, whereas my patch below just creates the 2-argument form.
I'm sure you can resolve that one way or another.

The patch to add the old-style perf_arch_fetch_caller_regs for powerpc
is sitting in the tip/perf/urgent branch but hasn't gone anywhere.
I suppose we need to get Ingo to pull that branch into tip/perf/core
and then include a revert patch in the series that switches to the new
interface.  Alternatively, if the urgent branch isn't append-only then
we could disappear it (the urgent branch would need to be rewound by
one commit).

Paul.
----------
[PATCH] perf/powerpc: Implement perf_arch_fetch_caller_regs for powerpc

This adds the ability to get a hot register snapshot for powerpc,
enabling us to get meaningful call chains for tracepoints and context
switch events.

Signed-off-by: Paul Mackerras <paulus@samba.org>
---
 arch/powerpc/include/asm/perf_event.h |   12 ++++++++++++
 1 file changed, 12 insertions(+)

diff --git a/arch/powerpc/include/asm/perf_event.h b/arch/powerpc/include/asm/perf_event.h
index e6d4ce6..5c16b89 100644
--- a/arch/powerpc/include/asm/perf_event.h
+++ b/arch/powerpc/include/asm/perf_event.h
@@ -21,3 +21,15 @@
 #ifdef CONFIG_FSL_EMB_PERF_EVENT
 #include <asm/perf_event_fsl_emb.h>
 #endif
+
+#ifdef CONFIG_PERF_EVENTS
+#include <asm/ptrace.h>
+#include <asm/reg.h>
+
+#define perf_arch_fetch_caller_regs(regs, __ip)			\
+	do {							\
+		(regs)->nip = __ip;				\
+		(regs)->gpr[1] = *(unsigned long *)__get_SP();	\
+		asm volatile("mfmsr %0" : "=r" ((regs)->msr));	\
+	} while (0)
+#endif

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* Re: [PATCH 0/7] perf updates and fixes
  2010-03-26  6:02 ` [PATCH 0/7] perf updates and fixes Paul Mackerras
@ 2010-03-26  7:58   ` Ingo Molnar
  2010-03-26 17:38     ` Frederic Weisbecker
  2010-03-26 17:45   ` Frederic Weisbecker
  1 sibling, 1 reply; 23+ messages in thread
From: Ingo Molnar @ 2010-03-26  7:58 UTC (permalink / raw)
  To: Paul Mackerras
  Cc: Frederic Weisbecker, LKML, Peter Zijlstra,
	Arnaldo Carvalho de Melo, David Miller, Steven Rostedt, Archs,
	H . Peter Anvin, Thomas Gleixner, Stephane Eranian


* Paul Mackerras <paulus@samba.org> wrote:

> On Fri, Mar 26, 2010 at 02:52:35AM +0100, Frederic Weisbecker wrote:
> 
> > The series is not yet mergeable because it would break PowerPc (hot regs 
> > snapshot API has been changed, and I don't know how to update PowerPc for 
> > that).
> > 
> > But if you're fine with the ideas, I can integrate the necessary changes 
> > to fix this, and also separate fixes and updates.
> 
> The patch below adds the necessary stuff for powerpc.  You could fold it 
> into your "perf: Move perf_arch_fetch_caller_regs into a macro" patch, or 
> keep it as a separate patch in the series (though that would make preserving 
> bisectability more difficult).

Since the series needs a resend anyway folding back ought to be fine i think.

I'm wondering whether this should get into tip:perf/urgent - or in 
tip:perf/core for 2.6.35.

It fixes sw event call trace ugliness, but is that a 2.6.34 regression? Is 
there any other aspect of the series that points towards accelerating this 
into .34?

Thanks,

	Ingo

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 0/7] perf updates and fixes
  2010-03-26  7:58   ` Ingo Molnar
@ 2010-03-26 17:38     ` Frederic Weisbecker
  0 siblings, 0 replies; 23+ messages in thread
From: Frederic Weisbecker @ 2010-03-26 17:38 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Paul Mackerras, LKML, Peter Zijlstra, Arnaldo Carvalho de Melo,
	David Miller, Steven Rostedt, Archs, H . Peter Anvin,
	Thomas Gleixner, Stephane Eranian

On Fri, Mar 26, 2010 at 08:58:17AM +0100, Ingo Molnar wrote:
> 
> * Paul Mackerras <paulus@samba.org> wrote:
> 
> > On Fri, Mar 26, 2010 at 02:52:35AM +0100, Frederic Weisbecker wrote:
> > 
> > > The series is not yet mergeable because it would break PowerPc (hot regs 
> > > snapshot API has been changed, and I don't know how to update PowerPc for 
> > > that).
> > > 
> > > But if you're fine with the ideas, I can integrate the necessary changes 
> > > to fix this, and also separate fixes and updates.
> > 
> > The patch below adds the necessary stuff for powerpc.  You could fold it 
> > into your "perf: Move perf_arch_fetch_caller_regs into a macro" patch, or 
> > keep it as a separate patch in the series (though that would make preserving 
> > bisectability more difficult).
> 
> Since the series needs a resend anyway folding back ought to be fine i think.
> 
> I'm wondering whether this should get into tip:perf/urgent - or in 
> tip:perf/core for 2.6.35.
> 
> It fixes sw event call trace ugliness, but is that a 2.6.34 regression? Is 
> there any other aspect of the series that points towards accelerating this 
> into .34?


Let's have a look:

	perf: Correctly align perf event tracing buffer

Should probably go into urgent. The change is not invasive at
all. It doesn't fix a regression but it's still an important fix.

	The rest

It depends. The whole bunch is rather invasive.
The callchains of context switches never worked correctly
I think. I couldn't tell if the cpu migration has ever worked.
If it ever did, then it's a regression fix but in the middle of
too much hot regs improvements. So I can cook a specific fix for
the cpu migration event to work, and keep the rest for perf/core.


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 0/7] perf updates and fixes
  2010-03-26  6:02 ` [PATCH 0/7] perf updates and fixes Paul Mackerras
  2010-03-26  7:58   ` Ingo Molnar
@ 2010-03-26 17:45   ` Frederic Weisbecker
  1 sibling, 0 replies; 23+ messages in thread
From: Frederic Weisbecker @ 2010-03-26 17:45 UTC (permalink / raw)
  To: Paul Mackerras
  Cc: Ingo Molnar, LKML, Peter Zijlstra, Arnaldo Carvalho de Melo,
	David Miller, Steven Rostedt, Archs, H . Peter Anvin,
	Thomas Gleixner, Stephane Eranian

On Fri, Mar 26, 2010 at 05:02:30PM +1100, Paul Mackerras wrote:
> On Fri, Mar 26, 2010 at 02:52:35AM +0100, Frederic Weisbecker wrote:
> 
> > The series is not yet mergeable because it would break PowerPc
> > (hot regs snapshot API has been changed, and I don't know how
> > to update PowerPc for that).
> > 
> > But if you're fine with the ideas, I can integrate the necessary
> > changes to fix this, and also separate fixes and updates.
> 
> The patch below adds the necessary stuff for powerpc.  You could fold
> it into your "perf: Move perf_arch_fetch_caller_regs into a macro"
> patch, or keep it as a separate patch in the series (though that would
> make preserving bisectability more difficult).
> 
> There is a little difficulty in that you first create a 3-argument
> form of perf_arch_fetch_caller_regs and then remove one argument in a
> later patch, whereas my patch below just creates the 2-argument form.
> I'm sure you can resolve that one way or another.
> 
> The patch to add the old-style perf_arch_fetch_caller_regs for powerpc
> is sitting in the tip/perf/urgent branch but hasn't gone anywhere.
> I suppose we need to get Ingo to pull that branch into tip/perf/core
> and then include a revert patch in the series that switches to the new
> interface.  Alternatively, if the urgent branch isn't append-only then
> we could disappear it (the urgent branch would need to be rewound by
> one commit).



Thanks a lot, that's axactly what I needed.
I'll sort it out to not break things in the middle :)



> 
> Paul.
> ----------
> [PATCH] perf/powerpc: Implement perf_arch_fetch_caller_regs for powerpc
> 
> This adds the ability to get a hot register snapshot for powerpc,
> enabling us to get meaningful call chains for tracepoints and context
> switch events.
> 
> Signed-off-by: Paul Mackerras <paulus@samba.org>
> ---
>  arch/powerpc/include/asm/perf_event.h |   12 ++++++++++++
>  1 file changed, 12 insertions(+)
> 
> diff --git a/arch/powerpc/include/asm/perf_event.h b/arch/powerpc/include/asm/perf_event.h
> index e6d4ce6..5c16b89 100644
> --- a/arch/powerpc/include/asm/perf_event.h
> +++ b/arch/powerpc/include/asm/perf_event.h
> @@ -21,3 +21,15 @@
>  #ifdef CONFIG_FSL_EMB_PERF_EVENT
>  #include <asm/perf_event_fsl_emb.h>
>  #endif
> +
> +#ifdef CONFIG_PERF_EVENTS
> +#include <asm/ptrace.h>
> +#include <asm/reg.h>
> +
> +#define perf_arch_fetch_caller_regs(regs, __ip)			\
> +	do {							\
> +		(regs)->nip = __ip;				\
> +		(regs)->gpr[1] = *(unsigned long *)__get_SP();	\
> +		asm volatile("mfmsr %0" : "=r" ((regs)->msr));	\
> +	} while (0)
> +#endif


^ permalink raw reply	[flat|nested] 23+ messages in thread

* [BUG perf] perf_fetch_caller_regs / rewind_frame_pointer can panic
  2010-03-26  1:52   ` Frederic Weisbecker
  (?)
  (?)
@ 2010-04-08  9:57   ` Eric Dumazet
  2010-04-08 10:59     ` Frederic Weisbecker
  2010-04-08 12:32       ` Frederic Weisbecker
  -1 siblings, 2 replies; 23+ messages in thread
From: Eric Dumazet @ 2010-04-08  9:57 UTC (permalink / raw)
  To: Frederic Weisbecker
  Cc: Ingo Molnar, LKML, Peter Zijlstra, Arnaldo Carvalho de Melo,
	Paul Mackerras, David Miller, Archs

Hello

Current linux-2.6 tree panics on my dev machine

64 bit kernel, 32bit user land
CONFIG_FRAME_POINTER=y

perf timechart record &

Instant crash

Call Trace:
 perf_trace_sched_switch+0xd5/0x120
 schedule+0x6b5/0x860
 retint_careful+0xd/0x21
 
RIP ffffffff81010955 perf_arch_fetch_caller_regs+0x15/0x40
CR2: 00000000d21f1422


rewind_frame_pointer() is probably wrong.

No test performed to check frame is in current stack, or
that (!user_mode_vm(regs))


static inline unsigned long rewind_frame_pointer(int n)
{
	struct stack_frame *frame;

	get_bp(frame);

#ifdef CONFIG_FRAME_POINTER
	while (n--)
		frame = frame->next_frame;
#endif

	return (unsigned long)frame;
}




^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [BUG perf] perf_fetch_caller_regs / rewind_frame_pointer can panic
  2010-04-08  9:57   ` [BUG perf] perf_fetch_caller_regs / rewind_frame_pointer can panic Eric Dumazet
@ 2010-04-08 10:59     ` Frederic Weisbecker
  2010-04-08 12:32       ` Frederic Weisbecker
  1 sibling, 0 replies; 23+ messages in thread
From: Frederic Weisbecker @ 2010-04-08 10:59 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: Ingo Molnar, LKML, Peter Zijlstra, Arnaldo Carvalho de Melo,
	Paul Mackerras, David Miller, Archs

On Thu, Apr 08, 2010 at 11:57:20AM +0200, Eric Dumazet wrote:
> Hello
> 
> Current linux-2.6 tree panics on my dev machine
> 
> 64 bit kernel, 32bit user land
> CONFIG_FRAME_POINTER=y
> 
> perf timechart record &
> 
> Instant crash
> 
> Call Trace:
>  perf_trace_sched_switch+0xd5/0x120
>  schedule+0x6b5/0x860
>  retint_careful+0xd/0x21
>  
> RIP ffffffff81010955 perf_arch_fetch_caller_regs+0x15/0x40
> CR2: 00000000d21f1422
> 
> 
> rewind_frame_pointer() is probably wrong.
> 
> No test performed to check frame is in current stack, or
> that (!user_mode_vm(regs))



user_mode_vm() can not work here as we are actually filling
regs from scratch.

But we indeed need to have a safe dereference to avoid such
crashes. A simple probe_kernel_address() should do the trick.

This API is going to change for the next cycle as it won't need
to rewind further than the first caller. So I'm going to do a
rough probe_kernel_address() fix for the current version. The next
one won't have this problem.



> 
> 
> static inline unsigned long rewind_frame_pointer(int n)
> {
> 	struct stack_frame *frame;
> 
> 	get_bp(frame);
> 
> #ifdef CONFIG_FRAME_POINTER
> 	while (n--)
> 		frame = frame->next_frame;
> #endif
> 
> 	return (unsigned long)frame;
> }
> 
> 
> 


^ permalink raw reply	[flat|nested] 23+ messages in thread

* [PATCH] perf: Fix unsafe frame rewinding with hot regs fetching
  2010-04-08  9:57   ` [BUG perf] perf_fetch_caller_regs / rewind_frame_pointer can panic Eric Dumazet
@ 2010-04-08 12:32       ` Frederic Weisbecker
  2010-04-08 12:32       ` Frederic Weisbecker
  1 sibling, 0 replies; 23+ messages in thread
From: Frederic Weisbecker @ 2010-04-08 12:32 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: Ingo Molnar, LKML, Peter Zijlstra, Arnaldo Carvalho de Melo,
	Paul Mackerras, David Miller, Archs

On Thu, Apr 08, 2010 at 11:57:20AM +0200, Eric Dumazet wrote:
> Hello
> 
> Current linux-2.6 tree panics on my dev machine
> 
> 64 bit kernel, 32bit user land
> CONFIG_FRAME_POINTER=y
> 
> perf timechart record &
> 
> Instant crash
> 
> Call Trace:
>  perf_trace_sched_switch+0xd5/0x120
>  schedule+0x6b5/0x860
>  retint_careful+0xd/0x21
>  
> RIP ffffffff81010955 perf_arch_fetch_caller_regs+0x15/0x40
> CR2: 00000000d21f1422
> 
> 
> rewind_frame_pointer() is probably wrong.
> 
> No test performed to check frame is in current stack, or
> that (!user_mode_vm(regs))
> 
> 
> static inline unsigned long rewind_frame_pointer(int n)
> {
> 	struct stack_frame *frame;
> 
> 	get_bp(frame);
> 
> #ifdef CONFIG_FRAME_POINTER
> 	while (n--)
> 		frame = frame->next_frame;
> #endif
> 
> 	return (unsigned long)frame;
> }
> 
> 
> 


Can you please test this fix?

Thanks.

---
>From 60d5c4e8498efc4a01abceef54ad3bc91993bf41 Mon Sep 17 00:00:00 2001
From: Frederic Weisbecker <fweisbec@gmail.com>
Date: Thu, 8 Apr 2010 14:05:50 +0200
Subject: [PATCH] perf: Fix unsafe frame rewinding with hot regs fetching

When we fetch the hot regs and rewind to the nth caller, it
might happen that we dereference a frame pointer outside the
kernel stack boundaries, like in this example:

	perf_trace_sched_switch+0xd5/0x120
        schedule+0x6b5/0x860
        retint_careful+0xd/0x21

Since we directly dereference a userspace frame pointer here while
rewinding behind retint_careful, this may end up in a crash.

Fix this by simply using probe_kernel_address() when we rewind the
frame pointer.

This issue will have a much more proper fix in the next version of the
perf_arch_fetch_caller_regs() API that will only need to rewind to the
first caller.

Reported-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: David Miller <davem@davemloft.net>
Cc: Archs <linux-arch@vger.kernel.org>
---
 arch/x86/kernel/dumpstack.h |    8 ++++++--
 1 files changed, 6 insertions(+), 2 deletions(-)

diff --git a/arch/x86/kernel/dumpstack.h b/arch/x86/kernel/dumpstack.h
index e39e771..e1a93be 100644
--- a/arch/x86/kernel/dumpstack.h
+++ b/arch/x86/kernel/dumpstack.h
@@ -14,6 +14,8 @@
 #define get_bp(bp) asm("movq %%rbp, %0" : "=r" (bp) :)
 #endif
 
+#include <linux/uaccess.h>
+
 extern void
 show_trace_log_lvl(struct task_struct *task, struct pt_regs *regs,
 		unsigned long *stack, unsigned long bp, char *log_lvl);
@@ -42,8 +44,10 @@ static inline unsigned long rewind_frame_pointer(int n)
 	get_bp(frame);
 
 #ifdef CONFIG_FRAME_POINTER
-	while (n--)
-		frame = frame->next_frame;
+	while (n--) {
+		if (probe_kernel_address(&frame->next_frame, frame))
+			break;
+	}
 #endif
 
 	return (unsigned long)frame;
-- 
1.6.2.3




^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH] perf: Fix unsafe frame rewinding with hot regs fetching
@ 2010-04-08 12:32       ` Frederic Weisbecker
  0 siblings, 0 replies; 23+ messages in thread
From: Frederic Weisbecker @ 2010-04-08 12:32 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: Ingo Molnar, LKML, Peter Zijlstra, Arnaldo Carvalho de Melo,
	Paul Mackerras, David Miller, Archs

On Thu, Apr 08, 2010 at 11:57:20AM +0200, Eric Dumazet wrote:
> Hello
> 
> Current linux-2.6 tree panics on my dev machine
> 
> 64 bit kernel, 32bit user land
> CONFIG_FRAME_POINTER=y
> 
> perf timechart record &
> 
> Instant crash
> 
> Call Trace:
>  perf_trace_sched_switch+0xd5/0x120
>  schedule+0x6b5/0x860
>  retint_careful+0xd/0x21
>  
> RIP ffffffff81010955 perf_arch_fetch_caller_regs+0x15/0x40
> CR2: 00000000d21f1422
> 
> 
> rewind_frame_pointer() is probably wrong.
> 
> No test performed to check frame is in current stack, or
> that (!user_mode_vm(regs))
> 
> 
> static inline unsigned long rewind_frame_pointer(int n)
> {
> 	struct stack_frame *frame;
> 
> 	get_bp(frame);
> 
> #ifdef CONFIG_FRAME_POINTER
> 	while (n--)
> 		frame = frame->next_frame;
> #endif
> 
> 	return (unsigned long)frame;
> }
> 
> 
> 


Can you please test this fix?

Thanks.

---
From 60d5c4e8498efc4a01abceef54ad3bc91993bf41 Mon Sep 17 00:00:00 2001
From: Frederic Weisbecker <fweisbec@gmail.com>
Date: Thu, 8 Apr 2010 14:05:50 +0200
Subject: [PATCH] perf: Fix unsafe frame rewinding with hot regs fetching

When we fetch the hot regs and rewind to the nth caller, it
might happen that we dereference a frame pointer outside the
kernel stack boundaries, like in this example:

	perf_trace_sched_switch+0xd5/0x120
        schedule+0x6b5/0x860
        retint_careful+0xd/0x21

Since we directly dereference a userspace frame pointer here while
rewinding behind retint_careful, this may end up in a crash.

Fix this by simply using probe_kernel_address() when we rewind the
frame pointer.

This issue will have a much more proper fix in the next version of the
perf_arch_fetch_caller_regs() API that will only need to rewind to the
first caller.

Reported-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: David Miller <davem@davemloft.net>
Cc: Archs <linux-arch@vger.kernel.org>
---
 arch/x86/kernel/dumpstack.h |    8 ++++++--
 1 files changed, 6 insertions(+), 2 deletions(-)

diff --git a/arch/x86/kernel/dumpstack.h b/arch/x86/kernel/dumpstack.h
index e39e771..e1a93be 100644
--- a/arch/x86/kernel/dumpstack.h
+++ b/arch/x86/kernel/dumpstack.h
@@ -14,6 +14,8 @@
 #define get_bp(bp) asm("movq %%rbp, %0" : "=r" (bp) :)
 #endif
 
+#include <linux/uaccess.h>
+
 extern void
 show_trace_log_lvl(struct task_struct *task, struct pt_regs *regs,
 		unsigned long *stack, unsigned long bp, char *log_lvl);
@@ -42,8 +44,10 @@ static inline unsigned long rewind_frame_pointer(int n)
 	get_bp(frame);
 
 #ifdef CONFIG_FRAME_POINTER
-	while (n--)
-		frame = frame->next_frame;
+	while (n--) {
+		if (probe_kernel_address(&frame->next_frame, frame))
+			break;
+	}
 #endif
 
 	return (unsigned long)frame;
-- 
1.6.2.3

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* Re: [PATCH] perf: Fix unsafe frame rewinding with hot regs fetching
  2010-04-08 12:32       ` Frederic Weisbecker
  (?)
@ 2010-04-08 13:52       ` Eric Dumazet
  2010-04-08 17:31         ` [GIT PULL] perf fix Frederic Weisbecker
  -1 siblings, 1 reply; 23+ messages in thread
From: Eric Dumazet @ 2010-04-08 13:52 UTC (permalink / raw)
  To: Frederic Weisbecker
  Cc: Ingo Molnar, LKML, Peter Zijlstra, Arnaldo Carvalho de Melo,
	Paul Mackerras, David Miller, Archs

Le jeudi 08 avril 2010 à 14:32 +0200, Frederic Weisbecker a écrit :


> 
> Can you please test this fix?
> 
> Thanks.
> 
> ---
> From 60d5c4e8498efc4a01abceef54ad3bc91993bf41 Mon Sep 17 00:00:00 2001
> From: Frederic Weisbecker <fweisbec@gmail.com>
> Date: Thu, 8 Apr 2010 14:05:50 +0200
> Subject: [PATCH] perf: Fix unsafe frame rewinding with hot regs fetching
> 
> When we fetch the hot regs and rewind to the nth caller, it
> might happen that we dereference a frame pointer outside the
> kernel stack boundaries, like in this example:
> 
> 	perf_trace_sched_switch+0xd5/0x120
>         schedule+0x6b5/0x860
>         retint_careful+0xd/0x21
> 
> Since we directly dereference a userspace frame pointer here while
> rewinding behind retint_careful, this may end up in a crash.
> 
> Fix this by simply using probe_kernel_address() when we rewind the
> frame pointer.
> 
> This issue will have a much more proper fix in the next version of the
> perf_arch_fetch_caller_regs() API that will only need to rewind to the
> first caller.
> 
> Reported-by: Eric Dumazet <eric.dumazet@gmail.com>
> Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
> Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
> Cc: Paul Mackerras <paulus@samba.org>
> Cc: David Miller <davem@davemloft.net>
> Cc: Archs <linux-arch@vger.kernel.org>
> ---
>  arch/x86/kernel/dumpstack.h |    8 ++++++--
>  1 files changed, 6 insertions(+), 2 deletions(-)
> 
> diff --git a/arch/x86/kernel/dumpstack.h b/arch/x86/kernel/dumpstack.h
> index e39e771..e1a93be 100644
> --- a/arch/x86/kernel/dumpstack.h
> +++ b/arch/x86/kernel/dumpstack.h
> @@ -14,6 +14,8 @@
>  #define get_bp(bp) asm("movq %%rbp, %0" : "=r" (bp) :)
>  #endif
>  
> +#include <linux/uaccess.h>
> +
>  extern void
>  show_trace_log_lvl(struct task_struct *task, struct pt_regs *regs,
>  		unsigned long *stack, unsigned long bp, char *log_lvl);
> @@ -42,8 +44,10 @@ static inline unsigned long rewind_frame_pointer(int n)
>  	get_bp(frame);
>  
>  #ifdef CONFIG_FRAME_POINTER
> -	while (n--)
> -		frame = frame->next_frame;
> +	while (n--) {
> +		if (probe_kernel_address(&frame->next_frame, frame))
> +			break;
> +	}
>  #endif
>  
>  	return (unsigned long)frame;

Thanks, no more crash :)

Tested-by: Eric Dumazet <eric.dumazet@gmail.com>



^ permalink raw reply	[flat|nested] 23+ messages in thread

* [GIT PULL] perf fix
  2010-04-08 13:52       ` Eric Dumazet
@ 2010-04-08 17:31         ` Frederic Weisbecker
  2010-04-13 22:51           ` Ingo Molnar
  0 siblings, 1 reply; 23+ messages in thread
From: Frederic Weisbecker @ 2010-04-08 17:31 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: LKML, Frederic Weisbecker, Eric Dumazet, Peter Zijlstra,
	Arnaldo Carvalho de Melo, Paul Mackerras, David Miller, Archs

Ingo,

Please pull the perf/urgent branch that can be found at:

git://git.kernel.org/pub/scm/linux/kernel/git/frederic/random-tracing.git
	perf/urgent

Thanks,
	Frederic
---

Frederic Weisbecker (1):
      perf: Fix unsafe frame rewinding with hot regs fetching


 arch/x86/kernel/dumpstack.h |    8 ++++++--
 1 files changed, 6 insertions(+), 2 deletions(-)

---
commit ab285f2b5290d92b7ec1a6f9aad54308dadf6157
Author: Frederic Weisbecker <fweisbec@gmail.com>
Date:   Thu Apr 8 14:05:50 2010 +0200

    perf: Fix unsafe frame rewinding with hot regs fetching
    
    When we fetch the hot regs and rewind to the nth caller, it
    might happen that we dereference a frame pointer outside the
    kernel stack boundaries, like in this example:
    
    	perf_trace_sched_switch+0xd5/0x120
            schedule+0x6b5/0x860
            retint_careful+0xd/0x21
    
    Since we directly dereference a userspace frame pointer here while
    rewinding behind retint_careful, this may end up in a crash.
    
    Fix this by simply using probe_kernel_address() when we rewind the
    frame pointer.
    
    This issue will have a much more proper fix in the next version of the
    perf_arch_fetch_caller_regs() API that will only need to rewind to the
    first caller.
    
    Reported-by: Eric Dumazet <eric.dumazet@gmail.com>
    Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
    Tested-by: Eric Dumazet <eric.dumazet@gmail.com>
    Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
    Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
    Cc: Paul Mackerras <paulus@samba.org>
    Cc: David Miller <davem@davemloft.net>
    Cc: Archs <linux-arch@vger.kernel.org>

diff --git a/arch/x86/kernel/dumpstack.h b/arch/x86/kernel/dumpstack.h
index e39e771..e1a93be 100644
--- a/arch/x86/kernel/dumpstack.h
+++ b/arch/x86/kernel/dumpstack.h
@@ -14,6 +14,8 @@
 #define get_bp(bp) asm("movq %%rbp, %0" : "=r" (bp) :)
 #endif
 
+#include <linux/uaccess.h>
+
 extern void
 show_trace_log_lvl(struct task_struct *task, struct pt_regs *regs,
 		unsigned long *stack, unsigned long bp, char *log_lvl);
@@ -42,8 +44,10 @@ static inline unsigned long rewind_frame_pointer(int n)
 	get_bp(frame);
 
 #ifdef CONFIG_FRAME_POINTER
-	while (n--)
-		frame = frame->next_frame;
+	while (n--) {
+		if (probe_kernel_address(&frame->next_frame, frame))
+			break;
+	}
 #endif
 
 	return (unsigned long)frame;

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* Re: [GIT PULL] perf fix
  2010-04-08 17:31         ` [GIT PULL] perf fix Frederic Weisbecker
@ 2010-04-13 22:51           ` Ingo Molnar
  0 siblings, 0 replies; 23+ messages in thread
From: Ingo Molnar @ 2010-04-13 22:51 UTC (permalink / raw)
  To: Frederic Weisbecker
  Cc: LKML, Eric Dumazet, Peter Zijlstra, Arnaldo Carvalho de Melo,
	Paul Mackerras, David Miller, Archs


* Frederic Weisbecker <fweisbec@gmail.com> wrote:

> Ingo,
> 
> Please pull the perf/urgent branch that can be found at:
> 
> git://git.kernel.org/pub/scm/linux/kernel/git/frederic/random-tracing.git
> 	perf/urgent
> 
> Thanks,
> 	Frederic
> ---
> 
> Frederic Weisbecker (1):
>       perf: Fix unsafe frame rewinding with hot regs fetching
> 
> 
>  arch/x86/kernel/dumpstack.h |    8 ++++++--
>  1 files changed, 6 insertions(+), 2 deletions(-)

Pulled, thanks a lot Frederic!

	Ingo

^ permalink raw reply	[flat|nested] 23+ messages in thread

end of thread, other threads:[~2010-04-13 22:51 UTC | newest]

Thread overview: 23+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2010-03-26  1:52 [PATCH 0/7] perf updates and fixes Frederic Weisbecker
2010-03-26  1:52 ` [PATCH 1/7] perf: Drop the frame reliablity check Frederic Weisbecker
2010-03-26  1:52 ` [PATCH 2/7] perf: Fetch hot regs from the template caller Frederic Weisbecker
2010-03-26  1:52 ` [PATCH 3/7] x86: Unify dumpstack.h and stacktrace.h Frederic Weisbecker
2010-03-26  1:52 ` [PATCH 4/7] perf: Move perf_arch_fetch_caller_regs into a macro Frederic Weisbecker
2010-03-26  1:52   ` Frederic Weisbecker
2010-03-26  1:52   ` Frederic Weisbecker
2010-03-26  1:52 ` [PATCH 5/7] perf: Make perf_fetch_caller_regs rewind to the first caller only Frederic Weisbecker
2010-03-26  1:52   ` Frederic Weisbecker
2010-03-26  1:52   ` Frederic Weisbecker
2010-04-08  9:57   ` [BUG perf] perf_fetch_caller_regs / rewind_frame_pointer can panic Eric Dumazet
2010-04-08 10:59     ` Frederic Weisbecker
2010-04-08 12:32     ` [PATCH] perf: Fix unsafe frame rewinding with hot regs fetching Frederic Weisbecker
2010-04-08 12:32       ` Frederic Weisbecker
2010-04-08 13:52       ` Eric Dumazet
2010-04-08 17:31         ` [GIT PULL] perf fix Frederic Weisbecker
2010-04-13 22:51           ` Ingo Molnar
2010-03-26  1:52 ` [PATCH 6/7] perf: Use hot regs with software sched/migrate events Frederic Weisbecker
2010-03-26  1:52 ` [PATCH 7/7] perf: Correctly align perf event tracing buffer Frederic Weisbecker
2010-03-26  6:02 ` [PATCH 0/7] perf updates and fixes Paul Mackerras
2010-03-26  7:58   ` Ingo Molnar
2010-03-26 17:38     ` Frederic Weisbecker
2010-03-26 17:45   ` Frederic Weisbecker

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.