linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [RFC PATCH 0/6] perf: cleanup and fixes
@ 2010-07-01 15:35 Frederic Weisbecker
  2010-07-01 15:35 ` [RFC PATCH 1/6] perf: Drop unappropriate tests on arch callchains Frederic Weisbecker
                   ` (5 more replies)
  0 siblings, 6 replies; 16+ messages in thread
From: Frederic Weisbecker @ 2010-07-01 15:35 UTC (permalink / raw)
  To: LKML; +Cc: LKML, Frederic Weisbecker

cleanups and fixes for perf core, although the last one could
go to perf/urgent.

It has only been tested in x86.

Frederic Weisbecker (6):
  perf: Drop unappropriate tests on arch callchains
  perf: Generalize callchain_store()
  perf: Generalize some arch callchain code
  perf: Factorize callchain context handling
  perf: Fix race in callchains
  perf: Fix double put_ctx

 arch/arm/kernel/perf_event.c         |   62 +--------
 arch/powerpc/kernel/perf_callchain.c |   86 ++++---------
 arch/sh/kernel/perf_callchain.c      |   50 +-------
 arch/sparc/kernel/perf_event.c       |   69 ++++-------
 arch/x86/kernel/cpu/perf_event.c     |   82 +++----------
 include/linux/perf_event.h           |   16 +++-
 kernel/perf_event.c                  |  228 +++++++++++++++++++++++++++++-----
 7 files changed, 289 insertions(+), 304 deletions(-)


^ permalink raw reply	[flat|nested] 16+ messages in thread

* [RFC PATCH 1/6] perf: Drop unappropriate tests on arch callchains
  2010-07-01 15:35 [RFC PATCH 0/6] perf: cleanup and fixes Frederic Weisbecker
@ 2010-07-01 15:35 ` Frederic Weisbecker
  2010-07-01 15:35 ` [RFC PATCH 2/6] perf: Generalize callchain_store() Frederic Weisbecker
                   ` (4 subsequent siblings)
  5 siblings, 0 replies; 16+ messages in thread
From: Frederic Weisbecker @ 2010-07-01 15:35 UTC (permalink / raw)
  To: LKML
  Cc: LKML, Frederic Weisbecker, Ingo Molnar, Peter Zijlstra,
	Arnaldo Carvalho de Melo, Paul Mackerras, Stephane Eranian,
	David Miller, Paul Mundt, Will Deacon, Borislav Petkov

Drop the TASK_RUNNING test on user tasks for callchains as
this check doesn't seem to make any sense.

Also remove the tests for !current that is not supposed to
happen and current->pid as this should be handled at the
generic level, with exclude_idle attribute.

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: David Miller <davem@davemloft.net>
Cc: Paul Mundt <lethal@linux-sh.org>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Borislav Petkov <bp@amd64.org>
---
 arch/arm/kernel/perf_event.c     |    6 ------
 arch/sh/kernel/perf_callchain.c  |    3 ---
 arch/x86/kernel/cpu/perf_event.c |    3 ---
 3 files changed, 0 insertions(+), 12 deletions(-)

diff --git a/arch/arm/kernel/perf_event.c b/arch/arm/kernel/perf_event.c
index 5b7cfaf..16519f6 100644
--- a/arch/arm/kernel/perf_event.c
+++ b/arch/arm/kernel/perf_event.c
@@ -3107,12 +3107,6 @@ perf_do_callchain(struct pt_regs *regs,
 
 	is_user = user_mode(regs);
 
-	if (!current || !current->pid)
-		return;
-
-	if (is_user && current->state != TASK_RUNNING)
-		return;
-
 	if (!is_user)
 		perf_callchain_kernel(regs, entry);
 
diff --git a/arch/sh/kernel/perf_callchain.c b/arch/sh/kernel/perf_callchain.c
index a9dd3ab..1d6dbce 100644
--- a/arch/sh/kernel/perf_callchain.c
+++ b/arch/sh/kernel/perf_callchain.c
@@ -68,9 +68,6 @@ perf_do_callchain(struct pt_regs *regs, struct perf_callchain_entry *entry)
 
 	is_user = user_mode(regs);
 
-	if (is_user && current->state != TASK_RUNNING)
-		return;
-
 	/*
 	 * Only the kernel side is implemented for now.
 	 */
diff --git a/arch/x86/kernel/cpu/perf_event.c b/arch/x86/kernel/cpu/perf_event.c
index f2da20f..4a4d191 100644
--- a/arch/x86/kernel/cpu/perf_event.c
+++ b/arch/x86/kernel/cpu/perf_event.c
@@ -1703,9 +1703,6 @@ perf_do_callchain(struct pt_regs *regs, struct perf_callchain_entry *entry)
 
 	is_user = user_mode(regs);
 
-	if (is_user && current->state != TASK_RUNNING)
-		return;
-
 	if (!is_user)
 		perf_callchain_kernel(regs, entry);
 
-- 
1.6.2.3


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [RFC PATCH 2/6] perf: Generalize callchain_store()
  2010-07-01 15:35 [RFC PATCH 0/6] perf: cleanup and fixes Frederic Weisbecker
  2010-07-01 15:35 ` [RFC PATCH 1/6] perf: Drop unappropriate tests on arch callchains Frederic Weisbecker
@ 2010-07-01 15:35 ` Frederic Weisbecker
  2010-07-01 15:35 ` [RFC PATCH 3/6] perf: Generalize some arch callchain code Frederic Weisbecker
                   ` (3 subsequent siblings)
  5 siblings, 0 replies; 16+ messages in thread
From: Frederic Weisbecker @ 2010-07-01 15:35 UTC (permalink / raw)
  To: LKML
  Cc: LKML, Frederic Weisbecker, Ingo Molnar, Peter Zijlstra,
	Arnaldo Carvalho de Melo, Paul Mackerras, Stephane Eranian,
	David Miller, Paul Mundt, Will Deacon, Borislav Petkov

callchain_store() is the same on every archs, inline it in
perf_event.h and rename it to perf_callchain_store() to avoid
any collision.

This removes repetitive code.

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: David Miller <davem@davemloft.net>
Cc: Paul Mundt <lethal@linux-sh.org>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Borislav Petkov <bp@amd64.org>
---
 arch/arm/kernel/perf_event.c         |   15 +++---------
 arch/powerpc/kernel/perf_callchain.c |   40 ++++++++++++----------------------
 arch/sh/kernel/perf_callchain.c      |   11 ++------
 arch/sparc/kernel/perf_event.c       |   26 ++++++++-------------
 arch/x86/kernel/cpu/perf_event.c     |   20 ++++++-----------
 include/linux/perf_event.h           |    7 ++++++
 6 files changed, 45 insertions(+), 74 deletions(-)

diff --git a/arch/arm/kernel/perf_event.c b/arch/arm/kernel/perf_event.c
index 16519f6..41de647 100644
--- a/arch/arm/kernel/perf_event.c
+++ b/arch/arm/kernel/perf_event.c
@@ -3001,13 +3001,6 @@ arch_initcall(init_hw_perf_events);
 /*
  * Callchain handling code.
  */
-static inline void
-callchain_store(struct perf_callchain_entry *entry,
-		u64 ip)
-{
-	if (entry->nr < PERF_MAX_STACK_DEPTH)
-		entry->ip[entry->nr++] = ip;
-}
 
 /*
  * The registers we're interested in are at the end of the variable
@@ -3039,7 +3032,7 @@ user_backtrace(struct frame_tail *tail,
 	if (__copy_from_user_inatomic(&buftail, tail, sizeof(buftail)))
 		return NULL;
 
-	callchain_store(entry, buftail.lr);
+	perf_callchain_store(entry, buftail.lr);
 
 	/*
 	 * Frame pointers should strictly progress back up the stack
@@ -3057,7 +3050,7 @@ perf_callchain_user(struct pt_regs *regs,
 {
 	struct frame_tail *tail;
 
-	callchain_store(entry, PERF_CONTEXT_USER);
+	perf_callchain_store(entry, PERF_CONTEXT_USER);
 
 	if (!user_mode(regs))
 		regs = task_pt_regs(current);
@@ -3078,7 +3071,7 @@ callchain_trace(struct stackframe *fr,
 		void *data)
 {
 	struct perf_callchain_entry *entry = data;
-	callchain_store(entry, fr->pc);
+	perf_callchain_store(entry, fr->pc);
 	return 0;
 }
 
@@ -3088,7 +3081,7 @@ perf_callchain_kernel(struct pt_regs *regs,
 {
 	struct stackframe fr;
 
-	callchain_store(entry, PERF_CONTEXT_KERNEL);
+	perf_callchain_store(entry, PERF_CONTEXT_KERNEL);
 	fr.fp = regs->ARM_fp;
 	fr.sp = regs->ARM_sp;
 	fr.lr = regs->ARM_lr;
diff --git a/arch/powerpc/kernel/perf_callchain.c b/arch/powerpc/kernel/perf_callchain.c
index 95ad9da..a286c2e 100644
--- a/arch/powerpc/kernel/perf_callchain.c
+++ b/arch/powerpc/kernel/perf_callchain.c
@@ -23,18 +23,6 @@
 #include "ppc32.h"
 #endif
 
-/*
- * Store another value in a callchain_entry.
- */
-static inline void callchain_store(struct perf_callchain_entry *entry, u64 ip)
-{
-	unsigned int nr = entry->nr;
-
-	if (nr < PERF_MAX_STACK_DEPTH) {
-		entry->ip[nr] = ip;
-		entry->nr = nr + 1;
-	}
-}
 
 /*
  * Is sp valid as the address of the next kernel stack frame after prev_sp?
@@ -69,8 +57,8 @@ static void perf_callchain_kernel(struct pt_regs *regs,
 
 	lr = regs->link;
 	sp = regs->gpr[1];
-	callchain_store(entry, PERF_CONTEXT_KERNEL);
-	callchain_store(entry, regs->nip);
+	perf_callchain_store(entry, PERF_CONTEXT_KERNEL);
+	perf_callchain_store(entry, regs->nip);
 
 	if (!validate_sp(sp, current, STACK_FRAME_OVERHEAD))
 		return;
@@ -89,7 +77,7 @@ static void perf_callchain_kernel(struct pt_regs *regs,
 			next_ip = regs->nip;
 			lr = regs->link;
 			level = 0;
-			callchain_store(entry, PERF_CONTEXT_KERNEL);
+			perf_callchain_store(entry, PERF_CONTEXT_KERNEL);
 
 		} else {
 			if (level == 0)
@@ -111,7 +99,7 @@ static void perf_callchain_kernel(struct pt_regs *regs,
 			++level;
 		}
 
-		callchain_store(entry, next_ip);
+		perf_callchain_store(entry, next_ip);
 		if (!valid_next_sp(next_sp, sp))
 			return;
 		sp = next_sp;
@@ -246,8 +234,8 @@ static void perf_callchain_user_64(struct pt_regs *regs,
 	next_ip = regs->nip;
 	lr = regs->link;
 	sp = regs->gpr[1];
-	callchain_store(entry, PERF_CONTEXT_USER);
-	callchain_store(entry, next_ip);
+	perf_callchain_store(entry, PERF_CONTEXT_USER);
+	perf_callchain_store(entry, next_ip);
 
 	for (;;) {
 		fp = (unsigned long __user *) sp;
@@ -276,14 +264,14 @@ static void perf_callchain_user_64(struct pt_regs *regs,
 			    read_user_stack_64(&uregs[PT_R1], &sp))
 				return;
 			level = 0;
-			callchain_store(entry, PERF_CONTEXT_USER);
-			callchain_store(entry, next_ip);
+			perf_callchain_store(entry, PERF_CONTEXT_USER);
+			perf_callchain_store(entry, next_ip);
 			continue;
 		}
 
 		if (level == 0)
 			next_ip = lr;
-		callchain_store(entry, next_ip);
+		perf_callchain_store(entry, next_ip);
 		++level;
 		sp = next_sp;
 	}
@@ -447,8 +435,8 @@ static void perf_callchain_user_32(struct pt_regs *regs,
 	next_ip = regs->nip;
 	lr = regs->link;
 	sp = regs->gpr[1];
-	callchain_store(entry, PERF_CONTEXT_USER);
-	callchain_store(entry, next_ip);
+	perf_callchain_store(entry, PERF_CONTEXT_USER);
+	perf_callchain_store(entry, next_ip);
 
 	while (entry->nr < PERF_MAX_STACK_DEPTH) {
 		fp = (unsigned int __user *) (unsigned long) sp;
@@ -470,14 +458,14 @@ static void perf_callchain_user_32(struct pt_regs *regs,
 			    read_user_stack_32(&uregs[PT_R1], &sp))
 				return;
 			level = 0;
-			callchain_store(entry, PERF_CONTEXT_USER);
-			callchain_store(entry, next_ip);
+			perf_callchain_store(entry, PERF_CONTEXT_USER);
+			perf_callchain_store(entry, next_ip);
 			continue;
 		}
 
 		if (level == 0)
 			next_ip = lr;
-		callchain_store(entry, next_ip);
+		perf_callchain_store(entry, next_ip);
 		++level;
 		sp = next_sp;
 	}
diff --git a/arch/sh/kernel/perf_callchain.c b/arch/sh/kernel/perf_callchain.c
index 1d6dbce..00143f3 100644
--- a/arch/sh/kernel/perf_callchain.c
+++ b/arch/sh/kernel/perf_callchain.c
@@ -14,11 +14,6 @@
 #include <asm/unwinder.h>
 #include <asm/ptrace.h>
 
-static inline void callchain_store(struct perf_callchain_entry *entry, u64 ip)
-{
-	if (entry->nr < PERF_MAX_STACK_DEPTH)
-		entry->ip[entry->nr++] = ip;
-}
 
 static void callchain_warning(void *data, char *msg)
 {
@@ -39,7 +34,7 @@ static void callchain_address(void *data, unsigned long addr, int reliable)
 	struct perf_callchain_entry *entry = data;
 
 	if (reliable)
-		callchain_store(entry, addr);
+		perf_callchain_store(entry, addr);
 }
 
 static const struct stacktrace_ops callchain_ops = {
@@ -52,8 +47,8 @@ static const struct stacktrace_ops callchain_ops = {
 static void
 perf_callchain_kernel(struct pt_regs *regs, struct perf_callchain_entry *entry)
 {
-	callchain_store(entry, PERF_CONTEXT_KERNEL);
-	callchain_store(entry, regs->pc);
+	perf_callchain_store(entry, PERF_CONTEXT_KERNEL);
+	perf_callchain_store(entry, regs->pc);
 
 	unwind_stack(NULL, regs, NULL, &callchain_ops, entry);
 }
diff --git a/arch/sparc/kernel/perf_event.c b/arch/sparc/kernel/perf_event.c
index 8a6660d..fbac5ad 100644
--- a/arch/sparc/kernel/perf_event.c
+++ b/arch/sparc/kernel/perf_event.c
@@ -1282,12 +1282,6 @@ void __init init_hw_perf_events(void)
 	register_die_notifier(&perf_event_nmi_notifier);
 }
 
-static inline void callchain_store(struct perf_callchain_entry *entry, u64 ip)
-{
-	if (entry->nr < PERF_MAX_STACK_DEPTH)
-		entry->ip[entry->nr++] = ip;
-}
-
 static void perf_callchain_kernel(struct pt_regs *regs,
 				  struct perf_callchain_entry *entry)
 {
@@ -1296,8 +1290,8 @@ static void perf_callchain_kernel(struct pt_regs *regs,
 	int graph = 0;
 #endif
 
-	callchain_store(entry, PERF_CONTEXT_KERNEL);
-	callchain_store(entry, regs->tpc);
+	perf_callchain_store(entry, PERF_CONTEXT_KERNEL);
+	perf_callchain_store(entry, regs->tpc);
 
 	ksp = regs->u_regs[UREG_I6];
 	fp = ksp + STACK_BIAS;
@@ -1321,13 +1315,13 @@ static void perf_callchain_kernel(struct pt_regs *regs,
 			pc = sf->callers_pc;
 			fp = (unsigned long)sf->fp + STACK_BIAS;
 		}
-		callchain_store(entry, pc);
+		perf_callchain_store(entry, pc);
 #ifdef CONFIG_FUNCTION_GRAPH_TRACER
 		if ((pc + 8UL) == (unsigned long) &return_to_handler) {
 			int index = current->curr_ret_stack;
 			if (current->ret_stack && index >= graph) {
 				pc = current->ret_stack[index - graph].ret;
-				callchain_store(entry, pc);
+				perf_callchain_store(entry, pc);
 				graph++;
 			}
 		}
@@ -1340,8 +1334,8 @@ static void perf_callchain_user_64(struct pt_regs *regs,
 {
 	unsigned long ufp;
 
-	callchain_store(entry, PERF_CONTEXT_USER);
-	callchain_store(entry, regs->tpc);
+	perf_callchain_store(entry, PERF_CONTEXT_USER);
+	perf_callchain_store(entry, regs->tpc);
 
 	ufp = regs->u_regs[UREG_I6] + STACK_BIAS;
 	do {
@@ -1354,7 +1348,7 @@ static void perf_callchain_user_64(struct pt_regs *regs,
 
 		pc = sf.callers_pc;
 		ufp = (unsigned long)sf.fp + STACK_BIAS;
-		callchain_store(entry, pc);
+		perf_callchain_store(entry, pc);
 	} while (entry->nr < PERF_MAX_STACK_DEPTH);
 }
 
@@ -1363,8 +1357,8 @@ static void perf_callchain_user_32(struct pt_regs *regs,
 {
 	unsigned long ufp;
 
-	callchain_store(entry, PERF_CONTEXT_USER);
-	callchain_store(entry, regs->tpc);
+	perf_callchain_store(entry, PERF_CONTEXT_USER);
+	perf_callchain_store(entry, regs->tpc);
 
 	ufp = regs->u_regs[UREG_I6] & 0xffffffffUL;
 	do {
@@ -1377,7 +1371,7 @@ static void perf_callchain_user_32(struct pt_regs *regs,
 
 		pc = sf.callers_pc;
 		ufp = (unsigned long)sf.fp;
-		callchain_store(entry, pc);
+		perf_callchain_store(entry, pc);
 	} while (entry->nr < PERF_MAX_STACK_DEPTH);
 }
 
diff --git a/arch/x86/kernel/cpu/perf_event.c b/arch/x86/kernel/cpu/perf_event.c
index 4a4d191..8af28ca 100644
--- a/arch/x86/kernel/cpu/perf_event.c
+++ b/arch/x86/kernel/cpu/perf_event.c
@@ -1571,12 +1571,6 @@ const struct pmu *hw_perf_event_init(struct perf_event *event)
  * callchain support
  */
 
-static inline
-void callchain_store(struct perf_callchain_entry *entry, u64 ip)
-{
-	if (entry->nr < PERF_MAX_STACK_DEPTH)
-		entry->ip[entry->nr++] = ip;
-}
 
 static DEFINE_PER_CPU(struct perf_callchain_entry, pmc_irq_entry);
 static DEFINE_PER_CPU(struct perf_callchain_entry, pmc_nmi_entry);
@@ -1602,7 +1596,7 @@ static void backtrace_address(void *data, unsigned long addr, int reliable)
 {
 	struct perf_callchain_entry *entry = data;
 
-	callchain_store(entry, addr);
+	perf_callchain_store(entry, addr);
 }
 
 static const struct stacktrace_ops backtrace_ops = {
@@ -1616,8 +1610,8 @@ static const struct stacktrace_ops backtrace_ops = {
 static void
 perf_callchain_kernel(struct pt_regs *regs, struct perf_callchain_entry *entry)
 {
-	callchain_store(entry, PERF_CONTEXT_KERNEL);
-	callchain_store(entry, regs->ip);
+	perf_callchain_store(entry, PERF_CONTEXT_KERNEL);
+	perf_callchain_store(entry, regs->ip);
 
 	dump_trace(NULL, regs, NULL, regs->bp, &backtrace_ops, entry);
 }
@@ -1646,7 +1640,7 @@ perf_callchain_user32(struct pt_regs *regs, struct perf_callchain_entry *entry)
 		if (fp < compat_ptr(regs->sp))
 			break;
 
-		callchain_store(entry, frame.return_address);
+		perf_callchain_store(entry, frame.return_address);
 		fp = compat_ptr(frame.next_frame);
 	}
 	return 1;
@@ -1670,8 +1664,8 @@ perf_callchain_user(struct pt_regs *regs, struct perf_callchain_entry *entry)
 
 	fp = (void __user *)regs->bp;
 
-	callchain_store(entry, PERF_CONTEXT_USER);
-	callchain_store(entry, regs->ip);
+	perf_callchain_store(entry, PERF_CONTEXT_USER);
+	perf_callchain_store(entry, regs->ip);
 
 	if (perf_callchain_user32(regs, entry))
 		return;
@@ -1688,7 +1682,7 @@ perf_callchain_user(struct pt_regs *regs, struct perf_callchain_entry *entry)
 		if ((unsigned long)fp < regs->sp)
 			break;
 
-		callchain_store(entry, frame.return_address);
+		perf_callchain_store(entry, frame.return_address);
 		fp = frame.next_frame;
 	}
 }
diff --git a/include/linux/perf_event.h b/include/linux/perf_event.h
index 937495c..3588804 100644
--- a/include/linux/perf_event.h
+++ b/include/linux/perf_event.h
@@ -978,6 +978,13 @@ extern void perf_event_fork(struct task_struct *tsk);
 
 extern struct perf_callchain_entry *perf_callchain(struct pt_regs *regs);
 
+static inline void
+perf_callchain_store(struct perf_callchain_entry *entry, u64 ip)
+{
+	if (entry->nr < PERF_MAX_STACK_DEPTH)
+		entry->ip[entry->nr++] = ip;
+}
+
 extern int sysctl_perf_event_paranoid;
 extern int sysctl_perf_event_mlock;
 extern int sysctl_perf_event_sample_rate;
-- 
1.6.2.3


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [RFC PATCH 3/6] perf: Generalize some arch callchain code
  2010-07-01 15:35 [RFC PATCH 0/6] perf: cleanup and fixes Frederic Weisbecker
  2010-07-01 15:35 ` [RFC PATCH 1/6] perf: Drop unappropriate tests on arch callchains Frederic Weisbecker
  2010-07-01 15:35 ` [RFC PATCH 2/6] perf: Generalize callchain_store() Frederic Weisbecker
@ 2010-07-01 15:35 ` Frederic Weisbecker
  2010-07-01 15:46   ` Peter Zijlstra
  2010-07-01 15:36 ` [RFC PATCH 4/6] perf: Factorize callchain context handling Frederic Weisbecker
                   ` (2 subsequent siblings)
  5 siblings, 1 reply; 16+ messages in thread
From: Frederic Weisbecker @ 2010-07-01 15:35 UTC (permalink / raw)
  To: LKML
  Cc: LKML, Frederic Weisbecker, Ingo Molnar, Peter Zijlstra,
	Arnaldo Carvalho de Melo, Paul Mackerras, Stephane Eranian,
	David Miller, Paul Mundt, Will Deacon, Borislav Petkov

- Most archs use one callchain buffer per cpu, except x86 that needs
  to deal with NMIs. Provide a default perf_callchain_buffer()
  implementation that x86 overrides.

- Centralize all the kernel/user regs handling and invoke new arch
  handlers from there: perf_callchain_user() / perf_callchain_kernel()
  That avoid all the user_mode(), current->mm checks and so...

- Invert some parameters in perf_callchain_*() helpers: entry to the
  left, regs to the right, following the traditional (src, dst).

I'm not sure what stack_trace_flush() and flushw_user() do in sparc,
I've just moved them inside the callchain_user handlers, I'm not sure
this is the right thing.

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: David Miller <davem@davemloft.net>
Cc: Paul Mundt <lethal@linux-sh.org>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Borislav Petkov <bp@amd64.org>
---
 arch/arm/kernel/perf_event.c         |   43 +++---------------------------
 arch/powerpc/kernel/perf_callchain.c |   49 +++++++++------------------------
 arch/sh/kernel/perf_callchain.c      |   37 +------------------------
 arch/sparc/kernel/perf_event.c       |   46 ++++++++++---------------------
 arch/x86/kernel/cpu/perf_event.c     |   45 +++++-------------------------
 include/linux/perf_event.h           |   10 ++++++-
 kernel/perf_event.c                  |   40 ++++++++++++++++++++++++++-
 7 files changed, 90 insertions(+), 180 deletions(-)

diff --git a/arch/arm/kernel/perf_event.c b/arch/arm/kernel/perf_event.c
index 41de647..ddb5134 100644
--- a/arch/arm/kernel/perf_event.c
+++ b/arch/arm/kernel/perf_event.c
@@ -3044,17 +3044,13 @@ user_backtrace(struct frame_tail *tail,
 	return buftail.fp - 1;
 }
 
-static void
-perf_callchain_user(struct pt_regs *regs,
-		    struct perf_callchain_entry *entry)
+void
+perf_callchain_user(struct perf_callchain_entry *entry, struct pt_regs *regs)
 {
 	struct frame_tail *tail;
 
 	perf_callchain_store(entry, PERF_CONTEXT_USER);
 
-	if (!user_mode(regs))
-		regs = task_pt_regs(current);
-
 	tail = (struct frame_tail *)regs->ARM_fp - 1;
 
 	while (tail && !((unsigned long)tail & 0x3))
@@ -3075,9 +3071,8 @@ callchain_trace(struct stackframe *fr,
 	return 0;
 }
 
-static void
-perf_callchain_kernel(struct pt_regs *regs,
-		      struct perf_callchain_entry *entry)
+void
+perf_callchain_kernel(struct perf_callchain_entry *entry, struct pt_regs *regs)
 {
 	struct stackframe fr;
 
@@ -3088,33 +3083,3 @@ perf_callchain_kernel(struct pt_regs *regs,
 	fr.pc = regs->ARM_pc;
 	walk_stackframe(&fr, callchain_trace, entry);
 }
-
-static void
-perf_do_callchain(struct pt_regs *regs,
-		  struct perf_callchain_entry *entry)
-{
-	int is_user;
-
-	if (!regs)
-		return;
-
-	is_user = user_mode(regs);
-
-	if (!is_user)
-		perf_callchain_kernel(regs, entry);
-
-	if (current->mm)
-		perf_callchain_user(regs, entry);
-}
-
-static DEFINE_PER_CPU(struct perf_callchain_entry, pmc_irq_entry);
-
-struct perf_callchain_entry *
-perf_callchain(struct pt_regs *regs)
-{
-	struct perf_callchain_entry *entry = &__get_cpu_var(pmc_irq_entry);
-
-	entry->nr = 0;
-	perf_do_callchain(regs, entry);
-	return entry;
-}
diff --git a/arch/powerpc/kernel/perf_callchain.c b/arch/powerpc/kernel/perf_callchain.c
index a286c2e..f7a85ed 100644
--- a/arch/powerpc/kernel/perf_callchain.c
+++ b/arch/powerpc/kernel/perf_callchain.c
@@ -46,8 +46,8 @@ static int valid_next_sp(unsigned long sp, unsigned long prev_sp)
 	return 0;
 }
 
-static void perf_callchain_kernel(struct pt_regs *regs,
-				  struct perf_callchain_entry *entry)
+void
+perf_callchain_kernel(struct perf_callchain_entry *entry, struct pt_regs *regs)
 {
 	unsigned long sp, next_sp;
 	unsigned long next_ip;
@@ -221,8 +221,8 @@ static int sane_signal_64_frame(unsigned long sp)
 		puc == (unsigned long) &sf->uc;
 }
 
-static void perf_callchain_user_64(struct pt_regs *regs,
-				   struct perf_callchain_entry *entry)
+static void perf_callchain_user_64(struct perf_callchain_entry *entry,
+				   struct pt_regs *regs)
 {
 	unsigned long sp, next_sp;
 	unsigned long next_ip;
@@ -303,8 +303,8 @@ static int read_user_stack_32(unsigned int __user *ptr, unsigned int *ret)
 	return __get_user_inatomic(*ret, ptr);
 }
 
-static inline void perf_callchain_user_64(struct pt_regs *regs,
-					  struct perf_callchain_entry *entry)
+static inline void perf_callchain_user_64(struct perf_callchain_entry *entry,
+					  struct pt_regs *regs)
 {
 }
 
@@ -423,8 +423,8 @@ static unsigned int __user *signal_frame_32_regs(unsigned int sp,
 	return mctx->mc_gregs;
 }
 
-static void perf_callchain_user_32(struct pt_regs *regs,
-				   struct perf_callchain_entry *entry)
+static void perf_callchain_user_32(struct perf_callchain_entry *entry,
+				   struct pt_regs *regs)
 {
 	unsigned int sp, next_sp;
 	unsigned int next_ip;
@@ -471,32 +471,11 @@ static void perf_callchain_user_32(struct pt_regs *regs,
 	}
 }
 
-/*
- * Since we can't get PMU interrupts inside a PMU interrupt handler,
- * we don't need separate irq and nmi entries here.
- */
-static DEFINE_PER_CPU(struct perf_callchain_entry, cpu_perf_callchain);
-
-struct perf_callchain_entry *perf_callchain(struct pt_regs *regs)
+void
+perf_callchain_user(struct perf_callchain_entry *entry, struct pt_regs *regs)
 {
-	struct perf_callchain_entry *entry = &__get_cpu_var(cpu_perf_callchain);
-
-	entry->nr = 0;
-
-	if (!user_mode(regs)) {
-		perf_callchain_kernel(regs, entry);
-		if (current->mm)
-			regs = task_pt_regs(current);
-		else
-			regs = NULL;
-	}
-
-	if (regs) {
-		if (current_is_64bit())
-			perf_callchain_user_64(regs, entry);
-		else
-			perf_callchain_user_32(regs, entry);
-	}
-
-	return entry;
+	if (current_is_64bit())
+		perf_callchain_user_64(entry, regs);
+	else
+		perf_callchain_user_32(entry, regs);
 }
diff --git a/arch/sh/kernel/perf_callchain.c b/arch/sh/kernel/perf_callchain.c
index 00143f3..ef076a9 100644
--- a/arch/sh/kernel/perf_callchain.c
+++ b/arch/sh/kernel/perf_callchain.c
@@ -44,44 +44,11 @@ static const struct stacktrace_ops callchain_ops = {
 	.address	= callchain_address,
 };
 
-static void
-perf_callchain_kernel(struct pt_regs *regs, struct perf_callchain_entry *entry)
+void
+perf_callchain_kernel(struct perf_callchain_entry *entry, struct pt_regs *regs)
 {
 	perf_callchain_store(entry, PERF_CONTEXT_KERNEL);
 	perf_callchain_store(entry, regs->pc);
 
 	unwind_stack(NULL, regs, NULL, &callchain_ops, entry);
 }
-
-static void
-perf_do_callchain(struct pt_regs *regs, struct perf_callchain_entry *entry)
-{
-	int is_user;
-
-	if (!regs)
-		return;
-
-	is_user = user_mode(regs);
-
-	/*
-	 * Only the kernel side is implemented for now.
-	 */
-	if (!is_user)
-		perf_callchain_kernel(regs, entry);
-}
-
-/*
- * No need for separate IRQ and NMI entries.
- */
-static DEFINE_PER_CPU(struct perf_callchain_entry, callchain);
-
-struct perf_callchain_entry *perf_callchain(struct pt_regs *regs)
-{
-	struct perf_callchain_entry *entry = &__get_cpu_var(callchain);
-
-	entry->nr = 0;
-
-	perf_do_callchain(regs, entry);
-
-	return entry;
-}
diff --git a/arch/sparc/kernel/perf_event.c b/arch/sparc/kernel/perf_event.c
index fbac5ad..7f0e44e 100644
--- a/arch/sparc/kernel/perf_event.c
+++ b/arch/sparc/kernel/perf_event.c
@@ -1282,14 +1282,16 @@ void __init init_hw_perf_events(void)
 	register_die_notifier(&perf_event_nmi_notifier);
 }
 
-static void perf_callchain_kernel(struct pt_regs *regs,
-				  struct perf_callchain_entry *entry)
+void perf_callchain_kernel(struct perf_callchain_entry *entry,
+			   struct pt_regs *regs)
 {
 	unsigned long ksp, fp;
 #ifdef CONFIG_FUNCTION_GRAPH_TRACER
 	int graph = 0;
 #endif
 
+	stack_trace_flush();
+
 	perf_callchain_store(entry, PERF_CONTEXT_KERNEL);
 	perf_callchain_store(entry, regs->tpc);
 
@@ -1329,8 +1331,8 @@ static void perf_callchain_kernel(struct pt_regs *regs,
 	} while (entry->nr < PERF_MAX_STACK_DEPTH);
 }
 
-static void perf_callchain_user_64(struct pt_regs *regs,
-				   struct perf_callchain_entry *entry)
+static void perf_callchain_user_64(struct perf_callchain_entry *entry,
+				   struct pt_regs *regs)
 {
 	unsigned long ufp;
 
@@ -1352,8 +1354,8 @@ static void perf_callchain_user_64(struct pt_regs *regs,
 	} while (entry->nr < PERF_MAX_STACK_DEPTH);
 }
 
-static void perf_callchain_user_32(struct pt_regs *regs,
-				   struct perf_callchain_entry *entry)
+static void perf_callchain_user_32(struct perf_callchain_entry *entry,
+				   struct pt_regs *regs)
 {
 	unsigned long ufp;
 
@@ -1375,30 +1377,12 @@ static void perf_callchain_user_32(struct pt_regs *regs,
 	} while (entry->nr < PERF_MAX_STACK_DEPTH);
 }
 
-/* Like powerpc we can't get PMU interrupts within the PMU handler,
- * so no need for separate NMI and IRQ chains as on x86.
- */
-static DEFINE_PER_CPU(struct perf_callchain_entry, callchain);
-
-struct perf_callchain_entry *perf_callchain(struct pt_regs *regs)
+void
+perf_callchain_user(struct perf_callchain_entry *entry, struct pt_regs *regs)
 {
-	struct perf_callchain_entry *entry = &__get_cpu_var(callchain);
-
-	entry->nr = 0;
-	if (!user_mode(regs)) {
-		stack_trace_flush();
-		perf_callchain_kernel(regs, entry);
-		if (current->mm)
-			regs = task_pt_regs(current);
-		else
-			regs = NULL;
-	}
-	if (regs) {
-		flushw_user();
-		if (test_thread_flag(TIF_32BIT))
-			perf_callchain_user_32(regs, entry);
-		else
-			perf_callchain_user_64(regs, entry);
-	}
-	return entry;
+	flushw_user();
+	if (test_thread_flag(TIF_32BIT))
+		perf_callchain_user_32(entry, regs);
+	else
+		perf_callchain_user_64(entry, regs);
 }
diff --git a/arch/x86/kernel/cpu/perf_event.c b/arch/x86/kernel/cpu/perf_event.c
index 8af28ca..39f8421 100644
--- a/arch/x86/kernel/cpu/perf_event.c
+++ b/arch/x86/kernel/cpu/perf_event.c
@@ -1571,9 +1571,7 @@ const struct pmu *hw_perf_event_init(struct perf_event *event)
  * callchain support
  */
 
-
-static DEFINE_PER_CPU(struct perf_callchain_entry, pmc_irq_entry);
-static DEFINE_PER_CPU(struct perf_callchain_entry, pmc_nmi_entry);
+static DEFINE_PER_CPU(struct perf_callchain_entry, perf_callchain_entry_nmi);
 
 
 static void
@@ -1607,8 +1605,8 @@ static const struct stacktrace_ops backtrace_ops = {
 	.walk_stack		= print_context_stack_bp,
 };
 
-static void
-perf_callchain_kernel(struct pt_regs *regs, struct perf_callchain_entry *entry)
+void
+perf_callchain_kernel(struct perf_callchain_entry *entry, struct pt_regs *regs)
 {
 	perf_callchain_store(entry, PERF_CONTEXT_KERNEL);
 	perf_callchain_store(entry, regs->ip);
@@ -1653,14 +1651,12 @@ perf_callchain_user32(struct pt_regs *regs, struct perf_callchain_entry *entry)
 }
 #endif
 
-static void
-perf_callchain_user(struct pt_regs *regs, struct perf_callchain_entry *entry)
+void
+perf_callchain_user(struct perf_callchain_entry *entry, struct pt_regs *regs)
 {
 	struct stack_frame frame;
 	const void __user *fp;
 
-	if (!user_mode(regs))
-		regs = task_pt_regs(current);
 
 	fp = (void __user *)regs->bp;
 
@@ -1687,42 +1683,17 @@ perf_callchain_user(struct pt_regs *regs, struct perf_callchain_entry *entry)
 	}
 }
 
-static void
-perf_do_callchain(struct pt_regs *regs, struct perf_callchain_entry *entry)
-{
-	int is_user;
-
-	if (!regs)
-		return;
-
-	is_user = user_mode(regs);
-
-	if (!is_user)
-		perf_callchain_kernel(regs, entry);
-
-	if (current->mm)
-		perf_callchain_user(regs, entry);
-}
-
-struct perf_callchain_entry *perf_callchain(struct pt_regs *regs)
+struct perf_callchain_entry *perf_callchain_buffer(void)
 {
-	struct perf_callchain_entry *entry;
-
 	if (perf_guest_cbs && perf_guest_cbs->is_in_guest()) {
 		/* TODO: We don't support guest os callchain now */
 		return NULL;
 	}
 
 	if (in_nmi())
-		entry = &__get_cpu_var(pmc_nmi_entry);
-	else
-		entry = &__get_cpu_var(pmc_irq_entry);
-
-	entry->nr = 0;
-
-	perf_do_callchain(regs, entry);
+		return &__get_cpu_var(perf_callchain_entry_nmi);
 
-	return entry;
+	return &__get_cpu_var(perf_callchain_entry);
 }
 
 unsigned long perf_instruction_pointer(struct pt_regs *regs)
diff --git a/include/linux/perf_event.h b/include/linux/perf_event.h
index 3588804..4db61dd 100644
--- a/include/linux/perf_event.h
+++ b/include/linux/perf_event.h
@@ -976,7 +976,15 @@ extern int perf_unregister_guest_info_callbacks(struct perf_guest_info_callbacks
 extern void perf_event_comm(struct task_struct *tsk);
 extern void perf_event_fork(struct task_struct *tsk);
 
-extern struct perf_callchain_entry *perf_callchain(struct pt_regs *regs);
+/* Callchains */
+DECLARE_PER_CPU(struct perf_callchain_entry, perf_callchain_entry);
+
+extern void perf_callchain_user(struct perf_callchain_entry *entry,
+				struct pt_regs *regs);
+extern void perf_callchain_kernel(struct perf_callchain_entry *entry,
+				  struct pt_regs *regs);
+extern struct perf_callchain_entry *perf_callchain_buffer(void);
+
 
 static inline void
 perf_callchain_store(struct perf_callchain_entry *entry, u64 ip)
diff --git a/kernel/perf_event.c b/kernel/perf_event.c
index c772a3d..02efde6 100644
--- a/kernel/perf_event.c
+++ b/kernel/perf_event.c
@@ -2937,13 +2937,49 @@ void perf_event_do_pending(void)
 	__perf_pending_run();
 }
 
+DEFINE_PER_CPU(struct perf_callchain_entry, perf_callchain_entry);
+
 /*
  * Callchain support -- arch specific
  */
 
-__weak struct perf_callchain_entry *perf_callchain(struct pt_regs *regs)
+__weak struct perf_callchain_entry *perf_callchain_buffer(void)
 {
-	return NULL;
+	return &__get_cpu_var(perf_callchain_entry);
+}
+
+__weak void perf_callchain_kernel(struct perf_callchain_entry *entry,
+				  struct pt_regs *regs)
+{
+}
+
+__weak void perf_callchain_user(struct perf_callchain_entry *entry,
+				struct pt_regs *regs)
+{
+}
+
+static struct perf_callchain_entry *perf_callchain(struct pt_regs *regs)
+{
+	struct perf_callchain_entry *entry;
+
+	entry = perf_callchain_buffer();
+	if (!entry)
+		return NULL;
+
+	entry->nr = 0;
+
+	if (!user_mode(regs)) {
+		perf_callchain_kernel(entry, regs);
+		if (current->mm)
+			regs = task_pt_regs(current);
+		else
+			regs = NULL;
+	}
+
+	if (regs)
+		perf_callchain_user(entry, regs);
+
+	return entry;
 }
 
 
-- 
1.6.2.3


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [RFC PATCH 4/6] perf: Factorize callchain context handling
  2010-07-01 15:35 [RFC PATCH 0/6] perf: cleanup and fixes Frederic Weisbecker
                   ` (2 preceding siblings ...)
  2010-07-01 15:35 ` [RFC PATCH 3/6] perf: Generalize some arch callchain code Frederic Weisbecker
@ 2010-07-01 15:36 ` Frederic Weisbecker
  2010-07-01 15:36 ` [RFC PATCH 5/6] perf: Fix race in callchains Frederic Weisbecker
  2010-07-01 15:36 ` [RFC PATCH 6/6] perf: Fix double put_ctx Frederic Weisbecker
  5 siblings, 0 replies; 16+ messages in thread
From: Frederic Weisbecker @ 2010-07-01 15:36 UTC (permalink / raw)
  To: LKML
  Cc: LKML, Frederic Weisbecker, Ingo Molnar, Peter Zijlstra,
	Arnaldo Carvalho de Melo, Paul Mackerras, Stephane Eranian,
	David Miller, Paul Mundt, Will Deacon, Borislav Petkov

Store the kernel and user contexts from the generic layer instead
of archs, this gathers some repetitive code.

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: David Miller <davem@davemloft.net>
Cc: Paul Mundt <lethal@linux-sh.org>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Borislav Petkov <bp@amd64.org>
---
 arch/arm/kernel/perf_event.c         |    2 --
 arch/powerpc/kernel/perf_callchain.c |    3 ---
 arch/sh/kernel/perf_callchain.c      |    1 -
 arch/sparc/kernel/perf_event.c       |    3 ---
 arch/x86/kernel/cpu/perf_event.c     |    2 --
 kernel/perf_event.c                  |    5 ++++-
 6 files changed, 4 insertions(+), 12 deletions(-)

diff --git a/arch/arm/kernel/perf_event.c b/arch/arm/kernel/perf_event.c
index ddb5134..8df0f8c 100644
--- a/arch/arm/kernel/perf_event.c
+++ b/arch/arm/kernel/perf_event.c
@@ -3049,7 +3049,6 @@ perf_callchain_user(struct perf_callchain_entry *entry, struct pt_regs *regs)
 {
 	struct frame_tail *tail;
 
-	perf_callchain_store(entry, PERF_CONTEXT_USER);
 
 	tail = (struct frame_tail *)regs->ARM_fp - 1;
 
@@ -3076,7 +3075,6 @@ perf_callchain_kernel(struct perf_callchain_entry *entry, struct pt_regs *regs)
 {
 	struct stackframe fr;
 
-	perf_callchain_store(entry, PERF_CONTEXT_KERNEL);
 	fr.fp = regs->ARM_fp;
 	fr.sp = regs->ARM_sp;
 	fr.lr = regs->ARM_lr;
diff --git a/arch/powerpc/kernel/perf_callchain.c b/arch/powerpc/kernel/perf_callchain.c
index f7a85ed..d05ae42 100644
--- a/arch/powerpc/kernel/perf_callchain.c
+++ b/arch/powerpc/kernel/perf_callchain.c
@@ -57,7 +57,6 @@ perf_callchain_kernel(struct perf_callchain_entry *entry, struct pt_regs *regs)
 
 	lr = regs->link;
 	sp = regs->gpr[1];
-	perf_callchain_store(entry, PERF_CONTEXT_KERNEL);
 	perf_callchain_store(entry, regs->nip);
 
 	if (!validate_sp(sp, current, STACK_FRAME_OVERHEAD))
@@ -234,7 +233,6 @@ static void perf_callchain_user_64(struct perf_callchain_entry *entry,
 	next_ip = regs->nip;
 	lr = regs->link;
 	sp = regs->gpr[1];
-	perf_callchain_store(entry, PERF_CONTEXT_USER);
 	perf_callchain_store(entry, next_ip);
 
 	for (;;) {
@@ -435,7 +433,6 @@ static void perf_callchain_user_32(struct perf_callchain_entry *entry,
 	next_ip = regs->nip;
 	lr = regs->link;
 	sp = regs->gpr[1];
-	perf_callchain_store(entry, PERF_CONTEXT_USER);
 	perf_callchain_store(entry, next_ip);
 
 	while (entry->nr < PERF_MAX_STACK_DEPTH) {
diff --git a/arch/sh/kernel/perf_callchain.c b/arch/sh/kernel/perf_callchain.c
index ef076a9..d5ca1ef 100644
--- a/arch/sh/kernel/perf_callchain.c
+++ b/arch/sh/kernel/perf_callchain.c
@@ -47,7 +47,6 @@ static const struct stacktrace_ops callchain_ops = {
 void
 perf_callchain_kernel(struct perf_callchain_entry *entry, struct pt_regs *regs)
 {
-	perf_callchain_store(entry, PERF_CONTEXT_KERNEL);
 	perf_callchain_store(entry, regs->pc);
 
 	unwind_stack(NULL, regs, NULL, &callchain_ops, entry);
diff --git a/arch/sparc/kernel/perf_event.c b/arch/sparc/kernel/perf_event.c
index 7f0e44e..c93bcac 100644
--- a/arch/sparc/kernel/perf_event.c
+++ b/arch/sparc/kernel/perf_event.c
@@ -1292,7 +1292,6 @@ void perf_callchain_kernel(struct perf_callchain_entry *entry,
 
 	stack_trace_flush();
 
-	perf_callchain_store(entry, PERF_CONTEXT_KERNEL);
 	perf_callchain_store(entry, regs->tpc);
 
 	ksp = regs->u_regs[UREG_I6];
@@ -1336,7 +1335,6 @@ static void perf_callchain_user_64(struct perf_callchain_entry *entry,
 {
 	unsigned long ufp;
 
-	perf_callchain_store(entry, PERF_CONTEXT_USER);
 	perf_callchain_store(entry, regs->tpc);
 
 	ufp = regs->u_regs[UREG_I6] + STACK_BIAS;
@@ -1359,7 +1357,6 @@ static void perf_callchain_user_32(struct perf_callchain_entry *entry,
 {
 	unsigned long ufp;
 
-	perf_callchain_store(entry, PERF_CONTEXT_USER);
 	perf_callchain_store(entry, regs->tpc);
 
 	ufp = regs->u_regs[UREG_I6] & 0xffffffffUL;
diff --git a/arch/x86/kernel/cpu/perf_event.c b/arch/x86/kernel/cpu/perf_event.c
index 39f8421..a3c9222 100644
--- a/arch/x86/kernel/cpu/perf_event.c
+++ b/arch/x86/kernel/cpu/perf_event.c
@@ -1608,7 +1608,6 @@ static const struct stacktrace_ops backtrace_ops = {
 void
 perf_callchain_kernel(struct perf_callchain_entry *entry, struct pt_regs *regs)
 {
-	perf_callchain_store(entry, PERF_CONTEXT_KERNEL);
 	perf_callchain_store(entry, regs->ip);
 
 	dump_trace(NULL, regs, NULL, regs->bp, &backtrace_ops, entry);
@@ -1660,7 +1659,6 @@ perf_callchain_user(struct perf_callchain_entry *entry, struct pt_regs *regs)
 
 	fp = (void __user *)regs->bp;
 
-	perf_callchain_store(entry, PERF_CONTEXT_USER);
 	perf_callchain_store(entry, regs->ip);
 
 	if (perf_callchain_user32(regs, entry))
diff --git a/kernel/perf_event.c b/kernel/perf_event.c
index 02efde6..615d024 100644
--- a/kernel/perf_event.c
+++ b/kernel/perf_event.c
@@ -2969,6 +2969,7 @@ static struct perf_callchain_entry *perf_callchain(struct pt_regs *regs)
 	entry->nr = 0;
 
 	if (!user_mode(regs)) {
+		perf_callchain_store(entry, PERF_CONTEXT_KERNEL);
 		perf_callchain_kernel(entry, regs);
 		if (current->mm)
 			regs = task_pt_regs(current);
@@ -2976,8 +2977,10 @@ static struct perf_callchain_entry *perf_callchain(struct pt_regs *regs)
 			regs = NULL;
 	}
 
-	if (regs)
+	if (regs) {
+		perf_callchain_store(entry, PERF_CONTEXT_USER);
 		perf_callchain_user(entry, regs);
+	}
 
 	return entry;
 }
-- 
1.6.2.3


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [RFC PATCH 5/6] perf: Fix race in callchains
  2010-07-01 15:35 [RFC PATCH 0/6] perf: cleanup and fixes Frederic Weisbecker
                   ` (3 preceding siblings ...)
  2010-07-01 15:36 ` [RFC PATCH 4/6] perf: Factorize callchain context handling Frederic Weisbecker
@ 2010-07-01 15:36 ` Frederic Weisbecker
  2010-07-01 15:42   ` Frederic Weisbecker
  2010-07-02 18:07   ` Peter Zijlstra
  2010-07-01 15:36 ` [RFC PATCH 6/6] perf: Fix double put_ctx Frederic Weisbecker
  5 siblings, 2 replies; 16+ messages in thread
From: Frederic Weisbecker @ 2010-07-01 15:36 UTC (permalink / raw)
  To: LKML
  Cc: LKML, Frederic Weisbecker, Ingo Molnar, Peter Zijlstra,
	Arnaldo Carvalho de Melo, Paul Mackerras, Stephane Eranian,
	Will Deacon, Paul Mundt, David Miller, Borislav Petkov

Now that software events don't have interrupt disabled anymore in
the event path, callchains can nest on any context. So seperating
nmi and others contexts in two buffers has become racy.

Fix this by providing one buffer per nesting level. Given the size
of the callchain entries (2040 bytes * 4), we now need to allocate
them dynamically.

(The guest checks in x86 should probably be moved elsewhere).

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Paul Mundt <lethal@linux-sh.org>
Cc: David Miller <davem@davemloft.net>
Cc: Borislav Petkov <bp@amd64.org>
---
 arch/x86/kernel/cpu/perf_event.c |   22 ++--
 include/linux/perf_event.h       |    1 -
 kernel/perf_event.c              |  265 ++++++++++++++++++++++++++++----------
 3 files changed, 205 insertions(+), 83 deletions(-)

diff --git a/arch/x86/kernel/cpu/perf_event.c b/arch/x86/kernel/cpu/perf_event.c
index a3c9222..8e91cf3 100644
--- a/arch/x86/kernel/cpu/perf_event.c
+++ b/arch/x86/kernel/cpu/perf_event.c
@@ -1608,6 +1608,11 @@ static const struct stacktrace_ops backtrace_ops = {
 void
 perf_callchain_kernel(struct perf_callchain_entry *entry, struct pt_regs *regs)
 {
+	if (perf_guest_cbs && perf_guest_cbs->is_in_guest()) {
+		/* TODO: We don't support guest os callchain now */
+		return NULL;
+	}
+
 	perf_callchain_store(entry, regs->ip);
 
 	dump_trace(NULL, regs, NULL, regs->bp, &backtrace_ops, entry);
@@ -1656,6 +1661,10 @@ perf_callchain_user(struct perf_callchain_entry *entry, struct pt_regs *regs)
 	struct stack_frame frame;
 	const void __user *fp;
 
+	if (perf_guest_cbs && perf_guest_cbs->is_in_guest()) {
+		/* TODO: We don't support guest os callchain now */
+		return NULL;
+	}
 
 	fp = (void __user *)regs->bp;
 
@@ -1681,19 +1690,6 @@ perf_callchain_user(struct perf_callchain_entry *entry, struct pt_regs *regs)
 	}
 }
 
-struct perf_callchain_entry *perf_callchain_buffer(void)
-{
-	if (perf_guest_cbs && perf_guest_cbs->is_in_guest()) {
-		/* TODO: We don't support guest os callchain now */
-		return NULL;
-	}
-
-	if (in_nmi())
-		return &__get_cpu_var(perf_callchain_entry_nmi);
-
-	return &__get_cpu_var(perf_callchain_entry);
-}
-
 unsigned long perf_instruction_pointer(struct pt_regs *regs)
 {
 	unsigned long ip;
diff --git a/include/linux/perf_event.h b/include/linux/perf_event.h
index 4db61dd..d7e8ea6 100644
--- a/include/linux/perf_event.h
+++ b/include/linux/perf_event.h
@@ -983,7 +983,6 @@ extern void perf_callchain_user(struct perf_callchain_entry *entry,
 				struct pt_regs *regs);
 extern void perf_callchain_kernel(struct perf_callchain_entry *entry,
 				  struct pt_regs *regs);
-extern struct perf_callchain_entry *perf_callchain_buffer(void);
 
 
 static inline void
diff --git a/kernel/perf_event.c b/kernel/perf_event.c
index 615d024..b6e854f 100644
--- a/kernel/perf_event.c
+++ b/kernel/perf_event.c
@@ -1764,6 +1764,183 @@ static u64 perf_event_read(struct perf_event *event)
 }
 
 /*
+ * Callchain support
+ */
+
+struct perf_callchain_entry_cpus {
+	struct perf_callchain_entry	__percpu *entries;
+	struct rcu_head				  rcu_head;
+};
+
+static DEFINE_PER_CPU(int, callchain_recursion);
+static int nr_callchain_events;
+static DEFINE_MUTEX(callchain_mutex);
+static struct perf_callchain_entry_cpus *callchain_entries[4];
+
+__weak void perf_callchain_kernel(struct perf_callchain_entry *entry,
+				  struct pt_regs *regs)
+{
+}
+
+__weak void perf_callchain_user(struct perf_callchain_entry *entry,
+				struct pt_regs *regs)
+{
+}
+
+static int get_callchain_buffers(void)
+{
+	int i;
+	int err = 0;
+	struct perf_callchain_entry_cpus *buf;
+
+	mutex_lock(&callchain_mutex);
+
+	if (WARN_ON_ONCE(++nr_callchain_events < 1)) {
+		err = -EINVAL;
+		goto exit;
+	}
+
+	if (nr_callchain_events > 1)
+		goto exit;
+
+	for (i = 0; i < 4; i++) {
+		buf = kzalloc(sizeof(*buf), GFP_KERNEL);
+		/* free_event() will clean the rest */
+		if (!buf) {
+			err = -ENOMEM;
+			goto exit;
+		}
+		buf->entries = alloc_percpu(struct perf_callchain_entry);
+		if (!buf->entries) {
+			kfree(buf);
+			err = -ENOMEM;
+			goto exit;
+		}
+		rcu_assign_pointer(callchain_entries[i], buf);
+	}
+
+exit:
+	mutex_unlock(&callchain_mutex);
+
+	return err;
+}
+
+static void release_callchain_buffers(struct rcu_head *head)
+{
+	struct perf_callchain_entry_cpus *entry;
+
+	entry = container_of(head, struct perf_callchain_entry_cpus, rcu_head);
+	free_percpu(entry->entries);
+	kfree(entry);
+}
+
+static void put_callchain_buffers(void)
+{
+	int i;
+	struct perf_callchain_entry_cpus *entry;
+
+	mutex_lock(&callchain_mutex);
+
+	if (WARN_ON_ONCE(--nr_callchain_events < 0))
+		goto exit;
+
+	if (nr_callchain_events > 0)
+		goto exit;
+
+	for (i = 0; i < 4; i++) {
+		entry = callchain_entries[i];
+		if (entry) {
+			callchain_entries[i] = NULL;
+			call_rcu(&entry->rcu_head, release_callchain_buffers);
+		}
+	}
+
+exit:
+	mutex_unlock(&callchain_mutex);
+}
+
+static int get_recursion_context(int *recursion)
+{
+	int rctx;
+
+	if (in_nmi())
+		rctx = 3;
+	else if (in_irq())
+		rctx = 2;
+	else if (in_softirq())
+		rctx = 1;
+	else
+		rctx = 0;
+
+	if (recursion[rctx])
+		return -1;
+
+	recursion[rctx]++;
+	barrier();
+
+	return rctx;
+}
+
+static inline void put_recursion_context(int *recursion, int rctx)
+{
+	barrier();
+	recursion[rctx]--;
+}
+
+static struct perf_callchain_entry *get_callchain_entry(int *rctx)
+{
+	struct perf_callchain_entry_cpus *cpu_entries;
+
+	*rctx = get_recursion_context(&__get_cpu_var(callchain_recursion));
+	if (*rctx == -1)
+		return NULL;
+
+	cpu_entries = rcu_dereference(callchain_entries[*rctx]);
+	if (!cpu_entries)
+		return NULL;
+
+	return this_cpu_ptr(cpu_entries->entries);
+}
+
+static void
+put_callchain_entry(int rctx)
+{
+	put_recursion_context(&__get_cpu_var(callchain_recursion), rctx);
+}
+
+static struct perf_callchain_entry *perf_callchain(struct pt_regs *regs)
+{
+	int rctx;
+	struct perf_callchain_entry *entry;
+
+
+	entry = get_callchain_entry(&rctx);
+	if (!entry)
+		goto exit_put;
+
+	entry->nr = 0;
+
+	if (!user_mode(regs)) {
+		perf_callchain_store(entry, PERF_CONTEXT_KERNEL);
+		perf_callchain_kernel(entry, regs);
+		if (current->mm)
+			regs = task_pt_regs(current);
+		else
+			regs = NULL;
+	}
+
+	if (regs) {
+		perf_callchain_store(entry, PERF_CONTEXT_USER);
+		perf_callchain_user(entry, regs);
+	}
+
+exit_put:
+	put_callchain_entry(rctx);
+
+	return entry;
+}
+
+/*
  * Initialize the perf_event context in a task_struct:
  */
 static void
@@ -1895,6 +2072,8 @@ static void free_event(struct perf_event *event)
 			atomic_dec(&nr_comm_events);
 		if (event->attr.task)
 			atomic_dec(&nr_task_events);
+		if (event->attr.sample_type & PERF_SAMPLE_CALLCHAIN)
+			put_callchain_buffers();
 	}
 
 	if (event->buffer) {
@@ -2937,55 +3116,6 @@ void perf_event_do_pending(void)
 	__perf_pending_run();
 }
 
-DEFINE_PER_CPU(struct perf_callchain_entry, perf_callchain_entry);
-
-/*
- * Callchain support -- arch specific
- */
-
-__weak struct perf_callchain_entry *perf_callchain_buffer(void)
-{
-	return &__get_cpu_var(perf_callchain_entry);
-}
-
-__weak void perf_callchain_kernel(struct perf_callchain_entry *entry,
-				  struct pt_regs *regs)
-{
-}
-
-__weak void perf_callchain_user(struct perf_callchain_entry *entry,
-				struct pt_regs *regs)
-{
-}
-
-static struct perf_callchain_entry *perf_callchain(struct pt_regs *regs)
-{
-	struct perf_callchain_entry *entry;
-
-	entry = perf_callchain_buffer();
-	if (!entry)
-		return NULL;
-
-	entry->nr = 0;
-
-	if (!user_mode(regs)) {
-		perf_callchain_store(entry, PERF_CONTEXT_KERNEL);
-		perf_callchain_kernel(entry, regs);
-		if (current->mm)
-			regs = task_pt_regs(current);
-		else
-			regs = NULL;
-	}
-
-	if (regs) {
-		perf_callchain_store(entry, PERF_CONTEXT_USER);
-		perf_callchain_user(entry, regs);
-	}
-
-	return entry;
-}
-
-
 /*
  * We assume there is only KVM supporting the callbacks.
  * Later on, we might change it to a list if there is
@@ -3480,14 +3610,20 @@ static void perf_event_output(struct perf_event *event, int nmi,
 	struct perf_output_handle handle;
 	struct perf_event_header header;
 
+	/* protect the callchain buffers */
+	rcu_read_lock();
+
 	perf_prepare_sample(&header, data, event, regs);
 
 	if (perf_output_begin(&handle, event, header.size, nmi, 1))
-		return;
+		goto exit;
 
 	perf_output_sample(&handle, &header, data, event);
 
 	perf_output_end(&handle);
+
+exit:
+	rcu_read_unlock();
 }
 
 /*
@@ -4243,32 +4379,16 @@ end:
 int perf_swevent_get_recursion_context(void)
 {
 	struct perf_cpu_context *cpuctx = &__get_cpu_var(perf_cpu_context);
-	int rctx;
 
-	if (in_nmi())
-		rctx = 3;
-	else if (in_irq())
-		rctx = 2;
-	else if (in_softirq())
-		rctx = 1;
-	else
-		rctx = 0;
-
-	if (cpuctx->recursion[rctx])
-		return -1;
-
-	cpuctx->recursion[rctx]++;
-	barrier();
-
-	return rctx;
+	return get_recursion_context(cpuctx->recursion);
 }
 EXPORT_SYMBOL_GPL(perf_swevent_get_recursion_context);
 
 void inline perf_swevent_put_recursion_context(int rctx)
 {
 	struct perf_cpu_context *cpuctx = &__get_cpu_var(perf_cpu_context);
-	barrier();
-	cpuctx->recursion[rctx]--;
+
+	put_recursion_context(cpuctx->recursion, rctx);
 }
 
 void __perf_sw_event(u32 event_id, u64 nr, int nmi,
@@ -4968,6 +5088,13 @@ done:
 			atomic_inc(&nr_comm_events);
 		if (event->attr.task)
 			atomic_inc(&nr_task_events);
+		if (event->attr.sample_type & PERF_SAMPLE_CALLCHAIN) {
+			err = get_callchain_buffers();
+			if (err) {
+				free_event(event);
+				return ERR_PTR(err);
+			}
+		}
 	}
 
 	return event;
-- 
1.6.2.3


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [RFC PATCH 6/6] perf: Fix double put_ctx
  2010-07-01 15:35 [RFC PATCH 0/6] perf: cleanup and fixes Frederic Weisbecker
                   ` (4 preceding siblings ...)
  2010-07-01 15:36 ` [RFC PATCH 5/6] perf: Fix race in callchains Frederic Weisbecker
@ 2010-07-01 15:36 ` Frederic Weisbecker
  5 siblings, 0 replies; 16+ messages in thread
From: Frederic Weisbecker @ 2010-07-01 15:36 UTC (permalink / raw)
  To: LKML
  Cc: LKML, Frederic Weisbecker, Ingo Molnar, Peter Zijlstra,
	Arnaldo Carvalho de Melo, Paul Mackerras, Stephane Eranian,
	Will Deacon, David Miller, Paul Mundt, Borislav Petkov

If we call free_event on fail case of event creation, it
already put the context. The falldown goto, though, also
does a put_ctx, which might dereference a freed context.

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: David Miller <davem@davemloft.net>
Cc: Paul Mundt <lethal@linux-sh.org>
Cc: Borislav Petkov <bp@amd64.org>
---
 kernel/perf_event.c |    2 ++
 1 files changed, 2 insertions(+), 0 deletions(-)

diff --git a/kernel/perf_event.c b/kernel/perf_event.c
index b6e854f..925b53e 100644
--- a/kernel/perf_event.c
+++ b/kernel/perf_event.c
@@ -5364,6 +5364,8 @@ SYSCALL_DEFINE5(perf_event_open,
 
 err_free_put_context:
 	free_event(event);
+	fput_light(group_file, fput_needed);
+	goto err_fd;
 err_put_context:
 	fput_light(group_file, fput_needed);
 	put_ctx(ctx);
-- 
1.6.2.3


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* Re: [RFC PATCH 5/6] perf: Fix race in callchains
  2010-07-01 15:36 ` [RFC PATCH 5/6] perf: Fix race in callchains Frederic Weisbecker
@ 2010-07-01 15:42   ` Frederic Weisbecker
  2010-07-02 18:07   ` Peter Zijlstra
  1 sibling, 0 replies; 16+ messages in thread
From: Frederic Weisbecker @ 2010-07-01 15:42 UTC (permalink / raw)
  To: LKML
  Cc: Ingo Molnar, Peter Zijlstra, Arnaldo Carvalho de Melo,
	Paul Mackerras, Stephane Eranian, Will Deacon, Paul Mundt,
	David Miller, Borislav Petkov

On Thu, Jul 01, 2010 at 05:36:01PM +0200, Frederic Weisbecker wrote:
> +static struct perf_callchain_entry *get_callchain_entry(int *rctx)
> +{
> +	struct perf_callchain_entry_cpus *cpu_entries;
> +
> +	*rctx = get_recursion_context(&__get_cpu_var(callchain_recursion));
> +	if (*rctx == -1)
> +		return NULL;
> +
> +	cpu_entries = rcu_dereference(callchain_entries[*rctx]);
> +	if (!cpu_entries)
> +		return NULL;
> +
> +	return this_cpu_ptr(cpu_entries->entries);
> +}
> +
> +static void
> +put_callchain_entry(int rctx)
> +{
> +	put_recursion_context(&__get_cpu_var(callchain_recursion), rctx);
> +}
> +
> +static struct perf_callchain_entry *perf_callchain(struct pt_regs *regs)
> +{
> +	int rctx;
> +	struct perf_callchain_entry *entry;
> +
> +
> +	entry = get_callchain_entry(&rctx);
> +	if (!entry)
> +		goto exit_put;


I realize this should only call put_callchain_entry() if rctx == -1, but you
got the idea...



> +
> +	entry->nr = 0;
> +
> +	if (!user_mode(regs)) {
> +		perf_callchain_store(entry, PERF_CONTEXT_KERNEL);
> +		perf_callchain_kernel(entry, regs);
> +		if (current->mm)
> +			regs = task_pt_regs(current);
> +		else
> +			regs = NULL;
> +	}
> +
> +	if (regs) {
> +		perf_callchain_store(entry, PERF_CONTEXT_USER);
> +		perf_callchain_user(entry, regs);
> +	}
> +
> +exit_put:
> +	put_callchain_entry(rctx);
> +
> +	return entry;
> +}


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [RFC PATCH 3/6] perf: Generalize some arch callchain code
  2010-07-01 15:35 ` [RFC PATCH 3/6] perf: Generalize some arch callchain code Frederic Weisbecker
@ 2010-07-01 15:46   ` Peter Zijlstra
  2010-07-01 15:47     ` Frederic Weisbecker
  2010-07-01 15:49     ` Frederic Weisbecker
  0 siblings, 2 replies; 16+ messages in thread
From: Peter Zijlstra @ 2010-07-01 15:46 UTC (permalink / raw)
  To: Frederic Weisbecker
  Cc: LKML, Ingo Molnar, Arnaldo Carvalho de Melo, Paul Mackerras,
	Stephane Eranian, David Miller, Paul Mundt, Will Deacon,
	Borislav Petkov

On Thu, 2010-07-01 at 17:35 +0200, Frederic Weisbecker wrote:
> - Most archs use one callchain buffer per cpu, except x86 that needs
>   to deal with NMIs. Provide a default perf_callchain_buffer()
>   implementation that x86 overrides. 

sparc and power also have NMI like contexts.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [RFC PATCH 3/6] perf: Generalize some arch callchain code
  2010-07-01 15:46   ` Peter Zijlstra
@ 2010-07-01 15:47     ` Frederic Weisbecker
  2010-07-01 15:49     ` Frederic Weisbecker
  1 sibling, 0 replies; 16+ messages in thread
From: Frederic Weisbecker @ 2010-07-01 15:47 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: LKML, Ingo Molnar, Arnaldo Carvalho de Melo, Paul Mackerras,
	Stephane Eranian, David Miller, Paul Mundt, Will Deacon,
	Borislav Petkov

On Thu, Jul 01, 2010 at 05:46:01PM +0200, Peter Zijlstra wrote:
> On Thu, 2010-07-01 at 17:35 +0200, Frederic Weisbecker wrote:
> > - Most archs use one callchain buffer per cpu, except x86 that needs
> >   to deal with NMIs. Provide a default perf_callchain_buffer()
> >   implementation that x86 overrides. 
> 
> sparc and power also have NMI like contexts.


Right, but they didn't split that into different nested buffers.
The 5th patch does anyway.


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [RFC PATCH 3/6] perf: Generalize some arch callchain code
  2010-07-01 15:46   ` Peter Zijlstra
  2010-07-01 15:47     ` Frederic Weisbecker
@ 2010-07-01 15:49     ` Frederic Weisbecker
  2010-07-01 15:51       ` Peter Zijlstra
  1 sibling, 1 reply; 16+ messages in thread
From: Frederic Weisbecker @ 2010-07-01 15:49 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: LKML, Ingo Molnar, Arnaldo Carvalho de Melo, Paul Mackerras,
	Stephane Eranian, David Miller, Paul Mundt, Will Deacon,
	Borislav Petkov

On Thu, Jul 01, 2010 at 05:46:01PM +0200, Peter Zijlstra wrote:
> On Thu, 2010-07-01 at 17:35 +0200, Frederic Weisbecker wrote:
> > - Most archs use one callchain buffer per cpu, except x86 that needs
> >   to deal with NMIs. Provide a default perf_callchain_buffer()
> >   implementation that x86 overrides. 
> 
> sparc and power also have NMI like contexts.


Ah and the comments suggest it's because pmu interrupts can't nest or so.
Anyway, that's notwithstanding the race that 5th patch fixes.


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [RFC PATCH 3/6] perf: Generalize some arch callchain code
  2010-07-01 15:49     ` Frederic Weisbecker
@ 2010-07-01 15:51       ` Peter Zijlstra
  2010-07-01 15:53         ` Frederic Weisbecker
  0 siblings, 1 reply; 16+ messages in thread
From: Peter Zijlstra @ 2010-07-01 15:51 UTC (permalink / raw)
  To: Frederic Weisbecker
  Cc: LKML, Ingo Molnar, Arnaldo Carvalho de Melo, Paul Mackerras,
	Stephane Eranian, David Miller, Paul Mundt, Will Deacon,
	Borislav Petkov

On Thu, 2010-07-01 at 17:49 +0200, Frederic Weisbecker wrote:
> On Thu, Jul 01, 2010 at 05:46:01PM +0200, Peter Zijlstra wrote:
> > On Thu, 2010-07-01 at 17:35 +0200, Frederic Weisbecker wrote:
> > > - Most archs use one callchain buffer per cpu, except x86 that needs
> > >   to deal with NMIs. Provide a default perf_callchain_buffer()
> > >   implementation that x86 overrides. 
> > 
> > sparc and power also have NMI like contexts.
> 
> 
> Ah and the comments suggest it's because pmu interrupts can't nest or so.
> Anyway, that's notwithstanding the race that 5th patch fixes.

Right, but they could interrupt a software event or the like.

But yeah, the whole callchain thing can nest issue is real, thanks for
fixing that.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [RFC PATCH 3/6] perf: Generalize some arch callchain code
  2010-07-01 15:51       ` Peter Zijlstra
@ 2010-07-01 15:53         ` Frederic Weisbecker
  0 siblings, 0 replies; 16+ messages in thread
From: Frederic Weisbecker @ 2010-07-01 15:53 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: LKML, Ingo Molnar, Arnaldo Carvalho de Melo, Paul Mackerras,
	Stephane Eranian, David Miller, Paul Mundt, Will Deacon,
	Borislav Petkov

On Thu, Jul 01, 2010 at 05:51:22PM +0200, Peter Zijlstra wrote:
> On Thu, 2010-07-01 at 17:49 +0200, Frederic Weisbecker wrote:
> > On Thu, Jul 01, 2010 at 05:46:01PM +0200, Peter Zijlstra wrote:
> > > On Thu, 2010-07-01 at 17:35 +0200, Frederic Weisbecker wrote:
> > > > - Most archs use one callchain buffer per cpu, except x86 that needs
> > > >   to deal with NMIs. Provide a default perf_callchain_buffer()
> > > >   implementation that x86 overrides. 
> > > 
> > > sparc and power also have NMI like contexts.
> > 
> > 
> > Ah and the comments suggest it's because pmu interrupts can't nest or so.
> > Anyway, that's notwithstanding the race that 5th patch fixes.
> 
> Right, but they could interrupt a software event or the like.


Yeah indeed.


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [RFC PATCH 5/6] perf: Fix race in callchains
  2010-07-01 15:36 ` [RFC PATCH 5/6] perf: Fix race in callchains Frederic Weisbecker
  2010-07-01 15:42   ` Frederic Weisbecker
@ 2010-07-02 18:07   ` Peter Zijlstra
  2010-07-03 20:28     ` Frederic Weisbecker
  1 sibling, 1 reply; 16+ messages in thread
From: Peter Zijlstra @ 2010-07-02 18:07 UTC (permalink / raw)
  To: Frederic Weisbecker
  Cc: LKML, Ingo Molnar, Arnaldo Carvalho de Melo, Paul Mackerras,
	Stephane Eranian, Will Deacon, Paul Mundt, David Miller,
	Borislav Petkov

On Thu, 2010-07-01 at 17:36 +0200, Frederic Weisbecker wrote:
> Now that software events don't have interrupt disabled anymore in
> the event path, callchains can nest on any context. So seperating
> nmi and others contexts in two buffers has become racy.
> 
> Fix this by providing one buffer per nesting level. Given the size
> of the callchain entries (2040 bytes * 4), we now need to allocate
> them dynamically.

OK so I guess you want to allocate them because 8k per cpu is too much
to always have about?

> +static int get_callchain_buffers(void)
> +{
> +	int i;
> +	int err = 0;
> +	struct perf_callchain_entry_cpus *buf;
> +
> +	mutex_lock(&callchain_mutex);
> +
> +	if (WARN_ON_ONCE(++nr_callchain_events < 1)) {
> +		err = -EINVAL;
> +		goto exit;
> +	}
> +
> +	if (nr_callchain_events > 1)
> +		goto exit;
> +
> +	for (i = 0; i < 4; i++) {
> +		buf = kzalloc(sizeof(*buf), GFP_KERNEL);
> +		/* free_event() will clean the rest */
> +		if (!buf) {
> +			err = -ENOMEM;
> +			goto exit;
> +		}
> +		buf->entries = alloc_percpu(struct perf_callchain_entry);
> +		if (!buf->entries) {
> +			kfree(buf);
> +			err = -ENOMEM;
> +			goto exit;
> +		}
> +		rcu_assign_pointer(callchain_entries[i], buf);
> +	}
> +
> +exit:
> +	mutex_unlock(&callchain_mutex);
> +
> +	return err;
> +}

> +static void put_callchain_buffers(void)
> +{
> +	int i;
> +	struct perf_callchain_entry_cpus *entry;
> +
> +	mutex_lock(&callchain_mutex);
> +
> +	if (WARN_ON_ONCE(--nr_callchain_events < 0))
> +		goto exit;
> +
> +	if (nr_callchain_events > 0)
> +		goto exit;
> +
> +	for (i = 0; i < 4; i++) {
> +		entry = callchain_entries[i];
> +		if (entry) {
> +			callchain_entries[i] = NULL;
> +			call_rcu(&entry->rcu_head, release_callchain_buffers);
> +		}
> +	}
> +
> +exit:
> +	mutex_unlock(&callchain_mutex);
> +}

If you make nr_callchain_events an atomic_t, then you can do the
refcounting outside the mutex. See the existing user of
atomic_dec_and_mutex_lock().

I would also split it in get/put and alloc/free functions for clarity.

I'm not at all sure why you're using RCU though.

> @@ -1895,6 +2072,8 @@ static void free_event(struct perf_event *event)
>  			atomic_dec(&nr_comm_events);
>  		if (event->attr.task)
>  			atomic_dec(&nr_task_events);
> +		if (event->attr.sample_type & PERF_SAMPLE_CALLCHAIN)
> +			put_callchain_buffers();
>  	}
>  
>  	if (event->buffer) {

If this was the last even, there's no callchain user left, so nobody can
be here:

> @@ -3480,14 +3610,20 @@ static void perf_event_output(struct perf_event *event, int nmi,
>  	struct perf_output_handle handle;
>  	struct perf_event_header header;
>  
> +	/* protect the callchain buffers */
> +	rcu_read_lock();
> +
>  	perf_prepare_sample(&header, data, event, regs);
>  
>  	if (perf_output_begin(&handle, event, header.size, nmi, 1))
> -		return;
> +		goto exit;
>  
>  	perf_output_sample(&handle, &header, data, event);
>  
>  	perf_output_end(&handle);
> +
> +exit:
> +	rcu_read_unlock();
>  }

Rendering that RCU stuff superfluous.



^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [RFC PATCH 5/6] perf: Fix race in callchains
  2010-07-02 18:07   ` Peter Zijlstra
@ 2010-07-03 20:28     ` Frederic Weisbecker
  0 siblings, 0 replies; 16+ messages in thread
From: Frederic Weisbecker @ 2010-07-03 20:28 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: LKML, Ingo Molnar, Arnaldo Carvalho de Melo, Paul Mackerras,
	Stephane Eranian, Will Deacon, Paul Mundt, David Miller,
	Borislav Petkov

On Fri, Jul 02, 2010 at 08:07:35PM +0200, Peter Zijlstra wrote:
> On Thu, 2010-07-01 at 17:36 +0200, Frederic Weisbecker wrote:
> > Now that software events don't have interrupt disabled anymore in
> > the event path, callchains can nest on any context. So seperating
> > nmi and others contexts in two buffers has become racy.
> > 
> > Fix this by providing one buffer per nesting level. Given the size
> > of the callchain entries (2040 bytes * 4), we now need to allocate
> > them dynamically.
> 
> OK so I guess you want to allocate them because 8k per cpu is too much
> to always have about?



Right. I know that really adds complexity and I hesitated much before
doing so. But I think that's quite necessary.


 
> > +static int get_callchain_buffers(void)
> > +{
> > +	int i;
> > +	int err = 0;
> > +	struct perf_callchain_entry_cpus *buf;
> > +
> > +	mutex_lock(&callchain_mutex);
> > +
> > +	if (WARN_ON_ONCE(++nr_callchain_events < 1)) {
> > +		err = -EINVAL;
> > +		goto exit;
> > +	}
> > +
> > +	if (nr_callchain_events > 1)
> > +		goto exit;
> > +
> > +	for (i = 0; i < 4; i++) {
> > +		buf = kzalloc(sizeof(*buf), GFP_KERNEL);
> > +		/* free_event() will clean the rest */
> > +		if (!buf) {
> > +			err = -ENOMEM;
> > +			goto exit;
> > +		}
> > +		buf->entries = alloc_percpu(struct perf_callchain_entry);
> > +		if (!buf->entries) {
> > +			kfree(buf);
> > +			err = -ENOMEM;
> > +			goto exit;
> > +		}
> > +		rcu_assign_pointer(callchain_entries[i], buf);
> > +	}
> > +
> > +exit:
> > +	mutex_unlock(&callchain_mutex);
> > +
> > +	return err;
> > +}
> 
> > +static void put_callchain_buffers(void)
> > +{
> > +	int i;
> > +	struct perf_callchain_entry_cpus *entry;
> > +
> > +	mutex_lock(&callchain_mutex);
> > +
> > +	if (WARN_ON_ONCE(--nr_callchain_events < 0))
> > +		goto exit;
> > +
> > +	if (nr_callchain_events > 0)
> > +		goto exit;
> > +
> > +	for (i = 0; i < 4; i++) {
> > +		entry = callchain_entries[i];
> > +		if (entry) {
> > +			callchain_entries[i] = NULL;
> > +			call_rcu(&entry->rcu_head, release_callchain_buffers);
> > +		}
> > +	}
> > +
> > +exit:
> > +	mutex_unlock(&callchain_mutex);
> > +}
> 
> If you make nr_callchain_events an atomic_t, then you can do the
> refcounting outside the mutex. See the existing user of
> atomic_dec_and_mutex_lock().
> 
> I would also split it in get/put and alloc/free functions for clarity.



Ok I will.




> I'm not at all sure why you're using RCU though.
> 
> > @@ -1895,6 +2072,8 @@ static void free_event(struct perf_event *event)
> >  			atomic_dec(&nr_comm_events);
> >  		if (event->attr.task)
> >  			atomic_dec(&nr_task_events);
> > +		if (event->attr.sample_type & PERF_SAMPLE_CALLCHAIN)
> > +			put_callchain_buffers();
> >  	}
> >  
> >  	if (event->buffer) {
> 
> If this was the last even, there's no callchain user left, so nobody can
> be here:
> 
> > @@ -3480,14 +3610,20 @@ static void perf_event_output(struct perf_event *event, int nmi,
> >  	struct perf_output_handle handle;
> >  	struct perf_event_header header;
> >  
> > +	/* protect the callchain buffers */
> > +	rcu_read_lock();
> > +
> >  	perf_prepare_sample(&header, data, event, regs);
> >  
> >  	if (perf_output_begin(&handle, event, header.size, nmi, 1))
> > -		return;
> > +		goto exit;
> >  
> >  	perf_output_sample(&handle, &header, data, event);
> >  
> >  	perf_output_end(&handle);
> > +
> > +exit:
> > +	rcu_read_unlock();
> >  }
> 
> Rendering that RCU stuff superfluous.


May be I'm omitting something that would make it non-rcu-safe.

But consider a perf event running on CPU 1. And you close the fd on
CPU 0. CPU 1 has started to use a callchain buffer but receives an IPI
to retire the event from the cpu. But still it has yet to finish his
callchain processing.

If right after that CPU 0 releases the callchain buffers, CPU 1 may
crash in the middle.

So you need to wait for the grace period to end.


^ permalink raw reply	[flat|nested] 16+ messages in thread

* [RFC PATCH 5/6] perf: Fix race in callchains
  2010-08-16 20:48 [RFC PATCH 0/0 v3] callchain fixes and cleanups Frederic Weisbecker
@ 2010-08-16 20:48 ` Frederic Weisbecker
  0 siblings, 0 replies; 16+ messages in thread
From: Frederic Weisbecker @ 2010-08-16 20:48 UTC (permalink / raw)
  To: LKML
  Cc: LKML, Frederic Weisbecker, Ingo Molnar, Peter Zijlstra,
	Arnaldo Carvalho de Melo, Paul Mackerras, Stephane Eranian,
	Will Deacon, Paul Mundt, David Miller, Borislav Petkov

Now that software events don't have interrupt disabled anymore in
the event path, callchains can nest on any context. So seperating
nmi and others contexts in two buffers has become racy.

Fix this by providing one buffer per nesting level. Given the size
of the callchain entries (2040 bytes * 4), we now need to allocate
them dynamically.

v2: Fixed put_callchain_entry call after recursion.
    Fix the type of the recursion, it must be an array.

v3: Use a manual per cpu allocation (temporary solution until NMIs
    can safely access vmalloc'ed memory).
    Do a better separation between callchain reference tracking and
    allocation. Make the "put" path lockless for non-release cases.

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Paul Mundt <lethal@linux-sh.org>
Cc: David Miller <davem@davemloft.net>
Cc: Borislav Petkov <bp@amd64.org>
---
 arch/x86/kernel/cpu/perf_event.c |   22 ++--
 include/linux/perf_event.h       |    1 -
 kernel/perf_event.c              |  265 ++++++++++++++++++++++++++++----------
 3 files changed, 205 insertions(+), 83 deletions(-)

diff --git a/arch/x86/kernel/cpu/perf_event.c b/arch/x86/kernel/cpu/perf_event.c
index a3c9222..8e91cf3 100644
--- a/arch/x86/kernel/cpu/perf_event.c
+++ b/arch/x86/kernel/cpu/perf_event.c
@@ -1608,6 +1608,11 @@ static const struct stacktrace_ops backtrace_ops = {
 void
 perf_callchain_kernel(struct perf_callchain_entry *entry, struct pt_regs *regs)
 {
+	if (perf_guest_cbs && perf_guest_cbs->is_in_guest()) {
+		/* TODO: We don't support guest os callchain now */
+		return NULL;
+	}
+
 	perf_callchain_store(entry, regs->ip);
 
 	dump_trace(NULL, regs, NULL, regs->bp, &backtrace_ops, entry);
@@ -1656,6 +1661,10 @@ perf_callchain_user(struct perf_callchain_entry *entry, struct pt_regs *regs)
 	struct stack_frame frame;
 	const void __user *fp;
 
+	if (perf_guest_cbs && perf_guest_cbs->is_in_guest()) {
+		/* TODO: We don't support guest os callchain now */
+		return NULL;
+	}
 
 	fp = (void __user *)regs->bp;
 
@@ -1681,19 +1690,6 @@ perf_callchain_user(struct perf_callchain_entry *entry, struct pt_regs *regs)
 	}
 }
 
-struct perf_callchain_entry *perf_callchain_buffer(void)
-{
-	if (perf_guest_cbs && perf_guest_cbs->is_in_guest()) {
-		/* TODO: We don't support guest os callchain now */
-		return NULL;
-	}
-
-	if (in_nmi())
-		return &__get_cpu_var(perf_callchain_entry_nmi);
-
-	return &__get_cpu_var(perf_callchain_entry);
-}
-
 unsigned long perf_instruction_pointer(struct pt_regs *regs)
 {
 	unsigned long ip;
diff --git a/include/linux/perf_event.h b/include/linux/perf_event.h
index 4db61dd..d7e8ea6 100644
--- a/include/linux/perf_event.h
+++ b/include/linux/perf_event.h
@@ -983,7 +983,6 @@ extern void perf_callchain_user(struct perf_callchain_entry *entry,
 				struct pt_regs *regs);
 extern void perf_callchain_kernel(struct perf_callchain_entry *entry,
 				  struct pt_regs *regs);
-extern struct perf_callchain_entry *perf_callchain_buffer(void);
 
 
 static inline void
diff --git a/kernel/perf_event.c b/kernel/perf_event.c
index 615d024..6ad61fb 100644
--- a/kernel/perf_event.c
+++ b/kernel/perf_event.c
@@ -1764,6 +1764,183 @@ static u64 perf_event_read(struct perf_event *event)
 }
 
 /*
+ * Callchain support
+ */
+
+static DEFINE_PER_CPU(int, callchain_recursion[4]);
+static atomic_t nr_callchain_events;
+static DEFINE_MUTEX(callchain_mutex);
+static struct perf_callchain_entry **callchain_cpu_entries;
+
+__weak void perf_callchain_kernel(struct perf_callchain_entry *entry,
+				  struct pt_regs *regs)
+{
+}
+
+__weak void perf_callchain_user(struct perf_callchain_entry *entry,
+				struct pt_regs *regs)
+{
+}
+
+static void release_callchain_buffers(void)
+{
+	int cpu;
+
+	for_each_possible_cpu(cpu) {
+		kfree(callchain_cpu_entries[cpu]);
+		callchain_cpu_entries[cpu] = NULL;
+	}
+
+	kfree(callchain_cpu_entries);
+	callchain_cpu_entries = NULL;
+}
+
+static int alloc_callchain_buffers(void)
+{
+	int cpu;
+	struct perf_callchain_entry **entries;
+
+	/*
+	 * We can't use the percpu allocation API for data that can be
+	 * accessed from NMI. Use a temporary manual per cpu allocation
+	 * until that gets sorted out.
+	 */
+	entries = kzalloc(sizeof(*entries) * num_possible_cpus(), GFP_KERNEL);
+	if (!entries)
+		return -ENOMEM;
+
+	callchain_cpu_entries = entries;
+
+	for_each_possible_cpu(cpu) {
+		entries[cpu] = kmalloc(sizeof(struct perf_callchain_entry) * 4,
+					GFP_KERNEL);
+		if (!entries[cpu])
+			return -ENOMEM;
+	}
+
+	return 0;
+}
+
+static int get_callchain_buffers(void)
+{
+	int err = 0;
+	int count;
+
+	mutex_lock(&callchain_mutex);
+
+	count = atomic_inc_return(&nr_callchain_events);
+	if (WARN_ON_ONCE(count < 1)) {
+		err = -EINVAL;
+		goto exit;
+	}
+
+	if (count > 1) {
+		/* If the allocation failed, give up */
+		if (!callchain_cpu_entries)
+			err = -ENOMEM;
+		goto exit;
+	}
+
+	err = alloc_callchain_buffers();
+	if (err)
+		release_callchain_buffers();
+exit:
+	mutex_unlock(&callchain_mutex);
+
+	return err;
+}
+
+static void put_callchain_buffers(void)
+{
+	if (atomic_dec_and_mutex_lock(&nr_callchain_events, &callchain_mutex)) {
+		release_callchain_buffers();
+		mutex_unlock(&callchain_mutex);
+	}
+}
+
+static int get_recursion_context(int *recursion)
+{
+	int rctx;
+
+	if (in_nmi())
+		rctx = 3;
+	else if (in_irq())
+		rctx = 2;
+	else if (in_softirq())
+		rctx = 1;
+	else
+		rctx = 0;
+
+	if (recursion[rctx])
+		return -1;
+
+	recursion[rctx]++;
+	barrier();
+
+	return rctx;
+}
+
+static inline void put_recursion_context(int *recursion, int rctx)
+{
+	barrier();
+	recursion[rctx]--;
+}
+
+static struct perf_callchain_entry *get_callchain_entry(int *rctx)
+{
+	int cpu;
+
+	*rctx = get_recursion_context(__get_cpu_var(callchain_recursion));
+	if (*rctx == -1)
+		return NULL;
+
+	cpu = smp_processor_id();
+
+	return &callchain_cpu_entries[cpu][*rctx];
+}
+
+static void
+put_callchain_entry(int rctx)
+{
+	put_recursion_context(__get_cpu_var(callchain_recursion), rctx);
+}
+
+static struct perf_callchain_entry *perf_callchain(struct pt_regs *regs)
+{
+	int rctx;
+	struct perf_callchain_entry *entry;
+
+
+	entry = get_callchain_entry(&rctx);
+	if (rctx == -1)
+		return NULL;
+
+	if (!entry)
+		goto exit_put;
+
+	entry->nr = 0;
+
+	if (!user_mode(regs)) {
+		perf_callchain_store(entry, PERF_CONTEXT_KERNEL);
+		perf_callchain_kernel(entry, regs);
+		if (current->mm)
+			regs = task_pt_regs(current);
+		else
+			regs = NULL;
+	}
+
+	if (regs) {
+		perf_callchain_store(entry, PERF_CONTEXT_USER);
+		perf_callchain_user(entry, regs);
+	}
+
+exit_put:
+	put_callchain_entry(rctx);
+
+	return entry;
+}
+
+/*
  * Initialize the perf_event context in a task_struct:
  */
 static void
@@ -1895,6 +2072,8 @@ static void free_event(struct perf_event *event)
 			atomic_dec(&nr_comm_events);
 		if (event->attr.task)
 			atomic_dec(&nr_task_events);
+		if (event->attr.sample_type & PERF_SAMPLE_CALLCHAIN)
+			put_callchain_buffers();
 	}
 
 	if (event->buffer) {
@@ -2937,55 +3116,6 @@ void perf_event_do_pending(void)
 	__perf_pending_run();
 }
 
-DEFINE_PER_CPU(struct perf_callchain_entry, perf_callchain_entry);
-
-/*
- * Callchain support -- arch specific
- */
-
-__weak struct perf_callchain_entry *perf_callchain_buffer(void)
-{
-	return &__get_cpu_var(perf_callchain_entry);
-}
-
-__weak void perf_callchain_kernel(struct perf_callchain_entry *entry,
-				  struct pt_regs *regs)
-{
-}
-
-__weak void perf_callchain_user(struct perf_callchain_entry *entry,
-				struct pt_regs *regs)
-{
-}
-
-static struct perf_callchain_entry *perf_callchain(struct pt_regs *regs)
-{
-	struct perf_callchain_entry *entry;
-
-	entry = perf_callchain_buffer();
-	if (!entry)
-		return NULL;
-
-	entry->nr = 0;
-
-	if (!user_mode(regs)) {
-		perf_callchain_store(entry, PERF_CONTEXT_KERNEL);
-		perf_callchain_kernel(entry, regs);
-		if (current->mm)
-			regs = task_pt_regs(current);
-		else
-			regs = NULL;
-	}
-
-	if (regs) {
-		perf_callchain_store(entry, PERF_CONTEXT_USER);
-		perf_callchain_user(entry, regs);
-	}
-
-	return entry;
-}
-
-
 /*
  * We assume there is only KVM supporting the callbacks.
  * Later on, we might change it to a list if there is
@@ -3480,14 +3610,20 @@ static void perf_event_output(struct perf_event *event, int nmi,
 	struct perf_output_handle handle;
 	struct perf_event_header header;
 
+	/* protect the callchain buffers */
+	rcu_read_lock();
+
 	perf_prepare_sample(&header, data, event, regs);
 
 	if (perf_output_begin(&handle, event, header.size, nmi, 1))
-		return;
+		goto exit;
 
 	perf_output_sample(&handle, &header, data, event);
 
 	perf_output_end(&handle);
+
+exit:
+	rcu_read_unlock();
 }
 
 /*
@@ -4243,32 +4379,16 @@ end:
 int perf_swevent_get_recursion_context(void)
 {
 	struct perf_cpu_context *cpuctx = &__get_cpu_var(perf_cpu_context);
-	int rctx;
 
-	if (in_nmi())
-		rctx = 3;
-	else if (in_irq())
-		rctx = 2;
-	else if (in_softirq())
-		rctx = 1;
-	else
-		rctx = 0;
-
-	if (cpuctx->recursion[rctx])
-		return -1;
-
-	cpuctx->recursion[rctx]++;
-	barrier();
-
-	return rctx;
+	return get_recursion_context(cpuctx->recursion);
 }
 EXPORT_SYMBOL_GPL(perf_swevent_get_recursion_context);
 
 void inline perf_swevent_put_recursion_context(int rctx)
 {
 	struct perf_cpu_context *cpuctx = &__get_cpu_var(perf_cpu_context);
-	barrier();
-	cpuctx->recursion[rctx]--;
+
+	put_recursion_context(cpuctx->recursion, rctx);
 }
 
 void __perf_sw_event(u32 event_id, u64 nr, int nmi,
@@ -4968,6 +5088,13 @@ done:
 			atomic_inc(&nr_comm_events);
 		if (event->attr.task)
 			atomic_inc(&nr_task_events);
+		if (event->attr.sample_type & PERF_SAMPLE_CALLCHAIN) {
+			err = get_callchain_buffers();
+			if (err) {
+				free_event(event);
+				return ERR_PTR(err);
+			}
+		}
 	}
 
 	return event;
-- 
1.6.2.3


^ permalink raw reply related	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2010-08-16 20:49 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2010-07-01 15:35 [RFC PATCH 0/6] perf: cleanup and fixes Frederic Weisbecker
2010-07-01 15:35 ` [RFC PATCH 1/6] perf: Drop unappropriate tests on arch callchains Frederic Weisbecker
2010-07-01 15:35 ` [RFC PATCH 2/6] perf: Generalize callchain_store() Frederic Weisbecker
2010-07-01 15:35 ` [RFC PATCH 3/6] perf: Generalize some arch callchain code Frederic Weisbecker
2010-07-01 15:46   ` Peter Zijlstra
2010-07-01 15:47     ` Frederic Weisbecker
2010-07-01 15:49     ` Frederic Weisbecker
2010-07-01 15:51       ` Peter Zijlstra
2010-07-01 15:53         ` Frederic Weisbecker
2010-07-01 15:36 ` [RFC PATCH 4/6] perf: Factorize callchain context handling Frederic Weisbecker
2010-07-01 15:36 ` [RFC PATCH 5/6] perf: Fix race in callchains Frederic Weisbecker
2010-07-01 15:42   ` Frederic Weisbecker
2010-07-02 18:07   ` Peter Zijlstra
2010-07-03 20:28     ` Frederic Weisbecker
2010-07-01 15:36 ` [RFC PATCH 6/6] perf: Fix double put_ctx Frederic Weisbecker
2010-08-16 20:48 [RFC PATCH 0/0 v3] callchain fixes and cleanups Frederic Weisbecker
2010-08-16 20:48 ` [RFC PATCH 5/6] perf: Fix race in callchains Frederic Weisbecker

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).