linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v2 0/7] ftrace: updates for tip
@ 2009-02-06 15:21 Steven Rostedt
  2009-02-06 15:21 ` [PATCH v2 1/7] ring-buffer: add NMI protection for spinlocks Steven Rostedt
                   ` (6 more replies)
  0 siblings, 7 replies; 10+ messages in thread
From: Steven Rostedt @ 2009-02-06 15:21 UTC (permalink / raw)
  To: linux-kernel
  Cc: Ingo Molnar, Andrew Morton, Peter Zijlstra, Frederic Weisbecker,
	Mathieu Desnoyers

Ingo,

This is still based off of tip/master since it is not all focused
on ftrace (has NMI generic code). I'll let you cherry pick and
figure out what goes where.

I tried to cherry pick them into tracing/ftrace and hit a bit of conflicts.
Next time, I'll just develop against tip/tracing/ftrace or
tip/tracing/core.

Changes since the first version:

 - Added Mathieu Desnoyers credit in the in_nmi change log.

 - Included the original ring buffer changes which these are
   based on and that I do not see in tip/master.

 - Added a comment fix patch by Wenji.

The following patches are in:

  git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-2.6-trace.git

    branch: tip/devel


Steven Rostedt (6):
      ring-buffer: add NMI protection for spinlocks
      ring-buffer: allow tracing_off to be used in core kernel code
      ftrace, x86: rename in_nmi variable
      nmi: add generic nmi tracking state
      ftrace: change function graph tracer to use new in_nmi
      ring-buffer: use generic version of in_nmi

Wenji Huang (1):
      trace: trivial fixes in comment typos.

----
 arch/x86/Kconfig            |    1 +
 arch/x86/kernel/ftrace.c    |   35 ++++++++---------------------------
 include/linux/ftrace.h      |    2 +-
 include/linux/ftrace_irq.h  |    2 +-
 include/linux/hardirq.h     |   15 +++++++++++++++
 include/linux/ring_buffer.h |    9 +++++++++
 kernel/trace/Kconfig        |    8 ++++++++
 kernel/trace/ftrace.c       |    6 +++---
 kernel/trace/ring_buffer.c  |   31 +++++++++++++++++++++++++++++--
 kernel/trace/trace.h        |    6 +++---
 10 files changed, 78 insertions(+), 37 deletions(-)
-- 

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [PATCH v2 1/7] ring-buffer: add NMI protection for spinlocks
  2009-02-06 15:21 [PATCH v2 0/7] ftrace: updates for tip Steven Rostedt
@ 2009-02-06 15:21 ` Steven Rostedt
  2009-02-06 19:40   ` Mathieu Desnoyers
  2009-02-06 15:21 ` [PATCH v2 2/7] ring-buffer: allow tracing_off to be used in core kernel code Steven Rostedt
                   ` (5 subsequent siblings)
  6 siblings, 1 reply; 10+ messages in thread
From: Steven Rostedt @ 2009-02-06 15:21 UTC (permalink / raw)
  To: linux-kernel
  Cc: Ingo Molnar, Andrew Morton, Peter Zijlstra, Frederic Weisbecker,
	Mathieu Desnoyers, Steven Rostedt

[-- Attachment #1: 0001-ring-buffer-add-NMI-protection-for-spinlocks.patch --]
[-- Type: text/plain, Size: 7067 bytes --]

From: Steven Rostedt <srostedt@redhat.com>

Impact: prevent deadlock in NMI

The ring buffers are not yet totally lockless with writing to
the buffer. When a writer crosses a page, it grabs a per cpu spinlock
to protect against a reader. The spinlocks taken by a writer are not
to protect against other writers, since a writer can only write to
its own per cpu buffer. The spinlocks protect against readers that
can touch any cpu buffer. The writers are made to be reentrant
with the spinlocks disabling interrupts.

The problem arises when an NMI writes to the buffer, and that write
crosses a page boundary. If it grabs a spinlock, it can be racing
with another writer (since disabling interrupts does not protect
against NMIs) or with a reader on the same CPU. Luckily, most of the
users are not reentrant and protects against this issue. But if a
user of the ring buffer becomes reentrant (which is what the ring
buffers do allow), if the NMI also writes to the ring buffer then
we risk the chance of a deadlock.

This patch moves the ftrace_nmi_enter called by nmi_enter() to the
ring buffer code. It replaces the current ftrace_nmi_enter that is
used by arch specific code to arch_ftrace_nmi_enter and updates
the Kconfig to handle it.

When an NMI is called, it will set a per cpu variable in the ring buffer
code and will clear it when the NMI exits. If a write to the ring buffer
crosses page boundaries inside an NMI, a trylock is used on the spin
lock instead. If the spinlock fails to be acquired, then the entry
is discarded.

This bug appeared in the ftrace work in the RT tree, where event tracing
is reentrant. This workaround solved the deadlocks that appeared there.

Signed-off-by: Steven Rostedt <srostedt@redhat.com>
---
 arch/x86/Kconfig           |    1 +
 arch/x86/kernel/ftrace.c   |    8 +++---
 include/linux/ftrace_irq.h |   10 ++++++++-
 kernel/trace/Kconfig       |    8 +++++++
 kernel/trace/ring_buffer.c |   48 ++++++++++++++++++++++++++++++++++++++++++-
 5 files changed, 68 insertions(+), 7 deletions(-)

diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index 4277640..3da7121 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -34,6 +34,7 @@ config X86
 	select HAVE_FUNCTION_TRACER
 	select HAVE_FUNCTION_GRAPH_TRACER
 	select HAVE_FUNCTION_TRACE_MCOUNT_TEST
+	select HAVE_FTRACE_NMI_ENTER if DYNAMIC_FTRACE || FUNCTION_GRAPH_TRACER
 	select HAVE_KVM
 	select HAVE_ARCH_KGDB
 	select HAVE_ARCH_TRACEHOOK
diff --git a/arch/x86/kernel/ftrace.c b/arch/x86/kernel/ftrace.c
index 4d33224..4c68358 100644
--- a/arch/x86/kernel/ftrace.c
+++ b/arch/x86/kernel/ftrace.c
@@ -113,7 +113,7 @@ static void ftrace_mod_code(void)
 					     MCOUNT_INSN_SIZE);
 }
 
-void ftrace_nmi_enter(void)
+void arch_ftrace_nmi_enter(void)
 {
 	atomic_inc(&in_nmi);
 	/* Must have in_nmi seen before reading write flag */
@@ -124,7 +124,7 @@ void ftrace_nmi_enter(void)
 	}
 }
 
-void ftrace_nmi_exit(void)
+void arch_ftrace_nmi_exit(void)
 {
 	/* Finish all executions before clearing in_nmi */
 	smp_wmb();
@@ -376,12 +376,12 @@ int ftrace_disable_ftrace_graph_caller(void)
  */
 static atomic_t in_nmi;
 
-void ftrace_nmi_enter(void)
+void arch_ftrace_nmi_enter(void)
 {
 	atomic_inc(&in_nmi);
 }
 
-void ftrace_nmi_exit(void)
+void arch_ftrace_nmi_exit(void)
 {
 	atomic_dec(&in_nmi);
 }
diff --git a/include/linux/ftrace_irq.h b/include/linux/ftrace_irq.h
index 366a054..29de677 100644
--- a/include/linux/ftrace_irq.h
+++ b/include/linux/ftrace_irq.h
@@ -2,7 +2,15 @@
 #define _LINUX_FTRACE_IRQ_H
 
 
-#if defined(CONFIG_DYNAMIC_FTRACE) || defined(CONFIG_FUNCTION_GRAPH_TRACER)
+#ifdef CONFIG_FTRACE_NMI_ENTER
+extern void arch_ftrace_nmi_enter(void);
+extern void arch_ftrace_nmi_exit(void);
+#else
+static inline void arch_ftrace_nmi_enter(void) { }
+static inline void arch_ftrace_nmi_exit(void) { }
+#endif
+
+#ifdef CONFIG_RING_BUFFER
 extern void ftrace_nmi_enter(void);
 extern void ftrace_nmi_exit(void);
 #else
diff --git a/kernel/trace/Kconfig b/kernel/trace/Kconfig
index 28f2644..25131a5 100644
--- a/kernel/trace/Kconfig
+++ b/kernel/trace/Kconfig
@@ -9,6 +9,9 @@ config USER_STACKTRACE_SUPPORT
 config NOP_TRACER
 	bool
 
+config HAVE_FTRACE_NMI_ENTER
+	bool
+
 config HAVE_FUNCTION_TRACER
 	bool
 
@@ -37,6 +40,11 @@ config TRACER_MAX_TRACE
 config RING_BUFFER
 	bool
 
+config FTRACE_NMI_ENTER
+       bool
+       depends on HAVE_FTRACE_NMI_ENTER
+       default y
+
 config TRACING
 	bool
 	select DEBUG_FS
diff --git a/kernel/trace/ring_buffer.c b/kernel/trace/ring_buffer.c
index b36d737..a60a6a8 100644
--- a/kernel/trace/ring_buffer.c
+++ b/kernel/trace/ring_buffer.c
@@ -4,6 +4,7 @@
  * Copyright (C) 2008 Steven Rostedt <srostedt@redhat.com>
  */
 #include <linux/ring_buffer.h>
+#include <linux/ftrace_irq.h>
 #include <linux/spinlock.h>
 #include <linux/debugfs.h>
 #include <linux/uaccess.h>
@@ -19,6 +20,35 @@
 #include "trace.h"
 
 /*
+ * Since the write to the buffer is still not fully lockless,
+ * we must be careful with NMIs. The locks in the writers
+ * are taken when a write crosses to a new page. The locks
+ * protect against races with the readers (this will soon
+ * be fixed with a lockless solution).
+ *
+ * Because we can not protect against NMIs, and we want to
+ * keep traces reentrant, we need to manage what happens
+ * when we are in an NMI.
+ */
+static DEFINE_PER_CPU(int, rb_in_nmi);
+
+void ftrace_nmi_enter(void)
+{
+	__get_cpu_var(rb_in_nmi)++;
+	/* call arch specific handler too */
+	arch_ftrace_nmi_enter();
+}
+
+void ftrace_nmi_exit(void)
+{
+	arch_ftrace_nmi_exit();
+	__get_cpu_var(rb_in_nmi)--;
+	/* NMIs are not recursive */
+	WARN_ON_ONCE(__get_cpu_var(rb_in_nmi));
+}
+
+
+/*
  * A fast way to enable or disable all ring buffers is to
  * call tracing_on or tracing_off. Turning off the ring buffers
  * prevents all ring buffers from being recorded to.
@@ -982,6 +1012,7 @@ __rb_reserve_next(struct ring_buffer_per_cpu *cpu_buffer,
 	struct ring_buffer *buffer = cpu_buffer->buffer;
 	struct ring_buffer_event *event;
 	unsigned long flags;
+	bool lock_taken = false;
 
 	commit_page = cpu_buffer->commit_page;
 	/* we just need to protect against interrupts */
@@ -995,7 +1026,19 @@ __rb_reserve_next(struct ring_buffer_per_cpu *cpu_buffer,
 		struct buffer_page *next_page = tail_page;
 
 		local_irq_save(flags);
-		__raw_spin_lock(&cpu_buffer->lock);
+		/*
+		 * NMIs can happen after we take the lock.
+		 * If we are in an NMI, only take the lock
+		 * if it is not already taken. Otherwise
+		 * simply fail.
+		 */
+		if (unlikely(__get_cpu_var(rb_in_nmi))) {
+			if (!__raw_spin_trylock(&cpu_buffer->lock))
+				goto out_unlock;
+		} else
+			__raw_spin_lock(&cpu_buffer->lock);
+
+		lock_taken = true;
 
 		rb_inc_page(cpu_buffer, &next_page);
 
@@ -1097,7 +1140,8 @@ __rb_reserve_next(struct ring_buffer_per_cpu *cpu_buffer,
 	if (tail <= BUF_PAGE_SIZE)
 		local_set(&tail_page->write, tail);
 
-	__raw_spin_unlock(&cpu_buffer->lock);
+	if (likely(lock_taken))
+		__raw_spin_unlock(&cpu_buffer->lock);
 	local_irq_restore(flags);
 	return NULL;
 }
-- 
1.5.6.5

-- 

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH v2 2/7] ring-buffer: allow tracing_off to be used in core kernel code
  2009-02-06 15:21 [PATCH v2 0/7] ftrace: updates for tip Steven Rostedt
  2009-02-06 15:21 ` [PATCH v2 1/7] ring-buffer: add NMI protection for spinlocks Steven Rostedt
@ 2009-02-06 15:21 ` Steven Rostedt
  2009-02-06 15:21 ` [PATCH v2 3/7] ftrace, x86: rename in_nmi variable Steven Rostedt
                   ` (4 subsequent siblings)
  6 siblings, 0 replies; 10+ messages in thread
From: Steven Rostedt @ 2009-02-06 15:21 UTC (permalink / raw)
  To: linux-kernel
  Cc: Ingo Molnar, Andrew Morton, Peter Zijlstra, Frederic Weisbecker,
	Mathieu Desnoyers, Steven Rostedt

[-- Attachment #1: 0002-ring-buffer-allow-tracing_off-to-be-used-in-core-ke.patch --]
[-- Type: text/plain, Size: 1337 bytes --]

From: Steven Rostedt <srostedt@redhat.com>

tracing_off() is the fastest way to stop recording to the ring buffers.
This may be used in places like panic and die, just before the
ftrace_dump is called.

This patch adds the appropriate CPP conditionals to make it a stub
function when the ring buffer is not configured it.

Signed-off-by: Steven Rostedt <srostedt@redhat.com>
---
 include/linux/ring_buffer.h |    9 +++++++++
 1 files changed, 9 insertions(+), 0 deletions(-)

diff --git a/include/linux/ring_buffer.h b/include/linux/ring_buffer.h
index b3b3596..ac94c06 100644
--- a/include/linux/ring_buffer.h
+++ b/include/linux/ring_buffer.h
@@ -124,9 +124,18 @@ unsigned long ring_buffer_overrun_cpu(struct ring_buffer *buffer, int cpu);
 u64 ring_buffer_time_stamp(int cpu);
 void ring_buffer_normalize_time_stamp(int cpu, u64 *ts);
 
+/*
+ * The below functions are fine to use outside the tracing facility.
+ */
+#ifdef CONFIG_RING_BUFFER
 void tracing_on(void);
 void tracing_off(void);
 void tracing_off_permanent(void);
+#else
+static inline void tracing_on(void) { }
+static inline void tracing_off(void) { }
+static inline void tracing_off_permanent(void) { }
+#endif
 
 void *ring_buffer_alloc_read_page(struct ring_buffer *buffer);
 void ring_buffer_free_read_page(struct ring_buffer *buffer, void *data);
-- 
1.5.6.5

-- 

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH v2 3/7] ftrace, x86: rename in_nmi variable
  2009-02-06 15:21 [PATCH v2 0/7] ftrace: updates for tip Steven Rostedt
  2009-02-06 15:21 ` [PATCH v2 1/7] ring-buffer: add NMI protection for spinlocks Steven Rostedt
  2009-02-06 15:21 ` [PATCH v2 2/7] ring-buffer: allow tracing_off to be used in core kernel code Steven Rostedt
@ 2009-02-06 15:21 ` Steven Rostedt
  2009-02-06 15:21 ` [PATCH v2 4/7] nmi: add generic nmi tracking state Steven Rostedt
                   ` (3 subsequent siblings)
  6 siblings, 0 replies; 10+ messages in thread
From: Steven Rostedt @ 2009-02-06 15:21 UTC (permalink / raw)
  To: linux-kernel
  Cc: Ingo Molnar, Andrew Morton, Peter Zijlstra, Frederic Weisbecker,
	Mathieu Desnoyers, Steven Rostedt

[-- Attachment #1: 0003-ftrace-x86-rename-in_nmi-variable.patch --]
[-- Type: text/plain, Size: 2698 bytes --]

From: Steven Rostedt <srostedt@redhat.com>

Impact: clean up

The in_nmi variable in x86 arch ftrace.c is a misnomer.
Andrew Morton pointed out that the in_nmi variable is incremented
by all CPUS. It can be set when another CPU is running an NMI.

Since this is actually intentional, the fix is to rename it to
what it really is: "nmi_running"

Signed-off-by: Steven Rostedt <srostedt@redhat.com>
---
 arch/x86/kernel/ftrace.c |   22 +++++++++++-----------
 1 files changed, 11 insertions(+), 11 deletions(-)

diff --git a/arch/x86/kernel/ftrace.c b/arch/x86/kernel/ftrace.c
index 4c68358..e3fad2e 100644
--- a/arch/x86/kernel/ftrace.c
+++ b/arch/x86/kernel/ftrace.c
@@ -82,7 +82,7 @@ static unsigned char *ftrace_call_replace(unsigned long ip, unsigned long addr)
  * are the same as what exists.
  */
 
-static atomic_t in_nmi = ATOMIC_INIT(0);
+static atomic_t nmi_running = ATOMIC_INIT(0);
 static int mod_code_status;		/* holds return value of text write */
 static int mod_code_write;		/* set when NMI should do the write */
 static void *mod_code_ip;		/* holds the IP to write to */
@@ -115,8 +115,8 @@ static void ftrace_mod_code(void)
 
 void arch_ftrace_nmi_enter(void)
 {
-	atomic_inc(&in_nmi);
-	/* Must have in_nmi seen before reading write flag */
+	atomic_inc(&nmi_running);
+	/* Must have nmi_running seen before reading write flag */
 	smp_mb();
 	if (mod_code_write) {
 		ftrace_mod_code();
@@ -126,19 +126,19 @@ void arch_ftrace_nmi_enter(void)
 
 void arch_ftrace_nmi_exit(void)
 {
-	/* Finish all executions before clearing in_nmi */
+	/* Finish all executions before clearing nmi_running */
 	smp_wmb();
-	atomic_dec(&in_nmi);
+	atomic_dec(&nmi_running);
 }
 
 static void wait_for_nmi(void)
 {
-	if (!atomic_read(&in_nmi))
+	if (!atomic_read(&nmi_running))
 		return;
 
 	do {
 		cpu_relax();
-	} while(atomic_read(&in_nmi));
+	} while (atomic_read(&nmi_running));
 
 	nmi_wait_count++;
 }
@@ -374,16 +374,16 @@ int ftrace_disable_ftrace_graph_caller(void)
  * this page for dynamic ftrace. They have been
  * simplified to ignore all traces in NMI context.
  */
-static atomic_t in_nmi;
+static atomic_t nmi_running;
 
 void arch_ftrace_nmi_enter(void)
 {
-	atomic_inc(&in_nmi);
+	atomic_inc(&nmi_running);
 }
 
 void arch_ftrace_nmi_exit(void)
 {
-	atomic_dec(&in_nmi);
+	atomic_dec(&nmi_running);
 }
 
 #endif /* !CONFIG_DYNAMIC_FTRACE */
@@ -475,7 +475,7 @@ void prepare_ftrace_return(unsigned long *parent, unsigned long self_addr)
 				&return_to_handler;
 
 	/* Nmi's are currently unsupported */
-	if (unlikely(atomic_read(&in_nmi)))
+	if (unlikely(atomic_read(&nmi_running)))
 		return;
 
 	if (unlikely(atomic_read(&current->tracing_graph_pause)))
-- 
1.5.6.5

-- 

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH v2 4/7] nmi: add generic nmi tracking state
  2009-02-06 15:21 [PATCH v2 0/7] ftrace: updates for tip Steven Rostedt
                   ` (2 preceding siblings ...)
  2009-02-06 15:21 ` [PATCH v2 3/7] ftrace, x86: rename in_nmi variable Steven Rostedt
@ 2009-02-06 15:21 ` Steven Rostedt
  2009-02-06 15:21 ` [PATCH v2 5/7] ftrace: change function graph tracer to use new in_nmi Steven Rostedt
                   ` (2 subsequent siblings)
  6 siblings, 0 replies; 10+ messages in thread
From: Steven Rostedt @ 2009-02-06 15:21 UTC (permalink / raw)
  To: linux-kernel
  Cc: Ingo Molnar, Andrew Morton, Peter Zijlstra, Frederic Weisbecker,
	Mathieu Desnoyers, Steven Rostedt

[-- Attachment #1: 0004-nmi-add-generic-nmi-tracking-state.patch --]
[-- Type: text/plain, Size: 1890 bytes --]

From: Steven Rostedt <srostedt@redhat.com>

This code adds an in_nmi() macro that uses the current tasks preempt count
to track when it is in NMI context. Other parts of the kernel can
use this to determine if the context is in NMI context or not.

This code was inspired by the -rt patch in_nmi version that was
written by Peter Zijlstra, who borrowed that code from
Mathieu Desnoyers.

Reported-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Steven Rostedt <srostedt@redhat.com>
---
 include/linux/hardirq.h |   15 +++++++++++++++
 1 files changed, 15 insertions(+), 0 deletions(-)

diff --git a/include/linux/hardirq.h b/include/linux/hardirq.h
index f832883..f3cf86e 100644
--- a/include/linux/hardirq.h
+++ b/include/linux/hardirq.h
@@ -61,6 +61,12 @@
 #error PREEMPT_ACTIVE is too low!
 #endif
 
+#define NMI_OFFSET	(PREEMPT_ACTIVE << 1)
+
+#if NMI_OFFSET >= 0x80000000
+#error PREEMPT_ACTIVE too high!
+#endif
+
 #define hardirq_count()	(preempt_count() & HARDIRQ_MASK)
 #define softirq_count()	(preempt_count() & SOFTIRQ_MASK)
 #define irq_count()	(preempt_count() & (HARDIRQ_MASK | SOFTIRQ_MASK))
@@ -73,6 +79,11 @@
 #define in_softirq()		(softirq_count())
 #define in_interrupt()		(irq_count())
 
+/*
+ * Are we in NMI context?
+ */
+#define in_nmi()	(preempt_count() & NMI_OFFSET)
+
 #if defined(CONFIG_PREEMPT)
 # define PREEMPT_INATOMIC_BASE kernel_locked()
 # define PREEMPT_CHECK_OFFSET 1
@@ -167,6 +178,8 @@ extern void irq_exit(void);
 #define nmi_enter()				\
 	do {					\
 		ftrace_nmi_enter();		\
+		BUG_ON(in_nmi());		\
+		add_preempt_count(NMI_OFFSET);	\
 		lockdep_off();			\
 		rcu_nmi_enter();		\
 		__irq_enter();			\
@@ -177,6 +190,8 @@ extern void irq_exit(void);
 		__irq_exit();			\
 		rcu_nmi_exit();			\
 		lockdep_on();			\
+		BUG_ON(!in_nmi());		\
+		sub_preempt_count(NMI_OFFSET);	\
 		ftrace_nmi_exit();		\
 	} while (0)
 
-- 
1.5.6.5

-- 

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH v2 5/7] ftrace: change function graph tracer to use new in_nmi
  2009-02-06 15:21 [PATCH v2 0/7] ftrace: updates for tip Steven Rostedt
                   ` (3 preceding siblings ...)
  2009-02-06 15:21 ` [PATCH v2 4/7] nmi: add generic nmi tracking state Steven Rostedt
@ 2009-02-06 15:21 ` Steven Rostedt
  2009-02-06 15:21 ` [PATCH v2 6/7] ring-buffer: use generic version of in_nmi Steven Rostedt
  2009-02-06 15:21 ` [PATCH v2 7/7] trace: trivial fixes in comment typos Steven Rostedt
  6 siblings, 0 replies; 10+ messages in thread
From: Steven Rostedt @ 2009-02-06 15:21 UTC (permalink / raw)
  To: linux-kernel
  Cc: Ingo Molnar, Andrew Morton, Peter Zijlstra, Frederic Weisbecker,
	Mathieu Desnoyers, Steven Rostedt

[-- Attachment #1: 0005-ftrace-change-function-graph-tracer-to-use-new-in_n.patch --]
[-- Type: text/plain, Size: 2321 bytes --]

From: Steven Rostedt <srostedt@redhat.com>

The function graph tracer piggy backed onto the dynamic ftracer
to use the in_nmi custom code for dynamic tracing. The problem
was (as Andrew Morton pointed out) it really only wanted to bail
out if the context of the current CPU was in NMI context. But the
dynamic ftrace in_nmi custom code was true if _any_ CPU happened
to be in NMI context.

Now that we have a generic in_nmi interface, this patch changes
the function graph code to use it instead of the dynamic ftarce
custom code.

Reported-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Steven Rostedt <srostedt@redhat.com>
---
 arch/x86/Kconfig         |    2 +-
 arch/x86/kernel/ftrace.c |   21 +--------------------
 2 files changed, 2 insertions(+), 21 deletions(-)

diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index 3da7121..0f47315 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -34,7 +34,7 @@ config X86
 	select HAVE_FUNCTION_TRACER
 	select HAVE_FUNCTION_GRAPH_TRACER
 	select HAVE_FUNCTION_TRACE_MCOUNT_TEST
-	select HAVE_FTRACE_NMI_ENTER if DYNAMIC_FTRACE || FUNCTION_GRAPH_TRACER
+	select HAVE_FTRACE_NMI_ENTER if DYNAMIC_FTRACE
 	select HAVE_KVM
 	select HAVE_ARCH_KGDB
 	select HAVE_ARCH_TRACEHOOK
diff --git a/arch/x86/kernel/ftrace.c b/arch/x86/kernel/ftrace.c
index e3fad2e..918073c 100644
--- a/arch/x86/kernel/ftrace.c
+++ b/arch/x86/kernel/ftrace.c
@@ -367,25 +367,6 @@ int ftrace_disable_ftrace_graph_caller(void)
 	return ftrace_mod_jmp(ip, old_offset, new_offset);
 }
 
-#else /* CONFIG_DYNAMIC_FTRACE */
-
-/*
- * These functions are picked from those used on
- * this page for dynamic ftrace. They have been
- * simplified to ignore all traces in NMI context.
- */
-static atomic_t nmi_running;
-
-void arch_ftrace_nmi_enter(void)
-{
-	atomic_inc(&nmi_running);
-}
-
-void arch_ftrace_nmi_exit(void)
-{
-	atomic_dec(&nmi_running);
-}
-
 #endif /* !CONFIG_DYNAMIC_FTRACE */
 
 /* Add a function return address to the trace stack on thread info.*/
@@ -475,7 +456,7 @@ void prepare_ftrace_return(unsigned long *parent, unsigned long self_addr)
 				&return_to_handler;
 
 	/* Nmi's are currently unsupported */
-	if (unlikely(atomic_read(&nmi_running)))
+	if (unlikely(in_nmi()))
 		return;
 
 	if (unlikely(atomic_read(&current->tracing_graph_pause)))
-- 
1.5.6.5

-- 

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH v2 6/7] ring-buffer: use generic version of in_nmi
  2009-02-06 15:21 [PATCH v2 0/7] ftrace: updates for tip Steven Rostedt
                   ` (4 preceding siblings ...)
  2009-02-06 15:21 ` [PATCH v2 5/7] ftrace: change function graph tracer to use new in_nmi Steven Rostedt
@ 2009-02-06 15:21 ` Steven Rostedt
  2009-02-06 15:21 ` [PATCH v2 7/7] trace: trivial fixes in comment typos Steven Rostedt
  6 siblings, 0 replies; 10+ messages in thread
From: Steven Rostedt @ 2009-02-06 15:21 UTC (permalink / raw)
  To: linux-kernel
  Cc: Ingo Molnar, Andrew Morton, Peter Zijlstra, Frederic Weisbecker,
	Mathieu Desnoyers, Steven Rostedt

[-- Attachment #1: 0006-ring-buffer-use-generic-version-of-in_nmi.patch --]
[-- Type: text/plain, Size: 4035 bytes --]

From: Steven Rostedt <srostedt@redhat.com>

Impact: clean up

Now that a generic in_nmi is available, this patch removes the
special code in the ring_buffer and implements the in_nmi generic
version instead.

With this change, I was also able to rename the "arch_ftrace_nmi_enter"
back to "ftrace_nmi_enter" and remove the code from the ring buffer.

Signed-off-by: Steven Rostedt <srostedt@redhat.com>
---
 arch/x86/kernel/ftrace.c   |    4 ++--
 include/linux/ftrace_irq.h |    8 --------
 kernel/trace/ring_buffer.c |   43 +++++++++++++------------------------------
 3 files changed, 15 insertions(+), 40 deletions(-)

diff --git a/arch/x86/kernel/ftrace.c b/arch/x86/kernel/ftrace.c
index 918073c..d74d75e 100644
--- a/arch/x86/kernel/ftrace.c
+++ b/arch/x86/kernel/ftrace.c
@@ -113,7 +113,7 @@ static void ftrace_mod_code(void)
 					     MCOUNT_INSN_SIZE);
 }
 
-void arch_ftrace_nmi_enter(void)
+void ftrace_nmi_enter(void)
 {
 	atomic_inc(&nmi_running);
 	/* Must have nmi_running seen before reading write flag */
@@ -124,7 +124,7 @@ void arch_ftrace_nmi_enter(void)
 	}
 }
 
-void arch_ftrace_nmi_exit(void)
+void ftrace_nmi_exit(void)
 {
 	/* Finish all executions before clearing nmi_running */
 	smp_wmb();
diff --git a/include/linux/ftrace_irq.h b/include/linux/ftrace_irq.h
index 29de677..dca7bf8 100644
--- a/include/linux/ftrace_irq.h
+++ b/include/linux/ftrace_irq.h
@@ -3,14 +3,6 @@
 
 
 #ifdef CONFIG_FTRACE_NMI_ENTER
-extern void arch_ftrace_nmi_enter(void);
-extern void arch_ftrace_nmi_exit(void);
-#else
-static inline void arch_ftrace_nmi_enter(void) { }
-static inline void arch_ftrace_nmi_exit(void) { }
-#endif
-
-#ifdef CONFIG_RING_BUFFER
 extern void ftrace_nmi_enter(void);
 extern void ftrace_nmi_exit(void);
 #else
diff --git a/kernel/trace/ring_buffer.c b/kernel/trace/ring_buffer.c
index a60a6a8..5ee3444 100644
--- a/kernel/trace/ring_buffer.c
+++ b/kernel/trace/ring_buffer.c
@@ -8,6 +8,7 @@
 #include <linux/spinlock.h>
 #include <linux/debugfs.h>
 #include <linux/uaccess.h>
+#include <linux/hardirq.h>
 #include <linux/module.h>
 #include <linux/percpu.h>
 #include <linux/mutex.h>
@@ -20,35 +21,6 @@
 #include "trace.h"
 
 /*
- * Since the write to the buffer is still not fully lockless,
- * we must be careful with NMIs. The locks in the writers
- * are taken when a write crosses to a new page. The locks
- * protect against races with the readers (this will soon
- * be fixed with a lockless solution).
- *
- * Because we can not protect against NMIs, and we want to
- * keep traces reentrant, we need to manage what happens
- * when we are in an NMI.
- */
-static DEFINE_PER_CPU(int, rb_in_nmi);
-
-void ftrace_nmi_enter(void)
-{
-	__get_cpu_var(rb_in_nmi)++;
-	/* call arch specific handler too */
-	arch_ftrace_nmi_enter();
-}
-
-void ftrace_nmi_exit(void)
-{
-	arch_ftrace_nmi_exit();
-	__get_cpu_var(rb_in_nmi)--;
-	/* NMIs are not recursive */
-	WARN_ON_ONCE(__get_cpu_var(rb_in_nmi));
-}
-
-
-/*
  * A fast way to enable or disable all ring buffers is to
  * call tracing_on or tracing_off. Turning off the ring buffers
  * prevents all ring buffers from being recorded to.
@@ -1027,12 +999,23 @@ __rb_reserve_next(struct ring_buffer_per_cpu *cpu_buffer,
 
 		local_irq_save(flags);
 		/*
+		 * Since the write to the buffer is still not
+		 * fully lockless, we must be careful with NMIs.
+		 * The locks in the writers are taken when a write
+		 * crosses to a new page. The locks protect against
+		 * races with the readers (this will soon be fixed
+		 * with a lockless solution).
+		 *
+		 * Because we can not protect against NMIs, and we
+		 * want to keep traces reentrant, we need to manage
+		 * what happens when we are in an NMI.
+		 *
 		 * NMIs can happen after we take the lock.
 		 * If we are in an NMI, only take the lock
 		 * if it is not already taken. Otherwise
 		 * simply fail.
 		 */
-		if (unlikely(__get_cpu_var(rb_in_nmi))) {
+		if (unlikely(in_nmi())) {
 			if (!__raw_spin_trylock(&cpu_buffer->lock))
 				goto out_unlock;
 		} else
-- 
1.5.6.5

-- 

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH v2 7/7] trace: trivial fixes in comment typos.
  2009-02-06 15:21 [PATCH v2 0/7] ftrace: updates for tip Steven Rostedt
                   ` (5 preceding siblings ...)
  2009-02-06 15:21 ` [PATCH v2 6/7] ring-buffer: use generic version of in_nmi Steven Rostedt
@ 2009-02-06 15:21 ` Steven Rostedt
  6 siblings, 0 replies; 10+ messages in thread
From: Steven Rostedt @ 2009-02-06 15:21 UTC (permalink / raw)
  To: linux-kernel
  Cc: Ingo Molnar, Andrew Morton, Peter Zijlstra, Frederic Weisbecker,
	Mathieu Desnoyers, Wenji Huang, Steven Rostedt

[-- Attachment #1: 0007-trace-trivial-fixes-in-comment-typos.patch --]
[-- Type: text/plain, Size: 3303 bytes --]

From: Wenji Huang <wenji.huang@oracle.com>

Impact: clean up

Fixed several typos in the comments.

Signed-off-by: Wenji Huang <wenji.huang@oracle.com>
Signed-off-by: Steven Rostedt <srostedt@redhat.com>
---
 include/linux/ftrace.h |    2 +-
 kernel/trace/ftrace.c  |    6 +++---
 kernel/trace/trace.h   |    6 +++---
 3 files changed, 7 insertions(+), 7 deletions(-)

diff --git a/include/linux/ftrace.h b/include/linux/ftrace.h
index 7840e71..5e302d6 100644
--- a/include/linux/ftrace.h
+++ b/include/linux/ftrace.h
@@ -140,7 +140,7 @@ static inline int ftrace_disable_ftrace_graph_caller(void) { return 0; }
 #endif
 
 /**
- * ftrace_make_nop - convert code into top
+ * ftrace_make_nop - convert code into nop
  * @mod: module structure if called by module load initialization
  * @rec: the mcount call site record
  * @addr: the address that the call site should be calling
diff --git a/kernel/trace/ftrace.c b/kernel/trace/ftrace.c
index 6861003..1796e01 100644
--- a/kernel/trace/ftrace.c
+++ b/kernel/trace/ftrace.c
@@ -465,7 +465,7 @@ __ftrace_replace_code(struct dyn_ftrace *rec, int enable)
 	 * it is not enabled then do nothing.
 	 *
 	 * If this record is not to be traced and
-	 * it is enabled then disabled it.
+	 * it is enabled then disable it.
 	 *
 	 */
 	if (rec->flags & FTRACE_FL_NOTRACE) {
@@ -485,7 +485,7 @@ __ftrace_replace_code(struct dyn_ftrace *rec, int enable)
 		if (fl == (FTRACE_FL_FILTER | FTRACE_FL_ENABLED))
 			return 0;
 
-		/* Record is not filtered and is not enabled do nothing */
+		/* Record is not filtered or enabled, do nothing */
 		if (!fl)
 			return 0;
 
@@ -507,7 +507,7 @@ __ftrace_replace_code(struct dyn_ftrace *rec, int enable)
 
 		} else {
 
-			/* if record is not enabled do nothing */
+			/* if record is not enabled, do nothing */
 			if (!(rec->flags & FTRACE_FL_ENABLED))
 				return 0;
 
diff --git a/kernel/trace/trace.h b/kernel/trace/trace.h
index df627a9..5f2f170 100644
--- a/kernel/trace/trace.h
+++ b/kernel/trace/trace.h
@@ -613,12 +613,12 @@ extern struct tracer nop_trace;
  * preempt_enable (after a disable), a schedule might take place
  * causing an infinite recursion.
  *
- * To prevent this, we read the need_recshed flag before
+ * To prevent this, we read the need_resched flag before
  * disabling preemption. When we want to enable preemption we
  * check the flag, if it is set, then we call preempt_enable_no_resched.
  * Otherwise, we call preempt_enable.
  *
- * The rational for doing the above is that if need resched is set
+ * The rational for doing the above is that if need_resched is set
  * and we have yet to reschedule, we are either in an atomic location
  * (where we do not need to check for scheduling) or we are inside
  * the scheduler and do not want to resched.
@@ -639,7 +639,7 @@ static inline int ftrace_preempt_disable(void)
  *
  * This is a scheduler safe way to enable preemption and not miss
  * any preemption checks. The disabled saved the state of preemption.
- * If resched is set, then we were either inside an atomic or
+ * If resched is set, then we are either inside an atomic or
  * are inside the scheduler (we would have already scheduled
  * otherwise). In this case, we do not want to call normal
  * preempt_enable, but preempt_enable_no_resched instead.
-- 
1.5.6.5

-- 

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [PATCH v2 1/7] ring-buffer: add NMI protection for spinlocks
  2009-02-06 15:21 ` [PATCH v2 1/7] ring-buffer: add NMI protection for spinlocks Steven Rostedt
@ 2009-02-06 19:40   ` Mathieu Desnoyers
  2009-02-06 19:49     ` Steven Rostedt
  0 siblings, 1 reply; 10+ messages in thread
From: Mathieu Desnoyers @ 2009-02-06 19:40 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: linux-kernel, Ingo Molnar, Andrew Morton, Peter Zijlstra,
	Frederic Weisbecker, Steven Rostedt

* Steven Rostedt (rostedt@goodmis.org) wrote:
> From: Steven Rostedt <srostedt@redhat.com>
> 
> Impact: prevent deadlock in NMI
> 
> The ring buffers are not yet totally lockless with writing to
> the buffer. When a writer crosses a page, it grabs a per cpu spinlock
> to protect against a reader. The spinlocks taken by a writer are not
> to protect against other writers, since a writer can only write to
> its own per cpu buffer. The spinlocks protect against readers that
> can touch any cpu buffer. The writers are made to be reentrant
> with the spinlocks disabling interrupts.
> 
> The problem arises when an NMI writes to the buffer, and that write
> crosses a page boundary. If it grabs a spinlock, it can be racing
> with another writer (since disabling interrupts does not protect
> against NMIs) or with a reader on the same CPU. Luckily, most of the
> users are not reentrant and protects against this issue. But if a
> user of the ring buffer becomes reentrant (which is what the ring
> buffers do allow), if the NMI also writes to the ring buffer then
> we risk the chance of a deadlock.
> 
> This patch moves the ftrace_nmi_enter called by nmi_enter() to the
> ring buffer code. It replaces the current ftrace_nmi_enter that is
> used by arch specific code to arch_ftrace_nmi_enter and updates
> the Kconfig to handle it.
> 
> When an NMI is called, it will set a per cpu variable in the ring buffer
> code and will clear it when the NMI exits. If a write to the ring buffer
> crosses page boundaries inside an NMI, a trylock is used on the spin
> lock instead. If the spinlock fails to be acquired, then the entry
> is discarded.
> 
> This bug appeared in the ftrace work in the RT tree, where event tracing
> is reentrant. This workaround solved the deadlocks that appeared there.
> 
> Signed-off-by: Steven Rostedt <srostedt@redhat.com>
> ---
>  arch/x86/Kconfig           |    1 +
>  arch/x86/kernel/ftrace.c   |    8 +++---
>  include/linux/ftrace_irq.h |   10 ++++++++-
>  kernel/trace/Kconfig       |    8 +++++++
>  kernel/trace/ring_buffer.c |   48 ++++++++++++++++++++++++++++++++++++++++++-
>  5 files changed, 68 insertions(+), 7 deletions(-)
> 
> diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
> index 4277640..3da7121 100644
> --- a/arch/x86/Kconfig
> +++ b/arch/x86/Kconfig
> @@ -34,6 +34,7 @@ config X86
>  	select HAVE_FUNCTION_TRACER
>  	select HAVE_FUNCTION_GRAPH_TRACER
>  	select HAVE_FUNCTION_TRACE_MCOUNT_TEST
> +	select HAVE_FTRACE_NMI_ENTER if DYNAMIC_FTRACE || FUNCTION_GRAPH_TRACER
>  	select HAVE_KVM
>  	select HAVE_ARCH_KGDB
>  	select HAVE_ARCH_TRACEHOOK
> diff --git a/arch/x86/kernel/ftrace.c b/arch/x86/kernel/ftrace.c
> index 4d33224..4c68358 100644
> --- a/arch/x86/kernel/ftrace.c
> +++ b/arch/x86/kernel/ftrace.c
> @@ -113,7 +113,7 @@ static void ftrace_mod_code(void)
>  					     MCOUNT_INSN_SIZE);
>  }
>  
> -void ftrace_nmi_enter(void)
> +void arch_ftrace_nmi_enter(void)
>  {
>  	atomic_inc(&in_nmi);
>  	/* Must have in_nmi seen before reading write flag */
> @@ -124,7 +124,7 @@ void ftrace_nmi_enter(void)
>  	}
>  }
>  
> -void ftrace_nmi_exit(void)
> +void arch_ftrace_nmi_exit(void)
>  {
>  	/* Finish all executions before clearing in_nmi */
>  	smp_wmb();
> @@ -376,12 +376,12 @@ int ftrace_disable_ftrace_graph_caller(void)
>   */
>  static atomic_t in_nmi;
>  
> -void ftrace_nmi_enter(void)
> +void arch_ftrace_nmi_enter(void)
>  {
>  	atomic_inc(&in_nmi);
>  }
>  
> -void ftrace_nmi_exit(void)
> +void arch_ftrace_nmi_exit(void)
>  {
>  	atomic_dec(&in_nmi);
>  }
> diff --git a/include/linux/ftrace_irq.h b/include/linux/ftrace_irq.h
> index 366a054..29de677 100644
> --- a/include/linux/ftrace_irq.h
> +++ b/include/linux/ftrace_irq.h
> @@ -2,7 +2,15 @@
>  #define _LINUX_FTRACE_IRQ_H
>  
>  
> -#if defined(CONFIG_DYNAMIC_FTRACE) || defined(CONFIG_FUNCTION_GRAPH_TRACER)
> +#ifdef CONFIG_FTRACE_NMI_ENTER
> +extern void arch_ftrace_nmi_enter(void);
> +extern void arch_ftrace_nmi_exit(void);
> +#else
> +static inline void arch_ftrace_nmi_enter(void) { }
> +static inline void arch_ftrace_nmi_exit(void) { }
> +#endif
> +
> +#ifdef CONFIG_RING_BUFFER
>  extern void ftrace_nmi_enter(void);
>  extern void ftrace_nmi_exit(void);
>  #else
> diff --git a/kernel/trace/Kconfig b/kernel/trace/Kconfig
> index 28f2644..25131a5 100644
> --- a/kernel/trace/Kconfig
> +++ b/kernel/trace/Kconfig
> @@ -9,6 +9,9 @@ config USER_STACKTRACE_SUPPORT
>  config NOP_TRACER
>  	bool
>  
> +config HAVE_FTRACE_NMI_ENTER
> +	bool
> +
>  config HAVE_FUNCTION_TRACER
>  	bool
>  
> @@ -37,6 +40,11 @@ config TRACER_MAX_TRACE
>  config RING_BUFFER
>  	bool
>  
> +config FTRACE_NMI_ENTER
> +       bool
> +       depends on HAVE_FTRACE_NMI_ENTER
> +       default y
> +
>  config TRACING
>  	bool
>  	select DEBUG_FS
> diff --git a/kernel/trace/ring_buffer.c b/kernel/trace/ring_buffer.c
> index b36d737..a60a6a8 100644
> --- a/kernel/trace/ring_buffer.c
> +++ b/kernel/trace/ring_buffer.c
> @@ -4,6 +4,7 @@
>   * Copyright (C) 2008 Steven Rostedt <srostedt@redhat.com>
>   */
>  #include <linux/ring_buffer.h>
> +#include <linux/ftrace_irq.h>
>  #include <linux/spinlock.h>
>  #include <linux/debugfs.h>
>  #include <linux/uaccess.h>
> @@ -19,6 +20,35 @@
>  #include "trace.h"
>  
>  /*
> + * Since the write to the buffer is still not fully lockless,
> + * we must be careful with NMIs. The locks in the writers
> + * are taken when a write crosses to a new page. The locks
> + * protect against races with the readers (this will soon
> + * be fixed with a lockless solution).
> + *
> + * Because we can not protect against NMIs, and we want to
> + * keep traces reentrant, we need to manage what happens
> + * when we are in an NMI.
> + */
> +static DEFINE_PER_CPU(int, rb_in_nmi);
> +
> +void ftrace_nmi_enter(void)
> +{
> +	__get_cpu_var(rb_in_nmi)++;
> +	/* call arch specific handler too */
> +	arch_ftrace_nmi_enter();
> +}
> +
> +void ftrace_nmi_exit(void)
> +{
> +	arch_ftrace_nmi_exit();
> +	__get_cpu_var(rb_in_nmi)--;
> +	/* NMIs are not recursive */
> +	WARN_ON_ONCE(__get_cpu_var(rb_in_nmi));
> +}
> +
> +
> +/*
>   * A fast way to enable or disable all ring buffers is to
>   * call tracing_on or tracing_off. Turning off the ring buffers
>   * prevents all ring buffers from being recorded to.
> @@ -982,6 +1012,7 @@ __rb_reserve_next(struct ring_buffer_per_cpu *cpu_buffer,
>  	struct ring_buffer *buffer = cpu_buffer->buffer;
>  	struct ring_buffer_event *event;
>  	unsigned long flags;
> +	bool lock_taken = false;
>  
>  	commit_page = cpu_buffer->commit_page;
>  	/* we just need to protect against interrupts */
> @@ -995,7 +1026,19 @@ __rb_reserve_next(struct ring_buffer_per_cpu *cpu_buffer,
>  		struct buffer_page *next_page = tail_page;
>  
>  		local_irq_save(flags);
> -		__raw_spin_lock(&cpu_buffer->lock);
> +		/*
> +		 * NMIs can happen after we take the lock.
> +		 * If we are in an NMI, only take the lock
> +		 * if it is not already taken. Otherwise
> +		 * simply fail.
> +		 */
> +		if (unlikely(__get_cpu_var(rb_in_nmi))) {
> +			if (!__raw_spin_trylock(&cpu_buffer->lock))
> +				goto out_unlock;

Hi Steven,

You're not silently discarding an event, aren't you ? ;)

Please keep at least a count of "events_lost" so it can be reported
later by a printk().

Mathieu


> +		} else
> +			__raw_spin_lock(&cpu_buffer->lock);
> +
> +		lock_taken = true;
>  
>  		rb_inc_page(cpu_buffer, &next_page);
>  
> @@ -1097,7 +1140,8 @@ __rb_reserve_next(struct ring_buffer_per_cpu *cpu_buffer,
>  	if (tail <= BUF_PAGE_SIZE)
>  		local_set(&tail_page->write, tail);
>  
> -	__raw_spin_unlock(&cpu_buffer->lock);
> +	if (likely(lock_taken))
> +		__raw_spin_unlock(&cpu_buffer->lock);
>  	local_irq_restore(flags);
>  	return NULL;
>  }
> -- 
> 1.5.6.5
> 
> -- 

-- 
Mathieu Desnoyers
OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F  BA06 3F25 A8FE 3BAE 9A68

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v2 1/7] ring-buffer: add NMI protection for spinlocks
  2009-02-06 19:40   ` Mathieu Desnoyers
@ 2009-02-06 19:49     ` Steven Rostedt
  0 siblings, 0 replies; 10+ messages in thread
From: Steven Rostedt @ 2009-02-06 19:49 UTC (permalink / raw)
  To: Mathieu Desnoyers
  Cc: linux-kernel, Ingo Molnar, Andrew Morton, Peter Zijlstra,
	Frederic Weisbecker, Steven Rostedt


On Fri, 6 Feb 2009, Mathieu Desnoyers wrote:
> >  
> >  	commit_page = cpu_buffer->commit_page;
> >  	/* we just need to protect against interrupts */
> > @@ -995,7 +1026,19 @@ __rb_reserve_next(struct ring_buffer_per_cpu *cpu_buffer,
> >  		struct buffer_page *next_page = tail_page;
> >  
> >  		local_irq_save(flags);
> > -		__raw_spin_lock(&cpu_buffer->lock);
> > +		/*
> > +		 * NMIs can happen after we take the lock.
> > +		 * If we are in an NMI, only take the lock
> > +		 * if it is not already taken. Otherwise
> > +		 * simply fail.
> > +		 */
> > +		if (unlikely(__get_cpu_var(rb_in_nmi))) {
> > +			if (!__raw_spin_trylock(&cpu_buffer->lock))
> > +				goto out_unlock;
> 
> Hi Steven,
> 
> You're not silently discarding an event, aren't you ? ;)
> 
> Please keep at least a count of "events_lost" so it can be reported
> later by a printk().

I could add a counter here, but it is not a silent discard. The tracer 
would receive a "NULL" pointer in this case. Then the tracer can do what 
it wants with this information. The ring buffer will never discard data 
(otherwise it is a bug) without letting the tracer know that it did so.

-- Steve


^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2009-02-06 19:49 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2009-02-06 15:21 [PATCH v2 0/7] ftrace: updates for tip Steven Rostedt
2009-02-06 15:21 ` [PATCH v2 1/7] ring-buffer: add NMI protection for spinlocks Steven Rostedt
2009-02-06 19:40   ` Mathieu Desnoyers
2009-02-06 19:49     ` Steven Rostedt
2009-02-06 15:21 ` [PATCH v2 2/7] ring-buffer: allow tracing_off to be used in core kernel code Steven Rostedt
2009-02-06 15:21 ` [PATCH v2 3/7] ftrace, x86: rename in_nmi variable Steven Rostedt
2009-02-06 15:21 ` [PATCH v2 4/7] nmi: add generic nmi tracking state Steven Rostedt
2009-02-06 15:21 ` [PATCH v2 5/7] ftrace: change function graph tracer to use new in_nmi Steven Rostedt
2009-02-06 15:21 ` [PATCH v2 6/7] ring-buffer: use generic version of in_nmi Steven Rostedt
2009-02-06 15:21 ` [PATCH v2 7/7] trace: trivial fixes in comment typos Steven Rostedt

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).