linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 00/12 -next] [for linux-next] ftrace: Recursion protection and speed up
@ 2013-01-23 20:30 Steven Rostedt
  2013-01-23 20:30 ` [PATCH 01/12 -next] tracing: Remove trace.h header from trace_clock.c Steven Rostedt
                   ` (11 more replies)
  0 siblings, 12 replies; 13+ messages in thread
From: Steven Rostedt @ 2013-01-23 20:30 UTC (permalink / raw)
  To: linux-kernel; +Cc: Ingo Molnar, Andrew Morton, Frederic Weisbecker

This change set replaces the function tracing recursion protection
with a context one, that keeps track of interrupts, NMIs, softirqs,
such that we do not need to disable interrupts and not miss any
events, as the old recursion protection had a race where interrupt
events could be dropped.

This also speeds up function tracing by 15% or more.

I'm posting these as a generic review of the changes I'll be pushing
into linux-next and will be pushing this for 3.9.

These patches have passed a series of tests I've run on few machines
with several different configs and such.

You can find the patches here. I'll add these to linux-next tomorrow.

-- Steve

  git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace.git
tip/perf/core

Head SHA1: 0b07436d95b5404134da4d661fd183eac863513e


Steven Rostedt (12):
      tracing: Remove trace.h header from trace_clock.c
      tracing: Fix race with max_tr and changing tracers
      tracing: Fix selftest function recursion accounting
      ftrace: Fix global function tracers that are not recursion safe
      ftrace: Fix function tracing recursion self test
      ftrace: Optimize the function tracer list loop
      ftrace: Add context level recursion bit checking
      tracing: Make the trace recursion bits into enums
      tracing: Avoid unnecessary multiple recursion checks
      ftrace: Use only the preempt version of function tracing
      ring-buffer: User context bit recursion checking
      ring-buffer: Remove trace.h from ring_buffer.c

----
 kernel/trace/ftrace.c          |   82 ++++++++++++++++---------
 kernel/trace/ring_buffer.c     |   88 ++++++++++++++++++--------
 kernel/trace/trace.c           |   29 ++++++---
 kernel/trace/trace.h           |  133 ++++++++++++++++++++++++++++++++++++----
 kernel/trace/trace_clock.c     |    2 -
 kernel/trace/trace_functions.c |   61 +++++-------------
 kernel/trace/trace_selftest.c  |   19 ++----
 7 files changed, 275 insertions(+), 139 deletions(-)

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [PATCH 01/12 -next] tracing: Remove trace.h header from trace_clock.c
  2013-01-23 20:30 [PATCH 00/12 -next] [for linux-next] ftrace: Recursion protection and speed up Steven Rostedt
@ 2013-01-23 20:30 ` Steven Rostedt
  2013-01-23 20:30 ` [PATCH 02/12 -next] tracing: Fix race with max_tr and changing tracers Steven Rostedt
                   ` (10 subsequent siblings)
  11 siblings, 0 replies; 13+ messages in thread
From: Steven Rostedt @ 2013-01-23 20:30 UTC (permalink / raw)
  To: linux-kernel; +Cc: Ingo Molnar, Andrew Morton, Frederic Weisbecker

[-- Attachment #1: 0001-tracing-Remove-trace.h-header-from-trace_clock.c.patch --]
[-- Type: text/plain, Size: 695 bytes --]

From: Steven Rostedt <srostedt@redhat.com>

As trace_clock is used by other things besides tracing, and it
does not require anything from trace.h, it is best not to include
the header file in trace_clock.c.

Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
---
 kernel/trace/trace_clock.c |    2 --
 1 file changed, 2 deletions(-)

diff --git a/kernel/trace/trace_clock.c b/kernel/trace/trace_clock.c
index 3947835..22b638b 100644
--- a/kernel/trace/trace_clock.c
+++ b/kernel/trace/trace_clock.c
@@ -21,8 +21,6 @@
 #include <linux/ktime.h>
 #include <linux/trace_clock.h>
 
-#include "trace.h"
-
 /*
  * trace_clock_local(): the simplest and least coherent tracing clock.
  *
-- 
1.7.10.4



^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH 02/12 -next] tracing: Fix race with max_tr and changing tracers
  2013-01-23 20:30 [PATCH 00/12 -next] [for linux-next] ftrace: Recursion protection and speed up Steven Rostedt
  2013-01-23 20:30 ` [PATCH 01/12 -next] tracing: Remove trace.h header from trace_clock.c Steven Rostedt
@ 2013-01-23 20:30 ` Steven Rostedt
  2013-01-23 20:30 ` [PATCH 03/12 -next] tracing: Fix selftest function recursion accounting Steven Rostedt
                   ` (9 subsequent siblings)
  11 siblings, 0 replies; 13+ messages in thread
From: Steven Rostedt @ 2013-01-23 20:30 UTC (permalink / raw)
  To: linux-kernel; +Cc: Ingo Molnar, Andrew Morton, Frederic Weisbecker

[-- Attachment #1: 0002-tracing-Fix-race-with-max_tr-and-changing-tracers.patch --]
[-- Type: text/plain, Size: 3178 bytes --]

From: Steven Rostedt <srostedt@redhat.com>

There's a race condition between the setting of a new tracer and
the update of the max trace buffers (the swap). When a new tracer
is added, it sets current_trace to nop_trace before disabling
the old tracer. At this moment, if the old tracer uses update_max_tr(),
the update may trigger the warning against !current_trace->use_max-tr,
as nop_trace doesn't have that set.

As update_max_tr() requires that interrupts be disabled, we can
add a check to see if current_trace == nop_trace and bail if it
does. Then when disabling the current_trace, set it to nop_trace
and run synchronize_sched(). This will make sure all calls to
update_max_tr() have completed (it was called with interrupts disabled).

As a clean up, this commit also removes shrinking and recreating
the max_tr buffer if the old and new tracers both have use_max_tr set.
The old way use to always shrink the buffer, and then expand it
for the next tracer. This is a waste of time.

Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
---
 kernel/trace/trace.c |   29 ++++++++++++++++++++++-------
 1 file changed, 22 insertions(+), 7 deletions(-)

diff --git a/kernel/trace/trace.c b/kernel/trace/trace.c
index a387bd2..d2a6583 100644
--- a/kernel/trace/trace.c
+++ b/kernel/trace/trace.c
@@ -709,10 +709,14 @@ update_max_tr(struct trace_array *tr, struct task_struct *tsk, int cpu)
 		return;
 
 	WARN_ON_ONCE(!irqs_disabled());
-	if (!current_trace->use_max_tr) {
-		WARN_ON_ONCE(1);
+
+	/* If we disabled the tracer, stop now */
+	if (current_trace == &nop_trace)
 		return;
-	}
+
+	if (WARN_ON_ONCE(!current_trace->use_max_tr))
+		return;
+
 	arch_spin_lock(&ftrace_max_lock);
 
 	tr->buffer = max_tr.buffer;
@@ -3185,6 +3189,7 @@ static int tracing_set_tracer(const char *buf)
 	static struct trace_option_dentry *topts;
 	struct trace_array *tr = &global_trace;
 	struct tracer *t;
+	bool had_max_tr;
 	int ret = 0;
 
 	mutex_lock(&trace_types_lock);
@@ -3211,7 +3216,19 @@ static int tracing_set_tracer(const char *buf)
 	trace_branch_disable();
 	if (current_trace && current_trace->reset)
 		current_trace->reset(tr);
-	if (current_trace && current_trace->use_max_tr) {
+
+	had_max_tr = current_trace && current_trace->use_max_tr;
+	current_trace = &nop_trace;
+
+	if (had_max_tr && !t->use_max_tr) {
+		/*
+		 * We need to make sure that the update_max_tr sees that
+		 * current_trace changed to nop_trace to keep it from
+		 * swapping the buffers after we resize it.
+		 * The update_max_tr is called from interrupts disabled
+		 * so a synchronized_sched() is sufficient.
+		 */
+		synchronize_sched();
 		/*
 		 * We don't free the ring buffer. instead, resize it because
 		 * The max_tr ring buffer has some state (e.g. ring->clock) and
@@ -3222,10 +3239,8 @@ static int tracing_set_tracer(const char *buf)
 	}
 	destroy_trace_option_files(topts);
 
-	current_trace = &nop_trace;
-
 	topts = create_trace_option_files(t);
-	if (t->use_max_tr) {
+	if (t->use_max_tr && !had_max_tr) {
 		/* we need to make per cpu buffer sizes equivalent */
 		ret = resize_buffer_duplicate_size(&max_tr, &global_trace,
 						   RING_BUFFER_ALL_CPUS);
-- 
1.7.10.4



^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH 03/12 -next] tracing: Fix selftest function recursion accounting
  2013-01-23 20:30 [PATCH 00/12 -next] [for linux-next] ftrace: Recursion protection and speed up Steven Rostedt
  2013-01-23 20:30 ` [PATCH 01/12 -next] tracing: Remove trace.h header from trace_clock.c Steven Rostedt
  2013-01-23 20:30 ` [PATCH 02/12 -next] tracing: Fix race with max_tr and changing tracers Steven Rostedt
@ 2013-01-23 20:30 ` Steven Rostedt
  2013-01-23 20:30 ` [PATCH 04/12 -next] ftrace: Fix global function tracers that are not recursion safe Steven Rostedt
                   ` (8 subsequent siblings)
  11 siblings, 0 replies; 13+ messages in thread
From: Steven Rostedt @ 2013-01-23 20:30 UTC (permalink / raw)
  To: linux-kernel; +Cc: Ingo Molnar, Andrew Morton, Frederic Weisbecker

[-- Attachment #1: 0003-tracing-Fix-selftest-function-recursion-accounting.patch --]
[-- Type: text/plain, Size: 1560 bytes --]

From: Steven Rostedt <srostedt@redhat.com>

The test that checks function recursion does things differently
if the arch does not support all ftrace features. But that really
doesn't make a difference with how the test runs, and either way
the count variable should be 2 at the end.

Currently the test wrongly fails for archs that don't support all
the ftrace features.

Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
---
 kernel/trace/trace_selftest.c |   16 +++-------------
 1 file changed, 3 insertions(+), 13 deletions(-)

diff --git a/kernel/trace/trace_selftest.c b/kernel/trace/trace_selftest.c
index 6c62d58..adb008a 100644
--- a/kernel/trace/trace_selftest.c
+++ b/kernel/trace/trace_selftest.c
@@ -452,7 +452,6 @@ trace_selftest_function_recursion(void)
 	char *func_name;
 	int len;
 	int ret;
-	int cnt;
 
 	/* The previous test PASSED */
 	pr_cont("PASSED\n");
@@ -510,19 +509,10 @@ trace_selftest_function_recursion(void)
 
 	unregister_ftrace_function(&test_recsafe_probe);
 
-	/*
-	 * If arch supports all ftrace features, and no other task
-	 * was on the list, we should be fine.
-	 */
-	if (!ftrace_nr_registered_ops() && !FTRACE_FORCE_LIST_FUNC)
-		cnt = 2; /* Should have recursed */
-	else
-		cnt = 1;
-
 	ret = -1;
-	if (trace_selftest_recursion_cnt != cnt) {
-		pr_cont("*callback not called expected %d times (%d)* ",
-			cnt, trace_selftest_recursion_cnt);
+	if (trace_selftest_recursion_cnt != 2) {
+		pr_cont("*callback not called expected 2 times (%d)* ",
+			trace_selftest_recursion_cnt);
 		goto out;
 	}
 
-- 
1.7.10.4



^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH 04/12 -next] ftrace: Fix global function tracers that are not recursion safe
  2013-01-23 20:30 [PATCH 00/12 -next] [for linux-next] ftrace: Recursion protection and speed up Steven Rostedt
                   ` (2 preceding siblings ...)
  2013-01-23 20:30 ` [PATCH 03/12 -next] tracing: Fix selftest function recursion accounting Steven Rostedt
@ 2013-01-23 20:30 ` Steven Rostedt
  2013-01-23 20:30 ` [PATCH 05/12 -next] ftrace: Fix function tracing recursion self test Steven Rostedt
                   ` (7 subsequent siblings)
  11 siblings, 0 replies; 13+ messages in thread
From: Steven Rostedt @ 2013-01-23 20:30 UTC (permalink / raw)
  To: linux-kernel; +Cc: Ingo Molnar, Andrew Morton, Frederic Weisbecker

[-- Attachment #1: 0004-ftrace-Fix-global-function-tracers-that-are-not-recu.patch --]
[-- Type: text/plain, Size: 1434 bytes --]

From: Steven Rostedt <srostedt@redhat.com>

If one of the function tracers set by the global ops is not recursion
safe, it can still be called directly without the added recursion
supplied by the ftrace infrastructure.

Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
---
 kernel/trace/ftrace.c |   18 ++++++++++++++++--
 1 file changed, 16 insertions(+), 2 deletions(-)

diff --git a/kernel/trace/ftrace.c b/kernel/trace/ftrace.c
index 6e34dc1..789cbec 100644
--- a/kernel/trace/ftrace.c
+++ b/kernel/trace/ftrace.c
@@ -221,10 +221,24 @@ static void update_global_ops(void)
 	 * registered callers.
 	 */
 	if (ftrace_global_list == &ftrace_list_end ||
-	    ftrace_global_list->next == &ftrace_list_end)
+	    ftrace_global_list->next == &ftrace_list_end) {
 		func = ftrace_global_list->func;
-	else
+		/*
+		 * As we are calling the function directly.
+		 * If it does not have recursion protection,
+		 * the function_trace_op needs to be updated
+		 * accordingly.
+		 */
+		if (ftrace_global_list->flags & FTRACE_OPS_FL_RECURSION_SAFE)
+			global_ops.flags |= FTRACE_OPS_FL_RECURSION_SAFE;
+		else
+			global_ops.flags &= ~FTRACE_OPS_FL_RECURSION_SAFE;
+	} else {
 		func = ftrace_global_list_func;
+		/* The list has its own recursion protection. */
+		global_ops.flags |= FTRACE_OPS_FL_RECURSION_SAFE;
+	}
+
 
 	/* If we filter on pids, update to use the pid function */
 	if (!list_empty(&ftrace_pids)) {
-- 
1.7.10.4



^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH 05/12 -next] ftrace: Fix function tracing recursion self test
  2013-01-23 20:30 [PATCH 00/12 -next] [for linux-next] ftrace: Recursion protection and speed up Steven Rostedt
                   ` (3 preceding siblings ...)
  2013-01-23 20:30 ` [PATCH 04/12 -next] ftrace: Fix global function tracers that are not recursion safe Steven Rostedt
@ 2013-01-23 20:30 ` Steven Rostedt
  2013-01-23 20:30 ` [PATCH 06/12 -next] ftrace: Optimize the function tracer list loop Steven Rostedt
                   ` (6 subsequent siblings)
  11 siblings, 0 replies; 13+ messages in thread
From: Steven Rostedt @ 2013-01-23 20:30 UTC (permalink / raw)
  To: linux-kernel; +Cc: Ingo Molnar, Andrew Morton, Frederic Weisbecker

[-- Attachment #1: 0005-ftrace-Fix-function-tracing-recursion-self-test.patch --]
[-- Type: text/plain, Size: 935 bytes --]

From: Steven Rostedt <srostedt@redhat.com>

The function tracing recursion self test should not crash
the machine if the resursion test fails. If it detects that
the function tracing is recursing when it should not be, then
bail, don't go into an infinite recursive loop.

Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
---
 kernel/trace/trace_selftest.c |    3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/kernel/trace/trace_selftest.c b/kernel/trace/trace_selftest.c
index adb008a..51c819c 100644
--- a/kernel/trace/trace_selftest.c
+++ b/kernel/trace/trace_selftest.c
@@ -415,7 +415,8 @@ static void trace_selftest_test_recursion_func(unsigned long ip,
 	 * The ftrace infrastructure should provide the recursion
 	 * protection. If not, this will crash the kernel!
 	 */
-	trace_selftest_recursion_cnt++;
+	if (trace_selftest_recursion_cnt++ > 10)
+		return;
 	DYN_FTRACE_TEST_NAME();
 }
 
-- 
1.7.10.4



^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH 06/12 -next] ftrace: Optimize the function tracer list loop
  2013-01-23 20:30 [PATCH 00/12 -next] [for linux-next] ftrace: Recursion protection and speed up Steven Rostedt
                   ` (4 preceding siblings ...)
  2013-01-23 20:30 ` [PATCH 05/12 -next] ftrace: Fix function tracing recursion self test Steven Rostedt
@ 2013-01-23 20:30 ` Steven Rostedt
  2013-01-23 20:30 ` [PATCH 07/12 -next] ftrace: Add context level recursion bit checking Steven Rostedt
                   ` (5 subsequent siblings)
  11 siblings, 0 replies; 13+ messages in thread
From: Steven Rostedt @ 2013-01-23 20:30 UTC (permalink / raw)
  To: linux-kernel; +Cc: Ingo Molnar, Andrew Morton, Frederic Weisbecker

[-- Attachment #1: 0006-ftrace-Optimize-the-function-tracer-list-loop.patch --]
[-- Type: text/plain, Size: 4886 bytes --]

From: Steven Rostedt <srostedt@redhat.com>

There is lots of places that perform:

       op = rcu_dereference_raw(ftrace_control_list);
       while (op != &ftrace_list_end) {

Add a helper macro to do this, and also optimize for a single
entity. That is, gcc will optimize a loop for either no iterations
or more than one iteration. But usually only a single callback
is registered to the function tracer, thus the optimized case
should be a single pass. to do this we now do:

	op = rcu_dereference_raw(list);
	do {
		[...]
	} while (likely(op = rcu_dereference_raw((op)->next)) &&
	       unlikely((op) != &ftrace_list_end));

An op is always registered (ftrace_list_end when no callbacks is
registered), thus when a single callback is registered, the link
list looks like:

 top => callback => ftrace_list_end => NULL.

The likely(op = op->next) still must be performed due to the race
of removing the callback, where the first op assignment could
equal ftrace_list_end. In that case, the op->next would be NULL.
But this is unlikely (only happens in a race condition when
removing the callback).

But it is very likely that the next op would be ftrace_list_end,
unless more than one callback has been registered. This tells
gcc what the most common case is and makes the fast path with
the least amount of branches.

Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
---
 kernel/trace/ftrace.c |   48 ++++++++++++++++++++++++++----------------------
 1 file changed, 26 insertions(+), 22 deletions(-)

diff --git a/kernel/trace/ftrace.c b/kernel/trace/ftrace.c
index 789cbec..1330969 100644
--- a/kernel/trace/ftrace.c
+++ b/kernel/trace/ftrace.c
@@ -111,6 +111,26 @@ static void ftrace_ops_no_ops(unsigned long ip, unsigned long parent_ip);
 #define ftrace_ops_list_func ((ftrace_func_t)ftrace_ops_no_ops)
 #endif
 
+/*
+ * Traverse the ftrace_global_list, invoking all entries.  The reason that we
+ * can use rcu_dereference_raw() is that elements removed from this list
+ * are simply leaked, so there is no need to interact with a grace-period
+ * mechanism.  The rcu_dereference_raw() calls are needed to handle
+ * concurrent insertions into the ftrace_global_list.
+ *
+ * Silly Alpha and silly pointer-speculation compiler optimizations!
+ */
+#define do_for_each_ftrace_op(op, list)			\
+	op = rcu_dereference_raw(list);			\
+	do
+
+/*
+ * Optimized for just a single item in the list (as that is the normal case).
+ */
+#define while_for_each_ftrace_op(op)				\
+	while (likely(op = rcu_dereference_raw((op)->next)) &&	\
+	       unlikely((op) != &ftrace_list_end))
+
 /**
  * ftrace_nr_registered_ops - return number of ops registered
  *
@@ -132,15 +152,6 @@ int ftrace_nr_registered_ops(void)
 	return cnt;
 }
 
-/*
- * Traverse the ftrace_global_list, invoking all entries.  The reason that we
- * can use rcu_dereference_raw() is that elements removed from this list
- * are simply leaked, so there is no need to interact with a grace-period
- * mechanism.  The rcu_dereference_raw() calls are needed to handle
- * concurrent insertions into the ftrace_global_list.
- *
- * Silly Alpha and silly pointer-speculation compiler optimizations!
- */
 static void
 ftrace_global_list_func(unsigned long ip, unsigned long parent_ip,
 			struct ftrace_ops *op, struct pt_regs *regs)
@@ -149,11 +160,9 @@ ftrace_global_list_func(unsigned long ip, unsigned long parent_ip,
 		return;
 
 	trace_recursion_set(TRACE_GLOBAL_BIT);
-	op = rcu_dereference_raw(ftrace_global_list); /*see above*/
-	while (op != &ftrace_list_end) {
+	do_for_each_ftrace_op(op, ftrace_global_list) {
 		op->func(ip, parent_ip, op, regs);
-		op = rcu_dereference_raw(op->next); /*see above*/
-	};
+	} while_for_each_ftrace_op(op);
 	trace_recursion_clear(TRACE_GLOBAL_BIT);
 }
 
@@ -4104,14 +4113,11 @@ ftrace_ops_control_func(unsigned long ip, unsigned long parent_ip,
 	 */
 	preempt_disable_notrace();
 	trace_recursion_set(TRACE_CONTROL_BIT);
-	op = rcu_dereference_raw(ftrace_control_list);
-	while (op != &ftrace_list_end) {
+	do_for_each_ftrace_op(op, ftrace_control_list) {
 		if (!ftrace_function_local_disabled(op) &&
 		    ftrace_ops_test(op, ip))
 			op->func(ip, parent_ip, op, regs);
-
-		op = rcu_dereference_raw(op->next);
-	};
+	} while_for_each_ftrace_op(op);
 	trace_recursion_clear(TRACE_CONTROL_BIT);
 	preempt_enable_notrace();
 }
@@ -4139,12 +4145,10 @@ __ftrace_ops_list_func(unsigned long ip, unsigned long parent_ip,
 	 * they must be freed after a synchronize_sched().
 	 */
 	preempt_disable_notrace();
-	op = rcu_dereference_raw(ftrace_ops_list);
-	while (op != &ftrace_list_end) {
+	do_for_each_ftrace_op(op, ftrace_ops_list) {
 		if (ftrace_ops_test(op, ip))
 			op->func(ip, parent_ip, op, regs);
-		op = rcu_dereference_raw(op->next);
-	};
+	} while_for_each_ftrace_op(op);
 	preempt_enable_notrace();
 	trace_recursion_clear(TRACE_INTERNAL_BIT);
 }
-- 
1.7.10.4



^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH 07/12 -next] ftrace: Add context level recursion bit checking
  2013-01-23 20:30 [PATCH 00/12 -next] [for linux-next] ftrace: Recursion protection and speed up Steven Rostedt
                   ` (5 preceding siblings ...)
  2013-01-23 20:30 ` [PATCH 06/12 -next] ftrace: Optimize the function tracer list loop Steven Rostedt
@ 2013-01-23 20:30 ` Steven Rostedt
  2013-01-23 20:30 ` [PATCH 08/12 -next] tracing: Make the trace recursion bits into enums Steven Rostedt
                   ` (4 subsequent siblings)
  11 siblings, 0 replies; 13+ messages in thread
From: Steven Rostedt @ 2013-01-23 20:30 UTC (permalink / raw)
  To: linux-kernel; +Cc: Ingo Molnar, Andrew Morton, Frederic Weisbecker

[-- Attachment #1: 0007-ftrace-Add-context-level-recursion-bit-checking.patch --]
[-- Type: text/plain, Size: 4084 bytes --]

From: Steven Rostedt <srostedt@redhat.com>

Currently for recursion checking in the function tracer, ftrace
tests a task_struct bit to determine if the function tracer had
recursed or not. If it has, then it will will return without going
further.

But this leads to races. If an interrupt came in after the bit
was set, the functions being traced would see that bit set and
think that the function tracer recursed on itself, and would return.

Instead add a bit for each context (normal, softirq, irq and nmi).

A check of which context the task is in is made before testing the
associated bit. Now if an interrupt preempts the function tracer
after the previous context has been set, the interrupt functions
can still be traced.

Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
---
 kernel/trace/ftrace.c |   40 +++++++++++++++++++++++++++++++++-------
 kernel/trace/trace.h  |   12 +++++++++---
 2 files changed, 42 insertions(+), 10 deletions(-)

diff --git a/kernel/trace/ftrace.c b/kernel/trace/ftrace.c
index 1330969..639b6ab 100644
--- a/kernel/trace/ftrace.c
+++ b/kernel/trace/ftrace.c
@@ -156,14 +156,27 @@ static void
 ftrace_global_list_func(unsigned long ip, unsigned long parent_ip,
 			struct ftrace_ops *op, struct pt_regs *regs)
 {
-	if (unlikely(trace_recursion_test(TRACE_GLOBAL_BIT)))
+	int bit;
+
+	if (in_interrupt()) {
+		if (in_nmi())
+			bit = TRACE_GLOBAL_NMI_BIT;
+
+		else if (in_irq())
+			bit = TRACE_GLOBAL_IRQ_BIT;
+		else
+			bit = TRACE_GLOBAL_SIRQ_BIT;
+	} else
+		bit = TRACE_GLOBAL_BIT;
+
+	if (unlikely(trace_recursion_test(bit)))
 		return;
 
-	trace_recursion_set(TRACE_GLOBAL_BIT);
+	trace_recursion_set(bit);
 	do_for_each_ftrace_op(op, ftrace_global_list) {
 		op->func(ip, parent_ip, op, regs);
 	} while_for_each_ftrace_op(op);
-	trace_recursion_clear(TRACE_GLOBAL_BIT);
+	trace_recursion_clear(bit);
 }
 
 static void ftrace_pid_func(unsigned long ip, unsigned long parent_ip,
@@ -4132,14 +4145,27 @@ __ftrace_ops_list_func(unsigned long ip, unsigned long parent_ip,
 		       struct ftrace_ops *ignored, struct pt_regs *regs)
 {
 	struct ftrace_ops *op;
+	unsigned int bit;
 
 	if (function_trace_stop)
 		return;
 
-	if (unlikely(trace_recursion_test(TRACE_INTERNAL_BIT)))
-		return;
+	if (in_interrupt()) {
+		if (in_nmi())
+			bit = TRACE_INTERNAL_NMI_BIT;
+
+		else if (in_irq())
+			bit = TRACE_INTERNAL_IRQ_BIT;
+		else
+			bit = TRACE_INTERNAL_SIRQ_BIT;
+	} else
+		bit = TRACE_INTERNAL_BIT;
+
+	if (unlikely(trace_recursion_test(bit)))
+			return;
+
+	trace_recursion_set(bit);
 
-	trace_recursion_set(TRACE_INTERNAL_BIT);
 	/*
 	 * Some of the ops may be dynamically allocated,
 	 * they must be freed after a synchronize_sched().
@@ -4150,7 +4176,7 @@ __ftrace_ops_list_func(unsigned long ip, unsigned long parent_ip,
 			op->func(ip, parent_ip, op, regs);
 	} while_for_each_ftrace_op(op);
 	preempt_enable_notrace();
-	trace_recursion_clear(TRACE_INTERNAL_BIT);
+	trace_recursion_clear(bit);
 }
 
 /*
diff --git a/kernel/trace/trace.h b/kernel/trace/trace.h
index c75d798..fe6ccff 100644
--- a/kernel/trace/trace.h
+++ b/kernel/trace/trace.h
@@ -299,8 +299,14 @@ struct tracer {
 
 /* for function tracing recursion */
 #define TRACE_INTERNAL_BIT		(1<<11)
-#define TRACE_GLOBAL_BIT		(1<<12)
-#define TRACE_CONTROL_BIT		(1<<13)
+#define TRACE_INTERNAL_NMI_BIT		(1<<12)
+#define TRACE_INTERNAL_IRQ_BIT		(1<<13)
+#define TRACE_INTERNAL_SIRQ_BIT		(1<<14)
+#define TRACE_GLOBAL_BIT		(1<<15)
+#define TRACE_GLOBAL_NMI_BIT		(1<<16)
+#define TRACE_GLOBAL_IRQ_BIT		(1<<17)
+#define TRACE_GLOBAL_SIRQ_BIT		(1<<18)
+#define TRACE_CONTROL_BIT		(1<<19)
 
 /*
  * Abuse of the trace_recursion.
@@ -309,7 +315,7 @@ struct tracer {
  * was called in irq context but we have irq tracing off. Since this
  * can only be modified by current, we can reuse trace_recursion.
  */
-#define TRACE_IRQ_BIT			(1<<13)
+#define TRACE_IRQ_BIT			(1<<20)
 
 #define trace_recursion_set(bit)	do { (current)->trace_recursion |= (bit); } while (0)
 #define trace_recursion_clear(bit)	do { (current)->trace_recursion &= ~(bit); } while (0)
-- 
1.7.10.4



^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH 08/12 -next] tracing: Make the trace recursion bits into enums
  2013-01-23 20:30 [PATCH 00/12 -next] [for linux-next] ftrace: Recursion protection and speed up Steven Rostedt
                   ` (6 preceding siblings ...)
  2013-01-23 20:30 ` [PATCH 07/12 -next] ftrace: Add context level recursion bit checking Steven Rostedt
@ 2013-01-23 20:30 ` Steven Rostedt
  2013-01-23 20:30 ` [PATCH 09/12 -next] tracing: Avoid unnecessary multiple recursion checks Steven Rostedt
                   ` (3 subsequent siblings)
  11 siblings, 0 replies; 13+ messages in thread
From: Steven Rostedt @ 2013-01-23 20:30 UTC (permalink / raw)
  To: linux-kernel; +Cc: Ingo Molnar, Andrew Morton, Frederic Weisbecker

[-- Attachment #1: 0008-tracing-Make-the-trace-recursion-bits-into-enums.patch --]
[-- Type: text/plain, Size: 2006 bytes --]

From: Steven Rostedt <srostedt@redhat.com>

Convert the bits into enums which makes the code a little easier
to maintain.

Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
---
 kernel/trace/trace.h |   30 +++++++++++++++++-------------
 1 file changed, 17 insertions(+), 13 deletions(-)

diff --git a/kernel/trace/trace.h b/kernel/trace/trace.h
index fe6ccff..5a095d6 100644
--- a/kernel/trace/trace.h
+++ b/kernel/trace/trace.h
@@ -298,15 +298,18 @@ struct tracer {
 #define trace_recursion_buffer() ((current)->trace_recursion & 0x3ff)
 
 /* for function tracing recursion */
-#define TRACE_INTERNAL_BIT		(1<<11)
-#define TRACE_INTERNAL_NMI_BIT		(1<<12)
-#define TRACE_INTERNAL_IRQ_BIT		(1<<13)
-#define TRACE_INTERNAL_SIRQ_BIT		(1<<14)
-#define TRACE_GLOBAL_BIT		(1<<15)
-#define TRACE_GLOBAL_NMI_BIT		(1<<16)
-#define TRACE_GLOBAL_IRQ_BIT		(1<<17)
-#define TRACE_GLOBAL_SIRQ_BIT		(1<<18)
-#define TRACE_CONTROL_BIT		(1<<19)
+enum {
+	TRACE_INTERNAL_BIT = 11,
+	TRACE_INTERNAL_NMI_BIT,
+	TRACE_INTERNAL_IRQ_BIT,
+	TRACE_INTERNAL_SIRQ_BIT,
+
+	TRACE_GLOBAL_BIT,
+	TRACE_GLOBAL_NMI_BIT,
+	TRACE_GLOBAL_IRQ_BIT,
+	TRACE_GLOBAL_SIRQ_BIT,
+
+	TRACE_CONTROL_BIT,
 
 /*
  * Abuse of the trace_recursion.
@@ -315,11 +318,12 @@ struct tracer {
  * was called in irq context but we have irq tracing off. Since this
  * can only be modified by current, we can reuse trace_recursion.
  */
-#define TRACE_IRQ_BIT			(1<<20)
+	TRACE_IRQ_BIT,
+};
 
-#define trace_recursion_set(bit)	do { (current)->trace_recursion |= (bit); } while (0)
-#define trace_recursion_clear(bit)	do { (current)->trace_recursion &= ~(bit); } while (0)
-#define trace_recursion_test(bit)	((current)->trace_recursion & (bit))
+#define trace_recursion_set(bit)	do { (current)->trace_recursion |= (1<<(bit)); } while (0)
+#define trace_recursion_clear(bit)	do { (current)->trace_recursion &= ~(1<<(bit)); } while (0)
+#define trace_recursion_test(bit)	((current)->trace_recursion & (1<<(bit)))
 
 #define TRACE_PIPE_ALL_CPU	-1
 
-- 
1.7.10.4



^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH 09/12 -next] tracing: Avoid unnecessary multiple recursion checks
  2013-01-23 20:30 [PATCH 00/12 -next] [for linux-next] ftrace: Recursion protection and speed up Steven Rostedt
                   ` (7 preceding siblings ...)
  2013-01-23 20:30 ` [PATCH 08/12 -next] tracing: Make the trace recursion bits into enums Steven Rostedt
@ 2013-01-23 20:30 ` Steven Rostedt
  2013-01-23 20:30 ` [PATCH 10/12 -next] ftrace: Use only the preempt version of function tracing Steven Rostedt
                   ` (2 subsequent siblings)
  11 siblings, 0 replies; 13+ messages in thread
From: Steven Rostedt @ 2013-01-23 20:30 UTC (permalink / raw)
  To: linux-kernel; +Cc: Ingo Molnar, Andrew Morton, Frederic Weisbecker

[-- Attachment #1: 0009-tracing-Avoid-unnecessary-multiple-recursion-checks.patch --]
[-- Type: text/plain, Size: 6695 bytes --]

From: Steven Rostedt <srostedt@redhat.com>

When function tracing occurs, the following steps are made:
  If arch does not support a ftrace feature:
   call internal function (uses INTERNAL bits) which calls...
  If callback is registered to the "global" list, the list
   function is called and recursion checks the GLOBAL bits.
   then this function calls...
  The function callback, which can use the FTRACE bits to
   check for recursion.

Now if the arch does not suppport a feature, and it calls
the global list function which calls the ftrace callback
all three of these steps will do a recursion protection.
There's no reason to do one if the previous caller already
did. The recursion that we are protecting against will
go through the same steps again.

To prevent the multiple recursion checks, if a recursion
bit is set that is higher than the MAX bit of the current
check, then we know that the check was made by the previous
caller, and we can skip the current check.

Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
---
 kernel/trace/ftrace.c |   40 +++++--------------
 kernel/trace/trace.h  |  106 ++++++++++++++++++++++++++++++++++++++++++++++---
 2 files changed, 110 insertions(+), 36 deletions(-)

diff --git a/kernel/trace/ftrace.c b/kernel/trace/ftrace.c
index 639b6ab..ce8c3d6 100644
--- a/kernel/trace/ftrace.c
+++ b/kernel/trace/ftrace.c
@@ -158,25 +158,15 @@ ftrace_global_list_func(unsigned long ip, unsigned long parent_ip,
 {
 	int bit;
 
-	if (in_interrupt()) {
-		if (in_nmi())
-			bit = TRACE_GLOBAL_NMI_BIT;
-
-		else if (in_irq())
-			bit = TRACE_GLOBAL_IRQ_BIT;
-		else
-			bit = TRACE_GLOBAL_SIRQ_BIT;
-	} else
-		bit = TRACE_GLOBAL_BIT;
-
-	if (unlikely(trace_recursion_test(bit)))
+	bit = trace_test_and_set_recursion(TRACE_GLOBAL_START, TRACE_GLOBAL_MAX);
+	if (bit < 0)
 		return;
 
-	trace_recursion_set(bit);
 	do_for_each_ftrace_op(op, ftrace_global_list) {
 		op->func(ip, parent_ip, op, regs);
 	} while_for_each_ftrace_op(op);
-	trace_recursion_clear(bit);
+
+	trace_clear_recursion(bit);
 }
 
 static void ftrace_pid_func(unsigned long ip, unsigned long parent_ip,
@@ -4145,26 +4135,14 @@ __ftrace_ops_list_func(unsigned long ip, unsigned long parent_ip,
 		       struct ftrace_ops *ignored, struct pt_regs *regs)
 {
 	struct ftrace_ops *op;
-	unsigned int bit;
+	int bit;
 
 	if (function_trace_stop)
 		return;
 
-	if (in_interrupt()) {
-		if (in_nmi())
-			bit = TRACE_INTERNAL_NMI_BIT;
-
-		else if (in_irq())
-			bit = TRACE_INTERNAL_IRQ_BIT;
-		else
-			bit = TRACE_INTERNAL_SIRQ_BIT;
-	} else
-		bit = TRACE_INTERNAL_BIT;
-
-	if (unlikely(trace_recursion_test(bit)))
-			return;
-
-	trace_recursion_set(bit);
+	bit = trace_test_and_set_recursion(TRACE_LIST_START, TRACE_LIST_MAX);
+	if (bit < 0)
+		return;
 
 	/*
 	 * Some of the ops may be dynamically allocated,
@@ -4176,7 +4154,7 @@ __ftrace_ops_list_func(unsigned long ip, unsigned long parent_ip,
 			op->func(ip, parent_ip, op, regs);
 	} while_for_each_ftrace_op(op);
 	preempt_enable_notrace();
-	trace_recursion_clear(bit);
+	trace_clear_recursion(bit);
 }
 
 /*
diff --git a/kernel/trace/trace.h b/kernel/trace/trace.h
index 5a095d6..c203a51 100644
--- a/kernel/trace/trace.h
+++ b/kernel/trace/trace.h
@@ -297,18 +297,49 @@ struct tracer {
 /* Ring buffer has the 10 LSB bits to count */
 #define trace_recursion_buffer() ((current)->trace_recursion & 0x3ff)
 
-/* for function tracing recursion */
+/*
+ * For function tracing recursion:
+ *  The order of these bits are important.
+ *
+ *  When function tracing occurs, the following steps are made:
+ *   If arch does not support a ftrace feature:
+ *    call internal function (uses INTERNAL bits) which calls...
+ *   If callback is registered to the "global" list, the list
+ *    function is called and recursion checks the GLOBAL bits.
+ *    then this function calls...
+ *   The function callback, which can use the FTRACE bits to
+ *    check for recursion.
+ *
+ * Now if the arch does not suppport a feature, and it calls
+ * the global list function which calls the ftrace callback
+ * all three of these steps will do a recursion protection.
+ * There's no reason to do one if the previous caller already
+ * did. The recursion that we are protecting against will
+ * go through the same steps again.
+ *
+ * To prevent the multiple recursion checks, if a recursion
+ * bit is set that is higher than the MAX bit of the current
+ * check, then we know that the check was made by the previous
+ * caller, and we can skip the current check.
+ */
 enum {
-	TRACE_INTERNAL_BIT = 11,
-	TRACE_INTERNAL_NMI_BIT,
-	TRACE_INTERNAL_IRQ_BIT,
-	TRACE_INTERNAL_SIRQ_BIT,
+	TRACE_FTRACE_BIT = 11,
+	TRACE_FTRACE_NMI_BIT,
+	TRACE_FTRACE_IRQ_BIT,
+	TRACE_FTRACE_SIRQ_BIT,
 
+	/* GLOBAL_BITs must be greater than FTRACE_BITs */
 	TRACE_GLOBAL_BIT,
 	TRACE_GLOBAL_NMI_BIT,
 	TRACE_GLOBAL_IRQ_BIT,
 	TRACE_GLOBAL_SIRQ_BIT,
 
+	/* INTERNAL_BITs must be greater than GLOBAL_BITs */
+	TRACE_INTERNAL_BIT,
+	TRACE_INTERNAL_NMI_BIT,
+	TRACE_INTERNAL_IRQ_BIT,
+	TRACE_INTERNAL_SIRQ_BIT,
+
 	TRACE_CONTROL_BIT,
 
 /*
@@ -325,6 +356,71 @@ enum {
 #define trace_recursion_clear(bit)	do { (current)->trace_recursion &= ~(1<<(bit)); } while (0)
 #define trace_recursion_test(bit)	((current)->trace_recursion & (1<<(bit)))
 
+#define TRACE_CONTEXT_BITS	4
+
+#define TRACE_FTRACE_START	TRACE_FTRACE_BIT
+#define TRACE_FTRACE_MAX	((1 << (TRACE_FTRACE_START + TRACE_CONTEXT_BITS)) - 1)
+
+#define TRACE_GLOBAL_START	TRACE_GLOBAL_BIT
+#define TRACE_GLOBAL_MAX	((1 << (TRACE_GLOBAL_START + TRACE_CONTEXT_BITS)) - 1)
+
+#define TRACE_LIST_START	TRACE_INTERNAL_BIT
+#define TRACE_LIST_MAX		((1 << (TRACE_LIST_START + TRACE_CONTEXT_BITS)) - 1)
+
+#define TRACE_CONTEXT_MASK	TRACE_LIST_MAX
+
+static __always_inline int trace_get_context_bit(void)
+{
+	int bit;
+
+	if (in_interrupt()) {
+		if (in_nmi())
+			bit = 0;
+
+		else if (in_irq())
+			bit = 1;
+		else
+			bit = 2;
+	} else
+		bit = 3;
+
+	return bit;
+}
+
+static __always_inline int trace_test_and_set_recursion(int start, int max)
+{
+	unsigned int val = current->trace_recursion;
+	int bit;
+
+	/* A previous recursion check was made */
+	if ((val & TRACE_CONTEXT_MASK) > max)
+		return 0;
+
+	bit = trace_get_context_bit() + start;
+	if (unlikely(val & (1 << bit)))
+		return -1;
+
+	val |= 1 << bit;
+	current->trace_recursion = val;
+	barrier();
+
+	return bit;
+}
+
+static __always_inline void trace_clear_recursion(int bit)
+{
+	unsigned int val = current->trace_recursion;
+
+	if (!bit)
+		return;
+
+	bit = 1 << bit;
+	val &= ~bit;
+
+	barrier();
+	current->trace_recursion = val;
+}
+
 #define TRACE_PIPE_ALL_CPU	-1
 
 static inline struct ring_buffer_iter *
-- 
1.7.10.4



^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH 10/12 -next] ftrace: Use only the preempt version of function tracing
  2013-01-23 20:30 [PATCH 00/12 -next] [for linux-next] ftrace: Recursion protection and speed up Steven Rostedt
                   ` (8 preceding siblings ...)
  2013-01-23 20:30 ` [PATCH 09/12 -next] tracing: Avoid unnecessary multiple recursion checks Steven Rostedt
@ 2013-01-23 20:30 ` Steven Rostedt
  2013-01-23 20:30 ` [PATCH 11/12 -next] ring-buffer: User context bit recursion checking Steven Rostedt
  2013-01-23 20:30 ` [PATCH 12/12 -next] ring-buffer: Remove trace.h from ring_buffer.c Steven Rostedt
  11 siblings, 0 replies; 13+ messages in thread
From: Steven Rostedt @ 2013-01-23 20:30 UTC (permalink / raw)
  To: linux-kernel; +Cc: Ingo Molnar, Andrew Morton, Frederic Weisbecker

[-- Attachment #1: 0010-ftrace-Use-only-the-preempt-version-of-function-trac.patch --]
[-- Type: text/plain, Size: 4695 bytes --]

From: Steven Rostedt <srostedt@redhat.com>

The function tracer had two different versions of function tracing.

The disabling of irqs version and the preempt disable version.

As function tracing in very intrusive and can cause nasty recursion
issues, it has its own recursion protection. But the old method to
do this was a flat layer. If it detected that a recursion was happening
then it would just return without recording.

This made the preempt version (much faster than the irq disabling one)
not very useful, because if an interrupt were to occur after the
recursion flag was set, the interrupt would not be traced at all,
because every function that was traced would think it recursed on
itself (due to the context it preempted setting the recursive flag).

Now that we have a recursion flag for every context level, we
no longer need to worry about that. We can disable preemption,
set the current context recursion check bit, and go on. If an
interrupt were to come along, it would check its own context bit
and happily continue to trace.

As the preempt version is faster than the irq disable version,
there's no more reason to keep the preempt version around.
And the irq disable version still had an issue with missing
out on tracing NMI code.

Remove the irq disable function tracer version and have the
preempt disable version be the default (and only version).

Before this patch we had from running:

 # echo function > /debug/tracing/current_tracer
 # for i in `seq 10`; do ./hackbench 50; done
Time: 12.028
Time: 11.945
Time: 11.925
Time: 11.964
Time: 12.002
Time: 11.910
Time: 11.944
Time: 11.929
Time: 11.941
Time: 11.924

(average: 11.9512)

Now we have:

 # echo function > /debug/tracing/current_tracer
 # for i in `seq 10`; do ./hackbench 50; done
Time: 10.285
Time: 10.407
Time: 10.243
Time: 10.372
Time: 10.380
Time: 10.198
Time: 10.272
Time: 10.354
Time: 10.248
Time: 10.253

(average: 10.3012)

 a 13.8% savings!

Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
---
 kernel/trace/trace_functions.c |   61 +++++++++-------------------------------
 1 file changed, 14 insertions(+), 47 deletions(-)

diff --git a/kernel/trace/trace_functions.c b/kernel/trace/trace_functions.c
index 8e3ad80..1c327ef 100644
--- a/kernel/trace/trace_functions.c
+++ b/kernel/trace/trace_functions.c
@@ -47,34 +47,6 @@ static void function_trace_start(struct trace_array *tr)
 	tracing_reset_online_cpus(tr);
 }
 
-static void
-function_trace_call_preempt_only(unsigned long ip, unsigned long parent_ip,
-				 struct ftrace_ops *op, struct pt_regs *pt_regs)
-{
-	struct trace_array *tr = func_trace;
-	struct trace_array_cpu *data;
-	unsigned long flags;
-	long disabled;
-	int cpu;
-	int pc;
-
-	if (unlikely(!ftrace_function_enabled))
-		return;
-
-	pc = preempt_count();
-	preempt_disable_notrace();
-	local_save_flags(flags);
-	cpu = raw_smp_processor_id();
-	data = tr->data[cpu];
-	disabled = atomic_inc_return(&data->disabled);
-
-	if (likely(disabled == 1))
-		trace_function(tr, ip, parent_ip, flags, pc);
-
-	atomic_dec(&data->disabled);
-	preempt_enable_notrace();
-}
-
 /* Our option */
 enum {
 	TRACE_FUNC_OPT_STACK	= 0x1,
@@ -85,34 +57,34 @@ static struct tracer_flags func_flags;
 static void
 function_trace_call(unsigned long ip, unsigned long parent_ip,
 		    struct ftrace_ops *op, struct pt_regs *pt_regs)
-
 {
 	struct trace_array *tr = func_trace;
 	struct trace_array_cpu *data;
 	unsigned long flags;
-	long disabled;
+	unsigned int bit;
 	int cpu;
 	int pc;
 
 	if (unlikely(!ftrace_function_enabled))
 		return;
 
-	/*
-	 * Need to use raw, since this must be called before the
-	 * recursive protection is performed.
-	 */
-	local_irq_save(flags);
-	cpu = raw_smp_processor_id();
-	data = tr->data[cpu];
-	disabled = atomic_inc_return(&data->disabled);
+	pc = preempt_count();
+	preempt_disable_notrace();
 
-	if (likely(disabled == 1)) {
-		pc = preempt_count();
+	bit = trace_test_and_set_recursion(TRACE_FTRACE_START, TRACE_FTRACE_MAX);
+	if (bit < 0)
+		goto out;
+
+	cpu = smp_processor_id();
+	data = tr->data[cpu];
+	if (!atomic_read(&data->disabled)) {
+		local_save_flags(flags);
 		trace_function(tr, ip, parent_ip, flags, pc);
 	}
+	trace_clear_recursion(bit);
 
-	atomic_dec(&data->disabled);
-	local_irq_restore(flags);
+ out:
+	preempt_enable_notrace();
 }
 
 static void
@@ -185,11 +157,6 @@ static void tracing_start_function_trace(void)
 {
 	ftrace_function_enabled = 0;
 
-	if (trace_flags & TRACE_ITER_PREEMPTONLY)
-		trace_ops.func = function_trace_call_preempt_only;
-	else
-		trace_ops.func = function_trace_call;
-
 	if (func_flags.val & TRACE_FUNC_OPT_STACK)
 		register_ftrace_function(&trace_stack_ops);
 	else
-- 
1.7.10.4



^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH 11/12 -next] ring-buffer: User context bit recursion checking
  2013-01-23 20:30 [PATCH 00/12 -next] [for linux-next] ftrace: Recursion protection and speed up Steven Rostedt
                   ` (9 preceding siblings ...)
  2013-01-23 20:30 ` [PATCH 10/12 -next] ftrace: Use only the preempt version of function tracing Steven Rostedt
@ 2013-01-23 20:30 ` Steven Rostedt
  2013-01-23 20:30 ` [PATCH 12/12 -next] ring-buffer: Remove trace.h from ring_buffer.c Steven Rostedt
  11 siblings, 0 replies; 13+ messages in thread
From: Steven Rostedt @ 2013-01-23 20:30 UTC (permalink / raw)
  To: linux-kernel; +Cc: Ingo Molnar, Andrew Morton, Frederic Weisbecker

[-- Attachment #1: 0011-ring-buffer-User-context-bit-recursion-checking.patch --]
[-- Type: text/plain, Size: 4823 bytes --]

From: Steven Rostedt <srostedt@redhat.com>

Using context bit recursion checking, we can help increase the
performance of the ring buffer.

Before this patch:

 # echo function > /debug/tracing/current_tracer
 # for i in `seq 10`; do ./hackbench 50; done
Time: 10.285
Time: 10.407
Time: 10.243
Time: 10.372
Time: 10.380
Time: 10.198
Time: 10.272
Time: 10.354
Time: 10.248
Time: 10.253

(average: 10.3012)

Now we have:

 # echo function > /debug/tracing/current_tracer
 # for i in `seq 10`; do ./hackbench 50; done
Time: 9.712
Time: 9.824
Time: 9.861
Time: 9.827
Time: 9.962
Time: 9.905
Time: 9.886
Time: 10.088
Time: 9.861
Time: 9.834

(average: 9.876)

 a 4% savings!

Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
---
 kernel/trace/ring_buffer.c |   85 +++++++++++++++++++++++++++++++-------------
 kernel/trace/trace.h       |   13 +++----
 2 files changed, 67 insertions(+), 31 deletions(-)

diff --git a/kernel/trace/ring_buffer.c b/kernel/trace/ring_buffer.c
index 6ff9cc4..481e262 100644
--- a/kernel/trace/ring_buffer.c
+++ b/kernel/trace/ring_buffer.c
@@ -2432,41 +2432,76 @@ rb_reserve_next_event(struct ring_buffer *buffer,
 
 #ifdef CONFIG_TRACING
 
-#define TRACE_RECURSIVE_DEPTH 16
+/*
+ * The lock and unlock are done within a preempt disable section.
+ * The current_context per_cpu variable can only be modified
+ * by the current task between lock and unlock. But it can
+ * be modified more than once via an interrupt. To pass this
+ * information from the lock to the unlock without having to
+ * access the 'in_interrupt()' functions again (which do show
+ * a bit of overhead in something as critical as function tracing,
+ * we use a bitmask trick.
+ *
+ *  bit 0 =  NMI context
+ *  bit 1 =  IRQ context
+ *  bit 2 =  SoftIRQ context
+ *  bit 3 =  normal context.
+ *
+ * This works because this is the order of contexts that can
+ * preempt other contexts. A SoftIRQ never preempts an IRQ
+ * context.
+ *
+ * When the context is determined, the corresponding bit is
+ * checked and set (if it was set, then a recursion of that context
+ * happened).
+ *
+ * On unlock, we need to clear this bit. To do so, just subtract
+ * 1 from the current_context and AND it to itself.
+ *
+ * (binary)
+ *  101 - 1 = 100
+ *  101 & 100 = 100 (clearing bit zero)
+ *
+ *  1010 - 1 = 1001
+ *  1010 & 1001 = 1000 (clearing bit 1)
+ *
+ * The least significant bit can be cleared this way, and it
+ * just so happens that it is the same bit corresponding to
+ * the current context.
+ */
+static DEFINE_PER_CPU(unsigned int, current_context);
 
-/* Keep this code out of the fast path cache */
-static noinline void trace_recursive_fail(void)
+static __always_inline int trace_recursive_lock(void)
 {
-	/* Disable all tracing before we do anything else */
-	tracing_off_permanent();
-
-	printk_once(KERN_WARNING "Tracing recursion: depth[%ld]:"
-		    "HC[%lu]:SC[%lu]:NMI[%lu]\n",
-		    trace_recursion_buffer(),
-		    hardirq_count() >> HARDIRQ_SHIFT,
-		    softirq_count() >> SOFTIRQ_SHIFT,
-		    in_nmi());
+	unsigned int val = this_cpu_read(current_context);
+	int bit;
 
-	WARN_ON_ONCE(1);
-}
-
-static inline int trace_recursive_lock(void)
-{
-	trace_recursion_inc();
+	if (in_interrupt()) {
+		if (in_nmi())
+			bit = 0;
+		else if (in_irq())
+			bit = 1;
+		else
+			bit = 2;
+	} else
+		bit = 3;
 
-	if (likely(trace_recursion_buffer() < TRACE_RECURSIVE_DEPTH))
-		return 0;
+	if (unlikely(val & (1 << bit)))
+		return 1;
 
-	trace_recursive_fail();
+	val |= (1 << bit);
+	this_cpu_write(current_context, val);
 
-	return -1;
+	return 0;
 }
 
-static inline void trace_recursive_unlock(void)
+static __always_inline void trace_recursive_unlock(void)
 {
-	WARN_ON_ONCE(!trace_recursion_buffer());
+	unsigned int val = this_cpu_read(current_context);
 
-	trace_recursion_dec();
+	val--;
+	val &= this_cpu_read(current_context);
+	this_cpu_write(current_context, val);
 }
 
 #else
diff --git a/kernel/trace/trace.h b/kernel/trace/trace.h
index c203a51..04a2c7a 100644
--- a/kernel/trace/trace.h
+++ b/kernel/trace/trace.h
@@ -291,11 +291,6 @@ struct tracer {
 
 
 /* Only current can touch trace_recursion */
-#define trace_recursion_inc() do { (current)->trace_recursion++; } while (0)
-#define trace_recursion_dec() do { (current)->trace_recursion--; } while (0)
-
-/* Ring buffer has the 10 LSB bits to count */
-#define trace_recursion_buffer() ((current)->trace_recursion & 0x3ff)
 
 /*
  * For function tracing recursion:
@@ -323,7 +318,13 @@ struct tracer {
  * caller, and we can skip the current check.
  */
 enum {
-	TRACE_FTRACE_BIT = 11,
+	TRACE_BUFFER_BIT,
+	TRACE_BUFFER_NMI_BIT,
+	TRACE_BUFFER_IRQ_BIT,
+	TRACE_BUFFER_SIRQ_BIT,
+
+	/* Start of function recursion bits */
+	TRACE_FTRACE_BIT,
 	TRACE_FTRACE_NMI_BIT,
 	TRACE_FTRACE_IRQ_BIT,
 	TRACE_FTRACE_SIRQ_BIT,
-- 
1.7.10.4



^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH 12/12 -next] ring-buffer: Remove trace.h from ring_buffer.c
  2013-01-23 20:30 [PATCH 00/12 -next] [for linux-next] ftrace: Recursion protection and speed up Steven Rostedt
                   ` (10 preceding siblings ...)
  2013-01-23 20:30 ` [PATCH 11/12 -next] ring-buffer: User context bit recursion checking Steven Rostedt
@ 2013-01-23 20:30 ` Steven Rostedt
  11 siblings, 0 replies; 13+ messages in thread
From: Steven Rostedt @ 2013-01-23 20:30 UTC (permalink / raw)
  To: linux-kernel; +Cc: Ingo Molnar, Andrew Morton, Frederic Weisbecker

[-- Attachment #1: 0012-ring-buffer-Remove-trace.h-from-ring_buffer.c.patch --]
[-- Type: text/plain, Size: 1255 bytes --]

From: Steven Rostedt <srostedt@redhat.com>

ring_buffer.c use to require declarations from trace.h, but
these have moved to the generic header files. There's nothing
in trace.h that ring_buffer.c requires.

There's some headers that trace.h included that ring_buffer.c
needs, but it's best that it includes them directly, and not
include trace.h.

Also, some things may use ring_buffer.c without having tracing
configured. This removes the dependency that may come in the
future.

Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
---
 kernel/trace/ring_buffer.c |    3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/kernel/trace/ring_buffer.c b/kernel/trace/ring_buffer.c
index 481e262..13950d9 100644
--- a/kernel/trace/ring_buffer.c
+++ b/kernel/trace/ring_buffer.c
@@ -3,8 +3,10 @@
  *
  * Copyright (C) 2008 Steven Rostedt <srostedt@redhat.com>
  */
+#include <linux/ftrace_event.h>
 #include <linux/ring_buffer.h>
 #include <linux/trace_clock.h>
+#include <linux/trace_seq.h>
 #include <linux/spinlock.h>
 #include <linux/debugfs.h>
 #include <linux/uaccess.h>
@@ -21,7 +23,6 @@
 #include <linux/fs.h>
 
 #include <asm/local.h>
-#include "trace.h"
 
 static void update_pages_handler(struct work_struct *work);
 
-- 
1.7.10.4



^ permalink raw reply related	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2013-01-23 20:36 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-01-23 20:30 [PATCH 00/12 -next] [for linux-next] ftrace: Recursion protection and speed up Steven Rostedt
2013-01-23 20:30 ` [PATCH 01/12 -next] tracing: Remove trace.h header from trace_clock.c Steven Rostedt
2013-01-23 20:30 ` [PATCH 02/12 -next] tracing: Fix race with max_tr and changing tracers Steven Rostedt
2013-01-23 20:30 ` [PATCH 03/12 -next] tracing: Fix selftest function recursion accounting Steven Rostedt
2013-01-23 20:30 ` [PATCH 04/12 -next] ftrace: Fix global function tracers that are not recursion safe Steven Rostedt
2013-01-23 20:30 ` [PATCH 05/12 -next] ftrace: Fix function tracing recursion self test Steven Rostedt
2013-01-23 20:30 ` [PATCH 06/12 -next] ftrace: Optimize the function tracer list loop Steven Rostedt
2013-01-23 20:30 ` [PATCH 07/12 -next] ftrace: Add context level recursion bit checking Steven Rostedt
2013-01-23 20:30 ` [PATCH 08/12 -next] tracing: Make the trace recursion bits into enums Steven Rostedt
2013-01-23 20:30 ` [PATCH 09/12 -next] tracing: Avoid unnecessary multiple recursion checks Steven Rostedt
2013-01-23 20:30 ` [PATCH 10/12 -next] ftrace: Use only the preempt version of function tracing Steven Rostedt
2013-01-23 20:30 ` [PATCH 11/12 -next] ring-buffer: User context bit recursion checking Steven Rostedt
2013-01-23 20:30 ` [PATCH 12/12 -next] ring-buffer: Remove trace.h from ring_buffer.c Steven Rostedt

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).