linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Steven Rostedt <rostedt@goodmis.org>
To: linux-kernel@vger.kernel.org
Cc: Ingo Molnar <mingo@kernel.org>,
	Frederic Weisbecker <fweisbec@gmail.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	"Paul E. McKenney" <paulmck@us.ibm.com>,
	Tejun Heo <tj@kernel.org>, Jiri Olsa <jolsa@redhat.com>,
	Peter Zijlstra <peterz@infradead.org>,
	"Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
Subject: [for-next][PATCH 05/12] ftrace: Use schedule_on_each_cpu() as a heavy synchronize_sched()
Date: Wed, 19 Jun 2013 23:35:21 -0400	[thread overview]
Message-ID: <20130620033639.241465107@goodmis.org> (raw)
In-Reply-To: 20130620033516.003166252@goodmis.org

[-- Attachment #1: 0005-ftrace-Use-schedule_on_each_cpu-as-a-heavy-synchroni.patch --]
[-- Type: text/plain, Size: 3623 bytes --]

From: Steven Rostedt <rostedt@goodmis.org>

The function tracer uses preempt_disable/enable_notrace() for
synchronization between reading registered ftrace_ops and unregistering
them.

Most of the ftrace_ops are global permanent structures that do not
require this synchronization. That is, ops may be added and removed from
the hlist but are never freed, and wont hurt if a synchronization is
missed.

But this is not true for dynamically created ftrace_ops or control_ops,
which are used by the perf function tracing.

The problem here is that the function tracer can be used to trace
kernel/user context switches as well as going to and from idle.
Basically, it can be used to trace blind spots of the RCU subsystem.
This means that even though preempt_disable() is done, a
synchronize_sched() will ignore CPUs that haven't made it out of user
space or idle. These can include functions that are being traced just
before entering or exiting the kernel sections.

To implement the RCU synchronization, instead of using
synchronize_sched() the use of schedule_on_each_cpu() is performed. This
means that when a dynamically allocated ftrace_ops, or a control ops is
being unregistered, all CPUs must be touched and execute a ftrace_sync()
stub function via the work queues. This will rip CPUs out from idle or
in dynamic tick mode. This only happens when a user disables perf
function tracing or other dynamically allocated function tracers, but it
allows us to continue to debug RCU and context tracking with function
tracing.

Link: http://lkml.kernel.org/r/1369785676.15552.55.camel@gandalf.local.home

Cc: "Paul E. McKenney" <paulmck@us.ibm.com>
Cc: Tejun Heo <tj@kernel.org>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
---
 kernel/trace/ftrace.c |   23 +++++++++++++++++++++--
 1 file changed, 21 insertions(+), 2 deletions(-)

diff --git a/kernel/trace/ftrace.c b/kernel/trace/ftrace.c
index 6c508ff..800a8a2 100644
--- a/kernel/trace/ftrace.c
+++ b/kernel/trace/ftrace.c
@@ -413,6 +413,17 @@ static int __register_ftrace_function(struct ftrace_ops *ops)
 	return 0;
 }
 
+static void ftrace_sync(struct work_struct *work)
+{
+	/*
+	 * This function is just a stub to implement a hard force
+	 * of synchronize_sched(). This requires synchronizing
+	 * tasks even in userspace and idle.
+	 *
+	 * Yes, function tracing is rude.
+	 */
+}
+
 static int __unregister_ftrace_function(struct ftrace_ops *ops)
 {
 	int ret;
@@ -440,8 +451,12 @@ static int __unregister_ftrace_function(struct ftrace_ops *ops)
 			 * so there'll be no new users. We must ensure
 			 * all current users are done before we free
 			 * the control data.
+			 * Note synchronize_sched() is not enough, as we
+			 * use preempt_disable() to do RCU, but the function
+			 * tracer can be called where RCU is not active
+			 * (before user_exit()).
 			 */
-			synchronize_sched();
+			schedule_on_each_cpu(ftrace_sync);
 			control_ops_free(ops);
 		}
 	} else
@@ -456,9 +471,13 @@ static int __unregister_ftrace_function(struct ftrace_ops *ops)
 	/*
 	 * Dynamic ops may be freed, we must make sure that all
 	 * callers are done before leaving this function.
+	 *
+	 * Again, normal synchronize_sched() is not good enough.
+	 * We need to do a hard force of sched synchronization.
 	 */
 	if (ops->flags & FTRACE_OPS_FL_DYNAMIC)
-		synchronize_sched();
+		schedule_on_each_cpu(ftrace_sync);
+
 
 	return 0;
 }
-- 
1.7.10.4



  parent reply	other threads:[~2013-06-20  3:39 UTC|newest]

Thread overview: 19+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-06-20  3:35 [for-next][PATCH 00/12] tracing: Updates and minor fixes for 3.11 Steven Rostedt
2013-06-20  3:35 ` [for-next][PATCH 01/12] tracing: Add function probe to trigger a ftrace dump to console Steven Rostedt
2013-06-20  3:35 ` [for-next][PATCH 02/12] tracing: Add function probe to trigger a ftrace dump of current CPU trace Steven Rostedt
2013-06-20  3:35 ` [for-next][PATCH 03/12] tracing/trivial: Consolidate error return condition Steven Rostedt
2013-06-20  3:35 ` [for-next][PATCH 04/12] tracing: Fix file mode of free_buffer Steven Rostedt
2013-06-20  3:35 ` Steven Rostedt [this message]
2013-06-20  3:35 ` [for-next][PATCH 06/12] ftrace: Remove ftrace_regex_lseek() Steven Rostedt
2013-06-20  3:35 ` [for-next][PATCH 07/12] tracing: Do not call kmem_cache_free() on allocation failure Steven Rostedt
2013-06-20  3:35 ` [for-next][PATCH 08/12] tracing: Add binary & filter for events Steven Rostedt
2013-06-20  8:09   ` Arend van Spriel
2013-06-20 12:14     ` Steven Rostedt
2013-06-20 18:28       ` Arend van Spriel
2013-06-20 18:34         ` Steven Rostedt
2013-06-20 18:38           ` Steven Rostedt
2013-06-20 19:15             ` Arend van Spriel
2013-06-20  3:35 ` [for-next][PATCH 09/12] tracing: Update documentation on tracepoint glob matching Steven Rostedt
2013-06-20  3:35 ` [for-next][PATCH 10/12] tracing: Disable tracing on warning Steven Rostedt
2013-06-20  3:35 ` [for-next][PATCH 11/12] tracing/kprobes: Remove unnecessary checking of trace_probe_is_enabled Steven Rostedt
2013-06-20  3:35 ` [for-next][PATCH 12/12] ftrace: Fix stddev calculation in function profiler Steven Rostedt

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20130620033639.241465107@goodmis.org \
    --to=rostedt@goodmis.org \
    --cc=akpm@linux-foundation.org \
    --cc=fweisbec@gmail.com \
    --cc=jolsa@redhat.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@kernel.org \
    --cc=paulmck@linux.vnet.ibm.com \
    --cc=paulmck@us.ibm.com \
    --cc=peterz@infradead.org \
    --cc=tj@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).