linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/5] [GIT PULL] ftrace: jprobe, syscall and other fixes
@ 2015-01-15 15:28 Steven Rostedt
  2015-01-15 15:28 ` [PATCH 1/5] ftrace: Fix updating of filters for shared global_ops filters Steven Rostedt
                   ` (4 more replies)
  0 siblings, 5 replies; 6+ messages in thread
From: Steven Rostedt @ 2015-01-15 15:28 UTC (permalink / raw)
  To: linux-kernel; +Cc: Linus Torvalds, Ingo Molnar, Andrew Morton, Masami Hiramatsu


Linus,

This holds a few fixes to the ftrace infrastructure as well as
the mixture of function graph tracing and kprobes.

When jprobes and function graph tracing is enabled at the same time
it will crash the system.

  # modprobe jprobe_example
  # echo function_graph > /sys/kernel/debug/tracing/current_tracer

After the first fork (jprobe_example probes it), the system will crash.
This is due to the way jprobes copies the stack frame and does not
do a normal function return. This messes up with the function graph
tracing accounting which hijacks the return address from the stack
and replaces it with a hook function. It saves the return addresses in
a separate stack to put back the correct return address when done.
But because the jprobe functions do not do a normal return, their
stack addresses are not put back until the function they probe is called,
which means that the probed function will get the return address of
the jprobe handler instead of its own.

The simple fix here was to disable function graph tracing while the
jprobe handler is being called.

While debugging this I found two minor bugs with the function graph
tracing.

The first was about the function graph tracer sharing its function hash
with the function tracer (they both get filtered by the same input).
The changing of the set_ftrace_filter would not sync the function recording
records after a change if the function tracer was disabled but the
function graph tracer was enabled. This was due to the update only checking
one of the ops instead of the shared ops to see if they were enabled and
should perform the sync. This caused the ftrace accounting to break and
a ftrace_bug() would be triggered, disabling ftrace until a reboot.

The second was that the check to update records only checked one of the
filter hashes. It needs to test both the "filter" and "notrace" hashes.
The "filter" hash determines what functions to trace where as the "notrace"
hash determines what functions not to trace (trace all but these).
Both hashes need to be passed to the update code to find out what change
is being done during the update. This also broke the ftrace record
accounting and triggered a ftrace_bug().

This patch set also include two more fixes that were reported separately
from the kprobe issue.

One was that init_ftrace_syscalls() was called twice at boot up.
This is not a major bug, but that call performed a rather large kmalloc
(NR_syscalls * sizeof(*syscalls_metadata)). The second call made the first
one a memory leak, and wastes memory.

The other fix is a regression caused by an update in the v3.19 merge window.
The moving to enable events early, moved the enabling before PID 1 was
created. The syscall events require setting the TIF_SYSCALL_TRACEPOINT
for all tasks. But for_each_process_thread() does not include the swapper
task (PID 0), and ended up being a nop. A suggested fix was to add
the init_task() to have its flag set, but I didn't really want to mess
with PID 0 for this minor bug. Instead I disable and re-enable events again
at early_initcall() where it use to be enabled. This also handles any other
event that might have its own reg function that could break at early
boot up.

Please pull the latest trace-fixes-v3.19-rc3 tree, which can be found at:


  git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace.git
trace-fixes-v3.19-rc3

Tag SHA1: 73a577e76939fb74bc8de1e2a816b3d8a0fe850d
Head SHA1: ce1039bd3a89e99e4f624e75fb1777fc92d76eb3


Steven Rostedt (Red Hat) (5):
      ftrace: Fix updating of filters for shared global_ops filters
      ftrace: Check both notrace and filter for old hash
      ftrace/jprobes/x86: Fix conflict between jprobes and function graph tracing
      tracing: Remove extra call to init_ftrace_syscalls()
      tracing: Fix enabling of syscall events on the command line

----
 arch/x86/kernel/kprobes/core.c | 20 +++++++++---
 kernel/trace/ftrace.c          | 53 +++++++++++++++++++++++++++-----
 kernel/trace/trace.c           |  1 -
 kernel/trace/trace_events.c    | 69 +++++++++++++++++++++++++++++++++---------
 4 files changed, 115 insertions(+), 28 deletions(-)

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [PATCH 1/5] ftrace: Fix updating of filters for shared global_ops filters
  2015-01-15 15:28 [PATCH 0/5] [GIT PULL] ftrace: jprobe, syscall and other fixes Steven Rostedt
@ 2015-01-15 15:28 ` Steven Rostedt
  2015-01-15 15:28 ` [PATCH 2/5] ftrace: Check both notrace and filter for old hash Steven Rostedt
                   ` (3 subsequent siblings)
  4 siblings, 0 replies; 6+ messages in thread
From: Steven Rostedt @ 2015-01-15 15:28 UTC (permalink / raw)
  To: linux-kernel
  Cc: Linus Torvalds, Ingo Molnar, Andrew Morton, Masami Hiramatsu, stable

[-- Attachment #1: 0001-ftrace-Fix-updating-of-filters-for-shared-global_ops.patch --]
[-- Type: text/plain, Size: 2369 bytes --]

From: "Steven Rostedt (Red Hat)" <rostedt@goodmis.org>

As the set_ftrace_filter affects both the function tracer as well as the
function graph tracer, the ops that represent each have a shared
ftrace_ops_hash structure. This allows both to be updated when the filter
files are updated.

But if function graph is enabled and the global_ops (function tracing) ops
is not, then it is possible that the filter could be changed without the
update happening for the function graph ops. This will cause the changes
to not take place and may even cause a ftrace_bug to occur as it could mess
with the trampoline accounting.

The solution is to check if the ops uses the shared global_ops filter and
if the ops itself is not enabled, to check if there's another ops that is
enabled and also shares the global_ops filter. In that case, the
modification still needs to be executed.

Link: http://lkml.kernel.org/r/20150114154329.055980438@goodmis.org

Cc: stable@vger.kernel.org # 3.17+
Reviewed-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
---
 kernel/trace/ftrace.c | 26 +++++++++++++++++++++++++-
 1 file changed, 25 insertions(+), 1 deletion(-)

diff --git a/kernel/trace/ftrace.c b/kernel/trace/ftrace.c
index 929a733d302e..2b35d0ba578d 100644
--- a/kernel/trace/ftrace.c
+++ b/kernel/trace/ftrace.c
@@ -4008,8 +4008,32 @@ ftrace_match_addr(struct ftrace_hash *hash, unsigned long ip, int remove)
 static void ftrace_ops_update_code(struct ftrace_ops *ops,
 				   struct ftrace_hash *old_hash)
 {
-	if (ops->flags & FTRACE_OPS_FL_ENABLED && ftrace_enabled)
+	struct ftrace_ops *op;
+
+	if (!ftrace_enabled)
+		return;
+
+	if (ops->flags & FTRACE_OPS_FL_ENABLED) {
 		ftrace_run_modify_code(ops, FTRACE_UPDATE_CALLS, old_hash);
+		return;
+	}
+
+	/*
+	 * If this is the shared global_ops filter, then we need to
+	 * check if there is another ops that shares it, is enabled.
+	 * If so, we still need to run the modify code.
+	 */
+	if (ops->func_hash != &global_ops.local_hash)
+		return;
+
+	do_for_each_ftrace_op(op, ftrace_ops_list) {
+		if (op->func_hash == &global_ops.local_hash &&
+		    op->flags & FTRACE_OPS_FL_ENABLED) {
+			ftrace_run_modify_code(op, FTRACE_UPDATE_CALLS, old_hash);
+			/* Only need to do this once */
+			return;
+		}
+	} while_for_each_ftrace_op(op);
 }
 
 static int
-- 
2.1.4



^ permalink raw reply related	[flat|nested] 6+ messages in thread

* [PATCH 2/5] ftrace: Check both notrace and filter for old hash
  2015-01-15 15:28 [PATCH 0/5] [GIT PULL] ftrace: jprobe, syscall and other fixes Steven Rostedt
  2015-01-15 15:28 ` [PATCH 1/5] ftrace: Fix updating of filters for shared global_ops filters Steven Rostedt
@ 2015-01-15 15:28 ` Steven Rostedt
  2015-01-15 15:28 ` [PATCH 3/5] ftrace/jprobes/x86: Fix conflict between jprobes and function graph tracing Steven Rostedt
                   ` (2 subsequent siblings)
  4 siblings, 0 replies; 6+ messages in thread
From: Steven Rostedt @ 2015-01-15 15:28 UTC (permalink / raw)
  To: linux-kernel
  Cc: Linus Torvalds, Ingo Molnar, Andrew Morton, Masami Hiramatsu, stable

[-- Attachment #1: 0002-ftrace-Check-both-notrace-and-filter-for-old-hash.patch --]
[-- Type: text/plain, Size: 5003 bytes --]

From: "Steven Rostedt (Red Hat)" <rostedt@goodmis.org>

Using just the filter for checking for trampolines or regs is not enough
when updating the code against the records that represent all functions.
Both the filter hash and the notrace hash need to be checked.

To trigger this bug (using trace-cmd and perf):

 # perf probe -a do_fork
 # trace-cmd start -B foo -e probe
 # trace-cmd record -p function_graph -n do_fork sleep 1

The trace-cmd record at the end clears the filter before it disables
function_graph tracing and then that causes the accounting of the
ftrace function records to become incorrect and causes ftrace to bug.

Link: http://lkml.kernel.org/r/20150114154329.358378039@goodmis.org

Cc: stable@vger.kernel.org
[ still need to switch old_hash_ops to old_ops_hash ]
Reviewed-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
---
 kernel/trace/ftrace.c | 27 ++++++++++++++++++++-------
 1 file changed, 20 insertions(+), 7 deletions(-)

diff --git a/kernel/trace/ftrace.c b/kernel/trace/ftrace.c
index 2b35d0ba578d..224e768bdc73 100644
--- a/kernel/trace/ftrace.c
+++ b/kernel/trace/ftrace.c
@@ -2497,12 +2497,14 @@ static void ftrace_run_update_code(int command)
 }
 
 static void ftrace_run_modify_code(struct ftrace_ops *ops, int command,
-				   struct ftrace_hash *old_hash)
+				   struct ftrace_ops_hash *old_hash)
 {
 	ops->flags |= FTRACE_OPS_FL_MODIFYING;
-	ops->old_hash.filter_hash = old_hash;
+	ops->old_hash.filter_hash = old_hash->filter_hash;
+	ops->old_hash.notrace_hash = old_hash->notrace_hash;
 	ftrace_run_update_code(command);
 	ops->old_hash.filter_hash = NULL;
+	ops->old_hash.notrace_hash = NULL;
 	ops->flags &= ~FTRACE_OPS_FL_MODIFYING;
 }
 
@@ -3579,7 +3581,7 @@ static struct ftrace_ops trace_probe_ops __read_mostly =
 
 static int ftrace_probe_registered;
 
-static void __enable_ftrace_function_probe(struct ftrace_hash *old_hash)
+static void __enable_ftrace_function_probe(struct ftrace_ops_hash *old_hash)
 {
 	int ret;
 	int i;
@@ -3637,6 +3639,7 @@ int
 register_ftrace_function_probe(char *glob, struct ftrace_probe_ops *ops,
 			      void *data)
 {
+	struct ftrace_ops_hash old_hash_ops;
 	struct ftrace_func_probe *entry;
 	struct ftrace_hash **orig_hash = &trace_probe_ops.func_hash->filter_hash;
 	struct ftrace_hash *old_hash = *orig_hash;
@@ -3658,6 +3661,10 @@ register_ftrace_function_probe(char *glob, struct ftrace_probe_ops *ops,
 
 	mutex_lock(&trace_probe_ops.func_hash->regex_lock);
 
+	old_hash_ops.filter_hash = old_hash;
+	/* Probes only have filters */
+	old_hash_ops.notrace_hash = NULL;
+
 	hash = alloc_and_copy_ftrace_hash(FTRACE_HASH_DEFAULT_BITS, old_hash);
 	if (!hash) {
 		count = -ENOMEM;
@@ -3718,7 +3725,7 @@ register_ftrace_function_probe(char *glob, struct ftrace_probe_ops *ops,
 
 	ret = ftrace_hash_move(&trace_probe_ops, 1, orig_hash, hash);
 
-	__enable_ftrace_function_probe(old_hash);
+	__enable_ftrace_function_probe(&old_hash_ops);
 
 	if (!ret)
 		free_ftrace_hash_rcu(old_hash);
@@ -4006,7 +4013,7 @@ ftrace_match_addr(struct ftrace_hash *hash, unsigned long ip, int remove)
 }
 
 static void ftrace_ops_update_code(struct ftrace_ops *ops,
-				   struct ftrace_hash *old_hash)
+				   struct ftrace_ops_hash *old_hash)
 {
 	struct ftrace_ops *op;
 
@@ -4041,6 +4048,7 @@ ftrace_set_hash(struct ftrace_ops *ops, unsigned char *buf, int len,
 		unsigned long ip, int remove, int reset, int enable)
 {
 	struct ftrace_hash **orig_hash;
+	struct ftrace_ops_hash old_hash_ops;
 	struct ftrace_hash *old_hash;
 	struct ftrace_hash *hash;
 	int ret;
@@ -4077,9 +4085,11 @@ ftrace_set_hash(struct ftrace_ops *ops, unsigned char *buf, int len,
 
 	mutex_lock(&ftrace_lock);
 	old_hash = *orig_hash;
+	old_hash_ops.filter_hash = ops->func_hash->filter_hash;
+	old_hash_ops.notrace_hash = ops->func_hash->notrace_hash;
 	ret = ftrace_hash_move(ops, enable, orig_hash, hash);
 	if (!ret) {
-		ftrace_ops_update_code(ops, old_hash);
+		ftrace_ops_update_code(ops, &old_hash_ops);
 		free_ftrace_hash_rcu(old_hash);
 	}
 	mutex_unlock(&ftrace_lock);
@@ -4291,6 +4301,7 @@ static void __init set_ftrace_early_filters(void)
 int ftrace_regex_release(struct inode *inode, struct file *file)
 {
 	struct seq_file *m = (struct seq_file *)file->private_data;
+	struct ftrace_ops_hash old_hash_ops;
 	struct ftrace_iterator *iter;
 	struct ftrace_hash **orig_hash;
 	struct ftrace_hash *old_hash;
@@ -4324,10 +4335,12 @@ int ftrace_regex_release(struct inode *inode, struct file *file)
 
 		mutex_lock(&ftrace_lock);
 		old_hash = *orig_hash;
+		old_hash_ops.filter_hash = iter->ops->func_hash->filter_hash;
+		old_hash_ops.notrace_hash = iter->ops->func_hash->notrace_hash;
 		ret = ftrace_hash_move(iter->ops, filter_hash,
 				       orig_hash, iter->hash);
 		if (!ret) {
-			ftrace_ops_update_code(iter->ops, old_hash);
+			ftrace_ops_update_code(iter->ops, &old_hash_ops);
 			free_ftrace_hash_rcu(old_hash);
 		}
 		mutex_unlock(&ftrace_lock);
-- 
2.1.4



^ permalink raw reply related	[flat|nested] 6+ messages in thread

* [PATCH 3/5] ftrace/jprobes/x86: Fix conflict between jprobes and function graph tracing
  2015-01-15 15:28 [PATCH 0/5] [GIT PULL] ftrace: jprobe, syscall and other fixes Steven Rostedt
  2015-01-15 15:28 ` [PATCH 1/5] ftrace: Fix updating of filters for shared global_ops filters Steven Rostedt
  2015-01-15 15:28 ` [PATCH 2/5] ftrace: Check both notrace and filter for old hash Steven Rostedt
@ 2015-01-15 15:28 ` Steven Rostedt
  2015-01-15 15:28 ` [PATCH 4/5] tracing: Remove extra call to init_ftrace_syscalls() Steven Rostedt
  2015-01-15 15:28 ` [PATCH 5/5] tracing: Fix enabling of syscall events on the command line Steven Rostedt
  4 siblings, 0 replies; 6+ messages in thread
From: Steven Rostedt @ 2015-01-15 15:28 UTC (permalink / raw)
  To: linux-kernel
  Cc: Linus Torvalds, Ingo Molnar, Andrew Morton, Masami Hiramatsu, stable

[-- Attachment #1: 0003-ftrace-jprobes-x86-Fix-conflict-between-jprobes-and-.patch --]
[-- Type: text/plain, Size: 4421 bytes --]

From: "Steven Rostedt (Red Hat)" <rostedt@goodmis.org>

If the function graph tracer traces a jprobe callback, the system will
crash. This can easily be demonstrated by compiling the jprobe
sample module that is in the kernel tree, loading it and running the
function graph tracer.

 # modprobe jprobe_example.ko
 # echo function_graph > /sys/kernel/debug/tracing/current_tracer
 # ls

The first two commands end up in a nice crash after the first fork.
(do_fork has a jprobe attached to it, so "ls" just triggers that fork)

The problem is caused by the jprobe_return() that all jprobe callbacks
must end with. The way jprobes works is that the function a jprobe
is attached to has a breakpoint placed at the start of it (or it uses
ftrace if fentry is supported). The breakpoint handler (or ftrace callback)
will copy the stack frame and change the ip address to return to the
jprobe handler instead of the function. The jprobe handler must end
with jprobe_return() which swaps the stack and does an int3 (breakpoint).
This breakpoint handler will then put back the saved stack frame,
simulate the instruction at the beginning of the function it added
a breakpoint to, and then continue on.

For function tracing to work, it hijakes the return address from the
stack frame, and replaces it with a hook function that will trace
the end of the call. This hook function will restore the return
address of the function call.

If the function tracer traces the jprobe handler, the hook function
for that handler will not be called, and its saved return address
will be used for the next function. This will result in a kernel crash.

To solve this, pause function tracing before the jprobe handler is called
and unpause it before it returns back to the function it probed.

Some other updates:

Used a variable "saved_sp" to hold kcb->jprobe_saved_sp. This makes the
code look a bit cleaner and easier to understand (various tries to fix
this bug required this change).

Note, if fentry is being used, jprobes will change the ip address before
the function graph tracer runs and it will not be able to trace the
function that the jprobe is probing.

Link: http://lkml.kernel.org/r/20150114154329.552437962@goodmis.org

Cc: stable@vger.kernel.org # 2.6.30+
Acked-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
---
 arch/x86/kernel/kprobes/core.c | 20 +++++++++++++++-----
 1 file changed, 15 insertions(+), 5 deletions(-)

diff --git a/arch/x86/kernel/kprobes/core.c b/arch/x86/kernel/kprobes/core.c
index f7e3cd50ece0..98f654d466e5 100644
--- a/arch/x86/kernel/kprobes/core.c
+++ b/arch/x86/kernel/kprobes/core.c
@@ -1020,6 +1020,15 @@ int setjmp_pre_handler(struct kprobe *p, struct pt_regs *regs)
 	regs->flags &= ~X86_EFLAGS_IF;
 	trace_hardirqs_off();
 	regs->ip = (unsigned long)(jp->entry);
+
+	/*
+	 * jprobes use jprobe_return() which skips the normal return
+	 * path of the function, and this messes up the accounting of the
+	 * function graph tracer to get messed up.
+	 *
+	 * Pause function graph tracing while performing the jprobe function.
+	 */
+	pause_graph_tracing();
 	return 1;
 }
 NOKPROBE_SYMBOL(setjmp_pre_handler);
@@ -1048,24 +1057,25 @@ int longjmp_break_handler(struct kprobe *p, struct pt_regs *regs)
 	struct kprobe_ctlblk *kcb = get_kprobe_ctlblk();
 	u8 *addr = (u8 *) (regs->ip - 1);
 	struct jprobe *jp = container_of(p, struct jprobe, kp);
+	void *saved_sp = kcb->jprobe_saved_sp;
 
 	if ((addr > (u8 *) jprobe_return) &&
 	    (addr < (u8 *) jprobe_return_end)) {
-		if (stack_addr(regs) != kcb->jprobe_saved_sp) {
+		if (stack_addr(regs) != saved_sp) {
 			struct pt_regs *saved_regs = &kcb->jprobe_saved_regs;
 			printk(KERN_ERR
 			       "current sp %p does not match saved sp %p\n",
-			       stack_addr(regs), kcb->jprobe_saved_sp);
+			       stack_addr(regs), saved_sp);
 			printk(KERN_ERR "Saved registers for jprobe %p\n", jp);
 			show_regs(saved_regs);
 			printk(KERN_ERR "Current registers\n");
 			show_regs(regs);
 			BUG();
 		}
+		/* It's OK to start function graph tracing again */
+		unpause_graph_tracing();
 		*regs = kcb->jprobe_saved_regs;
-		memcpy((kprobe_opcode_t *)(kcb->jprobe_saved_sp),
-		       kcb->jprobes_stack,
-		       MIN_STACK_SIZE(kcb->jprobe_saved_sp));
+		memcpy(saved_sp, kcb->jprobes_stack, MIN_STACK_SIZE(saved_sp));
 		preempt_enable_no_resched();
 		return 1;
 	}
-- 
2.1.4



^ permalink raw reply related	[flat|nested] 6+ messages in thread

* [PATCH 4/5] tracing: Remove extra call to init_ftrace_syscalls()
  2015-01-15 15:28 [PATCH 0/5] [GIT PULL] ftrace: jprobe, syscall and other fixes Steven Rostedt
                   ` (2 preceding siblings ...)
  2015-01-15 15:28 ` [PATCH 3/5] ftrace/jprobes/x86: Fix conflict between jprobes and function graph tracing Steven Rostedt
@ 2015-01-15 15:28 ` Steven Rostedt
  2015-01-15 15:28 ` [PATCH 5/5] tracing: Fix enabling of syscall events on the command line Steven Rostedt
  4 siblings, 0 replies; 6+ messages in thread
From: Steven Rostedt @ 2015-01-15 15:28 UTC (permalink / raw)
  To: linux-kernel
  Cc: Linus Torvalds, Ingo Molnar, Andrew Morton, Masami Hiramatsu, Wang Nan

[-- Attachment #1: 0004-tracing-Remove-extra-call-to-init_ftrace_syscalls.patch --]
[-- Type: text/plain, Size: 955 bytes --]

From: "Steven Rostedt (Red Hat)" <rostedt@goodmis.org>

trace_init() calls init_ftrace_syscalls() and then calls trace_event_init()
which also calls init_ftrace_syscalls(). It makes more sense to only
call it from trace_event_init().

Calling it twice wastes memory, as it allocates the syscall events twice,
and loses the first copy of it.

Link: http://lkml.kernel.org/r/54AF53BD.5070303@huawei.com
Link: http://lkml.kernel.org/r/20150115040505.930398632@goodmis.org

Reported-by: Wang Nan <wangnan0@huawei.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
---
 kernel/trace/trace.c | 1 -
 1 file changed, 1 deletion(-)

diff --git a/kernel/trace/trace.c b/kernel/trace/trace.c
index 2e767972e99c..4a9079b9f082 100644
--- a/kernel/trace/trace.c
+++ b/kernel/trace/trace.c
@@ -6918,7 +6918,6 @@ void __init trace_init(void)
 			tracepoint_printk = 0;
 	}
 	tracer_alloc_buffers();
-	init_ftrace_syscalls();
 	trace_event_init();	
 }
 
-- 
2.1.4



^ permalink raw reply related	[flat|nested] 6+ messages in thread

* [PATCH 5/5] tracing: Fix enabling of syscall events on the command line
  2015-01-15 15:28 [PATCH 0/5] [GIT PULL] ftrace: jprobe, syscall and other fixes Steven Rostedt
                   ` (3 preceding siblings ...)
  2015-01-15 15:28 ` [PATCH 4/5] tracing: Remove extra call to init_ftrace_syscalls() Steven Rostedt
@ 2015-01-15 15:28 ` Steven Rostedt
  4 siblings, 0 replies; 6+ messages in thread
From: Steven Rostedt @ 2015-01-15 15:28 UTC (permalink / raw)
  To: linux-kernel
  Cc: Linus Torvalds, Ingo Molnar, Andrew Morton, Masami Hiramatsu,
	Michael Ellerman

[-- Attachment #1: 0005-tracing-Fix-enabling-of-syscall-events-on-the-comman.patch --]
[-- Type: text/plain, Size: 3881 bytes --]

From: "Steven Rostedt (Red Hat)" <rostedt@goodmis.org>

Commit 5f893b2639b2 "tracing: Move enabling tracepoints to just after
rcu_init()" broke the enabling of system call events from the command
line. The reason was that the enabling of command line trace events
was moved before PID 1 started, and the syscall tracepoints require
that all tasks have the TIF_SYSCALL_TRACEPOINT flag set. But the
swapper task (pid 0) is not part of that. Since the swapper task is the
only task that is running at this early in boot, no task gets the
flag set, and the tracepoint never gets reached.

Instead of setting the swapper task flag (there should be no reason to
do that), re-enabled trace events again after the init thread (PID 1)
has been started. It requires disabling all command line events and
re-enabling them, as just enabling them again will not reset the logic
to set the TIF_SYSCALL_TRACEPOINT flag, as the syscall tracepoint will
be fooled into thinking that it was already set, and wont try setting
it again. For this reason, we must first disable it and re-enable it.

Link: http://lkml.kernel.org/r/1421188517-18312-1-git-send-email-mpe@ellerman.id.au
Link: http://lkml.kernel.org/r/20150115040506.216066449@goodmis.org

Reported-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
---
 kernel/trace/trace_events.c | 69 ++++++++++++++++++++++++++++++++++++---------
 1 file changed, 55 insertions(+), 14 deletions(-)

diff --git a/kernel/trace/trace_events.c b/kernel/trace/trace_events.c
index 366a78a3e61e..b03a0ea77b99 100644
--- a/kernel/trace/trace_events.c
+++ b/kernel/trace/trace_events.c
@@ -2429,12 +2429,39 @@ static __init int event_trace_memsetup(void)
 	return 0;
 }
 
+static __init void
+early_enable_events(struct trace_array *tr, bool disable_first)
+{
+	char *buf = bootup_event_buf;
+	char *token;
+	int ret;
+
+	while (true) {
+		token = strsep(&buf, ",");
+
+		if (!token)
+			break;
+		if (!*token)
+			continue;
+
+		/* Restarting syscalls requires that we stop them first */
+		if (disable_first)
+			ftrace_set_clr_event(tr, token, 0);
+
+		ret = ftrace_set_clr_event(tr, token, 1);
+		if (ret)
+			pr_warn("Failed to enable trace event: %s\n", token);
+
+		/* Put back the comma to allow this to be called again */
+		if (buf)
+			*(buf - 1) = ',';
+	}
+}
+
 static __init int event_trace_enable(void)
 {
 	struct trace_array *tr = top_trace_array();
 	struct ftrace_event_call **iter, *call;
-	char *buf = bootup_event_buf;
-	char *token;
 	int ret;
 
 	if (!tr)
@@ -2456,18 +2483,7 @@ static __init int event_trace_enable(void)
 	 */
 	__trace_early_add_events(tr);
 
-	while (true) {
-		token = strsep(&buf, ",");
-
-		if (!token)
-			break;
-		if (!*token)
-			continue;
-
-		ret = ftrace_set_clr_event(tr, token, 1);
-		if (ret)
-			pr_warn("Failed to enable trace event: %s\n", token);
-	}
+	early_enable_events(tr, false);
 
 	trace_printk_start_comm();
 
@@ -2478,6 +2494,31 @@ static __init int event_trace_enable(void)
 	return 0;
 }
 
+/*
+ * event_trace_enable() is called from trace_event_init() first to
+ * initialize events and perhaps start any events that are on the
+ * command line. Unfortunately, there are some events that will not
+ * start this early, like the system call tracepoints that need
+ * to set the TIF_SYSCALL_TRACEPOINT flag of pid 1. But event_trace_enable()
+ * is called before pid 1 starts, and this flag is never set, making
+ * the syscall tracepoint never get reached, but the event is enabled
+ * regardless (and not doing anything).
+ */
+static __init int event_trace_enable_again(void)
+{
+	struct trace_array *tr;
+
+	tr = top_trace_array();
+	if (!tr)
+		return -ENODEV;
+
+	early_enable_events(tr, true);
+
+	return 0;
+}
+
+early_initcall(event_trace_enable_again);
+
 static __init int event_trace_init(void)
 {
 	struct trace_array *tr;
-- 
2.1.4



^ permalink raw reply related	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2015-01-15 15:29 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-01-15 15:28 [PATCH 0/5] [GIT PULL] ftrace: jprobe, syscall and other fixes Steven Rostedt
2015-01-15 15:28 ` [PATCH 1/5] ftrace: Fix updating of filters for shared global_ops filters Steven Rostedt
2015-01-15 15:28 ` [PATCH 2/5] ftrace: Check both notrace and filter for old hash Steven Rostedt
2015-01-15 15:28 ` [PATCH 3/5] ftrace/jprobes/x86: Fix conflict between jprobes and function graph tracing Steven Rostedt
2015-01-15 15:28 ` [PATCH 4/5] tracing: Remove extra call to init_ftrace_syscalls() Steven Rostedt
2015-01-15 15:28 ` [PATCH 5/5] tracing: Fix enabling of syscall events on the command line Steven Rostedt

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).