All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/3] ftrace/kprobes/x86:
@ 2015-01-14 15:39 Steven Rostedt
  2015-01-14 15:39 ` [PATCH 1/3] ftrace: Fix updating of filters for shared global_ops filters Steven Rostedt
                   ` (2 more replies)
  0 siblings, 3 replies; 11+ messages in thread
From: Steven Rostedt @ 2015-01-14 15:39 UTC (permalink / raw)
  To: linux-kernel
  Cc: Ingo Molnar, Andrew Morton, Masami Hiramatsu, vvs,
	H. Peter Anvin, Thomas Gleixner, Borislav Petkov, x86-ml


Vasily Averin reported that combining jprobes with function tracing
resulted in a kernel crash.

I first thought this was related to the fentry code and wrote up a few
nice (albeit complex) patches that could handle it. But then during my
testing I discovered that it also affected non fentry code and i386.
I had to revert all my changes and just disable function graph tracing
before the jprobe is called, and re-enable it afterward. This seems
to be the easiest fix, although if fentry is enabled, we lose the
tracing of the function that was being probed. But since it would have
crashed otherwise, this solution is a major improvement.

While debugging this I also uncovered some other bugs in the function
graph trampoline logic. Nothing too major, but worthy enough to be fixed
now and sent to stable.

I can still use the patches I have done for fentry to let the probed
function be traced (as well as the jprobe handlers) if fentry is being
used. But that is something for the next merge window and not needed now.

Steven Rostedt (Red Hat) (3):
      ftrace: Fix updating of filters for shared global_ops filters
      ftrace: Check both notrace and filter for old hash
      ftrace/jprobes/x86: Fix conflict between jprobes and function graph tracing

----
 arch/x86/kernel/kprobes/core.c | 20 ++++++++++++----
 kernel/trace/ftrace.c          | 53 +++++++++++++++++++++++++++++++++++-------
 2 files changed, 60 insertions(+), 13 deletions(-)

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [PATCH 1/3] ftrace: Fix updating of filters for shared global_ops filters
  2015-01-14 15:39 [PATCH 0/3] ftrace/kprobes/x86: Steven Rostedt
@ 2015-01-14 15:39 ` Steven Rostedt
  2015-01-15  8:05   ` Masami Hiramatsu
  2015-01-14 15:40 ` [PATCH 2/3] ftrace: Check both notrace and filter for old hash Steven Rostedt
  2015-01-14 15:40 ` [PATCH 3/3] ftrace/jprobes/x86: Fix conflict between jprobes and function graph tracing Steven Rostedt
  2 siblings, 1 reply; 11+ messages in thread
From: Steven Rostedt @ 2015-01-14 15:39 UTC (permalink / raw)
  To: linux-kernel
  Cc: Ingo Molnar, Andrew Morton, Masami Hiramatsu, vvs,
	H. Peter Anvin, Thomas Gleixner, Borislav Petkov, x86-ml, stable

[-- Attachment #1: 0001-ftrace-Fix-updating-of-filters-for-shared-global_ops.patch --]
[-- Type: text/plain, Size: 2236 bytes --]

From: "Steven Rostedt (Red Hat)" <rostedt@goodmis.org>

As the set_ftrace_filter affects both the function tracer as well as the
function graph tracer, the ops that represent each have a shared
ftrace_ops_hash structure. This allows both to be updated when the filter
files are updated.

But if function graph is enabled and the global_ops (function tracing) ops
is not, then it is possible that the filter could be changed without the
update happening for the function graph ops. This will cause the changes
to not take place and may even cause a ftrace_bug to occur as it could mess
with the trampoline accounting.

The solution is to check if the ops uses the shared global_ops filter and
if the ops itself is not enabled, to check if there's another ops that is
enabled and also shares the global_ops filter. In that case, the
modification still needs to be executed.

Cc: stable@vger.kernel.org # 3.17+
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
---
 kernel/trace/ftrace.c | 26 +++++++++++++++++++++++++-
 1 file changed, 25 insertions(+), 1 deletion(-)

diff --git a/kernel/trace/ftrace.c b/kernel/trace/ftrace.c
index 929a733d302e..2b35d0ba578d 100644
--- a/kernel/trace/ftrace.c
+++ b/kernel/trace/ftrace.c
@@ -4008,8 +4008,32 @@ ftrace_match_addr(struct ftrace_hash *hash, unsigned long ip, int remove)
 static void ftrace_ops_update_code(struct ftrace_ops *ops,
 				   struct ftrace_hash *old_hash)
 {
-	if (ops->flags & FTRACE_OPS_FL_ENABLED && ftrace_enabled)
+	struct ftrace_ops *op;
+
+	if (!ftrace_enabled)
+		return;
+
+	if (ops->flags & FTRACE_OPS_FL_ENABLED) {
 		ftrace_run_modify_code(ops, FTRACE_UPDATE_CALLS, old_hash);
+		return;
+	}
+
+	/*
+	 * If this is the shared global_ops filter, then we need to
+	 * check if there is another ops that shares it, is enabled.
+	 * If so, we still need to run the modify code.
+	 */
+	if (ops->func_hash != &global_ops.local_hash)
+		return;
+
+	do_for_each_ftrace_op(op, ftrace_ops_list) {
+		if (op->func_hash == &global_ops.local_hash &&
+		    op->flags & FTRACE_OPS_FL_ENABLED) {
+			ftrace_run_modify_code(op, FTRACE_UPDATE_CALLS, old_hash);
+			/* Only need to do this once */
+			return;
+		}
+	} while_for_each_ftrace_op(op);
 }
 
 static int
-- 
2.1.4



^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH 2/3] ftrace: Check both notrace and filter for old hash
  2015-01-14 15:39 [PATCH 0/3] ftrace/kprobes/x86: Steven Rostedt
  2015-01-14 15:39 ` [PATCH 1/3] ftrace: Fix updating of filters for shared global_ops filters Steven Rostedt
@ 2015-01-14 15:40 ` Steven Rostedt
  2015-01-15 10:59   ` Masami Hiramatsu
  2015-01-14 15:40 ` [PATCH 3/3] ftrace/jprobes/x86: Fix conflict between jprobes and function graph tracing Steven Rostedt
  2 siblings, 1 reply; 11+ messages in thread
From: Steven Rostedt @ 2015-01-14 15:40 UTC (permalink / raw)
  To: linux-kernel
  Cc: Ingo Molnar, Andrew Morton, Masami Hiramatsu, vvs,
	H. Peter Anvin, Thomas Gleixner, Borislav Petkov, x86-ml, stable

[-- Attachment #1: 0002-ftrace-Check-both-notrace-and-filter-for-old-hash.patch --]
[-- Type: text/plain, Size: 4816 bytes --]

From: "Steven Rostedt (Red Hat)" <rostedt@goodmis.org>

Using just the filter for checking for trampolines or regs is not enough
when updating the code against the records that represent all functions.
Both the filter hash and the notrace hash need to be checked.

To trigger this bug (using trace-cmd and perf):

 # perf probe -a do_fork
 # trace-cmd start -B foo -e probe
 # trace-cmd record -p function_graph -n do_fork sleep 1

The trace-cmd record at the end clears the filter before it disables
function_graph tracing and then that causes the accounting of the
ftrace function records to become incorrect and causes ftrace to bug.

Cc: stable@vger.kernel.org
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
---
 kernel/trace/ftrace.c | 27 ++++++++++++++++++++-------
 1 file changed, 20 insertions(+), 7 deletions(-)

diff --git a/kernel/trace/ftrace.c b/kernel/trace/ftrace.c
index 2b35d0ba578d..224e768bdc73 100644
--- a/kernel/trace/ftrace.c
+++ b/kernel/trace/ftrace.c
@@ -2497,12 +2497,14 @@ static void ftrace_run_update_code(int command)
 }
 
 static void ftrace_run_modify_code(struct ftrace_ops *ops, int command,
-				   struct ftrace_hash *old_hash)
+				   struct ftrace_ops_hash *old_hash)
 {
 	ops->flags |= FTRACE_OPS_FL_MODIFYING;
-	ops->old_hash.filter_hash = old_hash;
+	ops->old_hash.filter_hash = old_hash->filter_hash;
+	ops->old_hash.notrace_hash = old_hash->notrace_hash;
 	ftrace_run_update_code(command);
 	ops->old_hash.filter_hash = NULL;
+	ops->old_hash.notrace_hash = NULL;
 	ops->flags &= ~FTRACE_OPS_FL_MODIFYING;
 }
 
@@ -3579,7 +3581,7 @@ static struct ftrace_ops trace_probe_ops __read_mostly =
 
 static int ftrace_probe_registered;
 
-static void __enable_ftrace_function_probe(struct ftrace_hash *old_hash)
+static void __enable_ftrace_function_probe(struct ftrace_ops_hash *old_hash)
 {
 	int ret;
 	int i;
@@ -3637,6 +3639,7 @@ int
 register_ftrace_function_probe(char *glob, struct ftrace_probe_ops *ops,
 			      void *data)
 {
+	struct ftrace_ops_hash old_hash_ops;
 	struct ftrace_func_probe *entry;
 	struct ftrace_hash **orig_hash = &trace_probe_ops.func_hash->filter_hash;
 	struct ftrace_hash *old_hash = *orig_hash;
@@ -3658,6 +3661,10 @@ register_ftrace_function_probe(char *glob, struct ftrace_probe_ops *ops,
 
 	mutex_lock(&trace_probe_ops.func_hash->regex_lock);
 
+	old_hash_ops.filter_hash = old_hash;
+	/* Probes only have filters */
+	old_hash_ops.notrace_hash = NULL;
+
 	hash = alloc_and_copy_ftrace_hash(FTRACE_HASH_DEFAULT_BITS, old_hash);
 	if (!hash) {
 		count = -ENOMEM;
@@ -3718,7 +3725,7 @@ register_ftrace_function_probe(char *glob, struct ftrace_probe_ops *ops,
 
 	ret = ftrace_hash_move(&trace_probe_ops, 1, orig_hash, hash);
 
-	__enable_ftrace_function_probe(old_hash);
+	__enable_ftrace_function_probe(&old_hash_ops);
 
 	if (!ret)
 		free_ftrace_hash_rcu(old_hash);
@@ -4006,7 +4013,7 @@ ftrace_match_addr(struct ftrace_hash *hash, unsigned long ip, int remove)
 }
 
 static void ftrace_ops_update_code(struct ftrace_ops *ops,
-				   struct ftrace_hash *old_hash)
+				   struct ftrace_ops_hash *old_hash)
 {
 	struct ftrace_ops *op;
 
@@ -4041,6 +4048,7 @@ ftrace_set_hash(struct ftrace_ops *ops, unsigned char *buf, int len,
 		unsigned long ip, int remove, int reset, int enable)
 {
 	struct ftrace_hash **orig_hash;
+	struct ftrace_ops_hash old_hash_ops;
 	struct ftrace_hash *old_hash;
 	struct ftrace_hash *hash;
 	int ret;
@@ -4077,9 +4085,11 @@ ftrace_set_hash(struct ftrace_ops *ops, unsigned char *buf, int len,
 
 	mutex_lock(&ftrace_lock);
 	old_hash = *orig_hash;
+	old_hash_ops.filter_hash = ops->func_hash->filter_hash;
+	old_hash_ops.notrace_hash = ops->func_hash->notrace_hash;
 	ret = ftrace_hash_move(ops, enable, orig_hash, hash);
 	if (!ret) {
-		ftrace_ops_update_code(ops, old_hash);
+		ftrace_ops_update_code(ops, &old_hash_ops);
 		free_ftrace_hash_rcu(old_hash);
 	}
 	mutex_unlock(&ftrace_lock);
@@ -4291,6 +4301,7 @@ static void __init set_ftrace_early_filters(void)
 int ftrace_regex_release(struct inode *inode, struct file *file)
 {
 	struct seq_file *m = (struct seq_file *)file->private_data;
+	struct ftrace_ops_hash old_hash_ops;
 	struct ftrace_iterator *iter;
 	struct ftrace_hash **orig_hash;
 	struct ftrace_hash *old_hash;
@@ -4324,10 +4335,12 @@ int ftrace_regex_release(struct inode *inode, struct file *file)
 
 		mutex_lock(&ftrace_lock);
 		old_hash = *orig_hash;
+		old_hash_ops.filter_hash = iter->ops->func_hash->filter_hash;
+		old_hash_ops.notrace_hash = iter->ops->func_hash->notrace_hash;
 		ret = ftrace_hash_move(iter->ops, filter_hash,
 				       orig_hash, iter->hash);
 		if (!ret) {
-			ftrace_ops_update_code(iter->ops, old_hash);
+			ftrace_ops_update_code(iter->ops, &old_hash_ops);
 			free_ftrace_hash_rcu(old_hash);
 		}
 		mutex_unlock(&ftrace_lock);
-- 
2.1.4



^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH 3/3] ftrace/jprobes/x86: Fix conflict between jprobes and function graph tracing
  2015-01-14 15:39 [PATCH 0/3] ftrace/kprobes/x86: Steven Rostedt
  2015-01-14 15:39 ` [PATCH 1/3] ftrace: Fix updating of filters for shared global_ops filters Steven Rostedt
  2015-01-14 15:40 ` [PATCH 2/3] ftrace: Check both notrace and filter for old hash Steven Rostedt
@ 2015-01-14 15:40 ` Steven Rostedt
  2015-01-14 16:55   ` Borislav Petkov
  2015-01-15 11:57   ` Masami Hiramatsu
  2 siblings, 2 replies; 11+ messages in thread
From: Steven Rostedt @ 2015-01-14 15:40 UTC (permalink / raw)
  To: linux-kernel
  Cc: Ingo Molnar, Andrew Morton, Masami Hiramatsu, vvs,
	H. Peter Anvin, Thomas Gleixner, Borislav Petkov, x86-ml, stable

[-- Attachment #1: 0003-ftrace-jprobes-x86-Fix-conflict-between-jprobes-and-.patch --]
[-- Type: text/plain, Size: 4346 bytes --]

From: "Steven Rostedt (Red Hat)" <rostedt@goodmis.org>

If the function graph tracer traces a jprobe callback, the system will
crash. This can easily be demonstrated by compiling the jprobe
sample module that is in the kernel tree, loading it and running the
function graph tracer.

 # modprobe jprobe_example.ko
 # echo function_graph > /sys/kernel/debug/tracing/current_tracer
 # ls

The first two commands end up in a nice crash after the first fork.
(do_fork has a jprobe attached to it, so "ls" just triggers that fork)

The problem is caused by the jprobe_return() that all jprobe callbacks
must end with. The way jprobes works is that the function a jprobe
is attached to has a breakpoint placed at the start of it (or it uses
ftrace if fentry is supported). The breakpoint handler (or ftrace callback)
will copy the stack frame and change the ip address to return to the
jprobe handler instead of the function. The jprobe handler must end
with jprobe_return() which swaps the stack and does an int3 (breakpoint).
This breakpoint handler will then put back the saved stack frame,
simulate the instruction at the beginning of the function it added
a breakpoint to, and then continue on.

For function tracing to work, it hijakes the return address from the
stack frame, and replaces it with a hook function that will trace
the end of the call. This hook function will restore the return
address of the function call.

If the function tracer traces the jprobe handler, the hook function
for that handler will not be called, and its saved return address
will be used for the next function. This will result in a kernel crash.

To solve this, pause function tracing before the jprobe handler is called
and unpause it before it returns back to the function it probed.

Some other updates:

Used a variable "saved_sp" to hold kcb->jprobe_saved_sp. This makes the
code look a bit cleaner and easier to understand (various tries to fix
this bug required this change).

Note, if fentry is being used, jprobes will change the ip address before
the function graph tracer runs and it will not be able to trace the
function that the jprobe is probing.

Cc: stable@vger.kernel.org # 2.6.30+
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
---
 arch/x86/kernel/kprobes/core.c | 20 +++++++++++++++-----
 1 file changed, 15 insertions(+), 5 deletions(-)

diff --git a/arch/x86/kernel/kprobes/core.c b/arch/x86/kernel/kprobes/core.c
index f7e3cd50ece0..98f654d466e5 100644
--- a/arch/x86/kernel/kprobes/core.c
+++ b/arch/x86/kernel/kprobes/core.c
@@ -1020,6 +1020,15 @@ int setjmp_pre_handler(struct kprobe *p, struct pt_regs *regs)
 	regs->flags &= ~X86_EFLAGS_IF;
 	trace_hardirqs_off();
 	regs->ip = (unsigned long)(jp->entry);
+
+	/*
+	 * jprobes use jprobe_return() which skips the normal return
+	 * path of the function, and this messes up the accounting of the
+	 * function graph tracer to get messed up.
+	 *
+	 * Pause function graph tracing while performing the jprobe function.
+	 */
+	pause_graph_tracing();
 	return 1;
 }
 NOKPROBE_SYMBOL(setjmp_pre_handler);
@@ -1048,24 +1057,25 @@ int longjmp_break_handler(struct kprobe *p, struct pt_regs *regs)
 	struct kprobe_ctlblk *kcb = get_kprobe_ctlblk();
 	u8 *addr = (u8 *) (regs->ip - 1);
 	struct jprobe *jp = container_of(p, struct jprobe, kp);
+	void *saved_sp = kcb->jprobe_saved_sp;
 
 	if ((addr > (u8 *) jprobe_return) &&
 	    (addr < (u8 *) jprobe_return_end)) {
-		if (stack_addr(regs) != kcb->jprobe_saved_sp) {
+		if (stack_addr(regs) != saved_sp) {
 			struct pt_regs *saved_regs = &kcb->jprobe_saved_regs;
 			printk(KERN_ERR
 			       "current sp %p does not match saved sp %p\n",
-			       stack_addr(regs), kcb->jprobe_saved_sp);
+			       stack_addr(regs), saved_sp);
 			printk(KERN_ERR "Saved registers for jprobe %p\n", jp);
 			show_regs(saved_regs);
 			printk(KERN_ERR "Current registers\n");
 			show_regs(regs);
 			BUG();
 		}
+		/* It's OK to start function graph tracing again */
+		unpause_graph_tracing();
 		*regs = kcb->jprobe_saved_regs;
-		memcpy((kprobe_opcode_t *)(kcb->jprobe_saved_sp),
-		       kcb->jprobes_stack,
-		       MIN_STACK_SIZE(kcb->jprobe_saved_sp));
+		memcpy(saved_sp, kcb->jprobes_stack, MIN_STACK_SIZE(saved_sp));
 		preempt_enable_no_resched();
 		return 1;
 	}
-- 
2.1.4



^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: [PATCH 3/3] ftrace/jprobes/x86: Fix conflict between jprobes and function graph tracing
  2015-01-14 15:40 ` [PATCH 3/3] ftrace/jprobes/x86: Fix conflict between jprobes and function graph tracing Steven Rostedt
@ 2015-01-14 16:55   ` Borislav Petkov
  2015-01-14 17:05     ` Steven Rostedt
  2015-01-15 11:57   ` Masami Hiramatsu
  1 sibling, 1 reply; 11+ messages in thread
From: Borislav Petkov @ 2015-01-14 16:55 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: linux-kernel, Ingo Molnar, Andrew Morton, Masami Hiramatsu, vvs,
	H. Peter Anvin, Thomas Gleixner, x86-ml, stable

On Wed, Jan 14, 2015 at 10:40:01AM -0500, Steven Rostedt wrote:
> From: "Steven Rostedt (Red Hat)" <rostedt@goodmis.org>
> 
> If the function graph tracer traces a jprobe callback, the system will
> crash. This can easily be demonstrated by compiling the jprobe
> sample module that is in the kernel tree, loading it and running the
> function graph tracer.
> 
>  # modprobe jprobe_example.ko
>  # echo function_graph > /sys/kernel/debug/tracing/current_tracer
>  # ls
> 
> The first two commands end up in a nice crash after the first fork.
> (do_fork has a jprobe attached to it, so "ls" just triggers that fork)
> 
> The problem is caused by the jprobe_return() that all jprobe callbacks
> must end with. The way jprobes works is that the function a jprobe
> is attached to has a breakpoint placed at the start of it (or it uses
> ftrace if fentry is supported). The breakpoint handler (or ftrace callback)
> will copy the stack frame and change the ip address to return to the
> jprobe handler instead of the function. The jprobe handler must end
> with jprobe_return() which swaps the stack and does an int3 (breakpoint).
> This breakpoint handler will then put back the saved stack frame,
> simulate the instruction at the beginning of the function it added
> a breakpoint to, and then continue on.
> 
> For function tracing to work, it hijakes the return address from the
> stack frame, and replaces it with a hook function that will trace
> the end of the call. This hook function will restore the return
> address of the function call.
> 
> If the function tracer traces the jprobe handler, the hook function
> for that handler will not be called, and its saved return address
> will be used for the next function. This will result in a kernel crash.
> 
> To solve this, pause function tracing before the jprobe handler is called
> and unpause it before it returns back to the function it probed.

Err, stupid question: marking the jprobe handler "notrace" doesn't help?

-- 
Regards/Gruss,
    Boris.

Sent from a fat crate under my desk. Formatting is fine.
--

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 3/3] ftrace/jprobes/x86: Fix conflict between jprobes and function graph tracing
  2015-01-14 16:55   ` Borislav Petkov
@ 2015-01-14 17:05     ` Steven Rostedt
  0 siblings, 0 replies; 11+ messages in thread
From: Steven Rostedt @ 2015-01-14 17:05 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: linux-kernel, Ingo Molnar, Andrew Morton, Masami Hiramatsu, vvs,
	H. Peter Anvin, Thomas Gleixner, x86-ml, stable

On Wed, 14 Jan 2015 17:55:37 +0100
Borislav Petkov <bp@alien8.de> wrote:

 
> Err, stupid question: marking the jprobe handler "notrace" doesn't help?
> 

A few reasons.

One, that would require all users to make their handler as "notrace".
That's not very reliable. Not to mention, I still work at Red Hat and
we have this KABI thingy, where the jprobe modules don't need to change
and we still need to fix it.

This change is much more robust that expecting jprobe callers to add
notrace.

Two, I HATE when a notrace is added for function graph tracing that
is not needed for function tracing. As I told Masami, every "notrace"
added to the kernel makes function tracing that much more useless.

Function tracing should be allowed to debug jprobes.

Three, I have a patch that lets this all work if the kprobe/jprobes
uses the ftrace fentry infrastructure (the work I original did). Why
break everything for something the requires jprobe users to do things
correctly?

-- Steve

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 1/3] ftrace: Fix updating of filters for shared global_ops filters
  2015-01-14 15:39 ` [PATCH 1/3] ftrace: Fix updating of filters for shared global_ops filters Steven Rostedt
@ 2015-01-15  8:05   ` Masami Hiramatsu
  0 siblings, 0 replies; 11+ messages in thread
From: Masami Hiramatsu @ 2015-01-15  8:05 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: linux-kernel, Ingo Molnar, Andrew Morton, vvs, H. Peter Anvin,
	Thomas Gleixner, Borislav Petkov, x86-ml, stable

(2015/01/15 0:39), Steven Rostedt wrote:
> From: "Steven Rostedt (Red Hat)" <rostedt@goodmis.org>
> 
> As the set_ftrace_filter affects both the function tracer as well as the
> function graph tracer, the ops that represent each have a shared
> ftrace_ops_hash structure. This allows both to be updated when the filter
> files are updated.
> 
> But if function graph is enabled and the global_ops (function tracing) ops
> is not, then it is possible that the filter could be changed without the
> update happening for the function graph ops. This will cause the changes
> to not take place and may even cause a ftrace_bug to occur as it could mess
> with the trampoline accounting.
> 
> The solution is to check if the ops uses the shared global_ops filter and
> if the ops itself is not enabled, to check if there's another ops that is
> enabled and also shares the global_ops filter. In that case, the
> modification still needs to be executed.
> 

Looks good to me.

Reviewed-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>

Thank you,

> Cc: stable@vger.kernel.org # 3.17+
> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
> ---
>  kernel/trace/ftrace.c | 26 +++++++++++++++++++++++++-
>  1 file changed, 25 insertions(+), 1 deletion(-)
> 
> diff --git a/kernel/trace/ftrace.c b/kernel/trace/ftrace.c
> index 929a733d302e..2b35d0ba578d 100644
> --- a/kernel/trace/ftrace.c
> +++ b/kernel/trace/ftrace.c
> @@ -4008,8 +4008,32 @@ ftrace_match_addr(struct ftrace_hash *hash, unsigned long ip, int remove)
>  static void ftrace_ops_update_code(struct ftrace_ops *ops,
>  				   struct ftrace_hash *old_hash)
>  {
> -	if (ops->flags & FTRACE_OPS_FL_ENABLED && ftrace_enabled)
> +	struct ftrace_ops *op;
> +
> +	if (!ftrace_enabled)
> +		return;
> +
> +	if (ops->flags & FTRACE_OPS_FL_ENABLED) {
>  		ftrace_run_modify_code(ops, FTRACE_UPDATE_CALLS, old_hash);
> +		return;
> +	}
> +
> +	/*
> +	 * If this is the shared global_ops filter, then we need to
> +	 * check if there is another ops that shares it, is enabled.
> +	 * If so, we still need to run the modify code.
> +	 */
> +	if (ops->func_hash != &global_ops.local_hash)
> +		return;
> +
> +	do_for_each_ftrace_op(op, ftrace_ops_list) {
> +		if (op->func_hash == &global_ops.local_hash &&
> +		    op->flags & FTRACE_OPS_FL_ENABLED) {
> +			ftrace_run_modify_code(op, FTRACE_UPDATE_CALLS, old_hash);
> +			/* Only need to do this once */
> +			return;
> +		}
> +	} while_for_each_ftrace_op(op);
>  }
>  
>  static int

-- 
Masami HIRAMATSU
Software Platform Research Dept. Linux Technology Research Center
Hitachi, Ltd., Yokohama Research Laboratory
E-mail: masami.hiramatsu.pt@hitachi.com



^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 2/3] ftrace: Check both notrace and filter for old hash
  2015-01-14 15:40 ` [PATCH 2/3] ftrace: Check both notrace and filter for old hash Steven Rostedt
@ 2015-01-15 10:59   ` Masami Hiramatsu
  2015-01-15 14:04     ` Steven Rostedt
  0 siblings, 1 reply; 11+ messages in thread
From: Masami Hiramatsu @ 2015-01-15 10:59 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: linux-kernel, Ingo Molnar, Andrew Morton, vvs, H. Peter Anvin,
	Thomas Gleixner, Borislav Petkov, x86-ml, stable

(2015/01/15 0:40), Steven Rostedt wrote:
> From: "Steven Rostedt (Red Hat)" <rostedt@goodmis.org>
> 
> Using just the filter for checking for trampolines or regs is not enough
> when updating the code against the records that represent all functions.
> Both the filter hash and the notrace hash need to be checked.
> 
> To trigger this bug (using trace-cmd and perf):
> 
>  # perf probe -a do_fork
>  # trace-cmd start -B foo -e probe
>  # trace-cmd record -p function_graph -n do_fork sleep 1
> 
> The trace-cmd record at the end clears the filter before it disables
> function_graph tracing and then that causes the accounting of the
> ftrace function records to become incorrect and causes ftrace to bug.
> 
> Cc: stable@vger.kernel.org
> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
> ---
>  kernel/trace/ftrace.c | 27 ++++++++++++++++++++-------
>  1 file changed, 20 insertions(+), 7 deletions(-)
> 
> diff --git a/kernel/trace/ftrace.c b/kernel/trace/ftrace.c
> index 2b35d0ba578d..224e768bdc73 100644
> --- a/kernel/trace/ftrace.c
> +++ b/kernel/trace/ftrace.c
> @@ -2497,12 +2497,14 @@ static void ftrace_run_update_code(int command)
>  }
>  
>  static void ftrace_run_modify_code(struct ftrace_ops *ops, int command,
> -				   struct ftrace_hash *old_hash)
> +				   struct ftrace_ops_hash *old_hash)
>  {
>  	ops->flags |= FTRACE_OPS_FL_MODIFYING;
> -	ops->old_hash.filter_hash = old_hash;
> +	ops->old_hash.filter_hash = old_hash->filter_hash;
> +	ops->old_hash.notrace_hash = old_hash->notrace_hash;
>  	ftrace_run_update_code(command);
>  	ops->old_hash.filter_hash = NULL;
> +	ops->old_hash.notrace_hash = NULL;
>  	ops->flags &= ~FTRACE_OPS_FL_MODIFYING;
>  }
>  
> @@ -3579,7 +3581,7 @@ static struct ftrace_ops trace_probe_ops __read_mostly =
>  
>  static int ftrace_probe_registered;
>  
> -static void __enable_ftrace_function_probe(struct ftrace_hash *old_hash)
> +static void __enable_ftrace_function_probe(struct ftrace_ops_hash *old_hash)
>  {
>  	int ret;
>  	int i;
> @@ -3637,6 +3639,7 @@ int
>  register_ftrace_function_probe(char *glob, struct ftrace_probe_ops *ops,
>  			      void *data)
>  {
> +	struct ftrace_ops_hash old_hash_ops;

Would it be better to be old_ops_hash? since it's not an ops.

Other parts are OK for me :)

Reviewed-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>

Thank you,

>  	struct ftrace_func_probe *entry;
>  	struct ftrace_hash **orig_hash = &trace_probe_ops.func_hash->filter_hash;
>  	struct ftrace_hash *old_hash = *orig_hash;
> @@ -3658,6 +3661,10 @@ register_ftrace_function_probe(char *glob, struct ftrace_probe_ops *ops,
>  
>  	mutex_lock(&trace_probe_ops.func_hash->regex_lock);
>  
> +	old_hash_ops.filter_hash = old_hash;
> +	/* Probes only have filters */
> +	old_hash_ops.notrace_hash = NULL;
> +
>  	hash = alloc_and_copy_ftrace_hash(FTRACE_HASH_DEFAULT_BITS, old_hash);
>  	if (!hash) {
>  		count = -ENOMEM;
> @@ -3718,7 +3725,7 @@ register_ftrace_function_probe(char *glob, struct ftrace_probe_ops *ops,
>  
>  	ret = ftrace_hash_move(&trace_probe_ops, 1, orig_hash, hash);
>  
> -	__enable_ftrace_function_probe(old_hash);
> +	__enable_ftrace_function_probe(&old_hash_ops);
>  
>  	if (!ret)
>  		free_ftrace_hash_rcu(old_hash);
> @@ -4006,7 +4013,7 @@ ftrace_match_addr(struct ftrace_hash *hash, unsigned long ip, int remove)
>  }
>  
>  static void ftrace_ops_update_code(struct ftrace_ops *ops,
> -				   struct ftrace_hash *old_hash)
> +				   struct ftrace_ops_hash *old_hash)
>  {
>  	struct ftrace_ops *op;
>  
> @@ -4041,6 +4048,7 @@ ftrace_set_hash(struct ftrace_ops *ops, unsigned char *buf, int len,
>  		unsigned long ip, int remove, int reset, int enable)
>  {
>  	struct ftrace_hash **orig_hash;
> +	struct ftrace_ops_hash old_hash_ops;
>  	struct ftrace_hash *old_hash;
>  	struct ftrace_hash *hash;
>  	int ret;
> @@ -4077,9 +4085,11 @@ ftrace_set_hash(struct ftrace_ops *ops, unsigned char *buf, int len,
>  
>  	mutex_lock(&ftrace_lock);
>  	old_hash = *orig_hash;
> +	old_hash_ops.filter_hash = ops->func_hash->filter_hash;
> +	old_hash_ops.notrace_hash = ops->func_hash->notrace_hash;
>  	ret = ftrace_hash_move(ops, enable, orig_hash, hash);
>  	if (!ret) {
> -		ftrace_ops_update_code(ops, old_hash);
> +		ftrace_ops_update_code(ops, &old_hash_ops);
>  		free_ftrace_hash_rcu(old_hash);
>  	}
>  	mutex_unlock(&ftrace_lock);
> @@ -4291,6 +4301,7 @@ static void __init set_ftrace_early_filters(void)
>  int ftrace_regex_release(struct inode *inode, struct file *file)
>  {
>  	struct seq_file *m = (struct seq_file *)file->private_data;
> +	struct ftrace_ops_hash old_hash_ops;
>  	struct ftrace_iterator *iter;
>  	struct ftrace_hash **orig_hash;
>  	struct ftrace_hash *old_hash;
> @@ -4324,10 +4335,12 @@ int ftrace_regex_release(struct inode *inode, struct file *file)
>  
>  		mutex_lock(&ftrace_lock);
>  		old_hash = *orig_hash;
> +		old_hash_ops.filter_hash = iter->ops->func_hash->filter_hash;
> +		old_hash_ops.notrace_hash = iter->ops->func_hash->notrace_hash;
>  		ret = ftrace_hash_move(iter->ops, filter_hash,
>  				       orig_hash, iter->hash);
>  		if (!ret) {
> -			ftrace_ops_update_code(iter->ops, old_hash);
> +			ftrace_ops_update_code(iter->ops, &old_hash_ops);
>  			free_ftrace_hash_rcu(old_hash);
>  		}
>  		mutex_unlock(&ftrace_lock);
> -- 2.1.4
> 


-- 
Masami HIRAMATSU
Software Platform Research Dept. Linux Technology Research Center
Hitachi, Ltd., Yokohama Research Laboratory
E-mail: masami.hiramatsu.pt@hitachi.com



^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 3/3] ftrace/jprobes/x86: Fix conflict between jprobes and function graph tracing
  2015-01-14 15:40 ` [PATCH 3/3] ftrace/jprobes/x86: Fix conflict between jprobes and function graph tracing Steven Rostedt
  2015-01-14 16:55   ` Borislav Petkov
@ 2015-01-15 11:57   ` Masami Hiramatsu
  2015-01-15 14:17     ` Steven Rostedt
  1 sibling, 1 reply; 11+ messages in thread
From: Masami Hiramatsu @ 2015-01-15 11:57 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: linux-kernel, Ingo Molnar, Andrew Morton, vvs, H. Peter Anvin,
	Thomas Gleixner, Borislav Petkov, x86-ml, stable

Hi Steven,

Thank you for fixing this bug!

(2015/01/15 0:40), Steven Rostedt wrote:
> From: "Steven Rostedt (Red Hat)" <rostedt@goodmis.org>
> 
> If the function graph tracer traces a jprobe callback, the system will
> crash. This can easily be demonstrated by compiling the jprobe
> sample module that is in the kernel tree, loading it and running the
> function graph tracer.
> 
>  # modprobe jprobe_example.ko
>  # echo function_graph > /sys/kernel/debug/tracing/current_tracer
>  # ls
> 
> The first two commands end up in a nice crash after the first fork.
> (do_fork has a jprobe attached to it, so "ls" just triggers that fork)
> 
> The problem is caused by the jprobe_return() that all jprobe callbacks
> must end with. The way jprobes works is that the function a jprobe
> is attached to has a breakpoint placed at the start of it (or it uses
> ftrace if fentry is supported). The breakpoint handler (or ftrace callback)
> will copy the stack frame and change the ip address to return to the
> jprobe handler instead of the function. The jprobe handler must end
> with jprobe_return() which swaps the stack and does an int3 (breakpoint).
> This breakpoint handler will then put back the saved stack frame,
> simulate the instruction at the beginning of the function it added
> a breakpoint to, and then continue on.
> 
> For function tracing to work, it hijakes the return address from the
> stack frame, and replaces it with a hook function that will trace
> the end of the call. This hook function will restore the return
> address of the function call.

Yeah, function-graph tracer makes a hidden return-address stack for
each thread for storing the return address.

> If the function tracer traces the jprobe handler, the hook function
> for that handler will not be called, and its saved return address
> will be used for the next function. This will result in a kernel crash.

Actually, both jprobe (user-defined) handler and jprobe_return() doesn't
execute "ret".
So, right after run out the jprobe handler with function-graph tracer,
on the top of its hidden stack, there are at least 2(*) unused return
addresses, one is for jprobe handler (this should be same as the probed
function's return address) and other is jprobe_return()'s return address.
(*: this can be more than 2 if jprobe_return is called from a function
 which is called from jprobe handler)

So, the hidden stack may be as below;

[jprobe_return() return address]
[probed function return address]
[probed function return address]

After jumping back to the probed function, the function return is
trapped by the function-graph tracer, and it uses jprobe_return()'s
return address. Since usually jprobe_return() is called at the end of
the function, CPU will execute "ret" soon again(if someone puts a BUG
right after jprobe_return(), the kernel will show that BUG), and it
returns to the caller of the probed function.
However, there still be the probed function return address on the
hidden stack! This means that the next "ret" will go back to the same
address but with modified register and stacks.

> To solve this, pause function tracing before the jprobe handler is called
> and unpause it before it returns back to the function it probed.

Agreed, but it also could drop some NMI events. That is downside.

> Some other updates:
> 
> Used a variable "saved_sp" to hold kcb->jprobe_saved_sp. This makes the
> code look a bit cleaner and easier to understand (various tries to fix
> this bug required this change).

OK.

> Note, if fentry is being used, jprobes will change the ip address before
> the function graph tracer runs and it will not be able to trace the
> function that the jprobe is probing.

yes, it should be fixed.

BTW, my last part of IPMODIFY patches (which is not yet merged)
can solve this a different way. It sets IPMODIFY flag only to jprobe.
So, if function-graph tracer sets IPMODIFY flag, user can not enable
function-graph tracer when jprobe is used.

https://lkml.org/lkml/2014/10/9/210

Anyway, this patch is better to go stable trees.

Acked-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>

Thank you,

> 
> Cc: stable@vger.kernel.org # 2.6.30+
> Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
> ---
>  arch/x86/kernel/kprobes/core.c | 20 +++++++++++++++-----
>  1 file changed, 15 insertions(+), 5 deletions(-)
> 
> diff --git a/arch/x86/kernel/kprobes/core.c b/arch/x86/kernel/kprobes/core.c
> index f7e3cd50ece0..98f654d466e5 100644
> --- a/arch/x86/kernel/kprobes/core.c
> +++ b/arch/x86/kernel/kprobes/core.c
> @@ -1020,6 +1020,15 @@ int setjmp_pre_handler(struct kprobe *p, struct pt_regs *regs)
>  	regs->flags &= ~X86_EFLAGS_IF;
>  	trace_hardirqs_off();
>  	regs->ip = (unsigned long)(jp->entry);
> +
> +	/*
> +	 * jprobes use jprobe_return() which skips the normal return
> +	 * path of the function, and this messes up the accounting of the
> +	 * function graph tracer to get messed up.
> +	 *
> +	 * Pause function graph tracing while performing the jprobe function.
> +	 */
> +	pause_graph_tracing();
>  	return 1;
>  }
>  NOKPROBE_SYMBOL(setjmp_pre_handler);
> @@ -1048,24 +1057,25 @@ int longjmp_break_handler(struct kprobe *p, struct pt_regs *regs)
>  	struct kprobe_ctlblk *kcb = get_kprobe_ctlblk();
>  	u8 *addr = (u8 *) (regs->ip - 1);
>  	struct jprobe *jp = container_of(p, struct jprobe, kp);
> +	void *saved_sp = kcb->jprobe_saved_sp;
>  
>  	if ((addr > (u8 *) jprobe_return) &&
>  	    (addr < (u8 *) jprobe_return_end)) {
> -		if (stack_addr(regs) != kcb->jprobe_saved_sp) {
> +		if (stack_addr(regs) != saved_sp) {
>  			struct pt_regs *saved_regs = &kcb->jprobe_saved_regs;
>  			printk(KERN_ERR
>  			       "current sp %p does not match saved sp %p\n",
> -			       stack_addr(regs), kcb->jprobe_saved_sp);
> +			       stack_addr(regs), saved_sp);
>  			printk(KERN_ERR "Saved registers for jprobe %p\n", jp);
>  			show_regs(saved_regs);
>  			printk(KERN_ERR "Current registers\n");
>  			show_regs(regs);
>  			BUG();
>  		}
> +		/* It's OK to start function graph tracing again */
> +		unpause_graph_tracing();
>  		*regs = kcb->jprobe_saved_regs;
> -		memcpy((kprobe_opcode_t *)(kcb->jprobe_saved_sp),
> -		       kcb->jprobes_stack,
> -		       MIN_STACK_SIZE(kcb->jprobe_saved_sp));
> +		memcpy(saved_sp, kcb->jprobes_stack, MIN_STACK_SIZE(saved_sp));
>  		preempt_enable_no_resched();
>  		return 1;
>  	}
> -- 2.1.4
> 


-- 
Masami HIRAMATSU
Software Platform Research Dept. Linux Technology Research Center
Hitachi, Ltd., Yokohama Research Laboratory
E-mail: masami.hiramatsu.pt@hitachi.com



^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 2/3] ftrace: Check both notrace and filter for old hash
  2015-01-15 10:59   ` Masami Hiramatsu
@ 2015-01-15 14:04     ` Steven Rostedt
  0 siblings, 0 replies; 11+ messages in thread
From: Steven Rostedt @ 2015-01-15 14:04 UTC (permalink / raw)
  To: Masami Hiramatsu
  Cc: linux-kernel, Ingo Molnar, Andrew Morton, vvs, H. Peter Anvin,
	Thomas Gleixner, Borislav Petkov, x86-ml, stable

On Thu, 15 Jan 2015 19:59:36 +0900
Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> wrote:

> > @@ -3637,6 +3639,7 @@ int
> >  register_ftrace_function_probe(char *glob, struct ftrace_probe_ops
> > *ops, void *data)
> >  {
> > +	struct ftrace_ops_hash old_hash_ops;
> 
> Would it be better to be old_ops_hash? since it's not an ops.

Probably. I'll fix it for 3.20. I ran this through 9 hours of testing
and I don't want to change anything unless I need to (if something else
needs to change, I'll make this fix). Even a comment fix needs to go
through my testing before I send to Linus (I was burnt by a comment
fix that wasn't tested a long time ago, you never know what you might
accidentally change).

> 
> Other parts are OK for me :)
> 
> Reviewed-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
> 

Thanks!

-- Steve

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 3/3] ftrace/jprobes/x86: Fix conflict between jprobes and function graph tracing
  2015-01-15 11:57   ` Masami Hiramatsu
@ 2015-01-15 14:17     ` Steven Rostedt
  0 siblings, 0 replies; 11+ messages in thread
From: Steven Rostedt @ 2015-01-15 14:17 UTC (permalink / raw)
  To: Masami Hiramatsu
  Cc: linux-kernel, Ingo Molnar, Andrew Morton, vvs, H. Peter Anvin,
	Thomas Gleixner, Borislav Petkov, x86-ml, stable

On Thu, 15 Jan 2015 20:57:29 +0900
Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> wrote:
> 
> > If the function tracer traces the jprobe handler, the hook function
> > for that handler will not be called, and its saved return address
> > will be used for the next function. This will result in a kernel
> > crash.
> 
> Actually, both jprobe (user-defined) handler and jprobe_return()
> doesn't execute "ret".

Yep I new that. But a notrace on jprobe_return isn't that big of a
deal. It's not a function we really need to trace, as it really doesn't
do anything but being a 'hack' for jprobes to return properly.

> So, right after run out the jprobe handler with function-graph tracer,
> on the top of its hidden stack, there are at least 2(*) unused return
> addresses, one is for jprobe handler (this should be same as the
> probed function's return address) and other is jprobe_return()'s
> return address. (*: this can be more than 2 if jprobe_return is
> called from a function which is called from jprobe handler)
> 
> So, the hidden stack may be as below;
> 
> [jprobe_return() return address]
> [probed function return address]
> [probed function return address]
> 
> After jumping back to the probed function, the function return is
> trapped by the function-graph tracer, and it uses jprobe_return()'s
> return address. Since usually jprobe_return() is called at the end of
> the function, CPU will execute "ret" soon again(if someone puts a BUG
> right after jprobe_return(), the kernel will show that BUG), and it
> returns to the caller of the probed function.
> However, there still be the probed function return address on the
> hidden stack! This means that the next "ret" will go back to the same
> address but with modified register and stacks.

Yep, I discovered all this with my patch that allows function tracing
with jprobes.

> 
> > To solve this, pause function tracing before the jprobe handler is
> > called and unpause it before it returns back to the function it
> > probed.
> 
> Agreed, but it also could drop some NMI events. That is downside.
> 
> > Some other updates:
> > 
> > Used a variable "saved_sp" to hold kcb->jprobe_saved_sp. This makes
> > the code look a bit cleaner and easier to understand (various tries
> > to fix this bug required this change).
> 
> OK.
> 
> > Note, if fentry is being used, jprobes will change the ip address
> > before the function graph tracer runs and it will not be able to
> > trace the function that the jprobe is probing.
> 
> yes, it should be fixed.
> 
> BTW, my last part of IPMODIFY patches (which is not yet merged)
> can solve this a different way. It sets IPMODIFY flag only to jprobe.
> So, if function-graph tracer sets IPMODIFY flag, user can not enable
> function-graph tracer when jprobe is used.

Nah, I don't want to stop jprobes due to function graph tracing or vice
versa.

function graph tracer doesn't change the regs->ip, so it doesn't need
the flag. But I sent out a patch that fixes this for this case. Let me
know what you think of that one.


> 
> https://lkml.org/lkml/2014/10/9/210
> 
> Anyway, this patch is better to go stable trees.
> 
> Acked-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
> 

Thanks!

-- Steve

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2015-01-15 14:18 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-01-14 15:39 [PATCH 0/3] ftrace/kprobes/x86: Steven Rostedt
2015-01-14 15:39 ` [PATCH 1/3] ftrace: Fix updating of filters for shared global_ops filters Steven Rostedt
2015-01-15  8:05   ` Masami Hiramatsu
2015-01-14 15:40 ` [PATCH 2/3] ftrace: Check both notrace and filter for old hash Steven Rostedt
2015-01-15 10:59   ` Masami Hiramatsu
2015-01-15 14:04     ` Steven Rostedt
2015-01-14 15:40 ` [PATCH 3/3] ftrace/jprobes/x86: Fix conflict between jprobes and function graph tracing Steven Rostedt
2015-01-14 16:55   ` Borislav Petkov
2015-01-14 17:05     ` Steven Rostedt
2015-01-15 11:57   ` Masami Hiramatsu
2015-01-15 14:17     ` Steven Rostedt

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.