linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [for-next][PATCH 0/4] ftrace: Add dynamic trampoline support
@ 2014-10-27 18:27 Steven Rostedt
  2014-10-27 18:27 ` [for-next][PATCH 1/4] ftrace/x86: Add dynamic allocated trampoline for ftrace_ops Steven Rostedt
                   ` (5 more replies)
  0 siblings, 6 replies; 23+ messages in thread
From: Steven Rostedt @ 2014-10-27 18:27 UTC (permalink / raw)
  To: linux-kernel
  Cc: Ingo Molnar, Andrew Morton, Masami Hiramatsu, Jiri Kosina,
	Josh Poimboeuf, Vojtech Pavlik, Seth Jennings, Paul E. McKenney

This adds the allocation of dynamic trampolines. It still does not
allow for dynamic ftrace_ops to use them on CONFIG_PREEMPT systems.
That will come in 3.20, as I want to test out call_rcu_tasks() for
a bit first on my own boxes.

  git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace.git
for-next

Head SHA1: 9fd7caf67913fbf35d834beaa8f9764d733a9236


Steven Rostedt (Red Hat) (4):
      ftrace/x86: Add dynamic allocated trampoline for ftrace_ops
      ftrace/x86: Show trampoline call function in enabled_functions
      ftrace/x86: Allow !CONFIG_PREEMPT dynamic ops to use allocated trampolines
      ftrace: Add more information to ftrace_bug() output

----
 arch/powerpc/kernel/ftrace.c |   2 +-
 arch/x86/kernel/ftrace.c     | 281 +++++++++++++++++++++++++++++++++++++++++--
 arch/x86/kernel/mcount_64.S  |  25 +++-
 include/linux/ftrace.h       |  12 +-
 kernel/trace/ftrace.c        | 118 +++++++++++++++---
 5 files changed, 409 insertions(+), 29 deletions(-)

^ permalink raw reply	[flat|nested] 23+ messages in thread

* [for-next][PATCH 1/4] ftrace/x86: Add dynamic allocated trampoline for ftrace_ops
  2014-10-27 18:27 [for-next][PATCH 0/4] ftrace: Add dynamic trampoline support Steven Rostedt
@ 2014-10-27 18:27 ` Steven Rostedt
  2014-10-30 17:00   ` Steven Rostedt
                     ` (2 more replies)
  2014-10-27 18:27 ` [for-next][PATCH 2/4] ftrace/x86: Show trampoline call function in enabled_functions Steven Rostedt
                   ` (4 subsequent siblings)
  5 siblings, 3 replies; 23+ messages in thread
From: Steven Rostedt @ 2014-10-27 18:27 UTC (permalink / raw)
  To: linux-kernel
  Cc: Ingo Molnar, Andrew Morton, Masami Hiramatsu, Jiri Kosina,
	Josh Poimboeuf, Vojtech Pavlik, Seth Jennings, Paul E. McKenney

[-- Attachment #1: 0001-ftrace-x86-Add-dynamic-allocated-trampoline-for-ftra.patch --]
[-- Type: text/plain, Size: 14116 bytes --]

From: "Steven Rostedt (Red Hat)" <rostedt@goodmis.org>

The current method of handling multiple function callbacks is to register
a list function callback that calls all the other callbacks based on
their hash tables and compare it to the function that the callback was
called on. But this is very inefficient.

For example, if you are tracing all functions in the kernel and then
add a kprobe to a function such that the kprobe uses ftrace, the
mcount trampoline will switch from calling the function trace callback
to calling the list callback that will iterate over all registered
ftrace_ops (in this case, the function tracer and the kprobes callback).
That means for every function being traced it checks the hash of the
ftrace_ops for function tracing and kprobes, even though the kprobes
is only set at a single function. The kprobes ftrace_ops is checked
for every function being traced!

Instead of calling the list function for functions that are only being
traced by a single callback, we can call a dynamically allocated
trampoline that calls the callback directly. The function graph tracer
already uses a direct call trampoline when it is being traced by itself
but it is not dynamically allocated. It's trampoline is static in the
kernel core. The infrastructure that called the function graph trampoline
can also be used to call a dynamically allocated one.

For now, only ftrace_ops that are not dynamically allocated can have
a trampoline. That is, users such as function tracer or stack tracer.
kprobes and perf allocate their ftrace_ops, and until there's a safe
way to free the trampoline, it can not be used. The dynamically allocated
ftrace_ops may, although, use the trampoline if the kernel is not
compiled with CONFIG_PREEMPT. But that will come later.

Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
---
 arch/x86/kernel/ftrace.c    | 195 ++++++++++++++++++++++++++++++++++++++++++--
 arch/x86/kernel/mcount_64.S |  25 +++++-
 include/linux/ftrace.h      |   8 ++
 kernel/trace/ftrace.c       |  40 ++++++++-
 4 files changed, 254 insertions(+), 14 deletions(-)

diff --git a/arch/x86/kernel/ftrace.c b/arch/x86/kernel/ftrace.c
index 3386dc9aa333..e4d48f6cad86 100644
--- a/arch/x86/kernel/ftrace.c
+++ b/arch/x86/kernel/ftrace.c
@@ -17,6 +17,7 @@
 #include <linux/ftrace.h>
 #include <linux/percpu.h>
 #include <linux/sched.h>
+#include <linux/slab.h>
 #include <linux/init.h>
 #include <linux/list.h>
 #include <linux/module.h>
@@ -644,13 +645,8 @@ int __init ftrace_dyn_arch_init(void)
 {
 	return 0;
 }
-#endif
-
-#ifdef CONFIG_FUNCTION_GRAPH_TRACER
-
-#ifdef CONFIG_DYNAMIC_FTRACE
-extern void ftrace_graph_call(void);
 
+#if defined(CONFIG_X86_64) || defined(CONFIG_FUNCTION_GRAPH_TRACER)
 static unsigned char *ftrace_jmp_replace(unsigned long ip, unsigned long addr)
 {
 	static union ftrace_code_union calc;
@@ -664,6 +660,193 @@ static unsigned char *ftrace_jmp_replace(unsigned long ip, unsigned long addr)
 	 */
 	return calc.code;
 }
+#endif
+
+/* Currently only x86_64 supports dynamic trampolines */
+#ifdef CONFIG_X86_64
+
+#ifdef CONFIG_MODULES
+#include <linux/moduleloader.h>
+/* Module allocation simplifies allocating memory for code */
+static inline void *alloc_tramp(unsigned long size)
+{
+	return module_alloc(size);
+}
+static inline void tramp_free(void *tramp)
+{
+	module_free(NULL, tramp);
+}
+#else
+/* Trampolines can only be created if modules are supported */
+static inline void *alloc_tramp(unsigned long size)
+{
+	return NULL;
+}
+static inline void tramp_free(void *tramp) { }
+#endif
+
+/* Defined as markers to the end of the ftrace default trampolines */
+extern void ftrace_caller_end(void);
+extern void ftrace_regs_caller_end(void);
+extern void ftrace_return(void);
+extern void ftrace_caller_op_ptr(void);
+extern void ftrace_regs_caller_op_ptr(void);
+
+/* movq function_trace_op(%rip), %rdx */
+/* 0x48 0x8b 0x15 <offset-to-ftrace_trace_op (4 bytes)> */
+#define OP_REF_SIZE	7
+
+/*
+ * The ftrace_ops is passed to the function callback. Since the
+ * trampoline only services a single ftrace_ops, we can pass in
+ * that ops directly.
+ *
+ * The ftrace_op_code_union is used to create a pointer to the
+ * ftrace_ops that will be passed to the callback function.
+ */
+union ftrace_op_code_union {
+	char code[OP_REF_SIZE];
+	struct {
+		char op[3];
+		int offset;
+	} __attribute__((packed));
+};
+
+static unsigned long create_trampoline(struct ftrace_ops *ops)
+{
+	unsigned const char *jmp;
+	unsigned long start_offset;
+	unsigned long end_offset;
+	unsigned long op_offset;
+	unsigned long offset;
+	unsigned long size;
+	unsigned long ip;
+	unsigned long *ptr;
+	void *trampoline;
+	/* 48 8b 15 <offset> is movq <offset>(%rip), %rdx */
+	unsigned const char op_ref[] = { 0x48, 0x8b, 0x15 };
+	union ftrace_op_code_union op_ptr;
+	int ret;
+
+	if (ops->flags & FTRACE_OPS_FL_SAVE_REGS) {
+		start_offset = (unsigned long)ftrace_regs_caller;
+		end_offset = (unsigned long)ftrace_regs_caller_end;
+		op_offset = (unsigned long)ftrace_regs_caller_op_ptr;
+	} else {
+		start_offset = (unsigned long)ftrace_caller;
+		end_offset = (unsigned long)ftrace_caller_end;
+		op_offset = (unsigned long)ftrace_caller_op_ptr;
+	}
+
+	size = end_offset - start_offset;
+
+	/*
+	 * Allocate enough size to store the ftrace_caller code,
+	 * the jmp to ftrace_return, as well as the address of
+	 * the ftrace_ops this trampoline is used for.
+	 */
+	trampoline = alloc_tramp(size + MCOUNT_INSN_SIZE + sizeof(void *));
+	if (!trampoline)
+		return 0;
+
+	/* Copy ftrace_caller onto the trampoline memory */
+	ret = probe_kernel_read(trampoline, (void *)start_offset, size);
+	if (WARN_ON(ret < 0)) {
+		tramp_free(trampoline);
+		return 0;
+	}
+
+	ip = (unsigned long)trampoline + size;
+
+	/* The trampoline ends with a jmp to ftrace_return */
+	jmp = ftrace_jmp_replace(ip, (unsigned long)ftrace_return);
+	memcpy(trampoline + size, jmp, MCOUNT_INSN_SIZE);
+
+	/*
+	 * The address of the ftrace_ops that is used for this trampoline
+	 * is stored at the end of the trampoline. This will be used to
+	 * load the third parameter for the callback. Basically, that
+	 * location at the end of the trampoline takes the place of
+	 * the global function_trace_op variable.
+	 */
+
+	ptr = (unsigned long *)(trampoline + size + MCOUNT_INSN_SIZE);
+	*ptr = (unsigned long)ops;
+
+	op_offset -= start_offset;
+	memcpy(&op_ptr, trampoline + op_offset, OP_REF_SIZE);
+
+	/* Are we pointing to the reference? */
+	if (WARN_ON(memcmp(op_ptr.op, op_ref, 3) != 0)) {
+		tramp_free(trampoline);
+		return 0;
+	}
+
+	/* Load the contents of ptr into the callback parameter */
+	offset = (unsigned long)ptr;
+	offset -= (unsigned long)trampoline + op_offset + OP_REF_SIZE;
+
+	op_ptr.offset = offset;
+
+	/* put in the new offset to the ftrace_ops */
+	memcpy(trampoline + op_offset, &op_ptr, OP_REF_SIZE);
+
+	/* ALLOC_TRAMP flags lets us know we created it */
+	ops->flags |= FTRACE_OPS_FL_ALLOC_TRAMP;
+
+	return (unsigned long)trampoline;
+}
+
+void arch_ftrace_update_trampoline(struct ftrace_ops *ops)
+{
+	ftrace_func_t func;
+	unsigned char *new;
+	unsigned long start_offset;
+	unsigned long call_offset;
+	unsigned long offset;
+	unsigned long ip;
+	int ret;
+
+	if (ops->trampoline) {
+		/*
+		 * The ftrace_ops caller may set up its own trampoline.
+		 * In such a case, this code must not modify it.
+		 */
+		if (!(ops->flags & FTRACE_OPS_FL_ALLOC_TRAMP))
+			return;
+	} else {
+		ops->trampoline = create_trampoline(ops);
+		if (!ops->trampoline)
+			return;
+	}
+
+	if (ops->flags & FTRACE_OPS_FL_SAVE_REGS) {
+		start_offset = (unsigned long)ftrace_regs_caller;
+		call_offset = (unsigned long)ftrace_regs_call;
+	} else {
+		start_offset = (unsigned long)ftrace_caller;
+		call_offset = (unsigned long)ftrace_call;
+	}
+
+	offset = call_offset - start_offset;
+	ip = ops->trampoline + offset;
+
+	func = ftrace_ops_get_func(ops);
+
+	/* Do a safe modify in case the trampoline is executing */
+	new = ftrace_call_replace(ip, (unsigned long)func);
+	ret = update_ftrace_func(ip, new);
+
+	/* The update should never fail */
+	WARN_ON(ret);
+}
+#endif /* CONFIG_X86_64 */
+#endif /* CONFIG_DYNAMIC_FTRACE */
+
+#ifdef CONFIG_FUNCTION_GRAPH_TRACER
+
+#ifdef CONFIG_DYNAMIC_FTRACE
+extern void ftrace_graph_call(void);
 
 static int ftrace_mod_jmp(unsigned long ip, void *func)
 {
diff --git a/arch/x86/kernel/mcount_64.S b/arch/x86/kernel/mcount_64.S
index c73aecf10d34..42f0cdd20baf 100644
--- a/arch/x86/kernel/mcount_64.S
+++ b/arch/x86/kernel/mcount_64.S
@@ -28,9 +28,11 @@ ENTRY(function_hook)
 END(function_hook)
 
 /* skip is set if stack has been adjusted */
-.macro ftrace_caller_setup skip=0
+.macro ftrace_caller_setup trace_label skip=0
 	MCOUNT_SAVE_FRAME \skip
 
+	/* Save this location */
+GLOBAL(\trace_label)
 	/* Load the ftrace_ops into the 3rd parameter */
 	movq function_trace_op(%rip), %rdx
 
@@ -46,7 +48,7 @@ END(function_hook)
 .endm
 
 ENTRY(ftrace_caller)
-	ftrace_caller_setup
+	ftrace_caller_setup ftrace_caller_op_ptr
 	/* regs go into 4th parameter (but make it NULL) */
 	movq $0, %rcx
 
@@ -54,7 +56,14 @@ GLOBAL(ftrace_call)
 	call ftrace_stub
 
 	MCOUNT_RESTORE_FRAME
-ftrace_return:
+
+	/*
+	 * The copied trampoline must call ftrace_return as it
+	 * still may need to call the function graph tracer.
+	 */
+GLOBAL(ftrace_caller_end)
+
+GLOBAL(ftrace_return)
 
 #ifdef CONFIG_FUNCTION_GRAPH_TRACER
 GLOBAL(ftrace_graph_call)
@@ -70,7 +79,7 @@ ENTRY(ftrace_regs_caller)
 	pushfq
 
 	/* skip=8 to skip flags saved in SS */
-	ftrace_caller_setup 8
+	ftrace_caller_setup ftrace_regs_caller_op_ptr 8
 
 	/* Save the rest of pt_regs */
 	movq %r15, R15(%rsp)
@@ -122,6 +131,14 @@ GLOBAL(ftrace_regs_call)
 	/* Restore flags */
 	popfq
 
+	/*
+	 * As this jmp to ftrace_return can be a short jump
+	 * it must not be copied into the trampoline.
+	 * The trampoline will add the code to jump
+	 * to the return.
+	 */
+GLOBAL(ftrace_regs_caller_end)
+
 	jmp ftrace_return
 
 	popfq
diff --git a/include/linux/ftrace.h b/include/linux/ftrace.h
index 662697babd48..06e3ca5a5083 100644
--- a/include/linux/ftrace.h
+++ b/include/linux/ftrace.h
@@ -94,6 +94,13 @@ ftrace_func_t ftrace_ops_get_func(struct ftrace_ops *ops);
  * ADDING  - The ops is in the process of being added.
  * REMOVING - The ops is in the process of being removed.
  * MODIFYING - The ops is in the process of changing its filter functions.
+ * ALLOC_TRAMP - A dynamic trampoline was allocated by the core code.
+ *            The arch specific code sets this flag when it allocated a
+ *            trampoline. This lets the arch know that it can update the
+ *            trampoline in case the callback function changes.
+ *            The ftrace_ops trampoline can be set by the ftrace users, and
+ *            in such cases the arch must not modify it. Only the arch ftrace
+ *            core code should set this flag.
  */
 enum {
 	FTRACE_OPS_FL_ENABLED			= 1 << 0,
@@ -108,6 +115,7 @@ enum {
 	FTRACE_OPS_FL_ADDING			= 1 << 9,
 	FTRACE_OPS_FL_REMOVING			= 1 << 10,
 	FTRACE_OPS_FL_MODIFYING			= 1 << 11,
+	FTRACE_OPS_FL_ALLOC_TRAMP		= 1 << 12,
 };
 
 #ifdef CONFIG_DYNAMIC_FTRACE
diff --git a/kernel/trace/ftrace.c b/kernel/trace/ftrace.c
index 31c90fec4158..15f85eac7e95 100644
--- a/kernel/trace/ftrace.c
+++ b/kernel/trace/ftrace.c
@@ -387,6 +387,8 @@ static int remove_ftrace_list_ops(struct ftrace_ops **list,
 	return ret;
 }
 
+static void ftrace_update_trampoline(struct ftrace_ops *ops);
+
 static int __register_ftrace_function(struct ftrace_ops *ops)
 {
 	if (ops->flags & FTRACE_OPS_FL_DELETED)
@@ -419,6 +421,8 @@ static int __register_ftrace_function(struct ftrace_ops *ops)
 	} else
 		add_ftrace_ops(&ftrace_ops_list, ops);
 
+	ftrace_update_trampoline(ops);
+
 	if (ftrace_enabled)
 		update_ftrace_function();
 
@@ -3020,9 +3024,6 @@ ftrace_enabled_open(struct inode *inode, struct file *file)
 {
 	struct ftrace_iterator *iter;
 
-	if (unlikely(ftrace_disabled))
-		return -ENODEV;
-
 	iter = __seq_open_private(file, &show_ftrace_seq_ops, sizeof(*iter));
 	if (iter) {
 		iter->pg = ftrace_pages_start;
@@ -3975,6 +3976,9 @@ static char ftrace_graph_buf[FTRACE_FILTER_SIZE] __initdata;
 static char ftrace_graph_notrace_buf[FTRACE_FILTER_SIZE] __initdata;
 static int ftrace_set_func(unsigned long *array, int *idx, int size, char *buffer);
 
+static unsigned long save_global_trampoline;
+static unsigned long save_global_flags;
+
 static int __init set_graph_function(char *str)
 {
 	strlcpy(ftrace_graph_buf, str, FTRACE_FILTER_SIZE);
@@ -4696,6 +4700,20 @@ void __init ftrace_init(void)
 	ftrace_disabled = 1;
 }
 
+/* Do nothing if arch does not support this */
+void __weak arch_ftrace_update_trampoline(struct ftrace_ops *ops)
+{
+}
+
+static void ftrace_update_trampoline(struct ftrace_ops *ops)
+{
+	/* Currently, only non dynamic ops can have a trampoline */
+	if (ops->flags & FTRACE_OPS_FL_DYNAMIC)
+		return;
+
+	arch_ftrace_update_trampoline(ops);
+}
+
 #else
 
 static struct ftrace_ops global_ops = {
@@ -4738,6 +4756,10 @@ ftrace_ops_test(struct ftrace_ops *ops, unsigned long ip, void *regs)
 	return 1;
 }
 
+static void ftrace_update_trampoline(struct ftrace_ops *ops)
+{
+}
+
 #endif /* CONFIG_DYNAMIC_FTRACE */
 
 __init void ftrace_init_global_array_ops(struct trace_array *tr)
@@ -5522,7 +5544,6 @@ int register_ftrace_graph(trace_func_graph_ret_t retfunc,
 	update_function_graph_func();
 
 	ret = ftrace_startup(&graph_ops, FTRACE_START_FUNC_RET);
-
 out:
 	mutex_unlock(&ftrace_lock);
 	return ret;
@@ -5543,6 +5564,17 @@ void unregister_ftrace_graph(void)
 	unregister_pm_notifier(&ftrace_suspend_notifier);
 	unregister_trace_sched_switch(ftrace_graph_probe_sched_switch, NULL);
 
+#ifdef CONFIG_DYNAMIC_FTRACE
+	/*
+	 * Function graph does not allocate the trampoline, but
+	 * other global_ops do. We need to reset the ALLOC_TRAMP flag
+	 * if one was used.
+	 */
+	global_ops.trampoline = save_global_trampoline;
+	if (save_global_flags & FTRACE_OPS_FL_ALLOC_TRAMP)
+		global_ops.flags |= FTRACE_OPS_FL_ALLOC_TRAMP;
+#endif
+
  out:
 	mutex_unlock(&ftrace_lock);
 }
-- 
2.1.1



^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [for-next][PATCH 2/4] ftrace/x86: Show trampoline call function in enabled_functions
  2014-10-27 18:27 [for-next][PATCH 0/4] ftrace: Add dynamic trampoline support Steven Rostedt
  2014-10-27 18:27 ` [for-next][PATCH 1/4] ftrace/x86: Add dynamic allocated trampoline for ftrace_ops Steven Rostedt
@ 2014-10-27 18:27 ` Steven Rostedt
  2014-10-30 17:00   ` Steven Rostedt
  2014-11-05 10:42   ` Borislav Petkov
  2014-10-27 18:27 ` [for-next][PATCH 3/4] ftrace/x86: Allow !CONFIG_PREEMPT dynamic ops to use allocated trampolines Steven Rostedt
                   ` (3 subsequent siblings)
  5 siblings, 2 replies; 23+ messages in thread
From: Steven Rostedt @ 2014-10-27 18:27 UTC (permalink / raw)
  To: linux-kernel
  Cc: Ingo Molnar, Andrew Morton, Masami Hiramatsu, Jiri Kosina,
	Josh Poimboeuf, Vojtech Pavlik, Seth Jennings, Paul E. McKenney,
	H. Peter Anvin

[-- Attachment #1: 0002-ftrace-x86-Show-trampoline-call-function-in-enabled_.patch --]
[-- Type: text/plain, Size: 5536 bytes --]

From: "Steven Rostedt (Red Hat)" <rostedt@goodmis.org>

The file /sys/kernel/debug/tracing/eneabled_functions is used to debug
ftrace function hooks. Add to the output what function is being called
by the trampoline if the arch supports it.

Add support for this feature in x86_64.

Cc: H. Peter Anvin <hpa@linux.intel.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
---
 arch/x86/kernel/ftrace.c | 98 ++++++++++++++++++++++++++++++++++++++++++------
 kernel/trace/ftrace.c    | 22 ++++++++++-
 2 files changed, 106 insertions(+), 14 deletions(-)

diff --git a/arch/x86/kernel/ftrace.c b/arch/x86/kernel/ftrace.c
index e4d48f6cad86..ca17c20a1010 100644
--- a/arch/x86/kernel/ftrace.c
+++ b/arch/x86/kernel/ftrace.c
@@ -48,7 +48,7 @@ int ftrace_arch_code_modify_post_process(void)
 union ftrace_code_union {
 	char code[MCOUNT_INSN_SIZE];
 	struct {
-		char e8;
+		unsigned char e8;
 		int offset;
 	} __attribute__((packed));
 };
@@ -797,12 +797,26 @@ static unsigned long create_trampoline(struct ftrace_ops *ops)
 	return (unsigned long)trampoline;
 }
 
+static unsigned long calc_trampoline_call_offset(bool save_regs)
+{
+	unsigned long start_offset;
+	unsigned long call_offset;
+
+	if (save_regs) {
+		start_offset = (unsigned long)ftrace_regs_caller;
+		call_offset = (unsigned long)ftrace_regs_call;
+	} else {
+		start_offset = (unsigned long)ftrace_caller;
+		call_offset = (unsigned long)ftrace_call;
+	}
+
+	return call_offset - start_offset;
+}
+
 void arch_ftrace_update_trampoline(struct ftrace_ops *ops)
 {
 	ftrace_func_t func;
 	unsigned char *new;
-	unsigned long start_offset;
-	unsigned long call_offset;
 	unsigned long offset;
 	unsigned long ip;
 	int ret;
@@ -820,15 +834,7 @@ void arch_ftrace_update_trampoline(struct ftrace_ops *ops)
 			return;
 	}
 
-	if (ops->flags & FTRACE_OPS_FL_SAVE_REGS) {
-		start_offset = (unsigned long)ftrace_regs_caller;
-		call_offset = (unsigned long)ftrace_regs_call;
-	} else {
-		start_offset = (unsigned long)ftrace_caller;
-		call_offset = (unsigned long)ftrace_call;
-	}
-
-	offset = call_offset - start_offset;
+	offset = calc_trampoline_call_offset(ops->flags & FTRACE_OPS_FL_SAVE_REGS);
 	ip = ops->trampoline + offset;
 
 	func = ftrace_ops_get_func(ops);
@@ -840,6 +846,74 @@ void arch_ftrace_update_trampoline(struct ftrace_ops *ops)
 	/* The update should never fail */
 	WARN_ON(ret);
 }
+
+/* Return the address of the function the trampoline calls */
+static void *addr_from_call(void *ptr)
+{
+	union ftrace_code_union calc;
+	int ret;
+
+	ret = probe_kernel_read(&calc, ptr, MCOUNT_INSN_SIZE);
+	if (WARN_ON_ONCE(ret < 0))
+		return NULL;
+
+	/* Make sure this is a call */
+	if (WARN_ON_ONCE(calc.e8 != 0xe8)) {
+		pr_warn("Expected e8, got %x\n", calc.e8);
+		return NULL;
+	}
+
+	return ptr + MCOUNT_INSN_SIZE + calc.offset;
+}
+
+void prepare_ftrace_return(unsigned long *parent, unsigned long self_addr,
+			   unsigned long frame_pointer);
+
+/*
+ * If the ops->trampoline was not allocated, then it probably
+ * has a static trampoline func, or is the ftrace caller itself.
+ */
+static void *static_tramp_func(struct ftrace_ops *ops, struct dyn_ftrace *rec)
+{
+	unsigned long offset;
+	bool save_regs = rec->flags & FTRACE_FL_REGS_EN;
+	void *ptr;
+
+	if (ops && ops->trampoline) {
+#ifdef CONFIG_FUNCTION_GRAPH_TRACER
+		/*
+		 * We only know about function graph tracer setting as static
+		 * trampoline.
+		 */
+		if (ops->trampoline == FTRACE_GRAPH_ADDR)
+			return (void *)prepare_ftrace_return;
+#endif
+		return NULL;
+	}
+
+	offset = calc_trampoline_call_offset(save_regs);
+
+	if (save_regs)
+		ptr = (void *)FTRACE_REGS_ADDR + offset;
+	else
+		ptr = (void *)FTRACE_ADDR + offset;
+
+	return addr_from_call(ptr);
+}
+
+void *arch_ftrace_trampoline_func(struct ftrace_ops *ops, struct dyn_ftrace *rec)
+{
+	unsigned long offset;
+
+	/* If we didn't allocate this trampoline, consider it static */
+	if (!ops || !(ops->flags & FTRACE_OPS_FL_ALLOC_TRAMP))
+		return static_tramp_func(ops, rec);
+
+	offset = calc_trampoline_call_offset(ops->flags & FTRACE_OPS_FL_SAVE_REGS);
+	return addr_from_call((void *)ops->trampoline + offset);
+}
+
+
 #endif /* CONFIG_X86_64 */
 #endif /* CONFIG_DYNAMIC_FTRACE */
 
diff --git a/kernel/trace/ftrace.c b/kernel/trace/ftrace.c
index 15f85eac7e95..422e1f8300b1 100644
--- a/kernel/trace/ftrace.c
+++ b/kernel/trace/ftrace.c
@@ -2952,6 +2952,22 @@ static void t_stop(struct seq_file *m, void *p)
 	mutex_unlock(&ftrace_lock);
 }
 
+void * __weak
+arch_ftrace_trampoline_func(struct ftrace_ops *ops, struct dyn_ftrace *rec)
+{
+	return NULL;
+}
+
+static void add_trampoline_func(struct seq_file *m, struct ftrace_ops *ops,
+				struct dyn_ftrace *rec)
+{
+	void *ptr;
+
+	ptr = arch_ftrace_trampoline_func(ops, rec);
+	if (ptr)
+		seq_printf(m, " ->%pS", ptr);
+}
+
 static int t_show(struct seq_file *m, void *v)
 {
 	struct ftrace_iterator *iter = m->private;
@@ -2975,19 +2991,21 @@ static int t_show(struct seq_file *m, void *v)
 
 	seq_printf(m, "%ps", (void *)rec->ip);
 	if (iter->flags & FTRACE_ITER_ENABLED) {
+		struct ftrace_ops *ops = NULL;
+
 		seq_printf(m, " (%ld)%s",
 			   ftrace_rec_count(rec),
 			   rec->flags & FTRACE_FL_REGS ? " R" : "  ");
 		if (rec->flags & FTRACE_FL_TRAMP_EN) {
-			struct ftrace_ops *ops;
-
 			ops = ftrace_find_tramp_ops_any(rec);
 			if (ops)
 				seq_printf(m, "\ttramp: %pS",
 					   (void *)ops->trampoline);
 			else
 				seq_printf(m, "\ttramp: ERROR!");
+
 		}
+		add_trampoline_func(m, ops, rec);
 	}	
 
 	seq_printf(m, "\n");
-- 
2.1.1



^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [for-next][PATCH 3/4] ftrace/x86: Allow !CONFIG_PREEMPT dynamic ops to use allocated trampolines
  2014-10-27 18:27 [for-next][PATCH 0/4] ftrace: Add dynamic trampoline support Steven Rostedt
  2014-10-27 18:27 ` [for-next][PATCH 1/4] ftrace/x86: Add dynamic allocated trampoline for ftrace_ops Steven Rostedt
  2014-10-27 18:27 ` [for-next][PATCH 2/4] ftrace/x86: Show trampoline call function in enabled_functions Steven Rostedt
@ 2014-10-27 18:27 ` Steven Rostedt
  2014-10-30 17:01   ` Steven Rostedt
  2014-11-05 10:46   ` Borislav Petkov
  2014-10-27 18:27 ` [for-next][PATCH 4/4] ftrace: Add more information to ftrace_bug() output Steven Rostedt
                   ` (2 subsequent siblings)
  5 siblings, 2 replies; 23+ messages in thread
From: Steven Rostedt @ 2014-10-27 18:27 UTC (permalink / raw)
  To: linux-kernel
  Cc: Ingo Molnar, Andrew Morton, Masami Hiramatsu, Jiri Kosina,
	Josh Poimboeuf, Vojtech Pavlik, Seth Jennings, Paul E. McKenney,
	H. Peter Anvin

[-- Attachment #1: 0003-ftrace-x86-Allow-CONFIG_PREEMPT-dynamic-ops-to-use-a.patch --]
[-- Type: text/plain, Size: 3512 bytes --]

From: "Steven Rostedt (Red Hat)" <rostedt@goodmis.org>

When the static ftrace_ops (like function tracer) enables tracing, and it
is the only callback that is referencing a function, a trampoline is
dynamically allocated to the function that calls the callback directly
instead of calling a loop function that iterates over all the registered
ftrace ops (if more than one ops is registered).

But when it comes to dynamically allocated ftrace_ops, where they may be
freed, on a CONFIG_PREEMPT kernel there's no way to know when it is safe
to free the trampoline. If a task was preempted while executing on the
trampoline, there's currently no way to know when it will be off that
trampoline.

But this is not true when it comes to !CONFIG_PREEMPT. The current method
of calling schedule_on_each_cpu() will force tasks off the trampoline,
becaues they can not schedule while on it (kernel preemption is not
configured). That means it is safe to free a dynamically allocated
ftrace ops trampoline when CONFIG_PREEMPT is not configured.

Cc: H. Peter Anvin <hpa@linux.intel.com>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
---
 arch/x86/kernel/ftrace.c |  8 ++++++++
 kernel/trace/ftrace.c    | 18 ++++++++++++++++++
 2 files changed, 26 insertions(+)

diff --git a/arch/x86/kernel/ftrace.c b/arch/x86/kernel/ftrace.c
index ca17c20a1010..4cfeca6ffe11 100644
--- a/arch/x86/kernel/ftrace.c
+++ b/arch/x86/kernel/ftrace.c
@@ -913,6 +913,14 @@ void *arch_ftrace_trampoline_func(struct ftrace_ops *ops, struct dyn_ftrace *rec
 	return addr_from_call((void *)ops->trampoline + offset);
 }
 
+void arch_ftrace_trampoline_free(struct ftrace_ops *ops)
+{
+	if (!ops || !(ops->flags & FTRACE_OPS_FL_ALLOC_TRAMP))
+		return;
+
+	tramp_free((void *)ops->trampoline);
+	ops->trampoline = 0;
+}
 
 #endif /* CONFIG_X86_64 */
 #endif /* CONFIG_DYNAMIC_FTRACE */
diff --git a/kernel/trace/ftrace.c b/kernel/trace/ftrace.c
index 422e1f8300b1..eab3123a1fbe 100644
--- a/kernel/trace/ftrace.c
+++ b/kernel/trace/ftrace.c
@@ -2324,6 +2324,10 @@ static void ftrace_run_modify_code(struct ftrace_ops *ops, int command,
 static ftrace_func_t saved_ftrace_func;
 static int ftrace_start_up;
 
+void __weak arch_ftrace_trampoline_free(struct ftrace_ops *ops)
+{
+}
+
 static void control_ops_free(struct ftrace_ops *ops)
 {
 	free_percpu(ops->disabled);
@@ -2475,6 +2479,8 @@ static int ftrace_shutdown(struct ftrace_ops *ops, int command)
 	if (ops->flags & (FTRACE_OPS_FL_DYNAMIC | FTRACE_OPS_FL_CONTROL)) {
 		schedule_on_each_cpu(ftrace_sync);
 
+		arch_ftrace_trampoline_free(ops);
+
 		if (ops->flags & FTRACE_OPS_FL_CONTROL)
 			control_ops_free(ops);
 	}
@@ -4725,9 +4731,21 @@ void __weak arch_ftrace_update_trampoline(struct ftrace_ops *ops)
 
 static void ftrace_update_trampoline(struct ftrace_ops *ops)
 {
+
+/*
+ * Currently there's no safe way to free a trampoline when the kernel
+ * is configured with PREEMPT. That is because a task could be preempted
+ * when it jumped to the trampoline, it may be preempted for a long time
+ * depending on the system load, and currently there's no way to know
+ * when it will be off the trampoline. If the trampoline is freed
+ * too early, when the task runs again, it will be executing on freed
+ * memory and crash.
+ */
+#ifdef CONFIG_PREEMPT
 	/* Currently, only non dynamic ops can have a trampoline */
 	if (ops->flags & FTRACE_OPS_FL_DYNAMIC)
 		return;
+#endif
 
 	arch_ftrace_update_trampoline(ops);
 }
-- 
2.1.1



^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [for-next][PATCH 4/4] ftrace: Add more information to ftrace_bug() output
  2014-10-27 18:27 [for-next][PATCH 0/4] ftrace: Add dynamic trampoline support Steven Rostedt
                   ` (2 preceding siblings ...)
  2014-10-27 18:27 ` [for-next][PATCH 3/4] ftrace/x86: Allow !CONFIG_PREEMPT dynamic ops to use allocated trampolines Steven Rostedt
@ 2014-10-27 18:27 ` Steven Rostedt
  2014-10-30 17:02   ` Steven Rostedt
  2014-11-05 10:49   ` Borislav Petkov
  2014-10-29 16:27 ` [for-next][PATCH 0/4] ftrace: Add dynamic trampoline support Jiri Kosina
  2014-10-31 10:04 ` Masami Hiramatsu
  5 siblings, 2 replies; 23+ messages in thread
From: Steven Rostedt @ 2014-10-27 18:27 UTC (permalink / raw)
  To: linux-kernel
  Cc: Ingo Molnar, Andrew Morton, Masami Hiramatsu, Jiri Kosina,
	Josh Poimboeuf, Vojtech Pavlik, Seth Jennings, Paul E. McKenney,
	Benjamin Herrenschmidt

[-- Attachment #1: 0004-ftrace-Add-more-information-to-ftrace_bug-output.patch --]
[-- Type: text/plain, Size: 5303 bytes --]

From: "Steven Rostedt (Red Hat)" <rostedt@goodmis.org>

With the introduction of the dynamic trampolines, it is useful that if
things go wrong that ftrace_bug() produces more information about what
the current state is. This can help debug issues that may arise.

Ftrace has lots of checks to make sure that the state of the system it
touchs is exactly what it expects it to be. When it detects an abnormality
it calls ftrace_bug() and disables itself to prevent any further damage.
It is crucial that ftrace_bug() produces sufficient information that
can be used to debug the situation.

Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
---
 arch/powerpc/kernel/ftrace.c |  2 +-
 arch/x86/kernel/ftrace.c     |  2 +-
 include/linux/ftrace.h       |  4 +++-
 kernel/trace/ftrace.c        | 38 +++++++++++++++++++++++++++++---------
 4 files changed, 34 insertions(+), 12 deletions(-)

diff --git a/arch/powerpc/kernel/ftrace.c b/arch/powerpc/kernel/ftrace.c
index 390311c0f03d..e66af6d265e8 100644
--- a/arch/powerpc/kernel/ftrace.c
+++ b/arch/powerpc/kernel/ftrace.c
@@ -449,7 +449,7 @@ void ftrace_replace_code(int enable)
 		rec = ftrace_rec_iter_record(iter);
 		ret = __ftrace_replace_code(rec, enable);
 		if (ret) {
-			ftrace_bug(ret, rec->ip);
+			ftrace_bug(ret, rec);
 			return;
 		}
 	}
diff --git a/arch/x86/kernel/ftrace.c b/arch/x86/kernel/ftrace.c
index 4cfeca6ffe11..1aea94d336c7 100644
--- a/arch/x86/kernel/ftrace.c
+++ b/arch/x86/kernel/ftrace.c
@@ -583,7 +583,7 @@ void ftrace_replace_code(int enable)
 
  remove_breakpoints:
 	pr_warn("Failed on %s (%d):\n", report, count);
-	ftrace_bug(ret, rec ? rec->ip : 0);
+	ftrace_bug(ret, rec);
 	for_ftrace_rec_iter(iter) {
 		rec = ftrace_rec_iter_record(iter);
 		/*
diff --git a/include/linux/ftrace.h b/include/linux/ftrace.h
index 06e3ca5a5083..619e37cc17fd 100644
--- a/include/linux/ftrace.h
+++ b/include/linux/ftrace.h
@@ -263,7 +263,9 @@ struct ftrace_func_command {
 int ftrace_arch_code_modify_prepare(void);
 int ftrace_arch_code_modify_post_process(void);
 
-void ftrace_bug(int err, unsigned long ip);
+struct dyn_ftrace;
+
+void ftrace_bug(int err, struct dyn_ftrace *rec);
 
 struct seq_file;
 
diff --git a/kernel/trace/ftrace.c b/kernel/trace/ftrace.c
index eab3123a1fbe..4043332f6720 100644
--- a/kernel/trace/ftrace.c
+++ b/kernel/trace/ftrace.c
@@ -1738,10 +1738,13 @@ static void print_ip_ins(const char *fmt, unsigned char *p)
 		printk(KERN_CONT "%s%02x", i ? ":" : "", p[i]);
 }
 
+static struct ftrace_ops *
+ftrace_find_tramp_ops_any(struct dyn_ftrace *rec);
+
 /**
  * ftrace_bug - report and shutdown function tracer
  * @failed: The failed type (EFAULT, EINVAL, EPERM)
- * @ip: The address that failed
+ * @rec: The record that failed
  *
  * The arch code that enables or disables the function tracing
  * can call ftrace_bug() when it has detected a problem in
@@ -1750,8 +1753,10 @@ static void print_ip_ins(const char *fmt, unsigned char *p)
  * EINVAL - if what is read at @ip is not what was expected
  * EPERM - if the problem happens on writting to the @ip address
  */
-void ftrace_bug(int failed, unsigned long ip)
+void ftrace_bug(int failed, struct dyn_ftrace *rec)
 {
+	unsigned long ip = rec ? rec->ip : 0;
+
 	switch (failed) {
 	case -EFAULT:
 		FTRACE_WARN_ON_ONCE(1);
@@ -1763,7 +1768,7 @@ void ftrace_bug(int failed, unsigned long ip)
 		pr_info("ftrace failed to modify ");
 		print_ip_sym(ip);
 		print_ip_ins(" actual: ", (unsigned char *)ip);
-		printk(KERN_CONT "\n");
+		pr_cont("\n");
 		break;
 	case -EPERM:
 		FTRACE_WARN_ON_ONCE(1);
@@ -1775,6 +1780,24 @@ void ftrace_bug(int failed, unsigned long ip)
 		pr_info("ftrace faulted on unknown error ");
 		print_ip_sym(ip);
 	}
+	if (rec) {
+		struct ftrace_ops *ops = NULL;
+
+		pr_info("ftrace record flags: %lx\n", rec->flags);
+		pr_cont(" (%ld)%s", ftrace_rec_count(rec),
+			rec->flags & FTRACE_FL_REGS ? " R" : "  ");
+		if (rec->flags & FTRACE_FL_TRAMP_EN) {
+			ops = ftrace_find_tramp_ops_any(rec);
+			if (ops)
+				pr_cont("\ttramp: %pS",
+					(void *)ops->trampoline);
+			else
+				pr_cont("\ttramp: ERROR!");
+
+		}
+		ip = ftrace_get_addr_curr(rec);
+		pr_cont(" expected tramp: %lx\n", ip);
+	}
 }
 
 static int ftrace_check_record(struct dyn_ftrace *rec, int enable, int update)
@@ -2097,7 +2120,7 @@ void __weak ftrace_replace_code(int enable)
 	do_for_each_ftrace_rec(pg, rec) {
 		failed = __ftrace_replace_code(rec, enable);
 		if (failed) {
-			ftrace_bug(failed, rec->ip);
+			ftrace_bug(failed, rec);
 			/* Stop processing */
 			return;
 		}
@@ -2179,17 +2202,14 @@ struct dyn_ftrace *ftrace_rec_iter_record(struct ftrace_rec_iter *iter)
 static int
 ftrace_code_disable(struct module *mod, struct dyn_ftrace *rec)
 {
-	unsigned long ip;
 	int ret;
 
-	ip = rec->ip;
-
 	if (unlikely(ftrace_disabled))
 		return 0;
 
 	ret = ftrace_make_nop(mod, rec, MCOUNT_ADDR);
 	if (ret) {
-		ftrace_bug(ret, ip);
+		ftrace_bug(ret, rec);
 		return 0;
 	}
 	return 1;
@@ -2633,7 +2653,7 @@ static int ftrace_update_code(struct module *mod, struct ftrace_page *new_pgs)
 			if (ftrace_start_up && cnt) {
 				int failed = __ftrace_replace_code(p, 1);
 				if (failed)
-					ftrace_bug(failed, p->ip);
+					ftrace_bug(failed, p);
 			}
 		}
 	}
-- 
2.1.1



^ permalink raw reply related	[flat|nested] 23+ messages in thread

* Re: [for-next][PATCH 0/4] ftrace: Add dynamic trampoline support
  2014-10-27 18:27 [for-next][PATCH 0/4] ftrace: Add dynamic trampoline support Steven Rostedt
                   ` (3 preceding siblings ...)
  2014-10-27 18:27 ` [for-next][PATCH 4/4] ftrace: Add more information to ftrace_bug() output Steven Rostedt
@ 2014-10-29 16:27 ` Jiri Kosina
  2014-10-30 16:56   ` Steven Rostedt
  2014-10-31 10:04 ` Masami Hiramatsu
  5 siblings, 1 reply; 23+ messages in thread
From: Jiri Kosina @ 2014-10-29 16:27 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: linux-kernel, Ingo Molnar, Andrew Morton, Masami Hiramatsu,
	Josh Poimboeuf, Vojtech Pavlik, Seth Jennings, Paul E. McKenney

On Mon, 27 Oct 2014, Steven Rostedt wrote:

> This adds the allocation of dynamic trampolines. It still does not
> allow for dynamic ftrace_ops to use them on CONFIG_PREEMPT systems.
> That will come in 3.20, as I want to test out call_rcu_tasks() for
> a bit first on my own boxes.
> 
>   git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace.git
> for-next
> 
> Head SHA1: 9fd7caf67913fbf35d834beaa8f9764d733a9236
> 
> 
> Steven Rostedt (Red Hat) (4):
>       ftrace/x86: Add dynamic allocated trampoline for ftrace_ops
>       ftrace/x86: Show trampoline call function in enabled_functions
>       ftrace/x86: Allow !CONFIG_PREEMPT dynamic ops to use allocated trampolines
>       ftrace: Add more information to ftrace_bug() output

FWIW I just did a very quick test, and kGraft seems to run flawlessly on 
top of this.

Thanks,

-- 
Jiri Kosina
SUSE Labs

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [for-next][PATCH 0/4] ftrace: Add dynamic trampoline support
  2014-10-29 16:27 ` [for-next][PATCH 0/4] ftrace: Add dynamic trampoline support Jiri Kosina
@ 2014-10-30 16:56   ` Steven Rostedt
  2014-10-30 17:08     ` Jiri Kosina
  0 siblings, 1 reply; 23+ messages in thread
From: Steven Rostedt @ 2014-10-30 16:56 UTC (permalink / raw)
  To: Jiri Kosina
  Cc: linux-kernel, Ingo Molnar, Andrew Morton, Masami Hiramatsu,
	Josh Poimboeuf, Vojtech Pavlik, Seth Jennings, Paul E. McKenney

On Wed, 29 Oct 2014 17:27:30 +0100 (CET)
Jiri Kosina <jkosina@suse.cz> wrote:

> On Mon, 27 Oct 2014, Steven Rostedt wrote:
> 
> > This adds the allocation of dynamic trampolines. It still does not
> > allow for dynamic ftrace_ops to use them on CONFIG_PREEMPT systems.
> > That will come in 3.20, as I want to test out call_rcu_tasks() for
> > a bit first on my own boxes.
> > 
> >   git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace.git
> > for-next
> > 
> > Head SHA1: 9fd7caf67913fbf35d834beaa8f9764d733a9236
> > 
> > 
> > Steven Rostedt (Red Hat) (4):
> >       ftrace/x86: Add dynamic allocated trampoline for ftrace_ops
> >       ftrace/x86: Show trampoline call function in enabled_functions
> >       ftrace/x86: Allow !CONFIG_PREEMPT dynamic ops to use allocated trampolines
> >       ftrace: Add more information to ftrace_bug() output
> 
> FWIW I just did a very quick test, and kGraft seems to run flawlessly on 
> top of this.
> 

Does that mean I can add your Tested-by: for all these patches?

-- Steve

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [for-next][PATCH 1/4] ftrace/x86: Add dynamic allocated trampoline for ftrace_ops
  2014-10-27 18:27 ` [for-next][PATCH 1/4] ftrace/x86: Add dynamic allocated trampoline for ftrace_ops Steven Rostedt
@ 2014-10-30 17:00   ` Steven Rostedt
  2014-10-31  5:19   ` Masami Hiramatsu
  2014-11-05 10:28   ` Borislav Petkov
  2 siblings, 0 replies; 23+ messages in thread
From: Steven Rostedt @ 2014-10-30 17:00 UTC (permalink / raw)
  To: H. Peter Anvin
  Cc: linux-kernel, Ingo Molnar, Andrew Morton, Masami Hiramatsu,
	Jiri Kosina, Josh Poimboeuf, Vojtech Pavlik, Seth Jennings,
	Paul E. McKenney


H. Peter,

Can you give me your acked-by for this.

Thanks!

-- Steve


On Mon, 27 Oct 2014 14:27:03 -0400
Steven Rostedt <rostedt@goodmis.org> wrote:

> From: "Steven Rostedt (Red Hat)" <rostedt@goodmis.org>
> 
> The current method of handling multiple function callbacks is to register
> a list function callback that calls all the other callbacks based on
> their hash tables and compare it to the function that the callback was
> called on. But this is very inefficient.
> 
> For example, if you are tracing all functions in the kernel and then
> add a kprobe to a function such that the kprobe uses ftrace, the
> mcount trampoline will switch from calling the function trace callback
> to calling the list callback that will iterate over all registered
> ftrace_ops (in this case, the function tracer and the kprobes callback).
> That means for every function being traced it checks the hash of the
> ftrace_ops for function tracing and kprobes, even though the kprobes
> is only set at a single function. The kprobes ftrace_ops is checked
> for every function being traced!
> 
> Instead of calling the list function for functions that are only being
> traced by a single callback, we can call a dynamically allocated
> trampoline that calls the callback directly. The function graph tracer
> already uses a direct call trampoline when it is being traced by itself
> but it is not dynamically allocated. It's trampoline is static in the
> kernel core. The infrastructure that called the function graph trampoline
> can also be used to call a dynamically allocated one.
> 
> For now, only ftrace_ops that are not dynamically allocated can have
> a trampoline. That is, users such as function tracer or stack tracer.
> kprobes and perf allocate their ftrace_ops, and until there's a safe
> way to free the trampoline, it can not be used. The dynamically allocated
> ftrace_ops may, although, use the trampoline if the kernel is not
> compiled with CONFIG_PREEMPT. But that will come later.
> 
> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
> ---
>  arch/x86/kernel/ftrace.c    | 195 ++++++++++++++++++++++++++++++++++++++++++--
>  arch/x86/kernel/mcount_64.S |  25 +++++-
>  include/linux/ftrace.h      |   8 ++
>  kernel/trace/ftrace.c       |  40 ++++++++-
>  4 files changed, 254 insertions(+), 14 deletions(-)
> 
> diff --git a/arch/x86/kernel/ftrace.c b/arch/x86/kernel/ftrace.c
> index 3386dc9aa333..e4d48f6cad86 100644
> --- a/arch/x86/kernel/ftrace.c
> +++ b/arch/x86/kernel/ftrace.c
> @@ -17,6 +17,7 @@
>  #include <linux/ftrace.h>
>  #include <linux/percpu.h>
>  #include <linux/sched.h>
> +#include <linux/slab.h>
>  #include <linux/init.h>
>  #include <linux/list.h>
>  #include <linux/module.h>
> @@ -644,13 +645,8 @@ int __init ftrace_dyn_arch_init(void)
>  {
>  	return 0;
>  }
> -#endif
> -
> -#ifdef CONFIG_FUNCTION_GRAPH_TRACER
> -
> -#ifdef CONFIG_DYNAMIC_FTRACE
> -extern void ftrace_graph_call(void);
>  
> +#if defined(CONFIG_X86_64) || defined(CONFIG_FUNCTION_GRAPH_TRACER)
>  static unsigned char *ftrace_jmp_replace(unsigned long ip, unsigned long addr)
>  {
>  	static union ftrace_code_union calc;
> @@ -664,6 +660,193 @@ static unsigned char *ftrace_jmp_replace(unsigned long ip, unsigned long addr)
>  	 */
>  	return calc.code;
>  }
> +#endif
> +
> +/* Currently only x86_64 supports dynamic trampolines */
> +#ifdef CONFIG_X86_64
> +
> +#ifdef CONFIG_MODULES
> +#include <linux/moduleloader.h>
> +/* Module allocation simplifies allocating memory for code */
> +static inline void *alloc_tramp(unsigned long size)
> +{
> +	return module_alloc(size);
> +}
> +static inline void tramp_free(void *tramp)
> +{
> +	module_free(NULL, tramp);
> +}
> +#else
> +/* Trampolines can only be created if modules are supported */
> +static inline void *alloc_tramp(unsigned long size)
> +{
> +	return NULL;
> +}
> +static inline void tramp_free(void *tramp) { }
> +#endif
> +
> +/* Defined as markers to the end of the ftrace default trampolines */
> +extern void ftrace_caller_end(void);
> +extern void ftrace_regs_caller_end(void);
> +extern void ftrace_return(void);
> +extern void ftrace_caller_op_ptr(void);
> +extern void ftrace_regs_caller_op_ptr(void);
> +
> +/* movq function_trace_op(%rip), %rdx */
> +/* 0x48 0x8b 0x15 <offset-to-ftrace_trace_op (4 bytes)> */
> +#define OP_REF_SIZE	7
> +
> +/*
> + * The ftrace_ops is passed to the function callback. Since the
> + * trampoline only services a single ftrace_ops, we can pass in
> + * that ops directly.
> + *
> + * The ftrace_op_code_union is used to create a pointer to the
> + * ftrace_ops that will be passed to the callback function.
> + */
> +union ftrace_op_code_union {
> +	char code[OP_REF_SIZE];
> +	struct {
> +		char op[3];
> +		int offset;
> +	} __attribute__((packed));
> +};
> +
> +static unsigned long create_trampoline(struct ftrace_ops *ops)
> +{
> +	unsigned const char *jmp;
> +	unsigned long start_offset;
> +	unsigned long end_offset;
> +	unsigned long op_offset;
> +	unsigned long offset;
> +	unsigned long size;
> +	unsigned long ip;
> +	unsigned long *ptr;
> +	void *trampoline;
> +	/* 48 8b 15 <offset> is movq <offset>(%rip), %rdx */
> +	unsigned const char op_ref[] = { 0x48, 0x8b, 0x15 };
> +	union ftrace_op_code_union op_ptr;
> +	int ret;
> +
> +	if (ops->flags & FTRACE_OPS_FL_SAVE_REGS) {
> +		start_offset = (unsigned long)ftrace_regs_caller;
> +		end_offset = (unsigned long)ftrace_regs_caller_end;
> +		op_offset = (unsigned long)ftrace_regs_caller_op_ptr;
> +	} else {
> +		start_offset = (unsigned long)ftrace_caller;
> +		end_offset = (unsigned long)ftrace_caller_end;
> +		op_offset = (unsigned long)ftrace_caller_op_ptr;
> +	}
> +
> +	size = end_offset - start_offset;
> +
> +	/*
> +	 * Allocate enough size to store the ftrace_caller code,
> +	 * the jmp to ftrace_return, as well as the address of
> +	 * the ftrace_ops this trampoline is used for.
> +	 */
> +	trampoline = alloc_tramp(size + MCOUNT_INSN_SIZE + sizeof(void *));
> +	if (!trampoline)
> +		return 0;
> +
> +	/* Copy ftrace_caller onto the trampoline memory */
> +	ret = probe_kernel_read(trampoline, (void *)start_offset, size);
> +	if (WARN_ON(ret < 0)) {
> +		tramp_free(trampoline);
> +		return 0;
> +	}
> +
> +	ip = (unsigned long)trampoline + size;
> +
> +	/* The trampoline ends with a jmp to ftrace_return */
> +	jmp = ftrace_jmp_replace(ip, (unsigned long)ftrace_return);
> +	memcpy(trampoline + size, jmp, MCOUNT_INSN_SIZE);
> +
> +	/*
> +	 * The address of the ftrace_ops that is used for this trampoline
> +	 * is stored at the end of the trampoline. This will be used to
> +	 * load the third parameter for the callback. Basically, that
> +	 * location at the end of the trampoline takes the place of
> +	 * the global function_trace_op variable.
> +	 */
> +
> +	ptr = (unsigned long *)(trampoline + size + MCOUNT_INSN_SIZE);
> +	*ptr = (unsigned long)ops;
> +
> +	op_offset -= start_offset;
> +	memcpy(&op_ptr, trampoline + op_offset, OP_REF_SIZE);
> +
> +	/* Are we pointing to the reference? */
> +	if (WARN_ON(memcmp(op_ptr.op, op_ref, 3) != 0)) {
> +		tramp_free(trampoline);
> +		return 0;
> +	}
> +
> +	/* Load the contents of ptr into the callback parameter */
> +	offset = (unsigned long)ptr;
> +	offset -= (unsigned long)trampoline + op_offset + OP_REF_SIZE;
> +
> +	op_ptr.offset = offset;
> +
> +	/* put in the new offset to the ftrace_ops */
> +	memcpy(trampoline + op_offset, &op_ptr, OP_REF_SIZE);
> +
> +	/* ALLOC_TRAMP flags lets us know we created it */
> +	ops->flags |= FTRACE_OPS_FL_ALLOC_TRAMP;
> +
> +	return (unsigned long)trampoline;
> +}
> +
> +void arch_ftrace_update_trampoline(struct ftrace_ops *ops)
> +{
> +	ftrace_func_t func;
> +	unsigned char *new;
> +	unsigned long start_offset;
> +	unsigned long call_offset;
> +	unsigned long offset;
> +	unsigned long ip;
> +	int ret;
> +
> +	if (ops->trampoline) {
> +		/*
> +		 * The ftrace_ops caller may set up its own trampoline.
> +		 * In such a case, this code must not modify it.
> +		 */
> +		if (!(ops->flags & FTRACE_OPS_FL_ALLOC_TRAMP))
> +			return;
> +	} else {
> +		ops->trampoline = create_trampoline(ops);
> +		if (!ops->trampoline)
> +			return;
> +	}
> +
> +	if (ops->flags & FTRACE_OPS_FL_SAVE_REGS) {
> +		start_offset = (unsigned long)ftrace_regs_caller;
> +		call_offset = (unsigned long)ftrace_regs_call;
> +	} else {
> +		start_offset = (unsigned long)ftrace_caller;
> +		call_offset = (unsigned long)ftrace_call;
> +	}
> +
> +	offset = call_offset - start_offset;
> +	ip = ops->trampoline + offset;
> +
> +	func = ftrace_ops_get_func(ops);
> +
> +	/* Do a safe modify in case the trampoline is executing */
> +	new = ftrace_call_replace(ip, (unsigned long)func);
> +	ret = update_ftrace_func(ip, new);
> +
> +	/* The update should never fail */
> +	WARN_ON(ret);
> +}
> +#endif /* CONFIG_X86_64 */
> +#endif /* CONFIG_DYNAMIC_FTRACE */
> +
> +#ifdef CONFIG_FUNCTION_GRAPH_TRACER
> +
> +#ifdef CONFIG_DYNAMIC_FTRACE
> +extern void ftrace_graph_call(void);
>  
>  static int ftrace_mod_jmp(unsigned long ip, void *func)
>  {
> diff --git a/arch/x86/kernel/mcount_64.S b/arch/x86/kernel/mcount_64.S
> index c73aecf10d34..42f0cdd20baf 100644
> --- a/arch/x86/kernel/mcount_64.S
> +++ b/arch/x86/kernel/mcount_64.S
> @@ -28,9 +28,11 @@ ENTRY(function_hook)
>  END(function_hook)
>  
>  /* skip is set if stack has been adjusted */
> -.macro ftrace_caller_setup skip=0
> +.macro ftrace_caller_setup trace_label skip=0
>  	MCOUNT_SAVE_FRAME \skip
>  
> +	/* Save this location */
> +GLOBAL(\trace_label)
>  	/* Load the ftrace_ops into the 3rd parameter */
>  	movq function_trace_op(%rip), %rdx
>  
> @@ -46,7 +48,7 @@ END(function_hook)
>  .endm
>  
>  ENTRY(ftrace_caller)
> -	ftrace_caller_setup
> +	ftrace_caller_setup ftrace_caller_op_ptr
>  	/* regs go into 4th parameter (but make it NULL) */
>  	movq $0, %rcx
>  
> @@ -54,7 +56,14 @@ GLOBAL(ftrace_call)
>  	call ftrace_stub
>  
>  	MCOUNT_RESTORE_FRAME
> -ftrace_return:
> +
> +	/*
> +	 * The copied trampoline must call ftrace_return as it
> +	 * still may need to call the function graph tracer.
> +	 */
> +GLOBAL(ftrace_caller_end)
> +
> +GLOBAL(ftrace_return)
>  
>  #ifdef CONFIG_FUNCTION_GRAPH_TRACER
>  GLOBAL(ftrace_graph_call)
> @@ -70,7 +79,7 @@ ENTRY(ftrace_regs_caller)
>  	pushfq
>  
>  	/* skip=8 to skip flags saved in SS */
> -	ftrace_caller_setup 8
> +	ftrace_caller_setup ftrace_regs_caller_op_ptr 8
>  
>  	/* Save the rest of pt_regs */
>  	movq %r15, R15(%rsp)
> @@ -122,6 +131,14 @@ GLOBAL(ftrace_regs_call)
>  	/* Restore flags */
>  	popfq
>  
> +	/*
> +	 * As this jmp to ftrace_return can be a short jump
> +	 * it must not be copied into the trampoline.
> +	 * The trampoline will add the code to jump
> +	 * to the return.
> +	 */
> +GLOBAL(ftrace_regs_caller_end)
> +
>  	jmp ftrace_return
>  
>  	popfq
> diff --git a/include/linux/ftrace.h b/include/linux/ftrace.h
> index 662697babd48..06e3ca5a5083 100644
> --- a/include/linux/ftrace.h
> +++ b/include/linux/ftrace.h
> @@ -94,6 +94,13 @@ ftrace_func_t ftrace_ops_get_func(struct ftrace_ops *ops);
>   * ADDING  - The ops is in the process of being added.
>   * REMOVING - The ops is in the process of being removed.
>   * MODIFYING - The ops is in the process of changing its filter functions.
> + * ALLOC_TRAMP - A dynamic trampoline was allocated by the core code.
> + *            The arch specific code sets this flag when it allocated a
> + *            trampoline. This lets the arch know that it can update the
> + *            trampoline in case the callback function changes.
> + *            The ftrace_ops trampoline can be set by the ftrace users, and
> + *            in such cases the arch must not modify it. Only the arch ftrace
> + *            core code should set this flag.
>   */
>  enum {
>  	FTRACE_OPS_FL_ENABLED			= 1 << 0,
> @@ -108,6 +115,7 @@ enum {
>  	FTRACE_OPS_FL_ADDING			= 1 << 9,
>  	FTRACE_OPS_FL_REMOVING			= 1 << 10,
>  	FTRACE_OPS_FL_MODIFYING			= 1 << 11,
> +	FTRACE_OPS_FL_ALLOC_TRAMP		= 1 << 12,
>  };
>  
>  #ifdef CONFIG_DYNAMIC_FTRACE
> diff --git a/kernel/trace/ftrace.c b/kernel/trace/ftrace.c
> index 31c90fec4158..15f85eac7e95 100644
> --- a/kernel/trace/ftrace.c
> +++ b/kernel/trace/ftrace.c
> @@ -387,6 +387,8 @@ static int remove_ftrace_list_ops(struct ftrace_ops **list,
>  	return ret;
>  }
>  
> +static void ftrace_update_trampoline(struct ftrace_ops *ops);
> +
>  static int __register_ftrace_function(struct ftrace_ops *ops)
>  {
>  	if (ops->flags & FTRACE_OPS_FL_DELETED)
> @@ -419,6 +421,8 @@ static int __register_ftrace_function(struct ftrace_ops *ops)
>  	} else
>  		add_ftrace_ops(&ftrace_ops_list, ops);
>  
> +	ftrace_update_trampoline(ops);
> +
>  	if (ftrace_enabled)
>  		update_ftrace_function();
>  
> @@ -3020,9 +3024,6 @@ ftrace_enabled_open(struct inode *inode, struct file *file)
>  {
>  	struct ftrace_iterator *iter;
>  
> -	if (unlikely(ftrace_disabled))
> -		return -ENODEV;
> -
>  	iter = __seq_open_private(file, &show_ftrace_seq_ops, sizeof(*iter));
>  	if (iter) {
>  		iter->pg = ftrace_pages_start;
> @@ -3975,6 +3976,9 @@ static char ftrace_graph_buf[FTRACE_FILTER_SIZE] __initdata;
>  static char ftrace_graph_notrace_buf[FTRACE_FILTER_SIZE] __initdata;
>  static int ftrace_set_func(unsigned long *array, int *idx, int size, char *buffer);
>  
> +static unsigned long save_global_trampoline;
> +static unsigned long save_global_flags;
> +
>  static int __init set_graph_function(char *str)
>  {
>  	strlcpy(ftrace_graph_buf, str, FTRACE_FILTER_SIZE);
> @@ -4696,6 +4700,20 @@ void __init ftrace_init(void)
>  	ftrace_disabled = 1;
>  }
>  
> +/* Do nothing if arch does not support this */
> +void __weak arch_ftrace_update_trampoline(struct ftrace_ops *ops)
> +{
> +}
> +
> +static void ftrace_update_trampoline(struct ftrace_ops *ops)
> +{
> +	/* Currently, only non dynamic ops can have a trampoline */
> +	if (ops->flags & FTRACE_OPS_FL_DYNAMIC)
> +		return;
> +
> +	arch_ftrace_update_trampoline(ops);
> +}
> +
>  #else
>  
>  static struct ftrace_ops global_ops = {
> @@ -4738,6 +4756,10 @@ ftrace_ops_test(struct ftrace_ops *ops, unsigned long ip, void *regs)
>  	return 1;
>  }
>  
> +static void ftrace_update_trampoline(struct ftrace_ops *ops)
> +{
> +}
> +
>  #endif /* CONFIG_DYNAMIC_FTRACE */
>  
>  __init void ftrace_init_global_array_ops(struct trace_array *tr)
> @@ -5522,7 +5544,6 @@ int register_ftrace_graph(trace_func_graph_ret_t retfunc,
>  	update_function_graph_func();
>  
>  	ret = ftrace_startup(&graph_ops, FTRACE_START_FUNC_RET);
> -
>  out:
>  	mutex_unlock(&ftrace_lock);
>  	return ret;
> @@ -5543,6 +5564,17 @@ void unregister_ftrace_graph(void)
>  	unregister_pm_notifier(&ftrace_suspend_notifier);
>  	unregister_trace_sched_switch(ftrace_graph_probe_sched_switch, NULL);
>  
> +#ifdef CONFIG_DYNAMIC_FTRACE
> +	/*
> +	 * Function graph does not allocate the trampoline, but
> +	 * other global_ops do. We need to reset the ALLOC_TRAMP flag
> +	 * if one was used.
> +	 */
> +	global_ops.trampoline = save_global_trampoline;
> +	if (save_global_flags & FTRACE_OPS_FL_ALLOC_TRAMP)
> +		global_ops.flags |= FTRACE_OPS_FL_ALLOC_TRAMP;
> +#endif
> +
>   out:
>  	mutex_unlock(&ftrace_lock);
>  }


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [for-next][PATCH 2/4] ftrace/x86: Show trampoline call function in enabled_functions
  2014-10-27 18:27 ` [for-next][PATCH 2/4] ftrace/x86: Show trampoline call function in enabled_functions Steven Rostedt
@ 2014-10-30 17:00   ` Steven Rostedt
  2014-11-05 10:42   ` Borislav Petkov
  1 sibling, 0 replies; 23+ messages in thread
From: Steven Rostedt @ 2014-10-30 17:00 UTC (permalink / raw)
  To: H. Peter Anvin
  Cc: linux-kernel, Ingo Molnar, Andrew Morton, Masami Hiramatsu,
	Jiri Kosina, Josh Poimboeuf, Vojtech Pavlik, Seth Jennings,
	Paul E. McKenney

H. Peter,

Can you give me your acked-by for this?

Thanks,

-- Steve


On Mon, 27 Oct 2014 14:27:04 -0400
Steven Rostedt <rostedt@goodmis.org> wrote:

> From: "Steven Rostedt (Red Hat)" <rostedt@goodmis.org>
> 
> The file /sys/kernel/debug/tracing/eneabled_functions is used to debug
> ftrace function hooks. Add to the output what function is being called
> by the trampoline if the arch supports it.
> 
> Add support for this feature in x86_64.
> 
> Cc: H. Peter Anvin <hpa@linux.intel.com>
> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
> ---
>  arch/x86/kernel/ftrace.c | 98 ++++++++++++++++++++++++++++++++++++++++++------
>  kernel/trace/ftrace.c    | 22 ++++++++++-
>  2 files changed, 106 insertions(+), 14 deletions(-)
> 
> diff --git a/arch/x86/kernel/ftrace.c b/arch/x86/kernel/ftrace.c
> index e4d48f6cad86..ca17c20a1010 100644
> --- a/arch/x86/kernel/ftrace.c
> +++ b/arch/x86/kernel/ftrace.c
> @@ -48,7 +48,7 @@ int ftrace_arch_code_modify_post_process(void)
>  union ftrace_code_union {
>  	char code[MCOUNT_INSN_SIZE];
>  	struct {
> -		char e8;
> +		unsigned char e8;
>  		int offset;
>  	} __attribute__((packed));
>  };
> @@ -797,12 +797,26 @@ static unsigned long create_trampoline(struct ftrace_ops *ops)
>  	return (unsigned long)trampoline;
>  }
>  
> +static unsigned long calc_trampoline_call_offset(bool save_regs)
> +{
> +	unsigned long start_offset;
> +	unsigned long call_offset;
> +
> +	if (save_regs) {
> +		start_offset = (unsigned long)ftrace_regs_caller;
> +		call_offset = (unsigned long)ftrace_regs_call;
> +	} else {
> +		start_offset = (unsigned long)ftrace_caller;
> +		call_offset = (unsigned long)ftrace_call;
> +	}
> +
> +	return call_offset - start_offset;
> +}
> +
>  void arch_ftrace_update_trampoline(struct ftrace_ops *ops)
>  {
>  	ftrace_func_t func;
>  	unsigned char *new;
> -	unsigned long start_offset;
> -	unsigned long call_offset;
>  	unsigned long offset;
>  	unsigned long ip;
>  	int ret;
> @@ -820,15 +834,7 @@ void arch_ftrace_update_trampoline(struct ftrace_ops *ops)
>  			return;
>  	}
>  
> -	if (ops->flags & FTRACE_OPS_FL_SAVE_REGS) {
> -		start_offset = (unsigned long)ftrace_regs_caller;
> -		call_offset = (unsigned long)ftrace_regs_call;
> -	} else {
> -		start_offset = (unsigned long)ftrace_caller;
> -		call_offset = (unsigned long)ftrace_call;
> -	}
> -
> -	offset = call_offset - start_offset;
> +	offset = calc_trampoline_call_offset(ops->flags & FTRACE_OPS_FL_SAVE_REGS);
>  	ip = ops->trampoline + offset;
>  
>  	func = ftrace_ops_get_func(ops);
> @@ -840,6 +846,74 @@ void arch_ftrace_update_trampoline(struct ftrace_ops *ops)
>  	/* The update should never fail */
>  	WARN_ON(ret);
>  }
> +
> +/* Return the address of the function the trampoline calls */
> +static void *addr_from_call(void *ptr)
> +{
> +	union ftrace_code_union calc;
> +	int ret;
> +
> +	ret = probe_kernel_read(&calc, ptr, MCOUNT_INSN_SIZE);
> +	if (WARN_ON_ONCE(ret < 0))
> +		return NULL;
> +
> +	/* Make sure this is a call */
> +	if (WARN_ON_ONCE(calc.e8 != 0xe8)) {
> +		pr_warn("Expected e8, got %x\n", calc.e8);
> +		return NULL;
> +	}
> +
> +	return ptr + MCOUNT_INSN_SIZE + calc.offset;
> +}
> +
> +void prepare_ftrace_return(unsigned long *parent, unsigned long self_addr,
> +			   unsigned long frame_pointer);
> +
> +/*
> + * If the ops->trampoline was not allocated, then it probably
> + * has a static trampoline func, or is the ftrace caller itself.
> + */
> +static void *static_tramp_func(struct ftrace_ops *ops, struct dyn_ftrace *rec)
> +{
> +	unsigned long offset;
> +	bool save_regs = rec->flags & FTRACE_FL_REGS_EN;
> +	void *ptr;
> +
> +	if (ops && ops->trampoline) {
> +#ifdef CONFIG_FUNCTION_GRAPH_TRACER
> +		/*
> +		 * We only know about function graph tracer setting as static
> +		 * trampoline.
> +		 */
> +		if (ops->trampoline == FTRACE_GRAPH_ADDR)
> +			return (void *)prepare_ftrace_return;
> +#endif
> +		return NULL;
> +	}
> +
> +	offset = calc_trampoline_call_offset(save_regs);
> +
> +	if (save_regs)
> +		ptr = (void *)FTRACE_REGS_ADDR + offset;
> +	else
> +		ptr = (void *)FTRACE_ADDR + offset;
> +
> +	return addr_from_call(ptr);
> +}
> +
> +void *arch_ftrace_trampoline_func(struct ftrace_ops *ops, struct dyn_ftrace *rec)
> +{
> +	unsigned long offset;
> +
> +	/* If we didn't allocate this trampoline, consider it static */
> +	if (!ops || !(ops->flags & FTRACE_OPS_FL_ALLOC_TRAMP))
> +		return static_tramp_func(ops, rec);
> +
> +	offset = calc_trampoline_call_offset(ops->flags & FTRACE_OPS_FL_SAVE_REGS);
> +	return addr_from_call((void *)ops->trampoline + offset);
> +}
> +
> +
>  #endif /* CONFIG_X86_64 */
>  #endif /* CONFIG_DYNAMIC_FTRACE */
>  
> diff --git a/kernel/trace/ftrace.c b/kernel/trace/ftrace.c
> index 15f85eac7e95..422e1f8300b1 100644
> --- a/kernel/trace/ftrace.c
> +++ b/kernel/trace/ftrace.c
> @@ -2952,6 +2952,22 @@ static void t_stop(struct seq_file *m, void *p)
>  	mutex_unlock(&ftrace_lock);
>  }
>  
> +void * __weak
> +arch_ftrace_trampoline_func(struct ftrace_ops *ops, struct dyn_ftrace *rec)
> +{
> +	return NULL;
> +}
> +
> +static void add_trampoline_func(struct seq_file *m, struct ftrace_ops *ops,
> +				struct dyn_ftrace *rec)
> +{
> +	void *ptr;
> +
> +	ptr = arch_ftrace_trampoline_func(ops, rec);
> +	if (ptr)
> +		seq_printf(m, " ->%pS", ptr);
> +}
> +
>  static int t_show(struct seq_file *m, void *v)
>  {
>  	struct ftrace_iterator *iter = m->private;
> @@ -2975,19 +2991,21 @@ static int t_show(struct seq_file *m, void *v)
>  
>  	seq_printf(m, "%ps", (void *)rec->ip);
>  	if (iter->flags & FTRACE_ITER_ENABLED) {
> +		struct ftrace_ops *ops = NULL;
> +
>  		seq_printf(m, " (%ld)%s",
>  			   ftrace_rec_count(rec),
>  			   rec->flags & FTRACE_FL_REGS ? " R" : "  ");
>  		if (rec->flags & FTRACE_FL_TRAMP_EN) {
> -			struct ftrace_ops *ops;
> -
>  			ops = ftrace_find_tramp_ops_any(rec);
>  			if (ops)
>  				seq_printf(m, "\ttramp: %pS",
>  					   (void *)ops->trampoline);
>  			else
>  				seq_printf(m, "\ttramp: ERROR!");
> +
>  		}
> +		add_trampoline_func(m, ops, rec);
>  	}	
>  
>  	seq_printf(m, "\n");


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [for-next][PATCH 3/4] ftrace/x86: Allow !CONFIG_PREEMPT dynamic ops to use allocated trampolines
  2014-10-27 18:27 ` [for-next][PATCH 3/4] ftrace/x86: Allow !CONFIG_PREEMPT dynamic ops to use allocated trampolines Steven Rostedt
@ 2014-10-30 17:01   ` Steven Rostedt
  2014-11-05 10:46   ` Borislav Petkov
  1 sibling, 0 replies; 23+ messages in thread
From: Steven Rostedt @ 2014-10-30 17:01 UTC (permalink / raw)
  To: H. Peter Anvin
  Cc: linux-kernel, Ingo Molnar, Andrew Morton, Masami Hiramatsu,
	Jiri Kosina, Josh Poimboeuf, Vojtech Pavlik, Seth Jennings,
	Paul E. McKenney

H. Peter,

Can you give me your acked-by for this?

Thanks,

-- Steve


On Mon, 27 Oct 2014 14:27:05 -0400
Steven Rostedt <rostedt@goodmis.org> wrote:

> From: "Steven Rostedt (Red Hat)" <rostedt@goodmis.org>
> 
> When the static ftrace_ops (like function tracer) enables tracing, and it
> is the only callback that is referencing a function, a trampoline is
> dynamically allocated to the function that calls the callback directly
> instead of calling a loop function that iterates over all the registered
> ftrace ops (if more than one ops is registered).
> 
> But when it comes to dynamically allocated ftrace_ops, where they may be
> freed, on a CONFIG_PREEMPT kernel there's no way to know when it is safe
> to free the trampoline. If a task was preempted while executing on the
> trampoline, there's currently no way to know when it will be off that
> trampoline.
> 
> But this is not true when it comes to !CONFIG_PREEMPT. The current method
> of calling schedule_on_each_cpu() will force tasks off the trampoline,
> becaues they can not schedule while on it (kernel preemption is not
> configured). That means it is safe to free a dynamically allocated
> ftrace ops trampoline when CONFIG_PREEMPT is not configured.
> 
> Cc: H. Peter Anvin <hpa@linux.intel.com>
> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
> ---
>  arch/x86/kernel/ftrace.c |  8 ++++++++
>  kernel/trace/ftrace.c    | 18 ++++++++++++++++++
>  2 files changed, 26 insertions(+)
> 
> diff --git a/arch/x86/kernel/ftrace.c b/arch/x86/kernel/ftrace.c
> index ca17c20a1010..4cfeca6ffe11 100644
> --- a/arch/x86/kernel/ftrace.c
> +++ b/arch/x86/kernel/ftrace.c
> @@ -913,6 +913,14 @@ void *arch_ftrace_trampoline_func(struct ftrace_ops *ops, struct dyn_ftrace *rec
>  	return addr_from_call((void *)ops->trampoline + offset);
>  }
>  
> +void arch_ftrace_trampoline_free(struct ftrace_ops *ops)
> +{
> +	if (!ops || !(ops->flags & FTRACE_OPS_FL_ALLOC_TRAMP))
> +		return;
> +
> +	tramp_free((void *)ops->trampoline);
> +	ops->trampoline = 0;
> +}
>  
>  #endif /* CONFIG_X86_64 */
>  #endif /* CONFIG_DYNAMIC_FTRACE */
> diff --git a/kernel/trace/ftrace.c b/kernel/trace/ftrace.c
> index 422e1f8300b1..eab3123a1fbe 100644
> --- a/kernel/trace/ftrace.c
> +++ b/kernel/trace/ftrace.c
> @@ -2324,6 +2324,10 @@ static void ftrace_run_modify_code(struct ftrace_ops *ops, int command,
>  static ftrace_func_t saved_ftrace_func;
>  static int ftrace_start_up;
>  
> +void __weak arch_ftrace_trampoline_free(struct ftrace_ops *ops)
> +{
> +}
> +
>  static void control_ops_free(struct ftrace_ops *ops)
>  {
>  	free_percpu(ops->disabled);
> @@ -2475,6 +2479,8 @@ static int ftrace_shutdown(struct ftrace_ops *ops, int command)
>  	if (ops->flags & (FTRACE_OPS_FL_DYNAMIC | FTRACE_OPS_FL_CONTROL)) {
>  		schedule_on_each_cpu(ftrace_sync);
>  
> +		arch_ftrace_trampoline_free(ops);
> +
>  		if (ops->flags & FTRACE_OPS_FL_CONTROL)
>  			control_ops_free(ops);
>  	}
> @@ -4725,9 +4731,21 @@ void __weak arch_ftrace_update_trampoline(struct ftrace_ops *ops)
>  
>  static void ftrace_update_trampoline(struct ftrace_ops *ops)
>  {
> +
> +/*
> + * Currently there's no safe way to free a trampoline when the kernel
> + * is configured with PREEMPT. That is because a task could be preempted
> + * when it jumped to the trampoline, it may be preempted for a long time
> + * depending on the system load, and currently there's no way to know
> + * when it will be off the trampoline. If the trampoline is freed
> + * too early, when the task runs again, it will be executing on freed
> + * memory and crash.
> + */
> +#ifdef CONFIG_PREEMPT
>  	/* Currently, only non dynamic ops can have a trampoline */
>  	if (ops->flags & FTRACE_OPS_FL_DYNAMIC)
>  		return;
> +#endif
>  
>  	arch_ftrace_update_trampoline(ops);
>  }


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [for-next][PATCH 4/4] ftrace: Add more information to ftrace_bug() output
  2014-10-27 18:27 ` [for-next][PATCH 4/4] ftrace: Add more information to ftrace_bug() output Steven Rostedt
@ 2014-10-30 17:02   ` Steven Rostedt
  2014-11-05 10:49   ` Borislav Petkov
  1 sibling, 0 replies; 23+ messages in thread
From: Steven Rostedt @ 2014-10-30 17:02 UTC (permalink / raw)
  To: Benjamin Herrenschmidt, H. Peter Anvin
  Cc: linux-kernel, Ingo Molnar, Andrew Morton, Masami Hiramatsu,
	Jiri Kosina, Josh Poimboeuf, Vojtech Pavlik, Seth Jennings,
	Paul E. McKenney

Ben and H. Peter,

Can you give me your acked-bys for this?

Thanks!

-- Steve


On Mon, 27 Oct 2014 14:27:06 -0400
Steven Rostedt <rostedt@goodmis.org> wrote:

> From: "Steven Rostedt (Red Hat)" <rostedt@goodmis.org>
> 
> With the introduction of the dynamic trampolines, it is useful that if
> things go wrong that ftrace_bug() produces more information about what
> the current state is. This can help debug issues that may arise.
> 
> Ftrace has lots of checks to make sure that the state of the system it
> touchs is exactly what it expects it to be. When it detects an abnormality
> it calls ftrace_bug() and disables itself to prevent any further damage.
> It is crucial that ftrace_bug() produces sufficient information that
> can be used to debug the situation.
> 
> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
> ---
>  arch/powerpc/kernel/ftrace.c |  2 +-
>  arch/x86/kernel/ftrace.c     |  2 +-
>  include/linux/ftrace.h       |  4 +++-
>  kernel/trace/ftrace.c        | 38 +++++++++++++++++++++++++++++---------
>  4 files changed, 34 insertions(+), 12 deletions(-)
> 
> diff --git a/arch/powerpc/kernel/ftrace.c b/arch/powerpc/kernel/ftrace.c
> index 390311c0f03d..e66af6d265e8 100644
> --- a/arch/powerpc/kernel/ftrace.c
> +++ b/arch/powerpc/kernel/ftrace.c
> @@ -449,7 +449,7 @@ void ftrace_replace_code(int enable)
>  		rec = ftrace_rec_iter_record(iter);
>  		ret = __ftrace_replace_code(rec, enable);
>  		if (ret) {
> -			ftrace_bug(ret, rec->ip);
> +			ftrace_bug(ret, rec);
>  			return;
>  		}
>  	}
> diff --git a/arch/x86/kernel/ftrace.c b/arch/x86/kernel/ftrace.c
> index 4cfeca6ffe11..1aea94d336c7 100644
> --- a/arch/x86/kernel/ftrace.c
> +++ b/arch/x86/kernel/ftrace.c
> @@ -583,7 +583,7 @@ void ftrace_replace_code(int enable)
>  
>   remove_breakpoints:
>  	pr_warn("Failed on %s (%d):\n", report, count);
> -	ftrace_bug(ret, rec ? rec->ip : 0);
> +	ftrace_bug(ret, rec);
>  	for_ftrace_rec_iter(iter) {
>  		rec = ftrace_rec_iter_record(iter);
>  		/*
> diff --git a/include/linux/ftrace.h b/include/linux/ftrace.h
> index 06e3ca5a5083..619e37cc17fd 100644
> --- a/include/linux/ftrace.h
> +++ b/include/linux/ftrace.h
> @@ -263,7 +263,9 @@ struct ftrace_func_command {
>  int ftrace_arch_code_modify_prepare(void);
>  int ftrace_arch_code_modify_post_process(void);
>  
> -void ftrace_bug(int err, unsigned long ip);
> +struct dyn_ftrace;
> +
> +void ftrace_bug(int err, struct dyn_ftrace *rec);
>  
>  struct seq_file;
>  
> diff --git a/kernel/trace/ftrace.c b/kernel/trace/ftrace.c
> index eab3123a1fbe..4043332f6720 100644
> --- a/kernel/trace/ftrace.c
> +++ b/kernel/trace/ftrace.c
> @@ -1738,10 +1738,13 @@ static void print_ip_ins(const char *fmt, unsigned char *p)
>  		printk(KERN_CONT "%s%02x", i ? ":" : "", p[i]);
>  }
>  
> +static struct ftrace_ops *
> +ftrace_find_tramp_ops_any(struct dyn_ftrace *rec);
> +
>  /**
>   * ftrace_bug - report and shutdown function tracer
>   * @failed: The failed type (EFAULT, EINVAL, EPERM)
> - * @ip: The address that failed
> + * @rec: The record that failed
>   *
>   * The arch code that enables or disables the function tracing
>   * can call ftrace_bug() when it has detected a problem in
> @@ -1750,8 +1753,10 @@ static void print_ip_ins(const char *fmt, unsigned char *p)
>   * EINVAL - if what is read at @ip is not what was expected
>   * EPERM - if the problem happens on writting to the @ip address
>   */
> -void ftrace_bug(int failed, unsigned long ip)
> +void ftrace_bug(int failed, struct dyn_ftrace *rec)
>  {
> +	unsigned long ip = rec ? rec->ip : 0;
> +
>  	switch (failed) {
>  	case -EFAULT:
>  		FTRACE_WARN_ON_ONCE(1);
> @@ -1763,7 +1768,7 @@ void ftrace_bug(int failed, unsigned long ip)
>  		pr_info("ftrace failed to modify ");
>  		print_ip_sym(ip);
>  		print_ip_ins(" actual: ", (unsigned char *)ip);
> -		printk(KERN_CONT "\n");
> +		pr_cont("\n");
>  		break;
>  	case -EPERM:
>  		FTRACE_WARN_ON_ONCE(1);
> @@ -1775,6 +1780,24 @@ void ftrace_bug(int failed, unsigned long ip)
>  		pr_info("ftrace faulted on unknown error ");
>  		print_ip_sym(ip);
>  	}
> +	if (rec) {
> +		struct ftrace_ops *ops = NULL;
> +
> +		pr_info("ftrace record flags: %lx\n", rec->flags);
> +		pr_cont(" (%ld)%s", ftrace_rec_count(rec),
> +			rec->flags & FTRACE_FL_REGS ? " R" : "  ");
> +		if (rec->flags & FTRACE_FL_TRAMP_EN) {
> +			ops = ftrace_find_tramp_ops_any(rec);
> +			if (ops)
> +				pr_cont("\ttramp: %pS",
> +					(void *)ops->trampoline);
> +			else
> +				pr_cont("\ttramp: ERROR!");
> +
> +		}
> +		ip = ftrace_get_addr_curr(rec);
> +		pr_cont(" expected tramp: %lx\n", ip);
> +	}
>  }
>  
>  static int ftrace_check_record(struct dyn_ftrace *rec, int enable, int update)
> @@ -2097,7 +2120,7 @@ void __weak ftrace_replace_code(int enable)
>  	do_for_each_ftrace_rec(pg, rec) {
>  		failed = __ftrace_replace_code(rec, enable);
>  		if (failed) {
> -			ftrace_bug(failed, rec->ip);
> +			ftrace_bug(failed, rec);
>  			/* Stop processing */
>  			return;
>  		}
> @@ -2179,17 +2202,14 @@ struct dyn_ftrace *ftrace_rec_iter_record(struct ftrace_rec_iter *iter)
>  static int
>  ftrace_code_disable(struct module *mod, struct dyn_ftrace *rec)
>  {
> -	unsigned long ip;
>  	int ret;
>  
> -	ip = rec->ip;
> -
>  	if (unlikely(ftrace_disabled))
>  		return 0;
>  
>  	ret = ftrace_make_nop(mod, rec, MCOUNT_ADDR);
>  	if (ret) {
> -		ftrace_bug(ret, ip);
> +		ftrace_bug(ret, rec);
>  		return 0;
>  	}
>  	return 1;
> @@ -2633,7 +2653,7 @@ static int ftrace_update_code(struct module *mod, struct ftrace_page *new_pgs)
>  			if (ftrace_start_up && cnt) {
>  				int failed = __ftrace_replace_code(p, 1);
>  				if (failed)
> -					ftrace_bug(failed, p->ip);
> +					ftrace_bug(failed, p);
>  			}
>  		}
>  	}


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [for-next][PATCH 0/4] ftrace: Add dynamic trampoline support
  2014-10-30 16:56   ` Steven Rostedt
@ 2014-10-30 17:08     ` Jiri Kosina
  0 siblings, 0 replies; 23+ messages in thread
From: Jiri Kosina @ 2014-10-30 17:08 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: linux-kernel, Ingo Molnar, Andrew Morton, Masami Hiramatsu,
	Josh Poimboeuf, Vojtech Pavlik, Seth Jennings, Paul E. McKenney

On Thu, 30 Oct 2014, Steven Rostedt wrote:

> > > This adds the allocation of dynamic trampolines. It still does not
> > > allow for dynamic ftrace_ops to use them on CONFIG_PREEMPT systems.
> > > That will come in 3.20, as I want to test out call_rcu_tasks() for
> > > a bit first on my own boxes.
> > > 
> > >   git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace.git
> > > for-next
> > > 
> > > Head SHA1: 9fd7caf67913fbf35d834beaa8f9764d733a9236
> > > 
> > > 
> > > Steven Rostedt (Red Hat) (4):
> > >       ftrace/x86: Add dynamic allocated trampoline for ftrace_ops
> > >       ftrace/x86: Show trampoline call function in enabled_functions
> > >       ftrace/x86: Allow !CONFIG_PREEMPT dynamic ops to use allocated trampolines
> > >       ftrace: Add more information to ftrace_bug() output
> > 
> > FWIW I just did a very quick test, and kGraft seems to run flawlessly on 
> > top of this.
> > 
> 
> Does that mean I can add your Tested-by: for all these patches?

Yup, Tested-by: is fine by me, thanks.

I haven't yet read the code though, so anything other than this tag isn't 
applicable.

-- 
Jiri Kosina
SUSE Labs

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [for-next][PATCH 1/4] ftrace/x86: Add dynamic allocated trampoline for ftrace_ops
  2014-10-27 18:27 ` [for-next][PATCH 1/4] ftrace/x86: Add dynamic allocated trampoline for ftrace_ops Steven Rostedt
  2014-10-30 17:00   ` Steven Rostedt
@ 2014-10-31  5:19   ` Masami Hiramatsu
  2014-10-31 16:01     ` Steven Rostedt
  2014-11-05 10:28   ` Borislav Petkov
  2 siblings, 1 reply; 23+ messages in thread
From: Masami Hiramatsu @ 2014-10-31  5:19 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: linux-kernel, Ingo Molnar, Andrew Morton, Jiri Kosina,
	Josh Poimboeuf, Vojtech Pavlik, Seth Jennings, Paul E. McKenney

Hi Steven,

(2014/10/28 3:27), Steven Rostedt wrote:
> +static unsigned long create_trampoline(struct ftrace_ops *ops)
> +{
> +	unsigned const char *jmp;
> +	unsigned long start_offset;
> +	unsigned long end_offset;
> +	unsigned long op_offset;
> +	unsigned long offset;
> +	unsigned long size;
> +	unsigned long ip;
> +	unsigned long *ptr;
> +	void *trampoline;
> +	/* 48 8b 15 <offset> is movq <offset>(%rip), %rdx */
> +	unsigned const char op_ref[] = { 0x48, 0x8b, 0x15 };
> +	union ftrace_op_code_union op_ptr;
> +	int ret;
> +
> +	if (ops->flags & FTRACE_OPS_FL_SAVE_REGS) {
> +		start_offset = (unsigned long)ftrace_regs_caller;
> +		end_offset = (unsigned long)ftrace_regs_caller_end;
> +		op_offset = (unsigned long)ftrace_regs_caller_op_ptr;
> +	} else {
> +		start_offset = (unsigned long)ftrace_caller;
> +		end_offset = (unsigned long)ftrace_caller_end;
> +		op_offset = (unsigned long)ftrace_caller_op_ptr;
> +	}
> +
> +	size = end_offset - start_offset;
> +
> +	/*
> +	 * Allocate enough size to store the ftrace_caller code,
> +	 * the jmp to ftrace_return, as well as the address of
> +	 * the ftrace_ops this trampoline is used for.
> +	 */
> +	trampoline = alloc_tramp(size + MCOUNT_INSN_SIZE + sizeof(void *));
> +	if (!trampoline)
> +		return 0;
> +
> +	/* Copy ftrace_caller onto the trampoline memory */
> +	ret = probe_kernel_read(trampoline, (void *)start_offset, size);
> +	if (WARN_ON(ret < 0)) {
> +		tramp_free(trampoline);
> +		return 0;
> +	}
> +
> +	ip = (unsigned long)trampoline + size;
> +
> +	/* The trampoline ends with a jmp to ftrace_return */
> +	jmp = ftrace_jmp_replace(ip, (unsigned long)ftrace_return);
> +	memcpy(trampoline + size, jmp, MCOUNT_INSN_SIZE);
> +
> +	/*
> +	 * The address of the ftrace_ops that is used for this trampoline
> +	 * is stored at the end of the trampoline. This will be used to
> +	 * load the third parameter for the callback. Basically, that
> +	 * location at the end of the trampoline takes the place of
> +	 * the global function_trace_op variable.
> +	 */
> +
> +	ptr = (unsigned long *)(trampoline + size + MCOUNT_INSN_SIZE);
> +	*ptr = (unsigned long)ops;
> +
> +	op_offset -= start_offset;
> +	memcpy(&op_ptr, trampoline + op_offset, OP_REF_SIZE);
> +
> +	/* Are we pointing to the reference? */
> +	if (WARN_ON(memcmp(op_ptr.op, op_ref, 3) != 0)) {
> +		tramp_free(trampoline);
> +		return 0;
> +	}
> +
> +	/* Load the contents of ptr into the callback parameter */
> +	offset = (unsigned long)ptr;
> +	offset -= (unsigned long)trampoline + op_offset + OP_REF_SIZE;
> +
> +	op_ptr.offset = offset;
> +
> +	/* put in the new offset to the ftrace_ops */
> +	memcpy(trampoline + op_offset, &op_ptr, OP_REF_SIZE);

Would we better call flush_icache_range here?

Thank you,

> +
> +	/* ALLOC_TRAMP flags lets us know we created it */
> +	ops->flags |= FTRACE_OPS_FL_ALLOC_TRAMP;
> +
> +	return (unsigned long)trampoline;
> +}


-- 
Masami HIRAMATSU
Software Platform Research Dept. Linux Technology Research Center
Hitachi, Ltd., Yokohama Research Laboratory
E-mail: masami.hiramatsu.pt@hitachi.com



^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [for-next][PATCH 0/4] ftrace: Add dynamic trampoline support
  2014-10-27 18:27 [for-next][PATCH 0/4] ftrace: Add dynamic trampoline support Steven Rostedt
                   ` (4 preceding siblings ...)
  2014-10-29 16:27 ` [for-next][PATCH 0/4] ftrace: Add dynamic trampoline support Jiri Kosina
@ 2014-10-31 10:04 ` Masami Hiramatsu
  5 siblings, 0 replies; 23+ messages in thread
From: Masami Hiramatsu @ 2014-10-31 10:04 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: linux-kernel, Ingo Molnar, Andrew Morton, Jiri Kosina,
	Josh Poimboeuf, Vojtech Pavlik, Seth Jennings, Paul E. McKenney

(2014/10/28 3:27), Steven Rostedt wrote:
> This adds the allocation of dynamic trampolines. It still does not
> allow for dynamic ftrace_ops to use them on CONFIG_PREEMPT systems.
> That will come in 3.20, as I want to test out call_rcu_tasks() for
> a bit first on my own boxes.
> 
>   git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace.git
> for-next
> 
> Head SHA1: 9fd7caf67913fbf35d834beaa8f9764d733a9236

Anyway, usually flush_icache_range() is not so critical. I've tested
this with ftracetest and following 2 testcases :)

http://marc.info/?l=linux-kernel&m=141150834012382
http://marc.info/?l=linux-kernel&m=141353397907456

Tested-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
for this series.

Thank you,


-- 
Masami HIRAMATSU
Software Platform Research Dept. Linux Technology Research Center
Hitachi, Ltd., Yokohama Research Laboratory
E-mail: masami.hiramatsu.pt@hitachi.com



^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [for-next][PATCH 1/4] ftrace/x86: Add dynamic allocated trampoline for ftrace_ops
  2014-10-31  5:19   ` Masami Hiramatsu
@ 2014-10-31 16:01     ` Steven Rostedt
  2014-11-03  7:50       ` Masami Hiramatsu
  0 siblings, 1 reply; 23+ messages in thread
From: Steven Rostedt @ 2014-10-31 16:01 UTC (permalink / raw)
  To: Masami Hiramatsu
  Cc: linux-kernel, Ingo Molnar, Andrew Morton, Jiri Kosina,
	Josh Poimboeuf, Vojtech Pavlik, Seth Jennings, Paul E. McKenney

On Fri, 31 Oct 2014 14:19:03 +0900
Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> wrote:


> > +	/* Load the contents of ptr into the callback parameter */
> > +	offset = (unsigned long)ptr;
> > +	offset -= (unsigned long)trampoline + op_offset + OP_REF_SIZE;
> > +
> > +	op_ptr.offset = offset;
> > +
> > +	/* put in the new offset to the ftrace_ops */
> > +	memcpy(trampoline + op_offset, &op_ptr, OP_REF_SIZE);
> 
> Would we better call flush_icache_range here?

Do we need to? We just allocated memory to execute. It shouldn't be in
any CPU cache on the machine.

-- Steve


> 
> Thank you,
> 
> > +
> > +	/* ALLOC_TRAMP flags lets us know we created it */
> > +	ops->flags |= FTRACE_OPS_FL_ALLOC_TRAMP;
> > +
> > +	return (unsigned long)trampoline;
> > +}
> 
> 


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: Re: [for-next][PATCH 1/4] ftrace/x86: Add dynamic allocated trampoline for ftrace_ops
  2014-10-31 16:01     ` Steven Rostedt
@ 2014-11-03  7:50       ` Masami Hiramatsu
  0 siblings, 0 replies; 23+ messages in thread
From: Masami Hiramatsu @ 2014-11-03  7:50 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: linux-kernel, Ingo Molnar, Andrew Morton, Jiri Kosina,
	Josh Poimboeuf, Vojtech Pavlik, Seth Jennings, Paul E. McKenney

(2014/11/01 1:01), Steven Rostedt wrote:
> On Fri, 31 Oct 2014 14:19:03 +0900
> Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> wrote:
> 
> 
>>> +	/* Load the contents of ptr into the callback parameter */
>>> +	offset = (unsigned long)ptr;
>>> +	offset -= (unsigned long)trampoline + op_offset + OP_REF_SIZE;
>>> +
>>> +	op_ptr.offset = offset;
>>> +
>>> +	/* put in the new offset to the ftrace_ops */
>>> +	memcpy(trampoline + op_offset, &op_ptr, OP_REF_SIZE);
>>
>> Would we better call flush_icache_range here?
> 
> Do we need to? We just allocated memory to execute. It shouldn't be in
> any CPU cache on the machine.

Indeed, if the trampoline buffer is not re-used (and it seems so),
we don't need to care about it.

OK, I've reviewed that :).

Thank you,

> 
> -- Steve
> 
> 
>>
>> Thank you,
>>
>>> +
>>> +	/* ALLOC_TRAMP flags lets us know we created it */
>>> +	ops->flags |= FTRACE_OPS_FL_ALLOC_TRAMP;
>>> +
>>> +	return (unsigned long)trampoline;
>>> +}
>>
>>
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/
> 


-- 
Masami HIRAMATSU
Software Platform Research Dept. Linux Technology Research Center
Hitachi, Ltd., Yokohama Research Laboratory
E-mail: masami.hiramatsu.pt@hitachi.com



^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [for-next][PATCH 1/4] ftrace/x86: Add dynamic allocated trampoline for ftrace_ops
  2014-10-27 18:27 ` [for-next][PATCH 1/4] ftrace/x86: Add dynamic allocated trampoline for ftrace_ops Steven Rostedt
  2014-10-30 17:00   ` Steven Rostedt
  2014-10-31  5:19   ` Masami Hiramatsu
@ 2014-11-05 10:28   ` Borislav Petkov
  2014-11-06 13:57     ` Steven Rostedt
  2 siblings, 1 reply; 23+ messages in thread
From: Borislav Petkov @ 2014-11-05 10:28 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: linux-kernel, Ingo Molnar, Andrew Morton, Masami Hiramatsu,
	Jiri Kosina, Josh Poimboeuf, Vojtech Pavlik, Seth Jennings,
	Paul E. McKenney

On Mon, Oct 27, 2014 at 02:27:03PM -0400, Steven Rostedt wrote:
> From: "Steven Rostedt (Red Hat)" <rostedt@goodmis.org>
> 
> The current method of handling multiple function callbacks is to register
> a list function callback that calls all the other callbacks based on
> their hash tables and compare it to the function that the callback was
> called on. But this is very inefficient.
> 
> For example, if you are tracing all functions in the kernel and then
> add a kprobe to a function such that the kprobe uses ftrace, the
> mcount trampoline will switch from calling the function trace callback
> to calling the list callback that will iterate over all registered
> ftrace_ops (in this case, the function tracer and the kprobes callback).
> That means for every function being traced it checks the hash of the
> ftrace_ops for function tracing and kprobes, even though the kprobes
> is only set at a single function. The kprobes ftrace_ops is checked
> for every function being traced!
> 
> Instead of calling the list function for functions that are only being
> traced by a single callback, we can call a dynamically allocated
> trampoline that calls the callback directly. The function graph tracer
> already uses a direct call trampoline when it is being traced by itself
> but it is not dynamically allocated. It's trampoline is static in the
> kernel core. The infrastructure that called the function graph trampoline
> can also be used to call a dynamically allocated one.
> 
> For now, only ftrace_ops that are not dynamically allocated can have
> a trampoline. That is, users such as function tracer or stack tracer.
> kprobes and perf allocate their ftrace_ops, and until there's a safe
> way to free the trampoline, it can not be used. The dynamically allocated
> ftrace_ops may, although, use the trampoline if the kernel is not
> compiled with CONFIG_PREEMPT. But that will come later.
> 
> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
> ---
>  arch/x86/kernel/ftrace.c    | 195 ++++++++++++++++++++++++++++++++++++++++++--
>  arch/x86/kernel/mcount_64.S |  25 +++++-
>  include/linux/ftrace.h      |   8 ++
>  kernel/trace/ftrace.c       |  40 ++++++++-
>  4 files changed, 254 insertions(+), 14 deletions(-)
> 
> diff --git a/arch/x86/kernel/ftrace.c b/arch/x86/kernel/ftrace.c
> index 3386dc9aa333..e4d48f6cad86 100644
> --- a/arch/x86/kernel/ftrace.c
> +++ b/arch/x86/kernel/ftrace.c
> @@ -17,6 +17,7 @@
>  #include <linux/ftrace.h>
>  #include <linux/percpu.h>
>  #include <linux/sched.h>
> +#include <linux/slab.h>
>  #include <linux/init.h>
>  #include <linux/list.h>
>  #include <linux/module.h>
> @@ -644,13 +645,8 @@ int __init ftrace_dyn_arch_init(void)
>  {
>  	return 0;
>  }
> -#endif
> -
> -#ifdef CONFIG_FUNCTION_GRAPH_TRACER
> -
> -#ifdef CONFIG_DYNAMIC_FTRACE
> -extern void ftrace_graph_call(void);
>  
> +#if defined(CONFIG_X86_64) || defined(CONFIG_FUNCTION_GRAPH_TRACER)
>  static unsigned char *ftrace_jmp_replace(unsigned long ip, unsigned long addr)
>  {
>  	static union ftrace_code_union calc;
> @@ -664,6 +660,193 @@ static unsigned char *ftrace_jmp_replace(unsigned long ip, unsigned long addr)
>  	 */
>  	return calc.code;
>  }
> +#endif
> +
> +/* Currently only x86_64 supports dynamic trampolines */
> +#ifdef CONFIG_X86_64
> +
> +#ifdef CONFIG_MODULES
> +#include <linux/moduleloader.h>
> +/* Module allocation simplifies allocating memory for code */

... because it makes it EXEC too?

> +static inline void *alloc_tramp(unsigned long size)

Yeah, naming is, hmm, rated R. But we talked about it :-)

> +{
> +	return module_alloc(size);
> +}
> +static inline void tramp_free(void *tramp)
> +{
> +	module_free(NULL, tramp);
> +}
> +#else
> +/* Trampolines can only be created if modules are supported */

Just for my own understanding: why? You're not loading additional .ko's
right?

> +static inline void *alloc_tramp(unsigned long size)
> +{
> +	return NULL;
> +}
> +static inline void tramp_free(void *tramp) { }
> +#endif
> +
> +/* Defined as markers to the end of the ftrace default trampolines */
> +extern void ftrace_caller_end(void);
> +extern void ftrace_regs_caller_end(void);
> +extern void ftrace_return(void);
> +extern void ftrace_caller_op_ptr(void);
> +extern void ftrace_regs_caller_op_ptr(void);
> +
> +/* movq function_trace_op(%rip), %rdx */
> +/* 0x48 0x8b 0x15 <offset-to-ftrace_trace_op (4 bytes)> */
> +#define OP_REF_SIZE	7
> +
> +/*
> + * The ftrace_ops is passed to the function callback. Since the
> + * trampoline only services a single ftrace_ops, we can pass in
> + * that ops directly.
> + *
> + * The ftrace_op_code_union is used to create a pointer to the
> + * ftrace_ops that will be passed to the callback function.
> + */
> +union ftrace_op_code_union {
> +	char code[OP_REF_SIZE];
> +	struct {
> +		char op[3];
> +		int offset;
> +	} __attribute__((packed));
> +};
> +
> +static unsigned long create_trampoline(struct ftrace_ops *ops)
> +{
> +	unsigned const char *jmp;
> +	unsigned long start_offset;
> +	unsigned long end_offset;
> +	unsigned long op_offset;
> +	unsigned long offset;
> +	unsigned long size;
> +	unsigned long ip;
> +	unsigned long *ptr;
> +	void *trampoline;
> +	/* 48 8b 15 <offset> is movq <offset>(%rip), %rdx */
> +	unsigned const char op_ref[] = { 0x48, 0x8b, 0x15 };

Ok, this 0x15 is the ModRM which says that the destination register is
hammered to be %rdx. I guess this has something to do with the magic
third parameter thing where ftrace_ops go into...

> +	union ftrace_op_code_union op_ptr;
> +	int ret;
> +
> +	if (ops->flags & FTRACE_OPS_FL_SAVE_REGS) {
> +		start_offset = (unsigned long)ftrace_regs_caller;
> +		end_offset = (unsigned long)ftrace_regs_caller_end;
> +		op_offset = (unsigned long)ftrace_regs_caller_op_ptr;
> +	} else {
> +		start_offset = (unsigned long)ftrace_caller;
> +		end_offset = (unsigned long)ftrace_caller_end;
> +		op_offset = (unsigned long)ftrace_caller_op_ptr;
> +	}
> +
> +	size = end_offset - start_offset;
> +
> +	/*
> +	 * Allocate enough size to store the ftrace_caller code,
> +	 * the jmp to ftrace_return, as well as the address of
> +	 * the ftrace_ops this trampoline is used for.
> +	 */
> +	trampoline = alloc_tramp(size + MCOUNT_INSN_SIZE + sizeof(void *));
> +	if (!trampoline)
> +		return 0;
> +
> +	/* Copy ftrace_caller onto the trampoline memory */
> +	ret = probe_kernel_read(trampoline, (void *)start_offset, size);
> +	if (WARN_ON(ret < 0)) {
> +		tramp_free(trampoline);

A tramp-free trampoline? :-P

Btw, "trmp" could be a good, short and politically correct prefix to
this stuff:

trmp_alloc()
trmp_free()
...

:-)

> +		return 0;
> +	}
> +
> +	ip = (unsigned long)trampoline + size;
> +
> +	/* The trampoline ends with a jmp to ftrace_return */
> +	jmp = ftrace_jmp_replace(ip, (unsigned long)ftrace_return);
> +	memcpy(trampoline + size, jmp, MCOUNT_INSN_SIZE);
> +
> +	/*
> +	 * The address of the ftrace_ops that is used for this trampoline
> +	 * is stored at the end of the trampoline. This will be used to
> +	 * load the third parameter for the callback. Basically, that
> +	 * location at the end of the trampoline takes the place of
> +	 * the global function_trace_op variable.
> +	 */
> +

Spurious newline.

> +	ptr = (unsigned long *)(trampoline + size + MCOUNT_INSN_SIZE);

Maybe we should call this variable ftrace_ops_ptr to be more clear?

> +	*ptr = (unsigned long)ops;
> +
> +	op_offset -= start_offset;
> +	memcpy(&op_ptr, trampoline + op_offset, OP_REF_SIZE);

I guess we can copy only 3 bytes here and not OP_REF_SIZE as we
overwrite op_ptr.offset later anyway.

> +
> +	/* Are we pointing to the reference? */

"Are we pointing to the movq" is what we ask here, no?

> +	if (WARN_ON(memcmp(op_ptr.op, op_ref, 3) != 0)) {
> +		tramp_free(trampoline);
> +		return 0;
> +	}
> +
> +	/* Load the contents of ptr into the callback parameter */
> +	offset = (unsigned long)ptr;
> +	offset -= (unsigned long)trampoline + op_offset + OP_REF_SIZE;
> +
> +	op_ptr.offset = offset;
> +
> +	/* put in the new offset to the ftrace_ops */
> +	memcpy(trampoline + op_offset, &op_ptr, OP_REF_SIZE);
> +
> +	/* ALLOC_TRAMP flags lets us know we created it */
> +	ops->flags |= FTRACE_OPS_FL_ALLOC_TRAMP;
> +
> +	return (unsigned long)trampoline;
> +}
> +

Yeah, it looks ok as far as I can tell by looking around and trying to
piece it together from staring at ftrace-design.txt and the rest of the
fun.

But don't take my word for it :-)

-- 
Regards/Gruss,
    Boris.

Sent from a fat crate under my desk. Formatting is fine.
--

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [for-next][PATCH 2/4] ftrace/x86: Show trampoline call function in enabled_functions
  2014-10-27 18:27 ` [for-next][PATCH 2/4] ftrace/x86: Show trampoline call function in enabled_functions Steven Rostedt
  2014-10-30 17:00   ` Steven Rostedt
@ 2014-11-05 10:42   ` Borislav Petkov
  2014-11-06 14:01     ` Steven Rostedt
  1 sibling, 1 reply; 23+ messages in thread
From: Borislav Petkov @ 2014-11-05 10:42 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: linux-kernel, Ingo Molnar, Andrew Morton, Masami Hiramatsu,
	Jiri Kosina, Josh Poimboeuf, Vojtech Pavlik, Seth Jennings,
	Paul E. McKenney, H. Peter Anvin

On Mon, Oct 27, 2014 at 02:27:04PM -0400, Steven Rostedt wrote:
> From: "Steven Rostedt (Red Hat)" <rostedt@goodmis.org>
> 
> The file /sys/kernel/debug/tracing/eneabled_functions is used to debug
> ftrace function hooks. Add to the output what function is being called
> by the trampoline if the arch supports it.
> 
> Add support for this feature in x86_64.
> 
> Cc: H. Peter Anvin <hpa@linux.intel.com>
> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
> ---
>  arch/x86/kernel/ftrace.c | 98 ++++++++++++++++++++++++++++++++++++++++++------
>  kernel/trace/ftrace.c    | 22 ++++++++++-
>  2 files changed, 106 insertions(+), 14 deletions(-)
> 
> diff --git a/arch/x86/kernel/ftrace.c b/arch/x86/kernel/ftrace.c
> index e4d48f6cad86..ca17c20a1010 100644
> --- a/arch/x86/kernel/ftrace.c
> +++ b/arch/x86/kernel/ftrace.c
> @@ -48,7 +48,7 @@ int ftrace_arch_code_modify_post_process(void)
>  union ftrace_code_union {
>  	char code[MCOUNT_INSN_SIZE];
>  	struct {
> -		char e8;
> +		unsigned char e8;

u8?

"u8 e8;" looks supa dupa though :-)

>  		int offset;
>  	} __attribute__((packed));
>  };
> @@ -797,12 +797,26 @@ static unsigned long create_trampoline(struct ftrace_ops *ops)
>  	return (unsigned long)trampoline;
>  }
>  
> +static unsigned long calc_trampoline_call_offset(bool save_regs)
> +{
> +	unsigned long start_offset;
> +	unsigned long call_offset;
> +
> +	if (save_regs) {
> +		start_offset = (unsigned long)ftrace_regs_caller;
> +		call_offset = (unsigned long)ftrace_regs_call;
> +	} else {
> +		start_offset = (unsigned long)ftrace_caller;
> +		call_offset = (unsigned long)ftrace_call;
> +	}
> +
> +	return call_offset - start_offset;
> +}
> +
>  void arch_ftrace_update_trampoline(struct ftrace_ops *ops)
>  {
>  	ftrace_func_t func;
>  	unsigned char *new;
> -	unsigned long start_offset;
> -	unsigned long call_offset;
>  	unsigned long offset;
>  	unsigned long ip;
>  	int ret;
> @@ -820,15 +834,7 @@ void arch_ftrace_update_trampoline(struct ftrace_ops *ops)
>  			return;
>  	}
>  
> -	if (ops->flags & FTRACE_OPS_FL_SAVE_REGS) {
> -		start_offset = (unsigned long)ftrace_regs_caller;
> -		call_offset = (unsigned long)ftrace_regs_call;
> -	} else {
> -		start_offset = (unsigned long)ftrace_caller;
> -		call_offset = (unsigned long)ftrace_call;
> -	}
> -
> -	offset = call_offset - start_offset;
> +	offset = calc_trampoline_call_offset(ops->flags & FTRACE_OPS_FL_SAVE_REGS);
>  	ip = ops->trampoline + offset;
>  
>  	func = ftrace_ops_get_func(ops);
> @@ -840,6 +846,74 @@ void arch_ftrace_update_trampoline(struct ftrace_ops *ops)
>  	/* The update should never fail */
>  	WARN_ON(ret);
>  }
> +
> +/* Return the address of the function the trampoline calls */
> +static void *addr_from_call(void *ptr)
> +{
> +	union ftrace_code_union calc;
> +	int ret;
> +
> +	ret = probe_kernel_read(&calc, ptr, MCOUNT_INSN_SIZE);
> +	if (WARN_ON_ONCE(ret < 0))
> +		return NULL;
> +
> +	/* Make sure this is a call */
> +	if (WARN_ON_ONCE(calc.e8 != 0xe8)) {
> +		pr_warn("Expected e8, got %x\n", calc.e8);

Simply WARN_ONCE gives you both.

> +		return NULL;
> +	}
> +
> +	return ptr + MCOUNT_INSN_SIZE + calc.offset;
> +}
> +
> +void prepare_ftrace_return(unsigned long *parent, unsigned long self_addr,
> +			   unsigned long frame_pointer);
> +
> +/*
> + * If the ops->trampoline was not allocated, then it probably
> + * has a static trampoline func, or is the ftrace caller itself.
> + */
> +static void *static_tramp_func(struct ftrace_ops *ops, struct dyn_ftrace *rec)

I guess you can merge this one into its only caller below - it is not
that huge... yet! :-)

-- 
Regards/Gruss,
    Boris.

Sent from a fat crate under my desk. Formatting is fine.
--

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [for-next][PATCH 3/4] ftrace/x86: Allow !CONFIG_PREEMPT dynamic ops to use allocated trampolines
  2014-10-27 18:27 ` [for-next][PATCH 3/4] ftrace/x86: Allow !CONFIG_PREEMPT dynamic ops to use allocated trampolines Steven Rostedt
  2014-10-30 17:01   ` Steven Rostedt
@ 2014-11-05 10:46   ` Borislav Petkov
  1 sibling, 0 replies; 23+ messages in thread
From: Borislav Petkov @ 2014-11-05 10:46 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: linux-kernel, Ingo Molnar, Andrew Morton, Masami Hiramatsu,
	Jiri Kosina, Josh Poimboeuf, Vojtech Pavlik, Seth Jennings,
	Paul E. McKenney, H. Peter Anvin

On Mon, Oct 27, 2014 at 02:27:05PM -0400, Steven Rostedt wrote:
> From: "Steven Rostedt (Red Hat)" <rostedt@goodmis.org>
> 
> When the static ftrace_ops (like function tracer) enables tracing, and it
> is the only callback that is referencing a function, a trampoline is
> dynamically allocated to the function that calls the callback directly
> instead of calling a loop function that iterates over all the registered
> ftrace ops (if more than one ops is registered).
> 
> But when it comes to dynamically allocated ftrace_ops, where they may be
> freed, on a CONFIG_PREEMPT kernel there's no way to know when it is safe
> to free the trampoline. If a task was preempted while executing on the
> trampoline, there's currently no way to know when it will be off that
> trampoline.
> 
> But this is not true when it comes to !CONFIG_PREEMPT. The current method
> of calling schedule_on_each_cpu() will force tasks off the trampoline,
> becaues they can not schedule while on it (kernel preemption is not
> configured). That means it is safe to free a dynamically allocated
> ftrace ops trampoline when CONFIG_PREEMPT is not configured.
> 
> Cc: H. Peter Anvin <hpa@linux.intel.com>
> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>

Acked-by: Borislav Petkov <bp@suse.de>

-- 
Regards/Gruss,
    Boris.

Sent from a fat crate under my desk. Formatting is fine.
--

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [for-next][PATCH 4/4] ftrace: Add more information to ftrace_bug() output
  2014-10-27 18:27 ` [for-next][PATCH 4/4] ftrace: Add more information to ftrace_bug() output Steven Rostedt
  2014-10-30 17:02   ` Steven Rostedt
@ 2014-11-05 10:49   ` Borislav Petkov
  1 sibling, 0 replies; 23+ messages in thread
From: Borislav Petkov @ 2014-11-05 10:49 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: linux-kernel, Ingo Molnar, Andrew Morton, Masami Hiramatsu,
	Jiri Kosina, Josh Poimboeuf, Vojtech Pavlik, Seth Jennings,
	Paul E. McKenney, Benjamin Herrenschmidt

On Mon, Oct 27, 2014 at 02:27:06PM -0400, Steven Rostedt wrote:
> From: "Steven Rostedt (Red Hat)" <rostedt@goodmis.org>
> 
> With the introduction of the dynamic trampolines, it is useful that if
> things go wrong that ftrace_bug() produces more information about what
> the current state is. This can help debug issues that may arise.
> 
> Ftrace has lots of checks to make sure that the state of the system it
> touchs is exactly what it expects it to be. When it detects an abnormality
> it calls ftrace_bug() and disables itself to prevent any further damage.
> It is crucial that ftrace_bug() produces sufficient information that
> can be used to debug the situation.
> 
> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>

Acked-by: Borislav Petkov <bp@suse.de>

-- 
Regards/Gruss,
    Boris.

Sent from a fat crate under my desk. Formatting is fine.
--

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [for-next][PATCH 1/4] ftrace/x86: Add dynamic allocated trampoline for ftrace_ops
  2014-11-05 10:28   ` Borislav Petkov
@ 2014-11-06 13:57     ` Steven Rostedt
  2014-11-06 15:01       ` Borislav Petkov
  0 siblings, 1 reply; 23+ messages in thread
From: Steven Rostedt @ 2014-11-06 13:57 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: linux-kernel, Ingo Molnar, Andrew Morton, Masami Hiramatsu,
	Jiri Kosina, Josh Poimboeuf, Vojtech Pavlik, Seth Jennings,
	Paul E. McKenney

On Wed, 5 Nov 2014 11:28:01 +0100
Borislav Petkov <bp@alien8.de> wrote:

> > --- a/arch/x86/kernel/ftrace.c
> > +++ b/arch/x86/kernel/ftrace.c
> > @@ -17,6 +17,7 @@
> >  #include <linux/ftrace.h>
> >  #include <linux/percpu.h>
> >  #include <linux/sched.h>
> > +#include <linux/slab.h>
> >  #include <linux/init.h>
> >  #include <linux/list.h>
> >  #include <linux/module.h>
> > @@ -644,13 +645,8 @@ int __init ftrace_dyn_arch_init(void)
> >  {
> >  	return 0;
> >  }
> > -#endif
> > -
> > -#ifdef CONFIG_FUNCTION_GRAPH_TRACER
> > -
> > -#ifdef CONFIG_DYNAMIC_FTRACE
> > -extern void ftrace_graph_call(void);
> >  
> > +#if defined(CONFIG_X86_64) || defined(CONFIG_FUNCTION_GRAPH_TRACER)
> >  static unsigned char *ftrace_jmp_replace(unsigned long ip, unsigned long addr)
> >  {
> >  	static union ftrace_code_union calc;
> > @@ -664,6 +660,193 @@ static unsigned char *ftrace_jmp_replace(unsigned long ip, unsigned long addr)
> >  	 */
> >  	return calc.code;
> >  }
> > +#endif
> > +
> > +/* Currently only x86_64 supports dynamic trampolines */
> > +#ifdef CONFIG_X86_64
> > +
> > +#ifdef CONFIG_MODULES
> > +#include <linux/moduleloader.h>
> > +/* Module allocation simplifies allocating memory for code */
> 
> ... because it makes it EXEC too?

Yeah, I should have specified that.

> 
> > +static inline void *alloc_tramp(unsigned long size)
> 
> Yeah, naming is, hmm, rated R. But we talked about it :-)

It's used in Disney movies, I doubt it's rated R.

> 
> > +{
> > +	return module_alloc(size);
> > +}
> > +static inline void tramp_free(void *tramp)
> > +{
> > +	module_free(NULL, tramp);
> > +}
> > +#else
> > +/* Trampolines can only be created if modules are supported */
> 
> Just for my own understanding: why? You're not loading additional .ko's
> right?

It goes back to the EXEC issue. The module allocation goes through
leaps and bounds to get memory allocated for EXEC correct. I wasn't
about to duplicate that.

Ideally, there should be a generic way to allocate memory that will be
used for executing code, but right now we depend on module code.

> 
> > +static inline void *alloc_tramp(unsigned long size)
> > +{
> > +	return NULL;
> > +}
> > +static inline void tramp_free(void *tramp) { }
> > +#endif
> > +
> > +/* Defined as markers to the end of the ftrace default trampolines */
> > +extern void ftrace_caller_end(void);
> > +extern void ftrace_regs_caller_end(void);
> > +extern void ftrace_return(void);
> > +extern void ftrace_caller_op_ptr(void);
> > +extern void ftrace_regs_caller_op_ptr(void);
> > +
> > +/* movq function_trace_op(%rip), %rdx */
> > +/* 0x48 0x8b 0x15 <offset-to-ftrace_trace_op (4 bytes)> */
> > +#define OP_REF_SIZE	7
> > +
> > +/*
> > + * The ftrace_ops is passed to the function callback. Since the
> > + * trampoline only services a single ftrace_ops, we can pass in
> > + * that ops directly.
> > + *
> > + * The ftrace_op_code_union is used to create a pointer to the
> > + * ftrace_ops that will be passed to the callback function.
> > + */
> > +union ftrace_op_code_union {
> > +	char code[OP_REF_SIZE];
> > +	struct {
> > +		char op[3];
> > +		int offset;
> > +	} __attribute__((packed));
> > +};
> > +
> > +static unsigned long create_trampoline(struct ftrace_ops *ops)
> > +{
> > +	unsigned const char *jmp;
> > +	unsigned long start_offset;
> > +	unsigned long end_offset;
> > +	unsigned long op_offset;
> > +	unsigned long offset;
> > +	unsigned long size;
> > +	unsigned long ip;
> > +	unsigned long *ptr;
> > +	void *trampoline;
> > +	/* 48 8b 15 <offset> is movq <offset>(%rip), %rdx */
> > +	unsigned const char op_ref[] = { 0x48, 0x8b, 0x15 };
> 
> Ok, this 0x15 is the ModRM which says that the destination register is
> hammered to be %rdx. I guess this has something to do with the magic
> third parameter thing where ftrace_ops go into...

Yes, the ftrace_ops is passed in as the third parameter.

> 
> > +	union ftrace_op_code_union op_ptr;
> > +	int ret;
> > +
> > +	if (ops->flags & FTRACE_OPS_FL_SAVE_REGS) {
> > +		start_offset = (unsigned long)ftrace_regs_caller;
> > +		end_offset = (unsigned long)ftrace_regs_caller_end;
> > +		op_offset = (unsigned long)ftrace_regs_caller_op_ptr;
> > +	} else {
> > +		start_offset = (unsigned long)ftrace_caller;
> > +		end_offset = (unsigned long)ftrace_caller_end;
> > +		op_offset = (unsigned long)ftrace_caller_op_ptr;
> > +	}
> > +
> > +	size = end_offset - start_offset;
> > +
> > +	/*
> > +	 * Allocate enough size to store the ftrace_caller code,
> > +	 * the jmp to ftrace_return, as well as the address of
> > +	 * the ftrace_ops this trampoline is used for.
> > +	 */
> > +	trampoline = alloc_tramp(size + MCOUNT_INSN_SIZE + sizeof(void *));
> > +	if (!trampoline)
> > +		return 0;
> > +
> > +	/* Copy ftrace_caller onto the trampoline memory */
> > +	ret = probe_kernel_read(trampoline, (void *)start_offset, size);
> > +	if (WARN_ON(ret < 0)) {
> > +		tramp_free(trampoline);
> 
> A tramp-free trampoline? :-P
> 
> Btw, "trmp" could be a good, short and politically correct prefix to
> this stuff:
> 
> trmp_alloc()
> trmp_free()

I just see "Trump" which to me is less politically correct than
Tramp ;-)

> ...
> 
> :-)
> 
> > +		return 0;
> > +	}
> > +
> > +	ip = (unsigned long)trampoline + size;
> > +
> > +	/* The trampoline ends with a jmp to ftrace_return */
> > +	jmp = ftrace_jmp_replace(ip, (unsigned long)ftrace_return);
> > +	memcpy(trampoline + size, jmp, MCOUNT_INSN_SIZE);
> > +
> > +	/*
> > +	 * The address of the ftrace_ops that is used for this trampoline
> > +	 * is stored at the end of the trampoline. This will be used to
> > +	 * load the third parameter for the callback. Basically, that
> > +	 * location at the end of the trampoline takes the place of
> > +	 * the global function_trace_op variable.
> > +	 */
> > +
> 
> Spurious newline.
> 
> > +	ptr = (unsigned long *)(trampoline + size + MCOUNT_INSN_SIZE);
> 
> Maybe we should call this variable ftrace_ops_ptr to be more clear?
> 
> > +	*ptr = (unsigned long)ops;
> > +
> > +	op_offset -= start_offset;
> > +	memcpy(&op_ptr, trampoline + op_offset, OP_REF_SIZE);
> 
> I guess we can copy only 3 bytes here and not OP_REF_SIZE as we
> overwrite op_ptr.offset later anyway.

Yeah I guess we could.

> 
> > +
> > +	/* Are we pointing to the reference? */
> 
> "Are we pointing to the movq" is what we ask here, no?

Yep. I guess I should have said, "Are we pointing to the correct
label?".

> 
> > +	if (WARN_ON(memcmp(op_ptr.op, op_ref, 3) != 0)) {
> > +		tramp_free(trampoline);
> > +		return 0;
> > +	}
> > +
> > +	/* Load the contents of ptr into the callback parameter */
> > +	offset = (unsigned long)ptr;
> > +	offset -= (unsigned long)trampoline + op_offset + OP_REF_SIZE;
> > +
> > +	op_ptr.offset = offset;
> > +
> > +	/* put in the new offset to the ftrace_ops */
> > +	memcpy(trampoline + op_offset, &op_ptr, OP_REF_SIZE);
> > +
> > +	/* ALLOC_TRAMP flags lets us know we created it */
> > +	ops->flags |= FTRACE_OPS_FL_ALLOC_TRAMP;
> > +
> > +	return (unsigned long)trampoline;
> > +}
> > +
> 
> Yeah, it looks ok as far as I can tell by looking around and trying to
> piece it together from staring at ftrace-design.txt and the rest of the
> fun.
> 
> But don't take my word for it :-)
> 

Thanks,

I may update some of the comments and such. It will have to be a
separate patch. I already tested the hell out of these and have a bunch
of commits on top that have also been fully tested. Any rebase would
throw all that away.

-- Steve

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [for-next][PATCH 2/4] ftrace/x86: Show trampoline call function in enabled_functions
  2014-11-05 10:42   ` Borislav Petkov
@ 2014-11-06 14:01     ` Steven Rostedt
  0 siblings, 0 replies; 23+ messages in thread
From: Steven Rostedt @ 2014-11-06 14:01 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: linux-kernel, Ingo Molnar, Andrew Morton, Masami Hiramatsu,
	Jiri Kosina, Josh Poimboeuf, Vojtech Pavlik, Seth Jennings,
	Paul E. McKenney, H. Peter Anvin

On Wed, 5 Nov 2014 11:42:07 +0100
Borislav Petkov <bp@alien8.de> wrote:


> > diff --git a/arch/x86/kernel/ftrace.c b/arch/x86/kernel/ftrace.c
> > index e4d48f6cad86..ca17c20a1010 100644
> > --- a/arch/x86/kernel/ftrace.c
> > +++ b/arch/x86/kernel/ftrace.c
> > @@ -48,7 +48,7 @@ int ftrace_arch_code_modify_post_process(void)
> >  union ftrace_code_union {
> >  	char code[MCOUNT_INSN_SIZE];
> >  	struct {
> > -		char e8;
> > +		unsigned char e8;
> 
> u8?
> 
> "u8 e8;" looks supa dupa though :-)

And compete with PowerPC's eieio command.

> 
> >  		int offset;
> >  	} __attribute__((packed));
> >  };
> > @@ -797,12 +797,26 @@ static unsigned long create_trampoline(struct ftrace_ops *ops)
> >  	return (unsigned long)trampoline;
> >  }
> >  
> > +static unsigned long calc_trampoline_call_offset(bool save_regs)
> > +{
> > +	unsigned long start_offset;
> > +	unsigned long call_offset;
> > +
> > +	if (save_regs) {
> > +		start_offset = (unsigned long)ftrace_regs_caller;
> > +		call_offset = (unsigned long)ftrace_regs_call;
> > +	} else {
> > +		start_offset = (unsigned long)ftrace_caller;
> > +		call_offset = (unsigned long)ftrace_call;
> > +	}
> > +
> > +	return call_offset - start_offset;
> > +}
> > +
> >  void arch_ftrace_update_trampoline(struct ftrace_ops *ops)
> >  {
> >  	ftrace_func_t func;
> >  	unsigned char *new;
> > -	unsigned long start_offset;
> > -	unsigned long call_offset;
> >  	unsigned long offset;
> >  	unsigned long ip;
> >  	int ret;
> > @@ -820,15 +834,7 @@ void arch_ftrace_update_trampoline(struct ftrace_ops *ops)
> >  			return;
> >  	}
> >  
> > -	if (ops->flags & FTRACE_OPS_FL_SAVE_REGS) {
> > -		start_offset = (unsigned long)ftrace_regs_caller;
> > -		call_offset = (unsigned long)ftrace_regs_call;
> > -	} else {
> > -		start_offset = (unsigned long)ftrace_caller;
> > -		call_offset = (unsigned long)ftrace_call;
> > -	}
> > -
> > -	offset = call_offset - start_offset;
> > +	offset = calc_trampoline_call_offset(ops->flags & FTRACE_OPS_FL_SAVE_REGS);
> >  	ip = ops->trampoline + offset;
> >  
> >  	func = ftrace_ops_get_func(ops);
> > @@ -840,6 +846,74 @@ void arch_ftrace_update_trampoline(struct ftrace_ops *ops)
> >  	/* The update should never fail */
> >  	WARN_ON(ret);
> >  }
> > +
> > +/* Return the address of the function the trampoline calls */
> > +static void *addr_from_call(void *ptr)
> > +{
> > +	union ftrace_code_union calc;
> > +	int ret;
> > +
> > +	ret = probe_kernel_read(&calc, ptr, MCOUNT_INSN_SIZE);
> > +	if (WARN_ON_ONCE(ret < 0))
> > +		return NULL;
> > +
> > +	/* Make sure this is a call */
> > +	if (WARN_ON_ONCE(calc.e8 != 0xe8)) {
> > +		pr_warn("Expected e8, got %x\n", calc.e8);
> 
> Simply WARN_ONCE gives you both.

Ah, could do that.

> 
> > +		return NULL;
> > +	}
> > +
> > +	return ptr + MCOUNT_INSN_SIZE + calc.offset;
> > +}
> > +
> > +void prepare_ftrace_return(unsigned long *parent, unsigned long self_addr,
> > +			   unsigned long frame_pointer);
> > +
> > +/*
> > + * If the ops->trampoline was not allocated, then it probably
> > + * has a static trampoline func, or is the ftrace caller itself.
> > + */
> > +static void *static_tramp_func(struct ftrace_ops *ops, struct dyn_ftrace *rec)
> 
> I guess you can merge this one into its only caller below - it is not
> that huge... yet! :-)
> 

Hmm, it's big enough not to be in that function as a if statement that
will return anyway.

gcc will just make it inlined anyway. I think it's better separate for
readability.

-- Steve


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [for-next][PATCH 1/4] ftrace/x86: Add dynamic allocated trampoline for ftrace_ops
  2014-11-06 13:57     ` Steven Rostedt
@ 2014-11-06 15:01       ` Borislav Petkov
  0 siblings, 0 replies; 23+ messages in thread
From: Borislav Petkov @ 2014-11-06 15:01 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: linux-kernel, Ingo Molnar, Andrew Morton, Masami Hiramatsu,
	Jiri Kosina, Josh Poimboeuf, Vojtech Pavlik, Seth Jennings,
	Paul E. McKenney

On Thu, Nov 06, 2014 at 08:57:15AM -0500, Steven Rostedt wrote:
> > > +/* Trampolines can only be created if modules are supported */
> > 
> > Just for my own understanding: why? You're not loading additional .ko's
> > right?
> 
> It goes back to the EXEC issue. The module allocation goes through
> leaps and bounds to get memory allocated for EXEC correct. I wasn't
> about to duplicate that.
> 
> Ideally, there should be a generic way to allocate memory that will be
> used for executing code, but right now we depend on module code.

Yah, that could be a nice addition (in a separate patch, ontop, as you
say) to explain the dependency on CONFIG_MODULES, which is not trivially
apparent.

-- 
Regards/Gruss,
    Boris.

Sent from a fat crate under my desk. Formatting is fine.
--

^ permalink raw reply	[flat|nested] 23+ messages in thread

end of thread, other threads:[~2014-11-06 15:01 UTC | newest]

Thread overview: 23+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-10-27 18:27 [for-next][PATCH 0/4] ftrace: Add dynamic trampoline support Steven Rostedt
2014-10-27 18:27 ` [for-next][PATCH 1/4] ftrace/x86: Add dynamic allocated trampoline for ftrace_ops Steven Rostedt
2014-10-30 17:00   ` Steven Rostedt
2014-10-31  5:19   ` Masami Hiramatsu
2014-10-31 16:01     ` Steven Rostedt
2014-11-03  7:50       ` Masami Hiramatsu
2014-11-05 10:28   ` Borislav Petkov
2014-11-06 13:57     ` Steven Rostedt
2014-11-06 15:01       ` Borislav Petkov
2014-10-27 18:27 ` [for-next][PATCH 2/4] ftrace/x86: Show trampoline call function in enabled_functions Steven Rostedt
2014-10-30 17:00   ` Steven Rostedt
2014-11-05 10:42   ` Borislav Petkov
2014-11-06 14:01     ` Steven Rostedt
2014-10-27 18:27 ` [for-next][PATCH 3/4] ftrace/x86: Allow !CONFIG_PREEMPT dynamic ops to use allocated trampolines Steven Rostedt
2014-10-30 17:01   ` Steven Rostedt
2014-11-05 10:46   ` Borislav Petkov
2014-10-27 18:27 ` [for-next][PATCH 4/4] ftrace: Add more information to ftrace_bug() output Steven Rostedt
2014-10-30 17:02   ` Steven Rostedt
2014-11-05 10:49   ` Borislav Petkov
2014-10-29 16:27 ` [for-next][PATCH 0/4] ftrace: Add dynamic trampoline support Jiri Kosina
2014-10-30 16:56   ` Steven Rostedt
2014-10-30 17:08     ` Jiri Kosina
2014-10-31 10:04 ` Masami Hiramatsu

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).