linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [RFC][PATCH 0/7] ftrace/x86: Clean up of mcount.S code
@ 2014-11-25  0:42 Steven Rostedt
  2014-11-25  0:42 ` [RFC][PATCH 1/7] ftrace/x86: Have static tracing also use ftrace_caller_setup Steven Rostedt
                   ` (7 more replies)
  0 siblings, 8 replies; 12+ messages in thread
From: Steven Rostedt @ 2014-11-25  0:42 UTC (permalink / raw)
  To: linux-kernel
  Cc: Linus Torvalds, Ingo Molnar, Andrew Morton, H. Peter Anvin,
	Thomas Gleixner, Masami Hiramatsu


Linus,

As you did not like the added macro that just inserted the frame setup
into the ftrace trampolines, I did a bit of clean up to hopefully make
the mcount code a bit more suitable to your taste. I did several of
the recommendations you have made.

1) You noticed that the static tracer hard coded the saving of the rip.
   I had that function also use ftrace_caller_setup. You mentioned
   the function graph tracer too, but that one does things differently
   and needs to stay hard coded.

2) I moved the MCOUNT_SAVE_FRAME out of the header file.

3) I renamed MCOUNT_SAVE_FRAME to save_mcount_regs which is a more meaningful
   name of what that function does. As "frame" gets confused to function
   frame, and mcount uses a different ABI to what must be saved.

4) When saving the RIP into the RIP pt_regs position of the stack, I
   had save_mcount_regs use %rdi, as that register is what holds the
   first parameter for the function hooks that the ftrace_caller and
   ftrace_regs_caller will call. This allows me to remove the copying
   of that register in ftrace_caller_setup.

5) Instead of callers to save_mcount_regs say what it added to the
   pt_regs structure (which is silly, because they don't even add
   the correct data), just have the callers pass in how much stack
   they used before calling it. This is to allow save_mcount_regs to
   be able to still find RIP.

6) Instead of all callers of save_mcount_regs doing a hard coded offset
   to get to the data on the stack before the stuff that save_mcount_regs
   used, create a macro MCOUNT_REG_SIZE that keeps all users in sync.

7) Move the saving of the frame pointers into save_mcount_regs, and remove
   the extra macro that you hated. I still have it do something different
   if frame pointers is compiled in or not, and if mcount or fentry is used.
   But I tested it (basic tests) on all variations, and it works.

I haven't run this through my main test suite which takes 8 to 12 hours to
run. But I did more than compile testing on them. If my tests uncover
something, these will change.

As this code is based on other code I have for 3.19, I pushed up my rfc
branch in case you want to look at it directly. Also at the end of this
email I have the full diff of these patches in one patch.

This patch series still expects my original patch that added the create_frame
and restore_frame macros to go in. That's a "quick fix" that should work
well for stable. I think these changes are a bit too much for a late rc
or stable. But If you prefer this to go in now, I can rearrange things
to do it.

Let me know if these changes have mcount.S give you less heebie-jeebies.

-- Steve

  git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace.git
rfc/mcount-update

Head SHA1: f9c9200f43396b13d2c77286412cb276ac0ba0a6


Steven Rostedt (Red Hat) (7):
      ftrace/x86: Have static tracing also use ftrace_caller_setup
      ftrace/x86: Move MCOUNT_SAVE_FRAME out of header file
      ftrace/x86: Rename MCOUNT_SAVE_FRAME and add more detailed comments
      ftrace/x86: Have save_mcount_regs store RIP in %rdi for first parameter
      ftrace/x86: Simplify save_mcount_regs on getting RIP
      ftrace/x86: Add macro MCOUNT_REG_SIZE for amount of stack used to save mcount regs
      ftrace/x86: Have save_mcount_regs macro also save stack frames if needed

----
 arch/x86/include/asm/ftrace.h |  33 -------
 arch/x86/kernel/mcount_64.S   | 209 ++++++++++++++++++++++++++----------------
 2 files changed, 130 insertions(+), 112 deletions(-)
----

diff --git a/arch/x86/include/asm/ftrace.h b/arch/x86/include/asm/ftrace.h
index e1f7fecaa7d6..f45acad3c4b6 100644
--- a/arch/x86/include/asm/ftrace.h
+++ b/arch/x86/include/asm/ftrace.h
@@ -1,39 +1,6 @@
 #ifndef _ASM_X86_FTRACE_H
 #define _ASM_X86_FTRACE_H
 
-#ifdef __ASSEMBLY__
-
-	/* skip is set if the stack was already partially adjusted */
-	.macro MCOUNT_SAVE_FRAME skip=0
-	 /*
-	  * We add enough stack to save all regs.
-	  */
-	subq $(SS+8-\skip), %rsp
-	movq %rax, RAX(%rsp)
-	movq %rcx, RCX(%rsp)
-	movq %rdx, RDX(%rsp)
-	movq %rsi, RSI(%rsp)
-	movq %rdi, RDI(%rsp)
-	movq %r8, R8(%rsp)
-	movq %r9, R9(%rsp)
-	 /* Move RIP to its proper location */
-	movq SS+8(%rsp), %rdx
-	movq %rdx, RIP(%rsp)
-	.endm
-
-	.macro MCOUNT_RESTORE_FRAME skip=0
-	movq R9(%rsp), %r9
-	movq R8(%rsp), %r8
-	movq RDI(%rsp), %rdi
-	movq RSI(%rsp), %rsi
-	movq RDX(%rsp), %rdx
-	movq RCX(%rsp), %rcx
-	movq RAX(%rsp), %rax
-	addq $(SS+8-\skip), %rsp
-	.endm
-
-#endif
-
 #ifdef CONFIG_FUNCTION_TRACER
 #ifdef CC_USING_FENTRY
 # define MCOUNT_ADDR		((long)(__fentry__))
diff --git a/arch/x86/kernel/mcount_64.S b/arch/x86/kernel/mcount_64.S
index 6dc134b8dc70..003b22df1d87 100644
--- a/arch/x86/kernel/mcount_64.S
+++ b/arch/x86/kernel/mcount_64.S
@@ -21,78 +21,144 @@
 # define function_hook	mcount
 #endif
 
-#ifdef CONFIG_DYNAMIC_FTRACE
+/* All cases save the original rbp (8 bytes) */
+#ifdef CONFIG_FRAME_POINTER
+# ifdef CC_USING_FENTRY
+/* Save parent and function stack frames (rip and rbp) */
+#  define MCOUNT_FRAME_SIZE	(8+16*2)
+# else
+/* Save just function stack frame (rip and rbp) */
+#  define MCOUNT_FRAME_SIZE	(8+16)
+# endif
+#else
+/* No need to save a stack frame */
+# define MCOUNT_FRAME_SIZE	8
+#endif /* CONFIG_FRAME_POINTER */
 
-ENTRY(function_hook)
-	retq
-END(function_hook)
+/* Size of stack used to save mcount regs in save_mcount_regs */
+#define MCOUNT_REG_SIZE		(SS+8 + MCOUNT_FRAME_SIZE)
+
+/*
+ * gcc -pg option adds a call to 'mcount' in most functions.
+ * When -mfentry is used, the call is to 'fentry' and not 'mcount'
+ * and is done before the function's stack frame is set up.
+ * They both require a set of regs to be saved before calling
+ * any C code and restored before returning back to the function.
+ *
+ * On boot up, all these calls are converted into nops. When tracing
+ * is enabled, the call can jump to either ftrace_caller or
+ * ftrace_regs_caller. Callbacks (tracing functions) that require
+ * ftrace_regs_caller (like kprobes) need to have pt_regs passed to
+ * it. For this reason, the size of the pt_regs structure will be
+ * allocated on the stack and the required mcount registers will
+ * be saved in the locations that pt_regs has them in.
+ */
+
+/* @added: the amount of stack added before calling this */
+.macro save_mcount_regs added=0
+
+	/* Always save the original rbp */
+	pushq %rbp
+
+#ifdef CONFIG_FRAME_POINTER
+	/*
+	 * Stack traces will stop at the ftrace trampoline if the frame pointer
+	 * is not set up properly. If fentry is used, we need to save a frame
+	 * pointer for the parent as well as the function traced, because the
+	 * fentry is called before the stack frame is set up, where as mcount
+	 * is called afterward.
+	 */
+#ifdef CC_USING_FENTRY
+	/* Save the parent pointer (skip orig rbp and our return address) */
+	pushq \added+8*2(%rsp)
+	pushq %rbp
+	movq %rsp, %rbp
+	/* Save the return address (now skip orig rbp, rbp and parent) */
+	pushq \added+8*3(%rsp)
+#else
+	/* Can't assume that rip is before this (unless added was zero) */
+	pushq \added+8(%rsp)
+#endif
+	pushq %rbp
+	movq %rsp, %rbp
+#endif /* CONFIG_FRAME_POINTER */
+
+	/*
+	 * We add enough stack to save all regs.
+	 */
+	subq $(MCOUNT_REG_SIZE - MCOUNT_FRAME_SIZE), %rsp
+	movq %rax, RAX(%rsp)
+	movq %rcx, RCX(%rsp)
+	movq %rdx, RDX(%rsp)
+	movq %rsi, RSI(%rsp)
+	movq %rdi, RDI(%rsp)
+	movq %r8, R8(%rsp)
+	movq %r9, R9(%rsp)
+	/*
+	 * Save the original RBP. Even though the mcount ABI does not
+	 * require this, it helps out callers.
+	 */
+	movq MCOUNT_REG_SIZE-8(%rsp), %rdx
+	movq %rdx, RBP(%rsp)
+
+	 /* Move RIP to its proper location */
+	movq MCOUNT_REG_SIZE+\added(%rsp), %rdi
+	movq %rdi, RIP(%rsp)
+	.endm
+
+.macro restore_mcount_regs
+	movq R9(%rsp), %r9
+	movq R8(%rsp), %r8
+	movq RDI(%rsp), %rdi
+	movq RSI(%rsp), %rsi
+	movq RDX(%rsp), %rdx
+	movq RCX(%rsp), %rcx
+	movq RAX(%rsp), %rax
+
+	/* ftrace_regs_caller can modify %rbp */
+	movq RBP(%rsp), %rbp
+
+	addq $MCOUNT_REG_SIZE, %rsp
+
+	.endm
 
 /* skip is set if stack has been adjusted */
-.macro ftrace_caller_setup trace_label skip=0
-	MCOUNT_SAVE_FRAME \skip
+.macro ftrace_caller_setup trace_label added=0
+	save_mcount_regs \added
 
 	/* Save this location */
 GLOBAL(\trace_label)
 	/* Load the ftrace_ops into the 3rd parameter */
 	movq function_trace_op(%rip), %rdx
 
-	/* Load ip into the first parameter */
-	movq RIP(%rsp), %rdi
+	/* %rdi already has %rip from the save_mcount_regs macro */
 	subq $MCOUNT_INSN_SIZE, %rdi
 	/* Load the parent_ip into the second parameter */
 #ifdef CC_USING_FENTRY
-	movq SS+16(%rsp), %rsi
+	movq MCOUNT_REG_SIZE+8+\added(%rsp), %rsi
 #else
-	movq 8(%rbp), %rsi
+	/* Need to grab the original %rbp */
+	movq RBP(%rsp), %rsi
+	/* Now parent address is 8 above original %rbp */
+	movq 8(%rsi), %rsi
 #endif
 .endm
 
-#ifdef CONFIG_FRAME_POINTER
-/*
- * Stack traces will stop at the ftrace trampoline if the frame pointer
- * is not set up properly. If fentry is used, we need to save a frame
- * pointer for the parent as well as the function traced, because the
- * fentry is called before the stack frame is set up, where as mcount
- * is called afterward.
- */
-.macro create_frame parent rip
-#ifdef CC_USING_FENTRY
-	pushq \parent
-	pushq %rbp
-	movq %rsp, %rbp
-#endif
-	pushq \rip
-	pushq %rbp
-	movq %rsp, %rbp
-.endm
+#ifdef CONFIG_DYNAMIC_FTRACE
 
-.macro restore_frame
-#ifdef CC_USING_FENTRY
-	addq $16, %rsp
-#endif
-	popq %rbp
-	addq $8, %rsp
-.endm
-#else
-.macro create_frame parent rip
-.endm
-.macro restore_frame
-.endm
-#endif /* CONFIG_FRAME_POINTER */
+ENTRY(function_hook)
+	retq
+END(function_hook)
 
 ENTRY(ftrace_caller)
 	ftrace_caller_setup ftrace_caller_op_ptr
 	/* regs go into 4th parameter (but make it NULL) */
 	movq $0, %rcx
 
-	create_frame %rsi, %rdi
-
 GLOBAL(ftrace_call)
 	call ftrace_stub
 
-	restore_frame
-
-	MCOUNT_RESTORE_FRAME
+	restore_mcount_regs
 
 	/*
 	 * The copied trampoline must call ftrace_return as it
@@ -112,10 +178,10 @@ GLOBAL(ftrace_stub)
 END(ftrace_caller)
 
 ENTRY(ftrace_regs_caller)
-	/* Save the current flags before compare (in SS location)*/
+	/* Save the current flags before any operations that can change them */
 	pushfq
 
-	/* skip=8 to skip flags saved in SS */
+	/* added 8 bytes to save flags */
 	ftrace_caller_setup ftrace_regs_caller_op_ptr 8
 
 	/* Save the rest of pt_regs */
@@ -125,37 +191,32 @@ ENTRY(ftrace_regs_caller)
 	movq %r12, R12(%rsp)
 	movq %r11, R11(%rsp)
 	movq %r10, R10(%rsp)
-	movq %rbp, RBP(%rsp)
 	movq %rbx, RBX(%rsp)
 	/* Copy saved flags */
-	movq SS(%rsp), %rcx
+	movq MCOUNT_REG_SIZE(%rsp), %rcx
 	movq %rcx, EFLAGS(%rsp)
 	/* Kernel segments */
 	movq $__KERNEL_DS, %rcx
 	movq %rcx, SS(%rsp)
 	movq $__KERNEL_CS, %rcx
 	movq %rcx, CS(%rsp)
-	/* Stack - skipping return address */
-	leaq SS+16(%rsp), %rcx
+	/* Stack - skipping return address and flags */
+	leaq MCOUNT_REG_SIZE+8*2(%rsp), %rcx
 	movq %rcx, RSP(%rsp)
 
 	/* regs go into 4th parameter */
 	leaq (%rsp), %rcx
 
-	create_frame %rsi, %rdi
-
 GLOBAL(ftrace_regs_call)
 	call ftrace_stub
 
-	restore_frame
-
 	/* Copy flags back to SS, to restore them */
 	movq EFLAGS(%rsp), %rax
-	movq %rax, SS(%rsp)
+	movq %rax, MCOUNT_REG_SIZE(%rsp)
 
 	/* Handlers can change the RIP */
 	movq RIP(%rsp), %rax
-	movq %rax, SS+8(%rsp)
+	movq %rax, MCOUNT_REG_SIZE+8(%rsp)
 
 	/* restore the rest of pt_regs */
 	movq R15(%rsp), %r15
@@ -163,11 +224,9 @@ GLOBAL(ftrace_regs_call)
 	movq R13(%rsp), %r13
 	movq R12(%rsp), %r12
 	movq R10(%rsp), %r10
-	movq RBP(%rsp), %rbp
 	movq RBX(%rsp), %rbx
 
-	/* skip=8 to skip flags saved in SS */
-	MCOUNT_RESTORE_FRAME 8
+	restore_mcount_regs
 
 	/* Restore flags */
 	popfq
@@ -182,9 +241,6 @@ GLOBAL(ftrace_regs_caller_end)
 
 	jmp ftrace_return
 
-	popfq
-	jmp  ftrace_stub
-
 END(ftrace_regs_caller)
 
 
@@ -207,19 +263,11 @@ GLOBAL(ftrace_stub)
 	retq
 
 trace:
-	MCOUNT_SAVE_FRAME
-
-	movq RIP(%rsp), %rdi
-#ifdef CC_USING_FENTRY
-	movq SS+16(%rsp), %rsi
-#else
-	movq 8(%rbp), %rsi
-#endif
-	subq $MCOUNT_INSN_SIZE, %rdi
+	ftrace_caller_setup ftrace_caller_op_ptr
 
 	call   *ftrace_trace_function
 
-	MCOUNT_RESTORE_FRAME
+	restore_mcount_regs
 
 	jmp fgraph_trace
 END(function_hook)
@@ -228,21 +276,24 @@ END(function_hook)
 
 #ifdef CONFIG_FUNCTION_GRAPH_TRACER
 ENTRY(ftrace_graph_caller)
-	MCOUNT_SAVE_FRAME
+	save_mcount_regs
 
 #ifdef CC_USING_FENTRY
-	leaq SS+16(%rsp), %rdi
+	leaq MCOUNT_REG_SIZE+8(%rsp), %rdi
 	movq $0, %rdx	/* No framepointers needed */
 #else
-	leaq 8(%rbp), %rdi
-	movq (%rbp), %rdx
+	/* Need to grab the original %rbp */
+	movq RBP(%rsp), %rdx
+	/* Now parent address is 8 above original %rbp */
+	leaq 8(%rdx), %rdi
+	movq (%rdx), %rdx
 #endif
 	movq RIP(%rsp), %rsi
 	subq $MCOUNT_INSN_SIZE, %rsi
 
 	call	prepare_ftrace_return
 
-	MCOUNT_RESTORE_FRAME
+	restore_mcount_regs
 
 	retq
 END(ftrace_graph_caller)

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [RFC][PATCH 1/7] ftrace/x86: Have static tracing also use ftrace_caller_setup
  2014-11-25  0:42 [RFC][PATCH 0/7] ftrace/x86: Clean up of mcount.S code Steven Rostedt
@ 2014-11-25  0:42 ` Steven Rostedt
  2014-11-25  0:42 ` [RFC][PATCH 2/7] ftrace/x86: Move MCOUNT_SAVE_FRAME out of header file Steven Rostedt
                   ` (6 subsequent siblings)
  7 siblings, 0 replies; 12+ messages in thread
From: Steven Rostedt @ 2014-11-25  0:42 UTC (permalink / raw)
  To: linux-kernel
  Cc: Linus Torvalds, Ingo Molnar, Andrew Morton, H. Peter Anvin,
	Thomas Gleixner, Masami Hiramatsu

[-- Attachment #1: 0001-ftrace-x86-Have-static-tracing-also-use-ftrace_calle.patch --]
[-- Type: text/plain, Size: 2029 bytes --]

From: "Steven Rostedt (Red Hat)" <rostedt@goodmis.org>

Linus pointed out that there were locations that did the hard coded
update of the parent and rip parameters. One of them was the static tracer
which could also use the ftrace_caller_setup to do that work. In fact,
because it did not use it, it is prone to bugs, and since the static
tracer is hardly ever used (who wants function tracing code always being
called?) it doesn't get tested very often. I only run a few "does it still
work" tests on it. But I do not run stress tests on that code. Although,
since it is never turned off, just having it on should be stressful enough.
(especially for the performance folks)

There's no reason that the static tracer can't also use ftrace_caller_setup.
Have it do so.

Link: http://lkml.kernel.org/r/CA+55aFwF+qCGSKdGaEgW4p6N65GZ5_XTV=1NbtWDvxnd5yYLiw@mail.gmail.com

Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
---
 arch/x86/kernel/mcount_64.S | 22 +++++++---------------
 1 file changed, 7 insertions(+), 15 deletions(-)

diff --git a/arch/x86/kernel/mcount_64.S b/arch/x86/kernel/mcount_64.S
index 6dc134b8dc70..24842c701660 100644
--- a/arch/x86/kernel/mcount_64.S
+++ b/arch/x86/kernel/mcount_64.S
@@ -21,12 +21,6 @@
 # define function_hook	mcount
 #endif
 
-#ifdef CONFIG_DYNAMIC_FTRACE
-
-ENTRY(function_hook)
-	retq
-END(function_hook)
-
 /* skip is set if stack has been adjusted */
 .macro ftrace_caller_setup trace_label skip=0
 	MCOUNT_SAVE_FRAME \skip
@@ -47,6 +41,12 @@ GLOBAL(\trace_label)
 #endif
 .endm
 
+#ifdef CONFIG_DYNAMIC_FTRACE
+
+ENTRY(function_hook)
+	retq
+END(function_hook)
+
 #ifdef CONFIG_FRAME_POINTER
 /*
  * Stack traces will stop at the ftrace trampoline if the frame pointer
@@ -207,15 +207,7 @@ GLOBAL(ftrace_stub)
 	retq
 
 trace:
-	MCOUNT_SAVE_FRAME
-
-	movq RIP(%rsp), %rdi
-#ifdef CC_USING_FENTRY
-	movq SS+16(%rsp), %rsi
-#else
-	movq 8(%rbp), %rsi
-#endif
-	subq $MCOUNT_INSN_SIZE, %rdi
+	ftrace_caller_setup ftrace_caller_op_ptr
 
 	call   *ftrace_trace_function
 
-- 
2.1.1



^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [RFC][PATCH 2/7] ftrace/x86: Move MCOUNT_SAVE_FRAME out of header file
  2014-11-25  0:42 [RFC][PATCH 0/7] ftrace/x86: Clean up of mcount.S code Steven Rostedt
  2014-11-25  0:42 ` [RFC][PATCH 1/7] ftrace/x86: Have static tracing also use ftrace_caller_setup Steven Rostedt
@ 2014-11-25  0:42 ` Steven Rostedt
  2014-11-25  0:42 ` [RFC][PATCH 3/7] ftrace/x86: Rename MCOUNT_SAVE_FRAME and add more detailed comments Steven Rostedt
                   ` (5 subsequent siblings)
  7 siblings, 0 replies; 12+ messages in thread
From: Steven Rostedt @ 2014-11-25  0:42 UTC (permalink / raw)
  To: linux-kernel
  Cc: Linus Torvalds, Ingo Molnar, Andrew Morton, H. Peter Anvin,
	Thomas Gleixner, Masami Hiramatsu

[-- Attachment #1: 0002-ftrace-x86-Move-MCOUNT_SAVE_FRAME-out-of-header-file.patch --]
[-- Type: text/plain, Size: 2671 bytes --]

From: "Steven Rostedt (Red Hat)" <rostedt@goodmis.org>

Linus pointed out that MCOUNT_SAVE_FRAME is used in only a single file
and that there's no reason that it should be in a header file.
Move the macro to the code that uses it.

Link: http://lkml.kernel.org/r/CA+55aFwF+qCGSKdGaEgW4p6N65GZ5_XTV=1NbtWDvxnd5yYLiw@mail.gmail.com

Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
---
 arch/x86/include/asm/ftrace.h | 33 ---------------------------------
 arch/x86/kernel/mcount_64.S   | 29 +++++++++++++++++++++++++++++
 2 files changed, 29 insertions(+), 33 deletions(-)

diff --git a/arch/x86/include/asm/ftrace.h b/arch/x86/include/asm/ftrace.h
index e1f7fecaa7d6..f45acad3c4b6 100644
--- a/arch/x86/include/asm/ftrace.h
+++ b/arch/x86/include/asm/ftrace.h
@@ -1,39 +1,6 @@
 #ifndef _ASM_X86_FTRACE_H
 #define _ASM_X86_FTRACE_H
 
-#ifdef __ASSEMBLY__
-
-	/* skip is set if the stack was already partially adjusted */
-	.macro MCOUNT_SAVE_FRAME skip=0
-	 /*
-	  * We add enough stack to save all regs.
-	  */
-	subq $(SS+8-\skip), %rsp
-	movq %rax, RAX(%rsp)
-	movq %rcx, RCX(%rsp)
-	movq %rdx, RDX(%rsp)
-	movq %rsi, RSI(%rsp)
-	movq %rdi, RDI(%rsp)
-	movq %r8, R8(%rsp)
-	movq %r9, R9(%rsp)
-	 /* Move RIP to its proper location */
-	movq SS+8(%rsp), %rdx
-	movq %rdx, RIP(%rsp)
-	.endm
-
-	.macro MCOUNT_RESTORE_FRAME skip=0
-	movq R9(%rsp), %r9
-	movq R8(%rsp), %r8
-	movq RDI(%rsp), %rdi
-	movq RSI(%rsp), %rsi
-	movq RDX(%rsp), %rdx
-	movq RCX(%rsp), %rcx
-	movq RAX(%rsp), %rax
-	addq $(SS+8-\skip), %rsp
-	.endm
-
-#endif
-
 #ifdef CONFIG_FUNCTION_TRACER
 #ifdef CC_USING_FENTRY
 # define MCOUNT_ADDR		((long)(__fentry__))
diff --git a/arch/x86/kernel/mcount_64.S b/arch/x86/kernel/mcount_64.S
index 24842c701660..94fe46725fe0 100644
--- a/arch/x86/kernel/mcount_64.S
+++ b/arch/x86/kernel/mcount_64.S
@@ -21,6 +21,35 @@
 # define function_hook	mcount
 #endif
 
+/* skip is set if the stack was already partially adjusted */
+.macro MCOUNT_SAVE_FRAME skip=0
+	 /*
+	  * We add enough stack to save all regs.
+	  */
+	subq $(SS+8-\skip), %rsp
+	movq %rax, RAX(%rsp)
+	movq %rcx, RCX(%rsp)
+	movq %rdx, RDX(%rsp)
+	movq %rsi, RSI(%rsp)
+	movq %rdi, RDI(%rsp)
+	movq %r8, R8(%rsp)
+	movq %r9, R9(%rsp)
+	 /* Move RIP to its proper location */
+	movq SS+8(%rsp), %rdx
+	movq %rdx, RIP(%rsp)
+	.endm
+
+.macro MCOUNT_RESTORE_FRAME skip=0
+	movq R9(%rsp), %r9
+	movq R8(%rsp), %r8
+	movq RDI(%rsp), %rdi
+	movq RSI(%rsp), %rsi
+	movq RDX(%rsp), %rdx
+	movq RCX(%rsp), %rcx
+	movq RAX(%rsp), %rax
+	addq $(SS+8-\skip), %rsp
+	.endm
+
 /* skip is set if stack has been adjusted */
 .macro ftrace_caller_setup trace_label skip=0
 	MCOUNT_SAVE_FRAME \skip
-- 
2.1.1



^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [RFC][PATCH 3/7] ftrace/x86: Rename MCOUNT_SAVE_FRAME and add more detailed comments
  2014-11-25  0:42 [RFC][PATCH 0/7] ftrace/x86: Clean up of mcount.S code Steven Rostedt
  2014-11-25  0:42 ` [RFC][PATCH 1/7] ftrace/x86: Have static tracing also use ftrace_caller_setup Steven Rostedt
  2014-11-25  0:42 ` [RFC][PATCH 2/7] ftrace/x86: Move MCOUNT_SAVE_FRAME out of header file Steven Rostedt
@ 2014-11-25  0:42 ` Steven Rostedt
  2014-11-25  0:42 ` [RFC][PATCH 4/7] ftrace/x86: Have save_mcount_regs store RIP in %rdi for first parameter Steven Rostedt
                   ` (4 subsequent siblings)
  7 siblings, 0 replies; 12+ messages in thread
From: Steven Rostedt @ 2014-11-25  0:42 UTC (permalink / raw)
  To: linux-kernel
  Cc: Linus Torvalds, Ingo Molnar, Andrew Morton, H. Peter Anvin,
	Thomas Gleixner, Masami Hiramatsu

[-- Attachment #1: 0003-ftrace-x86-Rename-MCOUNT_SAVE_FRAME-and-add-more-det.patch --]
[-- Type: text/plain, Size: 3273 bytes --]

From: "Steven Rostedt (Red Hat)" <rostedt@goodmis.org>

The name MCOUNT_SAVE_FRAME is rather confusing as it really isn't a
function frame that is saved, but just the required mcount registers
that are needed to be saved before C code may be called. The word
"frame" confuses it as being a function frame which it is not.

Rename MCOUNT_SAVE_FRAME and MCOUNT_RESTORE_FRAME to save_mcount_regs
and restore_mcount_regs respectively. Noticed the lower case, which
keeps it from screaming at the reviewers.

Link: http://lkml.kernel.org/r/CA+55aFwF+qCGSKdGaEgW4p6N65GZ5_XTV=1NbtWDvxnd5yYLiw@mail.gmail.com

Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
---
 arch/x86/kernel/mcount_64.S | 32 ++++++++++++++++++++++++--------
 1 file changed, 24 insertions(+), 8 deletions(-)

diff --git a/arch/x86/kernel/mcount_64.S b/arch/x86/kernel/mcount_64.S
index 94fe46725fe0..0a693d011980 100644
--- a/arch/x86/kernel/mcount_64.S
+++ b/arch/x86/kernel/mcount_64.S
@@ -21,8 +21,24 @@
 # define function_hook	mcount
 #endif
 
+/*
+ * gcc -pg option adds a call to 'mcount' in most functions.
+ * When -mfentry is used, the call is to 'fentry' and not 'mcount'
+ * and is done before the function's stack frame is set up.
+ * They both require a set of regs to be saved before calling
+ * any C code and restored before returning back to the function.
+ *
+ * On boot up, all these calls are converted into nops. When tracing
+ * is enabled, the call can jump to either ftrace_caller or
+ * ftrace_regs_caller. Callbacks (tracing functions) that require
+ * ftrace_regs_caller (like kprobes) need to have pt_regs passed to
+ * it. For this reason, the size of the pt_regs structure will be
+ * allocated on the stack and the required mcount registers will
+ * be saved in the locations that pt_regs has them in.
+ */
+
 /* skip is set if the stack was already partially adjusted */
-.macro MCOUNT_SAVE_FRAME skip=0
+.macro save_mcount_regs skip=0
 	 /*
 	  * We add enough stack to save all regs.
 	  */
@@ -39,7 +55,7 @@
 	movq %rdx, RIP(%rsp)
 	.endm
 
-.macro MCOUNT_RESTORE_FRAME skip=0
+.macro restore_mcount_regs skip=0
 	movq R9(%rsp), %r9
 	movq R8(%rsp), %r8
 	movq RDI(%rsp), %rdi
@@ -52,7 +68,7 @@
 
 /* skip is set if stack has been adjusted */
 .macro ftrace_caller_setup trace_label skip=0
-	MCOUNT_SAVE_FRAME \skip
+	save_mcount_regs \skip
 
 	/* Save this location */
 GLOBAL(\trace_label)
@@ -121,7 +137,7 @@ GLOBAL(ftrace_call)
 
 	restore_frame
 
-	MCOUNT_RESTORE_FRAME
+	restore_mcount_regs
 
 	/*
 	 * The copied trampoline must call ftrace_return as it
@@ -196,7 +212,7 @@ GLOBAL(ftrace_regs_call)
 	movq RBX(%rsp), %rbx
 
 	/* skip=8 to skip flags saved in SS */
-	MCOUNT_RESTORE_FRAME 8
+	restore_mcount_regs 8
 
 	/* Restore flags */
 	popfq
@@ -240,7 +256,7 @@ trace:
 
 	call   *ftrace_trace_function
 
-	MCOUNT_RESTORE_FRAME
+	restore_mcount_regs
 
 	jmp fgraph_trace
 END(function_hook)
@@ -249,7 +265,7 @@ END(function_hook)
 
 #ifdef CONFIG_FUNCTION_GRAPH_TRACER
 ENTRY(ftrace_graph_caller)
-	MCOUNT_SAVE_FRAME
+	save_mcount_regs
 
 #ifdef CC_USING_FENTRY
 	leaq SS+16(%rsp), %rdi
@@ -263,7 +279,7 @@ ENTRY(ftrace_graph_caller)
 
 	call	prepare_ftrace_return
 
-	MCOUNT_RESTORE_FRAME
+	restore_mcount_regs
 
 	retq
 END(ftrace_graph_caller)
-- 
2.1.1



^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [RFC][PATCH 4/7] ftrace/x86: Have save_mcount_regs store RIP in %rdi for first parameter
  2014-11-25  0:42 [RFC][PATCH 0/7] ftrace/x86: Clean up of mcount.S code Steven Rostedt
                   ` (2 preceding siblings ...)
  2014-11-25  0:42 ` [RFC][PATCH 3/7] ftrace/x86: Rename MCOUNT_SAVE_FRAME and add more detailed comments Steven Rostedt
@ 2014-11-25  0:42 ` Steven Rostedt
  2014-11-25  0:42 ` [RFC][PATCH 5/7] ftrace/x86: Simplify save_mcount_regs on getting RIP Steven Rostedt
                   ` (3 subsequent siblings)
  7 siblings, 0 replies; 12+ messages in thread
From: Steven Rostedt @ 2014-11-25  0:42 UTC (permalink / raw)
  To: linux-kernel
  Cc: Linus Torvalds, Ingo Molnar, Andrew Morton, H. Peter Anvin,
	Thomas Gleixner, Masami Hiramatsu

[-- Attachment #1: 0004-ftrace-x86-Have-save_mcount_regs-store-RIP-in-rdi-fo.patch --]
[-- Type: text/plain, Size: 1412 bytes --]

From: "Steven Rostedt (Red Hat)" <rostedt@goodmis.org>

Instead of having save_mcount_regs store the RIP in %rdx as a temp register
to place it in the proper location of the pt_regs on the stack. Use the
%rdi register as the temp register. This lets us remove the extra store
in the ftrace_caller_setup macro.

Link: http://lkml.kernel.org/r/CA+55aFwF+qCGSKdGaEgW4p6N65GZ5_XTV=1NbtWDvxnd5yYLiw@mail.gmail.com

Suggested-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
---
 arch/x86/kernel/mcount_64.S | 7 +++----
 1 file changed, 3 insertions(+), 4 deletions(-)

diff --git a/arch/x86/kernel/mcount_64.S b/arch/x86/kernel/mcount_64.S
index 0a693d011980..4f1b27642495 100644
--- a/arch/x86/kernel/mcount_64.S
+++ b/arch/x86/kernel/mcount_64.S
@@ -51,8 +51,8 @@
 	movq %r8, R8(%rsp)
 	movq %r9, R9(%rsp)
 	 /* Move RIP to its proper location */
-	movq SS+8(%rsp), %rdx
-	movq %rdx, RIP(%rsp)
+	movq SS+8(%rsp), %rdi
+	movq %rdi, RIP(%rsp)
 	.endm
 
 .macro restore_mcount_regs skip=0
@@ -75,8 +75,7 @@ GLOBAL(\trace_label)
 	/* Load the ftrace_ops into the 3rd parameter */
 	movq function_trace_op(%rip), %rdx
 
-	/* Load ip into the first parameter */
-	movq RIP(%rsp), %rdi
+	/* %rdi already has %rip from the save_mcount_regs macro */
 	subq $MCOUNT_INSN_SIZE, %rdi
 	/* Load the parent_ip into the second parameter */
 #ifdef CC_USING_FENTRY
-- 
2.1.1



^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [RFC][PATCH 5/7] ftrace/x86: Simplify save_mcount_regs on getting RIP
  2014-11-25  0:42 [RFC][PATCH 0/7] ftrace/x86: Clean up of mcount.S code Steven Rostedt
                   ` (3 preceding siblings ...)
  2014-11-25  0:42 ` [RFC][PATCH 4/7] ftrace/x86: Have save_mcount_regs store RIP in %rdi for first parameter Steven Rostedt
@ 2014-11-25  0:42 ` Steven Rostedt
  2014-11-25  0:42 ` [RFC][PATCH 6/7] ftrace/x86: Add macro MCOUNT_REG_SIZE for amount of stack used to save mcount regs Steven Rostedt
                   ` (2 subsequent siblings)
  7 siblings, 0 replies; 12+ messages in thread
From: Steven Rostedt @ 2014-11-25  0:42 UTC (permalink / raw)
  To: linux-kernel
  Cc: Linus Torvalds, Ingo Molnar, Andrew Morton, H. Peter Anvin,
	Thomas Gleixner, Masami Hiramatsu

[-- Attachment #1: 0005-ftrace-x86-Simplify-save_mcount_regs-on-getting-RIP.patch --]
[-- Type: text/plain, Size: 3993 bytes --]

From: "Steven Rostedt (Red Hat)" <rostedt@goodmis.org>

Currently save_mcount_regs is passed a "skip" parameter to know how much
stack updated the pt_regs, as it tries to keep the saved pt_regs in the
same location for all users. This is rather stupid, especially since the
part stored on the pt_regs has nothing to do with what is suppose to be
in that location.

Instead of doing that, just pass in an "added" parameter that lets that
macro know how much stack was added before it was called so that it
can get to the RIP.  But the difference is that it will now offset the
pt_regs by that "added" count. The caller now needs to take care of
the offset of the pt_regs.

This will make it easier to simplify the code later.

Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
---
 arch/x86/kernel/mcount_64.S | 37 ++++++++++++++++++-------------------
 1 file changed, 18 insertions(+), 19 deletions(-)

diff --git a/arch/x86/kernel/mcount_64.S b/arch/x86/kernel/mcount_64.S
index 4f1b27642495..596ac330c1db 100644
--- a/arch/x86/kernel/mcount_64.S
+++ b/arch/x86/kernel/mcount_64.S
@@ -37,12 +37,12 @@
  * be saved in the locations that pt_regs has them in.
  */
 
-/* skip is set if the stack was already partially adjusted */
-.macro save_mcount_regs skip=0
+/* @added: the amount of stack added before calling this */
+.macro save_mcount_regs added=0
 	 /*
 	  * We add enough stack to save all regs.
 	  */
-	subq $(SS+8-\skip), %rsp
+	subq $(SS+8), %rsp
 	movq %rax, RAX(%rsp)
 	movq %rcx, RCX(%rsp)
 	movq %rdx, RDX(%rsp)
@@ -51,11 +51,11 @@
 	movq %r8, R8(%rsp)
 	movq %r9, R9(%rsp)
 	 /* Move RIP to its proper location */
-	movq SS+8(%rsp), %rdi
+	movq SS+8+\added(%rsp), %rdi
 	movq %rdi, RIP(%rsp)
 	.endm
 
-.macro restore_mcount_regs skip=0
+.macro restore_mcount_regs
 	movq R9(%rsp), %r9
 	movq R8(%rsp), %r8
 	movq RDI(%rsp), %rdi
@@ -63,12 +63,12 @@
 	movq RDX(%rsp), %rdx
 	movq RCX(%rsp), %rcx
 	movq RAX(%rsp), %rax
-	addq $(SS+8-\skip), %rsp
+	addq $(SS+8), %rsp
 	.endm
 
 /* skip is set if stack has been adjusted */
-.macro ftrace_caller_setup trace_label skip=0
-	save_mcount_regs \skip
+.macro ftrace_caller_setup trace_label added=0
+	save_mcount_regs \added
 
 	/* Save this location */
 GLOBAL(\trace_label)
@@ -79,9 +79,9 @@ GLOBAL(\trace_label)
 	subq $MCOUNT_INSN_SIZE, %rdi
 	/* Load the parent_ip into the second parameter */
 #ifdef CC_USING_FENTRY
-	movq SS+16(%rsp), %rsi
+	movq SS+16+\added(%rsp), %rsi
 #else
-	movq 8(%rbp), %rsi
+	movq 8+\added(%rbp), %rsi
 #endif
 .endm
 
@@ -156,10 +156,10 @@ GLOBAL(ftrace_stub)
 END(ftrace_caller)
 
 ENTRY(ftrace_regs_caller)
-	/* Save the current flags before compare (in SS location)*/
+	/* Save the current flags before any operations that can change them */
 	pushfq
 
-	/* skip=8 to skip flags saved in SS */
+	/* added 8 bytes to save flags */
 	ftrace_caller_setup ftrace_regs_caller_op_ptr 8
 
 	/* Save the rest of pt_regs */
@@ -172,15 +172,15 @@ ENTRY(ftrace_regs_caller)
 	movq %rbp, RBP(%rsp)
 	movq %rbx, RBX(%rsp)
 	/* Copy saved flags */
-	movq SS(%rsp), %rcx
+	movq SS+8(%rsp), %rcx
 	movq %rcx, EFLAGS(%rsp)
 	/* Kernel segments */
 	movq $__KERNEL_DS, %rcx
 	movq %rcx, SS(%rsp)
 	movq $__KERNEL_CS, %rcx
 	movq %rcx, CS(%rsp)
-	/* Stack - skipping return address */
-	leaq SS+16(%rsp), %rcx
+	/* Stack - skipping return address and flags */
+	leaq SS+8*3(%rsp), %rcx
 	movq %rcx, RSP(%rsp)
 
 	/* regs go into 4th parameter */
@@ -195,11 +195,11 @@ GLOBAL(ftrace_regs_call)
 
 	/* Copy flags back to SS, to restore them */
 	movq EFLAGS(%rsp), %rax
-	movq %rax, SS(%rsp)
+	movq %rax, SS+8(%rsp)
 
 	/* Handlers can change the RIP */
 	movq RIP(%rsp), %rax
-	movq %rax, SS+8(%rsp)
+	movq %rax, SS+8*2(%rsp)
 
 	/* restore the rest of pt_regs */
 	movq R15(%rsp), %r15
@@ -210,8 +210,7 @@ GLOBAL(ftrace_regs_call)
 	movq RBP(%rsp), %rbp
 	movq RBX(%rsp), %rbx
 
-	/* skip=8 to skip flags saved in SS */
-	restore_mcount_regs 8
+	restore_mcount_regs
 
 	/* Restore flags */
 	popfq
-- 
2.1.1



^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [RFC][PATCH 6/7] ftrace/x86: Add macro MCOUNT_REG_SIZE for amount of stack used to save mcount regs
  2014-11-25  0:42 [RFC][PATCH 0/7] ftrace/x86: Clean up of mcount.S code Steven Rostedt
                   ` (4 preceding siblings ...)
  2014-11-25  0:42 ` [RFC][PATCH 5/7] ftrace/x86: Simplify save_mcount_regs on getting RIP Steven Rostedt
@ 2014-11-25  0:42 ` Steven Rostedt
  2014-11-25  0:42 ` [RFC][PATCH 7/7] ftrace/x86: Have save_mcount_regs macro also save stack frames if needed Steven Rostedt
  2014-11-25  1:34 ` [RFC][PATCH 0/7] ftrace/x86: Clean up of mcount.S code Linus Torvalds
  7 siblings, 0 replies; 12+ messages in thread
From: Steven Rostedt @ 2014-11-25  0:42 UTC (permalink / raw)
  To: linux-kernel
  Cc: Linus Torvalds, Ingo Molnar, Andrew Morton, H. Peter Anvin,
	Thomas Gleixner, Masami Hiramatsu

[-- Attachment #1: 0006-ftrace-x86-Add-macro-MCOUNT_REG_SIZE-for-amount-of-s.patch --]
[-- Type: text/plain, Size: 3273 bytes --]

From: "Steven Rostedt (Red Hat)" <rostedt@goodmis.org>

The macro save_mcount_regs saves regs onto the stack. But to uncouple the
amount of stack used in that macro from the users of the macro, we need
to have a define that tells all the users how much stack is used by that
macro. This way we can change the amount of stack the macro uses without
breaking its users.

Also remove some dead code that was left over from commit fdc841b58cf5
"ftrace: x86: Remove check of obsolete variable function_trace_stop".

Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
---
 arch/x86/kernel/mcount_64.S | 24 ++++++++++++------------
 1 file changed, 12 insertions(+), 12 deletions(-)

diff --git a/arch/x86/kernel/mcount_64.S b/arch/x86/kernel/mcount_64.S
index 596ac330c1db..a0f6f942183a 100644
--- a/arch/x86/kernel/mcount_64.S
+++ b/arch/x86/kernel/mcount_64.S
@@ -21,6 +21,9 @@
 # define function_hook	mcount
 #endif
 
+/* Size of stack used to save mcount regs in save_mcount_regs */
+#define MCOUNT_REG_SIZE		(SS+8)
+
 /*
  * gcc -pg option adds a call to 'mcount' in most functions.
  * When -mfentry is used, the call is to 'fentry' and not 'mcount'
@@ -42,7 +45,7 @@
 	 /*
 	  * We add enough stack to save all regs.
 	  */
-	subq $(SS+8), %rsp
+	subq $MCOUNT_REG_SIZE, %rsp
 	movq %rax, RAX(%rsp)
 	movq %rcx, RCX(%rsp)
 	movq %rdx, RDX(%rsp)
@@ -51,7 +54,7 @@
 	movq %r8, R8(%rsp)
 	movq %r9, R9(%rsp)
 	 /* Move RIP to its proper location */
-	movq SS+8+\added(%rsp), %rdi
+	movq MCOUNT_REG_SIZE+\added(%rsp), %rdi
 	movq %rdi, RIP(%rsp)
 	.endm
 
@@ -63,7 +66,7 @@
 	movq RDX(%rsp), %rdx
 	movq RCX(%rsp), %rcx
 	movq RAX(%rsp), %rax
-	addq $(SS+8), %rsp
+	addq $MCOUNT_REG_SIZE, %rsp
 	.endm
 
 /* skip is set if stack has been adjusted */
@@ -79,7 +82,7 @@ GLOBAL(\trace_label)
 	subq $MCOUNT_INSN_SIZE, %rdi
 	/* Load the parent_ip into the second parameter */
 #ifdef CC_USING_FENTRY
-	movq SS+16+\added(%rsp), %rsi
+	movq MCOUNT_REG_SIZE+8+\added(%rsp), %rsi
 #else
 	movq 8+\added(%rbp), %rsi
 #endif
@@ -172,7 +175,7 @@ ENTRY(ftrace_regs_caller)
 	movq %rbp, RBP(%rsp)
 	movq %rbx, RBX(%rsp)
 	/* Copy saved flags */
-	movq SS+8(%rsp), %rcx
+	movq MCOUNT_REG_SIZE(%rsp), %rcx
 	movq %rcx, EFLAGS(%rsp)
 	/* Kernel segments */
 	movq $__KERNEL_DS, %rcx
@@ -180,7 +183,7 @@ ENTRY(ftrace_regs_caller)
 	movq $__KERNEL_CS, %rcx
 	movq %rcx, CS(%rsp)
 	/* Stack - skipping return address and flags */
-	leaq SS+8*3(%rsp), %rcx
+	leaq MCOUNT_REG_SIZE+8*2(%rsp), %rcx
 	movq %rcx, RSP(%rsp)
 
 	/* regs go into 4th parameter */
@@ -195,11 +198,11 @@ GLOBAL(ftrace_regs_call)
 
 	/* Copy flags back to SS, to restore them */
 	movq EFLAGS(%rsp), %rax
-	movq %rax, SS+8(%rsp)
+	movq %rax, MCOUNT_REG_SIZE(%rsp)
 
 	/* Handlers can change the RIP */
 	movq RIP(%rsp), %rax
-	movq %rax, SS+8*2(%rsp)
+	movq %rax, MCOUNT_REG_SIZE+8(%rsp)
 
 	/* restore the rest of pt_regs */
 	movq R15(%rsp), %r15
@@ -225,9 +228,6 @@ GLOBAL(ftrace_regs_caller_end)
 
 	jmp ftrace_return
 
-	popfq
-	jmp  ftrace_stub
-
 END(ftrace_regs_caller)
 
 
@@ -266,7 +266,7 @@ ENTRY(ftrace_graph_caller)
 	save_mcount_regs
 
 #ifdef CC_USING_FENTRY
-	leaq SS+16(%rsp), %rdi
+	leaq MCOUNT_REG_SIZE+8(%rsp), %rdi
 	movq $0, %rdx	/* No framepointers needed */
 #else
 	leaq 8(%rbp), %rdi
-- 
2.1.1



^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [RFC][PATCH 7/7] ftrace/x86: Have save_mcount_regs macro also save stack frames if needed
  2014-11-25  0:42 [RFC][PATCH 0/7] ftrace/x86: Clean up of mcount.S code Steven Rostedt
                   ` (5 preceding siblings ...)
  2014-11-25  0:42 ` [RFC][PATCH 6/7] ftrace/x86: Add macro MCOUNT_REG_SIZE for amount of stack used to save mcount regs Steven Rostedt
@ 2014-11-25  0:42 ` Steven Rostedt
  2014-11-25  1:34 ` [RFC][PATCH 0/7] ftrace/x86: Clean up of mcount.S code Linus Torvalds
  7 siblings, 0 replies; 12+ messages in thread
From: Steven Rostedt @ 2014-11-25  0:42 UTC (permalink / raw)
  To: linux-kernel
  Cc: Linus Torvalds, Ingo Molnar, Andrew Morton, H. Peter Anvin,
	Thomas Gleixner, Masami Hiramatsu

[-- Attachment #1: 0007-ftrace-x86-Have-save_mcount_regs-macro-also-save-sta.patch --]
[-- Type: text/plain, Size: 6328 bytes --]

From: "Steven Rostedt (Red Hat)" <rostedt@goodmis.org>

The save_mcount_regs macro saves and restores the required mcount regs that
need to be saved before calling C code. It is done for all the function hook
utilities (static tracing, dynamic tracing, regs, function graph).

When frame pointers are enabled, the ftrace trampolines need to set up
frames and pointers such that a back trace (dump stack) can continue passed
them. Currently, a separate macro is used (create_frame) to do this, but
it's only done for the ftrace_caller and ftrace_reg_caller functions. It
is not done for the static tracer or function graph tracing.

Instead of having a separate macro doing the recording of the frames,
have the save_mcount_regs perform this task. This also has all tracers
saving the frame pointers when needed.

Link: http://lkml.kernel.org/r/CA+55aFwF+qCGSKdGaEgW4p6N65GZ5_XTV=1NbtWDvxnd5yYLiw@mail.gmail.com

Suggested-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
---
 arch/x86/kernel/mcount_64.S | 118 +++++++++++++++++++++++++-------------------
 1 file changed, 67 insertions(+), 51 deletions(-)

diff --git a/arch/x86/kernel/mcount_64.S b/arch/x86/kernel/mcount_64.S
index a0f6f942183a..003b22df1d87 100644
--- a/arch/x86/kernel/mcount_64.S
+++ b/arch/x86/kernel/mcount_64.S
@@ -21,8 +21,22 @@
 # define function_hook	mcount
 #endif
 
+/* All cases save the original rbp (8 bytes) */
+#ifdef CONFIG_FRAME_POINTER
+# ifdef CC_USING_FENTRY
+/* Save parent and function stack frames (rip and rbp) */
+#  define MCOUNT_FRAME_SIZE	(8+16*2)
+# else
+/* Save just function stack frame (rip and rbp) */
+#  define MCOUNT_FRAME_SIZE	(8+16)
+# endif
+#else
+/* No need to save a stack frame */
+# define MCOUNT_FRAME_SIZE	8
+#endif /* CONFIG_FRAME_POINTER */
+
 /* Size of stack used to save mcount regs in save_mcount_regs */
-#define MCOUNT_REG_SIZE		(SS+8)
+#define MCOUNT_REG_SIZE		(SS+8 + MCOUNT_FRAME_SIZE)
 
 /*
  * gcc -pg option adds a call to 'mcount' in most functions.
@@ -42,10 +56,37 @@
 
 /* @added: the amount of stack added before calling this */
 .macro save_mcount_regs added=0
-	 /*
-	  * We add enough stack to save all regs.
-	  */
-	subq $MCOUNT_REG_SIZE, %rsp
+
+	/* Always save the original rbp */
+	pushq %rbp
+
+#ifdef CONFIG_FRAME_POINTER
+	/*
+	 * Stack traces will stop at the ftrace trampoline if the frame pointer
+	 * is not set up properly. If fentry is used, we need to save a frame
+	 * pointer for the parent as well as the function traced, because the
+	 * fentry is called before the stack frame is set up, where as mcount
+	 * is called afterward.
+	 */
+#ifdef CC_USING_FENTRY
+	/* Save the parent pointer (skip orig rbp and our return address) */
+	pushq \added+8*2(%rsp)
+	pushq %rbp
+	movq %rsp, %rbp
+	/* Save the return address (now skip orig rbp, rbp and parent) */
+	pushq \added+8*3(%rsp)
+#else
+	/* Can't assume that rip is before this (unless added was zero) */
+	pushq \added+8(%rsp)
+#endif
+	pushq %rbp
+	movq %rsp, %rbp
+#endif /* CONFIG_FRAME_POINTER */
+
+	/*
+	 * We add enough stack to save all regs.
+	 */
+	subq $(MCOUNT_REG_SIZE - MCOUNT_FRAME_SIZE), %rsp
 	movq %rax, RAX(%rsp)
 	movq %rcx, RCX(%rsp)
 	movq %rdx, RDX(%rsp)
@@ -53,6 +94,13 @@
 	movq %rdi, RDI(%rsp)
 	movq %r8, R8(%rsp)
 	movq %r9, R9(%rsp)
+	/*
+	 * Save the original RBP. Even though the mcount ABI does not
+	 * require this, it helps out callers.
+	 */
+	movq MCOUNT_REG_SIZE-8(%rsp), %rdx
+	movq %rdx, RBP(%rsp)
+
 	 /* Move RIP to its proper location */
 	movq MCOUNT_REG_SIZE+\added(%rsp), %rdi
 	movq %rdi, RIP(%rsp)
@@ -66,7 +114,12 @@
 	movq RDX(%rsp), %rdx
 	movq RCX(%rsp), %rcx
 	movq RAX(%rsp), %rax
+
+	/* ftrace_regs_caller can modify %rbp */
+	movq RBP(%rsp), %rbp
+
 	addq $MCOUNT_REG_SIZE, %rsp
+
 	.endm
 
 /* skip is set if stack has been adjusted */
@@ -84,7 +137,10 @@ GLOBAL(\trace_label)
 #ifdef CC_USING_FENTRY
 	movq MCOUNT_REG_SIZE+8+\added(%rsp), %rsi
 #else
-	movq 8+\added(%rbp), %rsi
+	/* Need to grab the original %rbp */
+	movq RBP(%rsp), %rsi
+	/* Now parent address is 8 above original %rbp */
+	movq 8(%rsi), %rsi
 #endif
 .endm
 
@@ -94,51 +150,14 @@ ENTRY(function_hook)
 	retq
 END(function_hook)
 
-#ifdef CONFIG_FRAME_POINTER
-/*
- * Stack traces will stop at the ftrace trampoline if the frame pointer
- * is not set up properly. If fentry is used, we need to save a frame
- * pointer for the parent as well as the function traced, because the
- * fentry is called before the stack frame is set up, where as mcount
- * is called afterward.
- */
-.macro create_frame parent rip
-#ifdef CC_USING_FENTRY
-	pushq \parent
-	pushq %rbp
-	movq %rsp, %rbp
-#endif
-	pushq \rip
-	pushq %rbp
-	movq %rsp, %rbp
-.endm
-
-.macro restore_frame
-#ifdef CC_USING_FENTRY
-	addq $16, %rsp
-#endif
-	popq %rbp
-	addq $8, %rsp
-.endm
-#else
-.macro create_frame parent rip
-.endm
-.macro restore_frame
-.endm
-#endif /* CONFIG_FRAME_POINTER */
-
 ENTRY(ftrace_caller)
 	ftrace_caller_setup ftrace_caller_op_ptr
 	/* regs go into 4th parameter (but make it NULL) */
 	movq $0, %rcx
 
-	create_frame %rsi, %rdi
-
 GLOBAL(ftrace_call)
 	call ftrace_stub
 
-	restore_frame
-
 	restore_mcount_regs
 
 	/*
@@ -172,7 +191,6 @@ ENTRY(ftrace_regs_caller)
 	movq %r12, R12(%rsp)
 	movq %r11, R11(%rsp)
 	movq %r10, R10(%rsp)
-	movq %rbp, RBP(%rsp)
 	movq %rbx, RBX(%rsp)
 	/* Copy saved flags */
 	movq MCOUNT_REG_SIZE(%rsp), %rcx
@@ -189,13 +207,9 @@ ENTRY(ftrace_regs_caller)
 	/* regs go into 4th parameter */
 	leaq (%rsp), %rcx
 
-	create_frame %rsi, %rdi
-
 GLOBAL(ftrace_regs_call)
 	call ftrace_stub
 
-	restore_frame
-
 	/* Copy flags back to SS, to restore them */
 	movq EFLAGS(%rsp), %rax
 	movq %rax, MCOUNT_REG_SIZE(%rsp)
@@ -210,7 +224,6 @@ GLOBAL(ftrace_regs_call)
 	movq R13(%rsp), %r13
 	movq R12(%rsp), %r12
 	movq R10(%rsp), %r10
-	movq RBP(%rsp), %rbp
 	movq RBX(%rsp), %rbx
 
 	restore_mcount_regs
@@ -269,8 +282,11 @@ ENTRY(ftrace_graph_caller)
 	leaq MCOUNT_REG_SIZE+8(%rsp), %rdi
 	movq $0, %rdx	/* No framepointers needed */
 #else
-	leaq 8(%rbp), %rdi
-	movq (%rbp), %rdx
+	/* Need to grab the original %rbp */
+	movq RBP(%rsp), %rdx
+	/* Now parent address is 8 above original %rbp */
+	leaq 8(%rdx), %rdi
+	movq (%rdx), %rdx
 #endif
 	movq RIP(%rsp), %rsi
 	subq $MCOUNT_INSN_SIZE, %rsi
-- 
2.1.1



^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [RFC][PATCH 0/7] ftrace/x86: Clean up of mcount.S code
  2014-11-25  0:42 [RFC][PATCH 0/7] ftrace/x86: Clean up of mcount.S code Steven Rostedt
                   ` (6 preceding siblings ...)
  2014-11-25  0:42 ` [RFC][PATCH 7/7] ftrace/x86: Have save_mcount_regs macro also save stack frames if needed Steven Rostedt
@ 2014-11-25  1:34 ` Linus Torvalds
  2014-11-25  1:56   ` Steven Rostedt
                     ` (2 more replies)
  7 siblings, 3 replies; 12+ messages in thread
From: Linus Torvalds @ 2014-11-25  1:34 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: Linux Kernel Mailing List, Ingo Molnar, Andrew Morton,
	H. Peter Anvin, Thomas Gleixner, Masami Hiramatsu

On Mon, Nov 24, 2014 at 4:42 PM, Steven Rostedt <rostedt@goodmis.org> wrote:
>
> Let me know if these changes have mcount.S give you less heebie-jeebies.

So I haven't looked at the individual patches, I just looked at the
rolled-up final patch in this email.

And yes, from that final patch, I certainly like this much more. At
least it now creates the frame in the obvious place, and the comments
explain the layout.

However, explain this (in the ftrace_caller_setup macro):

 #ifdef CC_USING_FENTRY
-       movq SS+16(%rsp), %rsi
+       movq MCOUNT_REG_SIZE+8+\added(%rsp), %rsi
 #else
-       movq 8(%rbp), %rsi
+       /* Need to grab the original %rbp */
+       movq RBP(%rsp), %rsi
+       /* Now parent address is 8 above original %rbp */
+       movq 8(%rsi), %rsi
 #endif

Why isn't that "follow rbp" approach now *always* the right thing to
do, regardless of fentry-vs-not? And in particular, couldn't you have
made '%rsi' already contain that old rbp address in save_mcount_regs,
the same way %rdi contains the RIP value?

(Yes, you can only do that after you've saved the old %rsi, but that's
easy enough to do by just delaying the second

    mov %rsp,%rbp

until after the save area, and replacing it with

    mov %rbp,%rsi
    lea MCOUNT_REG_SIZE - MCOUNT_FRAME_SIZE(%rsp),%rbp

after the saving of the frame. And now you have that RBP(%rsp) in %rsi
already, so regardless of whether you have CC_USING_FENTRY or not, the
above code becomes just

    /* Now parent address is 8 above original %rbp */
    movq 8(%rsi), %rsi

No?

Ok, so I didn't write it all out, and maybe I made some mistake while
writing this email. but it *looks* like your ftrace_caller_setup macro
is just unnecessarily complicated, and again, it's because you have
two different macros and they aren't taking advantage of each other
very well.

Hmm?

                 Linus

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [RFC][PATCH 0/7] ftrace/x86: Clean up of mcount.S code
  2014-11-25  1:34 ` [RFC][PATCH 0/7] ftrace/x86: Clean up of mcount.S code Linus Torvalds
@ 2014-11-25  1:56   ` Steven Rostedt
  2014-11-25  2:36   ` Steven Rostedt
  2014-11-25  2:54   ` Steven Rostedt
  2 siblings, 0 replies; 12+ messages in thread
From: Steven Rostedt @ 2014-11-25  1:56 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Linux Kernel Mailing List, Ingo Molnar, Andrew Morton,
	H. Peter Anvin, Thomas Gleixner, Masami Hiramatsu

On Mon, 24 Nov 2014 17:34:04 -0800
Linus Torvalds <torvalds@linux-foundation.org> wrote:

> On Mon, Nov 24, 2014 at 4:42 PM, Steven Rostedt <rostedt@goodmis.org> wrote:
> >
> > Let me know if these changes have mcount.S give you less heebie-jeebies.
> 
> So I haven't looked at the individual patches, I just looked at the
> rolled-up final patch in this email.
> 
> And yes, from that final patch, I certainly like this much more. At
> least it now creates the frame in the obvious place, and the comments
> explain the layout.
> 
> However, explain this (in the ftrace_caller_setup macro):
> 
>  #ifdef CC_USING_FENTRY
> -       movq SS+16(%rsp), %rsi
> +       movq MCOUNT_REG_SIZE+8+\added(%rsp), %rsi
>  #else
> -       movq 8(%rbp), %rsi
> +       /* Need to grab the original %rbp */
> +       movq RBP(%rsp), %rsi
> +       /* Now parent address is 8 above original %rbp */
> +       movq 8(%rsi), %rsi
>  #endif
> 
> Why isn't that "follow rbp" approach now *always* the right thing to
> do, regardless of fentry-vs-not? And in particular, couldn't you have
> made '%rsi' already contain that old rbp address in save_mcount_regs,
> the same way %rdi contains the RIP value?
> 
> (Yes, you can only do that after you've saved the old %rsi, but that's
> easy enough to do by just delaying the second
> 
>     mov %rsp,%rbp
> 
> until after the save area, and replacing it with
> 
>     mov %rbp,%rsi
>     lea MCOUNT_REG_SIZE - MCOUNT_FRAME_SIZE(%rsp),%rbp
> 
> after the saving of the frame. And now you have that RBP(%rsp) in %rsi
> already, so regardless of whether you have CC_USING_FENTRY or not, the
> above code becomes just
> 
>     /* Now parent address is 8 above original %rbp */
>     movq 8(%rsi), %rsi
> 
> No?
> 
> Ok, so I didn't write it all out, and maybe I made some mistake while
> writing this email. but it *looks* like your ftrace_caller_setup macro
> is just unnecessarily complicated, and again, it's because you have
> two different macros and they aren't taking advantage of each other
> very well.
> 
> Hmm?
> 

Actually, I just wrote a patch to move the MCOUNT_INSN_SIZE update into
save_mcount_regs, and after reading this, I think something like this
may work.

Note, I didn't even compile test it. But this removes that
ftrace_caller_setup completely. (Written on top of this series).

I have to look to at how this affects function_graph tracing. But I'm
sure we can make that work too.

-- Steve

diff --git a/arch/x86/kernel/mcount_64.S b/arch/x86/kernel/mcount_64.S
index 003b22df1d87..438747ca994f 100644
--- a/arch/x86/kernel/mcount_64.S
+++ b/arch/x86/kernel/mcount_64.S
@@ -101,9 +101,19 @@
 	movq MCOUNT_REG_SIZE-8(%rsp), %rdx
 	movq %rdx, RBP(%rsp)
 
+	/* Copy the parent address into %rsi (second parameter) */
+	movq 8(%rdx), %rsi
+
 	 /* Move RIP to its proper location */
 	movq MCOUNT_REG_SIZE+\added(%rsp), %rdi
 	movq %rdi, RIP(%rsp)
+
+	/*
+	 * Now %rdi (the first parameter) has the return address of
+	 * where ftrace_call returns. But the callbacks expect the
+	 * the address of the call itself.
+	 */
+	subq $MCOUNT_INSN_SIZE, %rdi
 	.endm
 
 .macro restore_mcount_regs
@@ -122,28 +132,6 @@
 
 	.endm
 
-/* skip is set if stack has been adjusted */
-.macro ftrace_caller_setup trace_label added=0
-	save_mcount_regs \added
-
-	/* Save this location */
-GLOBAL(\trace_label)
-	/* Load the ftrace_ops into the 3rd parameter */
-	movq function_trace_op(%rip), %rdx
-
-	/* %rdi already has %rip from the save_mcount_regs macro */
-	subq $MCOUNT_INSN_SIZE, %rdi
-	/* Load the parent_ip into the second parameter */
-#ifdef CC_USING_FENTRY
-	movq MCOUNT_REG_SIZE+8+\added(%rsp), %rsi
-#else
-	/* Need to grab the original %rbp */
-	movq RBP(%rsp), %rsi
-	/* Now parent address is 8 above original %rbp */
-	movq 8(%rsi), %rsi
-#endif
-.endm
-
 #ifdef CONFIG_DYNAMIC_FTRACE
 
 ENTRY(function_hook)
@@ -151,7 +139,13 @@ ENTRY(function_hook)
 END(function_hook)
 
 ENTRY(ftrace_caller)
-	ftrace_caller_setup ftrace_caller_op_ptr
+	/* save_mcount_regs fills in first two parameters */
+	save_mcount_regs
+
+GLOBAL(ftrace_caller_op_ptr)
+	/* Load the ftrace_ops into the 3rd parameter */
+	movq function_trace_op(%rip), %rdx
+
 	/* regs go into 4th parameter (but make it NULL) */
 	movq $0, %rcx
 
@@ -181,8 +175,12 @@ ENTRY(ftrace_regs_caller)
 	/* Save the current flags before any operations that can change them */
 	pushfq
 
-	/* added 8 bytes to save flags */
-	ftrace_caller_setup ftrace_regs_caller_op_ptr 8
+	/* save_mcount_regs fills in first two parameters */
+	save_mcount_regs 8
+
+GLOBAL(ftrace_regs_caller_op_ptr)
+	/* Load the ftrace_ops into the 3rd parameter */
+	movq function_trace_op(%rip), %rdx
 
 	/* Save the rest of pt_regs */
 	movq %r15, R15(%rsp)
@@ -263,7 +261,8 @@ GLOBAL(ftrace_stub)
 	retq
 
 trace:
-	ftrace_caller_setup ftrace_caller_op_ptr
+	/* save_mcount_regs fills in first two parameters */
+	save_mcount_regs
 
 	call   *ftrace_trace_function
 


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [RFC][PATCH 0/7] ftrace/x86: Clean up of mcount.S code
  2014-11-25  1:34 ` [RFC][PATCH 0/7] ftrace/x86: Clean up of mcount.S code Linus Torvalds
  2014-11-25  1:56   ` Steven Rostedt
@ 2014-11-25  2:36   ` Steven Rostedt
  2014-11-25  2:54   ` Steven Rostedt
  2 siblings, 0 replies; 12+ messages in thread
From: Steven Rostedt @ 2014-11-25  2:36 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Linux Kernel Mailing List, Ingo Molnar, Andrew Morton,
	H. Peter Anvin, Thomas Gleixner, Masami Hiramatsu

On Mon, 24 Nov 2014 17:34:04 -0800
Linus Torvalds <torvalds@linux-foundation.org> wrote:

> On Mon, Nov 24, 2014 at 4:42 PM, Steven Rostedt <rostedt@goodmis.org> wrote:
> >
> > Let me know if these changes have mcount.S give you less heebie-jeebies.
> 
> So I haven't looked at the individual patches, I just looked at the
> rolled-up final patch in this email.
> 
> And yes, from that final patch, I certainly like this much more. At
> least it now creates the frame in the obvious place, and the comments
> explain the layout.
> 
> However, explain this (in the ftrace_caller_setup macro):
> 
>  #ifdef CC_USING_FENTRY
> -       movq SS+16(%rsp), %rsi
> +       movq MCOUNT_REG_SIZE+8+\added(%rsp), %rsi
>  #else
> -       movq 8(%rbp), %rsi
> +       /* Need to grab the original %rbp */
> +       movq RBP(%rsp), %rsi
> +       /* Now parent address is 8 above original %rbp */
> +       movq 8(%rsi), %rsi
>  #endif
> 
> Why isn't that "follow rbp" approach now *always* the right thing to
> do, regardless of fentry-vs-not? And in particular, couldn't you have
> made '%rsi' already contain that old rbp address in save_mcount_regs,
> the same way %rdi contains the RIP value?

Testing my code and watching it crash and burn, I remember why I did it
this way. When using "mcount", frame pointers are always compiled in,
when "fentry" is used, they may or may not be. That is, %rbp is
meaningless.

We have three scenarios:

1) mcount (and frame pointers)
2) fentry (and frame pointers)
3) fentry (and no frame pointers)

The functions that are traced look like this:

1)
	func:
		push %rbp
		mov  %rsp, %rbp
		[ set up the rest of the frame ]
		call mcount

	That is, the call to the mcount trampoline is done where the
	only way to get to the parent is by "8(%rbp)".

2)
	func:
		call fentry
		push %rbp
		mov  %rsp, %rbp

	Here, when the fentry trampoline is called, it contains the
	return address of fentry (func:) and before that, the return
	address of func itself (parent). %rbp holds the parent's frame
	pointer. The only way to get to the parent is 8(%rsp) where
	0(%rsp) is the return back to func.

3)
	func:
		call fentry
		[ do whatever ]

	This is the same as 2) but this time %rbp is meaning less.
	And again, we need to get the parent with 8(%rsp).

But we can still have save_mcount_regs do all the work. Something like
this:


diff --git a/arch/x86/kernel/mcount_64.S b/arch/x86/kernel/mcount_64.S
index 003b22df1d87..ddc766efa1f1 100644
--- a/arch/x86/kernel/mcount_64.S
+++ b/arch/x86/kernel/mcount_64.S
@@ -54,7 +54,15 @@
  * be saved in the locations that pt_regs has them in.
  */
 
-/* @added: the amount of stack added before calling this */
+/*
+ * @added: the amount of stack added before calling this
+ *
+ * After this is called, the following registers contain:
+ *
+ *  %rdi - holds the address that called the trampoline
+ *  %rsi - holds the parent function (traced function's return address)
+ *  %rdx - holds the original %rbp
+ */
 .macro save_mcount_regs added=0
 
 	/* Always save the original rbp */
@@ -101,9 +109,24 @@
 	movq MCOUNT_REG_SIZE-8(%rsp), %rdx
 	movq %rdx, RBP(%rsp)
 
+	/* Copy the parent address into %rsi (second parameter) */
+#ifdef CC_USING_FENTRY
+	movq MCOUNT_REG_SIZE+8+\added(%rsp), %rsi
+#else
+	/* %rdx contains original %rbp */
+	movq 8(%rdx), %rsi
+#endif
+
 	 /* Move RIP to its proper location */
 	movq MCOUNT_REG_SIZE+\added(%rsp), %rdi
 	movq %rdi, RIP(%rsp)
+
+	/*
+	 * Now %rdi (the first parameter) has the return address of
+	 * where ftrace_call returns. But the callbacks expect the
+	 * the address of the call itself.
+	 */
+	subq $MCOUNT_INSN_SIZE, %rdi
 	.endm
 
 .macro restore_mcount_regs
@@ -122,28 +145,6 @@
 
 	.endm
 
-/* skip is set if stack has been adjusted */
-.macro ftrace_caller_setup trace_label added=0
-	save_mcount_regs \added
-
-	/* Save this location */
-GLOBAL(\trace_label)
-	/* Load the ftrace_ops into the 3rd parameter */
-	movq function_trace_op(%rip), %rdx
-
-	/* %rdi already has %rip from the save_mcount_regs macro */
-	subq $MCOUNT_INSN_SIZE, %rdi
-	/* Load the parent_ip into the second parameter */
-#ifdef CC_USING_FENTRY
-	movq MCOUNT_REG_SIZE+8+\added(%rsp), %rsi
-#else
-	/* Need to grab the original %rbp */
-	movq RBP(%rsp), %rsi
-	/* Now parent address is 8 above original %rbp */
-	movq 8(%rsi), %rsi
-#endif
-.endm
-
 #ifdef CONFIG_DYNAMIC_FTRACE
 
 ENTRY(function_hook)
@@ -151,7 +152,13 @@ ENTRY(function_hook)
 END(function_hook)
 
 ENTRY(ftrace_caller)
-	ftrace_caller_setup ftrace_caller_op_ptr
+	/* save_mcount_regs fills in first two parameters */
+	save_mcount_regs
+
+GLOBAL(ftrace_caller_op_ptr)
+	/* Load the ftrace_ops into the 3rd parameter */
+	movq function_trace_op(%rip), %rdx
+
 	/* regs go into 4th parameter (but make it NULL) */
 	movq $0, %rcx
 
@@ -182,7 +189,12 @@ ENTRY(ftrace_regs_caller)
 	pushfq
 
 	/* added 8 bytes to save flags */
-	ftrace_caller_setup ftrace_regs_caller_op_ptr 8
+	save_mcount_regs 8
+	/* save_mcount_regs fills in first two parameters */
+
+GLOBAL(ftrace_regs_caller_op_ptr)
+	/* Load the ftrace_ops into the 3rd parameter */
+	movq function_trace_op(%rip), %rdx
 
 	/* Save the rest of pt_regs */
 	movq %r15, R15(%rsp)
@@ -263,7 +275,8 @@ GLOBAL(ftrace_stub)
 	retq
 
 trace:
-	ftrace_caller_setup ftrace_caller_op_ptr
+	/* save_mcount_regs fills in first two parameters */
+	save_mcount_regs
 
 	call   *ftrace_trace_function
 
@@ -276,16 +289,16 @@ END(function_hook)
 
 #ifdef CONFIG_FUNCTION_GRAPH_TRACER
 ENTRY(ftrace_graph_caller)
+	/* Saves rbp into %rdx */
 	save_mcount_regs
 
 #ifdef CC_USING_FENTRY
 	leaq MCOUNT_REG_SIZE+8(%rsp), %rdi
 	movq $0, %rdx	/* No framepointers needed */
 #else
-	/* Need to grab the original %rbp */
-	movq RBP(%rsp), %rdx
-	/* Now parent address is 8 above original %rbp */
+	/* Save address of the return address of traced function */
 	leaq 8(%rdx), %rdi
+	/* ftrace does sanity checks against frame pointers */
 	movq (%rdx), %rdx
 #endif
 	movq RIP(%rsp), %rsi


-- Steve

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [RFC][PATCH 0/7] ftrace/x86: Clean up of mcount.S code
  2014-11-25  1:34 ` [RFC][PATCH 0/7] ftrace/x86: Clean up of mcount.S code Linus Torvalds
  2014-11-25  1:56   ` Steven Rostedt
  2014-11-25  2:36   ` Steven Rostedt
@ 2014-11-25  2:54   ` Steven Rostedt
  2 siblings, 0 replies; 12+ messages in thread
From: Steven Rostedt @ 2014-11-25  2:54 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Linux Kernel Mailing List, Ingo Molnar, Andrew Morton,
	H. Peter Anvin, Thomas Gleixner, Masami Hiramatsu

Here's a new update. I'll spare you the patch series and only show you
the rolled up patch. By swapping the parameters of
prepare_ftrace_return() that's used by ftrace_graph_caller, I was able
to have that take advantage of the rdi being filled for it too.

I pushed this up to my repo as well.

-- Steve

diff --git a/arch/x86/include/asm/ftrace.h b/arch/x86/include/asm/ftrace.h
index e1f7fecaa7d6..f45acad3c4b6 100644
--- a/arch/x86/include/asm/ftrace.h
+++ b/arch/x86/include/asm/ftrace.h
@@ -1,39 +1,6 @@
 #ifndef _ASM_X86_FTRACE_H
 #define _ASM_X86_FTRACE_H
 
-#ifdef __ASSEMBLY__
-
-	/* skip is set if the stack was already partially adjusted */
-	.macro MCOUNT_SAVE_FRAME skip=0
-	 /*
-	  * We add enough stack to save all regs.
-	  */
-	subq $(SS+8-\skip), %rsp
-	movq %rax, RAX(%rsp)
-	movq %rcx, RCX(%rsp)
-	movq %rdx, RDX(%rsp)
-	movq %rsi, RSI(%rsp)
-	movq %rdi, RDI(%rsp)
-	movq %r8, R8(%rsp)
-	movq %r9, R9(%rsp)
-	 /* Move RIP to its proper location */
-	movq SS+8(%rsp), %rdx
-	movq %rdx, RIP(%rsp)
-	.endm
-
-	.macro MCOUNT_RESTORE_FRAME skip=0
-	movq R9(%rsp), %r9
-	movq R8(%rsp), %r8
-	movq RDI(%rsp), %rdi
-	movq RSI(%rsp), %rsi
-	movq RDX(%rsp), %rdx
-	movq RCX(%rsp), %rcx
-	movq RAX(%rsp), %rax
-	addq $(SS+8-\skip), %rsp
-	.endm
-
-#endif
-
 #ifdef CONFIG_FUNCTION_TRACER
 #ifdef CC_USING_FENTRY
 # define MCOUNT_ADDR		((long)(__fentry__))
diff --git a/arch/x86/kernel/ftrace.c b/arch/x86/kernel/ftrace.c
index 60881d919432..2142376dc8c6 100644
--- a/arch/x86/kernel/ftrace.c
+++ b/arch/x86/kernel/ftrace.c
@@ -871,7 +871,7 @@ static void *addr_from_call(void *ptr)
 	return ptr + MCOUNT_INSN_SIZE + calc.offset;
 }
 
-void prepare_ftrace_return(unsigned long *parent, unsigned long self_addr,
+void prepare_ftrace_return(unsigned long self_addr, unsigned long *parent,
 			   unsigned long frame_pointer);
 
 /*
@@ -964,7 +964,7 @@ int ftrace_disable_ftrace_graph_caller(void)
  * Hook the return address and push it in the stack of return addrs
  * in current thread info.
  */
-void prepare_ftrace_return(unsigned long *parent, unsigned long self_addr,
+void prepare_ftrace_return(unsigned long self_addr, unsigned long *parent,
 			   unsigned long frame_pointer)
 {
 	unsigned long old;
diff --git a/arch/x86/kernel/mcount_64.S b/arch/x86/kernel/mcount_64.S
index 6dc134b8dc70..1a5cf2a23ff3 100644
--- a/arch/x86/kernel/mcount_64.S
+++ b/arch/x86/kernel/mcount_64.S
@@ -21,78 +21,151 @@
 # define function_hook	mcount
 #endif
 
-#ifdef CONFIG_DYNAMIC_FTRACE
+/* All cases save the original rbp (8 bytes) */
+#ifdef CONFIG_FRAME_POINTER
+# ifdef CC_USING_FENTRY
+/* Save parent and function stack frames (rip and rbp) */
+#  define MCOUNT_FRAME_SIZE	(8+16*2)
+# else
+/* Save just function stack frame (rip and rbp) */
+#  define MCOUNT_FRAME_SIZE	(8+16)
+# endif
+#else
+/* No need to save a stack frame */
+# define MCOUNT_FRAME_SIZE	8
+#endif /* CONFIG_FRAME_POINTER */
 
-ENTRY(function_hook)
-	retq
-END(function_hook)
+/* Size of stack used to save mcount regs in save_mcount_regs */
+#define MCOUNT_REG_SIZE		(SS+8 + MCOUNT_FRAME_SIZE)
 
-/* skip is set if stack has been adjusted */
-.macro ftrace_caller_setup trace_label skip=0
-	MCOUNT_SAVE_FRAME \skip
+/*
+ * gcc -pg option adds a call to 'mcount' in most functions.
+ * When -mfentry is used, the call is to 'fentry' and not 'mcount'
+ * and is done before the function's stack frame is set up.
+ * They both require a set of regs to be saved before calling
+ * any C code and restored before returning back to the function.
+ *
+ * On boot up, all these calls are converted into nops. When tracing
+ * is enabled, the call can jump to either ftrace_caller or
+ * ftrace_regs_caller. Callbacks (tracing functions) that require
+ * ftrace_regs_caller (like kprobes) need to have pt_regs passed to
+ * it. For this reason, the size of the pt_regs structure will be
+ * allocated on the stack and the required mcount registers will
+ * be saved in the locations that pt_regs has them in.
+ */
 
-	/* Save this location */
-GLOBAL(\trace_label)
-	/* Load the ftrace_ops into the 3rd parameter */
-	movq function_trace_op(%rip), %rdx
+/*
+ * @added: the amount of stack added before calling this
+ *
+ * After this is called, the following registers contain:
+ *
+ *  %rdi - holds the address that called the trampoline
+ *  %rsi - holds the parent function (traced function's return address)
+ *  %rdx - holds the original %rbp
+ */
+.macro save_mcount_regs added=0
 
-	/* Load ip into the first parameter */
-	movq RIP(%rsp), %rdi
-	subq $MCOUNT_INSN_SIZE, %rdi
-	/* Load the parent_ip into the second parameter */
-#ifdef CC_USING_FENTRY
-	movq SS+16(%rsp), %rsi
-#else
-	movq 8(%rbp), %rsi
-#endif
-.endm
+	/* Always save the original rbp */
+	pushq %rbp
 
 #ifdef CONFIG_FRAME_POINTER
-/*
- * Stack traces will stop at the ftrace trampoline if the frame pointer
- * is not set up properly. If fentry is used, we need to save a frame
- * pointer for the parent as well as the function traced, because the
- * fentry is called before the stack frame is set up, where as mcount
- * is called afterward.
- */
-.macro create_frame parent rip
+	/*
+	 * Stack traces will stop at the ftrace trampoline if the frame pointer
+	 * is not set up properly. If fentry is used, we need to save a frame
+	 * pointer for the parent as well as the function traced, because the
+	 * fentry is called before the stack frame is set up, where as mcount
+	 * is called afterward.
+	 */
 #ifdef CC_USING_FENTRY
-	pushq \parent
+	/* Save the parent pointer (skip orig rbp and our return address) */
+	pushq \added+8*2(%rsp)
 	pushq %rbp
 	movq %rsp, %rbp
+	/* Save the return address (now skip orig rbp, rbp and parent) */
+	pushq \added+8*3(%rsp)
+#else
+	/* Can't assume that rip is before this (unless added was zero) */
+	pushq \added+8(%rsp)
 #endif
-	pushq \rip
 	pushq %rbp
 	movq %rsp, %rbp
-.endm
+#endif /* CONFIG_FRAME_POINTER */
 
-.macro restore_frame
+	/*
+	 * We add enough stack to save all regs.
+	 */
+	subq $(MCOUNT_REG_SIZE - MCOUNT_FRAME_SIZE), %rsp
+	movq %rax, RAX(%rsp)
+	movq %rcx, RCX(%rsp)
+	movq %rdx, RDX(%rsp)
+	movq %rsi, RSI(%rsp)
+	movq %rdi, RDI(%rsp)
+	movq %r8, R8(%rsp)
+	movq %r9, R9(%rsp)
+	/*
+	 * Save the original RBP. Even though the mcount ABI does not
+	 * require this, it helps out callers.
+	 */
+	movq MCOUNT_REG_SIZE-8(%rsp), %rdx
+	movq %rdx, RBP(%rsp)
+
+	/* Copy the parent address into %rsi (second parameter) */
 #ifdef CC_USING_FENTRY
-	addq $16, %rsp
-#endif
-	popq %rbp
-	addq $8, %rsp
-.endm
+	movq MCOUNT_REG_SIZE+8+\added(%rsp), %rsi
 #else
-.macro create_frame parent rip
-.endm
-.macro restore_frame
-.endm
-#endif /* CONFIG_FRAME_POINTER */
+	/* %rdx contains original %rbp */
+	movq 8(%rdx), %rsi
+#endif
+
+	 /* Move RIP to its proper location */
+	movq MCOUNT_REG_SIZE+\added(%rsp), %rdi
+	movq %rdi, RIP(%rsp)
+
+	/*
+	 * Now %rdi (the first parameter) has the return address of
+	 * where ftrace_call returns. But the callbacks expect the
+	 * the address of the call itself.
+	 */
+	subq $MCOUNT_INSN_SIZE, %rdi
+	.endm
+
+.macro restore_mcount_regs
+	movq R9(%rsp), %r9
+	movq R8(%rsp), %r8
+	movq RDI(%rsp), %rdi
+	movq RSI(%rsp), %rsi
+	movq RDX(%rsp), %rdx
+	movq RCX(%rsp), %rcx
+	movq RAX(%rsp), %rax
+
+	/* ftrace_regs_caller can modify %rbp */
+	movq RBP(%rsp), %rbp
+
+	addq $MCOUNT_REG_SIZE, %rsp
+
+	.endm
+
+#ifdef CONFIG_DYNAMIC_FTRACE
+
+ENTRY(function_hook)
+	retq
+END(function_hook)
 
 ENTRY(ftrace_caller)
-	ftrace_caller_setup ftrace_caller_op_ptr
+	/* save_mcount_regs fills in first two parameters */
+	save_mcount_regs
+
+GLOBAL(ftrace_caller_op_ptr)
+	/* Load the ftrace_ops into the 3rd parameter */
+	movq function_trace_op(%rip), %rdx
+
 	/* regs go into 4th parameter (but make it NULL) */
 	movq $0, %rcx
 
-	create_frame %rsi, %rdi
-
 GLOBAL(ftrace_call)
 	call ftrace_stub
 
-	restore_frame
-
-	MCOUNT_RESTORE_FRAME
+	restore_mcount_regs
 
 	/*
 	 * The copied trampoline must call ftrace_return as it
@@ -112,11 +185,16 @@ GLOBAL(ftrace_stub)
 END(ftrace_caller)
 
 ENTRY(ftrace_regs_caller)
-	/* Save the current flags before compare (in SS location)*/
+	/* Save the current flags before any operations that can change them */
 	pushfq
 
-	/* skip=8 to skip flags saved in SS */
-	ftrace_caller_setup ftrace_regs_caller_op_ptr 8
+	/* added 8 bytes to save flags */
+	save_mcount_regs 8
+	/* save_mcount_regs fills in first two parameters */
+
+GLOBAL(ftrace_regs_caller_op_ptr)
+	/* Load the ftrace_ops into the 3rd parameter */
+	movq function_trace_op(%rip), %rdx
 
 	/* Save the rest of pt_regs */
 	movq %r15, R15(%rsp)
@@ -125,37 +203,32 @@ ENTRY(ftrace_regs_caller)
 	movq %r12, R12(%rsp)
 	movq %r11, R11(%rsp)
 	movq %r10, R10(%rsp)
-	movq %rbp, RBP(%rsp)
 	movq %rbx, RBX(%rsp)
 	/* Copy saved flags */
-	movq SS(%rsp), %rcx
+	movq MCOUNT_REG_SIZE(%rsp), %rcx
 	movq %rcx, EFLAGS(%rsp)
 	/* Kernel segments */
 	movq $__KERNEL_DS, %rcx
 	movq %rcx, SS(%rsp)
 	movq $__KERNEL_CS, %rcx
 	movq %rcx, CS(%rsp)
-	/* Stack - skipping return address */
-	leaq SS+16(%rsp), %rcx
+	/* Stack - skipping return address and flags */
+	leaq MCOUNT_REG_SIZE+8*2(%rsp), %rcx
 	movq %rcx, RSP(%rsp)
 
 	/* regs go into 4th parameter */
 	leaq (%rsp), %rcx
 
-	create_frame %rsi, %rdi
-
 GLOBAL(ftrace_regs_call)
 	call ftrace_stub
 
-	restore_frame
-
 	/* Copy flags back to SS, to restore them */
 	movq EFLAGS(%rsp), %rax
-	movq %rax, SS(%rsp)
+	movq %rax, MCOUNT_REG_SIZE(%rsp)
 
 	/* Handlers can change the RIP */
 	movq RIP(%rsp), %rax
-	movq %rax, SS+8(%rsp)
+	movq %rax, MCOUNT_REG_SIZE+8(%rsp)
 
 	/* restore the rest of pt_regs */
 	movq R15(%rsp), %r15
@@ -163,11 +236,9 @@ GLOBAL(ftrace_regs_call)
 	movq R13(%rsp), %r13
 	movq R12(%rsp), %r12
 	movq R10(%rsp), %r10
-	movq RBP(%rsp), %rbp
 	movq RBX(%rsp), %rbx
 
-	/* skip=8 to skip flags saved in SS */
-	MCOUNT_RESTORE_FRAME 8
+	restore_mcount_regs
 
 	/* Restore flags */
 	popfq
@@ -182,9 +253,6 @@ GLOBAL(ftrace_regs_caller_end)
 
 	jmp ftrace_return
 
-	popfq
-	jmp  ftrace_stub
-
 END(ftrace_regs_caller)
 
 
@@ -207,19 +275,12 @@ GLOBAL(ftrace_stub)
 	retq
 
 trace:
-	MCOUNT_SAVE_FRAME
-
-	movq RIP(%rsp), %rdi
-#ifdef CC_USING_FENTRY
-	movq SS+16(%rsp), %rsi
-#else
-	movq 8(%rbp), %rsi
-#endif
-	subq $MCOUNT_INSN_SIZE, %rdi
+	/* save_mcount_regs fills in first two parameters */
+	save_mcount_regs
 
 	call   *ftrace_trace_function
 
-	MCOUNT_RESTORE_FRAME
+	restore_mcount_regs
 
 	jmp fgraph_trace
 END(function_hook)
@@ -228,21 +289,21 @@ END(function_hook)
 
 #ifdef CONFIG_FUNCTION_GRAPH_TRACER
 ENTRY(ftrace_graph_caller)
-	MCOUNT_SAVE_FRAME
+	/* Saves rbp into %rdx and fills first parameter  */
+	save_mcount_regs
 
 #ifdef CC_USING_FENTRY
-	leaq SS+16(%rsp), %rdi
+	leaq MCOUNT_REG_SIZE+8(%rsp), %rsi
 	movq $0, %rdx	/* No framepointers needed */
 #else
-	leaq 8(%rbp), %rdi
-	movq (%rbp), %rdx
+	/* Save address of the return address of traced function */
+	leaq 8(%rdx), %rsi
+	/* ftrace does sanity checks against frame pointers */
+	movq (%rdx), %rdx
 #endif
-	movq RIP(%rsp), %rsi
-	subq $MCOUNT_INSN_SIZE, %rsi
-
 	call	prepare_ftrace_return
 
-	MCOUNT_RESTORE_FRAME
+	restore_mcount_regs
 
 	retq
 END(ftrace_graph_caller)

^ permalink raw reply related	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2014-11-25  2:54 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-11-25  0:42 [RFC][PATCH 0/7] ftrace/x86: Clean up of mcount.S code Steven Rostedt
2014-11-25  0:42 ` [RFC][PATCH 1/7] ftrace/x86: Have static tracing also use ftrace_caller_setup Steven Rostedt
2014-11-25  0:42 ` [RFC][PATCH 2/7] ftrace/x86: Move MCOUNT_SAVE_FRAME out of header file Steven Rostedt
2014-11-25  0:42 ` [RFC][PATCH 3/7] ftrace/x86: Rename MCOUNT_SAVE_FRAME and add more detailed comments Steven Rostedt
2014-11-25  0:42 ` [RFC][PATCH 4/7] ftrace/x86: Have save_mcount_regs store RIP in %rdi for first parameter Steven Rostedt
2014-11-25  0:42 ` [RFC][PATCH 5/7] ftrace/x86: Simplify save_mcount_regs on getting RIP Steven Rostedt
2014-11-25  0:42 ` [RFC][PATCH 6/7] ftrace/x86: Add macro MCOUNT_REG_SIZE for amount of stack used to save mcount regs Steven Rostedt
2014-11-25  0:42 ` [RFC][PATCH 7/7] ftrace/x86: Have save_mcount_regs macro also save stack frames if needed Steven Rostedt
2014-11-25  1:34 ` [RFC][PATCH 0/7] ftrace/x86: Clean up of mcount.S code Linus Torvalds
2014-11-25  1:56   ` Steven Rostedt
2014-11-25  2:36   ` Steven Rostedt
2014-11-25  2:54   ` Steven Rostedt

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).