All of lore.kernel.org
 help / color / mirror / Atom feed
* [RFC][PATCH 0/4 v4] ftrace/kprobes: Setting up ftrace for kprobes
@ 2012-07-11 19:50 Steven Rostedt
  2012-07-11 19:50 ` [RFC][PATCH 1/4 v4] ftrace/x86: Add separate function to save regs Steven Rostedt
                   ` (3 more replies)
  0 siblings, 4 replies; 23+ messages in thread
From: Steven Rostedt @ 2012-07-11 19:50 UTC (permalink / raw)
  To: linux-kernel
  Cc: Ingo Molnar, Andrew Morton, Thomas Gleixner, Frederic Weisbecker,
	Masami Hiramatsu, H. Peter Anvin

I'm only posting patches that changed from v3. Those changes include:

Saving the flags after MCOUNT_RESTORE_FRAME (suggested by Masami)
I decided to restore flags from the pt_regs, such that a kprobe
could change the flags register.

I added the offset fix for function parent pointer (pointed out by
 Alexander van Heukelum).

I fixed the i386 version of "save regs" to restore flags correctly
(as pointed out by Masami Hiramatsu).

I also added two new patches.

1) I removed the double check to function_trace_stop variable that
   the function graph tracer was doing (it tested it in the function
   tracer trampoline, and again in the function graph trampoline).

2) I added internal recursion protection, that I found kprobes was
  triggering. This was long overdue anyway.

Masami,

Could you give your reviewed by tags for the first two patches, at least.

Thanks,

-- Steve

Steven Rostedt (4):
      ftrace/x86: Add separate function to save regs
      ftrace/x86: Add save_regs for i386 function calls
      ftrace/x86: Remove function_trace_stop check from graph caller
      ftrace/x86_64: Add recursion protection inside mcount caller

----
 arch/x86/include/asm/ftrace.h |   47 +++++++++-------
 arch/x86/kernel/entry_32.S    |   93 ++++++++++++++++++++++++++++--
 arch/x86/kernel/entry_64.S    |  125 ++++++++++++++++++++++++++++++++++++-----
 arch/x86/kernel/ftrace.c      |   80 ++++++++++++++++++++++++--
 include/linux/ftrace.h        |  109 ++++++++++++++++++++++++++++++++---
 kernel/trace/ftrace.c         |   91 +++++++++++++++++++++++++++---
 6 files changed, 485 insertions(+), 60 deletions(-)

^ permalink raw reply	[flat|nested] 23+ messages in thread

* [RFC][PATCH 1/4 v4] ftrace/x86: Add separate function to save regs
  2012-07-11 19:50 [RFC][PATCH 0/4 v4] ftrace/kprobes: Setting up ftrace for kprobes Steven Rostedt
@ 2012-07-11 19:50 ` Steven Rostedt
  2012-07-12 12:12   ` Masami Hiramatsu
  2012-07-11 19:50 ` [RFC][PATCH 2/4 v4] ftrace/x86: Add save_regs for i386 function calls Steven Rostedt
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 23+ messages in thread
From: Steven Rostedt @ 2012-07-11 19:50 UTC (permalink / raw)
  To: linux-kernel
  Cc: Ingo Molnar, Andrew Morton, Thomas Gleixner, Frederic Weisbecker,
	Masami Hiramatsu, H. Peter Anvin, Alexander van Heukelum

[-- Attachment #1: 0001-ftrace-x86-Add-separate-function-to-save-regs.patch --]
[-- Type: text/plain, Size: 22504 bytes --]

From: Steven Rostedt <srostedt@redhat.com>

Add a way to have different functions calling different trampolines.
If a ftrace_ops wants regs saved on the return, then have only the
functions with ops registered to save regs. Functions registered by
other ops would not be affected, unless the functions overlap.

If one ftrace_ops registered functions A, B and C and another ops
registered fucntions to save regs on A, and D, then only functions
A and D would be saving regs. Function B and C would work as normal.
Although A is registered by both ops: normal and saves regs; this is fine
as saving the regs is needed to satisfy one of the ops that calls it
but the regs are ignored by the other ops function.

x86_64 implements the full regs saving, and i386 just passes a NULL
for regs to satisfy the ftrace_ops passing. Where an arch must supply
both regs and ftrace_ops parameters, even if regs is just NULL.

It is OK for an arch to pass NULL regs. All function trace users that
require regs passing must add the flag FTRACE_OPS_FL_SAVE_REGS when
registering the ftrace_ops. If the arch does not support saving regs
then the ftrace_ops will fail to register. The flag
FTRACE_OPS_FL_SAVE_REGS_IF_SUPPORTED may be set that will prevent the
ftrace_ops from failing to register. In this case, the handler may
either check if regs is not NULL or check if ARCH_SUPPORTS_FTRACE_SAVE_REGS.
If the arch supports passing regs it will set this macro and pass regs
for ops that request them. All other archs will just pass NULL.

Cc: Alexander van Heukelum <heukelum@fastmail.fm>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
---
 arch/x86/include/asm/ftrace.h |   47 ++++++++++--------
 arch/x86/kernel/entry_32.S    |    4 +-
 arch/x86/kernel/entry_64.S    |   94 +++++++++++++++++++++++++++++++++---
 arch/x86/kernel/ftrace.c      |   77 +++++++++++++++++++++++++++--
 include/linux/ftrace.h        |  107 ++++++++++++++++++++++++++++++++++++++---
 kernel/trace/ftrace.c         |   91 ++++++++++++++++++++++++++++++++---
 6 files changed, 373 insertions(+), 47 deletions(-)

diff --git a/arch/x86/include/asm/ftrace.h b/arch/x86/include/asm/ftrace.h
index b3bb1f3..a847501 100644
--- a/arch/x86/include/asm/ftrace.h
+++ b/arch/x86/include/asm/ftrace.h
@@ -3,27 +3,33 @@
 
 #ifdef __ASSEMBLY__
 
-	.macro MCOUNT_SAVE_FRAME
-	/* taken from glibc */
-	subq $0x38, %rsp
-	movq %rax, (%rsp)
-	movq %rcx, 8(%rsp)
-	movq %rdx, 16(%rsp)
-	movq %rsi, 24(%rsp)
-	movq %rdi, 32(%rsp)
-	movq %r8, 40(%rsp)
-	movq %r9, 48(%rsp)
+	/* skip is set if the stack was already partially adjusted */
+	.macro MCOUNT_SAVE_FRAME skip=0
+	 /*
+	  * We add enough stack to save all regs.
+	  */
+	subq $(SS+8-\skip), %rsp
+	movq %rax, RAX(%rsp)
+	movq %rcx, RCX(%rsp)
+	movq %rdx, RDX(%rsp)
+	movq %rsi, RSI(%rsp)
+	movq %rdi, RDI(%rsp)
+	movq %r8, R8(%rsp)
+	movq %r9, R9(%rsp)
+	 /* Move RIP to its proper location */
+	movq SS+8(%rsp), %rdx
+	movq %rdx, RIP(%rsp)
 	.endm
 
-	.macro MCOUNT_RESTORE_FRAME
-	movq 48(%rsp), %r9
-	movq 40(%rsp), %r8
-	movq 32(%rsp), %rdi
-	movq 24(%rsp), %rsi
-	movq 16(%rsp), %rdx
-	movq 8(%rsp), %rcx
-	movq (%rsp), %rax
-	addq $0x38, %rsp
+	.macro MCOUNT_RESTORE_FRAME skip=0
+	movq R9(%rsp), %r9
+	movq R8(%rsp), %r8
+	movq RDI(%rsp), %rdi
+	movq RSI(%rsp), %rsi
+	movq RDX(%rsp), %rdx
+	movq RCX(%rsp), %rcx
+	movq RAX(%rsp), %rax
+	addq $(SS+8-\skip), %rsp
 	.endm
 
 #endif
@@ -34,6 +40,9 @@
 
 #ifdef CONFIG_DYNAMIC_FTRACE
 #define ARCH_SUPPORTS_FTRACE_OPS 1
+#ifdef CONFIG_X86_64
+#define ARCH_SUPPORTS_FTRACE_SAVE_REGS
+#endif
 #endif
 
 #ifndef __ASSEMBLY__
diff --git a/arch/x86/kernel/entry_32.S b/arch/x86/kernel/entry_32.S
index e3e17a0..5da11d1 100644
--- a/arch/x86/kernel/entry_32.S
+++ b/arch/x86/kernel/entry_32.S
@@ -1109,7 +1109,8 @@ ENTRY(ftrace_caller)
 	pushl %eax
 	pushl %ecx
 	pushl %edx
-	movl 0xc(%esp), %eax
+	pushl $0	/* Pass NULL as regs pointer */
+	movl 4*4(%esp), %eax
 	movl 0x4(%ebp), %edx
 	leal function_trace_op, %ecx
 	subl $MCOUNT_INSN_SIZE, %eax
@@ -1118,6 +1119,7 @@ ENTRY(ftrace_caller)
 ftrace_call:
 	call ftrace_stub
 
+	addl $4,%esp	/* skip NULL pointer */
 	popl %edx
 	popl %ecx
 	popl %eax
diff --git a/arch/x86/kernel/entry_64.S b/arch/x86/kernel/entry_64.S
index 2b4f94c..52bda2e 100644
--- a/arch/x86/kernel/entry_64.S
+++ b/arch/x86/kernel/entry_64.S
@@ -73,21 +73,34 @@ ENTRY(mcount)
 	retq
 END(mcount)
 
+/* skip is set if stack has been adjusted */
+.macro ftrace_caller_setup skip=0
+	MCOUNT_SAVE_FRAME \skip
+
+	/* Load the ftrace_ops into the 3rd parameter */
+	leaq function_trace_op, %rdx
+
+	/* Load ip into the first parameter */
+	movq RIP(%rsp), %rdi
+	subq $MCOUNT_INSN_SIZE, %rdi
+	/* Load the parent_ip into the second parameter */
+	movq 8(%rbp), %rsi
+.endm
+
 ENTRY(ftrace_caller)
+	/* Check if tracing was disabled (quick check) */
 	cmpl $0, function_trace_stop
 	jne  ftrace_stub
 
-	MCOUNT_SAVE_FRAME
-
-	leaq function_trace_op, %rdx
-	movq 0x38(%rsp), %rdi
-	movq 8(%rbp), %rsi
-	subq $MCOUNT_INSN_SIZE, %rdi
+	ftrace_caller_setup
+	/* regs go into 4th parameter (but make it NULL) */
+	movq $0, %rcx
 
 GLOBAL(ftrace_call)
 	call ftrace_stub
 
 	MCOUNT_RESTORE_FRAME
+ftrace_return:
 
 #ifdef CONFIG_FUNCTION_GRAPH_TRACER
 GLOBAL(ftrace_graph_call)
@@ -98,6 +111,71 @@ GLOBAL(ftrace_stub)
 	retq
 END(ftrace_caller)
 
+ENTRY(ftrace_regs_caller)
+	/* Save the current flags before compare (in SS location)*/
+	pushfq
+
+	/* Check if tracing was disabled (quick check) */
+	cmpl $0, function_trace_stop
+	jne  ftrace_restore_flags
+
+	/* skip=8 to skip flags saved in SS */
+	ftrace_caller_setup 8
+
+	/* Save the rest of pt_regs */
+	movq %r15, R15(%rsp)
+	movq %r14, R14(%rsp)
+	movq %r13, R13(%rsp)
+	movq %r12, R12(%rsp)
+	movq %r11, R11(%rsp)
+	movq %r10, R10(%rsp)
+	movq %rbp, RBP(%rsp)
+	movq %rbx, RBX(%rsp)
+	/* Copy saved flags */
+	movq SS(%rsp), %rcx
+	movq %rcx, EFLAGS(%rsp)
+	/* Kernel segments */
+	movq $__KERNEL_DS, %rcx
+	movq %rcx, SS(%rsp)
+	movq $__KERNEL_CS, %rcx
+	movq %rcx, CS(%rsp)
+	/* Stack - skipping return address */
+	leaq SS+16(%rsp), %rcx
+	movq %rcx, RSP(%rsp)
+
+	/* regs go into 4th parameter */
+	leaq (%rsp), %rcx
+
+GLOBAL(ftrace_regs_call)
+	call ftrace_stub
+
+	/* Copy flags back to SS, to restore them */
+	movq EFLAGS(%rsp), %rax
+	movq %rax, SS(%rsp)
+
+	/* restore the rest of pt_regs */
+	movq R15(%rsp), %r15
+	movq R14(%rsp), %r14
+	movq R13(%rsp), %r13
+	movq R12(%rsp), %r12
+	movq R10(%rsp), %r10
+	movq RBP(%rsp), %rbp
+	movq RBX(%rsp), %rbx
+
+	/* skip=8 to skip flags saved in SS */
+	MCOUNT_RESTORE_FRAME 8
+
+	/* Restore flags */
+	popfq
+
+	jmp ftrace_return
+ftrace_restore_flags:
+	popfq
+	jmp  ftrace_stub
+
+END(ftrace_regs_caller)
+
+
 #else /* ! CONFIG_DYNAMIC_FTRACE */
 ENTRY(mcount)
 	cmpl $0, function_trace_stop
@@ -120,7 +198,7 @@ GLOBAL(ftrace_stub)
 trace:
 	MCOUNT_SAVE_FRAME
 
-	movq 0x38(%rsp), %rdi
+	movq RIP(%rsp), %rdi
 	movq 8(%rbp), %rsi
 	subq $MCOUNT_INSN_SIZE, %rdi
 
@@ -141,7 +219,7 @@ ENTRY(ftrace_graph_caller)
 	MCOUNT_SAVE_FRAME
 
 	leaq 8(%rbp), %rdi
-	movq 0x38(%rsp), %rsi
+	movq RIP(%rsp), %rsi
 	movq (%rbp), %rdx
 	subq $MCOUNT_INSN_SIZE, %rsi
 
diff --git a/arch/x86/kernel/ftrace.c b/arch/x86/kernel/ftrace.c
index c3a7cb4..b90eb1a 100644
--- a/arch/x86/kernel/ftrace.c
+++ b/arch/x86/kernel/ftrace.c
@@ -206,6 +206,23 @@ static int
 ftrace_modify_code(unsigned long ip, unsigned const char *old_code,
 		   unsigned const char *new_code);
 
+#ifdef ARCH_SUPPORTS_FTRACE_SAVE_REGS
+/*
+ * Should never be called:
+ *  As it is only called by __ftrace_replace_code() which is called by
+ *  ftrace_replace_code() that x86 overrides, and by ftrace_update_code()
+ *  which is called to turn mcount into nops or nops into function calls
+ *  but not to convert a function from not using regs to one that uses
+ *  regs, which ftrace_modify_call() is for.
+ */
+int ftrace_modify_call(struct dyn_ftrace *rec, unsigned long old_addr,
+				 unsigned long addr)
+{
+	WARN_ON(1);
+	return -EINVAL;
+}
+#endif
+
 int ftrace_update_ftrace_func(ftrace_func_t func)
 {
 	unsigned long ip = (unsigned long)(&ftrace_call);
@@ -220,6 +237,16 @@ int ftrace_update_ftrace_func(ftrace_func_t func)
 
 	ret = ftrace_modify_code(ip, old, new);
 
+#ifdef ARCH_SUPPORTS_FTRACE_SAVE_REGS
+	/* Also update the regs callback function */
+	if (!ret) {
+		ip = (unsigned long)(&ftrace_regs_call);
+		memcpy(old, &ftrace_regs_call, MCOUNT_INSN_SIZE);
+		new = ftrace_call_replace(ip, (unsigned long)func);
+		ret = ftrace_modify_code(ip, old, new);
+	}
+#endif
+
 	atomic_dec(&modifying_ftrace_code);
 
 	return ret;
@@ -299,6 +326,32 @@ static int add_brk_on_nop(struct dyn_ftrace *rec)
 	return add_break(rec->ip, old);
 }
 
+/*
+ * If the record has the FTRACE_FL_REGS set, that means that it
+ * wants to convert to a callback that saves all regs. If FTRACE_FL_REGS
+ * is not not set, then it wants to convert to the normal callback.
+ */
+static unsigned long get_ftrace_addr(struct dyn_ftrace *rec)
+{
+	if (rec->flags & FTRACE_FL_REGS)
+		return (unsigned long)FTRACE_REGS_ADDR;
+	else
+		return (unsigned long)FTRACE_ADDR;
+}
+
+/*
+ * The FTRACE_FL_REGS_EN is set when the record already points to
+ * a function that saves all the regs. Basically the '_EN' version
+ * represents the current state of the function.
+ */
+static unsigned long get_ftrace_old_addr(struct dyn_ftrace *rec)
+{
+	if (rec->flags & FTRACE_FL_REGS_EN)
+		return (unsigned long)FTRACE_REGS_ADDR;
+	else
+		return (unsigned long)FTRACE_ADDR;
+}
+
 static int add_breakpoints(struct dyn_ftrace *rec, int enable)
 {
 	unsigned long ftrace_addr;
@@ -306,7 +359,7 @@ static int add_breakpoints(struct dyn_ftrace *rec, int enable)
 
 	ret = ftrace_test_record(rec, enable);
 
-	ftrace_addr = (unsigned long)FTRACE_ADDR;
+	ftrace_addr = get_ftrace_addr(rec);
 
 	switch (ret) {
 	case FTRACE_UPDATE_IGNORE:
@@ -316,6 +369,10 @@ static int add_breakpoints(struct dyn_ftrace *rec, int enable)
 		/* converting nop to call */
 		return add_brk_on_nop(rec);
 
+	case FTRACE_UPDATE_MODIFY_CALL_REGS:
+	case FTRACE_UPDATE_MODIFY_CALL:
+		ftrace_addr = get_ftrace_old_addr(rec);
+		/* fall through */
 	case FTRACE_UPDATE_MAKE_NOP:
 		/* converting a call to a nop */
 		return add_brk_on_call(rec, ftrace_addr);
@@ -360,13 +417,21 @@ static int remove_breakpoint(struct dyn_ftrace *rec)
 		 * If not, don't touch the breakpoint, we make just create
 		 * a disaster.
 		 */
-		ftrace_addr = (unsigned long)FTRACE_ADDR;
+		ftrace_addr = get_ftrace_addr(rec);
+		nop = ftrace_call_replace(ip, ftrace_addr);
+
+		if (memcmp(&ins[1], &nop[1], MCOUNT_INSN_SIZE - 1) == 0)
+			goto update;
+
+		/* Check both ftrace_addr and ftrace_old_addr */
+		ftrace_addr = get_ftrace_old_addr(rec);
 		nop = ftrace_call_replace(ip, ftrace_addr);
 
 		if (memcmp(&ins[1], &nop[1], MCOUNT_INSN_SIZE - 1) != 0)
 			return -EINVAL;
 	}
 
+ update:
 	return probe_kernel_write((void *)ip, &nop[0], 1);
 }
 
@@ -405,12 +470,14 @@ static int add_update(struct dyn_ftrace *rec, int enable)
 
 	ret = ftrace_test_record(rec, enable);
 
-	ftrace_addr = (unsigned long)FTRACE_ADDR;
+	ftrace_addr  = get_ftrace_addr(rec);
 
 	switch (ret) {
 	case FTRACE_UPDATE_IGNORE:
 		return 0;
 
+	case FTRACE_UPDATE_MODIFY_CALL_REGS:
+	case FTRACE_UPDATE_MODIFY_CALL:
 	case FTRACE_UPDATE_MAKE_CALL:
 		/* converting nop to call */
 		return add_update_call(rec, ftrace_addr);
@@ -455,12 +522,14 @@ static int finish_update(struct dyn_ftrace *rec, int enable)
 
 	ret = ftrace_update_record(rec, enable);
 
-	ftrace_addr = (unsigned long)FTRACE_ADDR;
+	ftrace_addr = get_ftrace_addr(rec);
 
 	switch (ret) {
 	case FTRACE_UPDATE_IGNORE:
 		return 0;
 
+	case FTRACE_UPDATE_MODIFY_CALL_REGS:
+	case FTRACE_UPDATE_MODIFY_CALL:
 	case FTRACE_UPDATE_MAKE_CALL:
 		/* converting nop to call */
 		return finish_update_call(rec, ftrace_addr);
diff --git a/include/linux/ftrace.h b/include/linux/ftrace.h
index e420288..ab39990 100644
--- a/include/linux/ftrace.h
+++ b/include/linux/ftrace.h
@@ -71,12 +71,28 @@ typedef void (*ftrace_func_t)(unsigned long ip, unsigned long parent_ip,
  *           could be controled by following calls:
  *             ftrace_function_local_enable
  *             ftrace_function_local_disable
+ * SAVE_REGS - The ftrace_ops wants regs saved at each function called
+ *            and passed to the callback. If this flag is set, but the
+ *            architecture does not support passing regs
+ *            (ARCH_SUPPORTS_FTRACE_SAVE_REGS is not defined), then the
+ *            ftrace_ops will fail to register, unless the next flag
+ *            is set.
+ * SAVE_REGS_IF_SUPPORTED - This is the same as SAVE_REGS, but if the
+ *            handler can handle an arch that does not save regs
+ *            (the handler tests if regs == NULL), then it can set
+ *            this flag instead. It will not fail registering the ftrace_ops
+ *            but, the regs field will be NULL if the arch does not support
+ *            passing regs to the handler.
+ *            Note, if this flag is set, the SAVE_REGS flag will automatically
+ *            get set upon registering the ftrace_ops, if the arch supports it.
  */
 enum {
-	FTRACE_OPS_FL_ENABLED		= 1 << 0,
-	FTRACE_OPS_FL_GLOBAL		= 1 << 1,
-	FTRACE_OPS_FL_DYNAMIC		= 1 << 2,
-	FTRACE_OPS_FL_CONTROL		= 1 << 3,
+	FTRACE_OPS_FL_ENABLED			= 1 << 0,
+	FTRACE_OPS_FL_GLOBAL			= 1 << 1,
+	FTRACE_OPS_FL_DYNAMIC			= 1 << 2,
+	FTRACE_OPS_FL_CONTROL			= 1 << 3,
+	FTRACE_OPS_FL_SAVE_REGS			= 1 << 4,
+	FTRACE_OPS_FL_SAVE_REGS_IF_SUPPORTED	= 1 << 5,
 };
 
 struct ftrace_ops {
@@ -254,12 +270,31 @@ extern void unregister_ftrace_function_probe_all(char *glob);
 
 extern int ftrace_text_reserved(void *start, void *end);
 
+/*
+ * The dyn_ftrace record's flags field is split into two parts.
+ * the first part which is '0-FTRACE_REF_MAX' is a counter of
+ * the number of callbacks that have registered the function that
+ * the dyn_ftrace descriptor represents.
+ *
+ * The second part is a mask:
+ *  ENABLED - the function is being traced
+ *  REGS    - the record wants the function to save regs
+ *  REGS_EN - the function is set up to save regs.
+ *
+ * When a new ftrace_ops is registered and wants a function to save
+ * pt_regs, the rec->flag REGS is set. When the function has been
+ * set up to save regs, the REG_EN flag is set. Once a function
+ * starts saving regs it will do so until all ftrace_ops are removed
+ * from tracing that function.
+ */
 enum {
-	FTRACE_FL_ENABLED	= (1 << 30),
+	FTRACE_FL_ENABLED	= (1UL << 29),
+	FTRACE_FL_REGS		= (1UL << 30),
+	FTRACE_FL_REGS_EN	= (1UL << 31)
 };
 
-#define FTRACE_FL_MASK		(0x3UL << 30)
-#define FTRACE_REF_MAX		((1 << 30) - 1)
+#define FTRACE_FL_MASK		(0x7UL << 29)
+#define FTRACE_REF_MAX		((1UL << 29) - 1)
 
 struct dyn_ftrace {
 	union {
@@ -290,9 +325,23 @@ enum {
 	FTRACE_STOP_FUNC_RET		= (1 << 4),
 };
 
+/*
+ * The FTRACE_UPDATE_* enum is used to pass information back
+ * from the ftrace_update_record() and ftrace_test_record()
+ * functions. These are called by the code update routines
+ * to find out what is to be done for a given function.
+ *
+ *  IGNORE           - The function is already what we want it to be
+ *  MAKE_CALL        - Start tracing the function
+ *  MODIFY_CALL      - Stop saving regs for the function
+ *  MODIFY_CALL_REGS - Start saving regs for the function
+ *  MAKE_NOP         - Stop tracing the function
+ */
 enum {
 	FTRACE_UPDATE_IGNORE,
 	FTRACE_UPDATE_MAKE_CALL,
+	FTRACE_UPDATE_MODIFY_CALL,
+	FTRACE_UPDATE_MODIFY_CALL_REGS,
 	FTRACE_UPDATE_MAKE_NOP,
 };
 
@@ -344,7 +393,9 @@ extern int ftrace_dyn_arch_init(void *data);
 extern void ftrace_replace_code(int enable);
 extern int ftrace_update_ftrace_func(ftrace_func_t func);
 extern void ftrace_caller(void);
+extern void ftrace_regs_caller(void);
 extern void ftrace_call(void);
+extern void ftrace_regs_call(void);
 extern void mcount_call(void);
 
 void ftrace_modify_all_code(int command);
@@ -352,6 +403,15 @@ void ftrace_modify_all_code(int command);
 #ifndef FTRACE_ADDR
 #define FTRACE_ADDR ((unsigned long)ftrace_caller)
 #endif
+
+#ifndef FTRACE_REGS_ADDR
+#ifdef ARCH_SUPPORTS_FTRACE_SAVE_REGS
+# define FTRACE_REGS_ADDR ((unsigned long)ftrace_regs_caller)
+#else
+# define FTRACE_REGS_ADDR FTRACE_ADDR
+#endif
+#endif
+
 #ifdef CONFIG_FUNCTION_GRAPH_TRACER
 extern void ftrace_graph_caller(void);
 extern int ftrace_enable_ftrace_graph_caller(void);
@@ -407,6 +467,39 @@ extern int ftrace_make_nop(struct module *mod,
  */
 extern int ftrace_make_call(struct dyn_ftrace *rec, unsigned long addr);
 
+#ifdef ARCH_SUPPORTS_FTRACE_SAVE_REGS
+/**
+ * ftrace_modify_call - convert from one addr to another (no nop)
+ * @rec: the mcount call site record
+ * @old_addr: the address expected to be currently called to
+ * @addr: the address to change to
+ *
+ * This is a very sensitive operation and great care needs
+ * to be taken by the arch.  The operation should carefully
+ * read the location, check to see if what is read is indeed
+ * what we expect it to be, and then on success of the compare,
+ * it should write to the location.
+ *
+ * The code segment at @rec->ip should be a caller to @old_addr
+ *
+ * Return must be:
+ *  0 on success
+ *  -EFAULT on error reading the location
+ *  -EINVAL on a failed compare of the contents
+ *  -EPERM  on error writing to the location
+ * Any other value will be considered a failure.
+ */
+extern int ftrace_modify_call(struct dyn_ftrace *rec, unsigned long old_addr,
+			      unsigned long addr);
+#else
+/* Should never be called */
+static inline int ftrace_modify_call(struct dyn_ftrace *rec, unsigned long old_addr,
+				     unsigned long addr)
+{
+	return -EINVAL;
+}
+#endif
+
 /* May be defined in arch */
 extern int ftrace_arch_read_dyn_info(char *buf, int size);
 
diff --git a/kernel/trace/ftrace.c b/kernel/trace/ftrace.c
index 6ff07ad..c55f7e2 100644
--- a/kernel/trace/ftrace.c
+++ b/kernel/trace/ftrace.c
@@ -314,6 +314,20 @@ static int __register_ftrace_function(struct ftrace_ops *ops)
 	if ((ops->flags & FL_GLOBAL_CONTROL_MASK) == FL_GLOBAL_CONTROL_MASK)
 		return -EINVAL;
 
+#ifndef ARCH_SUPPORTS_FTRACE_SAVE_REGS
+	/*
+	 * If the ftrace_ops specifies SAVE_REGS, then it only can be used
+	 * if the arch supports it, or SAVE_REGS_IF_SUPPORTED is also set.
+	 * Setting SAVE_REGS_IF_SUPPORTED makes SAVE_REGS irrelevant.
+	 */
+	if (ops->flags & FTRACE_OPS_FL_SAVE_REGS &&
+	    !(ops->flags & FTRACE_OPS_FL_SAVE_REGS_IF_SUPPORTED))
+		return -EINVAL;
+
+	if (ops->flags & FTRACE_OPS_FL_SAVE_REGS_IF_SUPPORTED)
+		ops->flags |= FTRACE_OPS_FL_SAVE_REGS;
+#endif
+
 	if (!core_kernel_data((unsigned long)ops))
 		ops->flags |= FTRACE_OPS_FL_DYNAMIC;
 
@@ -1515,6 +1529,12 @@ static void __ftrace_hash_rec_update(struct ftrace_ops *ops,
 			rec->flags++;
 			if (FTRACE_WARN_ON((rec->flags & ~FTRACE_FL_MASK) == FTRACE_REF_MAX))
 				return;
+			/*
+			 * If any ops wants regs saved for this function
+			 * then all ops will get saved regs.
+			 */
+			if (ops->flags & FTRACE_OPS_FL_SAVE_REGS)
+				rec->flags |= FTRACE_FL_REGS;
 		} else {
 			if (FTRACE_WARN_ON((rec->flags & ~FTRACE_FL_MASK) == 0))
 				return;
@@ -1606,18 +1626,59 @@ static int ftrace_check_record(struct dyn_ftrace *rec, int enable, int update)
 	if (enable && (rec->flags & ~FTRACE_FL_MASK))
 		flag = FTRACE_FL_ENABLED;
 
+	/*
+	 * If enabling and the REGS flag does not match the REGS_EN, then
+	 * do not ignore this record. Set flags to fail the compare against
+	 * ENABLED.
+	 */
+	if (flag &&
+	    (!(rec->flags & FTRACE_FL_REGS) != !(rec->flags & FTRACE_FL_REGS_EN)))
+		flag |= FTRACE_FL_REGS;
+
 	/* If the state of this record hasn't changed, then do nothing */
 	if ((rec->flags & FTRACE_FL_ENABLED) == flag)
 		return FTRACE_UPDATE_IGNORE;
 
 	if (flag) {
-		if (update)
+		/* Save off if rec is being enabled (for return value) */
+		flag ^= rec->flags & FTRACE_FL_ENABLED;
+
+		if (update) {
 			rec->flags |= FTRACE_FL_ENABLED;
-		return FTRACE_UPDATE_MAKE_CALL;
+			if (flag & FTRACE_FL_REGS) {
+				if (rec->flags & FTRACE_FL_REGS)
+					rec->flags |= FTRACE_FL_REGS_EN;
+				else
+					rec->flags &= ~FTRACE_FL_REGS_EN;
+			}
+		}
+
+		/*
+		 * If this record is being updated from a nop, then
+		 *   return UPDATE_MAKE_CALL.
+		 * Otherwise, if the EN flag is set, then return
+		 *   UPDATE_MODIFY_CALL_REGS to tell the caller to convert
+		 *   from the non-save regs, to a save regs function.
+		 * Otherwise,
+		 *   return UPDATE_MODIFY_CALL to tell the caller to convert
+		 *   from the save regs, to a non-save regs function.
+		 */
+		if (flag & FTRACE_FL_ENABLED)
+			return FTRACE_UPDATE_MAKE_CALL;
+		else if (rec->flags & FTRACE_FL_REGS_EN)
+			return FTRACE_UPDATE_MODIFY_CALL_REGS;
+		else
+			return FTRACE_UPDATE_MODIFY_CALL;
 	}
 
-	if (update)
-		rec->flags &= ~FTRACE_FL_ENABLED;
+	if (update) {
+		/* If there's no more users, clear all flags */
+		if (!(rec->flags & ~FTRACE_FL_MASK))
+			rec->flags = 0;
+		else
+			/* Just disable the record (keep REGS state) */
+			rec->flags &= ~FTRACE_FL_ENABLED;
+	}
 
 	return FTRACE_UPDATE_MAKE_NOP;
 }
@@ -1652,13 +1713,17 @@ int ftrace_test_record(struct dyn_ftrace *rec, int enable)
 static int
 __ftrace_replace_code(struct dyn_ftrace *rec, int enable)
 {
+	unsigned long ftrace_old_addr;
 	unsigned long ftrace_addr;
 	int ret;
 
-	ftrace_addr = (unsigned long)FTRACE_ADDR;
-
 	ret = ftrace_update_record(rec, enable);
 
+	if (rec->flags & FTRACE_FL_REGS)
+		ftrace_addr = (unsigned long)FTRACE_REGS_ADDR;
+	else
+		ftrace_addr = (unsigned long)FTRACE_ADDR;
+
 	switch (ret) {
 	case FTRACE_UPDATE_IGNORE:
 		return 0;
@@ -1668,6 +1733,15 @@ __ftrace_replace_code(struct dyn_ftrace *rec, int enable)
 
 	case FTRACE_UPDATE_MAKE_NOP:
 		return ftrace_make_nop(NULL, rec, ftrace_addr);
+
+	case FTRACE_UPDATE_MODIFY_CALL_REGS:
+	case FTRACE_UPDATE_MODIFY_CALL:
+		if (rec->flags & FTRACE_FL_REGS)
+			ftrace_old_addr = (unsigned long)FTRACE_ADDR;
+		else
+			ftrace_old_addr = (unsigned long)FTRACE_REGS_ADDR;
+
+		return ftrace_modify_call(rec, ftrace_old_addr, ftrace_addr);
 	}
 
 	return -1; /* unknow ftrace bug */
@@ -2421,8 +2495,9 @@ static int t_show(struct seq_file *m, void *v)
 
 	seq_printf(m, "%ps", (void *)rec->ip);
 	if (iter->flags & FTRACE_ITER_ENABLED)
-		seq_printf(m, " (%ld)",
-			   rec->flags & ~FTRACE_FL_MASK);
+		seq_printf(m, " (%ld)%s",
+			   rec->flags & ~FTRACE_FL_MASK,
+			   rec->flags & FTRACE_FL_REGS ? " R" : "");
 	seq_printf(m, "\n");
 
 	return 0;
-- 
1.7.10



^ permalink raw reply	[flat|nested] 23+ messages in thread

* [RFC][PATCH 2/4 v4] ftrace/x86: Add save_regs for i386 function calls
  2012-07-11 19:50 [RFC][PATCH 0/4 v4] ftrace/kprobes: Setting up ftrace for kprobes Steven Rostedt
  2012-07-11 19:50 ` [RFC][PATCH 1/4 v4] ftrace/x86: Add separate function to save regs Steven Rostedt
@ 2012-07-11 19:50 ` Steven Rostedt
  2012-07-12 12:39   ` Masami Hiramatsu
  2012-07-11 19:50 ` [RFC][PATCH 3/4 v4] ftrace/x86: Remove function_trace_stop check from graph caller Steven Rostedt
  2012-07-11 19:50 ` [RFC][PATCH 4/4 v4] ftrace/x86_64: Add recursion protection inside mcount caller Steven Rostedt
  3 siblings, 1 reply; 23+ messages in thread
From: Steven Rostedt @ 2012-07-11 19:50 UTC (permalink / raw)
  To: linux-kernel
  Cc: Ingo Molnar, Andrew Morton, Thomas Gleixner, Frederic Weisbecker,
	Masami Hiramatsu, H. Peter Anvin

[-- Attachment #1: 0002-ftrace-x86-Add-save_regs-for-i386-function-calls.patch --]
[-- Type: text/plain, Size: 3880 bytes --]

From: Steven Rostedt <srostedt@redhat.com>

Add saving full regs for function tracing on i386.
The saving of regs was influenced by patches sent out by
Masami Hiramatsu.

Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
---
 arch/x86/include/asm/ftrace.h |    2 --
 arch/x86/kernel/entry_32.S    |   58 +++++++++++++++++++++++++++++++++++++++++
 arch/x86/kernel/ftrace.c      |    4 ---
 3 files changed, 58 insertions(+), 6 deletions(-)

diff --git a/arch/x86/include/asm/ftrace.h b/arch/x86/include/asm/ftrace.h
index a847501..a6cae0c 100644
--- a/arch/x86/include/asm/ftrace.h
+++ b/arch/x86/include/asm/ftrace.h
@@ -40,10 +40,8 @@
 
 #ifdef CONFIG_DYNAMIC_FTRACE
 #define ARCH_SUPPORTS_FTRACE_OPS 1
-#ifdef CONFIG_X86_64
 #define ARCH_SUPPORTS_FTRACE_SAVE_REGS
 #endif
-#endif
 
 #ifndef __ASSEMBLY__
 extern void mcount(void);
diff --git a/arch/x86/kernel/entry_32.S b/arch/x86/kernel/entry_32.S
index 5da11d1..8ef138f 100644
--- a/arch/x86/kernel/entry_32.S
+++ b/arch/x86/kernel/entry_32.S
@@ -1123,6 +1123,7 @@ ftrace_call:
 	popl %edx
 	popl %ecx
 	popl %eax
+ftrace_ret:
 #ifdef CONFIG_FUNCTION_GRAPH_TRACER
 .globl ftrace_graph_call
 ftrace_graph_call:
@@ -1134,6 +1135,63 @@ ftrace_stub:
 	ret
 END(ftrace_caller)
 
+ENTRY(ftrace_regs_caller)
+	pushf	/* push flags before compare (in ss location) */
+	cmpl $0, function_trace_stop
+	jne ftrace_restore_flags
+
+	pushl %esp	/* Save stack in sp location */
+	subl $4, (%esp) /* Adjust saved stack to skip saved flags */
+	pushl 4(%esp)	/* Save flags in correct position */
+	movl $__KERNEL_DS, 8(%esp)	/* Save ss */
+	pushl $__KERNEL_CS
+	pushl 4*4(%esp)	/* Save the ip */
+	subl $MCOUNT_INSN_SIZE, (%esp)	/* Adjust ip */
+	pushl $0	/* Load 0 into orig_ax */
+
+	pushl %gs
+	pushl %fs
+	pushl %es
+	pushl %ds
+	pushl %eax
+	pushl %ebp
+	pushl %edi
+	pushl %esi
+	pushl %edx
+	pushl %ecx
+	pushl %ebx
+
+	movl 12*4(%esp), %eax	/* Load ip (1st parameter) */
+	movl 0x4(%ebp), %edx	/* Load parent ip (2cd parameter) */
+	lea  (%esp), %ecx
+	pushl %ecx		/* Save pt_regs as 4th parameter */
+	leal function_trace_op, %ecx /* Save ftrace_pos in 3rd parameter */
+
+GLOBAL(ftrace_regs_call)
+	call ftrace_stub
+
+	addl $4,%esp		/* Skip pt_regs */
+	movl 14*4(%esp), %eax	/* Move flags back into ss */
+	movl %eax, 16*4(%esp)	/* Needed to keep addl from modifying flags */
+
+	popl %ebx
+	popl %ecx
+	popl %edx
+	popl %esi
+	popl %edi
+	popl %ebp
+	popl %eax
+	popl %ds
+	popl %es
+	popl %fs
+	popl %gs
+	addl $(5*4), %esp	/* Skip orig_ax, ip, cs, flags and sp */
+	popf			/* Pop flags at end (no addl to corrupt flags) */
+	jmp ftrace_ret
+
+ftrace_restore_flags:
+	popf
+	jmp  ftrace_stub
 #else /* ! CONFIG_DYNAMIC_FTRACE */
 
 ENTRY(mcount)
diff --git a/arch/x86/kernel/ftrace.c b/arch/x86/kernel/ftrace.c
index b90eb1a..1d41402 100644
--- a/arch/x86/kernel/ftrace.c
+++ b/arch/x86/kernel/ftrace.c
@@ -206,7 +206,6 @@ static int
 ftrace_modify_code(unsigned long ip, unsigned const char *old_code,
 		   unsigned const char *new_code);
 
-#ifdef ARCH_SUPPORTS_FTRACE_SAVE_REGS
 /*
  * Should never be called:
  *  As it is only called by __ftrace_replace_code() which is called by
@@ -221,7 +220,6 @@ int ftrace_modify_call(struct dyn_ftrace *rec, unsigned long old_addr,
 	WARN_ON(1);
 	return -EINVAL;
 }
-#endif
 
 int ftrace_update_ftrace_func(ftrace_func_t func)
 {
@@ -237,7 +235,6 @@ int ftrace_update_ftrace_func(ftrace_func_t func)
 
 	ret = ftrace_modify_code(ip, old, new);
 
-#ifdef ARCH_SUPPORTS_FTRACE_SAVE_REGS
 	/* Also update the regs callback function */
 	if (!ret) {
 		ip = (unsigned long)(&ftrace_regs_call);
@@ -245,7 +242,6 @@ int ftrace_update_ftrace_func(ftrace_func_t func)
 		new = ftrace_call_replace(ip, (unsigned long)func);
 		ret = ftrace_modify_code(ip, old, new);
 	}
-#endif
 
 	atomic_dec(&modifying_ftrace_code);
 
-- 
1.7.10



^ permalink raw reply	[flat|nested] 23+ messages in thread

* [RFC][PATCH 3/4 v4] ftrace/x86: Remove function_trace_stop check from graph caller
  2012-07-11 19:50 [RFC][PATCH 0/4 v4] ftrace/kprobes: Setting up ftrace for kprobes Steven Rostedt
  2012-07-11 19:50 ` [RFC][PATCH 1/4 v4] ftrace/x86: Add separate function to save regs Steven Rostedt
  2012-07-11 19:50 ` [RFC][PATCH 2/4 v4] ftrace/x86: Add save_regs for i386 function calls Steven Rostedt
@ 2012-07-11 19:50 ` Steven Rostedt
  2012-08-21 15:04   ` [tip:perf/core] " tip-bot for Steven Rostedt
  2012-07-11 19:50 ` [RFC][PATCH 4/4 v4] ftrace/x86_64: Add recursion protection inside mcount caller Steven Rostedt
  3 siblings, 1 reply; 23+ messages in thread
From: Steven Rostedt @ 2012-07-11 19:50 UTC (permalink / raw)
  To: linux-kernel
  Cc: Ingo Molnar, Andrew Morton, Thomas Gleixner, Frederic Weisbecker,
	Masami Hiramatsu, H. Peter Anvin

[-- Attachment #1: 0003-ftrace-x86-Remove-function_trace_stop-check-from-gra.patch --]
[-- Type: text/plain, Size: 1090 bytes --]

From: Steven Rostedt <srostedt@redhat.com>

The graph caller is called by the mcount callers, which already does
the check against the function_trace_stop variable. No reason to
check it again.

Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
---
 arch/x86/kernel/entry_32.S |    3 ---
 arch/x86/kernel/entry_64.S |    3 ---
 2 files changed, 6 deletions(-)

diff --git a/arch/x86/kernel/entry_32.S b/arch/x86/kernel/entry_32.S
index 8ef138f..9019806 100644
--- a/arch/x86/kernel/entry_32.S
+++ b/arch/x86/kernel/entry_32.S
@@ -1232,9 +1232,6 @@ END(mcount)
 
 #ifdef CONFIG_FUNCTION_GRAPH_TRACER
 ENTRY(ftrace_graph_caller)
-	cmpl $0, function_trace_stop
-	jne ftrace_stub
-
 	pushl %eax
 	pushl %ecx
 	pushl %edx
diff --git a/arch/x86/kernel/entry_64.S b/arch/x86/kernel/entry_64.S
index 52bda2e..38308fa 100644
--- a/arch/x86/kernel/entry_64.S
+++ b/arch/x86/kernel/entry_64.S
@@ -213,9 +213,6 @@ END(mcount)
 
 #ifdef CONFIG_FUNCTION_GRAPH_TRACER
 ENTRY(ftrace_graph_caller)
-	cmpl $0, function_trace_stop
-	jne ftrace_stub
-
 	MCOUNT_SAVE_FRAME
 
 	leaq 8(%rbp), %rdi
-- 
1.7.10



^ permalink raw reply	[flat|nested] 23+ messages in thread

* [RFC][PATCH 4/4 v4] ftrace/x86_64: Add recursion protection inside mcount caller
  2012-07-11 19:50 [RFC][PATCH 0/4 v4] ftrace/kprobes: Setting up ftrace for kprobes Steven Rostedt
                   ` (2 preceding siblings ...)
  2012-07-11 19:50 ` [RFC][PATCH 3/4 v4] ftrace/x86: Remove function_trace_stop check from graph caller Steven Rostedt
@ 2012-07-11 19:50 ` Steven Rostedt
  3 siblings, 0 replies; 23+ messages in thread
From: Steven Rostedt @ 2012-07-11 19:50 UTC (permalink / raw)
  To: linux-kernel
  Cc: Ingo Molnar, Andrew Morton, Thomas Gleixner, Frederic Weisbecker,
	Masami Hiramatsu, H. Peter Anvin

[-- Attachment #1: 0004-ftrace-x86_64-Add-recursion-protection-inside-mcount.patch --]
[-- Type: text/plain, Size: 7589 bytes --]

From: Steven Rostedt <srostedt@redhat.com>

As more users of function tracing is occurring (perf, kprobes, etc)
there's more chances that the users do not sufficiently protect against
recursion. As it is simple to protect against it either at the mcount
call site, or by the "list function", an arch can now add protection
inside the mcount trampoline and define ARCH_SUPPORTS_FTRACE_RECURSION.

If ARCH_SUPPORTS_FTRACE_RECURSION is not defined by the arch (inside
the asm/ftrace.h), the FTRACE_FORCE_LIST_FUNC is set to 1, which will
always enable the list function (the function used to call
multiple function trace callbacks when more than one is register)
even if there's only one function register. This is because the list
function provides all the required features (like recursion protection)
for function tracing. If the arch does not support all features, the
list function is used to make sure they are supported.

Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
---
 arch/x86/include/asm/ftrace.h |    2 ++
 arch/x86/kernel/entry_32.S    |   28 ++++++++++++++++++++++++++--
 arch/x86/kernel/entry_64.S    |   30 +++++++++++++++++++++++++++---
 arch/x86/kernel/ftrace.c      |    7 +++++--
 include/linux/ftrace.h        |    2 +-
 5 files changed, 61 insertions(+), 8 deletions(-)

diff --git a/arch/x86/include/asm/ftrace.h b/arch/x86/include/asm/ftrace.h
index a6cae0c..fdaee7c 100644
--- a/arch/x86/include/asm/ftrace.h
+++ b/arch/x86/include/asm/ftrace.h
@@ -43,6 +43,8 @@
 #define ARCH_SUPPORTS_FTRACE_SAVE_REGS
 #endif
 
+#define ARCH_SUPPORTS_FTRACE_RECURSION
+
 #ifndef __ASSEMBLY__
 extern void mcount(void);
 extern atomic_t modifying_ftrace_code;
diff --git a/arch/x86/kernel/entry_32.S b/arch/x86/kernel/entry_32.S
index 9019806..f61a15c 100644
--- a/arch/x86/kernel/entry_32.S
+++ b/arch/x86/kernel/entry_32.S
@@ -1106,6 +1106,12 @@ ENTRY(ftrace_caller)
 	cmpl $0, function_trace_stop
 	jne  ftrace_stub
 
+	/* Check for recursion */
+	cmpl $0, PER_CPU_VAR(ftrace_recursion)
+	jne  ftrace_stub
+
+	movl $1, PER_CPU_VAR(ftrace_recursion)
+
 	pushl %eax
 	pushl %ecx
 	pushl %edx
@@ -1127,8 +1133,11 @@ ftrace_ret:
 #ifdef CONFIG_FUNCTION_GRAPH_TRACER
 .globl ftrace_graph_call
 ftrace_graph_call:
-	jmp ftrace_stub
+	jmp ftrace_graph_finish
+.globl ftrace_graph_finish
+ftrace_graph_finish:
 #endif
+	movl $0, PER_CPU_VAR(ftrace_recursion)
 
 .globl ftrace_stub
 ftrace_stub:
@@ -1140,6 +1149,12 @@ ENTRY(ftrace_regs_caller)
 	cmpl $0, function_trace_stop
 	jne ftrace_restore_flags
 
+	/* Check for recursion */
+	cmpl $0, PER_CPU_VAR(ftrace_recursion)
+	jne  ftrace_restore_flags
+
+	movl $1, PER_CPU_VAR(ftrace_recursion)
+
 	pushl %esp	/* Save stack in sp location */
 	subl $4, (%esp) /* Adjust saved stack to skip saved flags */
 	pushl 4(%esp)	/* Save flags in correct position */
@@ -1198,6 +1213,12 @@ ENTRY(mcount)
 	cmpl $0, function_trace_stop
 	jne  ftrace_stub
 
+	/* Check for recursion */
+	cmpl $0, PER_CPU_VAR(ftrace_recursion)
+	jne  ftrace_stub
+
+	movl $1, PER_CPU_VAR(ftrace_recursion)
+
 	cmpl $ftrace_stub, ftrace_trace_function
 	jnz trace
 #ifdef CONFIG_FUNCTION_GRAPH_TRACER
@@ -1206,7 +1227,10 @@ ENTRY(mcount)
 
 	cmpl $ftrace_graph_entry_stub, ftrace_graph_entry
 	jnz ftrace_graph_caller
+ftrace_graph_finish:
 #endif
+	movl $0, PER_CPU_VAR(ftrace_recursion)
+
 .globl ftrace_stub
 ftrace_stub:
 	ret
@@ -1243,7 +1267,7 @@ ENTRY(ftrace_graph_caller)
 	popl %edx
 	popl %ecx
 	popl %eax
-	ret
+	jmp ftrace_graph_finish
 END(ftrace_graph_caller)
 
 .globl return_to_handler
diff --git a/arch/x86/kernel/entry_64.S b/arch/x86/kernel/entry_64.S
index 38308fa..7b6d14b 100644
--- a/arch/x86/kernel/entry_64.S
+++ b/arch/x86/kernel/entry_64.S
@@ -77,6 +77,8 @@ END(mcount)
 .macro ftrace_caller_setup skip=0
 	MCOUNT_SAVE_FRAME \skip
 
+	movl $1, PER_CPU_VAR(ftrace_recursion)
+
 	/* Load the ftrace_ops into the 3rd parameter */
 	leaq function_trace_op, %rdx
 
@@ -92,6 +94,10 @@ ENTRY(ftrace_caller)
 	cmpl $0, function_trace_stop
 	jne  ftrace_stub
 
+	/* Check for recursion */
+	cmpl $0, PER_CPU_VAR(ftrace_recursion)
+	jne  ftrace_stub
+
 	ftrace_caller_setup
 	/* regs go into 4th parameter (but make it NULL) */
 	movq $0, %rcx
@@ -104,9 +110,12 @@ ftrace_return:
 
 #ifdef CONFIG_FUNCTION_GRAPH_TRACER
 GLOBAL(ftrace_graph_call)
-	jmp ftrace_stub
+	jmp ftrace_graph_finish
+GLOBAL(ftrace_graph_finish)
 #endif
 
+	movl $0, PER_CPU_VAR(ftrace_recursion)
+
 GLOBAL(ftrace_stub)
 	retq
 END(ftrace_caller)
@@ -119,6 +128,10 @@ ENTRY(ftrace_regs_caller)
 	cmpl $0, function_trace_stop
 	jne  ftrace_restore_flags
 
+	/* Check for recursion */
+	cmpl $0, PER_CPU_VAR(ftrace_recursion)
+	jne  ftrace_restore_flags
+
 	/* skip=8 to skip flags saved in SS */
 	ftrace_caller_setup 8
 
@@ -181,17 +194,28 @@ ENTRY(mcount)
 	cmpl $0, function_trace_stop
 	jne  ftrace_stub
 
+	/* Check for recursion */
+	cmpl $0, PER_CPU_VAR(ftrace_recursion)
+	jne  ftrace_stub
+
+	movl $1, PER_CPU_VAR(ftrace_recursion)
+
 	cmpq $ftrace_stub, ftrace_trace_function
 	jnz trace
 
+ftrace_return:
+
 #ifdef CONFIG_FUNCTION_GRAPH_TRACER
 	cmpq $ftrace_stub, ftrace_graph_return
 	jnz ftrace_graph_caller
 
 	cmpq $ftrace_graph_entry_stub, ftrace_graph_entry
 	jnz ftrace_graph_caller
+ftrace_graph_finish:
 #endif
 
+	movl $0, PER_CPU_VAR(ftrace_recursion)
+
 GLOBAL(ftrace_stub)
 	retq
 
@@ -206,7 +230,7 @@ trace:
 
 	MCOUNT_RESTORE_FRAME
 
-	jmp ftrace_stub
+	jmp ftrace_return
 END(mcount)
 #endif /* CONFIG_DYNAMIC_FTRACE */
 #endif /* CONFIG_FUNCTION_TRACER */
@@ -224,7 +248,7 @@ ENTRY(ftrace_graph_caller)
 
 	MCOUNT_RESTORE_FRAME
 
-	retq
+	jmp ftrace_graph_finish
 END(ftrace_graph_caller)
 
 GLOBAL(return_to_handler)
diff --git a/arch/x86/kernel/ftrace.c b/arch/x86/kernel/ftrace.c
index 1d41402..28807ff 100644
--- a/arch/x86/kernel/ftrace.c
+++ b/arch/x86/kernel/ftrace.c
@@ -28,6 +28,8 @@
 #include <asm/ftrace.h>
 #include <asm/nops.h>
 
+DEFINE_PER_CPU(int, ftrace_recursion);
+
 #ifdef CONFIG_DYNAMIC_FTRACE
 
 int ftrace_arch_code_modify_prepare(void)
@@ -664,6 +666,7 @@ int __init ftrace_dyn_arch_init(void *data)
 
 #ifdef CONFIG_DYNAMIC_FTRACE
 extern void ftrace_graph_call(void);
+extern void ftrace_graph_finish(void);
 
 static int ftrace_mod_jmp(unsigned long ip,
 			  int old_offset, int new_offset)
@@ -689,7 +692,7 @@ int ftrace_enable_ftrace_graph_caller(void)
 	unsigned long ip = (unsigned long)(&ftrace_graph_call);
 	int old_offset, new_offset;
 
-	old_offset = (unsigned long)(&ftrace_stub) - (ip + MCOUNT_INSN_SIZE);
+	old_offset = (unsigned long)(&ftrace_graph_finish) - (ip + MCOUNT_INSN_SIZE);
 	new_offset = (unsigned long)(&ftrace_graph_caller) - (ip + MCOUNT_INSN_SIZE);
 
 	return ftrace_mod_jmp(ip, old_offset, new_offset);
@@ -701,7 +704,7 @@ int ftrace_disable_ftrace_graph_caller(void)
 	int old_offset, new_offset;
 
 	old_offset = (unsigned long)(&ftrace_graph_caller) - (ip + MCOUNT_INSN_SIZE);
-	new_offset = (unsigned long)(&ftrace_stub) - (ip + MCOUNT_INSN_SIZE);
+	new_offset = (unsigned long)(&ftrace_graph_finish) - (ip + MCOUNT_INSN_SIZE);
 
 	return ftrace_mod_jmp(ip, old_offset, new_offset);
 }
diff --git a/include/linux/ftrace.h b/include/linux/ftrace.h
index ab39990..087bdd5 100644
--- a/include/linux/ftrace.h
+++ b/include/linux/ftrace.h
@@ -34,7 +34,7 @@
  * does. Or at least does enough to prevent any unwelcomed side effects.
  */
 #if !defined(CONFIG_HAVE_FUNCTION_TRACE_MCOUNT_TEST) || \
-	!ARCH_SUPPORTS_FTRACE_OPS
+	!ARCH_SUPPORTS_FTRACE_OPS || !defined(ARCH_SUPPORTS_FTRACE_RECURSION)
 # define FTRACE_FORCE_LIST_FUNC 1
 #else
 # define FTRACE_FORCE_LIST_FUNC 0
-- 
1.7.10



^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [RFC][PATCH 1/4 v4] ftrace/x86: Add separate function to save regs
  2012-07-11 19:50 ` [RFC][PATCH 1/4 v4] ftrace/x86: Add separate function to save regs Steven Rostedt
@ 2012-07-12 12:12   ` Masami Hiramatsu
  0 siblings, 0 replies; 23+ messages in thread
From: Masami Hiramatsu @ 2012-07-12 12:12 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: linux-kernel, Ingo Molnar, Andrew Morton, Thomas Gleixner,
	Frederic Weisbecker, H. Peter Anvin, Alexander van Heukelum,
	yrl.pp-manager.tt

(2012/07/12 4:50), Steven Rostedt wrote:
> From: Steven Rostedt <srostedt@redhat.com>
> 
> Add a way to have different functions calling different trampolines.
> If a ftrace_ops wants regs saved on the return, then have only the
> functions with ops registered to save regs. Functions registered by
> other ops would not be affected, unless the functions overlap.
> 
> If one ftrace_ops registered functions A, B and C and another ops
> registered fucntions to save regs on A, and D, then only functions
> A and D would be saving regs. Function B and C would work as normal.
> Although A is registered by both ops: normal and saves regs; this is fine
> as saving the regs is needed to satisfy one of the ops that calls it
> but the regs are ignored by the other ops function.
> 
> x86_64 implements the full regs saving, and i386 just passes a NULL
> for regs to satisfy the ftrace_ops passing. Where an arch must supply
> both regs and ftrace_ops parameters, even if regs is just NULL.
> 
> It is OK for an arch to pass NULL regs. All function trace users that
> require regs passing must add the flag FTRACE_OPS_FL_SAVE_REGS when
> registering the ftrace_ops. If the arch does not support saving regs
> then the ftrace_ops will fail to register. The flag
> FTRACE_OPS_FL_SAVE_REGS_IF_SUPPORTED may be set that will prevent the
> ftrace_ops from failing to register. In this case, the handler may
> either check if regs is not NULL or check if ARCH_SUPPORTS_FTRACE_SAVE_REGS.
> If the arch supports passing regs it will set this macro and pass regs
> for ops that request them. All other archs will just pass NULL.

OK, this is good for me:)

Reviewed-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>

Thank you!

> 
> Cc: Alexander van Heukelum <heukelum@fastmail.fm>
> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
> ---

-- 
Masami HIRAMATSU
Software Platform Research Dept. Linux Technology Center
Hitachi, Ltd., Yokohama Research Laboratory
E-mail: masami.hiramatsu.pt@hitachi.com



^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [RFC][PATCH 2/4 v4] ftrace/x86: Add save_regs for i386 function calls
  2012-07-11 19:50 ` [RFC][PATCH 2/4 v4] ftrace/x86: Add save_regs for i386 function calls Steven Rostedt
@ 2012-07-12 12:39   ` Masami Hiramatsu
  2012-07-12 15:53     ` Steven Rostedt
  2012-07-13 18:47     ` Steven Rostedt
  0 siblings, 2 replies; 23+ messages in thread
From: Masami Hiramatsu @ 2012-07-12 12:39 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: linux-kernel, Ingo Molnar, Andrew Morton, Thomas Gleixner,
	Frederic Weisbecker, H. Peter Anvin, yrl.pp-manager.tt

(2012/07/12 4:50), Steven Rostedt wrote:
> From: Steven Rostedt <srostedt@redhat.com>
> 
> Add saving full regs for function tracing on i386.
> The saving of regs was influenced by patches sent out by
> Masami Hiramatsu.
> 
> Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
> ---
>  arch/x86/include/asm/ftrace.h |    2 --
>  arch/x86/kernel/entry_32.S    |   58 +++++++++++++++++++++++++++++++++++++++++
>  arch/x86/kernel/ftrace.c      |    4 ---
>  3 files changed, 58 insertions(+), 6 deletions(-)
> 
> diff --git a/arch/x86/include/asm/ftrace.h b/arch/x86/include/asm/ftrace.h
> index a847501..a6cae0c 100644
> --- a/arch/x86/include/asm/ftrace.h
> +++ b/arch/x86/include/asm/ftrace.h
> @@ -40,10 +40,8 @@
>  
>  #ifdef CONFIG_DYNAMIC_FTRACE
>  #define ARCH_SUPPORTS_FTRACE_OPS 1
> -#ifdef CONFIG_X86_64
>  #define ARCH_SUPPORTS_FTRACE_SAVE_REGS
>  #endif
> -#endif
>  
>  #ifndef __ASSEMBLY__
>  extern void mcount(void);
> diff --git a/arch/x86/kernel/entry_32.S b/arch/x86/kernel/entry_32.S
> index 5da11d1..8ef138f 100644
> --- a/arch/x86/kernel/entry_32.S
> +++ b/arch/x86/kernel/entry_32.S
> @@ -1123,6 +1123,7 @@ ftrace_call:
>  	popl %edx
>  	popl %ecx
>  	popl %eax
> +ftrace_ret:
>  #ifdef CONFIG_FUNCTION_GRAPH_TRACER
>  .globl ftrace_graph_call
>  ftrace_graph_call:
> @@ -1134,6 +1135,63 @@ ftrace_stub:
>  	ret
>  END(ftrace_caller)
>  
> +ENTRY(ftrace_regs_caller)
> +	pushf	/* push flags before compare (in ss location) */
> +	cmpl $0, function_trace_stop
> +	jne ftrace_restore_flags
> +
> +	pushl %esp	/* Save stack in sp location */
> +	subl $4, (%esp) /* Adjust saved stack to skip saved flags */
> +	pushl 4(%esp)	/* Save flags in correct position */
> +	movl $__KERNEL_DS, 8(%esp)	/* Save ss */
> +	pushl $__KERNEL_CS
> +	pushl 4*4(%esp)	/* Save the ip */
> +	subl $MCOUNT_INSN_SIZE, (%esp)	/* Adjust ip */
> +	pushl $0	/* Load 0 into orig_ax */

Oops, you might forget that the i386's interrupt stack layout is a bit
different from x86-64.

On x86-64, regs->sp directly points the top of stack.
On the other hand (i386), regs->sp IS the top of stack. You can see
below code in arch/x86/include/asm/ptrace.h
---
/*
 * X86_32 CPUs don't save ss and esp if the CPU is already in kernel mode
 * when it traps.  The previous stack will be directly underneath the saved
 * registers, and 'sp/ss' won't even have been saved. Thus the '&regs->sp'.
 *
 * This is valid only for kernel mode traps.
 */
static inline unsigned long kernel_stack_pointer(struct pt_regs *regs)
{
#ifdef CONFIG_X86_32
        return (unsigned long)(&regs->sp);
#else
        return regs->sp;
#endif
}
---

This means that you need a trick here.

	 sp-> [retaddr]
	(*)-> [orig_stack]

Here is the stack layout when the ftrace_regs_caller is called.
(*) points the original stack pointer. this means that regs->sp has
placed at (*). After doing pushf, it changed as below.

	                    (what user expects)
	 sp-> [flags]      <- regs.cs
	      [retaddr]    <- regs.flags
	(*)-> [orig_stack] <- regs.sp

So we have to change this stack layout as the user expected. That is
what I did it in my previous series;

https://lkml.org/lkml/2012/6/5/119

In this patch, I clobbered the return address on the stack and
stores it in the local stack because of that reason.

+	movl 14*4(%esp), %eax	/* Load return address */
+	pushl %eax		/* Save return address (+4) */
+	subl $MCOUNT_INSN_SIZE, %eax
+	movl %eax, 12*4+4(%esp)	/* Store IP */
+	movl 13*4+4(%esp), %edx	/* Load flags */
+	movl %edx, 14*4+4(%esp)	/* Store flags */
+	movl $__KERNEL_CS, %edx
+	movl %edx, 13*4+4(%esp)

Thank you,

-- 
Masami HIRAMATSU
Software Platform Research Dept. Linux Technology Center
Hitachi, Ltd., Yokohama Research Laboratory
E-mail: masami.hiramatsu.pt@hitachi.com



^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [RFC][PATCH 2/4 v4] ftrace/x86: Add save_regs for i386 function calls
  2012-07-12 12:39   ` Masami Hiramatsu
@ 2012-07-12 15:53     ` Steven Rostedt
  2012-07-13 18:47     ` Steven Rostedt
  1 sibling, 0 replies; 23+ messages in thread
From: Steven Rostedt @ 2012-07-12 15:53 UTC (permalink / raw)
  To: Masami Hiramatsu
  Cc: linux-kernel, Ingo Molnar, Andrew Morton, Thomas Gleixner,
	Frederic Weisbecker, H. Peter Anvin, yrl.pp-manager.tt

I'm slowly getting this patch set into working order ;-)


On Thu, 2012-07-12 at 21:39 +0900, Masami Hiramatsu wrote:
>  
> > +ENTRY(ftrace_regs_caller)
> > +	pushf	/* push flags before compare (in ss location) */
> > +	cmpl $0, function_trace_stop
> > +	jne ftrace_restore_flags
> > +
> > +	pushl %esp	/* Save stack in sp location */
> > +	subl $4, (%esp) /* Adjust saved stack to skip saved flags */
> > +	pushl 4(%esp)	/* Save flags in correct position */
> > +	movl $__KERNEL_DS, 8(%esp)	/* Save ss */
> > +	pushl $__KERNEL_CS
> > +	pushl 4*4(%esp)	/* Save the ip */
> > +	subl $MCOUNT_INSN_SIZE, (%esp)	/* Adjust ip */
> > +	pushl $0	/* Load 0 into orig_ax */
> 
> Oops, you might forget that the i386's interrupt stack layout is a bit
> different from x86-64.
> 
> On x86-64, regs->sp directly points the top of stack.
> On the other hand (i386), regs->sp IS the top of stack. You can see
> below code in arch/x86/include/asm/ptrace.h
> ---
> /*
>  * X86_32 CPUs don't save ss and esp if the CPU is already in kernel mode
>  * when it traps.  The previous stack will be directly underneath the saved
>  * registers, and 'sp/ss' won't even have been saved. Thus the '&regs->sp'.
>  *
>  * This is valid only for kernel mode traps.
>  */
> static inline unsigned long kernel_stack_pointer(struct pt_regs *regs)
> {
> #ifdef CONFIG_X86_32
>         return (unsigned long)(&regs->sp);
> #else
>         return regs->sp;
> #endif


Yuck yuck yuck!

> }
> ---
> 
> This means that you need a trick here.
> 
> 	 sp-> [retaddr]
> 	(*)-> [orig_stack]
> 
> Here is the stack layout when the ftrace_regs_caller is called.
> (*) points the original stack pointer. this means that regs->sp has
> placed at (*). After doing pushf, it changed as below.
> 
> 	                    (what user expects)
> 	 sp-> [flags]      <- regs.cs
> 	      [retaddr]    <- regs.flags
> 	(*)-> [orig_stack] <- regs.sp
> 
> So we have to change this stack layout as the user expected. That is
> what I did it in my previous series;

Yeah, I saw that you did this, but didn't fully understand why. I
completely forgot about that hack in x86_32 :-(

This is why I'm insisting to get your Reviewed-by, as you seem to be
more up-to-date on the subtleties between 32 and 64 than I am.
 
> 
> https://lkml.org/lkml/2012/6/5/119
> 
> In this patch, I clobbered the return address on the stack and
> stores it in the local stack because of that reason.
> 
> +	movl 14*4(%esp), %eax	/* Load return address */
> +	pushl %eax		/* Save return address (+4) */
> +	subl $MCOUNT_INSN_SIZE, %eax
> +	movl %eax, 12*4+4(%esp)	/* Store IP */
> +	movl 13*4+4(%esp), %edx	/* Load flags */
> +	movl %edx, 14*4+4(%esp)	/* Store flags */
> +	movl $__KERNEL_CS, %edx
> +	movl %edx, 13*4+4(%esp)

Well the change log does say that my patch set was influenced by your
code. I started to veer from that. I shouldn't have.
 
> 
> Thank you,
> 

No, thank you!

/me goes to work on v5.

-- Steve



^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [RFC][PATCH 2/4 v4] ftrace/x86: Add save_regs for i386 function calls
  2012-07-12 12:39   ` Masami Hiramatsu
  2012-07-12 15:53     ` Steven Rostedt
@ 2012-07-13 18:47     ` Steven Rostedt
  2012-07-17  2:08       ` Masami Hiramatsu
  2012-07-18 15:59       ` Steven Rostedt
  1 sibling, 2 replies; 23+ messages in thread
From: Steven Rostedt @ 2012-07-13 18:47 UTC (permalink / raw)
  To: Masami Hiramatsu
  Cc: linux-kernel, Ingo Molnar, Andrew Morton, Thomas Gleixner,
	Frederic Weisbecker, H. Peter Anvin, yrl.pp-manager.tt

On Thu, 2012-07-12 at 21:39 +0900, Masami Hiramatsu wrote:

> /*
>  * X86_32 CPUs don't save ss and esp if the CPU is already in kernel mode
>  * when it traps.  The previous stack will be directly underneath the saved
>  * registers, and 'sp/ss' won't even have been saved. Thus the '&regs->sp'.
>  *
>  * This is valid only for kernel mode traps.
>  */
> static inline unsigned long kernel_stack_pointer(struct pt_regs *regs)
> {
> #ifdef CONFIG_X86_32
>         return (unsigned long)(&regs->sp);
> #else
>         return regs->sp;
> #endif
> }

I found that regs_get_register() doesn't honor this either. Thus,
kprobes in tracing gets this:

 # echo 'p:ftrace sys_read+4 s=%sp' > /debug/tracing/kprobe_events
 # echo 1 > /debug/tracing/events/kprobes/enable
 # cat trace
            sshd-1345  [000] d...   489.117168: ftrace: (sys_read+0x4/0x70) s=b7e96768
            sshd-1345  [000] d...   489.117191: ftrace: (sys_read+0x4/0x70) s=b7e96768
             cat-1447  [000] d...   489.117392: ftrace: (sys_read+0x4/0x70) s=5a7
             cat-1447  [001] d...   489.118023: ftrace: (sys_read+0x4/0x70) s=b77ad05f
            less-1448  [000] d...   489.118079: ftrace: (sys_read+0x4/0x70) s=b7762e06
            less-1448  [000] d...   489.118117: ftrace: (sys_read+0x4/0x70) s=b7764970

Note, I added +4 (which is still a normal kprobe, and not a ftrace one)
because it updates the stack where regs->sp would definitely not be a
kernel stack.

This patch fixes it:

diff --git a/arch/x86/include/asm/ptrace.h b/arch/x86/include/asm/ptrace.h
index dcfde52..b1e0f53 100644
--- a/arch/x86/include/asm/ptrace.h
+++ b/arch/x86/include/asm/ptrace.h
@@ -246,6 +246,11 @@ static inline unsigned long regs_get_register(struct pt_regs *regs,
 {
 	if (unlikely(offset > MAX_REG_OFFSET))
 		return 0;
+#ifdef CONFIG_X86_32
+	if (offset == offsetof(struct pt_regs, sp) &&
+	    regs->cs == __KERNEL_CS)
+		return kernel_stack_pointer(regs);
+#endif
 	return *(unsigned long *)((unsigned long)regs + offset);
 }
 
            sshd-1352  [000] d...   362.348016: ftrace: (sys_read+0x4/0x70) s=f3febfa8
            sshd-1352  [000] d...   362.348048: ftrace: (sys_read+0x4/0x70) s=f3febfa8
            bash-1355  [001] d...   362.348081: ftrace: (sys_read+0x4/0x70) s=f5075fa8
            sshd-1352  [000] d...   362.348082: ftrace: (sys_read+0x4/0x70) s=f3febfa8
            sshd-1352  [000] d...   362.690950: ftrace: (sys_read+0x4/0x70) s=f3febfa8
            bash-1355  [001] d...   362.691033: ftrace: (sys_read+0x4/0x70) s=f5075fa8

I'll post that patch separately.

> ---
> 
> This means that you need a trick here.
> 
> 	 sp-> [retaddr]
> 	(*)-> [orig_stack]
> 
> Here is the stack layout when the ftrace_regs_caller is called.
> (*) points the original stack pointer. this means that regs->sp has
> placed at (*). After doing pushf, it changed as below.
> 
> 	                    (what user expects)
> 	 sp-> [flags]      <- regs.cs
> 	      [retaddr]    <- regs.flags
> 	(*)-> [orig_stack] <- regs.sp
> 
> So we have to change this stack layout as the user expected. That is
> what I did it in my previous series;
> 
> https://lkml.org/lkml/2012/6/5/119
> 
> In this patch, I clobbered the return address on the stack and
> stores it in the local stack because of that reason.
> 
> +	movl 14*4(%esp), %eax	/* Load return address */
> +	pushl %eax		/* Save return address (+4) */
> +	subl $MCOUNT_INSN_SIZE, %eax
> +	movl %eax, 12*4+4(%esp)	/* Store IP */
> +	movl 13*4+4(%esp), %edx	/* Load flags */
> +	movl %edx, 14*4+4(%esp)	/* Store flags */
> +	movl $__KERNEL_CS, %edx
> +	movl %edx, 13*4+4(%esp)

I did something slightly different but basically the same. Here's the
new version:

-- Steve

From: Steven Rostedt <srostedt@redhat.com>
Date: Tue, 5 Jun 2012 20:00:11 -0400
Subject: [PATCH] ftrace/x86: Add save_regs for i386 function calls

Add saving full regs for function tracing on i386.
The saving of regs was influenced by patches sent out by
Masami Hiramatsu.

Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
---
 arch/x86/include/asm/ftrace.h |    2 -
 arch/x86/kernel/entry_32.S    |   68 +++++++++++++++++++++++++++++++++++++++++
 arch/x86/kernel/ftrace.c      |    4 --
 3 files changed, 68 insertions(+), 6 deletions(-)

diff --git a/arch/x86/include/asm/ftrace.h b/arch/x86/include/asm/ftrace.h
index a847501..a6cae0c 100644
--- a/arch/x86/include/asm/ftrace.h
+++ b/arch/x86/include/asm/ftrace.h
@@ -40,10 +40,8 @@
 
 #ifdef CONFIG_DYNAMIC_FTRACE
 #define ARCH_SUPPORTS_FTRACE_OPS 1
-#ifdef CONFIG_X86_64
 #define ARCH_SUPPORTS_FTRACE_SAVE_REGS
 #endif
-#endif
 
 #ifndef __ASSEMBLY__
 extern void mcount(void);
diff --git a/arch/x86/kernel/entry_32.S b/arch/x86/kernel/entry_32.S
index 5da11d1..46caa56 100644
--- a/arch/x86/kernel/entry_32.S
+++ b/arch/x86/kernel/entry_32.S
@@ -1123,6 +1123,7 @@ ftrace_call:
 	popl %edx
 	popl %ecx
 	popl %eax
+ftrace_ret:
 #ifdef CONFIG_FUNCTION_GRAPH_TRACER
 .globl ftrace_graph_call
 ftrace_graph_call:
@@ -1134,6 +1135,73 @@ ftrace_stub:
 	ret
 END(ftrace_caller)
 
+ENTRY(ftrace_regs_caller)
+	pushf	/* push flags before compare (in cs location) */
+	cmpl $0, function_trace_stop
+	jne ftrace_restore_flags
+
+	/*
+	 * i386 does not save SS and ESP when coming from kernel.
+	 * Instead, to get sp, &regs->sp is used (see ptrace.h).
+	 * Unfortunately, that means eflags must be at the same location
+	 * as the current return ip is. We move the return ip into the
+	 * ip location, and move flags into the return ip location.
+	 */
+	pushl 4(%esp)	/* save return ip into ip slot */
+	subl $MCOUNT_INSN_SIZE, (%esp)	/* Adjust ip */
+
+	pushl $0	/* Load 0 into orig_ax */
+	pushl %gs
+	pushl %fs
+	pushl %es
+	pushl %ds
+	pushl %eax
+	pushl %ebp
+	pushl %edi
+	pushl %esi
+	pushl %edx
+	pushl %ecx
+	pushl %ebx
+
+	movl 13*4(%esp), %eax	/* Get the saved flags */
+	movl %eax, 14*4(%esp)	/* Move saved flags into regs->flags location */
+				/* clobbering return ip */
+	movl $__KERNEL_CS,13*4(%esp)
+
+	movl 12*4(%esp), %eax	/* Load ip (1st parameter) */
+	movl 0x4(%ebp), %edx	/* Load parent ip (2cd parameter) */
+	lea  (%esp), %ecx
+	pushl %ecx		/* Save pt_regs as 4th parameter */
+	leal function_trace_op, %ecx /* Save ftrace_pos in 3rd parameter */
+
+GLOBAL(ftrace_regs_call)
+	call ftrace_stub
+
+	addl $4, %esp		/* Skip pt_regs */
+	movl 14*4(%esp), %eax	/* Move flags back into cs */
+	movl %eax, 13*4(%esp)	/* Needed to keep addl from modifying flags */
+	movl 12*4(%esp), %eax	/* Get return ip from regs->ip */
+	addl $MCOUNT_INSN_SIZE, %eax
+	movl %eax, 14*4(%esp)	/* Put return ip back for ret */
+
+	popl %ebx
+	popl %ecx
+	popl %edx
+	popl %esi
+	popl %edi
+	popl %ebp
+	popl %eax
+	popl %ds
+	popl %es
+	popl %fs
+	popl %gs
+	addl $8, %esp		/* Skip orig_ax and ip */
+	popf			/* Pop flags at end (no addl to corrupt flags) */
+	jmp ftrace_ret
+
+ftrace_restore_flags:
+	popf
+	jmp  ftrace_stub
 #else /* ! CONFIG_DYNAMIC_FTRACE */
 
 ENTRY(mcount)
diff --git a/arch/x86/kernel/ftrace.c b/arch/x86/kernel/ftrace.c
index b90eb1a..1d41402 100644
--- a/arch/x86/kernel/ftrace.c
+++ b/arch/x86/kernel/ftrace.c
@@ -206,7 +206,6 @@ static int
 ftrace_modify_code(unsigned long ip, unsigned const char *old_code,
 		   unsigned const char *new_code);
 
-#ifdef ARCH_SUPPORTS_FTRACE_SAVE_REGS
 /*
  * Should never be called:
  *  As it is only called by __ftrace_replace_code() which is called by
@@ -221,7 +220,6 @@ int ftrace_modify_call(struct dyn_ftrace *rec, unsigned long old_addr,
 	WARN_ON(1);
 	return -EINVAL;
 }
-#endif
 
 int ftrace_update_ftrace_func(ftrace_func_t func)
 {
@@ -237,7 +235,6 @@ int ftrace_update_ftrace_func(ftrace_func_t func)
 
 	ret = ftrace_modify_code(ip, old, new);
 
-#ifdef ARCH_SUPPORTS_FTRACE_SAVE_REGS
 	/* Also update the regs callback function */
 	if (!ret) {
 		ip = (unsigned long)(&ftrace_regs_call);
@@ -245,7 +242,6 @@ int ftrace_update_ftrace_func(ftrace_func_t func)
 		new = ftrace_call_replace(ip, (unsigned long)func);
 		ret = ftrace_modify_code(ip, old, new);
 	}
-#endif
 
 	atomic_dec(&modifying_ftrace_code);
 
-- 
1.7.3.4




^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: Re: [RFC][PATCH 2/4 v4] ftrace/x86: Add save_regs for i386 function calls
  2012-07-13 18:47     ` Steven Rostedt
@ 2012-07-17  2:08       ` Masami Hiramatsu
  2012-07-17  3:05         ` Steven Rostedt
  2012-07-18 15:59       ` Steven Rostedt
  1 sibling, 1 reply; 23+ messages in thread
From: Masami Hiramatsu @ 2012-07-17  2:08 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: linux-kernel, Ingo Molnar, Andrew Morton, Thomas Gleixner,
	Frederic Weisbecker, H. Peter Anvin, yrl.pp-manager.tt

(2012/07/14 3:47), Steven Rostedt wrote:
> On Thu, 2012-07-12 at 21:39 +0900, Masami Hiramatsu wrote:
> 
>> /*
>>  * X86_32 CPUs don't save ss and esp if the CPU is already in kernel mode
>>  * when it traps.  The previous stack will be directly underneath the saved
>>  * registers, and 'sp/ss' won't even have been saved. Thus the '&regs->sp'.
>>  *
>>  * This is valid only for kernel mode traps.
>>  */
>> static inline unsigned long kernel_stack_pointer(struct pt_regs *regs)
>> {
>> #ifdef CONFIG_X86_32
>>         return (unsigned long)(&regs->sp);
>> #else
>>         return regs->sp;
>> #endif
>> }
> 
> I found that regs_get_register() doesn't honor this either. Thus,
> kprobes in tracing gets this:
> 
>  # echo 'p:ftrace sys_read+4 s=%sp' > /debug/tracing/kprobe_events
>  # echo 1 > /debug/tracing/events/kprobes/enable
>  # cat trace
>             sshd-1345  [000] d...   489.117168: ftrace: (sys_read+0x4/0x70) s=b7e96768
>             sshd-1345  [000] d...   489.117191: ftrace: (sys_read+0x4/0x70) s=b7e96768
>              cat-1447  [000] d...   489.117392: ftrace: (sys_read+0x4/0x70) s=5a7
>              cat-1447  [001] d...   489.118023: ftrace: (sys_read+0x4/0x70) s=b77ad05f
>             less-1448  [000] d...   489.118079: ftrace: (sys_read+0x4/0x70) s=b7762e06
>             less-1448  [000] d...   489.118117: ftrace: (sys_read+0x4/0x70) s=b7764970
> 

Yes, that is by design, since I made it so. :)
Instead of %sp, kprobe tracer provides $stack special argument
for stack address, because "sp" is not always means the stack
address on every arch.

Thanks,

-- 
Masami HIRAMATSU
Software Platform Research Dept. Linux Technology Center
Hitachi, Ltd., Yokohama Research Laboratory
E-mail: masami.hiramatsu.pt@hitachi.com



^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: Re: [RFC][PATCH 2/4 v4] ftrace/x86: Add save_regs for i386 function calls
  2012-07-17  2:08       ` Masami Hiramatsu
@ 2012-07-17  3:05         ` Steven Rostedt
  2012-07-17  3:13           ` Masami Hiramatsu
  0 siblings, 1 reply; 23+ messages in thread
From: Steven Rostedt @ 2012-07-17  3:05 UTC (permalink / raw)
  To: Masami Hiramatsu
  Cc: linux-kernel, Ingo Molnar, Andrew Morton, Thomas Gleixner,
	Frederic Weisbecker, H. Peter Anvin, yrl.pp-manager.tt

On Tue, 2012-07-17 at 11:08 +0900, Masami Hiramatsu wrote:

> > I found that regs_get_register() doesn't honor this either. Thus,
> > kprobes in tracing gets this:
> > 
> >  # echo 'p:ftrace sys_read+4 s=%sp' > /debug/tracing/kprobe_events
> >  # echo 1 > /debug/tracing/events/kprobes/enable
> >  # cat trace
> >             sshd-1345  [000] d...   489.117168: ftrace: (sys_read+0x4/0x70) s=b7e96768
> >             sshd-1345  [000] d...   489.117191: ftrace: (sys_read+0x4/0x70) s=b7e96768
> >              cat-1447  [000] d...   489.117392: ftrace: (sys_read+0x4/0x70) s=5a7
> >              cat-1447  [001] d...   489.118023: ftrace: (sys_read+0x4/0x70) s=b77ad05f
> >             less-1448  [000] d...   489.118079: ftrace: (sys_read+0x4/0x70) s=b7762e06
> >             less-1448  [000] d...   489.118117: ftrace: (sys_read+0x4/0x70) s=b7764970
> > 
> 
> Yes, that is by design, since I made it so. :)
> Instead of %sp, kprobe tracer provides $stack special argument
> for stack address, because "sp" is not always means the stack
> address on every arch.

But is that useful? Wouldn't the actual stack pointer be more
informative?

-- Steve



^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: Re: Re: [RFC][PATCH 2/4 v4] ftrace/x86: Add save_regs for i386 function calls
  2012-07-17  3:05         ` Steven Rostedt
@ 2012-07-17  3:13           ` Masami Hiramatsu
  0 siblings, 0 replies; 23+ messages in thread
From: Masami Hiramatsu @ 2012-07-17  3:13 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: linux-kernel, Ingo Molnar, Andrew Morton, Thomas Gleixner,
	Frederic Weisbecker, H. Peter Anvin, yrl.pp-manager.tt

(2012/07/17 12:05), Steven Rostedt wrote:
> On Tue, 2012-07-17 at 11:08 +0900, Masami Hiramatsu wrote:
> 
>>> I found that regs_get_register() doesn't honor this either. Thus,
>>> kprobes in tracing gets this:
>>>
>>>  # echo 'p:ftrace sys_read+4 s=%sp' > /debug/tracing/kprobe_events
>>>  # echo 1 > /debug/tracing/events/kprobes/enable
>>>  # cat trace
>>>             sshd-1345  [000] d...   489.117168: ftrace: (sys_read+0x4/0x70) s=b7e96768
>>>             sshd-1345  [000] d...   489.117191: ftrace: (sys_read+0x4/0x70) s=b7e96768
>>>              cat-1447  [000] d...   489.117392: ftrace: (sys_read+0x4/0x70) s=5a7
>>>              cat-1447  [001] d...   489.118023: ftrace: (sys_read+0x4/0x70) s=b77ad05f
>>>             less-1448  [000] d...   489.118079: ftrace: (sys_read+0x4/0x70) s=b7762e06
>>>             less-1448  [000] d...   489.118117: ftrace: (sys_read+0x4/0x70) s=b7764970
>>>
>>
>> Yes, that is by design, since I made it so. :)
>> Instead of %sp, kprobe tracer provides $stack special argument
>> for stack address, because "sp" is not always means the stack
>> address on every arch.
> 
> But is that useful? Wouldn't the actual stack pointer be more
> informative?

It is just FYI :). I rather like your "%sp" enhancement
than current meaningless "%sp" on i386...

However, I think "$stack" is more general and informative
for users, thus, at least perf probe uses it for getting
variables from stack.

Thank you,

-- 
Masami HIRAMATSU
Software Platform Research Dept. Linux Technology Center
Hitachi, Ltd., Yokohama Research Laboratory
E-mail: masami.hiramatsu.pt@hitachi.com



^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [RFC][PATCH 2/4 v4] ftrace/x86: Add save_regs for i386 function calls
  2012-07-13 18:47     ` Steven Rostedt
  2012-07-17  2:08       ` Masami Hiramatsu
@ 2012-07-18 15:59       ` Steven Rostedt
  2012-07-19  2:20         ` Masami Hiramatsu
  1 sibling, 1 reply; 23+ messages in thread
From: Steven Rostedt @ 2012-07-18 15:59 UTC (permalink / raw)
  To: Masami Hiramatsu
  Cc: linux-kernel, Ingo Molnar, Andrew Morton, Thomas Gleixner,
	Frederic Weisbecker, H. Peter Anvin, yrl.pp-manager.tt

On Fri, 2012-07-13 at 14:47 -0400, Steven Rostedt wrote:

Masami, can you give your Reviewed-by tag for this version? Or is there
something else needing to be fixed?

Thanks!

-- Steve

> From: Steven Rostedt <srostedt@redhat.com>
> Date: Tue, 5 Jun 2012 20:00:11 -0400
> Subject: [PATCH] ftrace/x86: Add save_regs for i386 function calls
> 
> Add saving full regs for function tracing on i386.
> The saving of regs was influenced by patches sent out by
> Masami Hiramatsu.
> 
> Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
> ---
>  arch/x86/include/asm/ftrace.h |    2 -
>  arch/x86/kernel/entry_32.S    |   68 +++++++++++++++++++++++++++++++++++++++++
>  arch/x86/kernel/ftrace.c      |    4 --
>  3 files changed, 68 insertions(+), 6 deletions(-)
> 
> diff --git a/arch/x86/include/asm/ftrace.h b/arch/x86/include/asm/ftrace.h
> index a847501..a6cae0c 100644
> --- a/arch/x86/include/asm/ftrace.h
> +++ b/arch/x86/include/asm/ftrace.h
> @@ -40,10 +40,8 @@
>  
>  #ifdef CONFIG_DYNAMIC_FTRACE
>  #define ARCH_SUPPORTS_FTRACE_OPS 1
> -#ifdef CONFIG_X86_64
>  #define ARCH_SUPPORTS_FTRACE_SAVE_REGS
>  #endif
> -#endif
>  
>  #ifndef __ASSEMBLY__
>  extern void mcount(void);
> diff --git a/arch/x86/kernel/entry_32.S b/arch/x86/kernel/entry_32.S
> index 5da11d1..46caa56 100644
> --- a/arch/x86/kernel/entry_32.S
> +++ b/arch/x86/kernel/entry_32.S
> @@ -1123,6 +1123,7 @@ ftrace_call:
>  	popl %edx
>  	popl %ecx
>  	popl %eax
> +ftrace_ret:
>  #ifdef CONFIG_FUNCTION_GRAPH_TRACER
>  .globl ftrace_graph_call
>  ftrace_graph_call:
> @@ -1134,6 +1135,73 @@ ftrace_stub:
>  	ret
>  END(ftrace_caller)
>  
> +ENTRY(ftrace_regs_caller)
> +	pushf	/* push flags before compare (in cs location) */
> +	cmpl $0, function_trace_stop
> +	jne ftrace_restore_flags
> +
> +	/*
> +	 * i386 does not save SS and ESP when coming from kernel.
> +	 * Instead, to get sp, &regs->sp is used (see ptrace.h).
> +	 * Unfortunately, that means eflags must be at the same location
> +	 * as the current return ip is. We move the return ip into the
> +	 * ip location, and move flags into the return ip location.
> +	 */
> +	pushl 4(%esp)	/* save return ip into ip slot */
> +	subl $MCOUNT_INSN_SIZE, (%esp)	/* Adjust ip */
> +
> +	pushl $0	/* Load 0 into orig_ax */
> +	pushl %gs
> +	pushl %fs
> +	pushl %es
> +	pushl %ds
> +	pushl %eax
> +	pushl %ebp
> +	pushl %edi
> +	pushl %esi
> +	pushl %edx
> +	pushl %ecx
> +	pushl %ebx
> +
> +	movl 13*4(%esp), %eax	/* Get the saved flags */
> +	movl %eax, 14*4(%esp)	/* Move saved flags into regs->flags location */
> +				/* clobbering return ip */
> +	movl $__KERNEL_CS,13*4(%esp)
> +
> +	movl 12*4(%esp), %eax	/* Load ip (1st parameter) */
> +	movl 0x4(%ebp), %edx	/* Load parent ip (2cd parameter) */
> +	lea  (%esp), %ecx
> +	pushl %ecx		/* Save pt_regs as 4th parameter */
> +	leal function_trace_op, %ecx /* Save ftrace_pos in 3rd parameter */
> +
> +GLOBAL(ftrace_regs_call)
> +	call ftrace_stub
> +
> +	addl $4, %esp		/* Skip pt_regs */
> +	movl 14*4(%esp), %eax	/* Move flags back into cs */
> +	movl %eax, 13*4(%esp)	/* Needed to keep addl from modifying flags */
> +	movl 12*4(%esp), %eax	/* Get return ip from regs->ip */
> +	addl $MCOUNT_INSN_SIZE, %eax
> +	movl %eax, 14*4(%esp)	/* Put return ip back for ret */
> +
> +	popl %ebx
> +	popl %ecx
> +	popl %edx
> +	popl %esi
> +	popl %edi
> +	popl %ebp
> +	popl %eax
> +	popl %ds
> +	popl %es
> +	popl %fs
> +	popl %gs
> +	addl $8, %esp		/* Skip orig_ax and ip */
> +	popf			/* Pop flags at end (no addl to corrupt flags) */
> +	jmp ftrace_ret
> +
> +ftrace_restore_flags:
> +	popf
> +	jmp  ftrace_stub
>  #else /* ! CONFIG_DYNAMIC_FTRACE */
>  
>  ENTRY(mcount)
> diff --git a/arch/x86/kernel/ftrace.c b/arch/x86/kernel/ftrace.c
> index b90eb1a..1d41402 100644
> --- a/arch/x86/kernel/ftrace.c
> +++ b/arch/x86/kernel/ftrace.c
> @@ -206,7 +206,6 @@ static int
>  ftrace_modify_code(unsigned long ip, unsigned const char *old_code,
>  		   unsigned const char *new_code);
>  
> -#ifdef ARCH_SUPPORTS_FTRACE_SAVE_REGS
>  /*
>   * Should never be called:
>   *  As it is only called by __ftrace_replace_code() which is called by
> @@ -221,7 +220,6 @@ int ftrace_modify_call(struct dyn_ftrace *rec, unsigned long old_addr,
>  	WARN_ON(1);
>  	return -EINVAL;
>  }
> -#endif
>  
>  int ftrace_update_ftrace_func(ftrace_func_t func)
>  {
> @@ -237,7 +235,6 @@ int ftrace_update_ftrace_func(ftrace_func_t func)
>  
>  	ret = ftrace_modify_code(ip, old, new);
>  
> -#ifdef ARCH_SUPPORTS_FTRACE_SAVE_REGS
>  	/* Also update the regs callback function */
>  	if (!ret) {
>  		ip = (unsigned long)(&ftrace_regs_call);
> @@ -245,7 +242,6 @@ int ftrace_update_ftrace_func(ftrace_func_t func)
>  		new = ftrace_call_replace(ip, (unsigned long)func);
>  		ret = ftrace_modify_code(ip, old, new);
>  	}
> -#endif
>  
>  	atomic_dec(&modifying_ftrace_code);
>  



^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: Re: [RFC][PATCH 2/4 v4] ftrace/x86: Add save_regs for i386 function calls
  2012-07-18 15:59       ` Steven Rostedt
@ 2012-07-19  2:20         ` Masami Hiramatsu
  2012-07-19 12:52           ` Steven Rostedt
  0 siblings, 1 reply; 23+ messages in thread
From: Masami Hiramatsu @ 2012-07-19  2:20 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: linux-kernel, Ingo Molnar, Andrew Morton, Thomas Gleixner,
	Frederic Weisbecker, H. Peter Anvin, yrl.pp-manager.tt

(2012/07/19 0:59), Steven Rostedt wrote:
> On Fri, 2012-07-13 at 14:47 -0400, Steven Rostedt wrote:
> 
> Masami, can you give your Reviewed-by tag for this version? Or is there
> something else needing to be fixed?

No, that is OK for me. I've just missed that...

Reviewed-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>

Thank you!

> 
> Thanks!
> 
> -- Steve
> 
>> From: Steven Rostedt <srostedt@redhat.com>
>> Date: Tue, 5 Jun 2012 20:00:11 -0400
>> Subject: [PATCH] ftrace/x86: Add save_regs for i386 function calls
>>
>> Add saving full regs for function tracing on i386.
>> The saving of regs was influenced by patches sent out by
>> Masami Hiramatsu.
>>
>> Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
>> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
>> ---
>>  arch/x86/include/asm/ftrace.h |    2 -
>>  arch/x86/kernel/entry_32.S    |   68 +++++++++++++++++++++++++++++++++++++++++
>>  arch/x86/kernel/ftrace.c      |    4 --
>>  3 files changed, 68 insertions(+), 6 deletions(-)
>>
>> diff --git a/arch/x86/include/asm/ftrace.h b/arch/x86/include/asm/ftrace.h
>> index a847501..a6cae0c 100644
>> --- a/arch/x86/include/asm/ftrace.h
>> +++ b/arch/x86/include/asm/ftrace.h
>> @@ -40,10 +40,8 @@
>>  
>>  #ifdef CONFIG_DYNAMIC_FTRACE
>>  #define ARCH_SUPPORTS_FTRACE_OPS 1
>> -#ifdef CONFIG_X86_64
>>  #define ARCH_SUPPORTS_FTRACE_SAVE_REGS
>>  #endif
>> -#endif
>>  
>>  #ifndef __ASSEMBLY__
>>  extern void mcount(void);
>> diff --git a/arch/x86/kernel/entry_32.S b/arch/x86/kernel/entry_32.S
>> index 5da11d1..46caa56 100644
>> --- a/arch/x86/kernel/entry_32.S
>> +++ b/arch/x86/kernel/entry_32.S
>> @@ -1123,6 +1123,7 @@ ftrace_call:
>>  	popl %edx
>>  	popl %ecx
>>  	popl %eax
>> +ftrace_ret:
>>  #ifdef CONFIG_FUNCTION_GRAPH_TRACER
>>  .globl ftrace_graph_call
>>  ftrace_graph_call:
>> @@ -1134,6 +1135,73 @@ ftrace_stub:
>>  	ret
>>  END(ftrace_caller)
>>  
>> +ENTRY(ftrace_regs_caller)
>> +	pushf	/* push flags before compare (in cs location) */
>> +	cmpl $0, function_trace_stop
>> +	jne ftrace_restore_flags
>> +
>> +	/*
>> +	 * i386 does not save SS and ESP when coming from kernel.
>> +	 * Instead, to get sp, &regs->sp is used (see ptrace.h).
>> +	 * Unfortunately, that means eflags must be at the same location
>> +	 * as the current return ip is. We move the return ip into the
>> +	 * ip location, and move flags into the return ip location.
>> +	 */
>> +	pushl 4(%esp)	/* save return ip into ip slot */
>> +	subl $MCOUNT_INSN_SIZE, (%esp)	/* Adjust ip */
>> +
>> +	pushl $0	/* Load 0 into orig_ax */
>> +	pushl %gs
>> +	pushl %fs
>> +	pushl %es
>> +	pushl %ds
>> +	pushl %eax
>> +	pushl %ebp
>> +	pushl %edi
>> +	pushl %esi
>> +	pushl %edx
>> +	pushl %ecx
>> +	pushl %ebx
>> +
>> +	movl 13*4(%esp), %eax	/* Get the saved flags */
>> +	movl %eax, 14*4(%esp)	/* Move saved flags into regs->flags location */
>> +				/* clobbering return ip */
>> +	movl $__KERNEL_CS,13*4(%esp)
>> +
>> +	movl 12*4(%esp), %eax	/* Load ip (1st parameter) */
>> +	movl 0x4(%ebp), %edx	/* Load parent ip (2cd parameter) */
>> +	lea  (%esp), %ecx
>> +	pushl %ecx		/* Save pt_regs as 4th parameter */
>> +	leal function_trace_op, %ecx /* Save ftrace_pos in 3rd parameter */
>> +
>> +GLOBAL(ftrace_regs_call)
>> +	call ftrace_stub
>> +
>> +	addl $4, %esp		/* Skip pt_regs */
>> +	movl 14*4(%esp), %eax	/* Move flags back into cs */
>> +	movl %eax, 13*4(%esp)	/* Needed to keep addl from modifying flags */
>> +	movl 12*4(%esp), %eax	/* Get return ip from regs->ip */
>> +	addl $MCOUNT_INSN_SIZE, %eax
>> +	movl %eax, 14*4(%esp)	/* Put return ip back for ret */
>> +
>> +	popl %ebx
>> +	popl %ecx
>> +	popl %edx
>> +	popl %esi
>> +	popl %edi
>> +	popl %ebp
>> +	popl %eax
>> +	popl %ds
>> +	popl %es
>> +	popl %fs
>> +	popl %gs
>> +	addl $8, %esp		/* Skip orig_ax and ip */
>> +	popf			/* Pop flags at end (no addl to corrupt flags) */
>> +	jmp ftrace_ret
>> +
>> +ftrace_restore_flags:
>> +	popf
>> +	jmp  ftrace_stub
>>  #else /* ! CONFIG_DYNAMIC_FTRACE */
>>  
>>  ENTRY(mcount)
>> diff --git a/arch/x86/kernel/ftrace.c b/arch/x86/kernel/ftrace.c
>> index b90eb1a..1d41402 100644
>> --- a/arch/x86/kernel/ftrace.c
>> +++ b/arch/x86/kernel/ftrace.c
>> @@ -206,7 +206,6 @@ static int
>>  ftrace_modify_code(unsigned long ip, unsigned const char *old_code,
>>  		   unsigned const char *new_code);
>>  
>> -#ifdef ARCH_SUPPORTS_FTRACE_SAVE_REGS
>>  /*
>>   * Should never be called:
>>   *  As it is only called by __ftrace_replace_code() which is called by
>> @@ -221,7 +220,6 @@ int ftrace_modify_call(struct dyn_ftrace *rec, unsigned long old_addr,
>>  	WARN_ON(1);
>>  	return -EINVAL;
>>  }
>> -#endif
>>  
>>  int ftrace_update_ftrace_func(ftrace_func_t func)
>>  {
>> @@ -237,7 +235,6 @@ int ftrace_update_ftrace_func(ftrace_func_t func)
>>  
>>  	ret = ftrace_modify_code(ip, old, new);
>>  
>> -#ifdef ARCH_SUPPORTS_FTRACE_SAVE_REGS
>>  	/* Also update the regs callback function */
>>  	if (!ret) {
>>  		ip = (unsigned long)(&ftrace_regs_call);
>> @@ -245,7 +242,6 @@ int ftrace_update_ftrace_func(ftrace_func_t func)
>>  		new = ftrace_call_replace(ip, (unsigned long)func);
>>  		ret = ftrace_modify_code(ip, old, new);
>>  	}
>> -#endif
>>  
>>  	atomic_dec(&modifying_ftrace_code);
>>  
> 
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/
> 


-- 
Masami HIRAMATSU
Software Platform Research Dept. Linux Technology Center
Hitachi, Ltd., Yokohama Research Laboratory
E-mail: masami.hiramatsu.pt@hitachi.com



^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: Re: [RFC][PATCH 2/4 v4] ftrace/x86: Add save_regs for i386 function calls
  2012-07-19  2:20         ` Masami Hiramatsu
@ 2012-07-19 12:52           ` Steven Rostedt
  2012-07-19 12:58             ` Steven Rostedt
                               ` (2 more replies)
  0 siblings, 3 replies; 23+ messages in thread
From: Steven Rostedt @ 2012-07-19 12:52 UTC (permalink / raw)
  To: Masami Hiramatsu
  Cc: linux-kernel, Ingo Molnar, Andrew Morton, Thomas Gleixner,
	Frederic Weisbecker, H. Peter Anvin, yrl.pp-manager.tt

On Thu, 2012-07-19 at 11:20 +0900, Masami Hiramatsu wrote:
> (2012/07/19 0:59), Steven Rostedt wrote:
> > On Fri, 2012-07-13 at 14:47 -0400, Steven Rostedt wrote:
> > 
> > Masami, can you give your Reviewed-by tag for this version? Or is there
> > something else needing to be fixed?
> 
> No, that is OK for me. I've just missed that...
> 
> Reviewed-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
> 
> Thank you!
> 

Thanks! Except I have one more version. Someone offlist gave me some
ideas.

Here's the diff from the previous patch.

diff --git a/arch/x86/kernel/entry_32.S b/arch/x86/kernel/entry_32.S
index 46caa56..ca5a146 100644
--- a/arch/x86/kernel/entry_32.S
+++ b/arch/x86/kernel/entry_32.S
@@ -1169,10 +1169,9 @@ ENTRY(ftrace_regs_caller)
 	movl $__KERNEL_CS,13*4(%esp)
 
 	movl 12*4(%esp), %eax	/* Load ip (1st parameter) */
-	movl 0x4(%ebp), %edx	/* Load parent ip (2cd parameter) */
-	lea  (%esp), %ecx
-	pushl %ecx		/* Save pt_regs as 4th parameter */
+	movl 0x4(%ebp), %edx	/* Load parent ip (2nd parameter) */
 	leal function_trace_op, %ecx /* Save ftrace_pos in 3rd parameter */
+	pushl %esp		/* Save pt_regs as 4th parameter */
 
 GLOBAL(ftrace_regs_call)
 	call ftrace_stub
@@ -1195,8 +1194,8 @@ GLOBAL(ftrace_regs_call)
 	popl %es
 	popl %fs
 	popl %gs
-	addl $8, %esp		/* Skip orig_ax and ip */
-	popf			/* Pop flags at end (no addl to corrupt flags) */
+	lea 8(%esp), %esp	/* Skip orig_ax and ip */
+	popf			/* Pop flags at end */
 	jmp ftrace_ret
 
 ftrace_restore_flags:


Because we no longer have that 4 byte offset on the stack when we need
to load the 4th parameter, we can just load the current stack pointer
into the stack (pushl %esp), without the save to %ecx step.

also, because lea is faster than add (and doesn't even modify flags), I
changed the last part to use lea instead of addl.

Can you give your reviewed-by tag for this too? I'd like to push this
out today so we can still make 3.6.

Thanks!

-- Steve

here's the full patch:

diff --git a/arch/x86/include/asm/ftrace.h b/arch/x86/include/asm/ftrace.h
index a847501..a6cae0c 100644
--- a/arch/x86/include/asm/ftrace.h
+++ b/arch/x86/include/asm/ftrace.h
@@ -40,10 +40,8 @@
 
 #ifdef CONFIG_DYNAMIC_FTRACE
 #define ARCH_SUPPORTS_FTRACE_OPS 1
-#ifdef CONFIG_X86_64
 #define ARCH_SUPPORTS_FTRACE_SAVE_REGS
 #endif
-#endif
 
 #ifndef __ASSEMBLY__
 extern void mcount(void);
diff --git a/arch/x86/kernel/entry_32.S b/arch/x86/kernel/entry_32.S
index 5da11d1..ca5a146 100644
--- a/arch/x86/kernel/entry_32.S
+++ b/arch/x86/kernel/entry_32.S
@@ -1123,6 +1123,7 @@ ftrace_call:
 	popl %edx
 	popl %ecx
 	popl %eax
+ftrace_ret:
 #ifdef CONFIG_FUNCTION_GRAPH_TRACER
 .globl ftrace_graph_call
 ftrace_graph_call:
@@ -1134,6 +1135,72 @@ ftrace_stub:
 	ret
 END(ftrace_caller)
 
+ENTRY(ftrace_regs_caller)
+	pushf	/* push flags before compare (in cs location) */
+	cmpl $0, function_trace_stop
+	jne ftrace_restore_flags
+
+	/*
+	 * i386 does not save SS and ESP when coming from kernel.
+	 * Instead, to get sp, &regs->sp is used (see ptrace.h).
+	 * Unfortunately, that means eflags must be at the same location
+	 * as the current return ip is. We move the return ip into the
+	 * ip location, and move flags into the return ip location.
+	 */
+	pushl 4(%esp)	/* save return ip into ip slot */
+	subl $MCOUNT_INSN_SIZE, (%esp)	/* Adjust ip */
+
+	pushl $0	/* Load 0 into orig_ax */
+	pushl %gs
+	pushl %fs
+	pushl %es
+	pushl %ds
+	pushl %eax
+	pushl %ebp
+	pushl %edi
+	pushl %esi
+	pushl %edx
+	pushl %ecx
+	pushl %ebx
+
+	movl 13*4(%esp), %eax	/* Get the saved flags */
+	movl %eax, 14*4(%esp)	/* Move saved flags into regs->flags location */
+				/* clobbering return ip */
+	movl $__KERNEL_CS,13*4(%esp)
+
+	movl 12*4(%esp), %eax	/* Load ip (1st parameter) */
+	movl 0x4(%ebp), %edx	/* Load parent ip (2nd parameter) */
+	leal function_trace_op, %ecx /* Save ftrace_pos in 3rd parameter */
+	pushl %esp		/* Save pt_regs as 4th parameter */
+
+GLOBAL(ftrace_regs_call)
+	call ftrace_stub
+
+	addl $4, %esp		/* Skip pt_regs */
+	movl 14*4(%esp), %eax	/* Move flags back into cs */
+	movl %eax, 13*4(%esp)	/* Needed to keep addl from modifying flags */
+	movl 12*4(%esp), %eax	/* Get return ip from regs->ip */
+	addl $MCOUNT_INSN_SIZE, %eax
+	movl %eax, 14*4(%esp)	/* Put return ip back for ret */
+
+	popl %ebx
+	popl %ecx
+	popl %edx
+	popl %esi
+	popl %edi
+	popl %ebp
+	popl %eax
+	popl %ds
+	popl %es
+	popl %fs
+	popl %gs
+	lea 8(%esp), %esp	/* Skip orig_ax and ip */
+	popf			/* Pop flags at end */
+	jmp ftrace_ret
+
+ftrace_restore_flags:
+	popf
+	jmp  ftrace_stub
 #else /* ! CONFIG_DYNAMIC_FTRACE */
 
 ENTRY(mcount)
diff --git a/arch/x86/kernel/ftrace.c b/arch/x86/kernel/ftrace.c
index b90eb1a..1d41402 100644
--- a/arch/x86/kernel/ftrace.c
+++ b/arch/x86/kernel/ftrace.c
@@ -206,7 +206,6 @@ static int
 ftrace_modify_code(unsigned long ip, unsigned const char *old_code,
 		   unsigned const char *new_code);
 
-#ifdef ARCH_SUPPORTS_FTRACE_SAVE_REGS
 /*
  * Should never be called:
  *  As it is only called by __ftrace_replace_code() which is called by
@@ -221,7 +220,6 @@ int ftrace_modify_call(struct dyn_ftrace *rec, unsigned long old_addr,
 	WARN_ON(1);
 	return -EINVAL;
 }
-#endif
 
 int ftrace_update_ftrace_func(ftrace_func_t func)
 {
@@ -237,7 +235,6 @@ int ftrace_update_ftrace_func(ftrace_func_t func)
 
 	ret = ftrace_modify_code(ip, old, new);
 
-#ifdef ARCH_SUPPORTS_FTRACE_SAVE_REGS
 	/* Also update the regs callback function */
 	if (!ret) {
 		ip = (unsigned long)(&ftrace_regs_call);
@@ -245,7 +242,6 @@ int ftrace_update_ftrace_func(ftrace_func_t func)
 		new = ftrace_call_replace(ip, (unsigned long)func);
 		ret = ftrace_modify_code(ip, old, new);
 	}
-#endif
 
 	atomic_dec(&modifying_ftrace_code);
 




^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: Re: [RFC][PATCH 2/4 v4] ftrace/x86: Add save_regs for i386 function calls
  2012-07-19 12:52           ` Steven Rostedt
@ 2012-07-19 12:58             ` Steven Rostedt
  2012-07-19 22:53               ` H. Peter Anvin
  2012-07-19 18:24             ` Steven Rostedt
  2012-08-21 15:03             ` [tip:perf/core] ftrace/x86_32: Simplify parameter setup for ftrace_regs_caller tip-bot for Uros Bizjak
  2 siblings, 1 reply; 23+ messages in thread
From: Steven Rostedt @ 2012-07-19 12:58 UTC (permalink / raw)
  To: Masami Hiramatsu
  Cc: linux-kernel, Ingo Molnar, Andrew Morton, Thomas Gleixner,
	Frederic Weisbecker, H. Peter Anvin, yrl.pp-manager.tt

On Thu, 2012-07-19 at 08:52 -0400, Steven Rostedt wrote:
>  
>  GLOBAL(ftrace_regs_call)
>  	call ftrace_stub
> @@ -1195,8 +1194,8 @@ GLOBAL(ftrace_regs_call)
>  	popl %es
>  	popl %fs
>  	popl %gs
> -	addl $8, %esp		/* Skip orig_ax and ip */
> -	popf			/* Pop flags at end (no addl to corrupt flags) */
> +	lea 8(%esp), %esp	/* Skip orig_ax and ip */
> +	popf			/* Pop flags at end */
>  	jmp ftrace_ret
>  
>  ftrace_restore_flags:
> 
> 
> Because we no longer have that 4 byte offset on the stack when we need
> to load the 4th parameter, we can just load the current stack pointer
> into the stack (pushl %esp), without the save to %ecx step.
> 
> also, because lea is faster than add (and doesn't even modify flags), I
> changed the last part to use lea instead of addl.

Now I'm told that this is not always the case (at least not for Atom),
so I reverted this part and put back the addl. But can you still give
you reviewed by for the first part?

> 
> Can you give your reviewed-by tag for this too? I'd like to push this
> out today so we can still make 3.6.
> 
> Thanks!
> 
> -- Steve
> 
> here's the full patch:
> 
> diff --git a/arch/x86/include/asm/ftrace.h b/arch/x86/include/asm/ftrace.h
> index a847501..a6cae0c 100644
> --- a/arch/x86/include/asm/ftrace.h
> +++ b/arch/x86/include/asm/ftrace.h
> @@ -40,10 +40,8 @@
>  
>  #ifdef CONFIG_DYNAMIC_FTRACE
>  #define ARCH_SUPPORTS_FTRACE_OPS 1
> -#ifdef CONFIG_X86_64
>  #define ARCH_SUPPORTS_FTRACE_SAVE_REGS
>  #endif
> -#endif
>  
>  #ifndef __ASSEMBLY__
>  extern void mcount(void);
> diff --git a/arch/x86/kernel/entry_32.S b/arch/x86/kernel/entry_32.S
> index 5da11d1..ca5a146 100644
> --- a/arch/x86/kernel/entry_32.S
> +++ b/arch/x86/kernel/entry_32.S
> @@ -1123,6 +1123,7 @@ ftrace_call:
>  	popl %edx
>  	popl %ecx
>  	popl %eax
> +ftrace_ret:
>  #ifdef CONFIG_FUNCTION_GRAPH_TRACER
>  .globl ftrace_graph_call
>  ftrace_graph_call:
> @@ -1134,6 +1135,72 @@ ftrace_stub:
>  	ret
>  END(ftrace_caller)
>  
> +ENTRY(ftrace_regs_caller)
> +	pushf	/* push flags before compare (in cs location) */
> +	cmpl $0, function_trace_stop
> +	jne ftrace_restore_flags
> +
> +	/*
> +	 * i386 does not save SS and ESP when coming from kernel.
> +	 * Instead, to get sp, &regs->sp is used (see ptrace.h).
> +	 * Unfortunately, that means eflags must be at the same location
> +	 * as the current return ip is. We move the return ip into the
> +	 * ip location, and move flags into the return ip location.
> +	 */
> +	pushl 4(%esp)	/* save return ip into ip slot */
> +	subl $MCOUNT_INSN_SIZE, (%esp)	/* Adjust ip */
> +
> +	pushl $0	/* Load 0 into orig_ax */
> +	pushl %gs
> +	pushl %fs
> +	pushl %es
> +	pushl %ds
> +	pushl %eax
> +	pushl %ebp
> +	pushl %edi
> +	pushl %esi
> +	pushl %edx
> +	pushl %ecx
> +	pushl %ebx
> +
> +	movl 13*4(%esp), %eax	/* Get the saved flags */
> +	movl %eax, 14*4(%esp)	/* Move saved flags into regs->flags location */
> +				/* clobbering return ip */
> +	movl $__KERNEL_CS,13*4(%esp)
> +
> +	movl 12*4(%esp), %eax	/* Load ip (1st parameter) */
> +	movl 0x4(%ebp), %edx	/* Load parent ip (2nd parameter) */
> +	leal function_trace_op, %ecx /* Save ftrace_pos in 3rd parameter */
> +	pushl %esp		/* Save pt_regs as 4th parameter */
> +
> +GLOBAL(ftrace_regs_call)
> +	call ftrace_stub
> +
> +	addl $4, %esp		/* Skip pt_regs */
> +	movl 14*4(%esp), %eax	/* Move flags back into cs */
> +	movl %eax, 13*4(%esp)	/* Needed to keep addl from modifying flags */
> +	movl 12*4(%esp), %eax	/* Get return ip from regs->ip */
> +	addl $MCOUNT_INSN_SIZE, %eax
> +	movl %eax, 14*4(%esp)	/* Put return ip back for ret */
> +
> +	popl %ebx
> +	popl %ecx
> +	popl %edx
> +	popl %esi
> +	popl %edi
> +	popl %ebp
> +	popl %eax
> +	popl %ds
> +	popl %es
> +	popl %fs
> +	popl %gs
+	addl $8, %esp		/* Skip orig_ax and ip */
+	popf			/* Pop flags at end (no addl to corrupt flags) */

The above has been changed to this again.

-- Steve

> +	jmp ftrace_ret
> +
> +ftrace_restore_flags:
> +	popf
> +	jmp  ftrace_stub
>  #else /* ! CONFIG_DYNAMIC_FTRACE */
>  
>  ENTRY(mcount)
> diff --git a/arch/x86/kernel/ftrace.c b/arch/x86/kernel/ftrace.c
> index b90eb1a..1d41402 100644
> --- a/arch/x86/kernel/ftrace.c
> +++ b/arch/x86/kernel/ftrace.c
> @@ -206,7 +206,6 @@ static int
>  ftrace_modify_code(unsigned long ip, unsigned const char *old_code,
>  		   unsigned const char *new_code);
>  
> -#ifdef ARCH_SUPPORTS_FTRACE_SAVE_REGS
>  /*
>   * Should never be called:
>   *  As it is only called by __ftrace_replace_code() which is called by
> @@ -221,7 +220,6 @@ int ftrace_modify_call(struct dyn_ftrace *rec, unsigned long old_addr,
>  	WARN_ON(1);
>  	return -EINVAL;
>  }
> -#endif
>  
>  int ftrace_update_ftrace_func(ftrace_func_t func)
>  {
> @@ -237,7 +235,6 @@ int ftrace_update_ftrace_func(ftrace_func_t func)
>  
>  	ret = ftrace_modify_code(ip, old, new);
>  
> -#ifdef ARCH_SUPPORTS_FTRACE_SAVE_REGS
>  	/* Also update the regs callback function */
>  	if (!ret) {
>  		ip = (unsigned long)(&ftrace_regs_call);
> @@ -245,7 +242,6 @@ int ftrace_update_ftrace_func(ftrace_func_t func)
>  		new = ftrace_call_replace(ip, (unsigned long)func);
>  		ret = ftrace_modify_code(ip, old, new);
>  	}
> -#endif
>  
>  	atomic_dec(&modifying_ftrace_code);
>  
> 
> 



^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: Re: [RFC][PATCH 2/4 v4] ftrace/x86: Add save_regs for i386 function calls
  2012-07-19 12:52           ` Steven Rostedt
  2012-07-19 12:58             ` Steven Rostedt
@ 2012-07-19 18:24             ` Steven Rostedt
  2012-08-21 15:03             ` [tip:perf/core] ftrace/x86_32: Simplify parameter setup for ftrace_regs_caller tip-bot for Uros Bizjak
  2 siblings, 0 replies; 23+ messages in thread
From: Steven Rostedt @ 2012-07-19 18:24 UTC (permalink / raw)
  To: Masami Hiramatsu, Uros Bizjak
  Cc: linux-kernel, Ingo Molnar, Andrew Morton, Thomas Gleixner,
	Frederic Weisbecker, H. Peter Anvin, yrl.pp-manager.tt

On Thu, 2012-07-19 at 08:52 -0400, Steven Rostedt wrote:
> On Thu, 2012-07-19 at 11:20 +0900, Masami Hiramatsu wrote:
> > (2012/07/19 0:59), Steven Rostedt wrote:
> > > On Fri, 2012-07-13 at 14:47 -0400, Steven Rostedt wrote:
> > > 
> > > Masami, can you give your Reviewed-by tag for this version? Or is there
> > > something else needing to be fixed?
> > 
> > No, that is OK for me. I've just missed that...
> > 
> > Reviewed-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
> > 
> > Thank you!
> > 
> 
> Thanks! Except I have one more version. Someone offlist gave me some
> ideas.
> 
> Here's the diff from the previous patch.
> 
> diff --git a/arch/x86/kernel/entry_32.S b/arch/x86/kernel/entry_32.S
> index 46caa56..ca5a146 100644
> --- a/arch/x86/kernel/entry_32.S
> +++ b/arch/x86/kernel/entry_32.S
> @@ -1169,10 +1169,9 @@ ENTRY(ftrace_regs_caller)
>  	movl $__KERNEL_CS,13*4(%esp)
>  
>  	movl 12*4(%esp), %eax	/* Load ip (1st parameter) */
> -	movl 0x4(%ebp), %edx	/* Load parent ip (2cd parameter) */
> -	lea  (%esp), %ecx
> -	pushl %ecx		/* Save pt_regs as 4th parameter */
> +	movl 0x4(%ebp), %edx	/* Load parent ip (2nd parameter) */
>  	leal function_trace_op, %ecx /* Save ftrace_pos in 3rd parameter */
> +	pushl %esp		/* Save pt_regs as 4th parameter */
>  
>  GLOBAL(ftrace_regs_call)

I'm adding this as a separate patch under Uros's name.

I'll be posting the full series for pulling either tonight or tomorrow,
depending on how it handles all my stress tests.

-- Steve



^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [RFC][PATCH 2/4 v4] ftrace/x86: Add save_regs for i386 function calls
  2012-07-19 12:58             ` Steven Rostedt
@ 2012-07-19 22:53               ` H. Peter Anvin
  2012-07-19 23:04                 ` Steven Rostedt
  0 siblings, 1 reply; 23+ messages in thread
From: H. Peter Anvin @ 2012-07-19 22:53 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: Masami Hiramatsu, linux-kernel, Ingo Molnar, Andrew Morton,
	Thomas Gleixner, Frederic Weisbecker, yrl.pp-manager.tt

On 07/19/2012 05:58 AM, Steven Rostedt wrote:
>>
>> also, because lea is faster than add (and doesn't even modify flags), I
>> changed the last part to use lea instead of addl.
> 
> Now I'm told that this is not always the case (at least not for Atom),
> so I reverted this part and put back the addl. But can you still give
> you reviewed by for the first part?
> 

lea is not typically faster than add, but in the case of Atom, it is
done in an earlier pipeline stage (AGU instead of ALU) which means lea
is faster if its inputs are already available as address expressions and
is consumed by address expressions; the goal is to avoid the ALU->AGU
forwarding latency.

	-hpa



^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [RFC][PATCH 2/4 v4] ftrace/x86: Add save_regs for i386 function calls
  2012-07-19 22:53               ` H. Peter Anvin
@ 2012-07-19 23:04                 ` Steven Rostedt
  2012-07-19 23:07                   ` H. Peter Anvin
  0 siblings, 1 reply; 23+ messages in thread
From: Steven Rostedt @ 2012-07-19 23:04 UTC (permalink / raw)
  To: H. Peter Anvin
  Cc: Masami Hiramatsu, linux-kernel, Ingo Molnar, Andrew Morton,
	Thomas Gleixner, Frederic Weisbecker, yrl.pp-manager.tt

On Thu, 2012-07-19 at 15:53 -0700, H. Peter Anvin wrote:

> lea is not typically faster than add, but in the case of Atom, it is
> done in an earlier pipeline stage (AGU instead of ALU) which means lea
> is faster if its inputs are already available as address expressions and
> is consumed by address expressions; the goal is to avoid the ALU->AGU
> forwarding latency.

Well, the question is, which is faster:

	lea 8(%esp), %esp
	addl $8, %esp

Basically, all we want to do is add 8 to the stack pointer. And this is
for the x86_32 version of whatever hardware is in use.

-- Steve



^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [RFC][PATCH 2/4 v4] ftrace/x86: Add save_regs for i386 function calls
  2012-07-19 23:04                 ` Steven Rostedt
@ 2012-07-19 23:07                   ` H. Peter Anvin
  2012-07-20  1:27                     ` Steven Rostedt
  0 siblings, 1 reply; 23+ messages in thread
From: H. Peter Anvin @ 2012-07-19 23:07 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: Masami Hiramatsu, linux-kernel, Ingo Molnar, Andrew Morton,
	Thomas Gleixner, Frederic Weisbecker, yrl.pp-manager.tt

On 07/19/2012 04:04 PM, Steven Rostedt wrote:
> On Thu, 2012-07-19 at 15:53 -0700, H. Peter Anvin wrote:
> 
>> lea is not typically faster than add, but in the case of Atom, it is
>> done in an earlier pipeline stage (AGU instead of ALU) which means lea
>> is faster if its inputs are already available as address expressions and
>> is consumed by address expressions; the goal is to avoid the ALU->AGU
>> forwarding latency.
> 
> Well, the question is, which is faster:
> 
> 	lea 8(%esp), %esp
> 	addl $8, %esp
> 
> Basically, all we want to do is add 8 to the stack pointer. And this is
> for the x86_32 version of whatever hardware is in use.
> 

What I'm telling you is that it depends on the context.

An address expression needs to be ready in the AGU; a piece of data
comes from the ALU.  Whenever something moves from the ALU to the AGU,
there is a penalty.  There is no penalty to move from the AGU to the
ALU, since the ALU is in a later stage.

I *believe* the stack adjustments push/pop are done in the AGU, but I
have to double-check.

	-hpa


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [RFC][PATCH 2/4 v4] ftrace/x86: Add save_regs for i386 function calls
  2012-07-19 23:07                   ` H. Peter Anvin
@ 2012-07-20  1:27                     ` Steven Rostedt
  0 siblings, 0 replies; 23+ messages in thread
From: Steven Rostedt @ 2012-07-20  1:27 UTC (permalink / raw)
  To: H. Peter Anvin
  Cc: Masami Hiramatsu, linux-kernel, Ingo Molnar, Andrew Morton,
	Thomas Gleixner, Frederic Weisbecker, yrl.pp-manager.tt

On Thu, 2012-07-19 at 16:07 -0700, H. Peter Anvin wrote:
> On 07/19/2012 04:04 PM, Steven Rostedt wrote:
>  
> > Basically, all we want to do is add 8 to the stack pointer. And this is
> > for the x86_32 version of whatever hardware is in use.
> > 
> 
> What I'm telling you is that it depends on the context.

It doesn't really matter anyway. My questions are more academic than
practical, as I'm just sticking with addl. Quibbling over addl or lea is
pointless, as addl at least shows us what we want to actually do: add 8
to the stack pointer.

-- Steve



^ permalink raw reply	[flat|nested] 23+ messages in thread

* [tip:perf/core] ftrace/x86_32: Simplify parameter setup for ftrace_regs_caller
  2012-07-19 12:52           ` Steven Rostedt
  2012-07-19 12:58             ` Steven Rostedt
  2012-07-19 18:24             ` Steven Rostedt
@ 2012-08-21 15:03             ` tip-bot for Uros Bizjak
  2 siblings, 0 replies; 23+ messages in thread
From: tip-bot for Uros Bizjak @ 2012-08-21 15:03 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: linux-kernel, hpa, mingo, masami.hiramatsu.pt, rostedt, tglx, ubizjak

Commit-ID:  e4ea3f6b1bf3d489674a3660db652636e50186f9
Gitweb:     http://git.kernel.org/tip/e4ea3f6b1bf3d489674a3660db652636e50186f9
Author:     Uros Bizjak <ubizjak@gmail.com>
AuthorDate: Thu, 19 Jul 2012 13:04:47 -0400
Committer:  Steven Rostedt <rostedt@goodmis.org>
CommitDate: Tue, 31 Jul 2012 10:29:34 -0400

ftrace/x86_32: Simplify parameter setup for ftrace_regs_caller

The final position of the stack after saving regs and setting up
the parameters for ftrace_regs_call, is the position of the pt_regs
needed for the 4th parameter. Instead of saving it into a temporary
reg and pushing the reg, simply push the stack pointer.

Link: http://lkml.kernel.org/r/1342702344.12353.16.camel@gandalf.stny.rr.com

Reviewed-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Signed-off-by: Uros Bizjak <ubizjak@gmail.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
---
 arch/x86/kernel/entry_32.S |    5 ++---
 1 files changed, 2 insertions(+), 3 deletions(-)

diff --git a/arch/x86/kernel/entry_32.S b/arch/x86/kernel/entry_32.S
index 46caa56..4dc3017 100644
--- a/arch/x86/kernel/entry_32.S
+++ b/arch/x86/kernel/entry_32.S
@@ -1169,10 +1169,9 @@ ENTRY(ftrace_regs_caller)
 	movl $__KERNEL_CS,13*4(%esp)
 
 	movl 12*4(%esp), %eax	/* Load ip (1st parameter) */
-	movl 0x4(%ebp), %edx	/* Load parent ip (2cd parameter) */
-	lea  (%esp), %ecx
-	pushl %ecx		/* Save pt_regs as 4th parameter */
+	movl 0x4(%ebp), %edx	/* Load parent ip (2nd parameter) */
 	leal function_trace_op, %ecx /* Save ftrace_pos in 3rd parameter */
+	pushl %esp		/* Save pt_regs as 4th parameter */
 
 GLOBAL(ftrace_regs_call)
 	call ftrace_stub

^ permalink raw reply	[flat|nested] 23+ messages in thread

* [tip:perf/core] ftrace/x86: Remove function_trace_stop check from graph caller
  2012-07-11 19:50 ` [RFC][PATCH 3/4 v4] ftrace/x86: Remove function_trace_stop check from graph caller Steven Rostedt
@ 2012-08-21 15:04   ` tip-bot for Steven Rostedt
  0 siblings, 0 replies; 23+ messages in thread
From: tip-bot for Steven Rostedt @ 2012-08-21 15:04 UTC (permalink / raw)
  To: linux-tip-commits; +Cc: linux-kernel, hpa, mingo, rostedt, srostedt, tglx

Commit-ID:  5767cfeaa9ec7b67c802143394f3ad9f8b174eb8
Gitweb:     http://git.kernel.org/tip/5767cfeaa9ec7b67c802143394f3ad9f8b174eb8
Author:     Steven Rostedt <srostedt@redhat.com>
AuthorDate: Tue, 3 Jul 2012 16:16:09 -0400
Committer:  Steven Rostedt <rostedt@goodmis.org>
CommitDate: Tue, 31 Jul 2012 10:29:51 -0400

ftrace/x86: Remove function_trace_stop check from graph caller

The graph caller is called by the mcount callers, which already does
the check against the function_trace_stop variable. No reason to
check it again.

Link: http://lkml.kernel.org/r/20120711195745.588538769@goodmis.org

Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
---
 arch/x86/kernel/entry_32.S |    3 ---
 arch/x86/kernel/entry_64.S |    3 ---
 2 files changed, 0 insertions(+), 6 deletions(-)

diff --git a/arch/x86/kernel/entry_32.S b/arch/x86/kernel/entry_32.S
index 4dc3017..061ac17 100644
--- a/arch/x86/kernel/entry_32.S
+++ b/arch/x86/kernel/entry_32.S
@@ -1241,9 +1241,6 @@ END(mcount)
 
 #ifdef CONFIG_FUNCTION_GRAPH_TRACER
 ENTRY(ftrace_graph_caller)
-	cmpl $0, function_trace_stop
-	jne ftrace_stub
-
 	pushl %eax
 	pushl %ecx
 	pushl %edx
diff --git a/arch/x86/kernel/entry_64.S b/arch/x86/kernel/entry_64.S
index 52bda2e..38308fa 100644
--- a/arch/x86/kernel/entry_64.S
+++ b/arch/x86/kernel/entry_64.S
@@ -213,9 +213,6 @@ END(mcount)
 
 #ifdef CONFIG_FUNCTION_GRAPH_TRACER
 ENTRY(ftrace_graph_caller)
-	cmpl $0, function_trace_stop
-	jne ftrace_stub
-
 	MCOUNT_SAVE_FRAME
 
 	leaq 8(%rbp), %rdi

^ permalink raw reply	[flat|nested] 23+ messages in thread

end of thread, other threads:[~2012-08-21 15:10 UTC | newest]

Thread overview: 23+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-07-11 19:50 [RFC][PATCH 0/4 v4] ftrace/kprobes: Setting up ftrace for kprobes Steven Rostedt
2012-07-11 19:50 ` [RFC][PATCH 1/4 v4] ftrace/x86: Add separate function to save regs Steven Rostedt
2012-07-12 12:12   ` Masami Hiramatsu
2012-07-11 19:50 ` [RFC][PATCH 2/4 v4] ftrace/x86: Add save_regs for i386 function calls Steven Rostedt
2012-07-12 12:39   ` Masami Hiramatsu
2012-07-12 15:53     ` Steven Rostedt
2012-07-13 18:47     ` Steven Rostedt
2012-07-17  2:08       ` Masami Hiramatsu
2012-07-17  3:05         ` Steven Rostedt
2012-07-17  3:13           ` Masami Hiramatsu
2012-07-18 15:59       ` Steven Rostedt
2012-07-19  2:20         ` Masami Hiramatsu
2012-07-19 12:52           ` Steven Rostedt
2012-07-19 12:58             ` Steven Rostedt
2012-07-19 22:53               ` H. Peter Anvin
2012-07-19 23:04                 ` Steven Rostedt
2012-07-19 23:07                   ` H. Peter Anvin
2012-07-20  1:27                     ` Steven Rostedt
2012-07-19 18:24             ` Steven Rostedt
2012-08-21 15:03             ` [tip:perf/core] ftrace/x86_32: Simplify parameter setup for ftrace_regs_caller tip-bot for Uros Bizjak
2012-07-11 19:50 ` [RFC][PATCH 3/4 v4] ftrace/x86: Remove function_trace_stop check from graph caller Steven Rostedt
2012-08-21 15:04   ` [tip:perf/core] " tip-bot for Steven Rostedt
2012-07-11 19:50 ` [RFC][PATCH 4/4 v4] ftrace/x86_64: Add recursion protection inside mcount caller Steven Rostedt

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.