All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v7 00/17] ftrace for MIPS
@ 2009-11-09 15:29 Wu Zhangjin
  2009-11-09 15:31 ` [PATCH v7 01/17] tracing: convert trace_clock_local() as weak function Wu Zhangjin
                   ` (16 more replies)
  0 siblings, 17 replies; 36+ messages in thread
From: Wu Zhangjin @ 2009-11-09 15:29 UTC (permalink / raw)
  To: linux-mips, linux-kernel
  Cc: zhangfx, zhouqg, Wu Zhangjin, Ralf Baechle, rostedt,
	Frederic Weisbecker, Ingo Molnar, Nicholas Mc Guire, David Daney,
	Richard Sandiford, Patrik Kluba, Thomas Gleixner, Michal Simek

From: Wu Zhangjin <wuzhangjin@gmail.com>

This patchset incorporates the feedbacks from Steven, Frederic, David, Richard
and Michal.

The main changes from the v6 version:

 o tracing: add IRQENTRY_EXIT section for MIPS: keep the return value of
 do_IRQ_* be void as the original version does.

 o tracing: define a new __time_notrace annotation flag: replace the old
 __arch_notrace by __time_notrace, just keep the stuff simpler.

 o tracing: make ftrace for MIPS work without -fno-omit-frame-pointer: prepare
 for the future -fomit-frame-pointer, Seems we are not necessary to enable
 -fno-omit-frame-pointer with -pg

 o tracing: make ftrace for MIPS more robust: add the safe load/store
 operations for MIPS
 
 o tracing: add function graph tracer support for MIPS: make the searching of
 the location of return address(ra) clearer.

 o tracing: reserve $12(t0) for mcount-ra-address of gcc 4.5
   tracing: make function graph tracer work with -mmcount-ra-address
 
All of these updates have been pused into:

git://dev.lemote.com/rt4ls.git  linux-mips/dev/ftrace-upstream

All of these stuff and their combinations have been tested with gcc 4.4 & 4.5
on a YeeLoong netbook.

Welcome to play with it and give more feedbacks ;)

Thanks & Regards,
       Wu Zhangjin

Wu Zhangjin (17):
  tracing: convert trace_clock_local() as weak function
  tracing: add mips_timecounter_read() for MIPS
  tracing: add MIPS specific trace_clock_local()
  tracing: add static function tracer support for MIPS
  tracing: enable HAVE_FUNCTION_TRACE_MCOUNT_TEST for MIPS
  tracing: add an endian argument to scripts/recordmcount.pl
  tracing: add dynamic function tracer support for MIPS
  tracing: add IRQENTRY_EXIT section for MIPS
  tracing: define a new __time_notrace annotation flag
  tracing: not trace the timecounter_read* in kernel/time/clocksource.c
  tracing: not trace mips_timecounter_read() for MIPS
  tracing: add function graph tracer support for MIPS
  tracing: add dynamic function graph tracer for MIPS
  tracing: make ftrace for MIPS work without -fno-omit-frame-pointer
  tracing: make ftrace for MIPS more robust
  tracing: reserve $12(t0) for mcount-ra-address of gcc 4.5
  tracing: make function graph tracer work with -mmcount-ra-address

 arch/mips/Kconfig              |    5 +
 arch/mips/Makefile             |    7 +
 arch/mips/include/asm/ftrace.h |   97 ++++++++++++++++-
 arch/mips/include/asm/irq.h    |   29 +-----
 arch/mips/include/asm/time.h   |   20 ++++
 arch/mips/kernel/Makefile      |    9 ++
 arch/mips/kernel/csrc-r4k.c    |   42 +++++++
 arch/mips/kernel/ftrace.c      |  240 ++++++++++++++++++++++++++++++++++++++++
 arch/mips/kernel/irq.c         |   30 +++++
 arch/mips/kernel/mcount.S      |  191 ++++++++++++++++++++++++++++++++
 arch/mips/kernel/mips_ksyms.c  |    5 +
 arch/mips/kernel/smp.c         |    3 +-
 arch/mips/kernel/smtc.c        |   21 +++-
 arch/mips/kernel/time.c        |    2 +
 arch/mips/kernel/trace_clock.c |   33 ++++++
 arch/mips/kernel/vmlinux.lds.S |    1 +
 arch/mips/sgi-ip22/ip22-int.c  |    3 +-
 arch/mips/sgi-ip22/ip22-time.c |    3 +-
 include/linux/ftrace.h         |   12 ++
 kernel/time/clocksource.c      |    5 +-
 kernel/trace/trace_clock.c     |    2 +-
 scripts/Makefile.build         |    1 +
 scripts/recordmcount.pl        |   47 ++++++++-
 23 files changed, 764 insertions(+), 44 deletions(-)
 create mode 100644 arch/mips/kernel/ftrace.c
 create mode 100644 arch/mips/kernel/mcount.S
 create mode 100644 arch/mips/kernel/trace_clock.c


^ permalink raw reply	[flat|nested] 36+ messages in thread

* [PATCH v7 01/17] tracing: convert trace_clock_local() as weak function
  2009-11-09 15:29 [PATCH v7 00/17] ftrace for MIPS Wu Zhangjin
@ 2009-11-09 15:31 ` Wu Zhangjin
  2009-11-09 15:31 ` [PATCH v7 02/17] tracing: add mips_timecounter_read() for MIPS Wu Zhangjin
                   ` (15 subsequent siblings)
  16 siblings, 0 replies; 36+ messages in thread
From: Wu Zhangjin @ 2009-11-09 15:31 UTC (permalink / raw)
  To: linux-mips, linux-kernel
  Cc: zhangfx, zhouqg, Wu Zhangjin, Ralf Baechle, rostedt,
	Frederic Weisbecker, Ingo Molnar, Nicholas Mc Guire, David Daney,
	Richard Sandiford, Patrik Kluba, Thomas Gleixner, Michal Simek

From: Wu Zhangjin <wuzhangjin@gmail.com>

trace_clock_local() is based on the arch-specific sched_clock(), in X86,
it is tsc(64bit) based, which can give very high precision(about 1ns
with 1GHz). but in MIPS, the sched_clock() is jiffies based, which can
give only 10ms precison with 1000 HZ. which is not enough for tracing,
especially for Real Time system.

so, we need to implement a MIPS specific sched_clock() to get higher
precision. There is a tsc like clock counter register in MIPS, whose
frequency is half of the processor, so, if the cpu frequency is 800MHz,
the time precision reaches 2.5ns, which is very good for tracing, even
for Real Time system.

but 'Cause it is only 32bit long, which will rollover quickly, so, such
a sched_clock() will bring with extra load, which is not good for the
whole system. so, we only need to implement a arch-specific
trace_clock_local() for tracing. as a preparation, we convert it as a
weak function.

The MIPS specific trace_clock_local() is coming in the next two patches.

Acked-by: Frederic Weisbecker <fweisbec@gmail.com>
Signed-off-by: Wu Zhangjin <wuzhangjin@gmail.com>
---
 kernel/trace/trace_clock.c |    2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/kernel/trace/trace_clock.c b/kernel/trace/trace_clock.c
index 20c5f92..06b8cd2 100644
--- a/kernel/trace/trace_clock.c
+++ b/kernel/trace/trace_clock.c
@@ -26,7 +26,7 @@
  * Useful for tracing that does not cross to other CPUs nor
  * does it go through idle events.
  */
-u64 notrace trace_clock_local(void)
+u64 __weak notrace trace_clock_local(void)
 {
 	unsigned long flags;
 	u64 clock;
-- 
1.6.2.1


^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH v7 02/17] tracing: add mips_timecounter_read() for MIPS
  2009-11-09 15:29 [PATCH v7 00/17] ftrace for MIPS Wu Zhangjin
  2009-11-09 15:31 ` [PATCH v7 01/17] tracing: convert trace_clock_local() as weak function Wu Zhangjin
@ 2009-11-09 15:31 ` Wu Zhangjin
  2009-11-09 15:31 ` [PATCH v7 03/17] tracing: add MIPS specific trace_clock_local() Wu Zhangjin
                   ` (14 subsequent siblings)
  16 siblings, 0 replies; 36+ messages in thread
From: Wu Zhangjin @ 2009-11-09 15:31 UTC (permalink / raw)
  To: linux-mips, linux-kernel
  Cc: zhangfx, zhouqg, Wu Zhangjin, Ralf Baechle, rostedt,
	Frederic Weisbecker, Ingo Molnar, Nicholas Mc Guire, David Daney,
	Richard Sandiford, Patrik Kluba, Thomas Gleixner, Michal Simek

From: Wu Zhangjin <wuzhangjin@gmail.com>

This patch add a new function: mips_timecounter_read() to get high
precision timestamp without extra lock.

It is based on the clock counter register and the
timecounter/cyclecounter abstraction layer of kernel.

Signed-off-by: Wu Zhangjin <wuzhangjin@gmail.com>
---
 arch/mips/include/asm/time.h |   20 ++++++++++++++++++++
 arch/mips/kernel/csrc-r4k.c  |   41 +++++++++++++++++++++++++++++++++++++++++
 arch/mips/kernel/time.c      |    2 ++
 3 files changed, 63 insertions(+), 0 deletions(-)

diff --git a/arch/mips/include/asm/time.h b/arch/mips/include/asm/time.h
index df6a430..3fcc481 100644
--- a/arch/mips/include/asm/time.h
+++ b/arch/mips/include/asm/time.h
@@ -15,6 +15,7 @@
 #define _ASM_TIME_H
 
 #include <linux/rtc.h>
+#include <linux/sched.h>
 #include <linux/spinlock.h>
 #include <linux/clockchips.h>
 #include <linux/clocksource.h>
@@ -73,8 +74,18 @@ static inline int mips_clockevent_init(void)
  */
 #ifdef CONFIG_CSRC_R4K_LIB
 extern int init_r4k_clocksource(void);
+extern int init_r4k_timecounter(void);
+extern u64 r4k_timecounter_read(void);
 #endif
 
+static inline u64 mips_timecounter_read(void)
+{
+#ifdef CONFIG_CSRC_R4K
+	return r4k_timecounter_read();
+#else
+	return sched_clock();
+#endif
+}
 static inline int init_mips_clocksource(void)
 {
 #ifdef CONFIG_CSRC_R4K
@@ -84,6 +95,15 @@ static inline int init_mips_clocksource(void)
 #endif
 }
 
+static inline int init_mips_timecounter(void)
+{
+#ifdef CONFIG_CSRC_R4K
+	return init_r4k_timecounter();
+#else
+	return 0;
+#endif
+}
+
 extern void clocksource_set_clock(struct clocksource *cs, unsigned int clock);
 extern void clockevent_set_clock(struct clock_event_device *cd,
 		unsigned int clock);
diff --git a/arch/mips/kernel/csrc-r4k.c b/arch/mips/kernel/csrc-r4k.c
index e95a3cd..4e7705f 100644
--- a/arch/mips/kernel/csrc-r4k.c
+++ b/arch/mips/kernel/csrc-r4k.c
@@ -7,6 +7,7 @@
  */
 #include <linux/clocksource.h>
 #include <linux/init.h>
+#include <linux/sched.h>
 
 #include <asm/time.h>
 
@@ -36,3 +37,43 @@ int __init init_r4k_clocksource(void)
 
 	return 0;
 }
+
+static struct timecounter r4k_tc = {
+	.cc = NULL,
+};
+
+static cycle_t r4k_cc_read(const struct cyclecounter *cc)
+{
+	return read_c0_count();
+}
+
+static struct cyclecounter r4k_cc = {
+	.read = r4k_cc_read,
+	.mask = CLOCKSOURCE_MASK(32),
+	.shift = 32,
+};
+
+int __init init_r4k_timecounter(void)
+{
+	if (!cpu_has_counter || !mips_hpt_frequency)
+		return -ENXIO;
+
+	r4k_cc.mult = div_sc((unsigned long)mips_hpt_frequency, NSEC_PER_SEC,
+			32);
+
+	timecounter_init(&r4k_tc, &r4k_cc, sched_clock());
+
+	return 0;
+}
+
+u64 r4k_timecounter_read(void)
+{
+	u64 clock;
+
+	if (r4k_tc.cc != NULL)
+		clock = timecounter_read(&r4k_tc);
+	else
+		clock = sched_clock();
+
+	return clock;
+}
diff --git a/arch/mips/kernel/time.c b/arch/mips/kernel/time.c
index 1f467d5..e38cca1 100644
--- a/arch/mips/kernel/time.c
+++ b/arch/mips/kernel/time.c
@@ -156,4 +156,6 @@ void __init time_init(void)
 
 	if (!mips_clockevent_init() || !cpu_has_mfc0_count_bug())
 		init_mips_clocksource();
+	if (!cpu_has_mfc0_count_bug())
+		init_mips_timecounter();
 }
-- 
1.6.2.1


^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH v7 03/17] tracing: add MIPS specific trace_clock_local()
  2009-11-09 15:29 [PATCH v7 00/17] ftrace for MIPS Wu Zhangjin
  2009-11-09 15:31 ` [PATCH v7 01/17] tracing: convert trace_clock_local() as weak function Wu Zhangjin
  2009-11-09 15:31 ` [PATCH v7 02/17] tracing: add mips_timecounter_read() for MIPS Wu Zhangjin
@ 2009-11-09 15:31 ` Wu Zhangjin
  2009-11-10 17:48   ` David Daney
  2009-11-09 15:31 ` [PATCH v7 04/17] tracing: add static function tracer support for MIPS Wu Zhangjin
                   ` (13 subsequent siblings)
  16 siblings, 1 reply; 36+ messages in thread
From: Wu Zhangjin @ 2009-11-09 15:31 UTC (permalink / raw)
  To: linux-mips, linux-kernel
  Cc: zhangfx, zhouqg, Wu Zhangjin, Ralf Baechle, rostedt,
	Frederic Weisbecker, Ingo Molnar, Nicholas Mc Guire, David Daney,
	Richard Sandiford, Patrik Kluba, Thomas Gleixner, Michal Simek

From: Wu Zhangjin <wuzhangjin@gmail.com>

This patch add the mips_timecounter_read() based high precision version
of trace_clock_local().

Signed-off-by: Wu Zhangjin <wuzhangjin@gmail.com>
---
 arch/mips/kernel/Makefile      |    6 ++++++
 arch/mips/kernel/trace_clock.c |   33 +++++++++++++++++++++++++++++++++
 2 files changed, 39 insertions(+), 0 deletions(-)
 create mode 100644 arch/mips/kernel/trace_clock.c

diff --git a/arch/mips/kernel/Makefile b/arch/mips/kernel/Makefile
index 8e26e9c..f228868 100644
--- a/arch/mips/kernel/Makefile
+++ b/arch/mips/kernel/Makefile
@@ -8,6 +8,10 @@ obj-y		+= cpu-probe.o branch.o entry.o genex.o irq.o process.o \
 		   ptrace.o reset.o setup.o signal.o syscall.o \
 		   time.o topology.o traps.o unaligned.o watch.o
 
+ifdef CONFIG_FUNCTION_TRACER
+CFLAGS_REMOVE_trace_clock.o = -pg
+endif
+
 obj-$(CONFIG_CEVT_BCM1480)	+= cevt-bcm1480.o
 obj-$(CONFIG_CEVT_R4K_LIB)	+= cevt-r4k.o
 obj-$(CONFIG_MIPS_MT_SMTC)	+= cevt-smtc.o
@@ -24,6 +28,8 @@ obj-$(CONFIG_SYNC_R4K)		+= sync-r4k.o
 obj-$(CONFIG_STACKTRACE)	+= stacktrace.o
 obj-$(CONFIG_MODULES)		+= mips_ksyms.o module.o
 
+obj-$(CONFIG_FTRACE)		+= trace_clock.o
+
 obj-$(CONFIG_CPU_LOONGSON2)	+= r4k_fpu.o r4k_switch.o
 obj-$(CONFIG_CPU_MIPS32)	+= r4k_fpu.o r4k_switch.o
 obj-$(CONFIG_CPU_MIPS64)	+= r4k_fpu.o r4k_switch.o
diff --git a/arch/mips/kernel/trace_clock.c b/arch/mips/kernel/trace_clock.c
new file mode 100644
index 0000000..0fe85ea
--- /dev/null
+++ b/arch/mips/kernel/trace_clock.c
@@ -0,0 +1,33 @@
+/*
+ * This file is subject to the terms and conditions of the GNU General Public
+ * License.  See the file "COPYING" in the main directory of this archive for
+ * more details.
+ *
+ * Copyright (C) 2009 Lemote Inc. & DSLab, Lanzhou University, China
+ * Author: Wu Zhangjin <wuzj@lemote.com>
+ */
+
+#include <linux/clocksource.h>
+#include <linux/sched.h>
+
+#include <asm/time.h>
+
+/*
+ * trace_clock_local(): the simplest and least coherent tracing clock.
+ *
+ * Useful for tracing that does not cross to other CPUs nor
+ * does it go through idle events.
+ */
+u64 trace_clock_local(void)
+{
+	unsigned long flags;
+	u64 clock;
+
+	raw_local_irq_save(flags);
+
+	clock = mips_timecounter_read();
+
+	raw_local_irq_restore(flags);
+
+	return clock;
+}
-- 
1.6.2.1


^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH v7 04/17] tracing: add static function tracer support for MIPS
  2009-11-09 15:29 [PATCH v7 00/17] ftrace for MIPS Wu Zhangjin
                   ` (2 preceding siblings ...)
  2009-11-09 15:31 ` [PATCH v7 03/17] tracing: add MIPS specific trace_clock_local() Wu Zhangjin
@ 2009-11-09 15:31 ` Wu Zhangjin
  2009-11-10  0:26   ` David Daney
  2009-11-12 11:06   ` Wu Zhangjin
  2009-11-09 15:31 ` [PATCH v7 05/17] tracing: enable HAVE_FUNCTION_TRACE_MCOUNT_TEST " Wu Zhangjin
                   ` (12 subsequent siblings)
  16 siblings, 2 replies; 36+ messages in thread
From: Wu Zhangjin @ 2009-11-09 15:31 UTC (permalink / raw)
  To: linux-mips, linux-kernel
  Cc: zhangfx, zhouqg, Wu Zhangjin, Ralf Baechle, rostedt,
	Frederic Weisbecker, Ingo Molnar, Nicholas Mc Guire, David Daney,
	Richard Sandiford, Patrik Kluba, Thomas Gleixner, Michal Simek

From: Wu Zhangjin <wuzhangjin@gmail.com>

If -pg of gcc is enabled with CONFIG_FUNCTION_TRACER=y. a calling to
_mcount will be inserted into each kernel function. so, there is a
possibility to trace the kernel functions in _mcount.

This patch add the MIPS specific _mcount support for static function
tracing. by deafult, ftrace_trace_function is initialized as
ftrace_stub(an empty function), so, the default _mcount will introduce
very little overhead. after enabling ftrace in user-space, it will jump
to a real tracing function and do static function tracing for us.

And to support module tracing, we need to enable -mlong-calls for the
long call from modules space to kernel space. -mlong-calls load the
address of _mcount to a register and then jump to it, so, the address is
allowed to be 32bit long, but without -mlong-calls, for the instruction
"jal _mcount" only left 26bit for the address of _mcount, which is not
enough for jumping from the module space to kernel space.

Reviewed-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Wu Zhangjin <wuzhangjin@gmail.com>
---
 arch/mips/Kconfig              |    1 +
 arch/mips/Makefile             |    4 ++
 arch/mips/include/asm/ftrace.h |   25 +++++++++++-
 arch/mips/kernel/Makefile      |    2 +
 arch/mips/kernel/mcount.S      |   83 ++++++++++++++++++++++++++++++++++++++++
 arch/mips/kernel/mips_ksyms.c  |    5 ++
 6 files changed, 119 insertions(+), 1 deletions(-)
 create mode 100644 arch/mips/kernel/mcount.S

diff --git a/arch/mips/Kconfig b/arch/mips/Kconfig
index 03bd56a..6b33e88 100644
--- a/arch/mips/Kconfig
+++ b/arch/mips/Kconfig
@@ -4,6 +4,7 @@ config MIPS
 	select HAVE_IDE
 	select HAVE_OPROFILE
 	select HAVE_ARCH_KGDB
+	select HAVE_FUNCTION_TRACER
 	# Horrible source of confusion.  Die, die, die ...
 	select EMBEDDED
 	select RTC_LIB if !LEMOTE_FULOONG2E
diff --git a/arch/mips/Makefile b/arch/mips/Makefile
index 77f5021..041deca 100644
--- a/arch/mips/Makefile
+++ b/arch/mips/Makefile
@@ -48,7 +48,11 @@ ifneq ($(SUBARCH),$(ARCH))
   endif
 endif
 
+ifndef CONFIG_FUNCTION_TRACER
 cflags-y := -ffunction-sections
+else
+cflags-y := -mlong-calls
+endif
 cflags-y += $(call cc-option, -mno-check-zero-division)
 
 ifdef CONFIG_32BIT
diff --git a/arch/mips/include/asm/ftrace.h b/arch/mips/include/asm/ftrace.h
index 40a8c17..5f8ebcf 100644
--- a/arch/mips/include/asm/ftrace.h
+++ b/arch/mips/include/asm/ftrace.h
@@ -1 +1,24 @@
-/* empty */
+/*
+ * This file is subject to the terms and conditions of the GNU General Public
+ * License.  See the file "COPYING" in the main directory of this archive for
+ * more details.
+ *
+ * Copyright (C) 2009 DSLab, Lanzhou University, China
+ * Author: Wu Zhangjin <wuzj@lemote.com>
+ */
+
+#ifndef _ASM_MIPS_FTRACE_H
+#define _ASM_MIPS_FTRACE_H
+
+#ifdef CONFIG_FUNCTION_TRACER
+
+#define MCOUNT_ADDR ((unsigned long)(_mcount))
+#define MCOUNT_INSN_SIZE 4		/* sizeof mcount call */
+
+#ifndef __ASSEMBLY__
+extern void _mcount(void);
+#define mcount _mcount
+
+#endif /* __ASSEMBLY__ */
+#endif /* CONFIG_FUNCTION_TRACER */
+#endif /* _ASM_MIPS_FTRACE_H */
diff --git a/arch/mips/kernel/Makefile b/arch/mips/kernel/Makefile
index f228868..38e704a 100644
--- a/arch/mips/kernel/Makefile
+++ b/arch/mips/kernel/Makefile
@@ -10,6 +10,7 @@ obj-y		+= cpu-probe.o branch.o entry.o genex.o irq.o process.o \
 
 ifdef CONFIG_FUNCTION_TRACER
 CFLAGS_REMOVE_trace_clock.o = -pg
+CFLAGS_REMOVE_early_printk.o = -pg
 endif
 
 obj-$(CONFIG_CEVT_BCM1480)	+= cevt-bcm1480.o
@@ -29,6 +30,7 @@ obj-$(CONFIG_STACKTRACE)	+= stacktrace.o
 obj-$(CONFIG_MODULES)		+= mips_ksyms.o module.o
 
 obj-$(CONFIG_FTRACE)		+= trace_clock.o
+obj-$(CONFIG_FUNCTION_TRACER)	+= mcount.o
 
 obj-$(CONFIG_CPU_LOONGSON2)	+= r4k_fpu.o r4k_switch.o
 obj-$(CONFIG_CPU_MIPS32)	+= r4k_fpu.o r4k_switch.o
diff --git a/arch/mips/kernel/mcount.S b/arch/mips/kernel/mcount.S
new file mode 100644
index 0000000..cebcc3c
--- /dev/null
+++ b/arch/mips/kernel/mcount.S
@@ -0,0 +1,83 @@
+/*
+ * MIPS specific _mcount support
+ *
+ * This file is subject to the terms and conditions of the GNU General Public
+ * License.  See the file "COPYING" in the main directory of this archive for
+ * more details.
+ *
+ * Copyright (C) 2009 Lemote Inc. & DSLab, Lanzhou University, China
+ * Author: Wu Zhangjin <wuzj@lemote.com>
+ */
+
+#include <asm/regdef.h>
+#include <asm/stackframe.h>
+#include <asm/ftrace.h>
+
+	.text
+	.set noreorder
+	.set noat
+
+	.macro MCOUNT_SAVE_REGS
+	PTR_SUBU	sp, PT_SIZE
+	PTR_S	ra, PT_R31(sp)
+	PTR_S	AT, PT_R1(sp)
+	PTR_S	a0, PT_R4(sp)
+	PTR_S	a1, PT_R5(sp)
+	PTR_S	a2, PT_R6(sp)
+	PTR_S	a3, PT_R7(sp)
+#ifdef CONFIG_64BIT
+	PTR_S	a4, PT_R8(sp)
+	PTR_S	a5, PT_R9(sp)
+	PTR_S	a6, PT_R10(sp)
+	PTR_S	a7, PT_R11(sp)
+#endif
+	.endm
+
+	.macro MCOUNT_RESTORE_REGS
+	PTR_L	ra, PT_R31(sp)
+	PTR_L	AT, PT_R1(sp)
+	PTR_L	a0, PT_R4(sp)
+	PTR_L	a1, PT_R5(sp)
+	PTR_L	a2, PT_R6(sp)
+	PTR_L	a3, PT_R7(sp)
+#ifdef CONFIG_64BIT
+	PTR_L	a4, PT_R8(sp)
+	PTR_L	a5, PT_R9(sp)
+	PTR_L	a6, PT_R10(sp)
+	PTR_L	a7, PT_R11(sp)
+#endif
+#ifdef CONFIG_64BIT
+	PTR_ADDIU	sp, PT_SIZE
+#else
+	PTR_ADDIU	sp, (PT_SIZE + 8)
+#endif
+.endm
+
+	.macro RETURN_BACK
+	jr ra
+	 move ra, AT
+	.endm
+
+NESTED(_mcount, PT_SIZE, ra)
+	PTR_LA	t0, ftrace_stub
+	PTR_L	t1, ftrace_trace_function /* Prepare t1 for (1) */
+	bne	t0, t1, static_trace
+	 nop
+	b	ftrace_stub
+	 nop
+
+static_trace:
+	MCOUNT_SAVE_REGS
+
+	move	a0, ra		/* arg1: next ip, selfaddr */
+	jalr	t1	/* (1) call *ftrace_trace_function */
+	 move	a1, AT		/* arg2: the caller's next ip, parent */
+
+	MCOUNT_RESTORE_REGS
+	.globl ftrace_stub
+ftrace_stub:
+	RETURN_BACK
+	END(_mcount)
+
+	.set at
+	.set reorder
diff --git a/arch/mips/kernel/mips_ksyms.c b/arch/mips/kernel/mips_ksyms.c
index 225755d..1d04807 100644
--- a/arch/mips/kernel/mips_ksyms.c
+++ b/arch/mips/kernel/mips_ksyms.c
@@ -13,6 +13,7 @@
 #include <asm/checksum.h>
 #include <asm/pgtable.h>
 #include <asm/uaccess.h>
+#include <asm/ftrace.h>
 
 extern void *__bzero(void *__s, size_t __count);
 extern long __strncpy_from_user_nocheck_asm(char *__to,
@@ -51,3 +52,7 @@ EXPORT_SYMBOL(csum_partial_copy_nocheck);
 EXPORT_SYMBOL(__csum_partial_copy_user);
 
 EXPORT_SYMBOL(invalid_pte_table);
+#ifdef CONFIG_FUNCTION_TRACER
+/* _mcount is defined in arch/mips/kernel/mcount.S */
+EXPORT_SYMBOL(_mcount);
+#endif
-- 
1.6.2.1


^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH v7 05/17] tracing: enable HAVE_FUNCTION_TRACE_MCOUNT_TEST for MIPS
  2009-11-09 15:29 [PATCH v7 00/17] ftrace for MIPS Wu Zhangjin
                   ` (3 preceding siblings ...)
  2009-11-09 15:31 ` [PATCH v7 04/17] tracing: add static function tracer support for MIPS Wu Zhangjin
@ 2009-11-09 15:31 ` Wu Zhangjin
  2009-11-09 15:31 ` [PATCH v7 06/17] tracing: add an endian argument to scripts/recordmcount.pl Wu Zhangjin
                   ` (11 subsequent siblings)
  16 siblings, 0 replies; 36+ messages in thread
From: Wu Zhangjin @ 2009-11-09 15:31 UTC (permalink / raw)
  To: linux-mips, linux-kernel
  Cc: zhangfx, zhouqg, Wu Zhangjin, Ralf Baechle, rostedt,
	Frederic Weisbecker, Ingo Molnar, Nicholas Mc Guire, David Daney,
	Richard Sandiford, Patrik Kluba, Thomas Gleixner, Michal Simek

From: Wu Zhangjin <wuzhangjin@gmail.com>

There is an exisiting common ftrace_test_stop_func() in
kernel/trace/ftrace.c, which is used to check the global variable
ftrace_trace_stop to determine whether stop the function tracing.

This patch implepment the MIPS specific one to speedup the procedure.

Thanks goes to Zhang Le for Cleaning it up.

Signed-off-by: Wu Zhangjin <wuzhangjin@gmail.com>
---
 arch/mips/Kconfig         |    1 +
 arch/mips/kernel/mcount.S |    3 +++
 2 files changed, 4 insertions(+), 0 deletions(-)

diff --git a/arch/mips/Kconfig b/arch/mips/Kconfig
index 6b33e88..a9bd992 100644
--- a/arch/mips/Kconfig
+++ b/arch/mips/Kconfig
@@ -5,6 +5,7 @@ config MIPS
 	select HAVE_OPROFILE
 	select HAVE_ARCH_KGDB
 	select HAVE_FUNCTION_TRACER
+	select HAVE_FUNCTION_TRACE_MCOUNT_TEST
 	# Horrible source of confusion.  Die, die, die ...
 	select EMBEDDED
 	select RTC_LIB if !LEMOTE_FULOONG2E
diff --git a/arch/mips/kernel/mcount.S b/arch/mips/kernel/mcount.S
index cebcc3c..cbb45ed 100644
--- a/arch/mips/kernel/mcount.S
+++ b/arch/mips/kernel/mcount.S
@@ -59,6 +59,9 @@
 	.endm
 
 NESTED(_mcount, PT_SIZE, ra)
+	lw	t0, function_trace_stop
+	bnez	t0, ftrace_stub
+	 nop
 	PTR_LA	t0, ftrace_stub
 	PTR_L	t1, ftrace_trace_function /* Prepare t1 for (1) */
 	bne	t0, t1, static_trace
-- 
1.6.2.1


^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH v7 06/17] tracing: add an endian argument to scripts/recordmcount.pl
  2009-11-09 15:29 [PATCH v7 00/17] ftrace for MIPS Wu Zhangjin
                   ` (4 preceding siblings ...)
  2009-11-09 15:31 ` [PATCH v7 05/17] tracing: enable HAVE_FUNCTION_TRACE_MCOUNT_TEST " Wu Zhangjin
@ 2009-11-09 15:31 ` Wu Zhangjin
  2009-11-09 15:31 ` [PATCH v7 07/17] tracing: add dynamic function tracer support for MIPS Wu Zhangjin
                   ` (10 subsequent siblings)
  16 siblings, 0 replies; 36+ messages in thread
From: Wu Zhangjin @ 2009-11-09 15:31 UTC (permalink / raw)
  To: linux-mips, linux-kernel
  Cc: zhangfx, zhouqg, Wu Zhangjin, Ralf Baechle, rostedt,
	Frederic Weisbecker, Ingo Molnar, Nicholas Mc Guire, David Daney,
	Richard Sandiford, Patrik Kluba, Thomas Gleixner, Michal Simek,
	Wu Zhangjin

From: Wu Zhangjin <wuzhangjin@gmail.com>

MIPS and some other architectures need this argument to handle
big/little endian respectively.

Signed-off-by: Wu Zhangjin <wuzj@lemote.com>
---
 scripts/Makefile.build  |    1 +
 scripts/recordmcount.pl |    6 +++---
 2 files changed, 4 insertions(+), 3 deletions(-)

diff --git a/scripts/Makefile.build b/scripts/Makefile.build
index 341b589..0b94d2f 100644
--- a/scripts/Makefile.build
+++ b/scripts/Makefile.build
@@ -207,6 +207,7 @@ endif
 
 ifdef CONFIG_FTRACE_MCOUNT_RECORD
 cmd_record_mcount = set -e ; perl $(srctree)/scripts/recordmcount.pl "$(ARCH)" \
+	"$(if $(CONFIG_CPU_BIG_ENDIAN),big,little)" \
 	"$(if $(CONFIG_64BIT),64,32)" \
 	"$(OBJDUMP)" "$(OBJCOPY)" "$(CC)" "$(LD)" "$(NM)" "$(RM)" "$(MV)" \
 	"$(if $(part-of-module),1,0)" "$(@)";
diff --git a/scripts/recordmcount.pl b/scripts/recordmcount.pl
index a4e2435..62a0001 100755
--- a/scripts/recordmcount.pl
+++ b/scripts/recordmcount.pl
@@ -113,13 +113,13 @@ $P =~ s@.*/@@g;
 
 my $V = '0.1';
 
-if ($#ARGV != 10) {
-	print "usage: $P arch bits objdump objcopy cc ld nm rm mv is_module inputfile\n";
+if ($#ARGV != 11) {
+	print "usage: $P arch endian bits objdump objcopy cc ld nm rm mv is_module inputfile\n";
 	print "version: $V\n";
 	exit(1);
 }
 
-my ($arch, $bits, $objdump, $objcopy, $cc,
+my ($arch, $endian, $bits, $objdump, $objcopy, $cc,
     $ld, $nm, $rm, $mv, $is_module, $inputfile) = @ARGV;
 
 # This file refers to mcount and shouldn't be ftraced, so lets' ignore it
-- 
1.6.2.1


^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH v7 07/17] tracing: add dynamic function tracer support for MIPS
  2009-11-09 15:29 [PATCH v7 00/17] ftrace for MIPS Wu Zhangjin
                   ` (5 preceding siblings ...)
  2009-11-09 15:31 ` [PATCH v7 06/17] tracing: add an endian argument to scripts/recordmcount.pl Wu Zhangjin
@ 2009-11-09 15:31 ` Wu Zhangjin
  2009-11-09 15:31 ` [PATCH v7 08/17] tracing: add IRQENTRY_EXIT section " Wu Zhangjin
                   ` (9 subsequent siblings)
  16 siblings, 0 replies; 36+ messages in thread
From: Wu Zhangjin @ 2009-11-09 15:31 UTC (permalink / raw)
  To: linux-mips, linux-kernel
  Cc: zhangfx, zhouqg, Wu Zhangjin, Ralf Baechle, rostedt,
	Frederic Weisbecker, Ingo Molnar, Nicholas Mc Guire, David Daney,
	Richard Sandiford, Patrik Kluba, Thomas Gleixner, Michal Simek,
	Wu Zhangjin

From: Wu Zhangjin <wuzhangjin@gmail.com>

With dynamic function tracer, by default, _mcount is defined as an
"empty" function, it just return without any more action, just like the
static function tracer does. When enabling it in user-space, it will
jump to a real tracing function(ftrace_caller), and do some real job for
us.

Differ from the static function tracer, dynamic function tracer provides
two functions ftrace_make_call()/ftrace_make_nop() to enable/disable the
tracing of some indicated kernel functions(set_ftrace_filter).

In the -v4 version, the implementation of this support is basically the same as
X86 version does: _mcount is implemented as an empty function and ftrace_caller
is implemented as a real tracing function respectively.

But in this version, to support module tracing with the help of
-mlong-calls. the stuff becomes a little more complex, let's have a look
at the new calling to _mcount with -mlong-calls(result of "objdump -hdr
vmlinux").

	c:	3c030000 	lui	v1,0x0
			c: R_MIPS_HI16	_mcount
			c: R_MIPS_NONE	*ABS*
			c: R_MIPS_NONE	*ABS*
	10:	64630000 	daddiu	v1,v1,0
			10: R_MIPS_LO16	_mcount
			10: R_MIPS_NONE	*ABS*
			10: R_MIPS_NONE	*ABS*
	14:	03e0082d 	move	at,ra
	18:	0060f809 	jalr	v1

and the old version without -mlong-calls looks like this:

	108:   03e0082d        move    at,ra
	10c:   0c000000        jal     0 <fpcsr_pending>
                        10c: R_MIPS_26  _mcount
                        10c: R_MIPS_NONE        *ABS*
                        10c: R_MIPS_NONE        *ABS*
	110:   00020021        nop

In the old version, there is only one "_mcount" string for every kernel
function, so, we just need to match this one in mcount_regex of
scripts/recordmcount.pl, but in the new version, we need to choose one
of the two to match. Herein, I choose the first one with "R_MIPS_HI16
_mcount".

and In the old verion, without module tracing support, we just need to replace
"jal _mcount" by "jal ftrace_caller" to do real tracing, and filter the tracing
of some kernel functions via replacing it by a nop instruction.

but as we have described before, the instruction "jal ftrace_caller" only left
32bit length for the address of ftrace_caller, it will fail when calling from
the module space. so, herein, we must replace something else.

the basic idea is loading the address of ftrace_caller to v1 via changing these
two instructions:

	lui	v1,0x0
	addiu	v1,v1,0

If we want to enable the tracing, we need to replace the above instructions to:

	lui	v1, HI_16BIT_ftrace_caller
	addiu	v1, v1, LOW_16BIT_ftrace_caller

If we want to stop the tracing of the indicated kernel functions, we
just need to replace the "jalr v1" to a nop instruction. but we need to
replace two instructions and encode the above two instructions
oursevles.

Is there a simpler solution, Yes! Here it is, in this version, we put _mcount
and ftrace_caller together, which means the address of _mcount and
ftrace_caller is the same:

_mcount:
ftrace_caller:
	j	ftrace_stub
	 nop

	...(do real tracing here)...

ftrace_stub:
	jr	ra
	 move	ra, at

By default, the kernel functions call _mcount, and then jump to ftrace_stub and
return. and when we want to do real tracing, we just need to remove that "j
ftrace_stub", and it will run through the two "nop" instructions and then do
the real tracing job.

what about filtering job? we just need to do this:

	 lui v1, hi_16bit_of_mcount        <--> b 1f (0x10000004)
	 addiu v1, v1, low_16bit_of_mcount
	 move at, ra
	 jalr v1
	 nop
	 				     1f: (rec->ip + 12)

In linux-mips64, there will be some local symbols, whose name are
prefixed by $L, which need to be filtered. thanks goes to Steven for
writing the mips64-specific function_regex.

In a conclusion, with RISC, things becomes easier with such a "stupid"
trick, RISC is something like K.I.S.S, and also, there are lots of
"simple" tricks in the whole ftrace support, thanks goes to Steven and
the other folks for providing such a wonderful tracing framework!

Signed-off-by: Wu Zhangjin <wuzj@lemote.com>
---
 arch/mips/Kconfig              |    2 +
 arch/mips/include/asm/ftrace.h |    9 ++++
 arch/mips/kernel/Makefile      |    3 +-
 arch/mips/kernel/ftrace.c      |   89 ++++++++++++++++++++++++++++++++++++++++
 arch/mips/kernel/mcount.S      |   29 +++++++++++++
 scripts/recordmcount.pl        |   41 ++++++++++++++++++
 6 files changed, 172 insertions(+), 1 deletions(-)
 create mode 100644 arch/mips/kernel/ftrace.c

diff --git a/arch/mips/Kconfig b/arch/mips/Kconfig
index a9bd992..147fbbc 100644
--- a/arch/mips/Kconfig
+++ b/arch/mips/Kconfig
@@ -6,6 +6,8 @@ config MIPS
 	select HAVE_ARCH_KGDB
 	select HAVE_FUNCTION_TRACER
 	select HAVE_FUNCTION_TRACE_MCOUNT_TEST
+	select HAVE_DYNAMIC_FTRACE
+	select HAVE_FTRACE_MCOUNT_RECORD
 	# Horrible source of confusion.  Die, die, die ...
 	select EMBEDDED
 	select RTC_LIB if !LEMOTE_FULOONG2E
diff --git a/arch/mips/include/asm/ftrace.h b/arch/mips/include/asm/ftrace.h
index 5f8ebcf..7094a40 100644
--- a/arch/mips/include/asm/ftrace.h
+++ b/arch/mips/include/asm/ftrace.h
@@ -19,6 +19,15 @@
 extern void _mcount(void);
 #define mcount _mcount
 
+#ifdef CONFIG_DYNAMIC_FTRACE
+static inline unsigned long ftrace_call_adjust(unsigned long addr)
+{
+	return addr;
+}
+
+struct dyn_arch_ftrace {
+};
+#endif /*  CONFIG_DYNAMIC_FTRACE */
 #endif /* __ASSEMBLY__ */
 #endif /* CONFIG_FUNCTION_TRACER */
 #endif /* _ASM_MIPS_FTRACE_H */
diff --git a/arch/mips/kernel/Makefile b/arch/mips/kernel/Makefile
index 38e704a..3cda3ab 100644
--- a/arch/mips/kernel/Makefile
+++ b/arch/mips/kernel/Makefile
@@ -10,6 +10,7 @@ obj-y		+= cpu-probe.o branch.o entry.o genex.o irq.o process.o \
 
 ifdef CONFIG_FUNCTION_TRACER
 CFLAGS_REMOVE_trace_clock.o = -pg
+CFLAGS_REMOVE_ftrace.o = -pg
 CFLAGS_REMOVE_early_printk.o = -pg
 endif
 
@@ -30,7 +31,7 @@ obj-$(CONFIG_STACKTRACE)	+= stacktrace.o
 obj-$(CONFIG_MODULES)		+= mips_ksyms.o module.o
 
 obj-$(CONFIG_FTRACE)		+= trace_clock.o
-obj-$(CONFIG_FUNCTION_TRACER)	+= mcount.o
+obj-$(CONFIG_FUNCTION_TRACER)	+= mcount.o ftrace.o
 
 obj-$(CONFIG_CPU_LOONGSON2)	+= r4k_fpu.o r4k_switch.o
 obj-$(CONFIG_CPU_MIPS32)	+= r4k_fpu.o r4k_switch.o
diff --git a/arch/mips/kernel/ftrace.c b/arch/mips/kernel/ftrace.c
new file mode 100644
index 0000000..ac6647e
--- /dev/null
+++ b/arch/mips/kernel/ftrace.c
@@ -0,0 +1,89 @@
+/*
+ * Code for replacing ftrace calls with jumps.
+ *
+ * Copyright (C) 2007-2008 Steven Rostedt <srostedt@redhat.com>
+ * Copyright (C) 2009 DSLab, Lanzhou University, China
+ * Author: Wu Zhangjin <wuzj@lemote.com>
+ *
+ * Thanks goes to Steven Rostedt for writing the original x86 version.
+ */
+
+#include <linux/uaccess.h>
+#include <linux/init.h>
+#include <linux/ftrace.h>
+
+#include <asm/cacheflush.h>
+
+#ifdef CONFIG_DYNAMIC_FTRACE
+
+#define JAL 0x0c000000		/* jump & link: ip --> ra, jump to target */
+#define ADDR_MASK 0x03ffffff	/*  op_code|addr : 31...26|25 ....0 */
+#define jump_insn_encode(op_code, addr) \
+	((unsigned int)((op_code) | (((addr) >> 2) & ADDR_MASK)))
+
+#ifdef CONFIG_CPU_LOONGSON2F
+static unsigned int ftrace_nop = 0x00020021;
+#else
+static unsigned int ftrace_nop = 0x00000000;
+#endif
+
+static int ftrace_modify_code(unsigned long ip, unsigned int new_code)
+{
+	*(unsigned int *)ip = new_code;
+
+	flush_icache_range(ip, ip + 8);
+
+	return 0;
+}
+
+static int lui_v1;
+
+int ftrace_make_nop(struct module *mod,
+		    struct dyn_ftrace *rec, unsigned long addr)
+{
+	/* record it for ftrace_make_call */
+	if (lui_v1 == 0)
+		lui_v1 = *(unsigned int *)rec->ip;
+
+	/* lui v1, hi_16bit_of_mcount        --> b 1f (0x10000004)
+	 * addiu v1, v1, low_16bit_of_mcount
+	 * move at, ra
+	 * jalr v1
+	 * nop
+	 * 				     1f: (rec->ip + 12)
+	 */
+	return ftrace_modify_code(rec->ip, 0x10000004);
+}
+
+static int modified;	/* initialized as 0 by default */
+
+int ftrace_make_call(struct dyn_ftrace *rec, unsigned long addr)
+{
+	/* We just need to remove the "b ftrace_stub" at the fist time! */
+	if (modified == 0) {
+		modified = 1;
+		ftrace_modify_code(addr, ftrace_nop);
+	}
+	/* ftrace_make_nop have recorded lui_v1 for us */
+	return ftrace_modify_code(rec->ip, lui_v1);
+}
+
+#define FTRACE_CALL_IP ((unsigned long)(&ftrace_call))
+
+int ftrace_update_ftrace_func(ftrace_func_t func)
+{
+	unsigned int new;
+
+	new = jump_insn_encode(JAL, (unsigned long)func);
+
+	return ftrace_modify_code(FTRACE_CALL_IP, new);
+}
+
+int __init ftrace_dyn_arch_init(void *data)
+{
+	/* The return code is retured via data */
+	*(unsigned long *)data = 0;
+
+	return 0;
+}
+#endif				/* CONFIG_DYNAMIC_FTRACE */
diff --git a/arch/mips/kernel/mcount.S b/arch/mips/kernel/mcount.S
index cbb45ed..ffc4259 100644
--- a/arch/mips/kernel/mcount.S
+++ b/arch/mips/kernel/mcount.S
@@ -58,6 +58,33 @@
 	 move ra, AT
 	.endm
 
+#ifdef CONFIG_DYNAMIC_FTRACE
+
+NESTED(ftrace_caller, PT_SIZE, ra)
+	.globl _mcount
+_mcount:
+	b	ftrace_stub
+	 nop
+	lw	t0, function_trace_stop
+	bnez	t0, ftrace_stub
+	 nop
+
+	MCOUNT_SAVE_REGS
+
+	move	a0, ra		/* arg1: next ip, selfaddr */
+	.globl ftrace_call
+ftrace_call:
+	nop	/* a placeholder for the call to a real tracing function */
+	 move	a1, AT		/* arg2: the caller's next ip, parent */
+
+	MCOUNT_RESTORE_REGS
+	.globl ftrace_stub
+ftrace_stub:
+	RETURN_BACK
+	END(ftrace_caller)
+
+#else	/* ! CONFIG_DYNAMIC_FTRACE */
+
 NESTED(_mcount, PT_SIZE, ra)
 	lw	t0, function_trace_stop
 	bnez	t0, ftrace_stub
@@ -82,5 +109,7 @@ ftrace_stub:
 	RETURN_BACK
 	END(_mcount)
 
+#endif	/* ! CONFIG_DYNAMIC_FTRACE */
+
 	.set at
 	.set reorder
diff --git a/scripts/recordmcount.pl b/scripts/recordmcount.pl
index 62a0001..11ac245 100755
--- a/scripts/recordmcount.pl
+++ b/scripts/recordmcount.pl
@@ -287,6 +287,47 @@ if ($arch eq "x86_64") {
     $ld .= " -m elf64_sparc";
     $cc .= " -m64";
     $objcopy .= " -O elf64-sparc";
+
+} elsif ($arch eq "mips") {
+    # To enable module support, we need to enable the -mlong-calls option
+    # of gcc for MIPS, after using this option, we can not get the real
+    # offset of the calling to _mcount, but the offset of the lui
+    # instruction or the addiu one. herein, we record the address of the
+    # first one, and then we can replace this instruction by a branch
+    # instruction to jump over the profiling function to filter the
+    # indicated functions, or swith back to the lui instruction to trace
+    # them, which means dynamic tracing.
+    #
+    #       c:	3c030000 	lui	v1,0x0
+    #			c: R_MIPS_HI16	_mcount
+    #			c: R_MIPS_NONE	*ABS*
+    #			c: R_MIPS_NONE	*ABS*
+    #      10:	64630000 	daddiu	v1,v1,0
+    #			10: R_MIPS_LO16	_mcount
+    #			10: R_MIPS_NONE	*ABS*
+    #			10: R_MIPS_NONE	*ABS*
+    #      14:	03e0082d 	move	at,ra
+    #      18:	0060f809 	jalr	v1
+    $mcount_regex = "^\\s*([0-9a-fA-F]+): R_MIPS_HI16\\s+_mcount\$";
+    $objdump .= " -Melf-trad".$endian."mips ";
+
+    if ($endian eq "big") {
+	    $endian = " -EB ";
+	    $ld .= " -melf".$bits."btsmip";
+    } else {
+	    $endian = " -EL ";
+	    $ld .= " -melf".$bits."ltsmip";
+    }
+
+    $cc .= " -mno-abicalls -fno-pic -mabi=" . $bits . $endian;
+    $ld .= $endian;
+
+    if ($bits == 64) {
+	    $function_regex =
+		"^([0-9a-fA-F]+)\\s+<(.|[^\$]L.*?|\$[^L].*?|[^\$][^L].*?)>:";
+	    $type = ".dword";
+    }
+
 } else {
     die "Arch $arch is not supported with CONFIG_FTRACE_MCOUNT_RECORD";
 }
-- 
1.6.2.1


^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH v7 08/17] tracing: add IRQENTRY_EXIT section for MIPS
  2009-11-09 15:29 [PATCH v7 00/17] ftrace for MIPS Wu Zhangjin
                   ` (6 preceding siblings ...)
  2009-11-09 15:31 ` [PATCH v7 07/17] tracing: add dynamic function tracer support for MIPS Wu Zhangjin
@ 2009-11-09 15:31 ` Wu Zhangjin
  2009-11-09 15:31 ` [PATCH v7 09/17] tracing: define a new __time_notrace annotation flag Wu Zhangjin
                   ` (8 subsequent siblings)
  16 siblings, 0 replies; 36+ messages in thread
From: Wu Zhangjin @ 2009-11-09 15:31 UTC (permalink / raw)
  To: linux-mips, linux-kernel
  Cc: zhangfx, zhouqg, Wu Zhangjin, Ralf Baechle, rostedt,
	Frederic Weisbecker, Ingo Molnar, Nicholas Mc Guire, David Daney,
	Richard Sandiford, Patrik Kluba, Thomas Gleixner, Michal Simek

From: Wu Zhangjin <wuzhangjin@gmail.com>

This patch add a new section for MIPS to record the block of the hardirq
handling for function graph tracer(print_graph_irq) via adding the
__irq_entry annotation to the the entrypoints of the hardirqs(the block
with irq_enter()...irq_exit()).

Thanks goes to Steven & Frederic Weisbecker for their feedbacks.

Reviewed-by: Frederic Weisbecker <fweisbec@gmail.com>
Signed-off-by: Wu Zhangjin <wuzhangjin@gmail.com>
---
 arch/mips/include/asm/irq.h    |   29 ++---------------------------
 arch/mips/kernel/irq.c         |   30 ++++++++++++++++++++++++++++++
 arch/mips/kernel/smp.c         |    3 ++-
 arch/mips/kernel/smtc.c        |   21 ++++++++++++++-------
 arch/mips/kernel/vmlinux.lds.S |    1 +
 arch/mips/sgi-ip22/ip22-int.c  |    3 ++-
 arch/mips/sgi-ip22/ip22-time.c |    3 ++-
 7 files changed, 53 insertions(+), 37 deletions(-)

diff --git a/arch/mips/include/asm/irq.h b/arch/mips/include/asm/irq.h
index 09b08d0..0696036 100644
--- a/arch/mips/include/asm/irq.h
+++ b/arch/mips/include/asm/irq.h
@@ -113,36 +113,11 @@ do {									\
 
 #endif
 
-/*
- * do_IRQ handles all normal device IRQ's (the special
- * SMP cross-CPU interrupts have their own specific
- * handlers).
- *
- * Ideally there should be away to get this into kernel/irq/handle.c to
- * avoid the overhead of a call for just a tiny function ...
- */
-#define do_IRQ(irq)							\
-do {									\
-	irq_enter();							\
-	__DO_IRQ_SMTC_HOOK(irq);					\
-	generic_handle_irq(irq);					\
-	irq_exit();							\
-} while (0)
+extern void do_IRQ(unsigned int irq);
 
 #ifdef CONFIG_MIPS_MT_SMTC_IRQAFF
-/*
- * To avoid inefficient and in some cases pathological re-checking of
- * IRQ affinity, we have this variant that skips the affinity check.
- */
-
 
-#define do_IRQ_no_affinity(irq)						\
-do {									\
-	irq_enter();							\
-	__NO_AFFINITY_IRQ_SMTC_HOOK(irq);				\
-	generic_handle_irq(irq);					\
-	irq_exit();							\
-} while (0)
+extern void do_IRQ_no_affinity(unsigned int irq);
 
 #endif /* CONFIG_MIPS_MT_SMTC_IRQAFF */
 
diff --git a/arch/mips/kernel/irq.c b/arch/mips/kernel/irq.c
index 7b845ba..dee8d5f 100644
--- a/arch/mips/kernel/irq.c
+++ b/arch/mips/kernel/irq.c
@@ -22,6 +22,7 @@
 #include <linux/seq_file.h>
 #include <linux/kallsyms.h>
 #include <linux/kgdb.h>
+#include <linux/ftrace.h>
 
 #include <asm/atomic.h>
 #include <asm/system.h>
@@ -150,3 +151,32 @@ void __init init_IRQ(void)
 		kgdb_early_setup = 1;
 #endif
 }
+
+/*
+ * do_IRQ handles all normal device IRQ's (the special
+ * SMP cross-CPU interrupts have their own specific
+ * handlers).
+ */
+void __irq_entry do_IRQ(unsigned int irq)
+{
+	irq_enter();
+	__DO_IRQ_SMTC_HOOK(irq);
+	generic_handle_irq(irq);
+	irq_exit();
+}
+
+#ifdef CONFIG_MIPS_MT_SMTC_IRQAFF
+/*
+ * To avoid inefficient and in some cases pathological re-checking of
+ * IRQ affinity, we have this variant that skips the affinity check.
+ */
+
+void __irq_entry do_IRQ_no_affinity(unsigned int irq)
+{
+	irq_enter();
+	__NO_AFFINITY_IRQ_SMTC_HOOK(irq);
+	generic_handle_irq(irq);
+	irq_exit();
+}
+
+#endif /* CONFIG_MIPS_MT_SMTC_IRQAFF */
diff --git a/arch/mips/kernel/smp.c b/arch/mips/kernel/smp.c
index e72e684..6cdca19 100644
--- a/arch/mips/kernel/smp.c
+++ b/arch/mips/kernel/smp.c
@@ -32,6 +32,7 @@
 #include <linux/cpumask.h>
 #include <linux/cpu.h>
 #include <linux/err.h>
+#include <linux/ftrace.h>
 
 #include <asm/atomic.h>
 #include <asm/cpu.h>
@@ -130,7 +131,7 @@ asmlinkage __cpuinit void start_secondary(void)
 /*
  * Call into both interrupt handlers, as we share the IPI for them
  */
-void smp_call_function_interrupt(void)
+void __irq_entry smp_call_function_interrupt(void)
 {
 	irq_enter();
 	generic_smp_call_function_single_interrupt();
diff --git a/arch/mips/kernel/smtc.c b/arch/mips/kernel/smtc.c
index 24630fd..75034a8 100644
--- a/arch/mips/kernel/smtc.c
+++ b/arch/mips/kernel/smtc.c
@@ -25,6 +25,7 @@
 #include <linux/interrupt.h>
 #include <linux/kernel_stat.h>
 #include <linux/module.h>
+#include <linux/ftrace.h>
 
 #include <asm/cpu.h>
 #include <asm/processor.h>
@@ -939,23 +940,29 @@ static void ipi_call_interrupt(void)
 
 DECLARE_PER_CPU(struct clock_event_device, mips_clockevent_device);
 
-void ipi_decode(struct smtc_ipi *pipi)
+static void __irq_entry smtc_clock_tick_interrupt(void)
 {
 	unsigned int cpu = smp_processor_id();
 	struct clock_event_device *cd;
+	int irq = MIPS_CPU_IRQ_BASE + 1;
+
+	irq_enter();
+	kstat_incr_irqs_this_cpu(irq, irq_to_desc(irq));
+	cd = &per_cpu(mips_clockevent_device, cpu);
+	cd->event_handler(cd);
+	irq_exit();
+}
+
+void ipi_decode(struct smtc_ipi *pipi)
+{
 	void *arg_copy = pipi->arg;
 	int type_copy = pipi->type;
-	int irq = MIPS_CPU_IRQ_BASE + 1;
 
 	smtc_ipi_nq(&freeIPIq, pipi);
 
 	switch (type_copy) {
 	case SMTC_CLOCK_TICK:
-		irq_enter();
-		kstat_incr_irqs_this_cpu(irq, irq_to_desc(irq));
-		cd = &per_cpu(mips_clockevent_device, cpu);
-		cd->event_handler(cd);
-		irq_exit();
+		smtc_clock_tick_interrupt();
 		break;
 
 	case LINUX_SMP_IPI:
diff --git a/arch/mips/kernel/vmlinux.lds.S b/arch/mips/kernel/vmlinux.lds.S
index 162b299..f25df73 100644
--- a/arch/mips/kernel/vmlinux.lds.S
+++ b/arch/mips/kernel/vmlinux.lds.S
@@ -46,6 +46,7 @@ SECTIONS
 		SCHED_TEXT
 		LOCK_TEXT
 		KPROBES_TEXT
+		IRQENTRY_TEXT
 		*(.text.*)
 		*(.fixup)
 		*(.gnu.warning)
diff --git a/arch/mips/sgi-ip22/ip22-int.c b/arch/mips/sgi-ip22/ip22-int.c
index 0ecd5fe..383f11d 100644
--- a/arch/mips/sgi-ip22/ip22-int.c
+++ b/arch/mips/sgi-ip22/ip22-int.c
@@ -13,6 +13,7 @@
 #include <linux/init.h>
 #include <linux/kernel_stat.h>
 #include <linux/interrupt.h>
+#include <linux/ftrace.h>
 
 #include <asm/irq_cpu.h>
 #include <asm/sgi/hpc3.h>
@@ -150,7 +151,7 @@ static void indy_local1_irqdispatch(void)
 
 extern void ip22_be_interrupt(int irq);
 
-static void indy_buserror_irq(void)
+static void __irq_entry indy_buserror_irq(void)
 {
 	int irq = SGI_BUSERR_IRQ;
 
diff --git a/arch/mips/sgi-ip22/ip22-time.c b/arch/mips/sgi-ip22/ip22-time.c
index c8f7d23..603fc91 100644
--- a/arch/mips/sgi-ip22/ip22-time.c
+++ b/arch/mips/sgi-ip22/ip22-time.c
@@ -16,6 +16,7 @@
 #include <linux/interrupt.h>
 #include <linux/kernel_stat.h>
 #include <linux/time.h>
+#include <linux/ftrace.h>
 
 #include <asm/cpu.h>
 #include <asm/mipsregs.h>
@@ -115,7 +116,7 @@ __init void plat_time_init(void)
 }
 
 /* Generic SGI handler for (spurious) 8254 interrupts */
-void indy_8254timer_irq(void)
+void __irq_entry indy_8254timer_irq(void)
 {
 	int irq = SGI_8254_0_IRQ;
 	ULONG cnt;
-- 
1.6.2.1


^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH v7 09/17] tracing: define a new __time_notrace annotation flag
  2009-11-09 15:29 [PATCH v7 00/17] ftrace for MIPS Wu Zhangjin
                   ` (7 preceding siblings ...)
  2009-11-09 15:31 ` [PATCH v7 08/17] tracing: add IRQENTRY_EXIT section " Wu Zhangjin
@ 2009-11-09 15:31 ` Wu Zhangjin
  2009-11-09 15:31 ` [PATCH v7 10/17] tracing: not trace the timecounter_read* in kernel/time/clocksource.c Wu Zhangjin
                   ` (7 subsequent siblings)
  16 siblings, 0 replies; 36+ messages in thread
From: Wu Zhangjin @ 2009-11-09 15:31 UTC (permalink / raw)
  To: linux-mips, linux-kernel
  Cc: zhangfx, zhouqg, Wu Zhangjin, Ralf Baechle, rostedt,
	Frederic Weisbecker, Ingo Molnar, Nicholas Mc Guire, David Daney,
	Richard Sandiford, Patrik Kluba, Thomas Gleixner, Michal Simek

From: Wu Zhangjin <wuzhangjin@gmail.com>

Ftrace gets the timestamp via time relative functions, when enabling
function graph tracer, if we also trace these functions, the system will
go into recusion and then hang there. So, we must not trace them. and
there are two existing methods help this job.

The first one is annotating them with notrace, another method is
removing the -pg for the whole file which included them:

CFLAGS_REMOVE_ftrace.o = -pg

But there is a particular situation, that is some archs(i.e MIPS,
Microblaze) need to notrace the common time relative source code, we can
not simply annotate them with the above methods, but need to consider
the archs respectively. this patch does it!

This patch introduce a new annotation flag: __time_notrace, if your
platform need it, define it in your arch specific ftrace.h:

	#define __time_notrace notrace

if only function graph trace need it, wrap it:

	#ifdef CONFIG_FUNCTION_GRAPH_TRACER
	 #define __time_notrace notrace
	#endif

otherwise, ignore it!

Signed-off-by: Wu Zhangjin <wuzhangjin@gmail.com>

common notrace

Signed-off-by: Wu Zhangjin <wuzhangjin@gmail.com>
---
 include/linux/ftrace.h |   12 ++++++++++++
 1 files changed, 12 insertions(+), 0 deletions(-)

diff --git a/include/linux/ftrace.h b/include/linux/ftrace.h
index 0b4f97d..c3f2f0f 100644
--- a/include/linux/ftrace.h
+++ b/include/linux/ftrace.h
@@ -511,4 +511,16 @@ static inline void trace_hw_branch_oops(void) {}
 
 #endif /* CONFIG_HW_BRANCH_TRACER */
 
+/* arch specific notrace for time relative functions, If your arch need it,
+ * define it in the arch specific ftrace.h
+ *
+ *	#define __time_notrace notrace
+ *
+ * otherwise, ignore it!
+ */
+
+#ifndef __time_notrace
+ #define __time_notrace
+#endif
+
 #endif /* _LINUX_FTRACE_H */
-- 
1.6.2.1


^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH v7 10/17] tracing: not trace the timecounter_read* in kernel/time/clocksource.c
  2009-11-09 15:29 [PATCH v7 00/17] ftrace for MIPS Wu Zhangjin
                   ` (8 preceding siblings ...)
  2009-11-09 15:31 ` [PATCH v7 09/17] tracing: define a new __time_notrace annotation flag Wu Zhangjin
@ 2009-11-09 15:31 ` Wu Zhangjin
  2009-11-09 15:31 ` [PATCH v7 11/17] tracing: not trace mips_timecounter_read() for MIPS Wu Zhangjin
                   ` (6 subsequent siblings)
  16 siblings, 0 replies; 36+ messages in thread
From: Wu Zhangjin @ 2009-11-09 15:31 UTC (permalink / raw)
  To: linux-mips, linux-kernel
  Cc: zhangfx, zhouqg, Wu Zhangjin, Ralf Baechle, rostedt,
	Frederic Weisbecker, Ingo Molnar, Nicholas Mc Guire, David Daney,
	Richard Sandiford, Patrik Kluba, Thomas Gleixner, Michal Simek

From: Wu Zhangjin <wuzhangjin@gmail.com>

Some platforms(i.e. MIPS) need these two functions to get the precise
timestamp, we use __time_notrace(added in the last patch) to annotate
it. By default, __time_notrace is empty, so, this patch have no
influence to the original functions, but if you really not need to trace
them, just add the following line into the arch specific ftrace.h:

	#define __time_notrace notrace

If you only need it for function graph tracer, wrap it:

	#ifdef CONFIG_FUNCTION_GRAPH_TRACER
	 #define __time_notrace notrace
	#endif

please get more information from include/linux/ftrace.h

Signed-off-by: Wu Zhangjin <wuzhangjin@gmail.com>
---
 kernel/time/clocksource.c |    5 +++--
 1 files changed, 3 insertions(+), 2 deletions(-)

diff --git a/kernel/time/clocksource.c b/kernel/time/clocksource.c
index 5e18c6a..b00476a 100644
--- a/kernel/time/clocksource.c
+++ b/kernel/time/clocksource.c
@@ -30,6 +30,7 @@
 #include <linux/sched.h> /* for spin_unlock_irq() using preempt_count() m68k */
 #include <linux/tick.h>
 #include <linux/kthread.h>
+#include <linux/ftrace.h>
 
 void timecounter_init(struct timecounter *tc,
 		      const struct cyclecounter *cc,
@@ -52,7 +53,7 @@ EXPORT_SYMBOL(timecounter_init);
  * The first call to this function for a new time counter initializes
  * the time tracking and returns an undefined result.
  */
-static u64 timecounter_read_delta(struct timecounter *tc)
+static u64 __time_notrace timecounter_read_delta(struct timecounter *tc)
 {
 	cycle_t cycle_now, cycle_delta;
 	u64 ns_offset;
@@ -72,7 +73,7 @@ static u64 timecounter_read_delta(struct timecounter *tc)
 	return ns_offset;
 }
 
-u64 timecounter_read(struct timecounter *tc)
+u64 __time_notrace timecounter_read(struct timecounter *tc)
 {
 	u64 nsec;
 
-- 
1.6.2.1


^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH v7 11/17] tracing: not trace mips_timecounter_read() for MIPS
  2009-11-09 15:29 [PATCH v7 00/17] ftrace for MIPS Wu Zhangjin
                   ` (9 preceding siblings ...)
  2009-11-09 15:31 ` [PATCH v7 10/17] tracing: not trace the timecounter_read* in kernel/time/clocksource.c Wu Zhangjin
@ 2009-11-09 15:31 ` Wu Zhangjin
  2009-11-09 15:31 ` [PATCH v7 12/17] tracing: add function graph tracer support " Wu Zhangjin
                   ` (5 subsequent siblings)
  16 siblings, 0 replies; 36+ messages in thread
From: Wu Zhangjin @ 2009-11-09 15:31 UTC (permalink / raw)
  To: linux-mips, linux-kernel
  Cc: zhangfx, zhouqg, Wu Zhangjin, Ralf Baechle, rostedt,
	Frederic Weisbecker, Ingo Molnar, Nicholas Mc Guire, David Daney,
	Richard Sandiford, Patrik Kluba, Thomas Gleixner, Michal Simek

From: Wu Zhangjin <wuzhangjin@gmail.com>

We use mips_timecounter_read() to get the timestamp in MIPS, so, it's
better to not trace it and it's subroutines, otherwise, it will goto
recursion(hang) when using function graph tracer. we use the
__notrace_funcgraph annotation to indicate them not to be traced.

And there are two common functions called by mips_timecounter_read(), we
define the __time_notrace macro to ensure they are not to be traced
either.

Signed-off-by: Wu Zhangjin <wuzhangjin@gmail.com>
---
 arch/mips/include/asm/ftrace.h |    6 ++++++
 arch/mips/kernel/csrc-r4k.c    |    5 +++--
 2 files changed, 9 insertions(+), 2 deletions(-)

diff --git a/arch/mips/include/asm/ftrace.h b/arch/mips/include/asm/ftrace.h
index 7094a40..aa7c80b 100644
--- a/arch/mips/include/asm/ftrace.h
+++ b/arch/mips/include/asm/ftrace.h
@@ -28,6 +28,12 @@ static inline unsigned long ftrace_call_adjust(unsigned long addr)
 struct dyn_arch_ftrace {
 };
 #endif /*  CONFIG_DYNAMIC_FTRACE */
+
+/* not trace the timecounter_read* in kernel/time/clocksource.c */
+#ifdef CONFIG_FUNCTION_GRAPH_TRACER
+ #define __time_notrace notrace
+#endif
+
 #endif /* __ASSEMBLY__ */
 #endif /* CONFIG_FUNCTION_TRACER */
 #endif /* _ASM_MIPS_FTRACE_H */
diff --git a/arch/mips/kernel/csrc-r4k.c b/arch/mips/kernel/csrc-r4k.c
index 4e7705f..0c1bf80 100644
--- a/arch/mips/kernel/csrc-r4k.c
+++ b/arch/mips/kernel/csrc-r4k.c
@@ -8,6 +8,7 @@
 #include <linux/clocksource.h>
 #include <linux/init.h>
 #include <linux/sched.h>
+#include <linux/ftrace.h>
 
 #include <asm/time.h>
 
@@ -42,7 +43,7 @@ static struct timecounter r4k_tc = {
 	.cc = NULL,
 };
 
-static cycle_t r4k_cc_read(const struct cyclecounter *cc)
+static cycle_t __notrace_funcgraph r4k_cc_read(const struct cyclecounter *cc)
 {
 	return read_c0_count();
 }
@@ -66,7 +67,7 @@ int __init init_r4k_timecounter(void)
 	return 0;
 }
 
-u64 r4k_timecounter_read(void)
+u64 __notrace_funcgraph r4k_timecounter_read(void)
 {
 	u64 clock;
 
-- 
1.6.2.1


^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH v7 12/17] tracing: add function graph tracer support for MIPS
  2009-11-09 15:29 [PATCH v7 00/17] ftrace for MIPS Wu Zhangjin
                   ` (10 preceding siblings ...)
  2009-11-09 15:31 ` [PATCH v7 11/17] tracing: not trace mips_timecounter_read() for MIPS Wu Zhangjin
@ 2009-11-09 15:31 ` Wu Zhangjin
  2009-11-09 15:31 ` [PATCH v7 13/17] tracing: add dynamic function graph tracer " Wu Zhangjin
                   ` (4 subsequent siblings)
  16 siblings, 0 replies; 36+ messages in thread
From: Wu Zhangjin @ 2009-11-09 15:31 UTC (permalink / raw)
  To: linux-mips, linux-kernel
  Cc: zhangfx, zhouqg, Wu Zhangjin, Ralf Baechle, rostedt,
	Frederic Weisbecker, Ingo Molnar, Nicholas Mc Guire, David Daney,
	Richard Sandiford, Patrik Kluba, Thomas Gleixner, Michal Simek

From: Wu Zhangjin <wuzhangjin@gmail.com>

The implementation of function graph tracer for MIPS is a little
different from X86.

in MIPS, gcc(with -pg) only transfer the caller's return address(at) and
the _mcount's return address(ra) to us.

move at, ra
jal _mcount

if the function is a leaf, it will no save the return address(ra):

ffffffff80101298 <au1k_wait>:
ffffffff80101298:       67bdfff0        daddiu  sp,sp,-16
ffffffff8010129c:       ffbe0008        sd      s8,8(sp)
ffffffff801012a0:       03a0f02d        move    s8,sp
ffffffff801012a4:       03e0082d        move    at,ra
ffffffff801012a8:       0c042930        jal     ffffffff8010a4c0 <_mcount>
ffffffff801012ac:       00020021        nop

so, we can hijack it directly in _mcount, but if the function is non-leaf, the
return address is saved in the stack.

ffffffff80133030 <copy_process>:
ffffffff80133030:       67bdff50        daddiu  sp,sp,-176
ffffffff80133034:       ffbe00a0        sd      s8,160(sp)
ffffffff80133038:       03a0f02d        move    s8,sp
ffffffff8013303c:       ffbf00a8        sd      ra,168(sp)
ffffffff80133040:       ffb70098        sd      s7,152(sp)
ffffffff80133044:       ffb60090        sd      s6,144(sp)
ffffffff80133048:       ffb50088        sd      s5,136(sp)
ffffffff8013304c:       ffb40080        sd      s4,128(sp)
ffffffff80133050:       ffb30078        sd      s3,120(sp)
ffffffff80133054:       ffb20070        sd      s2,112(sp)
ffffffff80133058:       ffb10068        sd      s1,104(sp)
ffffffff8013305c:       ffb00060        sd      s0,96(sp)
ffffffff80133060:       03e0082d        move    at,ra
ffffffff80133064:       0c042930        jal     ffffffff8010a4c0 <_mcount>
ffffffff80133068:       00020021        nop

but we can not get the exact stack address(which saved ra) directly in
_mcount, we need to search the content of at register in the stack space
or search the "s{d,w} ra, offset(sp)" instruction in the text. 'Cause we
can not prove there is only a match in the stack space, so, we search
the text instead.

as we can see, if the first instruction above "move at, ra" is not a
store instruction, there should be a leaf function, so we hijack the at
register directly via putting &return_to_handler into it, otherwise, we
search the "s{d,w} ra, offset(sp)" instruction to get the stack offset,
and then the stack address. we use the above copy_process() as an
example, we at last find "ffbf00a8", 0xa8 is the stack offset, we plus
it with s8(fp), that is the stack address, we hijack the content via
writing the &return_to_handler in.

Signed-off-by: Wu Zhangjin <wuzhangjin@gmail.com>
---
 arch/mips/Kconfig         |    1 +
 arch/mips/kernel/ftrace.c |  100 +++++++++++++++++++++++++++++++++++++++++++++
 arch/mips/kernel/mcount.S |   42 +++++++++++++++++++
 3 files changed, 143 insertions(+), 0 deletions(-)

diff --git a/arch/mips/Kconfig b/arch/mips/Kconfig
index 147fbbc..de690fd 100644
--- a/arch/mips/Kconfig
+++ b/arch/mips/Kconfig
@@ -8,6 +8,7 @@ config MIPS
 	select HAVE_FUNCTION_TRACE_MCOUNT_TEST
 	select HAVE_DYNAMIC_FTRACE
 	select HAVE_FTRACE_MCOUNT_RECORD
+	select HAVE_FUNCTION_GRAPH_TRACER
 	# Horrible source of confusion.  Die, die, die ...
 	select EMBEDDED
 	select RTC_LIB if !LEMOTE_FULOONG2E
diff --git a/arch/mips/kernel/ftrace.c b/arch/mips/kernel/ftrace.c
index ac6647e..50ba128 100644
--- a/arch/mips/kernel/ftrace.c
+++ b/arch/mips/kernel/ftrace.c
@@ -13,6 +13,8 @@
 #include <linux/ftrace.h>
 
 #include <asm/cacheflush.h>
+#include <asm/asm.h>
+#include <asm/asm-offsets.h>
 
 #ifdef CONFIG_DYNAMIC_FTRACE
 
@@ -87,3 +89,101 @@ int __init ftrace_dyn_arch_init(void *data)
 	return 0;
 }
 #endif				/* CONFIG_DYNAMIC_FTRACE */
+
+#ifdef CONFIG_FUNCTION_GRAPH_TRACER
+
+#define S_RA_SP	(0xafbf << 16)	/* s{d,w} ra, offset(sp) */
+#define S_R_SP	(0xafb0 << 16)  /* s{d,w} R, offset(sp) */
+#define OFFSET_MASK	0xffff	/* stack offset range: 0 ~ PT_SIZE */
+
+unsigned long ftrace_get_parent_addr(unsigned long self_addr,
+				     unsigned long parent,
+				     unsigned long parent_addr,
+				     unsigned long fp)
+{
+	unsigned long sp, ip, ra;
+	unsigned int code;
+
+	/* move to the instruction "lui v1, HI_16BIT_OF_MCOUNT" */
+	ip = self_addr - 20;
+
+	/* search the text until finding the non-store instruction or "s{d,w}
+	 * ra, offset(sp)" instruction */
+	do {
+		ip -= 4;
+
+		/* get the code at "ip" */
+		code = *(unsigned int *)ip;
+
+		/* If we hit the non-store instruction before finding where the
+		 * ra is stored, then this is a leaf function and it does not
+		 * store the ra on the stack. */
+		if ((code & S_R_SP) != S_R_SP)
+			return parent_addr;
+
+	} while (((code & S_RA_SP) != S_RA_SP));
+
+	sp = fp + (code & OFFSET_MASK);
+	ra = *(unsigned long *)sp;
+
+	if (ra == parent)
+		return sp;
+
+	return 0;
+}
+
+/*
+ * Hook the return address and push it in the stack of return addrs
+ * in current thread info.
+ */
+void prepare_ftrace_return(unsigned long *parent, unsigned long self_addr,
+			   unsigned long fp)
+{
+	unsigned long old;
+	struct ftrace_graph_ent trace;
+	unsigned long return_hooker = (unsigned long)
+	    &return_to_handler;
+
+	if (unlikely(atomic_read(&current->tracing_graph_pause)))
+		return;
+
+	/* "parent" is the stack address saved the return address of the caller
+	 * of _mcount, for a leaf function not save the return address in the
+	 * stack address, so, we "emulate" one in _mcount's stack space, and
+	 * hijack it directly, but for a non-leaf function, it will save the
+	 * return address to the its stack space, so, we can not hijack the
+	 * "parent" directly, but need to find the real stack address,
+	 * ftrace_get_parent_addr() does it!
+	 */
+
+	old = *parent;
+
+	parent = (unsigned long *)ftrace_get_parent_addr(self_addr, old,
+							 (unsigned long)parent,
+							 fp);
+
+	/* If fails when getting the stack address of the non-leaf function's
+	 * ra, stop function graph tracer and return */
+	if (parent == 0) {
+		ftrace_graph_stop();
+		WARN_ON(1);
+		return;
+	}
+
+	*parent = return_hooker;
+
+	if (ftrace_push_return_trace(old, self_addr, &trace.depth, fp) ==
+	    -EBUSY) {
+		*parent = old;
+		return;
+	}
+
+	trace.func = self_addr;
+
+	/* Only trace if the calling function expects to */
+	if (!ftrace_graph_entry(&trace)) {
+		current->curr_ret_stack--;
+		*parent = old;
+	}
+}
+#endif				/* CONFIG_FUNCTION_GRAPH_TRACER */
diff --git a/arch/mips/kernel/mcount.S b/arch/mips/kernel/mcount.S
index ffc4259..b50e38d 100644
--- a/arch/mips/kernel/mcount.S
+++ b/arch/mips/kernel/mcount.S
@@ -93,6 +93,16 @@ NESTED(_mcount, PT_SIZE, ra)
 	PTR_L	t1, ftrace_trace_function /* Prepare t1 for (1) */
 	bne	t0, t1, static_trace
 	 nop
+
+#ifdef	CONFIG_FUNCTION_GRAPH_TRACER
+	PTR_L	t2, ftrace_graph_return
+	bne	t0, t2, ftrace_graph_caller
+	 nop
+	PTR_LA	t0, ftrace_graph_entry_stub
+	PTR_L	t2, ftrace_graph_entry
+	bne	t0, t2, ftrace_graph_caller
+	 nop
+#endif
 	b	ftrace_stub
 	 nop
 
@@ -111,5 +121,37 @@ ftrace_stub:
 
 #endif	/* ! CONFIG_DYNAMIC_FTRACE */
 
+#ifdef CONFIG_FUNCTION_GRAPH_TRACER
+
+NESTED(ftrace_graph_caller, PT_SIZE, ra)
+	MCOUNT_SAVE_REGS
+
+	PTR_LA	a0, PT_R1(sp)	/* arg1: &AT -> a0 */
+	move	a1, ra		/* arg2: next ip, selfaddr */
+	jal	prepare_ftrace_return
+	 move	a2, fp		/* arg3: frame pointer */
+
+	MCOUNT_RESTORE_REGS
+	RETURN_BACK
+	END(ftrace_graph_caller)
+
+	.align	2
+	.globl	return_to_handler
+return_to_handler:
+	PTR_SUBU	sp, PT_SIZE
+	PTR_S	v0, PT_R2(sp)
+
+	jal	ftrace_return_to_handler
+	 PTR_S	v1, PT_R3(sp)
+
+	/* restore the real parent address: v0 -> ra */
+	move	ra, v0
+
+	PTR_L	v0, PT_R2(sp)
+	PTR_L	v1, PT_R3(sp)
+	jr	ra
+	 PTR_ADDIU	sp, PT_SIZE
+#endif /* CONFIG_FUNCTION_GRAPH_TRACER */
+
 	.set at
 	.set reorder
-- 
1.6.2.1


^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH v7 13/17] tracing: add dynamic function graph tracer for MIPS
  2009-11-09 15:29 [PATCH v7 00/17] ftrace for MIPS Wu Zhangjin
                   ` (11 preceding siblings ...)
  2009-11-09 15:31 ` [PATCH v7 12/17] tracing: add function graph tracer support " Wu Zhangjin
@ 2009-11-09 15:31 ` Wu Zhangjin
  2009-11-09 15:31 ` [PATCH v7 14/17] tracing: make ftrace for MIPS work without -fno-omit-frame-pointer Wu Zhangjin
                   ` (3 subsequent siblings)
  16 siblings, 0 replies; 36+ messages in thread
From: Wu Zhangjin @ 2009-11-09 15:31 UTC (permalink / raw)
  To: linux-mips, linux-kernel
  Cc: zhangfx, zhouqg, Wu Zhangjin, Ralf Baechle, rostedt,
	Frederic Weisbecker, Ingo Molnar, Nicholas Mc Guire, David Daney,
	Richard Sandiford, Patrik Kluba, Thomas Gleixner, Michal Simek

From: Wu Zhangjin <wuzhangjin@gmail.com>

This patch make function graph tracer work with dynamic function tracer.

To share the source code of dynamic function tracer(MCOUNT_SAVE_REGS),
and avoid restoring the whole saved registers, we need to restore the ra
register from the stack.

(NOTE: This not work with 32bit! need to ensure why!)

Signed-off-by: Wu Zhangjin <wuzhangjin@gmail.com>
---
 arch/mips/kernel/ftrace.c |   21 +++++++++++++++++++++
 arch/mips/kernel/mcount.S |   14 ++++++++++++--
 2 files changed, 33 insertions(+), 2 deletions(-)

diff --git a/arch/mips/kernel/ftrace.c b/arch/mips/kernel/ftrace.c
index 50ba128..af3ceed 100644
--- a/arch/mips/kernel/ftrace.c
+++ b/arch/mips/kernel/ftrace.c
@@ -92,6 +92,27 @@ int __init ftrace_dyn_arch_init(void *data)
 
 #ifdef CONFIG_FUNCTION_GRAPH_TRACER
 
+#ifdef CONFIG_DYNAMIC_FTRACE
+
+extern void ftrace_graph_call(void);
+#define JMP	0x08000000	/* jump to target directly */
+#define CALL_FTRACE_GRAPH_CALLER \
+	jump_insn_encode(JMP, (unsigned long)(&ftrace_graph_caller))
+#define FTRACE_GRAPH_CALL_IP	((unsigned long)(&ftrace_graph_call))
+
+int ftrace_enable_ftrace_graph_caller(void)
+{
+	return ftrace_modify_code(FTRACE_GRAPH_CALL_IP,
+				  CALL_FTRACE_GRAPH_CALLER);
+}
+
+int ftrace_disable_ftrace_graph_caller(void)
+{
+	return ftrace_modify_code(FTRACE_GRAPH_CALL_IP, ftrace_nop);
+}
+
+#endif				/* !CONFIG_DYNAMIC_FTRACE */
+
 #define S_RA_SP	(0xafbf << 16)	/* s{d,w} ra, offset(sp) */
 #define S_R_SP	(0xafb0 << 16)  /* s{d,w} R, offset(sp) */
 #define OFFSET_MASK	0xffff	/* stack offset range: 0 ~ PT_SIZE */
diff --git a/arch/mips/kernel/mcount.S b/arch/mips/kernel/mcount.S
index b50e38d..98d4690 100644
--- a/arch/mips/kernel/mcount.S
+++ b/arch/mips/kernel/mcount.S
@@ -77,6 +77,13 @@ ftrace_call:
 	nop	/* a placeholder for the call to a real tracing function */
 	 move	a1, AT		/* arg2: the caller's next ip, parent */
 
+#ifdef CONFIG_FUNCTION_GRAPH_TRACER
+	.globl ftrace_graph_call
+ftrace_graph_call:
+	nop
+	 nop
+#endif
+
 	MCOUNT_RESTORE_REGS
 	.globl ftrace_stub
 ftrace_stub:
@@ -124,10 +131,13 @@ ftrace_stub:
 #ifdef CONFIG_FUNCTION_GRAPH_TRACER
 
 NESTED(ftrace_graph_caller, PT_SIZE, ra)
+#ifdef CONFIG_DYNAMIC_FTRACE
+	PTR_L	a1, PT_R31(sp)	/* load the original ra from the stack */
+#else
 	MCOUNT_SAVE_REGS
-
-	PTR_LA	a0, PT_R1(sp)	/* arg1: &AT -> a0 */
 	move	a1, ra		/* arg2: next ip, selfaddr */
+#endif
+	PTR_LA	a0, PT_R1(sp)	/* arg1: &AT -> a0 */
 	jal	prepare_ftrace_return
 	 move	a2, fp		/* arg3: frame pointer */
 
-- 
1.6.2.1


^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH v7 14/17] tracing: make ftrace for MIPS work without -fno-omit-frame-pointer
  2009-11-09 15:29 [PATCH v7 00/17] ftrace for MIPS Wu Zhangjin
                   ` (12 preceding siblings ...)
  2009-11-09 15:31 ` [PATCH v7 13/17] tracing: add dynamic function graph tracer " Wu Zhangjin
@ 2009-11-09 15:31 ` Wu Zhangjin
  2009-11-09 15:31 ` [PATCH v7 15/17] tracing: make ftrace for MIPS more robust Wu Zhangjin
                   ` (2 subsequent siblings)
  16 siblings, 0 replies; 36+ messages in thread
From: Wu Zhangjin @ 2009-11-09 15:31 UTC (permalink / raw)
  To: linux-mips, linux-kernel
  Cc: zhangfx, zhouqg, Wu Zhangjin, Ralf Baechle, rostedt,
	Frederic Weisbecker, Ingo Molnar, Nicholas Mc Guire, David Daney,
	Richard Sandiford, Patrik Kluba, Thomas Gleixner, Michal Simek

From: Wu Zhangjin <wuzhangjin@gmail.com>

When remove the -fno-omit-frame-pointer, gcc will not save the frame
pointer for us, we need to save one ourselves.

Signed-off-by: Wu Zhangjin <wuzhangjin@gmail.com>
---
 arch/mips/kernel/mcount.S |    8 ++++++++
 1 files changed, 8 insertions(+), 0 deletions(-)

diff --git a/arch/mips/kernel/mcount.S b/arch/mips/kernel/mcount.S
index 98d4690..bdfef2c 100644
--- a/arch/mips/kernel/mcount.S
+++ b/arch/mips/kernel/mcount.S
@@ -139,7 +139,15 @@ NESTED(ftrace_graph_caller, PT_SIZE, ra)
 #endif
 	PTR_LA	a0, PT_R1(sp)	/* arg1: &AT -> a0 */
 	jal	prepare_ftrace_return
+#ifdef CONFIG_FRAME_POINTER
 	 move	a2, fp		/* arg3: frame pointer */
+#else
+#ifdef CONFIG_64BIT
+	 PTR_LA	a2, PT_SIZE(sp)
+#else
+	 PTR_LA	a2, (PT_SIZE+8)(sp)
+#endif
+#endif
 
 	MCOUNT_RESTORE_REGS
 	RETURN_BACK
-- 
1.6.2.1


^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH v7 15/17] tracing: make ftrace for MIPS more robust
  2009-11-09 15:29 [PATCH v7 00/17] ftrace for MIPS Wu Zhangjin
                   ` (13 preceding siblings ...)
  2009-11-09 15:31 ` [PATCH v7 14/17] tracing: make ftrace for MIPS work without -fno-omit-frame-pointer Wu Zhangjin
@ 2009-11-09 15:31 ` Wu Zhangjin
  2009-11-09 15:31 ` [PATCH v7 16/17] tracing: reserve $12(t0) for mcount-ra-address of gcc 4.5 Wu Zhangjin
  2009-11-09 15:31 ` [PATCH v7 17/17] tracing: make function graph tracer work with -mmcount-ra-address Wu Zhangjin
  16 siblings, 0 replies; 36+ messages in thread
From: Wu Zhangjin @ 2009-11-09 15:31 UTC (permalink / raw)
  To: linux-mips, linux-kernel
  Cc: zhangfx, zhouqg, Wu Zhangjin, Ralf Baechle, rostedt,
	Frederic Weisbecker, Ingo Molnar, Nicholas Mc Guire, David Daney,
	Richard Sandiford, Patrik Kluba, Thomas Gleixner, Michal Simek

From: Wu Zhangjin <wuzhangjin@gmail.com>

Seems there is no failure meet when working with the C source code to
load/hijack the text & stack space, but we must ensure our source code
is robust enough, This patch does it via adding extra exception
handling.

Signed-off-by: Wu Zhangjin <wuzhangjin@gmail.com>
---
 arch/mips/include/asm/ftrace.h |   57 ++++++++++++++++++++++++++++++++++++++++
 arch/mips/kernel/ftrace.c      |   46 +++++++++++++++++++++++---------
 2 files changed, 90 insertions(+), 13 deletions(-)

diff --git a/arch/mips/include/asm/ftrace.h b/arch/mips/include/asm/ftrace.h
index aa7c80b..f78d763 100644
--- a/arch/mips/include/asm/ftrace.h
+++ b/arch/mips/include/asm/ftrace.h
@@ -19,6 +19,62 @@
 extern void _mcount(void);
 #define mcount _mcount
 
+#define safe_load(load, src, dst, error)		\
+do {							\
+	asm volatile (					\
+		"1: " load " %[" STR(dst) "], 0(%[" STR(src) "])\n"\
+		"   li %[" STR(error) "], 0\n"		\
+		"2:\n"					\
+							\
+		".section .fixup, \"ax\"\n"		\
+		"3: li %[" STR(error) "], 1\n"		\
+		"   j 2b\n"				\
+		".previous\n"				\
+							\
+		".section\t__ex_table,\"a\"\n\t"	\
+		STR(PTR) "\t1b, 3b\n\t"			\
+		".previous\n"				\
+							\
+		: [dst] "=&r" (dst), [error] "=r" (error)\
+		: [src] "r" (src)			\
+		: "memory"				\
+	);						\
+} while (0)
+
+#define safe_store(store, src, dst, error)	\
+do {						\
+	asm volatile (				\
+		"1: " store " %[" STR(src) "], 0(%[" STR(dst) "])\n"\
+		"   li %[" STR(error) "], 0\n"	\
+		"2:\n"				\
+						\
+		".section .fixup, \"ax\"\n"	\
+		"3: li %[" STR(error) "], 1\n"	\
+		"   j 2b\n"			\
+		".previous\n"			\
+						\
+		".section\t__ex_table,\"a\"\n\t"\
+		STR(PTR) "\t1b, 3b\n\t"		\
+		".previous\n"			\
+						\
+		: [error] "=r" (error)		\
+		: [dst] "r" (dst), [src] "r" (src)\
+		: "memory"			\
+	);					\
+} while (0)
+
+#define safe_load_code(dst, src, error) \
+	safe_load(STR(lw), src, dst, error)
+#define safe_store_code(src, dst, error) \
+	safe_store(STR(sw), src, dst, error)
+
+#define safe_load_stack(dst, src, error) \
+	safe_load(STR(PTR_L), src, dst, error)
+
+#define safe_store_stack(src, dst, error) \
+	safe_store(STR(PTR_S), src, dst, error)
+
+
 #ifdef CONFIG_DYNAMIC_FTRACE
 static inline unsigned long ftrace_call_adjust(unsigned long addr)
 {
@@ -27,6 +83,7 @@ static inline unsigned long ftrace_call_adjust(unsigned long addr)
 
 struct dyn_arch_ftrace {
 };
+
 #endif /*  CONFIG_DYNAMIC_FTRACE */
 
 /* not trace the timecounter_read* in kernel/time/clocksource.c */
diff --git a/arch/mips/kernel/ftrace.c b/arch/mips/kernel/ftrace.c
index af3ceed..a4e25b8 100644
--- a/arch/mips/kernel/ftrace.c
+++ b/arch/mips/kernel/ftrace.c
@@ -31,7 +31,13 @@ static unsigned int ftrace_nop = 0x00000000;
 
 static int ftrace_modify_code(unsigned long ip, unsigned int new_code)
 {
-	*(unsigned int *)ip = new_code;
+	int faulted;
+
+	/* *(unsigned int *)ip = new_code; */
+	safe_store_code(new_code, ip, faulted);
+
+	if (unlikely(faulted))
+		return -EFAULT;
 
 	flush_icache_range(ip, ip + 8);
 
@@ -124,6 +130,7 @@ unsigned long ftrace_get_parent_addr(unsigned long self_addr,
 {
 	unsigned long sp, ip, ra;
 	unsigned int code;
+	int faulted;
 
 	/* move to the instruction "lui v1, HI_16BIT_OF_MCOUNT" */
 	ip = self_addr - 20;
@@ -133,8 +140,11 @@ unsigned long ftrace_get_parent_addr(unsigned long self_addr,
 	do {
 		ip -= 4;
 
-		/* get the code at "ip" */
-		code = *(unsigned int *)ip;
+		/* get the code at "ip": code = *(unsigned int *)ip; */
+		safe_load_code(code, ip, faulted);
+
+		if (unlikely(faulted))
+			return 0;
 
 		/* If we hit the non-store instruction before finding where the
 		 * ra is stored, then this is a leaf function and it does not
@@ -145,11 +155,14 @@ unsigned long ftrace_get_parent_addr(unsigned long self_addr,
 	} while (((code & S_RA_SP) != S_RA_SP));
 
 	sp = fp + (code & OFFSET_MASK);
-	ra = *(unsigned long *)sp;
+
+	/* ra = *(unsigned long *)sp; */
+	safe_load_stack(ra, sp, faulted);
+	if (unlikely(faulted))
+		return 0;
 
 	if (ra == parent)
 		return sp;
-
 	return 0;
 }
 
@@ -164,6 +177,7 @@ void prepare_ftrace_return(unsigned long *parent, unsigned long self_addr,
 	struct ftrace_graph_ent trace;
 	unsigned long return_hooker = (unsigned long)
 	    &return_to_handler;
+	int faulted;
 
 	if (unlikely(atomic_read(&current->tracing_graph_pause)))
 		return;
@@ -177,21 +191,23 @@ void prepare_ftrace_return(unsigned long *parent, unsigned long self_addr,
 	 * ftrace_get_parent_addr() does it!
 	 */
 
-	old = *parent;
+	/* old = *parent; */
+	safe_load_stack(old, parent, faulted);
+	if (unlikely(faulted))
+		goto out;
 
 	parent = (unsigned long *)ftrace_get_parent_addr(self_addr, old,
 							 (unsigned long)parent,
 							 fp);
-
 	/* If fails when getting the stack address of the non-leaf function's
 	 * ra, stop function graph tracer and return */
-	if (parent == 0) {
-		ftrace_graph_stop();
-		WARN_ON(1);
-		return;
-	}
+	if (parent == 0)
+		goto out;
 
-	*parent = return_hooker;
+	/* *parent = return_hooker; */
+	safe_store_stack(return_hooker, parent, faulted);
+	if (unlikely(faulted))
+		goto out;
 
 	if (ftrace_push_return_trace(old, self_addr, &trace.depth, fp) ==
 	    -EBUSY) {
@@ -206,5 +222,9 @@ void prepare_ftrace_return(unsigned long *parent, unsigned long self_addr,
 		current->curr_ret_stack--;
 		*parent = old;
 	}
+	return;
+out:
+	ftrace_graph_stop();
+	WARN_ON(1);
 }
 #endif				/* CONFIG_FUNCTION_GRAPH_TRACER */
-- 
1.6.2.1


^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH v7 16/17] tracing: reserve $12(t0) for mcount-ra-address of gcc 4.5
  2009-11-09 15:29 [PATCH v7 00/17] ftrace for MIPS Wu Zhangjin
                   ` (14 preceding siblings ...)
  2009-11-09 15:31 ` [PATCH v7 15/17] tracing: make ftrace for MIPS more robust Wu Zhangjin
@ 2009-11-09 15:31 ` Wu Zhangjin
  2009-11-09 15:31 ` [PATCH v7 17/17] tracing: make function graph tracer work with -mmcount-ra-address Wu Zhangjin
  16 siblings, 0 replies; 36+ messages in thread
From: Wu Zhangjin @ 2009-11-09 15:31 UTC (permalink / raw)
  To: linux-mips, linux-kernel
  Cc: zhangfx, zhouqg, Wu Zhangjin, Ralf Baechle, rostedt,
	Frederic Weisbecker, Ingo Molnar, Nicholas Mc Guire, David Daney,
	Richard Sandiford, Patrik Kluba, Thomas Gleixner, Michal Simek

From: Wu Zhangjin <wuzhangjin@gmail.com>

A new option -mmcount-ra-address for gcc 4.5 have been sent by David
Daney <ddaney@caviumnetworks.com> in the thread "MIPS: Add option to
pass return address location to _mcount", which help to record the
location of the return address(ra) for the function graph tracer of MIPS
to hijack the return address easier and safer. that option used the
$12(t0) register by default, so, we reserve it for it, and use t1,t2,t3
instead of t0,t1,t2.

Signed-off-by: Wu Zhangjin <wuzhangjin@gmail.com>
---
 arch/mips/kernel/mcount.S |   26 +++++++++++++-------------
 1 files changed, 13 insertions(+), 13 deletions(-)

diff --git a/arch/mips/kernel/mcount.S b/arch/mips/kernel/mcount.S
index bdfef2c..522e91c 100644
--- a/arch/mips/kernel/mcount.S
+++ b/arch/mips/kernel/mcount.S
@@ -65,8 +65,8 @@ NESTED(ftrace_caller, PT_SIZE, ra)
 _mcount:
 	b	ftrace_stub
 	 nop
-	lw	t0, function_trace_stop
-	bnez	t0, ftrace_stub
+	lw	t1, function_trace_stop
+	bnez	t1, ftrace_stub
 	 nop
 
 	MCOUNT_SAVE_REGS
@@ -93,21 +93,21 @@ ftrace_stub:
 #else	/* ! CONFIG_DYNAMIC_FTRACE */
 
 NESTED(_mcount, PT_SIZE, ra)
-	lw	t0, function_trace_stop
-	bnez	t0, ftrace_stub
+	lw	t1, function_trace_stop
+	bnez	t1, ftrace_stub
 	 nop
-	PTR_LA	t0, ftrace_stub
-	PTR_L	t1, ftrace_trace_function /* Prepare t1 for (1) */
-	bne	t0, t1, static_trace
+	PTR_LA	t1, ftrace_stub
+	PTR_L	t2, ftrace_trace_function /* Prepare t2 for (1) */
+	bne	t1, t2, static_trace
 	 nop
 
 #ifdef	CONFIG_FUNCTION_GRAPH_TRACER
-	PTR_L	t2, ftrace_graph_return
-	bne	t0, t2, ftrace_graph_caller
+	PTR_L	t3, ftrace_graph_return
+	bne	t1, t3, ftrace_graph_caller
 	 nop
-	PTR_LA	t0, ftrace_graph_entry_stub
-	PTR_L	t2, ftrace_graph_entry
-	bne	t0, t2, ftrace_graph_caller
+	PTR_LA	t1, ftrace_graph_entry_stub
+	PTR_L	t3, ftrace_graph_entry
+	bne	t1, t3, ftrace_graph_caller
 	 nop
 #endif
 	b	ftrace_stub
@@ -117,7 +117,7 @@ static_trace:
 	MCOUNT_SAVE_REGS
 
 	move	a0, ra		/* arg1: next ip, selfaddr */
-	jalr	t1	/* (1) call *ftrace_trace_function */
+	jalr	t2		/* (1) call *ftrace_trace_function */
 	 move	a1, AT		/* arg2: the caller's next ip, parent */
 
 	MCOUNT_RESTORE_REGS
-- 
1.6.2.1


^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH v7 17/17] tracing: make function graph tracer work with -mmcount-ra-address
  2009-11-09 15:29 [PATCH v7 00/17] ftrace for MIPS Wu Zhangjin
                   ` (15 preceding siblings ...)
  2009-11-09 15:31 ` [PATCH v7 16/17] tracing: reserve $12(t0) for mcount-ra-address of gcc 4.5 Wu Zhangjin
@ 2009-11-09 15:31 ` Wu Zhangjin
  2009-11-09 22:34   ` David Daney
  16 siblings, 1 reply; 36+ messages in thread
From: Wu Zhangjin @ 2009-11-09 15:31 UTC (permalink / raw)
  To: linux-mips, linux-kernel
  Cc: zhangfx, zhouqg, Wu Zhangjin, Ralf Baechle, rostedt,
	Frederic Weisbecker, Ingo Molnar, Nicholas Mc Guire, David Daney,
	Richard Sandiford, Patrik Kluba, Thomas Gleixner, Michal Simek

From: Wu Zhangjin <wuzhangjin@gmail.com>

That thread "MIPS: Add option to pass return address location to
_mcount" from "David Daney <ddaney@caviumnetworks.com>" have added a new
option -mmcount-ra-address to gcc(4.5) for MIPS to transfer the location
of the return address to _mcount.

Benefit from this new feature, function graph tracer on MIPS will be
easier and safer to hijack the return address of the kernel function,
which will save some overhead and make the whole thing more reliable.

In this patch, at first, try to enable the option -mmcount-ra-address in
arch/mips/Makefile with cc-option, if gcc support it, it will be
enabled, otherwise, not side effect.

and then, we need to support this new option of gcc 4.5 and also support
the old gcc versions.

with _mcount in the old gcc versions, it's not easy to get the location
of return address(refer to 1f5db2a6932fcc7d2f6600903724e33b7bd269f3:
"tracing: add function graph tracer support for MIPS"), so, we do it in
a C function: ftrace_get_parent_addr(ftrace.c), but with
-mmcount-ra-address, only several instructions need to get what we want,
so, I put into asm(mcount.S). and also, as the $12(t0) is used by
-mmcount-ra-address for transferring the localtion of return address to
_mcount, we need to save it into the stack and restore it when enabled
dynamic function tracer, 'Cause we have called "ftrace_call" before
"ftrace_graph_caller", which may destroy $12(t0).

Signed-off-by: Wu Zhangjin <wuzhangjin@gmail.com>
---
 arch/mips/Makefile        |    3 +++
 arch/mips/kernel/ftrace.c |   24 +++++++++++++++++-------
 arch/mips/kernel/mcount.S |   16 ++++++++++++++++
 3 files changed, 36 insertions(+), 7 deletions(-)

diff --git a/arch/mips/Makefile b/arch/mips/Makefile
index 041deca..c710324 100644
--- a/arch/mips/Makefile
+++ b/arch/mips/Makefile
@@ -52,6 +52,9 @@ ifndef CONFIG_FUNCTION_TRACER
 cflags-y := -ffunction-sections
 else
 cflags-y := -mlong-calls
+ifdef CONFIG_FUNCTION_GRAPH_TRACER
+	cflags-y += $(call cc-option, -mmcount-ra-address)
+endif
 endif
 cflags-y += $(call cc-option, -mno-check-zero-division)
 
diff --git a/arch/mips/kernel/ftrace.c b/arch/mips/kernel/ftrace.c
index a4e25b8..1dcbf0f 100644
--- a/arch/mips/kernel/ftrace.c
+++ b/arch/mips/kernel/ftrace.c
@@ -119,6 +119,7 @@ int ftrace_disable_ftrace_graph_caller(void)
 
 #endif				/* !CONFIG_DYNAMIC_FTRACE */
 
+#if (__GNUC__ <= 4 && __GNUC_MINOR__ < 5)
 #define S_RA_SP	(0xafbf << 16)	/* s{d,w} ra, offset(sp) */
 #define S_R_SP	(0xafb0 << 16)  /* s{d,w} R, offset(sp) */
 #define OFFSET_MASK	0xffff	/* stack offset range: 0 ~ PT_SIZE */
@@ -166,6 +167,8 @@ unsigned long ftrace_get_parent_addr(unsigned long self_addr,
 	return 0;
 }
 
+#endif
+
 /*
  * Hook the return address and push it in the stack of return addrs
  * in current thread info.
@@ -183,19 +186,26 @@ void prepare_ftrace_return(unsigned long *parent, unsigned long self_addr,
 		return;
 
 	/* "parent" is the stack address saved the return address of the caller
-	 * of _mcount, for a leaf function not save the return address in the
-	 * stack address, so, we "emulate" one in _mcount's stack space, and
-	 * hijack it directly, but for a non-leaf function, it will save the
-	 * return address to the its stack space, so, we can not hijack the
-	 * "parent" directly, but need to find the real stack address,
+	 * of _mcount.
+	 *
+	 * if the gcc < 4.5, a leaf function does not save the return address
+	 * in the stack address, so, we "emulate" one in _mcount's stack space,
+	 * and hijack it directly, but for a non-leaf function, it save the
+	 * return address to the its own stack space, we can not hijack it
+	 * directly, but need to find the real stack address,
 	 * ftrace_get_parent_addr() does it!
+	 *
+	 * if gcc>= 4.5, with the new -mmcount-ra-address option, for a
+	 * non-leaf function, the location of the return address will be saved
+	 * to $12 for us, and for a leaf function, only put a zero into $12. we
+	 * do it in ftrace_graph_caller of mcount.S.
 	 */
 
 	/* old = *parent; */
 	safe_load_stack(old, parent, faulted);
 	if (unlikely(faulted))
 		goto out;
-
+#if (__GNUC__ <= 4 && __GNUC_MINOR__ < 5)
 	parent = (unsigned long *)ftrace_get_parent_addr(self_addr, old,
 							 (unsigned long)parent,
 							 fp);
@@ -203,7 +213,7 @@ void prepare_ftrace_return(unsigned long *parent, unsigned long self_addr,
 	 * ra, stop function graph tracer and return */
 	if (parent == 0)
 		goto out;
-
+#endif
 	/* *parent = return_hooker; */
 	safe_store_stack(return_hooker, parent, faulted);
 	if (unlikely(faulted))
diff --git a/arch/mips/kernel/mcount.S b/arch/mips/kernel/mcount.S
index 522e91c..addd68f 100644
--- a/arch/mips/kernel/mcount.S
+++ b/arch/mips/kernel/mcount.S
@@ -70,6 +70,11 @@ _mcount:
 	 nop
 
 	MCOUNT_SAVE_REGS
+#if (__GNUC__ >= 4 && __GNUC_MINOR__ >= 5)
+#ifdef CONFIG_FUNCTION_GRAPH_TRACER
+	PTR_S	t0, PT_R12(sp)	/* t0 saved the location of the return address(at) by -mmcount-ra-address */
+#endif
+#endif
 
 	move	a0, ra		/* arg1: next ip, selfaddr */
 	.globl ftrace_call
@@ -133,11 +138,22 @@ ftrace_stub:
 NESTED(ftrace_graph_caller, PT_SIZE, ra)
 #ifdef CONFIG_DYNAMIC_FTRACE
 	PTR_L	a1, PT_R31(sp)	/* load the original ra from the stack */
+#if (__GNUC__ >= 4 && __GNUC_MINOR__ >= 5)
+	PTR_L	t0, PT_R12(sp)	/* load the original t0 from the stack */
+#endif
 #else
 	MCOUNT_SAVE_REGS
 	move	a1, ra		/* arg2: next ip, selfaddr */
 #endif
+
+#if (__GNUC__ >= 4 && __GNUC_MINOR__ >= 5)
+	bnez	t0, 1f		/* non-leaf func: t0 saved the location of the return address */
+	 nop
+	PTR_LA	t0, PT_R1(sp)	/* leaf func: get the location of at(old ra) from our own stack */
+1:	move	a0, t0		/* arg1: the location of the return address */
+#else
 	PTR_LA	a0, PT_R1(sp)	/* arg1: &AT -> a0 */
+#endif
 	jal	prepare_ftrace_return
 #ifdef CONFIG_FRAME_POINTER
 	 move	a2, fp		/* arg3: frame pointer */
-- 
1.6.2.1


^ permalink raw reply related	[flat|nested] 36+ messages in thread

* Re: [PATCH v7 17/17] tracing: make function graph tracer work with -mmcount-ra-address
  2009-11-09 15:31 ` [PATCH v7 17/17] tracing: make function graph tracer work with -mmcount-ra-address Wu Zhangjin
@ 2009-11-09 22:34   ` David Daney
  2009-11-10  2:42     ` Wu Zhangjin
  0 siblings, 1 reply; 36+ messages in thread
From: David Daney @ 2009-11-09 22:34 UTC (permalink / raw)
  To: Wu Zhangjin
  Cc: linux-mips, linux-kernel, zhangfx, zhouqg, Ralf Baechle, rostedt,
	Frederic Weisbecker, Ingo Molnar, Nicholas Mc Guire,
	Richard Sandiford, Patrik Kluba, Thomas Gleixner, Michal Simek

Wu Zhangjin wrote:
[...]
> +	cflags-y += $(call cc-option, -mmcount-ra-address)
[...]
> +#if (__GNUC__ <= 4 && __GNUC_MINOR__ < 5)



Sprinkling the code with these #if clauses is ugly and prone to breakage 
I think.

The Makefile part is testing for the same feature.

We do a very similar thing with -msym32, and KBUILD_SYM32.  Perhaps you 
could rework this patch in a similar manner and test for 
KBUILD_MCOUNT_RA_ADDRESS instead of the '(__GNUC__ <= 4 && 
__GNUC_MINOR__ < 5)'

David Daney

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH v7 04/17] tracing: add static function tracer support for MIPS
  2009-11-09 15:31 ` [PATCH v7 04/17] tracing: add static function tracer support for MIPS Wu Zhangjin
@ 2009-11-10  0:26   ` David Daney
  2009-11-10  1:00     ` Wu Zhangjin
  2009-11-12 11:06   ` Wu Zhangjin
  1 sibling, 1 reply; 36+ messages in thread
From: David Daney @ 2009-11-10  0:26 UTC (permalink / raw)
  To: Wu Zhangjin
  Cc: linux-mips, linux-kernel, zhangfx, zhouqg, Ralf Baechle, rostedt,
	Frederic Weisbecker, Ingo Molnar, Nicholas Mc Guire,
	Richard Sandiford, Patrik Kluba, Thomas Gleixner, Michal Simek

Wu Zhangjin wrote:

>  
> +ifndef CONFIG_FUNCTION_TRACER
>  cflags-y := -ffunction-sections
> +else
> +cflags-y := -mlong-calls
> +endif
>  cflags-y += $(call cc-option, -mno-check-zero-division)
>  

That doesn't make sense to me.  Modules are already compiled with 
-mlong-calls.  The only time you would need the entire kernel compiled 
with -mlong-calls is if the tracer were in a module.  The logic here 
doesn't seem to reflect that.

David Daney

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH v7 04/17] tracing: add static function tracer support for MIPS
  2009-11-10  0:26   ` David Daney
@ 2009-11-10  1:00     ` Wu Zhangjin
  2009-11-10 16:43       ` David Daney
  0 siblings, 1 reply; 36+ messages in thread
From: Wu Zhangjin @ 2009-11-10  1:00 UTC (permalink / raw)
  To: David Daney
  Cc: linux-mips, linux-kernel, zhangfx, zhouqg, Ralf Baechle, rostedt,
	Frederic Weisbecker, Ingo Molnar, Nicholas Mc Guire,
	Richard Sandiford, Patrik Kluba, Thomas Gleixner, Michal Simek

Hi,

On Mon, 2009-11-09 at 16:26 -0800, David Daney wrote: 
> Wu Zhangjin wrote:
> 
> >  
> > +ifndef CONFIG_FUNCTION_TRACER
> >  cflags-y := -ffunction-sections
> > +else
> > +cflags-y := -mlong-calls
> > +endif
> >  cflags-y += $(call cc-option, -mno-check-zero-division)
> >  
> 
> That doesn't make sense to me.  Modules are already compiled with 
> -mlong-calls.  The only time you would need the entire kernel compiled 
> with -mlong-calls is if the tracer were in a module.  The logic here 
> doesn't seem to reflect that.

Yes, I knew the module have gotten the -mlong-calls, Here we just want
to use -mlong-calls for the whole kernel, and then we get the same
_mcount stuff in the whole kernel, at last, we can use the same
scripts/recordmcount.pl and ftrace_make_nop & ftrace_make_call for the
dynamic ftracer.

And seems we only need to enable it for the dynamic one. So, I will
split it out into it's own patch later.

Thanks & Regards,
	Wu Zhangjin


^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH v7 17/17] tracing: make function graph tracer work with -mmcount-ra-address
  2009-11-09 22:34   ` David Daney
@ 2009-11-10  2:42     ` Wu Zhangjin
  0 siblings, 0 replies; 36+ messages in thread
From: Wu Zhangjin @ 2009-11-10  2:42 UTC (permalink / raw)
  To: David Daney
  Cc: linux-mips, linux-kernel, zhangfx, zhouqg, Ralf Baechle, rostedt,
	Frederic Weisbecker, Ingo Molnar, Nicholas Mc Guire,
	Richard Sandiford, Patrik Kluba, Thomas Gleixner, Michal Simek

Hi,

On Mon, 2009-11-09 at 14:34 -0800, David Daney wrote:
> Wu Zhangjin wrote:
> [...]
> > +	cflags-y += $(call cc-option, -mmcount-ra-address)
> [...]
> > +#if (__GNUC__ <= 4 && __GNUC_MINOR__ < 5)
> 
> 
> 
> Sprinkling the code with these #if clauses is ugly and prone to breakage 
> I think.
> 
> The Makefile part is testing for the same feature.
> 
> We do a very similar thing with -msym32, and KBUILD_SYM32.  Perhaps you 
> could rework this patch in a similar manner and test for 
> KBUILD_MCOUNT_RA_ADDRESS instead of the '(__GNUC__ <= 4 && 
> __GNUC_MINOR__ < 5)'
> 

This is really ugly ;)

KBUILD_MCOUNT_RA_ADDRESS is a wonderful idea, thanks!

Regards,
	Wu Zhangjin


^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH v7 04/17] tracing: add static function tracer support for MIPS
  2009-11-10  1:00     ` Wu Zhangjin
@ 2009-11-10 16:43       ` David Daney
  2009-11-11  2:42         ` Wu Zhangjin
  0 siblings, 1 reply; 36+ messages in thread
From: David Daney @ 2009-11-10 16:43 UTC (permalink / raw)
  To: wuzhangjin
  Cc: linux-mips, linux-kernel, zhangfx, zhouqg, Ralf Baechle, rostedt,
	Frederic Weisbecker, Ingo Molnar, Nicholas Mc Guire,
	Richard Sandiford, Patrik Kluba, Thomas Gleixner, Michal Simek

Wu Zhangjin wrote:
> Hi,
> 
> On Mon, 2009-11-09 at 16:26 -0800, David Daney wrote: 
>> Wu Zhangjin wrote:
>>
>>>  
>>> +ifndef CONFIG_FUNCTION_TRACER
>>>  cflags-y := -ffunction-sections
>>> +else
>>> +cflags-y := -mlong-calls
>>> +endif
>>>  cflags-y += $(call cc-option, -mno-check-zero-division)
>>>  
>> That doesn't make sense to me.  Modules are already compiled with 
>> -mlong-calls.  The only time you would need the entire kernel compiled 
>> with -mlong-calls is if the tracer were in a module.  The logic here 
>> doesn't seem to reflect that.
> 
> Yes, I knew the module have gotten the -mlong-calls, Here we just want
> to use -mlong-calls for the whole kernel, and then we get the same
> _mcount stuff in the whole kernel, at last, we can use the same
> scripts/recordmcount.pl and ftrace_make_nop & ftrace_make_call for the
> dynamic ftracer.
> 

-mlong-calls really degrades performance.  I have seen things like 6% 
drop in network packet forwarding rates with -mlong-calls.

It would be better to fix all the tools so that they could handle both 
-mlong-calls and -mno-long-calls code.


David Daney

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH v7 03/17] tracing: add MIPS specific trace_clock_local()
  2009-11-09 15:31 ` [PATCH v7 03/17] tracing: add MIPS specific trace_clock_local() Wu Zhangjin
@ 2009-11-10 17:48   ` David Daney
  2009-11-11  2:47     ` Wu Zhangjin
  0 siblings, 1 reply; 36+ messages in thread
From: David Daney @ 2009-11-10 17:48 UTC (permalink / raw)
  To: Wu Zhangjin
  Cc: linux-mips, linux-kernel, zhangfx, zhouqg, Ralf Baechle, rostedt,
	Frederic Weisbecker, Ingo Molnar, Nicholas Mc Guire,
	Richard Sandiford, Patrik Kluba, Thomas Gleixner, Michal Simek

Wu Zhangjin wrote:
[...]
> + * trace_clock_local(): the simplest and least coherent tracing clock.
> + *
> + * Useful for tracing that does not cross to other CPUs nor
> + * does it go through idle events.
> + */
> +u64 trace_clock_local(void)
> +{
> +	unsigned long flags;
> +	u64 clock;
> +
> +	raw_local_irq_save(flags);
> +
> +	clock = mips_timecounter_read();
> +
> +	raw_local_irq_restore(flags);
> +
> +	return clock;
> +}

Why disable interrupts?

Also you call the new function mips_timecounter_read().  Since 
sched_clock() is a weak function, you can override the definition with a 
more accurate version when possible.  Then you could just directly call 
it here, instead of adding the new mips_timecounter_read() that the 
'[PATCH v7 02/17] tracing: add mips_timecounter_read() for MIPS' adds.

David Daney

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH v7 04/17] tracing: add static function tracer support for MIPS
  2009-11-10 16:43       ` David Daney
@ 2009-11-11  2:42         ` Wu Zhangjin
  2009-11-11  3:13           ` Ralf Baechle
  2009-11-11 13:15           ` Maciej W. Rozycki
  0 siblings, 2 replies; 36+ messages in thread
From: Wu Zhangjin @ 2009-11-11  2:42 UTC (permalink / raw)
  To: David Daney
  Cc: linux-mips, linux-kernel, zhangfx, zhouqg, Ralf Baechle, rostedt,
	Frederic Weisbecker, Ingo Molnar, Nicholas Mc Guire,
	Richard Sandiford, Patrik Kluba, Thomas Gleixner, Michal Simek

Hi,

On Tue, 2009-11-10 at 08:43 -0800, David Daney wrote:
[...]
> >>> +ifndef CONFIG_FUNCTION_TRACER
> >>>  cflags-y := -ffunction-sections
> >>> +else
> >>> +cflags-y := -mlong-calls
> >>> +endif
> >>>  cflags-y += $(call cc-option, -mno-check-zero-division)
> >>>  
> >> That doesn't make sense to me.  Modules are already compiled with 
> >> -mlong-calls.  The only time you would need the entire kernel compiled 
> >> with -mlong-calls is if the tracer were in a module.  The logic here 
> >> doesn't seem to reflect that.
> > 
> > Yes, I knew the module have gotten the -mlong-calls, Here we just want
> > to use -mlong-calls for the whole kernel, and then we get the same
> > _mcount stuff in the whole kernel, at last, we can use the same
> > scripts/recordmcount.pl and ftrace_make_nop & ftrace_make_call for the
> > dynamic ftracer.
> > 
> 
> -mlong-calls really degrades performance.  I have seen things like 6% 
> drop in network packet forwarding rates with -mlong-calls.
> 

so much drop? seems only two instructions added for it: lui, addi. from
this view point, I think the -fno-omit-frame-pointer(add, sd, move...)
will also bring with much drop.

It's time to remove them? -mlong-calls, -fno-omit-frame-pointer.

> It would be better to fix all the tools so that they could handle both 
> -mlong-calls and -mno-long-calls code.
> 

It's totally possible, will try to make it work later. I just wanted the
stuff simple, but if it really brings us with much drop, it's time to
fix it.

Regards,
	Wu Zhangjin


^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH v7 03/17] tracing: add MIPS specific trace_clock_local()
  2009-11-10 17:48   ` David Daney
@ 2009-11-11  2:47     ` Wu Zhangjin
  0 siblings, 0 replies; 36+ messages in thread
From: Wu Zhangjin @ 2009-11-11  2:47 UTC (permalink / raw)
  To: David Daney
  Cc: linux-mips, linux-kernel, zhangfx, zhouqg, Ralf Baechle, rostedt,
	Frederic Weisbecker, Ingo Molnar, Nicholas Mc Guire,
	Richard Sandiford, Patrik Kluba, Thomas Gleixner, Michal Simek

Hi,

On Tue, 2009-11-10 at 09:48 -0800, David Daney wrote:
> Wu Zhangjin wrote:
> [...]
> > + * trace_clock_local(): the simplest and least coherent tracing clock.
> > + *
> > + * Useful for tracing that does not cross to other CPUs nor
> > + * does it go through idle events.
> > + */
> > +u64 trace_clock_local(void)
> > +{
> > +	unsigned long flags;
> > +	u64 clock;
> > +
> > +	raw_local_irq_save(flags);
> > +
> > +	clock = mips_timecounter_read();
> > +
> > +	raw_local_irq_restore(flags);
> > +
> > +	return clock;
> > +}
> 
> Why disable interrupts?
> 

I got it from kernel/trace/trace_clock.c:

/*
 * trace_clock_local(): the simplest and least coherent tracing clock.
 *
 * Useful for tracing that does not cross to other CPUs nor
 * does it go through idle events.
 */
u64 notrace trace_clock_local(void)
{
        unsigned long flags;
        u64 clock;

        /*
         * sched_clock() is an architecture implemented, fast, scalable,
         * lockless clock. It is not guaranteed to be coherent across
         * CPUs, nor across CPU idle events.
         */
        raw_local_irq_save(flags);
        clock = sched_clock();
        raw_local_irq_restore(flags);

        return clock;
}

> Also you call the new function mips_timecounter_read().  Since 
> sched_clock() is a weak function, you can override the definition with a 
> more accurate version when possible.  Then you could just directly call 
> it here, instead of adding the new mips_timecounter_read() that the 
> '[PATCH v7 02/17] tracing: add mips_timecounter_read() for MIPS' adds.
> 

Yes, I have tried to override the sched_clock(), but failed on
booting(just hang there). and also, as you know, this version of
mips_timecounter_read() will bring us some overhead, I think it's not a
good idea to enable it for the whole system, so, I only enable it for
ftrace.

Regards,
	Wu Zhangjin


^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH v7 04/17] tracing: add static function tracer support for MIPS
  2009-11-11  2:42         ` Wu Zhangjin
@ 2009-11-11  3:13           ` Ralf Baechle
  2009-11-11 13:30             ` Maciej W. Rozycki
  2009-11-11 14:13             ` Maciej W. Rozycki
  2009-11-11 13:15           ` Maciej W. Rozycki
  1 sibling, 2 replies; 36+ messages in thread
From: Ralf Baechle @ 2009-11-11  3:13 UTC (permalink / raw)
  To: Wu Zhangjin
  Cc: David Daney, linux-mips, linux-kernel, zhangfx, zhouqg, rostedt,
	Frederic Weisbecker, Ingo Molnar, Nicholas Mc Guire,
	Richard Sandiford, Patrik Kluba, Thomas Gleixner, Michal Simek

On Wed, Nov 11, 2009 at 10:42:31AM +0800, Wu Zhangjin wrote:

> > -mlong-calls really degrades performance.  I have seen things like 6% 
> > drop in network packet forwarding rates with -mlong-calls.
> > 
> 
> so much drop? seems only two instructions added for it: lui, addi. from
> this view point, I think the -fno-omit-frame-pointer(add, sd, move...)
> will also bring with much drop.

The calling sequence is quite badly bloated.  Example:

Normal 32/64-bit subroutine call:

	jal	symbol

32-bit with -mlong-call:

	lui	$25, %hi(foo)
	addiu	$25, %lo(foo)
	jalr	$25

64-bit with -mlong-call:

	lui	$25, %highest(foo)
	lui	$2, %hi(foo)
	daddiu	$25, %higher(foo)
	daddiu	$2, %lo(foo)
	dsll	$25, 32
	daddu	$25, $2
	jalr	$25

So not considering the possible cost of the delay slot that's 1 vs. 3 vs. 7
instructions.  Last I checked ages ago gcc didn't apropriately consider this
cost when generating -mlong-calls code and Linux these days also is
optimized under the assumption that subroutine calls are cheap.

It's time that we get a -G optimization that works for the kernel; it would
allow to cut down the -mlong-calls calling sequence to just:

	lw/ld	$25, offset($gp)
	jalr	$25

> It's time to remove them? -mlong-calls, -fno-omit-frame-pointer.
> 
> > It would be better to fix all the tools so that they could handle both 
> > -mlong-calls and -mno-long-calls code.
> > 
> 
> It's totally possible, will try to make it work later. I just wanted the
> stuff simple, but if it really brings us with much drop, it's time to
> fix it.

  Ralf

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH v7 04/17] tracing: add static function tracer support for MIPS
  2009-11-11  2:42         ` Wu Zhangjin
  2009-11-11  3:13           ` Ralf Baechle
@ 2009-11-11 13:15           ` Maciej W. Rozycki
  2009-11-11 13:51             ` Wu Zhangjin
  1 sibling, 1 reply; 36+ messages in thread
From: Maciej W. Rozycki @ 2009-11-11 13:15 UTC (permalink / raw)
  To: Wu Zhangjin
  Cc: David Daney, linux-mips, linux-kernel, zhangfx, zhouqg,
	Ralf Baechle, rostedt, Frederic Weisbecker, Ingo Molnar,
	Nicholas Mc Guire, Richard Sandiford, Patrik Kluba,
	Thomas Gleixner, Michal Simek

On Wed, 11 Nov 2009, Wu Zhangjin wrote:

> > -mlong-calls really degrades performance.  I have seen things like 6% 
> > drop in network packet forwarding rates with -mlong-calls.
> > 
> 
> so much drop? seems only two instructions added for it: lui, addi. from
> this view point, I think the -fno-omit-frame-pointer(add, sd, move...)
> will also bring with much drop.

 No, register jumps cannot be predicted -- this is where the performance 
goes on any serious processor -- the two extra instructions are nothing 
compared to that.  OTOH frame pointer calculations are pure arithmetic, so 
you only lose time incurred by the instructions themselves.

  Maciej

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH v7 04/17] tracing: add static function tracer support for MIPS
  2009-11-11  3:13           ` Ralf Baechle
@ 2009-11-11 13:30             ` Maciej W. Rozycki
  2009-11-12 21:16               ` Ralf Baechle
  2009-11-11 14:13             ` Maciej W. Rozycki
  1 sibling, 1 reply; 36+ messages in thread
From: Maciej W. Rozycki @ 2009-11-11 13:30 UTC (permalink / raw)
  To: Ralf Baechle
  Cc: Wu Zhangjin, David Daney, linux-mips, linux-kernel, zhangfx,
	zhouqg, rostedt, Frederic Weisbecker, Ingo Molnar,
	Nicholas Mc Guire, Richard Sandiford, Patrik Kluba,
	Thomas Gleixner, Michal Simek

On Wed, 11 Nov 2009, Ralf Baechle wrote:

> 64-bit with -mlong-call:
> 
> 	lui	$25, %highest(foo)
> 	lui	$2, %hi(foo)
> 	daddiu	$25, %higher(foo)
> 	daddiu	$2, %lo(foo)
> 	dsll	$25, 32
> 	daddu	$25, $2
> 	jalr	$25

 Don't we use -msym32 these days for this very reason?

  Maciej

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH v7 04/17] tracing: add static function tracer support for MIPS
  2009-11-11 13:15           ` Maciej W. Rozycki
@ 2009-11-11 13:51             ` Wu Zhangjin
  2009-11-11 14:18               ` Maciej W. Rozycki
  0 siblings, 1 reply; 36+ messages in thread
From: Wu Zhangjin @ 2009-11-11 13:51 UTC (permalink / raw)
  To: Maciej W. Rozycki
  Cc: David Daney, linux-mips, linux-kernel, zhangfx, zhouqg,
	Ralf Baechle, rostedt, Frederic Weisbecker, Ingo Molnar,
	Nicholas Mc Guire, Richard Sandiford, Patrik Kluba,
	Thomas Gleixner, Michal Simek

Hi,

On Wed, 2009-11-11 at 13:15 +0000, Maciej W. Rozycki wrote:
> On Wed, 11 Nov 2009, Wu Zhangjin wrote:
> 
> > > -mlong-calls really degrades performance.  I have seen things like 6% 
> > > drop in network packet forwarding rates with -mlong-calls.
> > > 
> > 
> > so much drop? seems only two instructions added for it: lui, addi. from
> > this view point, I think the -fno-omit-frame-pointer(add, sd, move...)
> > will also bring with much drop.
> 
>  No, register jumps cannot be predicted -- this is where the performance 
> goes on any serious processor -- the two extra instructions are nothing 
> compared to that.  OTOH frame pointer calculations are pure arithmetic, so 
> you only lose time incurred by the instructions themselves.

Yes, I only mean the -mlong-calls and the original -mno-long-calls with
-pg.

The orignal one looks like this:

move ra, at
jal _mcount

The new one with -mlong-calls looks like this:

lui v1, HI_16BIT_OF_MCOUNT
addiu v1, v1, LOW_16BIT_OF_MCOUNT
move ra, at
jalr v1

both of them have a "jump" instruciton, so, only two lui, addiu added
for -mlong-calls ;)

what about the difference between that "jal _mcount"  and "jalr v1"?

Regards,
	Wu Zhangjin


^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH v7 04/17] tracing: add static function tracer support for MIPS
  2009-11-11  3:13           ` Ralf Baechle
  2009-11-11 13:30             ` Maciej W. Rozycki
@ 2009-11-11 14:13             ` Maciej W. Rozycki
  2009-11-13 16:50               ` Ralf Baechle
  1 sibling, 1 reply; 36+ messages in thread
From: Maciej W. Rozycki @ 2009-11-11 14:13 UTC (permalink / raw)
  To: Ralf Baechle
  Cc: Wu Zhangjin, David Daney, linux-mips, linux-kernel, zhangfx,
	zhouqg, rostedt, Frederic Weisbecker, Ingo Molnar,
	Nicholas Mc Guire, Richard Sandiford, Patrik Kluba,
	Thomas Gleixner, Michal Simek

On Wed, 11 Nov 2009, Ralf Baechle wrote:

> 32-bit with -mlong-call:
> 
> 	lui	$25, %hi(foo)
> 	addiu	$25, %lo(foo)
> 	jalr	$25
[...]
> It's time that we get a -G optimization that works for the kernel; it would
> allow to cut down the -mlong-calls calling sequence to just:
> 
> 	lw/ld	$25, offset($gp)
> 	jalr	$25

 Actually this may be no faster than the above.  The load produces its 
result late and the jump needs its data early, so unless a bypass has been 
implemented in the pipeline, it may well stall for the extra cycle (that's 
the reason for the load-delay slot in the original MIPS I ISA after all).

 Of course there is still the benefit of a reduced cache footprint, but 
the extra load may have to evict a cache line and flush the benefit down 
the drain.  I don't mean it's not to be considered, but it's not at all 
immediately obvious it would be a win.

  Maciej

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH v7 04/17] tracing: add static function tracer support for MIPS
  2009-11-11 13:51             ` Wu Zhangjin
@ 2009-11-11 14:18               ` Maciej W. Rozycki
  2009-11-11 14:27                 ` Wu Zhangjin
  0 siblings, 1 reply; 36+ messages in thread
From: Maciej W. Rozycki @ 2009-11-11 14:18 UTC (permalink / raw)
  To: Wu Zhangjin
  Cc: David Daney, linux-mips, linux-kernel, zhangfx, zhouqg,
	Ralf Baechle, rostedt, Frederic Weisbecker, Ingo Molnar,
	Nicholas Mc Guire, Richard Sandiford, Patrik Kluba,
	Thomas Gleixner, Michal Simek

On Wed, 11 Nov 2009, Wu Zhangjin wrote:

> >  No, register jumps cannot be predicted -- this is where the performance 
> > goes on any serious processor -- the two extra instructions are nothing 
> > compared to that.  OTOH frame pointer calculations are pure arithmetic, so 
> > you only lose time incurred by the instructions themselves.
> 
> Yes, I only mean the -mlong-calls and the original -mno-long-calls with
> -pg.
> 
> The orignal one looks like this:
> 
> move ra, at
> jal _mcount
> 
> The new one with -mlong-calls looks like this:
> 
> lui v1, HI_16BIT_OF_MCOUNT
> addiu v1, v1, LOW_16BIT_OF_MCOUNT
> move ra, at
> jalr v1
> 
> both of them have a "jump" instruciton, so, only two lui, addiu added
> for -mlong-calls ;)
> 
> what about the difference between that "jal _mcount"  and "jalr v1"?

 As I say, the latter cannot be predicted and will incur a stall for any 
decent pipeline.  With the former the target address of the jump can be 
calculated early and the instruction fetch unit can start feeding 
instructions from there into the pipeline even before the jump has reached 
the execution stage.

  Maciej

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH v7 04/17] tracing: add static function tracer support for MIPS
  2009-11-11 14:18               ` Maciej W. Rozycki
@ 2009-11-11 14:27                 ` Wu Zhangjin
  0 siblings, 0 replies; 36+ messages in thread
From: Wu Zhangjin @ 2009-11-11 14:27 UTC (permalink / raw)
  To: Maciej W. Rozycki
  Cc: David Daney, linux-mips, linux-kernel, zhangfx, zhouqg,
	Ralf Baechle, rostedt, Frederic Weisbecker, Ingo Molnar,
	Nicholas Mc Guire, Richard Sandiford, Patrik Kluba,
	Thomas Gleixner, Michal Simek

On Wed, 2009-11-11 at 14:18 +0000, Maciej W. Rozycki wrote:
> On Wed, 11 Nov 2009, Wu Zhangjin wrote:
> 
> > >  No, register jumps cannot be predicted -- this is where the performance 
> > > goes on any serious processor -- the two extra instructions are nothing 
> > > compared to that.  OTOH frame pointer calculations are pure arithmetic, so 
> > > you only lose time incurred by the instructions themselves.
> > 
> > Yes, I only mean the -mlong-calls and the original -mno-long-calls with
> > -pg.
> > 
> > The orignal one looks like this:
> > 
> > move ra, at
> > jal _mcount
> > 
> > The new one with -mlong-calls looks like this:
> > 
> > lui v1, HI_16BIT_OF_MCOUNT
> > addiu v1, v1, LOW_16BIT_OF_MCOUNT
> > move ra, at
> > jalr v1
> > 
> > both of them have a "jump" instruciton, so, only two lui, addiu added
> > for -mlong-calls ;)
> > 
> > what about the difference between that "jal _mcount"  and "jalr v1"?
> 
>  As I say, the latter cannot be predicted and will incur a stall for any 
> decent pipeline.  With the former the target address of the jump can be 
> calculated early and the instruction fetch unit can start feeding 
> instructions from there into the pipeline even before the jump has reached 
> the execution stage.
> 

Get it, thanks!

Regards,
	Wu Zhangjin


^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH v7 04/17] tracing: add static function tracer support for MIPS
  2009-11-09 15:31 ` [PATCH v7 04/17] tracing: add static function tracer support for MIPS Wu Zhangjin
  2009-11-10  0:26   ` David Daney
@ 2009-11-12 11:06   ` Wu Zhangjin
  1 sibling, 0 replies; 36+ messages in thread
From: Wu Zhangjin @ 2009-11-12 11:06 UTC (permalink / raw)
  To: linux-mips
  Cc: linux-kernel, zhangfx, zhouqg, Ralf Baechle, rostedt,
	Frederic Weisbecker, Ingo Molnar, Nicholas Mc Guire, David Daney,
	Richard Sandiford, Patrik Kluba, Thomas Gleixner, Michal Simek

Hi, All

On Mon, 2009-11-09 at 23:31 +0800, Wu Zhangjin wrote:
[...]
> 
> And to support module tracing, we need to enable -mlong-calls for the
> long call from modules space to kernel space. -mlong-calls load the
> address of _mcount to a register and then jump to it, so, the address is
> allowed to be 32bit long, but without -mlong-calls, for the instruction
> "jal _mcount" only left 26bit for the address of _mcount, which is not
> enough for jumping from the module space to kernel space.
[...]
> diff --git a/arch/mips/Makefile b/arch/mips/Makefile
[...]
> +else
> +cflags-y := -mlong-calls
> +endif

Just made dynamic ftracer work without the above patch.

Will send it out as v8 later.

any more feedbacks to this v7 patchset?

Thanks & Regards,
	Wu Zhangjin


^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH v7 04/17] tracing: add static function tracer support for MIPS
  2009-11-11 13:30             ` Maciej W. Rozycki
@ 2009-11-12 21:16               ` Ralf Baechle
  0 siblings, 0 replies; 36+ messages in thread
From: Ralf Baechle @ 2009-11-12 21:16 UTC (permalink / raw)
  To: Maciej W. Rozycki
  Cc: Wu Zhangjin, David Daney, linux-mips, linux-kernel, zhangfx,
	zhouqg, rostedt, Frederic Weisbecker, Ingo Molnar,
	Nicholas Mc Guire, Richard Sandiford, Patrik Kluba,
	Thomas Gleixner, Michal Simek

On Wed, Nov 11, 2009 at 01:30:40PM +0000, Maciej W. Rozycki wrote:

> 
> On Wed, 11 Nov 2009, Ralf Baechle wrote:
> 
> > 64-bit with -mlong-call:
> > 
> > 	lui	$25, %highest(foo)
> > 	lui	$2, %hi(foo)
> > 	daddiu	$25, %higher(foo)
> > 	daddiu	$2, %lo(foo)
> > 	dsll	$25, 32
> > 	daddu	$25, $2
> > 	jalr	$25
> 
>  Don't we use -msym32 these days for this very reason?

We do - where we can.  Not all systems can use -msym32.

  Ralf

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH v7 04/17] tracing: add static function tracer support for MIPS
  2009-11-11 14:13             ` Maciej W. Rozycki
@ 2009-11-13 16:50               ` Ralf Baechle
  0 siblings, 0 replies; 36+ messages in thread
From: Ralf Baechle @ 2009-11-13 16:50 UTC (permalink / raw)
  To: Maciej W. Rozycki
  Cc: Wu Zhangjin, David Daney, linux-mips, linux-kernel, zhangfx,
	zhouqg, rostedt, Frederic Weisbecker, Ingo Molnar,
	Nicholas Mc Guire, Richard Sandiford, Patrik Kluba,
	Thomas Gleixner, Michal Simek

On Wed, Nov 11, 2009 at 02:13:13PM +0000, Maciej W. Rozycki wrote:

> > 32-bit with -mlong-call:
> > 
> > 	lui	$25, %hi(foo)
> > 	addiu	$25, %lo(foo)
> > 	jalr	$25
> [...]
> > It's time that we get a -G optimization that works for the kernel; it would
> > allow to cut down the -mlong-calls calling sequence to just:
> > 
> > 	lw/ld	$25, offset($gp)
> > 	jalr	$25
> 
>  Actually this may be no faster than the above.  The load produces its 
> result late and the jump needs its data early, so unless a bypass has been 
> implemented in the pipeline, it may well stall for the extra cycle (that's 
> the reason for the load-delay slot in the original MIPS I ISA after all).
> 
>  Of course there is still the benefit of a reduced cache footprint, but 
> the extra load may have to evict a cache line and flush the benefit down 
> the drain.  I don't mean it's not to be considered, but it's not at all 
> immediately obvious it would be a win.

Yes; I was placing my bets on the cost of cache misses and for modules also
TLB misses.  In the end this needs to be benchmarked.

David Daney has an alternative approach in the works; he uses a wired TLB
entry in CKSEG2 with -msym32.  That'll work for everybody but R8000 and
maybe a few other esotheric cases.

  Ralf

^ permalink raw reply	[flat|nested] 36+ messages in thread

end of thread, other threads:[~2009-11-13 16:50 UTC | newest]

Thread overview: 36+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2009-11-09 15:29 [PATCH v7 00/17] ftrace for MIPS Wu Zhangjin
2009-11-09 15:31 ` [PATCH v7 01/17] tracing: convert trace_clock_local() as weak function Wu Zhangjin
2009-11-09 15:31 ` [PATCH v7 02/17] tracing: add mips_timecounter_read() for MIPS Wu Zhangjin
2009-11-09 15:31 ` [PATCH v7 03/17] tracing: add MIPS specific trace_clock_local() Wu Zhangjin
2009-11-10 17:48   ` David Daney
2009-11-11  2:47     ` Wu Zhangjin
2009-11-09 15:31 ` [PATCH v7 04/17] tracing: add static function tracer support for MIPS Wu Zhangjin
2009-11-10  0:26   ` David Daney
2009-11-10  1:00     ` Wu Zhangjin
2009-11-10 16:43       ` David Daney
2009-11-11  2:42         ` Wu Zhangjin
2009-11-11  3:13           ` Ralf Baechle
2009-11-11 13:30             ` Maciej W. Rozycki
2009-11-12 21:16               ` Ralf Baechle
2009-11-11 14:13             ` Maciej W. Rozycki
2009-11-13 16:50               ` Ralf Baechle
2009-11-11 13:15           ` Maciej W. Rozycki
2009-11-11 13:51             ` Wu Zhangjin
2009-11-11 14:18               ` Maciej W. Rozycki
2009-11-11 14:27                 ` Wu Zhangjin
2009-11-12 11:06   ` Wu Zhangjin
2009-11-09 15:31 ` [PATCH v7 05/17] tracing: enable HAVE_FUNCTION_TRACE_MCOUNT_TEST " Wu Zhangjin
2009-11-09 15:31 ` [PATCH v7 06/17] tracing: add an endian argument to scripts/recordmcount.pl Wu Zhangjin
2009-11-09 15:31 ` [PATCH v7 07/17] tracing: add dynamic function tracer support for MIPS Wu Zhangjin
2009-11-09 15:31 ` [PATCH v7 08/17] tracing: add IRQENTRY_EXIT section " Wu Zhangjin
2009-11-09 15:31 ` [PATCH v7 09/17] tracing: define a new __time_notrace annotation flag Wu Zhangjin
2009-11-09 15:31 ` [PATCH v7 10/17] tracing: not trace the timecounter_read* in kernel/time/clocksource.c Wu Zhangjin
2009-11-09 15:31 ` [PATCH v7 11/17] tracing: not trace mips_timecounter_read() for MIPS Wu Zhangjin
2009-11-09 15:31 ` [PATCH v7 12/17] tracing: add function graph tracer support " Wu Zhangjin
2009-11-09 15:31 ` [PATCH v7 13/17] tracing: add dynamic function graph tracer " Wu Zhangjin
2009-11-09 15:31 ` [PATCH v7 14/17] tracing: make ftrace for MIPS work without -fno-omit-frame-pointer Wu Zhangjin
2009-11-09 15:31 ` [PATCH v7 15/17] tracing: make ftrace for MIPS more robust Wu Zhangjin
2009-11-09 15:31 ` [PATCH v7 16/17] tracing: reserve $12(t0) for mcount-ra-address of gcc 4.5 Wu Zhangjin
2009-11-09 15:31 ` [PATCH v7 17/17] tracing: make function graph tracer work with -mmcount-ra-address Wu Zhangjin
2009-11-09 22:34   ` David Daney
2009-11-10  2:42     ` Wu Zhangjin

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.