linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/7] Porting dynmaic ftrace to PowerPC
@ 2008-11-16 21:24 Steven Rostedt
  2008-11-16 21:24 ` [PATCH 1/7] ftrace, PPC: do not latency trace idle Steven Rostedt
                   ` (7 more replies)
  0 siblings, 8 replies; 29+ messages in thread
From: Steven Rostedt @ 2008-11-16 21:24 UTC (permalink / raw)
  To: linux-kernel
  Cc: Ingo Molnar, Andrew Morton, Thomas Gleixner, Peter Zijlstra,
	David Miller, Benjamin Herrenschmidt, Frederic Weisbecker,
	Pekka Paalanen, linuxppc-dev, Rusty Russell, Paul Mackerras,
	Paul Mundt


The following patches are for my work on porting the new dynamic ftrace
framework to PowerPC.  The issue I had with both PPC64 and PPC32 is
that the calls to mcount are 24 bit jumps. Since the modules are
loaded in vmalloc address space, the call to mcount is farther than
what a 24 bit jump can make. The way PPC solves this is with the use
of trampolines. The trampoline is a memory space allocated within the
24 bit region of the module. The code in the trampoline that the
jump is made to does a far jump to the core kernel code.

The way PPC64 implements this is slightly different than the way
PPC32 does. Since my PPC64 box has a serial port it makes developing
and debugging easier, so my first patches port to PPC64, and then
the later patches include the work to get PPC32 working.

I'm describing what both PPC archs do in a bit of detail so that the
PPC exports CC'd can tell me if I'm incorrect. I did not read any
PPC specs to find out what was happening, I only reviewed the existing
PPC code that was in Linux.

The PPC64 dynamic ftrace:

PPC64, although works with 64 bit registers, the op codes are still
32 bit in length.  PPC64 uses table of contents (TOC) fields
to make their calls to functions. A function name is really a pointer
into the TOC table that stores the actual address of the function
along with the TOC of that function. The r2 register plays as the
TOC pointer. The actual name of the function is the function name
with a dot '.' prefix.  The reference name "schedule" is really
to the TOC entry, which calls the actual code with the reference
name ".schedule". This also explains why the list of available filter
functions on PPC64 all have a dot prefix.

When a funtion is called, it uses the 'bl' command which is a 24
bit function jump (saving the return address in the link register).
The next operation after all 'bl' calls is a nop. What the module
load code does when one of these 'bl' calls is farther than 24 bits
can handle, it creates a entry in the TOC and has the 'bl' call to
that entry. The entry in the TOC will save the r2 register on the
stack "40(r1)" load the actually function into the ctrl register
and make the far jump using that register (I'm using the term
'far' to mean more than 24 bits, nothing to do with the x86 far jumps
that deal with segments).  The module linker also modifies the
nop after the 'bl' call in the module into an op code that will restore
the r2 register 'ld r2,40(r1)'.

Now for what we need to do with dynamic ftrace on PPC64:

Dynamic ftrace needs to convert these calls to mcount into nops. It
also needs to be able to convert them back into a call to the
ftrace_caller (mcount is just a stub, the actual function recording
is done by a different function called ftrace_caller).

Before the dynamic ftrace modifies any code, it first makes sure
what it is changing is indeed what it expects it to be. This means
the dynamic ftrace code for PPC64 must be fully aware of the module
trampolines. When a mcount call is farther than 24 bits, it
takes a look at where that mcount call is at. The call should be into
an entry in the TOC, and the dynamic ftrace code reads the entry
that the call points to. It makes sure that the entry will make a
call to mcount (otherwise it returns failure).

After verifying that the 'bl' call calls into the TOC that calls
mcount, it converts that 'bl' call into a nop. It also converts
the following op code (the load of r2) into a nop since the TOC
is no longer saved. It does test that the following op is a load
of r2 before making any of the above changes. It also stores the
module structure pointer into the dyn_ftrace record field, for later
use.

On enabling the call back to ftrace_caller, the dynamic ftrace code
first verifys that the two op codes are two nops. It then reads
the dyn_ftrace structure module pointer to find the TOC and the entry
for the ftrace_caller (the ftrace_caller is added to the module
TOC on module load).  It then changes the call to the ftrace_caller
and the op that reloads the r2 register.

Note, to disable the ftrace_caller, the same as the disabling of
mcount is done, but instead it verifies that the TOC entry calls
the ftrace_caller and not mcount.


The PPC32 dynamic ftrace:

The work for PPC32 is very much the same as the PPC64 code but the 32
version does not need to deal with TOCS. This makes the code much
simpler. Pretty much everything as PPC64 is done, except it does not
need to index a TOC.

To disable mcount (or ftrace_caller):

If the call is greater than 24 bits, it looks to see where the 'bl'
jumps to. It verifies that the trampoline that it jumps to makes the
call to 'mcount' (or ftrace_caller if that is what is expected).
It then simply converts the 'bl' to a nop.

To enable ftrace_caller:

The 'bl' is converted to jump to the ftrace_caller trampoline entry
that was created on module load.

I've tested the following patches on both PPC64 and PPC32. I will
admit that the PPC64 does not seem that stable, but neither does the
code when all this is not enabled ;-)  I'll debug it more to see if
I can find the cause of my crashes, which may or may not be related
to the dynamic ftrace code. But the use of TOCS in PPC64 make me
a bit nervious that I did not do this correctly. Any help in reviewing
my code for mistakes would be greatly appreciated.

The following patches are in:

  git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-2.6-trace.git

    branch: tip/ppc


Steven Rostedt (7):
      ftrace, PPC: do not latency trace idle
      ftrace, ppc: convert to new dynamic ftrace arch API
      ftrace: powerpc mcount record port
      ftrace, PPC: use probe_kernel API to modify code
      ftrace, PPC64: handle module trampolines for dyn ftrace
      ftrace,ppc32: enabled dynamic ftrace
      ftrace,ppc32: dynamic ftrace to handle modules

----
 arch/powerpc/Kconfig              |    2 +
 arch/powerpc/include/asm/ftrace.h |   14 +-
 arch/powerpc/include/asm/module.h |   16 ++-
 arch/powerpc/kernel/ftrace.c      |  460 +++++++++++++++++++++++++++++++++---
 arch/powerpc/kernel/idle.c        |    5 +
 arch/powerpc/kernel/module_32.c   |   10 +
 arch/powerpc/kernel/module_64.c   |   13 +
 scripts/recordmcount.pl           |   18 ++-
 8 files changed, 495 insertions(+), 43 deletions(-)

^ permalink raw reply	[flat|nested] 29+ messages in thread

* [PATCH 1/7] ftrace, PPC: do not latency trace idle
  2008-11-16 21:24 [PATCH 0/7] Porting dynmaic ftrace to PowerPC Steven Rostedt
@ 2008-11-16 21:24 ` Steven Rostedt
  2008-11-17  3:43   ` Paul Mackerras
  2008-11-16 21:24 ` [PATCH 2/7] ftrace, ppc: convert to new dynamic ftrace arch API Steven Rostedt
                   ` (6 subsequent siblings)
  7 siblings, 1 reply; 29+ messages in thread
From: Steven Rostedt @ 2008-11-16 21:24 UTC (permalink / raw)
  To: linux-kernel
  Cc: Ingo Molnar, Andrew Morton, Thomas Gleixner, Peter Zijlstra,
	David Miller, Benjamin Herrenschmidt, Frederic Weisbecker,
	Pekka Paalanen, linuxppc-dev, Rusty Russell, Paul Mackerras,
	Paul Mundt, Steven Rostedt

[-- Attachment #1: 0001-ftrace-PPC-do-not-latency-trace-idle.patch --]
[-- Type: text/plain, Size: 1117 bytes --]

Impact: give better timing to latency tracers

When idle is called, interrupts are disabled, but the idle function
will still wake up on an interrupt. The problem is that the interrupt
disabled latency tracer will take this call to idle as a latency.

This patch disables the latency tracing when going into idle.

Signed-off-by: Steven Rostedt <srostedt@redhat.com>
CC:  Paul Mackerras <paulus@samba.org>
CC:  Benjamin Herrenschmidt <benh@kernel.crashing.org>
---
 arch/powerpc/kernel/idle.c |    5 +++++
 1 files changed, 5 insertions(+), 0 deletions(-)

diff --git a/arch/powerpc/kernel/idle.c b/arch/powerpc/kernel/idle.c
index 31982d0..88d9c1d 100644
--- a/arch/powerpc/kernel/idle.c
+++ b/arch/powerpc/kernel/idle.c
@@ -69,10 +69,15 @@ void cpu_idle(void)
 				smp_mb();
 				local_irq_disable();
 
+				/* Don't trace irqs off for idle */
+				stop_critical_timings();
+
 				/* check again after disabling irqs */
 				if (!need_resched() && !cpu_should_die())
 					ppc_md.power_save();
 
+				start_critical_timings();
+
 				local_irq_enable();
 				set_thread_flag(TIF_POLLING_NRFLAG);
 
-- 
1.5.6.5

-- 

^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH 2/7] ftrace, ppc: convert to new dynamic ftrace arch API
  2008-11-16 21:24 [PATCH 0/7] Porting dynmaic ftrace to PowerPC Steven Rostedt
  2008-11-16 21:24 ` [PATCH 1/7] ftrace, PPC: do not latency trace idle Steven Rostedt
@ 2008-11-16 21:24 ` Steven Rostedt
  2008-11-17  3:57   ` Paul Mackerras
  2008-11-16 21:24 ` [PATCH 3/7] ftrace: powerpc mcount record port Steven Rostedt
                   ` (5 subsequent siblings)
  7 siblings, 1 reply; 29+ messages in thread
From: Steven Rostedt @ 2008-11-16 21:24 UTC (permalink / raw)
  To: linux-kernel
  Cc: Ingo Molnar, Andrew Morton, Thomas Gleixner, Peter Zijlstra,
	David Miller, Benjamin Herrenschmidt, Frederic Weisbecker,
	Pekka Paalanen, linuxppc-dev, Rusty Russell, Paul Mackerras,
	Paul Mundt, Steven Rostedt

[-- Attachment #1: 0002-ftrace-ppc-convert-to-new-dynamic-ftrace-arch-API.patch --]
[-- Type: text/plain, Size: 3508 bytes --]

Impact: change PPC to use new ftrace arch API

This patch converts PPC to use the new dynamic ftrace arch API.

Signed-off-by: Steven Rostedt <srostedt@redhat.com>
---
 arch/powerpc/include/asm/ftrace.h |   14 ++++++++-
 arch/powerpc/kernel/ftrace.c      |   60 +++++++++++++++++++++++++++++++++---
 2 files changed, 68 insertions(+), 6 deletions(-)

diff --git a/arch/powerpc/include/asm/ftrace.h b/arch/powerpc/include/asm/ftrace.h
index b298f7a..17efecc 100644
--- a/arch/powerpc/include/asm/ftrace.h
+++ b/arch/powerpc/include/asm/ftrace.h
@@ -7,7 +7,19 @@
 
 #ifndef __ASSEMBLY__
 extern void _mcount(void);
-#endif
+
+#ifdef CONFIG_DYNAMIC_FTRACE
+static inline unsigned long ftrace_call_adjust(unsigned long addr)
+{
+       /* reloction of mcount call site is the same as the address */
+       return addr;
+}
+
+struct dyn_arch_ftrace {
+	/* nothing yet */
+};
+#endif /*  CONFIG_DYNAMIC_FTRACE */
+#endif /* __ASSEMBLY__ */
 
 #endif
 
diff --git a/arch/powerpc/kernel/ftrace.c b/arch/powerpc/kernel/ftrace.c
index f4b006e..aa5d0b2 100644
--- a/arch/powerpc/kernel/ftrace.c
+++ b/arch/powerpc/kernel/ftrace.c
@@ -33,12 +33,12 @@ static unsigned int ftrace_calc_offset(long ip, long addr)
 	return (int)(addr - ip);
 }
 
-unsigned char *ftrace_nop_replace(void)
+static unsigned char *ftrace_nop_replace(void)
 {
 	return (char *)&ftrace_nop;
 }
 
-unsigned char *ftrace_call_replace(unsigned long ip, unsigned long addr)
+static unsigned char *ftrace_call_replace(unsigned long ip, unsigned long addr)
 {
 	static unsigned int op;
 
@@ -68,7 +68,7 @@ unsigned char *ftrace_call_replace(unsigned long ip, unsigned long addr)
 # define _ASM_PTR	" .long "
 #endif
 
-int
+static int
 ftrace_modify_code(unsigned long ip, unsigned char *old_code,
 		   unsigned char *new_code)
 {
@@ -113,6 +113,55 @@ ftrace_modify_code(unsigned long ip, unsigned char *old_code,
 	return faulted;
 }
 
+static int test_24bit_addr(unsigned long ip, unsigned long addr)
+{
+	unsigned long diff;
+
+	/* can we get to addr from ip in 24 bits? */
+	diff = ip > addr ? ip - addr : addr - ip;
+
+	return !(diff & ((unsigned long)-1 << 24));
+}
+
+int ftrace_make_nop(struct module *mod,
+		    struct dyn_ftrace *rec, unsigned long addr)
+{
+	unsigned char *old, *new;
+
+	/*
+	 * If the calling address is more that 24 bits away,
+	 * then we had to use a trampoline to make the call.
+	 * Otherwise just update the call site.
+	 */
+	if (test_24bit_addr(rec->ip, addr)) {
+		/* within range */
+		old = ftrace_call_replace(rec->ip, addr);
+		new = ftrace_nop_replace();
+		return ftrace_modify_code(rec->ip, old, new);
+	}
+
+	return 0;
+}
+
+int ftrace_make_call(struct dyn_ftrace *rec, unsigned long addr)
+{
+	unsigned char *old, *new;
+
+	/*
+	 * If the calling address is more that 24 bits away,
+	 * then we had to use a trampoline to make the call.
+	 * Otherwise just update the call site.
+	 */
+	if (test_24bit_addr(rec->ip, addr)) {
+		/* within range */
+		old = ftrace_nop_replace();
+		new = ftrace_call_replace(rec->ip, addr);
+		return ftrace_modify_code(rec->ip, old, new);
+	}
+
+	return 0;
+}
+
 int ftrace_update_ftrace_func(ftrace_func_t func)
 {
 	unsigned long ip = (unsigned long)(&ftrace_call);
@@ -128,9 +177,10 @@ int ftrace_update_ftrace_func(ftrace_func_t func)
 
 int __init ftrace_dyn_arch_init(void *data)
 {
-	/* This is running in kstop_machine */
+	/* caller expects data to be zero */
+	unsigned long *p = data;
 
-	ftrace_mcount_set(data);
+	*p = 0;
 
 	return 0;
 }
-- 
1.5.6.5

-- 

^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH 3/7] ftrace: powerpc mcount record port
  2008-11-16 21:24 [PATCH 0/7] Porting dynmaic ftrace to PowerPC Steven Rostedt
  2008-11-16 21:24 ` [PATCH 1/7] ftrace, PPC: do not latency trace idle Steven Rostedt
  2008-11-16 21:24 ` [PATCH 2/7] ftrace, ppc: convert to new dynamic ftrace arch API Steven Rostedt
@ 2008-11-16 21:24 ` Steven Rostedt
  2008-11-16 21:24 ` [PATCH 4/7] ftrace, PPC: use probe_kernel API to modify code Steven Rostedt
                   ` (4 subsequent siblings)
  7 siblings, 0 replies; 29+ messages in thread
From: Steven Rostedt @ 2008-11-16 21:24 UTC (permalink / raw)
  To: linux-kernel
  Cc: Ingo Molnar, Andrew Morton, Thomas Gleixner, Peter Zijlstra,
	David Miller, Benjamin Herrenschmidt, Frederic Weisbecker,
	Pekka Paalanen, linuxppc-dev, Rusty Russell, Paul Mackerras,
	Paul Mundt, Steven Rostedt

[-- Attachment #1: 0003-ftrace-powerpc-mcount-record-port.patch --]
[-- Type: text/plain, Size: 2996 bytes --]

Impact: enable PowerPC for dynamic ftrace

This patch converts PowerPC to use the mcount location section.

Currently, modules will be ignored by the converter.

Signed-off-by: Steven Rostedt <srostedt@redhat.com>
---
 arch/powerpc/Kconfig    |    2 ++
 scripts/recordmcount.pl |   13 +++++++++++--
 2 files changed, 13 insertions(+), 2 deletions(-)

diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig
index adb23ea..99d9500 100644
--- a/arch/powerpc/Kconfig
+++ b/arch/powerpc/Kconfig
@@ -108,6 +108,8 @@ config ARCH_NO_VIRT_TO_BUS
 config PPC
 	bool
 	default y
+	select HAVE_FTRACE_MCOUNT_RECORD if PPC64
+	select HAVE_DYNAMIC_FTRACE
 	select HAVE_FUNCTION_TRACER
 	select ARCH_WANT_OPTIONAL_GPIOLIB
 	select HAVE_IDE
diff --git a/scripts/recordmcount.pl b/scripts/recordmcount.pl
index eeac71c..7acbe17 100755
--- a/scripts/recordmcount.pl
+++ b/scripts/recordmcount.pl
@@ -130,6 +130,7 @@ my %weak;		# List of weak functions
 my %convert;		# List of local functions used that needs conversion
 
 my $type;
+my $nm_regex;		# Find the local functions (return function)
 my $section_regex;	# Find the start of a section
 my $function_regex;	# Find the name of a function
 			#    (return offset and func name)
@@ -145,6 +146,7 @@ if ($arch eq "x86") {
 }
 
 if ($arch eq "x86_64") {
+    $nm_regex = "^[0-9a-fA-F]+\\s+t\\s+(\\S+)";
     $section_regex = "Disassembly of section\\s+(\\S+):";
     $function_regex = "^([0-9a-fA-F]+)\\s+<(.*?)>:";
     $mcount_regex = "^\\s*([0-9a-fA-F]+):.*\\smcount([+-]0x[0-9a-zA-Z]+)?\$";
@@ -158,6 +160,7 @@ if ($arch eq "x86_64") {
     $cc .= " -m64";
 
 } elsif ($arch eq "i386") {
+    $nm_regex = "^[0-9a-fA-F]+\\s+t\\s+(\\S+)";
     $section_regex = "Disassembly of section\\s+(\\S+):";
     $function_regex = "^([0-9a-fA-F]+)\\s+<(.*?)>:";
     $mcount_regex = "^\\s*([0-9a-fA-F]+):.*\\smcount\$";
@@ -170,6 +173,12 @@ if ($arch eq "x86_64") {
     $objcopy .= " -O elf32-i386";
     $cc .= " -m32";
 
+} elsif ($arch eq "powerpc") {
+    $nm_regex = "^[0-9a-fA-F]+\\s+t\\s+(\\.?\\S+)";
+    $section_regex = "Disassembly of section\\s+(\\S+):";
+    $function_regex = "^([0-9a-fA-F]+)\\s+<(\\.?.*?)>:";
+    $mcount_regex = "^\\s*([0-9a-fA-F]+):.*\\s\\.?_mcount\$";
+    $type = ".quad";
 } else {
     die "Arch $arch is not supported with CONFIG_FTRACE_MCOUNT_RECORD";
 }
@@ -239,7 +248,7 @@ if (!$found_version) {
 #
 open (IN, "$nm $inputfile|") || die "error running $nm";
 while (<IN>) {
-    if (/^[0-9a-fA-F]+\s+t\s+(\S+)/) {
+    if (/$nm_regex/) {
 	$locals{$1} = 1;
     } elsif (/^[0-9a-fA-F]+\s+([wW])\s+(\S+)/) {
 	$weak{$2} = $1;
@@ -291,7 +300,7 @@ sub update_funcs
 	    open(FILE, ">$mcount_s") || die "can't create $mcount_s\n";
 	    $opened = 1;
 	    print FILE "\t.section $mcount_section,\"a\",\@progbits\n";
-	    print FILE "\t.align $alignment\n";
+	    print FILE "\t.align $alignment\n" if (defined($alignment));
 	}
 	printf FILE "\t%s %s + %d\n", $type, $ref_func, $offsets[$i] - $offset;
     }
-- 
1.5.6.5

-- 

^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH 4/7] ftrace, PPC: use probe_kernel API to modify code
  2008-11-16 21:24 [PATCH 0/7] Porting dynmaic ftrace to PowerPC Steven Rostedt
                   ` (2 preceding siblings ...)
  2008-11-16 21:24 ` [PATCH 3/7] ftrace: powerpc mcount record port Steven Rostedt
@ 2008-11-16 21:24 ` Steven Rostedt
  2008-11-17  4:44   ` Paul Mackerras
  2008-11-16 21:24 ` [PATCH 5/7] ftrace, PPC64: handle module trampolines for dyn ftrace Steven Rostedt
                   ` (3 subsequent siblings)
  7 siblings, 1 reply; 29+ messages in thread
From: Steven Rostedt @ 2008-11-16 21:24 UTC (permalink / raw)
  To: linux-kernel
  Cc: Ingo Molnar, Andrew Morton, Thomas Gleixner, Peter Zijlstra,
	David Miller, Benjamin Herrenschmidt, Frederic Weisbecker,
	Pekka Paalanen, linuxppc-dev, Rusty Russell, Paul Mackerras,
	Paul Mundt, Steven Rostedt

[-- Attachment #1: 0004-ftrace-PPC-use-probe_kernel-API-to-modify-code.patch --]
[-- Type: text/plain, Size: 2397 bytes --]

Impact: use cleaner probe_kernel API over assembly

Using probe_kernel_read/write interface is a much cleaner approach
than the current assembly version.

Signed-off-by: Steven Rostedt <srostedt@redhat.com>
---
 arch/powerpc/kernel/ftrace.c |   53 ++++++++++++++++-------------------------
 1 files changed, 21 insertions(+), 32 deletions(-)

diff --git a/arch/powerpc/kernel/ftrace.c b/arch/powerpc/kernel/ftrace.c
index aa5d0b2..3852919 100644
--- a/arch/powerpc/kernel/ftrace.c
+++ b/arch/powerpc/kernel/ftrace.c
@@ -9,6 +9,7 @@
 
 #include <linux/spinlock.h>
 #include <linux/hardirq.h>
+#include <linux/uaccess.h>
 #include <linux/ftrace.h>
 #include <linux/percpu.h>
 #include <linux/init.h>
@@ -72,45 +73,33 @@ static int
 ftrace_modify_code(unsigned long ip, unsigned char *old_code,
 		   unsigned char *new_code)
 {
-	unsigned replaced;
-	unsigned old = *(unsigned *)old_code;
-	unsigned new = *(unsigned *)new_code;
-	int faulted = 0;
+	unsigned char replaced[MCOUNT_INSN_SIZE];
 
 	/*
 	 * Note: Due to modules and __init, code can
 	 *  disappear and change, we need to protect against faulting
-	 *  as well as code changing.
+	 *  as well as code changing. We do this by using the
+	 *  probe_kernel_* functions.
 	 *
 	 * No real locking needed, this code is run through
-	 * kstop_machine.
+	 * kstop_machine, or before SMP starts.
 	 */
-	asm volatile (
-		"1: lwz		%1, 0(%2)\n"
-		"   cmpw	%1, %5\n"
-		"   bne		2f\n"
-		"   stwu	%3, 0(%2)\n"
-		"2:\n"
-		".section .fixup, \"ax\"\n"
-		"3:	li %0, 1\n"
-		"	b 2b\n"
-		".previous\n"
-		".section __ex_table,\"a\"\n"
-		_ASM_ALIGN "\n"
-		_ASM_PTR "1b, 3b\n"
-		".previous"
-		: "=r"(faulted), "=r"(replaced)
-		: "r"(ip), "r"(new),
-		  "0"(faulted), "r"(old)
-		: "memory");
-
-	if (replaced != old && replaced != new)
-		faulted = 2;
-
-	if (!faulted)
-		flush_icache_range(ip, ip + 8);
-
-	return faulted;
+
+	/* read the text we want to modify */
+	if (probe_kernel_read(replaced, (void *)ip, MCOUNT_INSN_SIZE))
+		return -EFAULT;
+
+	/* Make sure it is what we expect it to be */
+	if (memcmp(replaced, old_code, MCOUNT_INSN_SIZE) != 0)
+		return -EINVAL;
+
+	/* replace the text with the new text */
+	if (probe_kernel_write((void *)ip, new_code, MCOUNT_INSN_SIZE))
+		return -EPERM;
+
+	flush_icache_range(ip, ip + 8);
+
+	return 0;
 }
 
 static int test_24bit_addr(unsigned long ip, unsigned long addr)
-- 
1.5.6.5

-- 

^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH 5/7] ftrace, PPC64: handle module trampolines for dyn ftrace
  2008-11-16 21:24 [PATCH 0/7] Porting dynmaic ftrace to PowerPC Steven Rostedt
                   ` (3 preceding siblings ...)
  2008-11-16 21:24 ` [PATCH 4/7] ftrace, PPC: use probe_kernel API to modify code Steven Rostedt
@ 2008-11-16 21:24 ` Steven Rostedt
  2008-11-17  5:00   ` Paul Mackerras
  2008-11-16 21:24 ` [PATCH 6/7] ftrace,ppc32: enabled dynamic ftrace Steven Rostedt
                   ` (2 subsequent siblings)
  7 siblings, 1 reply; 29+ messages in thread
From: Steven Rostedt @ 2008-11-16 21:24 UTC (permalink / raw)
  To: linux-kernel
  Cc: Ingo Molnar, Andrew Morton, Thomas Gleixner, Peter Zijlstra,
	David Miller, Benjamin Herrenschmidt, Frederic Weisbecker,
	Pekka Paalanen, linuxppc-dev, Rusty Russell, Paul Mackerras,
	Paul Mundt, Steven Rostedt

[-- Attachment #1: 0005-ftrace-PPC64-handle-module-trampolines-for-dyn-ftr.patch --]
[-- Type: text/plain, Size: 9829 bytes --]

Impact: Implement PPC64 module trampolines for dyn ftrace

This adds code to handle the PPC64 module trampolines, and allows for
PPC64 to use dynamic ftrace.

Signed-off-by: Steven Rostedt <srostedt@redhat.com>
---
 arch/powerpc/include/asm/ftrace.h |    3 +-
 arch/powerpc/include/asm/module.h |   11 ++
 arch/powerpc/kernel/ftrace.c      |  202 +++++++++++++++++++++++++++++++++++--
 arch/powerpc/kernel/module_64.c   |   13 +++
 4 files changed, 220 insertions(+), 9 deletions(-)

diff --git a/arch/powerpc/include/asm/ftrace.h b/arch/powerpc/include/asm/ftrace.h
index 17efecc..d57f5bc 100644
--- a/arch/powerpc/include/asm/ftrace.h
+++ b/arch/powerpc/include/asm/ftrace.h
@@ -16,7 +16,8 @@ static inline unsigned long ftrace_call_adjust(unsigned long addr)
 }
 
 struct dyn_arch_ftrace {
-	/* nothing yet */
+	struct module *mod;
+	unsigned long tramp;
 };
 #endif /*  CONFIG_DYNAMIC_FTRACE */
 #endif /* __ASSEMBLY__ */
diff --git a/arch/powerpc/include/asm/module.h b/arch/powerpc/include/asm/module.h
index e5f14b1..340bc69 100644
--- a/arch/powerpc/include/asm/module.h
+++ b/arch/powerpc/include/asm/module.h
@@ -34,6 +34,11 @@ struct mod_arch_specific {
 #ifdef __powerpc64__
 	unsigned int stubs_section;	/* Index of stubs section in module */
 	unsigned int toc_section;	/* What section is the TOC? */
+#ifdef CONFIG_DYNAMIC_FTRACE
+	unsigned long toc;
+	unsigned long tramp;
+#endif
+
 #else
 	/* Indices of PLT sections within module. */
 	unsigned int core_plt_section;
@@ -68,6 +73,12 @@ struct mod_arch_specific {
 #    endif	/* MODULE */
 #endif
 
+#ifdef CONFIG_DYNAMIC_FTRACE
+#    ifdef MODULE
+	asm(".section .ftrace.tramp,\"ax\",@nobits; .align 3; .previous");
+#    endif	/* MODULE */
+#endif
+
 
 struct exception_table_entry;
 void sort_ex_table(struct exception_table_entry *start,
diff --git a/arch/powerpc/kernel/ftrace.c b/arch/powerpc/kernel/ftrace.c
index 3852919..acbec66 100644
--- a/arch/powerpc/kernel/ftrace.c
+++ b/arch/powerpc/kernel/ftrace.c
@@ -10,22 +10,29 @@
 #include <linux/spinlock.h>
 #include <linux/hardirq.h>
 #include <linux/uaccess.h>
+#include <linux/module.h>
 #include <linux/ftrace.h>
 #include <linux/percpu.h>
 #include <linux/init.h>
 #include <linux/list.h>
 
 #include <asm/cacheflush.h>
+#include <asm/code-patching.h>
 #include <asm/ftrace.h>
 
+#if 0
+#define DEBUGP printk
+#else
+#define DEBUGP(fmt , ...)	do { } while (0)
+#endif
 
-static unsigned int ftrace_nop = 0x60000000;
+static unsigned int ftrace_nop = PPC_NOP_INSTR;
 
 #ifdef CONFIG_PPC32
 # define GET_ADDR(addr) addr
 #else
 /* PowerPC64's functions are data that points to the functions */
-# define GET_ADDR(addr) *(unsigned long *)addr
+# define GET_ADDR(addr) (*(unsigned long *)addr)
 #endif
 
 
@@ -115,40 +122,219 @@ static int test_24bit_addr(unsigned long ip, unsigned long addr)
 int ftrace_make_nop(struct module *mod,
 		    struct dyn_ftrace *rec, unsigned long addr)
 {
+	unsigned char replaced[MCOUNT_INSN_SIZE * 2];
+	unsigned int *op = (unsigned *)&replaced;
+	unsigned char jmp[8];
+	unsigned long *ptr = (unsigned long *)&jmp;
 	unsigned char *old, *new;
+	unsigned long ip = rec->ip;
+	unsigned long tramp;
+	int offset;
 
 	/*
 	 * If the calling address is more that 24 bits away,
 	 * then we had to use a trampoline to make the call.
 	 * Otherwise just update the call site.
 	 */
-	if (test_24bit_addr(rec->ip, addr)) {
+	if (test_24bit_addr(ip, addr)) {
 		/* within range */
-		old = ftrace_call_replace(rec->ip, addr);
+		old = ftrace_call_replace(ip, addr);
 		new = ftrace_nop_replace();
-		return ftrace_modify_code(rec->ip, old, new);
+		return ftrace_modify_code(ip, old, new);
 	}
 
+#ifndef CONFIG_PPC64
+	/* only supported for PPC64 for now */
 	return 0;
+#else
+
+	/*
+	 * Out of range jumps are called from modules.
+	 * We should either already have a pointer to the module
+	 * or it has been passed in.
+	 */
+	if (!rec->arch.mod) {
+		if (!mod) {
+			printk(KERN_ERR "No module loaded addr=%lx\n",
+			       addr);
+			return -EFAULT;
+		}
+		rec->arch.mod = mod;
+	} else if (mod) {
+		printk(KERN_ERR
+		       "Record mod %p not equal to passed in mod %p\n",
+		       rec->arch.mod, mod);
+		return -EINVAL;
+	} else
+		mod = rec->arch.mod;
+
+	/* read where this goes */
+	if (probe_kernel_read(replaced, (void *)ip, MCOUNT_INSN_SIZE))
+		return -EFAULT;
+
+	/* Make sure that that this is still a 24bit jump */
+	if ((*op & 0xff000000) != 0x48000000) {
+		printk(KERN_ERR "Not expected bl: opcode is %x\n", *op);
+		return -EINVAL;
+	}
+
+	/* lets find where the pointer goes */
+	offset = (*op & 0x03fffffc);
+	/* make it signed */
+	if (offset & 0x02000000)
+		offset |= 0xfe000000;
+
+	tramp = ip + (long)offset;
+
+	/*
+	 * On PPC64 the trampoline looks like:
+	 * 0x3d, 0x82, 0x00, 0x00,    addis   r12,r2, <high>
+	 * 0x39, 0x8c, 0x00, 0x00,    addi    r12,r12, <low>
+	 *   Where the bytes 2,3,6 and 7 make up the 32bit offset
+	 *   to the TOC that holds the pointer.
+	 *   to jump to.
+	 * 0xf8, 0x41, 0x00, 0x28,    std     r2,40(r1)
+	 * 0xe9, 0x6c, 0x00, 0x20,    ld      r11,32(r12)
+	 *   The actually address is 32 bytes from the offset
+	 *   into the TOC.
+	 * 0xe8, 0x4c, 0x00, 0x28,    ld      r2,40(r12)
+	 */
+
+	DEBUGP("ip:%lx jumps to %lx r2: %lx", ip, tramp, mod->arch.toc);
+
+	/* Find where the trampoline jumps to */
+	if (probe_kernel_read(jmp, (void *)tramp, 8)) {
+		printk(KERN_ERR "Failed to read %lx\n", tramp);
+		return -EFAULT;
+	}
+
+	DEBUGP(" %08x %08x",
+	       (unsigned)(*ptr >> 32),
+	       (unsigned)*ptr);
+
+	offset = (unsigned)jmp[2] << 24 |
+		(unsigned)jmp[3] << 16 |
+		(unsigned)jmp[6] << 8 |
+		(unsigned)jmp[7];
+
+	DEBUGP(" %x ", offset);
+
+	/* get the address this jumps too */
+	tramp = mod->arch.toc + offset + 32;
+	DEBUGP("toc: %lx", tramp);
+
+	if (probe_kernel_read(jmp, (void *)tramp, 8)) {
+		printk(KERN_ERR "Failed to read %lx\n", tramp);
+		return -EFAULT;
+	}
+
+	DEBUGP(" %08x %08x\n",
+	       (unsigned)(*ptr >> 32),
+	       (unsigned)*ptr);
+
+	/* This should match what was called */
+	if (*ptr != GET_ADDR(addr)) {
+		printk(KERN_ERR "addr does not match %lx\n", *ptr);
+		return -EINVAL;
+	}
+
+	/*
+	 * We want to nop the line, but the next line is
+	 *  0xe8, 0x41, 0x00, 0x28   ld r2,40(r1)
+	 * This needs to be turned to a nop too.
+	 */
+	if (probe_kernel_read(replaced, (void *)(ip+4), MCOUNT_INSN_SIZE))
+		return -EFAULT;
+
+	if (*op != 0xe8410028) {
+		printk(KERN_ERR "Next line is not ld! (%08x)\n", *op);
+		return -EINVAL;
+	}
+
+	op[0] = PPC_NOP_INSTR;
+	op[1] = PPC_NOP_INSTR;
+
+	if (probe_kernel_write((void *)ip, replaced, MCOUNT_INSN_SIZE * 2))
+		return -EPERM;
+
+	return 0;
+#endif /* CONFIG_PPC64 */
 }
 
 int ftrace_make_call(struct dyn_ftrace *rec, unsigned long addr)
 {
+	unsigned char replaced[MCOUNT_INSN_SIZE * 2];
+	unsigned int *op = (unsigned *)&replaced;
 	unsigned char *old, *new;
+	unsigned long ip = rec->ip;
+	unsigned long offset;
 
 	/*
 	 * If the calling address is more that 24 bits away,
 	 * then we had to use a trampoline to make the call.
 	 * Otherwise just update the call site.
 	 */
-	if (test_24bit_addr(rec->ip, addr)) {
+	if (test_24bit_addr(ip, addr)) {
 		/* within range */
 		old = ftrace_nop_replace();
-		new = ftrace_call_replace(rec->ip, addr);
-		return ftrace_modify_code(rec->ip, old, new);
+		new = ftrace_call_replace(ip, addr);
+		return ftrace_modify_code(ip, old, new);
 	}
 
+#ifndef CONFIG_PPC64
+	/* only supported for PPC64 for now */
+	return 0;
+#else
+
+	/*
+	 * Out of range jumps are called from modules.
+	 * Being that we are converting from nop, it had better
+	 * already have a module defined.
+	 */
+	if (!rec->arch.mod) {
+		printk(KERN_ERR "No module loaded\n");
+		return -EINVAL;
+	}
+
+	/* read where this goes */
+	if (probe_kernel_read(replaced, (void *)ip, MCOUNT_INSN_SIZE * 2))
+		return -EFAULT;
+
+	/* It should be pointing to two nops */
+	if ((op[0] != PPC_NOP_INSTR) ||
+	    (op[1] != PPC_NOP_INSTR)) {
+		printk(KERN_ERR "Expected NOPs but have %x %x\n", op[0], op[1]);
+		return -EINVAL;
+	}
+
+	/* If we never set up a trampoline to ftrace_caller, then bail */
+	if (!rec->arch.mod->arch.tramp) {
+		printk(KERN_ERR "No ftrace trampoline\n");
+		return -EINVAL;
+	}
+
+	/* now calculate a jump to the ftrace caller trampoline */
+	offset = rec->arch.mod->arch.tramp - ip;
+
+	if (offset + 0x2000000 > 0x3ffffff || (offset & 3) != 0) {
+		printk(KERN_ERR "REL24 %li out of range!\n",
+		       (long int)offset);
+		return -EINVAL;
+	}
+
+
+	/* Set to "bl addr" */
+	op[0] = 0x48000001 | (offset & 0x03fffffc);
+	/* ld r2,40(r1) */
+	op[1] = 0xe8410028;
+
+	DEBUGP("write to %lx\n", rec->ip);
+
+	if (probe_kernel_write((void *)ip, replaced, MCOUNT_INSN_SIZE * 2))
+		return -EPERM;
+
 	return 0;
+#endif /* CONFIG_PPC64 */
 }
 
 int ftrace_update_ftrace_func(ftrace_func_t func)
diff --git a/arch/powerpc/kernel/module_64.c b/arch/powerpc/kernel/module_64.c
index 1af2377..8992b03 100644
--- a/arch/powerpc/kernel/module_64.c
+++ b/arch/powerpc/kernel/module_64.c
@@ -20,6 +20,7 @@
 #include <linux/moduleloader.h>
 #include <linux/err.h>
 #include <linux/vmalloc.h>
+#include <linux/ftrace.h>
 #include <linux/bug.h>
 #include <asm/module.h>
 #include <asm/firmware.h>
@@ -163,6 +164,11 @@ static unsigned long get_stubs_size(const Elf64_Ehdr *hdr,
 		}
 	}
 
+#ifdef CONFIG_DYNAMIC_FTRACE
+	/* make the trampoline to the ftrace_caller */
+	relocs++;
+#endif
+
 	DEBUGP("Looks like a total of %lu stubs, max\n", relocs);
 	return relocs * sizeof(struct ppc64_stub_entry);
 }
@@ -441,5 +447,12 @@ int apply_relocate_add(Elf64_Shdr *sechdrs,
 		}
 	}
 
+#ifdef CONFIG_DYNAMIC_FTRACE
+	me->arch.toc = my_r2(sechdrs, me);
+	me->arch.tramp = stub_for_addr(sechdrs,
+				       (unsigned long)ftrace_caller,
+				       me);
+#endif
+
 	return 0;
 }
-- 
1.5.6.5

-- 

^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH 6/7] ftrace,ppc32: enabled dynamic ftrace
  2008-11-16 21:24 [PATCH 0/7] Porting dynmaic ftrace to PowerPC Steven Rostedt
                   ` (4 preceding siblings ...)
  2008-11-16 21:24 ` [PATCH 5/7] ftrace, PPC64: handle module trampolines for dyn ftrace Steven Rostedt
@ 2008-11-16 21:24 ` Steven Rostedt
  2008-11-16 21:24 ` [PATCH 7/7] ftrace,ppc32: dynamic ftrace to handle modules Steven Rostedt
  2008-11-16 22:55 ` [PATCH 0/7] Porting dynmaic ftrace to PowerPC Paul Mackerras
  7 siblings, 0 replies; 29+ messages in thread
From: Steven Rostedt @ 2008-11-16 21:24 UTC (permalink / raw)
  To: linux-kernel
  Cc: Ingo Molnar, Andrew Morton, Thomas Gleixner, Peter Zijlstra,
	David Miller, Benjamin Herrenschmidt, Frederic Weisbecker,
	Pekka Paalanen, linuxppc-dev, Rusty Russell, Paul Mackerras,
	Paul Mundt, Steven Rostedt

[-- Attachment #1: 0006-ftrace-ppc32-enabled-dynamic-ftrace.patch --]
[-- Type: text/plain, Size: 1339 bytes --]

Impact: enable dynamic ftrace for PPC32

This patch adds the necessary hooks to get PPC32 dynamic ftrace working.
It does not handle modules. They are ignored by this patch.

Signed-off-by: Steven Rostedt <srostedt@redhat.com>
---
 arch/powerpc/Kconfig    |    2 +-
 scripts/recordmcount.pl |    7 ++++++-
 2 files changed, 7 insertions(+), 2 deletions(-)

diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig
index 99d9500..efcc2c4 100644
--- a/arch/powerpc/Kconfig
+++ b/arch/powerpc/Kconfig
@@ -108,7 +108,7 @@ config ARCH_NO_VIRT_TO_BUS
 config PPC
 	bool
 	default y
-	select HAVE_FTRACE_MCOUNT_RECORD if PPC64
+	select HAVE_FTRACE_MCOUNT_RECORD
 	select HAVE_DYNAMIC_FTRACE
 	select HAVE_FUNCTION_TRACER
 	select ARCH_WANT_OPTIONAL_GPIOLIB
diff --git a/scripts/recordmcount.pl b/scripts/recordmcount.pl
index 7acbe17..48609e9 100755
--- a/scripts/recordmcount.pl
+++ b/scripts/recordmcount.pl
@@ -178,7 +178,12 @@ if ($arch eq "x86_64") {
     $section_regex = "Disassembly of section\\s+(\\S+):";
     $function_regex = "^([0-9a-fA-F]+)\\s+<(\\.?.*?)>:";
     $mcount_regex = "^\\s*([0-9a-fA-F]+):.*\\s\\.?_mcount\$";
-    $type = ".quad";
+    if ($bits == 64) {
+	$type = ".quad";
+    } else {
+	$type = ".long";
+    }
+
 } else {
     die "Arch $arch is not supported with CONFIG_FTRACE_MCOUNT_RECORD";
 }
-- 
1.5.6.5

-- 

^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH 7/7] ftrace,ppc32: dynamic ftrace to handle modules
  2008-11-16 21:24 [PATCH 0/7] Porting dynmaic ftrace to PowerPC Steven Rostedt
                   ` (5 preceding siblings ...)
  2008-11-16 21:24 ` [PATCH 6/7] ftrace,ppc32: enabled dynamic ftrace Steven Rostedt
@ 2008-11-16 21:24 ` Steven Rostedt
  2008-11-16 22:55 ` [PATCH 0/7] Porting dynmaic ftrace to PowerPC Paul Mackerras
  7 siblings, 0 replies; 29+ messages in thread
From: Steven Rostedt @ 2008-11-16 21:24 UTC (permalink / raw)
  To: linux-kernel
  Cc: Ingo Molnar, Andrew Morton, Thomas Gleixner, Peter Zijlstra,
	David Miller, Benjamin Herrenschmidt, Frederic Weisbecker,
	Pekka Paalanen, linuxppc-dev, Rusty Russell, Paul Mackerras,
	Paul Mundt, Steven Rostedt

[-- Attachment #1: 0007-ftrace-ppc32-dynamic-ftrace-to-handle-modules.patch --]
[-- Type: text/plain, Size: 9267 bytes --]

Impact: enable modules for dynamic ftrace on PPC32

This patch performs the necessary trampoline calls to handle
modules with dynamic ftrace.

Signed-off-by: Steven Rostedt <srostedt@redhat.com>
---
 arch/powerpc/include/asm/ftrace.h |    1 -
 arch/powerpc/include/asm/module.h |    5 +-
 arch/powerpc/kernel/ftrace.c      |  219 +++++++++++++++++++++++++++++++-----
 arch/powerpc/kernel/module_32.c   |   10 ++
 4 files changed, 202 insertions(+), 33 deletions(-)

diff --git a/arch/powerpc/include/asm/ftrace.h b/arch/powerpc/include/asm/ftrace.h
index d57f5bc..e5f2ae8 100644
--- a/arch/powerpc/include/asm/ftrace.h
+++ b/arch/powerpc/include/asm/ftrace.h
@@ -17,7 +17,6 @@ static inline unsigned long ftrace_call_adjust(unsigned long addr)
 
 struct dyn_arch_ftrace {
 	struct module *mod;
-	unsigned long tramp;
 };
 #endif /*  CONFIG_DYNAMIC_FTRACE */
 #endif /* __ASSEMBLY__ */
diff --git a/arch/powerpc/include/asm/module.h b/arch/powerpc/include/asm/module.h
index 340bc69..0845488 100644
--- a/arch/powerpc/include/asm/module.h
+++ b/arch/powerpc/include/asm/module.h
@@ -39,11 +39,14 @@ struct mod_arch_specific {
 	unsigned long tramp;
 #endif
 
-#else
+#else /* powerpc64 */
 	/* Indices of PLT sections within module. */
 	unsigned int core_plt_section;
 	unsigned int init_plt_section;
+#ifdef CONFIG_DYNAMIC_FTRACE
+	unsigned long tramp;
 #endif
+#endif /* powerpc64 */
 
 	/* List of BUG addresses, source line numbers and filenames */
 	struct list_head bug_list;
diff --git a/arch/powerpc/kernel/ftrace.c b/arch/powerpc/kernel/ftrace.c
index acbec66..57fdda8 100644
--- a/arch/powerpc/kernel/ftrace.c
+++ b/arch/powerpc/kernel/ftrace.c
@@ -119,36 +119,20 @@ static int test_24bit_addr(unsigned long ip, unsigned long addr)
 	return !(diff & ((unsigned long)-1 << 24));
 }
 
-int ftrace_make_nop(struct module *mod,
-		    struct dyn_ftrace *rec, unsigned long addr)
+#ifdef CONFIG_PPC64
+static int
+__ftrace_make_nop(struct module *mod,
+		  struct dyn_ftrace *rec, unsigned long addr)
 {
 	unsigned char replaced[MCOUNT_INSN_SIZE * 2];
 	unsigned int *op = (unsigned *)&replaced;
 	unsigned char jmp[8];
 	unsigned long *ptr = (unsigned long *)&jmp;
-	unsigned char *old, *new;
 	unsigned long ip = rec->ip;
 	unsigned long tramp;
 	int offset;
 
 	/*
-	 * If the calling address is more that 24 bits away,
-	 * then we had to use a trampoline to make the call.
-	 * Otherwise just update the call site.
-	 */
-	if (test_24bit_addr(ip, addr)) {
-		/* within range */
-		old = ftrace_call_replace(ip, addr);
-		new = ftrace_nop_replace();
-		return ftrace_modify_code(ip, old, new);
-	}
-
-#ifndef CONFIG_PPC64
-	/* only supported for PPC64 for now */
-	return 0;
-#else
-
-	/*
 	 * Out of range jumps are called from modules.
 	 * We should either already have a pointer to the module
 	 * or it has been passed in.
@@ -258,16 +242,105 @@ int ftrace_make_nop(struct module *mod,
 		return -EPERM;
 
 	return 0;
-#endif /* CONFIG_PPC64 */
 }
 
-int ftrace_make_call(struct dyn_ftrace *rec, unsigned long addr)
+#else /* !PPC64 */
+static int
+__ftrace_make_nop(struct module *mod,
+		  struct dyn_ftrace *rec, unsigned long addr)
 {
-	unsigned char replaced[MCOUNT_INSN_SIZE * 2];
+	unsigned char replaced[MCOUNT_INSN_SIZE];
 	unsigned int *op = (unsigned *)&replaced;
+	unsigned char jmp[8];
+	unsigned int *ptr = (unsigned int *)&jmp;
+	unsigned long ip = rec->ip;
+	unsigned long tramp;
+	int offset;
+
+	/*
+	 * Out of range jumps are called from modules.
+	 * We should either already have a pointer to the module
+	 * or it has been passed in.
+	 */
+	if (!rec->arch.mod) {
+		if (!mod) {
+			printk(KERN_ERR "No module loaded addr=%lx\n",
+			       addr);
+			return -EFAULT;
+		}
+		rec->arch.mod = mod;
+	} else if (mod) {
+		printk(KERN_ERR
+		       "Record mod %p not equal to passed in mod %p\n",
+		       rec->arch.mod, mod);
+		return -EINVAL;
+	} else
+		mod = rec->arch.mod;
+
+	/* read where this goes */
+	if (probe_kernel_read(replaced, (void *)ip, MCOUNT_INSN_SIZE))
+		return -EFAULT;
+
+	/* Make sure that that this is still a 24bit jump */
+	if ((*op & 0xff000000) != 0x48000000) {
+		printk(KERN_ERR "Not expected bl: opcode is %x\n", *op);
+		return -EINVAL;
+	}
+
+	/* lets find where the pointer goes */
+	offset = (*op & 0x03fffffc);
+	/* make it signed */
+	if (offset & 0x02000000)
+		offset |= 0xfe000000;
+
+	tramp = ip + (long)offset;
+
+	/*
+	 * On PPC32 the trampoline looks like:
+	 * lis r11,sym@ha
+	 * addi r11,r11,sym@l
+	 * mtctr r11
+	 * bctr
+	 */
+
+	DEBUGP("ip:%lx jumps to %lx", ip, tramp);
+
+	/* Find where the trampoline jumps to */
+	if (probe_kernel_read(jmp, (void *)tramp, 8)) {
+		printk(KERN_ERR "Failed to read %lx\n", tramp);
+		return -EFAULT;
+	}
+
+	DEBUGP(" %08x %08x ", ptr[0], ptr[1]);
+
+	tramp = (ptr[1] & 0xffff) |
+		((ptr[0] & 0xffff) << 16);
+	if (tramp & 0x8000)
+		tramp -= 0x10000;
+
+	DEBUGP(" %x ", tramp);
+
+	if (tramp != addr) {
+		printk(KERN_ERR
+		       "Trampoline location %08lx does not match addr\n",
+		       tramp);
+		return -EINVAL;
+	}
+
+	op[0] = PPC_NOP_INSTR;
+
+	if (probe_kernel_write((void *)ip, replaced, MCOUNT_INSN_SIZE))
+		return -EPERM;
+
+	return 0;
+}
+#endif /* PPC64 */
+
+int ftrace_make_nop(struct module *mod,
+		    struct dyn_ftrace *rec, unsigned long addr)
+{
 	unsigned char *old, *new;
 	unsigned long ip = rec->ip;
-	unsigned long offset;
 
 	/*
 	 * If the calling address is more that 24 bits away,
@@ -276,15 +349,23 @@ int ftrace_make_call(struct dyn_ftrace *rec, unsigned long addr)
 	 */
 	if (test_24bit_addr(ip, addr)) {
 		/* within range */
-		old = ftrace_nop_replace();
-		new = ftrace_call_replace(ip, addr);
+		old = ftrace_call_replace(ip, addr);
+		new = ftrace_nop_replace();
 		return ftrace_modify_code(ip, old, new);
 	}
 
-#ifndef CONFIG_PPC64
-	/* only supported for PPC64 for now */
-	return 0;
-#else
+	return __ftrace_make_nop(mod, rec, addr);
+
+}
+
+#ifdef CONFIG_PPC64
+static int
+__ftrace_make_call(struct dyn_ftrace *rec, unsigned long addr)
+{
+	unsigned char replaced[MCOUNT_INSN_SIZE * 2];
+	unsigned int *op = (unsigned *)&replaced;
+	unsigned long ip = rec->ip;
+	unsigned long offset;
 
 	/*
 	 * Out of range jumps are called from modules.
@@ -334,7 +415,83 @@ int ftrace_make_call(struct dyn_ftrace *rec, unsigned long addr)
 		return -EPERM;
 
 	return 0;
-#endif /* CONFIG_PPC64 */
+}
+#else
+static int
+__ftrace_make_call(struct dyn_ftrace *rec, unsigned long addr)
+{
+	unsigned char replaced[MCOUNT_INSN_SIZE];
+	unsigned int *op = (unsigned *)&replaced;
+	unsigned long ip = rec->ip;
+	unsigned long offset;
+
+	/*
+	 * Out of range jumps are called from modules.
+	 * Being that we are converting from nop, it had better
+	 * already have a module defined.
+	 */
+	if (!rec->arch.mod) {
+		printk(KERN_ERR "No module loaded\n");
+		return -EINVAL;
+	}
+
+	/* read where this goes */
+	if (probe_kernel_read(replaced, (void *)ip, MCOUNT_INSN_SIZE))
+		return -EFAULT;
+
+	/* It should be pointing to a nop */
+	if (op[0] != PPC_NOP_INSTR) {
+		printk(KERN_ERR "Expected NOP but have %x\n", op[0]);
+		return -EINVAL;
+	}
+
+	/* If we never set up a trampoline to ftrace_caller, then bail */
+	if (!rec->arch.mod->arch.tramp) {
+		printk(KERN_ERR "No ftrace trampoline\n");
+		return -EINVAL;
+	}
+
+	/* now calculate a jump to the ftrace caller trampoline */
+	offset = rec->arch.mod->arch.tramp - ip;
+
+	if (offset + 0x2000000 > 0x3ffffff || (offset & 3) != 0) {
+		printk(KERN_ERR "REL24 %li out of range!\n",
+		       (long int)offset);
+		return -EINVAL;
+	}
+
+
+	/* Set to "bl addr" */
+	op[0] = 0x48000001 | (offset & 0x03fffffc);
+
+	DEBUGP("write to %lx\n", rec->ip);
+
+	if (probe_kernel_write((void *)ip, replaced, MCOUNT_INSN_SIZE))
+		return -EPERM;
+
+	return 0;
+}
+#endif
+
+int ftrace_make_call(struct dyn_ftrace *rec, unsigned long addr)
+{
+	unsigned char *old, *new;
+	unsigned long ip = rec->ip;
+
+	/*
+	 * If the calling address is more that 24 bits away,
+	 * then we had to use a trampoline to make the call.
+	 * Otherwise just update the call site.
+	 */
+	if (test_24bit_addr(ip, addr)) {
+		/* within range */
+		old = ftrace_nop_replace();
+		new = ftrace_call_replace(ip, addr);
+		return ftrace_modify_code(ip, old, new);
+	}
+
+	return __ftrace_make_call(rec, addr);
+
 }
 
 int ftrace_update_ftrace_func(ftrace_func_t func)
diff --git a/arch/powerpc/kernel/module_32.c b/arch/powerpc/kernel/module_32.c
index 2df91a0..f832773 100644
--- a/arch/powerpc/kernel/module_32.c
+++ b/arch/powerpc/kernel/module_32.c
@@ -22,6 +22,7 @@
 #include <linux/fs.h>
 #include <linux/string.h>
 #include <linux/kernel.h>
+#include <linux/ftrace.h>
 #include <linux/cache.h>
 #include <linux/bug.h>
 #include <linux/sort.h>
@@ -53,6 +54,9 @@ static unsigned int count_relocs(const Elf32_Rela *rela, unsigned int num)
 			r_addend = rela[i].r_addend;
 		}
 
+#ifdef CONFIG_DYNAMIC_FTRACE
+	_count_relocs++;	/* add one for ftrace_caller */
+#endif
 	return _count_relocs;
 }
 
@@ -306,5 +310,11 @@ int apply_relocate_add(Elf32_Shdr *sechdrs,
 			return -ENOEXEC;
 		}
 	}
+#ifdef CONFIG_DYNAMIC_FTRACE
+	module->arch.tramp =
+		do_plt_call(module->module_core,
+			    (unsigned long)ftrace_caller,
+			    sechdrs, module);
+#endif
 	return 0;
 }
-- 
1.5.6.5

-- 

^ permalink raw reply related	[flat|nested] 29+ messages in thread

* Re: [PATCH 0/7] Porting dynmaic ftrace to PowerPC
  2008-11-16 21:24 [PATCH 0/7] Porting dynmaic ftrace to PowerPC Steven Rostedt
                   ` (6 preceding siblings ...)
  2008-11-16 21:24 ` [PATCH 7/7] ftrace,ppc32: dynamic ftrace to handle modules Steven Rostedt
@ 2008-11-16 22:55 ` Paul Mackerras
  2008-11-17 15:42   ` Steven Rostedt
  7 siblings, 1 reply; 29+ messages in thread
From: Paul Mackerras @ 2008-11-16 22:55 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: linux-kernel, Ingo Molnar, Andrew Morton, Thomas Gleixner,
	Peter Zijlstra, David Miller, Benjamin Herrenschmidt,
	Frederic Weisbecker, Pekka Paalanen, linuxppc-dev, Rusty Russell,
	Paul Mundt

Steven Rostedt writes:

> The following patches are for my work on porting the new dynamic ftrace
> framework to PowerPC.  The issue I had with both PPC64 and PPC32 is
> that the calls to mcount are 24 bit jumps. Since the modules are
> loaded in vmalloc address space, the call to mcount is farther than
> what a 24 bit jump can make. The way PPC solves this is with the use
> of trampolines. The trampoline is a memory space allocated within the
> 24 bit region of the module. The code in the trampoline that the
> jump is made to does a far jump to the core kernel code.

Thanks for doing this work.  I'll go through the patches in detail
today, but first I'd like to clear up a couple of things for you.  The
first is that unconditional branches on PowerPC effectively have a
26-bit sign-extended offset, not 24-bit.  The offset field in the
instruction is 24 bits long, but because all instructions are 4 bytes
long, two extra 0 bits get appended to the offset field, giving a
26-bit offset and a range of +/- 32MB from the branch instruction.

> PPC64, although works with 64 bit registers, the op codes are still
> 32 bit in length.  PPC64 uses table of contents (TOC) fields
> to make their calls to functions. A function name is really a pointer
> into the TOC table that stores the actual address of the function
> along with the TOC of that function. The r2 register plays as the
> TOC pointer. The actual name of the function is the function name
> with a dot '.' prefix.  The reference name "schedule" is really
> to the TOC entry, which calls the actual code with the reference
> name ".schedule". This also explains why the list of available filter
> functions on PPC64 all have a dot prefix.

A little more detail: the TOC mainly stores addresses and other
constants.  Functions have a descriptor that is stored in a .opd
section (not the TOC, though the TOC may contain pointers to procedure
descriptors).  Each descriptor has the address of the code, the
address of the TOC for the function, and a static chain pointer (not
used for C, but can used for other languages).  As you note, the
symbol for a function will be the address of the descriptor rather
than the address of the function code.

> When a funtion is called, it uses the 'bl' command which is a 24
> bit function jump (saving the return address in the link register).
> The next operation after all 'bl' calls is a nop. What the module
> load code does when one of these 'bl' calls is farther than 24 bits
> can handle, it creates a entry in the TOC and has the 'bl' call to

The module loader allocates some memory for these trampolines, but
that's a distinct area from the TOC and the OPD section.

> that entry. The entry in the TOC will save the r2 register on the
> stack "40(r1)" load the actually function into the ctrl register

"counter" register, actually, not "ctrl".

> The work for PPC32 is very much the same as the PPC64 code but the 32
> version does not need to deal with TOCS. This makes the code much
> simpler. Pretty much everything as PPC64 is done, except it does not
> need to index a TOC.

Right.

> I've tested the following patches on both PPC64 and PPC32. I will
> admit that the PPC64 does not seem that stable, but neither does the
> code when all this is not enabled ;-)  I'll debug it more to see if
> I can find the cause of my crashes, which may or may not be related
> to the dynamic ftrace code. But the use of TOCS in PPC64 make me
> a bit nervious that I did not do this correctly. Any help in reviewing
> my code for mistakes would be greatly appreciated.

Hmmm.  What sort of crashes are you seeing?

Regards,
Paul.

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 1/7] ftrace, PPC: do not latency trace idle
  2008-11-16 21:24 ` [PATCH 1/7] ftrace, PPC: do not latency trace idle Steven Rostedt
@ 2008-11-17  3:43   ` Paul Mackerras
  2008-11-17 15:43     ` Steven Rostedt
  0 siblings, 1 reply; 29+ messages in thread
From: Paul Mackerras @ 2008-11-17  3:43 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: linux-kernel, Ingo Molnar, Andrew Morton, Thomas Gleixner,
	Peter Zijlstra, David Miller, Benjamin Herrenschmidt,
	Frederic Weisbecker, Pekka Paalanen, linuxppc-dev, Rusty Russell,
	Paul Mundt, Steven Rostedt

Steven Rostedt writes:

> When idle is called, interrupts are disabled, but the idle function
> will still wake up on an interrupt. The problem is that the interrupt
> disabled latency tracer will take this call to idle as a latency.
> 
> This patch disables the latency tracing when going into idle.

Patch looks OK, but what is the connection with ftrace?

Paul.

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 2/7] ftrace, ppc: convert to new dynamic ftrace arch API
  2008-11-16 21:24 ` [PATCH 2/7] ftrace, ppc: convert to new dynamic ftrace arch API Steven Rostedt
@ 2008-11-17  3:57   ` Paul Mackerras
  2008-11-17 15:47     ` Steven Rostedt
  0 siblings, 1 reply; 29+ messages in thread
From: Paul Mackerras @ 2008-11-17  3:57 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: linux-kernel, Ingo Molnar, Andrew Morton, Thomas Gleixner,
	Peter Zijlstra, David Miller, Benjamin Herrenschmidt,
	Frederic Weisbecker, Pekka Paalanen, linuxppc-dev, Rusty Russell,
	Paul Mundt, Steven Rostedt

Steven Rostedt writes:

> +static int test_24bit_addr(unsigned long ip, unsigned long addr)
> +{
> +	unsigned long diff;
> +
> +	/* can we get to addr from ip in 24 bits? */
> +	diff = ip > addr ? ip - addr : addr - ip;
> +
> +	return !(diff & ((unsigned long)-1 << 24));

Since a branch instruction can reach +/- 32MB, the 24 needs to be 25.
It's 25 not 26 since you have effectively accounted for one bit by
taking the absolute value of the offset.  Also, this will say that an
offset of -0x2000000 is out of range whereas it is just in range.
That probably doesn't matter unless you're relying on this to give
exactly the same answer in all cases as tests done elsewhere (e.g. in
module_64.c).

Note that the address difference being too large isn't the only reason
for a procedure call to go via a trampoline on ppc64.  A trampoline is
also needed when the called function uses a different TOC from the
caller.  The most obvious example of this is when the caller is in a
module and the callee is in the main kernel or another module.

Currently all functions in the main kernel use the same TOC, but the
linker is capable of dividing up a large executable into groups of
functions and assigning a different TOC to each group, in order to
avoid any of the TOCs getting too large (a TOC is limited to 64kB).
If the linker does that, it automatically inserts trampolines to
handle the TOC change.

Paul.

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 4/7] ftrace, PPC: use probe_kernel API to modify code
  2008-11-16 21:24 ` [PATCH 4/7] ftrace, PPC: use probe_kernel API to modify code Steven Rostedt
@ 2008-11-17  4:44   ` Paul Mackerras
  2008-11-17 15:51     ` Steven Rostedt
  0 siblings, 1 reply; 29+ messages in thread
From: Paul Mackerras @ 2008-11-17  4:44 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: linux-kernel, Ingo Molnar, Andrew Morton, Thomas Gleixner,
	Peter Zijlstra, David Miller, Benjamin Herrenschmidt,
	Frederic Weisbecker, Pekka Paalanen, linuxppc-dev, Rusty Russell,
	Paul Mundt, Steven Rostedt

Steven Rostedt writes:

> Impact: use cleaner probe_kernel API over assembly
> 
> Using probe_kernel_read/write interface is a much cleaner approach
> than the current assembly version.

Possibly naive question: how is it possible for the accesses to the
instructions to fault, given that we are called through kstop_machine
(according to the comment) and nothing else should be happening?

Paul.

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 5/7] ftrace, PPC64: handle module trampolines for dyn ftrace
  2008-11-16 21:24 ` [PATCH 5/7] ftrace, PPC64: handle module trampolines for dyn ftrace Steven Rostedt
@ 2008-11-17  5:00   ` Paul Mackerras
  2008-11-17 16:02     ` Steven Rostedt
  2008-11-17 16:18     ` Steven Rostedt
  0 siblings, 2 replies; 29+ messages in thread
From: Paul Mackerras @ 2008-11-17  5:00 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: linux-kernel, Ingo Molnar, Andrew Morton, Thomas Gleixner,
	Peter Zijlstra, David Miller, Benjamin Herrenschmidt,
	Frederic Weisbecker, Pekka Paalanen, linuxppc-dev, Rusty Russell,
	Paul Mundt, Steven Rostedt

Steven Rostedt writes:

> +	if (!rec->arch.mod) {
> +		if (!mod) {
> +			printk(KERN_ERR "No module loaded addr=%lx\n",
> +			       addr);
> +			return -EFAULT;
> +		}
> +		rec->arch.mod = mod;
> +	} else if (mod) {
> +		printk(KERN_ERR
> +		       "Record mod %p not equal to passed in mod %p\n",
> +		       rec->arch.mod, mod);
> +		return -EINVAL;

That looks wrong; surely you need an if (mod != rec->arch.mod)
somewhere here?

> +	/* Make sure that that this is still a 24bit jump */
> +	if ((*op & 0xff000000) != 0x48000000) {
> +		printk(KERN_ERR "Not expected bl: opcode is %x\n", *op);
> +		return -EINVAL;
> +	}

Needs to be

	if ((*op & 0xfc000003) != 0x48000001)

since the major opcode is the top 6 bits, and the bottom 2 bits are
the link bit and the absolute address bit.

> +	/* lets find where the pointer goes */
> +	offset = (*op & 0x03fffffc);
> +	/* make it signed */
> +	if (offset & 0x02000000)
> +		offset |= 0xfe000000;

but you got that right... :)

> +	/* get the address this jumps too */
> +	tramp = mod->arch.toc + offset + 32;

Why + 32?

Paul.

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 0/7] Porting dynmaic ftrace to PowerPC
  2008-11-16 22:55 ` [PATCH 0/7] Porting dynmaic ftrace to PowerPC Paul Mackerras
@ 2008-11-17 15:42   ` Steven Rostedt
  2008-11-17 20:03     ` Steven Rostedt
  2008-11-18 13:56     ` Steven Rostedt
  0 siblings, 2 replies; 29+ messages in thread
From: Steven Rostedt @ 2008-11-17 15:42 UTC (permalink / raw)
  To: Paul Mackerras
  Cc: LKML, Ingo Molnar, Andrew Morton, Thomas Gleixner,
	Peter Zijlstra, David Miller, Benjamin Herrenschmidt,
	Frederic Weisbecker, Pekka Paalanen, linuxppc-dev, Rusty Russell,
	Paul Mundt


On Mon, 17 Nov 2008, Paul Mackerras wrote:

> Steven Rostedt writes:
> 
> > The following patches are for my work on porting the new dynamic ftrace
> > framework to PowerPC.  The issue I had with both PPC64 and PPC32 is
> > that the calls to mcount are 24 bit jumps. Since the modules are
> > loaded in vmalloc address space, the call to mcount is farther than
> > what a 24 bit jump can make. The way PPC solves this is with the use
> > of trampolines. The trampoline is a memory space allocated within the
> > 24 bit region of the module. The code in the trampoline that the
> > jump is made to does a far jump to the core kernel code.
> 
> Thanks for doing this work.  I'll go through the patches in detail
> today, but first I'd like to clear up a couple of things for you.  The
> first is that unconditional branches on PowerPC effectively have a
> 26-bit sign-extended offset, not 24-bit.  The offset field in the
> instruction is 24 bits long, but because all instructions are 4 bytes
> long, two extra 0 bits get appended to the offset field, giving a
> 26-bit offset and a range of +/- 32MB from the branch instruction.

Ah yes, thanks for the clarification.

> 
> > PPC64, although works with 64 bit registers, the op codes are still
> > 32 bit in length.  PPC64 uses table of contents (TOC) fields
> > to make their calls to functions. A function name is really a pointer
> > into the TOC table that stores the actual address of the function
> > along with the TOC of that function. The r2 register plays as the
> > TOC pointer. The actual name of the function is the function name
> > with a dot '.' prefix.  The reference name "schedule" is really
> > to the TOC entry, which calls the actual code with the reference
> > name ".schedule". This also explains why the list of available filter
> > functions on PPC64 all have a dot prefix.
> 
> A little more detail: the TOC mainly stores addresses and other
> constants.  Functions have a descriptor that is stored in a .opd
> section (not the TOC, though the TOC may contain pointers to procedure
> descriptors).  Each descriptor has the address of the code, the
> address of the TOC for the function, and a static chain pointer (not
> used for C, but can used for other languages).  As you note, the
> symbol for a function will be the address of the descriptor rather
> than the address of the function code.
> 
> > When a funtion is called, it uses the 'bl' command which is a 24
> > bit function jump (saving the return address in the link register).
> > The next operation after all 'bl' calls is a nop. What the module
> > load code does when one of these 'bl' calls is farther than 24 bits
> > can handle, it creates a entry in the TOC and has the 'bl' call to
> 
> The module loader allocates some memory for these trampolines, but
> that's a distinct area from the TOC and the OPD section.

Ah, yes, my mistake. It is a trampoline entry, not part of the TOC.

> 
> > that entry. The entry in the TOC will save the r2 register on the
> > stack "40(r1)" load the actually function into the ctrl register
> 
> "counter" register, actually, not "ctrl".

Oops, I still make that mistake :-/  I use to do a lot of PPC work several 
years ago, and I would always call that the control register, and my 
colleagues would always correct me and say its the counter register. I 
guess some things just don't change ;-)


> 
> > The work for PPC32 is very much the same as the PPC64 code but the 32
> > version does not need to deal with TOCS. This makes the code much
> > simpler. Pretty much everything as PPC64 is done, except it does not
> > need to index a TOC.
> 
> Right.
> 
> > I've tested the following patches on both PPC64 and PPC32. I will
> > admit that the PPC64 does not seem that stable, but neither does the
> > code when all this is not enabled ;-)  I'll debug it more to see if
> > I can find the cause of my crashes, which may or may not be related
> > to the dynamic ftrace code. But the use of TOCS in PPC64 make me
> > a bit nervious that I did not do this correctly. Any help in reviewing
> > my code for mistakes would be greatly appreciated.
> 
> Hmmm.  What sort of crashes are you seeing?

This code is in tip, which is mainly used to develop for x86. I've hit a 
few crashes, and I think I hit a couple without this code. But here's an 
example:

huh, entered softirq 4 c000000000846ad8 preempt_count 10000103, exited 
with fffefffe?
------------[ cut here ]------------
Badness at kernel/sched_fair.c:875
NIP: c00000000004bfb8 LR: c00000000004bf7c CTR: c0000000000b5830
REGS: c00000003929cce0 TRAP: 0700   Not tainted  (2.6.28-rc4-tip)
MSR: 9000000000021032 <ME,IR,DR>  CR: 28822842  XER: 20000000
TASK = c00000003d93cd10[2061] 'remove-trailing' THREAD: c00000003929c000 
CPU: 1
GPR00: 0000000000000001 c00000003929cf60 c000000000887070 c000000000ae2d00 
GPR04: c00000000004c2c0 0000000000003320 c00000003929cb70 000000000000080d 
GPR08: c00000000079333c 000000000002ffff c000000000903380 c000000000903380 
GPR12: 0000000048822848 c000000000903580 c000000000794000 0000000000000000 
GPR16: c000000000903380 0000000000000001 c000000000909f7c 7fffffffffffffff 
GPR20: c00000003929d8e0 c000000000ae2f20 00000086b6e84cc0 0000000000000001 
GPR24: 0000000000000001 c000000000794000 c00000003d93cd10 c00000003d934f20 
GPR28: c000000000ae4000 c00000003d93cd48 c000000000803550 c00000003929cf60 
cpu 0x1: Vector: 400 (Instruction Access) at [c00000003929be1f]
    pc: 01c0000000000ae8
    lr: 01c0000000000aeb
    sp: c00000003929c09f
   msr: 9000000040001032
  current = 0xc00000003d93cd10
  paca    = 0xc000000000903580
    pid   = 2061, comm = remove-trailing


Then it went into the monitor that is loaded. When I fix the rest of my 
patches, I'll see if it is not my code that is crashing this, and then 
I'll see if I can figure out what is causing some of these crashes.

Thanks Paul for all the feedback!

-- Steve


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 1/7] ftrace, PPC: do not latency trace idle
  2008-11-17  3:43   ` Paul Mackerras
@ 2008-11-17 15:43     ` Steven Rostedt
  0 siblings, 0 replies; 29+ messages in thread
From: Steven Rostedt @ 2008-11-17 15:43 UTC (permalink / raw)
  To: Paul Mackerras
  Cc: linux-kernel, Ingo Molnar, Andrew Morton, Thomas Gleixner,
	Peter Zijlstra, David Miller, Benjamin Herrenschmidt,
	Frederic Weisbecker, Pekka Paalanen, linuxppc-dev, Rusty Russell,
	Paul Mundt, Steven Rostedt


On Mon, 17 Nov 2008, Paul Mackerras wrote:

> Steven Rostedt writes:
> 
> > When idle is called, interrupts are disabled, but the idle function
> > will still wake up on an interrupt. The problem is that the interrupt
> > disabled latency tracer will take this call to idle as a latency.
> > 
> > This patch disables the latency tracing when going into idle.
> 
> Patch looks OK, but what is the connection with ftrace?

Ah, that patch has been sitting on my box for some time. It has to deal 
with the latency tracer and not the dynamic tracing part of ftrace. I 
added it because it is PPC ftrace related and to get it off my box. But 
this is not have anything to do with the rest of the patches.

-- Steve


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 2/7] ftrace, ppc: convert to new dynamic ftrace arch API
  2008-11-17  3:57   ` Paul Mackerras
@ 2008-11-17 15:47     ` Steven Rostedt
  0 siblings, 0 replies; 29+ messages in thread
From: Steven Rostedt @ 2008-11-17 15:47 UTC (permalink / raw)
  To: Paul Mackerras
  Cc: linux-kernel, Ingo Molnar, Andrew Morton, Thomas Gleixner,
	Peter Zijlstra, David Miller, Benjamin Herrenschmidt,
	Frederic Weisbecker, Pekka Paalanen, linuxppc-dev, Rusty Russell,
	Paul Mundt, Steven Rostedt


On Mon, 17 Nov 2008, Paul Mackerras wrote:

> Steven Rostedt writes:
> 
> > +static int test_24bit_addr(unsigned long ip, unsigned long addr)
> > +{
> > +	unsigned long diff;
> > +
> > +	/* can we get to addr from ip in 24 bits? */
> > +	diff = ip > addr ? ip - addr : addr - ip;
> > +
> > +	return !(diff & ((unsigned long)-1 << 24));
> 
> Since a branch instruction can reach +/- 32MB, the 24 needs to be 25.
> It's 25 not 26 since you have effectively accounted for one bit by
> taking the absolute value of the offset.  Also, this will say that an
> offset of -0x2000000 is out of range whereas it is just in range.
> That probably doesn't matter unless you're relying on this to give
> exactly the same answer in all cases as tests done elsewhere (e.g. in
> module_64.c).

I'll update this to be correct.

> 
> Note that the address difference being too large isn't the only reason
> for a procedure call to go via a trampoline on ppc64.  A trampoline is
> also needed when the called function uses a different TOC from the
> caller.  The most obvious example of this is when the caller is in a
> module and the callee is in the main kernel or another module.
> 
> Currently all functions in the main kernel use the same TOC, but the
> linker is capable of dividing up a large executable into groups of
> functions and assigning a different TOC to each group, in order to
> avoid any of the TOCs getting too large (a TOC is limited to 64kB).
> If the linker does that, it automatically inserts trampolines to
> handle the TOC change.

OK, I'll deal with that at a another time. If this does happen in core 
kernel code, it will produce a nice warning and nicely shutdown ftrace, 
without causing any harm. Then I can deal with the bug report.

-- Steve



^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 4/7] ftrace, PPC: use probe_kernel API to modify code
  2008-11-17  4:44   ` Paul Mackerras
@ 2008-11-17 15:51     ` Steven Rostedt
  0 siblings, 0 replies; 29+ messages in thread
From: Steven Rostedt @ 2008-11-17 15:51 UTC (permalink / raw)
  To: Paul Mackerras
  Cc: linux-kernel, Ingo Molnar, Andrew Morton, Thomas Gleixner,
	Peter Zijlstra, David Miller, Benjamin Herrenschmidt,
	Frederic Weisbecker, Pekka Paalanen, linuxppc-dev, Rusty Russell,
	Paul Mundt, Steven Rostedt


On Mon, 17 Nov 2008, Paul Mackerras wrote:

> Steven Rostedt writes:
> 
> > Impact: use cleaner probe_kernel API over assembly
> > 
> > Using probe_kernel_read/write interface is a much cleaner approach
> > than the current assembly version.
> 
> Possibly naive question: how is it possible for the accesses to the
> instructions to fault, given that we are called through kstop_machine
> (according to the comment) and nothing else should be happening?

Actually, only start up and shutdown of the tracer goes through 
kstop_machine. With the new code, it can happen before SMP starts (in 
init/main.c) or on the module itself, before the module loads. But that's 
not why we do the probe_kernel_* calls.

The reason for probe_kernel_ is because ftrace is very intrusive. Ingo and 
I have been making ftrace very paranoid about anything it does. It does 
not trust itself to be correct. If anything were to fail, it will warn and 
shut itself down peacefully.  So far, I have not seen these warnings, but 
this is part of the robustness features that have been added since 2.6.27.

-- Steve


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 5/7] ftrace, PPC64: handle module trampolines for dyn ftrace
  2008-11-17  5:00   ` Paul Mackerras
@ 2008-11-17 16:02     ` Steven Rostedt
  2008-11-17 16:18     ` Steven Rostedt
  1 sibling, 0 replies; 29+ messages in thread
From: Steven Rostedt @ 2008-11-17 16:02 UTC (permalink / raw)
  To: Paul Mackerras
  Cc: linux-kernel, Ingo Molnar, Andrew Morton, Thomas Gleixner,
	Peter Zijlstra, David Miller, Benjamin Herrenschmidt,
	Frederic Weisbecker, Pekka Paalanen, linuxppc-dev, Rusty Russell,
	Paul Mundt, Steven Rostedt

On Mon, 17 Nov 2008, Paul Mackerras wrote:

> Steven Rostedt writes:
> 
> > +	if (!rec->arch.mod) {
> > +		if (!mod) {
> > +			printk(KERN_ERR "No module loaded addr=%lx\n",
> > +			       addr);
> > +			return -EFAULT;
> > +		}
> > +		rec->arch.mod = mod;
> > +	} else if (mod) {
> > +		printk(KERN_ERR
> > +		       "Record mod %p not equal to passed in mod %p\n",
> > +		       rec->arch.mod, mod);
> > +		return -EINVAL;
> 
> That looks wrong; surely you need an if (mod != rec->arch.mod)
> somewhere here?

??? How did that happen :-/  Maybe it was because I was working between
two boxes and I did not refresh quilt before submitting it into my git
tree? I'm positive I made that fix already.

Yes, I will fix that.

> 
> > +	/* Make sure that that this is still a 24bit jump */
> > +	if ((*op & 0xff000000) != 0x48000000) {
> > +		printk(KERN_ERR "Not expected bl: opcode is %x\n", *op);
> > +		return -EINVAL;
> > +	}
> 
> Needs to be
> 
> 	if ((*op & 0xfc000003) != 0x48000001)
> 
> since the major opcode is the top 6 bits, and the bottom 2 bits are
> the link bit and the absolute address bit.

Thanks! will fix.

> 
> > +	/* lets find where the pointer goes */
> > +	offset = (*op & 0x03fffffc);
> > +	/* make it signed */
> > +	if (offset & 0x02000000)
> > +		offset |= 0xfe000000;
> 
> but you got that right... :)

Heh, I'm not hopeless.

> 
> > +	/* get the address this jumps too */
> > +	tramp = mod->arch.toc + offset + 32;
> 
> Why + 32?

Good question. This is one of the things that was weird to me too. I got 
this from the module_64.c code.

static struct ppc64_stub_entry ppc64_stub =
{ .jump = {
	0x3d, 0x82, 0x00, 0x00, /* addis   r12,r2, <high> */
	0x39, 0x8c, 0x00, 0x00, /* addi    r12,r12, <low> */
	/* Save current r2 value in magic place on the stack. */
	0xf8, 0x41, 0x00, 0x28, /* std     r2,40(r1) */
	0xe9, 0x6c, 0x00, 0x20, /* ld      r11,32(r12) */

 ld r11,32(r12) is the call.

	0xe8, 0x4c, 0x00, 0x28, /* ld      r2,40(r12) */
	0x7d, 0x69, 0x03, 0xa6, /* mtctr   r11 */
	0x4e, 0x80, 0x04, 0x20  /* bctr */
} };


I'm making sure that I will actually jump to the correct spot before 
changing any code, but to find that correct spot, I need to calculate what 
is being done.

I see the above code from module_64.c adds r2 (the TOC?) to r12, and then 
adds the location of the 'bl' offset. Then it adds the contents of 32(r12) 
to r11. And then the jump is to r11.

I'll comment that better to state that I pulled this code from 
module_64.c.

Thanks,

-- Steve



^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 5/7] ftrace, PPC64: handle module trampolines for dyn ftrace
  2008-11-17  5:00   ` Paul Mackerras
  2008-11-17 16:02     ` Steven Rostedt
@ 2008-11-17 16:18     ` Steven Rostedt
  1 sibling, 0 replies; 29+ messages in thread
From: Steven Rostedt @ 2008-11-17 16:18 UTC (permalink / raw)
  To: Paul Mackerras
  Cc: linux-kernel, Ingo Molnar, Andrew Morton, Thomas Gleixner,
	Peter Zijlstra, David Miller, Benjamin Herrenschmidt,
	Frederic Weisbecker, Pekka Paalanen, linuxppc-dev, Rusty Russell,
	Paul Mundt, Steven Rostedt


On Mon, 17 Nov 2008, Paul Mackerras wrote:

> Steven Rostedt writes:
> 
> > +	if (!rec->arch.mod) {
> > +		if (!mod) {
> > +			printk(KERN_ERR "No module loaded addr=%lx\n",
> > +			       addr);
> > +			return -EFAULT;
> > +		}
> > +		rec->arch.mod = mod;
> > +	} else if (mod) {
> > +		printk(KERN_ERR
> > +		       "Record mod %p not equal to passed in mod %p\n",
> > +		       rec->arch.mod, mod);
> > +		return -EINVAL;
> 
> That looks wrong; surely you need an if (mod != rec->arch.mod)
> somewhere here?

BTW, the reason this still worked was because only the first instance 
passes in mod (when rec->arch.mod does not exist). After that, mod is 
NULL, so we never enter this case. But still, I will need to fix this 
(even though I thought I did).

-- Steve


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 0/7] Porting dynmaic ftrace to PowerPC
  2008-11-17 15:42   ` Steven Rostedt
@ 2008-11-17 20:03     ` Steven Rostedt
  2008-11-18 13:56     ` Steven Rostedt
  1 sibling, 0 replies; 29+ messages in thread
From: Steven Rostedt @ 2008-11-17 20:03 UTC (permalink / raw)
  To: Paul Mackerras
  Cc: LKML, Ingo Molnar, Andrew Morton, Thomas Gleixner,
	Peter Zijlstra, David Miller, Benjamin Herrenschmidt,
	Frederic Weisbecker, Pekka Paalanen, linuxppc-dev, Rusty Russell,
	Paul Mundt


On Mon, 17 Nov 2008, Steven Rostedt wrote:
> On Mon, 17 Nov 2008, Paul Mackerras wrote:
> > 
> > > I've tested the following patches on both PPC64 and PPC32. I will
> > > admit that the PPC64 does not seem that stable, but neither does the
> > > code when all this is not enabled ;-)  I'll debug it more to see if
> > > I can find the cause of my crashes, which may or may not be related
> > > to the dynamic ftrace code. But the use of TOCS in PPC64 make me
> > > a bit nervious that I did not do this correctly. Any help in reviewing
> > > my code for mistakes would be greatly appreciated.
> > 
> > Hmmm.  What sort of crashes are you seeing?
> 
> This code is in tip, which is mainly used to develop for x86. I've hit a 
> few crashes, and I think I hit a couple without this code. But here's an 
> example:
> 
> huh, entered softirq 4 c000000000846ad8 preempt_count 10000103, exited 
> with fffefffe?
> ------------[ cut here ]------------
> Badness at kernel/sched_fair.c:875
> NIP: c00000000004bfb8 LR: c00000000004bf7c CTR: c0000000000b5830
> REGS: c00000003929cce0 TRAP: 0700   Not tainted  (2.6.28-rc4-tip)
> MSR: 9000000000021032 <ME,IR,DR>  CR: 28822842  XER: 20000000
> TASK = c00000003d93cd10[2061] 'remove-trailing' THREAD: c00000003929c000 
> CPU: 1
> GPR00: 0000000000000001 c00000003929cf60 c000000000887070 c000000000ae2d00 
> GPR04: c00000000004c2c0 0000000000003320 c00000003929cb70 000000000000080d 
> GPR08: c00000000079333c 000000000002ffff c000000000903380 c000000000903380 
> GPR12: 0000000048822848 c000000000903580 c000000000794000 0000000000000000 
> GPR16: c000000000903380 0000000000000001 c000000000909f7c 7fffffffffffffff 
> GPR20: c00000003929d8e0 c000000000ae2f20 00000086b6e84cc0 0000000000000001 
> GPR24: 0000000000000001 c000000000794000 c00000003d93cd10 c00000003d934f20 
> GPR28: c000000000ae4000 c00000003d93cd48 c000000000803550 c00000003929cf60 
> cpu 0x1: Vector: 400 (Instruction Access) at [c00000003929be1f]
>     pc: 01c0000000000ae8
>     lr: 01c0000000000aeb
>     sp: c00000003929c09f
>    msr: 9000000040001032
>   current = 0xc00000003d93cd10
>   paca    = 0xc000000000903580
>     pid   = 2061, comm = remove-trailing
> 
> 
> Then it went into the monitor that is loaded. When I fix the rest of my 
> patches, I'll see if it is not my code that is crashing this, and then 
> I'll see if I can figure out what is causing some of these crashes.
> 


Paul,

I'm thinking that I'm hitting stack overflows. I just got this crash:

Badness at net/core/skbuff.c:393
NIP: c0000000004c805c LR: c0000000004c800c CTR: c0000000000b57d0
------------[ cut here ]------------
kernel BUG at kernel/sched.c:1155!
cpu 0x1: Vector: 700 (Program Check) at [c00000002ad18500]
    pc: c000000000049884: .resched_task+0x54/0xe0
    lr: c000000000049858: .resched_task+0x28/0xe0
    sp: c00000002ad18780
   msr: 9000000000021032
  current = 0xc00000002a516c30
  paca    = 0xc000000000903580
    pid   = 17578, comm = fixdep
kernel BUG at kernel/sched.c:1155!
enter ? for help
[c00000002ad18810] c000000000054494 .task_tick_fair+0xc4/0x120
[c00000002ad188a0] c000000000056188 .scheduler_tick+0x108/0x2d0
[c00000002ad18940] c00000000006b1d4 .update_process_times+0x74/0xb0
[c00000002ad189e0] c00000000008bb4c .tick_sched_timer+0x8c/0x120
[c00000002ad18a90] c000000000080f88 .__run_hrtimer+0xd8/0x130
[c00000002ad18b30] c000000000081fac .hrtimer_interrupt+0x16c/0x220
[c00000002ad18c20] c000000000023a0c .timer_interrupt+0xcc/0x110
[c00000002ad18cc0] c0000000000034e0 decrementer_common+0xe0/0x100
--- Exception: 901 (Decrementer) at c00000000000b978 
.raw_local_irq_restore+0x58/0x60
[link register   ] c00000000005b8d8 .vprintk+0x318/0x4b0
[c00000002ad18fb0] c00000000005b7e0 .vprintk+0x220/0x4b0 (unreliable)
[c00000002ad19110] c00000000005bac4 .printk+0x54/0x70
[c00000002ad191b0] c0000000000119d0 .show_regs+0x50/0x380
[c00000002ad19260] c000000000244150 .report_bug+0xb0/0x130
[c00000002ad19300] c0000000000253c0 .program_check_exception+0x1e0/0x610
[c00000002ad19390] c0000000000043e0 program_check_common+0xe0/0x100
--- Exception: 700 (Program Check) at c0000000004c805c 
.skb_release_head_state+0x7c/0xe0
[c00000002ad19710] c0000000004c8f0c .skb_release_all+0x2c/0x50
[c00000002ad197a0] c0000000004c80f4 .__kfree_skb+0x34/0x120
[c00000002ad19830] c0000000004c8224 .kfree_skb+0x44/0x90
[c00000002ad198c0] c0000000004d31f4 .dev_hard_start_xmit+0x224/0x390
[c00000002ad19980] c0000000004ed1d0 .__qdisc_run+0x240/0x340
[c00000002ad19a50] c0000000004d7778 .dev_queue_xmit+0x328/0x630
[c00000002ad19b00] c000000000501940 .ip_finish_output+0x160/0x3d0
[c00000002ad19bc0] c0000000005022c4 .ip_output+0xc4/0xf0
[c00000002ad19c60] c000000000501c88 .ip_local_out+0x58/0x80
[c00000002ad19cf0] c000000000502824 .ip_queue_xmit+0x254/0x480
[c00000002ad19e00] c00000000051ace8 .tcp_transmit_skb+0x498/0x970
[c00000002ad19ef0] c00000000051cf98 .__tcp_push_pending_frames+0x248/0x9a0
[c00000002ad19fe0] c000000000518ec0 .tcp_rcv_established+0x180/0x710
[c00000002ad1a0a0] c0000000005219cc .tcp_v4_do_rcv+0x11c/0x2b0
[c00000002ad1a160] c000000000524084 .tcp_v4_rcv+0x6d4/0x870
[c00000002ad1a230] c0000000004fc0e8 .ip_local_deliver+0xf8/0x2b0
[c00000002ad1a2d0] c0000000004fc614 .ip_rcv+0x374/0x670
[c00000002ad1a3a0] c0000000004d28a8 .netif_receive_skb+0x298/0x380
[c00000002ad1a470] c00000000054a528 .lro_receive_skb+0x68/0xa0
[c00000002ad1a510] c00000000031a8d4 .pasemi_mac_clean_rx+0x2e4/0x500
[c00000002ad1a610] c00000000031b534 .pasemi_mac_poll+0x54/0x150
[c00000002ad1a6b0] c0000000004d5f90 .net_rx_action+0x150/0x290
[c00000002ad1a790] c000000000062438 .__do_softirq+0xe8/0x1e0
[c00000002ad1a850] c00000000000d2c4 .do_softirq+0xa4/0xd0
[c00000002ad1a8e0] c000000000062104 .irq_exit+0xb4/0xf0
[c00000002ad1a970] c00000000000d6a4 .do_IRQ+0x114/0x150
[c00000002ad1aa10] c000000000004160 hardware_interrupt_entry+0x1c/0x3c
--- Exception: 501 (Hardware Interrupt) at c000000000193d84 
.journal_dirty_metadata+0x254/0x260
[c00000002ad1ad00] c00000000018e05c 
.__ext3_journal_dirty_metadata+0x4c/0xa0 (unreliable)
[c00000002ad1ada0] c00000000017bcf4 .ext3_mark_iloc_dirty+0x2a4/0x590
[c00000002ad1ae70] c00000000017c5ec .ext3_mark_inode_dirty+0x5c/0x80
[c00000002ad1af30] c000000000180dec .ext3_dirty_inode+0xac/0xf0
[c00000002ad1afd0] c00000000012454c .__mark_inode_dirty+0x7c/0x210
[c00000002ad1b080] c000000000178174 .ext3_new_blocks+0xd4/0x770
[c00000002ad1b1b0] c00000000017d194 .ext3_get_blocks_handle+0x2a4/0xcd0
[c00000002ad1b380] c00000000017dee0 .ext3_get_block+0xa0/0x170
[c00000002ad1b440] c00000000012ca6c .__block_prepare_write+0x32c/0x510
[c00000002ad1b580] c00000000012cd80 .block_write_begin+0x90/0x160
[c00000002ad1b640] c0000000001800e0 .ext3_write_begin+0x130/0x2c0
[c00000002ad1b760] c0000000000b9d30 
.generic_file_buffered_write+0x180/0x3c0
[c00000002ad1b8b0] c0000000000ba510 
.__generic_file_aio_write_nolock+0x2c0/0x450
[c00000002ad1b9c0] c0000000000ba730 .generic_file_aio_write+0x90/0x130
[c00000002ad1ba80] c000000000179b00 .ext3_file_write+0x60/0x130
[c00000002ad1bb30] c0000000000f8fd4 .do_sync_write+0xf4/0x160
[c00000002ad1bcd0] c0000000000f99dc .vfs_write+0xdc/0x1e0
[c00000002ad1bd80] c0000000000fa48c .sys_write+0x6c/0xd0
[c00000002ad1be30] c0000000000074b0 syscall_exit+0x0/0x40
--- Exception: c01 (System Call) at 000000000ff470c4
SP (ffb8be40) is in userspace
1:mon> X

And running my stack_trace in ftrace, I've been seeing large hits on the 
stack:

root@electra ~> cat /debug/tracing/stack_trace 
        Depth   Size      Location    (56 entries)
        -----   ----      --------
  0)    14032     112   ftrace_call+0x4/0x14
  1)    13920     128   .sched_clock+0x20/0x60
  2)    13792     128   .sched_clock_cpu+0x34/0x50
  3)    13664     144   .cpu_clock+0x3c/0xa0
  4)    13520     144   .get_timestamp+0x2c/0x50
  5)    13376     192   .softlockup_tick+0x100/0x220
  6)    13184     128   .run_local_timers+0x34/0x50
  7)    13056     160   .update_process_times+0x44/0xb0
  8)    12896     176   .tick_sched_timer+0x8c/0x120
  9)    12720     160   .__run_hrtimer+0xd8/0x130
 10)    12560     240   .hrtimer_interrupt+0x16c/0x220
 11)    12320     160   .timer_interrupt+0xcc/0x110
 12)    12160     240   decrementer_common+0xe0/0x100
 13)    11920     576   0x80
 14)    11344     160   .usb_hcd_irq+0x94/0x150
 15)    11184     160   .handle_IRQ_event+0x80/0x120
 16)    11024     160   .handle_fasteoi_irq+0xd8/0x1e0
 17)    10864     160   .do_IRQ+0xbc/0x150
 18)    10704     144   hardware_interrupt_entry+0x1c/0x3c
 19)    10560     672   0x0
 20)     9888     144   ._spin_unlock_irqrestore+0x84/0xd0
 21)     9744     160   .scsi_dispatch_cmd+0x170/0x360
 22)     9584     208   .scsi_request_fn+0x324/0x5e0
 23)     9376     144   .blk_invoke_request_fn+0xc8/0x1b0
 24)     9232     144   .__blk_run_queue+0x48/0x60
 25)     9088     144   .blk_run_queue+0x40/0x70
 26)     8944     192   .scsi_run_queue+0x3a8/0x3e0
 27)     8752     160   .scsi_next_command+0x58/0x90
 28)     8592     176   .scsi_end_request+0xd4/0x130
 29)     8416     208   .scsi_io_completion+0x15c/0x500
 30)     8208     160   .scsi_finish_command+0x15c/0x190
 31)     8048     160   .scsi_softirq_done+0x138/0x1e0
 32)     7888     160   .blk_done_softirq+0xd0/0x100
 33)     7728     192   .__do_softirq+0xe8/0x1e0
 34)     7536     144   .do_softirq+0xa4/0xd0
 35)     7392     144   .irq_exit+0xb4/0xf0
 36)     7248     160   .do_IRQ+0x114/0x150
 37)     7088     752   hardware_interrupt_entry+0x1c/0x3c
 38)     6336     144   .blk_rq_init+0x28/0xc0
 39)     6192     208   .get_request+0x13c/0x3d0
 40)     5984     240   .get_request_wait+0x60/0x170
 41)     5744     192   .__make_request+0xd4/0x560
 42)     5552     192   .generic_make_request+0x210/0x300
 43)     5360     208   .submit_bio+0x168/0x1a0
 44)     5152     160   .submit_bh+0x188/0x1e0
 45)     4992    1280   .block_read_full_page+0x23c/0x430
 46)     3712    1280   .do_mpage_readpage+0x43c/0x740
 47)     2432     352   .mpage_readpages+0x130/0x1c0
 48)     2080     160   .ext3_readpages+0x50/0x80
 49)     1920     256   .__do_page_cache_readahead+0x1e4/0x340
 50)     1664     160   .do_page_cache_readahead+0x94/0xe0
 51)     1504     240   .filemap_fault+0x360/0x530
 52)     1264     256   .__do_fault+0xb8/0x600
 53)     1008     240   .handle_mm_fault+0x190/0x920
 54)      768     320   .do_page_fault+0x3d4/0x5f0
 55)      448     448   handle_page_fault+0x20/0x5c

This is one of the ftrace features, it tests the stack at every function 
call and if it is a new max, then it records the stack. This dump was what 
was recorded right after boot up. That is about 14K of stack use, and I 
hear that the stack is 16K. That looks like it is getting pretty close.

I'm also told that most PPC64 configs should have CONFIG_IRQSTACKS set, 
but I see that I do not. I am in the process of building my box without it 
and see if I still have these crashes.

-- Steve


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 0/7] Porting dynmaic ftrace to PowerPC
  2008-11-17 15:42   ` Steven Rostedt
  2008-11-17 20:03     ` Steven Rostedt
@ 2008-11-18 13:56     ` Steven Rostedt
  2008-11-19  2:38       ` Paul Mackerras
  1 sibling, 1 reply; 29+ messages in thread
From: Steven Rostedt @ 2008-11-18 13:56 UTC (permalink / raw)
  To: Paul Mackerras
  Cc: LKML, Ingo Molnar, Andrew Morton, Benjamin Herrenschmidt, linuxppc-dev


Paul and Benjamin,

Can I add your Acked-by: to all these patches that I submitted? I'm going 
to recommit them with a consistent subject (all lower case ppc), but I'm 
not going to change the patches themselves.

Would you two be fine with that? Or at least one of you?

-- Steve



^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 0/7] Porting dynmaic ftrace to PowerPC
  2008-11-18 13:56     ` Steven Rostedt
@ 2008-11-19  2:38       ` Paul Mackerras
  2008-11-19  3:04         ` Steven Rostedt
  0 siblings, 1 reply; 29+ messages in thread
From: Paul Mackerras @ 2008-11-19  2:38 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: LKML, Ingo Molnar, Andrew Morton, Benjamin Herrenschmidt, linuxppc-dev

Steven Rostedt writes:

> Can I add your Acked-by: to all these patches that I submitted? I'm going 
> to recommit them with a consistent subject (all lower case ppc), but I'm 
> not going to change the patches themselves.
> 
> Would you two be fine with that? Or at least one of you?

My preference would be for the patches to go through the powerpc tree
unless there is a good reason for them to go via another tree.

The style we use for the headline is "powerpc: Add xyz feature" or
"powerpc/subsystem: Fix foo bug".

As for the acked-by, I feel I first need to go through the whole
series again with the changes you have made recently.  Have you
reworked the earlier patches to avoid introducing any bugs, rather
than just fixing the bugs with later patches?  If you haven't, are you
sure that the bugs won't cause any problems in bisecting?

Also, what's the best place to find the latest patch set?

Thanks again for doing all this work.

Paul.
  

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 0/7] Porting dynmaic ftrace to PowerPC
  2008-11-19  2:38       ` Paul Mackerras
@ 2008-11-19  3:04         ` Steven Rostedt
  2008-11-19  9:27           ` Ingo Molnar
  0 siblings, 1 reply; 29+ messages in thread
From: Steven Rostedt @ 2008-11-19  3:04 UTC (permalink / raw)
  To: Paul Mackerras
  Cc: LKML, Ingo Molnar, Andrew Morton, Benjamin Herrenschmidt, linuxppc-dev


On Wed, 19 Nov 2008, Paul Mackerras wrote:

> Steven Rostedt writes:
> 
> > Can I add your Acked-by: to all these patches that I submitted? I'm going 
> > to recommit them with a consistent subject (all lower case ppc), but I'm 
> > not going to change the patches themselves.
> > 
> > Would you two be fine with that? Or at least one of you?
> 
> My preference would be for the patches to go through the powerpc tree
> unless there is a good reason for them to go via another tree.

I have no problem with that. The only thing is that we have a lot of
pending work still in the linux-tip tree, which you may need to pull
in to get these patches working. Well, there's two or three commits in the 
generic code that I know the PPC code is dependent on.

I could give you a list of commits in tip that need to go mainline first
before we can pull in the PPC changes. Then you could wait till those 
changes make it into 29 and then you could push the PPC modifications in 
from your tree.

> 
> The style we use for the headline is "powerpc: Add xyz feature" or
> "powerpc/subsystem: Fix foo bug".

We've been using the "ftrace: subject" format for most of our ftrace 
commits, and have been using "ftrace, ppc: subject" or ppc32 or ppc64 for 
those. But since this is a powerpc port, I will conform to your style.

> 
> As for the acked-by, I feel I first need to go through the whole
> series again with the changes you have made recently.  Have you
> reworked the earlier patches to avoid introducing any bugs, rather
> than just fixing the bugs with later patches?  If you haven't, are you
> sure that the bugs won't cause any problems in bisecting?

Fair enough. I'll rework the patches again to fold back in the changes 
based on your comments, as well as the comments of others. When I'm done, 
I'll repost to the list. This way, if you pull them in, you can add your 
Signed-off-by yourself.

> 
> Also, what's the best place to find the latest patch set?

I'll be keeping the changes in my repo:

 git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-2.6-trace.git

  branch tip/ppc

Note, this branch is based on top of Ingo's linux-tip tree. I can make a 
a branch "ppc" that will be based on top of mainline, and I can add the 
patches needed to get PPC working. The generic commits will have the 
"ftrace: " format.

The generic ftrace patches will still need to go in through linux-tip. But 
when they do, feel free to push the PowerPC port in.  Anyway, you are the 
ones that can test it better than we can. I have two PPC boxes that I test 
on, but I'm sure you have a lot more.

> 
> Thanks again for doing all this work.

Thank you for the reviews.

-- Steve


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 0/7] Porting dynmaic ftrace to PowerPC
  2008-11-19  3:04         ` Steven Rostedt
@ 2008-11-19  9:27           ` Ingo Molnar
  2008-11-19 10:38             ` Paul Mackerras
  0 siblings, 1 reply; 29+ messages in thread
From: Ingo Molnar @ 2008-11-19  9:27 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: Paul Mackerras, LKML, Andrew Morton, Benjamin Herrenschmidt,
	linuxppc-dev


* Steven Rostedt <rostedt@goodmis.org> wrote:

> On Wed, 19 Nov 2008, Paul Mackerras wrote:
> 
> > Steven Rostedt writes:
> > 
> > > Can I add your Acked-by: to all these patches that I submitted? I'm going 
> > > to recommit them with a consistent subject (all lower case ppc), but I'm 
> > > not going to change the patches themselves.
> > > 
> > > Would you two be fine with that? Or at least one of you?
> > 
> > My preference would be for the patches to go through the powerpc tree
> > unless there is a good reason for them to go via another tree.
> 
> I have no problem with that. The only thing is that we have a lot of 
> pending work still in the linux-tip tree, which you may need to pull 
> in to get these patches working. Well, there's two or three commits 
> in the generic code that I know the PPC code is dependent on.
> 
> I could give you a list of commits in tip that need to go mainline 
> first before we can pull in the PPC changes. Then you could wait 
> till those changes make it into 29 and then you could push the PPC 
> modifications in from your tree.

note that this inserts a lot of (unnecessary) serialization and a 
window of non-testing - by all likelyhood this will delay ppc ftrace 
to v2.6.30 or later kernels.

	Ingo

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 0/7] Porting dynmaic ftrace to PowerPC
  2008-11-19  9:27           ` Ingo Molnar
@ 2008-11-19 10:38             ` Paul Mackerras
  2008-11-19 10:57               ` Ingo Molnar
  0 siblings, 1 reply; 29+ messages in thread
From: Paul Mackerras @ 2008-11-19 10:38 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Steven Rostedt, LKML, Andrew Morton, Benjamin Herrenschmidt,
	linuxppc-dev

Ingo Molnar writes:

> * Steven Rostedt <rostedt@goodmis.org> wrote:
> 
> > On Wed, 19 Nov 2008, Paul Mackerras wrote:
> > 
> > > Steven Rostedt writes:
> > > 
> > > > Can I add your Acked-by: to all these patches that I submitted? I'm going 
> > > > to recommit them with a consistent subject (all lower case ppc), but I'm 
> > > > not going to change the patches themselves.
> > > > 
> > > > Would you two be fine with that? Or at least one of you?
> > > 
> > > My preference would be for the patches to go through the powerpc tree
> > > unless there is a good reason for them to go via another tree.
> > 
> > I have no problem with that. The only thing is that we have a lot of 
> > pending work still in the linux-tip tree, which you may need to pull 
> > in to get these patches working. Well, there's two or three commits 
> > in the generic code that I know the PPC code is dependent on.
> > 
> > I could give you a list of commits in tip that need to go mainline 
> > first before we can pull in the PPC changes. Then you could wait 
> > till those changes make it into 29 and then you could push the PPC 
> > modifications in from your tree.
> 
> note that this inserts a lot of (unnecessary) serialization and a 
> window of non-testing - by all likelyhood this will delay ppc ftrace 
> to v2.6.30 or later kernels.

Well, note that I said "unless there is a good reason".  If it does
need to go via your tree, it can, though I don't see that it will get
much testing on powerpc there, and having it there will make it harder
to manage any conflicts with the other stuff I have queued up.

How much generic stuff that's not upstream do the powerpc ftrace
patches depend on?

Paul.

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 0/7] Porting dynmaic ftrace to PowerPC
  2008-11-19 10:38             ` Paul Mackerras
@ 2008-11-19 10:57               ` Ingo Molnar
  2008-11-19 11:35                 ` Paul Mackerras
  2008-11-19 12:10                 ` Steven Rostedt
  0 siblings, 2 replies; 29+ messages in thread
From: Ingo Molnar @ 2008-11-19 10:57 UTC (permalink / raw)
  To: Paul Mackerras
  Cc: Steven Rostedt, LKML, Andrew Morton, Benjamin Herrenschmidt,
	linuxppc-dev


* Paul Mackerras <paulus@samba.org> wrote:

> Ingo Molnar writes:
> 
> > * Steven Rostedt <rostedt@goodmis.org> wrote:
> > 
> > > On Wed, 19 Nov 2008, Paul Mackerras wrote:
> > > 
> > > > Steven Rostedt writes:
> > > > 
> > > > > Can I add your Acked-by: to all these patches that I submitted? I'm going 
> > > > > to recommit them with a consistent subject (all lower case ppc), but I'm 
> > > > > not going to change the patches themselves.
> > > > > 
> > > > > Would you two be fine with that? Or at least one of you?
> > > > 
> > > > My preference would be for the patches to go through the powerpc tree
> > > > unless there is a good reason for them to go via another tree.
> > > 
> > > I have no problem with that. The only thing is that we have a lot of 
> > > pending work still in the linux-tip tree, which you may need to pull 
> > > in to get these patches working. Well, there's two or three commits 
> > > in the generic code that I know the PPC code is dependent on.
> > > 
> > > I could give you a list of commits in tip that need to go mainline 
> > > first before we can pull in the PPC changes. Then you could wait 
> > > till those changes make it into 29 and then you could push the PPC 
> > > modifications in from your tree.
> > 
> > note that this inserts a lot of (unnecessary) serialization and a 
> > window of non-testing - by all likelyhood this will delay ppc ftrace 
> > to v2.6.30 or later kernels.
> 
> Well, note that I said "unless there is a good reason".  If it does 
> need to go via your tree, it can, though I don't see that it will 
> get much testing on powerpc there, and having it there will make it 
> harder to manage any conflicts with the other stuff I have queued 
> up.

this is the diffstat:

 arch/powerpc/Kconfig              |    2 +
 arch/powerpc/include/asm/ftrace.h |   14 +-
 arch/powerpc/include/asm/module.h |   16 ++-
 arch/powerpc/kernel/ftrace.c      |  460 +++++++++++++++++++++++++++++++++---
 arch/powerpc/kernel/idle.c        |    5 +
 arch/powerpc/kernel/module_32.c   |   10 +
 arch/powerpc/kernel/module_64.c   |   13 +
 scripts/recordmcount.pl           |   18 ++-
 8 files changed, 495 insertions(+), 43 deletions(-)

90% of it creates new code that shouldnt be collision-happy.

it does not "need" to go via tip/tracing, i just pointed out the 
effects of the proposed workflow and i'm just trying to find a more 
efficient workflow for it - while not impacting yours either. I think 
it can be done - git gives us tons of tools to do this intelligently.

> How much generic stuff that's not upstream do the powerpc ftrace 
> patches depend on?

a lot:

   66 files changed, 3266 insertions(+), 985 deletions(-)

and it's all in flux (we are in the middle of the development cycle), 
so i dont think it would be a good idea for you to pull those bits 
into the powerpc tree.

Maybe Steve could do the following trick: create a Linus -git based 
branch that uses the new APIs but marks ppc's ftrace as "depends 0" in 
the powerpc Kconfig. (the new ftrace.c wont build)

You could pull that tree into the powerpc tree and everything should 
still work fine in PPC - sans ftrace.

In tip/tracing we'd merge it too (these commits will never be 
rebased), and we'd also remove the depends 0 from the powerpc Kconfig. 
That sole change wont conflict with anything powerpc.

It would all play out just fine in linux-next: we'd have both the 
latest powerpc bits and the latest ftrace bits on powerpc. You could 
test the non-ftrace impact of Steve's changes in the powerpc tree as 
well and have it all part of your usual workflow.

The 2.6.29 merge window would be trouble-free as well: since the 
sha1's are the same, any of the trees can be merged upstream without 
having to wait for the other one and without creating conflicts or 
other trouble for the other one.

Hm?

	Ingo

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 0/7] Porting dynmaic ftrace to PowerPC
  2008-11-19 10:57               ` Ingo Molnar
@ 2008-11-19 11:35                 ` Paul Mackerras
  2008-11-19 12:10                 ` Steven Rostedt
  1 sibling, 0 replies; 29+ messages in thread
From: Paul Mackerras @ 2008-11-19 11:35 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Steven Rostedt, LKML, Andrew Morton, Benjamin Herrenschmidt,
	linuxppc-dev

Ingo Molnar writes:

> and it's all in flux (we are in the middle of the development cycle), 
> so i dont think it would be a good idea for you to pull those bits 
> into the powerpc tree.

Quite.  OK, it does sound like this stuff needs to live in your tree
for now, and from the diffstat it doesn't look like there is any
conflict with stuff in my tree yet.

> Maybe Steve could do the following trick: create a Linus -git based 
> branch that uses the new APIs but marks ppc's ftrace as "depends 0" in 
> the powerpc Kconfig. (the new ftrace.c wont build)
> 
> You could pull that tree into the powerpc tree and everything should 
> still work fine in PPC - sans ftrace.

Sounds like a reasonable idea, except that I think I'll delay pulling
that branch into my tree until I need to in order to resolve a
conflict - at least as far as my exported branches are concerned.

> In tip/tracing we'd merge it too (these commits will never be 
> rebased),

I do want to see the patches in their final form and have the
opportunity to give an acked-by before you declare the branch frozen.
Apart from that, sounds good.

Paul.

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 0/7] Porting dynmaic ftrace to PowerPC
  2008-11-19 10:57               ` Ingo Molnar
  2008-11-19 11:35                 ` Paul Mackerras
@ 2008-11-19 12:10                 ` Steven Rostedt
  2008-11-19 12:15                   ` Steven Rostedt
  1 sibling, 1 reply; 29+ messages in thread
From: Steven Rostedt @ 2008-11-19 12:10 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Paul Mackerras, LKML, Andrew Morton, Benjamin Herrenschmidt,
	linuxppc-dev


On Wed, 19 Nov 2008, Ingo Molnar wrote:
> 
> Maybe Steve could do the following trick: create a Linus -git based 
> branch that uses the new APIs but marks ppc's ftrace as "depends 0" in 
> the powerpc Kconfig. (the new ftrace.c wont build)

There's only two generic commits that need to be added for the PowerPC 
code to work.

  ftrace: pass module struct to arch dynamic ftrace functions
  ftrace: allow NULL pointers in mcount_loc

I've already ported them to mainline to test PowerPC there.
Paul could use these two versions and keep ftrace in a separate branch in 
his tree. This way all the PowerPC code will be there, and actually can be 
tested. They may still hit the same bugs that we have fixed in tip, but 
those should all be minor, since any major bug is already in mainline or 
on its way.

> 
> You could pull that tree into the powerpc tree and everything should 
> still work fine in PPC - sans ftrace.
> 
> In tip/tracing we'd merge it too (these commits will never be 
> rebased), and we'd also remove the depends 0 from the powerpc Kconfig. 
> That sole change wont conflict with anything powerpc.
> 
> It would all play out just fine in linux-next: we'd have both the 
> latest powerpc bits and the latest ftrace bits on powerpc. You could 
> test the non-ftrace impact of Steve's changes in the powerpc tree as 
> well and have it all part of your usual workflow.
> 
> The 2.6.29 merge window would be trouble-free as well: since the 
> sha1's are the same, any of the trees can be merged upstream without 
> having to wait for the other one and without creating conflicts or 
> other trouble for the other one.
> 
> Hm?

If Paul uses the ported two generic commits, then he would need to keep 
that in a separate branch that will never go upstream. He could pull in 
the other PowerPC patches and do as you said, keep the depend off.

I can make two branches that Paul could pull from. One would have this 
code disabled in the config, and just be the PowerPC port. Although, we 
would need to figure out the best way to keep it disabled. Disable it 
after the patches? The patches themselve enable it.

And then I could make another branch with the back port of the two commits 
that Paul would never push upstream. This would allow for the PowerPC guys 
to test the code in their tree without waiting for it. We just need to 
trust that Paul will not push those commits ;-)

Actually, I can change the subject of those commits to have at the 
beginning:

   NOT FOR UPSTREAM ftrace: ....

What do you guys think?

-- Steve


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 0/7] Porting dynmaic ftrace to PowerPC
  2008-11-19 12:10                 ` Steven Rostedt
@ 2008-11-19 12:15                   ` Steven Rostedt
  0 siblings, 0 replies; 29+ messages in thread
From: Steven Rostedt @ 2008-11-19 12:15 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Paul Mackerras, LKML, Andrew Morton, Benjamin Herrenschmidt,
	linuxppc-dev

On Wed, 19 Nov 2008, Steven Rostedt wrote:

> 
> On Wed, 19 Nov 2008, Ingo Molnar wrote:
> > 
> > Maybe Steve could do the following trick: create a Linus -git based 
> > branch that uses the new APIs but marks ppc's ftrace as "depends 0" in 
> > the powerpc Kconfig. (the new ftrace.c wont build)
> 
> There's only two generic commits that need to be added for the PowerPC 
> code to work.
> 
>   ftrace: pass module struct to arch dynamic ftrace functions
>   ftrace: allow NULL pointers in mcount_loc
> 
> I've already ported them to mainline to test PowerPC there.
> Paul could use these two versions and keep ftrace in a separate branch in 
> his tree. This way all the PowerPC code will be there, and actually can be 
> tested. They may still hit the same bugs that we have fixed in tip, but 
> those should all be minor, since any major bug is already in mainline or 
> on its way.

I just pushed all the PowerPC patches on top of this port it works.
I still need to rework the patches for Paul.

-- Steve


^ permalink raw reply	[flat|nested] 29+ messages in thread

end of thread, other threads:[~2008-11-19 12:15 UTC | newest]

Thread overview: 29+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2008-11-16 21:24 [PATCH 0/7] Porting dynmaic ftrace to PowerPC Steven Rostedt
2008-11-16 21:24 ` [PATCH 1/7] ftrace, PPC: do not latency trace idle Steven Rostedt
2008-11-17  3:43   ` Paul Mackerras
2008-11-17 15:43     ` Steven Rostedt
2008-11-16 21:24 ` [PATCH 2/7] ftrace, ppc: convert to new dynamic ftrace arch API Steven Rostedt
2008-11-17  3:57   ` Paul Mackerras
2008-11-17 15:47     ` Steven Rostedt
2008-11-16 21:24 ` [PATCH 3/7] ftrace: powerpc mcount record port Steven Rostedt
2008-11-16 21:24 ` [PATCH 4/7] ftrace, PPC: use probe_kernel API to modify code Steven Rostedt
2008-11-17  4:44   ` Paul Mackerras
2008-11-17 15:51     ` Steven Rostedt
2008-11-16 21:24 ` [PATCH 5/7] ftrace, PPC64: handle module trampolines for dyn ftrace Steven Rostedt
2008-11-17  5:00   ` Paul Mackerras
2008-11-17 16:02     ` Steven Rostedt
2008-11-17 16:18     ` Steven Rostedt
2008-11-16 21:24 ` [PATCH 6/7] ftrace,ppc32: enabled dynamic ftrace Steven Rostedt
2008-11-16 21:24 ` [PATCH 7/7] ftrace,ppc32: dynamic ftrace to handle modules Steven Rostedt
2008-11-16 22:55 ` [PATCH 0/7] Porting dynmaic ftrace to PowerPC Paul Mackerras
2008-11-17 15:42   ` Steven Rostedt
2008-11-17 20:03     ` Steven Rostedt
2008-11-18 13:56     ` Steven Rostedt
2008-11-19  2:38       ` Paul Mackerras
2008-11-19  3:04         ` Steven Rostedt
2008-11-19  9:27           ` Ingo Molnar
2008-11-19 10:38             ` Paul Mackerras
2008-11-19 10:57               ` Ingo Molnar
2008-11-19 11:35                 ` Paul Mackerras
2008-11-19 12:10                 ` Steven Rostedt
2008-11-19 12:15                   ` Steven Rostedt

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).