linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [RFC 0/2] Correct stack trace info
@ 2011-03-18 11:21 Namhyung Kim
  2011-03-18 11:21 ` [RFC 1/2] vsprintf: introduce %pT format specifier Namhyung Kim
  2011-03-18 11:21 ` [RFC 2/2] x86, dumpstack: use %pT format specifier for stack trace Namhyung Kim
  0 siblings, 2 replies; 5+ messages in thread
From: Namhyung Kim @ 2011-03-18 11:21 UTC (permalink / raw)
  To: linux-kernel; +Cc: Frederic Weisbecker, Jason Wessel

Hello,

Recently I've spent some time with kgdb and found a interesting thing.
The code I've used is Jason Wessel's sample module which can be downloaded
at following URL.

  http://kernel.org/pub/linux/kernel/people/jwessel/dbg_webinar/crash_mod.tar.bz2

It build a kernel module "test_panic" and after the module is loaded it
makes some files in /proc/test_panic directory. Most of them are supposed
to kernel panic when they're written. A simple usage is like below:

# echo 1 > /proc/test_panic/panic

[   18.345571] Starting panic
[   18.345730] Kernel panic - not syncing: test_panic running!
[   18.345752]
[   18.345856] Pid: 36, comm: sh Not tainted 2.6.38-rc8+ #83
[   18.345923] Call Trace:
[   18.346001]  [<ffffffff812a8502>] panic+0x8c/0x18d
[   18.346257]  [<ffffffffa000012a>] deep01+0x0/0x38 [test_panic]  <--- ???
[   18.346347]  [<ffffffff81104666>] proc_file_write+0x73/0x8d
[   18.346432]  [<ffffffff811000b3>] proc_reg_write+0x8d/0xac
[   18.346516]  [<ffffffff810c7d32>] vfs_write+0xa1/0xc5
[   18.346603]  [<ffffffff810c7e0f>] sys_write+0x45/0x6c
[   18.346801]  [<ffffffff8f02943b>] system_call_fastpath+0x16/0x1b

In this case, offending function is "panic_write()" but the trace showed
me a different function "deep01()". I was a bit confused at first. The
panic_write() function is a very simple code which calls panic() at the
end of the function.

static int panic_write(struct file *file, const char *buffer,
                       unsigned long count, void *data)
{
	trace_printk("Starting panic\n");
	printk(KERN_INFO "Starting panic\n");
	panic("test_panic running!\n");
	return count;
}

But where did the "deep01()" come from?

This was due to a gcc's optimization. Because the panic() was known not to
return to the caller (it has "noreturn" attribute), the gcc optimized out
remaining code after calling to the panic(). But PC was incremented after
the call as usual, so return address saved in stack pointed a different
(very next) function.

I ran objdump on test_panic.ko and got following result.

 ... (snipped) ...

000000000000010a <panic_write>:
 10a:	55                   	push   %rbp
 10b:	48 c7 c7 00 00 00 00 	mov    $0x0,%rdi
 112:	31 c0                	xor    %eax,%eax
 114:	48 89 e5             	mov    %rsp,%rbp
 117:	e8 00 00 00 00       	callq  11c <panic_write+0x12>
 11c:	48 c7 c7 00 00 00 00 	mov    $0x0,%rdi
 123:	31 c0                	xor    %eax,%eax
 125:	e8 00 00 00 00       	callq  12a <deep01>   <--- call panic() !!!

000000000000012a <deep01>:
 12a:	55                   	push   %rbp
 12b:	31 c0                	xor    %eax,%eax
 12d:	48 c7 c7 00 00 00 00 	mov    $0x0,%rdi
 ... (snipped) ...


To this end, I added sprint_trace() and %pT specifier for printk(). They'll
do the same symbol lookup but with the offset subtracted by 1 to ensure it
would be in the same function - and increment it again before printing.

It will not affect normal trace case and the only problematic case will be
the case where return address is really the start of a function (offset 0).
But I think this cannot happen during stack trace because calling a function
always increases PC and then saves. Thus even if there's a call at the
beginning in a function, its return address will not have the 0-offset. But
other symbol lookups could still use 0-offset for some purpose, I added new
helper to separate them.

I've only checked this on x86_64 on qemu, but other architectures may want
this too :) It seems work well for me.

[   22.224483] Call Trace:
[   22.224569]  [<ffffffff812bce69>] panic+0x8c/0x18d
[   22.224848]  [<ffffffffa000012a>] panic_write+0x20/0x20 [test_panic]  <--- !!!
[   22.224979]  [<ffffffff81115fab>] proc_file_write+0x73/0x8d
[   22.225089]  [<ffffffff81111a5f>] proc_reg_write+0x8d/0xac
[   22.225199]  [<ffffffff810d90ee>] vfs_write+0xa1/0xc5
[   22.225304]  [<ffffffff810d91cb>] sys_write+0x45/0x6c
[   22.225408]  [<ffffffff812c07fb>] system_call_fastpath+0x16/0x1b

Any comments are welcome.


Thanks.

---

Namhyung Kim (2):
  vsprintf: introduce %pT format specifier
  x86, dumpstack: use %pT format specifier for stack trace

 arch/x86/kernel/dumpstack.c |    2 +-
 include/linux/kallsyms.h    |    7 +++++++
 kernel/kallsyms.c           |   26 ++++++++++++++++++++++++++
 lib/vsprintf.c              |    7 ++++++-
 4 files changed, 40 insertions(+), 2 deletions(-)

--
1.7.4


^ permalink raw reply	[flat|nested] 5+ messages in thread

* [RFC 1/2] vsprintf: introduce %pT format specifier
  2011-03-18 11:21 [RFC 0/2] Correct stack trace info Namhyung Kim
@ 2011-03-18 11:21 ` Namhyung Kim
  2011-03-18 15:27   ` Steven Rostedt
  2011-03-18 11:21 ` [RFC 2/2] x86, dumpstack: use %pT format specifier for stack trace Namhyung Kim
  1 sibling, 1 reply; 5+ messages in thread
From: Namhyung Kim @ 2011-03-18 11:21 UTC (permalink / raw)
  To: linux-kernel; +Cc: Frederic Weisbecker, Jason Wessel

The %pT format specifier is for stack backtrace. Its handler
sprint_trace() does symbol lookup using (offset-1) to ensure
the offset will not point outside of the function.

If there is a tail-call to the function marked "noreturn",
gcc optimized out the code after the call then causes saved
return address points outside of the function (i.e. the start
of the next function), so messes up a call trace entry.
This patch will fix it.

Signed-off-by: Namhyung Kim <namhyung@gmail.com>
---
 include/linux/kallsyms.h |    7 +++++++
 kernel/kallsyms.c        |   26 ++++++++++++++++++++++++++
 lib/vsprintf.c           |    7 ++++++-
 3 files changed, 39 insertions(+), 1 deletions(-)

diff --git a/include/linux/kallsyms.h b/include/linux/kallsyms.h
index d8e9b3d1c23c..4964ed2021cc 100644
--- a/include/linux/kallsyms.h
+++ b/include/linux/kallsyms.h
@@ -36,6 +36,7 @@ const char *kallsyms_lookup(unsigned long addr,

 /* Look up a kernel symbol and return it in a text buffer. */
 extern int sprint_symbol(char *buffer, unsigned long address);
+extern int sprint_trace(char *buffer, unsigned long address);

 /* Look up a kernel symbol and print it to the kernel messages. */
 extern void __print_symbol(const char *fmt, unsigned long address);
@@ -79,6 +80,12 @@ static inline int sprint_symbol(char *buffer, unsigned long addr)
 	return 0;
 }

+static inline int sprint_trace(char *buffer, unsigned long addr)
+{
+	*buffer = '\0';
+	return 0;
+}
+
 static inline int lookup_symbol_name(unsigned long addr, char *symname)
 {
 	return -ERANGE;
diff --git a/kernel/kallsyms.c b/kernel/kallsyms.c
index 6f6d091b5757..2e43200cdc79 100644
--- a/kernel/kallsyms.c
+++ b/kernel/kallsyms.c
@@ -368,6 +368,32 @@ int sprint_symbol(char *buffer, unsigned long address)
 }
 EXPORT_SYMBOL_GPL(sprint_symbol);

+int sprint_trace(char *buffer, unsigned long address)
+{
+	char *modname;
+	const char *name;
+	unsigned long offset, size;
+	int len;
+
+	name = kallsyms_lookup(address-1, &size, &offset, &modname, buffer);
+	if (!name)
+		return sprintf(buffer, "0x%lx", address);
+
+	if (name != buffer)
+		strcpy(buffer, name);
+	len = strlen(buffer);
+	buffer += len;
+	offset++;
+
+	if (modname)
+		len += sprintf(buffer, "+%#lx/%#lx [%s]",
+						offset, size, modname);
+	else
+		len += sprintf(buffer, "+%#lx/%#lx", offset, size);
+
+	return len;
+}
+
 /* Look up a kernel symbol and print it to the kernel messages. */
 void __print_symbol(const char *fmt, unsigned long address)
 {
diff --git a/lib/vsprintf.c b/lib/vsprintf.c
index d3023df8477f..c9c56a54cc42 100644
--- a/lib/vsprintf.c
+++ b/lib/vsprintf.c
@@ -574,7 +574,9 @@ char *symbol_string(char *buf, char *end, void *ptr,
 	unsigned long value = (unsigned long) ptr;
 #ifdef CONFIG_KALLSYMS
 	char sym[KSYM_SYMBOL_LEN];
-	if (ext != 'f' && ext != 's')
+	if (ext == 'T')
+		sprint_trace(sym, value);
+	else if (ext != 'f' && ext != 's')
 		sprint_symbol(sym, value);
 	else
 		kallsyms_lookup(value, NULL, NULL, NULL, sym);
@@ -949,6 +951,7 @@ int kptr_restrict = 1;
  * - 'f' For simple symbolic function names without offset
  * - 'S' For symbolic direct pointers with offset
  * - 's' For symbolic direct pointers without offset
+ * - 'T' For backtraced symbolic direct pointers with offset
  * - 'R' For decoded struct resource, e.g., [mem 0x0-0x1f 64bit pref]
  * - 'r' For raw struct resource, e.g., [mem 0x0-0x1f flags 0x201]
  * - 'M' For a 6-byte MAC address, it prints the address in the
@@ -1008,6 +1011,7 @@ char *pointer(const char *fmt, char *buf, char *end, void *ptr,
 		/* Fallthrough */
 	case 'S':
 	case 's':
+	case 'T':
 		return symbol_string(buf, end, ptr, spec, *fmt);
 	case 'R':
 	case 'r':
@@ -1279,6 +1283,7 @@ qualifier:
  * %ps output the name of a text symbol without offset
  * %pF output the name of a function pointer with its offset
  * %pf output the name of a function pointer without its offset
+ * %pT output the name of a trace symbol with its offset
  * %pR output the address range in a struct resource with decoded flags
  * %pr output the address range in a struct resource with raw flags
  * %pM output a 6-byte MAC address with colons
--
1.7.4


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* [RFC 2/2] x86, dumpstack: use %pT format specifier for stack trace
  2011-03-18 11:21 [RFC 0/2] Correct stack trace info Namhyung Kim
  2011-03-18 11:21 ` [RFC 1/2] vsprintf: introduce %pT format specifier Namhyung Kim
@ 2011-03-18 11:21 ` Namhyung Kim
  2011-03-18 15:29   ` Steven Rostedt
  1 sibling, 1 reply; 5+ messages in thread
From: Namhyung Kim @ 2011-03-18 11:21 UTC (permalink / raw)
  To: linux-kernel; +Cc: Frederic Weisbecker, Jason Wessel

Signed-off-by: Namhyung Kim <namhyung@gmail.com>
---
 arch/x86/kernel/dumpstack.c |    2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/arch/x86/kernel/dumpstack.c b/arch/x86/kernel/dumpstack.c
index 914065620f2c..809d027de28f 100644
--- a/arch/x86/kernel/dumpstack.c
+++ b/arch/x86/kernel/dumpstack.c
@@ -27,7 +27,7 @@ static int die_counter;
 
 void printk_address(unsigned long address, int reliable)
 {
-	printk(" [<%p>] %s%pS\n", (void *) address,
+	printk(" [<%p>] %s%pT\n", (void *) address,
 			reliable ? "" : "? ", (void *) address);
 }
 
-- 
1.7.4


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [RFC 1/2] vsprintf: introduce %pT format specifier
  2011-03-18 11:21 ` [RFC 1/2] vsprintf: introduce %pT format specifier Namhyung Kim
@ 2011-03-18 15:27   ` Steven Rostedt
  0 siblings, 0 replies; 5+ messages in thread
From: Steven Rostedt @ 2011-03-18 15:27 UTC (permalink / raw)
  To: Namhyung Kim
  Cc: linux-kernel, Frederic Weisbecker, Jason Wessel, Andrew Morton

On Fri, Mar 18, 2011 at 08:21:17PM +0900, Namhyung Kim wrote:
> The %pT format specifier is for stack backtrace. Its handler
> sprint_trace() does symbol lookup using (offset-1) to ensure
> the offset will not point outside of the function.
> 
> If there is a tail-call to the function marked "noreturn",
> gcc optimized out the code after the call then causes saved
> return address points outside of the function (i.e. the start
> of the next function), so messes up a call trace entry.
> This patch will fix it.

This looks uselful.

>  kernel/kallsyms.c        |   26 ++++++++++++++++++++++++++
>  lib/vsprintf.c           |    7 ++++++-
>  3 files changed, 39 insertions(+), 1 deletions(-)
> 
> diff --git a/include/linux/kallsyms.h b/include/linux/kallsyms.h
> index d8e9b3d1c23c..4964ed2021cc 100644
> --- a/include/linux/kallsyms.h
> +++ b/include/linux/kallsyms.h
> @@ -36,6 +36,7 @@ const char *kallsyms_lookup(unsigned long addr,
> 
>  /* Look up a kernel symbol and return it in a text buffer. */
>  extern int sprint_symbol(char *buffer, unsigned long address);
> +extern int sprint_trace(char *buffer, unsigned long address);
> 
>  /* Look up a kernel symbol and print it to the kernel messages. */
>  extern void __print_symbol(const char *fmt, unsigned long address);
> @@ -79,6 +80,12 @@ static inline int sprint_symbol(char *buffer, unsigned long addr)
>  	return 0;
>  }
> 
> +static inline int sprint_trace(char *buffer, unsigned long addr)
> +{
> +	*buffer = '\0';
> +	return 0;
> +}
> +
>  static inline int lookup_symbol_name(unsigned long addr, char *symname)
>  {
>  	return -ERANGE;
> diff --git a/kernel/kallsyms.c b/kernel/kallsyms.c
> index 6f6d091b5757..2e43200cdc79 100644
> --- a/kernel/kallsyms.c
> +++ b/kernel/kallsyms.c
> @@ -368,6 +368,32 @@ int sprint_symbol(char *buffer, unsigned long address)
>  }
>  EXPORT_SYMBOL_GPL(sprint_symbol);
>

This desperately needs a comment. Please add a kernel doc style header
to this. (ie.:)

/**
 * sprint_trace - one line description
 * @buffer: what is the buffer
 * @address: what is the address
 *
 * describe why this function exists and how to use it.
 *
 * Returns: what are the return values.
 */

-- Steve

> +int sprint_trace(char *buffer, unsigned long address)
> +{
> +	char *modname;
> +	const char *name;
> +	unsigned long offset, size;
> +	int len;
> +
> +	name = kallsyms_lookup(address-1, &size, &offset, &modname, buffer);
> +	if (!name)
> +		return sprintf(buffer, "0x%lx", address);
> +
> +	if (name != buffer)
> +		strcpy(buffer, name);
> +	len = strlen(buffer);
> +	buffer += len;
> +	offset++;
> +
> +	if (modname)
> +		len += sprintf(buffer, "+%#lx/%#lx [%s]",
> +						offset, size, modname);
> +	else
> +		len += sprintf(buffer, "+%#lx/%#lx", offset, size);
> +
> +	return len;
> +}
> +
>  /* Look up a kernel symbol and print it to the kernel messages. */
>  void __print_symbol(const char *fmt, unsigned long address)
>  {
> diff --git a/lib/vsprintf.c b/lib/vsprintf.c
> index d3023df8477f..c9c56a54cc42 100644
> --- a/lib/vsprintf.c
> +++ b/lib/vsprintf.c
> @@ -574,7 +574,9 @@ char *symbol_string(char *buf, char *end, void *ptr,
>  	unsigned long value = (unsigned long) ptr;
>  #ifdef CONFIG_KALLSYMS
>  	char sym[KSYM_SYMBOL_LEN];
> -	if (ext != 'f' && ext != 's')
> +	if (ext == 'T')
> +		sprint_trace(sym, value);
> +	else if (ext != 'f' && ext != 's')
>  		sprint_symbol(sym, value);
>  	else
>  		kallsyms_lookup(value, NULL, NULL, NULL, sym);
> @@ -949,6 +951,7 @@ int kptr_restrict = 1;
>   * - 'f' For simple symbolic function names without offset
>   * - 'S' For symbolic direct pointers with offset
>   * - 's' For symbolic direct pointers without offset
> + * - 'T' For backtraced symbolic direct pointers with offset
>   * - 'R' For decoded struct resource, e.g., [mem 0x0-0x1f 64bit pref]
>   * - 'r' For raw struct resource, e.g., [mem 0x0-0x1f flags 0x201]
>   * - 'M' For a 6-byte MAC address, it prints the address in the
> @@ -1008,6 +1011,7 @@ char *pointer(const char *fmt, char *buf, char *end, void *ptr,
>  		/* Fallthrough */
>  	case 'S':
>  	case 's':
> +	case 'T':
>  		return symbol_string(buf, end, ptr, spec, *fmt);
>  	case 'R':
>  	case 'r':
> @@ -1279,6 +1283,7 @@ qualifier:
>   * %ps output the name of a text symbol without offset
>   * %pF output the name of a function pointer with its offset
>   * %pf output the name of a function pointer without its offset
> + * %pT output the name of a trace symbol with its offset
>   * %pR output the address range in a struct resource with decoded flags
>   * %pr output the address range in a struct resource with raw flags
>   * %pM output a 6-byte MAC address with colons
> --
> 1.7.4
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [RFC 2/2] x86, dumpstack: use %pT format specifier for stack trace
  2011-03-18 11:21 ` [RFC 2/2] x86, dumpstack: use %pT format specifier for stack trace Namhyung Kim
@ 2011-03-18 15:29   ` Steven Rostedt
  0 siblings, 0 replies; 5+ messages in thread
From: Steven Rostedt @ 2011-03-18 15:29 UTC (permalink / raw)
  To: Namhyung Kim
  Cc: linux-kernel, Frederic Weisbecker, Jason Wessel, Thomas Gleixner,
	Ingo Molnar, H. Peter Anvin

You needed to add the x86 maintainers to your Cc list.

-- Steve

On Fri, Mar 18, 2011 at 08:21:18PM +0900, Namhyung Kim wrote:
> Signed-off-by: Namhyung Kim <namhyung@gmail.com>
> ---
>  arch/x86/kernel/dumpstack.c |    2 +-
>  1 files changed, 1 insertions(+), 1 deletions(-)
> 
> diff --git a/arch/x86/kernel/dumpstack.c b/arch/x86/kernel/dumpstack.c
> index 914065620f2c..809d027de28f 100644
> --- a/arch/x86/kernel/dumpstack.c
> +++ b/arch/x86/kernel/dumpstack.c
> @@ -27,7 +27,7 @@ static int die_counter;
>  
>  void printk_address(unsigned long address, int reliable)
>  {
> -	printk(" [<%p>] %s%pS\n", (void *) address,
> +	printk(" [<%p>] %s%pT\n", (void *) address,
>  			reliable ? "" : "? ", (void *) address);
>  }
>  
> -- 
> 1.7.4
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2011-03-18 15:30 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2011-03-18 11:21 [RFC 0/2] Correct stack trace info Namhyung Kim
2011-03-18 11:21 ` [RFC 1/2] vsprintf: introduce %pT format specifier Namhyung Kim
2011-03-18 15:27   ` Steven Rostedt
2011-03-18 11:21 ` [RFC 2/2] x86, dumpstack: use %pT format specifier for stack trace Namhyung Kim
2011-03-18 15:29   ` Steven Rostedt

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).