[PATCH] x86/dumpstack: Walk frames when built with frame pointers

All of lore.kernel.org
 help / color / mirror / Atom feed

* [PATCH] x86/dumpstack: Walk frames when built with frame pointers
@ 2014-04-26 18:10 Richard Yao
  2014-04-27 12:08 ` Ingo Molnar
  0 siblings, 1 reply; 15+ messages in thread
From: Richard Yao @ 2014-04-26 18:10 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: Ingo Molnar, H. Peter Anvin, x86, Andrew Morton, Tejun Heo,
	Vineet Gupta, Jesper Nilsson, Jiri Slaby, linux-kernel, kernel,
	Brian Behlendorf, Richard Yao

Stack traces are generated by scanning the stack and interpeting
anything that looks like it could be a pointer to something. We do not
need to do this when we have frame pointers, but we do it anyway, with
the distinction that we use the return pointers to mark actual frames by
the absence of a question mark.

The additional verbosity of stack scanning tends to bombard us with
walls of text for no gain in practice, so lets switch to printing only
stack frames when frame pointers are available. That we can spend less
time reading stack traces and more time looking at code.

Signed-off-by: Richard Yao <ryao@gentoo.org>
---
 arch/x86/kernel/dumpstack.c | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/arch/x86/kernel/dumpstack.c b/arch/x86/kernel/dumpstack.c
index d9c12d3..94ffe06 100644
--- a/arch/x86/kernel/dumpstack.c
+++ b/arch/x86/kernel/dumpstack.c
@@ -162,7 +162,11 @@ static void print_trace_address(void *data, unsigned long addr, int reliable)
 static const struct stacktrace_ops print_trace_ops = {
 	.stack			= print_trace_stack,
 	.address		= print_trace_address,
+#ifdef CONFIG_FRAME_POINTER
+	.walk_stack		= print_context_stack_bp,
+#else
 	.walk_stack		= print_context_stack,
+#endif
 };

 void
-- 
1.8.3.2

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* Re: [PATCH] x86/dumpstack: Walk frames when built with frame pointers
  2014-04-26 18:10 [PATCH] x86/dumpstack: Walk frames when built with frame pointers Richard Yao
@ 2014-04-27 12:08 ` Ingo Molnar
  2014-04-27 19:42   ` Peter Zijlstra
                     ` (3 more replies)
  0 siblings, 4 replies; 15+ messages in thread
From: Ingo Molnar @ 2014-04-27 12:08 UTC (permalink / raw)
  To: Richard Yao
  Cc: Thomas Gleixner, Ingo Molnar, H. Peter Anvin, x86, Andrew Morton,
	Tejun Heo, Vineet Gupta, Jesper Nilsson, Jiri Slaby,
	linux-kernel, kernel, Brian Behlendorf, Linus Torvalds,
	Peter Zijlstra, Frédéric Weisbecker,
	Arnaldo Carvalho de Melo, Jiri Olsa


* Richard Yao <ryao@gentoo.org> wrote:

> Stack traces are generated by scanning the stack and interpeting 
> anything that looks like it could be a pointer to something. We do 
> not need to do this when we have frame pointers, but we do it 
> anyway, with the distinction that we use the return pointers to mark 
> actual frames by the absence of a question mark.
> 
> The additional verbosity of stack scanning tends to bombard us with 
> walls of text for no gain in practice, so lets switch to printing 
> only stack frames when frame pointers are available. That we can 
> spend less time reading stack traces and more time looking at code.
> 
> Signed-off-by: Richard Yao <ryao@gentoo.org>
> ---
>  arch/x86/kernel/dumpstack.c | 4 ++++
>  1 file changed, 4 insertions(+)
> 
> diff --git a/arch/x86/kernel/dumpstack.c b/arch/x86/kernel/dumpstack.c
> index d9c12d3..94ffe06 100644
> --- a/arch/x86/kernel/dumpstack.c
> +++ b/arch/x86/kernel/dumpstack.c
> @@ -162,7 +162,11 @@ static void print_trace_address(void *data, unsigned long addr, int reliable)
>  static const struct stacktrace_ops print_trace_ops = {
>  	.stack			= print_trace_stack,
>  	.address		= print_trace_address,
> +#ifdef CONFIG_FRAME_POINTER
> +	.walk_stack		= print_context_stack_bp,
> +#else
>  	.walk_stack		= print_context_stack,
> +#endif
>  };

I don't really like this patch, and this question comes up often, so 
let me try to explain it more verbosely.

Lets take a real-life example of a BP-only trace about a lockup:

 Call Trace:
  [<79018d24>] arch_trigger_all_cpu_backtrace+0x3c/0x64
  [<7916a34c>] do_raw_spin_lock+0xb7/0xe8
  [<792b9412>] _raw_spin_lock_irqsave+0x35/0x3e
  [<790391e8>] prepare_to_wait+0x18/0x57
  [<792b7158>] __wait_on_bit+0x20/0x5f
  [<792b71eb>] out_of_line_wait_on_bit+0x54/0x5c
  [<790e4eff>] __wait_on_buffer+0x1b/0x1e
  [<7910f615>] __ext3_get_inode_loc+0x1ef/0x254
  [<7911015f>] ext3_iget+0x45/0x346
  [<79114e03>] ext3_lookup+0x97/0xa6
  [<790c9c0e>] lookup_real+0x20/0x35
  [<790ca1d2>] __lookup_hash+0x2a/0x31
  [<790caa0e>] lookup_slow+0x36/0x8c
  [<790cbdda>] path_lookupat+0xf9/0x5c5
  [<790cc2c5>] filename_lookup+0x1f/0x84
  [<790cde98>] user_path_at_empty+0x3f/0x65
  [<790cdecb>] user_path_at+0xd/0xf
  [<790c6585>] vfs_fstatat+0x40/0x88
  [<790c65f8>] vfs_stat+0x13/0x15
  [<790c67e2>] sys_stat64+0x11/0x22
  [<792b9d04>] syscall_call+0x7/0xb

What does this call trace tell us? That we locked up somewhere in the 
ext3 code. That's all that we know.

Now lets take a look at a 'verbose' entry of the same lockup, that 
also outputs other entries that 'look like' kernel text addresses:

 Call Trace:
  [<79018d24>] arch_trigger_all_cpu_backtrace+0x3c/0x64
  [<7916a34c>] do_raw_spin_lock+0xb7/0xe8
  [<792b9412>] _raw_spin_lock_irqsave+0x35/0x3e
  [<790391e8>] ? prepare_to_wait+0x18/0x57
  [<790391e8>] prepare_to_wait+0x18/0x57
  [<7914a320>] ? generic_make_request+0x80/0xb5
  [<790e4f30>] ? unmap_underlying_metadata+0x2e/0x2e
  [<792b7158>] __wait_on_bit+0x20/0x5f
  [<7914a427>] ? submit_bio+0xd2/0xfb
  [<792b71eb>] out_of_line_wait_on_bit+0x54/0x5c
  [<790e4f30>] ? unmap_underlying_metadata+0x2e/0x2e
  [<79039086>] ? autoremove_wake_function+0x31/0x31
  [<790e4eff>] __wait_on_buffer+0x1b/0x1e
  [<7910f615>] __ext3_get_inode_loc+0x1ef/0x254
  [<7911015f>] ext3_iget+0x45/0x346
  [<79114e03>] ext3_lookup+0x97/0xa6
  [<790c9c0e>] lookup_real+0x20/0x35
  [<790ca1d2>] __lookup_hash+0x2a/0x31
  [<790caa0e>] lookup_slow+0x36/0x8c
  [<790cbdda>] path_lookupat+0xf9/0x5c5
  [<790cc2c5>] filename_lookup+0x1f/0x84
  [<790cde98>] user_path_at_empty+0x3f/0x65
  [<790cdecb>] user_path_at+0xd/0xf
  [<790c6585>] vfs_fstatat+0x40/0x88
  [<7903f844>] ? lg_local_unlock+0x31/0x47
  [<790c65f8>] vfs_stat+0x13/0x15
  [<790c67e2>] sys_stat64+0x11/0x22
  [<790c3c47>] ? __fput+0x187/0x1a0
  [<792b9d37>] ? restore_all+0xf/0xf
  [<79165bb4>] ? trace_hardirqs_on_thunk+0xc/0x10
  [<792b9d04>] syscall_call+0x7/0xb

Firstly, it's still easy to tell the true backtrace: ignore all the 
'?' entries.

But the '?' entries also tell us something more: that recently this 
CPU submitted IO. That might or might not be suspected from the 
original trace, but from the 'extended' trace it's really obvious that 
IO activity happened before the lockup as well.

There were many other examples in the past where '?' entries gave us 
clues about the nature of a bug: they represent a poor man's trace of 
what happened recently on that CPU.

So it's useful information for hairy bugs and it would be sad to 
remove them.

Having said that, your complaint that '?' entries can make reading of 
back traces more difficult is valid as well - so maybe we can do 
something about that.

For example if we used more horizontal separation in the output 
format:

 Call Trace:
  [<79018d24>] arch_trigger_all_cpu_backtrace+0x3c/0x64
  [<7916a34c>] do_raw_spin_lock+0xb7/0xe8
  [<792b9412>] _raw_spin_lock_irqsave+0x35/0x3e
  [<790391e8>]                                   # ? prepare_to_wait+0x18/0x57
  [<790391e8>] prepare_to_wait+0x18/0x57
  [<7914a320>]                                   # ? generic_make_request+0x80/0xb5
  [<790e4f30>]                                   # ? unmap_underlying_metadata+0x2e/0x2e
  [<792b7158>] __wait_on_bit+0x20/0x5f
  [<7914a427>]                                   # ? submit_bio+0xd2/0xfb
  [<792b71eb>] out_of_line_wait_on_bit+0x54/0x5c
  [<790e4f30>]                                   # ? unmap_underlying_metadata+0x2e/0x2e
  [<79039086>]                                   # ? autoremove_wake_function+0x31/0x31
  [<790e4eff>] __wait_on_buffer+0x1b/0x1e
  [<7910f615>] __ext3_get_inode_loc+0x1ef/0x254
  [<7911015f>] ext3_iget+0x45/0x346
  [<79114e03>] ext3_lookup+0x97/0xa6
  [<790c9c0e>] lookup_real+0x20/0x35
  [<790ca1d2>] __lookup_hash+0x2a/0x31
  [<790caa0e>] lookup_slow+0x36/0x8c
  [<790cbdda>] path_lookupat+0xf9/0x5c5
  [<790cc2c5>] filename_lookup+0x1f/0x84
  [<790cde98>] user_path_at_empty+0x3f/0x65
  [<790cdecb>] user_path_at+0xd/0xf
  [<790c6585>] vfs_fstatat+0x40/0x88
  [<7903f844>]                                   # ? lg_local_unlock+0x31/0x47
  [<790c65f8>] vfs_stat+0x13/0x15
  [<790c67e2>] sys_stat64+0x11/0x22
  [<790c3c47>]                                   # ? __fput+0x187/0x1a0
  [<792b9d37>]                                   # ? restore_all+0xf/0xf
  [<79165bb4>]                                   # ? trace_hardirqs_on_thunk+0xc/0x10
  [<792b9d04>] syscall_call+0x7/0xb

then the information would still be there, it would still be in the 
right place, but it would also be much easier to ignore the right side 
column it visually. (The '# ?' prefix would also make it clear that 
this information is not so important as the left hand side column.)

Or something like that. Assuming this doesn't confuse tools and such.

Thanks,

	Ingo


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] x86/dumpstack: Walk frames when built with frame pointers
  2014-04-27 12:08 ` Ingo Molnar
@ 2014-04-27 19:42   ` Peter Zijlstra
  2014-04-27 19:51   ` Richard Yao
                     ` (2 subsequent siblings)
  3 siblings, 0 replies; 15+ messages in thread
From: Peter Zijlstra @ 2014-04-27 19:42 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Richard Yao, Thomas Gleixner, Ingo Molnar, H. Peter Anvin, x86,
	Andrew Morton, Tejun Heo, Vineet Gupta, Jesper Nilsson,
	Jiri Slaby, linux-kernel, kernel, Brian Behlendorf,
	Linus Torvalds, Frédéric Weisbecker,
	Arnaldo Carvalho de Melo, Jiri Olsa

On Sun, Apr 27, 2014 at 02:08:20PM +0200, Ingo Molnar wrote:
> For example if we used more horizontal separation in the output 
> format:
> 
>  Call Trace:
>   [<79018d24>] arch_trigger_all_cpu_backtrace+0x3c/0x64
>   [<7916a34c>] do_raw_spin_lock+0xb7/0xe8
>   [<792b9412>] _raw_spin_lock_irqsave+0x35/0x3e
>   [<790391e8>]                                   # ? prepare_to_wait+0x18/0x57
>   [<790391e8>] prepare_to_wait+0x18/0x57
>   [<7914a320>]                                   # ? generic_make_request+0x80/0xb5
>   [<790e4f30>]                                   # ? unmap_underlying_metadata+0x2e/0x2e
>   [<792b7158>] __wait_on_bit+0x20/0x5f
>   [<7914a427>]                                   # ? submit_bio+0xd2/0xfb
>   [<792b71eb>] out_of_line_wait_on_bit+0x54/0x5c
>   [<790e4f30>]                                   # ? unmap_underlying_metadata+0x2e/0x2e
>   [<79039086>]                                   # ? autoremove_wake_function+0x31/0x31
>   [<790e4eff>] __wait_on_buffer+0x1b/0x1e
>   [<7910f615>] __ext3_get_inode_loc+0x1ef/0x254
>   [<7911015f>] ext3_iget+0x45/0x346
>   [<79114e03>] ext3_lookup+0x97/0xa6
>   [<790c9c0e>] lookup_real+0x20/0x35
>   [<790ca1d2>] __lookup_hash+0x2a/0x31
>   [<790caa0e>] lookup_slow+0x36/0x8c
>   [<790cbdda>] path_lookupat+0xf9/0x5c5
>   [<790cc2c5>] filename_lookup+0x1f/0x84
>   [<790cde98>] user_path_at_empty+0x3f/0x65
>   [<790cdecb>] user_path_at+0xd/0xf
>   [<790c6585>] vfs_fstatat+0x40/0x88
>   [<7903f844>]                                   # ? lg_local_unlock+0x31/0x47
>   [<790c65f8>] vfs_stat+0x13/0x15
>   [<790c67e2>] sys_stat64+0x11/0x22
>   [<790c3c47>]                                   # ? __fput+0x187/0x1a0
>   [<792b9d37>]                                   # ? restore_all+0xf/0xf
>   [<79165bb4>]                                   # ? trace_hardirqs_on_thunk+0xc/0x10
>   [<792b9d04>] syscall_call+0x7/0xb
> 
> then the information would still be there, it would still be in the 
> right place, but it would also be much easier to ignore the right side 
> column it visually. (The '# ?' prefix would also make it clear that 
> this information is not so important as the left hand side column.)

Might be my brain being weird, but I find the above actually harder to
parse.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] x86/dumpstack: Walk frames when built with frame pointers
  2014-04-27 12:08 ` Ingo Molnar
  2014-04-27 19:42   ` Peter Zijlstra
@ 2014-04-27 19:51   ` Richard Yao
  2014-04-27 20:08   ` Linus Torvalds
  2014-04-30 21:56   ` Frederic Weisbecker
  3 siblings, 0 replies; 15+ messages in thread
From: Richard Yao @ 2014-04-27 19:51 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Thomas Gleixner, Ingo Molnar, H. Peter Anvin, x86, Andrew Morton,
	Tejun Heo, Vineet Gupta, Jesper Nilsson, Jiri Slaby,
	linux-kernel, kernel, Brian Behlendorf, Linus Torvalds,
	Peter Zijlstra, Frédéric Weisbecker,
	Arnaldo Carvalho de Melo, Jiri Olsa

I happened to write this with an email client that used Rich Text, which the LKML rejected, so I am resending it in plaintext format. My future replies will be made in plaintext, which is what my normal development system’s email client does.

On Apr 27, 2014, at 8:08 AM, Ingo Molnar <mingo@kernel.org> wrote:

> 
> * Richard Yao <ryao@gentoo.org> wrote:
> 
>> Stack traces are generated by scanning the stack and interpeting 
>> anything that looks like it could be a pointer to something. We do 
>> not need to do this when we have frame pointers, but we do it 
>> anyway, with the distinction that we use the return pointers to mark 
>> actual frames by the absence of a question mark.
>> 
>> The additional verbosity of stack scanning tends to bombard us with 
>> walls of text for no gain in practice, so lets switch to printing 
>> only stack frames when frame pointers are available. That we can 
>> spend less time reading stack traces and more time looking at code.
>> 
>> Signed-off-by: Richard Yao <ryao@gentoo.org>
>> ---
>> arch/x86/kernel/dumpstack.c | 4 ++++
>> 1 file changed, 4 insertions(+)
>> 
>> diff --git a/arch/x86/kernel/dumpstack.c b/arch/x86/kernel/dumpstack.c
>> index d9c12d3..94ffe06 100644
>> --- a/arch/x86/kernel/dumpstack.c
>> +++ b/arch/x86/kernel/dumpstack.c
>> @@ -162,7 +162,11 @@ static void print_trace_address(void *data, unsigned long addr, int reliable)
>> static const struct stacktrace_ops print_trace_ops = {
>> 	.stack			= print_trace_stack,
>> 	.address		= print_trace_address,
>> +#ifdef CONFIG_FRAME_POINTER
>> +	.walk_stack		= print_context_stack_bp,
>> +#else
>> 	.walk_stack		= print_context_stack,
>> +#endif
>> };
> 
> I don't really like this patch, and this question comes up often, so 
> let me try to explain it more verbosely.
> 
> Lets take a real-life example of a BP-only trace about a lockup:
> 
> Call Trace:
>  [<79018d24>] arch_trigger_all_cpu_backtrace+0x3c/0x64
>  [<7916a34c>] do_raw_spin_lock+0xb7/0xe8
>  [<792b9412>] _raw_spin_lock_irqsave+0x35/0x3e
>  [<790391e8>] prepare_to_wait+0x18/0x57
>  [<792b7158>] __wait_on_bit+0x20/0x5f
>  [<792b71eb>] out_of_line_wait_on_bit+0x54/0x5c
>  [<790e4eff>] __wait_on_buffer+0x1b/0x1e
>  [<7910f615>] __ext3_get_inode_loc+0x1ef/0x254
>  [<7911015f>] ext3_iget+0x45/0x346
>  [<79114e03>] ext3_lookup+0x97/0xa6
>  [<790c9c0e>] lookup_real+0x20/0x35
>  [<790ca1d2>] __lookup_hash+0x2a/0x31
>  [<790caa0e>] lookup_slow+0x36/0x8c
>  [<790cbdda>] path_lookupat+0xf9/0x5c5
>  [<790cc2c5>] filename_lookup+0x1f/0x84
>  [<790cde98>] user_path_at_empty+0x3f/0x65
>  [<790cdecb>] user_path_at+0xd/0xf
>  [<790c6585>] vfs_fstatat+0x40/0x88
>  [<790c65f8>] vfs_stat+0x13/0x15
>  [<790c67e2>] sys_stat64+0x11/0x22
>  [<792b9d04>] syscall_call+0x7/0xb
> 
> What does this call trace tell us? That we locked up somewhere in the 
> ext3 code. That's all that we know.

Actually, we know some other things:

1. Something should have set the bit and started IO.
2. The callback meant to clear that bit has been executed yet.
3. We want to find out who is responsible for that callback. i.e. We want stack traces of all other threads.

It is possible that the thread handling the callback deadlocked. In that case, backtraces of the other threads should suggest the possibility. Reviewing the backtraces is *much* easier without the added noise of the historical stack frames and in all likelihood, the person doing that review would catch the problem faster when they do reveal the issue. There is also no possibility of it being present in those traces, but missed because of the noise from these frames.

> 
> Now lets take a look at a 'verbose' entry of the same lockup, that 
> also outputs other entries that 'look like' kernel text addresses:
> 
> Call Trace:
>  [<79018d24>] arch_trigger_all_cpu_backtrace+0x3c/0x64
>  [<7916a34c>] do_raw_spin_lock+0xb7/0xe8
>  [<792b9412>] _raw_spin_lock_irqsave+0x35/0x3e
>  [<790391e8>] ? prepare_to_wait+0x18/0x57
>  [<790391e8>] prepare_to_wait+0x18/0x57
>  [<7914a320>] ? generic_make_request+0x80/0xb5
>  [<790e4f30>] ? unmap_underlying_metadata+0x2e/0x2e
>  [<792b7158>] __wait_on_bit+0x20/0x5f
>  [<7914a427>] ? submit_bio+0xd2/0xfb
>  [<792b71eb>] out_of_line_wait_on_bit+0x54/0x5c
>  [<790e4f30>] ? unmap_underlying_metadata+0x2e/0x2e
>  [<79039086>] ? autoremove_wake_function+0x31/0x31
>  [<790e4eff>] __wait_on_buffer+0x1b/0x1e
>  [<7910f615>] __ext3_get_inode_loc+0x1ef/0x254
>  [<7911015f>] ext3_iget+0x45/0x346
>  [<79114e03>] ext3_lookup+0x97/0xa6
>  [<790c9c0e>] lookup_real+0x20/0x35
>  [<790ca1d2>] __lookup_hash+0x2a/0x31
>  [<790caa0e>] lookup_slow+0x36/0x8c
>  [<790cbdda>] path_lookupat+0xf9/0x5c5
>  [<790cc2c5>] filename_lookup+0x1f/0x84
>  [<790cde98>] user_path_at_empty+0x3f/0x65
>  [<790cdecb>] user_path_at+0xd/0xf
>  [<790c6585>] vfs_fstatat+0x40/0x88
>  [<7903f844>] ? lg_local_unlock+0x31/0x47
>  [<790c65f8>] vfs_stat+0x13/0x15
>  [<790c67e2>] sys_stat64+0x11/0x22
>  [<790c3c47>] ? __fput+0x187/0x1a0
>  [<792b9d37>] ? restore_all+0xf/0xf
>  [<79165bb4>] ? trace_hardirqs_on_thunk+0xc/0x10
>  [<792b9d04>] syscall_call+0x7/0xb
> 
> Firstly, it's still easy to tell the true backtrace: ignore all the 
> '?' entries.
> 
> But the '?' entries also tell us something more: that recently this 
> CPU submitted IO. That might or might not be suspected from the 
> original trace, but from the 'extended' trace it's really obvious that 
> IO activity happened before the lockup as well.
> 
> There were many other examples in the past where '?' entries gave us 
> clues about the nature of a bug: they represent a poor man's trace of 
> what happened recently on that CPU.
> 
> So it's useful information for hairy bugs and it would be sad to 
> remove them.

It is true that this thread did block IO at some point in the past. It is also possible that block IO operation is related to this stack trace, but the many ways that those frames could get there makes it impossible to conclude that the operation is related. A few ways that occur to me are:

1. This could have gone onto the stack as a result of directory reclaim.
2. A previous syscall by user land could have triggered block IO on something unrelated.
3. If this had been on a kernel thread that has no user land stack like kswapd, then this information can be expected to be there.

Upon seeing the stack trace you provided, I would want to look at all threads on the system under the assumption that some thread started IO and some other thread (or maybe even an interrupt handler) should be responsible for finishing it. I suspect that is a more effective way to get a handle on the problem. If that does not work, then I likely want to attach kgdb or at the very least, start looking at code paths that could start the IO. In such a case, being told that the current thread might have started IO is not really that useful to me. I still need to consider all possible points where this IO could have started.

In this context, having the historical stack frames in the backtrace seems to be a hinderance, rather than a help. They serve to slow down manual inspection of the actual stacks and can obscure useful information that would have been caught had they never been printed. They also can waste valuable screen space and cause the cause of useful information by overflowing scroll back buffers.

Reviewing the historical stack frames under the assumption that they might be useful encourages one to think that the absence of such frames would be useful give that the presence might be useful, but that could not be further than the truth. Having them appear depends on the kernel .config, the toolchain used, the amount of memory pressure (direct reclaim), the call paths taken, etcetera. In short, it is mostly a matter of luck and I find their usefulness when they do appear in backtraces to be very questionable.

In addition, the lack of determines from non-useful information in backtraces makes it more difficult to catch issues with the debug information itself. This is often the case in binary distributions, which often strip kernel modules by passing --strip-uneeded to strip. That creates modules where `file` reports that they are unstripped, but in actuality, they are stripped of all static symbols. In such cases, the stack frames are often garbage, but someone trying to make sense of historical stack frames might not notice because they are used to seeing strange things. In such cases, trying to interpret these frames is definitely a waste of time, but there is nothing (e.g. a taint message?) to inform us of this.

While I believe that the the historical stack frames have been useful in the past for some difficult bugs (when the stars aligned properly) and can still be useful in such instances, I consider the consequences of printing them to outweigh the benefits. In the common case where these frames are not useful, their presence wastes time that could have gone to things that would actually be useful. In the rare cases where they are useful, people relying on them spend their time on one-off solutions. Spending time on many of these one-off solutions is a less efficient use of time than spending time on tools to catch such issues out of necessity.

In both cases, there seems to plenty of time that would be better spent by not printing these frames. Some possibilities that occur to me are better static analysis tools, more widespread adoption of crash dumps and more time spent on dynamic tracing tools. Any one of them should ultimately be beneficial to the overall bug churn rate and consequently, get more eyeballs on difficult to debug problems, such as the rare ones where these historical stack frames are useful.

> Having said that, your complaint that '?' entries can make reading of 
> back traces more difficult is valid as well - so maybe we can do 
> something about that.
> 
> For example if we used more horizontal separation in the output 
> format:
> 
> Call Trace:
>  [<79018d24>] arch_trigger_all_cpu_backtrace+0x3c/0x64
>  [<7916a34c>] do_raw_spin_lock+0xb7/0xe8
>  [<792b9412>] _raw_spin_lock_irqsave+0x35/0x3e
>  [<790391e8>]                                   # ? prepare_to_wait+0x18/0x57
>  [<790391e8>] prepare_to_wait+0x18/0x57
>  [<7914a320>]                                   # ? generic_make_request+0x80/0xb5
>  [<790e4f30>]                                   # ? unmap_underlying_metadata+0x2e/0x2e
>  [<792b7158>] __wait_on_bit+0x20/0x5f
>  [<7914a427>]                                   # ? submit_bio+0xd2/0xfb
>  [<792b71eb>] out_of_line_wait_on_bit+0x54/0x5c
>  [<790e4f30>]                                   # ? unmap_underlying_metadata+0x2e/0x2e
>  [<79039086>]                                   # ? autoremove_wake_function+0x31/0x31
>  [<790e4eff>] __wait_on_buffer+0x1b/0x1e
>  [<7910f615>] __ext3_get_inode_loc+0x1ef/0x254
>  [<7911015f>] ext3_iget+0x45/0x346
>  [<79114e03>] ext3_lookup+0x97/0xa6
>  [<790c9c0e>] lookup_real+0x20/0x35
>  [<790ca1d2>] __lookup_hash+0x2a/0x31
>  [<790caa0e>] lookup_slow+0x36/0x8c
>  [<790cbdda>] path_lookupat+0xf9/0x5c5
>  [<790cc2c5>] filename_lookup+0x1f/0x84
>  [<790cde98>] user_path_at_empty+0x3f/0x65
>  [<790cdecb>] user_path_at+0xd/0xf
>  [<790c6585>] vfs_fstatat+0x40/0x88
>  [<7903f844>]                                   # ? lg_local_unlock+0x31/0x47
>  [<790c65f8>] vfs_stat+0x13/0x15
>  [<790c67e2>] sys_stat64+0x11/0x22
>  [<790c3c47>]                                   # ? __fput+0x187/0x1a0
>  [<792b9d37>]                                   # ? restore_all+0xf/0xf
>  [<79165bb4>]                                   # ? trace_hardirqs_on_thunk+0xc/0x10
>  [<792b9d04>] syscall_call+0x7/0xb
> 
> then the information would still be there, it would still be in the 
> right place, but it would also be much easier to ignore the right side 
> column it visually. (The '# ?' prefix would also make it clear that 
> this information is not so important as the left hand side column.)
> 
> Or something like that. Assuming this doesn't confuse tools and such.

If we do something like this, it would be best to make it configurable via a sysctl. We could have settings for something like the behavior you proposed, the original behavior we have now and the behavior I would like to see.

That being said, I would like to know if you still feel this information is useful after hearing my thoughts on it.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] x86/dumpstack: Walk frames when built with frame pointers
  2014-04-27 12:08 ` Ingo Molnar
  2014-04-27 19:42   ` Peter Zijlstra
  2014-04-27 19:51   ` Richard Yao
@ 2014-04-27 20:08   ` Linus Torvalds
  2014-04-27 20:36     ` Richard Yao
  2014-05-07 17:18     ` Ingo Molnar
  2014-04-30 21:56   ` Frederic Weisbecker
  3 siblings, 2 replies; 15+ messages in thread
From: Linus Torvalds @ 2014-04-27 20:08 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Richard Yao, Thomas Gleixner, Ingo Molnar, H. Peter Anvin,
	the arch/x86 maintainers, Andrew Morton, Tejun Heo, Vineet Gupta,
	Jesper Nilsson, Jiri Slaby, Linux Kernel Mailing List, kernel,
	Brian Behlendorf, Peter Zijlstra, Frédéric Weisbecker,
	Arnaldo Carvalho de Melo, Jiri Olsa

On Sun, Apr 27, 2014 at 5:08 AM, Ingo Molnar <mingo@kernel.org> wrote:
>
> So it's useful information for hairy bugs and it would be sad to
> remove them.

I tend to agree. I've often found the left-overs to be good clues
about what just got called. Although equally often it's another kind
of clue entirely: that the stack frame of some of the functions
involved in the real call frame is much too big, leaving room for that
stale information to lay around.

> Having said that, your complaint that '?' entries can make reading of
> back traces more difficult is valid as well - so maybe we can do
> something about that.

Quite frankly, I'd much rather just remove the annoying hex numbers
that are imnsho *much* more distracting. Possibly even the "/0xsize"
part (although that is at least somewhat useful to judge where in the
function it is).

And while it would be horrible for readability, it might also be a
good idea to replace the newlines with something like " -> " instead,
because we are quite often vertically challenged. But that could
really make things pretty unreadable.

So to take your example, it might be something like this

arch_trigger_all_cpu_backtrace+0x3c -> do_raw_spin_lock+0xb7
 -> _raw_spin_lock_irqsave+0x35 -> ? prepare_to_wait+0x18
 -> prepare_to_wait+0x18 -> ? generic_make_request+0x80
 -> ? unmap_underlying_metadata+0x2e -> __wait_on_bit+0x20
 -> ? submit_bio+0xd2 -> out_of_line_wait_on_bit+0x54
 -> ? unmap_underlying_metadata+0x2e -> ? autoremove_wake_function+0x31
 -> __wait_on_buffer+0x1b -> __ext3_get_inode_loc+0x1ef -> ext3_iget+0x45
 -> ext3_lookup+0x97 -> lookup_real+0x20 -> __lookup_hash+0x2a
 -> lookup_slow+0x36 -> path_lookupat+0xf9 -> filename_lookup+0x1f
 -> user_path_at_empty+0x3f -> user_path_at+0xd -> vfs_fstatat+0x40
 -> ? lg_local_unlock+0x31 -> vfs_stat+0x13 -> sys_stat64+0x11
 -> ? __fput+0x187 -> ? restore_all+0xf -> ? trace_hardirqs_on_thunk+0xc
 -> syscall_call+0x7

which is admittedly complete line noise, but is just 13 lines rather
than 31. That can sometimes be a really big deal.

Also, we might want to cap the number of lines regardless. It is true
that sometimes the really deep call chains can be interesting, but
equally often they make other important stuff scroll off the screen
(oopses that don't get caught in /sys/log/messages because they kill
the machine are the worst to debug, and we still end up having people
send pictures taken with digital cameras of them), so it's a "win
some, lose some" kind of thing.

Of course, the questionable stale entries on the stack can (and do)
make the whole "scroll off the screen" thing worse. So I dunno.

              Linus

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] x86/dumpstack: Walk frames when built with frame pointers
  2014-04-27 20:08   ` Linus Torvalds
@ 2014-04-27 20:36     ` Richard Yao
  2014-05-07 17:18     ` Ingo Molnar
  1 sibling, 0 replies; 15+ messages in thread
From: Richard Yao @ 2014-04-27 20:36 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Ingo Molnar, Thomas Gleixner, Ingo Molnar, H. Peter Anvin,
	the arch/x86 maintainers, Andrew Morton, Tejun Heo, Vineet Gupta,
	Jesper Nilsson, Jiri Slaby, Linux Kernel Mailing List, kernel,
	Brian Behlendorf, Peter Zijlstra, Frédéric Weisbecker,
	Arnaldo Carvalho de Melo, Jiri Olsa

On Apr 27, 2014, at 4:08 PM, Linus Torvalds <torvalds@linux-foundation.org> wrote:

> On Sun, Apr 27, 2014 at 5:08 AM, Ingo Molnar <mingo@kernel.org> wrote:
>> 
>> So it's useful information for hairy bugs and it would be sad to
>> remove them.
> 
> I tend to agree. I've often found the left-overs to be good clues
> about what just got called. Although equally often it's another kind
> of clue entirely: that the stack frame of some of the functions
> involved in the real call frame is much too big, leaving room for that
> stale information to lay around.

Getting information about the frames being too big is a useful side effect. I don’t think the omission of historical frames from the stack traces needs to eliminate that. In particular, we could print the stack depth, frame size and number of possible pointers to kernel code spotted in the frame alongside the stack frames that we obtain from unwinding the stack.

>> Having said that, your complaint that '?' entries can make reading of
>> back traces more difficult is valid as well - so maybe we can do
>> something about that.
> 
> Quite frankly, I'd much rather just remove the annoying hex numbers
> that are imnsho *much* more distracting. Possibly even the "/0xsize"
> part (although that is at least somewhat useful to judge where in the
> function it is).
> 
> And while it would be horrible for readability, it might also be a
> good idea to replace the newlines with something like " -> " instead,
> because we are quite often vertically challenged. But that could
> really make things pretty unreadable.
> 
> So to take your example, it might be something like this
> 
> arch_trigger_all_cpu_backtrace+0x3c -> do_raw_spin_lock+0xb7
> -> _raw_spin_lock_irqsave+0x35 -> ? prepare_to_wait+0x18
> -> prepare_to_wait+0x18 -> ? generic_make_request+0x80
> -> ? unmap_underlying_metadata+0x2e -> __wait_on_bit+0x20
> -> ? submit_bio+0xd2 -> out_of_line_wait_on_bit+0x54
> -> ? unmap_underlying_metadata+0x2e -> ? autoremove_wake_function+0x31
> -> __wait_on_buffer+0x1b -> __ext3_get_inode_loc+0x1ef -> ext3_iget+0x45
> -> ext3_lookup+0x97 -> lookup_real+0x20 -> __lookup_hash+0x2a
> -> lookup_slow+0x36 -> path_lookupat+0xf9 -> filename_lookup+0x1f
> -> user_path_at_empty+0x3f -> user_path_at+0xd -> vfs_fstatat+0x40
> -> ? lg_local_unlock+0x31 -> vfs_stat+0x13 -> sys_stat64+0x11
> -> ? __fput+0x187 -> ? restore_all+0xf -> ? trace_hardirqs_on_thunk+0xc
> -> syscall_call+0x7
> 
> which is admittedly complete line noise, but is just 13 lines rather
> than 31. That can sometimes be a really big deal.
> 
> Also, we might want to cap the number of lines regardless. It is true
> that sometimes the really deep call chains can be interesting, but
> equally often they make other important stuff scroll off the screen
> (oopses that don't get caught in /sys/log/messages because they kill
> the machine are the worst to debug, and we still end up having people
> send pictures taken with digital cameras of them), so it's a "win
> some, lose some" kind of thing.
> 
> Of course, the questionable stale entries on the stack can (and do)
> make the whole "scroll off the screen" thing worse. So I dunno.
> 
>              Linus

I suppose one option would be to write a patch to make this configurable via sysctl, throw it into staging so only kernel developers have access to it (I hope) and then try out each of these variations on our own machines to see which one we like best. The patch I submitted to the list is one that I am using myself and I have come to really like having only stack frames described by frame pointers printed. I think others who try more condensed stacks will come to like them too.

If you are interested in doing that experiment, I could put a patch together to do it.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] x86/dumpstack: Walk frames when built with frame pointers
  2014-04-27 12:08 ` Ingo Molnar
                     ` (2 preceding siblings ...)
  2014-04-27 20:08   ` Linus Torvalds
@ 2014-04-30 21:56   ` Frederic Weisbecker
  2014-05-07 16:40     ` Ingo Molnar
  3 siblings, 1 reply; 15+ messages in thread
From: Frederic Weisbecker @ 2014-04-30 21:56 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Richard Yao, Thomas Gleixner, Ingo Molnar, H. Peter Anvin, x86,
	Andrew Morton, Tejun Heo, Vineet Gupta, Jesper Nilsson,
	Jiri Slaby, linux-kernel, kernel, Brian Behlendorf,
	Linus Torvalds, Peter Zijlstra, Arnaldo Carvalho de Melo,
	Jiri Olsa

On Sun, Apr 27, 2014 at 02:08:20PM +0200, Ingo Molnar wrote:
> 
> * Richard Yao <ryao@gentoo.org> wrote:
> 
> > Stack traces are generated by scanning the stack and interpeting 
> > anything that looks like it could be a pointer to something. We do 
> > not need to do this when we have frame pointers, but we do it 
> > anyway, with the distinction that we use the return pointers to mark 
> > actual frames by the absence of a question mark.
> > 
> > The additional verbosity of stack scanning tends to bombard us with 
> > walls of text for no gain in practice, so lets switch to printing 
> > only stack frames when frame pointers are available. That we can 
> > spend less time reading stack traces and more time looking at code.
> > 
> > Signed-off-by: Richard Yao <ryao@gentoo.org>
> > ---
> >  arch/x86/kernel/dumpstack.c | 4 ++++
> >  1 file changed, 4 insertions(+)
> > 
> > diff --git a/arch/x86/kernel/dumpstack.c b/arch/x86/kernel/dumpstack.c
> > index d9c12d3..94ffe06 100644
> > --- a/arch/x86/kernel/dumpstack.c
> > +++ b/arch/x86/kernel/dumpstack.c
> > @@ -162,7 +162,11 @@ static void print_trace_address(void *data, unsigned long addr, int reliable)
> >  static const struct stacktrace_ops print_trace_ops = {
> >  	.stack			= print_trace_stack,
> >  	.address		= print_trace_address,
> > +#ifdef CONFIG_FRAME_POINTER
> > +	.walk_stack		= print_context_stack_bp,
> > +#else
> >  	.walk_stack		= print_context_stack,
> > +#endif
> >  };

Besides the complementary informations brought by the full stack walk,
another big argument toward keeping full stack walk is that if your frame
pointer is screwed for whatever reason, you still have a useful stack trace.

I have seen and fixed several broken frame links in x86-64 by the past. Those
are very subtle and often hardly visible issues because, if they are easily spotted
on common frame scenarios like : task > irq, they are much harder to find on trickier,
rarer frame scenarios such as: task -> softirq -> irq -> nmi -> debug exception ->....

For example before a2bbe75089d5eb9a3a46d50dd5c215e213790288
("x86: Don't use frame pointer to save old stack on irq entry"), we were missing
entire stack frames on nesting irqs (hardirq on softirqs) while using pure frame
pointer based unwinding.

Who knows if we have other remaining issues like this? Especially given the high
possible number of frame combinations between task, irq, softirq, nmi and exceptions.
Multiply the contexts possibility by the number of possible archs out there and their
stack switch implementations.

Also further frame links breakages, we have many other possibilities to end up
with misleading frame pointers. Relying on that source alone definetly reduce the
reliability of our stacktraces.

So this goes way beyond just missing complementary informations. Debugging robustness
itself is actually very concerned here if we remove the full stack walk.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] x86/dumpstack: Walk frames when built with frame pointers
  2014-04-30 21:56   ` Frederic Weisbecker
@ 2014-05-07 16:40     ` Ingo Molnar
  2014-06-06  8:17       ` Peter Zijlstra
  0 siblings, 1 reply; 15+ messages in thread
From: Ingo Molnar @ 2014-05-07 16:40 UTC (permalink / raw)
  To: Frederic Weisbecker
  Cc: Richard Yao, Thomas Gleixner, Ingo Molnar, H. Peter Anvin, x86,
	Andrew Morton, Tejun Heo, Vineet Gupta, Jesper Nilsson,
	Jiri Slaby, linux-kernel, kernel, Brian Behlendorf,
	Linus Torvalds, Peter Zijlstra, Arnaldo Carvalho de Melo,
	Jiri Olsa


* Frederic Weisbecker <fweisbec@gmail.com> wrote:

> On Sun, Apr 27, 2014 at 02:08:20PM +0200, Ingo Molnar wrote:
> > 
> > * Richard Yao <ryao@gentoo.org> wrote:
> > 
> > > Stack traces are generated by scanning the stack and interpeting 
> > > anything that looks like it could be a pointer to something. We do 
> > > not need to do this when we have frame pointers, but we do it 
> > > anyway, with the distinction that we use the return pointers to mark 
> > > actual frames by the absence of a question mark.
> > > 
> > > The additional verbosity of stack scanning tends to bombard us with 
> > > walls of text for no gain in practice, so lets switch to printing 
> > > only stack frames when frame pointers are available. That we can 
> > > spend less time reading stack traces and more time looking at code.
> > > 
> > > Signed-off-by: Richard Yao <ryao@gentoo.org>
> > > ---
> > >  arch/x86/kernel/dumpstack.c | 4 ++++
> > >  1 file changed, 4 insertions(+)
> > > 
> > > diff --git a/arch/x86/kernel/dumpstack.c b/arch/x86/kernel/dumpstack.c
> > > index d9c12d3..94ffe06 100644
> > > --- a/arch/x86/kernel/dumpstack.c
> > > +++ b/arch/x86/kernel/dumpstack.c
> > > @@ -162,7 +162,11 @@ static void print_trace_address(void *data, unsigned long addr, int reliable)
> > >  static const struct stacktrace_ops print_trace_ops = {
> > >  	.stack			= print_trace_stack,
> > >  	.address		= print_trace_address,
> > > +#ifdef CONFIG_FRAME_POINTER
> > > +	.walk_stack		= print_context_stack_bp,
> > > +#else
> > >  	.walk_stack		= print_context_stack,
> > > +#endif
> > >  };
> 
> Besides the complementary informations brought by the full stack 
> walk, another big argument toward keeping full stack walk is that if 
> your frame pointer is screwed for whatever reason, you still have a 
> useful stack trace.
> 
> I have seen and fixed several broken frame links in x86-64 by the 
> past. Those are very subtle and often hardly visible issues because, 
> if they are easily spotted on common frame scenarios like : task > 
> irq, they are much harder to find on trickier, rarer frame scenarios 
> such as: task -> softirq -> irq -> nmi -> debug exception ->....
> 
> For example before a2bbe75089d5eb9a3a46d50dd5c215e213790288 ("x86: 
> Don't use frame pointer to save old stack on irq entry"), we were 
> missing entire stack frames on nesting irqs (hardirq on softirqs) 
> while using pure frame pointer based unwinding.
> 
> Who knows if we have other remaining issues like this? Especially 
> given the high possible number of frame combinations between task, 
> irq, softirq, nmi and exceptions. Multiply the contexts possibility 
> by the number of possible archs out there and their stack switch 
> implementations.
> 
> Also further frame links breakages, we have many other possibilities 
> to end up with misleading frame pointers. Relying on that source 
> alone definetly reduce the reliability of our stacktraces.
> 
> So this goes way beyond just missing complementary informations. 
> Debugging robustness itself is actually very concerned here if we 
> remove the full stack walk.

Agreed, that's a very good point.

Also, consider the following holistic argument, what is easier to 
achieve, when looking at an oops and not seeing the bug:

  - if only I had more information
  - if only I had less information

we cannot put in information that we cut out, but it's not 
particularly hard to skip overly verbose information in most cases.
 
Yes, there's a line to be drawn with verbosity: scroll-off is a 
concern when the oops does not make it to a log file.

So I don't really know.

Thanks,

	Ingo

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] x86/dumpstack: Walk frames when built with frame pointers
  2014-04-27 20:08   ` Linus Torvalds
  2014-04-27 20:36     ` Richard Yao
@ 2014-05-07 17:18     ` Ingo Molnar
  2014-05-07 17:37       ` Peter Zijlstra
  1 sibling, 1 reply; 15+ messages in thread
From: Ingo Molnar @ 2014-05-07 17:18 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Richard Yao, Thomas Gleixner, Ingo Molnar, H. Peter Anvin,
	the arch/x86 maintainers, Andrew Morton, Tejun Heo, Vineet Gupta,
	Jesper Nilsson, Jiri Slaby, Linux Kernel Mailing List, kernel,
	Brian Behlendorf, Peter Zijlstra, Frédéric Weisbecker,
	Arnaldo Carvalho de Melo, Jiri Olsa


* Linus Torvalds <torvalds@linux-foundation.org> wrote:

> So to take your example, it might be something like this
> 
> arch_trigger_all_cpu_backtrace+0x3c -> do_raw_spin_lock+0xb7
>  -> _raw_spin_lock_irqsave+0x35 -> ? prepare_to_wait+0x18
>  -> prepare_to_wait+0x18 -> ? generic_make_request+0x80
>  -> ? unmap_underlying_metadata+0x2e -> __wait_on_bit+0x20
>  -> ? submit_bio+0xd2 -> out_of_line_wait_on_bit+0x54
>  -> ? unmap_underlying_metadata+0x2e -> ? autoremove_wake_function+0x31
>  -> __wait_on_buffer+0x1b -> __ext3_get_inode_loc+0x1ef -> ext3_iget+0x45
>  -> ext3_lookup+0x97 -> lookup_real+0x20 -> __lookup_hash+0x2a
>  -> lookup_slow+0x36 -> path_lookupat+0xf9 -> filename_lookup+0x1f
>  -> user_path_at_empty+0x3f -> user_path_at+0xd -> vfs_fstatat+0x40
>  -> ? lg_local_unlock+0x31 -> vfs_stat+0x13 -> sys_stat64+0x11
>  -> ? __fput+0x187 -> ? restore_all+0xf -> ? trace_hardirqs_on_thunk+0xc
>  -> syscall_call+0x7
> 
> which is admittedly complete line noise, but is just 13 lines rather 
> than 31. That can sometimes be a really big deal.

Let me try to offer a few more mockups with different typographical 
variants:

1) 'intelligently skip offsets':

 arch_trigger_all_cpu_backtrace -> do_raw_spin_lock
  -> _raw_spin_lock_irqsave -> ? prepare_to_wait
  -> prepare_to_wait -> ? generic_make_request
  -> ? unmap_underlying_metadata -> __wait_on_bit
  -> ? submit_bio+0xd2 -> out_of_line_wait_on_bit
  -> ? unmap_underlying_metadata -> ? autoremove_wake_function
  -> __wait_on_buffer -> __ext3_get_inode_loc -> ext3_iget
  -> ext3_lookup -> lookup_real -> __lookup_hash
  -> lookup_slow -> path_lookupat -> filename_lookup
  -> user_path_at_empty -> user_path_at -> vfs_fstatat
  -> ? lg_local_unlock -> vfs_stat -> sys_stat64
  -> ? __fput -> ? restore_all -> ? trace_hardirqs_on_thunk
  -> syscall_call+0x7

Note how the offset is skipped intelligently, by adding an attribute 
to kallsyms entries: the number of callouts in that function. For 
functions that only have a single call, no offset information is 
needed.

For functions that have multiple call instructions, the offset is 
printed - such as the 'submit_bio+0xd2' case shows it in the mockup 
above.

2) +'vertically aligned, screen split in two'

       arch_trigger_all_cpu_backtrace    -> do_raw_spin_lock
    -> _raw_spin_lock_irqsave            -> ? prepare_to_wait
    -> prepare_to_wait                   -> ? generic_make_request
    -> ? unmap_underlying_metadata       -> __wait_on_bit
    -> ? submit_bio+0xd2                 -> out_of_line_wait_on_bit
    -> ? unmap_underlying_metadata       -> ? autoremove_wake_function
    -> __wait_on_buffer                  -> __ext3_get_inode_loc
    -> ext3_iget                         -> ext3_lookup
    -> lookup_real                       -> __lookup_hash
    -> lookup_slow                       -> path_lookupat
    -> filename_lookup                   -> user_path_at_empty
    -> user_path_at                      -> vfs_fstatat
    -> ? lg_local_unlock                 -> vfs_stat
    -> sys_stat64                        -> ? __fput
    -> ? restore_all                     -> ? trace_hardirqs_on_thunk
    -> syscall_call

This is pretty readable, and if you only look at the left side column, 
it still gives a shortened 'pattern' that gives an impression as to 
what rough call context this is.

The first and last entry would always be printed on the left hand 
side, to make the 'left column' unconditionally readable.

This mockup fits on 80col screens.

3) +'function parentheses'

       arch_trigger_all_cpu_backtrace()    -> do_raw_spin_lock()
    -> _raw_spin_lock_irqsave()            -> ? prepare_to_wait
    -> prepare_to_wait()                   -> ? generic_make_request
    -> ? unmap_underlying_metadata         -> __wait_on_bit()
    -> ? submit_bio+0xd2                   -> out_of_line_wait_on_bit()
    -> ? unmap_underlying_metadata         -> ? autoremove_wake_function
    -> __wait_on_buffer()                  -> __ext3_get_inode_loc()
    -> ext3_iget()                         -> ext3_lookup()
    -> lookup_real()                       -> __lookup_hash()
    -> lookup_slow()                       -> path_lookupat()
    -> filename_lookup()                   -> user_path_at_empty()
    -> user_path_at()                      -> vfs_fstatat()
    -> ? lg_local_unlock                   -> vfs_stat()
    -> sys_stat64()                        -> ? __fput
    -> ? restore_all                       -> ? trace_hardirqs_on_thunk
    -> syscall_call()+0x7

This is more readable to me personally, as it's more C-alike. Note how 
only 'real' FP entries without question marks get the parentheses. 
This gives more visual separation between 'real' and 'noise' entries.

4) +'square brackets for unknown entries'

       arch_trigger_all_cpu_backtrace()    -> do_raw_spin_lock()
    -> _raw_spin_lock_irqsave()            -> ? [prepare_to_wait]
    -> prepare_to_wait()                   -> ? [generic_make_request]
    -> ? [unmap_underlying_metadata]       -> __wait_on_bit()
    -> ? [submit_bio+0xd2]                 -> out_of_line_wait_on_bit()
    -> ? [unmap_underlying_metadata]       -> ? [autoremove_wake_function]
    -> __wait_on_buffer()                  -> __ext3_get_inode_loc()
    -> ext3_iget()                         -> ext3_lookup()
    -> lookup_real()                       -> __lookup_hash()
    -> lookup_slow()                       -> path_lookupat()
    -> filename_lookup()                   -> user_path_at_empty()
    -> user_path_at()                      -> vfs_fstatat()
    -> ? [lg_local_unlock]                 -> vfs_stat()
    -> sys_stat64()                        -> ? [__fput]
    -> ? [restore_all]                     -> ? [trace_hardirqs_on_thunk]
    -> syscall_call()+0x7

To my eyes this further de-emphasises 'unknown' entries that are 
uninteresting noise in most cases. Makes it a bit more easy for me to 
ignore them.

YMMV: the brackets might increase emphasis for you ...

5) #3+'vertically aligned, screen split in three'

       arch_trigger_all_cpu_backtrace()    -> do_raw_spin_lock()                  -> _raw_spin_lock_irqsave()
    -> ? prepare_to_wait()                 -> prepare_to_wait()                   -> ? generic_make_request()
    -> ? unmap_underlying_metadata()       -> __wait_on_bit()                     -> ? submit_bio()+0xd2
    -> out_of_line_wait_on_bit()           -> ? unmap_underlying_metadata()       -> ? autoremove_wake_function()
    -> __wait_on_buffer()                  -> __ext3_get_inode_loc()              -> ext3_iget()
    -> ext3_lookup()                       -> lookup_real()                       -> __lookup_hash()
    -> lookup_slow()                       -> path_lookupat()                     -> filename_lookup()
    -> user_path_at_empty()                -> user_path_at()                      -> vfs_fstatat()
    -> ? lg_local_unlock()                 -> vfs_stat()                          -> sys_stat64()
    -> ? __fput()                          -> ? restore_all()                     -> ? trace_hardirqs_on_thunk()
    -> syscall_call()+0x7

Still mostly readable, but more compressed, 11 lines only. Width is 
110+ chars though, so does not fit on col80 screens easily. This might 
be overdoing the horizontal compression, for little gain.

6) color tricks to de-emphasise unknown entries.

On color consoles I'd also print unknown entries in grey, to make it 
easier to interpret screenshots and 'dmesg' output. (The color codes 
might not make it into the syslog, but that's OK.)

I'd also print the symbolic resolution of the crash RIP in red.

Here's a mockup, I'm using ASCII color codes, so this might not render 
in color on all mail readers:

RIP: 0010:[<ffffffff81127cc2>]  [<ffffffff81127cc2>]^[[31mgeneric_exec_single^[[m+0x52/0x70

=====================

Conclusion:

So to me #4 looks best, and it's 16 lines instead of 31. Not as 
compact as your mockup that has 13 lines, but pretty close.

(I'd also do the color tricks on #6, but that's more technically 
challenging and I also don't look forward fighting the syslog guys 
over it...)

I'd guess that some people would prefer #2 or #3, depending on how 
ergonomic the parentheses and brackets are for them.

Thanks,

	Ingo

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] x86/dumpstack: Walk frames when built with frame pointers
  2014-05-07 17:18     ` Ingo Molnar
@ 2014-05-07 17:37       ` Peter Zijlstra
  0 siblings, 0 replies; 15+ messages in thread
From: Peter Zijlstra @ 2014-05-07 17:37 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Linus Torvalds, Richard Yao, Thomas Gleixner, Ingo Molnar,
	H. Peter Anvin, the arch/x86 maintainers, Andrew Morton,
	Tejun Heo, Vineet Gupta, Jesper Nilsson, Jiri Slaby,
	Linux Kernel Mailing List, kernel, Brian Behlendorf,
	Frédéric Weisbecker, Arnaldo Carvalho de Melo,
	Jiri Olsa

[-- Attachment #1: Type: text/plain, Size: 687 bytes --]

On Wed, May 07, 2014 at 07:18:22PM +0200, Ingo Molnar wrote:
> 
> Conclusion:
> 
> So to me #4 looks best, and it's 16 lines instead of 31. Not as 
> compact as your mockup that has 13 lines, but pretty close.
> 
> (I'd also do the color tricks on #6, but that's more technically 
> challenging and I also don't look forward fighting the syslog guys 
> over it...)
> 
> I'd guess that some people would prefer #2 or #3, depending on how 
> ergonomic the parentheses and brackets are for them.
> 

Can we get a CONFIG option to retain the old (current) format? My brain
is all used to those and since I use a logging serial console there's no
space issues what so ever.

[-- Attachment #2: Type: application/pgp-signature, Size: 836 bytes --]

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] x86/dumpstack: Walk frames when built with frame pointers
  2014-05-07 16:40     ` Ingo Molnar
@ 2014-06-06  8:17       ` Peter Zijlstra
  2014-06-06  8:24         ` Borislav Petkov
  0 siblings, 1 reply; 15+ messages in thread
From: Peter Zijlstra @ 2014-06-06  8:17 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Frederic Weisbecker, Richard Yao, Thomas Gleixner, Ingo Molnar,
	H. Peter Anvin, x86, Andrew Morton, Tejun Heo, Vineet Gupta,
	Jesper Nilsson, Jiri Slaby, linux-kernel, kernel,
	Brian Behlendorf, Linus Torvalds, Arnaldo Carvalho de Melo,
	Jiri Olsa

[-- Attachment #1: Type: text/plain, Size: 1008 bytes --]

On Wed, May 07, 2014 at 06:40:14PM +0200, Ingo Molnar wrote:
> Agreed, that's a very good point.
> 
> Also, consider the following holistic argument, what is easier to 
> achieve, when looking at an oops and not seeing the bug:
> 
>   - if only I had more information
>   - if only I had less information
> 
> we cannot put in information that we cut out, but it's not 
> particularly hard to skip overly verbose information in most cases.
>  
> Yes, there's a line to be drawn with verbosity: scroll-off is a 
> concern when the oops does not make it to a log file.
> 
> So I don't really know.

So it really stinks its so hard to debug laptops, and it doesn't help
the ehci debug dongle costs an arm and a leg.

Hopefully the EFI pstore thing will work out, but I've already heard
that can wear out very quickly and render your machine a brick :-(

Me, I simply refuse to work on machines if they don't have a serial
line, but in general that's not a tenable position I suppose :/

[-- Attachment #2: Type: application/pgp-signature, Size: 836 bytes --]

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] x86/dumpstack: Walk frames when built with frame pointers
  2014-06-06  8:17       ` Peter Zijlstra
@ 2014-06-06  8:24         ` Borislav Petkov
  2014-06-06  9:35           ` Peter Zijlstra
  2014-06-07  3:09           ` H. Peter Anvin
  0 siblings, 2 replies; 15+ messages in thread
From: Borislav Petkov @ 2014-06-06  8:24 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Ingo Molnar, Frederic Weisbecker, Richard Yao, Thomas Gleixner,
	Ingo Molnar, H. Peter Anvin, x86, Andrew Morton, Tejun Heo,
	Vineet Gupta, Jesper Nilsson, Jiri Slaby, linux-kernel, kernel,
	Brian Behlendorf, Linus Torvalds, Arnaldo Carvalho de Melo,
	Jiri Olsa

On Fri, Jun 06, 2014 at 10:17:26AM +0200, Peter Zijlstra wrote:
> Hopefully the EFI pstore thing will work out, but I've already heard
> that can wear out very quickly and render your machine a brick :-(

And not only that - write speed is awfully low. I hear you might be ok
to record an oops or so. Continuous debug output, not so much.

> Me, I simply refuse to work on machines if they don't have a serial
> line, but in general that's not a tenable position I suppose :/

Well, with all the other crap they put in modern machines, how hard it
is to put a goddam stupid serial line out everywhere...? WTF do I need a
goddam adaptive keyboard and other idiotic bling-bling if I can't have a
simple serial line for debugging... Blergh.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] x86/dumpstack: Walk frames when built with frame pointers
  2014-06-06  8:24         ` Borislav Petkov
@ 2014-06-06  9:35           ` Peter Zijlstra
  2014-06-07  3:08             ` H. Peter Anvin
  2014-06-07  3:09           ` H. Peter Anvin
  1 sibling, 1 reply; 15+ messages in thread
From: Peter Zijlstra @ 2014-06-06  9:35 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: Ingo Molnar, Frederic Weisbecker, Richard Yao, Thomas Gleixner,
	Ingo Molnar, H. Peter Anvin, x86, Andrew Morton, Tejun Heo,
	Vineet Gupta, Jesper Nilsson, Jiri Slaby, linux-kernel, kernel,
	Brian Behlendorf, Linus Torvalds, Arnaldo Carvalho de Melo,
	Jiri Olsa

[-- Attachment #1: Type: text/plain, Size: 457 bytes --]

On Fri, Jun 06, 2014 at 10:24:20AM +0200, Borislav Petkov wrote:
> Well, with all the other crap they put in modern machines, how hard it
> is to put a goddam stupid serial line out everywhere...? WTF do I need a
> goddam adaptive keyboard and other idiotic bling-bling if I can't have a
> simple serial line for debugging... Blergh.

This DE-9 is not esthetically pleasing :-) They could of course use a
different connector, its only 5 wires or so iirc.



[-- Attachment #2: Type: application/pgp-signature, Size: 836 bytes --]

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] x86/dumpstack: Walk frames when built with frame pointers
  2014-06-06  9:35           ` Peter Zijlstra
@ 2014-06-07  3:08             ` H. Peter Anvin
  0 siblings, 0 replies; 15+ messages in thread
From: H. Peter Anvin @ 2014-06-07  3:08 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Borislav Petkov, Ingo Molnar, Frederic Weisbecker, Richard Yao,
	Thomas Gleixner, Ingo Molnar, x86, Andrew Morton, Tejun Heo,
	Vineet Gupta, Jesper Nilsson, Jiri Slaby, linux-kernel, kernel,
	Brian Behlendorf, Linus Torvalds, Arnaldo Carvalho de Melo,
	Jiri Olsa

Yes, but now you need a custom adapter.  Some have used a phono plug for the absolute minimum 3 wires.  However, no standards...

Sent from my tablet, pardon any formatting problems.

> On Jun 6, 2014, at 2:35, Peter Zijlstra <peterz@infradead.org> wrote:
> 
>> On Fri, Jun 06, 2014 at 10:24:20AM +0200, Borislav Petkov wrote:
>> Well, with all the other crap they put in modern machines, how hard it
>> is to put a goddam stupid serial line out everywhere...? WTF do I need a
>> goddam adaptive keyboard and other idiotic bling-bling if I can't have a
>> simple serial line for debugging... Blergh.
> 
> This DE-9 is not esthetically pleasing :-) They could of course use a
> different connector, its only 5 wires or so iirc.
> 
> 

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] x86/dumpstack: Walk frames when built with frame pointers
  2014-06-06  8:24         ` Borislav Petkov
  2014-06-06  9:35           ` Peter Zijlstra
@ 2014-06-07  3:09           ` H. Peter Anvin
  1 sibling, 0 replies; 15+ messages in thread
From: H. Peter Anvin @ 2014-06-07  3:09 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: Peter Zijlstra, Ingo Molnar, Frederic Weisbecker, Richard Yao,
	Thomas Gleixner, Ingo Molnar, x86, Andrew Morton, Tejun Heo,
	Vineet Gupta, Jesper Nilsson, Jiri Slaby, linux-kernel, kernel,
	Brian Behlendorf, Linus Torvalds, Arnaldo Carvalho de Melo,
	Jiri Olsa

The USB debug port is nice once you get it to work.  Assumes they haven't done anything clever like hide the special port behind a hub...

Sent from my tablet, pardon any formatting problems.

> On Jun 6, 2014, at 1:24, Borislav Petkov <bp@alien8.de> wrote:
> 
>> On Fri, Jun 06, 2014 at 10:17:26AM +0200, Peter Zijlstra wrote:
>> Hopefully the EFI pstore thing will work out, but I've already heard
>> that can wear out very quickly and render your machine a brick :-(
> 
> And not only that - write speed is awfully low. I hear you might be ok
> to record an oops or so. Continuous debug output, not so much.
> 
>> Me, I simply refuse to work on machines if they don't have a serial
>> line, but in general that's not a tenable position I suppose :/
> 
> Well, with all the other crap they put in modern machines, how hard it
> is to put a goddam stupid serial line out everywhere...? WTF do I need a
> goddam adaptive keyboard and other idiotic bling-bling if I can't have a
> simple serial line for debugging... Blergh.

^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2014-06-07  3:10 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-04-26 18:10 [PATCH] x86/dumpstack: Walk frames when built with frame pointers Richard Yao
2014-04-27 12:08 ` Ingo Molnar
2014-04-27 19:42   ` Peter Zijlstra
2014-04-27 19:51   ` Richard Yao
2014-04-27 20:08   ` Linus Torvalds
2014-04-27 20:36     ` Richard Yao
2014-05-07 17:18     ` Ingo Molnar
2014-05-07 17:37       ` Peter Zijlstra
2014-04-30 21:56   ` Frederic Weisbecker
2014-05-07 16:40     ` Ingo Molnar
2014-06-06  8:17       ` Peter Zijlstra
2014-06-06  8:24         ` Borislav Petkov
2014-06-06  9:35           ` Peter Zijlstra
2014-06-07  3:08             ` H. Peter Anvin
2014-06-07  3:09           ` H. Peter Anvin

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.