linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] Revert "x86/unwind/orc: Don't skip the first frame for inactive tasks"
@ 2022-07-27  3:15 Chen Zhongjin
  2022-08-04  4:55 ` Josh Poimboeuf
  0 siblings, 1 reply; 6+ messages in thread
From: Chen Zhongjin @ 2022-07-27  3:15 UTC (permalink / raw)
  To: linux-kernel, x86, linux-arch
  Cc: jpoimboe, peterz, tglx, mingo, bp, dave.hansen, hpa, mbenes,
	chenzhongjin

This reverts commit f1d9a2abff66aa8156fbc1493abed468db63ea48.

When CONFIG_GCOV_PROFILE_ALL is enabled, show_stack() and related
functions (e.g. dump_stack) will break for x86 ORC unwinder.

Call Trace:
 <TASK>
 ? dump_stack_lvl+0x83/0xb7
 ? schedule+0x1/0x190
 ? dump_stack+0x13/0x1f
 ? handler_pre0+0x3f/0x53 [kp_unwind]
 ...

show_trace_log_lvl() searches text address on stack to validate
whether unwind results are reliable. The code:

    for (; stack < stack_info.end; stack++) {
        ...
        if (stack == ret_addr_p)
            reliable = 1;
        ...
        if (!reliable)
            continue;
        ...
    }

This requires:

    *stack* <= ret_addr_p

So that the first ret_addr_p can be found when stack++.

In normal cases the frame of show_stack() should be optimized out.
However if it is not optimized such as CONFIG_GCOV_PROFILE_ALL=y,
unwind_start() will stop at show_stack(), where:

    state->sp == first_frame == *stack*

And this will causes:

    ret_addr_p = unwind_get_return_address_ptr = state->sp - 1
    => *stack* > ret_addr_p

Then reliable check will ignore all unwind because first ret_addr_p
can't be found.

'f1d9a2abff66 ("x86/unwind/orc: Don't skip the first frame for inactive tasks")'

This patch removed the equal condition when state->sp == first_frame
which makes frame of show_stack() not be skipped. But the reason to
do that is not established now:

'f2ac57a4c49d ("x86/unwind/orc: Fix inactive tasks with stack pointer in %sp on GCC 10 compiled kernels")'

    state->sp = first_frame + sizeof(*frame),

state->sp and first_frame can't be equal for inactive stack any more.

Regard this equal condition doesn't involve other cases now,
revert it to fix above problem.

After revert, stack can be printed right:

Call Trace:
 <TASK>
 dump_stack_lvl+0x83/0xb7
 ? schedule+0x1/0x190
 dump_stack+0x13/0x1f
 handler_pre0+0x3f/0x53 [kp_unwind]
 ...

Fixes: f1d9a2abff66 ("x86/unwind/orc: Don't skip the first frame for inactive tasks")
Signed-off-by: Chen Zhongjin <chenzhongjin@huawei.com>
---
 arch/x86/kernel/unwind_orc.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/x86/kernel/unwind_orc.c b/arch/x86/kernel/unwind_orc.c
index 38185aedf7d1..514dc9ef99fe 100644
--- a/arch/x86/kernel/unwind_orc.c
+++ b/arch/x86/kernel/unwind_orc.c
@@ -708,7 +708,7 @@ void __unwind_start(struct unwind_state *state, struct task_struct *task,
 	/* Otherwise, skip ahead to the user-specified starting frame: */
 	while (!unwind_done(state) &&
 	       (!on_stack(&state->stack_info, first_frame, sizeof(long)) ||
-			state->sp < (unsigned long)first_frame))
+			state->sp <= (unsigned long)first_frame))
 		unwind_next_frame(state);
 
 	return;
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH] Revert "x86/unwind/orc: Don't skip the first frame for inactive tasks"
  2022-07-27  3:15 [PATCH] Revert "x86/unwind/orc: Don't skip the first frame for inactive tasks" Chen Zhongjin
@ 2022-08-04  4:55 ` Josh Poimboeuf
  2022-08-04  7:27   ` Chen Zhongjin
  0 siblings, 1 reply; 6+ messages in thread
From: Josh Poimboeuf @ 2022-08-04  4:55 UTC (permalink / raw)
  To: Chen Zhongjin
  Cc: linux-kernel, x86, linux-arch, peterz, tglx, mingo, bp,
	dave.hansen, hpa, mbenes

On Wed, Jul 27, 2022 at 11:15:06AM +0800, Chen Zhongjin wrote:
> This reverts commit f1d9a2abff66aa8156fbc1493abed468db63ea48.
> 
> When CONFIG_GCOV_PROFILE_ALL is enabled, show_stack() and related
> functions (e.g. dump_stack) will break for x86 ORC unwinder.
> 
> Call Trace:
>  <TASK>
>  ? dump_stack_lvl+0x83/0xb7
>  ? schedule+0x1/0x190
>  ? dump_stack+0x13/0x1f
>  ? handler_pre0+0x3f/0x53 [kp_unwind]
>  ...
> 
> show_trace_log_lvl() searches text address on stack to validate
> whether unwind results are reliable. The code:

Hi,

Thanks for the patch.  The change itself makes sense, though I'm having
trouble recreating the bug described in the patch description.

I enabled CONFIG_GCOV_PROFILE_ALL and did 'echo l >
/proc/sysrq-trigger', but I got a valid stack trace:

  # echo l > /proc/sysrq-trigger
  [  343.916728] sysrq: Show backtrace of all active CPUs
  [  343.917459] NMI backtrace for cpu 2
  [  343.917884] CPU: 2 PID: 1007 Comm: bash Not tainted 5.19.0-rc8+ #68
  [  343.918562] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.0-1.fc36 04/01/2014
  [  343.919534] Call Trace:
  [  343.919904]  <TASK>
  [  343.920226]  dump_stack_lvl+0xcc/0x11b
  [  343.920742]  dump_stack+0x17/0x24
  [  343.921199]  nmi_cpu_backtrace.cold+0xb5/0x10d
  [  343.921795]  ? lapic_can_unplug_cpu+0xa0/0xa0
  [  343.922375]  nmi_trigger_cpumask_backtrace+0x171/0x200
  [  343.923053]  arch_trigger_cpumask_backtrace+0x21/0x30
  [  343.923599]  sysrq_handle_showallcpus+0x1f/0x30
  [  343.924104]  __handle_sysrq.cold+0x81/0x234
  [  343.924576]  write_sysrq_trigger+0x6a/0x90
  [  343.925098]  proc_reg_write+0x99/0x120
  [  343.925587]  vfs_write+0x16c/0x350
  [  343.926093]  ksys_write+0x8c/0x180
  [  343.926597]  __x64_sys_write+0x21/0x30
  [  343.927144]  do_syscall_64+0x76/0x100
  [  343.927676]  entry_SYSCALL_64_after_hwframe+0x63/0xcd
  [  343.928343] RIP: 0033:0x7f555e8f7b50
  [  343.928822] Code: 73 01 c3 48 8b 0d 38 83 2c 00 f7 d8 64 89 01 48 83 c8 ff c3 66 0f 1f 44 00 00 83 3d 79 db 2c 00 00 75 10 b8 01 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 31 c3 48 83 ec 08 e8 1e e3 01 00 48 89 04 24
  [  343.930801] RSP: 002b:00007ffdd20d0978 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
  [  343.931761] RAX: ffffffffffffffda RBX: 0000000000000002 RCX: 00007f555e8f7b50
  [  343.932551] RDX: 0000000000000002 RSI: 000055ff471bfe20 RDI: 0000000000000001
  [  343.933347] RBP: 000055ff471bfe20 R08: 00007f555ebc2740 R09: 00007f555f80f700
  [  343.934155] R10: 0000000000000073 R11: 0000000000000246 R12: 0000000000000002
  [  343.934962] R13: 0000000000000001 R14: 00007f555ebc15e0 R15: 00007f555ebbd3c0
  [  343.935798]  </TASK>
  [  343.936341] Sending NMI from CPU 2 to CPUs 0-1,3:
  [  343.937163] NMI backtrace for cpu 3
  [  343.937167] CPU: 3 PID: 600 Comm: systemd-journal Not tainted 5.19.0-rc8+ #68

Was this with an upstream kernel?  Can you share the config and
toolchain versions?

-- 
Josh

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] Revert "x86/unwind/orc: Don't skip the first frame for inactive tasks"
  2022-08-04  4:55 ` Josh Poimboeuf
@ 2022-08-04  7:27   ` Chen Zhongjin
  2022-08-04 22:06     ` Josh Poimboeuf
  0 siblings, 1 reply; 6+ messages in thread
From: Chen Zhongjin @ 2022-08-04  7:27 UTC (permalink / raw)
  To: Josh Poimboeuf
  Cc: linux-kernel, x86, linux-arch, peterz, tglx, mingo, bp,
	dave.hansen, hpa, mbenes


On 2022/8/4 12:55, Josh Poimboeuf wrote:
> On Wed, Jul 27, 2022 at 11:15:06AM +0800, Chen Zhongjin wrote:
>> This reverts commit f1d9a2abff66aa8156fbc1493abed468db63ea48.
>>
>> When CONFIG_GCOV_PROFILE_ALL is enabled, show_stack() and related
>> functions (e.g. dump_stack) will break for x86 ORC unwinder.
>>
>> Call Trace:
>>   <TASK>
>>   ? dump_stack_lvl+0x83/0xb7
>>   ? schedule+0x1/0x190
>>   ? dump_stack+0x13/0x1f
>>   ? handler_pre0+0x3f/0x53 [kp_unwind]
>>   ...
>>
>> show_trace_log_lvl() searches text address on stack to validate
>> whether unwind results are reliable. The code:
> Hi,
>
> Thanks for the patch.  The change itself makes sense, though I'm having
> trouble recreating the bug described in the patch description.
>
> I enabled CONFIG_GCOV_PROFILE_ALL and did 'echo l >
> /proc/sysrq-trigger', but I got a valid stack trace:
>
>    # echo l > /proc/sysrq-trigger
>    [  343.916728] sysrq: Show backtrace of all active CPUs
>    [  343.917459] NMI backtrace for cpu 2
>    [  343.917884] CPU: 2 PID: 1007 Comm: bash Not tainted 5.19.0-rc8+ #68
>    [  343.918562] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.0-1.fc36 04/01/2014
>    [  343.919534] Call Trace:
>    [  343.919904]  <TASK>
>    [  343.920226]  dump_stack_lvl+0xcc/0x11b
>    [  343.920742]  dump_stack+0x17/0x24
>    [  343.921199]  nmi_cpu_backtrace.cold+0xb5/0x10d
>    [  343.921795]  ? lapic_can_unplug_cpu+0xa0/0xa0
>    [  343.922375]  nmi_trigger_cpumask_backtrace+0x171/0x200
>    [  343.923053]  arch_trigger_cpumask_backtrace+0x21/0x30
>    [  343.923599]  sysrq_handle_showallcpus+0x1f/0x30
>    [  343.924104]  __handle_sysrq.cold+0x81/0x234
>    [  343.924576]  write_sysrq_trigger+0x6a/0x90
>    [  343.925098]  proc_reg_write+0x99/0x120
>    [  343.925587]  vfs_write+0x16c/0x350
>    [  343.926093]  ksys_write+0x8c/0x180
>    [  343.926597]  __x64_sys_write+0x21/0x30
>    [  343.927144]  do_syscall_64+0x76/0x100
>    [  343.927676]  entry_SYSCALL_64_after_hwframe+0x63/0xcd
>    [  343.928343] RIP: 0033:0x7f555e8f7b50
>    [  343.928822] Code: 73 01 c3 48 8b 0d 38 83 2c 00 f7 d8 64 89 01 48 83 c8 ff c3 66 0f 1f 44 00 00 83 3d 79 db 2c 00 00 75 10 b8 01 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 31 c3 48 83 ec 08 e8 1e e3 01 00 48 89 04 24
>    [  343.930801] RSP: 002b:00007ffdd20d0978 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
>    [  343.931761] RAX: ffffffffffffffda RBX: 0000000000000002 RCX: 00007f555e8f7b50
>    [  343.932551] RDX: 0000000000000002 RSI: 000055ff471bfe20 RDI: 0000000000000001
>    [  343.933347] RBP: 000055ff471bfe20 R08: 00007f555ebc2740 R09: 00007f555f80f700
>    [  343.934155] R10: 0000000000000073 R11: 0000000000000246 R12: 0000000000000002
>    [  343.934962] R13: 0000000000000001 R14: 00007f555ebc15e0 R15: 00007f555ebbd3c0
>    [  343.935798]  </TASK>
>    [  343.936341] Sending NMI from CPU 2 to CPUs 0-1,3:
>    [  343.937163] NMI backtrace for cpu 3
>    [  343.937167] CPU: 3 PID: 600 Comm: systemd-journal Not tainted 5.19.0-rc8+ #68
>
> Was this with an upstream kernel?  Can you share the config and
> toolchain versions?
Hi,

My kernel is upstream and config is defconfig with only 
CONFIG_GCOV_PROFILE_ALL=y then make menuconfig.

CONFIG_CC_VERSION_TEXT="gcc (Ubuntu 7.5.0-3ubuntu1~18.04) 7.5.0"

And I got:

[   29.745728] NMI backtrace for cpu 1
[   29.745728] CPU: 1 PID: 135 Comm: sh Not tainted 
5.19.0-02858-ge2b542100719 #58
[   29.745728] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), 
BIOS rel-1.15.0-0-g2dd4b9b3f8404
[   29.745728] Call Trace:
[   29.745728]  <TASK>
[   29.745728]  ? dump_stack_lvl+0x72/0xa1
[   29.745728]  ? dump_stack+0x13/0x1f
[   29.745728]  ? nmi_cpu_backtrace+0x19f/0x1d0
[   29.745728]  ? lapic_can_unplug_cpu+0xf0/0xf0
[   29.745728]  ? nmi_trigger_cpumask_backtrace+0x15e/0x210
[   29.745728]  ? arch_trigger_cpumask_backtrace+0x18/0x30
[   29.745728]  ? sysrq_handle_showallcpus+0x16/0x30
[   29.745728]  ? __handle_sysrq+0xc4/0x250
[   29.745728]  ? write_sysrq_trigger+0x46/0x60
[   29.745728]  ? proc_reg_write+0xbc/0x120
[   29.745728]  ? preempt_count_add+0x8e/0x130
[   29.745728]  ? vfs_write+0xfe/0x460
[   29.745728]  ? ksys_write+0xdc/0x170
[   29.745728]  ? fpregs_assert_state_consistent+0x2d/0x90
[   29.745728]  ? __x64_sys_write+0x19/0x30
[   29.745728]  ? do_syscall_64+0x3a/0x90
[   29.745728]  ? entry_SYSCALL_64_after_hwframe+0x63/0xcd
[   29.745728]  </TASK>


It also happened with CONFIG_CC_VERSION_TEXT="gcc-8 (Ubuntu 
8.4.0-1ubuntu1~18.04) 8.4.0".

I believe disassemble show_stack in vmlinux and if we have:

     push   %rbp

     mov    %rsp,%rbp

     ... (no regs pushed to stack)

     callq  <show_trace_log_lvl>

This can be reproduced.

Your stacktrace looks just as normal as the one in my commit message 
(show_stack is optimized out).


Best,

Chen




^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] Revert "x86/unwind/orc: Don't skip the first frame for inactive tasks"
  2022-08-04  7:27   ` Chen Zhongjin
@ 2022-08-04 22:06     ` Josh Poimboeuf
  2022-10-09  3:14       ` Chen Zhongjin
  0 siblings, 1 reply; 6+ messages in thread
From: Josh Poimboeuf @ 2022-08-04 22:06 UTC (permalink / raw)
  To: Chen Zhongjin
  Cc: linux-kernel, x86, linux-arch, peterz, tglx, mingo, bp,
	dave.hansen, hpa, mbenes

On Thu, Aug 04, 2022 at 03:27:39PM +0800, Chen Zhongjin wrote:
> I believe disassemble show_stack in vmlinux and if we have:
> 
>     push   %rbp
> 
>     mov    %rsp,%rbp
> 
>     ... (no regs pushed to stack)
> 
>     callq  <show_trace_log_lvl>
> 
> This can be reproduced.

Weird, that's what I have.  This is GCC 12.1.

ffffffff81056de0 <show_stack>:
ffffffff81056de0:	e8 0b 43 05 00       	call   ffffffff810ab0f0 <__fentry__>
ffffffff81056de5:	55                   	push   %rbp
ffffffff81056de6:	48 83 05 b2 0f c9 02 01 	addq   $0x1,0x2c90fb2(%rip)        # ffffffff83ce7da0 <__gcov0.show_stack>
ffffffff81056dee:	48 89 e5             	mov    %rsp,%rbp
ffffffff81056df1:	48 85 ff             	test   %rdi,%rdi
ffffffff81056df4:	74 41                	je     ffffffff81056e37 <show_stack+0x57>
ffffffff81056df6:	48 85 f6             	test   %rsi,%rsi
ffffffff81056df9:	0f 85 c2 24 15 01    	jne    ffffffff821a92c1 <show_stack.cold+0xd>
ffffffff81056dff:	65 48 8b 04 25 c0 bd 01 00 	mov    %gs:0x1bdc0,%rax
ffffffff81056e08:	48 39 c7             	cmp    %rax,%rdi
ffffffff81056e0b:	0f 85 a3 24 15 01    	jne    ffffffff821a92b4 <show_stack.cold>
ffffffff81056e11:	48 83 05 af 0f c9 02 01 	addq   $0x1,0x2c90faf(%rip)        # ffffffff83ce7dc8 <__gcov0.show_stack+0x28>
ffffffff81056e19:	48 89 ee             	mov    %rbp,%rsi
ffffffff81056e1c:	48 89 d1             	mov    %rdx,%rcx
ffffffff81056e1f:	48 89 f2             	mov    %rsi,%rdx
ffffffff81056e22:	31 f6                	xor    %esi,%esi
ffffffff81056e24:	e8 8e 20 15 01       	call   ffffffff821a8eb7 <show_trace_log_lvl>
ffffffff81056e29:	48 83 05 9f 0f c9 02 01 	addq   $0x1,0x2c90f9f(%rip)        # ffffffff83ce7dd0 <__gcov0.show_stack+0x30>
ffffffff81056e31:	5d                   	pop    %rbp
ffffffff81056e32:	e9 49 b2 5a 01       	jmp    ffffffff82602080 <__x86_return_thunk>
ffffffff81056e37:	48 83 05 69 0f c9 02 01 	addq   $0x1,0x2c90f69(%rip)        # ffffffff83ce7da8 <__gcov0.show_stack+0x8>
ffffffff81056e3f:	65 48 8b 3c 25 c0 bd 01 00 	mov    %gs:0x1bdc0,%rdi
ffffffff81056e48:	eb ac                	jmp    ffffffff81056df6 <show_stack+0x16>
ffffffff81056e4a:	66 0f 1f 44 00 00    	nopw   0x0(%rax,%rax,1)

-- 
Josh

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] Revert "x86/unwind/orc: Don't skip the first frame for inactive tasks"
  2022-08-04 22:06     ` Josh Poimboeuf
@ 2022-10-09  3:14       ` Chen Zhongjin
  2022-10-14 22:44         ` Josh Poimboeuf
  0 siblings, 1 reply; 6+ messages in thread
From: Chen Zhongjin @ 2022-10-09  3:14 UTC (permalink / raw)
  To: Josh Poimboeuf
  Cc: linux-kernel, x86, linux-arch, peterz, tglx, mingo, bp,
	dave.hansen, hpa, mbenes

Hi Josh,

On 2022/8/5 6:06, Josh Poimboeuf wrote:
> On Thu, Aug 04, 2022 at 03:27:39PM +0800, Chen Zhongjin wrote:
>> I believe disassemble show_stack in vmlinux and if we have:
>>
>>      push   %rbp
>>
>>      mov    %rsp,%rbp
>>
>>      ... (no regs pushed to stack)
>>
>>      callq  <show_trace_log_lvl>
>>
>> This can be reproduced.
> Weird, that's what I have.  This is GCC 12.1.
>
> ffffffff81056de0 <show_stack>:
> ffffffff81056de0:	e8 0b 43 05 00       	call   ffffffff810ab0f0 <__fentry__>
> ffffffff81056de5:	55                   	push   %rbp
> ffffffff81056de6:	48 83 05 b2 0f c9 02 01 	addq   $0x1,0x2c90fb2(%rip)        # ffffffff83ce7da0 <__gcov0.show_stack>
> ffffffff81056dee:	48 89 e5             	mov    %rsp,%rbp
> ffffffff81056df1:	48 85 ff             	test   %rdi,%rdi
> ffffffff81056df4:	74 41                	je     ffffffff81056e37 <show_stack+0x57>
> ffffffff81056df6:	48 85 f6             	test   %rsi,%rsi
> ffffffff81056df9:	0f 85 c2 24 15 01    	jne    ffffffff821a92c1 <show_stack.cold+0xd>
> ffffffff81056dff:	65 48 8b 04 25 c0 bd 01 00 	mov    %gs:0x1bdc0,%rax
> ffffffff81056e08:	48 39 c7             	cmp    %rax,%rdi
> ffffffff81056e0b:	0f 85 a3 24 15 01    	jne    ffffffff821a92b4 <show_stack.cold>
> ffffffff81056e11:	48 83 05 af 0f c9 02 01 	addq   $0x1,0x2c90faf(%rip)        # ffffffff83ce7dc8 <__gcov0.show_stack+0x28>
> ffffffff81056e19:	48 89 ee             	mov    %rbp,%rsi
> ffffffff81056e1c:	48 89 d1             	mov    %rdx,%rcx
> ffffffff81056e1f:	48 89 f2             	mov    %rsi,%rdx
> ffffffff81056e22:	31 f6                	xor    %esi,%esi
> ffffffff81056e24:	e8 8e 20 15 01       	call   ffffffff821a8eb7 <show_trace_log_lvl>
> ffffffff81056e29:	48 83 05 9f 0f c9 02 01 	addq   $0x1,0x2c90f9f(%rip)        # ffffffff83ce7dd0 <__gcov0.show_stack+0x30>
> ffffffff81056e31:	5d                   	pop    %rbp
> ffffffff81056e32:	e9 49 b2 5a 01       	jmp    ffffffff82602080 <__x86_return_thunk>
> ffffffff81056e37:	48 83 05 69 0f c9 02 01 	addq   $0x1,0x2c90f69(%rip)        # ffffffff83ce7da8 <__gcov0.show_stack+0x8>
> ffffffff81056e3f:	65 48 8b 3c 25 c0 bd 01 00 	mov    %gs:0x1bdc0,%rdi
> ffffffff81056e48:	eb ac                	jmp    ffffffff81056df6 <show_stack+0x16>
> ffffffff81056e4a:	66 0f 1f 44 00 00    	nopw   0x0(%rax,%rax,1)

This problem still exist on my machine:

CONFIG: defconfig + CONFIG_GCOV_PROFILE_ALL

GCC: gcc 7.5.0 & 12.1.0

TRIGGER:

# echo l > /proc/sysrq-trigger

[  173.156737]  <TASK>
[  173.156737]  ? dump_stack_lvl+0x79/0xaf
[  173.156737]  ? dump_stack+0x13/0x1f
[  173.156737]  ? nmi_cpu_backtrace.cold+0x97/0xef
[  173.156737]  ? lapic_can_unplug_cpu+0x90/0x90
[  173.156737]  ? nmi_trigger_cpumask_backtrace+0x15d/0x1e0
[  173.156737]  ? arch_trigger_cpumask_backtrace+0x18/0x30
[  173.156737]  ? sysrq_handle_showallcpus+0x16/0x30
[  173.156737]  ? __handle_sysrq.cold+0x51/0x1ae
[  173.156737]  ? write_sysrq_trigger+0x2f/0x60
[  173.156737]  ? proc_reg_write+0x90/0x110
[  173.156737]  ? vfs_write+0x12a/0x5b0
[  173.156737]  ? handle_mm_fault+0x158/0x340
[  173.156737]  ? ksys_write+0x76/0x170
[  173.156737]  ? __x64_sys_write+0x18/0x30
[  173.156737]  ? do_syscall_64+0x6e/0x100
[  173.156737]  ? entry_SYSCALL_64_after_hwframe+0x63/0xcd
[  173.156737]  </TASK>

DEBUG CODE:

With your test code and unwind_debug = 1

diff --git a/arch/x86/kernel/unwind_orc.c b/arch/x86/kernel/unwind_orc.c
index a80ac029bab5..c9d7e6f69e22 100644
--- a/arch/x86/kernel/unwind_orc.c
+++ b/arch/x86/kernel/unwind_orc.c
@@ -750,17 +750,24 @@ void __unwind_start(struct unwind_state *state, 
struct task_struct *task,
          * to start unwinding at.  Skip ahead until we reach it.
          */

+orc_warn_current("skipping frames: ");
         /* When starting from regs, skip the regs frame: */
         if (regs) {
                 unwind_next_frame(state);
                 return;
         }
+pr_info("[%pS] %lx\nsp: %lx\nfirst_frame: %lx",
+               (void *)state->ip, state->ip, state->sp,
+               (unsigned long)first_frame);

         /* Otherwise, skip ahead to the user-specified starting frame: */
         while (!unwind_done(state) &&
                (!on_stack(&state->stack_info, first_frame, sizeof(long)) ||
                         state->sp < (unsigned long)first_frame))
                 unwind_next_frame(state);
+pr_info("[%pS] %lx\nsp: %lx\nfirst_frame: %lx",
+               (void *)state->ip, state->ip, state->sp,
+               (unsigned long)first_frame);

         return;

DEBUG LOG:

[  173.156737] unwind stack type:1 next_sp:0000000000000000 mask:0x2 
graph_idx:0
[  173.156737] ffffad5d419b3c50: ffffad5d419b3d78 (0xffffad5d419b3d78)
[  173.156737] ffffad5d419b3c58: ffffffffb9490905 
(__unwind_start+0x175/0x220)
[  173.156737] ffffad5d419b3c60: ffffad5d419b3d78 (0xffffad5d419b3d78)
[  173.156737] ffffad5d419b3c68: ffffad5d419b3d68 (0xffffad5d419b3d68)
[  173.156737] ffffad5d419b3c70: 0000000000000000 (0x0)
[  173.156737] ffffad5d419b3c78: ffff9c6a02d16580 (0xffff9c6a02d16580)
[  173.156737] ffffad5d419b3c80: ffffffffba69e5fb 
(show_trace_log_lvl+0xc9/0x3fb)
[  173.156737] ffffad5d419b3c88: ffff9c6a3ec9b520 (0xffff9c6a3ec9b520)
[  173.156737] ffffad5d419b3c90: 0000000000000000 (0x0)
[  173.156737] ffffad5d419b3c98: ffffffffb953aa17 
(__wake_up_klogd+0xa7/0x100)
[  173.156737] ffffad5d419b3ca0: 0000000000000000 (0x0)
[  173.156737] ffffad5d419b3ca8: 0000000000000000 (0x0)
[  173.156737] ffffad5d419b3cb0: 0000000000000000 (0x0)
[  173.156737] ffffad5d419b3cb8: 0000000000000000 (0x0)
[  173.156737] ffffad5d419b3cc0: 0000000000000000 (0x0)
[  173.156737] ffffad5d419b3cc8: 0000000000000001 (0x1)
[  173.156737] ffffad5d419b3cd0: ffffad5d419b0000 (0xffffad5d419b0000)
[  173.156737] ffffad5d419b3cd8: ffffad5d419b4000 (0xffffad5d419b4000)
[  173.156737] ffffad5d419b3ce0: 0000000000000000 (0x0)
[  173.156737] ffffad5d419b3ce8: 0000000000000002 (0x2)
[  173.156737] ffffad5d419b3cf0: ffff9c6a02d16580 (0xffff9c6a02d16580)
[  173.156737] ffffad5d419b3cf8: 0000000000000000 (0x0)
[  173.156737] ffffad5d419b3d00: 0000000000000000 (0x0)
[  173.156737] ffffad5d419b3d08: 0000000000000000 (0x0)
[  173.156737] ffffad5d419b3d10: ffffad5d419b3c60 (0xffffad5d419b3c60)
[  173.156737] ffffad5d419b3d18: ffffad5d419b3d78 (0xffffad5d419b3d78)
[  173.156737] ffffad5d419b3d20: ffffffffb949097d 
(__unwind_start+0x1ed/0x220)
[  173.156737] ffffad5d419b3d28: 0000000000000000 (0x0)
[  173.156737] ffffad5d419b3d30: 0000000000000000 (0x0)
[  173.156737] ffffad5d419b3d38: a4641208047dd200 (0xa4641208047dd200)
[  173.156737] ffffad5d419b3d40: ffffffffbb2829a3 
(msr_save_dmi_table+0x53e9a3/0x544c20)
[  173.156737] ffffad5d419b3d48: 0000000000000000 (0x0)
[  173.156737] ffffad5d419b3d50: 0000000000000001 (0x1)
[  173.156737] ffffad5d419b3d58: 0000000000000001 (0x1)
[  173.156737] ffffad5d419b3d60: ffff9c6a019e0800 (0xffff9c6a019e0800)
[  173.156737] ffffad5d419b3d68: ffffad5d419b3d78 (0xffffad5d419b3d78)
[  173.156737] ffffad5d419b3d70: ffffffffb943d9e4 (show_stack+0x44/0x70)
[  173.156737] ffffad5d419b3d78: 0000000000000002 (0x2)
[  173.156737] ffffad5d419b3d80: ffffffffba6c4628 (dump_stack_lvl+0x79/0xaf)
[  173.156737] ffffad5d419b3d88: 0000000000000202 (0x202)
[  173.156737] ffffad5d419b3d90: 0000000000000001 (0x1)
[  173.156737] ffffad5d419b3d98: ffffffffba6c4671 (dump_stack+0x13/0x1f)
... ...

[  173.156737] [__unwind_start+0x1ed/0x220] ffffffffb949097d
[  173.156737] sp: ffffad5d419b3c60
[  173.156737] first_frame: ffffad5d419b3d78
[  173.156737] [show_stack+0x44/0x70] ffffffffb943d9e4
[  173.156737] sp: ffffad5d419b3d78
[  173.156737] first_frame: ffffad5d419b3d78


As I said in commit log, problem is caused because we got *state->sp == 
first_frame* at the end of __unwind_start().

Could you please try this debug code on your machine and have a look if 
this happened ?

Thanks for your time!


Best,

Chen


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH] Revert "x86/unwind/orc: Don't skip the first frame for inactive tasks"
  2022-10-09  3:14       ` Chen Zhongjin
@ 2022-10-14 22:44         ` Josh Poimboeuf
  0 siblings, 0 replies; 6+ messages in thread
From: Josh Poimboeuf @ 2022-10-14 22:44 UTC (permalink / raw)
  To: Chen Zhongjin
  Cc: linux-kernel, x86, linux-arch, peterz, tglx, mingo, bp,
	dave.hansen, hpa, mbenes

On Sun, Oct 09, 2022 at 11:14:05AM +0800, Chen Zhongjin wrote:
> Hi Josh,
> 
> On 2022/8/5 6:06, Josh Poimboeuf wrote:
> > On Thu, Aug 04, 2022 at 03:27:39PM +0800, Chen Zhongjin wrote:
> > > I believe disassemble show_stack in vmlinux and if we have:
> > > 
> > >      push   %rbp
> > > 
> > >      mov    %rsp,%rbp
> > > 
> > >      ... (no regs pushed to stack)
> > > 
> > >      callq  <show_trace_log_lvl>
> > > 
> > > This can be reproduced.
> > Weird, that's what I have.  This is GCC 12.1.
> > 
> > ffffffff81056de0 <show_stack>:
> > ffffffff81056de0:	e8 0b 43 05 00       	call   ffffffff810ab0f0 <__fentry__>
> > ffffffff81056de5:	55                   	push   %rbp
> > ffffffff81056de6:	48 83 05 b2 0f c9 02 01 	addq   $0x1,0x2c90fb2(%rip)        # ffffffff83ce7da0 <__gcov0.show_stack>
> > ffffffff81056dee:	48 89 e5             	mov    %rsp,%rbp
> > ffffffff81056df1:	48 85 ff             	test   %rdi,%rdi
> > ffffffff81056df4:	74 41                	je     ffffffff81056e37 <show_stack+0x57>
> > ffffffff81056df6:	48 85 f6             	test   %rsi,%rsi
> > ffffffff81056df9:	0f 85 c2 24 15 01    	jne    ffffffff821a92c1 <show_stack.cold+0xd>
> > ffffffff81056dff:	65 48 8b 04 25 c0 bd 01 00 	mov    %gs:0x1bdc0,%rax
> > ffffffff81056e08:	48 39 c7             	cmp    %rax,%rdi
> > ffffffff81056e0b:	0f 85 a3 24 15 01    	jne    ffffffff821a92b4 <show_stack.cold>
> > ffffffff81056e11:	48 83 05 af 0f c9 02 01 	addq   $0x1,0x2c90faf(%rip)        # ffffffff83ce7dc8 <__gcov0.show_stack+0x28>
> > ffffffff81056e19:	48 89 ee             	mov    %rbp,%rsi
> > ffffffff81056e1c:	48 89 d1             	mov    %rdx,%rcx
> > ffffffff81056e1f:	48 89 f2             	mov    %rsi,%rdx
> > ffffffff81056e22:	31 f6                	xor    %esi,%esi
> > ffffffff81056e24:	e8 8e 20 15 01       	call   ffffffff821a8eb7 <show_trace_log_lvl>
> > ffffffff81056e29:	48 83 05 9f 0f c9 02 01 	addq   $0x1,0x2c90f9f(%rip)        # ffffffff83ce7dd0 <__gcov0.show_stack+0x30>
> > ffffffff81056e31:	5d                   	pop    %rbp
> > ffffffff81056e32:	e9 49 b2 5a 01       	jmp    ffffffff82602080 <__x86_return_thunk>
> > ffffffff81056e37:	48 83 05 69 0f c9 02 01 	addq   $0x1,0x2c90f69(%rip)        # ffffffff83ce7da8 <__gcov0.show_stack+0x8>
> > ffffffff81056e3f:	65 48 8b 3c 25 c0 bd 01 00 	mov    %gs:0x1bdc0,%rdi
> > ffffffff81056e48:	eb ac                	jmp    ffffffff81056df6 <show_stack+0x16>
> > ffffffff81056e4a:	66 0f 1f 44 00 00    	nopw   0x0(%rax,%rax,1)
> 
> This problem still exist on my machine:
> 
> CONFIG: defconfig + CONFIG_GCOV_PROFILE_ALL
> 
> GCC: gcc 7.5.0 & 12.1.0
> 
> TRIGGER:
> 
> # echo l > /proc/sysrq-trigger

Hi Chen,

Thanks for your patience.  I'm able to recreate now.  I still have no
idea why I couldn't recreate before.

Anyway the patch looks good.  I'll run it through some more testing.

-- 
Josh

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2022-10-14 22:45 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-07-27  3:15 [PATCH] Revert "x86/unwind/orc: Don't skip the first frame for inactive tasks" Chen Zhongjin
2022-08-04  4:55 ` Josh Poimboeuf
2022-08-04  7:27   ` Chen Zhongjin
2022-08-04 22:06     ` Josh Poimboeuf
2022-10-09  3:14       ` Chen Zhongjin
2022-10-14 22:44         ` Josh Poimboeuf

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).