From: Joe Lawrence <joe.lawrence@redhat.com> To: linux-kernel@vger.kernel.org, linuxppc-dev@lists.ozlabs.org, live-patching@vger.kernel.org Cc: Torsten Duwe <duwe@lst.de>, Michael Ellerman <mpe@ellerman.id.au>, Jiri Kosina <jkosina@suse.cz>, Balbir Singh <bsingharora@gmail.com> Subject: ppc64le reliable stack unwinder and scheduled tasks Date: Thu, 10 Jan 2019 16:14:00 -0500 [thread overview] Message-ID: <7f468285-b149-37e2-e782-c9e538b997a9@redhat.com> (raw) Hi all, tl;dr: On ppc64le, what is top-most stack frame for scheduled tasks about? I am looking at a bug in which ~10% of livepatch tests on RHEL-7 and RHEL-8 distro kernels, the ppc64le reliable stack unwinder consistently reports an unreliable stack for a given task. This condition can eventually resolve itself, but sometimes this state remains wedged for hours and I can live-debug the system with crash-utility. I have tried reproducing with upstream 4.20 kernel, but haven't seen this exact condition yet. More on upstream in a bit. That said, I have a question about the ppc64 stack frame layout with regard to scheduled tasks. In each sticky "unreliable" stack trace instance that I've looked at, the task's stack has an interesting frame at the top: Example 1 (RHEL-7) ================== crash> struct task_struct.thread c00000022fd015c0 | grep ksp ksp = 0xc0000000288af9c0 crash> rd 0xc0000000288af9c0 -e 0xc0000000288b0000 - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - sp[0]: c0000000288af9c0: c0000000288afb90 0000000000dd0000 ...(............ c0000000288af9d0: c000000000002a94 c000000001c60a00 .*.............. crash> sym c000000000002a94 c000000000002a94 (T) hardware_interrupt_common+0x114 c0000000288af9e0: c000000001c60a80 0000000000000000 ................ c0000000288af9f0: c0000000288afbc0 0000000000dd0000 ...(............ c0000000288afa00: c0000000014322e0 c000000001c60a00 ."C............. c0000000288afa10: c0000002303ae380 c0000002303ae380 ..:0......:0.... c0000000288afa20: 7265677368657265 0000000000002200 erehsger."...... Uh-oh... /* Mark stacktraces with exception frames as unreliable. */ stack[STACK_FRAME_MARKER] == STACK_FRAME_REGS_MARKER c0000000288afa30: c0000000001240ac c0000000288afcb0 .@.........(.... c0000000288afa40: c0000000013e4d00 0000000000000000 .M>............. c0000000288afa50: 0000000000000001 0000000000000001 ................ c0000000288afa60: c0000000014322e0 0000000000000804 ."C............. c0000000288afa70: c0000000288ac080 c00000022fd015c0 ...(......./.... c0000000288afa80: c0000000288afae0 00000000ffffffff ...(............ c0000000288afa90: c0000000288afae0 c000000007b80900 ...(............ c0000000288afaa0: c000000000e90a00 c000000000e90a00 ................ c0000000288afab0: c0000000001240ac c000000000e90a00 .@.............. c0000000288afac0: c000000000e90a00 c000000004790580 ..........y..... c0000000288afad0: 0000000000000001 c000000000e90a00 ................ c0000000288afae0: 0000000000000000 0000000000000004 ................ c0000000288afaf0: c00000022fd01ad0 c000000000e73be0 .../.....;...... c0000000288afb00: c00000023900f450 c000000001488190 P..9......H..... c0000000288afb10: 0000000000ad0000 c00000023900ef40 ........@..9.... c0000000288afb20: c0000000288ac000 c000000000e73be0 ...(.....;...... c0000000288afb30: c00000000001b514 c00000022fd01628 ........(../.... c0000000288afb40: c0000000288afbb0 c000000000008800 ...(............ c0000000288afb50: c000000000162880 0000000000000000 .(.............. c0000000288afb60: 00000000240f3022 0000000000000004 "0.$............ c0000000288afb70: c000000000e90a00 c00000022fd01a20 ........ ../.... c0000000288afb80: c0000000288afbf0 c0000000014322e0 ...(....."C..... - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - sp[1]: c0000000288afb90: c0000000288afbf0 c000000001960a00 ...(............ c0000000288afba0: c00000000001b514 0000000000000004 ................ crash> sym c00000000001b514 c00000000001b514 (T) __switch_to+0x264 c0000000288afbb0: c000000000e90a00 0000000000000000 ................ c0000000288afbc0: c0000000288ac000 c0000000014322e0 ...(....."C..... c0000000288afbd0: c000000000e90a00 c000000001960a00 ................ c0000000288afbe0: c00000022fd015c0 c00000023900ef40 .../....@..9.... - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - sp[2]: c0000000288afbf0: c0000000288afcd0 c000000000003300 ...(.....3...... c0000000288afc00: c000000000a83918 c0000000013e4d00 .9.......M>..... crash> sym c000000000a83918 c000000000a83918 (t) __schedule+0x428 - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - sp[3]: c0000000288afcd0: c0000000288afd80 0000000000003300 ...(.....3...... c0000000288afce0: c0000000001240ac 0000000000000000 .@.............. crash> sym c0000000001240ac c0000000001240ac (t) rescuer_thread+0x3ac [ ... and so on as we would normally expect ... ] Example 2 - (RHEL-7) ==================== crash> struct task_struct.thread c00000004a660a00 | grep ksp ksp = 0xc0000005e85439c0, crash> rd 0xc0000005e8543b90 -e 0xc0000005e8544000 - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - sp[0]: c0000005e85439c0: c0000005e8543b90 0000000000000000 .;T............. c0000005e85439d0: c000000000002a94 0000000000000000 .*.............. crash> sym c000000000002a94 c000000000002a94 (T) hardware_interrupt_common+0x114 c0000005e8543a20: 7265677368657265 c0000000013e4d00 erehsger.M>..... /* Mark stacktraces with exception frames as unreliable. */ stack[STACK_FRAME_MARKER] == STACK_FRAME_REGS_MARKER - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - sp[1]: c0000005e8543b90: c0000005e8543bf0 c000000001960a00 .;T............. c0000005e8543ba0: c00000000001b514 0000000000000004 ................ crash> sym c00000000001b514 c00000000001b514 (T) __switch_to+0x264 - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - sp[2]: c0000005e8543bf0: c0000005e8543cd0 c000000000003300 .<T......3...... c0000005e8543c00: c000000000a83918 c0000000013e4d00 .9.......M>..... crash> sym c000000000a83918 c000000000a83918 (t) __schedule+0x428 [ ... and so on as we would normally expect ... ] Example 3 (upstream 4.20) ========================= This is not a repro of the sticky "unreliable" stack trace, but just the stack of a workqueue that fired and called msleep. crash> struct task_struct.thread c0000002257f0e00 | grep ksp ksp = 0xc00000022715f9b0, crash> rd 0xc00000022715f9b0 100 - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - sp[0]: c00000022715f9b0: c00000022715fb90 c00000023f4e8280 ...'......N?.... c00000022715f9c0: c00000022715f9e0 0000000000000003 ...'............ crash> sym c00000022715f9e0 c00000022715f9e0 (B) _end+0x22594f9e0 - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - sp[1]: c00000022715fb90: c00000022715fbf0 c00000000158634c ...'....LcX..... c00000022715fba0: c00000000001ebfc c000000001318a60 ........`.1..... crash> sym c00000000001ebfc c00000000001ebfc (T) __switch_to+0x2dc - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - sp[2]: c00000022715fbf0: c00000022715fcd0 c000000000008800 ...'............ c00000022715fc00: c000000000abcf40 c0000000013abf00 @.........:..... crash> sym c000000000abcf40 c000000000abcf40 (t) __schedule+0x3b0 - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - sp[3]: c00000022715fcd0: c00000022715fd00 2c50fff000000000 ...'..........P, c00000022715fce0: c000000000abd628 c00000023bff57e8 (........W.;.... crash> sym c000000000abd628 c000000000abd628 (T) schedule+0x48 [ ... etc ... ] save_stack_trace_tsk_reliable ============================= arch/powerpc/kernel/stacktrace.c :: save_stack_trace_tsk_reliable() does take into account the first stackframe, but only to verify that the link register is indeed pointing at kernel code address. Can someone explain what __switch_to is doing with the stack and whether in such circumstances, the reliable stack unwinder should be skipping the first frame when verifying the frame contents like STACK_FRAME_MARKER, etc. I may be on the wrong path in debugging this, but figuring out this sp[0] frame state would be helpful. Thanks, -- Joe
WARNING: multiple messages have this Message-ID (diff)
From: Joe Lawrence <joe.lawrence@redhat.com> To: linux-kernel@vger.kernel.org, linuxppc-dev@lists.ozlabs.org, live-patching@vger.kernel.org Cc: Torsten Duwe <duwe@lst.de>, Jiri Kosina <jkosina@suse.cz> Subject: ppc64le reliable stack unwinder and scheduled tasks Date: Thu, 10 Jan 2019 16:14:00 -0500 [thread overview] Message-ID: <7f468285-b149-37e2-e782-c9e538b997a9@redhat.com> (raw) Hi all, tl;dr: On ppc64le, what is top-most stack frame for scheduled tasks about? I am looking at a bug in which ~10% of livepatch tests on RHEL-7 and RHEL-8 distro kernels, the ppc64le reliable stack unwinder consistently reports an unreliable stack for a given task. This condition can eventually resolve itself, but sometimes this state remains wedged for hours and I can live-debug the system with crash-utility. I have tried reproducing with upstream 4.20 kernel, but haven't seen this exact condition yet. More on upstream in a bit. That said, I have a question about the ppc64 stack frame layout with regard to scheduled tasks. In each sticky "unreliable" stack trace instance that I've looked at, the task's stack has an interesting frame at the top: Example 1 (RHEL-7) ================== crash> struct task_struct.thread c00000022fd015c0 | grep ksp ksp = 0xc0000000288af9c0 crash> rd 0xc0000000288af9c0 -e 0xc0000000288b0000 - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - sp[0]: c0000000288af9c0: c0000000288afb90 0000000000dd0000 ...(............ c0000000288af9d0: c000000000002a94 c000000001c60a00 .*.............. crash> sym c000000000002a94 c000000000002a94 (T) hardware_interrupt_common+0x114 c0000000288af9e0: c000000001c60a80 0000000000000000 ................ c0000000288af9f0: c0000000288afbc0 0000000000dd0000 ...(............ c0000000288afa00: c0000000014322e0 c000000001c60a00 ."C............. c0000000288afa10: c0000002303ae380 c0000002303ae380 ..:0......:0.... c0000000288afa20: 7265677368657265 0000000000002200 erehsger."...... Uh-oh... /* Mark stacktraces with exception frames as unreliable. */ stack[STACK_FRAME_MARKER] == STACK_FRAME_REGS_MARKER c0000000288afa30: c0000000001240ac c0000000288afcb0 .@.........(.... c0000000288afa40: c0000000013e4d00 0000000000000000 .M>............. c0000000288afa50: 0000000000000001 0000000000000001 ................ c0000000288afa60: c0000000014322e0 0000000000000804 ."C............. c0000000288afa70: c0000000288ac080 c00000022fd015c0 ...(......./.... c0000000288afa80: c0000000288afae0 00000000ffffffff ...(............ c0000000288afa90: c0000000288afae0 c000000007b80900 ...(............ c0000000288afaa0: c000000000e90a00 c000000000e90a00 ................ c0000000288afab0: c0000000001240ac c000000000e90a00 .@.............. c0000000288afac0: c000000000e90a00 c000000004790580 ..........y..... c0000000288afad0: 0000000000000001 c000000000e90a00 ................ c0000000288afae0: 0000000000000000 0000000000000004 ................ c0000000288afaf0: c00000022fd01ad0 c000000000e73be0 .../.....;...... c0000000288afb00: c00000023900f450 c000000001488190 P..9......H..... c0000000288afb10: 0000000000ad0000 c00000023900ef40 ........@..9.... c0000000288afb20: c0000000288ac000 c000000000e73be0 ...(.....;...... c0000000288afb30: c00000000001b514 c00000022fd01628 ........(../.... c0000000288afb40: c0000000288afbb0 c000000000008800 ...(............ c0000000288afb50: c000000000162880 0000000000000000 .(.............. c0000000288afb60: 00000000240f3022 0000000000000004 "0.$............ c0000000288afb70: c000000000e90a00 c00000022fd01a20 ........ ../.... c0000000288afb80: c0000000288afbf0 c0000000014322e0 ...(....."C..... - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - sp[1]: c0000000288afb90: c0000000288afbf0 c000000001960a00 ...(............ c0000000288afba0: c00000000001b514 0000000000000004 ................ crash> sym c00000000001b514 c00000000001b514 (T) __switch_to+0x264 c0000000288afbb0: c000000000e90a00 0000000000000000 ................ c0000000288afbc0: c0000000288ac000 c0000000014322e0 ...(....."C..... c0000000288afbd0: c000000000e90a00 c000000001960a00 ................ c0000000288afbe0: c00000022fd015c0 c00000023900ef40 .../....@..9.... - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - sp[2]: c0000000288afbf0: c0000000288afcd0 c000000000003300 ...(.....3...... c0000000288afc00: c000000000a83918 c0000000013e4d00 .9.......M>..... crash> sym c000000000a83918 c000000000a83918 (t) __schedule+0x428 - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - sp[3]: c0000000288afcd0: c0000000288afd80 0000000000003300 ...(.....3...... c0000000288afce0: c0000000001240ac 0000000000000000 .@.............. crash> sym c0000000001240ac c0000000001240ac (t) rescuer_thread+0x3ac [ ... and so on as we would normally expect ... ] Example 2 - (RHEL-7) ==================== crash> struct task_struct.thread c00000004a660a00 | grep ksp ksp = 0xc0000005e85439c0, crash> rd 0xc0000005e8543b90 -e 0xc0000005e8544000 - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - sp[0]: c0000005e85439c0: c0000005e8543b90 0000000000000000 .;T............. c0000005e85439d0: c000000000002a94 0000000000000000 .*.............. crash> sym c000000000002a94 c000000000002a94 (T) hardware_interrupt_common+0x114 c0000005e8543a20: 7265677368657265 c0000000013e4d00 erehsger.M>..... /* Mark stacktraces with exception frames as unreliable. */ stack[STACK_FRAME_MARKER] == STACK_FRAME_REGS_MARKER - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - sp[1]: c0000005e8543b90: c0000005e8543bf0 c000000001960a00 .;T............. c0000005e8543ba0: c00000000001b514 0000000000000004 ................ crash> sym c00000000001b514 c00000000001b514 (T) __switch_to+0x264 - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - sp[2]: c0000005e8543bf0: c0000005e8543cd0 c000000000003300 .<T......3...... c0000005e8543c00: c000000000a83918 c0000000013e4d00 .9.......M>..... crash> sym c000000000a83918 c000000000a83918 (t) __schedule+0x428 [ ... and so on as we would normally expect ... ] Example 3 (upstream 4.20) ========================= This is not a repro of the sticky "unreliable" stack trace, but just the stack of a workqueue that fired and called msleep. crash> struct task_struct.thread c0000002257f0e00 | grep ksp ksp = 0xc00000022715f9b0, crash> rd 0xc00000022715f9b0 100 - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - sp[0]: c00000022715f9b0: c00000022715fb90 c00000023f4e8280 ...'......N?.... c00000022715f9c0: c00000022715f9e0 0000000000000003 ...'............ crash> sym c00000022715f9e0 c00000022715f9e0 (B) _end+0x22594f9e0 - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - sp[1]: c00000022715fb90: c00000022715fbf0 c00000000158634c ...'....LcX..... c00000022715fba0: c00000000001ebfc c000000001318a60 ........`.1..... crash> sym c00000000001ebfc c00000000001ebfc (T) __switch_to+0x2dc - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - sp[2]: c00000022715fbf0: c00000022715fcd0 c000000000008800 ...'............ c00000022715fc00: c000000000abcf40 c0000000013abf00 @.........:..... crash> sym c000000000abcf40 c000000000abcf40 (t) __schedule+0x3b0 - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - sp[3]: c00000022715fcd0: c00000022715fd00 2c50fff000000000 ...'..........P, c00000022715fce0: c000000000abd628 c00000023bff57e8 (........W.;.... crash> sym c000000000abd628 c000000000abd628 (T) schedule+0x48 [ ... etc ... ] save_stack_trace_tsk_reliable ============================= arch/powerpc/kernel/stacktrace.c :: save_stack_trace_tsk_reliable() does take into account the first stackframe, but only to verify that the link register is indeed pointing at kernel code address. Can someone explain what __switch_to is doing with the stack and whether in such circumstances, the reliable stack unwinder should be skipping the first frame when verifying the frame contents like STACK_FRAME_MARKER, etc. I may be on the wrong path in debugging this, but figuring out this sp[0] frame state would be helpful. Thanks, -- Joe
next reply other threads:[~2019-01-10 21:14 UTC|newest] Thread overview: 25+ messages / expand[flat|nested] mbox.gz Atom feed top 2019-01-10 21:14 Joe Lawrence [this message] 2019-01-10 21:14 ` ppc64le reliable stack unwinder and scheduled tasks Joe Lawrence 2019-01-11 0:00 ` Nicolai Stange 2019-01-11 0:00 ` Nicolai Stange 2019-01-11 1:08 ` Joe Lawrence 2019-01-11 1:08 ` Joe Lawrence 2019-01-11 7:51 ` Nicolai Stange 2019-01-11 7:51 ` Nicolai Stange 2019-01-14 4:09 ` Joe Lawrence 2019-01-14 4:09 ` Joe Lawrence 2019-01-14 7:21 ` Nicolai Stange 2019-01-14 7:21 ` Nicolai Stange 2019-01-14 16:46 ` Joe Lawrence 2019-01-14 16:46 ` Joe Lawrence 2019-01-14 17:09 ` Josh Poimboeuf 2019-01-14 17:09 ` Josh Poimboeuf 2019-01-14 17:54 ` Joe Lawrence 2019-01-14 17:54 ` Joe Lawrence 2019-01-12 1:09 ` Balbir Singh 2019-01-12 1:09 ` Balbir Singh 2019-01-12 8:45 ` Segher Boessenkool 2019-01-13 12:33 ` Balbir Singh 2019-01-13 13:05 ` Torsten Duwe 2019-01-13 13:05 ` Torsten Duwe 2019-01-17 14:52 ` Segher Boessenkool
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=7f468285-b149-37e2-e782-c9e538b997a9@redhat.com \ --to=joe.lawrence@redhat.com \ --cc=bsingharora@gmail.com \ --cc=duwe@lst.de \ --cc=jkosina@suse.cz \ --cc=linux-kernel@vger.kernel.org \ --cc=linuxppc-dev@lists.ozlabs.org \ --cc=live-patching@vger.kernel.org \ --cc=mpe@ellerman.id.au \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: linkBe sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.