* BUG : PowerPC RCU: torture test failed with __stack_chk_fail @ 2023-04-22 12:46 Zhouyi Zhou 2023-04-22 19:19 ` Joel Fernandes ` (2 more replies) 0 siblings, 3 replies; 41+ messages in thread From: Zhouyi Zhou @ 2023-04-22 12:46 UTC (permalink / raw) To: linuxppc-dev, rcu, linux-kernel, lance, Paul E. McKenney, Michael Ellerman Dear PowerPC and RCU developers: During the RCU torture test on mainline (on the VM of Opensource Lab of Oregon State University), SRCU-P failed with __stack_chk_fail: [ 264.381952][ T99] [c000000006c7bab0] [c0000000010c67c0] dump_stack_lvl+0x94/0xd8 (unreliable) [ 264.383786][ T99] [c000000006c7bae0] [c00000000014fc94] panic+0x19c/0x468 [ 264.385128][ T99] [c000000006c7bb80] [c0000000010fca24] __stack_chk_fail+0x24/0x30 [ 264.386610][ T99] [c000000006c7bbe0] [c0000000002293b4] srcu_gp_start_if_needed+0x5c4/0x5d0 [ 264.388188][ T99] [c000000006c7bc70] [c00000000022f7f4] srcu_torture_call+0x34/0x50 [ 264.389611][ T99] [c000000006c7bc90] [c00000000022b5e8] rcu_torture_fwd_prog+0x8c8/0xa60 [ 264.391439][ T99] [c000000006c7be00] [c00000000018e37c] kthread+0x15c/0x170 [ 264.392792][ T99] [c000000006c7be50] [c00000000000df94] ret_from_kernel_thread+0x5c/0x64 The kernel config file can be found in [1]. And I write a bash script to accelerate the bug reproducing [2]. After a week's debugging, I found the cause of the bug is because the register r10 used to judge for stack overflow is not constant between context switches. The assembly code for srcu_gp_start_if_needed is located at [3]: c000000000226eb4: 78 6b aa 7d mr r10,r13 c000000000226eb8: 14 42 29 7d add r9,r9,r8 c000000000226ebc: ac 04 00 7c hwsync c000000000226ec0: 10 00 7b 3b addi r27,r27,16 c000000000226ec4: 14 da 29 7d add r9,r9,r27 c000000000226ec8: a8 48 00 7d ldarx r8,0,r9 c000000000226ecc: 01 00 08 31 addic r8,r8,1 c000000000226ed0: ad 49 00 7d stdcx. r8,0,r9 c000000000226ed4: f4 ff c2 40 bne- c000000000226ec8 <srcu_gp_start_if_needed+0x1c8> c000000000226ed8: 28 00 21 e9 ld r9,40(r1) c000000000226edc: 78 0c 4a e9 ld r10,3192(r10) c000000000226ee0: 79 52 29 7d xor. r9,r9,r10 c000000000226ee4: 00 00 40 39 li r10,0 c000000000226ee8: b8 03 82 40 bne c0000000002272a0 <srcu_gp_start_if_needed+0x5a0> by debugging, I see the r10 is assigned with r13 on c000000000226eb4, but if there is a context-switch before c000000000226edc, a false positive will be reported. [1] http://154.220.3.115/logs/0422/configformainline.txt [2] 154.220.3.115/logs/0422/whilebash.sh [3] http://154.220.3.115/logs/0422/srcu_gp_start_if_needed.txt My analysis and debugging may not be correct, but the bug is easily reproducible. Thanks Zhouyi ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: BUG : PowerPC RCU: torture test failed with __stack_chk_fail 2023-04-22 12:46 BUG : PowerPC RCU: torture test failed with __stack_chk_fail Zhouyi Zhou @ 2023-04-22 19:19 ` Joel Fernandes 2023-04-23 1:37 ` Zhouyi Zhou 2023-04-22 19:28 ` Joel Fernandes 2023-04-24 22:07 ` Michael Ellerman 2 siblings, 1 reply; 41+ messages in thread From: Joel Fernandes @ 2023-04-22 19:19 UTC (permalink / raw) To: Zhouyi Zhou Cc: linuxppc-dev, rcu, linux-kernel, lance, Paul E. McKenney, Michael Ellerman Hi Zhouyi, On Sat, Apr 22, 2023 at 2:47 PM Zhouyi Zhou <zhouzhouyi@gmail.com> wrote: > > Dear PowerPC and RCU developers: > During the RCU torture test on mainline (on the VM of Opensource Lab > of Oregon State University), SRCU-P failed with __stack_chk_fail: > [ 264.381952][ T99] [c000000006c7bab0] [c0000000010c67c0] > dump_stack_lvl+0x94/0xd8 (unreliable) > [ 264.383786][ T99] [c000000006c7bae0] [c00000000014fc94] panic+0x19c/0x468 > [ 264.385128][ T99] [c000000006c7bb80] [c0000000010fca24] > __stack_chk_fail+0x24/0x30 > [ 264.386610][ T99] [c000000006c7bbe0] [c0000000002293b4] > srcu_gp_start_if_needed+0x5c4/0x5d0 > [ 264.388188][ T99] [c000000006c7bc70] [c00000000022f7f4] > srcu_torture_call+0x34/0x50 > [ 264.389611][ T99] [c000000006c7bc90] [c00000000022b5e8] > rcu_torture_fwd_prog+0x8c8/0xa60 > [ 264.391439][ T99] [c000000006c7be00] [c00000000018e37c] kthread+0x15c/0x170 > [ 264.392792][ T99] [c000000006c7be50] [c00000000000df94] > ret_from_kernel_thread+0x5c/0x64 > The kernel config file can be found in [1]. > And I write a bash script to accelerate the bug reproducing [2]. > After a week's debugging, I found the cause of the bug is because the > register r10 used to judge for stack overflow is not constant between > context switches. > The assembly code for srcu_gp_start_if_needed is located at [3]: > c000000000226eb4: 78 6b aa 7d mr r10,r13 > c000000000226eb8: 14 42 29 7d add r9,r9,r8 > c000000000226ebc: ac 04 00 7c hwsync > c000000000226ec0: 10 00 7b 3b addi r27,r27,16 > c000000000226ec4: 14 da 29 7d add r9,r9,r27 > c000000000226ec8: a8 48 00 7d ldarx r8,0,r9 > c000000000226ecc: 01 00 08 31 addic r8,r8,1 > c000000000226ed0: ad 49 00 7d stdcx. r8,0,r9 > c000000000226ed4: f4 ff c2 40 bne- c000000000226ec8 > <srcu_gp_start_if_needed+0x1c8> > c000000000226ed8: 28 00 21 e9 ld r9,40(r1) > c000000000226edc: 78 0c 4a e9 ld r10,3192(r10) > c000000000226ee0: 79 52 29 7d xor. r9,r9,r10 > c000000000226ee4: 00 00 40 39 li r10,0 > c000000000226ee8: b8 03 82 40 bne c0000000002272a0 > <srcu_gp_start_if_needed+0x5a0> > by debugging, I see the r10 is assigned with r13 on c000000000226eb4, > but if there is a context-switch before c000000000226edc, a false > positive will be reported. > > [1] http://154.220.3.115/logs/0422/configformainline.txt > [2] 154.220.3.115/logs/0422/whilebash.sh > [3] http://154.220.3.115/logs/0422/srcu_gp_start_if_needed.txt > > My analysis and debugging may not be correct, but the bug is easily > reproducible. Could you provide the full kernel log? It is not clear exactly from your attachments, but I think this is a stack overflow issue as implied by the mention of __stack_chk_fail. One trick might be to turn on any available stack debug kernel config options, or check the kernel logs for any messages related to shortage of remaining stack space. Additionally, you could also find out where the kernel crash happened in C code following the below notes [1] I wrote (see section "Figuring out where kernel crashes happen in C code"). The notes are x86-specific but should be generally applicable (In the off chance you'd like to improve the notes, feel free to share them ;-)). Lastly, is it a specific kernel release from which you start seeing this issue? You should try git bisect if it is easily reproducible in a newer release, but goes away in an older one. I will also join you in your debug efforts soon though I am currently in between conferences. [1] https://gist.github.com/joelagnel/ae15c404facee0eb3ebb8aff0e996a68 thanks, - Joel > > Thanks > Zhouyi ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: BUG : PowerPC RCU: torture test failed with __stack_chk_fail 2023-04-22 19:19 ` Joel Fernandes @ 2023-04-23 1:37 ` Zhouyi Zhou 2023-04-23 5:45 ` Zhouyi Zhou 0 siblings, 1 reply; 41+ messages in thread From: Zhouyi Zhou @ 2023-04-23 1:37 UTC (permalink / raw) To: Joel Fernandes Cc: linuxppc-dev, rcu, linux-kernel, lance, Paul E. McKenney, Michael Ellerman On Sun, Apr 23, 2023 at 3:19 AM Joel Fernandes <joel@joelfernandes.org> wrote: > > Hi Zhouyi, Thank Joel for your quick response ;-) I will gradually provide all the necessary information to facilitate our chasing. Please do not hesitate email me if I have ignored any ;-) > > On Sat, Apr 22, 2023 at 2:47 PM Zhouyi Zhou <zhouzhouyi@gmail.com> wrote: > > > > Dear PowerPC and RCU developers: > > During the RCU torture test on mainline (on the VM of Opensource Lab > > of Oregon State University), SRCU-P failed with __stack_chk_fail: > > [ 264.381952][ T99] [c000000006c7bab0] [c0000000010c67c0] > > dump_stack_lvl+0x94/0xd8 (unreliable) > > [ 264.383786][ T99] [c000000006c7bae0] [c00000000014fc94] panic+0x19c/0x468 > > [ 264.385128][ T99] [c000000006c7bb80] [c0000000010fca24] > > __stack_chk_fail+0x24/0x30 > > [ 264.386610][ T99] [c000000006c7bbe0] [c0000000002293b4] > > srcu_gp_start_if_needed+0x5c4/0x5d0 > > [ 264.388188][ T99] [c000000006c7bc70] [c00000000022f7f4] > > srcu_torture_call+0x34/0x50 > > [ 264.389611][ T99] [c000000006c7bc90] [c00000000022b5e8] > > rcu_torture_fwd_prog+0x8c8/0xa60 > > [ 264.391439][ T99] [c000000006c7be00] [c00000000018e37c] kthread+0x15c/0x170 > > [ 264.392792][ T99] [c000000006c7be50] [c00000000000df94] > > ret_from_kernel_thread+0x5c/0x64 > > The kernel config file can be found in [1]. > > And I write a bash script to accelerate the bug reproducing [2]. > > After a week's debugging, I found the cause of the bug is because the > > register r10 used to judge for stack overflow is not constant between > > context switches. > > The assembly code for srcu_gp_start_if_needed is located at [3]: > > c000000000226eb4: 78 6b aa 7d mr r10,r13 > > c000000000226eb8: 14 42 29 7d add r9,r9,r8 > > c000000000226ebc: ac 04 00 7c hwsync > > c000000000226ec0: 10 00 7b 3b addi r27,r27,16 > > c000000000226ec4: 14 da 29 7d add r9,r9,r27 > > c000000000226ec8: a8 48 00 7d ldarx r8,0,r9 > > c000000000226ecc: 01 00 08 31 addic r8,r8,1 > > c000000000226ed0: ad 49 00 7d stdcx. r8,0,r9 > > c000000000226ed4: f4 ff c2 40 bne- c000000000226ec8 > > <srcu_gp_start_if_needed+0x1c8> > > c000000000226ed8: 28 00 21 e9 ld r9,40(r1) > > c000000000226edc: 78 0c 4a e9 ld r10,3192(r10) > > c000000000226ee0: 79 52 29 7d xor. r9,r9,r10 > > c000000000226ee4: 00 00 40 39 li r10,0 > > c000000000226ee8: b8 03 82 40 bne c0000000002272a0 > > <srcu_gp_start_if_needed+0x5a0> > > by debugging, I see the r10 is assigned with r13 on c000000000226eb4, > > but if there is a context-switch before c000000000226edc, a false > > positive will be reported. > > > > [1] http://154.220.3.115/logs/0422/configformainline.txt > > [2] 154.220.3.115/logs/0422/whilebash.sh > > [3] http://154.220.3.115/logs/0422/srcu_gp_start_if_needed.txt > > > > My analysis and debugging may not be correct, but the bug is easily > > reproducible. > > Could you provide the full kernel log? It is not clear exactly from > your attachments, but I think this is a stack overflow issue as > implied by the mention of __stack_chk_fail. One trick might be to turn > on any available stack debug kernel config options, or check the > kernel logs for any messages related to shortage of remaining stack > space. The full kernel log is [1] [1] http://154.220.3.115/logs/0422/console.log > > Additionally, you could also find out where the kernel crash happened > in C code following the below notes [1] I wrote (see section "Figuring > out where kernel crashes happen in C code"). The notes are > x86-specific but should be generally applicable (In the off chance > you'd like to improve the notes, feel free to share them ;-)). Fantastic article!!!, I benefit a lot from reading it. Because we can reproduce it so easily on powerpc VM, I can even use gdb to debug it, following is my debug process on 2e83b879fb91dafe995967b46a1d38a5b0889242(srcu: Create an srcu_read_lock_nmisafe() and srcu_read_unlock_nmisafe()). [2] http://154.220.3.115/logs/0422/gdb.txt > > Lastly, is it a specific kernel release from which you start seeing > this issue? You should try git bisect if it is easily reproducible in > a newer release, but goes away in an older one. I did bisect on powerpc VM, the problem begin to appear on 2e83b879fb91dafe995967b46a1d38a5b0889242(srcu: Create an srcu_read_lock_nmisafe() and srcu_read_unlock_nmisafe()). The kernel is good at 5d0f5953b60f5f7a278085b55ddc73e2932f4c33(srcu: Convert ->srcu_lock_count and ->srcu_unlock_count to atomic) But if I apply the following patch [3] to 5d0f5953b60f5f7a278085b55ddc73e2932f4c33, the bug appears again. [3] http://154.220.3.115/logs/0422/bug.patch Both native gcc on PPC vm (gcc version 9.4.0), and gcc cross compiler on my x86 laptop (gcc version 10.4.0) will reproduce the bug. > > I will also join you in your debug efforts soon though I am currently > in between conferences. Exciting!! Thank you very much! I can give you ssh access (based on rsa pub key) to PPC vm on Oregon State University if you like. Thanks again Zhouyi > > [1] https://gist.github.com/joelagnel/ae15c404facee0eb3ebb8aff0e996a68 > > thanks, > > - Joel > > > > > > > > Thanks > > Zhouyi ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: BUG : PowerPC RCU: torture test failed with __stack_chk_fail 2023-04-23 1:37 ` Zhouyi Zhou @ 2023-04-23 5:45 ` Zhouyi Zhou 0 siblings, 0 replies; 41+ messages in thread From: Zhouyi Zhou @ 2023-04-23 5:45 UTC (permalink / raw) To: Joel Fernandes Cc: linuxppc-dev, rcu, linux-kernel, lance, Paul E. McKenney, Michael Ellerman On Sun, Apr 23, 2023 at 9:37 AM Zhouyi Zhou <zhouzhouyi@gmail.com> wrote: > > On Sun, Apr 23, 2023 at 3:19 AM Joel Fernandes <joel@joelfernandes.org> wrote: > > > > Hi Zhouyi, > Thank Joel for your quick response ;-) > I will gradually provide all the necessary information to facilitate > our chasing. Please do not hesitate email me > if I have ignored any ;-) > > > > On Sat, Apr 22, 2023 at 2:47 PM Zhouyi Zhou <zhouzhouyi@gmail.com> wrote: > > > > > > Dear PowerPC and RCU developers: > > > During the RCU torture test on mainline (on the VM of Opensource Lab > > > of Oregon State University), SRCU-P failed with __stack_chk_fail: > > > [ 264.381952][ T99] [c000000006c7bab0] [c0000000010c67c0] > > > dump_stack_lvl+0x94/0xd8 (unreliable) > > > [ 264.383786][ T99] [c000000006c7bae0] [c00000000014fc94] panic+0x19c/0x468 > > > [ 264.385128][ T99] [c000000006c7bb80] [c0000000010fca24] > > > __stack_chk_fail+0x24/0x30 > > > [ 264.386610][ T99] [c000000006c7bbe0] [c0000000002293b4] > > > srcu_gp_start_if_needed+0x5c4/0x5d0 > > > [ 264.388188][ T99] [c000000006c7bc70] [c00000000022f7f4] > > > srcu_torture_call+0x34/0x50 > > > [ 264.389611][ T99] [c000000006c7bc90] [c00000000022b5e8] > > > rcu_torture_fwd_prog+0x8c8/0xa60 > > > [ 264.391439][ T99] [c000000006c7be00] [c00000000018e37c] kthread+0x15c/0x170 > > > [ 264.392792][ T99] [c000000006c7be50] [c00000000000df94] > > > ret_from_kernel_thread+0x5c/0x64 > > > The kernel config file can be found in [1]. > > > And I write a bash script to accelerate the bug reproducing [2]. > > > After a week's debugging, I found the cause of the bug is because the > > > register r10 used to judge for stack overflow is not constant between > > > context switches. > > > The assembly code for srcu_gp_start_if_needed is located at [3]: > > > c000000000226eb4: 78 6b aa 7d mr r10,r13 > > > c000000000226eb8: 14 42 29 7d add r9,r9,r8 > > > c000000000226ebc: ac 04 00 7c hwsync > > > c000000000226ec0: 10 00 7b 3b addi r27,r27,16 > > > c000000000226ec4: 14 da 29 7d add r9,r9,r27 > > > c000000000226ec8: a8 48 00 7d ldarx r8,0,r9 > > > c000000000226ecc: 01 00 08 31 addic r8,r8,1 > > > c000000000226ed0: ad 49 00 7d stdcx. r8,0,r9 > > > c000000000226ed4: f4 ff c2 40 bne- c000000000226ec8 > > > <srcu_gp_start_if_needed+0x1c8> > > > c000000000226ed8: 28 00 21 e9 ld r9,40(r1) > > > c000000000226edc: 78 0c 4a e9 ld r10,3192(r10) > > > c000000000226ee0: 79 52 29 7d xor. r9,r9,r10 > > > c000000000226ee4: 00 00 40 39 li r10,0 > > > c000000000226ee8: b8 03 82 40 bne c0000000002272a0 > > > <srcu_gp_start_if_needed+0x5a0> > > > by debugging, I see the r10 is assigned with r13 on c000000000226eb4, > > > but if there is a context-switch before c000000000226edc, a false > > > positive will be reported. > > > > > > [1] http://154.220.3.115/logs/0422/configformainline.txt > > > [2] 154.220.3.115/logs/0422/whilebash.sh > > > [3] http://154.220.3.115/logs/0422/srcu_gp_start_if_needed.txt > > > > > > My analysis and debugging may not be correct, but the bug is easily > > > reproducible. > > > > Could you provide the full kernel log? It is not clear exactly from > > your attachments, but I think this is a stack overflow issue as > > implied by the mention of __stack_chk_fail. One trick might be to turn > > on any available stack debug kernel config options, or check the > > kernel logs for any messages related to shortage of remaining stack > > space. > The full kernel log is [1] > [1] http://154.220.3.115/logs/0422/console.log > > > > Additionally, you could also find out where the kernel crash happened > > in C code following the below notes [1] I wrote (see section "Figuring > > out where kernel crashes happen in C code"). The notes are > > x86-specific but should be generally applicable (In the off chance > > you'd like to improve the notes, feel free to share them ;-)). > Fantastic article!!!, I benefit a lot from reading it. Because we can > reproduce it so easily on powerpc VM, > I can even use gdb to debug it, following is my debug process on > 2e83b879fb91dafe995967b46a1d38a5b0889242(srcu: Create an > srcu_read_lock_nmisafe() and srcu_read_unlock_nmisafe()). > > [2] http://154.220.3.115/logs/0422/gdb.txt > > > > Lastly, is it a specific kernel release from which you start seeing > > this issue? You should try git bisect if it is easily reproducible in > > a newer release, but goes away in an older one. > I did bisect on powerpc VM, the problem begin to appear on > 2e83b879fb91dafe995967b46a1d38a5b0889242(srcu: Create an > srcu_read_lock_nmisafe() and srcu_read_unlock_nmisafe()). > > The kernel is good at 5d0f5953b60f5f7a278085b55ddc73e2932f4c33(srcu: > Convert ->srcu_lock_count and ->srcu_unlock_count to atomic) > > But if I apply the following patch [3] to > 5d0f5953b60f5f7a278085b55ddc73e2932f4c33, the bug appears again. > [3] http://154.220.3.115/logs/0422/bug.patch > > Both native gcc on PPC vm (gcc version 9.4.0), and gcc cross compiler > on my x86 laptop (gcc version 10.4.0) will reproduce the bug. update: stress tested on x86 platform for 6 hours, no bug reported (while we can reproduce it on X86 based cross platform powerpc gcc and X86 based cross platform powerpc qemu in less than 3 minute). > > > > I will also join you in your debug efforts soon though I am currently > > in between conferences. > Exciting!! Thank you very much! > I can give you ssh access (based on rsa pub key) to PPC vm on Oregon > State University if you like. > > Thanks again > Zhouyi > > > > [1] https://gist.github.com/joelagnel/ae15c404facee0eb3ebb8aff0e996a68 > > > > thanks, > > > > - Joel > > > > > > > > > > > > > > Thanks > > > Zhouyi ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: BUG : PowerPC RCU: torture test failed with __stack_chk_fail 2023-04-22 12:46 BUG : PowerPC RCU: torture test failed with __stack_chk_fail Zhouyi Zhou 2023-04-22 19:19 ` Joel Fernandes @ 2023-04-22 19:28 ` Joel Fernandes 2023-04-24 0:32 ` Boqun Feng 2023-04-24 22:07 ` Michael Ellerman 2 siblings, 1 reply; 41+ messages in thread From: Joel Fernandes @ 2023-04-22 19:28 UTC (permalink / raw) To: Zhouyi Zhou Cc: linuxppc-dev, rcu, linux-kernel, lance, Paul E. McKenney, Michael Ellerman On Sat, Apr 22, 2023 at 2:47 PM Zhouyi Zhou <zhouzhouyi@gmail.com> wrote: > > Dear PowerPC and RCU developers: > During the RCU torture test on mainline (on the VM of Opensource Lab > of Oregon State University), SRCU-P failed with __stack_chk_fail: > [ 264.381952][ T99] [c000000006c7bab0] [c0000000010c67c0] > dump_stack_lvl+0x94/0xd8 (unreliable) > [ 264.383786][ T99] [c000000006c7bae0] [c00000000014fc94] panic+0x19c/0x468 > [ 264.385128][ T99] [c000000006c7bb80] [c0000000010fca24] > __stack_chk_fail+0x24/0x30 > [ 264.386610][ T99] [c000000006c7bbe0] [c0000000002293b4] > srcu_gp_start_if_needed+0x5c4/0x5d0 > [ 264.388188][ T99] [c000000006c7bc70] [c00000000022f7f4] > srcu_torture_call+0x34/0x50 > [ 264.389611][ T99] [c000000006c7bc90] [c00000000022b5e8] > rcu_torture_fwd_prog+0x8c8/0xa60 > [ 264.391439][ T99] [c000000006c7be00] [c00000000018e37c] kthread+0x15c/0x170 > [ 264.392792][ T99] [c000000006c7be50] [c00000000000df94] > ret_from_kernel_thread+0x5c/0x64 > The kernel config file can be found in [1]. > And I write a bash script to accelerate the bug reproducing [2]. > After a week's debugging, I found the cause of the bug is because the > register r10 used to judge for stack overflow is not constant between > context switches. > The assembly code for srcu_gp_start_if_needed is located at [3]: > c000000000226eb4: 78 6b aa 7d mr r10,r13 > c000000000226eb8: 14 42 29 7d add r9,r9,r8 > c000000000226ebc: ac 04 00 7c hwsync > c000000000226ec0: 10 00 7b 3b addi r27,r27,16 > c000000000226ec4: 14 da 29 7d add r9,r9,r27 > c000000000226ec8: a8 48 00 7d ldarx r8,0,r9 > c000000000226ecc: 01 00 08 31 addic r8,r8,1 > c000000000226ed0: ad 49 00 7d stdcx. r8,0,r9 > c000000000226ed4: f4 ff c2 40 bne- c000000000226ec8 > <srcu_gp_start_if_needed+0x1c8> > c000000000226ed8: 28 00 21 e9 ld r9,40(r1) > c000000000226edc: 78 0c 4a e9 ld r10,3192(r10) > c000000000226ee0: 79 52 29 7d xor. r9,r9,r10 > c000000000226ee4: 00 00 40 39 li r10,0 > c000000000226ee8: b8 03 82 40 bne c0000000002272a0 > <srcu_gp_start_if_needed+0x5a0> > by debugging, I see the r10 is assigned with r13 on c000000000226eb4, > but if there is a context-switch before c000000000226edc, a false > positive will be reported. > > [1] http://154.220.3.115/logs/0422/configformainline.txt > [2] 154.220.3.115/logs/0422/whilebash.sh > [3] http://154.220.3.115/logs/0422/srcu_gp_start_if_needed.txt > > My analysis and debugging may not be correct, but the bug is easily > reproducible. If this is a bug in the stack smashing protection as you seem to hint, I wonder if you see the issue with a specific gcc version and is a compiler-specific issue. It's hard to say, but considering this I think it's important for you to mention the compiler version in your report (along with kernel version, kernel logs etc.) thanks, - Joel - Joel ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: BUG : PowerPC RCU: torture test failed with __stack_chk_fail 2023-04-22 19:28 ` Joel Fernandes @ 2023-04-24 0:32 ` Boqun Feng 2023-04-24 4:00 ` Zhouyi Zhou 2023-04-24 13:14 ` Michael Ellerman 0 siblings, 2 replies; 41+ messages in thread From: Boqun Feng @ 2023-04-24 0:32 UTC (permalink / raw) To: Joel Fernandes Cc: Zhouyi Zhou, linuxppc-dev, rcu, linux-kernel, lance, Paul E. McKenney, Michael Ellerman On Sat, Apr 22, 2023 at 09:28:39PM +0200, Joel Fernandes wrote: > On Sat, Apr 22, 2023 at 2:47 PM Zhouyi Zhou <zhouzhouyi@gmail.com> wrote: > > > > Dear PowerPC and RCU developers: > > During the RCU torture test on mainline (on the VM of Opensource Lab > > of Oregon State University), SRCU-P failed with __stack_chk_fail: > > [ 264.381952][ T99] [c000000006c7bab0] [c0000000010c67c0] > > dump_stack_lvl+0x94/0xd8 (unreliable) > > [ 264.383786][ T99] [c000000006c7bae0] [c00000000014fc94] panic+0x19c/0x468 > > [ 264.385128][ T99] [c000000006c7bb80] [c0000000010fca24] > > __stack_chk_fail+0x24/0x30 > > [ 264.386610][ T99] [c000000006c7bbe0] [c0000000002293b4] > > srcu_gp_start_if_needed+0x5c4/0x5d0 > > [ 264.388188][ T99] [c000000006c7bc70] [c00000000022f7f4] > > srcu_torture_call+0x34/0x50 > > [ 264.389611][ T99] [c000000006c7bc90] [c00000000022b5e8] > > rcu_torture_fwd_prog+0x8c8/0xa60 > > [ 264.391439][ T99] [c000000006c7be00] [c00000000018e37c] kthread+0x15c/0x170 > > [ 264.392792][ T99] [c000000006c7be50] [c00000000000df94] > > ret_from_kernel_thread+0x5c/0x64 > > The kernel config file can be found in [1]. > > And I write a bash script to accelerate the bug reproducing [2]. > > After a week's debugging, I found the cause of the bug is because the > > register r10 used to judge for stack overflow is not constant between > > context switches. > > The assembly code for srcu_gp_start_if_needed is located at [3]: > > c000000000226eb4: 78 6b aa 7d mr r10,r13 > > c000000000226eb8: 14 42 29 7d add r9,r9,r8 > > c000000000226ebc: ac 04 00 7c hwsync > > c000000000226ec0: 10 00 7b 3b addi r27,r27,16 > > c000000000226ec4: 14 da 29 7d add r9,r9,r27 > > c000000000226ec8: a8 48 00 7d ldarx r8,0,r9 > > c000000000226ecc: 01 00 08 31 addic r8,r8,1 > > c000000000226ed0: ad 49 00 7d stdcx. r8,0,r9 > > c000000000226ed4: f4 ff c2 40 bne- c000000000226ec8 > > <srcu_gp_start_if_needed+0x1c8> > > c000000000226ed8: 28 00 21 e9 ld r9,40(r1) > > c000000000226edc: 78 0c 4a e9 ld r10,3192(r10) > > c000000000226ee0: 79 52 29 7d xor. r9,r9,r10 > > c000000000226ee4: 00 00 40 39 li r10,0 > > c000000000226ee8: b8 03 82 40 bne c0000000002272a0 > > <srcu_gp_start_if_needed+0x5a0> > > by debugging, I see the r10 is assigned with r13 on c000000000226eb4, > > but if there is a context-switch before c000000000226edc, a false > > positive will be reported. > > > > [1] http://154.220.3.115/logs/0422/configformainline.txt > > [2] 154.220.3.115/logs/0422/whilebash.sh > > [3] http://154.220.3.115/logs/0422/srcu_gp_start_if_needed.txt > > > > My analysis and debugging may not be correct, but the bug is easily > > reproducible. > > If this is a bug in the stack smashing protection as you seem to hint, > I wonder if you see the issue with a specific gcc version and is a > compiler-specific issue. It's hard to say, but considering this I Very likely, more asm code from Zhouyi's link: This is the __srcu_read_unlock_nmisafe(), since "hwsync" is smp_mb__{after,before}_atomic(), and the following code is first barrier then atomic, so it's the unlock. c000000000226eb4: 78 6b aa 7d mr r10,r13 ^ r13 is the pointer to percpu data on PPC64 kernel, and it's also the pointer to TLS data for userspace code. c000000000226eb8: 14 42 29 7d add r9,r9,r8 c000000000226ebc: ac 04 00 7c hwsync c000000000226ec0: 10 00 7b 3b addi r27,r27,16 c000000000226ec4: 14 da 29 7d add r9,r9,r27 c000000000226ec8: a8 48 00 7d ldarx r8,0,r9 c000000000226ecc: 01 00 08 31 addic r8,r8,1 c000000000226ed0: ad 49 00 7d stdcx. r8,0,r9 c000000000226ed4: f4 ff c2 40 bne- c000000000226ec8 <srcu_gp_start_if_needed+0x1c8> c000000000226ed8: 28 00 21 e9 ld r9,40(r1) c000000000226edc: 78 0c 4a e9 ld r10,3192(r10) here I think that the compiler is using r10 as an alias to r13, since for userspace program, it's safe to assume the TLS pointer doesn't change. However this is not true for kernel percpu pointer. The real intention here is to compare 40(r1) vs 3192(r13) for stack guard checking, however since r13 is the percpu pointer in kernel, so the value of r13 can be changed if the thread gets scheduled to a different CPU after reading r13 for r10. __srcu_read_unlock_nmisafe() triggers this issue, because: * it contains a read from r13 * it locates at the very end of srcu_gp_start_if_needed(). This gives the compiler more opportunity to "optimize" a read from r13 away. c000000000226ee0: 79 52 29 7d xor. r9,r9,r10 c000000000226ee4: 00 00 40 39 li r10,0 c000000000226ee8: b8 03 82 40 bne c0000000002272a0 <srcu_gp_start_if_needed+0x5a0> As a result, here triggers __stack_chk_fail if mis-match. If I'm correct, the following should be a workaround: diff --git a/kernel/rcu/srcutree.c b/kernel/rcu/srcutree.c index ab4ee58af84b..f5ae3be3d04d 100644 --- a/kernel/rcu/srcutree.c +++ b/kernel/rcu/srcutree.c @@ -747,6 +747,7 @@ void __srcu_read_unlock_nmisafe(struct srcu_struct *ssp, int idx) smp_mb__before_atomic(); /* C */ /* Avoid leaking the critical section. */ atomic_long_inc(&sdp->srcu_unlock_count[idx]); + asm volatile("" : : : "r13", "memory"); } EXPORT_SYMBOL_GPL(__srcu_read_unlock_nmisafe); Zhouyi, could you give a try? Note I think the "memory" clobber here is unnecesarry, but I just add it in case I'm wrong. Needless to say, the correct fix is to make ppc stack protector aware of r13 is volatile. Regards, Boqun > think it's important for you to mention the compiler version in your > report (along with kernel version, kernel logs etc.) > > thanks, > > - Joel > > > - Joel ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: BUG : PowerPC RCU: torture test failed with __stack_chk_fail 2023-04-24 0:32 ` Boqun Feng @ 2023-04-24 4:00 ` Zhouyi Zhou 2023-04-24 13:14 ` Michael Ellerman 1 sibling, 0 replies; 41+ messages in thread From: Zhouyi Zhou @ 2023-04-24 4:00 UTC (permalink / raw) To: Boqun Feng Cc: Joel Fernandes, linuxppc-dev, rcu, linux-kernel, lance, Paul E. McKenney, Michael Ellerman Thank Boqun for your wonderful analysis! On Mon, Apr 24, 2023 at 8:33 AM Boqun Feng <boqun.feng@gmail.com> wrote: > > On Sat, Apr 22, 2023 at 09:28:39PM +0200, Joel Fernandes wrote: > > On Sat, Apr 22, 2023 at 2:47 PM Zhouyi Zhou <zhouzhouyi@gmail.com> wrote: > > > > > > Dear PowerPC and RCU developers: > > > During the RCU torture test on mainline (on the VM of Opensource Lab > > > of Oregon State University), SRCU-P failed with __stack_chk_fail: > > > [ 264.381952][ T99] [c000000006c7bab0] [c0000000010c67c0] > > > dump_stack_lvl+0x94/0xd8 (unreliable) > > > [ 264.383786][ T99] [c000000006c7bae0] [c00000000014fc94] panic+0x19c/0x468 > > > [ 264.385128][ T99] [c000000006c7bb80] [c0000000010fca24] > > > __stack_chk_fail+0x24/0x30 > > > [ 264.386610][ T99] [c000000006c7bbe0] [c0000000002293b4] > > > srcu_gp_start_if_needed+0x5c4/0x5d0 > > > [ 264.388188][ T99] [c000000006c7bc70] [c00000000022f7f4] > > > srcu_torture_call+0x34/0x50 > > > [ 264.389611][ T99] [c000000006c7bc90] [c00000000022b5e8] > > > rcu_torture_fwd_prog+0x8c8/0xa60 > > > [ 264.391439][ T99] [c000000006c7be00] [c00000000018e37c] kthread+0x15c/0x170 > > > [ 264.392792][ T99] [c000000006c7be50] [c00000000000df94] > > > ret_from_kernel_thread+0x5c/0x64 > > > The kernel config file can be found in [1]. > > > And I write a bash script to accelerate the bug reproducing [2]. > > > After a week's debugging, I found the cause of the bug is because the > > > register r10 used to judge for stack overflow is not constant between > > > context switches. > > > The assembly code for srcu_gp_start_if_needed is located at [3]: > > > c000000000226eb4: 78 6b aa 7d mr r10,r13 > > > c000000000226eb8: 14 42 29 7d add r9,r9,r8 > > > c000000000226ebc: ac 04 00 7c hwsync > > > c000000000226ec0: 10 00 7b 3b addi r27,r27,16 > > > c000000000226ec4: 14 da 29 7d add r9,r9,r27 > > > c000000000226ec8: a8 48 00 7d ldarx r8,0,r9 > > > c000000000226ecc: 01 00 08 31 addic r8,r8,1 > > > c000000000226ed0: ad 49 00 7d stdcx. r8,0,r9 > > > c000000000226ed4: f4 ff c2 40 bne- c000000000226ec8 > > > <srcu_gp_start_if_needed+0x1c8> > > > c000000000226ed8: 28 00 21 e9 ld r9,40(r1) > > > c000000000226edc: 78 0c 4a e9 ld r10,3192(r10) > > > c000000000226ee0: 79 52 29 7d xor. r9,r9,r10 > > > c000000000226ee4: 00 00 40 39 li r10,0 > > > c000000000226ee8: b8 03 82 40 bne c0000000002272a0 > > > <srcu_gp_start_if_needed+0x5a0> > > > by debugging, I see the r10 is assigned with r13 on c000000000226eb4, > > > but if there is a context-switch before c000000000226edc, a false > > > positive will be reported. > > > > > > [1] http://154.220.3.115/logs/0422/configformainline.txt > > > [2] 154.220.3.115/logs/0422/whilebash.sh > > > [3] http://154.220.3.115/logs/0422/srcu_gp_start_if_needed.txt > > > > > > My analysis and debugging may not be correct, but the bug is easily > > > reproducible. > > > > If this is a bug in the stack smashing protection as you seem to hint, > > I wonder if you see the issue with a specific gcc version and is a > > compiler-specific issue. It's hard to say, but considering this I > > Very likely, more asm code from Zhouyi's link: > > This is the __srcu_read_unlock_nmisafe(), since "hwsync" is > smp_mb__{after,before}_atomic(), and the following code is first > barrier then atomic, so it's the unlock. > > c000000000226eb4: 78 6b aa 7d mr r10,r13 > > ^ r13 is the pointer to percpu data on PPC64 kernel, and it's also > the pointer to TLS data for userspace code. > > c000000000226eb8: 14 42 29 7d add r9,r9,r8 > c000000000226ebc: ac 04 00 7c hwsync > c000000000226ec0: 10 00 7b 3b addi r27,r27,16 > c000000000226ec4: 14 da 29 7d add r9,r9,r27 > c000000000226ec8: a8 48 00 7d ldarx r8,0,r9 > c000000000226ecc: 01 00 08 31 addic r8,r8,1 > c000000000226ed0: ad 49 00 7d stdcx. r8,0,r9 > c000000000226ed4: f4 ff c2 40 bne- c000000000226ec8 <srcu_gp_start_if_needed+0x1c8> > c000000000226ed8: 28 00 21 e9 ld r9,40(r1) > c000000000226edc: 78 0c 4a e9 ld r10,3192(r10) > > here I think that the compiler is using r10 as an alias to r13, since > for userspace program, it's safe to assume the TLS pointer doesn't > change. However this is not true for kernel percpu pointer. I learned a lot from your analysis, this is a fruitful learning journey for me ;-) > > The real intention here is to compare 40(r1) vs 3192(r13) for stack > guard checking, however since r13 is the percpu pointer in kernel, so > the value of r13 can be changed if the thread gets scheduled to a > different CPU after reading r13 for r10. > > __srcu_read_unlock_nmisafe() triggers this issue, because: > > * it contains a read from r13 > * it locates at the very end of srcu_gp_start_if_needed(). > > This gives the compiler more opportunity to "optimize" a read from r13 > away. Ah, this why adding __srcu_read_unlock_nmisafe() triggers this issue. > > c000000000226ee0: 79 52 29 7d xor. r9,r9,r10 > c000000000226ee4: 00 00 40 39 li r10,0 > c000000000226ee8: b8 03 82 40 bne c0000000002272a0 <srcu_gp_start_if_needed+0x5a0> > > As a result, here triggers __stack_chk_fail if mis-match. > > If I'm correct, the following should be a workaround: > > diff --git a/kernel/rcu/srcutree.c b/kernel/rcu/srcutree.c > index ab4ee58af84b..f5ae3be3d04d 100644 > --- a/kernel/rcu/srcutree.c > +++ b/kernel/rcu/srcutree.c > @@ -747,6 +747,7 @@ void __srcu_read_unlock_nmisafe(struct srcu_struct *ssp, int idx) > > smp_mb__before_atomic(); /* C */ /* Avoid leaking the critical section. */ > atomic_long_inc(&sdp->srcu_unlock_count[idx]); > + asm volatile("" : : : "r13", "memory"); > } > EXPORT_SYMBOL_GPL(__srcu_read_unlock_nmisafe); > > Zhouyi, could you give a try? Note I think the "memory" clobber here is > unnecesarry, but I just add it in case I'm wrong. After applying above, the srcu_gp_start_if_needed becomes http://140.211.169.189/0424/srcu_gp_start_if_needed.txt now. Yes, the modified kernel has survived > 2 hours' test, while the original kernel will certainly fail within 3 minutes. > > > Needless to say, the correct fix is to make ppc stack protector aware of > r13 is volatile. Yes, agree, thank you Thanks you all Regards Zhouyi > > Regards, > Boqun > > > think it's important for you to mention the compiler version in your > > report (along with kernel version, kernel logs etc.) > > > > thanks, > > > > - Joel > > > > > > - Joel ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: BUG : PowerPC RCU: torture test failed with __stack_chk_fail 2023-04-24 0:32 ` Boqun Feng 2023-04-24 4:00 ` Zhouyi Zhou @ 2023-04-24 13:14 ` Michael Ellerman 2023-04-24 15:13 ` Segher Boessenkool 1 sibling, 1 reply; 41+ messages in thread From: Michael Ellerman @ 2023-04-24 13:14 UTC (permalink / raw) To: Boqun Feng, Joel Fernandes Cc: Zhouyi Zhou, linuxppc-dev, rcu, linux-kernel, lance, Paul E. McKenney, Segher Boessenkool Hi Boqun, Thanks for debugging this ... Boqun Feng <boqun.feng@gmail.com> writes: > On Sat, Apr 22, 2023 at 09:28:39PM +0200, Joel Fernandes wrote: >> On Sat, Apr 22, 2023 at 2:47 PM Zhouyi Zhou <zhouzhouyi@gmail.com> wrote: >> > >> > Dear PowerPC and RCU developers: >> > During the RCU torture test on mainline (on the VM of Opensource Lab >> > of Oregon State University), SRCU-P failed with __stack_chk_fail: >> > [ 264.381952][ T99] [c000000006c7bab0] [c0000000010c67c0] >> > dump_stack_lvl+0x94/0xd8 (unreliable) >> > [ 264.383786][ T99] [c000000006c7bae0] [c00000000014fc94] panic+0x19c/0x468 >> > [ 264.385128][ T99] [c000000006c7bb80] [c0000000010fca24] >> > __stack_chk_fail+0x24/0x30 >> > [ 264.386610][ T99] [c000000006c7bbe0] [c0000000002293b4] >> > srcu_gp_start_if_needed+0x5c4/0x5d0 >> > [ 264.388188][ T99] [c000000006c7bc70] [c00000000022f7f4] >> > srcu_torture_call+0x34/0x50 >> > [ 264.389611][ T99] [c000000006c7bc90] [c00000000022b5e8] >> > rcu_torture_fwd_prog+0x8c8/0xa60 >> > [ 264.391439][ T99] [c000000006c7be00] [c00000000018e37c] kthread+0x15c/0x170 >> > [ 264.392792][ T99] [c000000006c7be50] [c00000000000df94] >> > ret_from_kernel_thread+0x5c/0x64 >> > The kernel config file can be found in [1]. >> > And I write a bash script to accelerate the bug reproducing [2]. >> > After a week's debugging, I found the cause of the bug is because the >> > register r10 used to judge for stack overflow is not constant between >> > context switches. >> > The assembly code for srcu_gp_start_if_needed is located at [3]: >> > c000000000226eb4: 78 6b aa 7d mr r10,r13 >> > c000000000226eb8: 14 42 29 7d add r9,r9,r8 >> > c000000000226ebc: ac 04 00 7c hwsync >> > c000000000226ec0: 10 00 7b 3b addi r27,r27,16 >> > c000000000226ec4: 14 da 29 7d add r9,r9,r27 >> > c000000000226ec8: a8 48 00 7d ldarx r8,0,r9 >> > c000000000226ecc: 01 00 08 31 addic r8,r8,1 >> > c000000000226ed0: ad 49 00 7d stdcx. r8,0,r9 >> > c000000000226ed4: f4 ff c2 40 bne- c000000000226ec8 >> > <srcu_gp_start_if_needed+0x1c8> >> > c000000000226ed8: 28 00 21 e9 ld r9,40(r1) >> > c000000000226edc: 78 0c 4a e9 ld r10,3192(r10) >> > c000000000226ee0: 79 52 29 7d xor. r9,r9,r10 >> > c000000000226ee4: 00 00 40 39 li r10,0 >> > c000000000226ee8: b8 03 82 40 bne c0000000002272a0 >> > <srcu_gp_start_if_needed+0x5a0> >> > by debugging, I see the r10 is assigned with r13 on c000000000226eb4, >> > but if there is a context-switch before c000000000226edc, a false >> > positive will be reported. >> > >> > [1] http://154.220.3.115/logs/0422/configformainline.txt >> > [2] 154.220.3.115/logs/0422/whilebash.sh >> > [3] http://154.220.3.115/logs/0422/srcu_gp_start_if_needed.txt >> > >> > My analysis and debugging may not be correct, but the bug is easily >> > reproducible. >> >> If this is a bug in the stack smashing protection as you seem to hint, >> I wonder if you see the issue with a specific gcc version and is a >> compiler-specific issue. It's hard to say, but considering this I > > Very likely, more asm code from Zhouyi's link: > > This is the __srcu_read_unlock_nmisafe(), since "hwsync" is > smp_mb__{after,before}_atomic(), and the following code is first > barrier then atomic, so it's the unlock. > > c000000000226eb4: 78 6b aa 7d mr r10,r13 > > ^ r13 is the pointer to percpu data on PPC64 kernel, and it's also > the pointer to TLS data for userspace code. I've never understood why the compiler wants to make a copy of a register variable into another register!? >:# > c000000000226eb8: 14 42 29 7d add r9,r9,r8 > c000000000226ebc: ac 04 00 7c hwsync > c000000000226ec0: 10 00 7b 3b addi r27,r27,16 > c000000000226ec4: 14 da 29 7d add r9,r9,r27 > c000000000226ec8: a8 48 00 7d ldarx r8,0,r9 > c000000000226ecc: 01 00 08 31 addic r8,r8,1 > c000000000226ed0: ad 49 00 7d stdcx. r8,0,r9 > c000000000226ed4: f4 ff c2 40 bne- c000000000226ec8 <srcu_gp_start_if_needed+0x1c8> > c000000000226ed8: 28 00 21 e9 ld r9,40(r1) > c000000000226edc: 78 0c 4a e9 ld r10,3192(r10) > > here I think that the compiler is using r10 as an alias to r13, since > for userspace program, it's safe to assume the TLS pointer doesn't > change. However this is not true for kernel percpu pointer. > > The real intention here is to compare 40(r1) vs 3192(r13) for stack > guard checking, however since r13 is the percpu pointer in kernel, so > the value of r13 can be changed if the thread gets scheduled to a > different CPU after reading r13 for r10. Yeah that's not good. > If I'm correct, the following should be a workaround: > > diff --git a/kernel/rcu/srcutree.c b/kernel/rcu/srcutree.c > index ab4ee58af84b..f5ae3be3d04d 100644 > --- a/kernel/rcu/srcutree.c > +++ b/kernel/rcu/srcutree.c > @@ -747,6 +747,7 @@ void __srcu_read_unlock_nmisafe(struct srcu_struct *ssp, int idx) > > smp_mb__before_atomic(); /* C */ /* Avoid leaking the critical section. */ > atomic_long_inc(&sdp->srcu_unlock_count[idx]); > + asm volatile("" : : : "r13", "memory"); > } > EXPORT_SYMBOL_GPL(__srcu_read_unlock_nmisafe); > > Zhouyi, could you give a try? Note I think the "memory" clobber here is > unnecesarry, but I just add it in case I'm wrong. > > Needless to say, the correct fix is to make ppc stack protector aware of > r13 is volatile. I suspect the compiler developers will tell us to go jump :) The problem of the compiler caching r13 has come up in the past, but I only remember it being "a worry" rather than causing an actual bug. We've had the DEBUG_PREEMPT checks in get_paca(), which have given us at least some comfort that if the compiler is caching r13, it shouldn't be doing it in preemptable regions. But obviously that doesn't help at all with the stack protector check. I don't see an easy fix. Adding "volatile" to the definition of local_paca seems to reduce but not elimate the caching of r13, and the GCC docs explicitly say *not* to use volatile. It also triggers lots of warnings about volatile being discarded. Short term we can make stack protector depend on !PREEMPT. Longer term possibly we can move to having current in a register like 32-bit does, and then use that as the stack protector reg. Or something simple I haven't thought of? :) cheers ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: BUG : PowerPC RCU: torture test failed with __stack_chk_fail 2023-04-24 13:14 ` Michael Ellerman @ 2023-04-24 15:13 ` Segher Boessenkool 2023-04-24 15:28 ` Boqun Feng 0 siblings, 1 reply; 41+ messages in thread From: Segher Boessenkool @ 2023-04-24 15:13 UTC (permalink / raw) To: Michael Ellerman Cc: Boqun Feng, Joel Fernandes, Zhouyi Zhou, linuxppc-dev, rcu, linux-kernel, lance, Paul E. McKenney Hi! On Mon, Apr 24, 2023 at 11:14:00PM +1000, Michael Ellerman wrote: > Boqun Feng <boqun.feng@gmail.com> writes: > > On Sat, Apr 22, 2023 at 09:28:39PM +0200, Joel Fernandes wrote: > >> On Sat, Apr 22, 2023 at 2:47 PM Zhouyi Zhou <zhouzhouyi@gmail.com> wrote: > >> > by debugging, I see the r10 is assigned with r13 on c000000000226eb4, > >> > but if there is a context-switch before c000000000226edc, a false > >> > positive will be reported. > I've never understood why the compiler wants to make a copy of a > register variable into another register!? >:# It is usually because a) you told it to (maybe via an earlyclobber), or b) it looked cheaper. I don't see either here :-( > > here I think that the compiler is using r10 as an alias to r13, since > > for userspace program, it's safe to assume the TLS pointer doesn't > > change. However this is not true for kernel percpu pointer. r13 is a "fixed" register, but that means it has a fixed purpose (so not available for allocation), it does not mean "unchanging". > > The real intention here is to compare 40(r1) vs 3192(r13) for stack > > guard checking, however since r13 is the percpu pointer in kernel, so > > the value of r13 can be changed if the thread gets scheduled to a > > different CPU after reading r13 for r10. > > Yeah that's not good. The GCC pattern here makes the four machine insns all stay together. That is to make sure to not leak any secret value, which is impossible to guarantee otherwise. What tells GCC r13 can randomly change behind its back? And, what then makes GCC ignore that fact? > > + asm volatile("" : : : "r13", "memory"); Any asm without output is always volatile. > > Needless to say, the correct fix is to make ppc stack protector aware of > > r13 is volatile. > > I suspect the compiler developers will tell us to go jump :) Why would r13 change over the course of *this* function / this macro, why can this not happen anywhere else? > The problem of the compiler caching r13 has come up in the past, but I > only remember it being "a worry" rather than causing an actual bug. In most cases the compiler is smart enough to use r13 directly, instead of copying it to another reg and then using that one. But not here for some strange reason. That of course is a very minor generated machine code quality bug and nothing more :-( > We've had the DEBUG_PREEMPT checks in get_paca(), which have given us at > least some comfort that if the compiler is caching r13, it shouldn't be > doing it in preemptable regions. > > But obviously that doesn't help at all with the stack protector check. > > I don't see an easy fix. > > Adding "volatile" to the definition of local_paca seems to reduce but > not elimate the caching of r13, and the GCC docs explicitly say *not* to > use volatile. It also triggers lots of warnings about volatile being > discarded. The point here is to say some code clobbers r13, not the asm volatile? > Or something simple I haven't thought of? :) At what points can r13 change? Only when some particular functions are called? Segher ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: BUG : PowerPC RCU: torture test failed with __stack_chk_fail 2023-04-24 15:13 ` Segher Boessenkool @ 2023-04-24 15:28 ` Boqun Feng 2023-04-24 17:29 ` Segher Boessenkool 2023-04-24 18:55 ` Joel Fernandes 0 siblings, 2 replies; 41+ messages in thread From: Boqun Feng @ 2023-04-24 15:28 UTC (permalink / raw) To: Segher Boessenkool Cc: Michael Ellerman, Joel Fernandes, Zhouyi Zhou, linuxppc-dev, rcu, linux-kernel, lance, Paul E. McKenney On Mon, Apr 24, 2023 at 10:13:51AM -0500, Segher Boessenkool wrote: > Hi! > > On Mon, Apr 24, 2023 at 11:14:00PM +1000, Michael Ellerman wrote: > > Boqun Feng <boqun.feng@gmail.com> writes: > > > On Sat, Apr 22, 2023 at 09:28:39PM +0200, Joel Fernandes wrote: > > >> On Sat, Apr 22, 2023 at 2:47 PM Zhouyi Zhou <zhouzhouyi@gmail.com> wrote: > > >> > by debugging, I see the r10 is assigned with r13 on c000000000226eb4, > > >> > but if there is a context-switch before c000000000226edc, a false > > >> > positive will be reported. > > > I've never understood why the compiler wants to make a copy of a > > register variable into another register!? >:# > > It is usually because a) you told it to (maybe via an earlyclobber), or > b) it looked cheaper. I don't see either here :-( > > > > here I think that the compiler is using r10 as an alias to r13, since > > > for userspace program, it's safe to assume the TLS pointer doesn't > > > change. However this is not true for kernel percpu pointer. > > r13 is a "fixed" register, but that means it has a fixed purpose (so not > available for allocation), it does not mean "unchanging". > > > > The real intention here is to compare 40(r1) vs 3192(r13) for stack > > > guard checking, however since r13 is the percpu pointer in kernel, so > > > the value of r13 can be changed if the thread gets scheduled to a > > > different CPU after reading r13 for r10. > > > > Yeah that's not good. > > The GCC pattern here makes the four machine insns all stay together. > That is to make sure to not leak any secret value, which is impossible > to guarantee otherwise. > > What tells GCC r13 can randomly change behind its back? And, what then > makes GCC ignore that fact? > > > > + asm volatile("" : : : "r13", "memory"); > > Any asm without output is always volatile. > > > > Needless to say, the correct fix is to make ppc stack protector aware of > > > r13 is volatile. > > > > I suspect the compiler developers will tell us to go jump :) > > Why would r13 change over the course of *this* function / this macro, > why can this not happen anywhere else? > > > The problem of the compiler caching r13 has come up in the past, but I > > only remember it being "a worry" rather than causing an actual bug. > > In most cases the compiler is smart enough to use r13 directly, instead > of copying it to another reg and then using that one. But not here for > some strange reason. That of course is a very minor generated machine > code quality bug and nothing more :-( > > > We've had the DEBUG_PREEMPT checks in get_paca(), which have given us at > > least some comfort that if the compiler is caching r13, it shouldn't be > > doing it in preemptable regions. > > > > But obviously that doesn't help at all with the stack protector check. > > > > I don't see an easy fix. > > > > Adding "volatile" to the definition of local_paca seems to reduce but > > not elimate the caching of r13, and the GCC docs explicitly say *not* to > > use volatile. It also triggers lots of warnings about volatile being > > discarded. > > The point here is to say some code clobbers r13, not the asm volatile? > > > Or something simple I haven't thought of? :) > > At what points can r13 change? Only when some particular functions are > called? > r13 is the local paca: register struct paca_struct *local_paca asm("r13"); , which is a pointer to percpu data. So if a task schedule from one CPU to anotehr CPU, the value gets changed. Regards, Boqun > > Segher ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: BUG : PowerPC RCU: torture test failed with __stack_chk_fail 2023-04-24 15:28 ` Boqun Feng @ 2023-04-24 17:29 ` Segher Boessenkool 2023-04-24 19:25 ` Boqun Feng 2023-04-24 18:55 ` Joel Fernandes 1 sibling, 1 reply; 41+ messages in thread From: Segher Boessenkool @ 2023-04-24 17:29 UTC (permalink / raw) To: Boqun Feng Cc: Michael Ellerman, Joel Fernandes, Zhouyi Zhou, linuxppc-dev, rcu, linux-kernel, lance, Paul E. McKenney On Mon, Apr 24, 2023 at 08:28:55AM -0700, Boqun Feng wrote: > On Mon, Apr 24, 2023 at 10:13:51AM -0500, Segher Boessenkool wrote: > > At what points can r13 change? Only when some particular functions are > > called? > > r13 is the local paca: > > register struct paca_struct *local_paca asm("r13"); > > , which is a pointer to percpu data. Yes, it is a global register variable. > So if a task schedule from one CPU to anotehr CPU, the value gets > changed. But the compiler does not see that something else changes local_paca (or r13 some other way, via assembler code perhaps)? Or is there a compiler bug? If the latter is true: Can you make a reproducer and open a GCC PR? <https://gcc.gnu.org/bugs/> for how to get started doing that. We need *exact* code that shows the problem, together with a compiler command line. So that we can reproduce the problem. That is step 0 in figuring out what is going on, and then maybe fixing the problem :-) Segher ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: BUG : PowerPC RCU: torture test failed with __stack_chk_fail 2023-04-24 17:29 ` Segher Boessenkool @ 2023-04-24 19:25 ` Boqun Feng 0 siblings, 0 replies; 41+ messages in thread From: Boqun Feng @ 2023-04-24 19:25 UTC (permalink / raw) To: Segher Boessenkool Cc: Michael Ellerman, Joel Fernandes, Zhouyi Zhou, linuxppc-dev, rcu, linux-kernel, lance, Paul E. McKenney On Mon, Apr 24, 2023 at 12:29:00PM -0500, Segher Boessenkool wrote: > On Mon, Apr 24, 2023 at 08:28:55AM -0700, Boqun Feng wrote: > > On Mon, Apr 24, 2023 at 10:13:51AM -0500, Segher Boessenkool wrote: > > > At what points can r13 change? Only when some particular functions are > > > called? > > > > r13 is the local paca: > > > > register struct paca_struct *local_paca asm("r13"); > > > > , which is a pointer to percpu data. > > Yes, it is a global register variable. > > > So if a task schedule from one CPU to anotehr CPU, the value gets > > changed. > > But the compiler does not see that something else changes local_paca (or It's more like this, however, in this case r13 is not changed: CPU 0 CPU 1 {r13 = 0x00} {r13 = 0x04} <thread 1> <in interrupt> _switch(): <switch to the stack of thread 2> <no need to change r13> <in thread 2> <thread 2> <thread 3> _switch(): <switch to the stack of thread 1> <no need to change r13> <in thread 1> <thread 1> as you can see thread 1 schedules from CPU 0 to CPU 1 and neither CPU changes its r13, but in the point of view for thread 1, its r13 changes. > r13 some other way, via assembler code perhaps)? Or is there a compiler > bug? > This looks to me a compiler bug, but I'm not 100% sure. Regards, Boqun > If the latter is true: > > Can you make a reproducer and open a GCC PR? <https://gcc.gnu.org/bugs/> > for how to get started doing that. We need *exact* code that shows the > problem, together with a compiler command line. So that we can > reproduce the problem. That is step 0 in figuring out what is going on, > and then maybe fixing the problem :-) > > > Segher ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: BUG : PowerPC RCU: torture test failed with __stack_chk_fail 2023-04-24 15:28 ` Boqun Feng 2023-04-24 17:29 ` Segher Boessenkool @ 2023-04-24 18:55 ` Joel Fernandes 2023-04-25 10:13 ` Peter Zijlstra 1 sibling, 1 reply; 41+ messages in thread From: Joel Fernandes @ 2023-04-24 18:55 UTC (permalink / raw) To: Boqun Feng, Peter Zijlstra Cc: Segher Boessenkool, Michael Ellerman, Zhouyi Zhou, linuxppc-dev, rcu, linux-kernel, lance, Paul E. McKenney This is amazing debugging Boqun, like a boss! One comment below: > > > Or something simple I haven't thought of? :) > > > > At what points can r13 change? Only when some particular functions are > > called? > > > > r13 is the local paca: > > register struct paca_struct *local_paca asm("r13"); > > , which is a pointer to percpu data. > > So if a task schedule from one CPU to anotehr CPU, the value gets > changed. It appears the whole issue, per your analysis, is that the stack checking code in gcc should not cache or alias r13, and must read its most up-to-date value during stack checking, as its value may have changed during a migration to a new CPU. Did I get that right? IMO, even without a reproducer, gcc on PPC should just not do that, that feels terribly broken for the kernel. I wonder what clang does, I'll go poke around with compilerexplorer after lunch. Adding +Peter Zijlstra as well to join the party as I have a feeling he'll be interested. ;-) thanks, - Joel ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: BUG : PowerPC RCU: torture test failed with __stack_chk_fail 2023-04-24 18:55 ` Joel Fernandes @ 2023-04-25 10:13 ` Peter Zijlstra 2023-04-25 10:58 ` Zhouyi Zhou 2023-04-25 10:59 ` Joel Fernandes 0 siblings, 2 replies; 41+ messages in thread From: Peter Zijlstra @ 2023-04-25 10:13 UTC (permalink / raw) To: Joel Fernandes Cc: Boqun Feng, Segher Boessenkool, Michael Ellerman, Zhouyi Zhou, linuxppc-dev, rcu, linux-kernel, lance, Paul E. McKenney On Mon, Apr 24, 2023 at 02:55:11PM -0400, Joel Fernandes wrote: > This is amazing debugging Boqun, like a boss! One comment below: > > > > > Or something simple I haven't thought of? :) > > > > > > At what points can r13 change? Only when some particular functions are > > > called? > > > > > > > r13 is the local paca: > > > > register struct paca_struct *local_paca asm("r13"); > > > > , which is a pointer to percpu data. > > > > So if a task schedule from one CPU to anotehr CPU, the value gets > > changed. > > It appears the whole issue, per your analysis, is that the stack > checking code in gcc should not cache or alias r13, and must read its > most up-to-date value during stack checking, as its value may have > changed during a migration to a new CPU. > > Did I get that right? > > IMO, even without a reproducer, gcc on PPC should just not do that, > that feels terribly broken for the kernel. I wonder what clang does, > I'll go poke around with compilerexplorer after lunch. > > Adding +Peter Zijlstra as well to join the party as I have a feeling > he'll be interested. ;-) I'm a little confused; the way I understand the whole stack protector thing to work is that we push a canary on the stack at call and on return check it is still valid. Since in general tasks randomly migrate, the per-cpu validation canary should be the same on all CPUs. Additionally, the 'new' __srcu_read_{,un}lock_nmisafe() functions use raw_cpu_ptr() to get 'a' percpu sdp, preferably that of the local cpu, but no guarantees. Both cases use r13 (paca) in a racy manner, and in both cases it should be safe. ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: BUG : PowerPC RCU: torture test failed with __stack_chk_fail 2023-04-25 10:13 ` Peter Zijlstra @ 2023-04-25 10:58 ` Zhouyi Zhou 2023-04-25 11:06 ` Joel Fernandes 2023-04-25 10:59 ` Joel Fernandes 1 sibling, 1 reply; 41+ messages in thread From: Zhouyi Zhou @ 2023-04-25 10:58 UTC (permalink / raw) To: Peter Zijlstra Cc: Joel Fernandes, Boqun Feng, Segher Boessenkool, Michael Ellerman, linuxppc-dev, rcu, linux-kernel, lance, Paul E. McKenney hi On Tue, Apr 25, 2023 at 6:13 PM Peter Zijlstra <peterz@infradead.org> wrote: > > On Mon, Apr 24, 2023 at 02:55:11PM -0400, Joel Fernandes wrote: > > This is amazing debugging Boqun, like a boss! One comment below: > > > > > > > Or something simple I haven't thought of? :) > > > > > > > > At what points can r13 change? Only when some particular functions are > > > > called? > > > > > > > > > > r13 is the local paca: > > > > > > register struct paca_struct *local_paca asm("r13"); > > > > > > , which is a pointer to percpu data. > > > > > > So if a task schedule from one CPU to anotehr CPU, the value gets > > > changed. > > > > It appears the whole issue, per your analysis, is that the stack > > checking code in gcc should not cache or alias r13, and must read its > > most up-to-date value during stack checking, as its value may have > > changed during a migration to a new CPU. > > > > Did I get that right? > > > > IMO, even without a reproducer, gcc on PPC should just not do that, > > that feels terribly broken for the kernel. I wonder what clang does, > > I'll go poke around with compilerexplorer after lunch. > > > > Adding +Peter Zijlstra as well to join the party as I have a feeling > > he'll be interested. ;-) > > I'm a little confused; the way I understand the whole stack protector > thing to work is that we push a canary on the stack at call and on > return check it is still valid. Since in general tasks randomly migrate, > the per-cpu validation canary should be the same on all CPUs. > > Additionally, the 'new' __srcu_read_{,un}lock_nmisafe() functions use > raw_cpu_ptr() to get 'a' percpu sdp, preferably that of the local cpu, > but no guarantees. > > Both cases use r13 (paca) in a racy manner, and in both cases it should > be safe. New test results today: both gcc build from git (git clone git://gcc.gnu.org/git/gcc.git) and Ubuntu 22.04 gcc-12.1.0 are immune from the above issue. We can see the assembly code on http://140.211.169.189/0425/srcu_gp_start_if_needed-gcc-12.txt while Both native gcc on PPC vm (gcc version 9.4.0), and gcc cross compiler on my x86 laptop (gcc version 10.4.0) will reproduce the bug. ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: BUG : PowerPC RCU: torture test failed with __stack_chk_fail 2023-04-25 10:58 ` Zhouyi Zhou @ 2023-04-25 11:06 ` Joel Fernandes 2023-04-25 3:12 ` Zhouyi Zhou ` (2 more replies) 0 siblings, 3 replies; 41+ messages in thread From: Joel Fernandes @ 2023-04-25 11:06 UTC (permalink / raw) To: Zhouyi Zhou, Christophe Leroy Cc: Peter Zijlstra, Boqun Feng, Segher Boessenkool, Michael Ellerman, linuxppc-dev, rcu, linux-kernel, lance, Paul E. McKenney On Tue, Apr 25, 2023 at 6:58 AM Zhouyi Zhou <zhouzhouyi@gmail.com> wrote: > > hi > > On Tue, Apr 25, 2023 at 6:13 PM Peter Zijlstra <peterz@infradead.org> wrote: > > > > On Mon, Apr 24, 2023 at 02:55:11PM -0400, Joel Fernandes wrote: > > > This is amazing debugging Boqun, like a boss! One comment below: > > > > > > > > > Or something simple I haven't thought of? :) > > > > > > > > > > At what points can r13 change? Only when some particular functions are > > > > > called? > > > > > > > > > > > > > r13 is the local paca: > > > > > > > > register struct paca_struct *local_paca asm("r13"); > > > > > > > > , which is a pointer to percpu data. > > > > > > > > So if a task schedule from one CPU to anotehr CPU, the value gets > > > > changed. > > > > > > It appears the whole issue, per your analysis, is that the stack > > > checking code in gcc should not cache or alias r13, and must read its > > > most up-to-date value during stack checking, as its value may have > > > changed during a migration to a new CPU. > > > > > > Did I get that right? > > > > > > IMO, even without a reproducer, gcc on PPC should just not do that, > > > that feels terribly broken for the kernel. I wonder what clang does, > > > I'll go poke around with compilerexplorer after lunch. > > > > > > Adding +Peter Zijlstra as well to join the party as I have a feeling > > > he'll be interested. ;-) > > > > I'm a little confused; the way I understand the whole stack protector > > thing to work is that we push a canary on the stack at call and on > > return check it is still valid. Since in general tasks randomly migrate, > > the per-cpu validation canary should be the same on all CPUs. > > > > Additionally, the 'new' __srcu_read_{,un}lock_nmisafe() functions use > > raw_cpu_ptr() to get 'a' percpu sdp, preferably that of the local cpu, > > but no guarantees. > > > > Both cases use r13 (paca) in a racy manner, and in both cases it should > > be safe. > New test results today: both gcc build from git (git clone > git://gcc.gnu.org/git/gcc.git) and Ubuntu 22.04 gcc-12.1.0 > are immune from the above issue. We can see the assembly code on > http://140.211.169.189/0425/srcu_gp_start_if_needed-gcc-12.txt > > while > Both native gcc on PPC vm (gcc version 9.4.0), and gcc cross compiler > on my x86 laptop (gcc version 10.4.0) will reproduce the bug. Do you know what fixes the issue? I would not declare victory yet. My feeling is something changes in timing, or compiler codegen which hides the issue. So the issue is still there but it is just a matter of time before someone else reports it. Out of curiosity for PPC folks, why cannot 64-bit PPC use per-task canary? Michael, is this an optimization? Adding Christophe as well since it came in a few years ago via the following commit: commit 06ec27aea9fc84d9c6d879eb64b5bcf28a8a1eb7 Author: Christophe Leroy <christophe.leroy@c-s.fr> Date: Thu Sep 27 07:05:55 2018 +0000 powerpc/64: add stack protector support On PPC64, as register r13 points to the paca_struct at all time, this patch adds a copy of the canary there, which is copied at task_switch. That new canary is then used by using the following GCC options: -mstack-protector-guard=tls -mstack-protector-guard-reg=r13 -mstack-protector-guard-offset=offsetof(struct paca_struct, canary)) Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> - Joel ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: BUG : PowerPC RCU: torture test failed with __stack_chk_fail 2023-04-25 11:06 ` Joel Fernandes @ 2023-04-25 3:12 ` Zhouyi Zhou 2023-04-25 13:40 ` Christophe Leroy 2023-04-26 12:29 ` Michael Ellerman 2 siblings, 0 replies; 41+ messages in thread From: Zhouyi Zhou @ 2023-04-25 3:12 UTC (permalink / raw) To: Joel Fernandes Cc: Christophe Leroy, Peter Zijlstra, Boqun Feng, Segher Boessenkool, Michael Ellerman, linuxppc-dev, rcu, linux-kernel, lance, Paul E. McKenney On Tue, Apr 25, 2023 at 7:06 PM Joel Fernandes <joel@joelfernandes.org> wrote: > > On Tue, Apr 25, 2023 at 6:58 AM Zhouyi Zhou <zhouzhouyi@gmail.com> wrote: > > > > hi > > > > On Tue, Apr 25, 2023 at 6:13 PM Peter Zijlstra <peterz@infradead.org> wrote: > > > > > > On Mon, Apr 24, 2023 at 02:55:11PM -0400, Joel Fernandes wrote: > > > > This is amazing debugging Boqun, like a boss! One comment below: > > > > > > > > > > > Or something simple I haven't thought of? :) > > > > > > > > > > > > At what points can r13 change? Only when some particular functions are > > > > > > called? > > > > > > > > > > > > > > > > r13 is the local paca: > > > > > > > > > > register struct paca_struct *local_paca asm("r13"); > > > > > > > > > > , which is a pointer to percpu data. > > > > > > > > > > So if a task schedule from one CPU to anotehr CPU, the value gets > > > > > changed. > > > > > > > > It appears the whole issue, per your analysis, is that the stack > > > > checking code in gcc should not cache or alias r13, and must read its > > > > most up-to-date value during stack checking, as its value may have > > > > changed during a migration to a new CPU. > > > > > > > > Did I get that right? > > > > > > > > IMO, even without a reproducer, gcc on PPC should just not do that, > > > > that feels terribly broken for the kernel. I wonder what clang does, > > > > I'll go poke around with compilerexplorer after lunch. > > > > > > > > Adding +Peter Zijlstra as well to join the party as I have a feeling > > > > he'll be interested. ;-) > > > > > > I'm a little confused; the way I understand the whole stack protector > > > thing to work is that we push a canary on the stack at call and on > > > return check it is still valid. Since in general tasks randomly migrate, > > > the per-cpu validation canary should be the same on all CPUs. > > > > > > Additionally, the 'new' __srcu_read_{,un}lock_nmisafe() functions use > > > raw_cpu_ptr() to get 'a' percpu sdp, preferably that of the local cpu, > > > but no guarantees. > > > > > > Both cases use r13 (paca) in a racy manner, and in both cases it should > > > be safe. > > New test results today: both gcc build from git (git clone > > git://gcc.gnu.org/git/gcc.git) and Ubuntu 22.04 gcc-12.1.0 > > are immune from the above issue. We can see the assembly code on > > http://140.211.169.189/0425/srcu_gp_start_if_needed-gcc-12.txt > > > > while > > Both native gcc on PPC vm (gcc version 9.4.0), and gcc cross compiler > > on my x86 laptop (gcc version 10.4.0) will reproduce the bug. > > Do you know what fixes the issue? I would not declare victory yet. My > feeling is something changes in timing, or compiler codegen which > hides the issue. So the issue is still there but it is just a matter > of time before someone else reports it. I am going to try bisect on GCC, hope we can find some clue. > > Out of curiosity for PPC folks, why cannot 64-bit PPC use per-task > canary? Michael, is this an optimization? Adding Christophe as well > since it came in a few years ago via the following commit: > > commit 06ec27aea9fc84d9c6d879eb64b5bcf28a8a1eb7 > Author: Christophe Leroy <christophe.leroy@c-s.fr> > Date: Thu Sep 27 07:05:55 2018 +0000 > > powerpc/64: add stack protector support > > On PPC64, as register r13 points to the paca_struct at all time, > this patch adds a copy of the canary there, which is copied at > task_switch. > That new canary is then used by using the following GCC options: > -mstack-protector-guard=tls > -mstack-protector-guard-reg=r13 > -mstack-protector-guard-offset=offsetof(struct paca_struct, canary)) > > Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> > Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> > > - Joel ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: BUG : PowerPC RCU: torture test failed with __stack_chk_fail 2023-04-25 11:06 ` Joel Fernandes 2023-04-25 3:12 ` Zhouyi Zhou @ 2023-04-25 13:40 ` Christophe Leroy 2023-04-25 13:49 ` Zhouyi Zhou 2023-04-26 0:42 ` Joel Fernandes 2023-04-26 12:29 ` Michael Ellerman 2 siblings, 2 replies; 41+ messages in thread From: Christophe Leroy @ 2023-04-25 13:40 UTC (permalink / raw) To: Joel Fernandes, Zhouyi Zhou, Christophe Leroy Cc: Peter Zijlstra, Boqun Feng, Segher Boessenkool, Michael Ellerman, linuxppc-dev, rcu, linux-kernel, lance, Paul E. McKenney Le 25/04/2023 à 13:06, Joel Fernandes a écrit : > On Tue, Apr 25, 2023 at 6:58 AM Zhouyi Zhou <zhouzhouyi@gmail.com> wrote: >> >> hi >> >> On Tue, Apr 25, 2023 at 6:13 PM Peter Zijlstra <peterz@infradead.org> wrote: >>> >>> On Mon, Apr 24, 2023 at 02:55:11PM -0400, Joel Fernandes wrote: >>>> This is amazing debugging Boqun, like a boss! One comment below: >>>> >>>>>>> Or something simple I haven't thought of? :) >>>>>> >>>>>> At what points can r13 change? Only when some particular functions are >>>>>> called? >>>>>> >>>>> >>>>> r13 is the local paca: >>>>> >>>>> register struct paca_struct *local_paca asm("r13"); >>>>> >>>>> , which is a pointer to percpu data. >>>>> >>>>> So if a task schedule from one CPU to anotehr CPU, the value gets >>>>> changed. >>>> >>>> It appears the whole issue, per your analysis, is that the stack >>>> checking code in gcc should not cache or alias r13, and must read its >>>> most up-to-date value during stack checking, as its value may have >>>> changed during a migration to a new CPU. >>>> >>>> Did I get that right? >>>> >>>> IMO, even without a reproducer, gcc on PPC should just not do that, >>>> that feels terribly broken for the kernel. I wonder what clang does, >>>> I'll go poke around with compilerexplorer after lunch. >>>> >>>> Adding +Peter Zijlstra as well to join the party as I have a feeling >>>> he'll be interested. ;-) >>> >>> I'm a little confused; the way I understand the whole stack protector >>> thing to work is that we push a canary on the stack at call and on >>> return check it is still valid. Since in general tasks randomly migrate, >>> the per-cpu validation canary should be the same on all CPUs. >>> >>> Additionally, the 'new' __srcu_read_{,un}lock_nmisafe() functions use >>> raw_cpu_ptr() to get 'a' percpu sdp, preferably that of the local cpu, >>> but no guarantees. >>> >>> Both cases use r13 (paca) in a racy manner, and in both cases it should >>> be safe. >> New test results today: both gcc build from git (git clone >> git://gcc.gnu.org/git/gcc.git) and Ubuntu 22.04 gcc-12.1.0 >> are immune from the above issue. We can see the assembly code on >> http://140.211.169.189/0425/srcu_gp_start_if_needed-gcc-12.txt >> >> while >> Both native gcc on PPC vm (gcc version 9.4.0), and gcc cross compiler >> on my x86 laptop (gcc version 10.4.0) will reproduce the bug. > > Do you know what fixes the issue? I would not declare victory yet. My > feeling is something changes in timing, or compiler codegen which > hides the issue. So the issue is still there but it is just a matter > of time before someone else reports it. > > Out of curiosity for PPC folks, why cannot 64-bit PPC use per-task > canary? Michael, is this an optimization? Adding Christophe as well > since it came in a few years ago via the following commit: It uses per-task canary. But unlike PPC32, PPC64 doesn't have a fixed register pointing to 'current' at all time so the canary is copied into a per-cpu struct during _switch(). If GCC keeps an old value of the per-cpu struct pointer, it then gets the canary from the wrong CPU struct so from a different task. Christophe > > commit 06ec27aea9fc84d9c6d879eb64b5bcf28a8a1eb7 > Author: Christophe Leroy <christophe.leroy@c-s.fr> > Date: Thu Sep 27 07:05:55 2018 +0000 > > powerpc/64: add stack protector support > > On PPC64, as register r13 points to the paca_struct at all time, > this patch adds a copy of the canary there, which is copied at > task_switch. > That new canary is then used by using the following GCC options: > -mstack-protector-guard=tls > -mstack-protector-guard-reg=r13 > -mstack-protector-guard-offset=offsetof(struct paca_struct, canary)) > > Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> > Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> > > - Joel ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: BUG : PowerPC RCU: torture test failed with __stack_chk_fail 2023-04-25 13:40 ` Christophe Leroy @ 2023-04-25 13:49 ` Zhouyi Zhou 2023-04-26 0:32 ` Joel Fernandes 2023-04-26 0:42 ` Joel Fernandes 1 sibling, 1 reply; 41+ messages in thread From: Zhouyi Zhou @ 2023-04-25 13:49 UTC (permalink / raw) To: Christophe Leroy Cc: Joel Fernandes, Peter Zijlstra, Boqun Feng, Segher Boessenkool, Michael Ellerman, linuxppc-dev, rcu, linux-kernel, lance, Paul E. McKenney Hi On Tue, Apr 25, 2023 at 9:40 PM Christophe Leroy <christophe.leroy@csgroup.eu> wrote: > > > > Le 25/04/2023 à 13:06, Joel Fernandes a écrit : > > On Tue, Apr 25, 2023 at 6:58 AM Zhouyi Zhou <zhouzhouyi@gmail.com> wrote: > >> > >> hi > >> > >> On Tue, Apr 25, 2023 at 6:13 PM Peter Zijlstra <peterz@infradead.org> wrote: > >>> > >>> On Mon, Apr 24, 2023 at 02:55:11PM -0400, Joel Fernandes wrote: > >>>> This is amazing debugging Boqun, like a boss! One comment below: > >>>> > >>>>>>> Or something simple I haven't thought of? :) > >>>>>> > >>>>>> At what points can r13 change? Only when some particular functions are > >>>>>> called? > >>>>>> > >>>>> > >>>>> r13 is the local paca: > >>>>> > >>>>> register struct paca_struct *local_paca asm("r13"); > >>>>> > >>>>> , which is a pointer to percpu data. > >>>>> > >>>>> So if a task schedule from one CPU to anotehr CPU, the value gets > >>>>> changed. > >>>> > >>>> It appears the whole issue, per your analysis, is that the stack > >>>> checking code in gcc should not cache or alias r13, and must read its > >>>> most up-to-date value during stack checking, as its value may have > >>>> changed during a migration to a new CPU. > >>>> > >>>> Did I get that right? > >>>> > >>>> IMO, even without a reproducer, gcc on PPC should just not do that, > >>>> that feels terribly broken for the kernel. I wonder what clang does, > >>>> I'll go poke around with compilerexplorer after lunch. > >>>> > >>>> Adding +Peter Zijlstra as well to join the party as I have a feeling > >>>> he'll be interested. ;-) > >>> > >>> I'm a little confused; the way I understand the whole stack protector > >>> thing to work is that we push a canary on the stack at call and on > >>> return check it is still valid. Since in general tasks randomly migrate, > >>> the per-cpu validation canary should be the same on all CPUs. > >>> > >>> Additionally, the 'new' __srcu_read_{,un}lock_nmisafe() functions use > >>> raw_cpu_ptr() to get 'a' percpu sdp, preferably that of the local cpu, > >>> but no guarantees. > >>> > >>> Both cases use r13 (paca) in a racy manner, and in both cases it should > >>> be safe. > >> New test results today: both gcc build from git (git clone > >> git://gcc.gnu.org/git/gcc.git) and Ubuntu 22.04 gcc-12.1.0 > >> are immune from the above issue. We can see the assembly code on > >> http://140.211.169.189/0425/srcu_gp_start_if_needed-gcc-12.txt > >> > >> while > >> Both native gcc on PPC vm (gcc version 9.4.0), and gcc cross compiler > >> on my x86 laptop (gcc version 10.4.0) will reproduce the bug. > > > > Do you know what fixes the issue? I would not declare victory yet. My > > feeling is something changes in timing, or compiler codegen which > > hides the issue. So the issue is still there but it is just a matter > > of time before someone else reports it. > > > > Out of curiosity for PPC folks, why cannot 64-bit PPC use per-task > > canary? Michael, is this an optimization? Adding Christophe as well > > since it came in a few years ago via the following commit: > > It uses per-task canary. But unlike PPC32, PPC64 doesn't have a fixed > register pointing to 'current' at all time so the canary is copied into > a per-cpu struct during _switch(). > > If GCC keeps an old value of the per-cpu struct pointer, it then gets > the canary from the wrong CPU struct so from a different task. This is a fruitful learning process for me! Christophe: Do you think there is still a need to bisect GCC ? If so, I am very glad to continue Cheers Zhouyi > > Christophe > > > > > commit 06ec27aea9fc84d9c6d879eb64b5bcf28a8a1eb7 > > Author: Christophe Leroy <christophe.leroy@c-s.fr> > > Date: Thu Sep 27 07:05:55 2018 +0000 > > > > powerpc/64: add stack protector support > > > > On PPC64, as register r13 points to the paca_struct at all time, > > this patch adds a copy of the canary there, which is copied at > > task_switch. > > That new canary is then used by using the following GCC options: > > -mstack-protector-guard=tls > > -mstack-protector-guard-reg=r13 > > -mstack-protector-guard-offset=offsetof(struct paca_struct, canary)) > > > > Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> > > Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> > > > > - Joel ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: BUG : PowerPC RCU: torture test failed with __stack_chk_fail 2023-04-25 13:49 ` Zhouyi Zhou @ 2023-04-26 0:32 ` Joel Fernandes 2023-04-26 1:31 ` Zhouyi Zhou 0 siblings, 1 reply; 41+ messages in thread From: Joel Fernandes @ 2023-04-26 0:32 UTC (permalink / raw) To: Zhouyi Zhou Cc: Christophe Leroy, Peter Zijlstra, Boqun Feng, Segher Boessenkool, Michael Ellerman, linuxppc-dev, rcu, linux-kernel, lance, Paul E. McKenney On Tue, Apr 25, 2023 at 9:50 AM Zhouyi Zhou <zhouzhouyi@gmail.com> wrote: > > Hi > > On Tue, Apr 25, 2023 at 9:40 PM Christophe Leroy > <christophe.leroy@csgroup.eu> wrote: > > > > > > > > Le 25/04/2023 à 13:06, Joel Fernandes a écrit : > > > On Tue, Apr 25, 2023 at 6:58 AM Zhouyi Zhou <zhouzhouyi@gmail.com> wrote: > > >> > > >> hi > > >> > > >> On Tue, Apr 25, 2023 at 6:13 PM Peter Zijlstra <peterz@infradead.org> wrote: > > >>> > > >>> On Mon, Apr 24, 2023 at 02:55:11PM -0400, Joel Fernandes wrote: > > >>>> This is amazing debugging Boqun, like a boss! One comment below: > > >>>> > > >>>>>>> Or something simple I haven't thought of? :) > > >>>>>> > > >>>>>> At what points can r13 change? Only when some particular functions are > > >>>>>> called? > > >>>>>> > > >>>>> > > >>>>> r13 is the local paca: > > >>>>> > > >>>>> register struct paca_struct *local_paca asm("r13"); > > >>>>> > > >>>>> , which is a pointer to percpu data. > > >>>>> > > >>>>> So if a task schedule from one CPU to anotehr CPU, the value gets > > >>>>> changed. > > >>>> > > >>>> It appears the whole issue, per your analysis, is that the stack > > >>>> checking code in gcc should not cache or alias r13, and must read its > > >>>> most up-to-date value during stack checking, as its value may have > > >>>> changed during a migration to a new CPU. > > >>>> > > >>>> Did I get that right? > > >>>> > > >>>> IMO, even without a reproducer, gcc on PPC should just not do that, > > >>>> that feels terribly broken for the kernel. I wonder what clang does, > > >>>> I'll go poke around with compilerexplorer after lunch. > > >>>> > > >>>> Adding +Peter Zijlstra as well to join the party as I have a feeling > > >>>> he'll be interested. ;-) > > >>> > > >>> I'm a little confused; the way I understand the whole stack protector > > >>> thing to work is that we push a canary on the stack at call and on > > >>> return check it is still valid. Since in general tasks randomly migrate, > > >>> the per-cpu validation canary should be the same on all CPUs. > > >>> > > >>> Additionally, the 'new' __srcu_read_{,un}lock_nmisafe() functions use > > >>> raw_cpu_ptr() to get 'a' percpu sdp, preferably that of the local cpu, > > >>> but no guarantees. > > >>> > > >>> Both cases use r13 (paca) in a racy manner, and in both cases it should > > >>> be safe. > > >> New test results today: both gcc build from git (git clone > > >> git://gcc.gnu.org/git/gcc.git) and Ubuntu 22.04 gcc-12.1.0 > > >> are immune from the above issue. We can see the assembly code on > > >> http://140.211.169.189/0425/srcu_gp_start_if_needed-gcc-12.txt > > >> > > >> while > > >> Both native gcc on PPC vm (gcc version 9.4.0), and gcc cross compiler > > >> on my x86 laptop (gcc version 10.4.0) will reproduce the bug. > > > > > > Do you know what fixes the issue? I would not declare victory yet. My > > > feeling is something changes in timing, or compiler codegen which > > > hides the issue. So the issue is still there but it is just a matter > > > of time before someone else reports it. > > > > > > Out of curiosity for PPC folks, why cannot 64-bit PPC use per-task > > > canary? Michael, is this an optimization? Adding Christophe as well > > > since it came in a few years ago via the following commit: > > > > It uses per-task canary. But unlike PPC32, PPC64 doesn't have a fixed > > register pointing to 'current' at all time so the canary is copied into > > a per-cpu struct during _switch(). > > > > If GCC keeps an old value of the per-cpu struct pointer, it then gets > > the canary from the wrong CPU struct so from a different task. > This is a fruitful learning process for me! Nice work Zhouyi.. > Christophe: > Do you think there is still a need to bisect GCC ? If so, I am very > glad to continue my 2 cents: It would be good to write a reproducer that Segher suggested (but that might be hard since you depend on the compiler to cache the r13 -- maybe some trial/error with CompilerExplorer will give you the magic recipe?). If I understand Christophe correctly, the issue requires the following ingredients: 1. Task A is running on CPU 1, and the task's canary is copied into the CPU1's per-cpu area pointed to by r13. 2. r13 is now cached into r10 in the offending function due to the compiler. 3. Task A running on CPU 1 now gets preempted right in the middle of the offending SRCU function and gets migrated to CPU 2. 4. CPU 2's per-cpu canary is updated to that of task A since task A is the current task now. 5. Task B now runs on CPU 1 and the per-cpu canary on CPU 1 is now that of B. 6. Task A exits the function, but stack checking code reads r10 which contains CPU 1's canary which is that of task B! 7. Boom. So the issue is precisely in #2. The issue is in the compiler that it does not treat r13 as volatile as Boqun had initially mentioned. - Joel > > Cheers > Zhouyi > > > > Christophe > > > > > > > > commit 06ec27aea9fc84d9c6d879eb64b5bcf28a8a1eb7 > > > Author: Christophe Leroy <christophe.leroy@c-s.fr> > > > Date: Thu Sep 27 07:05:55 2018 +0000 > > > > > > powerpc/64: add stack protector support > > > > > > On PPC64, as register r13 points to the paca_struct at all time, > > > this patch adds a copy of the canary there, which is copied at > > > task_switch. > > > That new canary is then used by using the following GCC options: > > > -mstack-protector-guard=tls > > > -mstack-protector-guard-reg=r13 > > > -mstack-protector-guard-offset=offsetof(struct paca_struct, canary)) > > > > > > Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> > > > Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> > > > > > > - Joel ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: BUG : PowerPC RCU: torture test failed with __stack_chk_fail 2023-04-26 0:32 ` Joel Fernandes @ 2023-04-26 1:31 ` Zhouyi Zhou 2023-04-26 2:15 ` Joel Fernandes 0 siblings, 1 reply; 41+ messages in thread From: Zhouyi Zhou @ 2023-04-26 1:31 UTC (permalink / raw) To: Joel Fernandes Cc: Christophe Leroy, Peter Zijlstra, Boqun Feng, Segher Boessenkool, Michael Ellerman, linuxppc-dev, rcu, linux-kernel, lance, Paul E. McKenney On Wed, Apr 26, 2023 at 8:33 AM Joel Fernandes <joel@joelfernandes.org> wrote: > > On Tue, Apr 25, 2023 at 9:50 AM Zhouyi Zhou <zhouzhouyi@gmail.com> wrote: > > > > Hi > > > > On Tue, Apr 25, 2023 at 9:40 PM Christophe Leroy > > <christophe.leroy@csgroup.eu> wrote: > > > > > > > > > > > > Le 25/04/2023 à 13:06, Joel Fernandes a écrit : > > > > On Tue, Apr 25, 2023 at 6:58 AM Zhouyi Zhou <zhouzhouyi@gmail.com> wrote: > > > >> > > > >> hi > > > >> > > > >> On Tue, Apr 25, 2023 at 6:13 PM Peter Zijlstra <peterz@infradead.org> wrote: > > > >>> > > > >>> On Mon, Apr 24, 2023 at 02:55:11PM -0400, Joel Fernandes wrote: > > > >>>> This is amazing debugging Boqun, like a boss! One comment below: > > > >>>> > > > >>>>>>> Or something simple I haven't thought of? :) > > > >>>>>> > > > >>>>>> At what points can r13 change? Only when some particular functions are > > > >>>>>> called? > > > >>>>>> > > > >>>>> > > > >>>>> r13 is the local paca: > > > >>>>> > > > >>>>> register struct paca_struct *local_paca asm("r13"); > > > >>>>> > > > >>>>> , which is a pointer to percpu data. > > > >>>>> > > > >>>>> So if a task schedule from one CPU to anotehr CPU, the value gets > > > >>>>> changed. > > > >>>> > > > >>>> It appears the whole issue, per your analysis, is that the stack > > > >>>> checking code in gcc should not cache or alias r13, and must read its > > > >>>> most up-to-date value during stack checking, as its value may have > > > >>>> changed during a migration to a new CPU. > > > >>>> > > > >>>> Did I get that right? > > > >>>> > > > >>>> IMO, even without a reproducer, gcc on PPC should just not do that, > > > >>>> that feels terribly broken for the kernel. I wonder what clang does, > > > >>>> I'll go poke around with compilerexplorer after lunch. > > > >>>> > > > >>>> Adding +Peter Zijlstra as well to join the party as I have a feeling > > > >>>> he'll be interested. ;-) > > > >>> > > > >>> I'm a little confused; the way I understand the whole stack protector > > > >>> thing to work is that we push a canary on the stack at call and on > > > >>> return check it is still valid. Since in general tasks randomly migrate, > > > >>> the per-cpu validation canary should be the same on all CPUs. > > > >>> > > > >>> Additionally, the 'new' __srcu_read_{,un}lock_nmisafe() functions use > > > >>> raw_cpu_ptr() to get 'a' percpu sdp, preferably that of the local cpu, > > > >>> but no guarantees. > > > >>> > > > >>> Both cases use r13 (paca) in a racy manner, and in both cases it should > > > >>> be safe. > > > >> New test results today: both gcc build from git (git clone > > > >> git://gcc.gnu.org/git/gcc.git) and Ubuntu 22.04 gcc-12.1.0 > > > >> are immune from the above issue. We can see the assembly code on > > > >> http://140.211.169.189/0425/srcu_gp_start_if_needed-gcc-12.txt > > > >> > > > >> while > > > >> Both native gcc on PPC vm (gcc version 9.4.0), and gcc cross compiler > > > >> on my x86 laptop (gcc version 10.4.0) will reproduce the bug. > > > > > > > > Do you know what fixes the issue? I would not declare victory yet. My > > > > feeling is something changes in timing, or compiler codegen which > > > > hides the issue. So the issue is still there but it is just a matter > > > > of time before someone else reports it. > > > > > > > > Out of curiosity for PPC folks, why cannot 64-bit PPC use per-task > > > > canary? Michael, is this an optimization? Adding Christophe as well > > > > since it came in a few years ago via the following commit: > > > > > > It uses per-task canary. But unlike PPC32, PPC64 doesn't have a fixed > > > register pointing to 'current' at all time so the canary is copied into > > > a per-cpu struct during _switch(). > > > > > > If GCC keeps an old value of the per-cpu struct pointer, it then gets > > > the canary from the wrong CPU struct so from a different task. > > This is a fruitful learning process for me! > > Nice work Zhouyi.. Thank Joel for your encouragement! Your encouragement is very important to me ;-) > > > Christophe: > > Do you think there is still a need to bisect GCC ? If so, I am very > > glad to continue > > my 2 cents: It would be good to write a reproducer that Segher > suggested (but that might be hard since you depend on the compiler to > cache the r13 -- maybe some trial/error with CompilerExplorer will > give you the magic recipe?). I have reported to GCC bugzilla once before ;-) [1] [1] https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88348 I think we could provide a preprocessed .i file, and give the command line that invokes cc1, The problem is the newest GCC is immune to our issue ;-( > > If I understand Christophe correctly, the issue requires the following > ingredients: > 1. Task A is running on CPU 1, and the task's canary is copied into > the CPU1's per-cpu area pointed to by r13. > 2. r13 is now cached into r10 in the offending function due to the compiler. > 3. Task A running on CPU 1 now gets preempted right in the middle of > the offending SRCU function and gets migrated to CPU 2. > 4. CPU 2's per-cpu canary is updated to that of task A since task A > is the current task now. > 5. Task B now runs on CPU 1 and the per-cpu canary on CPU 1 is now that of B. > 6. Task A exits the function, but stack checking code reads r10 which > contains CPU 1's canary which is that of task B! > 7. Boom. Joel makes the learning process easier for me, indeed! One question I have tried very hard to understand, but still confused. for now, I know r13 is fixed, but r1 is not, why "r9,40(r1)"'s 40(r1) can be assumed to be equal to 3192(r10). Thanks in advance. > > So the issue is precisely in #2. The issue is in the compiler that it > does not treat r13 as volatile as Boqun had initially mentioned. Please do not hesitate to email me if there is anything I can do (for example bisecting ;-)). I am very glad to be of help ;-) Cheers Zhouyi > > - Joel > > > > > > > Cheers > > Zhouyi > > > > > > Christophe > > > > > > > > > > > commit 06ec27aea9fc84d9c6d879eb64b5bcf28a8a1eb7 > > > > Author: Christophe Leroy <christophe.leroy@c-s.fr> > > > > Date: Thu Sep 27 07:05:55 2018 +0000 > > > > > > > > powerpc/64: add stack protector support > > > > > > > > On PPC64, as register r13 points to the paca_struct at all time, > > > > this patch adds a copy of the canary there, which is copied at > > > > task_switch. > > > > That new canary is then used by using the following GCC options: > > > > -mstack-protector-guard=tls > > > > -mstack-protector-guard-reg=r13 > > > > -mstack-protector-guard-offset=offsetof(struct paca_struct, canary)) > > > > > > > > Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> > > > > Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> > > > > > > > > - Joel ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: BUG : PowerPC RCU: torture test failed with __stack_chk_fail 2023-04-26 1:31 ` Zhouyi Zhou @ 2023-04-26 2:15 ` Joel Fernandes 2023-04-26 2:37 ` Zhouyi Zhou 0 siblings, 1 reply; 41+ messages in thread From: Joel Fernandes @ 2023-04-26 2:15 UTC (permalink / raw) To: Zhouyi Zhou Cc: Christophe Leroy, Peter Zijlstra, Boqun Feng, Segher Boessenkool, Michael Ellerman, linuxppc-dev, rcu, linux-kernel, lance, Paul E. McKenney Hi Zhouyi, On Wed, Apr 26, 2023 at 09:31:17AM +0800, Zhouyi Zhou wrote: [..] > Joel makes the learning process easier for me, indeed! I know that feeling being a learner myself ;-) > One question I have tried very hard to understand, but still confused. > for now, I know > r13 is fixed, but r1 is not, why "r9,40(r1)"'s 40(r1) can be assumed > to be equal to 3192(r10). First you have to I guess read up a bit about stack canaries. Google for "gcc stack protector" and "gcc stack canaries", and the look for basics of "buffer overflow attacks". That'll explain the concept of stack guards etc (Sorry if this is too obvious but I did not know how much you knew about it already). 40(r1) is where the canary was stored. In the beginning of the function, you have this: c000000000226d58: 78 0c 2d e9 ld r9,3192(r13) c000000000226d5c: 28 00 21 f9 std r9,40(r1) r1 is your stack pointer. 3192(r13) is the canary value. 40(r1) is where the canary is stored for later comparison. r1 should not change through out the function I believe, because otherwise you don't know where the stack frame is, right? Later you have this stuff before the function returns which gcc presumably did due to optimization. That mr means move register and is where the caching of r13 to r10 happens that Boqun pointed. c000000000226eb4: 78 6b aa 7d mr r10,r13 [...] and then the canary comparison happens: c000000000226ed8: 28 00 21 e9 ld r9,40(r1) c000000000226edc: 78 0c 4a e9 ld r10,3192(r10) c000000000226ee0: 79 52 29 7d xor. r9,r9,r10 c000000000226ee4: 00 00 40 39 li r10,0 c000000000226ee8: b8 03 82 40 bne c0000000002272a0 <srcu_gp_start_if_needed+0x5a0> So looks like for this to blow up, the preemption/migration has to happen precisely between the mr doing the caching, and the xor doing the comparison, since that's when the r10 will be stale. thanks, - Joel ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: BUG : PowerPC RCU: torture test failed with __stack_chk_fail 2023-04-26 2:15 ` Joel Fernandes @ 2023-04-26 2:37 ` Zhouyi Zhou 0 siblings, 0 replies; 41+ messages in thread From: Zhouyi Zhou @ 2023-04-26 2:37 UTC (permalink / raw) To: Joel Fernandes Cc: Christophe Leroy, Peter Zijlstra, Boqun Feng, Segher Boessenkool, Michael Ellerman, linuxppc-dev, rcu, linux-kernel, lance, Paul E. McKenney On Wed, Apr 26, 2023 at 10:15 AM Joel Fernandes <joel@joelfernandes.org> wrote: > > Hi Zhouyi, > > On Wed, Apr 26, 2023 at 09:31:17AM +0800, Zhouyi Zhou wrote: > [..] > > Joel makes the learning process easier for me, indeed! > > I know that feeling being a learner myself ;-) > > > One question I have tried very hard to understand, but still confused. > > for now, I know > > r13 is fixed, but r1 is not, why "r9,40(r1)"'s 40(r1) can be assumed > > to be equal to 3192(r10). > > First you have to I guess read up a bit about stack canaries. Google for > "gcc stack protector" and "gcc stack canaries", and the look for basics of > "buffer overflow attacks". That'll explain the concept of stack guards etc > (Sorry if this is too obvious but I did not know how much you knew about it > already). > > 40(r1) is where the canary was stored. In the beginning of the function, you > have this: > > c000000000226d58: 78 0c 2d e9 ld r9,3192(r13) > c000000000226d5c: 28 00 21 f9 std r9,40(r1) > > r1 is your stack pointer. 3192(r13) is the canary value. > > 40(r1) is where the canary is stored for later comparison. > > r1 should not change through out the function I believe, because otherwise > you don't know where the stack frame is, right? Thanks Joel's awesome explanation. I can understand the mechanics behind our situation now!! 40(r1) is where the canary is stored for later comparison, this is located on the stack. while 3192(r13) is inside the cpu's PACA. I quote Christophe's note here "in order to be able to have the canary as an offset of a fixed register as expected by GCC, we copy the task canary into the cpu's PACA struct during _switch(): addi r6,r4,-THREAD /* Convert THREAD to 'current' */ std r6,PACACURRENT(r13) /* Set new 'current' */ #if defined(CONFIG_STACKPROTECTOR) ld r6, TASK_CANARY(r6) std r6, PACA_CANARY(r13) #endif " > > Later you have this stuff before the function returns which gcc presumably > did due to optimization. That mr means move register and is where the caching > of r13 to r10 happens that Boqun pointed. Thank Boqun and all others' wonderful debugging! Your work confirmed my bug report ;-) > > c000000000226eb4: 78 6b aa 7d mr r10,r13 > [...] > and then the canary comparison happens: > > c000000000226ed8: 28 00 21 e9 ld r9,40(r1) > c000000000226edc: 78 0c 4a e9 ld r10,3192(r10) 3192(r13) is correct because "we copy the task canary into the cpu's PACA struct during _switch():" but 3192(r10) is not correct, because r10 is the old value of r13. > c000000000226ee0: 79 52 29 7d xor. r9,r9,r10 > c000000000226ee4: 00 00 40 39 li r10,0 > c000000000226ee8: b8 03 82 40 bne c0000000002272a0 <srcu_gp_start_if_needed+0x5a0> > > So looks like for this to blow up, the preemption/migration has to happen > precisely between the mr doing the caching, and the xor doing the comparison, > since that's when the r10 will be stale. Thank Joel and all others for your time ;-) I benefit a lot, and am very glad to do more good work to the community in return ;-) Cheers Zhouyi > > thanks, > > - Joel > ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: BUG : PowerPC RCU: torture test failed with __stack_chk_fail 2023-04-25 13:40 ` Christophe Leroy 2023-04-25 13:49 ` Zhouyi Zhou @ 2023-04-26 0:42 ` Joel Fernandes 1 sibling, 0 replies; 41+ messages in thread From: Joel Fernandes @ 2023-04-26 0:42 UTC (permalink / raw) To: Christophe Leroy Cc: Zhouyi Zhou, Peter Zijlstra, Boqun Feng, Segher Boessenkool, Michael Ellerman, linuxppc-dev, rcu, linux-kernel, lance, Paul E. McKenney On Tue, Apr 25, 2023 at 9:40 AM Christophe Leroy <christophe.leroy@csgroup.eu> wrote: > > > > Le 25/04/2023 à 13:06, Joel Fernandes a écrit : > > On Tue, Apr 25, 2023 at 6:58 AM Zhouyi Zhou <zhouzhouyi@gmail.com> wrote: > >> > >> hi > >> > >> On Tue, Apr 25, 2023 at 6:13 PM Peter Zijlstra <peterz@infradead.org> wrote: > >>> > >>> On Mon, Apr 24, 2023 at 02:55:11PM -0400, Joel Fernandes wrote: > >>>> This is amazing debugging Boqun, like a boss! One comment below: > >>>> > >>>>>>> Or something simple I haven't thought of? :) > >>>>>> > >>>>>> At what points can r13 change? Only when some particular functions are > >>>>>> called? > >>>>>> > >>>>> > >>>>> r13 is the local paca: > >>>>> > >>>>> register struct paca_struct *local_paca asm("r13"); > >>>>> > >>>>> , which is a pointer to percpu data. > >>>>> > >>>>> So if a task schedule from one CPU to anotehr CPU, the value gets > >>>>> changed. > >>>> > >>>> It appears the whole issue, per your analysis, is that the stack > >>>> checking code in gcc should not cache or alias r13, and must read its > >>>> most up-to-date value during stack checking, as its value may have > >>>> changed during a migration to a new CPU. > >>>> > >>>> Did I get that right? > >>>> > >>>> IMO, even without a reproducer, gcc on PPC should just not do that, > >>>> that feels terribly broken for the kernel. I wonder what clang does, > >>>> I'll go poke around with compilerexplorer after lunch. > >>>> > >>>> Adding +Peter Zijlstra as well to join the party as I have a feeling > >>>> he'll be interested. ;-) > >>> > >>> I'm a little confused; the way I understand the whole stack protector > >>> thing to work is that we push a canary on the stack at call and on > >>> return check it is still valid. Since in general tasks randomly migrate, > >>> the per-cpu validation canary should be the same on all CPUs. > >>> > >>> Additionally, the 'new' __srcu_read_{,un}lock_nmisafe() functions use > >>> raw_cpu_ptr() to get 'a' percpu sdp, preferably that of the local cpu, > >>> but no guarantees. > >>> > >>> Both cases use r13 (paca) in a racy manner, and in both cases it should > >>> be safe. > >> New test results today: both gcc build from git (git clone > >> git://gcc.gnu.org/git/gcc.git) and Ubuntu 22.04 gcc-12.1.0 > >> are immune from the above issue. We can see the assembly code on > >> http://140.211.169.189/0425/srcu_gp_start_if_needed-gcc-12.txt > >> > >> while > >> Both native gcc on PPC vm (gcc version 9.4.0), and gcc cross compiler > >> on my x86 laptop (gcc version 10.4.0) will reproduce the bug. > > > > Do you know what fixes the issue? I would not declare victory yet. My > > feeling is something changes in timing, or compiler codegen which > > hides the issue. So the issue is still there but it is just a matter > > of time before someone else reports it. > > > > Out of curiosity for PPC folks, why cannot 64-bit PPC use per-task > > canary? Michael, is this an optimization? Adding Christophe as well > > since it came in a few years ago via the following commit: > > It uses per-task canary. But unlike PPC32, PPC64 doesn't have a fixed > register pointing to 'current' at all time so the canary is copied into > a per-cpu struct during _switch(). > > If GCC keeps an old value of the per-cpu struct pointer, it then gets > the canary from the wrong CPU struct so from a different task. Thanks a lot Christophe, that makes sense. Segher, are you convinced that it is a compiler issue or is there still some doubt? Could you modify gcc's stack checker to not optimize away r13 reads or is that already the case in newer gcc? thanks, - Joel > > Christophe > > > > > commit 06ec27aea9fc84d9c6d879eb64b5bcf28a8a1eb7 > > Author: Christophe Leroy <christophe.leroy@c-s.fr> > > Date: Thu Sep 27 07:05:55 2018 +0000 > > > > powerpc/64: add stack protector support > > > > On PPC64, as register r13 points to the paca_struct at all time, > > this patch adds a copy of the canary there, which is copied at > > task_switch. > > That new canary is then used by using the following GCC options: > > -mstack-protector-guard=tls > > -mstack-protector-guard-reg=r13 > > -mstack-protector-guard-offset=offsetof(struct paca_struct, canary)) > > > > Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> > > Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> > > > > - Joel ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: BUG : PowerPC RCU: torture test failed with __stack_chk_fail 2023-04-25 11:06 ` Joel Fernandes 2023-04-25 3:12 ` Zhouyi Zhou 2023-04-25 13:40 ` Christophe Leroy @ 2023-04-26 12:29 ` Michael Ellerman 2023-04-26 13:44 ` Joel Fernandes 2023-04-28 10:35 ` Christophe Leroy 2 siblings, 2 replies; 41+ messages in thread From: Michael Ellerman @ 2023-04-26 12:29 UTC (permalink / raw) To: Joel Fernandes, Zhouyi Zhou, Christophe Leroy Cc: Peter Zijlstra, Boqun Feng, Segher Boessenkool, linuxppc-dev, rcu, linux-kernel, lance, Paul E. McKenney Joel Fernandes <joel@joelfernandes.org> writes: > On Tue, Apr 25, 2023 at 6:58 AM Zhouyi Zhou <zhouzhouyi@gmail.com> wrote: ... > > Out of curiosity for PPC folks, why cannot 64-bit PPC use per-task > canary? Michael, is this an optimization? Adding Christophe as well > since it came in a few years ago via the following commit: I think Christophe also answered these in his reply. We do use a per-task canary, but because we don't have "current" in a register, we can't use the value in current for GCC. In one of my replies I said a possible solution would be to keep current in a register on 64-bit, but we'd need to do that in addition to the paca, so that would consume another GPR which we'd need to think hard about. There's another reason to have it in the paca, which is that the paca is always accessible, even when the MMU is off, whereas current isn't (in some situations). In general we don't want to use stack protector in code that runs with the MMU off, but if the canary wasn't in the paca then we'd have a hard requirement to not use stack protector in that code. cheers ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: BUG : PowerPC RCU: torture test failed with __stack_chk_fail 2023-04-26 12:29 ` Michael Ellerman @ 2023-04-26 13:44 ` Joel Fernandes 2023-04-26 14:20 ` Peter Zijlstra 2023-04-28 10:35 ` Christophe Leroy 1 sibling, 1 reply; 41+ messages in thread From: Joel Fernandes @ 2023-04-26 13:44 UTC (permalink / raw) To: Michael Ellerman Cc: Zhouyi Zhou, Christophe Leroy, Peter Zijlstra, Boqun Feng, Segher Boessenkool, linuxppc-dev, rcu, linux-kernel, lance, Paul E. McKenney On Wed, Apr 26, 2023 at 8:30 AM Michael Ellerman <mpe@ellerman.id.au> wrote: > > Joel Fernandes <joel@joelfernandes.org> writes: > > On Tue, Apr 25, 2023 at 6:58 AM Zhouyi Zhou <zhouzhouyi@gmail.com> wrote: > ... > > > > Out of curiosity for PPC folks, why cannot 64-bit PPC use per-task > > canary? Michael, is this an optimization? Adding Christophe as well > > since it came in a few years ago via the following commit: > > I think Christophe also answered these in his reply. > > We do use a per-task canary, but because we don't have "current" in a > register, we can't use the value in current for GCC. > > In one of my replies I said a possible solution would be to keep current > in a register on 64-bit, but we'd need to do that in addition to the > paca, so that would consume another GPR which we'd need to think hard > about. Makes sense. I'd think it is not worth allocating a separate GPR just for this, and may cause similar register optimization issues as well. > There's another reason to have it in the paca, which is that the paca is > always accessible, even when the MMU is off, whereas current isn't (in > some situations). > > In general we don't want to use stack protector in code that runs with > the MMU off, but if the canary wasn't in the paca then we'd have a hard > requirement to not use stack protector in that code. How could you control which code paths don't have the stack protector? Just curious. thanks, - Joel ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: BUG : PowerPC RCU: torture test failed with __stack_chk_fail 2023-04-26 13:44 ` Joel Fernandes @ 2023-04-26 14:20 ` Peter Zijlstra 2023-04-26 14:45 ` Michael Ellerman 0 siblings, 1 reply; 41+ messages in thread From: Peter Zijlstra @ 2023-04-26 14:20 UTC (permalink / raw) To: Joel Fernandes Cc: Michael Ellerman, Zhouyi Zhou, Christophe Leroy, Boqun Feng, Segher Boessenkool, linuxppc-dev, rcu, linux-kernel, lance, Paul E. McKenney On Wed, Apr 26, 2023 at 09:44:22AM -0400, Joel Fernandes wrote: > How could you control which code paths don't have the stack protector? > Just curious. https://lkml.kernel.org/r/20230412-no_stackp-v2-0-116f9fe4bbe7@google.com ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: BUG : PowerPC RCU: torture test failed with __stack_chk_fail 2023-04-26 14:20 ` Peter Zijlstra @ 2023-04-26 14:45 ` Michael Ellerman 0 siblings, 0 replies; 41+ messages in thread From: Michael Ellerman @ 2023-04-26 14:45 UTC (permalink / raw) To: Peter Zijlstra, Joel Fernandes Cc: Zhouyi Zhou, Christophe Leroy, Boqun Feng, Segher Boessenkool, linuxppc-dev, rcu, linux-kernel, lance, Paul E. McKenney Peter Zijlstra <peterz@infradead.org> writes: > On Wed, Apr 26, 2023 at 09:44:22AM -0400, Joel Fernandes wrote: > >> How could you control which code paths don't have the stack protector? >> Just curious. > > https://lkml.kernel.org/r/20230412-no_stackp-v2-0-116f9fe4bbe7@google.com We also have some entire files disabled, eg: $ git grep no-stack-protector arch/powerpc/ arch/powerpc/kernel/Makefile:CFLAGS_prom_init.o += -fno-stack-protector arch/powerpc/kernel/Makefile:CFLAGS_syscall.o += -fno-stack-protector arch/powerpc/kernel/Makefile:CFLAGS_setup_64.o += -fno-stack-protector arch/powerpc/kernel/Makefile:CFLAGS_paca.o += -fno-stack-protector cheers ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: BUG : PowerPC RCU: torture test failed with __stack_chk_fail 2023-04-26 12:29 ` Michael Ellerman 2023-04-26 13:44 ` Joel Fernandes @ 2023-04-28 10:35 ` Christophe Leroy 1 sibling, 0 replies; 41+ messages in thread From: Christophe Leroy @ 2023-04-28 10:35 UTC (permalink / raw) To: Michael Ellerman, Joel Fernandes, Zhouyi Zhou, Christophe Leroy Cc: Peter Zijlstra, Boqun Feng, Segher Boessenkool, linuxppc-dev, rcu, linux-kernel, lance, Paul E. McKenney Le 26/04/2023 à 14:29, Michael Ellerman a écrit : > Joel Fernandes <joel@joelfernandes.org> writes: >> On Tue, Apr 25, 2023 at 6:58 AM Zhouyi Zhou <zhouzhouyi@gmail.com> wrote: > ... >> >> Out of curiosity for PPC folks, why cannot 64-bit PPC use per-task >> canary? Michael, is this an optimization? Adding Christophe as well >> since it came in a few years ago via the following commit: > > I think Christophe also answered these in his reply. > > We do use a per-task canary, but because we don't have "current" in a > register, we can't use the value in current for GCC. > > In one of my replies I said a possible solution would be to keep current > in a register on 64-bit, but we'd need to do that in addition to the > paca, so that would consume another GPR which we'd need to think hard > about. An analysis was done here https://github.com/linuxppc/issues/issues/45 showing that r14 is very little used. > > There's another reason to have it in the paca, which is that the paca is > always accessible, even when the MMU is off, whereas current isn't (in > some situations). Even now that powerpc is converted to CONFIG_THREAD_INFO_IN_TASK ? Christophe ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: BUG : PowerPC RCU: torture test failed with __stack_chk_fail 2023-04-25 10:13 ` Peter Zijlstra 2023-04-25 10:58 ` Zhouyi Zhou @ 2023-04-25 10:59 ` Joel Fernandes 2023-04-25 11:53 ` Peter Zijlstra 1 sibling, 1 reply; 41+ messages in thread From: Joel Fernandes @ 2023-04-25 10:59 UTC (permalink / raw) To: Peter Zijlstra Cc: Boqun Feng, Segher Boessenkool, Michael Ellerman, Zhouyi Zhou, linuxppc-dev, rcu, linux-kernel, lance, Paul E. McKenney On Tue, Apr 25, 2023 at 6:13 AM Peter Zijlstra <peterz@infradead.org> wrote: > > On Mon, Apr 24, 2023 at 02:55:11PM -0400, Joel Fernandes wrote: > > This is amazing debugging Boqun, like a boss! One comment below: > > > > > > > Or something simple I haven't thought of? :) > > > > > > > > At what points can r13 change? Only when some particular functions are > > > > called? > > > > > > > > > > r13 is the local paca: > > > > > > register struct paca_struct *local_paca asm("r13"); > > > > > > , which is a pointer to percpu data. > > > > > > So if a task schedule from one CPU to anotehr CPU, the value gets > > > changed. > > > > It appears the whole issue, per your analysis, is that the stack > > checking code in gcc should not cache or alias r13, and must read its > > most up-to-date value during stack checking, as its value may have > > changed during a migration to a new CPU. > > > > Did I get that right? > > > > IMO, even without a reproducer, gcc on PPC should just not do that, > > that feels terribly broken for the kernel. I wonder what clang does, > > I'll go poke around with compilerexplorer after lunch. > > > > Adding +Peter Zijlstra as well to join the party as I have a feeling > > he'll be interested. ;-) > > I'm a little confused; the way I understand the whole stack protector > thing to work is that we push a canary on the stack at call and on > return check it is still valid. Since in general tasks randomly migrate, > the per-cpu validation canary should be the same on all CPUs. > > Additionally, the 'new' __srcu_read_{,un}lock_nmisafe() functions use > raw_cpu_ptr() to get 'a' percpu sdp, preferably that of the local cpu, > but no guarantees. > > Both cases use r13 (paca) in a racy manner, and in both cases it should > be safe. AFAICS, the canary is randomly chosen both in the kernel [1]. This also appears to be the case in glibc. That makes sense because you don't want the canary to be something that the attacker can easily predict and store on the stack to bypass buffer overflow attacks: [1] kernel : /* * Initialize the stackprotector canary value. * * NOTE: this must only be called from functions that never return, * and it must always be inlined. */ static __always_inline void boot_init_stack_canary(void) { unsigned long canary = get_random_canary(); current->stack_canary = canary; #ifdef CONFIG_PPC64 get_paca()->canary = canary; #endif } thanks, - Joel ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: BUG : PowerPC RCU: torture test failed with __stack_chk_fail 2023-04-25 10:59 ` Joel Fernandes @ 2023-04-25 11:53 ` Peter Zijlstra 2023-04-25 13:36 ` Christophe Leroy 0 siblings, 1 reply; 41+ messages in thread From: Peter Zijlstra @ 2023-04-25 11:53 UTC (permalink / raw) To: Joel Fernandes Cc: Boqun Feng, Segher Boessenkool, Michael Ellerman, Zhouyi Zhou, linuxppc-dev, rcu, linux-kernel, lance, Paul E. McKenney On Tue, Apr 25, 2023 at 06:59:29AM -0400, Joel Fernandes wrote: > > I'm a little confused; the way I understand the whole stack protector > > thing to work is that we push a canary on the stack at call and on > > return check it is still valid. Since in general tasks randomly migrate, > > the per-cpu validation canary should be the same on all CPUs. > AFAICS, the canary is randomly chosen both in the kernel [1]. This Yes, at boot, once. But thereafter it should be the same for all CPUs. > also appears to be the case in glibc. That makes sense because you > don't want the canary to be something that the attacker can easily > predict and store on the stack to bypass buffer overflow attacks: > > [1] kernel : > /* > * Initialize the stackprotector canary value. > * > * NOTE: this must only be called from functions that never return, > * and it must always be inlined. > */ > static __always_inline void boot_init_stack_canary(void) > { > unsigned long canary = get_random_canary(); > > current->stack_canary = canary; > #ifdef CONFIG_PPC64 > get_paca()->canary = canary; > #endif > } > > thanks, > > - Joel ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: BUG : PowerPC RCU: torture test failed with __stack_chk_fail 2023-04-25 11:53 ` Peter Zijlstra @ 2023-04-25 13:36 ` Christophe Leroy 0 siblings, 0 replies; 41+ messages in thread From: Christophe Leroy @ 2023-04-25 13:36 UTC (permalink / raw) To: Peter Zijlstra, Joel Fernandes Cc: Paul E. McKenney, Boqun Feng, linux-kernel, rcu, lance, Zhouyi Zhou, linuxppc-dev Le 25/04/2023 à 13:53, Peter Zijlstra a écrit : > On Tue, Apr 25, 2023 at 06:59:29AM -0400, Joel Fernandes wrote: >>> I'm a little confused; the way I understand the whole stack protector >>> thing to work is that we push a canary on the stack at call and on >>> return check it is still valid. Since in general tasks randomly migrate, >>> the per-cpu validation canary should be the same on all CPUs. > >> AFAICS, the canary is randomly chosen both in the kernel [1]. This > > Yes, at boot, once. But thereafter it should be the same for all CPUs. Each task has its own canary, stored in task struct : kernel/fork.c:1012: tsk->stack_canary = get_random_canary(); On PPC32 we have register 'r2' that points to task struct at all time, so GCC is instructed to find canary at an offset from r2. But on PPC64 we have no such register. Instead we have r13 that points to the PACA struct which is a per-cpu structure, and we have a pointer to 'current' task struct in the PACA struct. So in order to be able to have the canary as an offset of a fixed register as expected by GCC, we copy the task canary into the cpu's PACA struct during _switch(): addi r6,r4,-THREAD /* Convert THREAD to 'current' */ std r6,PACACURRENT(r13) /* Set new 'current' */ #if defined(CONFIG_STACKPROTECTOR) ld r6, TASK_CANARY(r6) std r6, PACA_CANARY(r13) #endif The problem is that r13 will change if a task is switched to another CPU. But if GCC is using a copy of an older value of r13, then it will take the canary from another CPU's PACA struct hence it'll get the canary of the task running on that CPU instead of getting the canary of the current task running on the current CPU. Christophe ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: BUG : PowerPC RCU: torture test failed with __stack_chk_fail 2023-04-22 12:46 BUG : PowerPC RCU: torture test failed with __stack_chk_fail Zhouyi Zhou 2023-04-22 19:19 ` Joel Fernandes 2023-04-22 19:28 ` Joel Fernandes @ 2023-04-24 22:07 ` Michael Ellerman 2023-04-24 22:13 ` Zhouyi Zhou 2023-04-25 6:01 ` Zhouyi Zhou 2 siblings, 2 replies; 41+ messages in thread From: Michael Ellerman @ 2023-04-24 22:07 UTC (permalink / raw) To: Zhouyi Zhou, linuxppc-dev, rcu, linux-kernel, lance, Paul E. McKenney Zhouyi Zhou <zhouzhouyi@gmail.com> writes: > Dear PowerPC and RCU developers: > During the RCU torture test on mainline (on the VM of Opensource Lab > of Oregon State University), SRCU-P failed with __stack_chk_fail: ... > by debugging, I see the r10 is assigned with r13 on c000000000226eb4, > but if there is a context-switch before c000000000226edc, a false > positive will be reported. > > [1] http://154.220.3.115/logs/0422/configformainline.txt Says: CONFIG_CC_VERSION_TEXT="powerpc64le-linux-gnu-gcc-10 (Ubuntu 10.4.0-4ubuntu1~22.04) 10.4.0" Do you see the same issue with a newer GCC? There's 12.2.0 here: https://mirrors.edge.kernel.org/pub/tools/crosstool/files/bin/x86_64/12.2.0/ https://mirrors.edge.kernel.org/pub/tools/crosstool/files/bin/ppc64le/12.2.0/ Or if you can build in a Fedora 38 system or container, it has GCC 13. cheers ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: BUG : PowerPC RCU: torture test failed with __stack_chk_fail 2023-04-24 22:07 ` Michael Ellerman @ 2023-04-24 22:13 ` Zhouyi Zhou 2023-04-25 6:01 ` Zhouyi Zhou 1 sibling, 0 replies; 41+ messages in thread From: Zhouyi Zhou @ 2023-04-24 22:13 UTC (permalink / raw) To: Michael Ellerman; +Cc: linuxppc-dev, rcu, linux-kernel, lance, Paul E. McKenney On Tue, Apr 25, 2023 at 6:07 AM Michael Ellerman <mpe@ellerman.id.au> wrote: > > Zhouyi Zhou <zhouzhouyi@gmail.com> writes: > > Dear PowerPC and RCU developers: > > During the RCU torture test on mainline (on the VM of Opensource Lab > > of Oregon State University), SRCU-P failed with __stack_chk_fail: > ... > > by debugging, I see the r10 is assigned with r13 on c000000000226eb4, > > but if there is a context-switch before c000000000226edc, a false > > positive will be reported. > > > > [1] http://154.220.3.115/logs/0422/configformainline.txt > > Says: > > CONFIG_CC_VERSION_TEXT="powerpc64le-linux-gnu-gcc-10 (Ubuntu 10.4.0-4ubuntu1~22.04) 10.4.0" > > Do you see the same issue with a newer GCC? I would like to try that in the newest GCC ;-), please give me about a day's time because I am going to compile the gcc ;-) > > There's 12.2.0 here: > https://mirrors.edge.kernel.org/pub/tools/crosstool/files/bin/x86_64/12.2.0/ > https://mirrors.edge.kernel.org/pub/tools/crosstool/files/bin/ppc64le/12.2.0/ > > Or if you can build in a Fedora 38 system or container, it has GCC 13. OK, I will try it or similar This is a very learningful process for me, thank you all ;-) Cheers > > cheers ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: BUG : PowerPC RCU: torture test failed with __stack_chk_fail 2023-04-24 22:07 ` Michael Ellerman 2023-04-24 22:13 ` Zhouyi Zhou @ 2023-04-25 6:01 ` Zhouyi Zhou 2023-04-25 9:27 ` Zhouyi Zhou 1 sibling, 1 reply; 41+ messages in thread From: Zhouyi Zhou @ 2023-04-25 6:01 UTC (permalink / raw) To: Michael Ellerman; +Cc: linuxppc-dev, rcu, linux-kernel, lance, Paul E. McKenney hi On Tue, Apr 25, 2023 at 6:07 AM Michael Ellerman <mpe@ellerman.id.au> wrote: > > Zhouyi Zhou <zhouzhouyi@gmail.com> writes: > > Dear PowerPC and RCU developers: > > During the RCU torture test on mainline (on the VM of Opensource Lab > > of Oregon State University), SRCU-P failed with __stack_chk_fail: > ... > > by debugging, I see the r10 is assigned with r13 on c000000000226eb4, > > but if there is a context-switch before c000000000226edc, a false > > positive will be reported. > > > > [1] http://154.220.3.115/logs/0422/configformainline.txt > > Says: > > CONFIG_CC_VERSION_TEXT="powerpc64le-linux-gnu-gcc-10 (Ubuntu 10.4.0-4ubuntu1~22.04) 10.4.0" > > Do you see the same issue with a newer GCC? On PPC vm of Oregon State University (I can grant rsa-pub key ssh access if you are interested), I build and install the gcc from git, then use the newly built gcc to compile the kernel, the bug disappears, I don't know why. Following is what is do: 1) git clone git://gcc.gnu.org/git/gcc.git git rev-parse --short HEAD f0eabc52c9a 2) mkdir gcc/build 3) cd gcc/build 4) ../configure --disable-bootstrap --prefix=/home/ubuntu/gcc-install 5) make -j 4 //my VM has limited memory ;-) 6) make install 7) cd linux-dir git rev-parse --short HEAD 61d325dcbc05 8) export PATH=/home/ubuntu/gcc-install/bin:$PATH 9) make vmlinux -j 8 10) ./whilebash.sh [1] From the assembly code of srcu_gp_start_if_needed [2], I found stack protector is operated directly on r13: c000000000225098: 30 00 0d e9 ld r8,48(r13) c00000000022509c: 08 00 3c e9 ld r9,8(r28) c0000000002250a0: 14 42 29 7d add r9,r9,r8 c0000000002250a4: ac 04 00 7c hwsync c0000000002250a8: 10 00 7b 3b addi r27,r27,16 c0000000002250ac: 14 da 29 7d add r9,r9,r27 c0000000002250b0: a8 48 00 7d ldarx r8,0,r9 c0000000002250b4: 01 00 08 31 addic r8,r8,1 c0000000002250b8: ad 49 00 7d stdcx. r8,0,r9 c0000000002250bc: f4 ff c2 40 bne- c0000000002250b0 <srcu_gp_start_if_needed+0x220> c0000000002250c0: 28 00 01 e9 ld r8,40(r1) c0000000002250c4: 78 0c 2d e9 ld r9,3192(r13) c0000000002250c8: 79 4a 08 7d xor. r8,r8,r9 c0000000002250cc: 00 00 20 39 li r9,0 c0000000002250d0: 90 03 82 40 bne c000000000225460 <srcu_gp_start_if_needed+0x5d0> console.log is attached at [3]. [1] 140.211.169.189/0425/whilebash.sh [2] http://140.211.169.189/0425/srcu_gp_start_if_needed.txt [3] http://140.211.169.189/0425/console.log I am very glad to cooperate if there is anything else I can do ;-) Cheers Zhouyi > > There's 12.2.0 here: > https://mirrors.edge.kernel.org/pub/tools/crosstool/files/bin/x86_64/12.2.0/ > https://mirrors.edge.kernel.org/pub/tools/crosstool/files/bin/ppc64le/12.2.0/ > > Or if you can build in a Fedora 38 system or container, it has GCC 13. > > cheers ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: BUG : PowerPC RCU: torture test failed with __stack_chk_fail 2023-04-25 6:01 ` Zhouyi Zhou @ 2023-04-25 9:27 ` Zhouyi Zhou 2023-04-27 3:09 ` Michael Ellerman 0 siblings, 1 reply; 41+ messages in thread From: Zhouyi Zhou @ 2023-04-25 9:27 UTC (permalink / raw) To: Michael Ellerman; +Cc: linuxppc-dev, rcu, linux-kernel, lance, Paul E. McKenney On Tue, Apr 25, 2023 at 2:01 PM Zhouyi Zhou <zhouzhouyi@gmail.com> wrote: > > hi > > On Tue, Apr 25, 2023 at 6:07 AM Michael Ellerman <mpe@ellerman.id.au> wrote: > > > > Zhouyi Zhou <zhouzhouyi@gmail.com> writes: > > > Dear PowerPC and RCU developers: > > > During the RCU torture test on mainline (on the VM of Opensource Lab > > > of Oregon State University), SRCU-P failed with __stack_chk_fail: > > ... > > > by debugging, I see the r10 is assigned with r13 on c000000000226eb4, > > > but if there is a context-switch before c000000000226edc, a false > > > positive will be reported. > > > > > > [1] http://154.220.3.115/logs/0422/configformainline.txt > > > > Says: > > > > CONFIG_CC_VERSION_TEXT="powerpc64le-linux-gnu-gcc-10 (Ubuntu 10.4.0-4ubuntu1~22.04) 10.4.0" > > > > Do you see the same issue with a newer GCC? > On PPC vm of Oregon State University (I can grant rsa-pub key ssh > access if you are interested), I > build and install the gcc from git, then use the newly built gcc to > compile the kernel, the bug disappears, > I don't know why. Following is what is do: > > 1) git clone git://gcc.gnu.org/git/gcc.git > git rev-parse --short HEAD > f0eabc52c9a > 2) mkdir gcc/build > 3) cd gcc/build > 4) ../configure --disable-bootstrap --prefix=/home/ubuntu/gcc-install > 5) make -j 4 //my VM has limited memory ;-) > 6) make install > 7) cd linux-dir > git rev-parse --short HEAD > 61d325dcbc05 > 8) export PATH=/home/ubuntu/gcc-install/bin:$PATH > 9) make vmlinux -j 8 > 10) ./whilebash.sh [1] > > From the assembly code of srcu_gp_start_if_needed [2], I found stack protector > is operated directly on r13: > > c000000000225098: 30 00 0d e9 ld r8,48(r13) > c00000000022509c: 08 00 3c e9 ld r9,8(r28) > c0000000002250a0: 14 42 29 7d add r9,r9,r8 > c0000000002250a4: ac 04 00 7c hwsync > c0000000002250a8: 10 00 7b 3b addi r27,r27,16 > c0000000002250ac: 14 da 29 7d add r9,r9,r27 > c0000000002250b0: a8 48 00 7d ldarx r8,0,r9 > c0000000002250b4: 01 00 08 31 addic r8,r8,1 > c0000000002250b8: ad 49 00 7d stdcx. r8,0,r9 > c0000000002250bc: f4 ff c2 40 bne- c0000000002250b0 > <srcu_gp_start_if_needed+0x220> > c0000000002250c0: 28 00 01 e9 ld r8,40(r1) > c0000000002250c4: 78 0c 2d e9 ld r9,3192(r13) > c0000000002250c8: 79 4a 08 7d xor. r8,r8,r9 > c0000000002250cc: 00 00 20 39 li r9,0 > c0000000002250d0: 90 03 82 40 bne c000000000225460 > <srcu_gp_start_if_needed+0x5d0> > > console.log is attached at [3]. > > [1] 140.211.169.189/0425/whilebash.sh > [2] http://140.211.169.189/0425/srcu_gp_start_if_needed.txt > [3] http://140.211.169.189/0425/console.log > > I am very glad to cooperate if there is anything else I can do ;-) > > Cheers > Zhouyi > > > > There's 12.2.0 here: > > https://mirrors.edge.kernel.org/pub/tools/crosstool/files/bin/x86_64/12.2.0/ > > https://mirrors.edge.kernel.org/pub/tools/crosstool/files/bin/ppc64le/12.2.0/ powerpc64le-linux-gnu-gcc-12 cross compiler on my Ubuntu 22.04 does not seem to have that issue as gcc-10 does [4] http://140.211.169.189/0425/srcu_gp_start_if_needed-gcc-12.txt > > > > Or if you can build in a Fedora 38 system or container, it has GCC 13. > > > > cheers ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: BUG : PowerPC RCU: torture test failed with __stack_chk_fail 2023-04-25 9:27 ` Zhouyi Zhou @ 2023-04-27 3:09 ` Michael Ellerman 2023-04-27 3:32 ` Zhouyi Zhou 2023-04-27 9:21 ` Zhouyi Zhou 0 siblings, 2 replies; 41+ messages in thread From: Michael Ellerman @ 2023-04-27 3:09 UTC (permalink / raw) To: Zhouyi Zhou; +Cc: linuxppc-dev, rcu, linux-kernel, lance, Paul E. McKenney Zhouyi Zhou <zhouzhouyi@gmail.com> writes: > On Tue, Apr 25, 2023 at 2:01 PM Zhouyi Zhou <zhouzhouyi@gmail.com> wrote: >> On Tue, Apr 25, 2023 at 6:07 AM Michael Ellerman <mpe@ellerman.id.au> wrote: ... >> > >> > There's 12.2.0 here: >> > https://mirrors.edge.kernel.org/pub/tools/crosstool/files/bin/x86_64/12.2.0/ >> > https://mirrors.edge.kernel.org/pub/tools/crosstool/files/bin/ppc64le/12.2.0/ > powerpc64le-linux-gnu-gcc-12 cross compiler on my Ubuntu 22.04 does > not seem to have that issue as gcc-10 does OK. So so far it's only that GCC 10 that shows the problem. If you have time, you could use some of the other versions to narrow down which versions show the bug: https://mirrors.edge.kernel.org/pub/tools/crosstool/files/bin/ppc64le/ There's an 11.0, 11.1 and 11.3 there, as well as 9.5 and so on. There's ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: BUG : PowerPC RCU: torture test failed with __stack_chk_fail 2023-04-27 3:09 ` Michael Ellerman @ 2023-04-27 3:32 ` Zhouyi Zhou 2023-04-27 9:21 ` Zhouyi Zhou 1 sibling, 0 replies; 41+ messages in thread From: Zhouyi Zhou @ 2023-04-27 3:32 UTC (permalink / raw) To: Michael Ellerman; +Cc: linuxppc-dev, rcu, linux-kernel, lance, Paul E. McKenney On Thu, Apr 27, 2023 at 11:09 AM Michael Ellerman <mpe@ellerman.id.au> wrote: > > Zhouyi Zhou <zhouzhouyi@gmail.com> writes: > > On Tue, Apr 25, 2023 at 2:01 PM Zhouyi Zhou <zhouzhouyi@gmail.com> wrote: > >> On Tue, Apr 25, 2023 at 6:07 AM Michael Ellerman <mpe@ellerman.id.au> wrote: > ... > >> > > >> > There's 12.2.0 here: > >> > https://mirrors.edge.kernel.org/pub/tools/crosstool/files/bin/x86_64/12.2.0/ > >> > https://mirrors.edge.kernel.org/pub/tools/crosstool/files/bin/ppc64le/12.2.0/ > > > powerpc64le-linux-gnu-gcc-12 cross compiler on my Ubuntu 22.04 does > > not seem to have that issue as gcc-10 does > > OK. So so far it's only that GCC 10 that shows the problem. The default gcc version 9.4.0 in ubuntu 20.04 (in VM of Oregon State University) also shows the problem. > > If you have time, you could use some of the other versions to narrow > down which versions show the bug: > > https://mirrors.edge.kernel.org/pub/tools/crosstool/files/bin/ppc64le/ > > There's an 11.0, 11.1 and 11.3 there, as well as 9.5 and so on. I have time ;-), and am very glad to try the above versions to narrow down which version shows the bug ;-) Cheers Zhouyi > > There's ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: BUG : PowerPC RCU: torture test failed with __stack_chk_fail 2023-04-27 3:09 ` Michael Ellerman 2023-04-27 3:32 ` Zhouyi Zhou @ 2023-04-27 9:21 ` Zhouyi Zhou 2023-04-27 14:13 ` Michael Ellerman 1 sibling, 1 reply; 41+ messages in thread From: Zhouyi Zhou @ 2023-04-27 9:21 UTC (permalink / raw) To: Michael Ellerman; +Cc: linuxppc-dev, rcu, linux-kernel, lance, Paul E. McKenney On Thu, Apr 27, 2023 at 11:09 AM Michael Ellerman <mpe@ellerman.id.au> wrote: > > Zhouyi Zhou <zhouzhouyi@gmail.com> writes: > > On Tue, Apr 25, 2023 at 2:01 PM Zhouyi Zhou <zhouzhouyi@gmail.com> wrote: > >> On Tue, Apr 25, 2023 at 6:07 AM Michael Ellerman <mpe@ellerman.id.au> wrote: > ... > >> > > >> > There's 12.2.0 here: > >> > https://mirrors.edge.kernel.org/pub/tools/crosstool/files/bin/x86_64/12.2.0/ > >> > https://mirrors.edge.kernel.org/pub/tools/crosstool/files/bin/ppc64le/12.2.0/ > > > powerpc64le-linux-gnu-gcc-12 cross compiler on my Ubuntu 22.04 does > > not seem to have that issue as gcc-10 does > > OK. So so far it's only that GCC 10 that shows the problem. > > If you have time, you could use some of the other versions to narrow > down which versions show the bug: > > https://mirrors.edge.kernel.org/pub/tools/crosstool/files/bin/ppc64le/ > > There's an 11.0, 11.1 and 11.3 there, as well as 9.5 and so on. GCC test results (Tested on PPC VM of Open Source Lab of Oregon State University) gcc 9.4 (ubuntu native): positive, show bug gcc 9.5 (download form [1]): positive, show bug gcc 10.1 (download from [1]): positive, show bug gcc 10.3 (download from [1]): positive, show bug gcc 10.4 (download from [1]): positive, show bug gcc 11.0 (download from [1]): negative, no bug gcc 11.1 (download from [1]): negative, no bug gcc 11.3 (download from [1]): negative, no bug gcc 12.1 (download from [1]): negative, no bug gcc 12.2 (download from [1]): negative, no bug [1] https://mirrors.edge.kernel.org/pub/tools/crosstool/files/bin/ppc64le/ I am very happy to cooperate if there is further need ;-) Cheers Zhouyi > > There's ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: BUG : PowerPC RCU: torture test failed with __stack_chk_fail 2023-04-27 9:21 ` Zhouyi Zhou @ 2023-04-27 14:13 ` Michael Ellerman 2023-04-27 14:29 ` Zhouyi Zhou 0 siblings, 1 reply; 41+ messages in thread From: Michael Ellerman @ 2023-04-27 14:13 UTC (permalink / raw) To: Zhouyi Zhou; +Cc: linuxppc-dev, rcu, linux-kernel, lance, Paul E. McKenney Zhouyi Zhou <zhouzhouyi@gmail.com> writes: > On Thu, Apr 27, 2023 at 11:09 AM Michael Ellerman <mpe@ellerman.id.au> wrote: >> >> Zhouyi Zhou <zhouzhouyi@gmail.com> writes: >> > On Tue, Apr 25, 2023 at 2:01 PM Zhouyi Zhou <zhouzhouyi@gmail.com> wrote: >> >> On Tue, Apr 25, 2023 at 6:07 AM Michael Ellerman <mpe@ellerman.id.au> wrote: >> ... >> >> > >> >> > There's 12.2.0 here: >> >> > https://mirrors.edge.kernel.org/pub/tools/crosstool/files/bin/x86_64/12.2.0/ >> >> > https://mirrors.edge.kernel.org/pub/tools/crosstool/files/bin/ppc64le/12.2.0/ >> >> > powerpc64le-linux-gnu-gcc-12 cross compiler on my Ubuntu 22.04 does >> > not seem to have that issue as gcc-10 does >> >> OK. So so far it's only that GCC 10 that shows the problem. >> >> If you have time, you could use some of the other versions to narrow >> down which versions show the bug: >> >> https://mirrors.edge.kernel.org/pub/tools/crosstool/files/bin/ppc64le/ >> >> There's an 11.0, 11.1 and 11.3 there, as well as 9.5 and so on. > GCC test results (Tested on PPC VM of Open Source Lab of Oregon State > University) > gcc 9.4 (ubuntu native): positive, show bug > gcc 9.5 (download form [1]): positive, show bug > gcc 10.1 (download from [1]): positive, show bug > gcc 10.3 (download from [1]): positive, show bug > gcc 10.4 (download from [1]): positive, show bug > > gcc 11.0 (download from [1]): negative, no bug > gcc 11.1 (download from [1]): negative, no bug > gcc 11.3 (download from [1]): negative, no bug > gcc 12.1 (download from [1]): negative, no bug > gcc 12.2 (download from [1]): negative, no bug Awesome work. How are you testing for presence/absence of the bug? By running your test and seeing if it crashes, or by looking at the generated code? cheers ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: BUG : PowerPC RCU: torture test failed with __stack_chk_fail 2023-04-27 14:13 ` Michael Ellerman @ 2023-04-27 14:29 ` Zhouyi Zhou 0 siblings, 0 replies; 41+ messages in thread From: Zhouyi Zhou @ 2023-04-27 14:29 UTC (permalink / raw) To: Michael Ellerman; +Cc: linuxppc-dev, rcu, linux-kernel, lance, Paul E. McKenney On Thu, Apr 27, 2023 at 10:13 PM Michael Ellerman <mpe@ellerman.id.au> wrote: > > Zhouyi Zhou <zhouzhouyi@gmail.com> writes: > > On Thu, Apr 27, 2023 at 11:09 AM Michael Ellerman <mpe@ellerman.id.au> wrote: > >> > >> Zhouyi Zhou <zhouzhouyi@gmail.com> writes: > >> > On Tue, Apr 25, 2023 at 2:01 PM Zhouyi Zhou <zhouzhouyi@gmail.com> wrote: > >> >> On Tue, Apr 25, 2023 at 6:07 AM Michael Ellerman <mpe@ellerman.id.au> wrote: > >> ... > >> >> > > >> >> > There's 12.2.0 here: > >> >> > https://mirrors.edge.kernel.org/pub/tools/crosstool/files/bin/x86_64/12.2.0/ > >> >> > https://mirrors.edge.kernel.org/pub/tools/crosstool/files/bin/ppc64le/12.2.0/ > >> > >> > powerpc64le-linux-gnu-gcc-12 cross compiler on my Ubuntu 22.04 does > >> > not seem to have that issue as gcc-10 does > >> > >> OK. So so far it's only that GCC 10 that shows the problem. > >> > >> If you have time, you could use some of the other versions to narrow > >> down which versions show the bug: > >> > >> https://mirrors.edge.kernel.org/pub/tools/crosstool/files/bin/ppc64le/ > >> > >> There's an 11.0, 11.1 and 11.3 there, as well as 9.5 and so on. > > GCC test results (Tested on PPC VM of Open Source Lab of Oregon State > > University) > > gcc 9.4 (ubuntu native): positive, show bug > > gcc 9.5 (download form [1]): positive, show bug > > gcc 10.1 (download from [1]): positive, show bug > > gcc 10.3 (download from [1]): positive, show bug > > gcc 10.4 (download from [1]): positive, show bug > > > > gcc 11.0 (download from [1]): negative, no bug > > gcc 11.1 (download from [1]): negative, no bug > > gcc 11.3 (download from [1]): negative, no bug > > gcc 12.1 (download from [1]): negative, no bug > > gcc 12.2 (download from [1]): negative, no bug > > Awesome work. Thank you for your encouragement ;-) ;-) > > How are you testing for presence/absence of the bug? By running your > test and seeing if it crashes, or by looking at the generated code? Both I use gdb ./vmlinux; gdb)disassemble srcu_gp_start_if_needed to look up the generated assembly code and I use [1] to see if it crashes, if there is a bug, it always crashes very quickly (within 3 minutes) [1] http://140.211.169.189/0425/whilebash.sh Cheers > > cheers ^ permalink raw reply [flat|nested] 41+ messages in thread
end of thread, other threads:[~2023-04-28 10:35 UTC | newest] Thread overview: 41+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2023-04-22 12:46 BUG : PowerPC RCU: torture test failed with __stack_chk_fail Zhouyi Zhou 2023-04-22 19:19 ` Joel Fernandes 2023-04-23 1:37 ` Zhouyi Zhou 2023-04-23 5:45 ` Zhouyi Zhou 2023-04-22 19:28 ` Joel Fernandes 2023-04-24 0:32 ` Boqun Feng 2023-04-24 4:00 ` Zhouyi Zhou 2023-04-24 13:14 ` Michael Ellerman 2023-04-24 15:13 ` Segher Boessenkool 2023-04-24 15:28 ` Boqun Feng 2023-04-24 17:29 ` Segher Boessenkool 2023-04-24 19:25 ` Boqun Feng 2023-04-24 18:55 ` Joel Fernandes 2023-04-25 10:13 ` Peter Zijlstra 2023-04-25 10:58 ` Zhouyi Zhou 2023-04-25 11:06 ` Joel Fernandes 2023-04-25 3:12 ` Zhouyi Zhou 2023-04-25 13:40 ` Christophe Leroy 2023-04-25 13:49 ` Zhouyi Zhou 2023-04-26 0:32 ` Joel Fernandes 2023-04-26 1:31 ` Zhouyi Zhou 2023-04-26 2:15 ` Joel Fernandes 2023-04-26 2:37 ` Zhouyi Zhou 2023-04-26 0:42 ` Joel Fernandes 2023-04-26 12:29 ` Michael Ellerman 2023-04-26 13:44 ` Joel Fernandes 2023-04-26 14:20 ` Peter Zijlstra 2023-04-26 14:45 ` Michael Ellerman 2023-04-28 10:35 ` Christophe Leroy 2023-04-25 10:59 ` Joel Fernandes 2023-04-25 11:53 ` Peter Zijlstra 2023-04-25 13:36 ` Christophe Leroy 2023-04-24 22:07 ` Michael Ellerman 2023-04-24 22:13 ` Zhouyi Zhou 2023-04-25 6:01 ` Zhouyi Zhou 2023-04-25 9:27 ` Zhouyi Zhou 2023-04-27 3:09 ` Michael Ellerman 2023-04-27 3:32 ` Zhouyi Zhou 2023-04-27 9:21 ` Zhouyi Zhou 2023-04-27 14:13 ` Michael Ellerman 2023-04-27 14:29 ` Zhouyi Zhou
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).