* Re: [PATCH 1/3] powerpc: Don't use local named register variable in current_thread_info
[not found] <1414727247-31838-1-git-send-email-anton__19440.5086375356$1414727300$gmane$org@samba.org>
@ 2014-12-17 1:16 ` Alexander Graf
2014-12-17 3:44 ` Anton Blanchard
2014-12-18 5:11 ` Michael Ellerman
0 siblings, 2 replies; 13+ messages in thread
From: Alexander Graf @ 2014-12-17 1:16 UTC (permalink / raw)
To: Anton Blanchard, benh, paulus, mpe, ulrich.weigand
Cc: Scott Wood, linuxppc-dev
On 31.10.14 04:47, Anton Blanchard wrote:
> LLVM doesn't support local named register variables and is unlikely
> to. current_thread_info is using one, fix it by moving it out and
> calling it __current_r1().
>
> I gave it a bit of an obscure name because we don't want anyone else
> using it - they should use current_stack_pointer(). This specific
> case is performance critical and we can't afford to call a function
> to get it. Furthermore it isn't important to know exactly where in
> the stack we are since we mask the lower bits.
>
> Signed-off-by: Anton Blanchard <anton@samba.org>
Git bisect managed to point me to this commit as the offender for OOPSes
on e5500 and e6500 (and maybe the G4 as well, not sure).
Doing a git revert of this commit on top of linus/master makes things
work fine for me again.
Alex
Oops: Kernel access of bad area, sig: 11 [#2]
SMP NR_CPUS=16 CoreNet Generic
Modules linked in:
CPU: 1 PID: 339 Comm: kworker/1:1 Tainted: G D
3.18.0-09423-g988adfd #1
Workqueue: rpciod .rpc_async_schedule
task: c0000001f6397500 ti: c0000001f6638000 task.ti: c0000001f6638000
NIP: c0000000004817a4 LR: c0000000004817a4 CTR: 0000000000000000
REGS: c0000001f663b0e0 TRAP: 0300 Tainted: G D
(3.18.0-09423-g988adfd)
MSR: 0000000080029000 <CE,EE,ME> CR: 24ad2e42 XER: 00000000
DEAR: 202031303438355f ESR: 0000000000000000 SOFTE: 1
GPR00: c0000000004817a4 c0000001f663b360 c000000000988028 000000007f24333d
GPR04: 5ff5738c1f2ebfb1 0000000000000000 0000000000000000 00000000000008f8
GPR08: c000000000480ae8 2020313034383537 36204b4220617320 6469726563740a31
GPR12: 3937302d30312d30 c00000000fff8780 c00000000007f988 c0000001f64c1600
GPR16: 0000000000000000 0000000000000000 0000000000000000 00000000000005dc
GPR20: c0000000009b8028 c00000007e034200 0000000000000548 c000000000000000
GPR24: c0000001f663b4b0 00000000b225831e 0000000000000000 0000000000000080
GPR28: 0000000000000548 00000000000008f8 0000000000000548 0000000000000094
NIP [c0000000004817a4] .__skb_checksum+0x194/0x378
LR [c0000000004817a4] .__skb_checksum+0x194/0x378
Call Trace:
[c0000001f663b360] [c0000000004817a4] .__skb_checksum+0x194/0x378
(unreliable)
[c0000001f663b440] [c0000000004819b4] .skb_checksum+0x2c/0x3c
[c0000001f663b4c0] [c0000000004fd0a8] .udp4_hwcsum+0xa8/0x16c
[c0000001f663b560] [c0000000004fd440] .udp_send_skb+0x2d4/0x370
[c0000001f663b600] [c0000000004fd51c] .udp_push_pending_frames+0x40/0x94
[c0000001f663b680] [c0000000004fec08] .udp_sendpage+0x150/0x1b4
[c0000001f663b770] [c00000000050ae54] .inet_sendpage+0xa0/0x120
[c0000001f663b810] [c00000000059c8cc] .xs_sendpages+0x2d0/0x30c
[c0000001f663b8d0] [c00000000059cae4] .xs_udp_send_request+0x58/0x120
[c0000001f663b970] [c000000000598f04] .xprt_transmit+0x80/0x36c
[c0000001f663ba20] [c0000000005942d8] .call_transmit+0x19c/0x254
[c0000001f663bab0] [c00000000059ff64] .__rpc_execute+0xbc/0x3c0
[c0000001f663bb90] [c0000000000797f8] .process_one_work+0x1c0/0x474
[c0000001f663bc40] [c00000000007a518] .worker_thread+0x17c/0x54c
[c0000001f663bd30] [c00000000007fa8c] .kthread+0x104/0x124
[c0000001f663be30] [c000000000000884] .ret_from_kernel_thread+0x58/0xd4
Instruction dump:
7d1f3a14 7c6a1850 e9580000 7fbd4050 786334e4 e90a0000 7c63ba14 f8490028
7c63ea14 7d0903a6 e84a0008 4e800421 <e8490028> 7c641b78 78270464 e9580008
---[ end trace 51b7414695b0cafe ]---
note: kworker/1:1[339] exited with preempt_count 1
Unable to handle kernel paging request for data at address
0xffffffffffffffd8
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH 1/3] powerpc: Don't use local named register variable in current_thread_info
2014-12-17 1:16 ` [PATCH 1/3] powerpc: Don't use local named register variable in current_thread_info Alexander Graf
@ 2014-12-17 3:44 ` Anton Blanchard
2014-12-17 9:27 ` Alexander Graf
2014-12-18 5:11 ` Michael Ellerman
1 sibling, 1 reply; 13+ messages in thread
From: Anton Blanchard @ 2014-12-17 3:44 UTC (permalink / raw)
To: Alexander Graf; +Cc: ulrich.weigand, paulus, Scott Wood, linuxppc-dev
Hi Alex,
> Git bisect managed to point me to this commit as the offender for
> OOPSes on e5500 and e6500 (and maybe the G4 as well, not sure).
>
> Doing a git revert of this commit on top of linus/master makes things
> work fine for me again.
Ouch, sorry for that, I'll work to reproduce. What gcc version are you
using?
Anton
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH 1/3] powerpc: Don't use local named register variable in current_thread_info
2014-12-17 3:44 ` Anton Blanchard
@ 2014-12-17 9:27 ` Alexander Graf
2014-12-22 6:49 ` Michael Ellerman
0 siblings, 1 reply; 13+ messages in thread
From: Alexander Graf @ 2014-12-17 9:27 UTC (permalink / raw)
To: Anton Blanchard; +Cc: ulrich.weigand, paulus, Scott Wood, linuxppc-dev
On 17.12.14 04:44, Anton Blanchard wrote:
> Hi Alex,
>
>> Git bisect managed to point me to this commit as the offender for
>> OOPSes on e5500 and e6500 (and maybe the G4 as well, not sure).
>>
>> Doing a git revert of this commit on top of linus/master makes things
>> work fine for me again.
>
> Ouch, sorry for that, I'll work to reproduce. What gcc version are you
> using?
I'm running
gcc (SUSE Linux) 4.7.2 20130108 [gcc-4_7-branch revision 195012]
which is basically the one from openSUSE 12.3 for ppc64.
I've also uploaded 2 builds, one with the patch applied (broken) and one
without (works):
http://csgraf.de/agraf/current_thread_info/vmlinux.broken.xz
http://csgraf.de/agraf/current_thread_info/vmlinux.works.xz
Interestingly enough I did not see this on IBM POWER systems, but maybe
that's because most of them compile their own kernels with local,
different compilers in my test runs and the only one that does base on
the same compiler doesn't run on nfsroot.
My iBook G4 target is affected by this as well and reverting this patch
also makes it work again. Maybe we're just running over some stack?
Alex
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH 1/3] powerpc: Don't use local named register variable in current_thread_info
2014-12-17 1:16 ` [PATCH 1/3] powerpc: Don't use local named register variable in current_thread_info Alexander Graf
2014-12-17 3:44 ` Anton Blanchard
@ 2014-12-18 5:11 ` Michael Ellerman
2014-12-18 6:25 ` Anton Blanchard
2014-12-18 14:56 ` Alexander Graf
1 sibling, 2 replies; 13+ messages in thread
From: Michael Ellerman @ 2014-12-18 5:11 UTC (permalink / raw)
To: Alexander Graf
Cc: ulrich.weigand, paulus, Anton Blanchard, Scott Wood, linuxppc-dev
On Wed, 2014-12-17 at 02:16 +0100, Alexander Graf wrote:
> On 31.10.14 04:47, Anton Blanchard wrote:
> > LLVM doesn't support local named register variables and is unlikely
> > to. current_thread_info is using one, fix it by moving it out and
> > calling it __current_r1().
> >
> > I gave it a bit of an obscure name because we don't want anyone else
> > using it - they should use current_stack_pointer(). This specific
> > case is performance critical and we can't afford to call a function
> > to get it. Furthermore it isn't important to know exactly where in
> > the stack we are since we mask the lower bits.
> >
> > Signed-off-by: Anton Blanchard <anton@samba.org>
>
> Git bisect managed to point me to this commit as the offender for OOPSes
> on e5500 and e6500 (and maybe the G4 as well, not sure).
>
> Doing a git revert of this commit on top of linus/master makes things
> work fine for me again.
>
>
> Alex
>
> Oops: Kernel access of bad area, sig: 11 [#2]
> SMP NR_CPUS=16 CoreNet Generic
> Modules linked in:
> CPU: 1 PID: 339 Comm: kworker/1:1 Tainted: G D
> 3.18.0-09423-g988adfd #1
> Workqueue: rpciod .rpc_async_schedule
> task: c0000001f6397500 ti: c0000001f6638000 task.ti: c0000001f6638000
> NIP: c0000000004817a4 LR: c0000000004817a4 CTR: 0000000000000000
> REGS: c0000001f663b0e0 TRAP: 0300 Tainted: G D
> (3.18.0-09423-g988adfd)
> MSR: 0000000080029000 <CE,EE,ME> CR: 24ad2e42 XER: 00000000
> DEAR: 202031303438355f ESR: 0000000000000000 SOFTE: 1
= r9 + 40
> GPR00: c0000000004817a4 c0000001f663b360 c000000000988028 000000007f24333d
> GPR04: 5ff5738c1f2ebfb1 0000000000000000 0000000000000000 00000000000008f8
> GPR08: c000000000480ae8 2020313034383537 36204b4220617320 6469726563740a31
> GPR12: 3937302d30312d30 c00000000fff8780 c00000000007f988 c0000001f64c1600
GPRs 9-12 say: " 1048576 KB as direct\n1970-01-0"
Which is rarely a good sign :)
Looks like it might be part of your dmesg from setup_page_sizes().
> GPR16: 0000000000000000 0000000000000000 0000000000000000 00000000000005dc
> GPR20: c0000000009b8028 c00000007e034200 0000000000000548 c000000000000000
> GPR24: c0000001f663b4b0 00000000b225831e 0000000000000000 0000000000000080
> GPR28: 0000000000000548 00000000000008f8 0000000000000548 0000000000000094
> NIP [c0000000004817a4] .__skb_checksum+0x194/0x378
> LR [c0000000004817a4] .__skb_checksum+0x194/0x378
> Call Trace:
> [c0000001f663b360] [c0000000004817a4] .__skb_checksum+0x194/0x378
> (unreliable)
> [c0000001f663b440] [c0000000004819b4] .skb_checksum+0x2c/0x3c
> [c0000001f663b4c0] [c0000000004fd0a8] .udp4_hwcsum+0xa8/0x16c
> [c0000001f663b560] [c0000000004fd440] .udp_send_skb+0x2d4/0x370
> [c0000001f663b600] [c0000000004fd51c] .udp_push_pending_frames+0x40/0x94
> [c0000001f663b680] [c0000000004fec08] .udp_sendpage+0x150/0x1b4
> [c0000001f663b770] [c00000000050ae54] .inet_sendpage+0xa0/0x120
> [c0000001f663b810] [c00000000059c8cc] .xs_sendpages+0x2d0/0x30c
> [c0000001f663b8d0] [c00000000059cae4] .xs_udp_send_request+0x58/0x120
> [c0000001f663b970] [c000000000598f04] .xprt_transmit+0x80/0x36c
> [c0000001f663ba20] [c0000000005942d8] .call_transmit+0x19c/0x254
> [c0000001f663bab0] [c00000000059ff64] .__rpc_execute+0xbc/0x3c0
> [c0000001f663bb90] [c0000000000797f8] .process_one_work+0x1c0/0x474
> [c0000001f663bc40] [c00000000007a518] .worker_thread+0x17c/0x54c
> [c0000001f663bd30] [c00000000007fa8c] .kthread+0x104/0x124
> [c0000001f663be30] [c000000000000884] .ret_from_kernel_thread+0x58/0xd4
> Instruction dump:
> 7d1f3a14 7c6a1850 e9580000 7fbd4050 786334e4 e90a0000 7c63ba14 f8490028
> 7c63ea14 7d0903a6 e84a0008 4e800421 <e8490028> 7c641b78 78270464 e9580008
Which is:
add r8, r31, r7
subf r3, r10, r3
ld r10, 0(r24)
subf r29, r29, r8
rldicr r3, r3, 6, 51
ld r8, 0(r10)
add r3, r3, r23
std r2, 40(r9)
add r3, r3, r29
mtctr r8
ld r2, 8(r10)
bctrl
ld r2, 40(r9) <---
mr r4, r3
rldicr r7, r1, 0, 49
ld r10, 8(r24)
Which looks a bit odd. I'd expect us to be saving/restoring r2 to the stack,
though maybe r9 was pointing at the stack?
Looking at your vmlinux.broken I don't see the same code gen.
Can you get an oops from a kernel and upload the exact binary? Or just post us
the full code dump of __skb_checksum() (or wherever it oopses).
cheers
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH 1/3] powerpc: Don't use local named register variable in current_thread_info
2014-12-18 5:11 ` Michael Ellerman
@ 2014-12-18 6:25 ` Anton Blanchard
2014-12-18 15:02 ` Alexander Graf
2014-12-31 12:24 ` Alan Modra
2014-12-18 14:56 ` Alexander Graf
1 sibling, 2 replies; 13+ messages in thread
From: Anton Blanchard @ 2014-12-18 6:25 UTC (permalink / raw)
To: Michael Ellerman
Cc: Alan Modra, Alexander Graf, ulrich.weigand, paulus, Scott Wood,
linuxppc-dev
On Thu, 18 Dec 2014 16:11:54 +1100
Michael Ellerman <mpe@ellerman.id.au> wrote:
> On Wed, 2014-12-17 at 02:16 +0100, Alexander Graf wrote:
> > On 31.10.14 04:47, Anton Blanchard wrote:
> > > LLVM doesn't support local named register variables and is
> > > unlikely to. current_thread_info is using one, fix it by moving
> > > it out and calling it __current_r1().
> > >
> > > I gave it a bit of an obscure name because we don't want anyone
> > > else using it - they should use current_stack_pointer(). This
> > > specific case is performance critical and we can't afford to call
> > > a function to get it. Furthermore it isn't important to know
> > > exactly where in the stack we are since we mask the lower bits.
> > >
> > > Signed-off-by: Anton Blanchard <anton@samba.org>
> >
> > Git bisect managed to point me to this commit as the offender for
> > OOPSes on e5500 and e6500 (and maybe the G4 as well, not sure).
> >
> > Doing a git revert of this commit on top of linus/master makes
> > things work fine for me again.
> >
> >
> > Alex
> >
> > Oops: Kernel access of bad area, sig: 11 [#2]
> > SMP NR_CPUS=16 CoreNet Generic
> > Modules linked in:
> > CPU: 1 PID: 339 Comm: kworker/1:1 Tainted: G D
> > 3.18.0-09423-g988adfd #1
> > Workqueue: rpciod .rpc_async_schedule
> > task: c0000001f6397500 ti: c0000001f6638000 task.ti:
> > c0000001f6638000 NIP: c0000000004817a4 LR: c0000000004817a4 CTR:
> > 0000000000000000 REGS: c0000001f663b0e0 TRAP: 0300 Tainted:
> > G D (3.18.0-09423-g988adfd)
> > MSR: 0000000080029000 <CE,EE,ME> CR: 24ad2e42 XER: 00000000
> > DEAR: 202031303438355f ESR: 0000000000000000 SOFTE: 1
> = r9 + 40
>
> > GPR00: c0000000004817a4 c0000001f663b360 c000000000988028
> > 000000007f24333d GPR04: 5ff5738c1f2ebfb1 0000000000000000
> > 0000000000000000 00000000000008f8 GPR08: c000000000480ae8
> > 2020313034383537 36204b4220617320 6469726563740a31 GPR12:
> > 3937302d30312d30 c00000000fff8780 c00000000007f988 c0000001f64c1600
>
> GPRs 9-12 say: " 1048576 KB as direct\n1970-01-0"
>
> Which is rarely a good sign :)
>
> Looks like it might be part of your dmesg from setup_page_sizes().
>
> > GPR16: 0000000000000000 0000000000000000 0000000000000000
> > 00000000000005dc GPR20: c0000000009b8028 c00000007e034200
> > 0000000000000548 c000000000000000 GPR24: c0000001f663b4b0
> > 00000000b225831e 0000000000000000 0000000000000080 GPR28:
> > 0000000000000548 00000000000008f8 0000000000000548 0000000000000094
> > NIP [c0000000004817a4] .__skb_checksum+0x194/0x378 LR
> > [c0000000004817a4] .__skb_checksum+0x194/0x378 Call Trace:
> > [c0000001f663b360] [c0000000004817a4] .__skb_checksum+0x194/0x378
> > (unreliable)
> > [c0000001f663b440] [c0000000004819b4] .skb_checksum+0x2c/0x3c
> > [c0000001f663b4c0] [c0000000004fd0a8] .udp4_hwcsum+0xa8/0x16c
> > [c0000001f663b560] [c0000000004fd440] .udp_send_skb+0x2d4/0x370
> > [c0000001f663b600]
> > [c0000000004fd51c] .udp_push_pending_frames+0x40/0x94
> > [c0000001f663b680] [c0000000004fec08] .udp_sendpage+0x150/0x1b4
> > [c0000001f663b770] [c00000000050ae54] .inet_sendpage+0xa0/0x120
> > [c0000001f663b810] [c00000000059c8cc] .xs_sendpages+0x2d0/0x30c
> > [c0000001f663b8d0]
> > [c00000000059cae4] .xs_udp_send_request+0x58/0x120
> > [c0000001f663b970] [c000000000598f04] .xprt_transmit+0x80/0x36c
> > [c0000001f663ba20] [c0000000005942d8] .call_transmit+0x19c/0x254
> > [c0000001f663bab0] [c00000000059ff64] .__rpc_execute+0xbc/0x3c0
> > [c0000001f663bb90] [c0000000000797f8] .process_one_work+0x1c0/0x474
> > [c0000001f663bc40] [c00000000007a518] .worker_thread+0x17c/0x54c
> > [c0000001f663bd30] [c00000000007fa8c] .kthread+0x104/0x124
> > [c0000001f663be30]
> > [c000000000000884] .ret_from_kernel_thread+0x58/0xd4 Instruction
> > dump: 7d1f3a14 7c6a1850 e9580000 7fbd4050 786334e4 e90a0000
> > 7c63ba14 f8490028 7c63ea14 7d0903a6 e84a0008 4e800421 <e8490028>
> > 7c641b78 78270464 e9580008
>
> Which is:
>
> add r8, r31, r7
> subf r3, r10, r3
> ld r10, 0(r24)
> subf r29, r29, r8
> rldicr r3, r3, 6, 51
> ld r8, 0(r10)
> add r3, r3, r23
> std r2, 40(r9)
> add r3, r3, r29
> mtctr r8
> ld r2, 8(r10)
> bctrl
> ld r2, 40(r9) <---
> mr r4, r3
> rldicr r7, r1, 0, 49
> ld r10, 8(r24)
>
>
> Which looks a bit odd. I'd expect us to be saving/restoring r2 to the
> stack, though maybe r9 was pointing at the stack?
Nice catch! This looks like a compiler bug.
> Looking at your vmlinux.broken I don't see the same code gen.
For whatever reason we ended up with r10 this time:
7c 2a 0b 78 mr r10,r1
...
f8 4a 00 28 std r2,40(r10)
7c 63 ba 14 add r3,r3,r23
7c e9 03 a6 mtctr r7
7c 63 ea 14 add r3,r3,r29
38 a0 00 00 li r5,0
e8 48 00 08 ld r2,8(r8)
4e 80 04 21 bctrl
e8 4a 00 28 ld r2,40(r10)
The indirect function call is allowed to clobber r10, gcc is doing
something very wrong here.
Anton
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH 1/3] powerpc: Don't use local named register variable in current_thread_info
2014-12-18 5:11 ` Michael Ellerman
2014-12-18 6:25 ` Anton Blanchard
@ 2014-12-18 14:56 ` Alexander Graf
1 sibling, 0 replies; 13+ messages in thread
From: Alexander Graf @ 2014-12-18 14:56 UTC (permalink / raw)
To: Michael Ellerman
Cc: ulrich.weigand, paulus, Anton Blanchard, Scott Wood, linuxppc-dev
On 18.12.14 06:11, Michael Ellerman wrote:
> On Wed, 2014-12-17 at 02:16 +0100, Alexander Graf wrote:
>> On 31.10.14 04:47, Anton Blanchard wrote:
>>> LLVM doesn't support local named register variables and is unlikely
>>> to. current_thread_info is using one, fix it by moving it out and
>>> calling it __current_r1().
>>>
>>> I gave it a bit of an obscure name because we don't want anyone else
>>> using it - they should use current_stack_pointer(). This specific
>>> case is performance critical and we can't afford to call a function
>>> to get it. Furthermore it isn't important to know exactly where in
>>> the stack we are since we mask the lower bits.
>>>
>>> Signed-off-by: Anton Blanchard <anton@samba.org>
>>
>> Git bisect managed to point me to this commit as the offender for OOPSes
>> on e5500 and e6500 (and maybe the G4 as well, not sure).
>>
>> Doing a git revert of this commit on top of linus/master makes things
>> work fine for me again.
>>
>>
>> Alex
>>
>> Oops: Kernel access of bad area, sig: 11 [#2]
>> SMP NR_CPUS=16 CoreNet Generic
>> Modules linked in:
>> CPU: 1 PID: 339 Comm: kworker/1:1 Tainted: G D
>> 3.18.0-09423-g988adfd #1
>> Workqueue: rpciod .rpc_async_schedule
>> task: c0000001f6397500 ti: c0000001f6638000 task.ti: c0000001f6638000
>> NIP: c0000000004817a4 LR: c0000000004817a4 CTR: 0000000000000000
>> REGS: c0000001f663b0e0 TRAP: 0300 Tainted: G D
>> (3.18.0-09423-g988adfd)
>> MSR: 0000000080029000 <CE,EE,ME> CR: 24ad2e42 XER: 00000000
>> DEAR: 202031303438355f ESR: 0000000000000000 SOFTE: 1
> = r9 + 40
>
>> GPR00: c0000000004817a4 c0000001f663b360 c000000000988028 000000007f24333d
>> GPR04: 5ff5738c1f2ebfb1 0000000000000000 0000000000000000 00000000000008f8
>> GPR08: c000000000480ae8 2020313034383537 36204b4220617320 6469726563740a31
>> GPR12: 3937302d30312d30 c00000000fff8780 c00000000007f988 c0000001f64c1600
>
> GPRs 9-12 say: " 1048576 KB as direct\n1970-01-0"
>
> Which is rarely a good sign :)
>
> Looks like it might be part of your dmesg from setup_page_sizes().
>
>> GPR16: 0000000000000000 0000000000000000 0000000000000000 00000000000005dc
>> GPR20: c0000000009b8028 c00000007e034200 0000000000000548 c000000000000000
>> GPR24: c0000001f663b4b0 00000000b225831e 0000000000000000 0000000000000080
>> GPR28: 0000000000000548 00000000000008f8 0000000000000548 0000000000000094
>> NIP [c0000000004817a4] .__skb_checksum+0x194/0x378
>> LR [c0000000004817a4] .__skb_checksum+0x194/0x378
>> Call Trace:
>> [c0000001f663b360] [c0000000004817a4] .__skb_checksum+0x194/0x378
>> (unreliable)
>> [c0000001f663b440] [c0000000004819b4] .skb_checksum+0x2c/0x3c
>> [c0000001f663b4c0] [c0000000004fd0a8] .udp4_hwcsum+0xa8/0x16c
>> [c0000001f663b560] [c0000000004fd440] .udp_send_skb+0x2d4/0x370
>> [c0000001f663b600] [c0000000004fd51c] .udp_push_pending_frames+0x40/0x94
>> [c0000001f663b680] [c0000000004fec08] .udp_sendpage+0x150/0x1b4
>> [c0000001f663b770] [c00000000050ae54] .inet_sendpage+0xa0/0x120
>> [c0000001f663b810] [c00000000059c8cc] .xs_sendpages+0x2d0/0x30c
>> [c0000001f663b8d0] [c00000000059cae4] .xs_udp_send_request+0x58/0x120
>> [c0000001f663b970] [c000000000598f04] .xprt_transmit+0x80/0x36c
>> [c0000001f663ba20] [c0000000005942d8] .call_transmit+0x19c/0x254
>> [c0000001f663bab0] [c00000000059ff64] .__rpc_execute+0xbc/0x3c0
>> [c0000001f663bb90] [c0000000000797f8] .process_one_work+0x1c0/0x474
>> [c0000001f663bc40] [c00000000007a518] .worker_thread+0x17c/0x54c
>> [c0000001f663bd30] [c00000000007fa8c] .kthread+0x104/0x124
>> [c0000001f663be30] [c000000000000884] .ret_from_kernel_thread+0x58/0xd4
>> Instruction dump:
>> 7d1f3a14 7c6a1850 e9580000 7fbd4050 786334e4 e90a0000 7c63ba14 f8490028
>> 7c63ea14 7d0903a6 e84a0008 4e800421 <e8490028> 7c641b78 78270464 e9580008
>
> Which is:
>
> add r8, r31, r7
> subf r3, r10, r3
> ld r10, 0(r24)
> subf r29, r29, r8
> rldicr r3, r3, 6, 51
> ld r8, 0(r10)
> add r3, r3, r23
> std r2, 40(r9)
> add r3, r3, r29
> mtctr r8
> ld r2, 8(r10)
> bctrl
> ld r2, 40(r9) <---
> mr r4, r3
> rldicr r7, r1, 0, 49
> ld r10, 8(r24)
>
>
> Which looks a bit odd. I'd expect us to be saving/restoring r2 to the stack,
> though maybe r9 was pointing at the stack?
>
> Looking at your vmlinux.broken I don't see the same code gen.
>
> Can you get an oops from a kernel and upload the exact binary? Or just post us
> the full code dump of __skb_checksum() (or wherever it oopses).
Ugh, sorry - I must've copied the wrong one. The serial output below is
from the uImage that (hopefully) is belongs to the vmlinux.broken:
http://csgraf.de/agraf/current_thread_info/dmesg.txt
Alex
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH 1/3] powerpc: Don't use local named register variable in current_thread_info
2014-12-18 6:25 ` Anton Blanchard
@ 2014-12-18 15:02 ` Alexander Graf
2014-12-31 12:24 ` Alan Modra
1 sibling, 0 replies; 13+ messages in thread
From: Alexander Graf @ 2014-12-18 15:02 UTC (permalink / raw)
To: Anton Blanchard, Michael Ellerman
Cc: Alan Modra, ulrich.weigand, paulus, Scott Wood, linuxppc-dev
On 18.12.14 07:25, Anton Blanchard wrote:
> On Thu, 18 Dec 2014 16:11:54 +1100
> Michael Ellerman <mpe@ellerman.id.au> wrote:
>
>> On Wed, 2014-12-17 at 02:16 +0100, Alexander Graf wrote:
>>> On 31.10.14 04:47, Anton Blanchard wrote:
>>>> LLVM doesn't support local named register variables and is
>>>> unlikely to. current_thread_info is using one, fix it by moving
>>>> it out and calling it __current_r1().
>>>>
>>>> I gave it a bit of an obscure name because we don't want anyone
>>>> else using it - they should use current_stack_pointer(). This
>>>> specific case is performance critical and we can't afford to call
>>>> a function to get it. Furthermore it isn't important to know
>>>> exactly where in the stack we are since we mask the lower bits.
>>>>
>>>> Signed-off-by: Anton Blanchard <anton@samba.org>
>>>
>>> Git bisect managed to point me to this commit as the offender for
>>> OOPSes on e5500 and e6500 (and maybe the G4 as well, not sure).
>>>
>>> Doing a git revert of this commit on top of linus/master makes
>>> things work fine for me again.
>>>
>>>
>>> Alex
>>>
>>> Oops: Kernel access of bad area, sig: 11 [#2]
>>> SMP NR_CPUS=16 CoreNet Generic
>>> Modules linked in:
>>> CPU: 1 PID: 339 Comm: kworker/1:1 Tainted: G D
>>> 3.18.0-09423-g988adfd #1
>>> Workqueue: rpciod .rpc_async_schedule
>>> task: c0000001f6397500 ti: c0000001f6638000 task.ti:
>>> c0000001f6638000 NIP: c0000000004817a4 LR: c0000000004817a4 CTR:
>>> 0000000000000000 REGS: c0000001f663b0e0 TRAP: 0300 Tainted:
>>> G D (3.18.0-09423-g988adfd)
>>> MSR: 0000000080029000 <CE,EE,ME> CR: 24ad2e42 XER: 00000000
>>> DEAR: 202031303438355f ESR: 0000000000000000 SOFTE: 1
>> = r9 + 40
>>
>>> GPR00: c0000000004817a4 c0000001f663b360 c000000000988028
>>> 000000007f24333d GPR04: 5ff5738c1f2ebfb1 0000000000000000
>>> 0000000000000000 00000000000008f8 GPR08: c000000000480ae8
>>> 2020313034383537 36204b4220617320 6469726563740a31 GPR12:
>>> 3937302d30312d30 c00000000fff8780 c00000000007f988 c0000001f64c1600
>>
>> GPRs 9-12 say: " 1048576 KB as direct\n1970-01-0"
>>
>> Which is rarely a good sign :)
>>
>> Looks like it might be part of your dmesg from setup_page_sizes().
>>
>>> GPR16: 0000000000000000 0000000000000000 0000000000000000
>>> 00000000000005dc GPR20: c0000000009b8028 c00000007e034200
>>> 0000000000000548 c000000000000000 GPR24: c0000001f663b4b0
>>> 00000000b225831e 0000000000000000 0000000000000080 GPR28:
>>> 0000000000000548 00000000000008f8 0000000000000548 0000000000000094
>>> NIP [c0000000004817a4] .__skb_checksum+0x194/0x378 LR
>>> [c0000000004817a4] .__skb_checksum+0x194/0x378 Call Trace:
>>> [c0000001f663b360] [c0000000004817a4] .__skb_checksum+0x194/0x378
>>> (unreliable)
>>> [c0000001f663b440] [c0000000004819b4] .skb_checksum+0x2c/0x3c
>>> [c0000001f663b4c0] [c0000000004fd0a8] .udp4_hwcsum+0xa8/0x16c
>>> [c0000001f663b560] [c0000000004fd440] .udp_send_skb+0x2d4/0x370
>>> [c0000001f663b600]
>>> [c0000000004fd51c] .udp_push_pending_frames+0x40/0x94
>>> [c0000001f663b680] [c0000000004fec08] .udp_sendpage+0x150/0x1b4
>>> [c0000001f663b770] [c00000000050ae54] .inet_sendpage+0xa0/0x120
>>> [c0000001f663b810] [c00000000059c8cc] .xs_sendpages+0x2d0/0x30c
>>> [c0000001f663b8d0]
>>> [c00000000059cae4] .xs_udp_send_request+0x58/0x120
>>> [c0000001f663b970] [c000000000598f04] .xprt_transmit+0x80/0x36c
>>> [c0000001f663ba20] [c0000000005942d8] .call_transmit+0x19c/0x254
>>> [c0000001f663bab0] [c00000000059ff64] .__rpc_execute+0xbc/0x3c0
>>> [c0000001f663bb90] [c0000000000797f8] .process_one_work+0x1c0/0x474
>>> [c0000001f663bc40] [c00000000007a518] .worker_thread+0x17c/0x54c
>>> [c0000001f663bd30] [c00000000007fa8c] .kthread+0x104/0x124
>>> [c0000001f663be30]
>>> [c000000000000884] .ret_from_kernel_thread+0x58/0xd4 Instruction
>>> dump: 7d1f3a14 7c6a1850 e9580000 7fbd4050 786334e4 e90a0000
>>> 7c63ba14 f8490028 7c63ea14 7d0903a6 e84a0008 4e800421 <e8490028>
>>> 7c641b78 78270464 e9580008
>>
>> Which is:
>>
>> add r8, r31, r7
>> subf r3, r10, r3
>> ld r10, 0(r24)
>> subf r29, r29, r8
>> rldicr r3, r3, 6, 51
>> ld r8, 0(r10)
>> add r3, r3, r23
>> std r2, 40(r9)
>> add r3, r3, r29
>> mtctr r8
>> ld r2, 8(r10)
>> bctrl
>> ld r2, 40(r9) <---
>> mr r4, r3
>> rldicr r7, r1, 0, 49
>> ld r10, 8(r24)
>>
>>
>> Which looks a bit odd. I'd expect us to be saving/restoring r2 to the
>> stack, though maybe r9 was pointing at the stack?
>
> Nice catch! This looks like a compiler bug.
>
>> Looking at your vmlinux.broken I don't see the same code gen.
>
> For whatever reason we ended up with r10 this time:
>
> 7c 2a 0b 78 mr r10,r1
> ...
> f8 4a 00 28 std r2,40(r10)
> 7c 63 ba 14 add r3,r3,r23
> 7c e9 03 a6 mtctr r7
> 7c 63 ea 14 add r3,r3,r29
> 38 a0 00 00 li r5,0
> e8 48 00 08 ld r2,8(r8)
> 4e 80 04 21 bctrl
> e8 4a 00 28 ld r2,40(r10)
>
> The indirect function call is allowed to clobber r10, gcc is doing
> something very wrong here.
Yeah, I couldn't see why the patch really would break anything either,
but it does for me with this (not quite old) version of gcc.
I also don't see the breakage on LE machines that compile with a newer
version of gcc (4.8.3).
Alex
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH 1/3] powerpc: Don't use local named register variable in current_thread_info
2014-12-17 9:27 ` Alexander Graf
@ 2014-12-22 6:49 ` Michael Ellerman
0 siblings, 0 replies; 13+ messages in thread
From: Michael Ellerman @ 2014-12-22 6:49 UTC (permalink / raw)
To: Alexander Graf
Cc: ulrich.weigand, paulus, Anton Blanchard, Scott Wood, linuxppc-dev
On Wed, 2014-12-17 at 10:27 +0100, Alexander Graf wrote:
>
> On 17.12.14 04:44, Anton Blanchard wrote:
> > Hi Alex,
> >
> >> Git bisect managed to point me to this commit as the offender for
> >> OOPSes on e5500 and e6500 (and maybe the G4 as well, not sure).
> >>
> >> Doing a git revert of this commit on top of linus/master makes things
> >> work fine for me again.
> >
> > Ouch, sorry for that, I'll work to reproduce. What gcc version are you
> > using?
>
> I'm running
>
> gcc (SUSE Linux) 4.7.2 20130108 [gcc-4_7-branch revision 195012]
>
> which is basically the one from openSUSE 12.3 for ppc64.
OK. I've also reproduced it with what I think is stock 4.7.3.
I only see it in __skb_checksum() though, which is a bit odd. It is a fairly
large routine, so maybe it's just register pressure?
I tried for a while to create a self contained test case, but couldn't get
anything to work.
I guess we'll have to revert the kernel patch, unless someone can identify the
gcc bug soon.
I also wonder if it's fixed in later versions, or we're just not seeing it due
to good luck.
Below is the broken vs working code. For some reason in the bad case it backs
up r1, as if it thinks it will be clobbered by the rldicr.
Bad:
c00000000076df84: 7c 29 0b 78 mr r9,r1 <-- save of r1 ?
c00000000076df88: 78 27 04 64 rldicr r7,r1,0,49 <-- current_thread_info()
c00000000076df8c: 7d 08 ea 14 add r8,r8,r29
c00000000076df90: 7c 9f 40 50 subf r4,r31,r8
c00000000076df94: 7d 12 07 b4 extsw r18,r8
c00000000076df98: 2f 04 00 00 cmpwi cr6,r4,0
c00000000076df9c: 7c 9c 23 78 mr r28,r4
c00000000076dfa0: 7f 84 f0 00 cmpw cr7,r4,r30
c00000000076dfa4: 40 99 00 cc ble- cr6,c00000000076e070 <.__skb_checksum+0x220>
c00000000076dfa8: 40 9d 00 08 ble- cr7,c00000000076dfb0 <.__skb_checksum+0x160>
c00000000076dfac: 7f dc f3 78 mr r28,r30
c00000000076dfb0: 81 07 00 14 lwz r8,20(r7)
c00000000076dfb4: e8 6a 00 00 ld r3,0(r10)
c00000000076dfb8: 7f 94 07 b4 extsw r20,r28
c00000000076dfbc: 39 08 00 01 addi r8,r8,1
c00000000076dfc0: 91 07 00 14 stw r8,20(r7)
c00000000076dfc4: 7c 63 b2 14 add r3,r3,r22
c00000000076dfc8: 7e 84 a3 78 mr r4,r20
c00000000076dfcc: 7c 63 1e 74 sradi r3,r3,3
c00000000076dfd0: 38 a0 00 00 li r5,0
c00000000076dfd4: 7c 63 c1 d2 mulld r3,r3,r24
c00000000076dfd8: 81 0a 00 08 lwz r8,8(r10)
c00000000076dfdc: 7d 1f 42 14 add r8,r31,r8
c00000000076dfe0: 78 63 83 e4 rldicr r3,r3,16,47
c00000000076dfe4: 7f bd 40 50 subf r29,r29,r8
c00000000076dfe8: 7c 63 bb 78 or r3,r3,r23
c00000000076dfec: 7c 63 ea 14 add r3,r3,r29
c00000000076dff0: e9 59 00 00 ld r10,0(r25)
c00000000076dff4: e9 0a 00 00 ld r8,0(r10)
c00000000076dff8: f8 49 00 28 std r2,40(r9) <-- use of r9 which is OK
c00000000076dffc: 7d 09 03 a6 mtctr r8
c00000000076e000: e8 4a 00 08 ld r2,8(r10)
c00000000076e004: 4e 80 04 21 bctrl
c00000000076e008: e8 49 00 28 ld r2,40(r9) <-- but not OK here
Good:
c00000000076e0b0: 78 39 04 64 rldicr r25,r1,0,49 <-- current_thread_info()
c00000000076e0b4: 7a 94 07 c6 rldicr r20,r20,32,31
c00000000076e0b8: 62 d6 6d b7 ori r22,r22,28087
c00000000076e0bc: 7a b5 00 44 rldicr r21,r21,0,1
c00000000076e0c0: 7d 5f f2 14 add r10,r31,r30
c00000000076e0c4: 39 1a 00 03 addi r8,r26,3
c00000000076e0c8: 79 08 26 e4 rldicr r8,r8,4,59
c00000000076e0cc: 7f 8a e8 00 cmpw cr7,r10,r29
c00000000076e0d0: 7d 40 00 26 mfcr r10
c00000000076e0d4: 55 4a ef fe rlwinm r10,r10,29,31,31
c00000000076e0d8: 7d 29 42 14 add r9,r9,r8
c00000000076e0dc: 0b 0a 00 00 tdnei r10,0
c00000000076e0e0: 81 49 00 0c lwz r10,12(r9)
c00000000076e0e4: 7d 4a ea 14 add r10,r10,r29
c00000000076e0e8: 7d 1f 50 50 subf r8,r31,r10
c00000000076e0ec: 7d 51 07 b4 extsw r17,r10
c00000000076e0f0: 2f 08 00 00 cmpwi cr6,r8,0
c00000000076e0f4: 7d 1c 43 78 mr r28,r8
c00000000076e0f8: 7f 88 f0 00 cmpw cr7,r8,r30
c00000000076e0fc: 40 99 00 c8 ble- cr6,c00000000076e1c4 <.__skb_checksum+0x214>
c00000000076e100: 40 9d 00 08 ble- cr7,c00000000076e108 <.__skb_checksum+0x158>
c00000000076e104: 7f dc f3 78 mr r28,r30
c00000000076e108: 81 59 00 14 lwz r10,20(r25)
c00000000076e10c: e8 69 00 00 ld r3,0(r9)
c00000000076e110: 7f 93 07 b4 extsw r19,r28
c00000000076e114: 39 4a 00 01 addi r10,r10,1
c00000000076e118: 91 59 00 14 stw r10,20(r25)
c00000000076e11c: 7c 63 a2 14 add r3,r3,r20
c00000000076e120: 7e 64 9b 78 mr r4,r19
c00000000076e124: 7c 63 1e 74 sradi r3,r3,3
c00000000076e128: 38 a0 00 00 li r5,0
c00000000076e12c: 7c 63 b1 d2 mulld r3,r3,r22
c00000000076e130: 81 49 00 08 lwz r10,8(r9)
c00000000076e134: 7d 5f 52 14 add r10,r31,r10
c00000000076e138: 78 63 83 e4 rldicr r3,r3,16,47
c00000000076e13c: 7f bd 50 50 subf r29,r29,r10
c00000000076e140: 7c 63 ab 78 or r3,r3,r21
c00000000076e144: 7c 63 ea 14 add r3,r3,r29
c00000000076e148: e9 37 00 00 ld r9,0(r23)
c00000000076e14c: e9 49 00 00 ld r10,0(r9)
c00000000076e150: f8 41 00 28 std r2,40(r1) <-- correct use of r1
c00000000076e154: 7d 49 03 a6 mtctr r10
c00000000076e158: e8 49 00 08 ld r2,8(r9)
c00000000076e15c: 4e 80 04 21 bctrl
c00000000076e160: e8 41 00 28 ld r2,40(r1) <-- correct use of r1
cheers
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH 1/3] powerpc: Don't use local named register variable in current_thread_info
2014-12-18 6:25 ` Anton Blanchard
2014-12-18 15:02 ` Alexander Graf
@ 2014-12-31 12:24 ` Alan Modra
2015-01-07 5:12 ` Anton Blanchard
1 sibling, 1 reply; 13+ messages in thread
From: Alan Modra @ 2014-12-31 12:24 UTC (permalink / raw)
To: Anton Blanchard
Cc: Alexander Graf, ulrich.weigand, paulus, Scott Wood, linuxppc-dev
On Thu, Dec 18, 2014 at 05:25:46PM +1100, Anton Blanchard wrote:
> On Thu, 18 Dec 2014 16:11:54 +1100
> Michael Ellerman <mpe@ellerman.id.au> wrote:
>
> > On Wed, 2014-12-17 at 02:16 +0100, Alexander Graf wrote:
> > > On 31.10.14 04:47, Anton Blanchard wrote:
> > > > LLVM doesn't support local named register variables and is
> > > > unlikely to. current_thread_info is using one, fix it by moving
> > > > it out and calling it __current_r1().
> > > >
> > > > I gave it a bit of an obscure name because we don't want anyone
> > > > else using it - they should use current_stack_pointer(). This
> > > > specific case is performance critical and we can't afford to call
> > > > a function to get it. Furthermore it isn't important to know
> > > > exactly where in the stack we are since we mask the lower bits.
> > > >
> > > > Signed-off-by: Anton Blanchard <anton@samba.org>
> > >
> > > Git bisect managed to point me to this commit as the offender for
> > > OOPSes on e5500 and e6500 (and maybe the G4 as well, not sure).
> > >
> > > Doing a git revert of this commit on top of linus/master makes
> > > things work fine for me again.
> > >
> > >
> > > Alex
> > >
> > > Oops: Kernel access of bad area, sig: 11 [#2]
> > > SMP NR_CPUS=16 CoreNet Generic
> > > Modules linked in:
> > > CPU: 1 PID: 339 Comm: kworker/1:1 Tainted: G D
> > > 3.18.0-09423-g988adfd #1
> > > Workqueue: rpciod .rpc_async_schedule
> > > task: c0000001f6397500 ti: c0000001f6638000 task.ti:
> > > c0000001f6638000 NIP: c0000000004817a4 LR: c0000000004817a4 CTR:
> > > 0000000000000000 REGS: c0000001f663b0e0 TRAP: 0300 Tainted:
> > > G D (3.18.0-09423-g988adfd)
> > > MSR: 0000000080029000 <CE,EE,ME> CR: 24ad2e42 XER: 00000000
> > > DEAR: 202031303438355f ESR: 0000000000000000 SOFTE: 1
> > = r9 + 40
> >
> > > GPR00: c0000000004817a4 c0000001f663b360 c000000000988028
> > > 000000007f24333d GPR04: 5ff5738c1f2ebfb1 0000000000000000
> > > 0000000000000000 00000000000008f8 GPR08: c000000000480ae8
> > > 2020313034383537 36204b4220617320 6469726563740a31 GPR12:
> > > 3937302d30312d30 c00000000fff8780 c00000000007f988 c0000001f64c1600
> >
> > GPRs 9-12 say: " 1048576 KB as direct\n1970-01-0"
> >
> > Which is rarely a good sign :)
> >
> > Looks like it might be part of your dmesg from setup_page_sizes().
> >
> > > GPR16: 0000000000000000 0000000000000000 0000000000000000
> > > 00000000000005dc GPR20: c0000000009b8028 c00000007e034200
> > > 0000000000000548 c000000000000000 GPR24: c0000001f663b4b0
> > > 00000000b225831e 0000000000000000 0000000000000080 GPR28:
> > > 0000000000000548 00000000000008f8 0000000000000548 0000000000000094
> > > NIP [c0000000004817a4] .__skb_checksum+0x194/0x378 LR
> > > [c0000000004817a4] .__skb_checksum+0x194/0x378 Call Trace:
> > > [c0000001f663b360] [c0000000004817a4] .__skb_checksum+0x194/0x378
> > > (unreliable)
> > > [c0000001f663b440] [c0000000004819b4] .skb_checksum+0x2c/0x3c
> > > [c0000001f663b4c0] [c0000000004fd0a8] .udp4_hwcsum+0xa8/0x16c
> > > [c0000001f663b560] [c0000000004fd440] .udp_send_skb+0x2d4/0x370
> > > [c0000001f663b600]
> > > [c0000000004fd51c] .udp_push_pending_frames+0x40/0x94
> > > [c0000001f663b680] [c0000000004fec08] .udp_sendpage+0x150/0x1b4
> > > [c0000001f663b770] [c00000000050ae54] .inet_sendpage+0xa0/0x120
> > > [c0000001f663b810] [c00000000059c8cc] .xs_sendpages+0x2d0/0x30c
> > > [c0000001f663b8d0]
> > > [c00000000059cae4] .xs_udp_send_request+0x58/0x120
> > > [c0000001f663b970] [c000000000598f04] .xprt_transmit+0x80/0x36c
> > > [c0000001f663ba20] [c0000000005942d8] .call_transmit+0x19c/0x254
> > > [c0000001f663bab0] [c00000000059ff64] .__rpc_execute+0xbc/0x3c0
> > > [c0000001f663bb90] [c0000000000797f8] .process_one_work+0x1c0/0x474
> > > [c0000001f663bc40] [c00000000007a518] .worker_thread+0x17c/0x54c
> > > [c0000001f663bd30] [c00000000007fa8c] .kthread+0x104/0x124
> > > [c0000001f663be30]
> > > [c000000000000884] .ret_from_kernel_thread+0x58/0xd4 Instruction
> > > dump: 7d1f3a14 7c6a1850 e9580000 7fbd4050 786334e4 e90a0000
> > > 7c63ba14 f8490028 7c63ea14 7d0903a6 e84a0008 4e800421 <e8490028>
> > > 7c641b78 78270464 e9580008
> >
> > Which is:
> >
> > add r8, r31, r7
> > subf r3, r10, r3
> > ld r10, 0(r24)
> > subf r29, r29, r8
> > rldicr r3, r3, 6, 51
> > ld r8, 0(r10)
> > add r3, r3, r23
> > std r2, 40(r9)
> > add r3, r3, r29
> > mtctr r8
> > ld r2, 8(r10)
> > bctrl
> > ld r2, 40(r9) <---
> > mr r4, r3
> > rldicr r7, r1, 0, 49
> > ld r10, 8(r24)
> >
> >
> > Which looks a bit odd. I'd expect us to be saving/restoring r2 to the
> > stack, though maybe r9 was pointing at the stack?
>
> Nice catch! This looks like a compiler bug.
Yes, it is.
> > Looking at your vmlinux.broken I don't see the same code gen.
>
> For whatever reason we ended up with r10 this time:
>
> 7c 2a 0b 78 mr r10,r1
> ...
> f8 4a 00 28 std r2,40(r10)
> 7c 63 ba 14 add r3,r3,r23
> 7c e9 03 a6 mtctr r7
> 7c 63 ea 14 add r3,r3,r29
> 38 a0 00 00 li r5,0
> e8 48 00 08 ld r2,8(r8)
> 4e 80 04 21 bctrl
> e8 4a 00 28 ld r2,40(r10)
>
> The indirect function call is allowed to clobber r10, gcc is doing
> something very wrong here.
Right. This is really an rs6000 backend bug. We describe one of the
indirect calls that go wrong here as
(call_insn 108 107 109 13 (parallel [
(set (reg:DI 3 3)
(call (mem:SI (reg:DI 288) [0 *_67 S4 A8])
(const_int 64 [0x40])))
(use (mem:DI (plus:DI (reg/f:DI 287 [ ops_44(D)->update ])
(const_int 8 [0x8])) [0 S8 A8]))
(set (reg:DI 2 2)
(mem/v/c:DI (plus:DI (reg/f:DI 1 1)
(const_int 40 [0x28])) [0 S8 A8]))
(clobber (reg:DI 65 lr))
]) net/core/skbuff.c:2085 680 {*call_value_indirect_aixdi}
<notes and arg uses omitted for clarity>
)
Notice that the RTL contains a "parallel". As you might guess, gcc
treats the vector of expressions inside the square brackets of the
parallel as happening "in parallel". Meaning that as far as gcc is
concerned the toc restore part (third element) happens at the same
time as the call (first element). So if gcc replaces (reg:DI 1) in
the toc restore with some other register known to have the same value
*before* the call, gcc's RTL analysis will conclude that such a
replacement is valid.
That's what happens in the cprop1 pass. The rtl dump shows
LOCAL COPY-PROP: Replacing reg 1 in insn 108 with reg 203
and then it's a matter of luck just what hard register is allocated to
pseudo-reg 203.
Of course, replacing r1 with some other register is a completely
useless thing to do, but trying to tell gcc that in our particular
case we want this generic optimisation disabled isn't so easy. (Well,
it's dead easy if you want to hack cprop.c:do_local_cprop, just rip
out
|| (GET_CODE (PATTERN (insn)) != USE
&& asm_noperands (PATTERN (insn)) < 0)))
but maybe not so easy to get such patches committed..) Instead, the
way I'd go about fixing this is removing the r1 reference in our toc
save/restore RTL, ie. don't use a mem, use an unspec.
--
Alan Modra
Australia Development Lab, IBM
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH 1/3] powerpc: Don't use local named register variable in current_thread_info
2014-12-31 12:24 ` Alan Modra
@ 2015-01-07 5:12 ` Anton Blanchard
2015-01-07 17:59 ` Scott Wood
2015-01-08 6:36 ` Alan Modra
0 siblings, 2 replies; 13+ messages in thread
From: Anton Blanchard @ 2015-01-07 5:12 UTC (permalink / raw)
To: Alan Modra
Cc: Alexander Graf, ulrich.weigand, paulus, Scott Wood, linuxppc-dev
Hi Alan,
> Right. This is really an rs6000 backend bug. We describe one of the
> indirect calls that go wrong here as
>
> (call_insn 108 107 109 13 (parallel [
> (set (reg:DI 3 3)
> (call (mem:SI (reg:DI 288) [0 *_67 S4 A8])
> (const_int 64 [0x40])))
> (use (mem:DI (plus:DI (reg/f:DI 287 [ ops_44(D)->update ])
> (const_int 8 [0x8])) [0 S8 A8]))
> (set (reg:DI 2 2)
> (mem/v/c:DI (plus:DI (reg/f:DI 1 1)
> (const_int 40 [0x28])) [0 S8 A8]))
> (clobber (reg:DI 65 lr))
> ]) net/core/skbuff.c:2085 680 {*call_value_indirect_aixdi}
> <notes and arg uses omitted for clarity>
> )
>
> Notice that the RTL contains a "parallel". As you might guess, gcc
> treats the vector of expressions inside the square brackets of the
> parallel as happening "in parallel". Meaning that as far as gcc is
> concerned the toc restore part (third element) happens at the same
> time as the call (first element). So if gcc replaces (reg:DI 1) in
> the toc restore with some other register known to have the same value
> *before* the call, gcc's RTL analysis will conclude that such a
> replacement is valid.
Thanks for looking into this. Does that mean we were just getting lucky
with the previous version:
static inline struct thread_info *current_thread_info(void)
{
register unsigned long sp asm("r1");
return (struct thread_info *)(sp & ~(THREAD_SIZE-1));
}
ie a static register asm instead of a global one. If so the safest fix
for now might be to just eat the overead of a register move:
static inline struct thread_info *current_thread_info(void)
{
unsigned long sp;
asm("mr %0,1": "=r"(sp));
return (struct thread_info *)(sp & ~(THREAD_SIZE-1));
}
Anton
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH 1/3] powerpc: Don't use local named register variable in current_thread_info
2015-01-07 5:12 ` Anton Blanchard
@ 2015-01-07 17:59 ` Scott Wood
2015-01-08 6:36 ` Alan Modra
1 sibling, 0 replies; 13+ messages in thread
From: Scott Wood @ 2015-01-07 17:59 UTC (permalink / raw)
To: Anton Blanchard
Cc: Alan Modra, Alexander Graf, ulrich.weigand, paulus, linuxppc-dev
On Wed, 2015-01-07 at 16:12 +1100, Anton Blanchard wrote:
> Thanks for looking into this. Does that mean we were just getting lucky
> with the previous version:
>
> static inline struct thread_info *current_thread_info(void)
> {
> register unsigned long sp asm("r1");
>
> return (struct thread_info *)(sp & ~(THREAD_SIZE-1));
> }
>
> ie a static register asm instead of a global one. If so the safest fix
> for now might be to just eat the overead of a register move:
>
> static inline struct thread_info *current_thread_info(void)
> {
> unsigned long sp;
>
> asm("mr %0,1": "=r"(sp));
> return (struct thread_info *)(sp & ~(THREAD_SIZE-1));
You could avoid the register move by doing a rlwinm/rldicr in inline
asm, if it matters enough.
-Scott
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH 1/3] powerpc: Don't use local named register variable in current_thread_info
2015-01-07 5:12 ` Anton Blanchard
2015-01-07 17:59 ` Scott Wood
@ 2015-01-08 6:36 ` Alan Modra
1 sibling, 0 replies; 13+ messages in thread
From: Alan Modra @ 2015-01-08 6:36 UTC (permalink / raw)
To: Anton Blanchard
Cc: Alexander Graf, ulrich.weigand, paulus, Scott Wood, linuxppc-dev
On Wed, Jan 07, 2015 at 04:12:47PM +1100, Anton Blanchard wrote:
> Hi Alan,
>
> > Right. This is really an rs6000 backend bug. We describe one of the
> > indirect calls that go wrong here as
> >
> > (call_insn 108 107 109 13 (parallel [
> > (set (reg:DI 3 3)
> > (call (mem:SI (reg:DI 288) [0 *_67 S4 A8])
> > (const_int 64 [0x40])))
> > (use (mem:DI (plus:DI (reg/f:DI 287 [ ops_44(D)->update ])
> > (const_int 8 [0x8])) [0 S8 A8]))
> > (set (reg:DI 2 2)
> > (mem/v/c:DI (plus:DI (reg/f:DI 1 1)
> > (const_int 40 [0x28])) [0 S8 A8]))
> > (clobber (reg:DI 65 lr))
> > ]) net/core/skbuff.c:2085 680 {*call_value_indirect_aixdi}
> > <notes and arg uses omitted for clarity>
> > )
> >
> > Notice that the RTL contains a "parallel". As you might guess, gcc
> > treats the vector of expressions inside the square brackets of the
> > parallel as happening "in parallel". Meaning that as far as gcc is
> > concerned the toc restore part (third element) happens at the same
> > time as the call (first element). So if gcc replaces (reg:DI 1) in
> > the toc restore with some other register known to have the same value
> > *before* the call, gcc's RTL analysis will conclude that such a
> > replacement is valid.
>
> Thanks for looking into this. Does that mean we were just getting lucky
> with the previous version:
>
> static inline struct thread_info *current_thread_info(void)
> {
> register unsigned long sp asm("r1");
>
> return (struct thread_info *)(sp & ~(THREAD_SIZE-1));
> }
With both versions, the original rtl for current_thread_info consists
of two instructions, copy r1 to a pseudo reg, then the "and". With
the above version, fwprop1 manages to combine this to a single "and"
insn. When using a global reg var, fprop1 leaves the two
instructions, the copy causing the trouble in the following cprop1
pass. So it's not that we are getting lucky in cprop1, but that
fwprop1 behaves differently with global vs. local registers.
--
Alan Modra
Australia Development Lab, IBM
^ permalink raw reply [flat|nested] 13+ messages in thread
* [PATCH 1/3] powerpc: Don't use local named register variable in current_thread_info
@ 2014-10-31 3:47 Anton Blanchard
0 siblings, 0 replies; 13+ messages in thread
From: Anton Blanchard @ 2014-10-31 3:47 UTC (permalink / raw)
To: benh, paulus, mpe, ulrich.weigand; +Cc: linuxppc-dev
LLVM doesn't support local named register variables and is unlikely
to. current_thread_info is using one, fix it by moving it out and
calling it __current_r1().
I gave it a bit of an obscure name because we don't want anyone else
using it - they should use current_stack_pointer(). This specific
case is performance critical and we can't afford to call a function
to get it. Furthermore it isn't important to know exactly where in
the stack we are since we mask the lower bits.
Signed-off-by: Anton Blanchard <anton@samba.org>
---
arch/powerpc/include/asm/thread_info.h | 5 ++---
1 file changed, 2 insertions(+), 3 deletions(-)
diff --git a/arch/powerpc/include/asm/thread_info.h b/arch/powerpc/include/asm/thread_info.h
index b034ecd..ebc4f16 100644
--- a/arch/powerpc/include/asm/thread_info.h
+++ b/arch/powerpc/include/asm/thread_info.h
@@ -71,13 +71,12 @@ struct thread_info {
#define THREAD_SIZE_ORDER (THREAD_SHIFT - PAGE_SHIFT)
/* how to get the thread information struct from C */
+register unsigned long __current_r1 asm("r1");
static inline struct thread_info *current_thread_info(void)
{
- register unsigned long sp asm("r1");
-
/* gcc4, at least, is smart enough to turn this into a single
* rlwinm for ppc32 and clrrdi for ppc64 */
- return (struct thread_info *)(sp & ~(THREAD_SIZE-1));
+ return (struct thread_info *)(__current_r1 & ~(THREAD_SIZE-1));
}
#endif /* __ASSEMBLY__ */
--
1.9.1
^ permalink raw reply related [flat|nested] 13+ messages in thread
end of thread, other threads:[~2015-01-08 6:36 UTC | newest]
Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
[not found] <1414727247-31838-1-git-send-email-anton__19440.5086375356$1414727300$gmane$org@samba.org>
2014-12-17 1:16 ` [PATCH 1/3] powerpc: Don't use local named register variable in current_thread_info Alexander Graf
2014-12-17 3:44 ` Anton Blanchard
2014-12-17 9:27 ` Alexander Graf
2014-12-22 6:49 ` Michael Ellerman
2014-12-18 5:11 ` Michael Ellerman
2014-12-18 6:25 ` Anton Blanchard
2014-12-18 15:02 ` Alexander Graf
2014-12-31 12:24 ` Alan Modra
2015-01-07 5:12 ` Anton Blanchard
2015-01-07 17:59 ` Scott Wood
2015-01-08 6:36 ` Alan Modra
2014-12-18 14:56 ` Alexander Graf
2014-10-31 3:47 Anton Blanchard
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.