All of lore.kernel.org
 help / color / mirror / Atom feed
* Re: [PATCH 1/3] powerpc: Don't use local named register variable in current_thread_info
       [not found] <1414727247-31838-1-git-send-email-anton__19440.5086375356$1414727300$gmane$org@samba.org>
@ 2014-12-17  1:16 ` Alexander Graf
  2014-12-17  3:44   ` Anton Blanchard
  2014-12-18  5:11   ` Michael Ellerman
  0 siblings, 2 replies; 13+ messages in thread
From: Alexander Graf @ 2014-12-17  1:16 UTC (permalink / raw)
  To: Anton Blanchard, benh, paulus, mpe, ulrich.weigand
  Cc: Scott Wood, linuxppc-dev

On 31.10.14 04:47, Anton Blanchard wrote:
> LLVM doesn't support local named register variables and is unlikely
> to. current_thread_info is using one, fix it by moving it out and
> calling it __current_r1().
> 
> I gave it a bit of an obscure name because we don't want anyone else
> using it - they should use current_stack_pointer(). This specific
> case is performance critical and we can't afford to call a function
> to get it. Furthermore it isn't important to know exactly where in
> the stack we are since we mask the lower bits.
> 
> Signed-off-by: Anton Blanchard <anton@samba.org>

Git bisect managed to point me to this commit as the offender for OOPSes
on e5500 and e6500 (and maybe the G4 as well, not sure).

Doing a git revert of this commit on top of linus/master makes things
work fine for me again.


Alex

Oops: Kernel access of bad area, sig: 11 [#2]
SMP NR_CPUS=16 CoreNet Generic
Modules linked in:
CPU: 1 PID: 339 Comm: kworker/1:1 Tainted: G      D
3.18.0-09423-g988adfd #1
Workqueue: rpciod .rpc_async_schedule
task: c0000001f6397500 ti: c0000001f6638000 task.ti: c0000001f6638000
NIP: c0000000004817a4 LR: c0000000004817a4 CTR: 0000000000000000
REGS: c0000001f663b0e0 TRAP: 0300   Tainted: G      D
(3.18.0-09423-g988adfd)
MSR: 0000000080029000 <CE,EE,ME>  CR: 24ad2e42  XER: 00000000
DEAR: 202031303438355f ESR: 0000000000000000 SOFTE: 1
GPR00: c0000000004817a4 c0000001f663b360 c000000000988028 000000007f24333d
GPR04: 5ff5738c1f2ebfb1 0000000000000000 0000000000000000 00000000000008f8
GPR08: c000000000480ae8 2020313034383537 36204b4220617320 6469726563740a31
GPR12: 3937302d30312d30 c00000000fff8780 c00000000007f988 c0000001f64c1600
GPR16: 0000000000000000 0000000000000000 0000000000000000 00000000000005dc
GPR20: c0000000009b8028 c00000007e034200 0000000000000548 c000000000000000
GPR24: c0000001f663b4b0 00000000b225831e 0000000000000000 0000000000000080
GPR28: 0000000000000548 00000000000008f8 0000000000000548 0000000000000094
NIP [c0000000004817a4] .__skb_checksum+0x194/0x378
LR [c0000000004817a4] .__skb_checksum+0x194/0x378
Call Trace:
[c0000001f663b360] [c0000000004817a4] .__skb_checksum+0x194/0x378
(unreliable)
[c0000001f663b440] [c0000000004819b4] .skb_checksum+0x2c/0x3c
[c0000001f663b4c0] [c0000000004fd0a8] .udp4_hwcsum+0xa8/0x16c
[c0000001f663b560] [c0000000004fd440] .udp_send_skb+0x2d4/0x370
[c0000001f663b600] [c0000000004fd51c] .udp_push_pending_frames+0x40/0x94
[c0000001f663b680] [c0000000004fec08] .udp_sendpage+0x150/0x1b4
[c0000001f663b770] [c00000000050ae54] .inet_sendpage+0xa0/0x120
[c0000001f663b810] [c00000000059c8cc] .xs_sendpages+0x2d0/0x30c
[c0000001f663b8d0] [c00000000059cae4] .xs_udp_send_request+0x58/0x120
[c0000001f663b970] [c000000000598f04] .xprt_transmit+0x80/0x36c
[c0000001f663ba20] [c0000000005942d8] .call_transmit+0x19c/0x254
[c0000001f663bab0] [c00000000059ff64] .__rpc_execute+0xbc/0x3c0
[c0000001f663bb90] [c0000000000797f8] .process_one_work+0x1c0/0x474
[c0000001f663bc40] [c00000000007a518] .worker_thread+0x17c/0x54c
[c0000001f663bd30] [c00000000007fa8c] .kthread+0x104/0x124
[c0000001f663be30] [c000000000000884] .ret_from_kernel_thread+0x58/0xd4
Instruction dump:
7d1f3a14 7c6a1850 e9580000 7fbd4050 786334e4 e90a0000 7c63ba14 f8490028
7c63ea14 7d0903a6 e84a0008 4e800421 <e8490028> 7c641b78 78270464 e9580008
---[ end trace 51b7414695b0cafe ]---

note: kworker/1:1[339] exited with preempt_count 1
Unable to handle kernel paging request for data at address
0xffffffffffffffd8

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 1/3] powerpc: Don't use local named register variable in current_thread_info
  2014-12-17  1:16 ` [PATCH 1/3] powerpc: Don't use local named register variable in current_thread_info Alexander Graf
@ 2014-12-17  3:44   ` Anton Blanchard
  2014-12-17  9:27     ` Alexander Graf
  2014-12-18  5:11   ` Michael Ellerman
  1 sibling, 1 reply; 13+ messages in thread
From: Anton Blanchard @ 2014-12-17  3:44 UTC (permalink / raw)
  To: Alexander Graf; +Cc: ulrich.weigand, paulus, Scott Wood, linuxppc-dev

Hi Alex,

> Git bisect managed to point me to this commit as the offender for
> OOPSes on e5500 and e6500 (and maybe the G4 as well, not sure).
> 
> Doing a git revert of this commit on top of linus/master makes things
> work fine for me again.

Ouch, sorry for that, I'll work to reproduce. What gcc version are you
using?

Anton

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 1/3] powerpc: Don't use local named register variable in current_thread_info
  2014-12-17  3:44   ` Anton Blanchard
@ 2014-12-17  9:27     ` Alexander Graf
  2014-12-22  6:49       ` Michael Ellerman
  0 siblings, 1 reply; 13+ messages in thread
From: Alexander Graf @ 2014-12-17  9:27 UTC (permalink / raw)
  To: Anton Blanchard; +Cc: ulrich.weigand, paulus, Scott Wood, linuxppc-dev



On 17.12.14 04:44, Anton Blanchard wrote:
> Hi Alex,
> 
>> Git bisect managed to point me to this commit as the offender for
>> OOPSes on e5500 and e6500 (and maybe the G4 as well, not sure).
>>
>> Doing a git revert of this commit on top of linus/master makes things
>> work fine for me again.
> 
> Ouch, sorry for that, I'll work to reproduce. What gcc version are you
> using?

I'm running

  gcc (SUSE Linux) 4.7.2 20130108 [gcc-4_7-branch revision 195012]

which is basically the one from openSUSE 12.3 for ppc64.

I've also uploaded 2 builds, one with the patch applied (broken) and one
without (works):

  http://csgraf.de/agraf/current_thread_info/vmlinux.broken.xz
  http://csgraf.de/agraf/current_thread_info/vmlinux.works.xz

Interestingly enough I did not see this on IBM POWER systems, but maybe
that's because most of them compile their own kernels with local,
different compilers in my test runs and the only one that does base on
the same compiler doesn't run on nfsroot.

My iBook G4 target is affected by this as well and reverting this patch
also makes it work again. Maybe we're just running over some stack?


Alex

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 1/3] powerpc: Don't use local named register variable in current_thread_info
  2014-12-17  1:16 ` [PATCH 1/3] powerpc: Don't use local named register variable in current_thread_info Alexander Graf
  2014-12-17  3:44   ` Anton Blanchard
@ 2014-12-18  5:11   ` Michael Ellerman
  2014-12-18  6:25     ` Anton Blanchard
  2014-12-18 14:56     ` Alexander Graf
  1 sibling, 2 replies; 13+ messages in thread
From: Michael Ellerman @ 2014-12-18  5:11 UTC (permalink / raw)
  To: Alexander Graf
  Cc: ulrich.weigand, paulus, Anton Blanchard, Scott Wood, linuxppc-dev

On Wed, 2014-12-17 at 02:16 +0100, Alexander Graf wrote:
> On 31.10.14 04:47, Anton Blanchard wrote:
> > LLVM doesn't support local named register variables and is unlikely
> > to. current_thread_info is using one, fix it by moving it out and
> > calling it __current_r1().
> > 
> > I gave it a bit of an obscure name because we don't want anyone else
> > using it - they should use current_stack_pointer(). This specific
> > case is performance critical and we can't afford to call a function
> > to get it. Furthermore it isn't important to know exactly where in
> > the stack we are since we mask the lower bits.
> > 
> > Signed-off-by: Anton Blanchard <anton@samba.org>
> 
> Git bisect managed to point me to this commit as the offender for OOPSes
> on e5500 and e6500 (and maybe the G4 as well, not sure).
> 
> Doing a git revert of this commit on top of linus/master makes things
> work fine for me again.
> 
> 
> Alex
> 
> Oops: Kernel access of bad area, sig: 11 [#2]
> SMP NR_CPUS=16 CoreNet Generic
> Modules linked in:
> CPU: 1 PID: 339 Comm: kworker/1:1 Tainted: G      D
> 3.18.0-09423-g988adfd #1
> Workqueue: rpciod .rpc_async_schedule
> task: c0000001f6397500 ti: c0000001f6638000 task.ti: c0000001f6638000
> NIP: c0000000004817a4 LR: c0000000004817a4 CTR: 0000000000000000
> REGS: c0000001f663b0e0 TRAP: 0300   Tainted: G      D
> (3.18.0-09423-g988adfd)
> MSR: 0000000080029000 <CE,EE,ME>  CR: 24ad2e42  XER: 00000000
> DEAR: 202031303438355f ESR: 0000000000000000 SOFTE: 1
      = r9 + 40

> GPR00: c0000000004817a4 c0000001f663b360 c000000000988028 000000007f24333d
> GPR04: 5ff5738c1f2ebfb1 0000000000000000 0000000000000000 00000000000008f8
> GPR08: c000000000480ae8 2020313034383537 36204b4220617320 6469726563740a31
> GPR12: 3937302d30312d30 c00000000fff8780 c00000000007f988 c0000001f64c1600

GPRs 9-12 say: " 1048576 KB as direct\n1970-01-0"

Which is rarely a good sign :)

Looks like it might be part of your dmesg from setup_page_sizes().

> GPR16: 0000000000000000 0000000000000000 0000000000000000 00000000000005dc
> GPR20: c0000000009b8028 c00000007e034200 0000000000000548 c000000000000000
> GPR24: c0000001f663b4b0 00000000b225831e 0000000000000000 0000000000000080
> GPR28: 0000000000000548 00000000000008f8 0000000000000548 0000000000000094
> NIP [c0000000004817a4] .__skb_checksum+0x194/0x378
> LR [c0000000004817a4] .__skb_checksum+0x194/0x378
> Call Trace:
> [c0000001f663b360] [c0000000004817a4] .__skb_checksum+0x194/0x378
> (unreliable)
> [c0000001f663b440] [c0000000004819b4] .skb_checksum+0x2c/0x3c
> [c0000001f663b4c0] [c0000000004fd0a8] .udp4_hwcsum+0xa8/0x16c
> [c0000001f663b560] [c0000000004fd440] .udp_send_skb+0x2d4/0x370
> [c0000001f663b600] [c0000000004fd51c] .udp_push_pending_frames+0x40/0x94
> [c0000001f663b680] [c0000000004fec08] .udp_sendpage+0x150/0x1b4
> [c0000001f663b770] [c00000000050ae54] .inet_sendpage+0xa0/0x120
> [c0000001f663b810] [c00000000059c8cc] .xs_sendpages+0x2d0/0x30c
> [c0000001f663b8d0] [c00000000059cae4] .xs_udp_send_request+0x58/0x120
> [c0000001f663b970] [c000000000598f04] .xprt_transmit+0x80/0x36c
> [c0000001f663ba20] [c0000000005942d8] .call_transmit+0x19c/0x254
> [c0000001f663bab0] [c00000000059ff64] .__rpc_execute+0xbc/0x3c0
> [c0000001f663bb90] [c0000000000797f8] .process_one_work+0x1c0/0x474
> [c0000001f663bc40] [c00000000007a518] .worker_thread+0x17c/0x54c
> [c0000001f663bd30] [c00000000007fa8c] .kthread+0x104/0x124
> [c0000001f663be30] [c000000000000884] .ret_from_kernel_thread+0x58/0xd4
> Instruction dump:
> 7d1f3a14 7c6a1850 e9580000 7fbd4050 786334e4 e90a0000 7c63ba14 f8490028
> 7c63ea14 7d0903a6 e84a0008 4e800421 <e8490028> 7c641b78 78270464 e9580008

Which is:

add      r8, r31, r7
subf     r3, r10, r3
ld      r10, 0(r24)
subf    r29, r29, r8
rldicr   r3, r3, 6, 51
ld       r8, 0(r10)
add      r3, r3, r23
std      r2, 40(r9)
add      r3, r3, r29
mtctr    r8
ld       r2, 8(r10)
bctrl
ld       r2, 40(r9)	<---
mr       r4, r3
rldicr   r7, r1, 0, 49
ld      r10, 8(r24)


Which looks a bit odd. I'd expect us to be saving/restoring r2 to the stack,
though maybe r9 was pointing at the stack?

Looking at your vmlinux.broken I don't see the same code gen.

Can you get an oops from a kernel and upload the exact binary? Or just post us
the full code dump of __skb_checksum() (or wherever it oopses).

cheers

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 1/3] powerpc: Don't use local named register variable in current_thread_info
  2014-12-18  5:11   ` Michael Ellerman
@ 2014-12-18  6:25     ` Anton Blanchard
  2014-12-18 15:02       ` Alexander Graf
  2014-12-31 12:24       ` Alan Modra
  2014-12-18 14:56     ` Alexander Graf
  1 sibling, 2 replies; 13+ messages in thread
From: Anton Blanchard @ 2014-12-18  6:25 UTC (permalink / raw)
  To: Michael Ellerman
  Cc: Alan Modra, Alexander Graf, ulrich.weigand, paulus, Scott Wood,
	linuxppc-dev

On Thu, 18 Dec 2014 16:11:54 +1100
Michael Ellerman <mpe@ellerman.id.au> wrote:

> On Wed, 2014-12-17 at 02:16 +0100, Alexander Graf wrote:
> > On 31.10.14 04:47, Anton Blanchard wrote:
> > > LLVM doesn't support local named register variables and is
> > > unlikely to. current_thread_info is using one, fix it by moving
> > > it out and calling it __current_r1().
> > > 
> > > I gave it a bit of an obscure name because we don't want anyone
> > > else using it - they should use current_stack_pointer(). This
> > > specific case is performance critical and we can't afford to call
> > > a function to get it. Furthermore it isn't important to know
> > > exactly where in the stack we are since we mask the lower bits.
> > > 
> > > Signed-off-by: Anton Blanchard <anton@samba.org>
> > 
> > Git bisect managed to point me to this commit as the offender for
> > OOPSes on e5500 and e6500 (and maybe the G4 as well, not sure).
> > 
> > Doing a git revert of this commit on top of linus/master makes
> > things work fine for me again.
> > 
> > 
> > Alex
> > 
> > Oops: Kernel access of bad area, sig: 11 [#2]
> > SMP NR_CPUS=16 CoreNet Generic
> > Modules linked in:
> > CPU: 1 PID: 339 Comm: kworker/1:1 Tainted: G      D
> > 3.18.0-09423-g988adfd #1
> > Workqueue: rpciod .rpc_async_schedule
> > task: c0000001f6397500 ti: c0000001f6638000 task.ti:
> > c0000001f6638000 NIP: c0000000004817a4 LR: c0000000004817a4 CTR:
> > 0000000000000000 REGS: c0000001f663b0e0 TRAP: 0300   Tainted:
> > G      D (3.18.0-09423-g988adfd)
> > MSR: 0000000080029000 <CE,EE,ME>  CR: 24ad2e42  XER: 00000000
> > DEAR: 202031303438355f ESR: 0000000000000000 SOFTE: 1
>       = r9 + 40
> 
> > GPR00: c0000000004817a4 c0000001f663b360 c000000000988028
> > 000000007f24333d GPR04: 5ff5738c1f2ebfb1 0000000000000000
> > 0000000000000000 00000000000008f8 GPR08: c000000000480ae8
> > 2020313034383537 36204b4220617320 6469726563740a31 GPR12:
> > 3937302d30312d30 c00000000fff8780 c00000000007f988 c0000001f64c1600
> 
> GPRs 9-12 say: " 1048576 KB as direct\n1970-01-0"
> 
> Which is rarely a good sign :)
> 
> Looks like it might be part of your dmesg from setup_page_sizes().
> 
> > GPR16: 0000000000000000 0000000000000000 0000000000000000
> > 00000000000005dc GPR20: c0000000009b8028 c00000007e034200
> > 0000000000000548 c000000000000000 GPR24: c0000001f663b4b0
> > 00000000b225831e 0000000000000000 0000000000000080 GPR28:
> > 0000000000000548 00000000000008f8 0000000000000548 0000000000000094
> > NIP [c0000000004817a4] .__skb_checksum+0x194/0x378 LR
> > [c0000000004817a4] .__skb_checksum+0x194/0x378 Call Trace:
> > [c0000001f663b360] [c0000000004817a4] .__skb_checksum+0x194/0x378
> > (unreliable)
> > [c0000001f663b440] [c0000000004819b4] .skb_checksum+0x2c/0x3c
> > [c0000001f663b4c0] [c0000000004fd0a8] .udp4_hwcsum+0xa8/0x16c
> > [c0000001f663b560] [c0000000004fd440] .udp_send_skb+0x2d4/0x370
> > [c0000001f663b600]
> > [c0000000004fd51c] .udp_push_pending_frames+0x40/0x94
> > [c0000001f663b680] [c0000000004fec08] .udp_sendpage+0x150/0x1b4
> > [c0000001f663b770] [c00000000050ae54] .inet_sendpage+0xa0/0x120
> > [c0000001f663b810] [c00000000059c8cc] .xs_sendpages+0x2d0/0x30c
> > [c0000001f663b8d0]
> > [c00000000059cae4] .xs_udp_send_request+0x58/0x120
> > [c0000001f663b970] [c000000000598f04] .xprt_transmit+0x80/0x36c
> > [c0000001f663ba20] [c0000000005942d8] .call_transmit+0x19c/0x254
> > [c0000001f663bab0] [c00000000059ff64] .__rpc_execute+0xbc/0x3c0
> > [c0000001f663bb90] [c0000000000797f8] .process_one_work+0x1c0/0x474
> > [c0000001f663bc40] [c00000000007a518] .worker_thread+0x17c/0x54c
> > [c0000001f663bd30] [c00000000007fa8c] .kthread+0x104/0x124
> > [c0000001f663be30]
> > [c000000000000884] .ret_from_kernel_thread+0x58/0xd4 Instruction
> > dump: 7d1f3a14 7c6a1850 e9580000 7fbd4050 786334e4 e90a0000
> > 7c63ba14 f8490028 7c63ea14 7d0903a6 e84a0008 4e800421 <e8490028>
> > 7c641b78 78270464 e9580008
> 
> Which is:
> 
> add      r8, r31, r7
> subf     r3, r10, r3
> ld      r10, 0(r24)
> subf    r29, r29, r8
> rldicr   r3, r3, 6, 51
> ld       r8, 0(r10)
> add      r3, r3, r23
> std      r2, 40(r9)
> add      r3, r3, r29
> mtctr    r8
> ld       r2, 8(r10)
> bctrl
> ld       r2, 40(r9)	<---
> mr       r4, r3
> rldicr   r7, r1, 0, 49
> ld      r10, 8(r24)
> 
> 
> Which looks a bit odd. I'd expect us to be saving/restoring r2 to the
> stack, though maybe r9 was pointing at the stack?

Nice catch! This looks like a compiler bug.

> Looking at your vmlinux.broken I don't see the same code gen.

For whatever reason we ended up with r10 this time:

7c 2a 0b 78     mr      r10,r1
...
f8 4a 00 28     std     r2,40(r10)
7c 63 ba 14     add     r3,r3,r23
7c e9 03 a6     mtctr   r7
7c 63 ea 14     add     r3,r3,r29
38 a0 00 00     li      r5,0
e8 48 00 08     ld      r2,8(r8)
4e 80 04 21     bctrl
e8 4a 00 28     ld      r2,40(r10)

The indirect function call is allowed to clobber r10, gcc is doing
something very wrong here.

Anton

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 1/3] powerpc: Don't use local named register variable in current_thread_info
  2014-12-18  5:11   ` Michael Ellerman
  2014-12-18  6:25     ` Anton Blanchard
@ 2014-12-18 14:56     ` Alexander Graf
  1 sibling, 0 replies; 13+ messages in thread
From: Alexander Graf @ 2014-12-18 14:56 UTC (permalink / raw)
  To: Michael Ellerman
  Cc: ulrich.weigand, paulus, Anton Blanchard, Scott Wood, linuxppc-dev



On 18.12.14 06:11, Michael Ellerman wrote:
> On Wed, 2014-12-17 at 02:16 +0100, Alexander Graf wrote:
>> On 31.10.14 04:47, Anton Blanchard wrote:
>>> LLVM doesn't support local named register variables and is unlikely
>>> to. current_thread_info is using one, fix it by moving it out and
>>> calling it __current_r1().
>>>
>>> I gave it a bit of an obscure name because we don't want anyone else
>>> using it - they should use current_stack_pointer(). This specific
>>> case is performance critical and we can't afford to call a function
>>> to get it. Furthermore it isn't important to know exactly where in
>>> the stack we are since we mask the lower bits.
>>>
>>> Signed-off-by: Anton Blanchard <anton@samba.org>
>>
>> Git bisect managed to point me to this commit as the offender for OOPSes
>> on e5500 and e6500 (and maybe the G4 as well, not sure).
>>
>> Doing a git revert of this commit on top of linus/master makes things
>> work fine for me again.
>>
>>
>> Alex
>>
>> Oops: Kernel access of bad area, sig: 11 [#2]
>> SMP NR_CPUS=16 CoreNet Generic
>> Modules linked in:
>> CPU: 1 PID: 339 Comm: kworker/1:1 Tainted: G      D
>> 3.18.0-09423-g988adfd #1
>> Workqueue: rpciod .rpc_async_schedule
>> task: c0000001f6397500 ti: c0000001f6638000 task.ti: c0000001f6638000
>> NIP: c0000000004817a4 LR: c0000000004817a4 CTR: 0000000000000000
>> REGS: c0000001f663b0e0 TRAP: 0300   Tainted: G      D
>> (3.18.0-09423-g988adfd)
>> MSR: 0000000080029000 <CE,EE,ME>  CR: 24ad2e42  XER: 00000000
>> DEAR: 202031303438355f ESR: 0000000000000000 SOFTE: 1
>       = r9 + 40
> 
>> GPR00: c0000000004817a4 c0000001f663b360 c000000000988028 000000007f24333d
>> GPR04: 5ff5738c1f2ebfb1 0000000000000000 0000000000000000 00000000000008f8
>> GPR08: c000000000480ae8 2020313034383537 36204b4220617320 6469726563740a31
>> GPR12: 3937302d30312d30 c00000000fff8780 c00000000007f988 c0000001f64c1600
> 
> GPRs 9-12 say: " 1048576 KB as direct\n1970-01-0"
> 
> Which is rarely a good sign :)
> 
> Looks like it might be part of your dmesg from setup_page_sizes().
> 
>> GPR16: 0000000000000000 0000000000000000 0000000000000000 00000000000005dc
>> GPR20: c0000000009b8028 c00000007e034200 0000000000000548 c000000000000000
>> GPR24: c0000001f663b4b0 00000000b225831e 0000000000000000 0000000000000080
>> GPR28: 0000000000000548 00000000000008f8 0000000000000548 0000000000000094
>> NIP [c0000000004817a4] .__skb_checksum+0x194/0x378
>> LR [c0000000004817a4] .__skb_checksum+0x194/0x378
>> Call Trace:
>> [c0000001f663b360] [c0000000004817a4] .__skb_checksum+0x194/0x378
>> (unreliable)
>> [c0000001f663b440] [c0000000004819b4] .skb_checksum+0x2c/0x3c
>> [c0000001f663b4c0] [c0000000004fd0a8] .udp4_hwcsum+0xa8/0x16c
>> [c0000001f663b560] [c0000000004fd440] .udp_send_skb+0x2d4/0x370
>> [c0000001f663b600] [c0000000004fd51c] .udp_push_pending_frames+0x40/0x94
>> [c0000001f663b680] [c0000000004fec08] .udp_sendpage+0x150/0x1b4
>> [c0000001f663b770] [c00000000050ae54] .inet_sendpage+0xa0/0x120
>> [c0000001f663b810] [c00000000059c8cc] .xs_sendpages+0x2d0/0x30c
>> [c0000001f663b8d0] [c00000000059cae4] .xs_udp_send_request+0x58/0x120
>> [c0000001f663b970] [c000000000598f04] .xprt_transmit+0x80/0x36c
>> [c0000001f663ba20] [c0000000005942d8] .call_transmit+0x19c/0x254
>> [c0000001f663bab0] [c00000000059ff64] .__rpc_execute+0xbc/0x3c0
>> [c0000001f663bb90] [c0000000000797f8] .process_one_work+0x1c0/0x474
>> [c0000001f663bc40] [c00000000007a518] .worker_thread+0x17c/0x54c
>> [c0000001f663bd30] [c00000000007fa8c] .kthread+0x104/0x124
>> [c0000001f663be30] [c000000000000884] .ret_from_kernel_thread+0x58/0xd4
>> Instruction dump:
>> 7d1f3a14 7c6a1850 e9580000 7fbd4050 786334e4 e90a0000 7c63ba14 f8490028
>> 7c63ea14 7d0903a6 e84a0008 4e800421 <e8490028> 7c641b78 78270464 e9580008
> 
> Which is:
> 
> add      r8, r31, r7
> subf     r3, r10, r3
> ld      r10, 0(r24)
> subf    r29, r29, r8
> rldicr   r3, r3, 6, 51
> ld       r8, 0(r10)
> add      r3, r3, r23
> std      r2, 40(r9)
> add      r3, r3, r29
> mtctr    r8
> ld       r2, 8(r10)
> bctrl
> ld       r2, 40(r9)	<---
> mr       r4, r3
> rldicr   r7, r1, 0, 49
> ld      r10, 8(r24)
> 
> 
> Which looks a bit odd. I'd expect us to be saving/restoring r2 to the stack,
> though maybe r9 was pointing at the stack?
> 
> Looking at your vmlinux.broken I don't see the same code gen.
> 
> Can you get an oops from a kernel and upload the exact binary? Or just post us
> the full code dump of __skb_checksum() (or wherever it oopses).

Ugh, sorry - I must've copied the wrong one. The serial output below is
from the uImage that (hopefully) is belongs to the vmlinux.broken:

  http://csgraf.de/agraf/current_thread_info/dmesg.txt


Alex

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 1/3] powerpc: Don't use local named register variable in current_thread_info
  2014-12-18  6:25     ` Anton Blanchard
@ 2014-12-18 15:02       ` Alexander Graf
  2014-12-31 12:24       ` Alan Modra
  1 sibling, 0 replies; 13+ messages in thread
From: Alexander Graf @ 2014-12-18 15:02 UTC (permalink / raw)
  To: Anton Blanchard, Michael Ellerman
  Cc: Alan Modra, ulrich.weigand, paulus, Scott Wood, linuxppc-dev



On 18.12.14 07:25, Anton Blanchard wrote:
> On Thu, 18 Dec 2014 16:11:54 +1100
> Michael Ellerman <mpe@ellerman.id.au> wrote:
> 
>> On Wed, 2014-12-17 at 02:16 +0100, Alexander Graf wrote:
>>> On 31.10.14 04:47, Anton Blanchard wrote:
>>>> LLVM doesn't support local named register variables and is
>>>> unlikely to. current_thread_info is using one, fix it by moving
>>>> it out and calling it __current_r1().
>>>>
>>>> I gave it a bit of an obscure name because we don't want anyone
>>>> else using it - they should use current_stack_pointer(). This
>>>> specific case is performance critical and we can't afford to call
>>>> a function to get it. Furthermore it isn't important to know
>>>> exactly where in the stack we are since we mask the lower bits.
>>>>
>>>> Signed-off-by: Anton Blanchard <anton@samba.org>
>>>
>>> Git bisect managed to point me to this commit as the offender for
>>> OOPSes on e5500 and e6500 (and maybe the G4 as well, not sure).
>>>
>>> Doing a git revert of this commit on top of linus/master makes
>>> things work fine for me again.
>>>
>>>
>>> Alex
>>>
>>> Oops: Kernel access of bad area, sig: 11 [#2]
>>> SMP NR_CPUS=16 CoreNet Generic
>>> Modules linked in:
>>> CPU: 1 PID: 339 Comm: kworker/1:1 Tainted: G      D
>>> 3.18.0-09423-g988adfd #1
>>> Workqueue: rpciod .rpc_async_schedule
>>> task: c0000001f6397500 ti: c0000001f6638000 task.ti:
>>> c0000001f6638000 NIP: c0000000004817a4 LR: c0000000004817a4 CTR:
>>> 0000000000000000 REGS: c0000001f663b0e0 TRAP: 0300   Tainted:
>>> G      D (3.18.0-09423-g988adfd)
>>> MSR: 0000000080029000 <CE,EE,ME>  CR: 24ad2e42  XER: 00000000
>>> DEAR: 202031303438355f ESR: 0000000000000000 SOFTE: 1
>>       = r9 + 40
>>
>>> GPR00: c0000000004817a4 c0000001f663b360 c000000000988028
>>> 000000007f24333d GPR04: 5ff5738c1f2ebfb1 0000000000000000
>>> 0000000000000000 00000000000008f8 GPR08: c000000000480ae8
>>> 2020313034383537 36204b4220617320 6469726563740a31 GPR12:
>>> 3937302d30312d30 c00000000fff8780 c00000000007f988 c0000001f64c1600
>>
>> GPRs 9-12 say: " 1048576 KB as direct\n1970-01-0"
>>
>> Which is rarely a good sign :)
>>
>> Looks like it might be part of your dmesg from setup_page_sizes().
>>
>>> GPR16: 0000000000000000 0000000000000000 0000000000000000
>>> 00000000000005dc GPR20: c0000000009b8028 c00000007e034200
>>> 0000000000000548 c000000000000000 GPR24: c0000001f663b4b0
>>> 00000000b225831e 0000000000000000 0000000000000080 GPR28:
>>> 0000000000000548 00000000000008f8 0000000000000548 0000000000000094
>>> NIP [c0000000004817a4] .__skb_checksum+0x194/0x378 LR
>>> [c0000000004817a4] .__skb_checksum+0x194/0x378 Call Trace:
>>> [c0000001f663b360] [c0000000004817a4] .__skb_checksum+0x194/0x378
>>> (unreliable)
>>> [c0000001f663b440] [c0000000004819b4] .skb_checksum+0x2c/0x3c
>>> [c0000001f663b4c0] [c0000000004fd0a8] .udp4_hwcsum+0xa8/0x16c
>>> [c0000001f663b560] [c0000000004fd440] .udp_send_skb+0x2d4/0x370
>>> [c0000001f663b600]
>>> [c0000000004fd51c] .udp_push_pending_frames+0x40/0x94
>>> [c0000001f663b680] [c0000000004fec08] .udp_sendpage+0x150/0x1b4
>>> [c0000001f663b770] [c00000000050ae54] .inet_sendpage+0xa0/0x120
>>> [c0000001f663b810] [c00000000059c8cc] .xs_sendpages+0x2d0/0x30c
>>> [c0000001f663b8d0]
>>> [c00000000059cae4] .xs_udp_send_request+0x58/0x120
>>> [c0000001f663b970] [c000000000598f04] .xprt_transmit+0x80/0x36c
>>> [c0000001f663ba20] [c0000000005942d8] .call_transmit+0x19c/0x254
>>> [c0000001f663bab0] [c00000000059ff64] .__rpc_execute+0xbc/0x3c0
>>> [c0000001f663bb90] [c0000000000797f8] .process_one_work+0x1c0/0x474
>>> [c0000001f663bc40] [c00000000007a518] .worker_thread+0x17c/0x54c
>>> [c0000001f663bd30] [c00000000007fa8c] .kthread+0x104/0x124
>>> [c0000001f663be30]
>>> [c000000000000884] .ret_from_kernel_thread+0x58/0xd4 Instruction
>>> dump: 7d1f3a14 7c6a1850 e9580000 7fbd4050 786334e4 e90a0000
>>> 7c63ba14 f8490028 7c63ea14 7d0903a6 e84a0008 4e800421 <e8490028>
>>> 7c641b78 78270464 e9580008
>>
>> Which is:
>>
>> add      r8, r31, r7
>> subf     r3, r10, r3
>> ld      r10, 0(r24)
>> subf    r29, r29, r8
>> rldicr   r3, r3, 6, 51
>> ld       r8, 0(r10)
>> add      r3, r3, r23
>> std      r2, 40(r9)
>> add      r3, r3, r29
>> mtctr    r8
>> ld       r2, 8(r10)
>> bctrl
>> ld       r2, 40(r9)	<---
>> mr       r4, r3
>> rldicr   r7, r1, 0, 49
>> ld      r10, 8(r24)
>>
>>
>> Which looks a bit odd. I'd expect us to be saving/restoring r2 to the
>> stack, though maybe r9 was pointing at the stack?
> 
> Nice catch! This looks like a compiler bug.
> 
>> Looking at your vmlinux.broken I don't see the same code gen.
> 
> For whatever reason we ended up with r10 this time:
> 
> 7c 2a 0b 78     mr      r10,r1
> ...
> f8 4a 00 28     std     r2,40(r10)
> 7c 63 ba 14     add     r3,r3,r23
> 7c e9 03 a6     mtctr   r7
> 7c 63 ea 14     add     r3,r3,r29
> 38 a0 00 00     li      r5,0
> e8 48 00 08     ld      r2,8(r8)
> 4e 80 04 21     bctrl
> e8 4a 00 28     ld      r2,40(r10)
> 
> The indirect function call is allowed to clobber r10, gcc is doing
> something very wrong here.

Yeah, I couldn't see why the patch really would break anything either,
but it does for me with this (not quite old) version of gcc.

I also don't see the breakage on LE machines that compile with a newer
version of gcc (4.8.3).


Alex

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 1/3] powerpc: Don't use local named register variable in current_thread_info
  2014-12-17  9:27     ` Alexander Graf
@ 2014-12-22  6:49       ` Michael Ellerman
  0 siblings, 0 replies; 13+ messages in thread
From: Michael Ellerman @ 2014-12-22  6:49 UTC (permalink / raw)
  To: Alexander Graf
  Cc: ulrich.weigand, paulus, Anton Blanchard, Scott Wood, linuxppc-dev

On Wed, 2014-12-17 at 10:27 +0100, Alexander Graf wrote:
> 
> On 17.12.14 04:44, Anton Blanchard wrote:
> > Hi Alex,
> > 
> >> Git bisect managed to point me to this commit as the offender for
> >> OOPSes on e5500 and e6500 (and maybe the G4 as well, not sure).
> >>
> >> Doing a git revert of this commit on top of linus/master makes things
> >> work fine for me again.
> > 
> > Ouch, sorry for that, I'll work to reproduce. What gcc version are you
> > using?
> 
> I'm running
> 
>   gcc (SUSE Linux) 4.7.2 20130108 [gcc-4_7-branch revision 195012]
> 
> which is basically the one from openSUSE 12.3 for ppc64.

OK. I've also reproduced it with what I think is stock 4.7.3.

I only see it in __skb_checksum() though, which is a bit odd. It is a fairly
large routine, so maybe it's just register pressure?

I tried for a while to create a self contained test case, but couldn't get
anything to work.

I guess we'll have to revert the kernel patch, unless someone can identify the
gcc bug soon.

I also wonder if it's fixed in later versions, or we're just not seeing it due
to good luck.

Below is the broken vs working code. For some reason in the bad case it backs
up r1, as if it thinks it will be clobbered by the rldicr.

Bad:

c00000000076df84:	7c 29 0b 78 	mr      r9,r1		<-- save of r1 ?
c00000000076df88:	78 27 04 64 	rldicr  r7,r1,0,49	<-- current_thread_info()
c00000000076df8c:	7d 08 ea 14 	add     r8,r8,r29
c00000000076df90:	7c 9f 40 50 	subf    r4,r31,r8
c00000000076df94:	7d 12 07 b4 	extsw   r18,r8
c00000000076df98:	2f 04 00 00 	cmpwi   cr6,r4,0
c00000000076df9c:	7c 9c 23 78 	mr      r28,r4
c00000000076dfa0:	7f 84 f0 00 	cmpw    cr7,r4,r30
c00000000076dfa4:	40 99 00 cc 	ble-    cr6,c00000000076e070 <.__skb_checksum+0x220>
c00000000076dfa8:	40 9d 00 08 	ble-    cr7,c00000000076dfb0 <.__skb_checksum+0x160>
c00000000076dfac:	7f dc f3 78 	mr      r28,r30
c00000000076dfb0:	81 07 00 14 	lwz     r8,20(r7)
c00000000076dfb4:	e8 6a 00 00 	ld      r3,0(r10)
c00000000076dfb8:	7f 94 07 b4 	extsw   r20,r28
c00000000076dfbc:	39 08 00 01 	addi    r8,r8,1
c00000000076dfc0:	91 07 00 14 	stw     r8,20(r7)
c00000000076dfc4:	7c 63 b2 14 	add     r3,r3,r22
c00000000076dfc8:	7e 84 a3 78 	mr      r4,r20
c00000000076dfcc:	7c 63 1e 74 	sradi   r3,r3,3
c00000000076dfd0:	38 a0 00 00 	li      r5,0
c00000000076dfd4:	7c 63 c1 d2 	mulld   r3,r3,r24
c00000000076dfd8:	81 0a 00 08 	lwz     r8,8(r10)
c00000000076dfdc:	7d 1f 42 14 	add     r8,r31,r8
c00000000076dfe0:	78 63 83 e4 	rldicr  r3,r3,16,47
c00000000076dfe4:	7f bd 40 50 	subf    r29,r29,r8
c00000000076dfe8:	7c 63 bb 78 	or      r3,r3,r23
c00000000076dfec:	7c 63 ea 14 	add     r3,r3,r29
c00000000076dff0:	e9 59 00 00 	ld      r10,0(r25)
c00000000076dff4:	e9 0a 00 00 	ld      r8,0(r10)
c00000000076dff8:	f8 49 00 28 	std     r2,40(r9)	<-- use of r9 which is OK
c00000000076dffc:	7d 09 03 a6 	mtctr   r8
c00000000076e000:	e8 4a 00 08 	ld      r2,8(r10)
c00000000076e004:	4e 80 04 21 	bctrl
c00000000076e008:	e8 49 00 28 	ld      r2,40(r9)	<-- but not OK here


Good:

c00000000076e0b0:       78 39 04 64     rldicr  r25,r1,0,49	<-- current_thread_info()
c00000000076e0b4:       7a 94 07 c6     rldicr  r20,r20,32,31
c00000000076e0b8:       62 d6 6d b7     ori     r22,r22,28087
c00000000076e0bc:       7a b5 00 44     rldicr  r21,r21,0,1
c00000000076e0c0:       7d 5f f2 14     add     r10,r31,r30
c00000000076e0c4:       39 1a 00 03     addi    r8,r26,3
c00000000076e0c8:       79 08 26 e4     rldicr  r8,r8,4,59
c00000000076e0cc:       7f 8a e8 00     cmpw    cr7,r10,r29
c00000000076e0d0:       7d 40 00 26     mfcr    r10
c00000000076e0d4:       55 4a ef fe     rlwinm  r10,r10,29,31,31
c00000000076e0d8:       7d 29 42 14     add     r9,r9,r8
c00000000076e0dc:       0b 0a 00 00     tdnei   r10,0
c00000000076e0e0:       81 49 00 0c     lwz     r10,12(r9)
c00000000076e0e4:       7d 4a ea 14     add     r10,r10,r29
c00000000076e0e8:       7d 1f 50 50     subf    r8,r31,r10
c00000000076e0ec:       7d 51 07 b4     extsw   r17,r10
c00000000076e0f0:       2f 08 00 00     cmpwi   cr6,r8,0
c00000000076e0f4:       7d 1c 43 78     mr      r28,r8
c00000000076e0f8:       7f 88 f0 00     cmpw    cr7,r8,r30
c00000000076e0fc:       40 99 00 c8     ble-    cr6,c00000000076e1c4 <.__skb_checksum+0x214>
c00000000076e100:       40 9d 00 08     ble-    cr7,c00000000076e108 <.__skb_checksum+0x158>
c00000000076e104:       7f dc f3 78     mr      r28,r30
c00000000076e108:       81 59 00 14     lwz     r10,20(r25)
c00000000076e10c:       e8 69 00 00     ld      r3,0(r9)
c00000000076e110:       7f 93 07 b4     extsw   r19,r28
c00000000076e114:       39 4a 00 01     addi    r10,r10,1
c00000000076e118:       91 59 00 14     stw     r10,20(r25)
c00000000076e11c:       7c 63 a2 14     add     r3,r3,r20
c00000000076e120:       7e 64 9b 78     mr      r4,r19
c00000000076e124:       7c 63 1e 74     sradi   r3,r3,3
c00000000076e128:       38 a0 00 00     li      r5,0
c00000000076e12c:       7c 63 b1 d2     mulld   r3,r3,r22
c00000000076e130:       81 49 00 08     lwz     r10,8(r9)
c00000000076e134:       7d 5f 52 14     add     r10,r31,r10
c00000000076e138:       78 63 83 e4     rldicr  r3,r3,16,47
c00000000076e13c:       7f bd 50 50     subf    r29,r29,r10
c00000000076e140:       7c 63 ab 78     or      r3,r3,r21
c00000000076e144:       7c 63 ea 14     add     r3,r3,r29
c00000000076e148:       e9 37 00 00     ld      r9,0(r23)
c00000000076e14c:       e9 49 00 00     ld      r10,0(r9)
c00000000076e150:       f8 41 00 28     std     r2,40(r1)	<-- correct use of r1
c00000000076e154:       7d 49 03 a6     mtctr   r10
c00000000076e158:       e8 49 00 08     ld      r2,8(r9)
c00000000076e15c:       4e 80 04 21     bctrl
c00000000076e160:       e8 41 00 28     ld      r2,40(r1)	<-- correct use of r1


cheers

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 1/3] powerpc: Don't use local named register variable in current_thread_info
  2014-12-18  6:25     ` Anton Blanchard
  2014-12-18 15:02       ` Alexander Graf
@ 2014-12-31 12:24       ` Alan Modra
  2015-01-07  5:12         ` Anton Blanchard
  1 sibling, 1 reply; 13+ messages in thread
From: Alan Modra @ 2014-12-31 12:24 UTC (permalink / raw)
  To: Anton Blanchard
  Cc: Alexander Graf, ulrich.weigand, paulus, Scott Wood, linuxppc-dev

On Thu, Dec 18, 2014 at 05:25:46PM +1100, Anton Blanchard wrote:
> On Thu, 18 Dec 2014 16:11:54 +1100
> Michael Ellerman <mpe@ellerman.id.au> wrote:
> 
> > On Wed, 2014-12-17 at 02:16 +0100, Alexander Graf wrote:
> > > On 31.10.14 04:47, Anton Blanchard wrote:
> > > > LLVM doesn't support local named register variables and is
> > > > unlikely to. current_thread_info is using one, fix it by moving
> > > > it out and calling it __current_r1().
> > > > 
> > > > I gave it a bit of an obscure name because we don't want anyone
> > > > else using it - they should use current_stack_pointer(). This
> > > > specific case is performance critical and we can't afford to call
> > > > a function to get it. Furthermore it isn't important to know
> > > > exactly where in the stack we are since we mask the lower bits.
> > > > 
> > > > Signed-off-by: Anton Blanchard <anton@samba.org>
> > > 
> > > Git bisect managed to point me to this commit as the offender for
> > > OOPSes on e5500 and e6500 (and maybe the G4 as well, not sure).
> > > 
> > > Doing a git revert of this commit on top of linus/master makes
> > > things work fine for me again.
> > > 
> > > 
> > > Alex
> > > 
> > > Oops: Kernel access of bad area, sig: 11 [#2]
> > > SMP NR_CPUS=16 CoreNet Generic
> > > Modules linked in:
> > > CPU: 1 PID: 339 Comm: kworker/1:1 Tainted: G      D
> > > 3.18.0-09423-g988adfd #1
> > > Workqueue: rpciod .rpc_async_schedule
> > > task: c0000001f6397500 ti: c0000001f6638000 task.ti:
> > > c0000001f6638000 NIP: c0000000004817a4 LR: c0000000004817a4 CTR:
> > > 0000000000000000 REGS: c0000001f663b0e0 TRAP: 0300   Tainted:
> > > G      D (3.18.0-09423-g988adfd)
> > > MSR: 0000000080029000 <CE,EE,ME>  CR: 24ad2e42  XER: 00000000
> > > DEAR: 202031303438355f ESR: 0000000000000000 SOFTE: 1
> >       = r9 + 40
> > 
> > > GPR00: c0000000004817a4 c0000001f663b360 c000000000988028
> > > 000000007f24333d GPR04: 5ff5738c1f2ebfb1 0000000000000000
> > > 0000000000000000 00000000000008f8 GPR08: c000000000480ae8
> > > 2020313034383537 36204b4220617320 6469726563740a31 GPR12:
> > > 3937302d30312d30 c00000000fff8780 c00000000007f988 c0000001f64c1600
> > 
> > GPRs 9-12 say: " 1048576 KB as direct\n1970-01-0"
> > 
> > Which is rarely a good sign :)
> > 
> > Looks like it might be part of your dmesg from setup_page_sizes().
> > 
> > > GPR16: 0000000000000000 0000000000000000 0000000000000000
> > > 00000000000005dc GPR20: c0000000009b8028 c00000007e034200
> > > 0000000000000548 c000000000000000 GPR24: c0000001f663b4b0
> > > 00000000b225831e 0000000000000000 0000000000000080 GPR28:
> > > 0000000000000548 00000000000008f8 0000000000000548 0000000000000094
> > > NIP [c0000000004817a4] .__skb_checksum+0x194/0x378 LR
> > > [c0000000004817a4] .__skb_checksum+0x194/0x378 Call Trace:
> > > [c0000001f663b360] [c0000000004817a4] .__skb_checksum+0x194/0x378
> > > (unreliable)
> > > [c0000001f663b440] [c0000000004819b4] .skb_checksum+0x2c/0x3c
> > > [c0000001f663b4c0] [c0000000004fd0a8] .udp4_hwcsum+0xa8/0x16c
> > > [c0000001f663b560] [c0000000004fd440] .udp_send_skb+0x2d4/0x370
> > > [c0000001f663b600]
> > > [c0000000004fd51c] .udp_push_pending_frames+0x40/0x94
> > > [c0000001f663b680] [c0000000004fec08] .udp_sendpage+0x150/0x1b4
> > > [c0000001f663b770] [c00000000050ae54] .inet_sendpage+0xa0/0x120
> > > [c0000001f663b810] [c00000000059c8cc] .xs_sendpages+0x2d0/0x30c
> > > [c0000001f663b8d0]
> > > [c00000000059cae4] .xs_udp_send_request+0x58/0x120
> > > [c0000001f663b970] [c000000000598f04] .xprt_transmit+0x80/0x36c
> > > [c0000001f663ba20] [c0000000005942d8] .call_transmit+0x19c/0x254
> > > [c0000001f663bab0] [c00000000059ff64] .__rpc_execute+0xbc/0x3c0
> > > [c0000001f663bb90] [c0000000000797f8] .process_one_work+0x1c0/0x474
> > > [c0000001f663bc40] [c00000000007a518] .worker_thread+0x17c/0x54c
> > > [c0000001f663bd30] [c00000000007fa8c] .kthread+0x104/0x124
> > > [c0000001f663be30]
> > > [c000000000000884] .ret_from_kernel_thread+0x58/0xd4 Instruction
> > > dump: 7d1f3a14 7c6a1850 e9580000 7fbd4050 786334e4 e90a0000
> > > 7c63ba14 f8490028 7c63ea14 7d0903a6 e84a0008 4e800421 <e8490028>
> > > 7c641b78 78270464 e9580008
> > 
> > Which is:
> > 
> > add      r8, r31, r7
> > subf     r3, r10, r3
> > ld      r10, 0(r24)
> > subf    r29, r29, r8
> > rldicr   r3, r3, 6, 51
> > ld       r8, 0(r10)
> > add      r3, r3, r23
> > std      r2, 40(r9)
> > add      r3, r3, r29
> > mtctr    r8
> > ld       r2, 8(r10)
> > bctrl
> > ld       r2, 40(r9)	<---
> > mr       r4, r3
> > rldicr   r7, r1, 0, 49
> > ld      r10, 8(r24)
> > 
> > 
> > Which looks a bit odd. I'd expect us to be saving/restoring r2 to the
> > stack, though maybe r9 was pointing at the stack?
> 
> Nice catch! This looks like a compiler bug.

Yes, it is.

> > Looking at your vmlinux.broken I don't see the same code gen.
> 
> For whatever reason we ended up with r10 this time:
> 
> 7c 2a 0b 78     mr      r10,r1
> ...
> f8 4a 00 28     std     r2,40(r10)
> 7c 63 ba 14     add     r3,r3,r23
> 7c e9 03 a6     mtctr   r7
> 7c 63 ea 14     add     r3,r3,r29
> 38 a0 00 00     li      r5,0
> e8 48 00 08     ld      r2,8(r8)
> 4e 80 04 21     bctrl
> e8 4a 00 28     ld      r2,40(r10)
> 
> The indirect function call is allowed to clobber r10, gcc is doing
> something very wrong here.

Right.  This is really an rs6000 backend bug.  We describe one of the
indirect calls that go wrong here as

(call_insn 108 107 109 13 (parallel [
            (set (reg:DI 3 3)
                (call (mem:SI (reg:DI 288) [0 *_67 S4 A8])
                    (const_int 64 [0x40])))
            (use (mem:DI (plus:DI (reg/f:DI 287 [ ops_44(D)->update ])
                        (const_int 8 [0x8])) [0  S8 A8]))
            (set (reg:DI 2 2)
                (mem/v/c:DI (plus:DI (reg/f:DI 1 1)
                        (const_int 40 [0x28])) [0  S8 A8]))
            (clobber (reg:DI 65 lr))
        ]) net/core/skbuff.c:2085 680 {*call_value_indirect_aixdi}
	<notes and arg uses omitted for clarity>
)

Notice that the RTL contains a "parallel".  As you might guess, gcc
treats the vector of expressions inside the square brackets of the
parallel as happening "in parallel".  Meaning that as far as gcc is
concerned the toc restore part (third element) happens at the same
time as the call (first element).  So if gcc replaces (reg:DI 1) in
the toc restore with some other register known to have the same value
*before* the call, gcc's RTL analysis will conclude that such a
replacement is valid.

That's what happens in the cprop1 pass.  The rtl dump shows
LOCAL COPY-PROP: Replacing reg 1 in insn 108 with reg 203
and then it's a matter of luck just what hard register is allocated to
pseudo-reg 203.

Of course, replacing r1 with some other register is a completely
useless thing to do, but trying to tell gcc that in our particular
case we want this generic optimisation disabled isn't so easy.  (Well,
it's dead easy if you want to hack cprop.c:do_local_cprop, just rip
out
          || (GET_CODE (PATTERN (insn)) != USE
	      && asm_noperands (PATTERN (insn)) < 0)))
but maybe not so easy to get such patches committed..)  Instead, the
way I'd go about fixing this is removing the r1 reference in our toc
save/restore RTL, ie. don't use a mem, use an unspec.

-- 
Alan Modra
Australia Development Lab, IBM

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 1/3] powerpc: Don't use local named register variable in current_thread_info
  2014-12-31 12:24       ` Alan Modra
@ 2015-01-07  5:12         ` Anton Blanchard
  2015-01-07 17:59           ` Scott Wood
  2015-01-08  6:36           ` Alan Modra
  0 siblings, 2 replies; 13+ messages in thread
From: Anton Blanchard @ 2015-01-07  5:12 UTC (permalink / raw)
  To: Alan Modra
  Cc: Alexander Graf, ulrich.weigand, paulus, Scott Wood, linuxppc-dev

Hi Alan,

> Right.  This is really an rs6000 backend bug.  We describe one of the
> indirect calls that go wrong here as
> 
> (call_insn 108 107 109 13 (parallel [
>             (set (reg:DI 3 3)
>                 (call (mem:SI (reg:DI 288) [0 *_67 S4 A8])
>                     (const_int 64 [0x40])))
>             (use (mem:DI (plus:DI (reg/f:DI 287 [ ops_44(D)->update ])
>                         (const_int 8 [0x8])) [0  S8 A8]))
>             (set (reg:DI 2 2)
>                 (mem/v/c:DI (plus:DI (reg/f:DI 1 1)
>                         (const_int 40 [0x28])) [0  S8 A8]))
>             (clobber (reg:DI 65 lr))
>         ]) net/core/skbuff.c:2085 680 {*call_value_indirect_aixdi}
> 	<notes and arg uses omitted for clarity>
> )
> 
> Notice that the RTL contains a "parallel".  As you might guess, gcc
> treats the vector of expressions inside the square brackets of the
> parallel as happening "in parallel".  Meaning that as far as gcc is
> concerned the toc restore part (third element) happens at the same
> time as the call (first element).  So if gcc replaces (reg:DI 1) in
> the toc restore with some other register known to have the same value
> *before* the call, gcc's RTL analysis will conclude that such a
> replacement is valid.

Thanks for looking into this. Does that mean we were just getting lucky
with the previous version:

static inline struct thread_info *current_thread_info(void)
{
        register unsigned long sp asm("r1");

        return (struct thread_info *)(sp & ~(THREAD_SIZE-1));
}
 
ie a static register asm instead of a global one. If so the safest fix
for now might be to just eat the overead of a register move:

static inline struct thread_info *current_thread_info(void)
{
        unsigned long sp;

        asm("mr %0,1": "=r"(sp));
        return (struct thread_info *)(sp & ~(THREAD_SIZE-1));
}

Anton

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 1/3] powerpc: Don't use local named register variable in current_thread_info
  2015-01-07  5:12         ` Anton Blanchard
@ 2015-01-07 17:59           ` Scott Wood
  2015-01-08  6:36           ` Alan Modra
  1 sibling, 0 replies; 13+ messages in thread
From: Scott Wood @ 2015-01-07 17:59 UTC (permalink / raw)
  To: Anton Blanchard
  Cc: Alan Modra, Alexander Graf, ulrich.weigand, paulus, linuxppc-dev

On Wed, 2015-01-07 at 16:12 +1100, Anton Blanchard wrote:
> Thanks for looking into this. Does that mean we were just getting lucky
> with the previous version:
> 
> static inline struct thread_info *current_thread_info(void)
> {
>         register unsigned long sp asm("r1");
> 
>         return (struct thread_info *)(sp & ~(THREAD_SIZE-1));
> }
>  
> ie a static register asm instead of a global one. If so the safest fix
> for now might be to just eat the overead of a register move:
> 
> static inline struct thread_info *current_thread_info(void)
> {
>         unsigned long sp;
> 
>         asm("mr %0,1": "=r"(sp));
>         return (struct thread_info *)(sp & ~(THREAD_SIZE-1));

You could avoid the register move by doing a rlwinm/rldicr in inline
asm, if it matters enough.

-Scott

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 1/3] powerpc: Don't use local named register variable in current_thread_info
  2015-01-07  5:12         ` Anton Blanchard
  2015-01-07 17:59           ` Scott Wood
@ 2015-01-08  6:36           ` Alan Modra
  1 sibling, 0 replies; 13+ messages in thread
From: Alan Modra @ 2015-01-08  6:36 UTC (permalink / raw)
  To: Anton Blanchard
  Cc: Alexander Graf, ulrich.weigand, paulus, Scott Wood, linuxppc-dev

On Wed, Jan 07, 2015 at 04:12:47PM +1100, Anton Blanchard wrote:
> Hi Alan,
> 
> > Right.  This is really an rs6000 backend bug.  We describe one of the
> > indirect calls that go wrong here as
> > 
> > (call_insn 108 107 109 13 (parallel [
> >             (set (reg:DI 3 3)
> >                 (call (mem:SI (reg:DI 288) [0 *_67 S4 A8])
> >                     (const_int 64 [0x40])))
> >             (use (mem:DI (plus:DI (reg/f:DI 287 [ ops_44(D)->update ])
> >                         (const_int 8 [0x8])) [0  S8 A8]))
> >             (set (reg:DI 2 2)
> >                 (mem/v/c:DI (plus:DI (reg/f:DI 1 1)
> >                         (const_int 40 [0x28])) [0  S8 A8]))
> >             (clobber (reg:DI 65 lr))
> >         ]) net/core/skbuff.c:2085 680 {*call_value_indirect_aixdi}
> > 	<notes and arg uses omitted for clarity>
> > )
> > 
> > Notice that the RTL contains a "parallel".  As you might guess, gcc
> > treats the vector of expressions inside the square brackets of the
> > parallel as happening "in parallel".  Meaning that as far as gcc is
> > concerned the toc restore part (third element) happens at the same
> > time as the call (first element).  So if gcc replaces (reg:DI 1) in
> > the toc restore with some other register known to have the same value
> > *before* the call, gcc's RTL analysis will conclude that such a
> > replacement is valid.
> 
> Thanks for looking into this. Does that mean we were just getting lucky
> with the previous version:
> 
> static inline struct thread_info *current_thread_info(void)
> {
>         register unsigned long sp asm("r1");
> 
>         return (struct thread_info *)(sp & ~(THREAD_SIZE-1));
> }

With both versions, the original rtl for current_thread_info consists
of two instructions, copy r1 to a pseudo reg, then the "and".  With
the above version, fwprop1 manages to combine this to a single "and"
insn.  When using a global reg var, fprop1 leaves the two
instructions, the copy causing the trouble in the following cprop1
pass.  So it's not that we are getting lucky in cprop1, but that
fwprop1 behaves differently with global vs. local registers.

-- 
Alan Modra
Australia Development Lab, IBM

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [PATCH 1/3] powerpc: Don't use local named register variable in current_thread_info
@ 2014-10-31  3:47 Anton Blanchard
  0 siblings, 0 replies; 13+ messages in thread
From: Anton Blanchard @ 2014-10-31  3:47 UTC (permalink / raw)
  To: benh, paulus, mpe, ulrich.weigand; +Cc: linuxppc-dev

LLVM doesn't support local named register variables and is unlikely
to. current_thread_info is using one, fix it by moving it out and
calling it __current_r1().

I gave it a bit of an obscure name because we don't want anyone else
using it - they should use current_stack_pointer(). This specific
case is performance critical and we can't afford to call a function
to get it. Furthermore it isn't important to know exactly where in
the stack we are since we mask the lower bits.

Signed-off-by: Anton Blanchard <anton@samba.org>
---
 arch/powerpc/include/asm/thread_info.h | 5 ++---
 1 file changed, 2 insertions(+), 3 deletions(-)

diff --git a/arch/powerpc/include/asm/thread_info.h b/arch/powerpc/include/asm/thread_info.h
index b034ecd..ebc4f16 100644
--- a/arch/powerpc/include/asm/thread_info.h
+++ b/arch/powerpc/include/asm/thread_info.h
@@ -71,13 +71,12 @@ struct thread_info {
 #define THREAD_SIZE_ORDER	(THREAD_SHIFT - PAGE_SHIFT)
 
 /* how to get the thread information struct from C */
+register unsigned long __current_r1 asm("r1");
 static inline struct thread_info *current_thread_info(void)
 {
-	register unsigned long sp asm("r1");
-
 	/* gcc4, at least, is smart enough to turn this into a single
 	 * rlwinm for ppc32 and clrrdi for ppc64 */
-	return (struct thread_info *)(sp & ~(THREAD_SIZE-1));
+	return (struct thread_info *)(__current_r1 & ~(THREAD_SIZE-1));
 }
 
 #endif /* __ASSEMBLY__ */
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2015-01-08  6:36 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <1414727247-31838-1-git-send-email-anton__19440.5086375356$1414727300$gmane$org@samba.org>
2014-12-17  1:16 ` [PATCH 1/3] powerpc: Don't use local named register variable in current_thread_info Alexander Graf
2014-12-17  3:44   ` Anton Blanchard
2014-12-17  9:27     ` Alexander Graf
2014-12-22  6:49       ` Michael Ellerman
2014-12-18  5:11   ` Michael Ellerman
2014-12-18  6:25     ` Anton Blanchard
2014-12-18 15:02       ` Alexander Graf
2014-12-31 12:24       ` Alan Modra
2015-01-07  5:12         ` Anton Blanchard
2015-01-07 17:59           ` Scott Wood
2015-01-08  6:36           ` Alan Modra
2014-12-18 14:56     ` Alexander Graf
2014-10-31  3:47 Anton Blanchard

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.