All of lore.kernel.org
 help / color / mirror / Atom feed
* How is this possible - Register r30 contains 0xc2236400 instead of 0xc6236400
@ 2018-07-03 13:20 Christophe LEROY
  2018-07-04 13:11 ` Michael Ellerman
  0 siblings, 1 reply; 4+ messages in thread
From: Christophe LEROY @ 2018-07-03 13:20 UTC (permalink / raw)
  To: linuxppc-dev, Segher Boessenkool, Michael Ellerman,
	Benjamin Herrenschmidt, Paul Mackerras

Kernel Oops at 0xc0334d5c for reading at address 0xc2236450 which 
corresponds to r30 + 80

But r30 should contain what's at r3 + 16 that is at 0xc619ec10 so r30 
should be c6236400 as shown below (print_hex_dump(regs->gpr[3]) added at 
end of __die() )

So how can r30 contain 0xc2236400 instead ?

And this is not random, it happens at most if not every startup.

c0334d44 <sock_wfree>:
c0334d44:       7c 08 02 a6     mflr    r0
c0334d48:       94 21 ff f0     stwu    r1,-16(r1)
c0334d4c:       bf c1 00 08     stmw    r30,8(r1)
c0334d50:       90 01 00 14     stw     r0,20(r1)
c0334d54:       83 c3 00 10     lwz     r30,16(r3)
c0334d58:       81 23 00 a8     lwz     r9,168(r3)
c0334d5c:       81 5e 00 50     lwz     r10,80(r30)


[  152.288237] Unable to handle kernel paging request for data at 
address 0xc2236450
[  152.295444] Faulting instruction address: 0xc0334d5c
[  152.300369] Oops: Kernel access of bad area, sig: 11 [#1]
[  152.305665] BE PREEMPT DEBUG_PAGEALLOC CMPC885
[  152.313630] CPU: 0 PID: 269 Comm: in:imuxsock Not tainted 
4.14.52-00025-g5bada429cf-dirty #36
[  152.322729] task: c623e100 task.stack: c650c000
[  152.327202] NIP:  c0334d5c LR: c043602c CTR: c0435fb8
[  152.332200] REGS: c650dc00 TRAP: 0300   Not tainted 
(4.14.52-00025-g5bada429cf-dirty)
[  152.340699] MSR:  00009032 <EE,ME,IR,DR,RI>  CR: 28002822 XER: 20000000
[  152.347333] DAR: c2236450 DSISR: c0000000
[  152.347333] GPR00: c043602c c650dcb0 c623e100 c619ec00 c642c060 
00000008 00000018 c650dd4c
[  152.347333] GPR08: c0435fb8 000002b0 c068d830 00000004 28004822 
100d4208 00000000 7780c848
[  152.347333] GPR16: 0ff58398 777674b0 1024b050 1024b0a8 1005ddbc 
0ff5a7bc 000003e8 00000000
[  152.347333] GPR24: 0000008e c5011650 c650deb8 0000008e c619ec00 
00000040 c2236400 c619ec00
[  152.385015] NIP [c0334d5c] sock_wfree+0x18/0xa4
[  152.389458] LR [c043602c] unix_destruct_scm+0x74/0x88
[  152.394399] Call Trace:
[  152.396868] [c650dcb0] [c006348c] ns_to_timeval+0x4c/0x7c (unreliable)
[  152.403305] [c650dcc0] [c043602c] unix_destruct_scm+0x74/0x88
[  152.408999] [c650dcf0] [c033a10c] skb_release_head_state+0x8c/0x110
[  152.415184] [c650dd00] [c033a3c4] skb_release_all+0x18/0x50
[  152.420690] [c650dd10] [c033a7cc] consume_skb+0x38/0xec
[  152.425869] [c650dd20] [c0342d7c] skb_free_datagram+0x1c/0x68
[  152.431535] [c650dd30] [c0435c8c] unix_dgram_recvmsg+0x19c/0x4ac
[  152.437476] [c650ddb0] [c0331370] ___sys_recvmsg+0x98/0x138
[  152.442984] [c650deb0] [c0333280] __sys_recvmsg+0x40/0x84
[  152.448321] [c650df10] [c0333680] SyS_socketcall+0xb8/0x1d4
[  152.453832] [c650df40] [c000d1ac] ret_from_syscall+0x0/0x38
[  152.459286] Instruction dump:
[  152.462225] 41beffac 4bffff58 38800003 4bffffa0 38800001 4bffff98 
7c0802a6 9421fff0
[  152.469881] bfc10008 90010014 83c30010 812300a8 <815e0050> 3bfe00e0 
71480200 4082003c
[  152.477739] c619ec00: 00 00 00 00 00 00 00 00 00 00 00 23 6f d9 b1 65
[  152.484100] c619ec10: c6 23 64 00 00 00 00 00 c6 42 c0 60 00 00 03 e8
[  152.490471] c619ec20: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[  152.496837] c619ec30: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[  152.503205] c619ec40: 00 00 00 00 00 00 00 00 00 00 00 00 c0 43 5f b8
[  152.509575] c619ec50: 00 00 00 00 00 00 00 00 00 00 00 8e 00 00 00 00
[  152.515943] c619ec60: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[  152.522311] c619ec70: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[  152.528680] c619ec80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[  152.535048] c619ec90: 00 00 ff ff 00 00 ff ff c6 42 30 8e c6 42 31 50
[  152.541417] c619eca0: c6 42 30 00 c6 42 30 00 00 00 02 b0 00 00 00 01
[  152.547781] ---[ end trace 0710a9d231876a27 ]---

Christophe

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: How is this possible - Register r30 contains 0xc2236400 instead of 0xc6236400
  2018-07-03 13:20 How is this possible - Register r30 contains 0xc2236400 instead of 0xc6236400 Christophe LEROY
@ 2018-07-04 13:11 ` Michael Ellerman
  2018-07-04 13:45   ` Segher Boessenkool
  0 siblings, 1 reply; 4+ messages in thread
From: Michael Ellerman @ 2018-07-04 13:11 UTC (permalink / raw)
  To: Christophe LEROY, linuxppc-dev, Segher Boessenkool,
	Benjamin Herrenschmidt, Paul Mackerras

Christophe LEROY <christophe.leroy@c-s.fr> writes:

> Kernel Oops at 0xc0334d5c for reading at address 0xc2236450 which=20
> corresponds to r30 + 80
>
> But r30 should contain what's at r3 + 16 that is at 0xc619ec10 so r30=20
> should be c6236400 as shown below (print_hex_dump(regs->gpr[3]) added at=
=20
> end of __die() )
>
> So how can r30 contain 0xc2236400 instead ?

The simplest answer is that memory was modified between the time we
loaded it into r30 and when you print it.

So it did contain 0xc2236400 but has since been modified to now contain
0xc6236400.

The thing that makes me less certain, is that c6 would be the correct
value (I think?), so it's been modified back to the correct value, which
seems lucky.

Mysterious.

cheers

> And this is not random, it happens at most if not every startup.
>
> c0334d44 <sock_wfree>:
> c0334d44:=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 7c 08 02 a6=C2=A0=C2=A0=C2=
=A0=C2=A0 mflr=C2=A0=C2=A0=C2=A0 r0
> c0334d48:=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 94 21 ff f0=C2=A0=C2=A0=C2=
=A0=C2=A0 stwu=C2=A0=C2=A0=C2=A0 r1,-16(r1)
> c0334d4c:=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 bf c1 00 08=C2=A0=C2=A0=C2=
=A0=C2=A0 stmw=C2=A0=C2=A0=C2=A0 r30,8(r1)
> c0334d50:=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 90 01 00 14=C2=A0=C2=A0=C2=
=A0=C2=A0 stw=C2=A0=C2=A0=C2=A0=C2=A0 r0,20(r1)
> c0334d54:=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 83 c3 00 10=C2=A0=C2=A0=C2=
=A0=C2=A0 lwz=C2=A0=C2=A0=C2=A0=C2=A0 r30,16(r3)
> c0334d58:=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 81 23 00 a8=C2=A0=C2=A0=C2=
=A0=C2=A0 lwz=C2=A0=C2=A0=C2=A0=C2=A0 r9,168(r3)
> c0334d5c:=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 81 5e 00 50=C2=A0=C2=A0=C2=
=A0=C2=A0 lwz=C2=A0=C2=A0=C2=A0=C2=A0 r10,80(r30)
>
>
> [=C2=A0 152.288237] Unable to handle kernel paging request for data at=20
> address 0xc2236450
> [=C2=A0 152.295444] Faulting instruction address: 0xc0334d5c
> [=C2=A0 152.300369] Oops: Kernel access of bad area, sig: 11 [#1]
> [=C2=A0 152.305665] BE PREEMPT DEBUG_PAGEALLOC CMPC885
> [=C2=A0 152.313630] CPU: 0 PID: 269 Comm: in:imuxsock Not tainted=20
> 4.14.52-00025-g5bada429cf-dirty #36
> [=C2=A0 152.322729] task: c623e100 task.stack: c650c000
> [=C2=A0 152.327202] NIP:=C2=A0 c0334d5c LR: c043602c CTR: c0435fb8
> [=C2=A0 152.332200] REGS: c650dc00 TRAP: 0300=C2=A0=C2=A0 Not tainted=20
> (4.14.52-00025-g5bada429cf-dirty)
> [=C2=A0 152.340699] MSR:=C2=A0 00009032 <EE,ME,IR,DR,RI>=C2=A0 CR: 280028=
22 XER: 20000000
> [=C2=A0 152.347333] DAR: c2236450 DSISR: c0000000
> [=C2=A0 152.347333] GPR00: c043602c c650dcb0 c623e100 c619ec00 c642c060=20
> 00000008 00000018 c650dd4c
> [=C2=A0 152.347333] GPR08: c0435fb8 000002b0 c068d830 00000004 28004822=20
> 100d4208 00000000 7780c848
> [=C2=A0 152.347333] GPR16: 0ff58398 777674b0 1024b050 1024b0a8 1005ddbc=20
> 0ff5a7bc 000003e8 00000000
> [=C2=A0 152.347333] GPR24: 0000008e c5011650 c650deb8 0000008e c619ec00=20
> 00000040 c2236400 c619ec00
> [=C2=A0 152.385015] NIP [c0334d5c] sock_wfree+0x18/0xa4
> [=C2=A0 152.389458] LR [c043602c] unix_destruct_scm+0x74/0x88
> [=C2=A0 152.394399] Call Trace:
> [=C2=A0 152.396868] [c650dcb0] [c006348c] ns_to_timeval+0x4c/0x7c (unreli=
able)
> [=C2=A0 152.403305] [c650dcc0] [c043602c] unix_destruct_scm+0x74/0x88
> [=C2=A0 152.408999] [c650dcf0] [c033a10c] skb_release_head_state+0x8c/0x1=
10
> [=C2=A0 152.415184] [c650dd00] [c033a3c4] skb_release_all+0x18/0x50
> [=C2=A0 152.420690] [c650dd10] [c033a7cc] consume_skb+0x38/0xec
> [=C2=A0 152.425869] [c650dd20] [c0342d7c] skb_free_datagram+0x1c/0x68
> [=C2=A0 152.431535] [c650dd30] [c0435c8c] unix_dgram_recvmsg+0x19c/0x4ac
> [=C2=A0 152.437476] [c650ddb0] [c0331370] ___sys_recvmsg+0x98/0x138
> [=C2=A0 152.442984] [c650deb0] [c0333280] __sys_recvmsg+0x40/0x84
> [=C2=A0 152.448321] [c650df10] [c0333680] SyS_socketcall+0xb8/0x1d4
> [=C2=A0 152.453832] [c650df40] [c000d1ac] ret_from_syscall+0x0/0x38
> [=C2=A0 152.459286] Instruction dump:
> [=C2=A0 152.462225] 41beffac 4bffff58 38800003 4bffffa0 38800001 4bffff98=
=20
> 7c0802a6 9421fff0
> [=C2=A0 152.469881] bfc10008 90010014 83c30010 812300a8 <815e0050> 3bfe00=
e0=20
> 71480200 4082003c
> [=C2=A0 152.477739] c619ec00: 00 00 00 00 00 00 00 00 00 00 00 23 6f d9 b=
1 65
> [=C2=A0 152.484100] c619ec10: c6 23 64 00 00 00 00 00 c6 42 c0 60 00 00 0=
3 e8
> [=C2=A0 152.490471] c619ec20: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0=
0 00
> [=C2=A0 152.496837] c619ec30: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0=
0 00
> [=C2=A0 152.503205] c619ec40: 00 00 00 00 00 00 00 00 00 00 00 00 c0 43 5=
f b8
> [=C2=A0 152.509575] c619ec50: 00 00 00 00 00 00 00 00 00 00 00 8e 00 00 0=
0 00
> [=C2=A0 152.515943] c619ec60: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0=
0 00
> [=C2=A0 152.522311] c619ec70: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0=
0 00
> [=C2=A0 152.528680] c619ec80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0=
0 00
> [=C2=A0 152.535048] c619ec90: 00 00 ff ff 00 00 ff ff c6 42 30 8e c6 42 3=
1 50
> [=C2=A0 152.541417] c619eca0: c6 42 30 00 c6 42 30 00 00 00 02 b0 00 00 0=
0 01
> [=C2=A0 152.547781] ---[ end trace 0710a9d231876a27 ]---
>
> Christophe

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: How is this possible - Register r30 contains 0xc2236400 instead of 0xc6236400
  2018-07-04 13:11 ` Michael Ellerman
@ 2018-07-04 13:45   ` Segher Boessenkool
  2018-07-04 13:59     ` Christophe LEROY
  0 siblings, 1 reply; 4+ messages in thread
From: Segher Boessenkool @ 2018-07-04 13:45 UTC (permalink / raw)
  To: Michael Ellerman
  Cc: Christophe LEROY, linuxppc-dev, Benjamin Herrenschmidt, Paul Mackerras

On Wed, Jul 04, 2018 at 11:11:59PM +1000, Michael Ellerman wrote:
> Christophe LEROY <christophe.leroy@c-s.fr> writes:
> 
> > Kernel Oops at 0xc0334d5c for reading at address 0xc2236450 which 
> > corresponds to r30 + 80
> >
> > But r30 should contain what's at r3 + 16 that is at 0xc619ec10 so r30 
> > should be c6236400 as shown below (print_hex_dump(regs->gpr[3]) added at 
> > end of __die() )
> >
> > So how can r30 contain 0xc2236400 instead ?
> 
> The simplest answer is that memory was modified between the time we
> loaded it into r30 and when you print it.
> 
> So it did contain 0xc2236400 but has since been modified to now contain
> 0xc6236400.
> 
> The thing that makes me less certain, is that c6 would be the correct
> value (I think?), so it's been modified back to the correct value, which
> seems lucky.
> 
> Mysterious.

That depends.  Is this reproducible at all?  It is a single bit flip.


Segher

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: How is this possible - Register r30 contains 0xc2236400 instead of 0xc6236400
  2018-07-04 13:45   ` Segher Boessenkool
@ 2018-07-04 13:59     ` Christophe LEROY
  0 siblings, 0 replies; 4+ messages in thread
From: Christophe LEROY @ 2018-07-04 13:59 UTC (permalink / raw)
  To: Segher Boessenkool, Michael Ellerman
  Cc: linuxppc-dev, Benjamin Herrenschmidt, Paul Mackerras



Le 04/07/2018 à 15:45, Segher Boessenkool a écrit :
> On Wed, Jul 04, 2018 at 11:11:59PM +1000, Michael Ellerman wrote:
>> Christophe LEROY <christophe.leroy@c-s.fr> writes:
>>
>>> Kernel Oops at 0xc0334d5c for reading at address 0xc2236450 which
>>> corresponds to r30 + 80
>>>
>>> But r30 should contain what's at r3 + 16 that is at 0xc619ec10 so r30
>>> should be c6236400 as shown below (print_hex_dump(regs->gpr[3]) added at
>>> end of __die() )
>>>
>>> So how can r30 contain 0xc2236400 instead ?
>>
>> The simplest answer is that memory was modified between the time we
>> loaded it into r30 and when you print it.
>>
>> So it did contain 0xc2236400 but has since been modified to now contain
>> 0xc6236400.
>>
>> The thing that makes me less certain, is that c6 would be the correct
>> value (I think?), so it's been modified back to the correct value, which
>> seems lucky.
>>
>> Mysterious.
> 
> That depends.  Is this reproducible at all?  It is a single bit flip.

Yes it is reproductible.

It isn't reproduced if I modify the function in such a way that there is 
an additional (unrelated) instruction before that read.

It isn't reproduced if I move the kernel base address to 0xd0000000 or 
0xb0000000 instead of 0xc0000000

If I force a second read of this address in the function, I get the same 
value (in another register).

If I add a dcbi before the second read I get the correct address.

So it still looks mysterious to me ...

Christophe

> 
> 
> Segher
> 

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2018-07-04 13:59 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-07-03 13:20 How is this possible - Register r30 contains 0xc2236400 instead of 0xc6236400 Christophe LEROY
2018-07-04 13:11 ` Michael Ellerman
2018-07-04 13:45   ` Segher Boessenkool
2018-07-04 13:59     ` Christophe LEROY

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.