* How is this possible - Register r30 contains 0xc2236400 instead of 0xc6236400
@ 2018-07-03 13:20 Christophe LEROY
2018-07-04 13:11 ` Michael Ellerman
0 siblings, 1 reply; 4+ messages in thread
From: Christophe LEROY @ 2018-07-03 13:20 UTC (permalink / raw)
To: linuxppc-dev, Segher Boessenkool, Michael Ellerman,
Benjamin Herrenschmidt, Paul Mackerras
Kernel Oops at 0xc0334d5c for reading at address 0xc2236450 which
corresponds to r30 + 80
But r30 should contain what's at r3 + 16 that is at 0xc619ec10 so r30
should be c6236400 as shown below (print_hex_dump(regs->gpr[3]) added at
end of __die() )
So how can r30 contain 0xc2236400 instead ?
And this is not random, it happens at most if not every startup.
c0334d44 <sock_wfree>:
c0334d44: 7c 08 02 a6 mflr r0
c0334d48: 94 21 ff f0 stwu r1,-16(r1)
c0334d4c: bf c1 00 08 stmw r30,8(r1)
c0334d50: 90 01 00 14 stw r0,20(r1)
c0334d54: 83 c3 00 10 lwz r30,16(r3)
c0334d58: 81 23 00 a8 lwz r9,168(r3)
c0334d5c: 81 5e 00 50 lwz r10,80(r30)
[ 152.288237] Unable to handle kernel paging request for data at
address 0xc2236450
[ 152.295444] Faulting instruction address: 0xc0334d5c
[ 152.300369] Oops: Kernel access of bad area, sig: 11 [#1]
[ 152.305665] BE PREEMPT DEBUG_PAGEALLOC CMPC885
[ 152.313630] CPU: 0 PID: 269 Comm: in:imuxsock Not tainted
4.14.52-00025-g5bada429cf-dirty #36
[ 152.322729] task: c623e100 task.stack: c650c000
[ 152.327202] NIP: c0334d5c LR: c043602c CTR: c0435fb8
[ 152.332200] REGS: c650dc00 TRAP: 0300 Not tainted
(4.14.52-00025-g5bada429cf-dirty)
[ 152.340699] MSR: 00009032 <EE,ME,IR,DR,RI> CR: 28002822 XER: 20000000
[ 152.347333] DAR: c2236450 DSISR: c0000000
[ 152.347333] GPR00: c043602c c650dcb0 c623e100 c619ec00 c642c060
00000008 00000018 c650dd4c
[ 152.347333] GPR08: c0435fb8 000002b0 c068d830 00000004 28004822
100d4208 00000000 7780c848
[ 152.347333] GPR16: 0ff58398 777674b0 1024b050 1024b0a8 1005ddbc
0ff5a7bc 000003e8 00000000
[ 152.347333] GPR24: 0000008e c5011650 c650deb8 0000008e c619ec00
00000040 c2236400 c619ec00
[ 152.385015] NIP [c0334d5c] sock_wfree+0x18/0xa4
[ 152.389458] LR [c043602c] unix_destruct_scm+0x74/0x88
[ 152.394399] Call Trace:
[ 152.396868] [c650dcb0] [c006348c] ns_to_timeval+0x4c/0x7c (unreliable)
[ 152.403305] [c650dcc0] [c043602c] unix_destruct_scm+0x74/0x88
[ 152.408999] [c650dcf0] [c033a10c] skb_release_head_state+0x8c/0x110
[ 152.415184] [c650dd00] [c033a3c4] skb_release_all+0x18/0x50
[ 152.420690] [c650dd10] [c033a7cc] consume_skb+0x38/0xec
[ 152.425869] [c650dd20] [c0342d7c] skb_free_datagram+0x1c/0x68
[ 152.431535] [c650dd30] [c0435c8c] unix_dgram_recvmsg+0x19c/0x4ac
[ 152.437476] [c650ddb0] [c0331370] ___sys_recvmsg+0x98/0x138
[ 152.442984] [c650deb0] [c0333280] __sys_recvmsg+0x40/0x84
[ 152.448321] [c650df10] [c0333680] SyS_socketcall+0xb8/0x1d4
[ 152.453832] [c650df40] [c000d1ac] ret_from_syscall+0x0/0x38
[ 152.459286] Instruction dump:
[ 152.462225] 41beffac 4bffff58 38800003 4bffffa0 38800001 4bffff98
7c0802a6 9421fff0
[ 152.469881] bfc10008 90010014 83c30010 812300a8 <815e0050> 3bfe00e0
71480200 4082003c
[ 152.477739] c619ec00: 00 00 00 00 00 00 00 00 00 00 00 23 6f d9 b1 65
[ 152.484100] c619ec10: c6 23 64 00 00 00 00 00 c6 42 c0 60 00 00 03 e8
[ 152.490471] c619ec20: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[ 152.496837] c619ec30: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[ 152.503205] c619ec40: 00 00 00 00 00 00 00 00 00 00 00 00 c0 43 5f b8
[ 152.509575] c619ec50: 00 00 00 00 00 00 00 00 00 00 00 8e 00 00 00 00
[ 152.515943] c619ec60: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[ 152.522311] c619ec70: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[ 152.528680] c619ec80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[ 152.535048] c619ec90: 00 00 ff ff 00 00 ff ff c6 42 30 8e c6 42 31 50
[ 152.541417] c619eca0: c6 42 30 00 c6 42 30 00 00 00 02 b0 00 00 00 01
[ 152.547781] ---[ end trace 0710a9d231876a27 ]---
Christophe
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: How is this possible - Register r30 contains 0xc2236400 instead of 0xc6236400
2018-07-03 13:20 How is this possible - Register r30 contains 0xc2236400 instead of 0xc6236400 Christophe LEROY
@ 2018-07-04 13:11 ` Michael Ellerman
2018-07-04 13:45 ` Segher Boessenkool
0 siblings, 1 reply; 4+ messages in thread
From: Michael Ellerman @ 2018-07-04 13:11 UTC (permalink / raw)
To: Christophe LEROY, linuxppc-dev, Segher Boessenkool,
Benjamin Herrenschmidt, Paul Mackerras
Christophe LEROY <christophe.leroy@c-s.fr> writes:
> Kernel Oops at 0xc0334d5c for reading at address 0xc2236450 which=20
> corresponds to r30 + 80
>
> But r30 should contain what's at r3 + 16 that is at 0xc619ec10 so r30=20
> should be c6236400 as shown below (print_hex_dump(regs->gpr[3]) added at=
=20
> end of __die() )
>
> So how can r30 contain 0xc2236400 instead ?
The simplest answer is that memory was modified between the time we
loaded it into r30 and when you print it.
So it did contain 0xc2236400 but has since been modified to now contain
0xc6236400.
The thing that makes me less certain, is that c6 would be the correct
value (I think?), so it's been modified back to the correct value, which
seems lucky.
Mysterious.
cheers
> And this is not random, it happens at most if not every startup.
>
> c0334d44 <sock_wfree>:
> c0334d44:=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 7c 08 02 a6=C2=A0=C2=A0=C2=
=A0=C2=A0 mflr=C2=A0=C2=A0=C2=A0 r0
> c0334d48:=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 94 21 ff f0=C2=A0=C2=A0=C2=
=A0=C2=A0 stwu=C2=A0=C2=A0=C2=A0 r1,-16(r1)
> c0334d4c:=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 bf c1 00 08=C2=A0=C2=A0=C2=
=A0=C2=A0 stmw=C2=A0=C2=A0=C2=A0 r30,8(r1)
> c0334d50:=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 90 01 00 14=C2=A0=C2=A0=C2=
=A0=C2=A0 stw=C2=A0=C2=A0=C2=A0=C2=A0 r0,20(r1)
> c0334d54:=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 83 c3 00 10=C2=A0=C2=A0=C2=
=A0=C2=A0 lwz=C2=A0=C2=A0=C2=A0=C2=A0 r30,16(r3)
> c0334d58:=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 81 23 00 a8=C2=A0=C2=A0=C2=
=A0=C2=A0 lwz=C2=A0=C2=A0=C2=A0=C2=A0 r9,168(r3)
> c0334d5c:=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 81 5e 00 50=C2=A0=C2=A0=C2=
=A0=C2=A0 lwz=C2=A0=C2=A0=C2=A0=C2=A0 r10,80(r30)
>
>
> [=C2=A0 152.288237] Unable to handle kernel paging request for data at=20
> address 0xc2236450
> [=C2=A0 152.295444] Faulting instruction address: 0xc0334d5c
> [=C2=A0 152.300369] Oops: Kernel access of bad area, sig: 11 [#1]
> [=C2=A0 152.305665] BE PREEMPT DEBUG_PAGEALLOC CMPC885
> [=C2=A0 152.313630] CPU: 0 PID: 269 Comm: in:imuxsock Not tainted=20
> 4.14.52-00025-g5bada429cf-dirty #36
> [=C2=A0 152.322729] task: c623e100 task.stack: c650c000
> [=C2=A0 152.327202] NIP:=C2=A0 c0334d5c LR: c043602c CTR: c0435fb8
> [=C2=A0 152.332200] REGS: c650dc00 TRAP: 0300=C2=A0=C2=A0 Not tainted=20
> (4.14.52-00025-g5bada429cf-dirty)
> [=C2=A0 152.340699] MSR:=C2=A0 00009032 <EE,ME,IR,DR,RI>=C2=A0 CR: 280028=
22 XER: 20000000
> [=C2=A0 152.347333] DAR: c2236450 DSISR: c0000000
> [=C2=A0 152.347333] GPR00: c043602c c650dcb0 c623e100 c619ec00 c642c060=20
> 00000008 00000018 c650dd4c
> [=C2=A0 152.347333] GPR08: c0435fb8 000002b0 c068d830 00000004 28004822=20
> 100d4208 00000000 7780c848
> [=C2=A0 152.347333] GPR16: 0ff58398 777674b0 1024b050 1024b0a8 1005ddbc=20
> 0ff5a7bc 000003e8 00000000
> [=C2=A0 152.347333] GPR24: 0000008e c5011650 c650deb8 0000008e c619ec00=20
> 00000040 c2236400 c619ec00
> [=C2=A0 152.385015] NIP [c0334d5c] sock_wfree+0x18/0xa4
> [=C2=A0 152.389458] LR [c043602c] unix_destruct_scm+0x74/0x88
> [=C2=A0 152.394399] Call Trace:
> [=C2=A0 152.396868] [c650dcb0] [c006348c] ns_to_timeval+0x4c/0x7c (unreli=
able)
> [=C2=A0 152.403305] [c650dcc0] [c043602c] unix_destruct_scm+0x74/0x88
> [=C2=A0 152.408999] [c650dcf0] [c033a10c] skb_release_head_state+0x8c/0x1=
10
> [=C2=A0 152.415184] [c650dd00] [c033a3c4] skb_release_all+0x18/0x50
> [=C2=A0 152.420690] [c650dd10] [c033a7cc] consume_skb+0x38/0xec
> [=C2=A0 152.425869] [c650dd20] [c0342d7c] skb_free_datagram+0x1c/0x68
> [=C2=A0 152.431535] [c650dd30] [c0435c8c] unix_dgram_recvmsg+0x19c/0x4ac
> [=C2=A0 152.437476] [c650ddb0] [c0331370] ___sys_recvmsg+0x98/0x138
> [=C2=A0 152.442984] [c650deb0] [c0333280] __sys_recvmsg+0x40/0x84
> [=C2=A0 152.448321] [c650df10] [c0333680] SyS_socketcall+0xb8/0x1d4
> [=C2=A0 152.453832] [c650df40] [c000d1ac] ret_from_syscall+0x0/0x38
> [=C2=A0 152.459286] Instruction dump:
> [=C2=A0 152.462225] 41beffac 4bffff58 38800003 4bffffa0 38800001 4bffff98=
=20
> 7c0802a6 9421fff0
> [=C2=A0 152.469881] bfc10008 90010014 83c30010 812300a8 <815e0050> 3bfe00=
e0=20
> 71480200 4082003c
> [=C2=A0 152.477739] c619ec00: 00 00 00 00 00 00 00 00 00 00 00 23 6f d9 b=
1 65
> [=C2=A0 152.484100] c619ec10: c6 23 64 00 00 00 00 00 c6 42 c0 60 00 00 0=
3 e8
> [=C2=A0 152.490471] c619ec20: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0=
0 00
> [=C2=A0 152.496837] c619ec30: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0=
0 00
> [=C2=A0 152.503205] c619ec40: 00 00 00 00 00 00 00 00 00 00 00 00 c0 43 5=
f b8
> [=C2=A0 152.509575] c619ec50: 00 00 00 00 00 00 00 00 00 00 00 8e 00 00 0=
0 00
> [=C2=A0 152.515943] c619ec60: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0=
0 00
> [=C2=A0 152.522311] c619ec70: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0=
0 00
> [=C2=A0 152.528680] c619ec80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0=
0 00
> [=C2=A0 152.535048] c619ec90: 00 00 ff ff 00 00 ff ff c6 42 30 8e c6 42 3=
1 50
> [=C2=A0 152.541417] c619eca0: c6 42 30 00 c6 42 30 00 00 00 02 b0 00 00 0=
0 01
> [=C2=A0 152.547781] ---[ end trace 0710a9d231876a27 ]---
>
> Christophe
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: How is this possible - Register r30 contains 0xc2236400 instead of 0xc6236400
2018-07-04 13:11 ` Michael Ellerman
@ 2018-07-04 13:45 ` Segher Boessenkool
2018-07-04 13:59 ` Christophe LEROY
0 siblings, 1 reply; 4+ messages in thread
From: Segher Boessenkool @ 2018-07-04 13:45 UTC (permalink / raw)
To: Michael Ellerman
Cc: Christophe LEROY, linuxppc-dev, Benjamin Herrenschmidt, Paul Mackerras
On Wed, Jul 04, 2018 at 11:11:59PM +1000, Michael Ellerman wrote:
> Christophe LEROY <christophe.leroy@c-s.fr> writes:
>
> > Kernel Oops at 0xc0334d5c for reading at address 0xc2236450 which
> > corresponds to r30 + 80
> >
> > But r30 should contain what's at r3 + 16 that is at 0xc619ec10 so r30
> > should be c6236400 as shown below (print_hex_dump(regs->gpr[3]) added at
> > end of __die() )
> >
> > So how can r30 contain 0xc2236400 instead ?
>
> The simplest answer is that memory was modified between the time we
> loaded it into r30 and when you print it.
>
> So it did contain 0xc2236400 but has since been modified to now contain
> 0xc6236400.
>
> The thing that makes me less certain, is that c6 would be the correct
> value (I think?), so it's been modified back to the correct value, which
> seems lucky.
>
> Mysterious.
That depends. Is this reproducible at all? It is a single bit flip.
Segher
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: How is this possible - Register r30 contains 0xc2236400 instead of 0xc6236400
2018-07-04 13:45 ` Segher Boessenkool
@ 2018-07-04 13:59 ` Christophe LEROY
0 siblings, 0 replies; 4+ messages in thread
From: Christophe LEROY @ 2018-07-04 13:59 UTC (permalink / raw)
To: Segher Boessenkool, Michael Ellerman
Cc: linuxppc-dev, Benjamin Herrenschmidt, Paul Mackerras
Le 04/07/2018 à 15:45, Segher Boessenkool a écrit :
> On Wed, Jul 04, 2018 at 11:11:59PM +1000, Michael Ellerman wrote:
>> Christophe LEROY <christophe.leroy@c-s.fr> writes:
>>
>>> Kernel Oops at 0xc0334d5c for reading at address 0xc2236450 which
>>> corresponds to r30 + 80
>>>
>>> But r30 should contain what's at r3 + 16 that is at 0xc619ec10 so r30
>>> should be c6236400 as shown below (print_hex_dump(regs->gpr[3]) added at
>>> end of __die() )
>>>
>>> So how can r30 contain 0xc2236400 instead ?
>>
>> The simplest answer is that memory was modified between the time we
>> loaded it into r30 and when you print it.
>>
>> So it did contain 0xc2236400 but has since been modified to now contain
>> 0xc6236400.
>>
>> The thing that makes me less certain, is that c6 would be the correct
>> value (I think?), so it's been modified back to the correct value, which
>> seems lucky.
>>
>> Mysterious.
>
> That depends. Is this reproducible at all? It is a single bit flip.
Yes it is reproductible.
It isn't reproduced if I modify the function in such a way that there is
an additional (unrelated) instruction before that read.
It isn't reproduced if I move the kernel base address to 0xd0000000 or
0xb0000000 instead of 0xc0000000
If I force a second read of this address in the function, I get the same
value (in another register).
If I add a dcbi before the second read I get the correct address.
So it still looks mysterious to me ...
Christophe
>
>
> Segher
>
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2018-07-04 13:59 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-07-03 13:20 How is this possible - Register r30 contains 0xc2236400 instead of 0xc6236400 Christophe LEROY
2018-07-04 13:11 ` Michael Ellerman
2018-07-04 13:45 ` Segher Boessenkool
2018-07-04 13:59 ` Christophe LEROY
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.