linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] powerpc/32s: Fix napping restore in data storage interrupt (DSI)
@ 2021-08-03 15:14 Christophe Leroy
  2021-08-04  4:04 ` Finn Thain
  2021-08-13 11:57 ` Michael Ellerman
  0 siblings, 2 replies; 9+ messages in thread
From: Christophe Leroy @ 2021-08-03 15:14 UTC (permalink / raw)
  To: Benjamin Herrenschmidt, Paul Mackerras, Michael Ellerman,
	Finn Thain, userm57
  Cc: linux-kernel, linuxppc-dev

When a DSI (Data Storage Interrupt) is taken while in NAP mode,
r11 doesn't survive the call to power_save_ppc32_restore().

So use r1 instead of r11 as they both contain the virtual stack
pointer at that point.

Reported-by: Finn Thain <fthain@linux-m68k.org>
Fixes: 4c0104a83fc3 ("powerpc/32: Dismantle EXC_XFER_STD/LITE/TEMPLATE")
Cc: stable@vger.kernel.org
Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
---
 arch/powerpc/kernel/head_book3s_32.S | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/powerpc/kernel/head_book3s_32.S b/arch/powerpc/kernel/head_book3s_32.S
index 764edd860ed4..68e5c0a7e99d 100644
--- a/arch/powerpc/kernel/head_book3s_32.S
+++ b/arch/powerpc/kernel/head_book3s_32.S
@@ -300,7 +300,7 @@ ALT_MMU_FTR_SECTION_END_IFSET(MMU_FTR_HPTE_TABLE)
 	EXCEPTION_PROLOG_1
 	EXCEPTION_PROLOG_2 INTERRUPT_DATA_STORAGE DataAccess handle_dar_dsisr=1
 	prepare_transfer_to_handler
-	lwz	r5, _DSISR(r11)
+	lwz	r5, _DSISR(r1)
 	andis.	r0, r5, DSISR_DABRMATCH@h
 	bne-	1f
 	bl	do_page_fault
-- 
2.25.0


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] powerpc/32s: Fix napping restore in data storage interrupt (DSI)
  2021-08-03 15:14 [PATCH] powerpc/32s: Fix napping restore in data storage interrupt (DSI) Christophe Leroy
@ 2021-08-04  4:04 ` Finn Thain
  2021-08-04  6:07   ` Christophe Leroy
  2021-08-13 11:57 ` Michael Ellerman
  1 sibling, 1 reply; 9+ messages in thread
From: Finn Thain @ 2021-08-04  4:04 UTC (permalink / raw)
  To: Christophe Leroy
  Cc: Benjamin Herrenschmidt, Paul Mackerras, Michael Ellerman,
	userm57, linux-kernel, linuxppc-dev

On Tue, 3 Aug 2021, Christophe Leroy wrote:

> When a DSI (Data Storage Interrupt) is taken while in NAP mode, r11 
> doesn't survive the call to power_save_ppc32_restore().
> 
> So use r1 instead of r11 as they both contain the virtual stack pointer 
> at that point.
> 
> Reported-by: Finn Thain <fthain@linux-m68k.org>
> Fixes: 4c0104a83fc3 ("powerpc/32: Dismantle EXC_XFER_STD/LITE/TEMPLATE")

Regarding that 'Fixes' tag, this patch has not fixed the failure below, 
unfortunately. But there appears to be several bugs in play here. Can you 
tell us which failure mode is associated with the bug addressed by this 
patch?

------------[ cut here ]------------
kernel BUG at arch/powerpc/kernel/interrupt.c:49!
Oops: Exception in kernel mode, sig: 5 [#1]
BE PAGE_SIZE=4K MMU=Hash SMP NR_CPUS=2 PowerMac
Modules linked in:
CPU: 0 PID: 1859 Comm: xfce4-session Not tainted 5.13.0-pmac-VMAP #10
NIP:  c0011474 LR: c0011464 CTR: 00000000
REGS: e2f75e40 TRAP: 0700   Not tainted  (5.13.0-pmac-VMAP)
MSR:  00021032 <ME,IR,DR,RI>  CR: 2400446c  XER: 20000000

GPR00: c001604c e2f75f00 ca284a60 00000000 00000000 a5205eb0 00000008 00000020
GPR08: ffffffc0 00000001 501200d9 ce030005 ca285010 00c1f778 00000000 00000000
GPR16: 00945b20 009402f8 00000001 a6b87550 a51fd000 afb73220 a6b22c78 a6a6aecc
GPR24: 00000000 ffffffc0 00000020 00000008 a5205eb0 00000000 e2f75f40 000000ae
NIP [c0011474] system_call_exception+0x60/0x164
LR [c0011464] system_call_exception+0x50/0x164
Call Trace:
[e2f75f00] [00009000] 0x9000 (unreliable)
[e2f75f30] [c001604c] ret_from_syscall+0x0/0x28
--- interrupt: c00 at 0xa69d6cb0
NIP:  a69d6cb0 LR: a69d6c3c CTR: 00000000
REGS: e2f75f40 TRAP: 0c00   Not tainted  (5.13.0-pmac-VMAP)
MSR:  0000d032 <EE,PR,ME,IR,DR,RI>  CR: 2400446c  XER: 20000000

GPR00: 000000ae a5205de0 a5687ca0 00000000 00000000 a5205eb0 00000008 00000020
GPR08: ffffffc0 401201ea 401200d9 ffffffff c158f230 00c1f778 00000000 00000000
GPR16: 00945b20 009402f8 00000001 a6b87550 a51fd000 afb73220 a6b22c78 a6a6aecc
GPR24: afb72fc8 00000000 00000001 a5205f30 afb733dc 00000000 a6b85ff4 a5205eb0
NIP [a69d6cb0] 0xa69d6cb0
LR [a69d6c3c] 0xa69d6c3c
--- interrupt: c00
Instruction dump:
7cdb3378 93810020 7cbc2b78 93a10024 7c9d2378 93e1002c 7d3f4b78 4800d629
817e0084 931e0088 69690002 5529fffe <0f090000> 69694000 552997fe 0f090000
---[ end trace c66c6c3c44806276 ]---

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] powerpc/32s: Fix napping restore in data storage interrupt (DSI)
  2021-08-04  4:04 ` Finn Thain
@ 2021-08-04  6:07   ` Christophe Leroy
  2021-08-04  6:21     ` Christophe Leroy
  2021-08-05  1:54     ` Finn Thain
  0 siblings, 2 replies; 9+ messages in thread
From: Christophe Leroy @ 2021-08-04  6:07 UTC (permalink / raw)
  To: Finn Thain
  Cc: Benjamin Herrenschmidt, Paul Mackerras, Michael Ellerman,
	userm57, linux-kernel, linuxppc-dev



Le 04/08/2021 à 06:04, Finn Thain a écrit :
> On Tue, 3 Aug 2021, Christophe Leroy wrote:
> 
>> When a DSI (Data Storage Interrupt) is taken while in NAP mode, r11
>> doesn't survive the call to power_save_ppc32_restore().
>>
>> So use r1 instead of r11 as they both contain the virtual stack pointer
>> at that point.
>>
>> Reported-by: Finn Thain <fthain@linux-m68k.org>
>> Fixes: 4c0104a83fc3 ("powerpc/32: Dismantle EXC_XFER_STD/LITE/TEMPLATE")
> 
> Regarding that 'Fixes' tag, this patch has not fixed the failure below,
> unfortunately. But there appears to be several bugs in play here. Can you
> tell us which failure mode is associated with the bug addressed by this
> patch?


This is unrelated to the failure below. This patch is related to the bisect you did that pointed to 
4c0104a83fc3 ("powerpc/32: Dismantle EXC_XFER_STD/LITE/TEMPLATE")

I think maybe the starting point should be to (manually) apply the patch on top of that commit in 
order to check that the bug to leaded to pointing that commit as 'first bad commit' is now gone.

The BUG below is likely something completely different.

And the other bug involving KUAP write is also something else to be investigated separately.

> 
> ------------[ cut here ]------------
> kernel BUG at arch/powerpc/kernel/interrupt.c:49!
> Oops: Exception in kernel mode, sig: 5 [#1]
> BE PAGE_SIZE=4K MMU=Hash SMP NR_CPUS=2 PowerMac
> Modules linked in:
> CPU: 0 PID: 1859 Comm: xfce4-session Not tainted 5.13.0-pmac-VMAP #10
> NIP:  c0011474 LR: c0011464 CTR: 00000000
> REGS: e2f75e40 TRAP: 0700   Not tainted  (5.13.0-pmac-VMAP)
> MSR:  00021032 <ME,IR,DR,RI>  CR: 2400446c  XER: 20000000
> 
> GPR00: c001604c e2f75f00 ca284a60 00000000 00000000 a5205eb0 00000008 00000020
> GPR08: ffffffc0 00000001 501200d9 ce030005 ca285010 00c1f778 00000000 00000000
> GPR16: 00945b20 009402f8 00000001 a6b87550 a51fd000 afb73220 a6b22c78 a6a6aecc
> GPR24: 00000000 ffffffc0 00000020 00000008 a5205eb0 00000000 e2f75f40 000000ae
> NIP [c0011474] system_call_exception+0x60/0x164
> LR [c0011464] system_call_exception+0x50/0x164
> Call Trace:
> [e2f75f00] [00009000] 0x9000 (unreliable)
> [e2f75f30] [c001604c] ret_from_syscall+0x0/0x28
> --- interrupt: c00 at 0xa69d6cb0
> NIP:  a69d6cb0 LR: a69d6c3c CTR: 00000000
> REGS: e2f75f40 TRAP: 0c00   Not tainted  (5.13.0-pmac-VMAP)
> MSR:  0000d032 <EE,PR,ME,IR,DR,RI>  CR: 2400446c  XER: 20000000
> 
> GPR00: 000000ae a5205de0 a5687ca0 00000000 00000000 a5205eb0 00000008 00000020
> GPR08: ffffffc0 401201ea 401200d9 ffffffff c158f230 00c1f778 00000000 00000000
> GPR16: 00945b20 009402f8 00000001 a6b87550 a51fd000 afb73220 a6b22c78 a6a6aecc
> GPR24: afb72fc8 00000000 00000001 a5205f30 afb733dc 00000000 a6b85ff4 a5205eb0
> NIP [a69d6cb0] 0xa69d6cb0
> LR [a69d6c3c] 0xa69d6c3c
> --- interrupt: c00
> Instruction dump:
> 7cdb3378 93810020 7cbc2b78 93a10024 7c9d2378 93e1002c 7d3f4b78 4800d629
> 817e0084 931e0088 69690002 5529fffe <0f090000> 69694000 552997fe 0f090000
> ---[ end trace c66c6c3c44806276 ]---
> 

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] powerpc/32s: Fix napping restore in data storage interrupt (DSI)
  2021-08-04  6:07   ` Christophe Leroy
@ 2021-08-04  6:21     ` Christophe Leroy
  2021-08-04 11:36       ` Nicholas Piggin
  2021-08-05  1:54     ` Finn Thain
  1 sibling, 1 reply; 9+ messages in thread
From: Christophe Leroy @ 2021-08-04  6:21 UTC (permalink / raw)
  To: Nicholas Piggin
  Cc: Benjamin Herrenschmidt, Paul Mackerras, Michael Ellerman,
	userm57, linux-kernel, linuxppc-dev, Finn Thain

Hi Nic,

I think I'll need your help on that one.

Le 04/08/2021 à 08:07, Christophe Leroy a écrit :
> 
> 
> Le 04/08/2021 à 06:04, Finn Thain a écrit :
>> On Tue, 3 Aug 2021, Christophe Leroy wrote:
>>
...
>>
>> ------------[ cut here ]------------
>> kernel BUG at arch/powerpc/kernel/interrupt.c:49!
>> Oops: Exception in kernel mode, sig: 5 [#1]
>> BE PAGE_SIZE=4K MMU=Hash SMP NR_CPUS=2 PowerMac
>> Modules linked in:
>> CPU: 0 PID: 1859 Comm: xfce4-session Not tainted 5.13.0-pmac-VMAP #10
>> NIP:  c0011474 LR: c0011464 CTR: 00000000
>> REGS: e2f75e40 TRAP: 0700   Not tainted  (5.13.0-pmac-VMAP)
>> MSR:  00021032 <ME,IR,DR,RI>  CR: 2400446c  XER: 20000000
>>
>> GPR00: c001604c e2f75f00 ca284a60 00000000 00000000 a5205eb0 00000008 00000020
>> GPR08: ffffffc0 00000001 501200d9 ce030005 ca285010 00c1f778 00000000 00000000
>> GPR16: 00945b20 009402f8 00000001 a6b87550 a51fd000 afb73220 a6b22c78 a6a6aecc
>> GPR24: 00000000 ffffffc0 00000020 00000008 a5205eb0 00000000 e2f75f40 000000ae
>> NIP [c0011474] system_call_exception+0x60/0x164
>> LR [c0011464] system_call_exception+0x50/0x164
>> Call Trace:
>> [e2f75f00] [00009000] 0x9000 (unreliable)
>> [e2f75f30] [c001604c] ret_from_syscall+0x0/0x28
>> --- interrupt: c00 at 0xa69d6cb0
>> NIP:  a69d6cb0 LR: a69d6c3c CTR: 00000000
>> REGS: e2f75f40 TRAP: 0c00   Not tainted  (5.13.0-pmac-VMAP)
>> MSR:  0000d032 <EE,PR,ME,IR,DR,RI>  CR: 2400446c  XER: 20000000
>>
>> GPR00: 000000ae a5205de0 a5687ca0 00000000 00000000 a5205eb0 00000008 00000020
>> GPR08: ffffffc0 401201ea 401200d9 ffffffff c158f230 00c1f778 00000000 00000000
>> GPR16: 00945b20 009402f8 00000001 a6b87550 a51fd000 afb73220 a6b22c78 a6a6aecc
>> GPR24: afb72fc8 00000000 00000001 a5205f30 afb733dc 00000000 a6b85ff4 a5205eb0
>> NIP [a69d6cb0] 0xa69d6cb0
>> LR [a69d6c3c] 0xa69d6c3c
>> --- interrupt: c00
>> Instruction dump:
>> 7cdb3378 93810020 7cbc2b78 93a10024 7c9d2378 93e1002c 7d3f4b78 4800d629
>> 817e0084 931e0088 69690002 5529fffe <0f090000> 69694000 552997fe 0f090000
>> ---[ end trace c66c6c3c44806276 ]---
>>

Getting a BUG at arch/powerpc/kernel/interrupt.c:49 meaning MSR_RI is not set, but the c00 interrupt 
frame shows MSR_RI properly set, so what ?

Thanks
Christophe

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] powerpc/32s: Fix napping restore in data storage interrupt (DSI)
  2021-08-04  6:21     ` Christophe Leroy
@ 2021-08-04 11:36       ` Nicholas Piggin
  2021-08-04 13:28         ` Christophe Leroy
  0 siblings, 1 reply; 9+ messages in thread
From: Nicholas Piggin @ 2021-08-04 11:36 UTC (permalink / raw)
  To: Christophe Leroy
  Cc: Benjamin Herrenschmidt, Finn Thain, linux-kernel, linuxppc-dev,
	Michael Ellerman, Paul Mackerras, userm57

Excerpts from Christophe Leroy's message of August 4, 2021 4:21 pm:
> Hi Nic,
> 
> I think I'll need your help on that one.
> 
> Le 04/08/2021 à 08:07, Christophe Leroy a écrit :
>> 
>> 
>> Le 04/08/2021 à 06:04, Finn Thain a écrit :

Hi Finn!

>>> On Tue, 3 Aug 2021, Christophe Leroy wrote:
>>>
> ...
>>>
>>> ------------[ cut here ]------------
>>> kernel BUG at arch/powerpc/kernel/interrupt.c:49!
>>> Oops: Exception in kernel mode, sig: 5 [#1]
>>> BE PAGE_SIZE=4K MMU=Hash SMP NR_CPUS=2 PowerMac
>>> Modules linked in:
>>> CPU: 0 PID: 1859 Comm: xfce4-session Not tainted 5.13.0-pmac-VMAP #10
>>> NIP:  c0011474 LR: c0011464 CTR: 00000000
>>> REGS: e2f75e40 TRAP: 0700   Not tainted  (5.13.0-pmac-VMAP)
>>> MSR:  00021032 <ME,IR,DR,RI>  CR: 2400446c  XER: 20000000
>>>
>>> GPR00: c001604c e2f75f00 ca284a60 00000000 00000000 a5205eb0 00000008 00000020
>>> GPR08: ffffffc0 00000001 501200d9 ce030005 ca285010 00c1f778 00000000 00000000
>>> GPR16: 00945b20 009402f8 00000001 a6b87550 a51fd000 afb73220 a6b22c78 a6a6aecc
>>> GPR24: 00000000 ffffffc0 00000020 00000008 a5205eb0 00000000 e2f75f40 000000ae
>>> NIP [c0011474] system_call_exception+0x60/0x164
>>> LR [c0011464] system_call_exception+0x50/0x164
>>> Call Trace:
>>> [e2f75f00] [00009000] 0x9000 (unreliable)
>>> [e2f75f30] [c001604c] ret_from_syscall+0x0/0x28
>>> --- interrupt: c00 at 0xa69d6cb0
>>> NIP:  a69d6cb0 LR: a69d6c3c CTR: 00000000
>>> REGS: e2f75f40 TRAP: 0c00   Not tainted  (5.13.0-pmac-VMAP)
>>> MSR:  0000d032 <EE,PR,ME,IR,DR,RI>  CR: 2400446c  XER: 20000000
>>>
>>> GPR00: 000000ae a5205de0 a5687ca0 00000000 00000000 a5205eb0 00000008 00000020
>>> GPR08: ffffffc0 401201ea 401200d9 ffffffff c158f230 00c1f778 00000000 00000000
>>> GPR16: 00945b20 009402f8 00000001 a6b87550 a51fd000 afb73220 a6b22c78 a6a6aecc
>>> GPR24: afb72fc8 00000000 00000001 a5205f30 afb733dc 00000000 a6b85ff4 a5205eb0
>>> NIP [a69d6cb0] 0xa69d6cb0
>>> LR [a69d6c3c] 0xa69d6c3c
>>> --- interrupt: c00
>>> Instruction dump:
>>> 7cdb3378 93810020 7cbc2b78 93a10024 7c9d2378 93e1002c 7d3f4b78 4800d629
>>> 817e0084 931e0088 69690002 5529fffe <0f090000> 69694000 552997fe 0f090000
>>> ---[ end trace c66c6c3c44806276 ]---
>>>
> 
> Getting a BUG at arch/powerpc/kernel/interrupt.c:49 meaning MSR_RI is not set, but the c00 interrupt 
> frame shows MSR_RI properly set, so what ?

Could the stack be correct but regs pointer incorrect?

Instruction dump is

   0:   78 33 db 7c     mr      r27,r6
   4:   20 00 81 93     stw     r28,32(r1)
   8:   78 2b bc 7c     mr      r28,r5
   c:   24 00 a1 93     stw     r29,36(r1)
  10:   78 23 9d 7c     mr      r29,r4
  14:   2c 00 e1 93     stw     r31,44(r1)
  18:   78 4b 3f 7d     mr      r31,r9
  1c:   29 d6 00 48     bl      0xd644
  20:   84 00 7e 81     lwz     r11,132(r30)
  24:   88 00 1e 93     stw     r24,136(r30)
  28:   02 00 69 69     xori    r9,r11,2
  2c:   fe ff 29 55     rlwinm  r9,r9,31,31,31
  30:   00 00 09 0f     twnei   r9,0
  34:   00 40 69 69     xori    r9,r11,16384
  38:   fe 97 29 55     rlwinm  r9,r9,18,31,31
  3c:   00 00 09 0f     twnei   r9,0

regs->msr is in r11 == 0xce030005 so some kernel address?

r1  == 0xe2f75f00
r30 == 0xe2f75f40

I think that matches if the function allocates 48 bytes of stack. 
STACK_FRAME_OVERHEAD is 16, so the difference would be 0x40 in that
case. Seems okay.

I'm not sure then. Can you get a hash fault interrupt come in here
because of the vmap stack access and clobber r11? Hmm...

fast_hash_page_return:
        andis.  r10, r9, SRR1_ISI_NOPT@h        /* Set on ISI, cleared on DSI */

Is that really right? DSI can set this bit for NOHPTE as well no?
That'd do it.

Thanks,
Nick

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] powerpc/32s: Fix napping restore in data storage interrupt (DSI)
  2021-08-04 11:36       ` Nicholas Piggin
@ 2021-08-04 13:28         ` Christophe Leroy
  2021-08-04 14:45           ` Nicholas Piggin
  0 siblings, 1 reply; 9+ messages in thread
From: Christophe Leroy @ 2021-08-04 13:28 UTC (permalink / raw)
  To: Nicholas Piggin
  Cc: Benjamin Herrenschmidt, Finn Thain, linux-kernel, linuxppc-dev,
	Michael Ellerman, Paul Mackerras, userm57



Le 04/08/2021 à 13:36, Nicholas Piggin a écrit :
> Excerpts from Christophe Leroy's message of August 4, 2021 4:21 pm:
>> Hi Nic,
>>
>> I think I'll need your help on that one.
>>
>> Le 04/08/2021 à 08:07, Christophe Leroy a écrit :
>>>
>>>
>>> Le 04/08/2021 à 06:04, Finn Thain a écrit :
> 
> Hi Finn!
> 
>>>> On Tue, 3 Aug 2021, Christophe Leroy wrote:
>>>>
>> ...
>>>>
>>>> ------------[ cut here ]------------
>>>> kernel BUG at arch/powerpc/kernel/interrupt.c:49!
>>>> Oops: Exception in kernel mode, sig: 5 [#1]
>>>> BE PAGE_SIZE=4K MMU=Hash SMP NR_CPUS=2 PowerMac
>>>> Modules linked in:
>>>> CPU: 0 PID: 1859 Comm: xfce4-session Not tainted 5.13.0-pmac-VMAP #10
>>>> NIP:  c0011474 LR: c0011464 CTR: 00000000
>>>> REGS: e2f75e40 TRAP: 0700   Not tainted  (5.13.0-pmac-VMAP)
>>>> MSR:  00021032 <ME,IR,DR,RI>  CR: 2400446c  XER: 20000000
>>>>
>>>> GPR00: c001604c e2f75f00 ca284a60 00000000 00000000 a5205eb0 00000008 00000020
>>>> GPR08: ffffffc0 00000001 501200d9 ce030005 ca285010 00c1f778 00000000 00000000
>>>> GPR16: 00945b20 009402f8 00000001 a6b87550 a51fd000 afb73220 a6b22c78 a6a6aecc
>>>> GPR24: 00000000 ffffffc0 00000020 00000008 a5205eb0 00000000 e2f75f40 000000ae
>>>> NIP [c0011474] system_call_exception+0x60/0x164
>>>> LR [c0011464] system_call_exception+0x50/0x164
>>>> Call Trace:
>>>> [e2f75f00] [00009000] 0x9000 (unreliable)
>>>> [e2f75f30] [c001604c] ret_from_syscall+0x0/0x28
>>>> --- interrupt: c00 at 0xa69d6cb0
>>>> NIP:  a69d6cb0 LR: a69d6c3c CTR: 00000000
>>>> REGS: e2f75f40 TRAP: 0c00   Not tainted  (5.13.0-pmac-VMAP)
>>>> MSR:  0000d032 <EE,PR,ME,IR,DR,RI>  CR: 2400446c  XER: 20000000
>>>>
>>>> GPR00: 000000ae a5205de0 a5687ca0 00000000 00000000 a5205eb0 00000008 00000020
>>>> GPR08: ffffffc0 401201ea 401200d9 ffffffff c158f230 00c1f778 00000000 00000000
>>>> GPR16: 00945b20 009402f8 00000001 a6b87550 a51fd000 afb73220 a6b22c78 a6a6aecc
>>>> GPR24: afb72fc8 00000000 00000001 a5205f30 afb733dc 00000000 a6b85ff4 a5205eb0
>>>> NIP [a69d6cb0] 0xa69d6cb0
>>>> LR [a69d6c3c] 0xa69d6c3c
>>>> --- interrupt: c00
>>>> Instruction dump:
>>>> 7cdb3378 93810020 7cbc2b78 93a10024 7c9d2378 93e1002c 7d3f4b78 4800d629
>>>> 817e0084 931e0088 69690002 5529fffe <0f090000> 69694000 552997fe 0f090000
>>>> ---[ end trace c66c6c3c44806276 ]---
>>>>
>>
>> Getting a BUG at arch/powerpc/kernel/interrupt.c:49 meaning MSR_RI is not set, but the c00 interrupt
>> frame shows MSR_RI properly set, so what ?
> 
> Could the stack be correct but regs pointer incorrect?
> 
> Instruction dump is
> 
>     0:   78 33 db 7c     mr      r27,r6
>     4:   20 00 81 93     stw     r28,32(r1)
>     8:   78 2b bc 7c     mr      r28,r5
>     c:   24 00 a1 93     stw     r29,36(r1)
>    10:   78 23 9d 7c     mr      r29,r4
>    14:   2c 00 e1 93     stw     r31,44(r1)
>    18:   78 4b 3f 7d     mr      r31,r9
>    1c:   29 d6 00 48     bl      0xd644
>    20:   84 00 7e 81     lwz     r11,132(r30)
>    24:   88 00 1e 93     stw     r24,136(r30)
>    28:   02 00 69 69     xori    r9,r11,2
>    2c:   fe ff 29 55     rlwinm  r9,r9,31,31,31
>    30:   00 00 09 0f     twnei   r9,0
>    34:   00 40 69 69     xori    r9,r11,16384
>    38:   fe 97 29 55     rlwinm  r9,r9,18,31,31
>    3c:   00 00 09 0f     twnei   r9,0
> 
> regs->msr is in r11 == 0xce030005 so some kernel address?
> 
> r1  == 0xe2f75f00
> r30 == 0xe2f75f40
> 
> I think that matches if the function allocates 48 bytes of stack.
> STACK_FRAME_OVERHEAD is 16, so the difference would be 0x40 in that
> case. Seems okay.
> 
> I'm not sure then. Can you get a hash fault interrupt come in here
> because of the vmap stack access and clobber r11? Hmm...
> 
> fast_hash_page_return:
>          andis.  r10, r9, SRR1_ISI_NOPT@h        /* Set on ISI, cleared on DSI */
> 
> Is that really right? DSI can set this bit for NOHPTE as well no?

On DSI, the error bits are in DSISR while they are in SRR1 on ISI.

r9 is supposed to contain SRR1 in both cases. Powerpc 32 bits programming manual explicitely says 
that bits 1-4 and 10-15 of SRR1 are cleared on DSI.

Thanks,
Christophe

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] powerpc/32s: Fix napping restore in data storage interrupt (DSI)
  2021-08-04 13:28         ` Christophe Leroy
@ 2021-08-04 14:45           ` Nicholas Piggin
  0 siblings, 0 replies; 9+ messages in thread
From: Nicholas Piggin @ 2021-08-04 14:45 UTC (permalink / raw)
  To: Christophe Leroy
  Cc: Benjamin Herrenschmidt, Finn Thain, linux-kernel, linuxppc-dev,
	Michael Ellerman, Paul Mackerras, userm57

Excerpts from Christophe Leroy's message of August 4, 2021 11:28 pm:
> 
> 
> Le 04/08/2021 à 13:36, Nicholas Piggin a écrit :
>> Excerpts from Christophe Leroy's message of August 4, 2021 4:21 pm:
>>> Hi Nic,
>>>
>>> I think I'll need your help on that one.
>>>
>>> Le 04/08/2021 à 08:07, Christophe Leroy a écrit :
>>>>
>>>>
>>>> Le 04/08/2021 à 06:04, Finn Thain a écrit :
>> 
>> Hi Finn!
>> 
>>>>> On Tue, 3 Aug 2021, Christophe Leroy wrote:
>>>>>
>>> ...
>>>>>
>>>>> ------------[ cut here ]------------
>>>>> kernel BUG at arch/powerpc/kernel/interrupt.c:49!
>>>>> Oops: Exception in kernel mode, sig: 5 [#1]
>>>>> BE PAGE_SIZE=4K MMU=Hash SMP NR_CPUS=2 PowerMac
>>>>> Modules linked in:
>>>>> CPU: 0 PID: 1859 Comm: xfce4-session Not tainted 5.13.0-pmac-VMAP #10
>>>>> NIP:  c0011474 LR: c0011464 CTR: 00000000
>>>>> REGS: e2f75e40 TRAP: 0700   Not tainted  (5.13.0-pmac-VMAP)
>>>>> MSR:  00021032 <ME,IR,DR,RI>  CR: 2400446c  XER: 20000000
>>>>>
>>>>> GPR00: c001604c e2f75f00 ca284a60 00000000 00000000 a5205eb0 00000008 00000020
>>>>> GPR08: ffffffc0 00000001 501200d9 ce030005 ca285010 00c1f778 00000000 00000000
>>>>> GPR16: 00945b20 009402f8 00000001 a6b87550 a51fd000 afb73220 a6b22c78 a6a6aecc
>>>>> GPR24: 00000000 ffffffc0 00000020 00000008 a5205eb0 00000000 e2f75f40 000000ae
>>>>> NIP [c0011474] system_call_exception+0x60/0x164
>>>>> LR [c0011464] system_call_exception+0x50/0x164
>>>>> Call Trace:
>>>>> [e2f75f00] [00009000] 0x9000 (unreliable)
>>>>> [e2f75f30] [c001604c] ret_from_syscall+0x0/0x28
>>>>> --- interrupt: c00 at 0xa69d6cb0
>>>>> NIP:  a69d6cb0 LR: a69d6c3c CTR: 00000000
>>>>> REGS: e2f75f40 TRAP: 0c00   Not tainted  (5.13.0-pmac-VMAP)
>>>>> MSR:  0000d032 <EE,PR,ME,IR,DR,RI>  CR: 2400446c  XER: 20000000
>>>>>
>>>>> GPR00: 000000ae a5205de0 a5687ca0 00000000 00000000 a5205eb0 00000008 00000020
>>>>> GPR08: ffffffc0 401201ea 401200d9 ffffffff c158f230 00c1f778 00000000 00000000
>>>>> GPR16: 00945b20 009402f8 00000001 a6b87550 a51fd000 afb73220 a6b22c78 a6a6aecc
>>>>> GPR24: afb72fc8 00000000 00000001 a5205f30 afb733dc 00000000 a6b85ff4 a5205eb0
>>>>> NIP [a69d6cb0] 0xa69d6cb0
>>>>> LR [a69d6c3c] 0xa69d6c3c
>>>>> --- interrupt: c00
>>>>> Instruction dump:
>>>>> 7cdb3378 93810020 7cbc2b78 93a10024 7c9d2378 93e1002c 7d3f4b78 4800d629
>>>>> 817e0084 931e0088 69690002 5529fffe <0f090000> 69694000 552997fe 0f090000
>>>>> ---[ end trace c66c6c3c44806276 ]---
>>>>>
>>>
>>> Getting a BUG at arch/powerpc/kernel/interrupt.c:49 meaning MSR_RI is not set, but the c00 interrupt
>>> frame shows MSR_RI properly set, so what ?
>> 
>> Could the stack be correct but regs pointer incorrect?
>> 
>> Instruction dump is
>> 
>>     0:   78 33 db 7c     mr      r27,r6
>>     4:   20 00 81 93     stw     r28,32(r1)
>>     8:   78 2b bc 7c     mr      r28,r5
>>     c:   24 00 a1 93     stw     r29,36(r1)
>>    10:   78 23 9d 7c     mr      r29,r4
>>    14:   2c 00 e1 93     stw     r31,44(r1)
>>    18:   78 4b 3f 7d     mr      r31,r9
>>    1c:   29 d6 00 48     bl      0xd644
>>    20:   84 00 7e 81     lwz     r11,132(r30)
>>    24:   88 00 1e 93     stw     r24,136(r30)
>>    28:   02 00 69 69     xori    r9,r11,2
>>    2c:   fe ff 29 55     rlwinm  r9,r9,31,31,31
>>    30:   00 00 09 0f     twnei   r9,0
>>    34:   00 40 69 69     xori    r9,r11,16384
>>    38:   fe 97 29 55     rlwinm  r9,r9,18,31,31
>>    3c:   00 00 09 0f     twnei   r9,0
>> 
>> regs->msr is in r11 == 0xce030005 so some kernel address?
>> 
>> r1  == 0xe2f75f00
>> r30 == 0xe2f75f40
>> 
>> I think that matches if the function allocates 48 bytes of stack.
>> STACK_FRAME_OVERHEAD is 16, so the difference would be 0x40 in that
>> case. Seems okay.
>> 
>> I'm not sure then. Can you get a hash fault interrupt come in here
>> because of the vmap stack access and clobber r11? Hmm...
>> 
>> fast_hash_page_return:
>>          andis.  r10, r9, SRR1_ISI_NOPT@h        /* Set on ISI, cleared on DSI */
>> 
>> Is that really right? DSI can set this bit for NOHPTE as well no?
> 
> On DSI, the error bits are in DSISR while they are in SRR1 on ISI.
> 
> r9 is supposed to contain SRR1 in both cases. Powerpc 32 bits programming manual explicitely says 
> that bits 1-4 and 10-15 of SRR1 are cleared on DSI.

Ah right, I had in mind it was DSISR on DSI and SRR1 on ISI because
we put them together early on 64s. Can't think of anything else at the 
moment.

Thanks,
Nick


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] powerpc/32s: Fix napping restore in data storage interrupt (DSI)
  2021-08-04  6:07   ` Christophe Leroy
  2021-08-04  6:21     ` Christophe Leroy
@ 2021-08-05  1:54     ` Finn Thain
  1 sibling, 0 replies; 9+ messages in thread
From: Finn Thain @ 2021-08-05  1:54 UTC (permalink / raw)
  To: Christophe Leroy
  Cc: Benjamin Herrenschmidt, Paul Mackerras, Michael Ellerman,
	userm57, linux-kernel, linuxppc-dev


On Wed, 4 Aug 2021, Christophe Leroy wrote:

> 
> This patch is related to the bisect you did that pointed to 4c0104a83fc3 
> ("powerpc/32: Dismantle EXC_XFER_STD/LITE/TEMPLATE")
> 
> I think maybe the starting point should be to (manually) apply the patch 
> on top of that commit in order to check that the bug to leaded to 
> pointing that commit as 'first bad commit' is now gone.
> 

Stan has now confirmed this. He applied this patch on top of 4c0104a83fc3, 
and it did indeed resolve the bug that 'git bisect' isolated [1]. Thanks 
Christophe.

[1]
https://lore.kernel.org/lkml/666e3ab4-372-27c2-4621-7cc3933756dd@linux-m68k.org/

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] powerpc/32s: Fix napping restore in data storage interrupt (DSI)
  2021-08-03 15:14 [PATCH] powerpc/32s: Fix napping restore in data storage interrupt (DSI) Christophe Leroy
  2021-08-04  4:04 ` Finn Thain
@ 2021-08-13 11:57 ` Michael Ellerman
  1 sibling, 0 replies; 9+ messages in thread
From: Michael Ellerman @ 2021-08-13 11:57 UTC (permalink / raw)
  To: Finn Thain, Benjamin Herrenschmidt, Michael Ellerman, userm57,
	Christophe Leroy, Paul Mackerras
  Cc: linuxppc-dev, linux-kernel

On Tue, 3 Aug 2021 15:14:27 +0000 (UTC), Christophe Leroy wrote:
> When a DSI (Data Storage Interrupt) is taken while in NAP mode,
> r11 doesn't survive the call to power_save_ppc32_restore().
> 
> So use r1 instead of r11 as they both contain the virtual stack
> pointer at that point.

Applied to powerpc/fixes.

[1/1] powerpc/32s: Fix napping restore in data storage interrupt (DSI)
      https://git.kernel.org/powerpc/c/62376365048878f770d8b7d11b89b8b3e18018f1

cheers

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2021-08-13 11:59 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-08-03 15:14 [PATCH] powerpc/32s: Fix napping restore in data storage interrupt (DSI) Christophe Leroy
2021-08-04  4:04 ` Finn Thain
2021-08-04  6:07   ` Christophe Leroy
2021-08-04  6:21     ` Christophe Leroy
2021-08-04 11:36       ` Nicholas Piggin
2021-08-04 13:28         ` Christophe Leroy
2021-08-04 14:45           ` Nicholas Piggin
2021-08-05  1:54     ` Finn Thain
2021-08-13 11:57 ` Michael Ellerman

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).