* [PATCH] powerpc/64s/exception: Fix machine check early corrupting AMR
@ 2019-06-21 22:55 Nicholas Piggin
2019-06-29 11:37 ` Michael Ellerman
2019-06-29 12:25 ` Michael Ellerman
0 siblings, 2 replies; 3+ messages in thread
From: Nicholas Piggin @ 2019-06-21 22:55 UTC (permalink / raw)
To: linuxppc-dev; +Cc: Nicholas Piggin
The early machine check runs in real mode, so locking is unnecessary.
Worse, the windup does not restore AMR, so this can result in a false
KUAP fault after a recoverable machine check hits inside a user copy
operation.
Fix this similarly to HMI by just avoiding the kuap lock in the
early machine check handler (it will be set by the late handler that
runs in virtual mode if that runs). If the virtual mode handler is
reached, it will lock and restore the AMR.
Fixes: 890274c2dc4c0 ("powerpc/64s: Implement KUAP for Radix MMU")
Cc: Russell Currey <ruscur@russell.cc>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
---
arch/powerpc/kernel/exceptions-64s.S | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/arch/powerpc/kernel/exceptions-64s.S b/arch/powerpc/kernel/exceptions-64s.S
index 6b86055e5251..73ba246ca11d 100644
--- a/arch/powerpc/kernel/exceptions-64s.S
+++ b/arch/powerpc/kernel/exceptions-64s.S
@@ -315,7 +315,7 @@ TRAMP_REAL_BEGIN(machine_check_common_early)
mfspr r11,SPRN_DSISR /* Save DSISR */
std r11,_DSISR(r1)
std r9,_CCR(r1) /* Save CR in stackframe */
- kuap_save_amr_and_lock r9, r10, cr1
+ /* We don't touch AMR here, we never go to virtual mode */
/* Save r9 through r13 from EXMC save area to stack frame. */
EXCEPTION_PROLOG_COMMON_2(PACA_EXMC)
mfmsr r11 /* get MSR value */
--
2.20.1
^ permalink raw reply related [flat|nested] 3+ messages in thread
* Re: [PATCH] powerpc/64s/exception: Fix machine check early corrupting AMR
2019-06-21 22:55 [PATCH] powerpc/64s/exception: Fix machine check early corrupting AMR Nicholas Piggin
@ 2019-06-29 11:37 ` Michael Ellerman
2019-06-29 12:25 ` Michael Ellerman
1 sibling, 0 replies; 3+ messages in thread
From: Michael Ellerman @ 2019-06-29 11:37 UTC (permalink / raw)
To: Nicholas Piggin, linuxppc-dev; +Cc: Nicholas Piggin
On Fri, 2019-06-21 at 22:55:54 UTC, Nicholas Piggin wrote:
> The early machine check runs in real mode, so locking is unnecessary.
> Worse, the windup does not restore AMR, so this can result in a false
> KUAP fault after a recoverable machine check hits inside a user copy
> operation.
>
> Fix this similarly to HMI by just avoiding the kuap lock in the
> early machine check handler (it will be set by the late handler that
> runs in virtual mode if that runs). If the virtual mode handler is
> reached, it will lock and restore the AMR.
>
> Fixes: 890274c2dc4c0 ("powerpc/64s: Implement KUAP for Radix MMU")
> Cc: Russell Currey <ruscur@russell.cc>
> Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Applied to powerpc fixes, thanks.
https://git.kernel.org/powerpc/c/e13e7cd4c0c1cc9984d9b6a8663e10d76b53f2aa
cheers
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: [PATCH] powerpc/64s/exception: Fix machine check early corrupting AMR
2019-06-21 22:55 [PATCH] powerpc/64s/exception: Fix machine check early corrupting AMR Nicholas Piggin
2019-06-29 11:37 ` Michael Ellerman
@ 2019-06-29 12:25 ` Michael Ellerman
1 sibling, 0 replies; 3+ messages in thread
From: Michael Ellerman @ 2019-06-29 12:25 UTC (permalink / raw)
To: Nicholas Piggin, linuxppc-dev; +Cc: Nicholas Piggin
Nicholas Piggin <npiggin@gmail.com> writes:
> The early machine check runs in real mode, so locking is unnecessary.
> Worse, the windup does not restore AMR, so this can result in a false
> KUAP fault after a recoverable machine check hits inside a user copy
> operation.
>
> Fix this similarly to HMI by just avoiding the kuap lock in the
> early machine check handler (it will be set by the late handler that
> runs in virtual mode if that runs). If the virtual mode handler is
> reached, it will lock and restore the AMR.
For the archives, this is how I tested this.
Build with KUAP enabled, disassemble load_elf_binary(), in there is a
call to __copy_tofrom_user(), preceded by a write to AMR, eg:
c00000000045eec8: a6 03 3d 7d mtspr 29,r9
c00000000045eecc: 2c 01 00 4c isync
c00000000045eed0: 78 93 44 7e mr r4,r18
c00000000045eed4: 78 e3 83 7f mr r3,r28
c00000000045eed8: b1 c1 c3 4b bl c00000000009b088 <__copy_tofrom_user+0x8>
Boot mambo using skiboot.tcl, break into the mambo shell. Add a
breakpoint at the branch to __copy_tofrom_user():
systemsim % b 0xc00000000045eed8
breakpoint set at [0:0:0]: 0xc00000000045eed8 (0xC00000000045EED8) Enc:0x00000000 : INVALID
Continue, run `ls` in the system shell and it should break at your breakpoint:
systemsim % c
4439260000000: [0:0]: (PC:0x00007FFFB43B2F00) : 2.1 Mega-Inst/Sec : 2.1 Mega-Cycles/Sec [1 Zaps 0 PA-Zaps] *ON* [0:0] pri=4 extra=0
4440009381609: (7208208132): # ls
[0:0:0]: 0xC00000000045EED8 (0x000000000045EED8) Enc:0xB1C1C34B : bl $-0x3C3E50
INFO: 4440936223969: (8135050536): ** Execution stopped: user (tcl), **
4440936223969: ** finished running 8135050536 instructions **
Print the AMR, it has been cleared:
systemsim % p amr
0x0000000000000000
Then inject a machine check exception, and continue:
systemsim % exc_mce
systemsim % c
4440936231861: (8135058428): [ 8673.510176] Disabling lock debugging due to kernel taint
4440936246871: (8135073438): [ 8673.510205] MCE: CPU0: machine check (Warning) Host TLB Multihit [Recovered]
4440936266680: (8135093247): [ 8673.510244] MCE: CPU0: NIP: [c00000000045eed8] load_elf_binary+0xef8/0x1970
4440936282657: (8135109224): [ 8673.510275] MCE: CPU0: Probable Software error (some chance of hardware cause)
[0:0:0]: 0xC00000000045EED8 (0x000000000045EED8) Enc:0xB1C1C34B : bl $-0x3C3E50
INFO: 4440936296116: (8135122683): ** Execution stopped: user (tcl), **
4440936296116: ** finished running 8135122683 instructions **
Now we're back at our breakpoint. Continue again and we should get an
oops due to a bad AMR fault:
systemsim % c
4440936301692: (8135128259): [ 8673.510312] ------------[ cut here ]------------
4440936321016: (8135147583): [ 8673.510336] Bug: Write fault blocked by AMR!
4440936331347: (8135157914): [ 8673.510350] WARNING: CPU: 0 PID: 95 at arch/powerpc/include/asm/book3s/64/kup-radix.h:102 __do_page_fault+0x604/0xe60
4440936352510: (8135179077): [ 8673.510410] Modules linked in:
4440936365222: (8135191789): [ 8673.510436] CPU: 0 PID: 95 Comm: ls Tainted: G M 5.2.0-rc2-gcc-8.2.0 #273
4440936383775: (8135210342): [ 8673.510473] NIP: c0000000000716b4 LR: c0000000000716b0 CTR: c000000000ca88b0
4440936401995: (8135228562): [ 8673.510508] REGS: c0000000ec883530 TRAP: 0700 Tainted: G M (5.2.0-rc2-gcc-8.2.0)
4440936430641: (8135257208): [ 8673.510545] MSR: 9000000000021033 <SF,HV,ME,IR,DR,RI,LE> CR: 28002422 XER: 20040000
4440936498754: (8135325321): [ 8673.510597] CFAR: c00000000011b8e4 IRQMASK: 1
4440936505159: (8135331726): [ 8673.510597] GPR00: c0000000000716b0 c0000000ec8837c0 c0000000015f4900 0000000000000020
4440936515814: (8135342381): [ 8673.510597] GPR04: c000000001824550 0000000000000000 746c756166206574 64656b636f6c6220
4440936528594: (8135355161): [ 8673.510597] GPR08: 00000000fed30000 c000000001130de8 0000000000000000 9000000030001033
4440936541374: (8135367941): [ 8673.510597] GPR12: 0000000000002000 c0000000018e0000 0000000080000000 00007fffe2e3de09
4440936554154: (8135380721): [ 8673.510597] GPR16: c000000000ed2c50 0000000010000000 c000000000ed2c50 00000000100d3648
4440936564809: (8135391376): [ 8673.510597] GPR20: c0000000f0968b00 00000000100e3648 00007fff930a0000 0000000002000000
4440936577589: (8135404156): [ 8673.510597] GPR24: 0000000002000000 c0000000ee830600 0000000000000301 00007fffe2e3de09
4440936590369: (8135416936): [ 8673.510597] GPR28: 0000000000000000 000000000a000000 0000000000000000 c0000000ec883900
4440936611699: (8135438266): [ 8673.510918] NIP [c0000000000716b4] __do_page_fault+0x604/0xe60
4440936628747: (8135455314): [ 8673.510951] LR [c0000000000716b0] __do_page_fault+0x600/0xe60
4440936642325: (8135468892): [ 8673.510978] Call Trace:
4440936655614: (8135482181): [ 8673.511000] [c0000000ec8837c0] [c0000000000716b0] __do_page_fault+0x600/0xe60 (unreliable)
4440936677874: (8135504441): [ 8673.511045] [c0000000ec883890] [c00000000000b0d4] handle_page_fault+0x18/0x38
4440936700658: (8135527225): [ 8673.511091] --- interrupt: 301 at __copy_tofrom_user_power7+0x230/0x7ac
4440936709188: (8135535755): [ 8673.511091] LR = load_elf_binary+0xefc/0x1970
4440936728082: (8135554649): [ 8673.511142] [c0000000ec883b90] [c00000000045ee80] load_elf_binary+0xea0/0x1970 (unreliable)
4440936750368: (8135576935): [ 8673.511187] [c0000000ec883c90] [c0000000003d2f88] search_binary_handler.part.12+0xb8/0x2b0
4440936772446: (8135599013): [ 8673.511230] [c0000000ec883d20] [c0000000003d3934] __do_execve_file.isra.14+0x684/0xa10
4440936793891: (8135620458): [ 8673.511272] [c0000000ec883df0] [c0000000003d41b8] sys_execve+0x38/0x50
4440936813829: (8135640396): [ 8673.511311] [c0000000ec883e20] [c00000000000bdf4] system_call+0x5c/0x70
4440936828817: (8135655384): [ 8673.511340] Instruction dump:
4440936848134: (8135674701): [ 8673.511361] 60000000 2fb70000 e93f0168 419e0620 2fa90000 409cfba4 3c82ff8e 38846b88
4440936874244: (8135700811): [ 8673.511412] 3c62ff8e 38636c98 480aa1d1 60000000 <0fe00000> e80100e0 3b80000b eae10088
4440936891327: (8135717894): [ 8673.511464] ---[ end trace 0698ac8ff1068918 ]---
4440938377906: (8137204473): Segmentation fault
Apply the fix, retest, and no oops is seen.
cheers
^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2019-06-29 12:27 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-06-21 22:55 [PATCH] powerpc/64s/exception: Fix machine check early corrupting AMR Nicholas Piggin
2019-06-29 11:37 ` Michael Ellerman
2019-06-29 12:25 ` Michael Ellerman
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).