* [PATCH 0/4] Fixes for 3 separate NMI reentrancy bugs
@ 2019-01-22 6:46 Nicholas Piggin
2019-01-22 6:46 ` [PATCH 1/4] powerpc/64s: Fix HV NMI vs HV interrupt recoverability test Nicholas Piggin
` (3 more replies)
0 siblings, 4 replies; 8+ messages in thread
From: Nicholas Piggin @ 2019-01-22 6:46 UTC (permalink / raw)
To: linuxppc-dev; +Cc: Nicholas Piggin
This series fixes several similar but unrelated bugs with NMIs
clobbering live registers and getting away with it because
MSR[RI] is set. Pretty rare bugs, but serious silent corruption
consequences.
For the most part these can be observed and tested quite easily
with the mambo simulator except that it does not seem to follow
the architecture wrt leaving MSR[RI] unchanged for HV interrupts.
Manually fixing that up in the sim environment can trigger that
case.
Thanks,
Nick
Nicholas Piggin (4):
powerpc/64s: Fix HV NMI vs HV interrupt recoverability test
powerpc/64s: system reset interrupt preserve HSRRs
powerpc/64s: Prepare to handle data interrupts vs d-side MCE
reentrancy
powerpc/64s: Fix data interrupts vs d-side MCE reentrancy
arch/powerpc/include/asm/nmi.h | 2 +
arch/powerpc/kernel/exceptions-64s.S | 76 +++++++++++++++++++-----
arch/powerpc/kernel/mce.c | 3 +
arch/powerpc/kernel/traps.c | 87 +++++++++++++++++++++++++++-
4 files changed, 151 insertions(+), 17 deletions(-)
--
2.18.0
^ permalink raw reply [flat|nested] 8+ messages in thread
* [PATCH 1/4] powerpc/64s: Fix HV NMI vs HV interrupt recoverability test
2019-01-22 6:46 [PATCH 0/4] Fixes for 3 separate NMI reentrancy bugs Nicholas Piggin
@ 2019-01-22 6:46 ` Nicholas Piggin
2019-02-20 14:31 ` Nicholas Piggin
2019-01-22 6:46 ` [PATCH 2/4] powerpc/64s: system reset interrupt preserve HSRRs Nicholas Piggin
` (2 subsequent siblings)
3 siblings, 1 reply; 8+ messages in thread
From: Nicholas Piggin @ 2019-01-22 6:46 UTC (permalink / raw)
To: linuxppc-dev; +Cc: Nicholas Piggin
HV interrupts that use HSRR registers do not clear MSR[RI], but
NMI entry code is not recoverable early on due to both using HSPRG
for a scratch register.
This bug means that a system reset or machine check can cause silent
data corruption (due to loss of r13 register) if it hits in a small
window when taking an HV interrupt.
Fix this by marking NMIs non-recoverable if they land in HV interrupt
ranges.
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
---
arch/powerpc/include/asm/nmi.h | 2 ++
arch/powerpc/kernel/mce.c | 3 ++
arch/powerpc/kernel/traps.c | 62 ++++++++++++++++++++++++++++++++++
3 files changed, 67 insertions(+)
diff --git a/arch/powerpc/include/asm/nmi.h b/arch/powerpc/include/asm/nmi.h
index bd9ba8defd72..84b4cfe73edd 100644
--- a/arch/powerpc/include/asm/nmi.h
+++ b/arch/powerpc/include/asm/nmi.h
@@ -14,4 +14,6 @@ extern void arch_trigger_cpumask_backtrace(const cpumask_t *mask,
#define arch_trigger_cpumask_backtrace arch_trigger_cpumask_backtrace
#endif
+extern void hv_nmi_check_nonrecoverable(struct pt_regs *regs);
+
#endif /* _ASM_NMI_H */
diff --git a/arch/powerpc/kernel/mce.c b/arch/powerpc/kernel/mce.c
index bd933a75f0bc..d653b5de4537 100644
--- a/arch/powerpc/kernel/mce.c
+++ b/arch/powerpc/kernel/mce.c
@@ -31,6 +31,7 @@
#include <asm/machdep.h>
#include <asm/mce.h>
+#include <asm/nmi.h>
static DEFINE_PER_CPU(int, mce_nest_count);
static DEFINE_PER_CPU(struct machine_check_event[MAX_MC_EVT], mce_event);
@@ -488,6 +489,8 @@ long machine_check_early(struct pt_regs *regs)
{
long handled = 0;
+ hv_nmi_check_nonrecoverable(regs);
+
/*
* See if platform is capable of handling machine check.
*/
diff --git a/arch/powerpc/kernel/traps.c b/arch/powerpc/kernel/traps.c
index 64936b60d521..b429b2264a1f 100644
--- a/arch/powerpc/kernel/traps.c
+++ b/arch/powerpc/kernel/traps.c
@@ -376,6 +376,66 @@ void _exception(int signr, struct pt_regs *regs, int code, unsigned long addr)
force_sig_fault(signr, code, (void __user *)addr, current);
}
+/*
+ * The interrupt architecture has a quirk in that the HV interrupts excluding
+ * the NMIs (0x100 and 0x200) do not clear MSR[RI] at entry. The first thing
+ * that an interrupt handler must do is save off a GPR into a scratch register,
+ * and all interrupts on POWERNV (HV=1) use the same HSPRG register as scratch.
+ * Therefore an NMI can clobber an HV interrupt's live HSPRG without noticing
+ * that it is non-reentrant, which leads to random data corruption.
+ *
+ * The solution is for NMI interrupts in HV mode to check if they originated
+ * from these critical HV interrupt regions. If so, then mark them not
+ * recoverable.
+ *
+ * An alternative would be for HV NMIs to use SPRG for scratch to avoid the
+ * HSPRG clobber, however this would cause guest SPRG to be clobbered. Linux
+ * guests should always have MSR[RI]=0 when its scratch SPRG is in use, so
+ * that would work. However any other guest OS that may have the SPRG live
+ * and MSR[RI]=1 could encounter silent corruption.
+ *
+ * Builds that do not support KVM could take this second option to increase
+ * the recoverability of NMIs.
+ */
+void hv_nmi_check_nonrecoverable(struct pt_regs *regs)
+{
+#ifdef CONFIG_POWERNV
+ unsigned long kbase = (unsigned long)_stext;
+ unsigned long nip = regs->nip;
+
+ if (!(regs->msr & MSR_RI))
+ return;
+ if (!(regs->msr & MSR_HV))
+ return;
+ if (regs->msr & MSR_PR)
+ return;
+again:
+ if (nip >= 0x500 && nip < 0x600)
+ goto nonrecoverable;
+ if (nip >= 0x980 && nip < 0xa00)
+ goto nonrecoverable;
+ if (nip >= 0xe00 && nip < 0xec0)
+ goto nonrecoverable;
+ if (nip >= 0xf80 && nip < 0xfa0)
+ goto nonrecoverable;
+ /* Trampolines are not relocated. */
+ if (nip >= real_trampolines_start - kbase &&
+ nip < real_trampolines_end - kbase)
+ goto nonrecoverable;
+ if (nip >= virt_trampolines_start - kbase &&
+ nip < virt_trampolines_end - kbase)
+ goto nonrecoverable;
+ if (nip >= 0xc000000000000000ULL) {
+ nip -= 0xc000000000000000ULL;
+ goto again;
+ }
+ return;
+
+nonrecoverable:
+ regs->msr &= ~MSR_RI;
+#endif
+}
+
void system_reset_exception(struct pt_regs *regs)
{
/*
@@ -386,6 +446,8 @@ void system_reset_exception(struct pt_regs *regs)
if (!nested)
nmi_enter();
+ hv_nmi_check_nonrecoverable(regs);
+
__this_cpu_inc(irq_stat.sreset_irqs);
/* See if any machine dependent calls */
--
2.18.0
^ permalink raw reply related [flat|nested] 8+ messages in thread
* [PATCH 2/4] powerpc/64s: system reset interrupt preserve HSRRs
2019-01-22 6:46 [PATCH 0/4] Fixes for 3 separate NMI reentrancy bugs Nicholas Piggin
2019-01-22 6:46 ` [PATCH 1/4] powerpc/64s: Fix HV NMI vs HV interrupt recoverability test Nicholas Piggin
@ 2019-01-22 6:46 ` Nicholas Piggin
2019-01-22 6:46 ` [PATCH 3/4] powerpc/64s: Prepare to handle data interrupts vs d-side MCE reentrancy Nicholas Piggin
2019-01-22 6:46 ` [PATCH 4/4] powerpc/64s: Fix " Nicholas Piggin
3 siblings, 0 replies; 8+ messages in thread
From: Nicholas Piggin @ 2019-01-22 6:46 UTC (permalink / raw)
To: linuxppc-dev; +Cc: Nicholas Piggin
The system reset interrupt may use HSRR registers (e.g., to call in to
OPAL), but code that uses HSRR registers is not required to clear
MSR[RI] by convention.
Rather than introduce that requirement, have system reset interrupt
save HSRRs before they might be used.
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
---
arch/powerpc/kernel/traps.c | 25 ++++++++++++++++++++++++-
1 file changed, 24 insertions(+), 1 deletion(-)
diff --git a/arch/powerpc/kernel/traps.c b/arch/powerpc/kernel/traps.c
index b429b2264a1f..d54459152a2b 100644
--- a/arch/powerpc/kernel/traps.c
+++ b/arch/powerpc/kernel/traps.c
@@ -438,14 +438,32 @@ void hv_nmi_check_nonrecoverable(struct pt_regs *regs)
void system_reset_exception(struct pt_regs *regs)
{
+ unsigned long hsrr0, hsrr1;
+ bool hsrrs_saved = false;
+ bool nested = in_nmi();
+
/*
* Avoid crashes in case of nested NMI exceptions. Recoverability
* is determined by RI and in_nmi
*/
- bool nested = in_nmi();
if (!nested)
nmi_enter();
+ /*
+ * System reset can interrupt a region where HSRRs are live and
+ * MSR[RI]=1, and it may clobber HSRRs itself (e.g., to call OPAL),
+ * so save them before doing anything.
+ *
+ * Machine checks should be okay to avoid this, as the real mode
+ * handler is careful to avoid HSRRs, and the virt code is not
+ * delivered as an NMI.
+ */
+ if (cpu_has_feature(CPU_FTR_HVMODE)) {
+ hsrrs_saved = true;
+ hsrr0 = mfspr(SPRN_HSRR0);
+ hsrr1 = mfspr(SPRN_HSRR1);
+ }
+
hv_nmi_check_nonrecoverable(regs);
__this_cpu_inc(irq_stat.sreset_irqs);
@@ -495,6 +513,11 @@ void system_reset_exception(struct pt_regs *regs)
if (!(regs->msr & MSR_RI))
nmi_panic(regs, "Unrecoverable System Reset");
+ if (hsrrs_saved) {
+ mtspr(SPRN_HSRR0, hsrr0);
+ mtspr(SPRN_HSRR1, hsrr1);
+ }
+
if (!nested)
nmi_exit();
--
2.18.0
^ permalink raw reply related [flat|nested] 8+ messages in thread
* [PATCH 3/4] powerpc/64s: Prepare to handle data interrupts vs d-side MCE reentrancy
2019-01-22 6:46 [PATCH 0/4] Fixes for 3 separate NMI reentrancy bugs Nicholas Piggin
2019-01-22 6:46 ` [PATCH 1/4] powerpc/64s: Fix HV NMI vs HV interrupt recoverability test Nicholas Piggin
2019-01-22 6:46 ` [PATCH 2/4] powerpc/64s: system reset interrupt preserve HSRRs Nicholas Piggin
@ 2019-01-22 6:46 ` Nicholas Piggin
2019-01-22 6:46 ` [PATCH 4/4] powerpc/64s: Fix " Nicholas Piggin
3 siblings, 0 replies; 8+ messages in thread
From: Nicholas Piggin @ 2019-01-22 6:46 UTC (permalink / raw)
To: linuxppc-dev; +Cc: Nicholas Piggin
The subsequent patch needs to change data interrupts (those that set
DAR / DSISR) in order to cope with a recoverability bug with MCE
interrupts.
This patch prepares for the fix by removing a couple of layers of
macro indirection from the interrupt handlers, and also makes the
0x300 interrupt out of line (as required by the next patch).
Functionality should be unchanged, and code unchanged except for the
OOL change.
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
---
arch/powerpc/kernel/exceptions-64s.S | 45 ++++++++++++++++++++++++----
1 file changed, 39 insertions(+), 6 deletions(-)
diff --git a/arch/powerpc/kernel/exceptions-64s.S b/arch/powerpc/kernel/exceptions-64s.S
index 9e253ce27e08..0d91d1c8aad1 100644
--- a/arch/powerpc/kernel/exceptions-64s.S
+++ b/arch/powerpc/kernel/exceptions-64s.S
@@ -566,8 +566,23 @@ EXC_COMMON_BEGIN(mce_return)
RFI_TO_KERNEL
b .
-EXC_REAL(data_access, 0x300, 0x80)
-EXC_VIRT(data_access, 0x4300, 0x80, 0x300)
+EXC_REAL_BEGIN(data_access, 0x300, 0x80)
+SET_SCRATCH0(r13) /* save r13 */
+EXCEPTION_PROLOG_0(PACA_EXGEN)
+ b tramp_real_data_access
+EXC_REAL_END(data_access, 0x300, 0x80)
+
+TRAMP_REAL_BEGIN(tramp_real_data_access)
+EXCEPTION_PROLOG_1(PACA_EXGEN, KVMTEST_PR, 0x300)
+EXCEPTION_PROLOG_2(data_access_common, EXC_STD)
+
+EXC_VIRT_BEGIN(data_access, 0x4300, 0x80)
+SET_SCRATCH0(r13) /* save r13 */
+EXCEPTION_PROLOG_0(PACA_EXGEN)
+EXCEPTION_PROLOG_1(PACA_EXGEN, NOTEST, 0x300)
+EXCEPTION_PROLOG_2_RELON(data_access_common, EXC_STD)
+EXC_VIRT_END(data_access, 0x4300, 0x80)
+
TRAMP_KVM_SKIP(PACA_EXGEN, 0x300)
EXC_COMMON_BEGIN(data_access_common)
@@ -596,11 +611,17 @@ ALT_MMU_FTR_SECTION_END_IFCLR(MMU_FTR_TYPE_RADIX)
EXC_REAL_BEGIN(data_access_slb, 0x380, 0x80)
-EXCEPTION_PROLOG(PACA_EXSLB, data_access_slb_common, EXC_STD, KVMTEST_PR, 0x380);
+SET_SCRATCH0(r13) /* save r13 */
+EXCEPTION_PROLOG_0(PACA_EXSLB)
+EXCEPTION_PROLOG_1(PACA_EXSLB, KVMTEST_PR, 0x380)
+EXCEPTION_PROLOG_2(data_access_slb_common, EXC_STD)
EXC_REAL_END(data_access_slb, 0x380, 0x80)
EXC_VIRT_BEGIN(data_access_slb, 0x4380, 0x80)
-EXCEPTION_RELON_PROLOG(PACA_EXSLB, data_access_slb_common, EXC_STD, NOTEST, 0x380);
+SET_SCRATCH0(r13) /* save r13 */
+EXCEPTION_PROLOG_0(PACA_EXSLB)
+EXCEPTION_PROLOG_1(PACA_EXSLB, NOTEST, 0x380)
+EXCEPTION_PROLOG_2_RELON(data_access_slb_common, EXC_STD)
EXC_VIRT_END(data_access_slb, 0x4380, 0x80)
TRAMP_KVM_SKIP(PACA_EXSLB, 0x380)
@@ -703,8 +724,20 @@ TRAMP_KVM_HV(PACA_EXGEN, 0x500)
EXC_COMMON_ASYNC(hardware_interrupt_common, 0x500, do_IRQ)
-EXC_REAL(alignment, 0x600, 0x100)
-EXC_VIRT(alignment, 0x4600, 0x100, 0x600)
+EXC_REAL_BEGIN(alignment, 0x600, 0x100)
+SET_SCRATCH0(r13) /* save r13 */
+EXCEPTION_PROLOG_0(PACA_EXGEN)
+EXCEPTION_PROLOG_1(PACA_EXGEN, KVMTEST_PR, 0x600)
+EXCEPTION_PROLOG_2(alignment_common, EXC_STD)
+EXC_REAL_END(alignment, 0x600, 0x100)
+
+EXC_VIRT_BEGIN(alignment, 0x4600, 0x100)
+SET_SCRATCH0(r13) /* save r13 */
+EXCEPTION_PROLOG_0(PACA_EXGEN)
+EXCEPTION_PROLOG_1(PACA_EXGEN, NOTEST, 0x600)
+EXCEPTION_PROLOG_2_RELON(alignment_common, EXC_STD)
+EXC_VIRT_END(alignment, 0x4600, 0x100)
+
TRAMP_KVM(PACA_EXGEN, 0x600)
EXC_COMMON_BEGIN(alignment_common)
mfspr r10,SPRN_DAR
--
2.18.0
^ permalink raw reply related [flat|nested] 8+ messages in thread
* [PATCH 4/4] powerpc/64s: Fix data interrupts vs d-side MCE reentrancy
2019-01-22 6:46 [PATCH 0/4] Fixes for 3 separate NMI reentrancy bugs Nicholas Piggin
` (2 preceding siblings ...)
2019-01-22 6:46 ` [PATCH 3/4] powerpc/64s: Prepare to handle data interrupts vs d-side MCE reentrancy Nicholas Piggin
@ 2019-01-22 6:46 ` Nicholas Piggin
2019-01-22 10:30 ` kbuild test robot
3 siblings, 1 reply; 8+ messages in thread
From: Nicholas Piggin @ 2019-01-22 6:46 UTC (permalink / raw)
To: linuxppc-dev; +Cc: Nicholas Piggin
Handlers for interrupts that set DAR / DSISR, set MSR[RI] before those
SPRs are read and saved away.
If a d-side machine check hits in this window, DAR / DSISR will be
clobbered silently, leading to random behaviour.
Fix this by saving those registers before MSR[RI] is set.
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
---
arch/powerpc/kernel/exceptions-64s.S | 31 +++++++++++++++++++---------
1 file changed, 21 insertions(+), 10 deletions(-)
diff --git a/arch/powerpc/kernel/exceptions-64s.S b/arch/powerpc/kernel/exceptions-64s.S
index 0d91d1c8aad1..8008ea6730ba 100644
--- a/arch/powerpc/kernel/exceptions-64s.S
+++ b/arch/powerpc/kernel/exceptions-64s.S
@@ -574,12 +574,20 @@ EXC_REAL_END(data_access, 0x300, 0x80)
TRAMP_REAL_BEGIN(tramp_real_data_access)
EXCEPTION_PROLOG_1(PACA_EXGEN, KVMTEST_PR, 0x300)
+ mfspr r10,SPRN_DAR
+ mfspr r11,SPRN_DSISR
+ std r10,PACA_EXGEN+EX_DAR(r13)
+ stw r11,PACA_EXGEN+EX_DSISR(r13)
EXCEPTION_PROLOG_2(data_access_common, EXC_STD)
EXC_VIRT_BEGIN(data_access, 0x4300, 0x80)
SET_SCRATCH0(r13) /* save r13 */
EXCEPTION_PROLOG_0(PACA_EXGEN)
EXCEPTION_PROLOG_1(PACA_EXGEN, NOTEST, 0x300)
+ mfspr r10,SPRN_DAR
+ mfspr r11,SPRN_DSISR
+ std r10,PACA_EXGEN+EX_DAR(r13)
+ stw r11,PACA_EXGEN+EX_DSISR(r13)
EXCEPTION_PROLOG_2_RELON(data_access_common, EXC_STD)
EXC_VIRT_END(data_access, 0x4300, 0x80)
@@ -590,11 +598,8 @@ EXC_COMMON_BEGIN(data_access_common)
* Here r13 points to the paca, r9 contains the saved CR,
* SRR0 and SRR1 are saved in r11 and r12,
* r9 - r13 are saved in paca->exgen.
+ * EX_DAR and EX_DSISR have saved DAR/DSISR
*/
- mfspr r10,SPRN_DAR
- std r10,PACA_EXGEN+EX_DAR(r13)
- mfspr r10,SPRN_DSISR
- stw r10,PACA_EXGEN+EX_DSISR(r13)
EXCEPTION_PROLOG_COMMON(0x300, PACA_EXGEN)
RECONCILE_IRQ_STATE(r10, r11)
ld r12,_MSR(r1)
@@ -614,6 +619,8 @@ EXC_REAL_BEGIN(data_access_slb, 0x380, 0x80)
SET_SCRATCH0(r13) /* save r13 */
EXCEPTION_PROLOG_0(PACA_EXSLB)
EXCEPTION_PROLOG_1(PACA_EXSLB, KVMTEST_PR, 0x380)
+ mfspr r10,SPRN_DAR
+ std r10,PACA_EXGEN+EX_DAR(r13)
EXCEPTION_PROLOG_2(data_access_slb_common, EXC_STD)
EXC_REAL_END(data_access_slb, 0x380, 0x80)
@@ -621,14 +628,14 @@ EXC_VIRT_BEGIN(data_access_slb, 0x4380, 0x80)
SET_SCRATCH0(r13) /* save r13 */
EXCEPTION_PROLOG_0(PACA_EXSLB)
EXCEPTION_PROLOG_1(PACA_EXSLB, NOTEST, 0x380)
+ mfspr r10,SPRN_DAR
+ std r10,PACA_EXGEN+EX_DAR(r13)
EXCEPTION_PROLOG_2_RELON(data_access_slb_common, EXC_STD)
EXC_VIRT_END(data_access_slb, 0x4380, 0x80)
TRAMP_KVM_SKIP(PACA_EXSLB, 0x380)
EXC_COMMON_BEGIN(data_access_slb_common)
- mfspr r10,SPRN_DAR
- std r10,PACA_EXSLB+EX_DAR(r13)
EXCEPTION_PROLOG_COMMON(0x380, PACA_EXSLB)
ld r4,PACA_EXSLB+EX_DAR(r13)
std r4,_DAR(r1)
@@ -728,6 +735,10 @@ EXC_REAL_BEGIN(alignment, 0x600, 0x100)
SET_SCRATCH0(r13) /* save r13 */
EXCEPTION_PROLOG_0(PACA_EXGEN)
EXCEPTION_PROLOG_1(PACA_EXGEN, KVMTEST_PR, 0x600)
+ mfspr r10,SPRN_DAR
+ mfspr r11,SPRN_DSISR
+ std r10,PACA_EXGEN+EX_DAR(r13)
+ stw r11,PACA_EXGEN+EX_DSISR(r13)
EXCEPTION_PROLOG_2(alignment_common, EXC_STD)
EXC_REAL_END(alignment, 0x600, 0x100)
@@ -735,15 +746,15 @@ EXC_VIRT_BEGIN(alignment, 0x4600, 0x100)
SET_SCRATCH0(r13) /* save r13 */
EXCEPTION_PROLOG_0(PACA_EXGEN)
EXCEPTION_PROLOG_1(PACA_EXGEN, NOTEST, 0x600)
+ mfspr r10,SPRN_DAR
+ mfspr r11,SPRN_DSISR
+ std r10,PACA_EXGEN+EX_DAR(r13)
+ stw r11,PACA_EXGEN+EX_DSISR(r13)
EXCEPTION_PROLOG_2_RELON(alignment_common, EXC_STD)
EXC_VIRT_END(alignment, 0x4600, 0x100)
TRAMP_KVM(PACA_EXGEN, 0x600)
EXC_COMMON_BEGIN(alignment_common)
- mfspr r10,SPRN_DAR
- std r10,PACA_EXGEN+EX_DAR(r13)
- mfspr r10,SPRN_DSISR
- stw r10,PACA_EXGEN+EX_DSISR(r13)
EXCEPTION_PROLOG_COMMON(0x600, PACA_EXGEN)
ld r3,PACA_EXGEN+EX_DAR(r13)
lwz r4,PACA_EXGEN+EX_DSISR(r13)
--
2.18.0
^ permalink raw reply related [flat|nested] 8+ messages in thread
* Re: [PATCH 4/4] powerpc/64s: Fix data interrupts vs d-side MCE reentrancy
2019-01-22 6:46 ` [PATCH 4/4] powerpc/64s: Fix " Nicholas Piggin
@ 2019-01-22 10:30 ` kbuild test robot
0 siblings, 0 replies; 8+ messages in thread
From: kbuild test robot @ 2019-01-22 10:30 UTC (permalink / raw)
To: Nicholas Piggin; +Cc: linuxppc-dev, kbuild-all, Nicholas Piggin
[-- Attachment #1: Type: text/plain, Size: 2435 bytes --]
Hi Nicholas,
I love your patch! Yet something to improve:
[auto build test ERROR on powerpc/next]
[also build test ERROR on v5.0-rc2 next-20190116]
[if your patch is applied to the wrong git tree, please drop us a note to help improve the system]
url: https://github.com/0day-ci/linux/commits/Nicholas-Piggin/Fixes-for-3-separate-NMI-reentrancy-bugs/20190122-154328
base: https://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux.git next
config: powerpc-defconfig (attached as .config)
compiler: powerpc64-linux-gnu-gcc (Debian 8.2.0-11) 8.2.0
reproduce:
wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
chmod +x ~/bin/make.cross
# save the attached .config to linux build tree
GCC_VERSION=8.2.0 make.cross ARCH=powerpc
All errors (new ones prefixed by >>):
arch/powerpc/kernel/exceptions-64s.S: Assembler messages:
>> arch/powerpc/kernel/exceptions-64s.S:625: Error: Fixed entry overflow
arch/powerpc/kernel/exceptions-64s.S:625: Fatal error: .abort detected. Abandoning ship.
vim +625 arch/powerpc/kernel/exceptions-64s.S
80795e6cb Nicholas Piggin 2016-09-21 616
0ebc4cdaa Benjamin Herrenschmidt 2009-06-02 617
1a6822d19 Nicholas Piggin 2016-12-06 618 EXC_REAL_BEGIN(data_access_slb, 0x380, 0x80)
7cfab0bae Nicholas Piggin 2019-01-22 619 SET_SCRATCH0(r13) /* save r13 */
7cfab0bae Nicholas Piggin 2019-01-22 620 EXCEPTION_PROLOG_0(PACA_EXSLB)
7cfab0bae Nicholas Piggin 2019-01-22 621 EXCEPTION_PROLOG_1(PACA_EXSLB, KVMTEST_PR, 0x380)
00038b16f Nicholas Piggin 2019-01-22 622 mfspr r10,SPRN_DAR
00038b16f Nicholas Piggin 2019-01-22 623 std r10,PACA_EXGEN+EX_DAR(r13)
7cfab0bae Nicholas Piggin 2019-01-22 624 EXCEPTION_PROLOG_2(data_access_slb_common, EXC_STD)
1a6822d19 Nicholas Piggin 2016-12-06 @625 EXC_REAL_END(data_access_slb, 0x380, 0x80)
0ebc4cdaa Benjamin Herrenschmidt 2009-06-02 626
:::::: The code at line 625 was first introduced by commit
:::::: 1a6822d194c3f627eeb6aaca6688a5d0a444663e powerpc/64s: Use (start, size) rather than (start, end) for exception handlers
:::::: TO: Nicholas Piggin <npiggin@gmail.com>
:::::: CC: Michael Ellerman <mpe@ellerman.id.au>
---
0-DAY kernel test infrastructure Open Source Technology Center
https://lists.01.org/pipermail/kbuild-all Intel Corporation
[-- Attachment #2: .config.gz --]
[-- Type: application/gzip, Size: 24049 bytes --]
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH 1/4] powerpc/64s: Fix HV NMI vs HV interrupt recoverability test
2019-01-22 6:46 ` [PATCH 1/4] powerpc/64s: Fix HV NMI vs HV interrupt recoverability test Nicholas Piggin
@ 2019-02-20 14:31 ` Nicholas Piggin
2019-02-21 10:05 ` Michael Ellerman
0 siblings, 1 reply; 8+ messages in thread
From: Nicholas Piggin @ 2019-02-20 14:31 UTC (permalink / raw)
To: linuxppc-dev
Nicholas Piggin's on January 22, 2019 4:46 pm:
> HV interrupts that use HSRR registers do not clear MSR[RI], but
> NMI entry code is not recoverable early on due to both using HSPRG
> for a scratch register.
>
> This bug means that a system reset or machine check can cause silent
> data corruption (due to loss of r13 register) if it hits in a small
> window when taking an HV interrupt.
>
> Fix this by marking NMIs non-recoverable if they land in HV interrupt
> ranges.
Hum, I had a v2 that I didn't send properly with a small compile fix,
but I've also just noticed this:
> +void hv_nmi_check_nonrecoverable(struct pt_regs *regs)
> +{
> +#ifdef CONFIG_POWERNV
> + unsigned long kbase = (unsigned long)_stext;
> + unsigned long nip = regs->nip;
> +
> + if (!(regs->msr & MSR_RI))
> + return;
> + if (!(regs->msr & MSR_HV))
> + return;
> + if (regs->msr & MSR_PR)
> + return;
> +again:
> + if (nip >= 0x500 && nip < 0x600)
> + goto nonrecoverable;
> + if (nip >= 0x980 && nip < 0xa00)
> + goto nonrecoverable;
> + if (nip >= 0xe00 && nip < 0xec0)
> + goto nonrecoverable;
> + if (nip >= 0xf80 && nip < 0xfa0)
> + goto nonrecoverable;
> + /* Trampolines are not relocated. */
> + if (nip >= real_trampolines_start - kbase &&
> + nip < real_trampolines_end - kbase)
> + goto nonrecoverable;
> + if (nip >= virt_trampolines_start - kbase &&
> + nip < virt_trampolines_end - kbase)
> + goto nonrecoverable;
> + if (nip >= 0xc000000000000000ULL) {
> + nip -= 0xc000000000000000ULL;
> + goto again;
Tried to be a bit too clever here. The 0xc... vectors also have a
+0x4000 offset so this won't catch them properly. I'll respin.
Thanks,
Nick
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH 1/4] powerpc/64s: Fix HV NMI vs HV interrupt recoverability test
2019-02-20 14:31 ` Nicholas Piggin
@ 2019-02-21 10:05 ` Michael Ellerman
0 siblings, 0 replies; 8+ messages in thread
From: Michael Ellerman @ 2019-02-21 10:05 UTC (permalink / raw)
To: Nicholas Piggin, linuxppc-dev
Nicholas Piggin <npiggin@gmail.com> writes:
> Nicholas Piggin's on January 22, 2019 4:46 pm:
>> HV interrupts that use HSRR registers do not clear MSR[RI], but
>> NMI entry code is not recoverable early on due to both using HSPRG
>> for a scratch register.
>>
>> This bug means that a system reset or machine check can cause silent
>> data corruption (due to loss of r13 register) if it hits in a small
>> window when taking an HV interrupt.
>>
>> Fix this by marking NMIs non-recoverable if they land in HV interrupt
>> ranges.
>
> Hum, I had a v2 that I didn't send properly with a small compile fix,
> but I've also just noticed this:
>
>> +void hv_nmi_check_nonrecoverable(struct pt_regs *regs)
>> +{
>> +#ifdef CONFIG_POWERNV
>> + unsigned long kbase = (unsigned long)_stext;
>> + unsigned long nip = regs->nip;
>> +
>> + if (!(regs->msr & MSR_RI))
>> + return;
>> + if (!(regs->msr & MSR_HV))
>> + return;
>> + if (regs->msr & MSR_PR)
>> + return;
>> +again:
>> + if (nip >= 0x500 && nip < 0x600)
>> + goto nonrecoverable;
>> + if (nip >= 0x980 && nip < 0xa00)
>> + goto nonrecoverable;
>> + if (nip >= 0xe00 && nip < 0xec0)
>> + goto nonrecoverable;
>> + if (nip >= 0xf80 && nip < 0xfa0)
>> + goto nonrecoverable;
>> + /* Trampolines are not relocated. */
>> + if (nip >= real_trampolines_start - kbase &&
>> + nip < real_trampolines_end - kbase)
>> + goto nonrecoverable;
>> + if (nip >= virt_trampolines_start - kbase &&
>> + nip < virt_trampolines_end - kbase)
>> + goto nonrecoverable;
>> + if (nip >= 0xc000000000000000ULL) {
>> + nip -= 0xc000000000000000ULL;
>> + goto again;
>
> Tried to be a bit too clever here. The 0xc... vectors also have a
> +0x4000 offset so this won't catch them properly. I'll respin.
Thanks.
I'm seeing the same build error as the kbuild robot too.
cheers
^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2019-02-21 10:07 UTC | newest]
Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-01-22 6:46 [PATCH 0/4] Fixes for 3 separate NMI reentrancy bugs Nicholas Piggin
2019-01-22 6:46 ` [PATCH 1/4] powerpc/64s: Fix HV NMI vs HV interrupt recoverability test Nicholas Piggin
2019-02-20 14:31 ` Nicholas Piggin
2019-02-21 10:05 ` Michael Ellerman
2019-01-22 6:46 ` [PATCH 2/4] powerpc/64s: system reset interrupt preserve HSRRs Nicholas Piggin
2019-01-22 6:46 ` [PATCH 3/4] powerpc/64s: Prepare to handle data interrupts vs d-side MCE reentrancy Nicholas Piggin
2019-01-22 6:46 ` [PATCH 4/4] powerpc/64s: Fix " Nicholas Piggin
2019-01-22 10:30 ` kbuild test robot
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).