linuxppc-dev.lists.ozlabs.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/4] Fixes for 3 separate NMI reentrancy bugs
@ 2019-01-22  6:46 Nicholas Piggin
  2019-01-22  6:46 ` [PATCH 1/4] powerpc/64s: Fix HV NMI vs HV interrupt recoverability test Nicholas Piggin
                   ` (3 more replies)
  0 siblings, 4 replies; 8+ messages in thread
From: Nicholas Piggin @ 2019-01-22  6:46 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: Nicholas Piggin

This series fixes several similar but unrelated bugs with NMIs
clobbering live registers and getting away with it because
MSR[RI] is set. Pretty rare bugs, but serious silent corruption
consequences.

For the most part these can be observed and tested quite easily
with the mambo simulator except that it does not seem to follow
the architecture wrt leaving MSR[RI] unchanged for HV interrupts.
Manually fixing that up in the sim environment can trigger that
case.

Thanks,
Nick

Nicholas Piggin (4):
  powerpc/64s: Fix HV NMI vs HV interrupt recoverability test
  powerpc/64s: system reset interrupt preserve HSRRs
  powerpc/64s: Prepare to handle data interrupts vs d-side MCE
    reentrancy
  powerpc/64s: Fix data interrupts vs d-side MCE reentrancy

 arch/powerpc/include/asm/nmi.h       |  2 +
 arch/powerpc/kernel/exceptions-64s.S | 76 +++++++++++++++++++-----
 arch/powerpc/kernel/mce.c            |  3 +
 arch/powerpc/kernel/traps.c          | 87 +++++++++++++++++++++++++++-
 4 files changed, 151 insertions(+), 17 deletions(-)

-- 
2.18.0


^ permalink raw reply	[flat|nested] 8+ messages in thread

* [PATCH 1/4] powerpc/64s: Fix HV NMI vs HV interrupt recoverability test
  2019-01-22  6:46 [PATCH 0/4] Fixes for 3 separate NMI reentrancy bugs Nicholas Piggin
@ 2019-01-22  6:46 ` Nicholas Piggin
  2019-02-20 14:31   ` Nicholas Piggin
  2019-01-22  6:46 ` [PATCH 2/4] powerpc/64s: system reset interrupt preserve HSRRs Nicholas Piggin
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 8+ messages in thread
From: Nicholas Piggin @ 2019-01-22  6:46 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: Nicholas Piggin

HV interrupts that use HSRR registers do not clear MSR[RI], but
NMI entry code is not recoverable early on due to both using HSPRG
for a scratch register.

This bug means that a system reset or machine check can cause silent
data corruption (due to loss of r13 register) if it hits in a small
window when taking an HV interrupt.

Fix this by marking NMIs non-recoverable if they land in HV interrupt
ranges.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
---
 arch/powerpc/include/asm/nmi.h |  2 ++
 arch/powerpc/kernel/mce.c      |  3 ++
 arch/powerpc/kernel/traps.c    | 62 ++++++++++++++++++++++++++++++++++
 3 files changed, 67 insertions(+)

diff --git a/arch/powerpc/include/asm/nmi.h b/arch/powerpc/include/asm/nmi.h
index bd9ba8defd72..84b4cfe73edd 100644
--- a/arch/powerpc/include/asm/nmi.h
+++ b/arch/powerpc/include/asm/nmi.h
@@ -14,4 +14,6 @@ extern void arch_trigger_cpumask_backtrace(const cpumask_t *mask,
 #define arch_trigger_cpumask_backtrace arch_trigger_cpumask_backtrace
 #endif
 
+extern void hv_nmi_check_nonrecoverable(struct pt_regs *regs);
+
 #endif /* _ASM_NMI_H */
diff --git a/arch/powerpc/kernel/mce.c b/arch/powerpc/kernel/mce.c
index bd933a75f0bc..d653b5de4537 100644
--- a/arch/powerpc/kernel/mce.c
+++ b/arch/powerpc/kernel/mce.c
@@ -31,6 +31,7 @@
 
 #include <asm/machdep.h>
 #include <asm/mce.h>
+#include <asm/nmi.h>
 
 static DEFINE_PER_CPU(int, mce_nest_count);
 static DEFINE_PER_CPU(struct machine_check_event[MAX_MC_EVT], mce_event);
@@ -488,6 +489,8 @@ long machine_check_early(struct pt_regs *regs)
 {
 	long handled = 0;
 
+	hv_nmi_check_nonrecoverable(regs);
+
 	/*
 	 * See if platform is capable of handling machine check.
 	 */
diff --git a/arch/powerpc/kernel/traps.c b/arch/powerpc/kernel/traps.c
index 64936b60d521..b429b2264a1f 100644
--- a/arch/powerpc/kernel/traps.c
+++ b/arch/powerpc/kernel/traps.c
@@ -376,6 +376,66 @@ void _exception(int signr, struct pt_regs *regs, int code, unsigned long addr)
 	force_sig_fault(signr, code, (void __user *)addr, current);
 }
 
+/*
+ * The interrupt architecture has a quirk in that the HV interrupts excluding
+ * the NMIs (0x100 and 0x200) do not clear MSR[RI] at entry. The first thing
+ * that an interrupt handler must do is save off a GPR into a scratch register,
+ * and all interrupts on POWERNV (HV=1) use the same HSPRG register as scratch.
+ * Therefore an NMI can clobber an HV interrupt's live HSPRG without noticing
+ * that it is non-reentrant, which leads to random data corruption.
+ *
+ * The solution is for NMI interrupts in HV mode to check if they originated
+ * from these critical HV interrupt regions. If so, then mark them not
+ * recoverable.
+ *
+ * An alternative would be for HV NMIs to use SPRG for scratch to avoid the
+ * HSPRG clobber, however this would cause guest SPRG to be clobbered. Linux
+ * guests should always have MSR[RI]=0 when its scratch SPRG is in use, so
+ * that would work. However any other guest OS that may have the SPRG live
+ * and MSR[RI]=1 could encounter silent corruption.
+ *
+ * Builds that do not support KVM could take this second option to increase
+ * the recoverability of NMIs.
+ */
+void hv_nmi_check_nonrecoverable(struct pt_regs *regs)
+{
+#ifdef CONFIG_POWERNV
+	unsigned long kbase = (unsigned long)_stext;
+	unsigned long nip = regs->nip;
+
+	if (!(regs->msr & MSR_RI))
+		return;
+	if (!(regs->msr & MSR_HV))
+		return;
+	if (regs->msr & MSR_PR)
+		return;
+again:
+	if (nip >= 0x500 && nip < 0x600)
+		goto nonrecoverable;
+	if (nip >= 0x980 && nip < 0xa00)
+		goto nonrecoverable;
+	if (nip >= 0xe00 && nip < 0xec0)
+		goto nonrecoverable;
+	if (nip >= 0xf80 && nip < 0xfa0)
+		goto nonrecoverable;
+	/* Trampolines are not relocated. */
+	if (nip >= real_trampolines_start - kbase &&
+			nip < real_trampolines_end - kbase)
+		goto nonrecoverable;
+	if (nip >= virt_trampolines_start - kbase &&
+			nip < virt_trampolines_end - kbase)
+		goto nonrecoverable;
+	if (nip >= 0xc000000000000000ULL) {
+		nip -= 0xc000000000000000ULL;
+		goto again;
+	}
+	return;
+
+nonrecoverable:
+	regs->msr &= ~MSR_RI;
+#endif
+}
+
 void system_reset_exception(struct pt_regs *regs)
 {
 	/*
@@ -386,6 +446,8 @@ void system_reset_exception(struct pt_regs *regs)
 	if (!nested)
 		nmi_enter();
 
+	hv_nmi_check_nonrecoverable(regs);
+
 	__this_cpu_inc(irq_stat.sreset_irqs);
 
 	/* See if any machine dependent calls */
-- 
2.18.0


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH 2/4] powerpc/64s: system reset interrupt preserve HSRRs
  2019-01-22  6:46 [PATCH 0/4] Fixes for 3 separate NMI reentrancy bugs Nicholas Piggin
  2019-01-22  6:46 ` [PATCH 1/4] powerpc/64s: Fix HV NMI vs HV interrupt recoverability test Nicholas Piggin
@ 2019-01-22  6:46 ` Nicholas Piggin
  2019-01-22  6:46 ` [PATCH 3/4] powerpc/64s: Prepare to handle data interrupts vs d-side MCE reentrancy Nicholas Piggin
  2019-01-22  6:46 ` [PATCH 4/4] powerpc/64s: Fix " Nicholas Piggin
  3 siblings, 0 replies; 8+ messages in thread
From: Nicholas Piggin @ 2019-01-22  6:46 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: Nicholas Piggin

The system reset interrupt may use HSRR registers (e.g., to call in to
OPAL), but code that uses HSRR registers is not required to clear
MSR[RI] by convention.

Rather than introduce that requirement, have system reset interrupt
save HSRRs before they might be used.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
---
 arch/powerpc/kernel/traps.c | 25 ++++++++++++++++++++++++-
 1 file changed, 24 insertions(+), 1 deletion(-)

diff --git a/arch/powerpc/kernel/traps.c b/arch/powerpc/kernel/traps.c
index b429b2264a1f..d54459152a2b 100644
--- a/arch/powerpc/kernel/traps.c
+++ b/arch/powerpc/kernel/traps.c
@@ -438,14 +438,32 @@ void hv_nmi_check_nonrecoverable(struct pt_regs *regs)
 
 void system_reset_exception(struct pt_regs *regs)
 {
+	unsigned long hsrr0, hsrr1;
+	bool hsrrs_saved = false;
+	bool nested = in_nmi();
+
 	/*
 	 * Avoid crashes in case of nested NMI exceptions. Recoverability
 	 * is determined by RI and in_nmi
 	 */
-	bool nested = in_nmi();
 	if (!nested)
 		nmi_enter();
 
+	/*
+	 * System reset can interrupt a region where HSRRs are live and
+	 * MSR[RI]=1, and it may clobber HSRRs itself (e.g., to call OPAL),
+	 * so save them before doing anything.
+	 *
+	 * Machine checks should be okay to avoid this, as the real mode
+	 * handler is careful to avoid HSRRs, and the virt code is not
+	 * delivered as an NMI.
+	 */
+	if (cpu_has_feature(CPU_FTR_HVMODE)) {
+		hsrrs_saved = true;
+		hsrr0 = mfspr(SPRN_HSRR0);
+		hsrr1 = mfspr(SPRN_HSRR1);
+	}
+
 	hv_nmi_check_nonrecoverable(regs);
 
 	__this_cpu_inc(irq_stat.sreset_irqs);
@@ -495,6 +513,11 @@ void system_reset_exception(struct pt_regs *regs)
 	if (!(regs->msr & MSR_RI))
 		nmi_panic(regs, "Unrecoverable System Reset");
 
+	if (hsrrs_saved) {
+		mtspr(SPRN_HSRR0, hsrr0);
+		mtspr(SPRN_HSRR1, hsrr1);
+	}
+
 	if (!nested)
 		nmi_exit();
 
-- 
2.18.0


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH 3/4] powerpc/64s: Prepare to handle data interrupts vs d-side MCE reentrancy
  2019-01-22  6:46 [PATCH 0/4] Fixes for 3 separate NMI reentrancy bugs Nicholas Piggin
  2019-01-22  6:46 ` [PATCH 1/4] powerpc/64s: Fix HV NMI vs HV interrupt recoverability test Nicholas Piggin
  2019-01-22  6:46 ` [PATCH 2/4] powerpc/64s: system reset interrupt preserve HSRRs Nicholas Piggin
@ 2019-01-22  6:46 ` Nicholas Piggin
  2019-01-22  6:46 ` [PATCH 4/4] powerpc/64s: Fix " Nicholas Piggin
  3 siblings, 0 replies; 8+ messages in thread
From: Nicholas Piggin @ 2019-01-22  6:46 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: Nicholas Piggin

The subsequent patch needs to change data interrupts (those that set
DAR / DSISR) in order to cope with a recoverability bug with MCE
interrupts.

This patch prepares for the fix by removing a couple of layers of
macro indirection from the interrupt handlers, and also makes the
0x300 interrupt out of line (as required by the next patch).

Functionality should be unchanged, and code unchanged except for the
OOL change.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
---
 arch/powerpc/kernel/exceptions-64s.S | 45 ++++++++++++++++++++++++----
 1 file changed, 39 insertions(+), 6 deletions(-)

diff --git a/arch/powerpc/kernel/exceptions-64s.S b/arch/powerpc/kernel/exceptions-64s.S
index 9e253ce27e08..0d91d1c8aad1 100644
--- a/arch/powerpc/kernel/exceptions-64s.S
+++ b/arch/powerpc/kernel/exceptions-64s.S
@@ -566,8 +566,23 @@ EXC_COMMON_BEGIN(mce_return)
 	RFI_TO_KERNEL
 	b	.
 
-EXC_REAL(data_access, 0x300, 0x80)
-EXC_VIRT(data_access, 0x4300, 0x80, 0x300)
+EXC_REAL_BEGIN(data_access, 0x300, 0x80)
+SET_SCRATCH0(r13)		/* save r13 */
+EXCEPTION_PROLOG_0(PACA_EXGEN)
+	b	tramp_real_data_access
+EXC_REAL_END(data_access, 0x300, 0x80)
+
+TRAMP_REAL_BEGIN(tramp_real_data_access)
+EXCEPTION_PROLOG_1(PACA_EXGEN, KVMTEST_PR, 0x300)
+EXCEPTION_PROLOG_2(data_access_common, EXC_STD)
+
+EXC_VIRT_BEGIN(data_access, 0x4300, 0x80)
+SET_SCRATCH0(r13)		/* save r13 */
+EXCEPTION_PROLOG_0(PACA_EXGEN)
+EXCEPTION_PROLOG_1(PACA_EXGEN, NOTEST, 0x300)
+EXCEPTION_PROLOG_2_RELON(data_access_common, EXC_STD)
+EXC_VIRT_END(data_access, 0x4300, 0x80)
+
 TRAMP_KVM_SKIP(PACA_EXGEN, 0x300)
 
 EXC_COMMON_BEGIN(data_access_common)
@@ -596,11 +611,17 @@ ALT_MMU_FTR_SECTION_END_IFCLR(MMU_FTR_TYPE_RADIX)
 
 
 EXC_REAL_BEGIN(data_access_slb, 0x380, 0x80)
-EXCEPTION_PROLOG(PACA_EXSLB, data_access_slb_common, EXC_STD, KVMTEST_PR, 0x380);
+SET_SCRATCH0(r13)		/* save r13 */
+EXCEPTION_PROLOG_0(PACA_EXSLB)
+EXCEPTION_PROLOG_1(PACA_EXSLB, KVMTEST_PR, 0x380)
+EXCEPTION_PROLOG_2(data_access_slb_common, EXC_STD)
 EXC_REAL_END(data_access_slb, 0x380, 0x80)
 
 EXC_VIRT_BEGIN(data_access_slb, 0x4380, 0x80)
-EXCEPTION_RELON_PROLOG(PACA_EXSLB, data_access_slb_common, EXC_STD, NOTEST, 0x380);
+SET_SCRATCH0(r13)		/* save r13 */
+EXCEPTION_PROLOG_0(PACA_EXSLB)
+EXCEPTION_PROLOG_1(PACA_EXSLB, NOTEST, 0x380)
+EXCEPTION_PROLOG_2_RELON(data_access_slb_common, EXC_STD)
 EXC_VIRT_END(data_access_slb, 0x4380, 0x80)
 
 TRAMP_KVM_SKIP(PACA_EXSLB, 0x380)
@@ -703,8 +724,20 @@ TRAMP_KVM_HV(PACA_EXGEN, 0x500)
 EXC_COMMON_ASYNC(hardware_interrupt_common, 0x500, do_IRQ)
 
 
-EXC_REAL(alignment, 0x600, 0x100)
-EXC_VIRT(alignment, 0x4600, 0x100, 0x600)
+EXC_REAL_BEGIN(alignment, 0x600, 0x100)
+SET_SCRATCH0(r13)		/* save r13 */
+EXCEPTION_PROLOG_0(PACA_EXGEN)
+EXCEPTION_PROLOG_1(PACA_EXGEN, KVMTEST_PR, 0x600)
+EXCEPTION_PROLOG_2(alignment_common, EXC_STD)
+EXC_REAL_END(alignment, 0x600, 0x100)
+
+EXC_VIRT_BEGIN(alignment, 0x4600, 0x100)
+SET_SCRATCH0(r13)		/* save r13 */
+EXCEPTION_PROLOG_0(PACA_EXGEN)
+EXCEPTION_PROLOG_1(PACA_EXGEN, NOTEST, 0x600)
+EXCEPTION_PROLOG_2_RELON(alignment_common, EXC_STD)
+EXC_VIRT_END(alignment, 0x4600, 0x100)
+
 TRAMP_KVM(PACA_EXGEN, 0x600)
 EXC_COMMON_BEGIN(alignment_common)
 	mfspr	r10,SPRN_DAR
-- 
2.18.0


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH 4/4] powerpc/64s: Fix data interrupts vs d-side MCE reentrancy
  2019-01-22  6:46 [PATCH 0/4] Fixes for 3 separate NMI reentrancy bugs Nicholas Piggin
                   ` (2 preceding siblings ...)
  2019-01-22  6:46 ` [PATCH 3/4] powerpc/64s: Prepare to handle data interrupts vs d-side MCE reentrancy Nicholas Piggin
@ 2019-01-22  6:46 ` Nicholas Piggin
  2019-01-22 10:30   ` kbuild test robot
  3 siblings, 1 reply; 8+ messages in thread
From: Nicholas Piggin @ 2019-01-22  6:46 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: Nicholas Piggin

Handlers for interrupts that set DAR / DSISR, set MSR[RI] before those
SPRs are read and saved away.

If a d-side machine check hits in this window, DAR / DSISR will be
clobbered silently, leading to random behaviour.

Fix this by saving those registers before MSR[RI] is set.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
---
 arch/powerpc/kernel/exceptions-64s.S | 31 +++++++++++++++++++---------
 1 file changed, 21 insertions(+), 10 deletions(-)

diff --git a/arch/powerpc/kernel/exceptions-64s.S b/arch/powerpc/kernel/exceptions-64s.S
index 0d91d1c8aad1..8008ea6730ba 100644
--- a/arch/powerpc/kernel/exceptions-64s.S
+++ b/arch/powerpc/kernel/exceptions-64s.S
@@ -574,12 +574,20 @@ EXC_REAL_END(data_access, 0x300, 0x80)
 
 TRAMP_REAL_BEGIN(tramp_real_data_access)
 EXCEPTION_PROLOG_1(PACA_EXGEN, KVMTEST_PR, 0x300)
+	mfspr	r10,SPRN_DAR
+	mfspr	r11,SPRN_DSISR
+	std	r10,PACA_EXGEN+EX_DAR(r13)
+	stw	r11,PACA_EXGEN+EX_DSISR(r13)
 EXCEPTION_PROLOG_2(data_access_common, EXC_STD)
 
 EXC_VIRT_BEGIN(data_access, 0x4300, 0x80)
 SET_SCRATCH0(r13)		/* save r13 */
 EXCEPTION_PROLOG_0(PACA_EXGEN)
 EXCEPTION_PROLOG_1(PACA_EXGEN, NOTEST, 0x300)
+	mfspr	r10,SPRN_DAR
+	mfspr	r11,SPRN_DSISR
+	std	r10,PACA_EXGEN+EX_DAR(r13)
+	stw	r11,PACA_EXGEN+EX_DSISR(r13)
 EXCEPTION_PROLOG_2_RELON(data_access_common, EXC_STD)
 EXC_VIRT_END(data_access, 0x4300, 0x80)
 
@@ -590,11 +598,8 @@ EXC_COMMON_BEGIN(data_access_common)
 	 * Here r13 points to the paca, r9 contains the saved CR,
 	 * SRR0 and SRR1 are saved in r11 and r12,
 	 * r9 - r13 are saved in paca->exgen.
+	 * EX_DAR and EX_DSISR have saved DAR/DSISR
 	 */
-	mfspr	r10,SPRN_DAR
-	std	r10,PACA_EXGEN+EX_DAR(r13)
-	mfspr	r10,SPRN_DSISR
-	stw	r10,PACA_EXGEN+EX_DSISR(r13)
 	EXCEPTION_PROLOG_COMMON(0x300, PACA_EXGEN)
 	RECONCILE_IRQ_STATE(r10, r11)
 	ld	r12,_MSR(r1)
@@ -614,6 +619,8 @@ EXC_REAL_BEGIN(data_access_slb, 0x380, 0x80)
 SET_SCRATCH0(r13)		/* save r13 */
 EXCEPTION_PROLOG_0(PACA_EXSLB)
 EXCEPTION_PROLOG_1(PACA_EXSLB, KVMTEST_PR, 0x380)
+	mfspr	r10,SPRN_DAR
+	std	r10,PACA_EXGEN+EX_DAR(r13)
 EXCEPTION_PROLOG_2(data_access_slb_common, EXC_STD)
 EXC_REAL_END(data_access_slb, 0x380, 0x80)
 
@@ -621,14 +628,14 @@ EXC_VIRT_BEGIN(data_access_slb, 0x4380, 0x80)
 SET_SCRATCH0(r13)		/* save r13 */
 EXCEPTION_PROLOG_0(PACA_EXSLB)
 EXCEPTION_PROLOG_1(PACA_EXSLB, NOTEST, 0x380)
+	mfspr	r10,SPRN_DAR
+	std	r10,PACA_EXGEN+EX_DAR(r13)
 EXCEPTION_PROLOG_2_RELON(data_access_slb_common, EXC_STD)
 EXC_VIRT_END(data_access_slb, 0x4380, 0x80)
 
 TRAMP_KVM_SKIP(PACA_EXSLB, 0x380)
 
 EXC_COMMON_BEGIN(data_access_slb_common)
-	mfspr	r10,SPRN_DAR
-	std	r10,PACA_EXSLB+EX_DAR(r13)
 	EXCEPTION_PROLOG_COMMON(0x380, PACA_EXSLB)
 	ld	r4,PACA_EXSLB+EX_DAR(r13)
 	std	r4,_DAR(r1)
@@ -728,6 +735,10 @@ EXC_REAL_BEGIN(alignment, 0x600, 0x100)
 SET_SCRATCH0(r13)		/* save r13 */
 EXCEPTION_PROLOG_0(PACA_EXGEN)
 EXCEPTION_PROLOG_1(PACA_EXGEN, KVMTEST_PR, 0x600)
+	mfspr	r10,SPRN_DAR
+	mfspr	r11,SPRN_DSISR
+	std	r10,PACA_EXGEN+EX_DAR(r13)
+	stw	r11,PACA_EXGEN+EX_DSISR(r13)
 EXCEPTION_PROLOG_2(alignment_common, EXC_STD)
 EXC_REAL_END(alignment, 0x600, 0x100)
 
@@ -735,15 +746,15 @@ EXC_VIRT_BEGIN(alignment, 0x4600, 0x100)
 SET_SCRATCH0(r13)		/* save r13 */
 EXCEPTION_PROLOG_0(PACA_EXGEN)
 EXCEPTION_PROLOG_1(PACA_EXGEN, NOTEST, 0x600)
+	mfspr	r10,SPRN_DAR
+	mfspr	r11,SPRN_DSISR
+	std	r10,PACA_EXGEN+EX_DAR(r13)
+	stw	r11,PACA_EXGEN+EX_DSISR(r13)
 EXCEPTION_PROLOG_2_RELON(alignment_common, EXC_STD)
 EXC_VIRT_END(alignment, 0x4600, 0x100)
 
 TRAMP_KVM(PACA_EXGEN, 0x600)
 EXC_COMMON_BEGIN(alignment_common)
-	mfspr	r10,SPRN_DAR
-	std	r10,PACA_EXGEN+EX_DAR(r13)
-	mfspr	r10,SPRN_DSISR
-	stw	r10,PACA_EXGEN+EX_DSISR(r13)
 	EXCEPTION_PROLOG_COMMON(0x600, PACA_EXGEN)
 	ld	r3,PACA_EXGEN+EX_DAR(r13)
 	lwz	r4,PACA_EXGEN+EX_DSISR(r13)
-- 
2.18.0


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [PATCH 4/4] powerpc/64s: Fix data interrupts vs d-side MCE reentrancy
  2019-01-22  6:46 ` [PATCH 4/4] powerpc/64s: Fix " Nicholas Piggin
@ 2019-01-22 10:30   ` kbuild test robot
  0 siblings, 0 replies; 8+ messages in thread
From: kbuild test robot @ 2019-01-22 10:30 UTC (permalink / raw)
  To: Nicholas Piggin; +Cc: linuxppc-dev, kbuild-all, Nicholas Piggin

[-- Attachment #1: Type: text/plain, Size: 2435 bytes --]

Hi Nicholas,

I love your patch! Yet something to improve:

[auto build test ERROR on powerpc/next]
[also build test ERROR on v5.0-rc2 next-20190116]
[if your patch is applied to the wrong git tree, please drop us a note to help improve the system]

url:    https://github.com/0day-ci/linux/commits/Nicholas-Piggin/Fixes-for-3-separate-NMI-reentrancy-bugs/20190122-154328
base:   https://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux.git next
config: powerpc-defconfig (attached as .config)
compiler: powerpc64-linux-gnu-gcc (Debian 8.2.0-11) 8.2.0
reproduce:
        wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
        chmod +x ~/bin/make.cross
        # save the attached .config to linux build tree
        GCC_VERSION=8.2.0 make.cross ARCH=powerpc 

All errors (new ones prefixed by >>):

   arch/powerpc/kernel/exceptions-64s.S: Assembler messages:
>> arch/powerpc/kernel/exceptions-64s.S:625: Error: Fixed entry overflow
   arch/powerpc/kernel/exceptions-64s.S:625: Fatal error: .abort detected.  Abandoning ship.

vim +625 arch/powerpc/kernel/exceptions-64s.S

80795e6cb Nicholas Piggin        2016-09-21  616  
0ebc4cdaa Benjamin Herrenschmidt 2009-06-02  617  
1a6822d19 Nicholas Piggin        2016-12-06  618  EXC_REAL_BEGIN(data_access_slb, 0x380, 0x80)
7cfab0bae Nicholas Piggin        2019-01-22  619  SET_SCRATCH0(r13)		/* save r13 */
7cfab0bae Nicholas Piggin        2019-01-22  620  EXCEPTION_PROLOG_0(PACA_EXSLB)
7cfab0bae Nicholas Piggin        2019-01-22  621  EXCEPTION_PROLOG_1(PACA_EXSLB, KVMTEST_PR, 0x380)
00038b16f Nicholas Piggin        2019-01-22  622  	mfspr	r10,SPRN_DAR
00038b16f Nicholas Piggin        2019-01-22  623  	std	r10,PACA_EXGEN+EX_DAR(r13)
7cfab0bae Nicholas Piggin        2019-01-22  624  EXCEPTION_PROLOG_2(data_access_slb_common, EXC_STD)
1a6822d19 Nicholas Piggin        2016-12-06 @625  EXC_REAL_END(data_access_slb, 0x380, 0x80)
0ebc4cdaa Benjamin Herrenschmidt 2009-06-02  626  

:::::: The code at line 625 was first introduced by commit
:::::: 1a6822d194c3f627eeb6aaca6688a5d0a444663e powerpc/64s: Use (start, size) rather than (start, end) for exception handlers

:::::: TO: Nicholas Piggin <npiggin@gmail.com>
:::::: CC: Michael Ellerman <mpe@ellerman.id.au>

---
0-DAY kernel test infrastructure                Open Source Technology Center
https://lists.01.org/pipermail/kbuild-all                   Intel Corporation

[-- Attachment #2: .config.gz --]
[-- Type: application/gzip, Size: 24049 bytes --]

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH 1/4] powerpc/64s: Fix HV NMI vs HV interrupt recoverability test
  2019-01-22  6:46 ` [PATCH 1/4] powerpc/64s: Fix HV NMI vs HV interrupt recoverability test Nicholas Piggin
@ 2019-02-20 14:31   ` Nicholas Piggin
  2019-02-21 10:05     ` Michael Ellerman
  0 siblings, 1 reply; 8+ messages in thread
From: Nicholas Piggin @ 2019-02-20 14:31 UTC (permalink / raw)
  To: linuxppc-dev

Nicholas Piggin's on January 22, 2019 4:46 pm:
> HV interrupts that use HSRR registers do not clear MSR[RI], but
> NMI entry code is not recoverable early on due to both using HSPRG
> for a scratch register.
> 
> This bug means that a system reset or machine check can cause silent
> data corruption (due to loss of r13 register) if it hits in a small
> window when taking an HV interrupt.
> 
> Fix this by marking NMIs non-recoverable if they land in HV interrupt
> ranges.

Hum, I had a v2 that I didn't send properly with a small compile fix,
but I've also just noticed this:

> +void hv_nmi_check_nonrecoverable(struct pt_regs *regs)
> +{
> +#ifdef CONFIG_POWERNV
> +	unsigned long kbase = (unsigned long)_stext;
> +	unsigned long nip = regs->nip;
> +
> +	if (!(regs->msr & MSR_RI))
> +		return;
> +	if (!(regs->msr & MSR_HV))
> +		return;
> +	if (regs->msr & MSR_PR)
> +		return;
> +again:
> +	if (nip >= 0x500 && nip < 0x600)
> +		goto nonrecoverable;
> +	if (nip >= 0x980 && nip < 0xa00)
> +		goto nonrecoverable;
> +	if (nip >= 0xe00 && nip < 0xec0)
> +		goto nonrecoverable;
> +	if (nip >= 0xf80 && nip < 0xfa0)
> +		goto nonrecoverable;
> +	/* Trampolines are not relocated. */
> +	if (nip >= real_trampolines_start - kbase &&
> +			nip < real_trampolines_end - kbase)
> +		goto nonrecoverable;
> +	if (nip >= virt_trampolines_start - kbase &&
> +			nip < virt_trampolines_end - kbase)
> +		goto nonrecoverable;
> +	if (nip >= 0xc000000000000000ULL) {
> +		nip -= 0xc000000000000000ULL;
> +		goto again;

Tried to be a bit too clever here. The 0xc... vectors also have a 
+0x4000 offset so this won't catch them properly. I'll respin.

Thanks,
Nick


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH 1/4] powerpc/64s: Fix HV NMI vs HV interrupt recoverability test
  2019-02-20 14:31   ` Nicholas Piggin
@ 2019-02-21 10:05     ` Michael Ellerman
  0 siblings, 0 replies; 8+ messages in thread
From: Michael Ellerman @ 2019-02-21 10:05 UTC (permalink / raw)
  To: Nicholas Piggin, linuxppc-dev

Nicholas Piggin <npiggin@gmail.com> writes:
> Nicholas Piggin's on January 22, 2019 4:46 pm:
>> HV interrupts that use HSRR registers do not clear MSR[RI], but
>> NMI entry code is not recoverable early on due to both using HSPRG
>> for a scratch register.
>> 
>> This bug means that a system reset or machine check can cause silent
>> data corruption (due to loss of r13 register) if it hits in a small
>> window when taking an HV interrupt.
>> 
>> Fix this by marking NMIs non-recoverable if they land in HV interrupt
>> ranges.
>
> Hum, I had a v2 that I didn't send properly with a small compile fix,
> but I've also just noticed this:
>
>> +void hv_nmi_check_nonrecoverable(struct pt_regs *regs)
>> +{
>> +#ifdef CONFIG_POWERNV
>> +	unsigned long kbase = (unsigned long)_stext;
>> +	unsigned long nip = regs->nip;
>> +
>> +	if (!(regs->msr & MSR_RI))
>> +		return;
>> +	if (!(regs->msr & MSR_HV))
>> +		return;
>> +	if (regs->msr & MSR_PR)
>> +		return;
>> +again:
>> +	if (nip >= 0x500 && nip < 0x600)
>> +		goto nonrecoverable;
>> +	if (nip >= 0x980 && nip < 0xa00)
>> +		goto nonrecoverable;
>> +	if (nip >= 0xe00 && nip < 0xec0)
>> +		goto nonrecoverable;
>> +	if (nip >= 0xf80 && nip < 0xfa0)
>> +		goto nonrecoverable;
>> +	/* Trampolines are not relocated. */
>> +	if (nip >= real_trampolines_start - kbase &&
>> +			nip < real_trampolines_end - kbase)
>> +		goto nonrecoverable;
>> +	if (nip >= virt_trampolines_start - kbase &&
>> +			nip < virt_trampolines_end - kbase)
>> +		goto nonrecoverable;
>> +	if (nip >= 0xc000000000000000ULL) {
>> +		nip -= 0xc000000000000000ULL;
>> +		goto again;
>
> Tried to be a bit too clever here. The 0xc... vectors also have a 
> +0x4000 offset so this won't catch them properly. I'll respin.

Thanks.

I'm seeing the same build error as the kbuild robot too.

cheers

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2019-02-21 10:07 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-01-22  6:46 [PATCH 0/4] Fixes for 3 separate NMI reentrancy bugs Nicholas Piggin
2019-01-22  6:46 ` [PATCH 1/4] powerpc/64s: Fix HV NMI vs HV interrupt recoverability test Nicholas Piggin
2019-02-20 14:31   ` Nicholas Piggin
2019-02-21 10:05     ` Michael Ellerman
2019-01-22  6:46 ` [PATCH 2/4] powerpc/64s: system reset interrupt preserve HSRRs Nicholas Piggin
2019-01-22  6:46 ` [PATCH 3/4] powerpc/64s: Prepare to handle data interrupts vs d-side MCE reentrancy Nicholas Piggin
2019-01-22  6:46 ` [PATCH 4/4] powerpc/64s: Fix " Nicholas Piggin
2019-01-22 10:30   ` kbuild test robot

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).