linuxppc-dev.lists.ozlabs.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 00/10] Improving debugging with NMIs
@ 2016-12-19 18:30 Nicholas Piggin
  2016-12-19 18:30 ` [PATCH 01/10] powerpc/64s: add exception macro that does not enable RI Nicholas Piggin
                   ` (9 more replies)
  0 siblings, 10 replies; 15+ messages in thread
From: Nicholas Piggin @ 2016-12-19 18:30 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: Nicholas Piggin

Hi,

These patches have been posted a few times before. Since then
I've split them up and tried to simplify the NMI IPI stuff as
much as possible.

I've been testing using David Gibson's QEMU branch ppc-for-2.9,
which implements the required hcall.

Thanks,
Nick

Nicholas Piggin (10):
  powerpc/64s: add exception macro that does not enable RI
  powerpc/64s: exception macro for stack frame and initial register save
  powerpc/64s: fix system reset vs general interrupt reentrancy
  powerpc/64s: disallow system reset vs system reset reentrancy
  powerpc/64s: dedicated system reset interrupt stack
  powerpc: nmi_enter for system reset
  powerpc: add NMI IPI infrastructure
  powerpc: add struct smp_ops_t.cause_nmi_ipi operation
  powerpc/pseries: implement NMI IPI with H_SIGNAL_SYS_RESET
  powerpc: xmon wait for secondaries before sending IPI

 arch/powerpc/Kconfig                      |   5 +
 arch/powerpc/include/asm/exception-64s.h  |  69 +++++++--
 arch/powerpc/include/asm/paca.h           |  14 +-
 arch/powerpc/include/asm/smp.h            |  15 +-
 arch/powerpc/kernel/asm-offsets.c         |   5 +-
 arch/powerpc/kernel/exceptions-64s.S      |  72 +++++----
 arch/powerpc/kernel/setup_64.c            |   5 +
 arch/powerpc/kernel/smp.c                 | 250 +++++++++++++++++++++++++-----
 arch/powerpc/kernel/traps.c               |  19 ++-
 arch/powerpc/platforms/85xx/smp.c         |   1 +
 arch/powerpc/platforms/86xx/mpc86xx_smp.c |   1 +
 arch/powerpc/platforms/cell/interrupt.c   |   2 +-
 arch/powerpc/platforms/chrp/smp.c         |   1 +
 arch/powerpc/platforms/powermac/smp.c     |   1 +
 arch/powerpc/platforms/powernv/smp.c      |   1 +
 arch/powerpc/platforms/ps3/smp.c          |   4 +-
 arch/powerpc/platforms/pseries/ras.c      |   4 +
 arch/powerpc/platforms/pseries/smp.c      |  22 +++
 arch/powerpc/xmon/xmon.c                  |  15 +-
 19 files changed, 418 insertions(+), 88 deletions(-)

-- 
2.11.0

^ permalink raw reply	[flat|nested] 15+ messages in thread

* [PATCH 01/10] powerpc/64s: add exception macro that does not enable RI
  2016-12-19 18:30 [PATCH 00/10] Improving debugging with NMIs Nicholas Piggin
@ 2016-12-19 18:30 ` Nicholas Piggin
  2017-05-01  2:58   ` [01/10] " Michael Ellerman
  2016-12-19 18:30 ` [PATCH 02/10] powerpc/64s: exception macro for stack frame and initial register save Nicholas Piggin
                   ` (8 subsequent siblings)
  9 siblings, 1 reply; 15+ messages in thread
From: Nicholas Piggin @ 2016-12-19 18:30 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: Nicholas Piggin

Subsequent patches will add more non-RI variant exceptions, so
create a macro for it rather than open-code it.

This does not change generated instructions.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
---
 arch/powerpc/include/asm/exception-64s.h | 15 +++++++++++++++
 arch/powerpc/kernel/exceptions-64s.S     | 17 ++++-------------
 2 files changed, 19 insertions(+), 13 deletions(-)

diff --git a/arch/powerpc/include/asm/exception-64s.h b/arch/powerpc/include/asm/exception-64s.h
index 9a3eee661297..b261fb4658b4 100644
--- a/arch/powerpc/include/asm/exception-64s.h
+++ b/arch/powerpc/include/asm/exception-64s.h
@@ -194,6 +194,21 @@ END_FTR_SECTION_NESTED(ftr,ftr,943)
 #define EXCEPTION_PROLOG_PSERIES_1(label, h)				\
 	__EXCEPTION_PROLOG_PSERIES_1(label, h)
 
+/* _NORI variant keeps MSR_RI clear */
+#define __EXCEPTION_PROLOG_PSERIES_1_NORI(label, h)			\
+	ld	r10,PACAKMSR(r13);	/* get MSR value for kernel */	\
+	xori	r10,r10,MSR_RI;		/* Clear MSR_RI */		\
+	mfspr	r11,SPRN_##h##SRR0;	/* save SRR0 */			\
+	LOAD_HANDLER(r12,label)						\
+	mtspr	SPRN_##h##SRR0,r12;					\
+	mfspr	r12,SPRN_##h##SRR1;	/* and SRR1 */			\
+	mtspr	SPRN_##h##SRR1,r10;					\
+	h##rfid;							\
+	b	.	/* prevent speculative execution */
+
+#define EXCEPTION_PROLOG_PSERIES_1_NORI(label, h)			\
+	__EXCEPTION_PROLOG_PSERIES_1_NORI(label, h)
+
 #define EXCEPTION_PROLOG_PSERIES(area, label, h, extra, vec)		\
 	EXCEPTION_PROLOG_0(area);					\
 	EXCEPTION_PROLOG_1(area, extra, vec);				\
diff --git a/arch/powerpc/kernel/exceptions-64s.S b/arch/powerpc/kernel/exceptions-64s.S
index d39d6118c6e9..ceff7d6c0518 100644
--- a/arch/powerpc/kernel/exceptions-64s.S
+++ b/arch/powerpc/kernel/exceptions-64s.S
@@ -268,20 +268,11 @@ machine_check_fwnmi:
 machine_check_pSeries_0:
 	EXCEPTION_PROLOG_1(PACA_EXMC, KVMTEST_PR, 0x200)
 	/*
-	 * The following is essentially EXCEPTION_PROLOG_PSERIES_1 with the
-	 * difference that MSR_RI is not enabled, because PACA_EXMC is being
-	 * used, so nested machine check corrupts it. machine_check_common
-	 * enables MSR_RI.
+	 * MSR_RI is not enabled, because PACA_EXMC is being used, so a
+	 * nested machine check corrupts it. machine_check_common enables
+	 * MSR_RI.
 	 */
-	ld	r10,PACAKMSR(r13)
-	xori	r10,r10,MSR_RI
-	mfspr	r11,SPRN_SRR0
-	LOAD_HANDLER(r12, machine_check_common)
-	mtspr	SPRN_SRR0,r12
-	mfspr	r12,SPRN_SRR1
-	mtspr	SPRN_SRR1,r10
-	rfid
-	b	.	/* prevent speculative execution */
+	EXCEPTION_PROLOG_PSERIES_1_NORI(machine_check_common, EXC_STD)
 
 TRAMP_KVM_SKIP(PACA_EXMC, 0x200)
 
-- 
2.11.0

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH 02/10] powerpc/64s: exception macro for stack frame and initial register save
  2016-12-19 18:30 [PATCH 00/10] Improving debugging with NMIs Nicholas Piggin
  2016-12-19 18:30 ` [PATCH 01/10] powerpc/64s: add exception macro that does not enable RI Nicholas Piggin
@ 2016-12-19 18:30 ` Nicholas Piggin
  2016-12-19 18:30 ` [PATCH 03/10] powerpc/64s: fix system reset vs general interrupt reentrancy Nicholas Piggin
                   ` (7 subsequent siblings)
  9 siblings, 0 replies; 15+ messages in thread
From: Nicholas Piggin @ 2016-12-19 18:30 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: Nicholas Piggin

This code is common to a few exceptions, and another user will be added.
This causes a trivial change to generated code:

-     604: std     r9,416(r1)
-     608: mfspr   r11,314
-     60c: std     r11,368(r1)
-     610: mfspr   r12,315
+     604: mfspr   r11,314
+     608: mfspr   r12,315
+     60c: std     r9,416(r1)
+     610: std     r11,368(r1)

machine_check_powernv_early could also use this, but that requires non
trivial changes to generated code, so that's for another patch.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
---
 arch/powerpc/include/asm/exception-64s.h | 16 ++++++++++------
 arch/powerpc/kernel/exceptions-64s.S     | 13 ++++---------
 2 files changed, 14 insertions(+), 15 deletions(-)

diff --git a/arch/powerpc/include/asm/exception-64s.h b/arch/powerpc/include/asm/exception-64s.h
index b261fb4658b4..1e985382a73a 100644
--- a/arch/powerpc/include/asm/exception-64s.h
+++ b/arch/powerpc/include/asm/exception-64s.h
@@ -297,6 +297,15 @@ END_FTR_SECTION_NESTED(ftr,ftr,943)
 
 #define NOTEST(n)
 
+#define EXCEPTION_PROLOG_COMMON_1()					   \
+	std	r9,_CCR(r1);		/* save CR in stackframe	*/ \
+	std	r11,_NIP(r1);		/* save SRR0 in stackframe	*/ \
+	std	r12,_MSR(r1);		/* save SRR1 in stackframe	*/ \
+	std	r10,0(r1);		/* make stack chain pointer	*/ \
+	std	r0,GPR0(r1);		/* save r0 in stackframe	*/ \
+	std	r10,GPR1(r1);		/* save r1 in stackframe	*/ \
+
+
 /*
  * The common exception prolog is used for all except a few exceptions
  * such as a segment miss on a kernel address.  We have to be prepared
@@ -321,12 +330,7 @@ END_FTR_SECTION_NESTED(ftr,ftr,943)
 	addi	r3,r13,area;		/* r3 -> where regs are saved*/	   \
 	RESTORE_CTR(r1, area);						   \
 	b	bad_stack;						   \
-3:	std	r9,_CCR(r1);		/* save CR in stackframe	*/ \
-	std	r11,_NIP(r1);		/* save SRR0 in stackframe	*/ \
-	std	r12,_MSR(r1);		/* save SRR1 in stackframe	*/ \
-	std	r10,0(r1);		/* make stack chain pointer	*/ \
-	std	r0,GPR0(r1);		/* save r0 in stackframe	*/ \
-	std	r10,GPR1(r1);		/* save r1 in stackframe	*/ \
+3:	EXCEPTION_PROLOG_COMMON_1();					   \
 	beq	4f;			/* if from kernel mode		*/ \
 	ACCOUNT_CPU_USER_ENTRY(r13, r9, r10);				   \
 	SAVE_PPR(area, r9, r10);					   \
diff --git a/arch/powerpc/kernel/exceptions-64s.S b/arch/powerpc/kernel/exceptions-64s.S
index ceff7d6c0518..4573853dfff7 100644
--- a/arch/powerpc/kernel/exceptions-64s.S
+++ b/arch/powerpc/kernel/exceptions-64s.S
@@ -956,17 +956,12 @@ EXC_VIRT_NONE(0x4e60, 0x4e80)
 TRAMP_KVM_HV(PACA_EXGEN, 0xe60)
 TRAMP_REAL_BEGIN(hmi_exception_early)
 	EXCEPTION_PROLOG_1(PACA_EXGEN, KVMTEST_HV, 0xe60)
-	mr	r10,r1			/* Save r1			*/
-	ld	r1,PACAEMERGSP(r13)	/* Use emergency stack		*/
+	mr	r10,r1			/* Save r1 */
+	ld	r1,PACAEMERGSP(r13)	/* Use emergency stack for realmode */
 	subi	r1,r1,INT_FRAME_SIZE	/* alloc stack frame		*/
-	std	r9,_CCR(r1)		/* save CR in stackframe	*/
 	mfspr	r11,SPRN_HSRR0		/* Save HSRR0 */
-	std	r11,_NIP(r1)		/* save HSRR0 in stackframe	*/
-	mfspr	r12,SPRN_HSRR1		/* Save SRR1 */
-	std	r12,_MSR(r1)		/* save SRR1 in stackframe	*/
-	std	r10,0(r1)		/* make stack chain pointer	*/
-	std	r0,GPR0(r1)		/* save r0 in stackframe	*/
-	std	r10,GPR1(r1)		/* save r1 in stackframe	*/
+	mfspr	r12,SPRN_HSRR1		/* Save HSRR1 */
+	EXCEPTION_PROLOG_COMMON_1()
 	EXCEPTION_PROLOG_COMMON_2(PACA_EXGEN)
 	EXCEPTION_PROLOG_COMMON_3(0xe60)
 	addi	r3,r1,STACK_FRAME_OVERHEAD
-- 
2.11.0

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH 03/10] powerpc/64s: fix system reset vs general interrupt reentrancy
  2016-12-19 18:30 [PATCH 00/10] Improving debugging with NMIs Nicholas Piggin
  2016-12-19 18:30 ` [PATCH 01/10] powerpc/64s: add exception macro that does not enable RI Nicholas Piggin
  2016-12-19 18:30 ` [PATCH 02/10] powerpc/64s: exception macro for stack frame and initial register save Nicholas Piggin
@ 2016-12-19 18:30 ` Nicholas Piggin
  2016-12-19 18:30 ` [PATCH 04/10] powerpc/64s: disallow system reset vs system reset reentrancy Nicholas Piggin
                   ` (6 subsequent siblings)
  9 siblings, 0 replies; 15+ messages in thread
From: Nicholas Piggin @ 2016-12-19 18:30 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: Nicholas Piggin

The system reset interrupt can occur when MSR_EE=0, and it currently
uses the PACA_EXGEN save area.

Some PACA_EXGEN interrupts have a window where MSR_RI=1 and MSR_EE=0
when the save area is still in use. A system reset interrupt in this
window can lead to undetected corruption when the save area gets
overwritten.

This patch introduces PACA_EXNMI save area for system reset exceptions,
which closes this corruption window. It's also helpful to retain the
EXGEN state for debugging situations, even if not considering the
recoverability aspect.

This patch also moves the PACA_EXMC area down to a less frequently used
part of the paca with the new save area.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
---
 arch/powerpc/include/asm/exception-64s.h | 14 +++++++-------
 arch/powerpc/include/asm/paca.h          |  6 +++++-
 arch/powerpc/kernel/asm-offsets.c        |  3 ++-
 arch/powerpc/kernel/exceptions-64s.S     | 11 +++++++----
 4 files changed, 21 insertions(+), 13 deletions(-)

diff --git a/arch/powerpc/include/asm/exception-64s.h b/arch/powerpc/include/asm/exception-64s.h
index 1e985382a73a..18eceba135cb 100644
--- a/arch/powerpc/include/asm/exception-64s.h
+++ b/arch/powerpc/include/asm/exception-64s.h
@@ -523,8 +523,8 @@ BEGIN_FTR_SECTION				\
 	beql	ppc64_runlatch_on_trampoline;	\
 END_FTR_SECTION_IFSET(CPU_FTR_CTRL)
 
-#define EXCEPTION_COMMON(trap, label, hdlr, ret, additions)	\
-	EXCEPTION_PROLOG_COMMON(trap, PACA_EXGEN);		\
+#define EXCEPTION_COMMON(area, trap, label, hdlr, ret, additions) \
+	EXCEPTION_PROLOG_COMMON(trap, area);			\
 	/* Volatile regs are potentially clobbered here */	\
 	additions;						\
 	addi	r3,r1,STACK_FRAME_OVERHEAD;			\
@@ -532,17 +532,17 @@ END_FTR_SECTION_IFSET(CPU_FTR_CTRL)
 	b	ret
 
 #define STD_EXCEPTION_COMMON(trap, label, hdlr)			\
-	EXCEPTION_COMMON(trap, label, hdlr, ret_from_except,	\
-			 ADD_NVGPRS;ADD_RECONCILE)
+	EXCEPTION_COMMON(PACA_EXGEN, trap, label, hdlr,		\
+		ret_from_except, ADD_NVGPRS;ADD_RECONCILE)
 
 /*
  * Like STD_EXCEPTION_COMMON, but for exceptions that can occur
  * in the idle task and therefore need the special idle handling
  * (finish nap and runlatch)
  */
-#define STD_EXCEPTION_COMMON_ASYNC(trap, label, hdlr)		  \
-	EXCEPTION_COMMON(trap, label, hdlr, ret_from_except_lite, \
-			 FINISH_NAP;ADD_RECONCILE;RUNLATCH_ON)
+#define STD_EXCEPTION_COMMON_ASYNC(trap, label, hdlr)		\
+	EXCEPTION_COMMON(PACA_EXGEN, trap, label, hdlr,		\
+		ret_from_except_lite, FINISH_NAP;ADD_RECONCILE;RUNLATCH_ON)
 
 /*
  * When the idle code in power4_idle puts the CPU into NAP mode,
diff --git a/arch/powerpc/include/asm/paca.h b/arch/powerpc/include/asm/paca.h
index 6a6792bb39fb..06e96648c1cb 100644
--- a/arch/powerpc/include/asm/paca.h
+++ b/arch/powerpc/include/asm/paca.h
@@ -99,7 +99,6 @@ struct paca_struct {
 	 */
 	/* used for most interrupts/exceptions */
 	u64 exgen[13] __attribute__((aligned(0x80)));
-	u64 exmc[13];		/* used for machine checks */
 	u64 exslb[13];		/* used for SLB/segment table misses
  				 * on the linear mapping */
 	/* SLB related definitions */
@@ -174,6 +173,11 @@ struct paca_struct {
 	u8 subcore_sibling_mask;
 #endif
 
+#ifdef CONFIG_PPC_STD_MMU_64
+	/* Non-maskable exceptions that are not performance critical */
+	u64 exnmi[13];		/* used for system reset (nmi) */
+	u64 exmc[13];		/* used for machine checks */
+#endif
 #ifdef CONFIG_PPC_BOOK3S_64
 	/* Exclusive emergency stack pointer for machine check exception. */
 	void *mc_emergency_sp;
diff --git a/arch/powerpc/kernel/asm-offsets.c b/arch/powerpc/kernel/asm-offsets.c
index 0601e6a7297c..fa7087a2d825 100644
--- a/arch/powerpc/kernel/asm-offsets.c
+++ b/arch/powerpc/kernel/asm-offsets.c
@@ -225,8 +225,9 @@ int main(void)
 	DEFINE(PACACONTEXTSLLP, offsetof(struct paca_struct, mm_ctx_sllp));
 #endif /* CONFIG_PPC_MM_SLICES */
 	DEFINE(PACA_EXGEN, offsetof(struct paca_struct, exgen));
-	DEFINE(PACA_EXMC, offsetof(struct paca_struct, exmc));
 	DEFINE(PACA_EXSLB, offsetof(struct paca_struct, exslb));
+	DEFINE(PACA_EXNMI, offsetof(struct paca_struct, exnmi));
+	DEFINE(PACA_EXMC, offsetof(struct paca_struct, exmc));
 	DEFINE(PACALPPACAPTR, offsetof(struct paca_struct, lppaca_ptr));
 	DEFINE(PACA_SLBSHADOWPTR, offsetof(struct paca_struct, slb_shadow_ptr));
 	DEFINE(SLBSHADOW_STACKVSID,
diff --git a/arch/powerpc/kernel/exceptions-64s.S b/arch/powerpc/kernel/exceptions-64s.S
index 4573853dfff7..dd2bb0d785c6 100644
--- a/arch/powerpc/kernel/exceptions-64s.S
+++ b/arch/powerpc/kernel/exceptions-64s.S
@@ -118,8 +118,8 @@ EXC_REAL_BEGIN(system_reset, 0x100, 0x200)
 	SET_SCRATCH0(r13)
 	GET_PACA(r13)
 	clrrdi	r13,r13,1 /* Last bit of HSPRG0 is set if waking from winkle */
-	EXCEPTION_PROLOG_PSERIES_PACA(PACA_EXGEN, system_reset_common, EXC_STD,
-				 IDLETEST, 0x100)
+	EXCEPTION_PROLOG_PSERIES_PACA(PACA_EXNMI, system_reset_common,
+						EXC_STD, IDLETEST, 0x100)
 
 EXC_REAL_END(system_reset, 0x100, 0x200)
 EXC_VIRT_NONE(0x4100, 0x4200)
@@ -153,7 +153,10 @@ END_FTR_SECTION_IFCLR(CPU_FTR_ARCH_300)
 2:	b	pnv_wakeup_noloss
 #endif
 
-EXC_COMMON(system_reset_common, 0x100, system_reset_exception)
+EXC_COMMON_BEGIN(system_reset_common)
+	EXCEPTION_COMMON(PACA_EXNMI, 0x100,
+			system_reset, system_reset_exception,
+			ret_from_except, ADD_NVGPRS;ADD_RECONCILE)
 
 #ifdef CONFIG_PPC_PSERIES
 /*
@@ -161,7 +164,7 @@ EXC_COMMON(system_reset_common, 0x100, system_reset_exception)
  */
 TRAMP_REAL_BEGIN(system_reset_fwnmi)
 	SET_SCRATCH0(r13)		/* save r13 */
-	EXCEPTION_PROLOG_PSERIES(PACA_EXGEN, system_reset_common, EXC_STD,
+	EXCEPTION_PROLOG_PSERIES(PACA_EXNMI, system_reset_common, EXC_STD,
 				 NOTEST, 0x100)
 #endif /* CONFIG_PPC_PSERIES */
 
-- 
2.11.0

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH 04/10] powerpc/64s: disallow system reset vs system reset reentrancy
  2016-12-19 18:30 [PATCH 00/10] Improving debugging with NMIs Nicholas Piggin
                   ` (2 preceding siblings ...)
  2016-12-19 18:30 ` [PATCH 03/10] powerpc/64s: fix system reset vs general interrupt reentrancy Nicholas Piggin
@ 2016-12-19 18:30 ` Nicholas Piggin
  2016-12-19 18:30 ` [PATCH 05/10] powerpc/64s: dedicated system reset interrupt stack Nicholas Piggin
                   ` (5 subsequent siblings)
  9 siblings, 0 replies; 15+ messages in thread
From: Nicholas Piggin @ 2016-12-19 18:30 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: Nicholas Piggin

In preparation for using a dedicated stack for system reset interrupts,
prevent a nested system reset from recovering, in order to simplify
code that is called in crash/debug path. This allows a system reset
interrupt to just use the base stack pointer.

Keep an in_nmi nesting counter similarly to the in_mce counter. Consider
the interrrupt non-recoverable if it is taken inside another system
reset.

Interrupt nesting could be allowed similarly to MCE, but system reset
is a special case that's not for normal operation, so simplicity wins
until there is requirement for nested system reset interrupts.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
---
 arch/powerpc/include/asm/exception-64s.h | 11 ++++++++++
 arch/powerpc/include/asm/paca.h          |  5 ++++-
 arch/powerpc/kernel/asm-offsets.c        |  1 +
 arch/powerpc/kernel/exceptions-64s.S     | 37 +++++++++++++++++++++++++++-----
 arch/powerpc/kernel/traps.c              |  8 ++++++-
 arch/powerpc/xmon/xmon.c                 |  1 +
 6 files changed, 56 insertions(+), 7 deletions(-)

diff --git a/arch/powerpc/include/asm/exception-64s.h b/arch/powerpc/include/asm/exception-64s.h
index 18eceba135cb..7e47fb67c696 100644
--- a/arch/powerpc/include/asm/exception-64s.h
+++ b/arch/powerpc/include/asm/exception-64s.h
@@ -214,12 +214,23 @@ END_FTR_SECTION_NESTED(ftr,ftr,943)
 	EXCEPTION_PROLOG_1(area, extra, vec);				\
 	EXCEPTION_PROLOG_PSERIES_1(label, h);
 
+/* Do not enable RI */
+#define EXCEPTION_PROLOG_PSERIES_NORI(area, label, h, extra, vec)	\
+	EXCEPTION_PROLOG_0(area);					\
+	EXCEPTION_PROLOG_1(area, extra, vec);				\
+	EXCEPTION_PROLOG_PSERIES_1_NORI(label, h);
+
 /* Have the PACA in r13 already */
 #define EXCEPTION_PROLOG_PSERIES_PACA(area, label, h, extra, vec)	\
 	EXCEPTION_PROLOG_0_PACA(area);					\
 	EXCEPTION_PROLOG_1(area, extra, vec);				\
 	EXCEPTION_PROLOG_PSERIES_1(label, h);
 
+#define EXCEPTION_PROLOG_PSERIES_PACA_NORI(area, label, h, extra, vec)	\
+	EXCEPTION_PROLOG_0_PACA(area);					\
+	EXCEPTION_PROLOG_1(area, extra, vec);				\
+	EXCEPTION_PROLOG_PSERIES_1_NORI(label, h);
+
 #define __KVMTEST(h, n)							\
 	lbz	r10,HSTATE_IN_GUEST(r13);				\
 	cmpwi	r10,0;							\
diff --git a/arch/powerpc/include/asm/paca.h b/arch/powerpc/include/asm/paca.h
index 06e96648c1cb..98da152b257b 100644
--- a/arch/powerpc/include/asm/paca.h
+++ b/arch/powerpc/include/asm/paca.h
@@ -181,12 +181,15 @@ struct paca_struct {
 #ifdef CONFIG_PPC_BOOK3S_64
 	/* Exclusive emergency stack pointer for machine check exception. */
 	void *mc_emergency_sp;
+
+	u16 in_nmi;			/* In nmi handler */
+
 	/*
 	 * Flag to check whether we are in machine check early handler
 	 * and already using emergency stack.
 	 */
 	u16 in_mce;
-	u8 hmi_event_available;		 /* HMI event is available */
+	u8 hmi_event_available;		/* HMI event is available */
 #endif
 
 	/* Stuff for accurate time accounting */
diff --git a/arch/powerpc/kernel/asm-offsets.c b/arch/powerpc/kernel/asm-offsets.c
index fa7087a2d825..893da286ecc3 100644
--- a/arch/powerpc/kernel/asm-offsets.c
+++ b/arch/powerpc/kernel/asm-offsets.c
@@ -244,6 +244,7 @@ int main(void)
 #ifdef CONFIG_PPC_BOOK3S_64
 	DEFINE(PACAMCEMERGSP, offsetof(struct paca_struct, mc_emergency_sp));
 	DEFINE(PACA_IN_MCE, offsetof(struct paca_struct, in_mce));
+	DEFINE(PACA_IN_NMI, offsetof(struct paca_struct, in_nmi));
 #endif
 	DEFINE(PACAHWCPUID, offsetof(struct paca_struct, hw_cpu_id));
 	DEFINE(PACAKEXECSTATE, offsetof(struct paca_struct, kexec_state));
diff --git a/arch/powerpc/kernel/exceptions-64s.S b/arch/powerpc/kernel/exceptions-64s.S
index dd2bb0d785c6..4034f7db73b7 100644
--- a/arch/powerpc/kernel/exceptions-64s.S
+++ b/arch/powerpc/kernel/exceptions-64s.S
@@ -118,7 +118,11 @@ EXC_REAL_BEGIN(system_reset, 0x100, 0x200)
 	SET_SCRATCH0(r13)
 	GET_PACA(r13)
 	clrrdi	r13,r13,1 /* Last bit of HSPRG0 is set if waking from winkle */
-	EXCEPTION_PROLOG_PSERIES_PACA(PACA_EXNMI, system_reset_common,
+	/*
+	 * MSR_RI is not enabled, because PACA_EXNMI and nmi stack is
+	 * being used, so a nested NMI exception would corrupt it.
+	 */
+	EXCEPTION_PROLOG_PSERIES_PACA_NORI(PACA_EXNMI, system_reset_common,
 						EXC_STD, IDLETEST, 0x100)
 
 EXC_REAL_END(system_reset, 0x100, 0x200)
@@ -154,9 +158,31 @@ END_FTR_SECTION_IFCLR(CPU_FTR_ARCH_300)
 #endif
 
 EXC_COMMON_BEGIN(system_reset_common)
+	/*
+	 * Increment paca->in_nmi then enable MSR_RI. SLB or MCE will be able
+	 * to recover, but nested NMI will notice in_nmi and not recover
+	 * because of the use of the NMI stack. in_nmi reentrancy is tested in
+	 * system_reset_exception.
+	 */
+	lhz	r10,PACA_IN_NMI(r13)
+	addi	r10,r10,1
+	sth	r10,PACA_IN_NMI(r13)
+	li	r10,MSR_RI
+	mtmsrd 	r10,1
+
 	EXCEPTION_COMMON(PACA_EXNMI, 0x100,
-			system_reset, system_reset_exception,
-			ret_from_except, ADD_NVGPRS;ADD_RECONCILE)
+			system_reset, system_reset_exception, 1f,
+			ADD_NVGPRS;ADD_RECONCILE)
+1: /* EXCEPTION_COMMON continues here */
+
+	/*
+	 * The stack is no longer in use, decrement in_nmi.
+	 */
+	lhz	r10,PACA_IN_NMI(r13)
+	subi	r10,r10,1
+	sth	r10,PACA_IN_NMI(r13)
+
+	b	ret_from_except
 
 #ifdef CONFIG_PPC_PSERIES
 /*
@@ -164,8 +190,9 @@ EXC_COMMON_BEGIN(system_reset_common)
  */
 TRAMP_REAL_BEGIN(system_reset_fwnmi)
 	SET_SCRATCH0(r13)		/* save r13 */
-	EXCEPTION_PROLOG_PSERIES(PACA_EXNMI, system_reset_common, EXC_STD,
-				 NOTEST, 0x100)
+	/* See comment at system_reset exception */
+	EXCEPTION_PROLOG_PSERIES_NORI(PACA_EXNMI, system_reset_common,
+						EXC_STD, NOTEST, 0x100)
 #endif /* CONFIG_PPC_PSERIES */
 
 
diff --git a/arch/powerpc/kernel/traps.c b/arch/powerpc/kernel/traps.c
index 4239aaf74886..802aa6bbe97b 100644
--- a/arch/powerpc/kernel/traps.c
+++ b/arch/powerpc/kernel/traps.c
@@ -281,11 +281,17 @@ void system_reset_exception(struct pt_regs *regs)
 	/* See if any machine dependent calls */
 	if (ppc_md.system_reset_exception) {
 		if (ppc_md.system_reset_exception(regs))
-			return;
+			goto out;
 	}
 
 	die("System Reset", regs, SIGABRT);
 
+out:
+#ifdef CONFIG_PPC_BOOK3S_64
+	BUG_ON(get_paca()->in_nmi == 0);
+	if (get_paca()->in_nmi > 1)
+		panic("Unrecoverable nested System Reset");
+#endif
 	/* Must die if the interrupt is not recoverable */
 	if (!(regs->msr & MSR_RI))
 		panic("Unrecoverable System Reset");
diff --git a/arch/powerpc/xmon/xmon.c b/arch/powerpc/xmon/xmon.c
index 9c0e17cf6886..832cbc097416 100644
--- a/arch/powerpc/xmon/xmon.c
+++ b/arch/powerpc/xmon/xmon.c
@@ -2228,6 +2228,7 @@ static void dump_one_paca(int cpu)
 	DUMP(p, emergency_sp, "p");
 #ifdef CONFIG_PPC_BOOK3S_64
 	DUMP(p, mc_emergency_sp, "p");
+	DUMP(p, in_nmi, "x");
 	DUMP(p, in_mce, "x");
 	DUMP(p, hmi_event_available, "x");
 #endif
-- 
2.11.0

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH 05/10] powerpc/64s: dedicated system reset interrupt stack
  2016-12-19 18:30 [PATCH 00/10] Improving debugging with NMIs Nicholas Piggin
                   ` (3 preceding siblings ...)
  2016-12-19 18:30 ` [PATCH 04/10] powerpc/64s: disallow system reset vs system reset reentrancy Nicholas Piggin
@ 2016-12-19 18:30 ` Nicholas Piggin
  2016-12-19 18:30 ` [PATCH 06/10] powerpc: nmi_enter for system reset Nicholas Piggin
                   ` (4 subsequent siblings)
  9 siblings, 0 replies; 15+ messages in thread
From: Nicholas Piggin @ 2016-12-19 18:30 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: Nicholas Piggin

The system reset interrupt is used for crash/debug situations, so it is
desirable to have as little impact on the normal state of the system as
possible.

Currently it uses the current kernel stack to process the exception.
This stores into the stack which may be involved with the crash. The
stack pointer may be corrupted, or it may have overflowed.

Avoid or minimise these problems by creating a dedicated NMI stack for
the system reset interrupt to use.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
---
 arch/powerpc/include/asm/exception-64s.h | 13 +++++++++++++
 arch/powerpc/include/asm/paca.h          |  3 ++-
 arch/powerpc/kernel/asm-offsets.c        |  1 +
 arch/powerpc/kernel/exceptions-64s.S     |  8 +++++---
 arch/powerpc/kernel/setup_64.c           |  5 +++++
 arch/powerpc/xmon/xmon.c                 |  1 +
 6 files changed, 27 insertions(+), 4 deletions(-)

diff --git a/arch/powerpc/include/asm/exception-64s.h b/arch/powerpc/include/asm/exception-64s.h
index 7e47fb67c696..7884d9263b98 100644
--- a/arch/powerpc/include/asm/exception-64s.h
+++ b/arch/powerpc/include/asm/exception-64s.h
@@ -542,6 +542,19 @@ END_FTR_SECTION_IFSET(CPU_FTR_CTRL)
 	bl	hdlr;						\
 	b	ret
 
+/*
+ * Exception where stack is already set in r1, r1 is saved in r10, and it
+ * continues rather than returns.
+ */
+#define EXCEPTION_COMMON_NORET_STACK(area, trap, label, hdlr, additions) \
+	EXCEPTION_PROLOG_COMMON_1();				\
+	EXCEPTION_PROLOG_COMMON_2(area);			\
+	EXCEPTION_PROLOG_COMMON_3(trap);			\
+	/* Volatile regs are potentially clobbered here */	\
+	additions;						\
+	addi	r3,r1,STACK_FRAME_OVERHEAD;			\
+	bl	hdlr
+
 #define STD_EXCEPTION_COMMON(trap, label, hdlr)			\
 	EXCEPTION_COMMON(PACA_EXGEN, trap, label, hdlr,		\
 		ret_from_except, ADD_NVGPRS;ADD_RECONCILE)
diff --git a/arch/powerpc/include/asm/paca.h b/arch/powerpc/include/asm/paca.h
index 98da152b257b..c4887c67157b 100644
--- a/arch/powerpc/include/asm/paca.h
+++ b/arch/powerpc/include/asm/paca.h
@@ -179,7 +179,8 @@ struct paca_struct {
 	u64 exmc[13];		/* used for machine checks */
 #endif
 #ifdef CONFIG_PPC_BOOK3S_64
-	/* Exclusive emergency stack pointer for machine check exception. */
+	/* Exclusive stacks for system reset and machine check exception. */
+	void *nmi_emergency_sp;
 	void *mc_emergency_sp;
 
 	u16 in_nmi;			/* In nmi handler */
diff --git a/arch/powerpc/kernel/asm-offsets.c b/arch/powerpc/kernel/asm-offsets.c
index 893da286ecc3..ba1e3bb09f9f 100644
--- a/arch/powerpc/kernel/asm-offsets.c
+++ b/arch/powerpc/kernel/asm-offsets.c
@@ -242,6 +242,7 @@ int main(void)
 #endif /* CONFIG_PPC_STD_MMU_64 */
 	DEFINE(PACAEMERGSP, offsetof(struct paca_struct, emergency_sp));
 #ifdef CONFIG_PPC_BOOK3S_64
+	DEFINE(PACANMIEMERGSP, offsetof(struct paca_struct, nmi_emergency_sp));
 	DEFINE(PACAMCEMERGSP, offsetof(struct paca_struct, mc_emergency_sp));
 	DEFINE(PACA_IN_MCE, offsetof(struct paca_struct, in_mce));
 	DEFINE(PACA_IN_NMI, offsetof(struct paca_struct, in_nmi));
diff --git a/arch/powerpc/kernel/exceptions-64s.S b/arch/powerpc/kernel/exceptions-64s.S
index 4034f7db73b7..59f1a53f45c7 100644
--- a/arch/powerpc/kernel/exceptions-64s.S
+++ b/arch/powerpc/kernel/exceptions-64s.S
@@ -170,10 +170,12 @@ EXC_COMMON_BEGIN(system_reset_common)
 	li	r10,MSR_RI
 	mtmsrd 	r10,1
 
-	EXCEPTION_COMMON(PACA_EXNMI, 0x100,
-			system_reset, system_reset_exception, 1f,
+	mr	r10,r1
+	ld	r1,PACANMIEMERGSP(r13)
+	subi	r1,r1,INT_FRAME_SIZE
+	EXCEPTION_COMMON_NORET_STACK(PACA_EXNMI, 0x100,
+			system_reset, system_reset_exception,
 			ADD_NVGPRS;ADD_RECONCILE)
-1: /* EXCEPTION_COMMON continues here */
 
 	/*
 	 * The stack is no longer in use, decrement in_nmi.
diff --git a/arch/powerpc/kernel/setup_64.c b/arch/powerpc/kernel/setup_64.c
index 6824157e4d2e..0823064c57c1 100644
--- a/arch/powerpc/kernel/setup_64.c
+++ b/arch/powerpc/kernel/setup_64.c
@@ -573,6 +573,11 @@ void __init emergency_stack_init(void)
 		paca[i].emergency_sp = (void *)ti + THREAD_SIZE;
 
 #ifdef CONFIG_PPC_BOOK3S_64
+		/* emergency stack for NMI exception handling. */
+		ti = __va(memblock_alloc_base(THREAD_SIZE, THREAD_SIZE, limit));
+		klp_init_thread_info(ti);
+		paca[i].nmi_emergency_sp = (void *)ti + THREAD_SIZE;
+
 		/* emergency stack for machine check exception handling. */
 		ti = __va(memblock_alloc_base(THREAD_SIZE, THREAD_SIZE, limit));
 		klp_init_thread_info(ti);
diff --git a/arch/powerpc/xmon/xmon.c b/arch/powerpc/xmon/xmon.c
index 832cbc097416..77a88319f494 100644
--- a/arch/powerpc/xmon/xmon.c
+++ b/arch/powerpc/xmon/xmon.c
@@ -2227,6 +2227,7 @@ static void dump_one_paca(int cpu)
 	DUMP(p, kernel_msr, "lx");
 	DUMP(p, emergency_sp, "p");
 #ifdef CONFIG_PPC_BOOK3S_64
+	DUMP(p, nmi_emergency_sp, "p");
 	DUMP(p, mc_emergency_sp, "p");
 	DUMP(p, in_nmi, "x");
 	DUMP(p, in_mce, "x");
-- 
2.11.0

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH 06/10] powerpc: nmi_enter for system reset
  2016-12-19 18:30 [PATCH 00/10] Improving debugging with NMIs Nicholas Piggin
                   ` (4 preceding siblings ...)
  2016-12-19 18:30 ` [PATCH 05/10] powerpc/64s: dedicated system reset interrupt stack Nicholas Piggin
@ 2016-12-19 18:30 ` Nicholas Piggin
  2017-02-07  4:06   ` Michael Ellerman
  2016-12-19 18:30 ` [PATCH 07/10] powerpc: add NMI IPI infrastructure Nicholas Piggin
                   ` (3 subsequent siblings)
  9 siblings, 1 reply; 15+ messages in thread
From: Nicholas Piggin @ 2016-12-19 18:30 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: Nicholas Piggin

System reset is a non-maskable interrupt from Linux's point of view
(occurs under local_irq_disable()), so it should use nmi_enter/exit.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
---
 arch/powerpc/kernel/traps.c | 11 +++++++++++
 1 file changed, 11 insertions(+)

diff --git a/arch/powerpc/kernel/traps.c b/arch/powerpc/kernel/traps.c
index 802aa6bbe97b..c65c88fb6482 100644
--- a/arch/powerpc/kernel/traps.c
+++ b/arch/powerpc/kernel/traps.c
@@ -278,6 +278,14 @@ void _exception(int signr, struct pt_regs *regs, int code, unsigned long addr)
 
 void system_reset_exception(struct pt_regs *regs)
 {
+	/*
+	 * Avoid crashes in case of nested NMI exceptions. Recoverability
+	 * is determined by RI and in_nmi
+	 */
+	bool nested = in_nmi();
+	if (!nested)
+		nmi_enter();
+
 	/* See if any machine dependent calls */
 	if (ppc_md.system_reset_exception) {
 		if (ppc_md.system_reset_exception(regs))
@@ -296,6 +304,9 @@ void system_reset_exception(struct pt_regs *regs)
 	if (!(regs->msr & MSR_RI))
 		panic("Unrecoverable System Reset");
 
+	if (!nested)
+		nmi_exit();
+
 	/* What should we do here? We could issue a shutdown or hard reset. */
 }
 
-- 
2.11.0

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH 07/10] powerpc: add NMI IPI infrastructure
  2016-12-19 18:30 [PATCH 00/10] Improving debugging with NMIs Nicholas Piggin
                   ` (5 preceding siblings ...)
  2016-12-19 18:30 ` [PATCH 06/10] powerpc: nmi_enter for system reset Nicholas Piggin
@ 2016-12-19 18:30 ` Nicholas Piggin
  2017-02-03 11:45   ` Nicholas Piggin
  2016-12-19 18:30 ` [PATCH 08/10] powerpc: add struct smp_ops_t.cause_nmi_ipi operation Nicholas Piggin
                   ` (2 subsequent siblings)
  9 siblings, 1 reply; 15+ messages in thread
From: Nicholas Piggin @ 2016-12-19 18:30 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: Nicholas Piggin

Add a simple NMI IPI system that handles concurrency and reentrancy.

The platform does not have to implement a true non-maskable interrupt,
the default is to simply use the debugger break IPI message. This has
now been co-opted for a general IPI message, and users (debugger and
crash) have been reimplemented on top of the NMI system.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
---
 arch/powerpc/Kconfig                    |   5 +
 arch/powerpc/include/asm/smp.h          |  14 +-
 arch/powerpc/kernel/smp.c               | 247 +++++++++++++++++++++++++++-----
 arch/powerpc/platforms/cell/interrupt.c |   2 +-
 arch/powerpc/platforms/ps3/smp.c        |   4 +-
 5 files changed, 230 insertions(+), 42 deletions(-)

diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig
index 3da87e198878..e7186db87e18 100644
--- a/arch/powerpc/Kconfig
+++ b/arch/powerpc/Kconfig
@@ -38,6 +38,11 @@ config NR_IRQS
 	  /proc/interrupts. If you configure your system to have too few,
 	  drivers will fail to load or worse - handle with care.
 
+config NMI_IPI
+	bool
+	depends on SMP && (DEBUGGER || KEXEC_CORE)
+	default y
+
 config STACKTRACE_SUPPORT
 	bool
 	default y
diff --git a/arch/powerpc/include/asm/smp.h b/arch/powerpc/include/asm/smp.h
index 32db16d2e7ad..693a68f4ac8b 100644
--- a/arch/powerpc/include/asm/smp.h
+++ b/arch/powerpc/include/asm/smp.h
@@ -112,14 +112,22 @@ extern int cpu_to_core_id(int cpu);
  *
  * Make sure this matches openpic_request_IPIs in open_pic.c, or what shows up
  * in /proc/interrupts will be wrong!!! --Troy */
-#define PPC_MSG_CALL_FUNCTION   0
-#define PPC_MSG_RESCHEDULE      1
+#define PPC_MSG_CALL_FUNCTION	0
+#define PPC_MSG_RESCHEDULE	1
 #define PPC_MSG_TICK_BROADCAST	2
-#define PPC_MSG_DEBUGGER_BREAK  3
+#define PPC_MSG_NMI_IPI		3
 
 /* This is only used by the powernv kernel */
 #define PPC_MSG_RM_HOST_ACTION	4
 
+#define NMI_IPI_ALL_OTHERS		-2
+
+#ifdef CONFIG_NMI_IPI
+extern int smp_handle_nmi_ipi(struct pt_regs *regs);
+#else
+static inline int smp_handle_nmi_ipi(struct pt_regs *regs) { return 0; }
+#endif
+
 /* for irq controllers that have dedicated ipis per message (4) */
 extern int smp_request_message_ipi(int virq, int message);
 extern const char *smp_ipi_name[];
diff --git a/arch/powerpc/kernel/smp.c b/arch/powerpc/kernel/smp.c
index 893bd7f79be6..81256522985d 100644
--- a/arch/powerpc/kernel/smp.c
+++ b/arch/powerpc/kernel/smp.c
@@ -85,8 +85,6 @@ volatile unsigned int cpu_callin_map[NR_CPUS];
 
 int smt_enabled_at_boot = 1;
 
-static void (*crash_ipi_function_ptr)(struct pt_regs *) = NULL;
-
 /*
  * Returns 1 if the specified cpu should be brought up during boot.
  * Used to inhibit booting threads if they've been disabled or
@@ -157,32 +155,33 @@ static irqreturn_t tick_broadcast_ipi_action(int irq, void *data)
 	return IRQ_HANDLED;
 }
 
-static irqreturn_t debug_ipi_action(int irq, void *data)
+#ifdef CONFIG_NMI_IPI
+static irqreturn_t nmi_ipi_action(int irq, void *data)
 {
-	if (crash_ipi_function_ptr) {
-		crash_ipi_function_ptr(get_irq_regs());
-		return IRQ_HANDLED;
-	}
-
-#ifdef CONFIG_DEBUGGER
-	debugger_ipi(get_irq_regs());
-#endif /* CONFIG_DEBUGGER */
-
+	smp_handle_nmi_ipi(get_irq_regs());
 	return IRQ_HANDLED;
 }
+#endif
 
 static irq_handler_t smp_ipi_action[] = {
 	[PPC_MSG_CALL_FUNCTION] =  call_function_action,
 	[PPC_MSG_RESCHEDULE] = reschedule_action,
 	[PPC_MSG_TICK_BROADCAST] = tick_broadcast_ipi_action,
-	[PPC_MSG_DEBUGGER_BREAK] = debug_ipi_action,
+#ifdef CONFIG_NMI_IPI
+	[PPC_MSG_NMI_IPI] = nmi_ipi_action,
+#endif
 };
 
+/*
+ * The NMI IPI is a fallback and not truly non-maskable. It is simpler
+ * than going through the call function infrastructure, and strongly
+ * serialized, so it is more appropriate for debugging.
+ */
 const char *smp_ipi_name[] = {
 	[PPC_MSG_CALL_FUNCTION] =  "ipi call function",
 	[PPC_MSG_RESCHEDULE] = "ipi reschedule",
 	[PPC_MSG_TICK_BROADCAST] = "ipi tick-broadcast",
-	[PPC_MSG_DEBUGGER_BREAK] = "ipi debugger",
+	[PPC_MSG_NMI_IPI] = "nmi ipi",
 };
 
 /* optional function to request ipi, for controllers with >= 4 ipis */
@@ -190,14 +189,13 @@ int smp_request_message_ipi(int virq, int msg)
 {
 	int err;
 
-	if (msg < 0 || msg > PPC_MSG_DEBUGGER_BREAK) {
+	if (msg < 0 || msg > PPC_MSG_NMI_IPI)
 		return -EINVAL;
-	}
-#if !defined(CONFIG_DEBUGGER) && !defined(CONFIG_KEXEC_CORE)
-	if (msg == PPC_MSG_DEBUGGER_BREAK) {
+#ifndef CONFIG_NMI_IPI
+	if (msg == PPC_MSG_NMI_IPI)
 		return 1;
-	}
 #endif
+
 	err = request_irq(virq, smp_ipi_action[msg],
 			  IRQF_PERCPU | IRQF_NO_THREAD | IRQF_NO_SUSPEND,
 			  smp_ipi_name[msg], NULL);
@@ -277,8 +275,10 @@ irqreturn_t smp_ipi_demux(void)
 			scheduler_ipi();
 		if (all & IPI_MESSAGE(PPC_MSG_TICK_BROADCAST))
 			tick_broadcast_ipi_handler();
-		if (all & IPI_MESSAGE(PPC_MSG_DEBUGGER_BREAK))
-			debug_ipi_action(0, NULL);
+#ifdef CONFIG_NMI_IPI
+		if (all & IPI_MESSAGE(PPC_MSG_NMI_IPI))
+			nmi_ipi_action(0, NULL);
+#endif
 	} while (info->messages);
 
 	return IRQ_HANDLED;
@@ -315,6 +315,188 @@ void arch_send_call_function_ipi_mask(const struct cpumask *mask)
 		do_message_pass(cpu, PPC_MSG_CALL_FUNCTION);
 }
 
+#ifdef CONFIG_NMI_IPI
+
+/*
+ * "NMI IPI" system.
+ *
+ * NMI IPIs may not be recoverable, so should not be used as ongoing part of
+ * a running system. They can be used for crash, debug, halt/reboot, etc.
+ *
+ * NMI IPIs are globally single threaded. No more than one in progress at
+ * any time.
+ *
+ * The IPI call waits with interrupts disabled until all targets enter the
+ * NMI handler, then the call returns.
+ *
+ * No new NMI can be initiated until targets exit the handler.
+ *
+ * The IPI call may time out without all targets entering the NMI handler.
+ * In that case, there is some logic to recover (and ignore subsequent
+ * NMI interrupts that may eventually be raised), but the platform interrupt
+ * handler may not be able to distinguish this from other exception causes,
+ * which may cause a crash.
+ */
+
+static atomic_t __nmi_ipi_lock = ATOMIC_INIT(0);
+static struct cpumask nmi_ipi_pending_mask;
+static int nmi_ipi_busy_count = 0;
+static void (*nmi_ipi_function)(struct pt_regs *) = NULL;
+
+static void nmi_ipi_lock_start(unsigned long *flags)
+{
+	raw_local_irq_save(*flags);
+	hard_irq_disable();
+	while (atomic_cmpxchg(&__nmi_ipi_lock, 0, 1) == 1) {
+		raw_local_irq_restore(*flags);
+		cpu_relax();
+		raw_local_irq_save(*flags);
+		hard_irq_disable();
+	}
+}
+
+static void nmi_ipi_lock(void)
+{
+	while (atomic_cmpxchg(&__nmi_ipi_lock, 0, 1) == 1)
+		cpu_relax();
+}
+
+static void nmi_ipi_unlock(void)
+{
+	smp_mb();
+	WARN_ON(atomic_read(&__nmi_ipi_lock) != 1);
+	atomic_set(&__nmi_ipi_lock, 0);
+}
+
+static void nmi_ipi_unlock_end(unsigned long *flags)
+{
+	nmi_ipi_unlock();
+	raw_local_irq_restore(*flags);
+}
+
+/*
+ * Platform NMI handler calls this to ack
+ */
+int smp_handle_nmi_ipi(struct pt_regs *regs)
+{
+	void (*fn)(struct pt_regs *);
+	unsigned long flags;
+	int me = raw_smp_processor_id();
+	int ret = 0;
+
+	/*
+	 * Unexpected NMIs are possible here because the interrupt may not
+	 * be able to distinguish NMI IPIs from other types of NMIs, or
+	 * because the caller may have timed out.
+	 */
+	nmi_ipi_lock_start(&flags);
+	if (!nmi_ipi_busy_count)
+		goto out;
+	if (!cpumask_test_cpu(me, &nmi_ipi_pending_mask))
+		goto out;
+
+	fn = nmi_ipi_function;
+	if (!fn)
+		goto out;
+
+	cpumask_clear_cpu(me, &nmi_ipi_pending_mask);
+	nmi_ipi_busy_count++;
+	nmi_ipi_unlock();
+
+	ret = 1;
+
+	fn(regs);
+
+	nmi_ipi_lock();
+	nmi_ipi_busy_count--;
+out:
+	nmi_ipi_unlock_end(&flags);
+
+	return ret;
+}
+
+static void do_smp_send_nmi_ipi(int cpu)
+{
+	if (cpu >= 0) {
+		do_message_pass(cpu, PPC_MSG_NMI_IPI);
+	} else {
+		unsigned int c;
+
+		for_each_online_cpu(c) {
+			if (c == raw_smp_processor_id())
+				continue;
+			do_message_pass(cpu, PPC_MSG_NMI_IPI);
+		}
+	}
+}
+
+/*
+ * - cpu is the target CPU (must not be this CPU), or NMI_IPI_ALL_OTHERS.
+ * - fn is the target callback function.
+ * - delay_us > 0 is the delay before giving up waiting for targets to
+ *   enter the handler, == 0 specifies indefinite delay.
+ */
+static int smp_send_nmi_ipi(int cpu, void (*fn)(struct pt_regs *), u64 delay_us)
+{
+	unsigned long flags;
+	int me = raw_smp_processor_id();
+	int ret = 1;
+
+	BUG_ON(cpu == me);
+	BUG_ON(cpu < 0 && cpu != NMI_IPI_ALL_OTHERS);
+
+	if (unlikely(!smp_ops))
+		return 0;
+
+	get_online_cpus();
+
+	/* Take the nmi_ipi_busy count/lock with interrupts hard disabled */
+	nmi_ipi_lock_start(&flags);
+	while (nmi_ipi_busy_count) {
+		nmi_ipi_unlock_end(&flags);
+		cpu_relax();
+		nmi_ipi_lock_start(&flags);
+	}
+
+	nmi_ipi_function = fn;
+
+	if (cpu < 0) {
+		/* ALL_OTHERS */
+		cpumask_copy(&nmi_ipi_pending_mask, cpu_online_mask);
+		cpumask_clear_cpu(me, &nmi_ipi_pending_mask);
+	} else {
+		/* cpumask starts clear */
+		cpumask_set_cpu(cpu, &nmi_ipi_pending_mask);
+	}
+	nmi_ipi_busy_count++;
+	nmi_ipi_unlock();
+
+	do_smp_send_nmi_ipi(cpu);
+
+	while (!cpumask_empty(&nmi_ipi_pending_mask)) {
+		udelay(1);
+		if (delay_us) {
+			delay_us--;
+			if (!delay_us)
+				break;
+		}
+	}
+
+	nmi_ipi_lock();
+	if (!cpumask_empty(&nmi_ipi_pending_mask)) {
+		/* Could not gather all CPUs */
+		ret = 0;
+		cpumask_clear(&nmi_ipi_pending_mask);
+	}
+	nmi_ipi_busy_count--;
+	nmi_ipi_unlock_end(&flags);
+
+	put_online_cpus();
+
+	return ret;
+}
+#endif /* CONFIG_NMI_IPI */
+
 #ifdef CONFIG_GENERIC_CLOCKEVENTS_BROADCAST
 void tick_broadcast(const struct cpumask *mask)
 {
@@ -325,29 +507,22 @@ void tick_broadcast(const struct cpumask *mask)
 }
 #endif
 
-#if defined(CONFIG_DEBUGGER) || defined(CONFIG_KEXEC_CORE)
-void smp_send_debugger_break(void)
+#ifdef CONFIG_DEBUGGER
+void debugger_ipi_callback(struct pt_regs *regs)
 {
-	int cpu;
-	int me = raw_smp_processor_id();
-
-	if (unlikely(!smp_ops))
-		return;
+	debugger_ipi(regs);
+}
 
-	for_each_online_cpu(cpu)
-		if (cpu != me)
-			do_message_pass(cpu, PPC_MSG_DEBUGGER_BREAK);
+void smp_send_debugger_break(void)
+{
+	smp_send_nmi_ipi(NMI_IPI_ALL_OTHERS, debugger_ipi_callback, 1000000);
 }
 #endif
 
 #ifdef CONFIG_KEXEC_CORE
 void crash_send_ipi(void (*crash_ipi_callback)(struct pt_regs *))
 {
-	crash_ipi_function_ptr = crash_ipi_callback;
-	if (crash_ipi_callback) {
-		mb();
-		smp_send_debugger_break();
-	}
+	smp_send_nmi_ipi(NMI_IPI_ALL_OTHERS, crash_ipi_callback, 1000000);
 }
 #endif
 
diff --git a/arch/powerpc/platforms/cell/interrupt.c b/arch/powerpc/platforms/cell/interrupt.c
index a6bbbaba14a3..871d38479a25 100644
--- a/arch/powerpc/platforms/cell/interrupt.c
+++ b/arch/powerpc/platforms/cell/interrupt.c
@@ -211,7 +211,7 @@ void iic_request_IPIs(void)
 	iic_request_ipi(PPC_MSG_CALL_FUNCTION);
 	iic_request_ipi(PPC_MSG_RESCHEDULE);
 	iic_request_ipi(PPC_MSG_TICK_BROADCAST);
-	iic_request_ipi(PPC_MSG_DEBUGGER_BREAK);
+	iic_request_ipi(PPC_MSG_NMI_IPI);
 }
 
 #endif /* CONFIG_SMP */
diff --git a/arch/powerpc/platforms/ps3/smp.c b/arch/powerpc/platforms/ps3/smp.c
index 60154d08debf..1d1ad5df106f 100644
--- a/arch/powerpc/platforms/ps3/smp.c
+++ b/arch/powerpc/platforms/ps3/smp.c
@@ -77,7 +77,7 @@ static void __init ps3_smp_probe(void)
 		BUILD_BUG_ON(PPC_MSG_CALL_FUNCTION    != 0);
 		BUILD_BUG_ON(PPC_MSG_RESCHEDULE       != 1);
 		BUILD_BUG_ON(PPC_MSG_TICK_BROADCAST   != 2);
-		BUILD_BUG_ON(PPC_MSG_DEBUGGER_BREAK   != 3);
+		BUILD_BUG_ON(PPC_MSG_NMI_IPI          != 3);
 
 		for (i = 0; i < MSG_COUNT; i++) {
 			result = ps3_event_receive_port_setup(cpu, &virqs[i]);
@@ -96,7 +96,7 @@ static void __init ps3_smp_probe(void)
 				ps3_register_ipi_irq(cpu, virqs[i]);
 		}
 
-		ps3_register_ipi_debug_brk(cpu, virqs[PPC_MSG_DEBUGGER_BREAK]);
+		ps3_register_ipi_debug_brk(cpu, virqs[PPC_MSG_NMI_IPI]);
 
 		DBG(" <- %s:%d: (%d)\n", __func__, __LINE__, cpu);
 	}
-- 
2.11.0

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH 08/10] powerpc: add struct smp_ops_t.cause_nmi_ipi operation
  2016-12-19 18:30 [PATCH 00/10] Improving debugging with NMIs Nicholas Piggin
                   ` (6 preceding siblings ...)
  2016-12-19 18:30 ` [PATCH 07/10] powerpc: add NMI IPI infrastructure Nicholas Piggin
@ 2016-12-19 18:30 ` Nicholas Piggin
  2016-12-19 18:30 ` [PATCH 09/10] powerpc/pseries: implement NMI IPI with H_SIGNAL_SYS_RESET Nicholas Piggin
  2016-12-19 18:30 ` [PATCH 10/10] powerpc: xmon wait for secondaries before sending IPI Nicholas Piggin
  9 siblings, 0 replies; 15+ messages in thread
From: Nicholas Piggin @ 2016-12-19 18:30 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: Nicholas Piggin

Have the NMI IPI code use this op when the platform defines it.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
---
 arch/powerpc/include/asm/smp.h            | 1 +
 arch/powerpc/kernel/smp.c                 | 3 +++
 arch/powerpc/platforms/85xx/smp.c         | 1 +
 arch/powerpc/platforms/86xx/mpc86xx_smp.c | 1 +
 arch/powerpc/platforms/chrp/smp.c         | 1 +
 arch/powerpc/platforms/powermac/smp.c     | 1 +
 arch/powerpc/platforms/powernv/smp.c      | 1 +
 arch/powerpc/platforms/pseries/smp.c      | 1 +
 8 files changed, 10 insertions(+)

diff --git a/arch/powerpc/include/asm/smp.h b/arch/powerpc/include/asm/smp.h
index 693a68f4ac8b..5b1c8f64acce 100644
--- a/arch/powerpc/include/asm/smp.h
+++ b/arch/powerpc/include/asm/smp.h
@@ -42,6 +42,7 @@ struct smp_ops_t {
 #ifdef CONFIG_PPC_SMP_MUXED_IPI
 	void  (*cause_ipi)(int cpu, unsigned long data);
 #endif
+	int   (*cause_nmi_ipi)(int cpu);
 	void  (*probe)(void);
 	int   (*kick_cpu)(int nr);
 	void  (*setup_cpu)(int nr);
diff --git a/arch/powerpc/kernel/smp.c b/arch/powerpc/kernel/smp.c
index 81256522985d..564b7dc9e250 100644
--- a/arch/powerpc/kernel/smp.c
+++ b/arch/powerpc/kernel/smp.c
@@ -417,6 +417,9 @@ int smp_handle_nmi_ipi(struct pt_regs *regs)
 
 static void do_smp_send_nmi_ipi(int cpu)
 {
+	if (smp_ops->cause_nmi_ipi && smp_ops->cause_nmi_ipi(cpu))
+		return;
+
 	if (cpu >= 0) {
 		do_message_pass(cpu, PPC_MSG_NMI_IPI);
 	} else {
diff --git a/arch/powerpc/platforms/85xx/smp.c b/arch/powerpc/platforms/85xx/smp.c
index a83a6d26090d..8534bc47c38b 100644
--- a/arch/powerpc/platforms/85xx/smp.c
+++ b/arch/powerpc/platforms/85xx/smp.c
@@ -343,6 +343,7 @@ static int smp_85xx_kick_cpu(int nr)
 }
 
 struct smp_ops_t smp_85xx_ops = {
+	.cause_nmi_ipi = NULL,
 	.kick_cpu = smp_85xx_kick_cpu,
 	.cpu_bootable = smp_generic_cpu_bootable,
 #ifdef CONFIG_HOTPLUG_CPU
diff --git a/arch/powerpc/platforms/86xx/mpc86xx_smp.c b/arch/powerpc/platforms/86xx/mpc86xx_smp.c
index af09baee22cb..020e84a47a32 100644
--- a/arch/powerpc/platforms/86xx/mpc86xx_smp.c
+++ b/arch/powerpc/platforms/86xx/mpc86xx_smp.c
@@ -105,6 +105,7 @@ smp_86xx_setup_cpu(int cpu_nr)
 
 
 struct smp_ops_t smp_86xx_ops = {
+	.cause_nmi_ipi = NULL,
 	.message_pass = smp_mpic_message_pass,
 	.probe = smp_mpic_probe,
 	.kick_cpu = smp_86xx_kick_cpu,
diff --git a/arch/powerpc/platforms/chrp/smp.c b/arch/powerpc/platforms/chrp/smp.c
index b6c9a0dcc924..14515040f7cd 100644
--- a/arch/powerpc/platforms/chrp/smp.c
+++ b/arch/powerpc/platforms/chrp/smp.c
@@ -44,6 +44,7 @@ static void smp_chrp_setup_cpu(int cpu_nr)
 
 /* CHRP with openpic */
 struct smp_ops_t chrp_smp_ops = {
+	.cause_nmi_ipi = NULL,
 	.message_pass = smp_mpic_message_pass,
 	.probe = smp_mpic_probe,
 	.kick_cpu = smp_chrp_kick_cpu,
diff --git a/arch/powerpc/platforms/powermac/smp.c b/arch/powerpc/platforms/powermac/smp.c
index c9eb7d6540ea..1d76e1567018 100644
--- a/arch/powerpc/platforms/powermac/smp.c
+++ b/arch/powerpc/platforms/powermac/smp.c
@@ -446,6 +446,7 @@ void __init smp_psurge_give_timebase(void)
 struct smp_ops_t psurge_smp_ops = {
 	.message_pass	= NULL,	/* Use smp_muxed_ipi_message_pass */
 	.cause_ipi	= smp_psurge_cause_ipi,
+	.cause_nmi_ipi	= NULL,
 	.probe		= smp_psurge_probe,
 	.kick_cpu	= smp_psurge_kick_cpu,
 	.setup_cpu	= smp_psurge_setup_cpu,
diff --git a/arch/powerpc/platforms/powernv/smp.c b/arch/powerpc/platforms/powernv/smp.c
index c789258ae1e1..092ec1f7b58d 100644
--- a/arch/powerpc/platforms/powernv/smp.c
+++ b/arch/powerpc/platforms/powernv/smp.c
@@ -244,6 +244,7 @@ static int pnv_cpu_bootable(unsigned int nr)
 static struct smp_ops_t pnv_smp_ops = {
 	.message_pass	= smp_muxed_ipi_message_pass,
 	.cause_ipi	= NULL,	/* Filled at runtime by xics_smp_probe() */
+	.cause_nmi_ipi	= NULL,
 	.probe		= xics_smp_probe,
 	.kick_cpu	= pnv_smp_kick_cpu,
 	.setup_cpu	= pnv_smp_setup_cpu,
diff --git a/arch/powerpc/platforms/pseries/smp.c b/arch/powerpc/platforms/pseries/smp.c
index f6f83aeccaaa..0f6522c6518b 100644
--- a/arch/powerpc/platforms/pseries/smp.c
+++ b/arch/powerpc/platforms/pseries/smp.c
@@ -209,6 +209,7 @@ static __init void pSeries_smp_probe(void)
 static struct smp_ops_t pseries_smp_ops = {
 	.message_pass	= NULL,	/* Use smp_muxed_ipi_message_pass */
 	.cause_ipi	= NULL,	/* Filled at runtime by pSeries_smp_probe() */
+	.cause_nmi_ipi	= NULL,
 	.probe		= pSeries_smp_probe,
 	.kick_cpu	= smp_pSeries_kick_cpu,
 	.setup_cpu	= smp_setup_cpu,
-- 
2.11.0

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH 09/10] powerpc/pseries: implement NMI IPI with H_SIGNAL_SYS_RESET
  2016-12-19 18:30 [PATCH 00/10] Improving debugging with NMIs Nicholas Piggin
                   ` (7 preceding siblings ...)
  2016-12-19 18:30 ` [PATCH 08/10] powerpc: add struct smp_ops_t.cause_nmi_ipi operation Nicholas Piggin
@ 2016-12-19 18:30 ` Nicholas Piggin
  2016-12-19 18:30 ` [PATCH 10/10] powerpc: xmon wait for secondaries before sending IPI Nicholas Piggin
  9 siblings, 0 replies; 15+ messages in thread
From: Nicholas Piggin @ 2016-12-19 18:30 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: Nicholas Piggin

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
---
 arch/powerpc/platforms/pseries/ras.c |  4 ++++
 arch/powerpc/platforms/pseries/smp.c | 23 ++++++++++++++++++++++-
 2 files changed, 26 insertions(+), 1 deletion(-)

diff --git a/arch/powerpc/platforms/pseries/ras.c b/arch/powerpc/platforms/pseries/ras.c
index 904a677208d1..bb70b26334f0 100644
--- a/arch/powerpc/platforms/pseries/ras.c
+++ b/arch/powerpc/platforms/pseries/ras.c
@@ -386,6 +386,10 @@ int pSeries_system_reset_exception(struct pt_regs *regs)
 		}
 		fwnmi_release_errinfo();
 	}
+
+	if (smp_handle_nmi_ipi(regs))
+		return 1;
+
 	return 0; /* need to perform reset */
 }
 
diff --git a/arch/powerpc/platforms/pseries/smp.c b/arch/powerpc/platforms/pseries/smp.c
index 0f6522c6518b..fac59c212bc4 100644
--- a/arch/powerpc/platforms/pseries/smp.c
+++ b/arch/powerpc/platforms/pseries/smp.c
@@ -196,6 +196,27 @@ static void pSeries_cause_ipi_mux(int cpu, unsigned long data)
 		xics_cause_ipi(cpu, data);
 }
 
+static int pseries_cause_nmi_ipi(int cpu)
+{
+	int hwcpu;
+
+	if (cpu == NMI_IPI_ALL_OTHERS) {
+		hwcpu = H_SIGNAL_SYS_RESET_ALL_OTHERS;
+	} else {
+		if (cpu < 0) {
+			WARN_ONCE(true, "incorrect cpu parameter %d", cpu);
+			return 0;
+		}
+
+		hwcpu = get_hard_smp_processor_id(cpu);
+	}
+
+	if (plapr_signal_sys_reset(hwcpu) == H_SUCCESS)
+		return 1;
+
+	return 0;
+}
+
 static __init void pSeries_smp_probe(void)
 {
 	xics_smp_probe();
@@ -209,7 +230,7 @@ static __init void pSeries_smp_probe(void)
 static struct smp_ops_t pseries_smp_ops = {
 	.message_pass	= NULL,	/* Use smp_muxed_ipi_message_pass */
 	.cause_ipi	= NULL,	/* Filled at runtime by pSeries_smp_probe() */
-	.cause_nmi_ipi	= NULL,
+	.cause_nmi_ipi	= pseries_cause_nmi_ipi,
 	.probe		= pSeries_smp_probe,
 	.kick_cpu	= smp_pSeries_kick_cpu,
 	.setup_cpu	= smp_setup_cpu,
-- 
2.11.0

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH 10/10] powerpc: xmon wait for secondaries before sending IPI
  2016-12-19 18:30 [PATCH 00/10] Improving debugging with NMIs Nicholas Piggin
                   ` (8 preceding siblings ...)
  2016-12-19 18:30 ` [PATCH 09/10] powerpc/pseries: implement NMI IPI with H_SIGNAL_SYS_RESET Nicholas Piggin
@ 2016-12-19 18:30 ` Nicholas Piggin
  9 siblings, 0 replies; 15+ messages in thread
From: Nicholas Piggin @ 2016-12-19 18:30 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: Nicholas Piggin

An externally triggered system reset (e.g., via QEMU nmi command, or
pseries reset button) can cause system reset interrupts on all CPUs.
In case this causes xmon to be entered, it is undesirable for the primary
(first) CPU into xmon to trigger an NMI IPI to others, because this may
cause a nested system reset interrupt.

So spin for a time waiting for secondaries to join xmon before performing
the NMI IPI, similarly to what the crash dump code does.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
---
 arch/powerpc/xmon/xmon.c | 13 ++++++++++++-
 1 file changed, 12 insertions(+), 1 deletion(-)

diff --git a/arch/powerpc/xmon/xmon.c b/arch/powerpc/xmon/xmon.c
index 77a88319f494..9f88ec4af0b3 100644
--- a/arch/powerpc/xmon/xmon.c
+++ b/arch/powerpc/xmon/xmon.c
@@ -516,14 +516,25 @@ static int xmon_core(struct pt_regs *regs, int fromipi)
 		xmon_owner = cpu;
 		mb();
 		if (ncpus > 1) {
+			/*
+			 * system resets can be triggered on all CPUs, so
+			 * wait for others to come in and avoid the IPI. This
+			 * is similar to crash_kexec_secondary()
+			 */
+			for (timeout = 100000000; timeout != 0; --timeout) {
+				if (cpumask_weight(&cpus_in_xmon) >= ncpus)
+					goto got_cpus;
+				barrier();
+			}
 			smp_send_debugger_break();
 			/* wait for other cpus to come in */
 			for (timeout = 100000000; timeout != 0; --timeout) {
 				if (cpumask_weight(&cpus_in_xmon) >= ncpus)
-					break;
+					goto got_cpus;
 				barrier();
 			}
 		}
+got_cpus:
 		remove_bpts();
 		disable_surveillance();
 		/* for breakpoint or single step, print the current instr. */
-- 
2.11.0

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* Re: [PATCH 07/10] powerpc: add NMI IPI infrastructure
  2016-12-19 18:30 ` [PATCH 07/10] powerpc: add NMI IPI infrastructure Nicholas Piggin
@ 2017-02-03 11:45   ` Nicholas Piggin
  2017-02-04  2:49     ` Nicholas Piggin
  0 siblings, 1 reply; 15+ messages in thread
From: Nicholas Piggin @ 2017-02-03 11:45 UTC (permalink / raw)
  To: linuxppc-dev, Michael Ellerman

On Tue, 20 Dec 2016 04:30:08 +1000
Nicholas Piggin <npiggin@gmail.com> wrote:

> +static void do_smp_send_nmi_ipi(int cpu)
> +{
> +	if (cpu >= 0) {
> +		do_message_pass(cpu, PPC_MSG_NMI_IPI);
> +	} else {
> +		unsigned int c;
> +
> +		for_each_online_cpu(c) {
> +			if (c == raw_smp_processor_id())
> +				continue;
> +			do_message_pass(cpu, PPC_MSG_NMI_IPI);

Okay this has a bug in the fallback path. Needs the following
incremental patch. sysrq+x works okay with this (tested in mambo),
and also recovers with "x" okay.

Thanks,
Nick

---
 arch/powerpc/kernel/smp.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/powerpc/kernel/smp.c b/arch/powerpc/kernel/smp.c
index 81256522985d..a9f8c70d1033 100644
--- a/arch/powerpc/kernel/smp.c
+++ b/arch/powerpc/kernel/smp.c
@@ -420,12 +420,12 @@ static void do_smp_send_nmi_ipi(int cpu)
 	if (cpu >= 0) {
 		do_message_pass(cpu, PPC_MSG_NMI_IPI);
 	} else {
-		unsigned int c;
+		int c;
 
 		for_each_online_cpu(c) {
 			if (c == raw_smp_processor_id())
 				continue;
-			do_message_pass(cpu, PPC_MSG_NMI_IPI);
+			do_message_pass(c, PPC_MSG_NMI_IPI);
 		}
 	}
 }
-- 
2.11.0

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* Re: [PATCH 07/10] powerpc: add NMI IPI infrastructure
  2017-02-03 11:45   ` Nicholas Piggin
@ 2017-02-04  2:49     ` Nicholas Piggin
  0 siblings, 0 replies; 15+ messages in thread
From: Nicholas Piggin @ 2017-02-04  2:49 UTC (permalink / raw)
  To: linuxppc-dev, Michael Ellerman

On Fri, 3 Feb 2017 21:45:24 +1000
Nicholas Piggin <npiggin@gmail.com> wrote:

> On Tue, 20 Dec 2016 04:30:08 +1000
> Nicholas Piggin <npiggin@gmail.com> wrote:
> 
> > +static void do_smp_send_nmi_ipi(int cpu)
> > +{
> > +	if (cpu >= 0) {
> > +		do_message_pass(cpu, PPC_MSG_NMI_IPI);
> > +	} else {
> > +		unsigned int c;
> > +
> > +		for_each_online_cpu(c) {
> > +			if (c == raw_smp_processor_id())
> > +				continue;
> > +			do_message_pass(cpu, PPC_MSG_NMI_IPI);  
> 
> Okay this has a bug in the fallback path. Needs the following
> incremental patch. sysrq+x works okay with this (tested in mambo),
> and also recovers with "x" okay.

Also requires removal of get_online_cpus/put_online_cpus.

---
 arch/powerpc/kernel/smp.c | 8 ++------
 1 file changed, 2 insertions(+), 6 deletions(-)

diff --git a/arch/powerpc/kernel/smp.c b/arch/powerpc/kernel/smp.c
index 81256522985d..e1a429251a7d 100644
--- a/arch/powerpc/kernel/smp.c
+++ b/arch/powerpc/kernel/smp.c
@@ -448,8 +448,6 @@ static int smp_send_nmi_ipi(int cpu, void (*fn)(struct pt_regs *), u64 delay_us)
 	if (unlikely(!smp_ops))
 		return 0;
 
-	get_online_cpus();
-
 	/* Take the nmi_ipi_busy count/lock with interrupts hard disabled */
 	nmi_ipi_lock_start(&flags);
 	while (nmi_ipi_busy_count) {
@@ -491,8 +489,6 @@ static int smp_send_nmi_ipi(int cpu, void (*fn)(struct pt_regs *), u64 delay_us)
 	nmi_ipi_busy_count--;
 	nmi_ipi_unlock_end(&flags);
 
-	put_online_cpus();
-
 	return ret;
 }
 #endif /* CONFIG_NMI_IPI */
-- 
2.11.0

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* Re: [PATCH 06/10] powerpc: nmi_enter for system reset
  2016-12-19 18:30 ` [PATCH 06/10] powerpc: nmi_enter for system reset Nicholas Piggin
@ 2017-02-07  4:06   ` Michael Ellerman
  0 siblings, 0 replies; 15+ messages in thread
From: Michael Ellerman @ 2017-02-07  4:06 UTC (permalink / raw)
  To: Nicholas Piggin, linuxppc-dev; +Cc: Nicholas Piggin

Nicholas Piggin <npiggin@gmail.com> writes:

> System reset is a non-maskable interrupt from Linux's point of view
> (occurs under local_irq_disable()), so it should use nmi_enter/exit.
...
> diff --git a/arch/powerpc/kernel/traps.c b/arch/powerpc/kernel/traps.c
> index 802aa6bbe97b..c65c88fb6482 100644
> --- a/arch/powerpc/kernel/traps.c
> +++ b/arch/powerpc/kernel/traps.c
> @@ -278,6 +278,14 @@ void _exception(int signr, struct pt_regs *regs, int code, unsigned long addr)
>  
>  void system_reset_exception(struct pt_regs *regs)
>  {
> +	/*
> +	 * Avoid crashes in case of nested NMI exceptions. Recoverability
> +	 * is determined by RI and in_nmi
> +	 */
> +	bool nested = in_nmi();
> +	if (!nested)
> +		nmi_enter();
> +
>  	/* See if any machine dependent calls */
>  	if (ppc_md.system_reset_exception) {
>  		if (ppc_md.system_reset_exception(regs))


This breaks my QS22 (Cell blade), I get lots of RCU stalls such as:

  INFO: rcu_sched self-detected stall on CPU
  	0-...: (5249 ticks this GP) idle=ad6/1/1 softirq=3/3 fqs=3 
  	 (t=5250 jiffies g=-298 c=-299 q=1289)
  rcu_sched kthread starved for 5234 jiffies! g18446744073709551318 c18446744073709551317 f0x0 RCU_GP_WAIT_FQS(3) ->state=0x1
  rcu_sched       S    0     8      2 0x00000800
  Call Trace:
  [c0000003fb9d7950] [c000000000014730] .__switch_to+0x218/0x2b0
  [c0000003fb9d7a00] [c0000000006a0668] .__schedule+0x268/0x778
  [c0000003fb9d7ae0] [c0000000006a0bb0] .schedule+0x38/0xb0
  [c0000003fb9d7b60] [c0000000006a7ba4] .schedule_timeout+0x184/0x2f0
  [c0000003fb9d7c50] [c000000000106c5c] .rcu_gp_kthread+0x5ec/0xa60
  [c0000003fb9d7d70] [c0000000000c69d0] .kthread+0x148/0x188
  [c0000003fb9d7e30] [c00000000000ba70] .ret_from_kernel_thread+0x58/0x68

And I never get to userspace.

This is because cbe_system_reset_exception() doesn't like being called
after nmi_enter() - though I don't know exactly what the problem is.

Moving the nmi_enter() after the ppc_md hook (and fixing up the goto
etc.) fixes it, but that's not really a great solution.

I suspect it will also break pasemi, because it does something similar.

I'm not clear on how best to fix it ATM.

cheers

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [01/10] powerpc/64s: add exception macro that does not enable RI
  2016-12-19 18:30 ` [PATCH 01/10] powerpc/64s: add exception macro that does not enable RI Nicholas Piggin
@ 2017-05-01  2:58   ` Michael Ellerman
  0 siblings, 0 replies; 15+ messages in thread
From: Michael Ellerman @ 2017-05-01  2:58 UTC (permalink / raw)
  To: Nicholas Piggin, linuxppc-dev; +Cc: Nicholas Piggin

On Mon, 2016-12-19 at 18:30:02 UTC, Nicholas Piggin wrote:
> Subsequent patches will add more non-RI variant exceptions, so
> create a macro for it rather than open-code it.
> 
> This does not change generated instructions.
> 
> Signed-off-by: Nicholas Piggin <npiggin@gmail.com>

Series applied to powerpc next, thanks.

https://git.kernel.org/powerpc/c/83a980f7f4769c0673f0f966350d1d

cheers

^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2017-05-01  2:58 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-12-19 18:30 [PATCH 00/10] Improving debugging with NMIs Nicholas Piggin
2016-12-19 18:30 ` [PATCH 01/10] powerpc/64s: add exception macro that does not enable RI Nicholas Piggin
2017-05-01  2:58   ` [01/10] " Michael Ellerman
2016-12-19 18:30 ` [PATCH 02/10] powerpc/64s: exception macro for stack frame and initial register save Nicholas Piggin
2016-12-19 18:30 ` [PATCH 03/10] powerpc/64s: fix system reset vs general interrupt reentrancy Nicholas Piggin
2016-12-19 18:30 ` [PATCH 04/10] powerpc/64s: disallow system reset vs system reset reentrancy Nicholas Piggin
2016-12-19 18:30 ` [PATCH 05/10] powerpc/64s: dedicated system reset interrupt stack Nicholas Piggin
2016-12-19 18:30 ` [PATCH 06/10] powerpc: nmi_enter for system reset Nicholas Piggin
2017-02-07  4:06   ` Michael Ellerman
2016-12-19 18:30 ` [PATCH 07/10] powerpc: add NMI IPI infrastructure Nicholas Piggin
2017-02-03 11:45   ` Nicholas Piggin
2017-02-04  2:49     ` Nicholas Piggin
2016-12-19 18:30 ` [PATCH 08/10] powerpc: add struct smp_ops_t.cause_nmi_ipi operation Nicholas Piggin
2016-12-19 18:30 ` [PATCH 09/10] powerpc/pseries: implement NMI IPI with H_SIGNAL_SYS_RESET Nicholas Piggin
2016-12-19 18:30 ` [PATCH 10/10] powerpc: xmon wait for secondaries before sending IPI Nicholas Piggin

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).