[PATCH v8 0/5] powerpc/pseries: Machine check handler improvements.

linuxppc-dev.lists.ozlabs.org archive mirror
 help / color / mirror / Atom feed

* [PATCH v8 0/5] powerpc/pseries: Machine check handler improvements.
@ 2018-08-19 17:08 Mahesh J Salgaonkar
  2018-08-19 17:08 ` [PATCH v8 1/5] powerpc/pseries: Define MCE error event section Mahesh J Salgaonkar
                   ` (4 more replies)
  0 siblings, 5 replies; 12+ messages in thread
From: Mahesh J Salgaonkar @ 2018-08-19 17:08 UTC (permalink / raw)
  To: linuxppc-dev
  Cc: Michal Suchanek, Michael Ellerman, Aneesh Kumar K.V,
	Ananth Narayan, Nicholas Piggin, Laurent Dufour,
	Aneesh Kumar K.V, Michal Suchanek, Michael Ellerman

This patch series includes some improvement to Machine check handler
for pSeries. First 3 patches from v7 revision are already in powerpc next.
Posting rest of the patches with review comments. This patch series drops
the sysctl knob patch that was proposed in v7. The SLB recovery code now
uses flush_and_reload_slb() from mce_power.c. This series depends on patch
at http://patchwork.ozlabs.org/patch/958720/ for successful SLB error
recovery.

Patch 1 defines MCE error event section.
Patch 2 implements a real mode mce handler and flushes the SLBs on SLB error.
Patch 3 display's the MCE error details on console.
Patch 4 saves and dumps the SLB contents on SLB MCE errors to improve the
debugability.
Patch 5 consolidates mce early real mode handling code.

Change in V8:
- Move mce error log structure definition to ras.c
- Use flush_and_reload_slb() from mce_power.c.
- Limit the slb saving to single level of mce recursion.
- Move mce_faulty_slbs and slb_save_cache_ptr under CONFIG_PPC_BOOK3S_64
  instead of CONFIG_PPC_PSERIES.
- Drop the sysctl knob patch.

Change in V7:
- Fold Michal's patch into patch 5
- Handle MSR_RI=0 and evil context case in MC handler in patch 5.
- Patch 7: Print slb cache ptr value and slb cache data.
- Move patch 8 to patch 9.
- Introduce patch 8 add sysctl knob for recovery action on recovered MCEs.

Change in V6:
- Introduce patch 8 to consolidate early real mode handling code.
- Address Nick's comment on erroneous hunk.

Change in V5:
- Use min_t instead of max_t.
- Fix an issue reported by kbuild test robot and address review comments.

Change in V4:
- Flush the SLBs in real mode mce handler to handle SLB errors for entry 0.
- Allocate buffers per cpu to hold rtas error log and old slb contents.
- Defer the logging of rtas error log to irq work queue.

Change in V3:
- Moved patch 5 to patch 2

Change in V2:
- patch 3: Display additional info (NIP and task info) in MCE error details.
- patch 5: Fix endain bug while restoring of r3 in MCE handler.

---

Mahesh Salgaonkar (5):
      powerpc/pseries: Define MCE error event section.
      powerpc/pseries: flush SLB contents on SLB MCE errors.
      powerpc/pseries: Display machine check error details.
      powerpc/pseries: Dump the SLB contents on SLB MCE errors.
      powernv/pseries: consolidate code for mce early handling.


 arch/powerpc/include/asm/book3s/64/mmu-hash.h |    7 +
 arch/powerpc/include/asm/machdep.h            |    1 
 arch/powerpc/include/asm/mce.h                |    3 
 arch/powerpc/include/asm/paca.h               |    6 +
 arch/powerpc/include/asm/rtas.h               |   13 +
 arch/powerpc/kernel/exceptions-64s.S          |   42 +++-
 arch/powerpc/kernel/mce.c                     |   15 +
 arch/powerpc/kernel/mce_power.c               |    2 
 arch/powerpc/mm/slb.c                         |   73 ++++++
 arch/powerpc/platforms/powernv/setup.c        |   11 +
 arch/powerpc/platforms/pseries/pseries.h      |    1 
 arch/powerpc/platforms/pseries/ras.c          |  297 +++++++++++++++++++++++++
 arch/powerpc/platforms/pseries/setup.c        |   14 +
 13 files changed, 474 insertions(+), 11 deletions(-)

--
Thanks,
-Mahesh

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [PATCH v8 1/5] powerpc/pseries: Define MCE error event section.
  2018-08-19 17:08 [PATCH v8 0/5] powerpc/pseries: Machine check handler improvements Mahesh J Salgaonkar
@ 2018-08-19 17:08 ` Mahesh J Salgaonkar
  2018-08-19 17:08 ` [PATCH v8 2/5] powerpc/pseries: flush SLB contents on SLB MCE errors Mahesh J Salgaonkar
                   ` (3 subsequent siblings)
  4 siblings, 0 replies; 12+ messages in thread
From: Mahesh J Salgaonkar @ 2018-08-19 17:08 UTC (permalink / raw)
  To: linuxppc-dev
  Cc: Ananth Narayan, Nicholas Piggin, Laurent Dufour,
	Aneesh Kumar K.V, Michal Suchanek, Michael Ellerman

From: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com>

On pseries, the machine check error details are part of RTAS extended
event log passed under Machine check exception section. This patch adds
the definition of rtas MCE event section and related helper
functions.

Signed-off-by: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com>
---
Change in V8:
- Move mce error log structure definition to ras.c
---
 arch/powerpc/include/asm/rtas.h      |    8 +++
 arch/powerpc/platforms/pseries/ras.c |   96 ++++++++++++++++++++++++++++++++++
 2 files changed, 104 insertions(+)

diff --git a/arch/powerpc/include/asm/rtas.h b/arch/powerpc/include/asm/rtas.h
index 71e393c46a49..adefa6493d29 100644
--- a/arch/powerpc/include/asm/rtas.h
+++ b/arch/powerpc/include/asm/rtas.h
@@ -185,6 +185,13 @@ static inline uint8_t rtas_error_disposition(const struct rtas_error_log *elog)
 	return (elog->byte1 & 0x18) >> 3;
 }
 
+static inline
+void rtas_set_disposition_recovered(struct rtas_error_log *elog)
+{
+	elog->byte1 &= ~0x18;
+	elog->byte1 |= (RTAS_DISP_FULLY_RECOVERED << 3);
+}
+
 static inline uint8_t rtas_error_extended(const struct rtas_error_log *elog)
 {
 	return (elog->byte1 & 0x04) >> 2;
@@ -275,6 +282,7 @@ inline uint32_t rtas_ext_event_company_id(struct rtas_ext_event_log_v6 *ext_log)
 #define PSERIES_ELOG_SECT_ID_CALL_HOME		(('C' << 8) | 'H')
 #define PSERIES_ELOG_SECT_ID_USER_DEF		(('U' << 8) | 'D')
 #define PSERIES_ELOG_SECT_ID_HOTPLUG		(('H' << 8) | 'P')
+#define PSERIES_ELOG_SECT_ID_MCE		(('M' << 8) | 'C')
 
 /* Vendor specific Platform Event Log Format, Version 6, section header */
 struct pseries_errorlog {
diff --git a/arch/powerpc/platforms/pseries/ras.c b/arch/powerpc/platforms/pseries/ras.c
index 851ce326874a..4a0b201e25aa 100644
--- a/arch/powerpc/platforms/pseries/ras.c
+++ b/arch/powerpc/platforms/pseries/ras.c
@@ -50,6 +50,102 @@ static irqreturn_t ras_hotplug_interrupt(int irq, void *dev_id);
 static irqreturn_t ras_epow_interrupt(int irq, void *dev_id);
 static irqreturn_t ras_error_interrupt(int irq, void *dev_id);
 
+/* RTAS pseries MCE errorlog section. */
+struct pseries_mc_errorlog {
+	__be32	fru_id;
+	__be32	proc_id;
+	u8	error_type;
+	/*
+	 * sub_err_type (1 byte). Bit fields depends on error_type
+	 *
+	 *   MSB0
+	 *   |
+	 *   V
+	 *   01234567
+	 *   XXXXXXXX
+	 *
+	 * For error_type == MC_ERROR_TYPE_UE
+	 *   XXXXXXXX
+	 *   X		1: Permanent or Transient UE.
+	 *    X		1: Effective address provided.
+	 *     X	1: Logical address provided.
+	 *      XX	2: Reserved.
+	 *        XXX	3: Type of UE error.
+	 *
+	 * For error_type != MC_ERROR_TYPE_UE
+	 *   XXXXXXXX
+	 *   X		1: Effective address provided.
+	 *    XXXXX	5: Reserved.
+	 *         XX	2: Type of SLB/ERAT/TLB error.
+	 */
+	u8	sub_err_type;
+	u8	reserved_1[6];
+	__be64	effective_address;
+	__be64	logical_address;
+} __packed;
+
+/* RTAS pseries MCE error types */
+#define MC_ERROR_TYPE_UE		0x00
+#define MC_ERROR_TYPE_SLB		0x01
+#define MC_ERROR_TYPE_ERAT		0x02
+#define MC_ERROR_TYPE_TLB		0x04
+#define MC_ERROR_TYPE_D_CACHE		0x05
+#define MC_ERROR_TYPE_I_CACHE		0x07
+
+/* RTAS pseries MCE error sub types */
+#define MC_ERROR_UE_INDETERMINATE		0
+#define MC_ERROR_UE_IFETCH			1
+#define MC_ERROR_UE_PAGE_TABLE_WALK_IFETCH	2
+#define MC_ERROR_UE_LOAD_STORE			3
+#define MC_ERROR_UE_PAGE_TABLE_WALK_LOAD_STORE	4
+
+#define MC_ERROR_SLB_PARITY		0
+#define MC_ERROR_SLB_MULTIHIT		1
+#define MC_ERROR_SLB_INDETERMINATE	2
+
+#define MC_ERROR_ERAT_PARITY		1
+#define MC_ERROR_ERAT_MULTIHIT		2
+#define MC_ERROR_ERAT_INDETERMINATE	3
+
+#define MC_ERROR_TLB_PARITY		1
+#define MC_ERROR_TLB_MULTIHIT		2
+#define MC_ERROR_TLB_INDETERMINATE	3
+
+static inline uint8_t rtas_mc_error_sub_type(
+					const struct pseries_mc_errorlog *mlog)
+{
+	switch (mlog->error_type) {
+	case	MC_ERROR_TYPE_UE:
+		return (mlog->sub_err_type & 0x07);
+	case	MC_ERROR_TYPE_SLB:
+	case	MC_ERROR_TYPE_ERAT:
+	case	MC_ERROR_TYPE_TLB:
+		return (mlog->sub_err_type & 0x03);
+	default:
+		return 0;
+	}
+}
+
+static inline uint64_t rtas_mc_get_effective_addr(
+					const struct pseries_mc_errorlog *mlog)
+{
+	__be64 addr = 0;
+
+	switch (mlog->error_type) {
+	case	MC_ERROR_TYPE_UE:
+		if (mlog->sub_err_type & 0x40)
+			addr = mlog->effective_address;
+		break;
+	case	MC_ERROR_TYPE_SLB:
+	case	MC_ERROR_TYPE_ERAT:
+	case	MC_ERROR_TYPE_TLB:
+		if (mlog->sub_err_type & 0x80)
+			addr = mlog->effective_address;
+	default:
+		break;
+	}
+	return be64_to_cpu(addr);
+}
 
 /*
  * Enable the hotplug interrupt late because processing them may touch other

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH v8 2/5] powerpc/pseries: flush SLB contents on SLB MCE errors.
  2018-08-19 17:08 [PATCH v8 0/5] powerpc/pseries: Machine check handler improvements Mahesh J Salgaonkar
  2018-08-19 17:08 ` [PATCH v8 1/5] powerpc/pseries: Define MCE error event section Mahesh J Salgaonkar
@ 2018-08-19 17:08 ` Mahesh J Salgaonkar
  2018-08-20 10:58   ` Nicholas Piggin
  2018-08-19 17:08 ` [PATCH v8 3/5] powerpc/pseries: Display machine check error details Mahesh J Salgaonkar
                   ` (2 subsequent siblings)
  4 siblings, 1 reply; 12+ messages in thread
From: Mahesh J Salgaonkar @ 2018-08-19 17:08 UTC (permalink / raw)
  To: linuxppc-dev
  Cc: Michal Suchanek, Ananth Narayan, Nicholas Piggin, Laurent Dufour,
	Aneesh Kumar K.V, Michal Suchanek, Michael Ellerman

From: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com>

On pseries, as of today system crashes if we get a machine check
exceptions due to SLB errors. These are soft errors and can be fixed by
flushing the SLBs so the kernel can continue to function instead of
system crash. We do this in real mode before turning on MMU. Otherwise
we would run into nested machine checks. This patch now fetches the
rtas error log in real mode and flushes the SLBs on SLB errors.

Signed-off-by: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com>
Signed-off-by: Michal Suchanek <msuchanek@suse.com>
---

Changes in V8:
- Use flush_and_reload_slb() from mce_power.c.
---
 arch/powerpc/include/asm/machdep.h       |    1 
 arch/powerpc/include/asm/mce.h           |    3 +
 arch/powerpc/kernel/exceptions-64s.S     |  129 ++++++++++++++++++++++++++++++
 arch/powerpc/kernel/mce.c                |   15 +++
 arch/powerpc/kernel/mce_power.c          |    2 
 arch/powerpc/platforms/powernv/setup.c   |   11 +++
 arch/powerpc/platforms/pseries/pseries.h |    1 
 arch/powerpc/platforms/pseries/ras.c     |   54 ++++++++++++-
 arch/powerpc/platforms/pseries/setup.c   |    1 
 9 files changed, 212 insertions(+), 5 deletions(-)

diff --git a/arch/powerpc/include/asm/machdep.h b/arch/powerpc/include/asm/machdep.h
index a47de82fb8e2..b4831f1338db 100644
--- a/arch/powerpc/include/asm/machdep.h
+++ b/arch/powerpc/include/asm/machdep.h
@@ -108,6 +108,7 @@ struct machdep_calls {
 
 	/* Early exception handlers called in realmode */
 	int		(*hmi_exception_early)(struct pt_regs *regs);
+	long		(*machine_check_early)(struct pt_regs *regs);
 
 	/* Called during machine check exception to retrive fixup address. */
 	bool		(*mce_check_early_recovery)(struct pt_regs *regs);
diff --git a/arch/powerpc/include/asm/mce.h b/arch/powerpc/include/asm/mce.h
index 3a1226e9b465..78a1da95a394 100644
--- a/arch/powerpc/include/asm/mce.h
+++ b/arch/powerpc/include/asm/mce.h
@@ -210,4 +210,7 @@ extern void release_mce_event(void);
 extern void machine_check_queue_event(void);
 extern void machine_check_print_event_info(struct machine_check_event *evt,
 					   bool user_mode);
+#ifdef CONFIG_PPC_BOOK3S_64
+extern void flush_and_reload_slb(void);
+#endif /* CONFIG_PPC_BOOK3S_64 */
 #endif /* __ASM_PPC64_MCE_H__ */
diff --git a/arch/powerpc/kernel/exceptions-64s.S b/arch/powerpc/kernel/exceptions-64s.S
index 285c6465324a..12f056179112 100644
--- a/arch/powerpc/kernel/exceptions-64s.S
+++ b/arch/powerpc/kernel/exceptions-64s.S
@@ -332,6 +332,9 @@ TRAMP_REAL_BEGIN(machine_check_pSeries)
 machine_check_fwnmi:
 	SET_SCRATCH0(r13)		/* save r13 */
 	EXCEPTION_PROLOG_0(PACA_EXMC)
+BEGIN_FTR_SECTION
+	b	machine_check_pSeries_early
+END_FTR_SECTION_IFCLR(CPU_FTR_HVMODE)
 machine_check_pSeries_0:
 	EXCEPTION_PROLOG_1(PACA_EXMC, KVMTEST_PR, 0x200)
 	/*
@@ -343,6 +346,103 @@ machine_check_pSeries_0:
 
 TRAMP_KVM_SKIP(PACA_EXMC, 0x200)
 
+TRAMP_REAL_BEGIN(machine_check_pSeries_early)
+BEGIN_FTR_SECTION
+	EXCEPTION_PROLOG_1(PACA_EXMC, NOTEST, 0x200)
+	mr	r10,r1			/* Save r1 */
+	lhz	r11,PACA_IN_MCE(r13)
+	cmpwi	r11,0			/* Are we in nested machine check */
+	bne	0f			/* Yes, we are. */
+	/* First machine check entry */
+	ld	r1,PACAMCEMERGSP(r13)	/* Use MC emergency stack */
+0:	subi	r1,r1,INT_FRAME_SIZE	/* alloc stack frame */
+	addi	r11,r11,1		/* increment paca->in_mce */
+	sth	r11,PACA_IN_MCE(r13)
+	/* Limit nested MCE to level 4 to avoid stack overflow */
+	cmpwi	r11,MAX_MCE_DEPTH
+	bgt	1f			/* Check if we hit limit of 4 */
+	mfspr	r11,SPRN_SRR0		/* Save SRR0 */
+	mfspr	r12,SPRN_SRR1		/* Save SRR1 */
+	EXCEPTION_PROLOG_COMMON_1()
+	EXCEPTION_PROLOG_COMMON_2(PACA_EXMC)
+	EXCEPTION_PROLOG_COMMON_3(0x200)
+	addi	r3,r1,STACK_FRAME_OVERHEAD
+	BRANCH_LINK_TO_FAR(machine_check_early) /* Function call ABI */
+	ld	r12,_MSR(r1)
+	andi.	r11,r12,MSR_PR		/* See if coming from user. */
+	bne	2f			/* continue in V mode if we are. */
+
+	/*
+	 * At this point we are not sure about what context we come from.
+	 * We may be in the middle of swithing stack. r1 may not be valid.
+	 * Hence stay on emergency stack, call machine_check_exception and
+	 * return from the interrupt.
+	 * But before that, check if this is an un-recoverable exception.
+	 * If yes, then stay on emergency stack and panic.
+	 */
+	andi.	r11,r12,MSR_RI
+	beq	1f
+
+	/*
+	 * Check if we have successfully handled/recovered from error, if not
+	 * then stay on emergency stack and panic.
+	 */
+	cmpdi	r3,0		/* see if we handled MCE successfully */
+	beq	1f		/* if !handled then panic */
+
+	/* Stay on emergency stack and return from interrupt. */
+	LOAD_HANDLER(r10,mce_return)
+	mtspr	SPRN_SRR0,r10
+	ld	r10,PACAKMSR(r13)
+	mtspr	SPRN_SRR1,r10
+	RFI_TO_KERNEL
+	b	.
+
+1:	LOAD_HANDLER(r10,unrecover_mce)
+	mtspr	SPRN_SRR0,r10
+	ld	r10,PACAKMSR(r13)
+	/*
+	 * We are going down. But there are chances that we might get hit by
+	 * another MCE during panic path and we may run into unstable state
+	 * with no way out. Hence, turn ME bit off while going down, so that
+	 * when another MCE is hit during panic path, hypervisor will
+	 * power cycle the lpar, instead of getting into MCE loop.
+	 */
+	li	r3,MSR_ME
+	andc	r10,r10,r3		/* Turn off MSR_ME */
+	mtspr	SPRN_SRR1,r10
+	RFI_TO_KERNEL
+	b	.
+
+	/* Move original SRR0 and SRR1 into the respective regs */
+2:	ld	r9,_MSR(r1)
+	mtspr	SPRN_SRR1,r9
+	ld	r3,_NIP(r1)
+	mtspr	SPRN_SRR0,r3
+	ld	r9,_CTR(r1)
+	mtctr	r9
+	ld	r9,_XER(r1)
+	mtxer	r9
+	ld	r9,_LINK(r1)
+	mtlr	r9
+	REST_GPR(0, r1)
+	REST_8GPRS(2, r1)
+	REST_GPR(10, r1)
+	ld	r11,_CCR(r1)
+	mtcr	r11
+	/* Decrement paca->in_mce. */
+	lhz	r12,PACA_IN_MCE(r13)
+	subi	r12,r12,1
+	sth	r12,PACA_IN_MCE(r13)
+	REST_GPR(11, r1)
+	REST_2GPRS(12, r1)
+	/* restore original r1. */
+	ld	r1,GPR1(r1)
+	SET_SCRATCH0(r13)		/* save r13 */
+	EXCEPTION_PROLOG_0(PACA_EXMC)
+	b	machine_check_pSeries_0
+END_FTR_SECTION_IFCLR(CPU_FTR_HVMODE)
+
 EXC_COMMON_BEGIN(machine_check_common)
 	/*
 	 * Machine check is different because we use a different
@@ -536,6 +636,35 @@ EXC_COMMON_BEGIN(unrecover_mce)
 	bl	unrecoverable_exception
 	b	1b
 
+EXC_COMMON_BEGIN(mce_return)
+	/* Invoke machine_check_exception to print MCE event and return. */
+	addi	r3,r1,STACK_FRAME_OVERHEAD
+	bl	machine_check_exception
+	ld	r9,_MSR(r1)
+	mtspr	SPRN_SRR1,r9
+	ld	r3,_NIP(r1)
+	mtspr	SPRN_SRR0,r3
+	ld	r9,_CTR(r1)
+	mtctr	r9
+	ld	r9,_XER(r1)
+	mtxer	r9
+	ld	r9,_LINK(r1)
+	mtlr	r9
+	REST_GPR(0, r1)
+	REST_8GPRS(2, r1)
+	REST_GPR(10, r1)
+	ld	r11,_CCR(r1)
+	mtcr	r11
+	/* Decrement paca->in_mce. */
+	lhz	r12,PACA_IN_MCE(r13)
+	subi	r12,r12,1
+	sth	r12,PACA_IN_MCE(r13)
+	REST_GPR(11, r1)
+	REST_2GPRS(12, r1)
+	/* restore original r1. */
+	ld	r1,GPR1(r1)
+	RFI_TO_KERNEL
+	b	.
 
 EXC_REAL(data_access, 0x300, 0x80)
 EXC_VIRT(data_access, 0x4300, 0x80, 0x300)
diff --git a/arch/powerpc/kernel/mce.c b/arch/powerpc/kernel/mce.c
index efdd16a79075..ae17d8aa60c4 100644
--- a/arch/powerpc/kernel/mce.c
+++ b/arch/powerpc/kernel/mce.c
@@ -488,10 +488,19 @@ long machine_check_early(struct pt_regs *regs)
 {
 	long handled = 0;
 
-	__this_cpu_inc(irq_stat.mce_exceptions);
+	/*
+	 * For pSeries we count mce when we go into virtual mode machine
+	 * check handler. Hence skip it. Also, We can't access per cpu
+	 * variables in real mode for LPAR.
+	 */
+	if (early_cpu_has_feature(CPU_FTR_HVMODE))
+		__this_cpu_inc(irq_stat.mce_exceptions);
 
-	if (cur_cpu_spec && cur_cpu_spec->machine_check_early)
-		handled = cur_cpu_spec->machine_check_early(regs);
+	/*
+	 * See if platform is capable of handling machine check.
+	 */
+	if (ppc_md.machine_check_early)
+		handled = ppc_md.machine_check_early(regs);
 	return handled;
 }
 
diff --git a/arch/powerpc/kernel/mce_power.c b/arch/powerpc/kernel/mce_power.c
index 368eb23f27c2..135b0b5a702e 100644
--- a/arch/powerpc/kernel/mce_power.c
+++ b/arch/powerpc/kernel/mce_power.c
@@ -60,7 +60,7 @@ static unsigned long addr_to_pfn(struct pt_regs *regs, unsigned long addr)
 
 /* flush SLBs and reload */
 #ifdef CONFIG_PPC_BOOK3S_64
-static void flush_and_reload_slb(void)
+void flush_and_reload_slb(void)
 {
 	/* Invalidate all SLBs */
 	slb_flush_all_realmode();
diff --git a/arch/powerpc/platforms/powernv/setup.c b/arch/powerpc/platforms/powernv/setup.c
index f96df0a25d05..b74c93bc2e55 100644
--- a/arch/powerpc/platforms/powernv/setup.c
+++ b/arch/powerpc/platforms/powernv/setup.c
@@ -431,6 +431,16 @@ static unsigned long pnv_get_proc_freq(unsigned int cpu)
 	return ret_freq;
 }
 
+static long pnv_machine_check_early(struct pt_regs *regs)
+{
+	long handled = 0;
+
+	if (cur_cpu_spec && cur_cpu_spec->machine_check_early)
+		handled = cur_cpu_spec->machine_check_early(regs);
+
+	return handled;
+}
+
 define_machine(powernv) {
 	.name			= "PowerNV",
 	.probe			= pnv_probe,
@@ -442,6 +452,7 @@ define_machine(powernv) {
 	.machine_shutdown	= pnv_shutdown,
 	.power_save             = NULL,
 	.calibrate_decr		= generic_calibrate_decr,
+	.machine_check_early	= pnv_machine_check_early,
 #ifdef CONFIG_KEXEC_CORE
 	.kexec_cpu_down		= pnv_kexec_cpu_down,
 #endif
diff --git a/arch/powerpc/platforms/pseries/pseries.h b/arch/powerpc/platforms/pseries/pseries.h
index 60db2ee511fb..ec2a5f61d4a4 100644
--- a/arch/powerpc/platforms/pseries/pseries.h
+++ b/arch/powerpc/platforms/pseries/pseries.h
@@ -24,6 +24,7 @@ struct pt_regs;
 
 extern int pSeries_system_reset_exception(struct pt_regs *regs);
 extern int pSeries_machine_check_exception(struct pt_regs *regs);
+extern long pSeries_machine_check_realmode(struct pt_regs *regs);
 
 #ifdef CONFIG_SMP
 extern void smp_init_pseries(void);
diff --git a/arch/powerpc/platforms/pseries/ras.c b/arch/powerpc/platforms/pseries/ras.c
index 4a0b201e25aa..73500a24e9c2 100644
--- a/arch/powerpc/platforms/pseries/ras.c
+++ b/arch/powerpc/platforms/pseries/ras.c
@@ -27,6 +27,7 @@
 #include <asm/machdep.h>
 #include <asm/rtas.h>
 #include <asm/firmware.h>
+#include <asm/mce.h>
 
 #include "pseries.h"
 
@@ -523,6 +524,37 @@ int pSeries_system_reset_exception(struct pt_regs *regs)
 	return 0; /* need to perform reset */
 }
 
+static int mce_handle_error(struct rtas_error_log *errp)
+{
+	struct pseries_errorlog *pseries_log;
+	struct pseries_mc_errorlog *mce_log;
+	int disposition = rtas_error_disposition(errp);
+	uint8_t error_type;
+
+	if (!rtas_error_extended(errp))
+		goto out;
+
+	pseries_log = get_pseries_errorlog(errp, PSERIES_ELOG_SECT_ID_MCE);
+	if (pseries_log == NULL)
+		goto out;
+
+	mce_log = (struct pseries_mc_errorlog *)pseries_log->data;
+	error_type = mce_log->error_type;
+
+#ifdef CONFIG_PPC_BOOK3S_64
+	if ((disposition == RTAS_DISP_NOT_RECOVERED) &&
+			(error_type == MC_ERROR_TYPE_SLB)) {
+		/* Store the old slb content someplace. */
+		flush_and_reload_slb();
+		disposition = RTAS_DISP_FULLY_RECOVERED;
+		rtas_set_disposition_recovered(errp);
+	}
+#endif
+
+out:
+	return disposition;
+}
+
 /*
  * Process MCE rtas errlog event.
  */
@@ -599,11 +631,31 @@ int pSeries_machine_check_exception(struct pt_regs *regs)
 	struct rtas_error_log *errp;
 
 	if (fwnmi_active) {
-		errp = fwnmi_get_errinfo(regs);
 		fwnmi_release_errinfo();
+		errp = fwnmi_get_errlog();
 		if (errp && recover_mce(regs, errp))
 			return 1;
 	}
 
 	return 0;
 }
+
+long pSeries_machine_check_realmode(struct pt_regs *regs)
+{
+	struct rtas_error_log *errp;
+	int disposition;
+
+	if (fwnmi_active) {
+		errp = fwnmi_get_errinfo(regs);
+		/*
+		 * Call to fwnmi_release_errinfo() in real mode causes kernel
+		 * to panic. Hence we will call it as soon as we go into
+		 * virtual mode.
+		 */
+		disposition = mce_handle_error(errp);
+		if (disposition == RTAS_DISP_FULLY_RECOVERED)
+			return 1;
+	}
+
+	return 0;
+}
diff --git a/arch/powerpc/platforms/pseries/setup.c b/arch/powerpc/platforms/pseries/setup.c
index c557f45733bf..cbd1adf3e14f 100644
--- a/arch/powerpc/platforms/pseries/setup.c
+++ b/arch/powerpc/platforms/pseries/setup.c
@@ -1001,6 +1001,7 @@ define_machine(pseries) {
 	.calibrate_decr		= generic_calibrate_decr,
 	.progress		= rtas_progress,
 	.system_reset_exception = pSeries_system_reset_exception,
+	.machine_check_early	= pSeries_machine_check_realmode,
 	.machine_check_exception = pSeries_machine_check_exception,
 #ifdef CONFIG_KEXEC_CORE
 	.machine_kexec          = pSeries_machine_kexec,

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH v8 3/5] powerpc/pseries: Display machine check error details.
  2018-08-19 17:08 [PATCH v8 0/5] powerpc/pseries: Machine check handler improvements Mahesh J Salgaonkar
  2018-08-19 17:08 ` [PATCH v8 1/5] powerpc/pseries: Define MCE error event section Mahesh J Salgaonkar
  2018-08-19 17:08 ` [PATCH v8 2/5] powerpc/pseries: flush SLB contents on SLB MCE errors Mahesh J Salgaonkar
@ 2018-08-19 17:08 ` Mahesh J Salgaonkar
  2018-08-19 17:08 ` [PATCH v8 4/5] powerpc/pseries: Dump the SLB contents on SLB MCE errors Mahesh J Salgaonkar
  2018-08-19 17:08 ` [PATCH v8 5/5] powernv/pseries: consolidate code for mce early handling Mahesh J Salgaonkar
  4 siblings, 0 replies; 12+ messages in thread
From: Mahesh J Salgaonkar @ 2018-08-19 17:08 UTC (permalink / raw)
  To: linuxppc-dev
  Cc: Ananth Narayan, Nicholas Piggin, Laurent Dufour,
	Aneesh Kumar K.V, Michal Suchanek, Michael Ellerman

From: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com>

Extract the MCE error details from RTAS extended log and display it to
console.

With this patch you should now see mce logs like below:

[  142.371818] Severe Machine check interrupt [Recovered]
[  142.371822]   NIP [d00000000ca301b8]: init_module+0x1b8/0x338 [bork_kernel]
[  142.371822]   Initiator: CPU
[  142.371823]   Error type: SLB [Multihit]
[  142.371824]     Effective address: d00000000ca70000

Signed-off-by: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com>
---
 arch/powerpc/include/asm/rtas.h      |    5 +
 arch/powerpc/platforms/pseries/ras.c |  132 ++++++++++++++++++++++++++++++++++
 2 files changed, 137 insertions(+)

diff --git a/arch/powerpc/include/asm/rtas.h b/arch/powerpc/include/asm/rtas.h
index adefa6493d29..0183e9595acc 100644
--- a/arch/powerpc/include/asm/rtas.h
+++ b/arch/powerpc/include/asm/rtas.h
@@ -197,6 +197,11 @@ static inline uint8_t rtas_error_extended(const struct rtas_error_log *elog)
 	return (elog->byte1 & 0x04) >> 2;
 }
 
+static inline uint8_t rtas_error_initiator(const struct rtas_error_log *elog)
+{
+	return (elog->byte2 & 0xf0) >> 4;
+}
+
 #define rtas_error_type(x)	((x)->byte3)
 
 static inline
diff --git a/arch/powerpc/platforms/pseries/ras.c b/arch/powerpc/platforms/pseries/ras.c
index 73500a24e9c2..d042f852afe9 100644
--- a/arch/powerpc/platforms/pseries/ras.c
+++ b/arch/powerpc/platforms/pseries/ras.c
@@ -524,6 +524,135 @@ int pSeries_system_reset_exception(struct pt_regs *regs)
 	return 0; /* need to perform reset */
 }
 
+#define VAL_TO_STRING(ar, val)	((val < ARRAY_SIZE(ar)) ? ar[val] : "Unknown")
+
+static void pseries_print_mce_info(struct pt_regs *regs,
+						struct rtas_error_log *errp)
+{
+	const char *level, *sevstr;
+	struct pseries_errorlog *pseries_log;
+	struct pseries_mc_errorlog *mce_log;
+	uint8_t error_type, err_sub_type;
+	uint64_t addr;
+	uint8_t initiator = rtas_error_initiator(errp);
+	int disposition = rtas_error_disposition(errp);
+
+	static const char * const initiators[] = {
+		"Unknown",
+		"CPU",
+		"PCI",
+		"ISA",
+		"Memory",
+		"Power Mgmt",
+	};
+	static const char * const mc_err_types[] = {
+		"UE",
+		"SLB",
+		"ERAT",
+		"TLB",
+		"D-Cache",
+		"Unknown",
+		"I-Cache",
+	};
+	static const char * const mc_ue_types[] = {
+		"Indeterminate",
+		"Instruction fetch",
+		"Page table walk ifetch",
+		"Load/Store",
+		"Page table walk Load/Store",
+	};
+
+	/* SLB sub errors valid values are 0x0, 0x1, 0x2 */
+	static const char * const mc_slb_types[] = {
+		"Parity",
+		"Multihit",
+		"Indeterminate",
+	};
+
+	/* TLB and ERAT sub errors valid values are 0x1, 0x2, 0x3 */
+	static const char * const mc_soft_types[] = {
+		"Unknown",
+		"Parity",
+		"Multihit",
+		"Indeterminate",
+	};
+
+	if (!rtas_error_extended(errp)) {
+		pr_err("Machine check interrupt: Missing extended error log\n");
+		return;
+	}
+
+	pseries_log = get_pseries_errorlog(errp, PSERIES_ELOG_SECT_ID_MCE);
+	if (pseries_log == NULL)
+		return;
+
+	mce_log = (struct pseries_mc_errorlog *)pseries_log->data;
+
+	error_type = mce_log->error_type;
+	err_sub_type = rtas_mc_error_sub_type(mce_log);
+
+	switch (rtas_error_severity(errp)) {
+	case RTAS_SEVERITY_NO_ERROR:
+		level = KERN_INFO;
+		sevstr = "Harmless";
+		break;
+	case RTAS_SEVERITY_WARNING:
+		level = KERN_WARNING;
+		sevstr = "";
+		break;
+	case RTAS_SEVERITY_ERROR:
+	case RTAS_SEVERITY_ERROR_SYNC:
+		level = KERN_ERR;
+		sevstr = "Severe";
+		break;
+	case RTAS_SEVERITY_FATAL:
+	default:
+		level = KERN_ERR;
+		sevstr = "Fatal";
+		break;
+	}
+
+	printk("%s%s Machine check interrupt [%s]\n", level, sevstr,
+		disposition == RTAS_DISP_FULLY_RECOVERED ?
+		"Recovered" : "Not recovered");
+	if (user_mode(regs)) {
+		printk("%s  NIP: [%016lx] PID: %d Comm: %s\n", level,
+			regs->nip, current->pid, current->comm);
+	} else {
+		printk("%s  NIP [%016lx]: %pS\n", level, regs->nip,
+			(void *)regs->nip);
+	}
+	printk("%s  Initiator: %s\n", level,
+				VAL_TO_STRING(initiators, initiator));
+
+	switch (error_type) {
+	case MC_ERROR_TYPE_UE:
+		printk("%s  Error type: %s [%s]\n", level,
+			VAL_TO_STRING(mc_err_types, error_type),
+			VAL_TO_STRING(mc_ue_types, err_sub_type));
+		break;
+	case MC_ERROR_TYPE_SLB:
+		printk("%s  Error type: %s [%s]\n", level,
+			VAL_TO_STRING(mc_err_types, error_type),
+			VAL_TO_STRING(mc_slb_types, err_sub_type));
+		break;
+	case MC_ERROR_TYPE_ERAT:
+	case MC_ERROR_TYPE_TLB:
+		printk("%s  Error type: %s [%s]\n", level,
+			VAL_TO_STRING(mc_err_types, error_type),
+			VAL_TO_STRING(mc_soft_types, err_sub_type));
+		break;
+	default:
+		printk("%s  Error type: %s\n", level,
+			VAL_TO_STRING(mc_err_types, error_type));
+		break;
+	}
+
+	addr = rtas_mc_get_effective_addr(mce_log);
+	if (addr)
+		printk("%s    Effective address: %016llx\n", level, addr);
+}
+
 static int mce_handle_error(struct rtas_error_log *errp)
 {
 	struct pseries_errorlog *pseries_log;
@@ -580,8 +709,11 @@ static int recover_mce(struct pt_regs *regs, struct rtas_error_log *err)
 	int recovered = 0;
 	int disposition = rtas_error_disposition(err);
 
+	pseries_print_mce_info(regs, err);
+
 	if (!(regs->msr & MSR_RI)) {
 		/* If MSR_RI isn't set, we cannot recover */
+		pr_err("Machine check interrupt unrecoverable: MSR(RI=0)\n");
 		recovered = 0;
 
 	} else if (disposition == RTAS_DISP_FULLY_RECOVERED) {

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH v8 4/5] powerpc/pseries: Dump the SLB contents on SLB MCE errors.
  2018-08-19 17:08 [PATCH v8 0/5] powerpc/pseries: Machine check handler improvements Mahesh J Salgaonkar
                   ` (2 preceding siblings ...)
  2018-08-19 17:08 ` [PATCH v8 3/5] powerpc/pseries: Display machine check error details Mahesh J Salgaonkar
@ 2018-08-19 17:08 ` Mahesh J Salgaonkar
  2018-08-20 11:20   ` Nicholas Piggin
  2018-08-19 17:08 ` [PATCH v8 5/5] powernv/pseries: consolidate code for mce early handling Mahesh J Salgaonkar
  4 siblings, 1 reply; 12+ messages in thread
From: Mahesh J Salgaonkar @ 2018-08-19 17:08 UTC (permalink / raw)
  To: linuxppc-dev
  Cc: Aneesh Kumar K.V, Michael Ellerman, Ananth Narayan,
	Nicholas Piggin, Laurent Dufour, Aneesh Kumar K.V,
	Michal Suchanek, Michael Ellerman

From: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com>

If we get a machine check exceptions due to SLB errors then dump the
current SLB contents which will be very much helpful in debugging the
root cause of SLB errors. Introduce an exclusive buffer per cpu to hold
faulty SLB entries. In real mode mce handler saves the old SLB contents
into this buffer accessible through paca and print it out later in virtual
mode.

With this patch the console will log SLB contents like below on SLB MCE
errors:

[  507.297236] SLB contents of cpu 0x1
[  507.297237] Last SLB entry inserted at slot 16
[  507.297238] 00 c000000008000000 400ea1b217000500
[  507.297239]   1T  ESID=   c00000  VSID=      ea1b217 LLP:100
[  507.297240] 01 d000000008000000 400d43642f000510
[  507.297242]   1T  ESID=   d00000  VSID=      d43642f LLP:110
[  507.297243] 11 f000000008000000 400a86c85f000500
[  507.297244]   1T  ESID=   f00000  VSID=      a86c85f LLP:100
[  507.297245] 12 00007f0008000000 4008119624000d90
[  507.297246]   1T  ESID=       7f  VSID=      8119624 LLP:110
[  507.297247] 13 0000000018000000 00092885f5150d90
[  507.297247]  256M ESID=        1  VSID=   92885f5150 LLP:110
[  507.297248] 14 0000010008000000 4009e7cb50000d90
[  507.297249]   1T  ESID=        1  VSID=      9e7cb50 LLP:110
[  507.297250] 15 d000000008000000 400d43642f000510
[  507.297251]   1T  ESID=   d00000  VSID=      d43642f LLP:110
[  507.297252] 16 d000000008000000 400d43642f000510
[  507.297253]   1T  ESID=   d00000  VSID=      d43642f LLP:110
[  507.297253] ----------------------------------
[  507.297254] SLB cache ptr value = 3
[  507.297254] Valid SLB cache entries:
[  507.297255] 00 EA[0-35]=    7f000
[  507.297256] 01 EA[0-35]=        1
[  507.297257] 02 EA[0-35]=     1000
[  507.297257] Rest of SLB cache entries:
[  507.297258] 03 EA[0-35]=    7f000
[  507.297258] 04 EA[0-35]=        1
[  507.297259] 05 EA[0-35]=     1000
[  507.297260] 06 EA[0-35]=       12
[  507.297260] 07 EA[0-35]=    7f000

Suggested-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Suggested-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com>
---

Changes in V8:
- Limit the slb saving to single level of mce recursion.
- Move mce_faulty_slbs and slb_save_cache_ptr under CONFIG_PPC_BOOK3S_64
  instead of CONFIG_PPC_PSERIES.
---
 arch/powerpc/include/asm/book3s/64/mmu-hash.h |    7 ++
 arch/powerpc/include/asm/paca.h               |    6 ++
 arch/powerpc/mm/slb.c                         |   73 +++++++++++++++++++++++++
 arch/powerpc/platforms/pseries/ras.c          |   17 +++++-
 arch/powerpc/platforms/pseries/setup.c        |   13 ++++
 5 files changed, 115 insertions(+), 1 deletion(-)

diff --git a/arch/powerpc/include/asm/book3s/64/mmu-hash.h b/arch/powerpc/include/asm/book3s/64/mmu-hash.h
index afca8c11d996..925271d95122 100644
--- a/arch/powerpc/include/asm/book3s/64/mmu-hash.h
+++ b/arch/powerpc/include/asm/book3s/64/mmu-hash.h
@@ -485,11 +485,18 @@ static inline void hpte_init_pseries(void) { }
 
 extern void hpte_init_native(void);
 
+struct slb_entry {
+	u64	esid;
+	u64	vsid;
+};
+
 extern void slb_initialize(void);
 extern void slb_flush_and_rebolt(void);
 extern void slb_flush_all_realmode(void);
 extern void __slb_restore_bolted_realmode(void);
 extern void slb_restore_bolted_realmode(void);
+extern void slb_save_contents(struct slb_entry *slb_ptr);
+extern void slb_dump_contents(struct slb_entry *slb_ptr);
 
 extern void slb_vmalloc_update(void);
 extern void slb_set_size(u16 size);
diff --git a/arch/powerpc/include/asm/paca.h b/arch/powerpc/include/asm/paca.h
index 7f22929ce915..8767abb521c2 100644
--- a/arch/powerpc/include/asm/paca.h
+++ b/arch/powerpc/include/asm/paca.h
@@ -255,6 +255,12 @@ struct paca_struct {
 #ifdef CONFIG_PPC_PSERIES
 	u8 *mce_data_buf;		/* buffer to hold per cpu rtas errlog */
 #endif /* CONFIG_PPC_PSERIES */
+
+#ifdef CONFIG_PPC_BOOK3S_64
+	/* Capture SLB related old contents in MCE handler. */
+	struct slb_entry *mce_faulty_slbs;
+	u16 slb_save_cache_ptr;
+#endif /* CONFIG_PPC_BOOK3S_64 */
 } ____cacheline_aligned;
 
 extern void copy_mm_to_paca(struct mm_struct *mm);
diff --git a/arch/powerpc/mm/slb.c b/arch/powerpc/mm/slb.c
index 6dd9913425bc..09a2f325b231 100644
--- a/arch/powerpc/mm/slb.c
+++ b/arch/powerpc/mm/slb.c
@@ -187,6 +187,79 @@ void slb_flush_and_rebolt(void)
 	get_paca()->slb_cache_ptr = 0;
 }
 
+void slb_save_contents(struct slb_entry *slb_ptr)
+{
+	int i;
+	unsigned long e, v;
+
+	/* Save slb_cache_ptr value. */
+	get_paca()->slb_save_cache_ptr = get_paca()->slb_cache_ptr;
+
+	if (!slb_ptr)
+		return;
+
+	for (i = 0; i < mmu_slb_size; i++) {
+		asm volatile("slbmfee  %0,%1" : "=r" (e) : "r" (i));
+		asm volatile("slbmfev  %0,%1" : "=r" (v) : "r" (i));
+		slb_ptr->esid = e;
+		slb_ptr->vsid = v;
+		slb_ptr++;
+	}
+}
+
+void slb_dump_contents(struct slb_entry *slb_ptr)
+{
+	int i, n;
+	unsigned long e, v;
+	unsigned long llp;
+
+	if (!slb_ptr)
+		return;
+
+	pr_err("SLB contents of cpu 0x%x\n", smp_processor_id());
+	pr_err("Last SLB entry inserted at slot %lld\n", get_paca()->stab_rr);
+
+	for (i = 0; i < mmu_slb_size; i++) {
+		e = slb_ptr->esid;
+		v = slb_ptr->vsid;
+		slb_ptr++;
+
+		if (!e && !v)
+			continue;
+
+		pr_err("%02d %016lx %016lx\n", i, e, v);
+
+		if (!(e & SLB_ESID_V)) {
+			pr_err("\n");
+			continue;
+		}
+		llp = v & SLB_VSID_LLP;
+		if (v & SLB_VSID_B_1T) {
+			pr_err("  1T  ESID=%9lx  VSID=%13lx LLP:%3lx\n",
+				GET_ESID_1T(e),
+				(v & ~SLB_VSID_B) >> SLB_VSID_SHIFT_1T,
+				llp);
+		} else {
+			pr_err(" 256M ESID=%9lx  VSID=%13lx LLP:%3lx\n",
+				GET_ESID(e),
+				(v & ~SLB_VSID_B) >> SLB_VSID_SHIFT,
+				llp);
+		}
+	}
+	pr_err("----------------------------------\n");
+
+	/* Dump slb cache entires as well. */
+	pr_err("SLB cache ptr value = %d\n", get_paca()->slb_save_cache_ptr);
+	pr_err("Valid SLB cache entries:\n");
+	n = min_t(int, get_paca()->slb_save_cache_ptr, SLB_CACHE_ENTRIES);
+	for (i = 0; i < n; i++)
+		pr_err("%02d EA[0-35]=%9x\n", i, get_paca()->slb_cache[i]);
+	pr_err("Rest of SLB cache entries:\n");
+	for (i = n; i < SLB_CACHE_ENTRIES; i++)
+		pr_err("%02d EA[0-35]=%9x\n", i, get_paca()->slb_cache[i]);
+
+}
+
 void slb_vmalloc_update(void)
 {
 	unsigned long vflags;
diff --git a/arch/powerpc/platforms/pseries/ras.c b/arch/powerpc/platforms/pseries/ras.c
index d042f852afe9..61103a3c5651 100644
--- a/arch/powerpc/platforms/pseries/ras.c
+++ b/arch/powerpc/platforms/pseries/ras.c
@@ -612,6 +612,12 @@ static void pseries_print_mce_info(struct pt_regs *regs,
 		break;
 	}
 
+#ifdef CONFIG_PPC_BOOK3S_64
+	/* Display faulty slb contents for SLB errors. */
+	if (error_type == MC_ERROR_TYPE_SLB)
+		slb_dump_contents(local_paca->mce_faulty_slbs);
+#endif
+
 	printk("%s%s Machine check interrupt [%s]\n", level, sevstr,
 		disposition == RTAS_DISP_FULLY_RECOVERED ?
 		"Recovered" : "Not recovered");
@@ -673,7 +679,16 @@ static int mce_handle_error(struct rtas_error_log *errp)
 #ifdef CONFIG_PPC_BOOK3S_64
 	if ((disposition == RTAS_DISP_NOT_RECOVERED) &&
 			(error_type == MC_ERROR_TYPE_SLB)) {
-		/* Store the old slb content someplace. */
+		/*
+		 * Store the old slb content in paca before flushing. Print
+		 * this when we go to virtual mode.
+		 * There are chances that we may hit MCE again if there
+		 * is a parity error on the SLB entry we trying to read
+		 * for saving. Hence limit the slb saving to single level
+		 * of recursion.
+		 */
+		if (local_paca->in_mce == 1)
+			slb_save_contents(local_paca->mce_faulty_slbs);
 		flush_and_reload_slb();
 		disposition = RTAS_DISP_FULLY_RECOVERED;
 		rtas_set_disposition_recovered(errp);
diff --git a/arch/powerpc/platforms/pseries/setup.c b/arch/powerpc/platforms/pseries/setup.c
index cbd1adf3e14f..47b2b91c759b 100644
--- a/arch/powerpc/platforms/pseries/setup.c
+++ b/arch/powerpc/platforms/pseries/setup.c
@@ -106,6 +106,10 @@ static void __init fwnmi_init(void)
 	u8 *mce_data_buf;
 	unsigned int i;
 	int nr_cpus = num_possible_cpus();
+#ifdef CONFIG_PPC_BOOK3S_64
+	struct slb_entry *slb_ptr;
+	size_t size;
+#endif
 
 	int ibm_nmi_register = rtas_token("ibm,nmi-register");
 	if (ibm_nmi_register == RTAS_UNKNOWN_SERVICE)
@@ -131,6 +135,15 @@ static void __init fwnmi_init(void)
 		paca_ptrs[i]->mce_data_buf = mce_data_buf +
 						(RTAS_ERROR_LOG_MAX * i);
 	}
+
+#ifdef CONFIG_PPC_BOOK3S_64
+	/* Allocate per cpu slb area to save old slb contents during MCE */
+	size = sizeof(struct slb_entry) * mmu_slb_size * nr_cpus;
+	slb_ptr = __va(memblock_alloc_base(size, sizeof(struct slb_entry),
+							ppc64_rma_size));
+	for_each_possible_cpu(i)
+		paca_ptrs[i]->mce_faulty_slbs = slb_ptr + (mmu_slb_size * i);
+#endif
 }
 
 static void pseries_8259_cascade(struct irq_desc *desc)

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH v8 5/5] powernv/pseries: consolidate code for mce early handling.
  2018-08-19 17:08 [PATCH v8 0/5] powerpc/pseries: Machine check handler improvements Mahesh J Salgaonkar
                   ` (3 preceding siblings ...)
  2018-08-19 17:08 ` [PATCH v8 4/5] powerpc/pseries: Dump the SLB contents on SLB MCE errors Mahesh J Salgaonkar
@ 2018-08-19 17:08 ` Mahesh J Salgaonkar
  2018-08-20 11:34   ` Nicholas Piggin
  4 siblings, 1 reply; 12+ messages in thread
From: Mahesh J Salgaonkar @ 2018-08-19 17:08 UTC (permalink / raw)
  To: linuxppc-dev
  Cc: Ananth Narayan, Nicholas Piggin, Laurent Dufour,
	Aneesh Kumar K.V, Michal Suchanek, Michael Ellerman

From: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com>

Now that other platforms also implements real mode mce handler,
lets consolidate the code by sharing existing powernv machine check
early code. Rename machine_check_powernv_early to
machine_check_common_early and reuse the code.

Signed-off-by: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com>
---
 arch/powerpc/kernel/exceptions-64s.S |  155 ++++++----------------------------
 1 file changed, 28 insertions(+), 127 deletions(-)

diff --git a/arch/powerpc/kernel/exceptions-64s.S b/arch/powerpc/kernel/exceptions-64s.S
index 12f056179112..2f85a7baf026 100644
--- a/arch/powerpc/kernel/exceptions-64s.S
+++ b/arch/powerpc/kernel/exceptions-64s.S
@@ -243,14 +243,13 @@ EXC_REAL_BEGIN(machine_check, 0x200, 0x100)
 	SET_SCRATCH0(r13)		/* save r13 */
 	EXCEPTION_PROLOG_0(PACA_EXMC)
 BEGIN_FTR_SECTION
-	b	machine_check_powernv_early
+	b	machine_check_common_early
 FTR_SECTION_ELSE
 	b	machine_check_pSeries_0
 ALT_FTR_SECTION_END_IFSET(CPU_FTR_HVMODE)
 EXC_REAL_END(machine_check, 0x200, 0x100)
 EXC_VIRT_NONE(0x4200, 0x100)
-TRAMP_REAL_BEGIN(machine_check_powernv_early)
-BEGIN_FTR_SECTION
+TRAMP_REAL_BEGIN(machine_check_common_early)
 	EXCEPTION_PROLOG_1(PACA_EXMC, NOTEST, 0x200)
 	/*
 	 * Register contents:
@@ -306,7 +305,9 @@ BEGIN_FTR_SECTION
 	/* Save r9 through r13 from EXMC save area to stack frame. */
 	EXCEPTION_PROLOG_COMMON_2(PACA_EXMC)
 	mfmsr	r11			/* get MSR value */
+BEGIN_FTR_SECTION
 	ori	r11,r11,MSR_ME		/* turn on ME bit */
+END_FTR_SECTION_IFSET(CPU_FTR_HVMODE)
 	ori	r11,r11,MSR_RI		/* turn on RI bit */
 	LOAD_HANDLER(r12, machine_check_handle_early)
 1:	mtspr	SPRN_SRR0,r12
@@ -325,7 +326,6 @@ BEGIN_FTR_SECTION
 	andc	r11,r11,r10		/* Turn off MSR_ME */
 	b	1b
 	b	.	/* prevent speculative execution */
-END_FTR_SECTION_IFSET(CPU_FTR_HVMODE)
 
 TRAMP_REAL_BEGIN(machine_check_pSeries)
 	.globl machine_check_fwnmi
@@ -333,7 +333,7 @@ machine_check_fwnmi:
 	SET_SCRATCH0(r13)		/* save r13 */
 	EXCEPTION_PROLOG_0(PACA_EXMC)
 BEGIN_FTR_SECTION
-	b	machine_check_pSeries_early
+	b	machine_check_common_early
 END_FTR_SECTION_IFCLR(CPU_FTR_HVMODE)
 machine_check_pSeries_0:
 	EXCEPTION_PROLOG_1(PACA_EXMC, KVMTEST_PR, 0x200)
@@ -346,103 +346,6 @@ machine_check_pSeries_0:
 
 TRAMP_KVM_SKIP(PACA_EXMC, 0x200)
 
-TRAMP_REAL_BEGIN(machine_check_pSeries_early)
-BEGIN_FTR_SECTION
-	EXCEPTION_PROLOG_1(PACA_EXMC, NOTEST, 0x200)
-	mr	r10,r1			/* Save r1 */
-	lhz	r11,PACA_IN_MCE(r13)
-	cmpwi	r11,0			/* Are we in nested machine check */
-	bne	0f			/* Yes, we are. */
-	/* First machine check entry */
-	ld	r1,PACAMCEMERGSP(r13)	/* Use MC emergency stack */
-0:	subi	r1,r1,INT_FRAME_SIZE	/* alloc stack frame */
-	addi	r11,r11,1		/* increment paca->in_mce */
-	sth	r11,PACA_IN_MCE(r13)
-	/* Limit nested MCE to level 4 to avoid stack overflow */
-	cmpwi	r11,MAX_MCE_DEPTH
-	bgt	1f			/* Check if we hit limit of 4 */
-	mfspr	r11,SPRN_SRR0		/* Save SRR0 */
-	mfspr	r12,SPRN_SRR1		/* Save SRR1 */
-	EXCEPTION_PROLOG_COMMON_1()
-	EXCEPTION_PROLOG_COMMON_2(PACA_EXMC)
-	EXCEPTION_PROLOG_COMMON_3(0x200)
-	addi	r3,r1,STACK_FRAME_OVERHEAD
-	BRANCH_LINK_TO_FAR(machine_check_early) /* Function call ABI */
-	ld	r12,_MSR(r1)
-	andi.	r11,r12,MSR_PR		/* See if coming from user. */
-	bne	2f			/* continue in V mode if we are. */
-
-	/*
-	 * At this point we are not sure about what context we come from.
-	 * We may be in the middle of swithing stack. r1 may not be valid.
-	 * Hence stay on emergency stack, call machine_check_exception and
-	 * return from the interrupt.
-	 * But before that, check if this is an un-recoverable exception.
-	 * If yes, then stay on emergency stack and panic.
-	 */
-	andi.	r11,r12,MSR_RI
-	beq	1f
-
-	/*
-	 * Check if we have successfully handled/recovered from error, if not
-	 * then stay on emergency stack and panic.
-	 */
-	cmpdi	r3,0		/* see if we handled MCE successfully */
-	beq	1f		/* if !handled then panic */
-
-	/* Stay on emergency stack and return from interrupt. */
-	LOAD_HANDLER(r10,mce_return)
-	mtspr	SPRN_SRR0,r10
-	ld	r10,PACAKMSR(r13)
-	mtspr	SPRN_SRR1,r10
-	RFI_TO_KERNEL
-	b	.
-
-1:	LOAD_HANDLER(r10,unrecover_mce)
-	mtspr	SPRN_SRR0,r10
-	ld	r10,PACAKMSR(r13)
-	/*
-	 * We are going down. But there are chances that we might get hit by
-	 * another MCE during panic path and we may run into unstable state
-	 * with no way out. Hence, turn ME bit off while going down, so that
-	 * when another MCE is hit during panic path, hypervisor will
-	 * power cycle the lpar, instead of getting into MCE loop.
-	 */
-	li	r3,MSR_ME
-	andc	r10,r10,r3		/* Turn off MSR_ME */
-	mtspr	SPRN_SRR1,r10
-	RFI_TO_KERNEL
-	b	.
-
-	/* Move original SRR0 and SRR1 into the respective regs */
-2:	ld	r9,_MSR(r1)
-	mtspr	SPRN_SRR1,r9
-	ld	r3,_NIP(r1)
-	mtspr	SPRN_SRR0,r3
-	ld	r9,_CTR(r1)
-	mtctr	r9
-	ld	r9,_XER(r1)
-	mtxer	r9
-	ld	r9,_LINK(r1)
-	mtlr	r9
-	REST_GPR(0, r1)
-	REST_8GPRS(2, r1)
-	REST_GPR(10, r1)
-	ld	r11,_CCR(r1)
-	mtcr	r11
-	/* Decrement paca->in_mce. */
-	lhz	r12,PACA_IN_MCE(r13)
-	subi	r12,r12,1
-	sth	r12,PACA_IN_MCE(r13)
-	REST_GPR(11, r1)
-	REST_2GPRS(12, r1)
-	/* restore original r1. */
-	ld	r1,GPR1(r1)
-	SET_SCRATCH0(r13)		/* save r13 */
-	EXCEPTION_PROLOG_0(PACA_EXMC)
-	b	machine_check_pSeries_0
-END_FTR_SECTION_IFCLR(CPU_FTR_HVMODE)
-
 EXC_COMMON_BEGIN(machine_check_common)
 	/*
 	 * Machine check is different because we use a different
@@ -541,6 +444,9 @@ EXC_COMMON_BEGIN(machine_check_handle_early)
 	bl	machine_check_early
 	std	r3,RESULT(r1)	/* Save result */
 	ld	r12,_MSR(r1)
+BEGIN_FTR_SECTION
+	b	4f
+END_FTR_SECTION_IFCLR(CPU_FTR_HVMODE)
 
 #ifdef	CONFIG_PPC_P7_NAP
 	/*
@@ -564,10 +470,11 @@ EXC_COMMON_BEGIN(machine_check_handle_early)
 	 */
 	rldicl.	r11,r12,4,63		/* See if MC hit while in HV mode. */
 	beq	5f
-	andi.	r11,r12,MSR_PR		/* See if coming from user. */
+4:	andi.	r11,r12,MSR_PR		/* See if coming from user. */
 	bne	9f			/* continue in V mode if we are. */
 
 5:
+BEGIN_FTR_SECTION
 #ifdef CONFIG_KVM_BOOK3S_64_HANDLER
 	/*
 	 * We are coming from kernel context. Check if we are coming from
@@ -578,6 +485,7 @@ EXC_COMMON_BEGIN(machine_check_handle_early)
 	cmpwi	r11,0			/* Check if coming from guest */
 	bne	9f			/* continue if we are. */
 #endif
+END_FTR_SECTION_IFSET(CPU_FTR_HVMODE)
 	/*
 	 * At this point we are not sure about what context we come from.
 	 * Queue up the MCE event and return from the interrupt.
@@ -611,6 +519,7 @@ EXC_COMMON_BEGIN(machine_check_handle_early)
 	cmpdi	r3,0		/* see if we handled MCE successfully */
 
 	beq	1b		/* if !handled then panic */
+BEGIN_FTR_SECTION
 	/*
 	 * Return from MC interrupt.
 	 * Queue up the MCE event so that we can log it later, while
@@ -619,10 +528,24 @@ EXC_COMMON_BEGIN(machine_check_handle_early)
 	bl	machine_check_queue_event
 	MACHINE_CHECK_HANDLER_WINDUP
 	RFI_TO_USER_OR_KERNEL
+FTR_SECTION_ELSE
+	/*
+	 * pSeries: Return from MC interrupt. Before that stay on emergency
+	 * stack and call machine_check_exception to log the MCE event.
+	 */
+	LOAD_HANDLER(r10,mce_return)
+	mtspr	SPRN_SRR0,r10
+	ld	r10,PACAKMSR(r13)
+	mtspr	SPRN_SRR1,r10
+	RFI_TO_KERNEL
+	b	.
+ALT_FTR_SECTION_END_IFSET(CPU_FTR_HVMODE)
 9:
 	/* Deliver the machine check to host kernel in V mode. */
 	MACHINE_CHECK_HANDLER_WINDUP
-	b	machine_check_pSeries
+	SET_SCRATCH0(r13)		/* save r13 */
+	EXCEPTION_PROLOG_0(PACA_EXMC)
+	b	machine_check_pSeries_0
 
 EXC_COMMON_BEGIN(unrecover_mce)
 	/* Invoke machine_check_exception to print MCE event and panic. */
@@ -640,29 +563,7 @@ EXC_COMMON_BEGIN(mce_return)
 	/* Invoke machine_check_exception to print MCE event and return. */
 	addi	r3,r1,STACK_FRAME_OVERHEAD
 	bl	machine_check_exception
-	ld	r9,_MSR(r1)
-	mtspr	SPRN_SRR1,r9
-	ld	r3,_NIP(r1)
-	mtspr	SPRN_SRR0,r3
-	ld	r9,_CTR(r1)
-	mtctr	r9
-	ld	r9,_XER(r1)
-	mtxer	r9
-	ld	r9,_LINK(r1)
-	mtlr	r9
-	REST_GPR(0, r1)
-	REST_8GPRS(2, r1)
-	REST_GPR(10, r1)
-	ld	r11,_CCR(r1)
-	mtcr	r11
-	/* Decrement paca->in_mce. */
-	lhz	r12,PACA_IN_MCE(r13)
-	subi	r12,r12,1
-	sth	r12,PACA_IN_MCE(r13)
-	REST_GPR(11, r1)
-	REST_2GPRS(12, r1)
-	/* restore original r1. */
-	ld	r1,GPR1(r1)
+	MACHINE_CHECK_HANDLER_WINDUP
 	RFI_TO_KERNEL
 	b	.
 

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [PATCH v8 2/5] powerpc/pseries: flush SLB contents on SLB MCE errors.
  2018-08-19 17:08 ` [PATCH v8 2/5] powerpc/pseries: flush SLB contents on SLB MCE errors Mahesh J Salgaonkar
@ 2018-08-20 10:58   ` Nicholas Piggin
  0 siblings, 0 replies; 12+ messages in thread
From: Nicholas Piggin @ 2018-08-20 10:58 UTC (permalink / raw)
  To: Mahesh J Salgaonkar
  Cc: linuxppc-dev, Michal Suchanek, Ananth Narayan, Laurent Dufour,
	Aneesh Kumar K.V, Michael Ellerman

On Sun, 19 Aug 2018 22:38:17 +0530
Mahesh J Salgaonkar <mahesh@linux.vnet.ibm.com> wrote:

> From: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com>
> 
> On pseries, as of today system crashes if we get a machine check
> exceptions due to SLB errors. These are soft errors and can be fixed by
> flushing the SLBs so the kernel can continue to function instead of
> system crash. We do this in real mode before turning on MMU. Otherwise
> we would run into nested machine checks. This patch now fetches the
> rtas error log in real mode and flushes the SLBs on SLB errors.
> 
> Signed-off-by: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com>
> Signed-off-by: Michal Suchanek <msuchanek@suse.com>
> ---
> 
> Changes in V8:
> - Use flush_and_reload_slb() from mce_power.c.
> ---
>  arch/powerpc/include/asm/machdep.h       |    1 
>  arch/powerpc/include/asm/mce.h           |    3 +
>  arch/powerpc/kernel/exceptions-64s.S     |  129 ++++++++++++++++++++++++++++++
>  arch/powerpc/kernel/mce.c                |   15 +++
>  arch/powerpc/kernel/mce_power.c          |    2 
>  arch/powerpc/platforms/powernv/setup.c   |   11 +++
>  arch/powerpc/platforms/pseries/pseries.h |    1 
>  arch/powerpc/platforms/pseries/ras.c     |   54 ++++++++++++-
>  arch/powerpc/platforms/pseries/setup.c   |    1 
>  9 files changed, 212 insertions(+), 5 deletions(-)
> 
> diff --git a/arch/powerpc/include/asm/machdep.h b/arch/powerpc/include/asm/machdep.h
> index a47de82fb8e2..b4831f1338db 100644
> --- a/arch/powerpc/include/asm/machdep.h
> +++ b/arch/powerpc/include/asm/machdep.h
> @@ -108,6 +108,7 @@ struct machdep_calls {
>  
>  	/* Early exception handlers called in realmode */
>  	int		(*hmi_exception_early)(struct pt_regs *regs);
> +	long		(*machine_check_early)(struct pt_regs *regs);
>  
>  	/* Called during machine check exception to retrive fixup address. */
>  	bool		(*mce_check_early_recovery)(struct pt_regs *regs);
> diff --git a/arch/powerpc/include/asm/mce.h b/arch/powerpc/include/asm/mce.h
> index 3a1226e9b465..78a1da95a394 100644
> --- a/arch/powerpc/include/asm/mce.h
> +++ b/arch/powerpc/include/asm/mce.h
> @@ -210,4 +210,7 @@ extern void release_mce_event(void);
>  extern void machine_check_queue_event(void);
>  extern void machine_check_print_event_info(struct machine_check_event *evt,
>  					   bool user_mode);
> +#ifdef CONFIG_PPC_BOOK3S_64
> +extern void flush_and_reload_slb(void);
> +#endif /* CONFIG_PPC_BOOK3S_64 */
>  #endif /* __ASM_PPC64_MCE_H__ */
> diff --git a/arch/powerpc/kernel/exceptions-64s.S b/arch/powerpc/kernel/exceptions-64s.S
> index 285c6465324a..12f056179112 100644
> --- a/arch/powerpc/kernel/exceptions-64s.S
> +++ b/arch/powerpc/kernel/exceptions-64s.S
> @@ -332,6 +332,9 @@ TRAMP_REAL_BEGIN(machine_check_pSeries)
>  machine_check_fwnmi:
>  	SET_SCRATCH0(r13)		/* save r13 */
>  	EXCEPTION_PROLOG_0(PACA_EXMC)
> +BEGIN_FTR_SECTION
> +	b	machine_check_pSeries_early
> +END_FTR_SECTION_IFCLR(CPU_FTR_HVMODE)
>  machine_check_pSeries_0:
>  	EXCEPTION_PROLOG_1(PACA_EXMC, KVMTEST_PR, 0x200)
>  	/*
> @@ -343,6 +346,103 @@ machine_check_pSeries_0:
>  
>  TRAMP_KVM_SKIP(PACA_EXMC, 0x200)
>  
> +TRAMP_REAL_BEGIN(machine_check_pSeries_early)
> +BEGIN_FTR_SECTION
> +	EXCEPTION_PROLOG_1(PACA_EXMC, NOTEST, 0x200)
> +	mr	r10,r1			/* Save r1 */
> +	lhz	r11,PACA_IN_MCE(r13)
> +	cmpwi	r11,0			/* Are we in nested machine check */
> +	bne	0f			/* Yes, we are. */
> +	/* First machine check entry */
> +	ld	r1,PACAMCEMERGSP(r13)	/* Use MC emergency stack */
> +0:	subi	r1,r1,INT_FRAME_SIZE	/* alloc stack frame */
> +	addi	r11,r11,1		/* increment paca->in_mce */
> +	sth	r11,PACA_IN_MCE(r13)
> +	/* Limit nested MCE to level 4 to avoid stack overflow */
> +	cmpwi	r11,MAX_MCE_DEPTH
> +	bgt	1f			/* Check if we hit limit of 4 */
> +	mfspr	r11,SPRN_SRR0		/* Save SRR0 */
> +	mfspr	r12,SPRN_SRR1		/* Save SRR1 */
> +	EXCEPTION_PROLOG_COMMON_1()
> +	EXCEPTION_PROLOG_COMMON_2(PACA_EXMC)
> +	EXCEPTION_PROLOG_COMMON_3(0x200)
> +	addi	r3,r1,STACK_FRAME_OVERHEAD
> +	BRANCH_LINK_TO_FAR(machine_check_early) /* Function call ABI */
> +	ld	r12,_MSR(r1)
> +	andi.	r11,r12,MSR_PR		/* See if coming from user. */
> +	bne	2f			/* continue in V mode if we are. */
> +
> +	/*
> +	 * At this point we are not sure about what context we come from.
> +	 * We may be in the middle of swithing stack. r1 may not be valid.
> +	 * Hence stay on emergency stack, call machine_check_exception and
> +	 * return from the interrupt.
> +	 * But before that, check if this is an un-recoverable exception.
> +	 * If yes, then stay on emergency stack and panic.
> +	 */
> +	andi.	r11,r12,MSR_RI
> +	beq	1f
> +
> +	/*
> +	 * Check if we have successfully handled/recovered from error, if not
> +	 * then stay on emergency stack and panic.
> +	 */
> +	cmpdi	r3,0		/* see if we handled MCE successfully */
> +	beq	1f		/* if !handled then panic */
> +
> +	/* Stay on emergency stack and return from interrupt. */
> +	LOAD_HANDLER(r10,mce_return)
> +	mtspr	SPRN_SRR0,r10
> +	ld	r10,PACAKMSR(r13)
> +	mtspr	SPRN_SRR1,r10
> +	RFI_TO_KERNEL
> +	b	.
> +
> +1:	LOAD_HANDLER(r10,unrecover_mce)
> +	mtspr	SPRN_SRR0,r10
> +	ld	r10,PACAKMSR(r13)
> +	/*
> +	 * We are going down. But there are chances that we might get hit by
> +	 * another MCE during panic path and we may run into unstable state
> +	 * with no way out. Hence, turn ME bit off while going down, so that
> +	 * when another MCE is hit during panic path, hypervisor will
> +	 * power cycle the lpar, instead of getting into MCE loop.
> +	 */
> +	li	r3,MSR_ME
> +	andc	r10,r10,r3		/* Turn off MSR_ME */
> +	mtspr	SPRN_SRR1,r10
> +	RFI_TO_KERNEL
> +	b	.
> +
> +	/* Move original SRR0 and SRR1 into the respective regs */
> +2:	ld	r9,_MSR(r1)
> +	mtspr	SPRN_SRR1,r9
> +	ld	r3,_NIP(r1)
> +	mtspr	SPRN_SRR0,r3
> +	ld	r9,_CTR(r1)
> +	mtctr	r9
> +	ld	r9,_XER(r1)
> +	mtxer	r9
> +	ld	r9,_LINK(r1)
> +	mtlr	r9
> +	REST_GPR(0, r1)
> +	REST_8GPRS(2, r1)
> +	REST_GPR(10, r1)
> +	ld	r11,_CCR(r1)
> +	mtcr	r11
> +	/* Decrement paca->in_mce. */
> +	lhz	r12,PACA_IN_MCE(r13)
> +	subi	r12,r12,1
> +	sth	r12,PACA_IN_MCE(r13)
> +	REST_GPR(11, r1)
> +	REST_2GPRS(12, r1)
> +	/* restore original r1. */
> +	ld	r1,GPR1(r1)
> +	SET_SCRATCH0(r13)		/* save r13 */
> +	EXCEPTION_PROLOG_0(PACA_EXMC)
> +	b	machine_check_pSeries_0
> +END_FTR_SECTION_IFCLR(CPU_FTR_HVMODE)
> +
>  EXC_COMMON_BEGIN(machine_check_common)
>  	/*
>  	 * Machine check is different because we use a different
> @@ -536,6 +636,35 @@ EXC_COMMON_BEGIN(unrecover_mce)
>  	bl	unrecoverable_exception
>  	b	1b
>  
> +EXC_COMMON_BEGIN(mce_return)
> +	/* Invoke machine_check_exception to print MCE event and return. */
> +	addi	r3,r1,STACK_FRAME_OVERHEAD
> +	bl	machine_check_exception
> +	ld	r9,_MSR(r1)
> +	mtspr	SPRN_SRR1,r9
> +	ld	r3,_NIP(r1)
> +	mtspr	SPRN_SRR0,r3
> +	ld	r9,_CTR(r1)
> +	mtctr	r9
> +	ld	r9,_XER(r1)
> +	mtxer	r9
> +	ld	r9,_LINK(r1)
> +	mtlr	r9
> +	REST_GPR(0, r1)
> +	REST_8GPRS(2, r1)
> +	REST_GPR(10, r1)
> +	ld	r11,_CCR(r1)
> +	mtcr	r11
> +	/* Decrement paca->in_mce. */
> +	lhz	r12,PACA_IN_MCE(r13)
> +	subi	r12,r12,1
> +	sth	r12,PACA_IN_MCE(r13)
> +	REST_GPR(11, r1)
> +	REST_2GPRS(12, r1)
> +	/* restore original r1. */
> +	ld	r1,GPR1(r1)
> +	RFI_TO_KERNEL
> +	b	.
>  
>  EXC_REAL(data_access, 0x300, 0x80)
>  EXC_VIRT(data_access, 0x4300, 0x80, 0x300)
> diff --git a/arch/powerpc/kernel/mce.c b/arch/powerpc/kernel/mce.c
> index efdd16a79075..ae17d8aa60c4 100644
> --- a/arch/powerpc/kernel/mce.c
> +++ b/arch/powerpc/kernel/mce.c
> @@ -488,10 +488,19 @@ long machine_check_early(struct pt_regs *regs)
>  {
>  	long handled = 0;
>  
> -	__this_cpu_inc(irq_stat.mce_exceptions);
> +	/*
> +	 * For pSeries we count mce when we go into virtual mode machine
> +	 * check handler. Hence skip it. Also, We can't access per cpu
> +	 * variables in real mode for LPAR.
> +	 */
> +	if (early_cpu_has_feature(CPU_FTR_HVMODE))
> +		__this_cpu_inc(irq_stat.mce_exceptions);

Could this be moved into powernv's virtual mode handler as well, do you
think?

>  
> -	if (cur_cpu_spec && cur_cpu_spec->machine_check_early)
> -		handled = cur_cpu_spec->machine_check_early(regs);
> +	/*
> +	 * See if platform is capable of handling machine check.
> +	 */
> +	if (ppc_md.machine_check_early)
> +		handled = ppc_md.machine_check_early(regs);
>  	return handled;
>  }
>  
> diff --git a/arch/powerpc/kernel/mce_power.c b/arch/powerpc/kernel/mce_power.c
> index 368eb23f27c2..135b0b5a702e 100644
> --- a/arch/powerpc/kernel/mce_power.c
> +++ b/arch/powerpc/kernel/mce_power.c
> @@ -60,7 +60,7 @@ static unsigned long addr_to_pfn(struct pt_regs *regs, unsigned long addr)
>  
>  /* flush SLBs and reload */
>  #ifdef CONFIG_PPC_BOOK3S_64
> -static void flush_and_reload_slb(void)
> +void flush_and_reload_slb(void)
>  {
>  	/* Invalidate all SLBs */
>  	slb_flush_all_realmode();
> diff --git a/arch/powerpc/platforms/powernv/setup.c b/arch/powerpc/platforms/powernv/setup.c
> index f96df0a25d05..b74c93bc2e55 100644
> --- a/arch/powerpc/platforms/powernv/setup.c
> +++ b/arch/powerpc/platforms/powernv/setup.c
> @@ -431,6 +431,16 @@ static unsigned long pnv_get_proc_freq(unsigned int cpu)
>  	return ret_freq;
>  }
>  
> +static long pnv_machine_check_early(struct pt_regs *regs)
> +{
> +	long handled = 0;
> +
> +	if (cur_cpu_spec && cur_cpu_spec->machine_check_early)
> +		handled = cur_cpu_spec->machine_check_early(regs);
> +
> +	return handled;
> +}
> +
>  define_machine(powernv) {
>  	.name			= "PowerNV",
>  	.probe			= pnv_probe,
> @@ -442,6 +452,7 @@ define_machine(powernv) {
>  	.machine_shutdown	= pnv_shutdown,
>  	.power_save             = NULL,
>  	.calibrate_decr		= generic_calibrate_decr,
> +	.machine_check_early	= pnv_machine_check_early,
>  #ifdef CONFIG_KEXEC_CORE
>  	.kexec_cpu_down		= pnv_kexec_cpu_down,
>  #endif
> diff --git a/arch/powerpc/platforms/pseries/pseries.h b/arch/powerpc/platforms/pseries/pseries.h
> index 60db2ee511fb..ec2a5f61d4a4 100644
> --- a/arch/powerpc/platforms/pseries/pseries.h
> +++ b/arch/powerpc/platforms/pseries/pseries.h
> @@ -24,6 +24,7 @@ struct pt_regs;
>  
>  extern int pSeries_system_reset_exception(struct pt_regs *regs);
>  extern int pSeries_machine_check_exception(struct pt_regs *regs);
> +extern long pSeries_machine_check_realmode(struct pt_regs *regs);
>  
>  #ifdef CONFIG_SMP
>  extern void smp_init_pseries(void);
> diff --git a/arch/powerpc/platforms/pseries/ras.c b/arch/powerpc/platforms/pseries/ras.c
> index 4a0b201e25aa..73500a24e9c2 100644
> --- a/arch/powerpc/platforms/pseries/ras.c
> +++ b/arch/powerpc/platforms/pseries/ras.c
> @@ -27,6 +27,7 @@
>  #include <asm/machdep.h>
>  #include <asm/rtas.h>
>  #include <asm/firmware.h>
> +#include <asm/mce.h>
>  
>  #include "pseries.h"
>  
> @@ -523,6 +524,37 @@ int pSeries_system_reset_exception(struct pt_regs *regs)
>  	return 0; /* need to perform reset */
>  }
>  
> +static int mce_handle_error(struct rtas_error_log *errp)
> +{
> +	struct pseries_errorlog *pseries_log;
> +	struct pseries_mc_errorlog *mce_log;
> +	int disposition = rtas_error_disposition(errp);
> +	uint8_t error_type;
> +
> +	if (!rtas_error_extended(errp))
> +		goto out;
> +
> +	pseries_log = get_pseries_errorlog(errp, PSERIES_ELOG_SECT_ID_MCE);
> +	if (pseries_log == NULL)
> +		goto out;
> +
> +	mce_log = (struct pseries_mc_errorlog *)pseries_log->data;
> +	error_type = mce_log->error_type;
> +
> +#ifdef CONFIG_PPC_BOOK3S_64
> +	if ((disposition == RTAS_DISP_NOT_RECOVERED) &&
> +			(error_type == MC_ERROR_TYPE_SLB)) {
> +		/* Store the old slb content someplace. */
> +		flush_and_reload_slb();
> +		disposition = RTAS_DISP_FULLY_RECOVERED;
> +		rtas_set_disposition_recovered(errp);
> +	}
> +#endif

I suppose this is the right thing to do here, and the hardware or
firmware should upgrade to a UE error if this keeps failing?

For a later patch series, but you could flush the ERAT and recover
ERAT errors here too. TLB would be possible in the guest too when
hypervisor allows tlbie access. For phyp presumably the HV should
take care of flushing the TLB and not pass that down to the guest
(unless there is a guest hypercall to flush the TLB).

Reviewed-by: Nicholas Piggin <npiggin@gmail.com>

Thanks,
Nick

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v8 4/5] powerpc/pseries: Dump the SLB contents on SLB MCE errors.
  2018-08-19 17:08 ` [PATCH v8 4/5] powerpc/pseries: Dump the SLB contents on SLB MCE errors Mahesh J Salgaonkar
@ 2018-08-20 11:20   ` Nicholas Piggin
  0 siblings, 0 replies; 12+ messages in thread
From: Nicholas Piggin @ 2018-08-20 11:20 UTC (permalink / raw)
  To: Mahesh J Salgaonkar
  Cc: linuxppc-dev, Aneesh Kumar K.V, Michael Ellerman, Ananth Narayan,
	Laurent Dufour, Aneesh Kumar K.V, Michal Suchanek

On Sun, 19 Aug 2018 22:38:32 +0530
Mahesh J Salgaonkar <mahesh@linux.vnet.ibm.com> wrote:

> From: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com>
> 
> If we get a machine check exceptions due to SLB errors then dump the
> current SLB contents which will be very much helpful in debugging the
> root cause of SLB errors. Introduce an exclusive buffer per cpu to hold
> faulty SLB entries. In real mode mce handler saves the old SLB contents
> into this buffer accessible through paca and print it out later in virtual
> mode.
> 
> With this patch the console will log SLB contents like below on SLB MCE
> errors:
> 
> [  507.297236] SLB contents of cpu 0x1
> [  507.297237] Last SLB entry inserted at slot 16
> [  507.297238] 00 c000000008000000 400ea1b217000500
> [  507.297239]   1T  ESID=   c00000  VSID=      ea1b217 LLP:100
> [  507.297240] 01 d000000008000000 400d43642f000510
> [  507.297242]   1T  ESID=   d00000  VSID=      d43642f LLP:110
> [  507.297243] 11 f000000008000000 400a86c85f000500
> [  507.297244]   1T  ESID=   f00000  VSID=      a86c85f LLP:100
> [  507.297245] 12 00007f0008000000 4008119624000d90
> [  507.297246]   1T  ESID=       7f  VSID=      8119624 LLP:110
> [  507.297247] 13 0000000018000000 00092885f5150d90
> [  507.297247]  256M ESID=        1  VSID=   92885f5150 LLP:110
> [  507.297248] 14 0000010008000000 4009e7cb50000d90
> [  507.297249]   1T  ESID=        1  VSID=      9e7cb50 LLP:110
> [  507.297250] 15 d000000008000000 400d43642f000510
> [  507.297251]   1T  ESID=   d00000  VSID=      d43642f LLP:110
> [  507.297252] 16 d000000008000000 400d43642f000510
> [  507.297253]   1T  ESID=   d00000  VSID=      d43642f LLP:110
> [  507.297253] ----------------------------------
> [  507.297254] SLB cache ptr value = 3
> [  507.297254] Valid SLB cache entries:
> [  507.297255] 00 EA[0-35]=    7f000
> [  507.297256] 01 EA[0-35]=        1
> [  507.297257] 02 EA[0-35]=     1000
> [  507.297257] Rest of SLB cache entries:
> [  507.297258] 03 EA[0-35]=    7f000
> [  507.297258] 04 EA[0-35]=        1
> [  507.297259] 05 EA[0-35]=     1000
> [  507.297260] 06 EA[0-35]=       12
> [  507.297260] 07 EA[0-35]=    7f000
> 
> Suggested-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
> Suggested-by: Michael Ellerman <mpe@ellerman.id.au>
> Signed-off-by: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com>
> ---
> 
> Changes in V8:
> - Limit the slb saving to single level of mce recursion.

Thanks, that looks good now.

Reviewed-by: Nicholas Piggin <npiggin@gmail.com>

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v8 5/5] powernv/pseries: consolidate code for mce early handling.
  2018-08-19 17:08 ` [PATCH v8 5/5] powernv/pseries: consolidate code for mce early handling Mahesh J Salgaonkar
@ 2018-08-20 11:34   ` Nicholas Piggin
  2018-08-23  8:43     ` Mahesh Jagannath Salgaonkar
  0 siblings, 1 reply; 12+ messages in thread
From: Nicholas Piggin @ 2018-08-20 11:34 UTC (permalink / raw)
  To: Mahesh J Salgaonkar
  Cc: linuxppc-dev, Ananth Narayan, Laurent Dufour, Aneesh Kumar K.V,
	Michal Suchanek, Michael Ellerman

On Sun, 19 Aug 2018 22:38:39 +0530
Mahesh J Salgaonkar <mahesh@linux.vnet.ibm.com> wrote:

> From: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com>
> 
> Now that other platforms also implements real mode mce handler,
> lets consolidate the code by sharing existing powernv machine check
> early code. Rename machine_check_powernv_early to
> machine_check_common_early and reuse the code.
> 
> Signed-off-by: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com>
> ---
>  arch/powerpc/kernel/exceptions-64s.S |  155 ++++++----------------------------
>  1 file changed, 28 insertions(+), 127 deletions(-)
> 
> diff --git a/arch/powerpc/kernel/exceptions-64s.S b/arch/powerpc/kernel/exceptions-64s.S
> index 12f056179112..2f85a7baf026 100644
> --- a/arch/powerpc/kernel/exceptions-64s.S
> +++ b/arch/powerpc/kernel/exceptions-64s.S
> @@ -243,14 +243,13 @@ EXC_REAL_BEGIN(machine_check, 0x200, 0x100)
>  	SET_SCRATCH0(r13)		/* save r13 */
>  	EXCEPTION_PROLOG_0(PACA_EXMC)
>  BEGIN_FTR_SECTION
> -	b	machine_check_powernv_early
> +	b	machine_check_common_early
>  FTR_SECTION_ELSE
>  	b	machine_check_pSeries_0
>  ALT_FTR_SECTION_END_IFSET(CPU_FTR_HVMODE)
>  EXC_REAL_END(machine_check, 0x200, 0x100)
>  EXC_VIRT_NONE(0x4200, 0x100)
> -TRAMP_REAL_BEGIN(machine_check_powernv_early)
> -BEGIN_FTR_SECTION
> +TRAMP_REAL_BEGIN(machine_check_common_early)
>  	EXCEPTION_PROLOG_1(PACA_EXMC, NOTEST, 0x200)
>  	/*
>  	 * Register contents:
> @@ -306,7 +305,9 @@ BEGIN_FTR_SECTION
>  	/* Save r9 through r13 from EXMC save area to stack frame. */
>  	EXCEPTION_PROLOG_COMMON_2(PACA_EXMC)
>  	mfmsr	r11			/* get MSR value */
> +BEGIN_FTR_SECTION
>  	ori	r11,r11,MSR_ME		/* turn on ME bit */
> +END_FTR_SECTION_IFSET(CPU_FTR_HVMODE)
>  	ori	r11,r11,MSR_RI		/* turn on RI bit */
>  	LOAD_HANDLER(r12, machine_check_handle_early)
>  1:	mtspr	SPRN_SRR0,r12
> @@ -325,7 +326,6 @@ BEGIN_FTR_SECTION
>  	andc	r11,r11,r10		/* Turn off MSR_ME */
>  	b	1b
>  	b	.	/* prevent speculative execution */
> -END_FTR_SECTION_IFSET(CPU_FTR_HVMODE)
>  
>  TRAMP_REAL_BEGIN(machine_check_pSeries)
>  	.globl machine_check_fwnmi
> @@ -333,7 +333,7 @@ machine_check_fwnmi:
>  	SET_SCRATCH0(r13)		/* save r13 */
>  	EXCEPTION_PROLOG_0(PACA_EXMC)
>  BEGIN_FTR_SECTION
> -	b	machine_check_pSeries_early
> +	b	machine_check_common_early
>  END_FTR_SECTION_IFCLR(CPU_FTR_HVMODE)
>  machine_check_pSeries_0:
>  	EXCEPTION_PROLOG_1(PACA_EXMC, KVMTEST_PR, 0x200)
> @@ -346,103 +346,6 @@ machine_check_pSeries_0:
>  
>  TRAMP_KVM_SKIP(PACA_EXMC, 0x200)
>  
> -TRAMP_REAL_BEGIN(machine_check_pSeries_early)
> -BEGIN_FTR_SECTION
> -	EXCEPTION_PROLOG_1(PACA_EXMC, NOTEST, 0x200)
> -	mr	r10,r1			/* Save r1 */
> -	lhz	r11,PACA_IN_MCE(r13)
> -	cmpwi	r11,0			/* Are we in nested machine check */
> -	bne	0f			/* Yes, we are. */
> -	/* First machine check entry */
> -	ld	r1,PACAMCEMERGSP(r13)	/* Use MC emergency stack */
> -0:	subi	r1,r1,INT_FRAME_SIZE	/* alloc stack frame */
> -	addi	r11,r11,1		/* increment paca->in_mce */
> -	sth	r11,PACA_IN_MCE(r13)
> -	/* Limit nested MCE to level 4 to avoid stack overflow */
> -	cmpwi	r11,MAX_MCE_DEPTH
> -	bgt	1f			/* Check if we hit limit of 4 */
> -	mfspr	r11,SPRN_SRR0		/* Save SRR0 */
> -	mfspr	r12,SPRN_SRR1		/* Save SRR1 */
> -	EXCEPTION_PROLOG_COMMON_1()
> -	EXCEPTION_PROLOG_COMMON_2(PACA_EXMC)
> -	EXCEPTION_PROLOG_COMMON_3(0x200)
> -	addi	r3,r1,STACK_FRAME_OVERHEAD
> -	BRANCH_LINK_TO_FAR(machine_check_early) /* Function call ABI */
> -	ld	r12,_MSR(r1)
> -	andi.	r11,r12,MSR_PR		/* See if coming from user. */
> -	bne	2f			/* continue in V mode if we are. */
> -
> -	/*
> -	 * At this point we are not sure about what context we come from.
> -	 * We may be in the middle of swithing stack. r1 may not be valid.
> -	 * Hence stay on emergency stack, call machine_check_exception and
> -	 * return from the interrupt.
> -	 * But before that, check if this is an un-recoverable exception.
> -	 * If yes, then stay on emergency stack and panic.
> -	 */
> -	andi.	r11,r12,MSR_RI
> -	beq	1f
> -
> -	/*
> -	 * Check if we have successfully handled/recovered from error, if not
> -	 * then stay on emergency stack and panic.
> -	 */
> -	cmpdi	r3,0		/* see if we handled MCE successfully */
> -	beq	1f		/* if !handled then panic */
> -
> -	/* Stay on emergency stack and return from interrupt. */
> -	LOAD_HANDLER(r10,mce_return)
> -	mtspr	SPRN_SRR0,r10
> -	ld	r10,PACAKMSR(r13)
> -	mtspr	SPRN_SRR1,r10
> -	RFI_TO_KERNEL
> -	b	.
> -
> -1:	LOAD_HANDLER(r10,unrecover_mce)
> -	mtspr	SPRN_SRR0,r10
> -	ld	r10,PACAKMSR(r13)
> -	/*
> -	 * We are going down. But there are chances that we might get hit by
> -	 * another MCE during panic path and we may run into unstable state
> -	 * with no way out. Hence, turn ME bit off while going down, so that
> -	 * when another MCE is hit during panic path, hypervisor will
> -	 * power cycle the lpar, instead of getting into MCE loop.
> -	 */
> -	li	r3,MSR_ME
> -	andc	r10,r10,r3		/* Turn off MSR_ME */
> -	mtspr	SPRN_SRR1,r10
> -	RFI_TO_KERNEL
> -	b	.
> -
> -	/* Move original SRR0 and SRR1 into the respective regs */
> -2:	ld	r9,_MSR(r1)
> -	mtspr	SPRN_SRR1,r9
> -	ld	r3,_NIP(r1)
> -	mtspr	SPRN_SRR0,r3
> -	ld	r9,_CTR(r1)
> -	mtctr	r9
> -	ld	r9,_XER(r1)
> -	mtxer	r9
> -	ld	r9,_LINK(r1)
> -	mtlr	r9
> -	REST_GPR(0, r1)
> -	REST_8GPRS(2, r1)
> -	REST_GPR(10, r1)
> -	ld	r11,_CCR(r1)
> -	mtcr	r11
> -	/* Decrement paca->in_mce. */
> -	lhz	r12,PACA_IN_MCE(r13)
> -	subi	r12,r12,1
> -	sth	r12,PACA_IN_MCE(r13)
> -	REST_GPR(11, r1)
> -	REST_2GPRS(12, r1)
> -	/* restore original r1. */
> -	ld	r1,GPR1(r1)
> -	SET_SCRATCH0(r13)		/* save r13 */
> -	EXCEPTION_PROLOG_0(PACA_EXMC)
> -	b	machine_check_pSeries_0
> -END_FTR_SECTION_IFCLR(CPU_FTR_HVMODE)
> -
>  EXC_COMMON_BEGIN(machine_check_common)
>  	/*
>  	 * Machine check is different because we use a different
> @@ -541,6 +444,9 @@ EXC_COMMON_BEGIN(machine_check_handle_early)
>  	bl	machine_check_early
>  	std	r3,RESULT(r1)	/* Save result */
>  	ld	r12,_MSR(r1)
> +BEGIN_FTR_SECTION
> +	b	4f
> +END_FTR_SECTION_IFCLR(CPU_FTR_HVMODE)
>  
>  #ifdef	CONFIG_PPC_P7_NAP
>  	/*
> @@ -564,10 +470,11 @@ EXC_COMMON_BEGIN(machine_check_handle_early)
>  	 */
>  	rldicl.	r11,r12,4,63		/* See if MC hit while in HV mode. */
>  	beq	5f
> -	andi.	r11,r12,MSR_PR		/* See if coming from user. */
> +4:	andi.	r11,r12,MSR_PR		/* See if coming from user. */
>  	bne	9f			/* continue in V mode if we are. */
>  
>  5:
> +BEGIN_FTR_SECTION
>  #ifdef CONFIG_KVM_BOOK3S_64_HANDLER
>  	/*
>  	 * We are coming from kernel context. Check if we are coming from
> @@ -578,6 +485,7 @@ EXC_COMMON_BEGIN(machine_check_handle_early)
>  	cmpwi	r11,0			/* Check if coming from guest */
>  	bne	9f			/* continue if we are. */
>  #endif
> +END_FTR_SECTION_IFSET(CPU_FTR_HVMODE)

Put these inside the ifdef?


>  	/*
>  	 * At this point we are not sure about what context we come from.
>  	 * Queue up the MCE event and return from the interrupt.
> @@ -611,6 +519,7 @@ EXC_COMMON_BEGIN(machine_check_handle_early)
>  	cmpdi	r3,0		/* see if we handled MCE successfully */
>  
>  	beq	1b		/* if !handled then panic */
> +BEGIN_FTR_SECTION
>  	/*
>  	 * Return from MC interrupt.
>  	 * Queue up the MCE event so that we can log it later, while
> @@ -619,10 +528,24 @@ EXC_COMMON_BEGIN(machine_check_handle_early)
>  	bl	machine_check_queue_event
>  	MACHINE_CHECK_HANDLER_WINDUP
>  	RFI_TO_USER_OR_KERNEL
> +FTR_SECTION_ELSE
> +	/*
> +	 * pSeries: Return from MC interrupt. Before that stay on emergency
> +	 * stack and call machine_check_exception to log the MCE event.
> +	 */
> +	LOAD_HANDLER(r10,mce_return)
> +	mtspr	SPRN_SRR0,r10
> +	ld	r10,PACAKMSR(r13)
> +	mtspr	SPRN_SRR1,r10
> +	RFI_TO_KERNEL
> +	b	.
> +ALT_FTR_SECTION_END_IFSET(CPU_FTR_HVMODE)

Do you still need mce_return? Why can't you consolidate it as well? ...
Hmm, okay so now I look back at patch 2, I don't think you should call
machine_check_exception there. You're supposed to call
machine_check_queue_event here and it will be handled by irq work.

I think when you do that more of this code should fall out and be
consolidated. Sorry for not picking that up earlier.

Thanks,
Nick

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v8 5/5] powernv/pseries: consolidate code for mce early handling.
  2018-08-20 11:34   ` Nicholas Piggin
@ 2018-08-23  8:43     ` Mahesh Jagannath Salgaonkar
  2018-08-23  9:02       ` Nicholas Piggin
  0 siblings, 1 reply; 12+ messages in thread
From: Mahesh Jagannath Salgaonkar @ 2018-08-23  8:43 UTC (permalink / raw)
  To: Nicholas Piggin
  Cc: linuxppc-dev, Ananth Narayan, Laurent Dufour, Aneesh Kumar K.V,
	Michal Suchanek, Michael Ellerman

On 08/20/2018 05:04 PM, Nicholas Piggin wrote:
> On Sun, 19 Aug 2018 22:38:39 +0530
> Mahesh J Salgaonkar <mahesh@linux.vnet.ibm.com> wrote:
> 
>> From: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com>
>>
>> Now that other platforms also implements real mode mce handler,
>> lets consolidate the code by sharing existing powernv machine check
>> early code. Rename machine_check_powernv_early to
>> machine_check_common_early and reuse the code.
>>
>> Signed-off-by: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com>
>> ---
>>  arch/powerpc/kernel/exceptions-64s.S |  155 ++++++----------------------------
>>  1 file changed, 28 insertions(+), 127 deletions(-)
>>
>> diff --git a/arch/powerpc/kernel/exceptions-64s.S b/arch/powerpc/kernel/exceptions-64s.S
>> index 12f056179112..2f85a7baf026 100644
>> --- a/arch/powerpc/kernel/exceptions-64s.S
>> +++ b/arch/powerpc/kernel/exceptions-64s.S
>> @@ -243,14 +243,13 @@ EXC_REAL_BEGIN(machine_check, 0x200, 0x100)
>>  	SET_SCRATCH0(r13)		/* save r13 */
>>  	EXCEPTION_PROLOG_0(PACA_EXMC)
>>  BEGIN_FTR_SECTION
>> -	b	machine_check_powernv_early
>> +	b	machine_check_common_early
>>  FTR_SECTION_ELSE
>>  	b	machine_check_pSeries_0
>>  ALT_FTR_SECTION_END_IFSET(CPU_FTR_HVMODE)
>>  EXC_REAL_END(machine_check, 0x200, 0x100)
>>  EXC_VIRT_NONE(0x4200, 0x100)
>> -TRAMP_REAL_BEGIN(machine_check_powernv_early)
>> -BEGIN_FTR_SECTION
>> +TRAMP_REAL_BEGIN(machine_check_common_early)
>>  	EXCEPTION_PROLOG_1(PACA_EXMC, NOTEST, 0x200)
>>  	/*
>>  	 * Register contents:
>> @@ -306,7 +305,9 @@ BEGIN_FTR_SECTION
>>  	/* Save r9 through r13 from EXMC save area to stack frame. */
>>  	EXCEPTION_PROLOG_COMMON_2(PACA_EXMC)
>>  	mfmsr	r11			/* get MSR value */
>> +BEGIN_FTR_SECTION
>>  	ori	r11,r11,MSR_ME		/* turn on ME bit */
>> +END_FTR_SECTION_IFSET(CPU_FTR_HVMODE)
>>  	ori	r11,r11,MSR_RI		/* turn on RI bit */
>>  	LOAD_HANDLER(r12, machine_check_handle_early)
>>  1:	mtspr	SPRN_SRR0,r12
>> @@ -325,7 +326,6 @@ BEGIN_FTR_SECTION
>>  	andc	r11,r11,r10		/* Turn off MSR_ME */
>>  	b	1b
>>  	b	.	/* prevent speculative execution */
>> -END_FTR_SECTION_IFSET(CPU_FTR_HVMODE)
>>  
>>  TRAMP_REAL_BEGIN(machine_check_pSeries)
>>  	.globl machine_check_fwnmi
>> @@ -333,7 +333,7 @@ machine_check_fwnmi:
>>  	SET_SCRATCH0(r13)		/* save r13 */
>>  	EXCEPTION_PROLOG_0(PACA_EXMC)
>>  BEGIN_FTR_SECTION
>> -	b	machine_check_pSeries_early
>> +	b	machine_check_common_early
>>  END_FTR_SECTION_IFCLR(CPU_FTR_HVMODE)
>>  machine_check_pSeries_0:
>>  	EXCEPTION_PROLOG_1(PACA_EXMC, KVMTEST_PR, 0x200)
>> @@ -346,103 +346,6 @@ machine_check_pSeries_0:
>>  
>>  TRAMP_KVM_SKIP(PACA_EXMC, 0x200)
>>  
>> -TRAMP_REAL_BEGIN(machine_check_pSeries_early)
>> -BEGIN_FTR_SECTION
>> -	EXCEPTION_PROLOG_1(PACA_EXMC, NOTEST, 0x200)
>> -	mr	r10,r1			/* Save r1 */
>> -	lhz	r11,PACA_IN_MCE(r13)
>> -	cmpwi	r11,0			/* Are we in nested machine check */
>> -	bne	0f			/* Yes, we are. */
>> -	/* First machine check entry */
>> -	ld	r1,PACAMCEMERGSP(r13)	/* Use MC emergency stack */
>> -0:	subi	r1,r1,INT_FRAME_SIZE	/* alloc stack frame */
>> -	addi	r11,r11,1		/* increment paca->in_mce */
>> -	sth	r11,PACA_IN_MCE(r13)
>> -	/* Limit nested MCE to level 4 to avoid stack overflow */
>> -	cmpwi	r11,MAX_MCE_DEPTH
>> -	bgt	1f			/* Check if we hit limit of 4 */
>> -	mfspr	r11,SPRN_SRR0		/* Save SRR0 */
>> -	mfspr	r12,SPRN_SRR1		/* Save SRR1 */
>> -	EXCEPTION_PROLOG_COMMON_1()
>> -	EXCEPTION_PROLOG_COMMON_2(PACA_EXMC)
>> -	EXCEPTION_PROLOG_COMMON_3(0x200)
>> -	addi	r3,r1,STACK_FRAME_OVERHEAD
>> -	BRANCH_LINK_TO_FAR(machine_check_early) /* Function call ABI */
>> -	ld	r12,_MSR(r1)
>> -	andi.	r11,r12,MSR_PR		/* See if coming from user. */
>> -	bne	2f			/* continue in V mode if we are. */
>> -
>> -	/*
>> -	 * At this point we are not sure about what context we come from.
>> -	 * We may be in the middle of swithing stack. r1 may not be valid.
>> -	 * Hence stay on emergency stack, call machine_check_exception and
>> -	 * return from the interrupt.
>> -	 * But before that, check if this is an un-recoverable exception.
>> -	 * If yes, then stay on emergency stack and panic.
>> -	 */
>> -	andi.	r11,r12,MSR_RI
>> -	beq	1f
>> -
>> -	/*
>> -	 * Check if we have successfully handled/recovered from error, if not
>> -	 * then stay on emergency stack and panic.
>> -	 */
>> -	cmpdi	r3,0		/* see if we handled MCE successfully */
>> -	beq	1f		/* if !handled then panic */
>> -
>> -	/* Stay on emergency stack and return from interrupt. */
>> -	LOAD_HANDLER(r10,mce_return)
>> -	mtspr	SPRN_SRR0,r10
>> -	ld	r10,PACAKMSR(r13)
>> -	mtspr	SPRN_SRR1,r10
>> -	RFI_TO_KERNEL
>> -	b	.
>> -
>> -1:	LOAD_HANDLER(r10,unrecover_mce)
>> -	mtspr	SPRN_SRR0,r10
>> -	ld	r10,PACAKMSR(r13)
>> -	/*
>> -	 * We are going down. But there are chances that we might get hit by
>> -	 * another MCE during panic path and we may run into unstable state
>> -	 * with no way out. Hence, turn ME bit off while going down, so that
>> -	 * when another MCE is hit during panic path, hypervisor will
>> -	 * power cycle the lpar, instead of getting into MCE loop.
>> -	 */
>> -	li	r3,MSR_ME
>> -	andc	r10,r10,r3		/* Turn off MSR_ME */
>> -	mtspr	SPRN_SRR1,r10
>> -	RFI_TO_KERNEL
>> -	b	.
>> -
>> -	/* Move original SRR0 and SRR1 into the respective regs */
>> -2:	ld	r9,_MSR(r1)
>> -	mtspr	SPRN_SRR1,r9
>> -	ld	r3,_NIP(r1)
>> -	mtspr	SPRN_SRR0,r3
>> -	ld	r9,_CTR(r1)
>> -	mtctr	r9
>> -	ld	r9,_XER(r1)
>> -	mtxer	r9
>> -	ld	r9,_LINK(r1)
>> -	mtlr	r9
>> -	REST_GPR(0, r1)
>> -	REST_8GPRS(2, r1)
>> -	REST_GPR(10, r1)
>> -	ld	r11,_CCR(r1)
>> -	mtcr	r11
>> -	/* Decrement paca->in_mce. */
>> -	lhz	r12,PACA_IN_MCE(r13)
>> -	subi	r12,r12,1
>> -	sth	r12,PACA_IN_MCE(r13)
>> -	REST_GPR(11, r1)
>> -	REST_2GPRS(12, r1)
>> -	/* restore original r1. */
>> -	ld	r1,GPR1(r1)
>> -	SET_SCRATCH0(r13)		/* save r13 */
>> -	EXCEPTION_PROLOG_0(PACA_EXMC)
>> -	b	machine_check_pSeries_0
>> -END_FTR_SECTION_IFCLR(CPU_FTR_HVMODE)
>> -
>>  EXC_COMMON_BEGIN(machine_check_common)
>>  	/*
>>  	 * Machine check is different because we use a different
>> @@ -541,6 +444,9 @@ EXC_COMMON_BEGIN(machine_check_handle_early)
>>  	bl	machine_check_early
>>  	std	r3,RESULT(r1)	/* Save result */
>>  	ld	r12,_MSR(r1)
>> +BEGIN_FTR_SECTION
>> +	b	4f
>> +END_FTR_SECTION_IFCLR(CPU_FTR_HVMODE)
>>  
>>  #ifdef	CONFIG_PPC_P7_NAP
>>  	/*
>> @@ -564,10 +470,11 @@ EXC_COMMON_BEGIN(machine_check_handle_early)
>>  	 */
>>  	rldicl.	r11,r12,4,63		/* See if MC hit while in HV mode. */
>>  	beq	5f
>> -	andi.	r11,r12,MSR_PR		/* See if coming from user. */
>> +4:	andi.	r11,r12,MSR_PR		/* See if coming from user. */
>>  	bne	9f			/* continue in V mode if we are. */
>>  
>>  5:
>> +BEGIN_FTR_SECTION
>>  #ifdef CONFIG_KVM_BOOK3S_64_HANDLER
>>  	/*
>>  	 * We are coming from kernel context. Check if we are coming from
>> @@ -578,6 +485,7 @@ EXC_COMMON_BEGIN(machine_check_handle_early)
>>  	cmpwi	r11,0			/* Check if coming from guest */
>>  	bne	9f			/* continue if we are. */
>>  #endif
>> +END_FTR_SECTION_IFSET(CPU_FTR_HVMODE)
> 
> Put these inside the ifdef?
> 
> 
>>  	/*
>>  	 * At this point we are not sure about what context we come from.
>>  	 * Queue up the MCE event and return from the interrupt.
>> @@ -611,6 +519,7 @@ EXC_COMMON_BEGIN(machine_check_handle_early)
>>  	cmpdi	r3,0		/* see if we handled MCE successfully */
>>  
>>  	beq	1b		/* if !handled then panic */
>> +BEGIN_FTR_SECTION
>>  	/*
>>  	 * Return from MC interrupt.
>>  	 * Queue up the MCE event so that we can log it later, while
>> @@ -619,10 +528,24 @@ EXC_COMMON_BEGIN(machine_check_handle_early)
>>  	bl	machine_check_queue_event
>>  	MACHINE_CHECK_HANDLER_WINDUP
>>  	RFI_TO_USER_OR_KERNEL
>> +FTR_SECTION_ELSE
>> +	/*
>> +	 * pSeries: Return from MC interrupt. Before that stay on emergency
>> +	 * stack and call machine_check_exception to log the MCE event.
>> +	 */
>> +	LOAD_HANDLER(r10,mce_return)
>> +	mtspr	SPRN_SRR0,r10
>> +	ld	r10,PACAKMSR(r13)
>> +	mtspr	SPRN_SRR1,r10
>> +	RFI_TO_KERNEL
>> +	b	.
>> +ALT_FTR_SECTION_END_IFSET(CPU_FTR_HVMODE)
> 
> Do you still need mce_return? Why can't you consolidate it as well? ...
> Hmm, okay so now I look back at patch 2, I don't think you should call
> machine_check_exception there. You're supposed to call
> machine_check_queue_event here and it will be handled by irq work.

machine_check_queue_event does not handle RTAS mce event. Also, we need
to call fwnmi_release_errinfo() as early as possible which is why I am
calling machine_check_exception() in mce_return path for pSeries.
Otherwise if we get another MCE before calling fwnmi_release_errinfo()
then lpar will get rebooted without any logs getting printed.

Thanks,
-Mahesh.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v8 5/5] powernv/pseries: consolidate code for mce early handling.
  2018-08-23  8:43     ` Mahesh Jagannath Salgaonkar
@ 2018-08-23  9:02       ` Nicholas Piggin
  2018-08-27 10:32         ` Mahesh Jagannath Salgaonkar
  0 siblings, 1 reply; 12+ messages in thread
From: Nicholas Piggin @ 2018-08-23  9:02 UTC (permalink / raw)
  To: Mahesh Jagannath Salgaonkar
  Cc: linuxppc-dev, Ananth Narayan, Laurent Dufour, Aneesh Kumar K.V,
	Michal Suchanek, Michael Ellerman

On Thu, 23 Aug 2018 14:13:13 +0530
Mahesh Jagannath Salgaonkar <mahesh@linux.vnet.ibm.com> wrote:

> On 08/20/2018 05:04 PM, Nicholas Piggin wrote:
> > On Sun, 19 Aug 2018 22:38:39 +0530
> > Mahesh J Salgaonkar <mahesh@linux.vnet.ibm.com> wrote:
> >   
> >> From: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com>
> >>
> >> Now that other platforms also implements real mode mce handler,
> >> lets consolidate the code by sharing existing powernv machine check
> >> early code. Rename machine_check_powernv_early to
> >> machine_check_common_early and reuse the code.
> >>
> >> Signed-off-by: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com>
> >> ---
> >>  arch/powerpc/kernel/exceptions-64s.S |  155 ++++++----------------------------
> >>  1 file changed, 28 insertions(+), 127 deletions(-)
> >>
> >> diff --git a/arch/powerpc/kernel/exceptions-64s.S b/arch/powerpc/kernel/exceptions-64s.S
> >> index 12f056179112..2f85a7baf026 100644
> >> --- a/arch/powerpc/kernel/exceptions-64s.S
> >> +++ b/arch/powerpc/kernel/exceptions-64s.S
> >> @@ -243,14 +243,13 @@ EXC_REAL_BEGIN(machine_check, 0x200, 0x100)
> >>  	SET_SCRATCH0(r13)		/* save r13 */
> >>  	EXCEPTION_PROLOG_0(PACA_EXMC)
> >>  BEGIN_FTR_SECTION
> >> -	b	machine_check_powernv_early
> >> +	b	machine_check_common_early
> >>  FTR_SECTION_ELSE
> >>  	b	machine_check_pSeries_0
> >>  ALT_FTR_SECTION_END_IFSET(CPU_FTR_HVMODE)
> >>  EXC_REAL_END(machine_check, 0x200, 0x100)
> >>  EXC_VIRT_NONE(0x4200, 0x100)
> >> -TRAMP_REAL_BEGIN(machine_check_powernv_early)
> >> -BEGIN_FTR_SECTION
> >> +TRAMP_REAL_BEGIN(machine_check_common_early)
> >>  	EXCEPTION_PROLOG_1(PACA_EXMC, NOTEST, 0x200)
> >>  	/*
> >>  	 * Register contents:
> >> @@ -306,7 +305,9 @@ BEGIN_FTR_SECTION
> >>  	/* Save r9 through r13 from EXMC save area to stack frame. */
> >>  	EXCEPTION_PROLOG_COMMON_2(PACA_EXMC)
> >>  	mfmsr	r11			/* get MSR value */
> >> +BEGIN_FTR_SECTION
> >>  	ori	r11,r11,MSR_ME		/* turn on ME bit */
> >> +END_FTR_SECTION_IFSET(CPU_FTR_HVMODE)
> >>  	ori	r11,r11,MSR_RI		/* turn on RI bit */
> >>  	LOAD_HANDLER(r12, machine_check_handle_early)
> >>  1:	mtspr	SPRN_SRR0,r12
> >> @@ -325,7 +326,6 @@ BEGIN_FTR_SECTION
> >>  	andc	r11,r11,r10		/* Turn off MSR_ME */
> >>  	b	1b
> >>  	b	.	/* prevent speculative execution */
> >> -END_FTR_SECTION_IFSET(CPU_FTR_HVMODE)
> >>  
> >>  TRAMP_REAL_BEGIN(machine_check_pSeries)
> >>  	.globl machine_check_fwnmi
> >> @@ -333,7 +333,7 @@ machine_check_fwnmi:
> >>  	SET_SCRATCH0(r13)		/* save r13 */
> >>  	EXCEPTION_PROLOG_0(PACA_EXMC)
> >>  BEGIN_FTR_SECTION
> >> -	b	machine_check_pSeries_early
> >> +	b	machine_check_common_early
> >>  END_FTR_SECTION_IFCLR(CPU_FTR_HVMODE)
> >>  machine_check_pSeries_0:
> >>  	EXCEPTION_PROLOG_1(PACA_EXMC, KVMTEST_PR, 0x200)
> >> @@ -346,103 +346,6 @@ machine_check_pSeries_0:
> >>  
> >>  TRAMP_KVM_SKIP(PACA_EXMC, 0x200)
> >>  
> >> -TRAMP_REAL_BEGIN(machine_check_pSeries_early)
> >> -BEGIN_FTR_SECTION
> >> -	EXCEPTION_PROLOG_1(PACA_EXMC, NOTEST, 0x200)
> >> -	mr	r10,r1			/* Save r1 */
> >> -	lhz	r11,PACA_IN_MCE(r13)
> >> -	cmpwi	r11,0			/* Are we in nested machine check */
> >> -	bne	0f			/* Yes, we are. */
> >> -	/* First machine check entry */
> >> -	ld	r1,PACAMCEMERGSP(r13)	/* Use MC emergency stack */
> >> -0:	subi	r1,r1,INT_FRAME_SIZE	/* alloc stack frame */
> >> -	addi	r11,r11,1		/* increment paca->in_mce */
> >> -	sth	r11,PACA_IN_MCE(r13)
> >> -	/* Limit nested MCE to level 4 to avoid stack overflow */
> >> -	cmpwi	r11,MAX_MCE_DEPTH
> >> -	bgt	1f			/* Check if we hit limit of 4 */
> >> -	mfspr	r11,SPRN_SRR0		/* Save SRR0 */
> >> -	mfspr	r12,SPRN_SRR1		/* Save SRR1 */
> >> -	EXCEPTION_PROLOG_COMMON_1()
> >> -	EXCEPTION_PROLOG_COMMON_2(PACA_EXMC)
> >> -	EXCEPTION_PROLOG_COMMON_3(0x200)
> >> -	addi	r3,r1,STACK_FRAME_OVERHEAD
> >> -	BRANCH_LINK_TO_FAR(machine_check_early) /* Function call ABI */
> >> -	ld	r12,_MSR(r1)
> >> -	andi.	r11,r12,MSR_PR		/* See if coming from user. */
> >> -	bne	2f			/* continue in V mode if we are. */
> >> -
> >> -	/*
> >> -	 * At this point we are not sure about what context we come from.
> >> -	 * We may be in the middle of swithing stack. r1 may not be valid.
> >> -	 * Hence stay on emergency stack, call machine_check_exception and
> >> -	 * return from the interrupt.
> >> -	 * But before that, check if this is an un-recoverable exception.
> >> -	 * If yes, then stay on emergency stack and panic.
> >> -	 */
> >> -	andi.	r11,r12,MSR_RI
> >> -	beq	1f
> >> -
> >> -	/*
> >> -	 * Check if we have successfully handled/recovered from error, if not
> >> -	 * then stay on emergency stack and panic.
> >> -	 */
> >> -	cmpdi	r3,0		/* see if we handled MCE successfully */
> >> -	beq	1f		/* if !handled then panic */
> >> -
> >> -	/* Stay on emergency stack and return from interrupt. */
> >> -	LOAD_HANDLER(r10,mce_return)
> >> -	mtspr	SPRN_SRR0,r10
> >> -	ld	r10,PACAKMSR(r13)
> >> -	mtspr	SPRN_SRR1,r10
> >> -	RFI_TO_KERNEL
> >> -	b	.
> >> -
> >> -1:	LOAD_HANDLER(r10,unrecover_mce)
> >> -	mtspr	SPRN_SRR0,r10
> >> -	ld	r10,PACAKMSR(r13)
> >> -	/*
> >> -	 * We are going down. But there are chances that we might get hit by
> >> -	 * another MCE during panic path and we may run into unstable state
> >> -	 * with no way out. Hence, turn ME bit off while going down, so that
> >> -	 * when another MCE is hit during panic path, hypervisor will
> >> -	 * power cycle the lpar, instead of getting into MCE loop.
> >> -	 */
> >> -	li	r3,MSR_ME
> >> -	andc	r10,r10,r3		/* Turn off MSR_ME */
> >> -	mtspr	SPRN_SRR1,r10
> >> -	RFI_TO_KERNEL
> >> -	b	.
> >> -
> >> -	/* Move original SRR0 and SRR1 into the respective regs */
> >> -2:	ld	r9,_MSR(r1)
> >> -	mtspr	SPRN_SRR1,r9
> >> -	ld	r3,_NIP(r1)
> >> -	mtspr	SPRN_SRR0,r3
> >> -	ld	r9,_CTR(r1)
> >> -	mtctr	r9
> >> -	ld	r9,_XER(r1)
> >> -	mtxer	r9
> >> -	ld	r9,_LINK(r1)
> >> -	mtlr	r9
> >> -	REST_GPR(0, r1)
> >> -	REST_8GPRS(2, r1)
> >> -	REST_GPR(10, r1)
> >> -	ld	r11,_CCR(r1)
> >> -	mtcr	r11
> >> -	/* Decrement paca->in_mce. */
> >> -	lhz	r12,PACA_IN_MCE(r13)
> >> -	subi	r12,r12,1
> >> -	sth	r12,PACA_IN_MCE(r13)
> >> -	REST_GPR(11, r1)
> >> -	REST_2GPRS(12, r1)
> >> -	/* restore original r1. */
> >> -	ld	r1,GPR1(r1)
> >> -	SET_SCRATCH0(r13)		/* save r13 */
> >> -	EXCEPTION_PROLOG_0(PACA_EXMC)
> >> -	b	machine_check_pSeries_0
> >> -END_FTR_SECTION_IFCLR(CPU_FTR_HVMODE)
> >> -
> >>  EXC_COMMON_BEGIN(machine_check_common)
> >>  	/*
> >>  	 * Machine check is different because we use a different
> >> @@ -541,6 +444,9 @@ EXC_COMMON_BEGIN(machine_check_handle_early)
> >>  	bl	machine_check_early
> >>  	std	r3,RESULT(r1)	/* Save result */
> >>  	ld	r12,_MSR(r1)
> >> +BEGIN_FTR_SECTION
> >> +	b	4f
> >> +END_FTR_SECTION_IFCLR(CPU_FTR_HVMODE)
> >>  
> >>  #ifdef	CONFIG_PPC_P7_NAP
> >>  	/*
> >> @@ -564,10 +470,11 @@ EXC_COMMON_BEGIN(machine_check_handle_early)
> >>  	 */
> >>  	rldicl.	r11,r12,4,63		/* See if MC hit while in HV mode. */
> >>  	beq	5f
> >> -	andi.	r11,r12,MSR_PR		/* See if coming from user. */
> >> +4:	andi.	r11,r12,MSR_PR		/* See if coming from user. */
> >>  	bne	9f			/* continue in V mode if we are. */
> >>  
> >>  5:
> >> +BEGIN_FTR_SECTION
> >>  #ifdef CONFIG_KVM_BOOK3S_64_HANDLER
> >>  	/*
> >>  	 * We are coming from kernel context. Check if we are coming from
> >> @@ -578,6 +485,7 @@ EXC_COMMON_BEGIN(machine_check_handle_early)
> >>  	cmpwi	r11,0			/* Check if coming from guest */
> >>  	bne	9f			/* continue if we are. */
> >>  #endif
> >> +END_FTR_SECTION_IFSET(CPU_FTR_HVMODE)  
> > 
> > Put these inside the ifdef?
> > 
> >   
> >>  	/*
> >>  	 * At this point we are not sure about what context we come from.
> >>  	 * Queue up the MCE event and return from the interrupt.
> >> @@ -611,6 +519,7 @@ EXC_COMMON_BEGIN(machine_check_handle_early)
> >>  	cmpdi	r3,0		/* see if we handled MCE successfully */
> >>  
> >>  	beq	1b		/* if !handled then panic */
> >> +BEGIN_FTR_SECTION
> >>  	/*
> >>  	 * Return from MC interrupt.
> >>  	 * Queue up the MCE event so that we can log it later, while
> >> @@ -619,10 +528,24 @@ EXC_COMMON_BEGIN(machine_check_handle_early)
> >>  	bl	machine_check_queue_event
> >>  	MACHINE_CHECK_HANDLER_WINDUP
> >>  	RFI_TO_USER_OR_KERNEL
> >> +FTR_SECTION_ELSE
> >> +	/*
> >> +	 * pSeries: Return from MC interrupt. Before that stay on emergency
> >> +	 * stack and call machine_check_exception to log the MCE event.
> >> +	 */
> >> +	LOAD_HANDLER(r10,mce_return)
> >> +	mtspr	SPRN_SRR0,r10
> >> +	ld	r10,PACAKMSR(r13)
> >> +	mtspr	SPRN_SRR1,r10
> >> +	RFI_TO_KERNEL
> >> +	b	.
> >> +ALT_FTR_SECTION_END_IFSET(CPU_FTR_HVMODE)  
> > 
> > Do you still need mce_return? Why can't you consolidate it as well? ...
> > Hmm, okay so now I look back at patch 2, I don't think you should call
> > machine_check_exception there. You're supposed to call
> > machine_check_queue_event here and it will be handled by irq work.  
> 
> machine_check_queue_event does not handle RTAS mce event.

Yes it would need a bit of work.

> Also, we need
> to call fwnmi_release_errinfo() as early as possible which is why I am
> calling machine_check_exception() in mce_return path for pSeries.
> Otherwise if we get another MCE before calling fwnmi_release_errinfo()
> then lpar will get rebooted without any logs getting printed.

I think you can call that in your early handler, but then defer
the printing to the irq work.

Although hmm, maybe that's less of a problem now we do nmi_enter
in machine check exception so I think printk will use an NMI safe
buffer.

We have to be careful actually of soft irq state if we take a
machine check in an un-reconciled state or in the middle of
the irq replay code I'm not actually sure we do the right thing,
but that would be a bug in existing code too. And we definitely
have MSR[RI] vs DAR/DSISR bugs in existing code, sigh.

I don't know... maybe just push what you have and we'll try to do
some more fixes and cleanups on top of that.

Thanks,
Nick

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v8 5/5] powernv/pseries: consolidate code for mce early handling.
  2018-08-23  9:02       ` Nicholas Piggin
@ 2018-08-27 10:32         ` Mahesh Jagannath Salgaonkar
  0 siblings, 0 replies; 12+ messages in thread
From: Mahesh Jagannath Salgaonkar @ 2018-08-27 10:32 UTC (permalink / raw)
  To: Nicholas Piggin
  Cc: linuxppc-dev, Ananth Narayan, Laurent Dufour, Aneesh Kumar K.V,
	Michal Suchanek, Michael Ellerman

On 08/23/2018 02:32 PM, Nicholas Piggin wrote:
> On Thu, 23 Aug 2018 14:13:13 +0530
> Mahesh Jagannath Salgaonkar <mahesh@linux.vnet.ibm.com> wrote:
> 
>> On 08/20/2018 05:04 PM, Nicholas Piggin wrote:
>>> On Sun, 19 Aug 2018 22:38:39 +0530
>>> Mahesh J Salgaonkar <mahesh@linux.vnet.ibm.com> wrote:
>>>   
>>>> From: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com>
>>>>
>>>> Now that other platforms also implements real mode mce handler,
>>>> lets consolidate the code by sharing existing powernv machine check
>>>> early code. Rename machine_check_powernv_early to
>>>> machine_check_common_early and reuse the code.
>>>>
>>>> Signed-off-by: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com>
>>>> ---
>>>>  arch/powerpc/kernel/exceptions-64s.S |  155 ++++++----------------------------
>>>>  1 file changed, 28 insertions(+), 127 deletions(-)
>>>>
>>>> diff --git a/arch/powerpc/kernel/exceptions-64s.S b/arch/powerpc/kernel/exceptions-64s.S
>>>> index 12f056179112..2f85a7baf026 100644
>>>> --- a/arch/powerpc/kernel/exceptions-64s.S
>>>> +++ b/arch/powerpc/kernel/exceptions-64s.S
>>>> @@ -243,14 +243,13 @@ EXC_REAL_BEGIN(machine_check, 0x200, 0x100)
>>>>  	SET_SCRATCH0(r13)		/* save r13 */
>>>>  	EXCEPTION_PROLOG_0(PACA_EXMC)
>>>>  BEGIN_FTR_SECTION
>>>> -	b	machine_check_powernv_early
>>>> +	b	machine_check_common_early
>>>>  FTR_SECTION_ELSE
>>>>  	b	machine_check_pSeries_0
>>>>  ALT_FTR_SECTION_END_IFSET(CPU_FTR_HVMODE)
>>>>  EXC_REAL_END(machine_check, 0x200, 0x100)
>>>>  EXC_VIRT_NONE(0x4200, 0x100)
>>>> -TRAMP_REAL_BEGIN(machine_check_powernv_early)
>>>> -BEGIN_FTR_SECTION
>>>> +TRAMP_REAL_BEGIN(machine_check_common_early)
>>>>  	EXCEPTION_PROLOG_1(PACA_EXMC, NOTEST, 0x200)
>>>>  	/*
>>>>  	 * Register contents:
>>>> @@ -306,7 +305,9 @@ BEGIN_FTR_SECTION
>>>>  	/* Save r9 through r13 from EXMC save area to stack frame. */
>>>>  	EXCEPTION_PROLOG_COMMON_2(PACA_EXMC)
>>>>  	mfmsr	r11			/* get MSR value */
>>>> +BEGIN_FTR_SECTION
>>>>  	ori	r11,r11,MSR_ME		/* turn on ME bit */
>>>> +END_FTR_SECTION_IFSET(CPU_FTR_HVMODE)
>>>>  	ori	r11,r11,MSR_RI		/* turn on RI bit */
>>>>  	LOAD_HANDLER(r12, machine_check_handle_early)
>>>>  1:	mtspr	SPRN_SRR0,r12
>>>> @@ -325,7 +326,6 @@ BEGIN_FTR_SECTION
>>>>  	andc	r11,r11,r10		/* Turn off MSR_ME */
>>>>  	b	1b
>>>>  	b	.	/* prevent speculative execution */
>>>> -END_FTR_SECTION_IFSET(CPU_FTR_HVMODE)
>>>>  
>>>>  TRAMP_REAL_BEGIN(machine_check_pSeries)
>>>>  	.globl machine_check_fwnmi
>>>> @@ -333,7 +333,7 @@ machine_check_fwnmi:
>>>>  	SET_SCRATCH0(r13)		/* save r13 */
>>>>  	EXCEPTION_PROLOG_0(PACA_EXMC)
>>>>  BEGIN_FTR_SECTION
>>>> -	b	machine_check_pSeries_early
>>>> +	b	machine_check_common_early
>>>>  END_FTR_SECTION_IFCLR(CPU_FTR_HVMODE)
>>>>  machine_check_pSeries_0:
>>>>  	EXCEPTION_PROLOG_1(PACA_EXMC, KVMTEST_PR, 0x200)
>>>> @@ -346,103 +346,6 @@ machine_check_pSeries_0:
>>>>  
>>>>  TRAMP_KVM_SKIP(PACA_EXMC, 0x200)
>>>>  
>>>> -TRAMP_REAL_BEGIN(machine_check_pSeries_early)
>>>> -BEGIN_FTR_SECTION
>>>> -	EXCEPTION_PROLOG_1(PACA_EXMC, NOTEST, 0x200)
>>>> -	mr	r10,r1			/* Save r1 */
>>>> -	lhz	r11,PACA_IN_MCE(r13)
>>>> -	cmpwi	r11,0			/* Are we in nested machine check */
>>>> -	bne	0f			/* Yes, we are. */
>>>> -	/* First machine check entry */
>>>> -	ld	r1,PACAMCEMERGSP(r13)	/* Use MC emergency stack */
>>>> -0:	subi	r1,r1,INT_FRAME_SIZE	/* alloc stack frame */
>>>> -	addi	r11,r11,1		/* increment paca->in_mce */
>>>> -	sth	r11,PACA_IN_MCE(r13)
>>>> -	/* Limit nested MCE to level 4 to avoid stack overflow */
>>>> -	cmpwi	r11,MAX_MCE_DEPTH
>>>> -	bgt	1f			/* Check if we hit limit of 4 */
>>>> -	mfspr	r11,SPRN_SRR0		/* Save SRR0 */
>>>> -	mfspr	r12,SPRN_SRR1		/* Save SRR1 */
>>>> -	EXCEPTION_PROLOG_COMMON_1()
>>>> -	EXCEPTION_PROLOG_COMMON_2(PACA_EXMC)
>>>> -	EXCEPTION_PROLOG_COMMON_3(0x200)
>>>> -	addi	r3,r1,STACK_FRAME_OVERHEAD
>>>> -	BRANCH_LINK_TO_FAR(machine_check_early) /* Function call ABI */
>>>> -	ld	r12,_MSR(r1)
>>>> -	andi.	r11,r12,MSR_PR		/* See if coming from user. */
>>>> -	bne	2f			/* continue in V mode if we are. */
>>>> -
>>>> -	/*
>>>> -	 * At this point we are not sure about what context we come from.
>>>> -	 * We may be in the middle of swithing stack. r1 may not be valid.
>>>> -	 * Hence stay on emergency stack, call machine_check_exception and
>>>> -	 * return from the interrupt.
>>>> -	 * But before that, check if this is an un-recoverable exception.
>>>> -	 * If yes, then stay on emergency stack and panic.
>>>> -	 */
>>>> -	andi.	r11,r12,MSR_RI
>>>> -	beq	1f
>>>> -
>>>> -	/*
>>>> -	 * Check if we have successfully handled/recovered from error, if not
>>>> -	 * then stay on emergency stack and panic.
>>>> -	 */
>>>> -	cmpdi	r3,0		/* see if we handled MCE successfully */
>>>> -	beq	1f		/* if !handled then panic */
>>>> -
>>>> -	/* Stay on emergency stack and return from interrupt. */
>>>> -	LOAD_HANDLER(r10,mce_return)
>>>> -	mtspr	SPRN_SRR0,r10
>>>> -	ld	r10,PACAKMSR(r13)
>>>> -	mtspr	SPRN_SRR1,r10
>>>> -	RFI_TO_KERNEL
>>>> -	b	.
>>>> -
>>>> -1:	LOAD_HANDLER(r10,unrecover_mce)
>>>> -	mtspr	SPRN_SRR0,r10
>>>> -	ld	r10,PACAKMSR(r13)
>>>> -	/*
>>>> -	 * We are going down. But there are chances that we might get hit by
>>>> -	 * another MCE during panic path and we may run into unstable state
>>>> -	 * with no way out. Hence, turn ME bit off while going down, so that
>>>> -	 * when another MCE is hit during panic path, hypervisor will
>>>> -	 * power cycle the lpar, instead of getting into MCE loop.
>>>> -	 */
>>>> -	li	r3,MSR_ME
>>>> -	andc	r10,r10,r3		/* Turn off MSR_ME */
>>>> -	mtspr	SPRN_SRR1,r10
>>>> -	RFI_TO_KERNEL
>>>> -	b	.
>>>> -
>>>> -	/* Move original SRR0 and SRR1 into the respective regs */
>>>> -2:	ld	r9,_MSR(r1)
>>>> -	mtspr	SPRN_SRR1,r9
>>>> -	ld	r3,_NIP(r1)
>>>> -	mtspr	SPRN_SRR0,r3
>>>> -	ld	r9,_CTR(r1)
>>>> -	mtctr	r9
>>>> -	ld	r9,_XER(r1)
>>>> -	mtxer	r9
>>>> -	ld	r9,_LINK(r1)
>>>> -	mtlr	r9
>>>> -	REST_GPR(0, r1)
>>>> -	REST_8GPRS(2, r1)
>>>> -	REST_GPR(10, r1)
>>>> -	ld	r11,_CCR(r1)
>>>> -	mtcr	r11
>>>> -	/* Decrement paca->in_mce. */
>>>> -	lhz	r12,PACA_IN_MCE(r13)
>>>> -	subi	r12,r12,1
>>>> -	sth	r12,PACA_IN_MCE(r13)
>>>> -	REST_GPR(11, r1)
>>>> -	REST_2GPRS(12, r1)
>>>> -	/* restore original r1. */
>>>> -	ld	r1,GPR1(r1)
>>>> -	SET_SCRATCH0(r13)		/* save r13 */
>>>> -	EXCEPTION_PROLOG_0(PACA_EXMC)
>>>> -	b	machine_check_pSeries_0
>>>> -END_FTR_SECTION_IFCLR(CPU_FTR_HVMODE)
>>>> -
>>>>  EXC_COMMON_BEGIN(machine_check_common)
>>>>  	/*
>>>>  	 * Machine check is different because we use a different
>>>> @@ -541,6 +444,9 @@ EXC_COMMON_BEGIN(machine_check_handle_early)
>>>>  	bl	machine_check_early
>>>>  	std	r3,RESULT(r1)	/* Save result */
>>>>  	ld	r12,_MSR(r1)
>>>> +BEGIN_FTR_SECTION
>>>> +	b	4f
>>>> +END_FTR_SECTION_IFCLR(CPU_FTR_HVMODE)
>>>>  
>>>>  #ifdef	CONFIG_PPC_P7_NAP
>>>>  	/*
>>>> @@ -564,10 +470,11 @@ EXC_COMMON_BEGIN(machine_check_handle_early)
>>>>  	 */
>>>>  	rldicl.	r11,r12,4,63		/* See if MC hit while in HV mode. */
>>>>  	beq	5f
>>>> -	andi.	r11,r12,MSR_PR		/* See if coming from user. */
>>>> +4:	andi.	r11,r12,MSR_PR		/* See if coming from user. */
>>>>  	bne	9f			/* continue in V mode if we are. */
>>>>  
>>>>  5:
>>>> +BEGIN_FTR_SECTION
>>>>  #ifdef CONFIG_KVM_BOOK3S_64_HANDLER
>>>>  	/*
>>>>  	 * We are coming from kernel context. Check if we are coming from
>>>> @@ -578,6 +485,7 @@ EXC_COMMON_BEGIN(machine_check_handle_early)
>>>>  	cmpwi	r11,0			/* Check if coming from guest */
>>>>  	bne	9f			/* continue if we are. */
>>>>  #endif
>>>> +END_FTR_SECTION_IFSET(CPU_FTR_HVMODE)  
>>>
>>> Put these inside the ifdef?
>>>
>>>   
>>>>  	/*
>>>>  	 * At this point we are not sure about what context we come from.
>>>>  	 * Queue up the MCE event and return from the interrupt.
>>>> @@ -611,6 +519,7 @@ EXC_COMMON_BEGIN(machine_check_handle_early)
>>>>  	cmpdi	r3,0		/* see if we handled MCE successfully */
>>>>  
>>>>  	beq	1b		/* if !handled then panic */
>>>> +BEGIN_FTR_SECTION
>>>>  	/*
>>>>  	 * Return from MC interrupt.
>>>>  	 * Queue up the MCE event so that we can log it later, while
>>>> @@ -619,10 +528,24 @@ EXC_COMMON_BEGIN(machine_check_handle_early)
>>>>  	bl	machine_check_queue_event
>>>>  	MACHINE_CHECK_HANDLER_WINDUP
>>>>  	RFI_TO_USER_OR_KERNEL
>>>> +FTR_SECTION_ELSE
>>>> +	/*
>>>> +	 * pSeries: Return from MC interrupt. Before that stay on emergency
>>>> +	 * stack and call machine_check_exception to log the MCE event.
>>>> +	 */
>>>> +	LOAD_HANDLER(r10,mce_return)
>>>> +	mtspr	SPRN_SRR0,r10
>>>> +	ld	r10,PACAKMSR(r13)
>>>> +	mtspr	SPRN_SRR1,r10
>>>> +	RFI_TO_KERNEL
>>>> +	b	.
>>>> +ALT_FTR_SECTION_END_IFSET(CPU_FTR_HVMODE)  
>>>
>>> Do you still need mce_return? Why can't you consolidate it as well? ...
>>> Hmm, okay so now I look back at patch 2, I don't think you should call
>>> machine_check_exception there. You're supposed to call
>>> machine_check_queue_event here and it will be handled by irq work.  
>>
>> machine_check_queue_event does not handle RTAS mce event.
> 
> Yes it would need a bit of work.
> 
>> Also, we need
>> to call fwnmi_release_errinfo() as early as possible which is why I am
>> calling machine_check_exception() in mce_return path for pSeries.
>> Otherwise if we get another MCE before calling fwnmi_release_errinfo()
>> then lpar will get rebooted without any logs getting printed.
> 
> I think you can call that in your early handler, but then defer
> the printing to the irq work.
> 
> Although hmm, maybe that's less of a problem now we do nmi_enter
> in machine check exception so I think printk will use an NMI safe
> buffer.
> 
> We have to be careful actually of soft irq state if we take a
> machine check in an un-reconciled state or in the middle of
> the irq replay code I'm not actually sure we do the right thing,
> but that would be a bug in existing code too. And we definitely
> have MSR[RI] vs DAR/DSISR bugs in existing code, sigh.
> 
> I don't know... maybe just push what you have and we'll try to do
> some more fixes and cleanups on top of that.

Sure. I will respin the next version addressing your other minor
comments. Will work on more improvements as separate change.

Thanks,
-Mahesh.

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2018-08-27 10:32 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-08-19 17:08 [PATCH v8 0/5] powerpc/pseries: Machine check handler improvements Mahesh J Salgaonkar
2018-08-19 17:08 ` [PATCH v8 1/5] powerpc/pseries: Define MCE error event section Mahesh J Salgaonkar
2018-08-19 17:08 ` [PATCH v8 2/5] powerpc/pseries: flush SLB contents on SLB MCE errors Mahesh J Salgaonkar
2018-08-20 10:58   ` Nicholas Piggin
2018-08-19 17:08 ` [PATCH v8 3/5] powerpc/pseries: Display machine check error details Mahesh J Salgaonkar
2018-08-19 17:08 ` [PATCH v8 4/5] powerpc/pseries: Dump the SLB contents on SLB MCE errors Mahesh J Salgaonkar
2018-08-20 11:20   ` Nicholas Piggin
2018-08-19 17:08 ` [PATCH v8 5/5] powernv/pseries: consolidate code for mce early handling Mahesh J Salgaonkar
2018-08-20 11:34   ` Nicholas Piggin
2018-08-23  8:43     ` Mahesh Jagannath Salgaonkar
2018-08-23  9:02       ` Nicholas Piggin
2018-08-27 10:32         ` Mahesh Jagannath Salgaonkar

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).