All of lore.kernel.org
 help / color / mirror / Atom feed
* [RFC PATCH 0/7] Machine check handling for Power9 with bacward compatibility.
@ 2017-02-21  1:51 Mahesh J Salgaonkar
  2017-02-21  1:51 ` [RFC PATCH 1/7] powerpc/book3s: Move machine check event structure to opal-api.h Mahesh J Salgaonkar
                   ` (6 more replies)
  0 siblings, 7 replies; 13+ messages in thread
From: Mahesh J Salgaonkar @ 2017-02-21  1:51 UTC (permalink / raw)
  To: linuxppc-dev, Benjamin Herrenschmidt; +Cc: Paul Mackerras, Nicholas Piggin

This RFC patch series adds machine check handling for Power9. Starting from
power9 linux will depend on OPAL to handle chip specific processing for
extracting MCE error reason. This will allow linux to not worry about
chip/CPU specific bit encoding to find out MCE reason. Linux will make
OPAL call during MCE interrupt to let OPAL extract the reason and provide
a high level machine check event that should help linux to decide further
actions to take.

OPAL machine check handler will be supported on Power9 and above. Linux will
populated ppc_md.machine_check_early() function pointer if opal supports
OPAL_HANDLE_MACHINE_CHECK token. But if opal mce handler does not support
processing/extracting MCE reason for current chip, then it will return
OPAL_UNSUPPORTED. e.g. new OPAL FW installed on system with Power8 or below.
In this case (i.e on Power8 system it will fallback to in-kernel MCE handler
as it does today).

I have done all my testing in Mambo only. There are few TODOs that I am
working on and will make those changes in v2. This early version is to get
comments about this approach to support backward compatibility for Machine
check handler starting from Power9.

Comments are welcome.


---

Mahesh Salgaonkar (7):
      powerpc/book3s: Move machine check event structure to opal-api.h
      powerpc/book3s: mce: Call opal mce handler to extract MCE error reason.
      powerpc/book3s: mce: Process the MCE event and recover if possible.
      powerpc/book3s: Print additional MCE errors introduced in power9.
      powerpc/book3s: Don't turn on the MSR[ME] bit until opal processes the reason.
      powerpc/book3s: Display more info for MCE error console log.
      powerpc/book3s: Display task info for MCE error in user mode.


 arch/powerpc/include/asm/machdep.h             |    2 
 arch/powerpc/include/asm/mce.h                 |  114 +--------------
 arch/powerpc/include/asm/opal-api.h            |  179 ++++++++++++++++++++++++
 arch/powerpc/include/asm/opal.h                |    3 
 arch/powerpc/kernel/exceptions-64s.S           |   12 +-
 arch/powerpc/kernel/mce.c                      |  122 +++++++++++++++-
 arch/powerpc/kernel/mce_power.c                |   38 +++++
 arch/powerpc/kernel/traps.c                    |   15 ++
 arch/powerpc/platforms/powernv/opal-wrappers.S |    1 
 arch/powerpc/platforms/powernv/opal.c          |   24 +++
 arch/powerpc/platforms/powernv/setup.c         |    4 +
 11 files changed, 387 insertions(+), 127 deletions(-)

--
Mahesh Salgaonkar

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [RFC PATCH 1/7] powerpc/book3s: Move machine check event structure to opal-api.h
  2017-02-21  1:51 [RFC PATCH 0/7] Machine check handling for Power9 with bacward compatibility Mahesh J Salgaonkar
@ 2017-02-21  1:51 ` Mahesh J Salgaonkar
  2017-02-21  2:35   ` Nicholas Piggin
  2017-02-21  1:52 ` [RFC PATCH 2/7] powerpc/book3s: mce: Call opal mce handler to extract MCE error reason Mahesh J Salgaonkar
                   ` (5 subsequent siblings)
  6 siblings, 1 reply; 13+ messages in thread
From: Mahesh J Salgaonkar @ 2017-02-21  1:51 UTC (permalink / raw)
  To: linuxppc-dev, Benjamin Herrenschmidt; +Cc: Paul Mackerras, Nicholas Piggin

From: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com>

Going forward from Power9, linux would take firmware's (OPAL) help to
exract MCE error reason. OPAL will now handle all chip specific processing
to extract the error reason and send high level information through this
machine check event structure. Hence this structure is now an ABI.

No functionality change.

Signed-off-by: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com>
---
 arch/powerpc/include/asm/mce.h        |  110 +--------------------
 arch/powerpc/include/asm/opal-api.h   |  176 +++++++++++++++++++++++++++++++++
 arch/powerpc/kernel/mce.c             |   18 ++-
 arch/powerpc/platforms/powernv/opal.c |    4 -
 4 files changed, 191 insertions(+), 117 deletions(-)

diff --git a/arch/powerpc/include/asm/mce.h b/arch/powerpc/include/asm/mce.h
index f97d8cb..36db6b0 100644
--- a/arch/powerpc/include/asm/mce.h
+++ b/arch/powerpc/include/asm/mce.h
@@ -23,6 +23,7 @@
 #define __ASM_PPC64_MCE_H__
 
 #include <linux/bitops.h>
+#include <asm/opal.h>
 
 /*
  * Machine Check bits on power7 and power8
@@ -66,109 +67,6 @@
 
 #define P8_DSISR_MC_SLB_ERRORS		(P7_DSISR_MC_SLB_ERRORS | \
 					 P8_DSISR_MC_ERAT_MULTIHIT_SEC)
-enum MCE_Version {
-	MCE_V1 = 1,
-};
-
-enum MCE_Severity {
-	MCE_SEV_NO_ERROR = 0,
-	MCE_SEV_WARNING = 1,
-	MCE_SEV_ERROR_SYNC = 2,
-	MCE_SEV_FATAL = 3,
-};
-
-enum MCE_Disposition {
-	MCE_DISPOSITION_RECOVERED = 0,
-	MCE_DISPOSITION_NOT_RECOVERED = 1,
-};
-
-enum MCE_Initiator {
-	MCE_INITIATOR_UNKNOWN = 0,
-	MCE_INITIATOR_CPU = 1,
-};
-
-enum MCE_ErrorType {
-	MCE_ERROR_TYPE_UNKNOWN = 0,
-	MCE_ERROR_TYPE_UE = 1,
-	MCE_ERROR_TYPE_SLB = 2,
-	MCE_ERROR_TYPE_ERAT = 3,
-	MCE_ERROR_TYPE_TLB = 4,
-};
-
-enum MCE_UeErrorType {
-	MCE_UE_ERROR_INDETERMINATE = 0,
-	MCE_UE_ERROR_IFETCH = 1,
-	MCE_UE_ERROR_PAGE_TABLE_WALK_IFETCH = 2,
-	MCE_UE_ERROR_LOAD_STORE = 3,
-	MCE_UE_ERROR_PAGE_TABLE_WALK_LOAD_STORE = 4,
-};
-
-enum MCE_SlbErrorType {
-	MCE_SLB_ERROR_INDETERMINATE = 0,
-	MCE_SLB_ERROR_PARITY = 1,
-	MCE_SLB_ERROR_MULTIHIT = 2,
-};
-
-enum MCE_EratErrorType {
-	MCE_ERAT_ERROR_INDETERMINATE = 0,
-	MCE_ERAT_ERROR_PARITY = 1,
-	MCE_ERAT_ERROR_MULTIHIT = 2,
-};
-
-enum MCE_TlbErrorType {
-	MCE_TLB_ERROR_INDETERMINATE = 0,
-	MCE_TLB_ERROR_PARITY = 1,
-	MCE_TLB_ERROR_MULTIHIT = 2,
-};
-
-struct machine_check_event {
-	enum MCE_Version	version:8;	/* 0x00 */
-	uint8_t			in_use;		/* 0x01 */
-	enum MCE_Severity	severity:8;	/* 0x02 */
-	enum MCE_Initiator	initiator:8;	/* 0x03 */
-	enum MCE_ErrorType	error_type:8;	/* 0x04 */
-	enum MCE_Disposition	disposition:8;	/* 0x05 */
-	uint8_t			reserved_1[2];	/* 0x06 */
-	uint64_t		gpr3;		/* 0x08 */
-	uint64_t		srr0;		/* 0x10 */
-	uint64_t		srr1;		/* 0x18 */
-	union {					/* 0x20 */
-		struct {
-			enum MCE_UeErrorType ue_error_type:8;
-			uint8_t		effective_address_provided;
-			uint8_t		physical_address_provided;
-			uint8_t		reserved_1[5];
-			uint64_t	effective_address;
-			uint64_t	physical_address;
-			uint8_t		reserved_2[8];
-		} ue_error;
-
-		struct {
-			enum MCE_SlbErrorType slb_error_type:8;
-			uint8_t		effective_address_provided;
-			uint8_t		reserved_1[6];
-			uint64_t	effective_address;
-			uint8_t		reserved_2[16];
-		} slb_error;
-
-		struct {
-			enum MCE_EratErrorType erat_error_type:8;
-			uint8_t		effective_address_provided;
-			uint8_t		reserved_1[6];
-			uint64_t	effective_address;
-			uint8_t		reserved_2[16];
-		} erat_error;
-
-		struct {
-			enum MCE_TlbErrorType tlb_error_type:8;
-			uint8_t		effective_address_provided;
-			uint8_t		reserved_1[6];
-			uint64_t	effective_address;
-			uint8_t		reserved_2[16];
-		} tlb_error;
-	} u;
-};
-
 struct mce_error_info {
 	enum MCE_ErrorType error_type:8;
 	union {
@@ -189,10 +87,10 @@ struct mce_error_info {
 extern void save_mce_event(struct pt_regs *regs, long handled,
 			   struct mce_error_info *mce_err, uint64_t nip,
 			   uint64_t addr);
-extern int get_mce_event(struct machine_check_event *mce, bool release);
+extern int get_mce_event(struct OpalMachineCheckEvent *mce, bool release);
 extern void release_mce_event(void);
 extern void machine_check_queue_event(void);
-extern void machine_check_print_event_info(struct machine_check_event *evt);
-extern uint64_t get_mce_fault_addr(struct machine_check_event *evt);
+extern void machine_check_print_event_info(struct OpalMachineCheckEvent *evt);
+extern uint64_t get_mce_fault_addr(struct OpalMachineCheckEvent *evt);
 
 #endif /* __ASM_PPC64_MCE_H__ */
diff --git a/arch/powerpc/include/asm/opal-api.h b/arch/powerpc/include/asm/opal-api.h
index 0e2e57b..6851be6 100644
--- a/arch/powerpc/include/asm/opal-api.h
+++ b/arch/powerpc/include/asm/opal-api.h
@@ -585,6 +585,182 @@ struct OpalHMIEvent {
 	} u;
 };
 
+/* Machine Check event */
+enum MCE_Version {
+	MCE_V1 = 1,
+};
+
+enum MCE_Severity {
+	MCE_SEV_NO_ERROR = 0,
+	MCE_SEV_WARNING = 1,
+	MCE_SEV_ERROR_SYNC = 2,
+	MCE_SEV_FATAL = 3,
+	MCE_SEV_ERROR_ASYNC = 4,
+};
+
+enum MCE_Disposition {
+	MCE_DISPOSITION_RECOVERED = 0,
+	MCE_DISPOSITION_NOT_RECOVERED = 1,
+};
+
+enum MCE_Initiator {
+	MCE_INITIATOR_UNKNOWN = 0,
+	MCE_INITIATOR_CPU = 1,
+};
+
+enum MCE_ErrorType {
+	MCE_ERROR_TYPE_UNKNOWN = 0,
+	MCE_ERROR_TYPE_UE = 1,
+	MCE_ERROR_TYPE_SLB = 2,
+	MCE_ERROR_TYPE_ERAT = 3,
+	MCE_ERROR_TYPE_TLB = 4,
+	MCE_ERROR_TYPE_NEST = 5,
+	MCE_ERROR_TYPE_CRESP = 6,
+	MCE_ERROR_TYPE_FSPACE = 7,
+	MCE_ERROR_TYPE_ASYNC = 8,
+};
+
+enum MCE_UeErrorType {
+	MCE_UE_ERROR_INDETERMINATE = 0,
+	MCE_UE_ERROR_IFETCH = 1,
+	MCE_UE_ERROR_PAGE_TABLE_WALK_IFETCH = 2,
+	MCE_UE_ERROR_LOAD_STORE = 3,
+	MCE_UE_ERROR_PAGE_TABLE_WALK_LOAD_STORE = 4,
+};
+
+enum MCE_SlbErrorType {
+	MCE_SLB_ERROR_INDETERMINATE = 0,
+	MCE_SLB_ERROR_PARITY = 1,
+	MCE_SLB_ERROR_MULTIHIT = 2,
+};
+
+enum MCE_EratErrorType {
+	MCE_ERAT_ERROR_INDETERMINATE = 0,
+	MCE_ERAT_ERROR_PARITY = 1,
+	MCE_ERAT_ERROR_MULTIHIT = 2,
+};
+
+enum MCE_TlbErrorType {
+	MCE_TLB_ERROR_INDETERMINATE = 0,
+	MCE_TLB_ERROR_PARITY = 1,
+	MCE_TLB_ERROR_MULTIHIT = 2,
+	MCE_TLB_ERROR_TLBIEL_PROG_ERROR = 3,
+};
+
+enum MCE_NestErrorType {
+	MCE_NEST_ERROR_ABRT_IFETCH = 0,
+	MCE_NEST_ERROR_ABRT_IFETCH_TABLEWALK = 1,
+	MCE_NEST_ERROR_ABRT_LOAD = 2,
+	MCE_NEST_ERROR_ABRT_LOAD_TABLEWALK = 3,
+};
+
+enum MCE_CrespErrorType {
+	MCE_CRESP_ERROR_BAD_RADDR_IFETCH = 0,
+	MCE_CRESP_ERROR_BAD_RADDR_IFETCH_TABLEWALK = 1,
+	MCE_CRESP_ERROR_BAD_RADDR_LOAD = 2,
+	MCE_CRESP_ERROR_BAD_RADDR_LOAD_TABLEWALK = 3,
+};
+
+enum MCE_FspaceErrorType {
+	MCE_FSPACE_ERROR_IFETCH = 0,
+	MCE_FSPACE_ERROR_IFETCH_TABLEWALK = 1,
+	MCE_FSPACE_ERROR_RADDR_TRANSLATION = 2,
+	MCE_FSPACE_ERROR_RADDR_LOAD = 3,
+};
+
+enum MCE_AsyncErrorType {
+	MCE_ASYNC_ERROR_REAL_ADDR_STORE = 0,
+	MCE_ASYNC_ERROR_NEST_ABRT_STORE = 1,
+};
+
+struct OpalMachineCheckEvent {
+	uint8_t			version;	/* 0x00 */
+	uint8_t			in_use;		/* 0x01 */
+	uint8_t			severity;	/* 0x02 */
+	uint8_t			initiator;	/* 0x03 */
+	uint8_t			error_type;	/* 0x04 */
+	uint8_t			disposition;	/* 0x05 */
+	uint8_t			reserved_1[2];	/* 0x06 */
+	uint64_t		gpr3;		/* 0x08 */
+	uint64_t		srr0;		/* 0x10 */
+	uint64_t		srr1;		/* 0x18 */
+
+	union {					/* 0x20 */
+		/* Next 32 bytes contains specific information of MCE error. */
+		struct {
+			/* enum MCE_UeErrorType */
+			uint8_t		ue_error_type;
+			uint8_t		effective_address_provided;
+			uint8_t		physical_address_provided;
+			uint8_t		reserved_1[5];
+			uint64_t	effective_address;
+			uint64_t	physical_address;
+			uint8_t		reserved_2[8];
+		} ue_error;
+
+		struct {
+			/* enum MCE_SlbErrorType */
+			uint8_t		slb_error_type;
+			uint8_t		effective_address_provided;
+			uint8_t		reserved_1[6];
+			uint64_t	effective_address;
+			uint8_t		reserved_2[16];
+		} slb_error;
+
+		struct {
+			/* enum MCE_EratErrorType */
+			uint8_t		erat_error_type;
+			uint8_t		effective_address_provided;
+			uint8_t		reserved_1[6];
+			uint64_t	effective_address;
+			uint8_t		reserved_2[16];
+		} erat_error;
+
+		struct {
+			/* enum MCE_TlbErrorType  */
+			uint8_t		tlb_error_type;
+			uint8_t		effective_address_provided;
+			uint8_t		reserved_1[6];
+			uint64_t	effective_address;
+			uint8_t		reserved_2[16];
+		} tlb_error;
+
+		struct {
+			/* enum MCE_NestErrorType */
+			uint8_t		nest_error_type;
+			uint8_t		effective_address_provided;
+			uint8_t		reserved_1[6];
+			uint64_t	effective_address;
+			uint8_t		reserved_2[16];
+		} nest_error;
+
+		struct {
+			/* enum MCE_CrespErrorType */
+			uint8_t		cresp_error_type;
+			uint8_t		effective_address_provided;
+			uint8_t		reserved_1[6];
+			uint64_t	effective_address;
+			uint8_t		reserved_2[16];
+		} cresp_error;
+
+		struct {
+			/* enum MCE_FspaceErrorType */
+			uint8_t		fspace_error_type;
+			uint8_t		effective_address_provided;
+			uint8_t		reserved_1[6];
+			uint64_t	effective_address;
+			uint8_t		reserved_2[16];
+		} fspace_error;
+
+		struct {
+			/* enum MCE_AsyncErrorType */
+			uint8_t		async_error_type;
+			uint8_t		reserved_1[7];
+			uint8_t		reserved_2[24];
+		} async_error;
+	} u;
+};
+
 enum {
 	OPAL_P7IOC_DIAG_TYPE_NONE	= 0,
 	OPAL_P7IOC_DIAG_TYPE_RGC	= 1,
diff --git a/arch/powerpc/kernel/mce.c b/arch/powerpc/kernel/mce.c
index c6923ff..51a7c64 100644
--- a/arch/powerpc/kernel/mce.c
+++ b/arch/powerpc/kernel/mce.c
@@ -30,18 +30,18 @@
 #include <asm/mce.h>
 
 static DEFINE_PER_CPU(int, mce_nest_count);
-static DEFINE_PER_CPU(struct machine_check_event[MAX_MC_EVT], mce_event);
+static DEFINE_PER_CPU(struct OpalMachineCheckEvent[MAX_MC_EVT], mce_event);
 
 /* Queue for delayed MCE events. */
 static DEFINE_PER_CPU(int, mce_queue_count);
-static DEFINE_PER_CPU(struct machine_check_event[MAX_MC_EVT], mce_event_queue);
+static DEFINE_PER_CPU(struct OpalMachineCheckEvent[MAX_MC_EVT], mce_event_queue);
 
 static void machine_check_process_queued_event(struct irq_work *work);
 static struct irq_work mce_event_process_work = {
         .func = machine_check_process_queued_event,
 };
 
-static void mce_set_error_info(struct machine_check_event *mce,
+static void mce_set_error_info(struct OpalMachineCheckEvent *mce,
 			       struct mce_error_info *mce_err)
 {
 	mce->error_type = mce_err->error_type;
@@ -73,7 +73,7 @@ void save_mce_event(struct pt_regs *regs, long handled,
 		    uint64_t nip, uint64_t addr)
 {
 	int index = __this_cpu_inc_return(mce_nest_count) - 1;
-	struct machine_check_event *mce = this_cpu_ptr(&mce_event[index]);
+	struct OpalMachineCheckEvent *mce = this_cpu_ptr(&mce_event[index]);
 
 	/*
 	 * Return if we don't have enough space to log mce event.
@@ -139,10 +139,10 @@ void save_mce_event(struct pt_regs *regs, long handled,
  * preemption will not be scheduled until ret_from_expect() routine
  * is called.
  */
-int get_mce_event(struct machine_check_event *mce, bool release)
+int get_mce_event(struct OpalMachineCheckEvent *mce, bool release)
 {
 	int index = __this_cpu_read(mce_nest_count) - 1;
-	struct machine_check_event *mc_evt;
+	struct OpalMachineCheckEvent *mc_evt;
 	int ret = 0;
 
 	/* Sanity check */
@@ -177,7 +177,7 @@ void release_mce_event(void)
 void machine_check_queue_event(void)
 {
 	int index;
-	struct machine_check_event evt;
+	struct OpalMachineCheckEvent evt;
 
 	if (!get_mce_event(&evt, MCE_EVENT_RELEASE))
 		return;
@@ -214,7 +214,7 @@ static void machine_check_process_queued_event(struct irq_work *work)
 	}
 }
 
-void machine_check_print_event_info(struct machine_check_event *evt)
+void machine_check_print_event_info(struct OpalMachineCheckEvent *evt)
 {
 	const char *level, *sevstr, *subtype;
 	static const char *mc_ue_types[] = {
@@ -322,7 +322,7 @@ void machine_check_print_event_info(struct machine_check_event *evt)
 	}
 }
 
-uint64_t get_mce_fault_addr(struct machine_check_event *evt)
+uint64_t get_mce_fault_addr(struct OpalMachineCheckEvent *evt)
 {
 	switch (evt->error_type) {
 	case MCE_ERROR_TYPE_UE:
diff --git a/arch/powerpc/platforms/powernv/opal.c b/arch/powerpc/platforms/powernv/opal.c
index 2822935..3eb0e94 100644
--- a/arch/powerpc/platforms/powernv/opal.c
+++ b/arch/powerpc/platforms/powernv/opal.c
@@ -392,7 +392,7 @@ int opal_put_chars(uint32_t vtermno, const char *data, int total_len)
 }
 
 static int opal_recover_mce(struct pt_regs *regs,
-					struct machine_check_event *evt)
+					struct OpalMachineCheckEvent *evt)
 {
 	int recovered = 0;
 	uint64_t ea = get_mce_fault_addr(evt);
@@ -432,7 +432,7 @@ static int opal_recover_mce(struct pt_regs *regs,
 
 int opal_machine_check(struct pt_regs *regs)
 {
-	struct machine_check_event evt;
+	struct OpalMachineCheckEvent evt;
 	int ret;
 
 	if (!get_mce_event(&evt, MCE_EVENT_RELEASE))

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [RFC PATCH 2/7] powerpc/book3s: mce: Call opal mce handler to extract MCE error reason.
  2017-02-21  1:51 [RFC PATCH 0/7] Machine check handling for Power9 with bacward compatibility Mahesh J Salgaonkar
  2017-02-21  1:51 ` [RFC PATCH 1/7] powerpc/book3s: Move machine check event structure to opal-api.h Mahesh J Salgaonkar
@ 2017-02-21  1:52 ` Mahesh J Salgaonkar
  2017-02-21  1:52 ` [RFC PATCH 3/7] powerpc/book3s: mce: Process the MCE event and recover if possible Mahesh J Salgaonkar
                   ` (4 subsequent siblings)
  6 siblings, 0 replies; 13+ messages in thread
From: Mahesh J Salgaonkar @ 2017-02-21  1:52 UTC (permalink / raw)
  To: linuxppc-dev, Benjamin Herrenschmidt; +Cc: Paul Mackerras, Nicholas Piggin

From: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com>

This is a RFC implementation that proposes an approach where linux would
still take an MCE interrupt but it would make opal call to handle all
chip specific processing to extract the error reason.

Opal will support OPAL_HANDLE_MACHINE_CHECK token if it implements
machine check handler currently installed FW. If this token is supported
then linux will populate ppc_md.machine_check_early() function pointer.

But if opal mce handler does not support processing/extracting MCE reason
for current chip, then it will return OPAL_UNSUPPORTED. e.g. new OPAL FW
installed on system with Power8 or below. In this case (i.e on Power8 system)
it will fallback to in-kernel MCE handler as it does today.

This patch uses OPAL_CALL_REAL() wrapper to be able to go back into OPAL on
machine check interrupt. The reason is, OPAL_CALL() wrapper makes it
difficult since it uses r13 to save PACASAVEDMSR which overwrites previous
MSR if we already in opal when MCE hit us.

We also need to make OPAL to stick to caller's stack to make opal call
re-entrant. Will send out a separate patch to skiboot mailing list.

The other approach could be let opal patch up 0x200 vector so that opal
takes up the interrupt and does the processing. comments welcome.

Signed-off-by: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com>
---
 arch/powerpc/include/asm/machdep.h             |    2 ++
 arch/powerpc/include/asm/opal-api.h            |    3 ++-
 arch/powerpc/include/asm/opal.h                |    3 +++
 arch/powerpc/kernel/traps.c                    |   15 ++++++++++++++-
 arch/powerpc/platforms/powernv/opal-wrappers.S |    1 +
 arch/powerpc/platforms/powernv/opal.c          |   16 ++++++++++++++++
 arch/powerpc/platforms/powernv/setup.c         |    4 ++++
 7 files changed, 42 insertions(+), 2 deletions(-)

diff --git a/arch/powerpc/include/asm/machdep.h b/arch/powerpc/include/asm/machdep.h
index 5011b69..037ea19 100644
--- a/arch/powerpc/include/asm/machdep.h
+++ b/arch/powerpc/include/asm/machdep.h
@@ -108,6 +108,8 @@ struct machdep_calls {
 
 	/* Early exception handlers called in realmode */
 	int		(*hmi_exception_early)(struct pt_regs *regs);
+	int		(*machine_check_early)(struct pt_regs *regs,
+							long *handled);
 
 	/* Called during machine check exception to retrive fixup address. */
 	bool		(*mce_check_early_recovery)(struct pt_regs *regs);
diff --git a/arch/powerpc/include/asm/opal-api.h b/arch/powerpc/include/asm/opal-api.h
index 6851be6..8485e67 100644
--- a/arch/powerpc/include/asm/opal-api.h
+++ b/arch/powerpc/include/asm/opal-api.h
@@ -167,7 +167,8 @@
 #define OPAL_INT_EOI				124
 #define OPAL_INT_SET_MFRR			125
 #define OPAL_PCI_TCE_KILL			126
-#define OPAL_LAST				126
+#define OPAL_HANDLE_MACHINE_CHECK		145
+#define OPAL_LAST				145
 
 /* Device tree flags */
 
diff --git a/arch/powerpc/include/asm/opal.h b/arch/powerpc/include/asm/opal.h
index 5c7db0f..4834e0d 100644
--- a/arch/powerpc/include/asm/opal.h
+++ b/arch/powerpc/include/asm/opal.h
@@ -189,6 +189,8 @@ int64_t opal_set_param(uint64_t token, uint32_t param_id, uint64_t buffer,
 		uint64_t length);
 int64_t opal_sensor_read(uint32_t sensor_hndl, int token, __be32 *sensor_data);
 int64_t opal_handle_hmi(void);
+int64_t opal_rm_handle_machine_check(uint64_t srr0, uint64_t srr1, uint64_t dar,
+					uint64_t dsisr, uint64_t *mce);
 int64_t opal_register_dump_region(uint32_t id, uint64_t start, uint64_t end);
 int64_t opal_unregister_dump_region(uint32_t id);
 int64_t opal_slw_set_reg(uint64_t cpu_pir, uint64_t sprn, uint64_t val);
@@ -279,6 +281,7 @@ extern int opal_hmi_handler_init(void);
 extern int opal_event_init(void);
 
 extern int opal_machine_check(struct pt_regs *regs);
+extern int opal_machine_check_early(struct pt_regs *regs, long *handled);
 extern bool opal_mce_check_early_recovery(struct pt_regs *regs);
 extern int opal_hmi_exception_early(struct pt_regs *regs);
 extern int opal_handle_hmi_exception(struct pt_regs *regs);
diff --git a/arch/powerpc/kernel/traps.c b/arch/powerpc/kernel/traps.c
index e6cc56b..62b587f 100644
--- a/arch/powerpc/kernel/traps.c
+++ b/arch/powerpc/kernel/traps.c
@@ -302,12 +302,25 @@ void system_reset_exception(struct pt_regs *regs)
 long machine_check_early(struct pt_regs *regs)
 {
 	long handled = 0;
+	int rc = 0;
 
 	__this_cpu_inc(irq_stat.mce_exceptions);
 
 	add_taint(TAINT_MACHINE_CHECK, LOCKDEP_NOW_UNRELIABLE);
 
-	if (cur_cpu_spec && cur_cpu_spec->machine_check_early)
+	/*
+	 * See if platform is capable of handling machine check. (e.g. PowerNV
+	 * platform may take help from OPAL firmware to handle machine check.)
+	 * Otherwise fallthrough and allow CPU to handle this machine check.
+	 *
+	 * If rc == -1 then it means firmware does not provide support for
+	 * machine check handling for this CPU chip. Fallback to in-kernel
+	 * machine check handler.
+	 */
+	if (ppc_md.machine_check_early)
+		rc = ppc_md.machine_check_early(regs, &handled);
+
+	if (!rc && cur_cpu_spec && cur_cpu_spec->machine_check_early)
 		handled = cur_cpu_spec->machine_check_early(regs);
 	return handled;
 }
diff --git a/arch/powerpc/platforms/powernv/opal-wrappers.S b/arch/powerpc/platforms/powernv/opal-wrappers.S
index 3aa40f1..3120d2e 100644
--- a/arch/powerpc/platforms/powernv/opal-wrappers.S
+++ b/arch/powerpc/platforms/powernv/opal-wrappers.S
@@ -312,3 +312,4 @@ OPAL_CALL(opal_int_set_mfrr,			OPAL_INT_SET_MFRR);
 OPAL_CALL_REAL(opal_rm_int_set_mfrr,		OPAL_INT_SET_MFRR);
 OPAL_CALL(opal_pci_tce_kill,			OPAL_PCI_TCE_KILL);
 OPAL_CALL_REAL(opal_rm_pci_tce_kill,		OPAL_PCI_TCE_KILL);
+OPAL_CALL_REAL(opal_rm_handle_machine_check,		OPAL_HANDLE_MACHINE_CHECK);
diff --git a/arch/powerpc/platforms/powernv/opal.c b/arch/powerpc/platforms/powernv/opal.c
index 3eb0e94..263c57e 100644
--- a/arch/powerpc/platforms/powernv/opal.c
+++ b/arch/powerpc/platforms/powernv/opal.c
@@ -488,6 +488,22 @@ int opal_machine_check(struct pt_regs *regs)
 	return 0;
 }
 
+/* Early mce handler called in real mode. */
+int opal_machine_check_early(struct pt_regs *regs, long *handled)
+{
+	int rc;
+	struct OpalMachineCheckEvent evt = { 0 };
+
+	*handled = 0;
+
+	rc = opal_rm_handle_machine_check(regs->nip, regs->msr, regs->dar,
+					regs->dsisr, (uint64_t *)&evt);
+	if (rc != OPAL_SUCCESS)
+		return -1;
+
+	return 0;
+}
+
 /* Early hmi handler called in real mode. */
 int opal_hmi_exception_early(struct pt_regs *regs)
 {
diff --git a/arch/powerpc/platforms/powernv/setup.c b/arch/powerpc/platforms/powernv/setup.c
index d50c7d9..781478a 100644
--- a/arch/powerpc/platforms/powernv/setup.c
+++ b/arch/powerpc/platforms/powernv/setup.c
@@ -264,6 +264,10 @@ static void __init pnv_setup_machdep_opal(void)
 	ppc_md.mce_check_early_recovery = opal_mce_check_early_recovery;
 	ppc_md.hmi_exception_early = opal_hmi_exception_early;
 	ppc_md.handle_hmi_exception = opal_handle_hmi_exception;
+
+	/* Check if OPAL is capable of handling machine check. */
+	if (opal_check_token(OPAL_HANDLE_MACHINE_CHECK))
+		ppc_md.machine_check_early = opal_machine_check_early;
 }
 
 static int __init pnv_probe(void)

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [RFC PATCH 3/7] powerpc/book3s: mce: Process the MCE event and recover if possible.
  2017-02-21  1:51 [RFC PATCH 0/7] Machine check handling for Power9 with bacward compatibility Mahesh J Salgaonkar
  2017-02-21  1:51 ` [RFC PATCH 1/7] powerpc/book3s: Move machine check event structure to opal-api.h Mahesh J Salgaonkar
  2017-02-21  1:52 ` [RFC PATCH 2/7] powerpc/book3s: mce: Call opal mce handler to extract MCE error reason Mahesh J Salgaonkar
@ 2017-02-21  1:52 ` Mahesh J Salgaonkar
  2017-02-21  1:52 ` [RFC PATCH 4/7] powerpc/book3s: Print additional MCE errors introduced in power9 Mahesh J Salgaonkar
                   ` (3 subsequent siblings)
  6 siblings, 0 replies; 13+ messages in thread
From: Mahesh J Salgaonkar @ 2017-02-21  1:52 UTC (permalink / raw)
  To: linuxppc-dev, Benjamin Herrenschmidt; +Cc: Paul Mackerras, Nicholas Piggin

From: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com>

Once we get high level MCE error event from opal, process it and figure
out if it recoverable or not. If yes, take corrective actions.

TODO:
- Rework on handling of asynchronous MCE errors.
  - Update opal_recover_mce() to ignore async errors.
- Update flush_and_reload_slb() to avoid SLB reload in radix mode.

Signed-off-by: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com>
---
 arch/powerpc/include/asm/mce.h        |    3 +++
 arch/powerpc/kernel/mce.c             |   26 +++++++++++++++++++++++
 arch/powerpc/kernel/mce_power.c       |   38 +++++++++++++++++++++++++++++++++
 arch/powerpc/platforms/powernv/opal.c |    2 ++
 4 files changed, 69 insertions(+)

diff --git a/arch/powerpc/include/asm/mce.h b/arch/powerpc/include/asm/mce.h
index 36db6b0..69e4a42 100644
--- a/arch/powerpc/include/asm/mce.h
+++ b/arch/powerpc/include/asm/mce.h
@@ -88,9 +88,12 @@ extern void save_mce_event(struct pt_regs *regs, long handled,
 			   struct mce_error_info *mce_err, uint64_t nip,
 			   uint64_t addr);
 extern int get_mce_event(struct OpalMachineCheckEvent *mce, bool release);
+extern int set_mce_event(struct OpalMachineCheckEvent *mce);
 extern void release_mce_event(void);
 extern void machine_check_queue_event(void);
 extern void machine_check_print_event_info(struct OpalMachineCheckEvent *evt);
 extern uint64_t get_mce_fault_addr(struct OpalMachineCheckEvent *evt);
+extern long handle_mce_errors(struct pt_regs *regs,
+					struct OpalMachineCheckEvent *evt);
 
 #endif /* __ASM_PPC64_MCE_H__ */
diff --git a/arch/powerpc/kernel/mce.c b/arch/powerpc/kernel/mce.c
index 51a7c64..36da14a3 100644
--- a/arch/powerpc/kernel/mce.c
+++ b/arch/powerpc/kernel/mce.c
@@ -166,6 +166,32 @@ int get_mce_event(struct OpalMachineCheckEvent *mce, bool release)
 	return ret;
 }
 
+int set_mce_event(struct OpalMachineCheckEvent *mce)
+{
+	int index = __this_cpu_inc_return(mce_nest_count) - 1;
+	struct OpalMachineCheckEvent *mc_evt = this_cpu_ptr(&mce_event[index]);
+	int ret = 0;
+
+	/* Sanity check */
+	if (index < 0)
+		return ret;
+
+	/* Check if we have MCE info slot within array limit. */
+	if (index < MAX_MC_EVT) {
+		/* Copy the event structure and release the original */
+		if (mce) {
+			*mc_evt = *mce;
+			/* endian conversions */
+			mc_evt->srr0 = be64_to_cpu(mce->srr0);
+			mc_evt->srr1 = be64_to_cpu(mce->srr1);
+			mc_evt->u.ue_error.effective_address =
+				be64_to_cpu(mce->u.ue_error.effective_address);
+		}
+		ret = 1;
+	}
+	return ret;
+}
+
 void release_mce_event(void)
 {
 	get_mce_event(NULL, true);
diff --git a/arch/powerpc/kernel/mce_power.c b/arch/powerpc/kernel/mce_power.c
index 7353991..91ed2ef 100644
--- a/arch/powerpc/kernel/mce_power.c
+++ b/arch/powerpc/kernel/mce_power.c
@@ -372,3 +372,41 @@ long __machine_check_early_realmode_p8(struct pt_regs *regs)
 	save_mce_event(regs, handled, &mce_error_info, nip, addr);
 	return handled;
 }
+
+static long flush_tlb(void)
+{
+	long handled = 0;
+
+	if (cur_cpu_spec && cur_cpu_spec->flush_tlb) {
+		cur_cpu_spec->flush_tlb(TLB_INVAL_SCOPE_GLOBAL);
+		handled = 1;
+	}
+	return handled;
+}
+
+long handle_mce_errors(struct pt_regs *regs, struct OpalMachineCheckEvent *evt)
+{
+	long handled = 1;
+
+	if (evt->disposition == MCE_DISPOSITION_RECOVERED)
+		return handled;
+
+	switch (evt->error_type) {
+	case MCE_ERROR_TYPE_UE:
+		handled = mce_handle_ue_error(regs);
+		break;
+	case MCE_ERROR_TYPE_SLB:
+	case MCE_ERROR_TYPE_ERAT:
+		flush_and_reload_slb();
+		handled = 1;
+		break;
+	case MCE_ERROR_TYPE_TLB:
+		handled = flush_tlb();
+		break;
+	default:
+		handled = 0;
+	}
+	if (handled)
+		evt->disposition = MCE_DISPOSITION_RECOVERED;
+	return handled;
+}
diff --git a/arch/powerpc/platforms/powernv/opal.c b/arch/powerpc/platforms/powernv/opal.c
index 263c57e..f1115c4 100644
--- a/arch/powerpc/platforms/powernv/opal.c
+++ b/arch/powerpc/platforms/powernv/opal.c
@@ -501,6 +501,8 @@ int opal_machine_check_early(struct pt_regs *regs, long *handled)
 	if (rc != OPAL_SUCCESS)
 		return -1;
 
+	*handled = handle_mce_errors(regs, &evt);
+	set_mce_event(&evt);
 	return 0;
 }
 

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [RFC PATCH 4/7] powerpc/book3s: Print additional MCE errors introduced in power9.
  2017-02-21  1:51 [RFC PATCH 0/7] Machine check handling for Power9 with bacward compatibility Mahesh J Salgaonkar
                   ` (2 preceding siblings ...)
  2017-02-21  1:52 ` [RFC PATCH 3/7] powerpc/book3s: mce: Process the MCE event and recover if possible Mahesh J Salgaonkar
@ 2017-02-21  1:52 ` Mahesh J Salgaonkar
  2017-02-21  1:52 ` [RFC PATCH 5/7] powerpc/book3s: Don't turn on the MSR[ME] bit until opal processes the reason Mahesh J Salgaonkar
                   ` (2 subsequent siblings)
  6 siblings, 0 replies; 13+ messages in thread
From: Mahesh J Salgaonkar @ 2017-02-21  1:52 UTC (permalink / raw)
  To: linuxppc-dev, Benjamin Herrenschmidt; +Cc: Paul Mackerras, Nicholas Piggin

From: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com>

Print out details about new MCE errors from Power9.

Signed-off-by: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com>
---
 arch/powerpc/kernel/mce.c |   68 +++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 68 insertions(+)

diff --git a/arch/powerpc/kernel/mce.c b/arch/powerpc/kernel/mce.c
index 36da14a3..da12992 100644
--- a/arch/powerpc/kernel/mce.c
+++ b/arch/powerpc/kernel/mce.c
@@ -264,6 +264,29 @@ void machine_check_print_event_info(struct OpalMachineCheckEvent *evt)
 		"Indeterminate",
 		"Parity",
 		"Multihit",
+		"Tlbiel programming error",
+	};
+	static const char *mc_nest_types[] = {
+		"Ifetch due to foreign link time out",
+		"Instruction tablewalk due to foreign link time out",
+		"Foreign Link Time out for load",
+		"Foreign Link Time out for tablewalk",
+	};
+	static const char *mc_cresp_types[] = {
+		"Real Address error for an ifetch",
+		"Real Address error for an ifetch tablewalk",
+		"Bad real address for a load",
+		"Bad address for load/store tablewalk",
+	};
+	static const char *mc_fspace_types[] = {
+		"Instruction Fetch to foreign address space",
+		"I-side tablewalk host real addr in the foreign address range",
+		"Host real addess to foreign space",
+		"Load/store real address went to a foreign address",
+	};
+	static const char *mc_async_types[] = {
+		"Real address error (CRESP) from store",
+		"Foreign link time out (nest abort) due to a store instruction",
 	};
 
 	/* Print things out */
@@ -341,6 +364,43 @@ void machine_check_print_event_info(struct OpalMachineCheckEvent *evt)
 			printk("%s    Effective address: %016llx\n",
 			       level, evt->u.tlb_error.effective_address);
 		break;
+	case MCE_ERROR_TYPE_NEST:
+		subtype = evt->u.nest_error.nest_error_type <
+			ARRAY_SIZE(mc_nest_types) ?
+			mc_nest_types[evt->u.nest_error.nest_error_type]
+			: "Unknown";
+		printk("%s  Error type: NEST ABORT [%s]\n", level, subtype);
+		if (evt->u.nest_error.effective_address_provided)
+			printk("%s    Effective address: %016llx\n",
+			       level, evt->u.nest_error.effective_address);
+		break;
+	case MCE_ERROR_TYPE_CRESP:
+		subtype = evt->u.cresp_error.cresp_error_type <
+			ARRAY_SIZE(mc_cresp_types) ?
+			mc_cresp_types[evt->u.cresp_error.cresp_error_type]
+			: "Unknown";
+		printk("%s  Error type: CRESP [%s]\n", level, subtype);
+		if (evt->u.cresp_error.effective_address_provided)
+			printk("%s    Effective address: %016llx\n",
+			       level, evt->u.cresp_error.effective_address);
+		break;
+	case MCE_ERROR_TYPE_FSPACE:
+		subtype = evt->u.fspace_error.fspace_error_type <
+			ARRAY_SIZE(mc_fspace_types) ?
+			mc_fspace_types[evt->u.fspace_error.fspace_error_type]
+			: "Unknown";
+		printk("%s  Error type: FOREIGN SPACE [%s]\n", level, subtype);
+		if (evt->u.fspace_error.effective_address_provided)
+			printk("%s    Effective address: %016llx\n",
+			       level, evt->u.fspace_error.effective_address);
+		break;
+	case MCE_ERROR_TYPE_ASYNC:
+		subtype = evt->u.async_error.async_error_type <
+			ARRAY_SIZE(mc_async_types) ?
+			mc_async_types[evt->u.async_error.async_error_type]
+			: "Unknown";
+		printk("%s  Error type: ASYNC MC [%s]\n", level, subtype);
+		break;
 	default:
 	case MCE_ERROR_TYPE_UNKNOWN:
 		printk("%s  Error type: Unknown\n", level);
@@ -367,6 +427,14 @@ uint64_t get_mce_fault_addr(struct OpalMachineCheckEvent *evt)
 		if (evt->u.tlb_error.effective_address_provided)
 			return evt->u.tlb_error.effective_address;
 		break;
+	case MCE_ERROR_TYPE_NEST:
+		if (evt->u.nest_error.effective_address_provided)
+			return evt->u.nest_error.effective_address;
+		break;
+	case MCE_ERROR_TYPE_CRESP:
+		if (evt->u.cresp_error.effective_address_provided)
+			return evt->u.cresp_error.effective_address;
+		break;
 	default:
 	case MCE_ERROR_TYPE_UNKNOWN:
 		break;

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [RFC PATCH 5/7] powerpc/book3s: Don't turn on the MSR[ME] bit until opal processes the reason.
  2017-02-21  1:51 [RFC PATCH 0/7] Machine check handling for Power9 with bacward compatibility Mahesh J Salgaonkar
                   ` (3 preceding siblings ...)
  2017-02-21  1:52 ` [RFC PATCH 4/7] powerpc/book3s: Print additional MCE errors introduced in power9 Mahesh J Salgaonkar
@ 2017-02-21  1:52 ` Mahesh J Salgaonkar
  2017-02-21  2:47   ` Nicholas Piggin
  2017-02-21  1:53 ` [RFC PATCH 6/7] powerpc/book3s: Display more info for MCE error console log Mahesh J Salgaonkar
  2017-02-21  1:53 ` [RFC PATCH 7/7] powerpc/book3s: Display task info for MCE error in user mode Mahesh J Salgaonkar
  6 siblings, 1 reply; 13+ messages in thread
From: Mahesh J Salgaonkar @ 2017-02-21  1:52 UTC (permalink / raw)
  To: linuxppc-dev, Benjamin Herrenschmidt; +Cc: Paul Mackerras, Nicholas Piggin

From: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com>

Delay it until we are done with machine_check_early() call. Turn on MSR[ME]
once opal is done with processing MCE.

Signed-off-by: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com>
---
 arch/powerpc/kernel/exceptions-64s.S |   12 ++++++------
 1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/arch/powerpc/kernel/exceptions-64s.S b/arch/powerpc/kernel/exceptions-64s.S
index d39d611..fa768a7 100644
--- a/arch/powerpc/kernel/exceptions-64s.S
+++ b/arch/powerpc/kernel/exceptions-64s.S
@@ -238,6 +238,12 @@ BEGIN_FTR_SECTION
 	std	r9,_CCR(r1)		/* Save CR in stackframe */
 	/* Save r9 through r13 from EXMC save area to stack frame. */
 	EXCEPTION_PROLOG_COMMON_2(PACA_EXMC)
+	std	r0,GPR0(r1)	/* Save r0 */
+	EXCEPTION_PROLOG_COMMON_3(0x200)
+	bl	save_nvgprs
+	addi	r3,r1,STACK_FRAME_OVERHEAD
+	bl	machine_check_early
+	std	r3,RESULT(r1)	/* Save result */
 	mfmsr	r11			/* get MSR value */
 	ori	r11,r11,MSR_ME		/* turn on ME bit */
 	ori	r11,r11,MSR_RI		/* turn on RI bit */
@@ -345,12 +351,6 @@ EXC_COMMON_BEGIN(machine_check_common)
 	 * ME=1, MMU (IR=0 and DR=0) off and using MC emergency stack.
 	 */
 EXC_COMMON_BEGIN(machine_check_handle_early)
-	std	r0,GPR0(r1)	/* Save r0 */
-	EXCEPTION_PROLOG_COMMON_3(0x200)
-	bl	save_nvgprs
-	addi	r3,r1,STACK_FRAME_OVERHEAD
-	bl	machine_check_early
-	std	r3,RESULT(r1)	/* Save result */
 	ld	r12,_MSR(r1)
 #ifdef	CONFIG_PPC_P7_NAP
 	/*

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [RFC PATCH 6/7] powerpc/book3s: Display more info for MCE error console log.
  2017-02-21  1:51 [RFC PATCH 0/7] Machine check handling for Power9 with bacward compatibility Mahesh J Salgaonkar
                   ` (4 preceding siblings ...)
  2017-02-21  1:52 ` [RFC PATCH 5/7] powerpc/book3s: Don't turn on the MSR[ME] bit until opal processes the reason Mahesh J Salgaonkar
@ 2017-02-21  1:53 ` Mahesh J Salgaonkar
  2017-02-21  1:53 ` [RFC PATCH 7/7] powerpc/book3s: Display task info for MCE error in user mode Mahesh J Salgaonkar
  6 siblings, 0 replies; 13+ messages in thread
From: Mahesh J Salgaonkar @ 2017-02-21  1:53 UTC (permalink / raw)
  To: linuxppc-dev, Benjamin Herrenschmidt; +Cc: Paul Mackerras, Nicholas Piggin

From: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com>

For D-side errors we print data load/store address as 'Effective address'
that caused MC. In addition to that print NIP also at which interrupt was
taken.

After this patch the MCE console log would look like:

[1150485.962090] Severe Machine check interrupt [Recovered]
[1150485.962114]   Initiator: CPU
[1150485.962139]   NIP [c000000000018b00]: sched_clock+0x8/0x34
[1150485.962166]   Error type: ERAT [Multihit]
[1150485.962190]     Effective address: 00003fff8f6b0000

Signed-off-by: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com>
---
 arch/powerpc/kernel/mce.c |    2 ++
 1 file changed, 2 insertions(+)

diff --git a/arch/powerpc/kernel/mce.c b/arch/powerpc/kernel/mce.c
index da12992..035ef53 100644
--- a/arch/powerpc/kernel/mce.c
+++ b/arch/powerpc/kernel/mce.c
@@ -320,6 +320,8 @@ void machine_check_print_event_info(struct OpalMachineCheckEvent *evt)
 	       "Recovered" : "[Not recovered");
 	printk("%s  Initiator: %s\n", level,
 	       evt->initiator == MCE_INITIATOR_CPU ? "CPU" : "Unknown");
+	printk("%s  NIP [%016llx]: %pS\n", level, evt->srr0,
+							(void *)evt->srr0);
 	switch (evt->error_type) {
 	case MCE_ERROR_TYPE_UE:
 		subtype = evt->u.ue_error.ue_error_type <

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [RFC PATCH 7/7] powerpc/book3s: Display task info for MCE error in user mode.
  2017-02-21  1:51 [RFC PATCH 0/7] Machine check handling for Power9 with bacward compatibility Mahesh J Salgaonkar
                   ` (5 preceding siblings ...)
  2017-02-21  1:53 ` [RFC PATCH 6/7] powerpc/book3s: Display more info for MCE error console log Mahesh J Salgaonkar
@ 2017-02-21  1:53 ` Mahesh J Salgaonkar
  6 siblings, 0 replies; 13+ messages in thread
From: Mahesh J Salgaonkar @ 2017-02-21  1:53 UTC (permalink / raw)
  To: linuxppc-dev, Benjamin Herrenschmidt; +Cc: Paul Mackerras, Nicholas Piggin

From: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com>

For MCE that hit while in use mode MSR(HV=1,PR=1), print the task info on the
console MCE error log. This will help to identify application that stumbled
upon MCE error.

After this patch the MCE console log would look like:

[    2.246155] Severe Machine check interrupt [Recovered]
[    2.246178]   Initiator: CPU
[    2.246199]   NIP: [0000000010039778] PID: 813 Comm: ebizzy
[    2.246223]   Error type: ERAT [Multihit]
[    2.246244]     Effective address: 00003fff94070000

[114560.247515] Severe Machine check interrupt [Recovered]
[114560.247562]   Initiator: CPU
[114560.247599]   NIP [d00000000d2e019c]: init_module+0x19c/0x260 [bork_kernel]
[114560.247666]   Error type: SLB [Multihit]
[114560.247701]     Effective address: d000000023db0000


Signed-off-by: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com>
---
 arch/powerpc/include/asm/mce.h        |    3 ++-
 arch/powerpc/kernel/mce.c             |   12 +++++++++---
 arch/powerpc/platforms/powernv/opal.c |    2 +-
 3 files changed, 12 insertions(+), 5 deletions(-)

diff --git a/arch/powerpc/include/asm/mce.h b/arch/powerpc/include/asm/mce.h
index 69e4a42..99dd1f3 100644
--- a/arch/powerpc/include/asm/mce.h
+++ b/arch/powerpc/include/asm/mce.h
@@ -91,7 +91,8 @@ extern int get_mce_event(struct OpalMachineCheckEvent *mce, bool release);
 extern int set_mce_event(struct OpalMachineCheckEvent *mce);
 extern void release_mce_event(void);
 extern void machine_check_queue_event(void);
-extern void machine_check_print_event_info(struct OpalMachineCheckEvent *evt);
+extern void machine_check_print_event_info(struct OpalMachineCheckEvent *evt,
+							bool user_mode);
 extern uint64_t get_mce_fault_addr(struct OpalMachineCheckEvent *evt);
 extern long handle_mce_errors(struct pt_regs *regs,
 					struct OpalMachineCheckEvent *evt);
diff --git a/arch/powerpc/kernel/mce.c b/arch/powerpc/kernel/mce.c
index 035ef53..af36824 100644
--- a/arch/powerpc/kernel/mce.c
+++ b/arch/powerpc/kernel/mce.c
@@ -235,12 +235,13 @@ static void machine_check_process_queued_event(struct irq_work *work)
 	while (__this_cpu_read(mce_queue_count) > 0) {
 		index = __this_cpu_read(mce_queue_count) - 1;
 		machine_check_print_event_info(
-				this_cpu_ptr(&mce_event_queue[index]));
+				this_cpu_ptr(&mce_event_queue[index]), false);
 		__this_cpu_dec(mce_queue_count);
 	}
 }
 
-void machine_check_print_event_info(struct OpalMachineCheckEvent *evt)
+void machine_check_print_event_info(struct OpalMachineCheckEvent *evt,
+							bool user_mode)
 {
 	const char *level, *sevstr, *subtype;
 	static const char *mc_ue_types[] = {
@@ -320,8 +321,13 @@ void machine_check_print_event_info(struct OpalMachineCheckEvent *evt)
 	       "Recovered" : "[Not recovered");
 	printk("%s  Initiator: %s\n", level,
 	       evt->initiator == MCE_INITIATOR_CPU ? "CPU" : "Unknown");
-	printk("%s  NIP [%016llx]: %pS\n", level, evt->srr0,
+	if (user_mode) {
+		printk("%s  NIP: [%016llx] PID: %d Comm: %s\n", level,
+				evt->srr0, current->pid, current->comm);
+	} else {
+		printk("%s  NIP [%016llx]: %pS\n", level, evt->srr0,
 							(void *)evt->srr0);
+	}
 	switch (evt->error_type) {
 	case MCE_ERROR_TYPE_UE:
 		subtype = evt->u.ue_error.ue_error_type <
diff --git a/arch/powerpc/platforms/powernv/opal.c b/arch/powerpc/platforms/powernv/opal.c
index f1115c4..49f193c 100644
--- a/arch/powerpc/platforms/powernv/opal.c
+++ b/arch/powerpc/platforms/powernv/opal.c
@@ -444,7 +444,7 @@ int opal_machine_check(struct pt_regs *regs)
 		       evt.version);
 		return 0;
 	}
-	machine_check_print_event_info(&evt);
+	machine_check_print_event_info(&evt, user_mode(regs));
 
 	if (opal_recover_mce(regs, &evt))
 		return 1;

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: [RFC PATCH 1/7] powerpc/book3s: Move machine check event structure to opal-api.h
  2017-02-21  1:51 ` [RFC PATCH 1/7] powerpc/book3s: Move machine check event structure to opal-api.h Mahesh J Salgaonkar
@ 2017-02-21  2:35   ` Nicholas Piggin
  2017-02-21  6:51     ` Mahesh Jagannath Salgaonkar
  0 siblings, 1 reply; 13+ messages in thread
From: Nicholas Piggin @ 2017-02-21  2:35 UTC (permalink / raw)
  To: Mahesh J Salgaonkar
  Cc: linuxppc-dev, Benjamin Herrenschmidt, Paul Mackerras, skiboot

On Tue, 21 Feb 2017 07:21:56 +0530
Mahesh J Salgaonkar <mahesh@linux.vnet.ibm.com> wrote:

> +enum MCE_TlbErrorType {
> +	MCE_TLB_ERROR_INDETERMINATE = 0,
> +	MCE_TLB_ERROR_PARITY = 1,
> +	MCE_TLB_ERROR_MULTIHIT = 2,
> +	MCE_TLB_ERROR_TLBIEL_PROG_ERROR = 3,
> +};

The new TLBIE error isn't really a TLB error as such. Not a hardware error.
I added a new "user" type for it.

I don't think we can handle it just by flushing TLB because it can also be
raised in response to invalid non-local tlbie. We could flush all TLBs maybe
but I think also have to advance nip to return to.

> +
> +enum MCE_NestErrorType {
> +	MCE_NEST_ERROR_ABRT_IFETCH = 0,
> +	MCE_NEST_ERROR_ABRT_IFETCH_TABLEWALK = 1,
> +	MCE_NEST_ERROR_ABRT_LOAD = 2,
> +	MCE_NEST_ERROR_ABRT_LOAD_TABLEWALK = 3,
> +};
> +
> +enum MCE_CrespErrorType {
> +	MCE_CRESP_ERROR_BAD_RADDR_IFETCH = 0,
> +	MCE_CRESP_ERROR_BAD_RADDR_IFETCH_TABLEWALK = 1,
> +	MCE_CRESP_ERROR_BAD_RADDR_LOAD = 2,
> +	MCE_CRESP_ERROR_BAD_RADDR_LOAD_TABLEWALK = 3,
> +};
> +
> +enum MCE_FspaceErrorType {
> +	MCE_FSPACE_ERROR_IFETCH = 0,
> +	MCE_FSPACE_ERROR_IFETCH_TABLEWALK = 1,
> +	MCE_FSPACE_ERROR_RADDR_TRANSLATION = 2,
> +	MCE_FSPACE_ERROR_RADDR_LOAD = 3,
> +};
> +
> +enum MCE_AsyncErrorType {
> +	MCE_ASYNC_ERROR_REAL_ADDR_STORE = 0,
> +	MCE_ASYNC_ERROR_NEST_ABRT_STORE = 1,
> +};
> +
> +struct OpalMachineCheckEvent {

Can we have more of a think about this structure and error types
before making it an OPAL API?

Errors don't always fit neatly into a simple classification like
this. For example "async" is not really an error. It's a property
of how the error is reported. The error is a timeout or real
address error. And it's caused by a store. And initiated by nest
or cResp... Other errors are caused by a table walk that was
caused by a store, etc.

I shoehorned these async errors into realaddr/link types in my
patch along with a different severity (i.e., not SYNC). But I
think we can do a lot better with a clean slate for OPAL.

More general thing is, I wonder how much we need to know of the
implementation details in this API? This still seems like it's
unnecessarily split between OS and FW. I think it would be much
nicer if we just return a set of things that the OS can usefully
respond to and have firmware construct the detailed messages for
logging.

That way we'll have much fewer new types of errors we don't know
how to handle, and never have to report unknown error.

Thanks,
Nick

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [RFC PATCH 5/7] powerpc/book3s: Don't turn on the MSR[ME] bit until opal processes the reason.
  2017-02-21  1:52 ` [RFC PATCH 5/7] powerpc/book3s: Don't turn on the MSR[ME] bit until opal processes the reason Mahesh J Salgaonkar
@ 2017-02-21  2:47   ` Nicholas Piggin
  2017-02-21  4:17     ` Mahesh Jagannath Salgaonkar
  0 siblings, 1 reply; 13+ messages in thread
From: Nicholas Piggin @ 2017-02-21  2:47 UTC (permalink / raw)
  To: Mahesh J Salgaonkar; +Cc: linuxppc-dev, Benjamin Herrenschmidt, Paul Mackerras

On Tue, 21 Feb 2017 07:22:56 +0530
Mahesh J Salgaonkar <mahesh@linux.vnet.ibm.com> wrote:

> From: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com>
> 
> Delay it until we are done with machine_check_early() call. Turn on MSR[ME]
> once opal is done with processing MCE.

Why? This seems like quite a regression -- the MCE handler today
has about 60 instructions and 30 l/st with ME clear.


> 
> Signed-off-by: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com>
> ---
>  arch/powerpc/kernel/exceptions-64s.S |   12 ++++++------
>  1 file changed, 6 insertions(+), 6 deletions(-)
> 
> diff --git a/arch/powerpc/kernel/exceptions-64s.S b/arch/powerpc/kernel/exceptions-64s.S
> index d39d611..fa768a7 100644
> --- a/arch/powerpc/kernel/exceptions-64s.S
> +++ b/arch/powerpc/kernel/exceptions-64s.S
> @@ -238,6 +238,12 @@ BEGIN_FTR_SECTION
>  	std	r9,_CCR(r1)		/* Save CR in stackframe */
>  	/* Save r9 through r13 from EXMC save area to stack frame. */
>  	EXCEPTION_PROLOG_COMMON_2(PACA_EXMC)
> +	std	r0,GPR0(r1)	/* Save r0 */
> +	EXCEPTION_PROLOG_COMMON_3(0x200)
> +	bl	save_nvgprs
> +	addi	r3,r1,STACK_FRAME_OVERHEAD
> +	bl	machine_check_early
> +	std	r3,RESULT(r1)	/* Save result */
>  	mfmsr	r11			/* get MSR value */
>  	ori	r11,r11,MSR_ME		/* turn on ME bit */
>  	ori	r11,r11,MSR_RI		/* turn on RI bit */
> @@ -345,12 +351,6 @@ EXC_COMMON_BEGIN(machine_check_common)
>  	 * ME=1, MMU (IR=0 and DR=0) off and using MC emergency stack.
>  	 */
>  EXC_COMMON_BEGIN(machine_check_handle_early)
> -	std	r0,GPR0(r1)	/* Save r0 */
> -	EXCEPTION_PROLOG_COMMON_3(0x200)
> -	bl	save_nvgprs
> -	addi	r3,r1,STACK_FRAME_OVERHEAD
> -	bl	machine_check_early
> -	std	r3,RESULT(r1)	/* Save result */
>  	ld	r12,_MSR(r1)
>  #ifdef	CONFIG_PPC_P7_NAP
>  	/*
> 

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [RFC PATCH 5/7] powerpc/book3s: Don't turn on the MSR[ME] bit until opal processes the reason.
  2017-02-21  2:47   ` Nicholas Piggin
@ 2017-02-21  4:17     ` Mahesh Jagannath Salgaonkar
  2017-02-21  4:43       ` Nicholas Piggin
  0 siblings, 1 reply; 13+ messages in thread
From: Mahesh Jagannath Salgaonkar @ 2017-02-21  4:17 UTC (permalink / raw)
  To: Nicholas Piggin; +Cc: linuxppc-dev, Benjamin Herrenschmidt, Paul Mackerras

On 02/21/2017 08:17 AM, Nicholas Piggin wrote:
> On Tue, 21 Feb 2017 07:22:56 +0530
> Mahesh J Salgaonkar <mahesh@linux.vnet.ibm.com> wrote:
> 
>> From: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com>
>>
>> Delay it until we are done with machine_check_early() call. Turn on MSR[ME]
>> once opal is done with processing MCE.
> 
> Why? This seems like quite a regression -- the MCE handler today
> has about 60 instructions and 30 l/st with ME clear.

I understand that this is bit long window. But we are in MCE handling
code and if we hit MCE while doing that we may anyway end up with
recursive MCE interrupts without really be able to recover from it.
Instead lets risk checkstop which would get us rebooted with hostboot
throwing proper error call out.

-Mahesh.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [RFC PATCH 5/7] powerpc/book3s: Don't turn on the MSR[ME] bit until opal processes the reason.
  2017-02-21  4:17     ` Mahesh Jagannath Salgaonkar
@ 2017-02-21  4:43       ` Nicholas Piggin
  0 siblings, 0 replies; 13+ messages in thread
From: Nicholas Piggin @ 2017-02-21  4:43 UTC (permalink / raw)
  To: Mahesh Jagannath Salgaonkar
  Cc: linuxppc-dev, Benjamin Herrenschmidt, Paul Mackerras

On Tue, 21 Feb 2017 09:47:53 +0530
Mahesh Jagannath Salgaonkar <mahesh@linux.vnet.ibm.com> wrote:

> On 02/21/2017 08:17 AM, Nicholas Piggin wrote:
> > On Tue, 21 Feb 2017 07:22:56 +0530
> > Mahesh J Salgaonkar <mahesh@linux.vnet.ibm.com> wrote:
> >   
> >> From: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com>
> >>
> >> Delay it until we are done with machine_check_early() call. Turn on MSR[ME]
> >> once opal is done with processing MCE.  
> > 
> > Why? This seems like quite a regression -- the MCE handler today
> > has about 60 instructions and 30 l/st with ME clear.  
> 
> I understand that this is bit long window. But we are in MCE handling
> code and if we hit MCE while doing that we may anyway end up with
> recursive MCE interrupts without really be able to recover from it.

There is careful code to handle recursive machine checks though.
Things should be structured so we will handle recursive MCEs and
recover/fail/checkstop properly.

> Instead lets risk checkstop which would get us rebooted with hostboot
> throwing proper error call out.

I'd like more justification for the proposed change. How is it an
improvement?

Thanks,
Nick

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [RFC PATCH 1/7] powerpc/book3s: Move machine check event structure to opal-api.h
  2017-02-21  2:35   ` Nicholas Piggin
@ 2017-02-21  6:51     ` Mahesh Jagannath Salgaonkar
  0 siblings, 0 replies; 13+ messages in thread
From: Mahesh Jagannath Salgaonkar @ 2017-02-21  6:51 UTC (permalink / raw)
  To: Nicholas Piggin
  Cc: linuxppc-dev, Benjamin Herrenschmidt, Paul Mackerras, skiboot

On 02/21/2017 08:05 AM, Nicholas Piggin wrote:
> On Tue, 21 Feb 2017 07:21:56 +0530
> Mahesh J Salgaonkar <mahesh@linux.vnet.ibm.com> wrote:
> 
>> +enum MCE_TlbErrorType {
>> +	MCE_TLB_ERROR_INDETERMINATE = 0,
>> +	MCE_TLB_ERROR_PARITY = 1,
>> +	MCE_TLB_ERROR_MULTIHIT = 2,
>> +	MCE_TLB_ERROR_TLBIEL_PROG_ERROR = 3,
>> +};
> 
> The new TLBIE error isn't really a TLB error as such. Not a hardware error.
> I added a new "user" type for it.
> 
> I don't think we can handle it just by flushing TLB because it can also be
> raised in response to invalid non-local tlbie. We could flush all TLBs maybe
> but I think also have to advance nip to return to.

ok got it.

> 
>> +
>> +enum MCE_NestErrorType {
>> +	MCE_NEST_ERROR_ABRT_IFETCH = 0,
>> +	MCE_NEST_ERROR_ABRT_IFETCH_TABLEWALK = 1,
>> +	MCE_NEST_ERROR_ABRT_LOAD = 2,
>> +	MCE_NEST_ERROR_ABRT_LOAD_TABLEWALK = 3,
>> +};
>> +
>> +enum MCE_CrespErrorType {
>> +	MCE_CRESP_ERROR_BAD_RADDR_IFETCH = 0,
>> +	MCE_CRESP_ERROR_BAD_RADDR_IFETCH_TABLEWALK = 1,
>> +	MCE_CRESP_ERROR_BAD_RADDR_LOAD = 2,
>> +	MCE_CRESP_ERROR_BAD_RADDR_LOAD_TABLEWALK = 3,
>> +};
>> +
>> +enum MCE_FspaceErrorType {
>> +	MCE_FSPACE_ERROR_IFETCH = 0,
>> +	MCE_FSPACE_ERROR_IFETCH_TABLEWALK = 1,
>> +	MCE_FSPACE_ERROR_RADDR_TRANSLATION = 2,
>> +	MCE_FSPACE_ERROR_RADDR_LOAD = 3,
>> +};
>> +
>> +enum MCE_AsyncErrorType {
>> +	MCE_ASYNC_ERROR_REAL_ADDR_STORE = 0,
>> +	MCE_ASYNC_ERROR_NEST_ABRT_STORE = 1,
>> +};
>> +
>> +struct OpalMachineCheckEvent {
> 
> Can we have more of a think about this structure and error types
> before making it an OPAL API?

Agree. I was just thinking how about we can just replace the entire
union as below:

	uint8_t 	specific_error_type;		/* 0x20 */
	uint8_t		effective_address_provided;	/* 0x21 */
	uint8_t		physical_address_provided;	/* 0x22 */
	uint8_t		reserved_1[5];			/* 0x23 */
	uint64_t	effective_address;		/* 0x28 */
	uint64_t	physical_address;		/* 0x30 */
	uint8_t		reserved_2[8];			/* 0x38 */
};

What do you say ? May increase few more bytes as reserved for future.

> 
> Errors don't always fit neatly into a simple classification like
> this. For example "async" is not really an error. It's a property
> of how the error is reported. The error is a timeout or real
> address error. And it's caused by a store. And initiated by nest
> or cResp... Other errors are caused by a table walk that was
> caused by a store, etc.
> 
> I shoehorned these async errors into realaddr/link types in my
> patch along with a different severity (i.e., not SYNC). But I
> think we can do a lot better with a clean slate for OPAL.

I see.

> 
> More general thing is, I wonder how much we need to know of the
> implementation details in this API? This still seems like it's
> unnecessarily split between OS and FW. I think it would be much
> nicer if we just return a set of things that the OS can usefully
> respond to and have firmware construct the detailed messages for
> logging.
> 
> That way we'll have much fewer new types of errors we don't know
> how to handle, and never have to report unknown error.

Makes sense. That would make linux MCE error printing much simpler and
we may never have to modify it to add new strings. We can probably add
char buffer to machine check struct or send it as separate string buffer.

Thanks,
-Mahesh.

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2017-02-21  6:52 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-02-21  1:51 [RFC PATCH 0/7] Machine check handling for Power9 with bacward compatibility Mahesh J Salgaonkar
2017-02-21  1:51 ` [RFC PATCH 1/7] powerpc/book3s: Move machine check event structure to opal-api.h Mahesh J Salgaonkar
2017-02-21  2:35   ` Nicholas Piggin
2017-02-21  6:51     ` Mahesh Jagannath Salgaonkar
2017-02-21  1:52 ` [RFC PATCH 2/7] powerpc/book3s: mce: Call opal mce handler to extract MCE error reason Mahesh J Salgaonkar
2017-02-21  1:52 ` [RFC PATCH 3/7] powerpc/book3s: mce: Process the MCE event and recover if possible Mahesh J Salgaonkar
2017-02-21  1:52 ` [RFC PATCH 4/7] powerpc/book3s: Print additional MCE errors introduced in power9 Mahesh J Salgaonkar
2017-02-21  1:52 ` [RFC PATCH 5/7] powerpc/book3s: Don't turn on the MSR[ME] bit until opal processes the reason Mahesh J Salgaonkar
2017-02-21  2:47   ` Nicholas Piggin
2017-02-21  4:17     ` Mahesh Jagannath Salgaonkar
2017-02-21  4:43       ` Nicholas Piggin
2017-02-21  1:53 ` [RFC PATCH 6/7] powerpc/book3s: Display more info for MCE error console log Mahesh J Salgaonkar
2017-02-21  1:53 ` [RFC PATCH 7/7] powerpc/book3s: Display task info for MCE error in user mode Mahesh J Salgaonkar

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.