All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 1/2] x86/MCE/AMD: Always give panic severity for UC errors in kernel context
@ 2017-11-06 17:46 Borislav Petkov
  2017-11-06 17:46 ` [PATCH 2/2] x86/MCE/AMD: Fix mce_severity_amd_smca() signature Borislav Petkov
  2017-11-07 10:15   ` tip-bot for Borislav Petkov
  0 siblings, 2 replies; 6+ messages in thread
From: Borislav Petkov @ 2017-11-06 17:46 UTC (permalink / raw)
  To: X86 ML; +Cc: LKML

From: Yazen Ghannam <yazen.ghannam@amd.com>

The AMD severity grading function was introduced in kernel 4.1. The
current logic can possibly give MCE_AR_SEVERITY for uncorrectable
errors in kernel context. The system may then get stuck in a loop as
memory_failure() will try to handle the bad kernel memory and find it
busy.

Return MCE_PANIC_SEVERITY for all UC errors IN_KERNEL context on AMD
systems.

After:

  b2f9d678e28c ("x86/mce: Check for faults tagged in EXTABLE_CLASS_FAULT exception table entries")

was accepted in v4.6, this issue was masked because of the tail-end attempt
at kernel mode recovery in the #MC handler.

However, uncorrectable errors IN_KERNEL context should always be considered
unrecoverable and cause a panic.

Signed-off-by: Yazen Ghannam <yazen.ghannam@amd.com>
Cc: Tony Luck <tony.luck@intel.com>
Cc: linux-edac <linux-edac@vger.kernel.org>
Cc: x86-ml <x86@kernel.org>
Link: http://lkml.kernel.org/r/1509562746-6313-1-git-send-email-Yazen.Ghannam@amd.com
Fixes: bf80bbd7dcf5 (x86/mce: Add an AMD severities-grading function)
Cc: <stable@vger.kernel.org> # 4.9.x
Signed-off-by: Borislav Petkov <bp@suse.de>
---
 arch/x86/kernel/cpu/mcheck/mce-severity.c | 7 +++----
 1 file changed, 3 insertions(+), 4 deletions(-)

diff --git a/arch/x86/kernel/cpu/mcheck/mce-severity.c b/arch/x86/kernel/cpu/mcheck/mce-severity.c
index 87cc9ab7a13c..4b8187639c2d 100644
--- a/arch/x86/kernel/cpu/mcheck/mce-severity.c
+++ b/arch/x86/kernel/cpu/mcheck/mce-severity.c
@@ -245,6 +245,9 @@ static int mce_severity_amd(struct mce *m, int tolerant, char **msg, bool is_exc
 
 	if (m->status & MCI_STATUS_UC) {
 
+		if (ctx == IN_KERNEL)
+			return MCE_PANIC_SEVERITY;
+
 		/*
 		 * On older systems where overflow_recov flag is not present, we
 		 * should simply panic if an error overflow occurs. If
@@ -255,10 +258,6 @@ static int mce_severity_amd(struct mce *m, int tolerant, char **msg, bool is_exc
 			if (mce_flags.smca)
 				return mce_severity_amd_smca(m, ctx);
 
-			/* software can try to contain */
-			if (!(m->mcgstatus & MCG_STATUS_RIPV) && (ctx == IN_KERNEL))
-				return MCE_PANIC_SEVERITY;
-
 			/* kill current process */
 			return MCE_AR_SEVERITY;
 		} else {
-- 
2.13.0

^ permalink raw reply related	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2017-11-07 10:20 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-11-06 17:46 [PATCH 1/2] x86/MCE/AMD: Always give panic severity for UC errors in kernel context Borislav Petkov
2017-11-06 17:46 ` [PATCH 2/2] x86/MCE/AMD: Fix mce_severity_amd_smca() signature Borislav Petkov
2017-11-07 10:16   ` [tip:ras/core] " tip-bot for Yazen Ghannam
2017-11-07 10:16     ` tip-bot for Borislav Petkov
2017-11-07 10:15 ` [tip:ras/core] x86/MCE/AMD: Always give panic severity for UC errors in kernel context tip-bot for Yazen Ghannam
2017-11-07 10:15   ` tip-bot for Borislav Petkov

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.