Linux-EDAC Archive on lore.kernel.org
 help / color / Atom feed
* [tip:ras/core] x86/mce: Don't check for the overflow bit on action optional machine checks
       [not found] <20190718182920.32621-1-tony.luck@intel.com>
@ 2019-08-05  7:46 ` tip-bot for Tony Luck
  0 siblings, 0 replies; only message in thread
From: tip-bot for Tony Luck @ 2019-08-05  7:46 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: linux-edac, x86, mingo, tony.luck, tglx, bp, linux-kernel,
	yongkaiwu, mingo, hpa

Commit-ID:  aaefca8e30d9df7a4ca13c9c8e135dd227b8ff19
Gitweb:     https://git.kernel.org/tip/aaefca8e30d9df7a4ca13c9c8e135dd227b8ff19
Author:     Tony Luck <tony.luck@intel.com>
AuthorDate: Thu, 18 Jul 2019 11:29:20 -0700
Committer:  Borislav Petkov <bp@suse.de>
CommitDate: Mon, 5 Aug 2019 09:34:02 +0200

x86/mce: Don't check for the overflow bit on action optional machine checks

We currently do not process SRAO (Software Recoverable Action Optional)
machine checks if they are logged with the overflow bit set to 1 in the
machine check bank status register. This is overly conservative.

There are two cases where we could end up with an SRAO+OVER log based
on the SDM volume 3 overwrite rules in "Table 15-8. Overwrite Rules for
UC, CE, and UCR Errors"

1) First a corrected error is logged, then the SRAO error overwrites.
   The second error overwrites the first because uncorrected errors
   have a higher severity than corrected errors.
2) The SRAO error was logged first, followed by a correcetd error.
   In this case the first error is retained in the bank.

So in either case the machine check bank will contain the address
of the SRAO error. So we can process that even if the overflow bit
was set.

Reported-by: Yongkai Wu <yongkaiwu@tencent.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: linux-edac <linux-edac@vger.kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: x86-ml <x86@kernel.org>
Link: https://lkml.kernel.org/r/20190718182920.32621-1-tony.luck@intel.com
---
 arch/x86/kernel/cpu/mce/severity.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/x86/kernel/cpu/mce/severity.c b/arch/x86/kernel/cpu/mce/severity.c
index 210f1f5db5f7..87bcdc6dc2f0 100644
--- a/arch/x86/kernel/cpu/mce/severity.c
+++ b/arch/x86/kernel/cpu/mce/severity.c
@@ -107,11 +107,11 @@ static struct severity {
 	 */
 	MCESEV(
 		AO, "Action optional: memory scrubbing error",
-		SER, MASK(MCI_STATUS_OVER|MCI_UC_AR|MCACOD_SCRUBMSK, MCI_STATUS_UC|MCACOD_SCRUB)
+		SER, MASK(MCI_UC_AR|MCACOD_SCRUBMSK, MCI_STATUS_UC|MCACOD_SCRUB)
 		),
 	MCESEV(
 		AO, "Action optional: last level cache writeback error",
-		SER, MASK(MCI_STATUS_OVER|MCI_UC_AR|MCACOD, MCI_STATUS_UC|MCACOD_L3WB)
+		SER, MASK(MCI_UC_AR|MCACOD, MCI_STATUS_UC|MCACOD_L3WB)
 		),
 
 	/* ignore OVER for UCNA */

^ permalink raw reply	[flat|nested] only message in thread

only message in thread, back to index

Thread overview: (only message) (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <20190718182920.32621-1-tony.luck@intel.com>
2019-08-05  7:46 ` [tip:ras/core] x86/mce: Don't check for the overflow bit on action optional machine checks tip-bot for Tony Luck

Linux-EDAC Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/linux-edac/0 linux-edac/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 linux-edac linux-edac/ https://lore.kernel.org/linux-edac \
		linux-edac@vger.kernel.org linux-edac@archiver.kernel.org
	public-inbox-index linux-edac


Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.kernel.vger.linux-edac


AGPL code for this site: git clone https://public-inbox.org/ public-inbox