linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 1/2] x86/mce: Fix mce regression from recent cleanup
  2013-07-24 20:58 [PATCH 0/2] machine check decode fixes Tony Luck
@ 2013-07-24 17:09 ` Tony Luck
  2013-07-24 20:54 ` [PATCH 2/2] x86/mce: Pay no attention to 'F' bit in MCACOD when parsing 'UC' errors Tony Luck
  2013-07-25 10:40 ` [PATCH 0/2] machine check decode fixes Naveen N. Rao
  2 siblings, 0 replies; 4+ messages in thread
From: Tony Luck @ 2013-07-24 17:09 UTC (permalink / raw)
  To: linux-kernel; +Cc: Naveen N. Rao, Borislav Petkov, Chen Gong

    commit 33d7885b594e169256daef652e8d3527b2298e75
    x86/mce: Update MCE severity condition check

Simplified the rules to recognise each classification of recoverable
machine check combining the instruction and data fetch rules into a
single entry based on clarifications in the June 2013 SDM that all
recoverable events would be reported on the unaffected processor with
MCG_STATUS.EIPV=0 and MCG_STATUS.RIPV=1.  Unfortunately the simplified
rule has a couple of bugs.  Fix them here.

Signed-off-by: Tony Luck <tony.luck@intel.com>
---
 arch/x86/kernel/cpu/mcheck/mce-severity.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/x86/kernel/cpu/mcheck/mce-severity.c b/arch/x86/kernel/cpu/mcheck/mce-severity.c
index e2703520..c370e1c 100644
--- a/arch/x86/kernel/cpu/mcheck/mce-severity.c
+++ b/arch/x86/kernel/cpu/mcheck/mce-severity.c
@@ -111,8 +111,8 @@ static struct severity {
 #ifdef	CONFIG_MEMORY_FAILURE
 	MCESEV(
 		KEEP, "Action required but unaffected thread is continuable",
-		SER, MASK(MCI_STATUS_OVER|MCI_UC_SAR|MCI_ADDR|MCACOD, MCI_UC_SAR|MCI_ADDR),
-		MCGMASK(MCG_STATUS_RIPV, MCG_STATUS_RIPV)
+		SER, MASK(MCI_STATUS_OVER|MCI_UC_SAR|MCI_ADDR, MCI_UC_SAR|MCI_ADDR),
+		MCGMASK(MCG_STATUS_RIPV|MCG_STATUS_EIPV, MCG_STATUS_RIPV)
 		),
 	MCESEV(
 		AR, "Action required: data load error in a user process",
-- 
1.8.1.4


^ permalink raw reply related	[flat|nested] 4+ messages in thread

* [PATCH 2/2] x86/mce: Pay no attention to 'F' bit in MCACOD when parsing 'UC' errors
  2013-07-24 20:58 [PATCH 0/2] machine check decode fixes Tony Luck
  2013-07-24 17:09 ` [PATCH 1/2] x86/mce: Fix mce regression from recent cleanup Tony Luck
@ 2013-07-24 20:54 ` Tony Luck
  2013-07-25 10:40 ` [PATCH 0/2] machine check decode fixes Naveen N. Rao
  2 siblings, 0 replies; 4+ messages in thread
From: Tony Luck @ 2013-07-24 20:54 UTC (permalink / raw)
  To: linux-kernel; +Cc: Naveen N. Rao, Borislav Petkov, Chen Gong

The 0x1000 bit of the MCACOD field of machine check MCi_STATUS
registers is only defined for corrected errors (where it means
that hardware may be filtering errors see SDM section 15.9.2.1).

For uncorrected errors it may, or may not be set - so we should mask
it out when checking for the architecturaly defined recoverable
error signatures (see SDM 15.9.3.1 and 15.9.3.2)

Signed-off-by: Tony Luck <tony.luck@intel.com>
---
 arch/x86/include/asm/mce.h | 13 +++++++++++--
 1 file changed, 11 insertions(+), 2 deletions(-)

diff --git a/arch/x86/include/asm/mce.h b/arch/x86/include/asm/mce.h
index 29e3093..aa97342 100644
--- a/arch/x86/include/asm/mce.h
+++ b/arch/x86/include/asm/mce.h
@@ -32,11 +32,20 @@
 #define MCI_STATUS_PCC   (1ULL<<57)  /* processor context corrupt */
 #define MCI_STATUS_S	 (1ULL<<56)  /* Signaled machine check */
 #define MCI_STATUS_AR	 (1ULL<<55)  /* Action required */
-#define MCACOD		  0xffff     /* MCA Error Code */
+
+/*
+ * Note that the full MCACOD field of IA32_MCi_STATUS MSR is
+ * bits 15:0.  But bit 12 is the 'F' bit, defined for corrected
+ * errors to indicate that errors are being filtered by hardware.
+ * We should mask out bit 12 when looking for specific signatures
+ * of uncorrected errors - so the F bit is deliberately skipped
+ * in this #define.
+ */
+#define MCACOD		  0xefff     /* MCA Error Code */
 
 /* Architecturally defined codes from SDM Vol. 3B Chapter 15 */
 #define MCACOD_SCRUB	0x00C0	/* 0xC0-0xCF Memory Scrubbing */
-#define MCACOD_SCRUBMSK	0xfff0
+#define MCACOD_SCRUBMSK	0xeff0	/* Skip bit 12 ('F' bit) */
 #define MCACOD_L3WB	0x017A	/* L3 Explicit Writeback */
 #define MCACOD_DATA	0x0134	/* Data Load */
 #define MCACOD_INSTR	0x0150	/* Instruction Fetch */
-- 
1.8.1.4


^ permalink raw reply related	[flat|nested] 4+ messages in thread

* [PATCH 0/2] machine check decode fixes
@ 2013-07-24 20:58 Tony Luck
  2013-07-24 17:09 ` [PATCH 1/2] x86/mce: Fix mce regression from recent cleanup Tony Luck
                   ` (2 more replies)
  0 siblings, 3 replies; 4+ messages in thread
From: Tony Luck @ 2013-07-24 20:58 UTC (permalink / raw)
  To: linux-kernel; +Cc: Naveen N. Rao, Borislav Petkov, Chen Gong

V2 of this:
* Broken into two patches - by suggestion of Chen Gong
* Just change MCACOD #define value - by suggestion of Naveen

Tony Luck (2):
  x86/mce: Fix mce regression from recent cleanup
  x86/mce: Pay no attention to 'F' bit in MCACOD when parsing 'UC'
    errors

 arch/x86/include/asm/mce.h                | 13 +++++++++++--
 arch/x86/kernel/cpu/mcheck/mce-severity.c |  4 ++--
 2 files changed, 13 insertions(+), 4 deletions(-)

-- 
1.8.1.4


^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH 0/2] machine check decode fixes
  2013-07-24 20:58 [PATCH 0/2] machine check decode fixes Tony Luck
  2013-07-24 17:09 ` [PATCH 1/2] x86/mce: Fix mce regression from recent cleanup Tony Luck
  2013-07-24 20:54 ` [PATCH 2/2] x86/mce: Pay no attention to 'F' bit in MCACOD when parsing 'UC' errors Tony Luck
@ 2013-07-25 10:40 ` Naveen N. Rao
  2 siblings, 0 replies; 4+ messages in thread
From: Naveen N. Rao @ 2013-07-25 10:40 UTC (permalink / raw)
  To: Tony Luck; +Cc: linux-kernel, Borislav Petkov, Chen Gong

On 07/25/2013 02:28 AM, Tony Luck wrote:
> V2 of this:
> * Broken into two patches - by suggestion of Chen Gong
> * Just change MCACOD #define value - by suggestion of Naveen
>
> Tony Luck (2):
>    x86/mce: Fix mce regression from recent cleanup
>    x86/mce: Pay no attention to 'F' bit in MCACOD when parsing 'UC'
>      errors
>
>   arch/x86/include/asm/mce.h                | 13 +++++++++++--
>   arch/x86/kernel/cpu/mcheck/mce-severity.c |  4 ++--
>   2 files changed, 13 insertions(+), 4 deletions(-)
>

Series Acked-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>


- Naveen


^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2013-07-25 10:40 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-07-24 20:58 [PATCH 0/2] machine check decode fixes Tony Luck
2013-07-24 17:09 ` [PATCH 1/2] x86/mce: Fix mce regression from recent cleanup Tony Luck
2013-07-24 20:54 ` [PATCH 2/2] x86/mce: Pay no attention to 'F' bit in MCACOD when parsing 'UC' errors Tony Luck
2013-07-25 10:40 ` [PATCH 0/2] machine check decode fixes Naveen N. Rao

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).