linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* mce: Add errata workaround for Skylake SKX37
@ 2021-10-29 20:57 Dave Jones
  2021-11-02 19:55 ` Luck, Tony
  2021-11-12 21:03 ` [tip: x86/urgent] x86/mce: " tip-bot2 for Dave Jones
  0 siblings, 2 replies; 3+ messages in thread
From: Dave Jones @ 2021-10-29 20:57 UTC (permalink / raw)
  To: Tony Luck; +Cc: Linux Kernel

Errata SKX37 is word-for-word identical to the other errata listed in
this workaround.   I happened to notice this after investigating a CMCI
storm on a Skylake host.  While I can't confirm this was the root cause,
spurious corrected errors does sound like a likely suspect.

Signed-off-by: Dave Jones <davej@codemonkey.org.uk>

diff --git arch/x86/kernel/cpu/mce/intel.c arch/x86/kernel/cpu/mce/intel.c
index acfd5d9f93c6..bb9a46a804bf 100644
--- arch/x86/kernel/cpu/mce/intel.c
+++ arch/x86/kernel/cpu/mce/intel.c
@@ -547,12 +547,13 @@ bool intel_filter_mce(struct mce *m)
 {
 	struct cpuinfo_x86 *c = &boot_cpu_data;
 
-	/* MCE errata HSD131, HSM142, HSW131, BDM48, and HSM142 */
+	/* MCE errata HSD131, HSM142, HSW131, BDM48, HSM142 and SKX37 */
 	if ((c->x86 == 6) &&
 	    ((c->x86_model == INTEL_FAM6_HASWELL) ||
 	     (c->x86_model == INTEL_FAM6_HASWELL_L) ||
 	     (c->x86_model == INTEL_FAM6_BROADWELL) ||
-	     (c->x86_model == INTEL_FAM6_HASWELL_G)) &&
+	     (c->x86_model == INTEL_FAM6_HASWELL_G) ||
+	     (c->x86_model == INTEL_FAM6_SKYLAKE_X)) &&
 	    (m->bank == 0) &&
 	    ((m->status & 0xa0000000ffffffff) == 0x80000000000f0005))
 		return true;

^ permalink raw reply related	[flat|nested] 3+ messages in thread

* Re: mce: Add errata workaround for Skylake SKX37
  2021-10-29 20:57 mce: Add errata workaround for Skylake SKX37 Dave Jones
@ 2021-11-02 19:55 ` Luck, Tony
  2021-11-12 21:03 ` [tip: x86/urgent] x86/mce: " tip-bot2 for Dave Jones
  1 sibling, 0 replies; 3+ messages in thread
From: Luck, Tony @ 2021-11-02 19:55 UTC (permalink / raw)
  To: Dave Jones, Linux Kernel; +Cc: Borislav Petkov

On Fri, Oct 29, 2021 at 04:57:59PM -0400, Dave Jones wrote:
> Errata SKX37 is word-for-word identical to the other errata listed in
> this workaround.   I happened to notice this after investigating a CMCI
> storm on a Skylake host.  While I can't confirm this was the root cause,
> spurious corrected errors does sound like a likely suspect.
> 
> Signed-off-by: Dave Jones <davej@codemonkey.org.uk>

Needs:

Fixes: 2976908e4198 ("x86/mce: Do not log spurious corrected mce errors")
Cc: <stable@vger.kernel.org>

otherwise:

Reviewed-by: Tony Luck <tony.luck@intel.com>

> 
> diff --git arch/x86/kernel/cpu/mce/intel.c arch/x86/kernel/cpu/mce/intel.c
> index acfd5d9f93c6..bb9a46a804bf 100644
> --- arch/x86/kernel/cpu/mce/intel.c
> +++ arch/x86/kernel/cpu/mce/intel.c
> @@ -547,12 +547,13 @@ bool intel_filter_mce(struct mce *m)
>  {
>  	struct cpuinfo_x86 *c = &boot_cpu_data;
>  
> -	/* MCE errata HSD131, HSM142, HSW131, BDM48, and HSM142 */
> +	/* MCE errata HSD131, HSM142, HSW131, BDM48, HSM142 and SKX37 */
>  	if ((c->x86 == 6) &&
>  	    ((c->x86_model == INTEL_FAM6_HASWELL) ||
>  	     (c->x86_model == INTEL_FAM6_HASWELL_L) ||
>  	     (c->x86_model == INTEL_FAM6_BROADWELL) ||
> -	     (c->x86_model == INTEL_FAM6_HASWELL_G)) &&
> +	     (c->x86_model == INTEL_FAM6_HASWELL_G) ||
> +	     (c->x86_model == INTEL_FAM6_SKYLAKE_X)) &&
>  	    (m->bank == 0) &&
>  	    ((m->status & 0xa0000000ffffffff) == 0x80000000000f0005))
>  		return true;

^ permalink raw reply	[flat|nested] 3+ messages in thread

* [tip: x86/urgent] x86/mce: Add errata workaround for Skylake SKX37
  2021-10-29 20:57 mce: Add errata workaround for Skylake SKX37 Dave Jones
  2021-11-02 19:55 ` Luck, Tony
@ 2021-11-12 21:03 ` tip-bot2 for Dave Jones
  1 sibling, 0 replies; 3+ messages in thread
From: tip-bot2 for Dave Jones @ 2021-11-12 21:03 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: Dave Jones, Dave Hansen, Tony Luck, stable, x86, linux-kernel

The following commit has been merged into the x86/urgent branch of tip:

Commit-ID:     e629fc1407a63dbb748f828f9814463ffc2a0af0
Gitweb:        https://git.kernel.org/tip/e629fc1407a63dbb748f828f9814463ffc2a0af0
Author:        Dave Jones <davej@codemonkey.org.uk>
AuthorDate:    Fri, 29 Oct 2021 16:57:59 -04:00
Committer:     Dave Hansen <dave.hansen@linux.intel.com>
CommitterDate: Fri, 12 Nov 2021 11:43:35 -08:00

x86/mce: Add errata workaround for Skylake SKX37

Errata SKX37 is word-for-word identical to the other errata listed in
this workaround.   I happened to notice this after investigating a CMCI
storm on a Skylake host.  While I can't confirm this was the root cause,
spurious corrected errors does sound like a likely suspect.

Fixes: 2976908e4198 ("x86/mce: Do not log spurious corrected mce errors")
Signed-off-by: Dave Jones <davej@codemonkey.org.uk>
Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Reviewed-by: Tony Luck <tony.luck@intel.com>
Cc: <stable@vger.kernel.org>
Link: https://lkml.kernel.org/r/20211029205759.GA7385@codemonkey.org.uk
---
 arch/x86/kernel/cpu/mce/intel.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/arch/x86/kernel/cpu/mce/intel.c b/arch/x86/kernel/cpu/mce/intel.c
index acfd5d9..bb9a46a 100644
--- a/arch/x86/kernel/cpu/mce/intel.c
+++ b/arch/x86/kernel/cpu/mce/intel.c
@@ -547,12 +547,13 @@ bool intel_filter_mce(struct mce *m)
 {
 	struct cpuinfo_x86 *c = &boot_cpu_data;
 
-	/* MCE errata HSD131, HSM142, HSW131, BDM48, and HSM142 */
+	/* MCE errata HSD131, HSM142, HSW131, BDM48, HSM142 and SKX37 */
 	if ((c->x86 == 6) &&
 	    ((c->x86_model == INTEL_FAM6_HASWELL) ||
 	     (c->x86_model == INTEL_FAM6_HASWELL_L) ||
 	     (c->x86_model == INTEL_FAM6_BROADWELL) ||
-	     (c->x86_model == INTEL_FAM6_HASWELL_G)) &&
+	     (c->x86_model == INTEL_FAM6_HASWELL_G) ||
+	     (c->x86_model == INTEL_FAM6_SKYLAKE_X)) &&
 	    (m->bank == 0) &&
 	    ((m->status & 0xa0000000ffffffff) == 0x80000000000f0005))
 		return true;

^ permalink raw reply related	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2021-11-12 21:03 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-10-29 20:57 mce: Add errata workaround for Skylake SKX37 Dave Jones
2021-11-02 19:55 ` Luck, Tony
2021-11-12 21:03 ` [tip: x86/urgent] x86/mce: " tip-bot2 for Dave Jones

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).