All of lore.kernel.org
 help / color / mirror / Atom feed
* mce: Add errata workaround for Skylake SKX37
@ 2021-10-29 20:57 Dave Jones
  2021-11-02 19:55 ` Luck, Tony
  2021-11-12 21:03 ` [tip: x86/urgent] x86/mce: " tip-bot2 for Dave Jones
  0 siblings, 2 replies; 3+ messages in thread
From: Dave Jones @ 2021-10-29 20:57 UTC (permalink / raw)
  To: Tony Luck; +Cc: Linux Kernel

Errata SKX37 is word-for-word identical to the other errata listed in
this workaround.   I happened to notice this after investigating a CMCI
storm on a Skylake host.  While I can't confirm this was the root cause,
spurious corrected errors does sound like a likely suspect.

Signed-off-by: Dave Jones <davej@codemonkey.org.uk>

diff --git arch/x86/kernel/cpu/mce/intel.c arch/x86/kernel/cpu/mce/intel.c
index acfd5d9f93c6..bb9a46a804bf 100644
--- arch/x86/kernel/cpu/mce/intel.c
+++ arch/x86/kernel/cpu/mce/intel.c
@@ -547,12 +547,13 @@ bool intel_filter_mce(struct mce *m)
 {
 	struct cpuinfo_x86 *c = &boot_cpu_data;
 
-	/* MCE errata HSD131, HSM142, HSW131, BDM48, and HSM142 */
+	/* MCE errata HSD131, HSM142, HSW131, BDM48, HSM142 and SKX37 */
 	if ((c->x86 == 6) &&
 	    ((c->x86_model == INTEL_FAM6_HASWELL) ||
 	     (c->x86_model == INTEL_FAM6_HASWELL_L) ||
 	     (c->x86_model == INTEL_FAM6_BROADWELL) ||
-	     (c->x86_model == INTEL_FAM6_HASWELL_G)) &&
+	     (c->x86_model == INTEL_FAM6_HASWELL_G) ||
+	     (c->x86_model == INTEL_FAM6_SKYLAKE_X)) &&
 	    (m->bank == 0) &&
 	    ((m->status & 0xa0000000ffffffff) == 0x80000000000f0005))
 		return true;

^ permalink raw reply related	[flat|nested] 3+ messages in thread

* Re: mce: Add errata workaround for Skylake SKX37
  2021-10-29 20:57 mce: Add errata workaround for Skylake SKX37 Dave Jones
@ 2021-11-02 19:55 ` Luck, Tony
  2021-11-12 21:03 ` [tip: x86/urgent] x86/mce: " tip-bot2 for Dave Jones
  1 sibling, 0 replies; 3+ messages in thread
From: Luck, Tony @ 2021-11-02 19:55 UTC (permalink / raw)
  To: Dave Jones, Linux Kernel; +Cc: Borislav Petkov

On Fri, Oct 29, 2021 at 04:57:59PM -0400, Dave Jones wrote:
> Errata SKX37 is word-for-word identical to the other errata listed in
> this workaround.   I happened to notice this after investigating a CMCI
> storm on a Skylake host.  While I can't confirm this was the root cause,
> spurious corrected errors does sound like a likely suspect.
> 
> Signed-off-by: Dave Jones <davej@codemonkey.org.uk>

Needs:

Fixes: 2976908e4198 ("x86/mce: Do not log spurious corrected mce errors")
Cc: <stable@vger.kernel.org>

otherwise:

Reviewed-by: Tony Luck <tony.luck@intel.com>

> 
> diff --git arch/x86/kernel/cpu/mce/intel.c arch/x86/kernel/cpu/mce/intel.c
> index acfd5d9f93c6..bb9a46a804bf 100644
> --- arch/x86/kernel/cpu/mce/intel.c
> +++ arch/x86/kernel/cpu/mce/intel.c
> @@ -547,12 +547,13 @@ bool intel_filter_mce(struct mce *m)
>  {
>  	struct cpuinfo_x86 *c = &boot_cpu_data;
>  
> -	/* MCE errata HSD131, HSM142, HSW131, BDM48, and HSM142 */
> +	/* MCE errata HSD131, HSM142, HSW131, BDM48, HSM142 and SKX37 */
>  	if ((c->x86 == 6) &&
>  	    ((c->x86_model == INTEL_FAM6_HASWELL) ||
>  	     (c->x86_model == INTEL_FAM6_HASWELL_L) ||
>  	     (c->x86_model == INTEL_FAM6_BROADWELL) ||
> -	     (c->x86_model == INTEL_FAM6_HASWELL_G)) &&
> +	     (c->x86_model == INTEL_FAM6_HASWELL_G) ||
> +	     (c->x86_model == INTEL_FAM6_SKYLAKE_X)) &&
>  	    (m->bank == 0) &&
>  	    ((m->status & 0xa0000000ffffffff) == 0x80000000000f0005))
>  		return true;

^ permalink raw reply	[flat|nested] 3+ messages in thread

* [tip: x86/urgent] x86/mce: Add errata workaround for Skylake SKX37
  2021-10-29 20:57 mce: Add errata workaround for Skylake SKX37 Dave Jones
  2021-11-02 19:55 ` Luck, Tony
@ 2021-11-12 21:03 ` tip-bot2 for Dave Jones
  1 sibling, 0 replies; 3+ messages in thread
From: tip-bot2 for Dave Jones @ 2021-11-12 21:03 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: Dave Jones, Dave Hansen, Tony Luck, stable, x86, linux-kernel

The following commit has been merged into the x86/urgent branch of tip:

Commit-ID:     e629fc1407a63dbb748f828f9814463ffc2a0af0
Gitweb:        https://git.kernel.org/tip/e629fc1407a63dbb748f828f9814463ffc2a0af0
Author:        Dave Jones <davej@codemonkey.org.uk>
AuthorDate:    Fri, 29 Oct 2021 16:57:59 -04:00
Committer:     Dave Hansen <dave.hansen@linux.intel.com>
CommitterDate: Fri, 12 Nov 2021 11:43:35 -08:00

x86/mce: Add errata workaround for Skylake SKX37

Errata SKX37 is word-for-word identical to the other errata listed in
this workaround.   I happened to notice this after investigating a CMCI
storm on a Skylake host.  While I can't confirm this was the root cause,
spurious corrected errors does sound like a likely suspect.

Fixes: 2976908e4198 ("x86/mce: Do not log spurious corrected mce errors")
Signed-off-by: Dave Jones <davej@codemonkey.org.uk>
Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Reviewed-by: Tony Luck <tony.luck@intel.com>
Cc: <stable@vger.kernel.org>
Link: https://lkml.kernel.org/r/20211029205759.GA7385@codemonkey.org.uk
---
 arch/x86/kernel/cpu/mce/intel.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/arch/x86/kernel/cpu/mce/intel.c b/arch/x86/kernel/cpu/mce/intel.c
index acfd5d9..bb9a46a 100644
--- a/arch/x86/kernel/cpu/mce/intel.c
+++ b/arch/x86/kernel/cpu/mce/intel.c
@@ -547,12 +547,13 @@ bool intel_filter_mce(struct mce *m)
 {
 	struct cpuinfo_x86 *c = &boot_cpu_data;
 
-	/* MCE errata HSD131, HSM142, HSW131, BDM48, and HSM142 */
+	/* MCE errata HSD131, HSM142, HSW131, BDM48, HSM142 and SKX37 */
 	if ((c->x86 == 6) &&
 	    ((c->x86_model == INTEL_FAM6_HASWELL) ||
 	     (c->x86_model == INTEL_FAM6_HASWELL_L) ||
 	     (c->x86_model == INTEL_FAM6_BROADWELL) ||
-	     (c->x86_model == INTEL_FAM6_HASWELL_G)) &&
+	     (c->x86_model == INTEL_FAM6_HASWELL_G) ||
+	     (c->x86_model == INTEL_FAM6_SKYLAKE_X)) &&
 	    (m->bank == 0) &&
 	    ((m->status & 0xa0000000ffffffff) == 0x80000000000f0005))
 		return true;

^ permalink raw reply related	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2021-11-12 21:03 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-10-29 20:57 mce: Add errata workaround for Skylake SKX37 Dave Jones
2021-11-02 19:55 ` Luck, Tony
2021-11-12 21:03 ` [tip: x86/urgent] x86/mce: " tip-bot2 for Dave Jones

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.