linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "Ghannam, Yazen" <Yazen.Ghannam@amd.com>
To: "linux-edac@vger.kernel.org" <linux-edac@vger.kernel.org>
Cc: "Ghannam, Yazen" <Yazen.Ghannam@amd.com>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	"bp@suse.de" <bp@suse.de>,
	"tony.luck@intel.com" <tony.luck@intel.com>,
	"x86@kernel.org" <x86@kernel.org>,
	"rafal@milecki.pl" <rafal@milecki.pl>,
	"clemej@gmail.com" <clemej@gmail.com>
Subject: [PATCH v2 2/2] x86/MCE/AMD: Don't report L1 BTB MCA errors on some Family 17h models
Date: Thu, 21 Mar 2019 20:25:18 +0000	[thread overview]
Message-ID: <20190321202505.5553-2-Yazen.Ghannam@amd.com> (raw)
In-Reply-To: <20190321202505.5553-1-Yazen.Ghannam@amd.com>

From: Yazen Ghannam <yazen.ghannam@amd.com>

AMD Family 17h Models 10h-2Fh may report a high number of L1 BTB MCA
errors under certain conditions. The errors are benign and can safely be
ignored. However, the high error rate may cause the MCA threshold
counter to overflow causing a high rate of thresholding interrupts. In
addition, users may see the errors reported through the AMD MCE decoder
module, even with the interrupt disabled, due to MCA polling.

This error is reported through the Instruction Fetch bank.

Clear the "Counter Present" bit in the Instruction Fetch bank's
MCA_MISC0 register. This will prevent enabling MCA thresholding on this
bank which will prevent the high interrupt rate due to this error.

Define a function to filter these errors from the MCE event pool.
Install this function during AMD vendor init. The MCA banks are enabled
after vendor init, so the filter function will be installed before the
spurious errors will be reported.

Cc: <stable@vger.kernel.org> # 4.14.x: c95b323dcd35: x86/MCE/AMD: Turn off MC4_MISC thresholding on all family 0x15 models
Cc: <stable@vger.kernel.org> # 4.14.x: 30aa3d26edb0: x86/MCE/AMD: Carve out the MC4_MISC thresholding quirk
Cc: <stable@vger.kernel.org> # 4.14.x
Signed-off-by: Yazen Ghannam <yazen.ghannam@amd.com>
---
Link:
https://lkml.kernel.org/r/20190307212552.8865-2-Yazen.Ghannam@amd.com

v1->v2:
* Filter out the error earlier in MCE code rather than later in EDAC.

 arch/x86/kernel/cpu/mce/amd.c | 57 ++++++++++++++++++++++++++++-------
 1 file changed, 46 insertions(+), 11 deletions(-)

diff --git a/arch/x86/kernel/cpu/mce/amd.c b/arch/x86/kernel/cpu/mce/amd.c
index e64de5149e50..2db85f65b41e 100644
--- a/arch/x86/kernel/cpu/mce/amd.c
+++ b/arch/x86/kernel/cpu/mce/amd.c
@@ -563,22 +563,55 @@ prepare_threshold_block(unsigned int bank, unsigned int block, u32 addr,
 	return offset;
 }
 
+bool filter_mce_rv(struct mce *m)
+{
+	enum smca_bank_types bank_type = smca_get_bank_type(m->bank);
+	u8 xec = (m->status >> 16) & 0x3F;
+
+	/*
+	 * Spurious errors of this type may be reported.
+	 * See Family 17h Models 10h-2Fh Erratum #1114.
+	 */
+	if (bank_type == SMCA_IF && xec == 10)
+		return true;
+
+	return false;
+}
+
+static void filter_mce_check(struct cpuinfo_x86 *c)
+{
+	if (c->x86 == 0x17 && (c->x86_model >= 0x10 && c->x86_model <= 0x2F))
+		filter_mce = filter_mce_rv;
+}
+
 /*
- * Turn off MC4_MISC thresholding banks on all family 0x15 models since
- * they're not supported there.
+ * Turn off thresholding banks for the following conditions:
+ * - MC4_MISC thresholding is not support on Family 0x15.
+ * - Prevent possible spurious interrupts from the IF bank on Family 0x17
+ *   Models 0x10-0x2F due to Erratum #1114.
  */
-void disable_err_thresholding(struct cpuinfo_x86 *c)
+void disable_err_thresholding(struct cpuinfo_x86 *c, unsigned int bank)
 {
-	int i;
+	int i, num_msrs;
 	u64 hwcr;
 	bool need_toggle;
-	u32 msrs[] = {
-		0x00000413, /* MC4_MISC0 */
-		0xc0000408, /* MC4_MISC1 */
-	};
+	u32 msrs[NR_BLOCKS];
 
-	if (c->x86 != 0x15)
+	if (c->x86 == 0x15 && bank == 4) {
+		msrs[0] = 0x00000413; /* MC4_MISC0 */
+		msrs[1] = 0xc0000408; /* MC4_MISC1 */
+		num_msrs = 2;
+	} else if (c->x86 == 0x17 &&
+		   (c->x86_model >= 0x10 && c->x86_model <= 0x2F)) {
+
+		if (smca_get_bank_type(bank) != SMCA_IF)
+			return;
+
+		msrs[0] = MSR_AMD64_SMCA_MCx_MISC(bank);
+		num_msrs = 1;
+	} else {
 		return;
+	}
 
 	rdmsrl(MSR_K7_HWCR, hwcr);
 
@@ -589,7 +622,7 @@ void disable_err_thresholding(struct cpuinfo_x86 *c)
 		wrmsrl(MSR_K7_HWCR, hwcr | BIT(18));
 
 	/* Clear CntP bit safely */
-	for (i = 0; i < ARRAY_SIZE(msrs); i++)
+	for (i = 0; i < num_msrs; i++)
 		msr_clear_bit(msrs[i], 62);
 
 	/* restore old settings */
@@ -604,12 +637,14 @@ void mce_amd_feature_init(struct cpuinfo_x86 *c)
 	unsigned int bank, block, cpu = smp_processor_id();
 	int offset = -1;
 
-	disable_err_thresholding(c);
+	filter_mce_check(c);
 
 	for (bank = 0; bank < mca_cfg.banks; ++bank) {
 		if (mce_flags.smca)
 			smca_configure(bank, cpu);
 
+		disable_err_thresholding(c, bank);
+
 		for (block = 0; block < NR_BLOCKS; ++block) {
 			address = get_block_address(address, low, high, bank, block);
 			if (!address)
-- 
2.17.1


  reply	other threads:[~2019-03-21 20:25 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-03-21 20:25 [PATCH v2 1/2] x86/MCE: Add function to allow filtering of MCA errors Ghannam, Yazen
2019-03-21 20:25 ` Ghannam, Yazen [this message]
2019-03-22 17:34   ` [PATCH v2 2/2] x86/MCE/AMD: Don't report L1 BTB MCA errors on some Family 17h models Borislav Petkov
2019-03-22 19:24     ` Ghannam, Yazen
2019-03-22 19:32       ` Borislav Petkov
2019-03-22 19:33         ` Ghannam, Yazen
2019-03-22 17:24 ` [PATCH v2 1/2] x86/MCE: Add function to allow filtering of MCA errors Borislav Petkov
2019-03-22 19:05   ` Ghannam, Yazen

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190321202505.5553-2-Yazen.Ghannam@amd.com \
    --to=yazen.ghannam@amd.com \
    --cc=bp@suse.de \
    --cc=clemej@gmail.com \
    --cc=linux-edac@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=rafal@milecki.pl \
    --cc=tony.luck@intel.com \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).