From: Borislav Petkov <bp@alien8.de>
To: "Ghannam, Yazen" <Yazen.Ghannam@amd.com>
Cc: "linux-edac@vger.kernel.org" <linux-edac@vger.kernel.org>,
"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
"tony.luck@intel.com" <tony.luck@intel.com>,
"x86@kernel.org" <x86@kernel.org>
Subject: Re: [PATCH RESEND 2/5] x86/MCE: Handle MCA controls in a per_cpu way
Date: Wed, 10 Apr 2019 19:25:39 +0200 [thread overview]
Message-ID: <20190410172539.GF26580@zn.tnic> (raw)
In-Reply-To: <SN6PR12MB2639F37C634BC7F6D4B86194F82E0@SN6PR12MB2639.namprd12.prod.outlook.com>
On Wed, Apr 10, 2019 at 04:58:12PM +0000, Ghannam, Yazen wrote:
> Yes, unused banks in the middle are counted in the MCG_CAP[Count] value.
Good.
> Okay, so you're saying the sysfs access should fail if a bank is
> disabled. Is that correct?
Well, think about it. If a bank is not operational for whatever reason,
we should tell the user that.
> Does "disabled" mean one or both of these?
> Unused = RAZ/WI in hardware
> Uninitialized = Not initialized by kernel due to quirks, etc.
>
> For an unused bank, it doesn't hurt to write MCA_CTL, but really
> there's no reason to do so and go through mce_restart().
Yes, but that bank is non-operational in some form. So we should prevent
all writes to it because, well, it is not going to do anything. And this
would be a good way to give feedback to the user that that is the case.
> For an uninitialized bank, should we prevent users from overriding the
> kernel's settings?
That all depends on the quirks. Whether we should allow them to be
overridden or not. I don't think we've ever thought about it, though.
Let's look at one:
if (c->x86_vendor == X86_VENDOR_AMD) {
if (c->x86 == 15 && cfg->banks > 4) {
/*
* disable GART TBL walk error reporting, which
* trips off incorrectly with the IOMMU & 3ware
* & Cerberus:
*/
clear_bit(10, (unsigned long *)&mce_banks[4].ctl);
Yah, so if the user reenables those GART errors, then she/he will see a
lot of MCEs reported and will maybe complain about it. And then we'll
say, but why did you enable them then. And she/he'll say: uh, didn't
know. Or, I was just poking at sysfs and this happened.
Then we can say, well, don't do that then! :-)
So my current position is, meh, who cares. But then I'm looking at
another quirk:
if (c->x86_vendor == X86_VENDOR_INTEL) {
/*
* SDM documents that on family 6 bank 0 should not be written
* because it aliases to another special BIOS controlled
* register.
* But it's not aliased anymore on model 0x1a+
* Don't ignore bank 0 completely because there could be a
* valid event later, merely don't write CTL0.
*/
if (c->x86 == 6 && c->x86_model < 0x1A && cfg->banks > 0)
mce_banks[0].init = 0;
which basically prevents that bank from being reinitialized. So I guess
we have that functionality already - we simply need to pay attention to
w->init.
Right?
--
Regards/Gruss,
Boris.
Good mailing practices for 400: avoid top-posting and trim the reply.
next prev parent reply other threads:[~2019-04-10 17:25 UTC|newest]
Thread overview: 23+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-04-08 14:12 [PATCH RESEND 0/5] Handle MCA banks in a per_cpu way Ghannam, Yazen
2019-04-08 14:12 ` [PATCH RESEND 1/5] x86/MCE: Make struct mce_banks[] static Ghannam, Yazen
2019-04-08 14:12 ` [PATCH RESEND 2/5] x86/MCE: Handle MCA controls in a per_cpu way Ghannam, Yazen
2019-04-08 17:51 ` Borislav Petkov
2019-04-08 18:55 ` Ghannam, Yazen
2019-04-09 20:34 ` Borislav Petkov
2019-04-10 16:36 ` Ghannam, Yazen
2019-04-10 16:40 ` Borislav Petkov
2019-04-10 16:58 ` Ghannam, Yazen
2019-04-10 17:25 ` Borislav Petkov [this message]
2019-04-10 19:41 ` Ghannam, Yazen
2019-04-10 20:04 ` Borislav Petkov
2019-04-08 14:12 ` [PATCH RESEND 3/5] x86/MCE/AMD: Don't cache block addresses on SMCA systems Ghannam, Yazen
2019-04-08 14:12 ` [PATCH RESEND 4/5] x86/MCE: Make number of MCA banks per_cpu Ghannam, Yazen
2019-04-08 20:26 ` Luck, Tony
2019-04-08 20:34 ` Borislav Petkov
2019-04-08 20:42 ` Luck, Tony
2019-04-08 20:50 ` Borislav Petkov
2019-04-08 22:48 ` Ghannam, Yazen
2019-04-08 23:23 ` Luck, Tony
2019-04-08 23:37 ` Ghannam, Yazen
2019-04-09 11:38 ` Borislav Petkov
2019-04-08 14:12 ` [PATCH RESEND 5/5] x86/MCE: Save MCA control bits that get set in hardware Ghannam, Yazen
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20190410172539.GF26580@zn.tnic \
--to=bp@alien8.de \
--cc=Yazen.Ghannam@amd.com \
--cc=linux-edac@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=tony.luck@intel.com \
--cc=x86@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).