From: Tony W Wang-oc <TonyWWang-oc@zhaoxin.com>
To: "tipbot@zytor.com" <tipbot@zytor.com>,
"ashok.raj@intel.com" <ashok.raj@intel.com>
Cc: "bp@suse.de" <bp@suse.de>, "hpa@zytor.com" <hpa@zytor.com>,
"linux-edac@vger.kernel.org" <linux-edac@vger.kernel.org>,
"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
"linux-tip-commits@vger.kernel.org"
<linux-tip-commits@vger.kernel.org>,
"mingo@kernel.org" <mingo@kernel.org>,
"peterz@infradead.org" <peterz@infradead.org>,
"stable@vger.kernel.org" <stable@vger.kernel.org>,
"tglx@linutronix.de" <tglx@linutronix.de>,
"tony.luck@intel.com" <tony.luck@intel.com>,
"torvalds@linux-foundation.org" <torvalds@linux-foundation.org>,
David Wang <DavidWang@zhaoxin.com>
Subject: 答复: Re: [tip:x86/urgent] x86/mce: Ensure offline CPUs don' t participate in rendezvous process
Date: Thu, 30 May 2019 09:13:39 +0000 [thread overview]
Message-ID: <985acf114ab245fbab52caabf03bd280@zhaoxin.com> (raw)
On Thu, May 30, 2019, Tony W Wang-oc wrote:
> Hi Ashok,
> I have two questions about this patch, could you help to check:
>
> 1, for broadcast #MC exceptions, this patch seems require #MC exception
> errors
> set MCG_STATUS_RIPV = 1.
> But for Intel CPU, some #MC exception errors set MCG_STATUS_RIPV = 0
> (like "Recoverable-not-continuable SRAR Type" Errors), for these errors
> the patch doesn't seem to work, is that okay?
>
> 2, for LMCE exceptions, this patch seems require #MC exception errors
> set MCG_STATUS_RIPV = 0 to make sure LMCE be handled normally even
> on offline CPU.
> For LMCE errors set MCG_STAUS_RIPV = 1, the patch prevents offline CPU
> handle these LMCE errors, is that okay?
>
More specifically, this patch seems require #MC exceptions meet the condition
"MCG_STATUS_RIPV ^ MCG_STATUS_LMCES == 1"; But on a Xeon X5650 machine (SMP),
"Data CACHE Level-2 Generic Error" does not meet this condition.
I got below message from: https://www.centos.org/forums/viewtopic.php?p=292742
Hardware event. This is not a software error.
MCE 0
CPU 4 BANK 6 TSC b7065eeaa18b0
TIME 1545643603 Mon Dec 24 10:26:43 2018
MCG status:MCIP
MCi status:
Uncorrected error
Error enabled
Processor context corrupt
MCA: Data CACHE Level-2 Generic Error
STATUS b200000080000106 MCGSTATUS 4
MCGCAP 1c09 APICID 4 SOCKETID 0
CPUID Vendor Intel Family 6 Model 44
> Thanks
> Tony W Wang-oc
next reply other threads:[~2019-05-30 10:13 UTC|newest]
Thread overview: 4+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-05-30 9:13 Tony W Wang-oc [this message]
2019-05-30 17:10 ` 答复: Re: [tip:x86/urgent] x86/mce: Ensure offline CPUs don' t participate in rendezvous process Raj, Ashok
2019-05-31 3:17 ` 答复: " David Wang
2019-05-31 4:41 ` Tony W Wang-oc
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=985acf114ab245fbab52caabf03bd280@zhaoxin.com \
--to=tonywwang-oc@zhaoxin.com \
--cc=DavidWang@zhaoxin.com \
--cc=ashok.raj@intel.com \
--cc=bp@suse.de \
--cc=hpa@zytor.com \
--cc=linux-edac@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-tip-commits@vger.kernel.org \
--cc=mingo@kernel.org \
--cc=peterz@infradead.org \
--cc=stable@vger.kernel.org \
--cc=tglx@linutronix.de \
--cc=tipbot@zytor.com \
--cc=tony.luck@intel.com \
--cc=torvalds@linux-foundation.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).