linux-edac.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "Luck, Tony" <tony.luck@intel.com>
To: Tony W Wang-oc <TonyWWang-oc@zhaoxin.com>
Cc: "Borislav Petkov (bp@alien8.de)" <bp@alien8.de>,
	"tglx@linutronix.de" <tglx@linutronix.de>,
	"mingo@redhat.com" <mingo@redhat.com>,
	"hpa@zytor.com" <hpa@zytor.com>,
	"x86@kernel.org" <x86@kernel.org>,
	"linux-edac@vger.kernel.org" <linux-edac@vger.kernel.org>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	"yazen.ghannam@amd.com" <yazen.ghannam@amd.com>,
	"vishal.l.verma@intel.com" <vishal.l.verma@intel.com>,
	"qiuxu.zhuo@intel.com" <qiuxu.zhuo@intel.com>,
	David Wang <DavidWang@zhaoxin.com>,
	"Cooper Yan(BJ-RD)" <CooperYan@zhaoxin.com>,
	"Qiyuan Wang(BJ-RD)" <QiyuanWang@zhaoxin.com>,
	"Herry Yang(BJ-RD)" <HerryYang@zhaoxin.com>
Subject: Re: [PATCH v3 4/4] x86/mce: Add Zhaoxin LMCE support
Date: Tue, 17 Sep 2019 09:37:05 -0700	[thread overview]
Message-ID: <20190917163704.GA1922@agluck-desk2.amr.corp.intel.com> (raw)
In-Reply-To: <1da27840413348febf301ef39305de12@zhaoxin.com>

On Tue, Sep 17, 2019 at 06:54:05AM +0000, Tony W Wang-oc wrote:
> But have a question about below codes:
> 	if (mcgstatus & MCG_STATUS_RIPV) {
> 		mce_wrmsrl(MSR_IA32_MCG_STATUS, 0);
> 		return true;
> 	}
> These seems require all #MC exception errors set MCG_STATUS_RIPV = 1
> in order to skip synchronize which "return true;" actually does for this.
> 
> As Intel SDM show, "Recoverable-not-continuable SRAR Type" errors may
> set MCG_STATUS_RIPV = 0, PCC = 0. When these #MC errors broadcast
> to offline CPU, may cause kernel panic with synchronize timeout (offline
> CPU can't skip synchronize in this case).
> 
> Could "return true;" outside the if-case?
> 	if (mcgstatus & MCG_STATUS_RIPV) {
> 		mce_wrmsrl(MSR_IA32_MCG_STATUS, 0);
> 	} 
> 	return true; 

If RIPV bit is not set in mcgstatus, then where will the CPU return
to if you simply return from the #MC handler? RIPV=1 means that the
CPU pushed a valid return instruction pointer onto the stack.

E.g. in the not-continuable case you mention above? The CPU
will likely do something undefined if you try to continue a
not-continuable instruction.

-Tony

  reply	other threads:[~2019-09-17 16:37 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-09-17  6:54 [PATCH v3 4/4] x86/mce: Add Zhaoxin LMCE support Tony W Wang-oc
2019-09-17 16:37 ` Luck, Tony [this message]
  -- strict thread matches above, loose matches on Subject: below --
2019-09-16 11:37 Tony W Wang-oc
2019-09-16 17:40 ` Luck, Tony
2019-09-11 12:03 Tony W Wang-oc

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190917163704.GA1922@agluck-desk2.amr.corp.intel.com \
    --to=tony.luck@intel.com \
    --cc=CooperYan@zhaoxin.com \
    --cc=DavidWang@zhaoxin.com \
    --cc=HerryYang@zhaoxin.com \
    --cc=QiyuanWang@zhaoxin.com \
    --cc=TonyWWang-oc@zhaoxin.com \
    --cc=bp@alien8.de \
    --cc=hpa@zytor.com \
    --cc=linux-edac@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@redhat.com \
    --cc=qiuxu.zhuo@intel.com \
    --cc=tglx@linutronix.de \
    --cc=vishal.l.verma@intel.com \
    --cc=x86@kernel.org \
    --cc=yazen.ghannam@amd.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).