From: Borislav Petkov <bp@alien8.de>
To: ruiv.wang@gmail.com
Cc: linux-kernel@vger.kernel.org, tony.luck@intel.com,
gong.chen@linux.intel.com, rui.y.wang@intel.com
Subject: Re: [PATCH v3] x86/mce: Try printing all machine check banks known before panic
Date: Wed, 19 Nov 2014 11:29:54 +0100 [thread overview]
Message-ID: <20141119102954.GA5617@pd.tnic> (raw)
In-Reply-To: <1416388961-24159-1-git-send-email-ruiv.wang@gmail.com>
On Wed, Nov 19, 2014 at 05:22:41PM +0800, ruiv.wang@gmail.com wrote:
> From: Rui Wang <rui.y.wang@intel.com>
>
> There are cases when an machine check panics without giving any information
> about the error:
>
> [ 177.806166] Kernel panic - not syncing: Machine check from unknown source
>
> No information besides that it is a machine check. This happens in two cases:
> 1) The CPU logs the error with the MCi_STATUS.EN bit set to zero, and Linux
> ignores EN=0 entries (as it should).
Well, I guess we shouldn't anymore. Apparently hw forgets to set the
bit when raising an MCE so then we should ignore it too in mce-severity
and delete that piece or grade it as higher severity based on, I dunno,
b0rked hardware family/model/stepping or whatever bit we set...
MCESEV(
NO, "Not enabled",
BITCLR(MCI_STATUS_EN)
),
> 2) In normal processing the MCE handler ignores banks that do not contain fatal
> or unrecoverable errors (these would later be found and logged by the CMCI
> handler). If we panic, these will never be logged, but could be important
> to diagnose the problem.
Well, we do this:
/*
* Non uncorrected or non signaled errors are handled by
* machine_check_poll. Leave them alone, unless this panics.
*/
if (!(m.status & (cfg->ser ? MCI_STATUS_S : MCI_STATUS_UC)) &&
!no_way_out)
continue;
so no_way_out gets indirectly controlled by mce-severity too. So I guess
mce-severity would need adjusting instead of adding more stuff to the #MC
handler.
Btw, the panic message comes from
/*
* No machine check event found. Must be some external
* source or one CPU is hung. Panic.
*/
if (global_worst <= MCE_KEEP_SEVERITY && mca_cfg.tolerant < 3)
mce_panic("Machine check from unknown source", NULL, NULL);
so fixing mce_severity is what should happen here instead, IMO.
Thanks.
--
Regards/Gruss,
Boris.
Sent from a fat crate under my desk. Formatting is fine.
--
next prev parent reply other threads:[~2014-11-19 10:30 UTC|newest]
Thread overview: 16+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-11-19 9:22 [PATCH v3] x86/mce: Try printing all machine check banks known before panic ruiv.wang
2014-11-19 10:29 ` Borislav Petkov [this message]
2014-11-19 23:34 ` Luck, Tony
2014-11-20 10:15 ` Borislav Petkov
2014-11-21 1:20 ` rui wang
2014-11-21 16:41 ` Borislav Petkov
2014-11-21 17:20 ` Luck, Tony
2014-11-21 18:13 ` Borislav Petkov
2014-11-21 21:31 ` Luck, Tony
2014-11-21 21:35 ` Borislav Petkov
2014-11-21 21:59 ` Luck, Tony
2014-11-23 20:55 ` Borislav Petkov
2014-11-22 2:16 ` rui wang
2014-11-22 9:44 ` Borislav Petkov
2014-11-22 15:32 ` rui wang
2014-11-22 16:31 ` Borislav Petkov
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20141119102954.GA5617@pd.tnic \
--to=bp@alien8.de \
--cc=gong.chen@linux.intel.com \
--cc=linux-kernel@vger.kernel.org \
--cc=rui.y.wang@intel.com \
--cc=ruiv.wang@gmail.com \
--cc=tony.luck@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).