From: Borislav Petkov <bp@alien8.de>
To: "Luck, Tony" <tony.luck@intel.com>
Cc: Yazen Ghannam <yazen.ghannam@amd.com>,
Smita Koralahalli <Smita.KoralahalliChannabasappa@amd.com>,
Carlos Bilbao <carlos.bilbao@amd.com>,
"x86@kernel.org" <x86@kernel.org>,
"linux-edac@vger.kernel.org" <linux-edac@vger.kernel.org>,
"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>
Subject: Re: [PATCH 2/2] x86/mce: Dump the stack for recoverable machine checks in kernel context
Date: Mon, 31 Oct 2022 19:36:08 +0100 [thread overview]
Message-ID: <Y2AVmOdEtTl5e68l@zn.tnic> (raw)
In-Reply-To: <SJ1PR11MB6083593A0D18EEE0074E77ABFC379@SJ1PR11MB6083.namprd11.prod.outlook.com>
On Mon, Oct 31, 2022 at 05:13:10PM +0000, Luck, Tony wrote:
> > 1. If the error has raised a MCE, then we will dump stack anyway.
>
> I don't see stack dumps for machine check panics. I don't have any non-standard
> settings (I think). Nor do I see them in the panic messages that other folks send
> to me.
>
> Are you settting some CONFIG or command line option to get a stack dump?
Well, if one were sane, one would assume that one would expect to see a
stack dump when the machine panics, right? I mean, it is only fair...
And there's an attempt:
#ifdef CONFIG_DEBUG_BUGVERBOSE
/*
* Avoid nested stack-dumping if a panic occurs during oops processing
*/
if (!test_taint(TAINT_DIE) && oops_in_progress <= 1)
dump_stack();
#endif
but that oops_in_progress thing is stopping us:
[ 13.706764] mce: [Hardware Error]: CPU 2: Machine Check Exception: 6 Bank 4: fe000010000b0c0f
[ 13.706781] mce: [Hardware Error]: RIP 10:<ffffffff8103bbcb> {trigger_mce+0xb/0x10}
[ 13.706791] mce: [Hardware Error]: TSC c83826d14 ADDR e1101add1e550012 MISC cafebeef
[ 13.706795] mce: [Hardware Error]: PROCESSOR 2:a00f11 TIME 1667244167 SOCKET 0 APIC 2 microcode 1000065
[ 13.706809] mce: [Hardware Error]: Machine check: Processor Context Corrupt
[ 13.706810] panic: on entry: oops_in_progress: 1
[ 13.706812] panic: before bust_spinlocks oops_in_progress: 1
[ 13.706813] Kernel panic - not syncing: Fatal local machine check
[ 13.706814] panic: taint: 0, oops_in_progress: 2
[ 13.707133] Kernel Offset: disabled
as panic() is being entered with oops_in_progress already set to 1. That
oops_in_progress thing looks like is being used for console unblanking.
Looking at
026ee1f66aaa ("panic: fix stack dump print on direct call to panic()")
it hints that panic() might've been called twice for oops_in_progress to
be already 1 on entry.
I guess we need to figure out why that is...
--
Regards/Gruss,
Boris.
https://people.kernel.org/tglx/notes-about-netiquette
next prev parent reply other threads:[~2022-10-31 18:36 UTC|newest]
Thread overview: 10+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-09-22 19:51 [PATCH 0/2] Dump stack after certain machine checks Tony Luck
2022-09-22 19:51 ` [PATCH 1/2] x86/mce: Use severity table to handle uncorrected errors in kernel Tony Luck
2022-10-31 16:15 ` [tip: ras/core] " tip-bot2 for Tony Luck
2022-09-22 19:51 ` [PATCH 2/2] x86/mce: Dump the stack for recoverable machine checks in kernel context Tony Luck
2022-10-31 16:44 ` Borislav Petkov
2022-10-31 17:13 ` Luck, Tony
2022-10-31 18:36 ` Borislav Petkov [this message]
2022-10-31 19:20 ` Luck, Tony
2022-10-31 10:30 ` [PATCH 0/2] Dump stack after certain machine checks Borislav Petkov
2022-11-01 17:36 ` Yazen Ghannam
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=Y2AVmOdEtTl5e68l@zn.tnic \
--to=bp@alien8.de \
--cc=Smita.KoralahalliChannabasappa@amd.com \
--cc=carlos.bilbao@amd.com \
--cc=linux-edac@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=tony.luck@intel.com \
--cc=x86@kernel.org \
--cc=yazen.ghannam@amd.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).