All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Day, Michael" <Michael.Day@amd.com>
To: "Bilbao, Carlos" <Carlos.Bilbao@amd.com>, "bp@alien8.de" <bp@alien8.de>
Cc: "tglx@linutronix.de" <tglx@linutronix.de>,
	"mingo@redhat.com" <mingo@redhat.com>,
	"dave.hansen@linux.intel.com" <dave.hansen@linux.intel.com>,
	"x86@kernel.org" <x86@kernel.org>,
	"Ghannam, Yazen" <Yazen.Ghannam@amd.com>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	"linux-edac@vger.kernel.org" <linux-edac@vger.kernel.org>,
	"bilbao@vt.edu" <bilbao@vt.edu>
Subject: RE: [PATCH v2 2/2] x86/mce: Add messages to describe panic machine errors on AMD's MCEs grading
Date: Thu, 31 Mar 2022 17:17:45 +0000	[thread overview]
Message-ID: <SN6PR12MB46850731F74982FFAE6433BFE8E19@SN6PR12MB4685.namprd12.prod.outlook.com> (raw)
In-Reply-To: <20220331163849.6850-3-carlos.bilbao@amd.com>

[AMD Official Use Only]

-----Original Message-----
From: Bilbao, Carlos <Carlos.Bilbao@amd.com> 
Sent: Thursday, March 31, 2022 12:39 PM
To: bp@alien8.de
Cc: tglx@linutronix.de; mingo@redhat.com; dave.hansen@linux.intel.com; x86@kernel.org; Ghannam, Yazen <Yazen.Ghannam@amd.com>; linux-kernel@vger.kernel.org; linux-edac@vger.kernel.org; bilbao@vt.edu; Bilbao, Carlos <Carlos.Bilbao@amd.com>
Subject: [PATCH v2 2/2] x86/mce: Add messages to describe panic machine errors on AMD's MCEs grading

When a machine error is graded as PANIC by AMD grading logic, the MCE handler calls mce_panic(). The notification chain does not come into effect so the AMD EDAC driver does not decode the errors. In these cases, the messages displayed to the user are more cryptic and miss information that might be relevant, like the context in which the error took place.

Fix the above issue including messages on AMD's grading logic for machine errors graded as PANIC.

Signed-off-by: Carlos Bilbao <carlos.bilbao@amd.com>
---
 arch/x86/kernel/cpu/mce/severity.c | 7 +++++++
 1 file changed, 7 insertions(+)

diff --git a/arch/x86/kernel/cpu/mce/severity.c b/arch/x86/kernel/cpu/mce/severity.c
index 4d52eef21230..ea4b9407bbad 100644
--- a/arch/x86/kernel/cpu/mce/severity.c
+++ b/arch/x86/kernel/cpu/mce/severity.c
@@ -307,6 +307,7 @@ static noinstr int error_context(struct mce *m, struct pt_regs *regs)  static noinstr int mce_severity_amd(struct mce *m, struct pt_regs *regs, char **msg, bool is_excp)  {
 	int ret;
+	char *panic_msg;

I suggest char *panic_msg = NULL;

Simply because its stack-based and a non-null value may get assigned to *msg after the amd_severity: label. 
 
 	/*
 	 * Default return value: Action required, the error must be handled @@ -316,6 +317,7 @@ static noinstr int mce_severity_amd(struct mce *m, struct pt_regs *regs, char **
 
 	/* Processor Context Corrupt, no need to fumble too much, die! */
 	if (m->status & MCI_STATUS_PCC) {
+		panic_msg = "Processor Context Corrupt";
 		ret = MCE_PANIC_SEVERITY;
 		goto amd_severity;
 	}
@@ -339,16 +341,21 @@ static noinstr int mce_severity_amd(struct mce *m, struct pt_regs *regs, char **
 
 	if (((m->status & MCI_STATUS_OVER) && !mce_flags.overflow_recov)
 	     || !mce_flags.succor) {
+		panic_msg = "Uncorrected unrecoverable error";
 		ret = MCE_PANIC_SEVERITY;
 		goto amd_severity;
 	}
 
 	if (error_context(m, regs) == IN_KERNEL) {
+		panic_msg = "Uncorrected error in kernel context";
 		ret = MCE_PANIC_SEVERITY;
 	}
 
 amd_severity:
 
+	if (msg && panic_msg)
+		*msg = panic_msg;
+
 	return ret;
 }
 
--
2.31.1

  reply	other threads:[~2022-03-31 17:17 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-03-31 16:38 [PATCH v2 0/2] x86/mce: Grade new machine errors for AMD MCEs and include messages for panic cases Carlos Bilbao
2022-03-31 16:38 ` [PATCH v2 1/2] x86/mce: Extend AMD severity grading function with new types of errors Carlos Bilbao
2022-04-05 17:18   ` Yazen Ghannam
2022-04-05 17:24     ` Carlos Bilbao
2022-04-05 17:41       ` Yazen Ghannam
2022-04-05 17:46         ` Carlos Bilbao
2022-03-31 16:38 ` [PATCH v2 2/2] x86/mce: Add messages to describe panic machine errors on AMD's MCEs grading Carlos Bilbao
2022-03-31 17:17   ` Day, Michael [this message]
2022-04-05 17:38   ` Yazen Ghannam
  -- strict thread matches above, loose matches on Subject: below --
2022-03-31 16:32 [PATCH v2 0/2] x86/mce: Grade new machine errors for AMD MCEs and include messages for panic cases Carlos Bilbao
2022-03-31 16:32 ` [PATCH v2 2/2] x86/mce: Add messages to describe panic machine errors on AMD's MCEs grading Carlos Bilbao

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=SN6PR12MB46850731F74982FFAE6433BFE8E19@SN6PR12MB4685.namprd12.prod.outlook.com \
    --to=michael.day@amd.com \
    --cc=Carlos.Bilbao@amd.com \
    --cc=Yazen.Ghannam@amd.com \
    --cc=bilbao@vt.edu \
    --cc=bp@alien8.de \
    --cc=dave.hansen@linux.intel.com \
    --cc=linux-edac@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@redhat.com \
    --cc=tglx@linutronix.de \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.