linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/3] New SMCA Syndrome registers and FRU Text feature
@ 2022-04-18 17:44 Yazen Ghannam
  2022-04-18 17:44 ` [PATCH 1/3] x86/MCE, EDAC/mce_amd: Add support for new MCA_SYND{1,2} registers Yazen Ghannam
                   ` (3 more replies)
  0 siblings, 4 replies; 28+ messages in thread
From: Yazen Ghannam @ 2022-04-18 17:44 UTC (permalink / raw)
  To: linux-edac
  Cc: linux-kernel, tony.luck, x86, Smita.KoralahalliChannabasappa,
	Yazen Ghannam

Hello,

This patchset adds support for two new "syndrome" registers used in
future AMD Scalable MCA (SMCA) systems and also a new "FRU Text in MCA"
feature that uses these new registers.

Patch 1 adds support for the new registers. They are read/printed
wherever the existing MCA_SYND register is used.

Patch 2 updates the function that pulls MCA information from UEFI x86
Common Platform Error Records (CPERs) to handle systems that support the
new registers.

Patch 3 adds support to the AMD MCE decoder module to detect and use the
"FRU Text in MCA" feature which leverages the new registers.

Thanks,
Yazen

Yazen Ghannam (3):
  x86/MCE, EDAC/mce_amd: Add support for new MCA_SYND{1,2} registers
  x86/MCE/APEI: Handle variable register array size
  EDAC/mce_amd: Add support for FRU Text in MCA

 arch/x86/include/asm/mce.h      |  5 +++
 arch/x86/include/uapi/asm/mce.h |  2 +
 arch/x86/kernel/cpu/mce/amd.c   |  5 ++-
 arch/x86/kernel/cpu/mce/apei.c  | 73 ++++++++++++++++++++++++++-------
 arch/x86/kernel/cpu/mce/core.c  |  9 +++-
 drivers/edac/mce_amd.c          | 24 ++++++++---
 include/trace/events/mce.h      |  7 +++-
 7 files changed, 103 insertions(+), 22 deletions(-)

-- 
2.25.1


^ permalink raw reply	[flat|nested] 28+ messages in thread

* [PATCH 1/3] x86/MCE, EDAC/mce_amd: Add support for new MCA_SYND{1,2} registers
  2022-04-18 17:44 [PATCH 0/3] New SMCA Syndrome registers and FRU Text feature Yazen Ghannam
@ 2022-04-18 17:44 ` Yazen Ghannam
  2022-06-30 11:01   ` Borislav Petkov
  2022-04-18 17:44 ` [PATCH 2/3] x86/MCE/APEI: Handle variable register array size Yazen Ghannam
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 28+ messages in thread
From: Yazen Ghannam @ 2022-04-18 17:44 UTC (permalink / raw)
  To: linux-edac
  Cc: linux-kernel, tony.luck, x86, Smita.KoralahalliChannabasappa,
	Yazen Ghannam

Future Scalable MCA systems will include two new registers: MCA_SYND1
and MCA_SYND2.

These registers will include supplemental error information in addition
to the existing MCA_SYND register. The data within the registers is
considered valid if MCA_STATUS[SyndV] is set.

Add fields for these registers in struct mce. Save and print these
registers wherever MCA_STATUS[SyndV]/MCA_SYND is currently used.

Note: Checkpatch warnings/errors are ignored to maintain coding style.

Signed-off-by: Yazen Ghannam <yazen.ghannam@amd.com>
---
 arch/x86/include/asm/mce.h      | 5 +++++
 arch/x86/include/uapi/asm/mce.h | 2 ++
 arch/x86/kernel/cpu/mce/amd.c   | 5 ++++-
 arch/x86/kernel/cpu/mce/core.c  | 9 ++++++++-
 drivers/edac/mce_amd.c          | 7 +++++--
 include/trace/events/mce.h      | 7 ++++++-
 6 files changed, 30 insertions(+), 5 deletions(-)

diff --git a/arch/x86/include/asm/mce.h b/arch/x86/include/asm/mce.h
index cc73061e7255..25e3e2bc8c0a 100644
--- a/arch/x86/include/asm/mce.h
+++ b/arch/x86/include/asm/mce.h
@@ -116,6 +116,9 @@
 #define MSR_AMD64_SMCA_MC0_DESTAT	0xc0002008
 #define MSR_AMD64_SMCA_MC0_DEADDR	0xc0002009
 #define MSR_AMD64_SMCA_MC0_MISC1	0xc000200a
+/* Registers MISC2 to MISC4 are at offsets B to D. */
+#define MSR_AMD64_SMCA_MC0_SYND1	0xc000200e
+#define MSR_AMD64_SMCA_MC0_SYND2	0xc000200f
 #define MSR_AMD64_SMCA_MCx_CTL(x)	(MSR_AMD64_SMCA_MC0_CTL + 0x10*(x))
 #define MSR_AMD64_SMCA_MCx_STATUS(x)	(MSR_AMD64_SMCA_MC0_STATUS + 0x10*(x))
 #define MSR_AMD64_SMCA_MCx_ADDR(x)	(MSR_AMD64_SMCA_MC0_ADDR + 0x10*(x))
@@ -126,6 +129,8 @@
 #define MSR_AMD64_SMCA_MCx_DESTAT(x)	(MSR_AMD64_SMCA_MC0_DESTAT + 0x10*(x))
 #define MSR_AMD64_SMCA_MCx_DEADDR(x)	(MSR_AMD64_SMCA_MC0_DEADDR + 0x10*(x))
 #define MSR_AMD64_SMCA_MCx_MISCy(x, y)	((MSR_AMD64_SMCA_MC0_MISC1 + y) + (0x10*(x)))
+#define MSR_AMD64_SMCA_MCx_SYND1(x)	(MSR_AMD64_SMCA_MC0_SYND1 + 0x10*(x))
+#define MSR_AMD64_SMCA_MCx_SYND2(x)	(MSR_AMD64_SMCA_MC0_SYND2 + 0x10*(x))
 
 #define XEC(x, mask)			(((x) >> 16) & mask)
 
diff --git a/arch/x86/include/uapi/asm/mce.h b/arch/x86/include/uapi/asm/mce.h
index db9adc081c5a..e77663a4abfa 100644
--- a/arch/x86/include/uapi/asm/mce.h
+++ b/arch/x86/include/uapi/asm/mce.h
@@ -36,6 +36,8 @@ struct mce {
 	__u64 ppin;		/* Protected Processor Inventory Number */
 	__u32 microcode;	/* Microcode revision */
 	__u64 kflags;		/* Internal kernel use */
+	__u64 synd1;		/* MCA_SYND1 MSR: only valid on SMCA systems */
+	__u64 synd2;		/* MCA_SYND2 MSR: only valid on SMCA systems */
 };
 
 #define MCE_GET_RECORD_LEN   _IOR('M', 1, int)
diff --git a/arch/x86/kernel/cpu/mce/amd.c b/arch/x86/kernel/cpu/mce/amd.c
index 1c87501e0fa3..23e34e5be7ed 100644
--- a/arch/x86/kernel/cpu/mce/amd.c
+++ b/arch/x86/kernel/cpu/mce/amd.c
@@ -750,8 +750,11 @@ static void __log_error(unsigned int bank, u64 status, u64 addr, u64 misc)
 	if (mce_flags.smca) {
 		rdmsrl(MSR_AMD64_SMCA_MCx_IPID(bank), m.ipid);
 
-		if (m.status & MCI_STATUS_SYNDV)
+		if (m.status & MCI_STATUS_SYNDV) {
 			rdmsrl(MSR_AMD64_SMCA_MCx_SYND(bank), m.synd);
+			rdmsrl(MSR_AMD64_SMCA_MCx_SYND1(bank), m.synd1);
+			rdmsrl(MSR_AMD64_SMCA_MCx_SYND2(bank), m.synd2);
+		}
 	}
 
 	mce_log(&m);
diff --git a/arch/x86/kernel/cpu/mce/core.c b/arch/x86/kernel/cpu/mce/core.c
index d775fcd74e98..28e7a3c9ecfe 100644
--- a/arch/x86/kernel/cpu/mce/core.c
+++ b/arch/x86/kernel/cpu/mce/core.c
@@ -190,6 +190,10 @@ static void __print_mce(struct mce *m)
 	if (mce_flags.smca) {
 		if (m->synd)
 			pr_cont("SYND %llx ", m->synd);
+		if (m->synd1)
+			pr_cont("SYND1 %llx ", m->synd1);
+		if (m->synd2)
+			pr_cont("SYND2 %llx ", m->synd2);
 		if (m->ipid)
 			pr_cont("IPID %llx ", m->ipid);
 	}
@@ -647,8 +651,11 @@ static noinstr void mce_read_aux(struct mce *m, int i)
 	if (mce_flags.smca) {
 		m->ipid = mce_rdmsrl(MSR_AMD64_SMCA_MCx_IPID(i));
 
-		if (m->status & MCI_STATUS_SYNDV)
+		if (m->status & MCI_STATUS_SYNDV) {
 			m->synd = mce_rdmsrl(MSR_AMD64_SMCA_MCx_SYND(i));
+			m->synd1 = mce_rdmsrl(MSR_AMD64_SMCA_MCx_SYND1(i));
+			m->synd2 = mce_rdmsrl(MSR_AMD64_SMCA_MCx_SYND2(i));
+		}
 	}
 }
 
diff --git a/drivers/edac/mce_amd.c b/drivers/edac/mce_amd.c
index cc5c63feb26a..28b48c711fe0 100644
--- a/drivers/edac/mce_amd.c
+++ b/drivers/edac/mce_amd.c
@@ -1291,8 +1291,11 @@ amd_decode_mce(struct notifier_block *nb, unsigned long val, void *data)
 	if (boot_cpu_has(X86_FEATURE_SMCA)) {
 		pr_emerg(HW_ERR "IPID: 0x%016llx", m->ipid);
 
-		if (m->status & MCI_STATUS_SYNDV)
-			pr_cont(", Syndrome: 0x%016llx", m->synd);
+		if (m->status & MCI_STATUS_SYNDV) {
+			pr_cont(", Syndrome: 0x%016llx\n", m->synd);
+			pr_emerg(HW_ERR "Syndrome1: 0x%016llx, Syndrome2: 0x%016llx",
+					m->synd1, m->synd2);
+		}
 
 		pr_cont("\n");
 
diff --git a/include/trace/events/mce.h b/include/trace/events/mce.h
index 1391ada0da3b..a6826c34a185 100644
--- a/include/trace/events/mce.h
+++ b/include/trace/events/mce.h
@@ -22,6 +22,8 @@ TRACE_EVENT(mce_record,
 		__field(	u64,		addr		)
 		__field(	u64,		misc		)
 		__field(	u64,		synd		)
+		__field(	u64,		synd1		)
+		__field(	u64,		synd2		)
 		__field(	u64,		ipid		)
 		__field(	u64,		ip		)
 		__field(	u64,		tsc		)
@@ -42,6 +44,8 @@ TRACE_EVENT(mce_record,
 		__entry->addr		= m->addr;
 		__entry->misc		= m->misc;
 		__entry->synd		= m->synd;
+		__entry->synd1		= m->synd1;
+		__entry->synd2		= m->synd2;
 		__entry->ipid		= m->ipid;
 		__entry->ip		= m->ip;
 		__entry->tsc		= m->tsc;
@@ -55,12 +59,13 @@ TRACE_EVENT(mce_record,
 		__entry->cpuvendor	= m->cpuvendor;
 	),
 
-	TP_printk("CPU: %d, MCGc/s: %llx/%llx, MC%d: %016Lx, IPID: %016Lx, ADDR/MISC/SYND: %016Lx/%016Lx/%016Lx, RIP: %02x:<%016Lx>, TSC: %llx, PROCESSOR: %u:%x, TIME: %llu, SOCKET: %u, APIC: %x",
+	TP_printk("CPU: %d, MCGc/s: %llx/%llx, MC%d: %016Lx, IPID: %016Lx, ADDR/MISC/SYND: %016Lx/%016Lx/%016Lx, SYND1/SYND2: %016Lx/%016Lx, RIP: %02x:<%016Lx>, TSC: %llx, PROCESSOR: %u:%x, TIME: %llu, SOCKET: %u, APIC: %x",
 		__entry->cpu,
 		__entry->mcgcap, __entry->mcgstatus,
 		__entry->bank, __entry->status,
 		__entry->ipid,
 		__entry->addr, __entry->misc, __entry->synd,
+		__entry->synd1, __entry->synd2,
 		__entry->cs, __entry->ip,
 		__entry->tsc,
 		__entry->cpuvendor, __entry->cpuid,
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH 2/3] x86/MCE/APEI: Handle variable register array size
  2022-04-18 17:44 [PATCH 0/3] New SMCA Syndrome registers and FRU Text feature Yazen Ghannam
  2022-04-18 17:44 ` [PATCH 1/3] x86/MCE, EDAC/mce_amd: Add support for new MCA_SYND{1,2} registers Yazen Ghannam
@ 2022-04-18 17:44 ` Yazen Ghannam
  2022-07-03 12:30   ` Borislav Petkov
  2022-04-18 17:44 ` [PATCH 3/3] EDAC/mce_amd: Add support for FRU Text in MCA Yazen Ghannam
  2022-06-10 16:29 ` [PATCH 0/3] New SMCA Syndrome registers and FRU Text feature Yazen Ghannam
  3 siblings, 1 reply; 28+ messages in thread
From: Yazen Ghannam @ 2022-04-18 17:44 UTC (permalink / raw)
  To: linux-edac
  Cc: linux-kernel, tony.luck, x86, Smita.KoralahalliChannabasappa,
	Yazen Ghannam

Recent AMD systems may provide an x86 Common Platform Error Record
(CPER) for errors reported in the ACPI Boot Error Record Table (BERT).
The x86 CPER may contain one or more Processor Context Information
Structures. The context structures may represent an x86 MSR range where
a starting address is given, and the data represents a contiguous set of
MSRs starting from, and including, the starting address.

It's common, for AMD systems that implement this behavior, that the MSR
range represents the MCAX register space used for the Scalable MCA
feature. The apei_smca_report_x86_error() function decodes and passes
this information through the MCE notifier chain. However, this function
assumes a fixed register size based on the original HW/FW
implementation.

This assumption breaks with the addition of two new MCAX registers:
MCA_SYND1 and MCA_SYND2. These registers are added at the end of the
MCAX register space, so they won't be included when decoding the CPER
data.

Rework apei_smca_report_x86_error() to support a variable register array
size. This covers any case where the MSR context information starts at
the MCAX address for MCA_STATUS and ends at any other register within
the MCAX register space.

Add code comments indicating the MCAX register at each offset.

Signed-off-by: Yazen Ghannam <yazen.ghannam@amd.com>
---
 arch/x86/kernel/cpu/mce/apei.c | 73 +++++++++++++++++++++++++++-------
 1 file changed, 59 insertions(+), 14 deletions(-)

diff --git a/arch/x86/kernel/cpu/mce/apei.c b/arch/x86/kernel/cpu/mce/apei.c
index 0e3ae64d3b76..7510cd88f7eb 100644
--- a/arch/x86/kernel/cpu/mce/apei.c
+++ b/arch/x86/kernel/cpu/mce/apei.c
@@ -55,7 +55,7 @@ EXPORT_SYMBOL_GPL(apei_mce_report_mem_error);
 int apei_smca_report_x86_error(struct cper_ia_proc_ctx *ctx_info, u64 lapic_id)
 {
 	const u64 *i_mce = ((const u64 *) (ctx_info + 1));
-	unsigned int cpu;
+	unsigned int cpu, num_registers;
 	struct mce m;
 
 	if (!boot_cpu_has(X86_FEATURE_SMCA))
@@ -74,16 +74,12 @@ int apei_smca_report_x86_error(struct cper_ia_proc_ctx *ctx_info, u64 lapic_id)
 		return -EINVAL;
 
 	/*
-	 * The register array size must be large enough to include all the
-	 * SMCA registers which need to be extracted.
-	 *
 	 * The number of registers in the register array is determined by
 	 * Register Array Size/8 as defined in UEFI spec v2.8, sec N.2.4.2.2.
-	 * The register layout is fixed and currently the raw data in the
-	 * register array includes 6 SMCA registers which the kernel can
-	 * extract.
+	 * Ensure that the array size includes at least 1 register.
 	 */
-	if (ctx_info->reg_arr_size < 48)
+	num_registers = ctx_info->reg_arr_size >> 3;
+	if (!num_registers)
 		return -EINVAL;
 
 	mce_setup(&m);
@@ -101,12 +97,61 @@ int apei_smca_report_x86_error(struct cper_ia_proc_ctx *ctx_info, u64 lapic_id)
 
 	m.apicid = lapic_id;
 	m.bank = (ctx_info->msr_addr >> 4) & 0xFF;
-	m.status = *i_mce;
-	m.addr = *(i_mce + 1);
-	m.misc = *(i_mce + 2);
-	/* Skipping MCA_CONFIG */
-	m.ipid = *(i_mce + 4);
-	m.synd = *(i_mce + 5);
+
+	/*
+	 * The SMCA register layout is fixed and includes 16 registers.
+	 * The end of the array may be variable, but the beginning is known.
+	 * Switch on the number of registers. Cap the number of registers to
+	 * expected max (15).
+	 */
+	if (num_registers > 15)
+		num_registers = 15;
+
+	switch (num_registers) {
+	/* MCA_SYND2 */
+	case 15:
+		m.synd2 = *(i_mce + 14);
+		fallthrough;
+	/* MCA_SYND1 */
+	case 14:
+		m.synd1 = *(i_mce + 13);
+		fallthrough;
+	/* MCA_MISC4 */
+	case 13:
+	/* MCA_MISC3 */
+	case 12:
+	/* MCA_MISC2 */
+	case 11:
+	/* MCA_MISC1 */
+	case 10:
+	/* MCA_DEADDR */
+	case 9:
+	/* MCA_DESTAT */
+	case 8:
+	/* reserved */
+	case 7:
+	/* MCA_SYND */
+	case 6:
+		m.synd = *(i_mce + 5);
+		fallthrough;
+	/* MCA_IPID */
+	case 5:
+		m.ipid = *(i_mce + 4);
+		fallthrough;
+	/* MCA_CONFIG */
+	case 4:
+	/* MCA_MISC0 */
+	case 3:
+		m.misc = *(i_mce + 2);
+		fallthrough;
+	/* MCA_ADDR */
+	case 2:
+		m.addr = *(i_mce + 1);
+		fallthrough;
+	/* MCA_STATUS */
+	case 1:
+		m.status = *i_mce;
+	}
 
 	mce_log(&m);
 
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH 3/3] EDAC/mce_amd: Add support for FRU Text in MCA
  2022-04-18 17:44 [PATCH 0/3] New SMCA Syndrome registers and FRU Text feature Yazen Ghannam
  2022-04-18 17:44 ` [PATCH 1/3] x86/MCE, EDAC/mce_amd: Add support for new MCA_SYND{1,2} registers Yazen Ghannam
  2022-04-18 17:44 ` [PATCH 2/3] x86/MCE/APEI: Handle variable register array size Yazen Ghannam
@ 2022-04-18 17:44 ` Yazen Ghannam
  2022-07-04  9:13   ` Borislav Petkov
  2022-06-10 16:29 ` [PATCH 0/3] New SMCA Syndrome registers and FRU Text feature Yazen Ghannam
  3 siblings, 1 reply; 28+ messages in thread
From: Yazen Ghannam @ 2022-04-18 17:44 UTC (permalink / raw)
  To: linux-edac
  Cc: linux-kernel, tony.luck, x86, Smita.KoralahalliChannabasappa,
	Yazen Ghannam

A new "FRU Text in MCA" feature is defined where the Field Replaceable
Unit (FRU) Text for a device is represented by a string in the new
MCA_SYND1 and MCA_SYND2 registers. This feature is supported per MCA
bank, and it is advertised by the McaFruTextInMca bit (MCA_CONFIG[9]).

The FRU Text is populated dynamically for each individual error state
(MCA_STATUS, MCA_ADDR, et al.). This handles the case where an MCA bank
covers multiple devices, for example, a Unified Memory Controller (UMC)
bank that manages two DIMMs.

Print the FRU Text string, if available, when decoding an MCA error.

Signed-off-by: Yazen Ghannam <yazen.ghannam@amd.com>
---
 drivers/edac/mce_amd.c | 17 ++++++++++++++---
 1 file changed, 14 insertions(+), 3 deletions(-)

diff --git a/drivers/edac/mce_amd.c b/drivers/edac/mce_amd.c
index 28b48c711fe0..3cacc3f22379 100644
--- a/drivers/edac/mce_amd.c
+++ b/drivers/edac/mce_amd.c
@@ -1235,6 +1235,7 @@ amd_decode_mce(struct notifier_block *nb, unsigned long val, void *data)
 {
 	struct mce *m = (struct mce *)data;
 	unsigned int fam = x86_family(m->cpuid);
+	u64 mca_config = 0;
 	int ecc;
 
 	if (m->kflags & MCE_HANDLED_CEC)
@@ -1254,11 +1255,10 @@ amd_decode_mce(struct notifier_block *nb, unsigned long val, void *data)
 		((m->status & MCI_STATUS_PCC)	? "PCC"	  : "-"));
 
 	if (boot_cpu_has(X86_FEATURE_SMCA)) {
-		u32 low, high;
 		u32 addr = MSR_AMD64_SMCA_MCx_CONFIG(m->bank);
 
-		if (!rdmsr_safe(addr, &low, &high) &&
-		    (low & MCI_CONFIG_MCAX))
+		if (!rdmsrl_safe_on_cpu(m->extcpu, addr, &mca_config) &&
+		    (mca_config & MCI_CONFIG_MCAX))
 			pr_cont("|%s", ((m->status & MCI_STATUS_TCC) ? "TCC" : "-"));
 
 		pr_cont("|%s", ((m->status & MCI_STATUS_SYNDV) ? "SyndV" : "-"));
@@ -1300,6 +1300,17 @@ amd_decode_mce(struct notifier_block *nb, unsigned long val, void *data)
 		pr_cont("\n");
 
 		decode_smca_error(m);
+
+		if (mca_config & BIT(9)) {
+			char frutext[32];
+
+			memset(frutext, 0, sizeof(frutext));
+			memcpy(&frutext[0], &m->synd1, 8);
+			memcpy(&frutext[8], &m->synd2, 8);
+
+			pr_emerg(HW_ERR "FRU Text: %s\n", frutext);
+		}
+
 		goto err_code;
 	}
 
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 28+ messages in thread

* Re: [PATCH 0/3] New SMCA Syndrome registers and FRU Text feature
  2022-04-18 17:44 [PATCH 0/3] New SMCA Syndrome registers and FRU Text feature Yazen Ghannam
                   ` (2 preceding siblings ...)
  2022-04-18 17:44 ` [PATCH 3/3] EDAC/mce_amd: Add support for FRU Text in MCA Yazen Ghannam
@ 2022-06-10 16:29 ` Yazen Ghannam
  3 siblings, 0 replies; 28+ messages in thread
From: Yazen Ghannam @ 2022-06-10 16:29 UTC (permalink / raw)
  To: linux-edac; +Cc: linux-kernel, tony.luck, x86, Smita.KoralahalliChannabasappa

On Mon, Apr 18, 2022 at 05:44:37PM +0000, Yazen Ghannam wrote:
> Hello,
> 
> This patchset adds support for two new "syndrome" registers used in
> future AMD Scalable MCA (SMCA) systems and also a new "FRU Text in MCA"
> feature that uses these new registers.
> 
> Patch 1 adds support for the new registers. They are read/printed
> wherever the existing MCA_SYND register is used.
> 
> Patch 2 updates the function that pulls MCA information from UEFI x86
> Common Platform Error Records (CPERs) to handle systems that support the
> new registers.
> 
> Patch 3 adds support to the AMD MCE decoder module to detect and use the
> "FRU Text in MCA" feature which leverages the new registers.
> 
> Thanks,
> Yazen
> 
> Yazen Ghannam (3):
>   x86/MCE, EDAC/mce_amd: Add support for new MCA_SYND{1,2} registers
>   x86/MCE/APEI: Handle variable register array size
>   EDAC/mce_amd: Add support for FRU Text in MCA
> 
>  arch/x86/include/asm/mce.h      |  5 +++
>  arch/x86/include/uapi/asm/mce.h |  2 +
>  arch/x86/kernel/cpu/mce/amd.c   |  5 ++-
>  arch/x86/kernel/cpu/mce/apei.c  | 73 ++++++++++++++++++++++++++-------
>  arch/x86/kernel/cpu/mce/core.c  |  9 +++-
>  drivers/edac/mce_amd.c          | 24 ++++++++---
>  include/trace/events/mce.h      |  7 +++-
>  7 files changed, 103 insertions(+), 22 deletions(-)
> 
> -- 

Hi everyone,

Any comments on this set? Thanks!

-Yazen

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 1/3] x86/MCE, EDAC/mce_amd: Add support for new MCA_SYND{1,2} registers
  2022-04-18 17:44 ` [PATCH 1/3] x86/MCE, EDAC/mce_amd: Add support for new MCA_SYND{1,2} registers Yazen Ghannam
@ 2022-06-30 11:01   ` Borislav Petkov
  2022-07-11 17:31     ` Yazen Ghannam
  0 siblings, 1 reply; 28+ messages in thread
From: Borislav Petkov @ 2022-06-30 11:01 UTC (permalink / raw)
  To: Yazen Ghannam
  Cc: linux-edac, linux-kernel, tony.luck, x86, Smita.KoralahalliChannabasappa

On Mon, Apr 18, 2022 at 05:44:38PM +0000, Yazen Ghannam wrote:
> Future Scalable MCA systems will include two new registers: MCA_SYND1
> and MCA_SYND2.
> 
> These registers will include supplemental error information in addition
> to the existing MCA_SYND register. The data within the registers is
> considered valid if MCA_STATUS[SyndV] is set.
> 
> Add fields for these registers in struct mce. Save and print these
> registers wherever MCA_STATUS[SyndV]/MCA_SYND is currently used.

That's all fine and good but what kind of supplemental error information
are we talking about here? Example?

And how is that error info going to be used in luserspace?

I don't want to increase struct mce record size by 16 bytes and those
end up unused.

Can the information from MCA_SYND{,1,2} be synthesized into a smaller
quantity an then fed to userspace?

Thx.

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 2/3] x86/MCE/APEI: Handle variable register array size
  2022-04-18 17:44 ` [PATCH 2/3] x86/MCE/APEI: Handle variable register array size Yazen Ghannam
@ 2022-07-03 12:30   ` Borislav Petkov
  2022-07-11 17:32     ` Yazen Ghannam
  0 siblings, 1 reply; 28+ messages in thread
From: Borislav Petkov @ 2022-07-03 12:30 UTC (permalink / raw)
  To: Yazen Ghannam
  Cc: linux-edac, linux-kernel, tony.luck, x86, Smita.KoralahalliChannabasappa

On Mon, Apr 18, 2022 at 05:44:39PM +0000, Yazen Ghannam wrote:
> Recent AMD systems may provide an x86 Common Platform Error Record
> (CPER) for errors reported in the ACPI Boot Error Record Table (BERT).
> The x86 CPER may contain one or more Processor Context Information
> Structures. The context structures may represent an x86 MSR range where
> a starting address is given, and the data represents a contiguous set of
> MSRs starting from, and including, the starting address.

You're killing me with these "may" formulations. Just say it once and
then drop it. I mean, we know some future hw "may" support something
new - you can just as well drop the "may" thing because if it only may
and it turns out it might not, you don't even have to do the work and
enabling it and sending the patch.

So no need to do that - the patch commit message should talk purely
about functionality and not sound like some vendor doc - there are
enough of those.

Thx.

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 3/3] EDAC/mce_amd: Add support for FRU Text in MCA
  2022-04-18 17:44 ` [PATCH 3/3] EDAC/mce_amd: Add support for FRU Text in MCA Yazen Ghannam
@ 2022-07-04  9:13   ` Borislav Petkov
  2022-07-11 17:41     ` Yazen Ghannam
  0 siblings, 1 reply; 28+ messages in thread
From: Borislav Petkov @ 2022-07-04  9:13 UTC (permalink / raw)
  To: Yazen Ghannam
  Cc: linux-edac, linux-kernel, tony.luck, x86, Smita.KoralahalliChannabasappa

On Mon, Apr 18, 2022 at 05:44:40PM +0000, Yazen Ghannam wrote:
> @@ -1254,11 +1255,10 @@ amd_decode_mce(struct notifier_block *nb, unsigned long val, void *data)
>  		((m->status & MCI_STATUS_PCC)	? "PCC"	  : "-"));
>  
>  	if (boot_cpu_has(X86_FEATURE_SMCA)) {
> -		u32 low, high;
>  		u32 addr = MSR_AMD64_SMCA_MCx_CONFIG(m->bank);
>  
> -		if (!rdmsr_safe(addr, &low, &high) &&
> -		    (low & MCI_CONFIG_MCAX))
> +		if (!rdmsrl_safe_on_cpu(m->extcpu, addr, &mca_config) &&

This change needs to be mentioned in the commit message.

> +		    (mca_config & MCI_CONFIG_MCAX))
>  			pr_cont("|%s", ((m->status & MCI_STATUS_TCC) ? "TCC" : "-"));
>  
>  		pr_cont("|%s", ((m->status & MCI_STATUS_SYNDV) ? "SyndV" : "-"));
> @@ -1300,6 +1300,17 @@ amd_decode_mce(struct notifier_block *nb, unsigned long val, void *data)
>  		pr_cont("\n");

So here above the code prints SYND1 and SYND2.

If they contain FRU strings, then this printing should be an if-else by
checking bit 9.

	if (BIT(9))
		print fru text
	else
		print naked syndromes


>  		decode_smca_error(m);
> +
> +		if (mca_config & BIT(9)) {
> +			char frutext[32];

Why 32?

> +			memset(frutext, 0, sizeof(frutext));
> +			memcpy(&frutext[0], &m->synd1, 8);
> +			memcpy(&frutext[8], &m->synd2, 8);
> +
> +			pr_emerg(HW_ERR "FRU Text: %s\n", frutext);
> +		}
> +
>  		goto err_code;
>  	}

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 1/3] x86/MCE, EDAC/mce_amd: Add support for new MCA_SYND{1,2} registers
  2022-06-30 11:01   ` Borislav Petkov
@ 2022-07-11 17:31     ` Yazen Ghannam
  2022-07-18  8:57       ` Borislav Petkov
  0 siblings, 1 reply; 28+ messages in thread
From: Yazen Ghannam @ 2022-07-11 17:31 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: linux-edac, linux-kernel, tony.luck, x86, Smita.KoralahalliChannabasappa

On Thu, Jun 30, 2022 at 01:01:58PM +0200, Borislav Petkov wrote:
> On Mon, Apr 18, 2022 at 05:44:38PM +0000, Yazen Ghannam wrote:
> > Future Scalable MCA systems will include two new registers: MCA_SYND1
> > and MCA_SYND2.
> > 
> > These registers will include supplemental error information in addition
> > to the existing MCA_SYND register. The data within the registers is
> > considered valid if MCA_STATUS[SyndV] is set.
> > 
> > Add fields for these registers in struct mce. Save and print these
> > registers wherever MCA_STATUS[SyndV]/MCA_SYND is currently used.
> 
> That's all fine and good but what kind of supplemental error information
> are we talking about here? Example?
> 
> And how is that error info going to be used in luserspace?
>

I think the general case will be more bank-specific information. For example,
if the bank is a cache type then the info one format and if the bank is a CPU
type then it'll be a different format, etc. So I think the new info will be
treated the same as the old info, i.e. collect all the raw data and share it
with a hardware debug person.

The one example where this is different is the "FRU Text" case covered in a
following patch in this set. 

> I don't want to increase struct mce record size by 16 bytes and those
> end up unused.
> 
> Can the information from MCA_SYND{,1,2} be synthesized into a smaller
> quantity an then fed to userspace?
>

I don't think so, at least not at the moment. There aren't any "architectural"
fields that can be interpreted the same accross multiple errors types and
banks.

Is your concern specifically on growing/changing struct mce, or is it more
about limiting info sent to userspace?

If it's the former, then I've been thinking it would be good to introduce a
new internal "struct mce_ext" that includes struct mce plus other things. This
way struct mce can still be uapi, and things like mcelog can use it. And at
the same time we can new data used in the kernel or shared through
tracepoints.

/* Extended MCE structure */
struct mce_ext {
	struct mce *m;
	/* new stuff here */
};

What do you think?

Thanks,
Yazen

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 2/3] x86/MCE/APEI: Handle variable register array size
  2022-07-03 12:30   ` Borislav Petkov
@ 2022-07-11 17:32     ` Yazen Ghannam
  0 siblings, 0 replies; 28+ messages in thread
From: Yazen Ghannam @ 2022-07-11 17:32 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: linux-edac, linux-kernel, tony.luck, x86, Smita.KoralahalliChannabasappa

On Sun, Jul 03, 2022 at 02:30:24PM +0200, Borislav Petkov wrote:
> On Mon, Apr 18, 2022 at 05:44:39PM +0000, Yazen Ghannam wrote:
> > Recent AMD systems may provide an x86 Common Platform Error Record
> > (CPER) for errors reported in the ACPI Boot Error Record Table (BERT).
> > The x86 CPER may contain one or more Processor Context Information
> > Structures. The context structures may represent an x86 MSR range where
> > a starting address is given, and the data represents a contiguous set of
> > MSRs starting from, and including, the starting address.
> 
> You're killing me with these "may" formulations. Just say it once and
> then drop it. I mean, we know some future hw "may" support something
> new - you can just as well drop the "may" thing because if it only may
> and it turns out it might not, you don't even have to do the work and
> enabling it and sending the patch.
> 
> So no need to do that - the patch commit message should talk purely
> about functionality and not sound like some vendor doc - there are
> enough of those.
>

Understood.

Thanks,
Yazen

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 3/3] EDAC/mce_amd: Add support for FRU Text in MCA
  2022-07-04  9:13   ` Borislav Petkov
@ 2022-07-11 17:41     ` Yazen Ghannam
  0 siblings, 0 replies; 28+ messages in thread
From: Yazen Ghannam @ 2022-07-11 17:41 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: linux-edac, linux-kernel, tony.luck, x86, Smita.KoralahalliChannabasappa

On Mon, Jul 04, 2022 at 11:13:56AM +0200, Borislav Petkov wrote:
> On Mon, Apr 18, 2022 at 05:44:40PM +0000, Yazen Ghannam wrote:
> > @@ -1254,11 +1255,10 @@ amd_decode_mce(struct notifier_block *nb, unsigned long val, void *data)
> >  		((m->status & MCI_STATUS_PCC)	? "PCC"	  : "-"));
> >  
> >  	if (boot_cpu_has(X86_FEATURE_SMCA)) {
> > -		u32 low, high;
> >  		u32 addr = MSR_AMD64_SMCA_MCx_CONFIG(m->bank);
> >  
> > -		if (!rdmsr_safe(addr, &low, &high) &&
> > -		    (low & MCI_CONFIG_MCAX))
> > +		if (!rdmsrl_safe_on_cpu(m->extcpu, addr, &mca_config) &&
> 
> This change needs to be mentioned in the commit message.
>

Okay.

> > +		    (mca_config & MCI_CONFIG_MCAX))
> >  			pr_cont("|%s", ((m->status & MCI_STATUS_TCC) ? "TCC" : "-"));
> >  
> >  		pr_cont("|%s", ((m->status & MCI_STATUS_SYNDV) ? "SyndV" : "-"));
> > @@ -1300,6 +1300,17 @@ amd_decode_mce(struct notifier_block *nb, unsigned long val, void *data)
> >  		pr_cont("\n");
> 
> So here above the code prints SYND1 and SYND2.
> 
> If they contain FRU strings, then this printing should be an if-else by
> checking bit 9.
> 
> 	if (BIT(9))
> 		print fru text
> 	else
> 		print naked syndromes
> 
>

Okay, will change.

> >  		decode_smca_error(m);
> > +
> > +		if (mca_config & BIT(9)) {
> > +			char frutext[32];
> 
> Why 32?
>

I picked the next power of 2 after 16 in order to have some space for
terminating NULL in the string. I can't think of a good reason it needs to be
a power of 2, so I can change this to 17.

> > +			memset(frutext, 0, sizeof(frutext));
> > +			memcpy(&frutext[0], &m->synd1, 8);
> > +			memcpy(&frutext[8], &m->synd2, 8);
> > +
> > +			pr_emerg(HW_ERR "FRU Text: %s\n", frutext);
> > +		}
> > +
> >  		goto err_code;
> >  	}
>

Thanks,
Yazen

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 1/3] x86/MCE, EDAC/mce_amd: Add support for new MCA_SYND{1,2} registers
  2022-07-11 17:31     ` Yazen Ghannam
@ 2022-07-18  8:57       ` Borislav Petkov
  2022-07-18 13:50         ` Borislav Petkov
  0 siblings, 1 reply; 28+ messages in thread
From: Borislav Petkov @ 2022-07-18  8:57 UTC (permalink / raw)
  To: Yazen Ghannam
  Cc: linux-edac, linux-kernel, tony.luck, x86, Smita.KoralahalliChannabasappa

On Mon, Jul 11, 2022 at 05:31:41PM +0000, Yazen Ghannam wrote:
> Is your concern specifically on growing/changing struct mce, or is it more
> about limiting info sent to userspace?

Well, both, kinda.

> If it's the former, then I've been thinking it would be good to introduce a
> new internal "struct mce_ext" that includes struct mce plus other things. This
> way struct mce can still be uapi, and things like mcelog can use it. And at
> the same time we can new data used in the kernel or shared through
> tracepoints.
> 
> /* Extended MCE structure */
> struct mce_ext {
> 	struct mce *m;
> 	/* new stuff here */
> };
> 
> What do you think?

The moment you make it part of a trace record, it practically becomes
ABI.

So we could have some opaque blob which is called vendor-specific data
and which is dumped raw to userspace, without any specification to its
format so that luserspace doesn't get any ideas...

Lemme talk to rostedt.

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 1/3] x86/MCE, EDAC/mce_amd: Add support for new MCA_SYND{1,2} registers
  2022-07-18  8:57       ` Borislav Petkov
@ 2022-07-18 13:50         ` Borislav Petkov
  2022-08-02 12:22           ` Yazen Ghannam
  0 siblings, 1 reply; 28+ messages in thread
From: Borislav Petkov @ 2022-07-18 13:50 UTC (permalink / raw)
  To: Yazen Ghannam
  Cc: linux-edac, linux-kernel, tony.luck, x86, Smita.KoralahalliChannabasappa

On Mon, Jul 18, 2022 at 10:57:19AM +0200, Borislav Petkov wrote:
> Lemme talk to rostedt.

Right, he says __dynamic_array(). Grep the tree for examples.

I'm thinking we can add a field called vendor_info or so, at the end of
the TP and then dump whatever fields we want there.

We can even slap a flag in front of the whole thing saying what the
vendor info type is. But we don't have to get ahead of ourselves and
keep it simple first.

How does that sound?

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 1/3] x86/MCE, EDAC/mce_amd: Add support for new MCA_SYND{1,2} registers
  2022-07-18 13:50         ` Borislav Petkov
@ 2022-08-02 12:22           ` Yazen Ghannam
  2022-08-02 16:58             ` Luck, Tony
  2022-10-24 16:09             ` Borislav Petkov
  0 siblings, 2 replies; 28+ messages in thread
From: Yazen Ghannam @ 2022-08-02 12:22 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: linux-edac, linux-kernel, tony.luck, x86, Smita.KoralahalliChannabasappa

On Mon, Jul 18, 2022 at 03:50:46PM +0200, Borislav Petkov wrote:
> On Mon, Jul 18, 2022 at 10:57:19AM +0200, Borislav Petkov wrote:
> > Lemme talk to rostedt.
> 
> Right, he says __dynamic_array(). Grep the tree for examples.
> 
> I'm thinking we can add a field called vendor_info or so, at the end of
> the TP and then dump whatever fields we want there.
> 
> We can even slap a flag in front of the whole thing saying what the
> vendor info type is. But we don't have to get ahead of ourselves and
> keep it simple first.
> 
> How does that sound?
>

Sounds okay to me. But how should this look in the internal kernel data
structures?

struct mce {
    ...
    void *vendor_info; /* Points to a vendor-defined struct. */
};

..or..

struct mce_ext {
    struct mce *m;
    void *vendor_info;
};

I ask because struct mce is UAPI. But I think this is just for /dev/mcelog,
and this has been deprecated for a while. So on a related note, should
/dev/mcelog be removed and struct mce moved out of UAPI? Then changes to
struct mce won't affect user space, and we can just consider the mce trace
event when reporting to user space.

Thanks,
Yazen

^ permalink raw reply	[flat|nested] 28+ messages in thread

* RE: [PATCH 1/3] x86/MCE, EDAC/mce_amd: Add support for new MCA_SYND{1,2} registers
  2022-08-02 12:22           ` Yazen Ghannam
@ 2022-08-02 16:58             ` Luck, Tony
  2022-10-24 16:09             ` Borislav Petkov
  1 sibling, 0 replies; 28+ messages in thread
From: Luck, Tony @ 2022-08-02 16:58 UTC (permalink / raw)
  To: Yazen Ghannam, Borislav Petkov
  Cc: linux-edac, linux-kernel, x86, Smita.KoralahalliChannabasappa

> I ask because struct mce is UAPI. But I think this is just for /dev/mcelog,
> and this has been deprecated for a while. So on a related note, should
> /dev/mcelog be removed and struct mce moved out of UAPI? Then changes to
> struct mce won't affect user space, and we can just consider the mce trace
> event when reporting to user space.

Even though deprecated, mcelog still has users, so I don't think it should go away
yet.

-Tony

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 1/3] x86/MCE, EDAC/mce_amd: Add support for new MCA_SYND{1,2} registers
  2022-08-02 12:22           ` Yazen Ghannam
  2022-08-02 16:58             ` Luck, Tony
@ 2022-10-24 16:09             ` Borislav Petkov
  2022-10-24 16:38               ` Tony Luck
  1 sibling, 1 reply; 28+ messages in thread
From: Borislav Petkov @ 2022-10-24 16:09 UTC (permalink / raw)
  To: Yazen Ghannam
  Cc: linux-edac, linux-kernel, tony.luck, x86, Smita.KoralahalliChannabasappa

On Tue, Aug 02, 2022 at 12:22:20PM +0000, Yazen Ghannam wrote:
> I ask because struct mce is UAPI. But I think this is just for /dev/mcelog,
> and this has been deprecated for a while. So on a related note, should
> /dev/mcelog be removed and struct mce moved out of UAPI? Then changes to
> struct mce won't affect user space, and we can just consider the mce trace
> event when reporting to user space.

Question is, do you want those error records to be fed into mcelog on
AMD too?

And I remember you guys supporting it at some point.

The answer to that question will tell you how exactly to build your
structure of data you shuffle to luserspace.

I'd say.

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 1/3] x86/MCE, EDAC/mce_amd: Add support for new MCA_SYND{1,2} registers
  2022-10-24 16:09             ` Borislav Petkov
@ 2022-10-24 16:38               ` Tony Luck
  2022-10-24 20:30                 ` Borislav Petkov
  0 siblings, 1 reply; 28+ messages in thread
From: Tony Luck @ 2022-10-24 16:38 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: Yazen Ghannam, linux-edac, linux-kernel, x86,
	Smita.KoralahalliChannabasappa

On Mon, Oct 24, 2022 at 06:09:10PM +0200, Borislav Petkov wrote:
> On Tue, Aug 02, 2022 at 12:22:20PM +0000, Yazen Ghannam wrote:
> > I ask because struct mce is UAPI. But I think this is just for /dev/mcelog,
> > and this has been deprecated for a while. So on a related note, should
> > /dev/mcelog be removed and struct mce moved out of UAPI? Then changes to
> > struct mce won't affect user space, and we can just consider the mce trace
> > event when reporting to user space.
> 
> Question is, do you want those error records to be fed into mcelog on
> AMD too?
> 
> And I remember you guys supporting it at some point.
> 
> The answer to that question will tell you how exactly to build your
> structure of data you shuffle to luserspace.

There are still a fair number of users of mcelog, so I think it needs
to remain in its half-undead state a while longer.

Changes to "struct mce" have always been supported. Several have
been made over the years. The rules are quite simple:

1) Do not remove any existing fields
2) Legacy fields that are no longer used should have value 0.
3) Kernel internal values (currently just "kflags") should be
   zeroed in the structures passed out to user space.
3) New fields must be added at the end.

-Tony

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 1/3] x86/MCE, EDAC/mce_amd: Add support for new MCA_SYND{1,2} registers
  2022-10-24 16:38               ` Tony Luck
@ 2022-10-24 20:30                 ` Borislav Petkov
  2022-10-24 21:08                   ` Luck, Tony
  0 siblings, 1 reply; 28+ messages in thread
From: Borislav Petkov @ 2022-10-24 20:30 UTC (permalink / raw)
  To: Tony Luck
  Cc: Yazen Ghannam, linux-edac, linux-kernel, x86,
	Smita.KoralahalliChannabasappa

On Mon, Oct 24, 2022 at 09:38:44AM -0700, Tony Luck wrote:
> There are still a fair number of users of mcelog, so I think it needs
> to remain in its half-undead state a while longer.

That's not the question - the question is how should vendor-specific
info should be logged so that the struct mce record doesn't get blown
up or ends up containing unused fields on the other vendor.

I.e., how to keep it as small as possible and to share the space there
in the most compact way.

That vendor-specific "space" in there could be used by each vendor
differently. As in this case, Intel doesn't have MCA_SYND{1,2} u64
values. So they could be part of a vendor_info which gets interpreted
based on vendor.

When Intel wants to carry more info through struct mce to userspace,
it can reuse those 2 u64s which are vendor_info but interpret them
differently.

Which then begs the question, how should those get logged etc.

I guess a u8 vendor_info[VENDOR_INFO_SIZE] or so which we can extend
later if needed.

Perhaps prepend it with its length too:

	error_record {
		struct mce;
		unsigned int vendor_info_len;
		u8 vendor_info[vendor_info_len];
	};

For example.

Not saying this is how it should be done - this is just what is swirling
around in my head right now.

Hmm.

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette

^ permalink raw reply	[flat|nested] 28+ messages in thread

* RE: [PATCH 1/3] x86/MCE, EDAC/mce_amd: Add support for new MCA_SYND{1,2} registers
  2022-10-24 20:30                 ` Borislav Petkov
@ 2022-10-24 21:08                   ` Luck, Tony
  2022-10-24 21:23                     ` Borislav Petkov
  0 siblings, 1 reply; 28+ messages in thread
From: Luck, Tony @ 2022-10-24 21:08 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: Yazen Ghannam, linux-edac, linux-kernel, x86,
	Smita.KoralahalliChannabasappa

We already have:

        __u8  cpuvendor;        /* Kernel's X86_VENDOR enum */

So the mce structure contains which vendor created it.

> I guess a u8 vendor_info[VENDOR_INFO_SIZE] or so which we can extend
> later if needed.

Extending is hard because we already boxed in the two AMD specific fields
with some generic fields that follow (ppin & microcode).

But we could change the current form to be something like:

	union {
		struct vendor_info {
			__u64	vendor_info[2];
		};
		struct vendor_amd_info {
			__u64 synd;             /* MCA_SYND MSR: only valid on SMCA systems */
			__u64 ipid;             /* MCA_IPID MSR: only valid on SMCA systems */
		};
	};

to make it clear that these 16 bytes are up for grabs to be re-interpreted based on
the value of "cpuvendor" (and possibly also "cpuid" if a vendor wants different data
logged for different models).

-Tony

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 1/3] x86/MCE, EDAC/mce_amd: Add support for new MCA_SYND{1,2} registers
  2022-10-24 21:08                   ` Luck, Tony
@ 2022-10-24 21:23                     ` Borislav Petkov
  2022-10-24 21:32                       ` Luck, Tony
  0 siblings, 1 reply; 28+ messages in thread
From: Borislav Petkov @ 2022-10-24 21:23 UTC (permalink / raw)
  To: Luck, Tony
  Cc: Yazen Ghannam, linux-edac, linux-kernel, x86,
	Smita.KoralahalliChannabasappa

On Mon, Oct 24, 2022 at 09:08:54PM +0000, Luck, Tony wrote:
> We already have:
> 
>         __u8  cpuvendor;        /* Kernel's X86_VENDOR enum */
> 
> So the mce structure contains which vendor created it.
> 
> > I guess a u8 vendor_info[VENDOR_INFO_SIZE] or so which we can extend
> > later if needed.
> 
> Extending is hard because we already boxed in the two AMD specific fields
> with some generic fields that follow (ppin & microcode).
> 
> But we could change the current form to be something like:
> 
> 	union {
> 		struct vendor_info {
> 			__u64	vendor_info[2];
> 		};
> 		struct vendor_amd_info {
> 			__u64 synd;             /* MCA_SYND MSR: only valid on SMCA systems */
> 			__u64 ipid;             /* MCA_IPID MSR: only valid on SMCA systems */
> 		};
> 	};
> 
> to make it clear that these 16 bytes are up for grabs to be re-interpreted based on
> the value of "cpuvendor" (and possibly also "cpuid" if a vendor wants different data
> logged for different models).

Yes, there's that. But this thread contains a patch which wants to add
two more fields.

So the above you're proposing we could do if you wanna reuse that info
on Intel.

But again, there's this other question how to add a *new* vendor_info
at the *end* of the structure which can be shared too *and* which could
potentially be enlarged.

And, if we do

	struct mce;
	vendor_info;
	<field shared by the two vendors>

then we're boxing in that vendor_info again and we cannot enlarge it
either.

That's why I'm proposing this prepended length in front of the
vendor_info field so that it can be extended properly in the future
and new shared members can also be added and this whole scheme can be
forward-compatible, so to speak, without having to cast anything in
stone.

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette

^ permalink raw reply	[flat|nested] 28+ messages in thread

* RE: [PATCH 1/3] x86/MCE, EDAC/mce_amd: Add support for new MCA_SYND{1,2} registers
  2022-10-24 21:23                     ` Borislav Petkov
@ 2022-10-24 21:32                       ` Luck, Tony
  2022-10-24 21:52                         ` Luck, Tony
  0 siblings, 1 reply; 28+ messages in thread
From: Luck, Tony @ 2022-10-24 21:32 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: Yazen Ghannam, linux-edac, linux-kernel, x86,
	Smita.KoralahalliChannabasappa

> That's why I'm proposing this prepended length in front of the
> vendor_info field so that it can be extended properly in the future
> and new shared members can also be added and this whole scheme can be
> forward-compatible, so to speak, without having to cast anything in
> stone.

I missed the pre-pended length bit ... with that it all makes sense.

-Tony

^ permalink raw reply	[flat|nested] 28+ messages in thread

* RE: [PATCH 1/3] x86/MCE, EDAC/mce_amd: Add support for new MCA_SYND{1,2} registers
  2022-10-24 21:32                       ` Luck, Tony
@ 2022-10-24 21:52                         ` Luck, Tony
  2022-10-25 16:35                           ` Yazen Ghannam
  0 siblings, 1 reply; 28+ messages in thread
From: Luck, Tony @ 2022-10-24 21:52 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: Yazen Ghannam, linux-edac, linux-kernel, x86,
	Smita.KoralahalliChannabasappa

> I missed the pre-pended length bit ... with that it all makes sense.

Though the other place where mce records are visible to user space
is in trace records. See:

    include/trace/events/mce.h

(N.B. this needs an update to include the ppin and microcode fields).

If these new fields need to be in the trace log, and we want to make
it easy to extend, then have to figure out how to handle this in a way
that doesn't confuse applications (rasdaemon ... are there others)
that consume the trace records.

-Tony

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 1/3] x86/MCE, EDAC/mce_amd: Add support for new MCA_SYND{1,2} registers
  2022-10-24 21:52                         ` Luck, Tony
@ 2022-10-25 16:35                           ` Yazen Ghannam
  2022-10-25 16:46                             ` Luck, Tony
  2022-10-25 18:05                             ` Borislav Petkov
  0 siblings, 2 replies; 28+ messages in thread
From: Yazen Ghannam @ 2022-10-25 16:35 UTC (permalink / raw)
  To: Luck, Tony
  Cc: Borislav Petkov, linux-edac, linux-kernel, x86,
	Smita.KoralahalliChannabasappa

On Mon, Oct 24, 2022 at 09:52:45PM +0000, Luck, Tony wrote:
> > I missed the pre-pended length bit ... with that it all makes sense.
> 
> Though the other place where mce records are visible to user space
> is in trace records. See:
> 
>     include/trace/events/mce.h
> 
> (N.B. this needs an update to include the ppin and microcode fields).
> 
> If these new fields need to be in the trace log, and we want to make
> it easy to extend, then have to figure out how to handle this in a way
> that doesn't confuse applications (rasdaemon ... are there others)
> that consume the trace records.
>

Hi folks,

I think the "right way" to use tracepoints is to parse the data according to
the format provided by the tracepoint itself. You can see an example of
rasdaemon doing that here:
https://github.com/mchehab/rasdaemon/blob/c2255178a49f62c53009a456bc37dd5e37332f09/ras-mce-handler.c#L386

A tracepoint user should not assume a fixed struct layout, so adding and
rearranging fields shouldn't be a problem. I'm not sure about removing a
field. It seems to me that this should be okay in the interface, and it's up
to the application how it wants to handle a field that isn't found.

Also, rasdaemon does already support raw, variable-length data:
https://github.com/mchehab/rasdaemon/blob/master/ras-non-standard-handler.c

This could be an example used to update the MCE part.

I think the only (or popular?) userspace tool that relies on the layout of
struct mce is mcelog. This is not supported on modern AMD systems, and we
refer users to rasdaemon instead.

Another option could be to define a new tracepoint. Userspace already needs to
be updated to recognize new fields, so I don't think it's much of a stretch to
add a new tracepoint handler. This may be simpler than trying to fit
vendor-specific info into an existing tracepoint and then decoding it later in
userspace.

I do like the suggestion from Boris to have a length and vendor data in struct
mce. This should keep mcelog happy while letting us treat struct mce as a
semi-internal kernel structure. Also, this avoids having to mess around with
all the notifier chain definitions.

I'll start working on an implementation if that's okay with you all. I'll
include kernel and rasdaemon patches together so we can have context for
discussion.

Thanks,
Yazen

^ permalink raw reply	[flat|nested] 28+ messages in thread

* RE: [PATCH 1/3] x86/MCE, EDAC/mce_amd: Add support for new MCA_SYND{1,2} registers
  2022-10-25 16:35                           ` Yazen Ghannam
@ 2022-10-25 16:46                             ` Luck, Tony
  2022-10-25 18:05                             ` Borislav Petkov
  1 sibling, 0 replies; 28+ messages in thread
From: Luck, Tony @ 2022-10-25 16:46 UTC (permalink / raw)
  To: Yazen Ghannam
  Cc: Borislav Petkov, linux-edac, linux-kernel, x86,
	Smita.KoralahalliChannabasappa

> I do like the suggestion from Boris to have a length and vendor data in struct
> mce. This should keep mcelog happy while letting us treat struct mce as a
> semi-internal kernel structure. Also, this avoids having to mess around with
> all the notifier chain definitions.
>
> I'll start working on an implementation if that's okay with you all. I'll
> include kernel and rasdaemon patches together so we can have context for
> discussion.

Sounds good to me. I'll work on the mcelog changes once you have some
agreed definition for the "struct mce" changes.

-Tony

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 1/3] x86/MCE, EDAC/mce_amd: Add support for new MCA_SYND{1,2} registers
  2022-10-25 16:35                           ` Yazen Ghannam
  2022-10-25 16:46                             ` Luck, Tony
@ 2022-10-25 18:05                             ` Borislav Petkov
  2022-10-25 19:28                               ` Steven Rostedt
  1 sibling, 1 reply; 28+ messages in thread
From: Borislav Petkov @ 2022-10-25 18:05 UTC (permalink / raw)
  To: Yazen Ghannam
  Cc: Luck, Tony, linux-edac, linux-kernel, x86,
	Smita.KoralahalliChannabasappa, Steven Rostedt

On Tue, Oct 25, 2022 at 04:35:15PM +0000, Yazen Ghannam wrote:
> I think the "right way" to use tracepoints is to parse the data according to
> the format provided by the tracepoint itself. You can see an example of
> rasdaemon doing that here:
> https://github.com/mchehab/rasdaemon/blob/c2255178a49f62c53009a456bc37dd5e37332f09/ras-mce-handler.c#L386

Lemme add Rostedt.

So now we have libtraceevent and here's an example how to do it:

https://patchwork.kernel.org/project/linux-trace-devel/patch/20221021182345.092cdb50@gandalf.local.home/
https://www.trace-cmd.org/Documentation/libtracefs/libtracefs-kprobes.html

Reportedly, rasdaemon is still using the old libtraceevent method. So it
probably should be updated to use the new library version.

> A tracepoint user should not assume a fixed struct layout, so adding
> and rearranging fields shouldn't be a problem. I'm not sure about
> removing a field. It seems to me that this should be okay in the
> interface, and it's up to the application how it wants to handle a
> field that isn't found.

From looking at the examples, I think the new libtraceevent should be
able to handle all that just fine.

> Another option could be to define a new tracepoint.

Yeah, no. Let's get this one to work pls.

trace_mce_record2() looks like the old attempts to design a syscall
better. :)

>  Userspace already needs to be updated to recognize new fields, so I
> don't think it's much of a stretch to add a new tracepoint handler.

Syncing with other components is always a stretch. You're forgetting how
distros lag behind and don't always have the needed resources to update
their userspace.

Or they can't update because of enterprise raisins.

So no, it is not trivial, trust me. It's a bunch of politics and effort.

> This may be simpler than trying to fit vendor-specific info into an
> existing tracepoint and then decoding it later in userspace.

Until that new tracepoint becomes insufficient in the future and we need
to add trace_mce_record3(). No, you don't want that. :)

Thx.

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 1/3] x86/MCE, EDAC/mce_amd: Add support for new MCA_SYND{1,2} registers
  2022-10-25 18:05                             ` Borislav Petkov
@ 2022-10-25 19:28                               ` Steven Rostedt
  2022-11-01 17:27                                 ` Yazen Ghannam
  0 siblings, 1 reply; 28+ messages in thread
From: Steven Rostedt @ 2022-10-25 19:28 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: Yazen Ghannam, Luck, Tony, linux-edac, linux-kernel, x86,
	Smita.KoralahalliChannabasappa

On Tue, 25 Oct 2022 20:05:48 +0200
Borislav Petkov <bp@alien8.de> wrote:

> On Tue, Oct 25, 2022 at 04:35:15PM +0000, Yazen Ghannam wrote:
> > I think the "right way" to use tracepoints is to parse the data according to
> > the format provided by the tracepoint itself. You can see an example of
> > rasdaemon doing that here:
> > https://github.com/mchehab/rasdaemon/blob/c2255178a49f62c53009a456bc37dd5e37332f09/ras-mce-handler.c#L386  
> 
> Lemme add Rostedt.
> 
> So now we have libtraceevent and here's an example how to do it:

Yes, I'm really grateful to Mauro for adapting an earlier version of
libtraceevent, although it was just cut and pasted into rasdaemon. But it
is time to use the official library, which had a bit of changes to the
interface.

> 
> https://patchwork.kernel.org/project/linux-trace-devel/patch/20221021182345.092cdb50@gandalf.local.home/
> https://www.trace-cmd.org/Documentation/libtracefs/libtracefs-kprobes.html
> 
> Reportedly, rasdaemon is still using the old libtraceevent method. So it
> probably should be updated to use the new library version.

Definitely.

> 
> > A tracepoint user should not assume a fixed struct layout, so adding
> > and rearranging fields shouldn't be a problem. I'm not sure about
> > removing a field. It seems to me that this should be okay in the
> > interface, and it's up to the application how it wants to handle a
> > field that isn't found.  
> 
> >From looking at the examples, I think the new libtraceevent should be  
> able to handle all that just fine.

As long as the code can handle a field removed or renamed. It allows the
application to check if it is there or not.

> 
> > Another option could be to define a new tracepoint.  
> 
> Yeah, no. Let's get this one to work pls.

You can always add a trace event on top of an existing trace event with a
"custom" trace event.

See https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/samples/trace_events/trace_custom_sched.h

Also, take a look at some of the code that is coming with libtracefs:

  https://patchwork.kernel.org/project/linux-trace-devel/list/?series=688772

 (Note, it's not there just yet)

Basically you can just do:

	instance = tracefs_instance_create(TOOL_NAME);

	for (i = 0; i < nr_cpus; i++) {
		tcpus[i] = tracefs_cpu_open(instance, i, false);

And then you can read from the raw trace buffers;

	/* Read all "ras" events */
	tep = tracefs_local_events_system(NULL, "ras");

	/* Note pevent was renamed to tep */

	/* I need to write up kbuffer man pages :-/ */
	kbuf = kbuffer_alloc(sizeof(long) == 8, !tep_is_bigendian());

	/* I guess you want to retrieve any data */
	tracefs_instance_file_write(instance, "buffer_percent", "0");

	buf = malloc(tracefs_cpu_read_size(tcpu));

	/* false means block until there's data */
	tracefs_cpu_read(tcpus[i], buf, false);

	struct tep_record *record;
	unsigned long long ts;

	/* Load the read raw data into the kbuffer parser */
	kbuffer_load_subbuffer(kbuf, buf);

	for (;;) {
		record.data = kbuffer_read_event(kbuf, &ts);
		if (!record.data)
			break;
		record.ts = ts;

		process(tep, record);

		kbuffer_next_event(kbuf, NULL);

		/* There's tracefs iterators that do this too, but I'm
		 * working on adding more features to them. */
	}


static void process(struct tep_handle *tep, struct tep_record *record)
{
	static struct tep_event *mc_event;
	static struct tep_event *aer_event;
	[..]
	static struct trace_seq s;
	struct tep_event *event;
	unsigned long long val;

	/* Initialize the static events to use them for data */
	if (!mc_event) {
		trace_seq_init(&s);
		mc_event = tep_find_event_by_name(tep, "ras", "mc_event");
		/* Do error checking? */
		aer_event = tep_find_event_by_name(tep, "ras", "aer_event");
		[..]
	}

	/* Remove any previous strings in s. */
	trace_seq_reset(&s);

	event = tep_find_event_by_record(tep, record);
	if (event->id == mc_event->id) {
		tep_get_field_val(&s, event, "address", record, &val, 0);
		[..]
	}
}


With libtracefs and libtraceevent, process trace events is so much easier
than it use to be!

-- Steve

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 1/3] x86/MCE, EDAC/mce_amd: Add support for new MCA_SYND{1,2} registers
  2022-10-25 19:28                               ` Steven Rostedt
@ 2022-11-01 17:27                                 ` Yazen Ghannam
  0 siblings, 0 replies; 28+ messages in thread
From: Yazen Ghannam @ 2022-11-01 17:27 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: Borislav Petkov, Luck, Tony, linux-edac, linux-kernel, x86,
	Smita.KoralahalliChannabasappa

On Tue, Oct 25, 2022 at 03:28:47PM -0400, Steven Rostedt wrote:

...

> With libtracefs and libtraceevent, process trace events is so much easier
> than it use to be!
>

Thanks Boris and Steve!

I'll start digging into this.

-Yazen

^ permalink raw reply	[flat|nested] 28+ messages in thread

* [PATCH 0/3] New SMCA Syndrome registers and FRU Text feature
@ 2022-04-18 17:40 Yazen Ghannam
  0 siblings, 0 replies; 28+ messages in thread
From: Yazen Ghannam @ 2022-04-18 17:40 UTC (permalink / raw)
  To: linux-edac
  Cc: linux-kernel, tony.luck, x86, Smita.KoralahalliChannabasappa,
	Yazen Ghannam

Hello,

This patchset adds support for two new "syndrome" registers used in
future AMD Scalable MCA (SMCA) systems and also a new "FRU Text in MCA"
feature that uses these new registers.

Patch 1 adds support for the new registers. They are read/printed
wherever the existing MCA_SYND register is used.

Patch 2 updates the function that pulls MCA information from UEFI x86
Common Platform Error Records (CPERs) to handle systems that support the
new registers.

Patch 3 adds support to the AMD MCE decoder module to detect and use the
"FRU Text in MCA" feature which leverages the new registers.

Thanks,
Yazen

Yazen Ghannam (3):
  x86/MCE, EDAC/mce_amd: Add support for new MCA_SYND{1,2} registers
  x86/MCE/APEI: Handle variable register array size
  EDAC/mce_amd: Add support for FRU Text in MCA

 arch/x86/include/asm/mce.h      |  5 +++
 arch/x86/include/uapi/asm/mce.h |  2 +
 arch/x86/kernel/cpu/mce/amd.c   |  5 ++-
 arch/x86/kernel/cpu/mce/apei.c  | 73 ++++++++++++++++++++++++++-------
 arch/x86/kernel/cpu/mce/core.c  |  9 +++-
 drivers/edac/mce_amd.c          | 24 ++++++++---
 include/trace/events/mce.h      |  7 +++-
 7 files changed, 103 insertions(+), 22 deletions(-)

-- 
2.25.1


^ permalink raw reply	[flat|nested] 28+ messages in thread

end of thread, other threads:[~2022-11-01 17:28 UTC | newest]

Thread overview: 28+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-04-18 17:44 [PATCH 0/3] New SMCA Syndrome registers and FRU Text feature Yazen Ghannam
2022-04-18 17:44 ` [PATCH 1/3] x86/MCE, EDAC/mce_amd: Add support for new MCA_SYND{1,2} registers Yazen Ghannam
2022-06-30 11:01   ` Borislav Petkov
2022-07-11 17:31     ` Yazen Ghannam
2022-07-18  8:57       ` Borislav Petkov
2022-07-18 13:50         ` Borislav Petkov
2022-08-02 12:22           ` Yazen Ghannam
2022-08-02 16:58             ` Luck, Tony
2022-10-24 16:09             ` Borislav Petkov
2022-10-24 16:38               ` Tony Luck
2022-10-24 20:30                 ` Borislav Petkov
2022-10-24 21:08                   ` Luck, Tony
2022-10-24 21:23                     ` Borislav Petkov
2022-10-24 21:32                       ` Luck, Tony
2022-10-24 21:52                         ` Luck, Tony
2022-10-25 16:35                           ` Yazen Ghannam
2022-10-25 16:46                             ` Luck, Tony
2022-10-25 18:05                             ` Borislav Petkov
2022-10-25 19:28                               ` Steven Rostedt
2022-11-01 17:27                                 ` Yazen Ghannam
2022-04-18 17:44 ` [PATCH 2/3] x86/MCE/APEI: Handle variable register array size Yazen Ghannam
2022-07-03 12:30   ` Borislav Petkov
2022-07-11 17:32     ` Yazen Ghannam
2022-04-18 17:44 ` [PATCH 3/3] EDAC/mce_amd: Add support for FRU Text in MCA Yazen Ghannam
2022-07-04  9:13   ` Borislav Petkov
2022-07-11 17:41     ` Yazen Ghannam
2022-06-10 16:29 ` [PATCH 0/3] New SMCA Syndrome registers and FRU Text feature Yazen Ghannam
  -- strict thread matches above, loose matches on Subject: below --
2022-04-18 17:40 Yazen Ghannam

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).