linux-edac.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v4 0/2] Update mce_record tracepoint
@ 2024-03-27 20:54 Avadhut Naik
  2024-03-27 20:54 ` [PATCH v4 1/2] tracing: Include PPIN in " Avadhut Naik
  2024-03-27 20:54 ` [PATCH v4 2/2] tracing: Include Microcode Revision " Avadhut Naik
  0 siblings, 2 replies; 11+ messages in thread
From: Avadhut Naik @ 2024-03-27 20:54 UTC (permalink / raw)
  To: linux-trace-kernel, linux-edac
  Cc: rostedt, tony.luck, bp, x86, linux-kernel, yazen.ghannam, avadnaik

This patchset updates the mce_record tracepoint so that the recently added
fields of struct mce are exported through it to userspace.

The first patch adds PPIN (Protected Processor Inventory Number) field to
the tracepoint.

The second patch adds the microcode field (Microcode Revision) to the
tracepoint.

Changes in v2:
 - Export microcode field (Microcode Revision) through the tracepoiont in
   addition to PPIN.

Changes in v3:
 - Change format specifier for microcode revision from %u to %x
 - Fix tab alignments
 - Add Reviewed-by: Sohil Mehta <sohil.mehta@intel.com>

Changes in v4:
 - Update commit messages to reflect the reason for the fields being
   added to the tracepoint.
 - Add comment to explicitly state the type of information that should
   be added to the tracepoint.
 - Add Reviewed-by: Steven Rostedt (Google) <rostedt@goodmis.org>

[NOTE:

 - Since only a comment has been added and only the commit messages have
   been reworked i.e. no code changes have been undertaken for this
   version, have the retained the below tags from v3:
    Reviewed-by: Sohil Mehta <sohil.mehta@intel.com>
    Reviewed-by: Steven Rostedt (Google) <rostedt@goodmis.org>]

Avadhut Naik (2):
  tracing: Include PPIN in mce_record tracepoint
  tracing: Include Microcode Revision in mce_record tracepoint

 include/trace/events/mce.h | 18 ++++++++++++++++--
 1 file changed, 16 insertions(+), 2 deletions(-)


base-commit: 818ea9b4c8237f96ac99dc0b2f02dd6d3f2adb97
-- 
2.34.1


^ permalink raw reply	[flat|nested] 11+ messages in thread

* [PATCH v4 1/2] tracing: Include PPIN in mce_record tracepoint
  2024-03-27 20:54 [PATCH v4 0/2] Update mce_record tracepoint Avadhut Naik
@ 2024-03-27 20:54 ` Avadhut Naik
  2024-03-27 20:59   ` Luck, Tony
  2024-03-27 20:54 ` [PATCH v4 2/2] tracing: Include Microcode Revision " Avadhut Naik
  1 sibling, 1 reply; 11+ messages in thread
From: Avadhut Naik @ 2024-03-27 20:54 UTC (permalink / raw)
  To: linux-trace-kernel, linux-edac
  Cc: rostedt, tony.luck, bp, x86, linux-kernel, yazen.ghannam, avadnaik

Machine Check Error information from struct mce is exported to userspace
through the mce_record tracepoint.

Currently, however, the PPIN (Protected Processor Inventory Number) field
of struct mce is not exported through the tracepoint.

Export PPIN through the tracepoint as it provides a unique identifier for
the system (or socket in case of multi-socket systems) on which the MCE
has been received.

Signed-off-by: Avadhut Naik <avadhut.naik@amd.com>
Reviewed-by: Sohil Mehta <sohil.mehta@intel.com>
Reviewed-by: Steven Rostedt (Google) <rostedt@goodmis.org>
---
 include/trace/events/mce.h | 13 ++++++++++++-
 1 file changed, 12 insertions(+), 1 deletion(-)

diff --git a/include/trace/events/mce.h b/include/trace/events/mce.h
index 1391ada0da3b..959ba7b775b1 100644
--- a/include/trace/events/mce.h
+++ b/include/trace/events/mce.h
@@ -9,6 +9,14 @@
 #include <linux/tracepoint.h>
 #include <asm/mce.h>
 
+/*
+ * MCE Event Record.
+ *
+ * Only very relevant and transient information which cannot be
+ * gathered from a system by any other means or which can only be
+ * acquired arduously should be added to this record.
+ */
+
 TRACE_EVENT(mce_record,
 
 	TP_PROTO(struct mce *m),
@@ -25,6 +33,7 @@ TRACE_EVENT(mce_record,
 		__field(	u64,		ipid		)
 		__field(	u64,		ip		)
 		__field(	u64,		tsc		)
+		__field(	u64,		ppin		)
 		__field(	u64,		walltime	)
 		__field(	u32,		cpu		)
 		__field(	u32,		cpuid		)
@@ -45,6 +54,7 @@ TRACE_EVENT(mce_record,
 		__entry->ipid		= m->ipid;
 		__entry->ip		= m->ip;
 		__entry->tsc		= m->tsc;
+		__entry->ppin		= m->ppin;
 		__entry->walltime	= m->time;
 		__entry->cpu		= m->extcpu;
 		__entry->cpuid		= m->cpuid;
@@ -55,7 +65,7 @@ TRACE_EVENT(mce_record,
 		__entry->cpuvendor	= m->cpuvendor;
 	),
 
-	TP_printk("CPU: %d, MCGc/s: %llx/%llx, MC%d: %016Lx, IPID: %016Lx, ADDR/MISC/SYND: %016Lx/%016Lx/%016Lx, RIP: %02x:<%016Lx>, TSC: %llx, PROCESSOR: %u:%x, TIME: %llu, SOCKET: %u, APIC: %x",
+	TP_printk("CPU: %d, MCGc/s: %llx/%llx, MC%d: %016Lx, IPID: %016Lx, ADDR/MISC/SYND: %016Lx/%016Lx/%016Lx, RIP: %02x:<%016Lx>, TSC: %llx, PPIN: %llx, PROCESSOR: %u:%x, TIME: %llu, SOCKET: %u, APIC: %x",
 		__entry->cpu,
 		__entry->mcgcap, __entry->mcgstatus,
 		__entry->bank, __entry->status,
@@ -63,6 +73,7 @@ TRACE_EVENT(mce_record,
 		__entry->addr, __entry->misc, __entry->synd,
 		__entry->cs, __entry->ip,
 		__entry->tsc,
+		__entry->ppin,
 		__entry->cpuvendor, __entry->cpuid,
 		__entry->walltime,
 		__entry->socketid,
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH v4 2/2] tracing: Include Microcode Revision in mce_record tracepoint
  2024-03-27 20:54 [PATCH v4 0/2] Update mce_record tracepoint Avadhut Naik
  2024-03-27 20:54 ` [PATCH v4 1/2] tracing: Include PPIN in " Avadhut Naik
@ 2024-03-27 20:54 ` Avadhut Naik
  2024-03-27 21:00   ` Luck, Tony
  2024-03-27 22:31   ` Sohil Mehta
  1 sibling, 2 replies; 11+ messages in thread
From: Avadhut Naik @ 2024-03-27 20:54 UTC (permalink / raw)
  To: linux-trace-kernel, linux-edac
  Cc: rostedt, tony.luck, bp, x86, linux-kernel, yazen.ghannam, avadnaik

Currently, the microcode field (Microcode Revision) of struct mce is not
exported to userspace through the mce_record tracepoint.

Knowing the microcode version on which the MCE was received is critical
information for debugging. If the version is not recorded, later attempts
to acquire the version might result in discrepancies since it can be
changed at runtime.

Export microcode version through the tracepoint to prevent ambiguity over
the active version on the system when the MCE was received.

Signed-off-by: Avadhut Naik <avadhut.naik@amd.com>
Reviewed-by: Sohil Mehta <sohil.mehta@intel.com>
Reviewed-by: Steven Rostedt (Google) <rostedt@goodmis.org>
---
 include/trace/events/mce.h | 7 +++++--
 1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/include/trace/events/mce.h b/include/trace/events/mce.h
index 959ba7b775b1..69438550252c 100644
--- a/include/trace/events/mce.h
+++ b/include/trace/events/mce.h
@@ -42,6 +42,7 @@ TRACE_EVENT(mce_record,
 		__field(	u8,		cs		)
 		__field(	u8,		bank		)
 		__field(	u8,		cpuvendor	)
+		__field(	u32,		microcode	)
 	),
 
 	TP_fast_assign(
@@ -63,9 +64,10 @@ TRACE_EVENT(mce_record,
 		__entry->cs		= m->cs;
 		__entry->bank		= m->bank;
 		__entry->cpuvendor	= m->cpuvendor;
+		__entry->microcode	= m->microcode;
 	),
 
-	TP_printk("CPU: %d, MCGc/s: %llx/%llx, MC%d: %016Lx, IPID: %016Lx, ADDR/MISC/SYND: %016Lx/%016Lx/%016Lx, RIP: %02x:<%016Lx>, TSC: %llx, PPIN: %llx, PROCESSOR: %u:%x, TIME: %llu, SOCKET: %u, APIC: %x",
+	TP_printk("CPU: %d, MCGc/s: %llx/%llx, MC%d: %016Lx, IPID: %016Lx, ADDR/MISC/SYND: %016Lx/%016Lx/%016Lx, RIP: %02x:<%016Lx>, TSC: %llx, PPIN: %llx, PROCESSOR: %u:%x, TIME: %llu, SOCKET: %u, APIC: %x, MICROCODE REVISION: %x",
 		__entry->cpu,
 		__entry->mcgcap, __entry->mcgstatus,
 		__entry->bank, __entry->status,
@@ -77,7 +79,8 @@ TRACE_EVENT(mce_record,
 		__entry->cpuvendor, __entry->cpuid,
 		__entry->walltime,
 		__entry->socketid,
-		__entry->apicid)
+		__entry->apicid,
+		__entry->microcode)
 );
 
 #endif /* _TRACE_MCE_H */
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* RE: [PATCH v4 1/2] tracing: Include PPIN in mce_record tracepoint
  2024-03-27 20:54 ` [PATCH v4 1/2] tracing: Include PPIN in " Avadhut Naik
@ 2024-03-27 20:59   ` Luck, Tony
  0 siblings, 0 replies; 11+ messages in thread
From: Luck, Tony @ 2024-03-27 20:59 UTC (permalink / raw)
  To: Avadhut Naik, linux-trace-kernel, linux-edac
  Cc: rostedt, bp, x86, linux-kernel, yazen.ghannam, avadnaik

> Export PPIN through the tracepoint as it provides a unique identifier for
> the system (or socket in case of multi-socket systems) on which the MCE
> has been received.
>
> Signed-off-by: Avadhut Naik <avadhut.naik@amd.com>
> Reviewed-by: Sohil Mehta <sohil.mehta@intel.com>
> Reviewed-by: Steven Rostedt (Google) <rostedt@goodmis.org>

Reviewed-by: Tony Luck <tony.luck@intel.com>

^ permalink raw reply	[flat|nested] 11+ messages in thread

* RE: [PATCH v4 2/2] tracing: Include Microcode Revision in mce_record tracepoint
  2024-03-27 20:54 ` [PATCH v4 2/2] tracing: Include Microcode Revision " Avadhut Naik
@ 2024-03-27 21:00   ` Luck, Tony
  2024-03-27 22:31   ` Sohil Mehta
  1 sibling, 0 replies; 11+ messages in thread
From: Luck, Tony @ 2024-03-27 21:00 UTC (permalink / raw)
  To: Avadhut Naik, linux-trace-kernel, linux-edac
  Cc: rostedt, bp, x86, linux-kernel, yazen.ghannam, avadnaik

> Export microcode version through the tracepoint to prevent ambiguity over
> the active version on the system when the MCE was received.
>
> Signed-off-by: Avadhut Naik <avadhut.naik@amd.com>
> Reviewed-by: Sohil Mehta <sohil.mehta@intel.com>
> Reviewed-by: Steven Rostedt (Google) <rostedt@goodmis.org>

Reviewed-by: Tony Luck <tony.luck@intel.com>

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v4 2/2] tracing: Include Microcode Revision in mce_record tracepoint
  2024-03-27 20:54 ` [PATCH v4 2/2] tracing: Include Microcode Revision " Avadhut Naik
  2024-03-27 21:00   ` Luck, Tony
@ 2024-03-27 22:31   ` Sohil Mehta
  2024-03-27 22:35     ` Borislav Petkov
  2024-03-28  6:16     ` Naik, Avadhut
  1 sibling, 2 replies; 11+ messages in thread
From: Sohil Mehta @ 2024-03-27 22:31 UTC (permalink / raw)
  To: Avadhut Naik, linux-trace-kernel, linux-edac
  Cc: rostedt, tony.luck, bp, x86, linux-kernel, yazen.ghannam, avadnaik

On 3/27/2024 1:54 PM, Avadhut Naik wrote:

> -	TP_printk("CPU: %d, MCGc/s: %llx/%llx, MC%d: %016Lx, IPID: %016Lx, ADDR/MISC/SYND: %016Lx/%016Lx/%016Lx, RIP: %02x:<%016Lx>, TSC: %llx, PPIN: %llx, PROCESSOR: %u:%x, TIME: %llu, SOCKET: %u, APIC: %x",
> +	TP_printk("CPU: %d, MCGc/s: %llx/%llx, MC%d: %016Lx, IPID: %016Lx, ADDR/MISC/SYND: %016Lx/%016Lx/%016Lx, RIP: %02x:<%016Lx>, TSC: %llx, PPIN: %llx, PROCESSOR: %u:%x, TIME: %llu, SOCKET: %u, APIC: %x, MICROCODE REVISION: %x",

Nit: s/MICROCODE REVISION/MICROCODE/g

You could probably get rid of the word REVISION in the interest of
brevity similar to __print_mce().

	pr_emerg(HW_ERR "PROCESSOR %u:%x TIME %llu SOCKET %u APIC %x microcode
%x\n",
		m->cpuvendor, m->cpuid, m->time, m->socketid, m->apicid,
		m->microcode);


-Sohil

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v4 2/2] tracing: Include Microcode Revision in mce_record tracepoint
  2024-03-27 22:31   ` Sohil Mehta
@ 2024-03-27 22:35     ` Borislav Petkov
  2024-03-28  1:52       ` Sohil Mehta
  2024-03-28  6:17       ` Naik, Avadhut
  2024-03-28  6:16     ` Naik, Avadhut
  1 sibling, 2 replies; 11+ messages in thread
From: Borislav Petkov @ 2024-03-27 22:35 UTC (permalink / raw)
  To: Sohil Mehta
  Cc: Avadhut Naik, linux-trace-kernel, linux-edac, rostedt, tony.luck,
	x86, linux-kernel, yazen.ghannam, avadnaik

On Wed, Mar 27, 2024 at 03:31:01PM -0700, Sohil Mehta wrote:
> On 3/27/2024 1:54 PM, Avadhut Naik wrote:
> 
> > -	TP_printk("CPU: %d, MCGc/s: %llx/%llx, MC%d: %016Lx, IPID: %016Lx, ADDR/MISC/SYND: %016Lx/%016Lx/%016Lx, RIP: %02x:<%016Lx>, TSC: %llx, PPIN: %llx, PROCESSOR: %u:%x, TIME: %llu, SOCKET: %u, APIC: %x",
> > +	TP_printk("CPU: %d, MCGc/s: %llx/%llx, MC%d: %016Lx, IPID: %016Lx, ADDR/MISC/SYND: %016Lx/%016Lx/%016Lx, RIP: %02x:<%016Lx>, TSC: %llx, PPIN: %llx, PROCESSOR: %u:%x, TIME: %llu, SOCKET: %u, APIC: %x, MICROCODE REVISION: %x",
> 
> Nit: s/MICROCODE REVISION/MICROCODE/g
> 
> You could probably get rid of the word REVISION in the interest of
> brevity similar to __print_mce().

You *definitely* want to do that - good catch.

And TBH, all the screaming words aren't helping either... :)

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v4 2/2] tracing: Include Microcode Revision in mce_record tracepoint
  2024-03-27 22:35     ` Borislav Petkov
@ 2024-03-28  1:52       ` Sohil Mehta
  2024-03-28  6:17       ` Naik, Avadhut
  1 sibling, 0 replies; 11+ messages in thread
From: Sohil Mehta @ 2024-03-28  1:52 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: Avadhut Naik, linux-trace-kernel, linux-edac, rostedt, tony.luck,
	x86, linux-kernel, yazen.ghannam, avadnaik


> 
> You *definitely* want to do that - good catch.
> 
> And TBH, all the screaming words aren't helping either... :)
> 

:) I thought the same as well. But, I felt inconsistently screaming
words might be worse. Maybe just update all the words that are not
acronyms (such as Processor, Time, Socket, etc.)

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v4 2/2] tracing: Include Microcode Revision in mce_record tracepoint
  2024-03-27 22:31   ` Sohil Mehta
  2024-03-27 22:35     ` Borislav Petkov
@ 2024-03-28  6:16     ` Naik, Avadhut
  1 sibling, 0 replies; 11+ messages in thread
From: Naik, Avadhut @ 2024-03-28  6:16 UTC (permalink / raw)
  To: Sohil Mehta, linux-trace-kernel, linux-edac
  Cc: rostedt, tony.luck, bp, x86, linux-kernel, yazen.ghannam, Avadhut Naik



On 3/27/2024 17:31, Sohil Mehta wrote:
> On 3/27/2024 1:54 PM, Avadhut Naik wrote:
> 
>> -	TP_printk("CPU: %d, MCGc/s: %llx/%llx, MC%d: %016Lx, IPID: %016Lx, ADDR/MISC/SYND: %016Lx/%016Lx/%016Lx, RIP: %02x:<%016Lx>, TSC: %llx, PPIN: %llx, PROCESSOR: %u:%x, TIME: %llu, SOCKET: %u, APIC: %x",
>> +	TP_printk("CPU: %d, MCGc/s: %llx/%llx, MC%d: %016Lx, IPID: %016Lx, ADDR/MISC/SYND: %016Lx/%016Lx/%016Lx, RIP: %02x:<%016Lx>, TSC: %llx, PPIN: %llx, PROCESSOR: %u:%x, TIME: %llu, SOCKET: %u, APIC: %x, MICROCODE REVISION: %x",
> 
> Nit: s/MICROCODE REVISION/MICROCODE/g
> 
> You could probably get rid of the word REVISION in the interest of
> brevity similar to __print_mce().
> 
> 	pr_emerg(HW_ERR "PROCESSOR %u:%x TIME %llu SOCKET %u APIC %x microcode
> %x\n",
> 		m->cpuvendor, m->cpuid, m->time, m->socketid, m->apicid,
> 		m->microcode);
> 
>
Okay. Will remove "REVISION".
> -Sohil

-- 
Thanks,
Avadhut Naik

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [PATCH v4 2/2] tracing: Include Microcode Revision in mce_record tracepoint
  2024-03-27 22:35     ` Borislav Petkov
  2024-03-28  1:52       ` Sohil Mehta
@ 2024-03-28  6:17       ` Naik, Avadhut
  2024-03-28 15:44         ` Borislav Petkov
  1 sibling, 1 reply; 11+ messages in thread
From: Naik, Avadhut @ 2024-03-28  6:17 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: Avadhut Naik, linux-trace-kernel, linux-edac, rostedt, tony.luck,
	x86, linux-kernel, yazen.ghannam, Sohil Mehta



On 3/27/2024 17:35, Borislav Petkov wrote:
> On Wed, Mar 27, 2024 at 03:31:01PM -0700, Sohil Mehta wrote:
>> On 3/27/2024 1:54 PM, Avadhut Naik wrote:
>>
>>> -	TP_printk("CPU: %d, MCGc/s: %llx/%llx, MC%d: %016Lx, IPID: %016Lx, ADDR/MISC/SYND: %016Lx/%016Lx/%016Lx, RIP: %02x:<%016Lx>, TSC: %llx, PPIN: %llx, PROCESSOR: %u:%x, TIME: %llu, SOCKET: %u, APIC: %x",
>>> +	TP_printk("CPU: %d, MCGc/s: %llx/%llx, MC%d: %016Lx, IPID: %016Lx, ADDR/MISC/SYND: %016Lx/%016Lx/%016Lx, RIP: %02x:<%016Lx>, TSC: %llx, PPIN: %llx, PROCESSOR: %u:%x, TIME: %llu, SOCKET: %u, APIC: %x, MICROCODE REVISION: %x",
>>
>> Nit: s/MICROCODE REVISION/MICROCODE/g
>>
>> You could probably get rid of the word REVISION in the interest of
>> brevity similar to __print_mce().
> 
> You *definitely* want to do that - good catch.
> 
Will do.

> And TBH, all the screaming words aren't helping either... :)
> 
Are you suggesting to change the ALL CAPS format of words which are
not acronyms to normal Capitalization style? Like Sohit suggested
in his other mail on this thread?

Somewhat like below:

SOCKET -> Socket
PROCESSOR -> Processor
MICROCODE -> Microcode

-- 
Thanks,
Avadhut Naik

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v4 2/2] tracing: Include Microcode Revision in mce_record tracepoint
  2024-03-28  6:17       ` Naik, Avadhut
@ 2024-03-28 15:44         ` Borislav Petkov
  0 siblings, 0 replies; 11+ messages in thread
From: Borislav Petkov @ 2024-03-28 15:44 UTC (permalink / raw)
  To: Naik, Avadhut
  Cc: Avadhut Naik, linux-trace-kernel, linux-edac, rostedt, tony.luck,
	x86, linux-kernel, yazen.ghannam, Sohil Mehta

On Thu, Mar 28, 2024 at 01:17:43AM -0500, Naik, Avadhut wrote:
> SOCKET -> Socket
> PROCESSOR -> Processor
> MICROCODE -> Microcode

SOCKET -> socket
PROCESSOR -> processor
MICROCODE -> microcode

And yeah, the acronyms need to obviously stay in all caps.

Thx.

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2024-03-28 15:45 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-03-27 20:54 [PATCH v4 0/2] Update mce_record tracepoint Avadhut Naik
2024-03-27 20:54 ` [PATCH v4 1/2] tracing: Include PPIN in " Avadhut Naik
2024-03-27 20:59   ` Luck, Tony
2024-03-27 20:54 ` [PATCH v4 2/2] tracing: Include Microcode Revision " Avadhut Naik
2024-03-27 21:00   ` Luck, Tony
2024-03-27 22:31   ` Sohil Mehta
2024-03-27 22:35     ` Borislav Petkov
2024-03-28  1:52       ` Sohil Mehta
2024-03-28  6:17       ` Naik, Avadhut
2024-03-28 15:44         ` Borislav Petkov
2024-03-28  6:16     ` Naik, Avadhut

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).