All of lore.kernel.org
 help / color / mirror / Atom feed
* [RFC V2 PATCH 00/11] CXL: Process event logs
@ 2022-10-10 22:41 ira.weiny
  2022-10-10 22:41 ` [RFC V2 PATCH 01/11] cxl/mbox: Add debug of hardware error code ira.weiny
                   ` (10 more replies)
  0 siblings, 11 replies; 36+ messages in thread
From: ira.weiny @ 2022-10-10 22:41 UTC (permalink / raw)
  To: Dan Williams
  Cc: Ira Weiny, Alison Schofield, Vishal Verma, Ben Widawsky,
	Steven Rostedt, Jonathan Cameron, Davidlohr Bueso, linux-kernel,
	linux-cxl

From: Ira Weiny <ira.weiny@intel.com>

Changes from RFC v1
	Add event irqs
	General simplification of the code.
	Resolve field alignment questions
	Update to rev 3.0 for comments and structures
	Add reserved fields and output them

Event records inform the OS of various device events.  Events are not needed
for any kernel operation but various user level software will want to track
events.

Add event reporting through the trace event mechanism.  On driver load read and
clear all device events.

Enable all event logs for interrupts and process each log on interrupt.

Testing of this was performed with additions to QEMU posted here:

	https://lore.kernel.org/all/20221010222944.3923556-1-ira.weiny@intel.com/

Ira Weiny (11):
  cxl/mbox: Add debug of hardware error code
  cxl/mem: Implement Get Event Records command
  cxl/mem: Implement Clear Event Records command
  cxl/mem: Clear events on driver load
  cxl/mem: Trace General Media Event Record
  cxl/mem: Trace DRAM Event Record
  cxl/mem: Trace Memory Module Event Record
  cxl/test: Add generic mock events
  cxl/test: Add specific events
  cxl/test: Simulate event log overflow
  cxl/mem: Wire up event interrupts

 MAINTAINERS                     |   1 +
 drivers/cxl/core/mbox.c         | 186 ++++++++++++-
 drivers/cxl/cxlmem.h            | 193 +++++++++++++
 drivers/cxl/pci.c               | 154 ++++++++++
 include/trace/events/cxl.h      | 478 ++++++++++++++++++++++++++++++++
 include/uapi/linux/cxl_mem.h    |   4 +
 tools/testing/cxl/test/Kbuild   |   2 +-
 tools/testing/cxl/test/events.c | 329 ++++++++++++++++++++++
 tools/testing/cxl/test/events.h |   9 +
 tools/testing/cxl/test/mem.c    |  34 +++
 10 files changed, 1388 insertions(+), 2 deletions(-)
 create mode 100644 include/trace/events/cxl.h
 create mode 100644 tools/testing/cxl/test/events.c
 create mode 100644 tools/testing/cxl/test/events.h


base-commit: e2302539dd4f1c62d96651c07ddb05aa2461d29c
-- 
2.37.2


^ permalink raw reply	[flat|nested] 36+ messages in thread

* [RFC V2 PATCH 01/11] cxl/mbox: Add debug of hardware error code
  2022-10-10 22:41 [RFC V2 PATCH 00/11] CXL: Process event logs ira.weiny
@ 2022-10-10 22:41 ` ira.weiny
  2022-10-11 10:41   ` Jonathan Cameron
  2022-10-14 16:29   ` Davidlohr Bueso
  2022-10-10 22:41 ` [RFC V2 PATCH 02/11] cxl/mem: Implement Get Event Records command ira.weiny
                   ` (9 subsequent siblings)
  10 siblings, 2 replies; 36+ messages in thread
From: ira.weiny @ 2022-10-10 22:41 UTC (permalink / raw)
  To: Dan Williams
  Cc: Ira Weiny, Alison Schofield, Vishal Verma, Ben Widawsky,
	Steven Rostedt, Jonathan Cameron, Davidlohr Bueso, linux-kernel,
	linux-cxl

From: Ira Weiny <ira.weiny@intel.com>

If a mailbox command fails the driver always reports ENXIO.  But this
may not be enough information to understand why the hardware, or in my
case Qemu, was failing.

Add a debug print of the error code returned from the hardware.

Signed-off-by: Ira Weiny <ira.weiny@intel.com>
---
 drivers/cxl/core/mbox.c | 5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/drivers/cxl/core/mbox.c b/drivers/cxl/core/mbox.c
index 16176b9278b4..6c4d024ad0e8 100644
--- a/drivers/cxl/core/mbox.c
+++ b/drivers/cxl/core/mbox.c
@@ -181,8 +181,11 @@ int cxl_mbox_send_cmd(struct cxl_dev_state *cxlds, u16 opcode, void *in,
 	if (rc)
 		return rc;
 
-	if (mbox_cmd.return_code != CXL_MBOX_CMD_RC_SUCCESS)
+	if (mbox_cmd.return_code != CXL_MBOX_CMD_RC_SUCCESS) {
+		dev_dbg(cxlds->dev, "MB error : %s\n",
+			cxl_mbox_cmd_rc2str(&mbox_cmd));
 		return cxl_mbox_cmd_rc2errno(&mbox_cmd);
+	}
 
 	/*
 	 * Variable sized commands can't be validated and so it's up to the
-- 
2.37.2


^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [RFC V2 PATCH 02/11] cxl/mem: Implement Get Event Records command
  2022-10-10 22:41 [RFC V2 PATCH 00/11] CXL: Process event logs ira.weiny
  2022-10-10 22:41 ` [RFC V2 PATCH 01/11] cxl/mbox: Add debug of hardware error code ira.weiny
@ 2022-10-10 22:41 ` ira.weiny
  2022-10-11 12:39   ` Jonathan Cameron
                     ` (2 more replies)
  2022-10-10 22:41 ` [RFC V2 PATCH 03/11] cxl/mem: Implement Clear " ira.weiny
                   ` (8 subsequent siblings)
  10 siblings, 3 replies; 36+ messages in thread
From: ira.weiny @ 2022-10-10 22:41 UTC (permalink / raw)
  To: Dan Williams
  Cc: Ira Weiny, Steven Rostedt, Alison Schofield, Vishal Verma,
	Ben Widawsky, Jonathan Cameron, Davidlohr Bueso, linux-kernel,
	linux-cxl

From: Ira Weiny <ira.weiny@intel.com>

Event records are defined for CXL devices.  Each record is reported in
one event log.  Devices are required to support the storage of at least
one event record in each event log type.

Devices track event log overflow by incrementing a counter and tracking
the time of the first and last overflow event seen.

Software queries events via the Get Event Record mailbox command; CXL
rev 3.0 section 8.2.9.2.2.

Issue the Get Event Record mailbox command on driver load.  Trace each
record found, as well as any overflow conditions.  Only 1 event is
requested for each query.  Optimization of multiple record queries is
deferred.

This patch traces a raw event record only and leaves the specific event
record types to subsequent patches.

Macros are created to use for the common CXL Event header fields.

Cc: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Ira Weiny <ira.weiny@intel.com>

---
Change from RFC:
	Remove redundant error message in get event records loop
	s/EVENT_RECORD_DATA_LENGTH/CXL_EVENT_RECORD_DATA_LENGTH
	Use hdr_uuid for the header UUID field
	Use cxl_event_log_type_str() for the trace events
	Create macros for the header fields and common entries of each event
	Add reserved buffer output dump
	Report error if event query fails
	Remove unused record_cnt variable
	Steven - reorder overflow record
		Remove NOTE about checkpatch
	Jonathan
		check for exactly 1 record
		s/v3.0/rev 3.0
		Use 3 byte fields for 24bit fields
		Add 3.0 Maintenance Operation Class
		Add Dynamic Capacity log type
		Fix spelling
	Dave Jiang/Dan/Alison
		s/cxl-event/cxl
		trace/events/cxl-events => trace/events/cxl.h
		s/cxl_event_overflow/overflow
		s/cxl_event/generic_event
---
 MAINTAINERS                  |   1 +
 drivers/cxl/core/mbox.c      |  53 ++++++++++++++
 drivers/cxl/cxlmem.h         |  75 ++++++++++++++++++++
 include/trace/events/cxl.h   | 130 +++++++++++++++++++++++++++++++++++
 include/uapi/linux/cxl_mem.h |   1 +
 5 files changed, 260 insertions(+)
 create mode 100644 include/trace/events/cxl.h

diff --git a/MAINTAINERS b/MAINTAINERS
index 7547ffce5032..49bec9cc8bda 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -5219,6 +5219,7 @@ M:	Dan Williams <dan.j.williams@intel.com>
 L:	linux-cxl@vger.kernel.org
 S:	Maintained
 F:	drivers/cxl/
+F:	include/trace/events/cxl.h
 F:	include/uapi/linux/cxl_mem.h
 
 CONEXANT ACCESSRUNNER USB DRIVER
diff --git a/drivers/cxl/core/mbox.c b/drivers/cxl/core/mbox.c
index 6c4d024ad0e8..5f258c3f2190 100644
--- a/drivers/cxl/core/mbox.c
+++ b/drivers/cxl/core/mbox.c
@@ -7,6 +7,9 @@
 #include <cxlmem.h>
 #include <cxl.h>
 
+#define CREATE_TRACE_POINTS
+#include <trace/events/cxl.h>
+
 #include "core.h"
 
 static bool cxl_raw_allow_all;
@@ -48,6 +51,7 @@ static struct cxl_mem_command cxl_mem_commands[CXL_MEM_COMMAND_ID_MAX] = {
 	CXL_CMD(RAW, CXL_VARIABLE_PAYLOAD, CXL_VARIABLE_PAYLOAD, 0),
 #endif
 	CXL_CMD(GET_SUPPORTED_LOGS, 0, CXL_VARIABLE_PAYLOAD, CXL_CMD_FLAG_FORCE_ENABLE),
+	CXL_CMD(GET_EVENT_RECORD, 1, CXL_VARIABLE_PAYLOAD, 0),
 	CXL_CMD(GET_FW_INFO, 0, 0x50, 0),
 	CXL_CMD(GET_PARTITION_INFO, 0, 0x20, 0),
 	CXL_CMD(GET_LSA, 0x8, CXL_VARIABLE_PAYLOAD, 0),
@@ -707,6 +711,55 @@ int cxl_enumerate_cmds(struct cxl_dev_state *cxlds)
 }
 EXPORT_SYMBOL_NS_GPL(cxl_enumerate_cmds, CXL);
 
+static void cxl_mem_get_records_log(struct cxl_dev_state *cxlds,
+				    enum cxl_event_log_type type)
+{
+	struct cxl_get_event_payload payload;
+
+	do {
+		u8 log_type = type;
+		int rc;
+
+		rc = cxl_mbox_send_cmd(cxlds, CXL_MBOX_OP_GET_EVENT_RECORD,
+				       &log_type, sizeof(log_type),
+				       &payload, sizeof(payload));
+		if (rc) {
+			dev_err(cxlds->dev, "Event log '%s': Failed to query event records : %d",
+				cxl_event_log_type_str(type), rc);
+			return;
+		}
+
+		if (le16_to_cpu(payload.record_count) == 1)
+			trace_generic_event(dev_name(cxlds->dev), type,
+					    &payload.record);
+
+		if (payload.flags & CXL_GET_EVENT_FLAG_OVERFLOW)
+			trace_overflow(dev_name(cxlds->dev), type, &payload);
+
+	} while (payload.flags & CXL_GET_EVENT_FLAG_MORE_RECORDS);
+}
+
+/**
+ * cxl_mem_get_event_records - Get Event Records from the device
+ * @cxlds: The device data for the operation
+ *
+ * Retrieve all event records available on the device and report them as trace
+ * events.
+ *
+ * See CXL rev 3.0 @8.2.9.2.2 Get Event Records
+ */
+void cxl_mem_get_event_records(struct cxl_dev_state *cxlds)
+{
+	enum cxl_event_log_type log_type;
+
+	dev_dbg(cxlds->dev, "Reading event logs\n");
+
+	for (log_type = CXL_EVENT_TYPE_INFO;
+	     log_type < CXL_EVENT_TYPE_MAX; log_type++)
+		cxl_mem_get_records_log(cxlds, log_type);
+}
+EXPORT_SYMBOL_NS_GPL(cxl_mem_get_event_records, CXL);
+
 /**
  * cxl_mem_get_partition_info - Get partition info
  * @cxlds: The device data for the operation
diff --git a/drivers/cxl/cxlmem.h b/drivers/cxl/cxlmem.h
index 88e3a8e54b6a..fa8d248fb299 100644
--- a/drivers/cxl/cxlmem.h
+++ b/drivers/cxl/cxlmem.h
@@ -4,6 +4,7 @@
 #define __CXL_MEM_H__
 #include <uapi/linux/cxl_mem.h>
 #include <linux/cdev.h>
+#include <linux/uuid.h>
 #include "cxl.h"
 
 /* CXL 2.0 8.2.8.5.1.1 Memory Device Status Register */
@@ -253,6 +254,7 @@ struct cxl_dev_state {
 enum cxl_opcode {
 	CXL_MBOX_OP_INVALID		= 0x0000,
 	CXL_MBOX_OP_RAW			= CXL_MBOX_OP_INVALID,
+	CXL_MBOX_OP_GET_EVENT_RECORD	= 0x0100,
 	CXL_MBOX_OP_GET_FW_INFO		= 0x0200,
 	CXL_MBOX_OP_ACTIVATE_FW		= 0x0202,
 	CXL_MBOX_OP_GET_SUPPORTED_LOGS	= 0x0400,
@@ -322,6 +324,78 @@ struct cxl_mbox_identify {
 	u8 qos_telemetry_caps;
 } __packed;
 
+/*
+ * Common Event Record Format
+ * CXL rev 3.0 section 8.2.9.2.1; Table 8-42
+ */
+#define CXL_EVENT_REC_HDR_RES_LEN 0xf
+struct cxl_event_record_hdr {
+	uuid_t id;
+	u8 length;
+	u8 flags[3];
+	__le16 handle;
+	__le16 related_handle;
+	__le64 timestamp;
+	u8 maint_op_class;
+	u8 reserved[CXL_EVENT_REC_HDR_RES_LEN];
+} __packed;
+
+#define CXL_EVENT_RECORD_DATA_LENGTH 0x50
+struct cxl_event_record_raw {
+	struct cxl_event_record_hdr hdr;
+	u8 data[CXL_EVENT_RECORD_DATA_LENGTH];
+} __packed;
+
+/*
+ * Get Event Records output payload
+ * CXL rev 3.0 section 8.2.9.2.2; Table 8-50
+ *
+ * Space given for 1 record
+ */
+#define CXL_GET_EVENT_FLAG_OVERFLOW		BIT(0)
+#define CXL_GET_EVENT_FLAG_MORE_RECORDS	BIT(1)
+struct cxl_get_event_payload {
+	u8 flags;
+	u8 reserved1;
+	__le16 overflow_err_count;
+	__le64 first_overflow_timestamp;
+	__le64 last_overflow_timestamp;
+	__le16 record_count;
+	u8 reserved2[0xa];
+	struct cxl_event_record_raw record;
+} __packed;
+
+/*
+ * CXL rev 3.0 section 8.2.9.2.2; Table 8-49
+ */
+enum cxl_event_log_type {
+	CXL_EVENT_TYPE_INFO = 0x00,
+	CXL_EVENT_TYPE_WARN,
+	CXL_EVENT_TYPE_FAIL,
+	CXL_EVENT_TYPE_FATAL,
+	CXL_EVENT_TYPE_DYNAMIC_CAP,
+	CXL_EVENT_TYPE_MAX
+};
+
+static inline const char *cxl_event_log_type_str(enum cxl_event_log_type type)
+{
+	switch (type) {
+	case CXL_EVENT_TYPE_INFO:
+		return "Informational";
+	case CXL_EVENT_TYPE_WARN:
+		return "Warning";
+	case CXL_EVENT_TYPE_FAIL:
+		return "Failure";
+	case CXL_EVENT_TYPE_FATAL:
+		return "Fatal";
+	case CXL_EVENT_TYPE_DYNAMIC_CAP:
+		return "Dynamic Capacity";
+	default:
+		break;
+	}
+	return "<unknown>";
+}
+
 struct cxl_mbox_get_partition_info {
 	__le64 active_volatile_cap;
 	__le64 active_persistent_cap;
@@ -381,6 +455,7 @@ int cxl_mem_create_range_info(struct cxl_dev_state *cxlds);
 struct cxl_dev_state *cxl_dev_state_create(struct device *dev);
 void set_exclusive_cxl_commands(struct cxl_dev_state *cxlds, unsigned long *cmds);
 void clear_exclusive_cxl_commands(struct cxl_dev_state *cxlds, unsigned long *cmds);
+void cxl_mem_get_event_records(struct cxl_dev_state *cxlds);
 #ifdef CONFIG_CXL_SUSPEND
 void cxl_mem_active_inc(void);
 void cxl_mem_active_dec(void);
diff --git a/include/trace/events/cxl.h b/include/trace/events/cxl.h
new file mode 100644
index 000000000000..318ba5fe046e
--- /dev/null
+++ b/include/trace/events/cxl.h
@@ -0,0 +1,130 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#undef TRACE_SYSTEM
+#define TRACE_SYSTEM cxl
+
+#if !defined(_CXL_TRACE_EVENTS_H) ||  defined(TRACE_HEADER_MULTI_READ)
+#define _CXL_TRACE_EVENTS_H
+
+#include <asm-generic/unaligned.h>
+#include <linux/tracepoint.h>
+
+TRACE_EVENT(overflow,
+
+	TP_PROTO(const char *dev_name, enum cxl_event_log_type log,
+		 struct cxl_get_event_payload *payload),
+
+	TP_ARGS(dev_name, log, payload),
+
+	TP_STRUCT__entry(
+		__string(dev_name, dev_name)
+		__field(int, log)
+		__field(u64, first)
+		__field(u64, last)
+		__field(u16, count)
+	),
+
+	TP_fast_assign(
+		__assign_str(dev_name, dev_name);
+		__entry->log = log;
+		__entry->count = le16_to_cpu(payload->overflow_err_count);
+		__entry->first = le64_to_cpu(payload->first_overflow_timestamp);
+		__entry->last = le64_to_cpu(payload->last_overflow_timestamp);
+	),
+
+	TP_printk("%s: EVENT LOG OVERFLOW log=%s : %u records from %llu to %llu",
+		__get_str(dev_name), cxl_event_log_type_str(__entry->log),
+		__entry->count, __entry->first, __entry->last)
+
+);
+
+/*
+ * Common Event Record Format
+ * CXL 3.0 section 8.2.9.2.1; Table 8-42
+ */
+#define CXL_EVENT_RECORD_FLAG_PERMANENT		BIT(2)
+#define CXL_EVENT_RECORD_FLAG_MAINT_NEEDED	BIT(3)
+#define CXL_EVENT_RECORD_FLAG_PERF_DEGRADED	BIT(4)
+#define CXL_EVENT_RECORD_FLAG_HW_REPLACE	BIT(5)
+#define show_hdr_flags(flags)	__print_flags(flags, " | ",			   \
+	{ CXL_EVENT_RECORD_FLAG_PERMANENT,	"Permanent Condition"		}, \
+	{ CXL_EVENT_RECORD_FLAG_MAINT_NEEDED,	"Maintenance Needed"		}, \
+	{ CXL_EVENT_RECORD_FLAG_PERF_DEGRADED,	"Performance Degraded"		}, \
+	{ CXL_EVENT_RECORD_FLAG_HW_REPLACE,	"Hardware Replacement Needed"	}  \
+)
+
+/*
+ * Define macros for the common header of each CXL event.
+ *
+ * Tracepoints using these macros must do 3 things:
+ *
+ *	1) Add CXL_EVT_TP_entry to TP_STRUCT__entry
+ *	2) Use CXL_EVT_TP_fast_assign within TP_fast_assign;
+ *	   pass the dev_name, log, and CXL event header
+ *	3) Use CXL_EVT_TP_printk() instead of TP_printk()
+ *
+ * See the generic_event tracepoint as an example.
+ */
+#define CXL_EVT_TP_entry					\
+	__string(dev_name, dev_name)				\
+	__field(int, log)					\
+	__array(u8, hdr_uuid, UUID_SIZE)			\
+	__field(u32, hdr_flags)					\
+	__field(u16, hdr_handle)				\
+	__field(u16, hdr_related_handle)			\
+	__field(u64, hdr_timestamp)				\
+	__array(u8, hdr_res, CXL_EVENT_REC_HDR_RES_LEN)		\
+	__field(u8, hdr_length)					\
+	__field(u8, hdr_maint_op_class)
+
+#define CXL_EVT_TP_fast_assign(dname, l, hdr)					\
+	__assign_str(dev_name, (dname));					\
+	__entry->log = (l);							\
+	memcpy(__entry->hdr_uuid, &(hdr).id, UUID_SIZE);			\
+	__entry->hdr_length = (hdr).length;					\
+	__entry->hdr_flags = get_unaligned_le24((hdr).flags);			\
+	__entry->hdr_handle = le16_to_cpu((hdr).handle);			\
+	__entry->hdr_related_handle = le16_to_cpu((hdr).related_handle);	\
+	__entry->hdr_timestamp = le64_to_cpu((hdr).timestamp);			\
+	__entry->hdr_maint_op_class = (hdr).maint_op_class;			\
+	memcpy(__entry->hdr_res, &(hdr).reserved,				\
+		CXL_EVENT_REC_HDR_RES_LEN)
+
+
+#define CXL_EVT_TP_printk(fmt, ...) \
+	TP_printk("%s log=%s : time=%llu uuid=%pUl len=%d flags='%s' "		\
+		"handle=%x related_handle=%x maint_op_class=%u res='%s' "	\
+		" : " fmt,							\
+		__get_str(dev_name), cxl_event_log_type_str(__entry->log),	\
+		__entry->hdr_timestamp, __entry->hdr_uuid, __entry->hdr_length,	\
+		show_hdr_flags(__entry->hdr_flags), __entry->hdr_handle,	\
+		__entry->hdr_related_handle, __entry->hdr_maint_op_class,	\
+		__print_hex(__entry->hdr_res, CXL_EVENT_REC_HDR_RES_LEN),	\
+		##__VA_ARGS__)
+
+TRACE_EVENT(generic_event,
+
+	TP_PROTO(const char *dev_name, enum cxl_event_log_type log,
+		 struct cxl_event_record_raw *rec),
+
+	TP_ARGS(dev_name, log, rec),
+
+	TP_STRUCT__entry(
+		CXL_EVT_TP_entry
+		__array(u8, data, CXL_EVENT_RECORD_DATA_LENGTH)
+	),
+
+	TP_fast_assign(
+		CXL_EVT_TP_fast_assign(dev_name, log, rec->hdr);
+		memcpy(__entry->data, &rec->data, CXL_EVENT_RECORD_DATA_LENGTH);
+	),
+
+	CXL_EVT_TP_printk("%s",
+		__print_hex(__entry->data, CXL_EVENT_RECORD_DATA_LENGTH))
+);
+
+#endif /* _CXL_TRACE_EVENTS_H */
+
+/* This part must be outside protection */
+#undef TRACE_INCLUDE_FILE
+#define TRACE_INCLUDE_FILE cxl
+#include <trace/define_trace.h>
diff --git a/include/uapi/linux/cxl_mem.h b/include/uapi/linux/cxl_mem.h
index c71021a2a9ed..70459be5bdd4 100644
--- a/include/uapi/linux/cxl_mem.h
+++ b/include/uapi/linux/cxl_mem.h
@@ -24,6 +24,7 @@
 	___C(IDENTIFY, "Identify Command"),                               \
 	___C(RAW, "Raw device command"),                                  \
 	___C(GET_SUPPORTED_LOGS, "Get Supported Logs"),                   \
+	___C(GET_EVENT_RECORD, "Get Event Record"),                       \
 	___C(GET_FW_INFO, "Get FW Info"),                                 \
 	___C(GET_PARTITION_INFO, "Get Partition Information"),            \
 	___C(GET_LSA, "Get Label Storage Area"),                          \
-- 
2.37.2


^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [RFC V2 PATCH 03/11] cxl/mem: Implement Clear Event Records command
  2022-10-10 22:41 [RFC V2 PATCH 00/11] CXL: Process event logs ira.weiny
  2022-10-10 22:41 ` [RFC V2 PATCH 01/11] cxl/mbox: Add debug of hardware error code ira.weiny
  2022-10-10 22:41 ` [RFC V2 PATCH 02/11] cxl/mem: Implement Get Event Records command ira.weiny
@ 2022-10-10 22:41 ` ira.weiny
  2022-10-10 22:41 ` [RFC V2 PATCH 04/11] cxl/mem: Clear events on driver load ira.weiny
                   ` (7 subsequent siblings)
  10 siblings, 0 replies; 36+ messages in thread
From: ira.weiny @ 2022-10-10 22:41 UTC (permalink / raw)
  To: Dan Williams
  Cc: Ira Weiny, Jonathan Cameron, Alison Schofield, Vishal Verma,
	Ben Widawsky, Steven Rostedt, Davidlohr Bueso, linux-kernel,
	linux-cxl

From: Ira Weiny <ira.weiny@intel.com>

CXL rev 3.0 section 8.2.9.2.3 defines the Clear Event Records mailbox
command.  After an event record is read it needs to be cleared from the
event log.

Implement cxl_clear_event_record() and call it for each record retrieved
from the device.

Each record is cleared individually.  A clear all bit is specified but
events could arrive between a get and the final clear all operation.
Therefore each event is cleared specifically.

Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Ira Weiny <ira.weiny@intel.com>

---
Changes from RFC:
	Jonathan
		Clean up init of payload and use return code.
		Also report any error to clear the event.
		s/v3.0/rev 3.0
---
 drivers/cxl/core/mbox.c      | 31 ++++++++++++++++++++++++++++---
 drivers/cxl/cxlmem.h         | 15 +++++++++++++++
 include/uapi/linux/cxl_mem.h |  1 +
 3 files changed, 44 insertions(+), 3 deletions(-)

diff --git a/drivers/cxl/core/mbox.c b/drivers/cxl/core/mbox.c
index 5f258c3f2190..bc4e42b3e01b 100644
--- a/drivers/cxl/core/mbox.c
+++ b/drivers/cxl/core/mbox.c
@@ -52,6 +52,7 @@ static struct cxl_mem_command cxl_mem_commands[CXL_MEM_COMMAND_ID_MAX] = {
 #endif
 	CXL_CMD(GET_SUPPORTED_LOGS, 0, CXL_VARIABLE_PAYLOAD, CXL_CMD_FLAG_FORCE_ENABLE),
 	CXL_CMD(GET_EVENT_RECORD, 1, CXL_VARIABLE_PAYLOAD, 0),
+	CXL_CMD(CLEAR_EVENT_RECORD, CXL_VARIABLE_PAYLOAD, 0, 0),
 	CXL_CMD(GET_FW_INFO, 0, 0x50, 0),
 	CXL_CMD(GET_PARTITION_INFO, 0, 0x20, 0),
 	CXL_CMD(GET_LSA, 0x8, CXL_VARIABLE_PAYLOAD, 0),
@@ -711,6 +712,20 @@ int cxl_enumerate_cmds(struct cxl_dev_state *cxlds)
 }
 EXPORT_SYMBOL_NS_GPL(cxl_enumerate_cmds, CXL);
 
+static int cxl_clear_event_record(struct cxl_dev_state *cxlds,
+				  enum cxl_event_log_type log,
+				  __le16 handle)
+{
+	struct cxl_mbox_clear_event_payload payload = {
+		.event_log = log,
+		.nr_recs = 1,
+		.handle = handle,
+	};
+
+	return cxl_mbox_send_cmd(cxlds, CXL_MBOX_OP_CLEAR_EVENT_RECORD,
+				 &payload, sizeof(payload), NULL, 0);
+}
+
 static void cxl_mem_get_records_log(struct cxl_dev_state *cxlds,
 				    enum cxl_event_log_type type)
 {
@@ -729,9 +744,18 @@ static void cxl_mem_get_records_log(struct cxl_dev_state *cxlds,
 			return;
 		}
 
-		if (le16_to_cpu(payload.record_count) == 1)
+		if (le16_to_cpu(payload.record_count) == 1) {
 			trace_generic_event(dev_name(cxlds->dev), type,
 					    &payload.record);
+			rc = cxl_clear_event_record(cxlds, type,
+						    payload.record.hdr.handle);
+			if (rc) {
+				dev_err(cxlds->dev, "Event log '%s': Failed to clear event (%u) : %d",
+					cxl_event_log_type_str(type),
+					payload.record.hdr.handle, rc);
+				return;
+			}
+		}
 
 		if (payload.flags & CXL_GET_EVENT_FLAG_OVERFLOW)
 			trace_overflow(dev_name(cxlds->dev), type, &payload);
@@ -743,10 +767,11 @@ static void cxl_mem_get_records_log(struct cxl_dev_state *cxlds,
  * cxl_mem_get_event_records - Get Event Records from the device
  * @cxlds: The device data for the operation
  *
- * Retrieve all event records available on the device and report them as trace
- * events.
+ * Retrieve all event records available on the device, report them as trace
+ * events, and clear them.
  *
  * See CXL rev 3.0 @8.2.9.2.2 Get Event Records
+ * See CXL rev 3.0 @8.2.9.2.3 Clear Event Records
  */
 void cxl_mem_get_event_records(struct cxl_dev_state *cxlds)
 {
diff --git a/drivers/cxl/cxlmem.h b/drivers/cxl/cxlmem.h
index fa8d248fb299..75aea34f3ffb 100644
--- a/drivers/cxl/cxlmem.h
+++ b/drivers/cxl/cxlmem.h
@@ -255,6 +255,7 @@ enum cxl_opcode {
 	CXL_MBOX_OP_INVALID		= 0x0000,
 	CXL_MBOX_OP_RAW			= CXL_MBOX_OP_INVALID,
 	CXL_MBOX_OP_GET_EVENT_RECORD	= 0x0100,
+	CXL_MBOX_OP_CLEAR_EVENT_RECORD	= 0x0101,
 	CXL_MBOX_OP_GET_FW_INFO		= 0x0200,
 	CXL_MBOX_OP_ACTIVATE_FW		= 0x0202,
 	CXL_MBOX_OP_GET_SUPPORTED_LOGS	= 0x0400,
@@ -396,6 +397,20 @@ static inline const char *cxl_event_log_type_str(enum cxl_event_log_type type)
 	return "<unknown>";
 }
 
+/*
+ * Clear Event Records input payload
+ * CXL rev 3.0 section 8.2.9.2.3; Table 8-51
+ *
+ * Space given for 1 record
+ */
+struct cxl_mbox_clear_event_payload {
+	u8 event_log;		/* enum cxl_event_log_type */
+	u8 clear_flags;
+	u8 nr_recs;		/* 1 for this struct */
+	u8 reserved[3];
+	__le16 handle;
+};
+
 struct cxl_mbox_get_partition_info {
 	__le64 active_volatile_cap;
 	__le64 active_persistent_cap;
diff --git a/include/uapi/linux/cxl_mem.h b/include/uapi/linux/cxl_mem.h
index 70459be5bdd4..7c1ad8062792 100644
--- a/include/uapi/linux/cxl_mem.h
+++ b/include/uapi/linux/cxl_mem.h
@@ -25,6 +25,7 @@
 	___C(RAW, "Raw device command"),                                  \
 	___C(GET_SUPPORTED_LOGS, "Get Supported Logs"),                   \
 	___C(GET_EVENT_RECORD, "Get Event Record"),                       \
+	___C(CLEAR_EVENT_RECORD, "Clear Event Record"),                   \
 	___C(GET_FW_INFO, "Get FW Info"),                                 \
 	___C(GET_PARTITION_INFO, "Get Partition Information"),            \
 	___C(GET_LSA, "Get Label Storage Area"),                          \
-- 
2.37.2


^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [RFC V2 PATCH 04/11] cxl/mem: Clear events on driver load
  2022-10-10 22:41 [RFC V2 PATCH 00/11] CXL: Process event logs ira.weiny
                   ` (2 preceding siblings ...)
  2022-10-10 22:41 ` [RFC V2 PATCH 03/11] cxl/mem: Implement Clear " ira.weiny
@ 2022-10-10 22:41 ` ira.weiny
  2022-10-11 12:42   ` Jonathan Cameron
  2022-10-10 22:41 ` [RFC V2 PATCH 05/11] cxl/mem: Trace General Media Event Record ira.weiny
                   ` (6 subsequent siblings)
  10 siblings, 1 reply; 36+ messages in thread
From: ira.weiny @ 2022-10-10 22:41 UTC (permalink / raw)
  To: Dan Williams
  Cc: Ira Weiny, Alison Schofield, Vishal Verma, Ben Widawsky,
	Steven Rostedt, Jonathan Cameron, Davidlohr Bueso, linux-kernel,
	linux-cxl

From: Ira Weiny <ira.weiny@intel.com>

The information contained in the events prior to the driver loading can
be queried at any time through other mailbox commands.

Ensure a clean slate of events by reading and clearing the events.  The
events are sent to the trace buffer but it is not anticipated to have
anyone listening to it at driver load time.

Signed-off-by: Ira Weiny <ira.weiny@intel.com>
---
 drivers/cxl/pci.c            | 2 ++
 tools/testing/cxl/test/mem.c | 2 ++
 2 files changed, 4 insertions(+)

diff --git a/drivers/cxl/pci.c b/drivers/cxl/pci.c
index faeb5d9d7a7a..5f1b492bd388 100644
--- a/drivers/cxl/pci.c
+++ b/drivers/cxl/pci.c
@@ -498,6 +498,8 @@ static int cxl_pci_probe(struct pci_dev *pdev, const struct pci_device_id *id)
 	if (IS_ERR(cxlmd))
 		return PTR_ERR(cxlmd);
 
+	cxl_mem_get_event_records(cxlds);
+
 	if (resource_size(&cxlds->pmem_res) && IS_ENABLED(CONFIG_CXL_PMEM))
 		rc = devm_cxl_add_nvdimm(&pdev->dev, cxlmd);
 
diff --git a/tools/testing/cxl/test/mem.c b/tools/testing/cxl/test/mem.c
index aa2df3a15051..e2f5445d24ff 100644
--- a/tools/testing/cxl/test/mem.c
+++ b/tools/testing/cxl/test/mem.c
@@ -285,6 +285,8 @@ static int cxl_mock_mem_probe(struct platform_device *pdev)
 	if (IS_ERR(cxlmd))
 		return PTR_ERR(cxlmd);
 
+	cxl_mem_get_event_records(cxlds);
+
 	if (resource_size(&cxlds->pmem_res) && IS_ENABLED(CONFIG_CXL_PMEM))
 		rc = devm_cxl_add_nvdimm(dev, cxlmd);
 
-- 
2.37.2


^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [RFC V2 PATCH 05/11] cxl/mem: Trace General Media Event Record
  2022-10-10 22:41 [RFC V2 PATCH 00/11] CXL: Process event logs ira.weiny
                   ` (3 preceding siblings ...)
  2022-10-10 22:41 ` [RFC V2 PATCH 04/11] cxl/mem: Clear events on driver load ira.weiny
@ 2022-10-10 22:41 ` ira.weiny
  2022-10-11 12:57   ` Jonathan Cameron
  2022-10-15 11:30   ` Steven Rostedt
  2022-10-10 22:41 ` [RFC V2 PATCH 06/11] cxl/mem: Trace DRAM " ira.weiny
                   ` (5 subsequent siblings)
  10 siblings, 2 replies; 36+ messages in thread
From: ira.weiny @ 2022-10-10 22:41 UTC (permalink / raw)
  To: Dan Williams
  Cc: Ira Weiny, Alison Schofield, Vishal Verma, Ben Widawsky,
	Steven Rostedt, Jonathan Cameron, Davidlohr Bueso, linux-kernel,
	linux-cxl

From: Ira Weiny <ira.weiny@intel.com>

CXL rev 3.0 section 8.2.9.2.1.1 defines the General Media Event Record.

Determine if the event read is a general media record and if so trace
the record.

Signed-off-by: Ira Weiny <ira.weiny@intel.com>

---
Changes from RFC:
	Add reserved byte array
	Use common CXL event header record macros
	Jonathan
		Use unaligned_le{24,16} for unaligned fields
		Don't use the inverse of phy addr mask
	Dave Jiang
		s/cxl_gen_media_event/general_media
		s/cxl_evt_gen_media/cxl_event_gen_media
---
 drivers/cxl/core/mbox.c    |  30 ++++++++++-
 drivers/cxl/cxlmem.h       |  20 +++++++
 include/trace/events/cxl.h | 108 +++++++++++++++++++++++++++++++++++++
 3 files changed, 156 insertions(+), 2 deletions(-)

diff --git a/drivers/cxl/core/mbox.c b/drivers/cxl/core/mbox.c
index bc4e42b3e01b..1097250c115a 100644
--- a/drivers/cxl/core/mbox.c
+++ b/drivers/cxl/core/mbox.c
@@ -712,6 +712,32 @@ int cxl_enumerate_cmds(struct cxl_dev_state *cxlds)
 }
 EXPORT_SYMBOL_NS_GPL(cxl_enumerate_cmds, CXL);
 
+/*
+ * General Media Event Record
+ * CXL rev 3.0 Section 8.2.9.2.1.1; Table 8-43
+ */
+static const uuid_t gen_media_event_uuid =
+	UUID_INIT(0xfbcd0a77, 0xc260, 0x417f,
+		  0x85, 0xa9, 0x08, 0x8b, 0x16, 0x21, 0xeb, 0xa6);
+
+static void cxl_trace_event_record(const char *dev_name,
+				   enum cxl_event_log_type type,
+				   struct cxl_get_event_payload *payload)
+{
+	uuid_t *id = &payload->record.hdr.id;
+
+	if (uuid_equal(id, &gen_media_event_uuid)) {
+		struct cxl_event_gen_media *rec =
+				(struct cxl_event_gen_media *)&payload->record;
+
+		trace_general_media(dev_name, type, rec);
+		return;
+	}
+
+	/* For unknown record types print just the header */
+	trace_generic_event(dev_name, type, &payload->record);
+}
+
 static int cxl_clear_event_record(struct cxl_dev_state *cxlds,
 				  enum cxl_event_log_type log,
 				  __le16 handle)
@@ -745,8 +771,8 @@ static void cxl_mem_get_records_log(struct cxl_dev_state *cxlds,
 		}
 
 		if (le16_to_cpu(payload.record_count) == 1) {
-			trace_generic_event(dev_name(cxlds->dev), type,
-					    &payload.record);
+			cxl_trace_event_record(dev_name(cxlds->dev), type,
+					       &payload);
 			rc = cxl_clear_event_record(cxlds, type,
 						    payload.record.hdr.handle);
 			if (rc) {
diff --git a/drivers/cxl/cxlmem.h b/drivers/cxl/cxlmem.h
index 75aea34f3ffb..b5c120bd4068 100644
--- a/drivers/cxl/cxlmem.h
+++ b/drivers/cxl/cxlmem.h
@@ -411,6 +411,26 @@ struct cxl_mbox_clear_event_payload {
 	__le16 handle;
 };
 
+/*
+ * General Media Event Record
+ * CXL rev 3.0 Section 8.2.9.2.1.1; Table 8-43
+ */
+#define CXL_EVENT_GEN_MED_COMP_ID_SIZE	0x10
+#define CXL_EVENT_GEN_MED_RES_SIZE	0x2e
+struct cxl_event_gen_media {
+	struct cxl_event_record_hdr hdr;
+	__le64 phys_addr;
+	u8 descriptor;
+	u8 type;
+	u8 transaction_type;
+	u8 validity_flags[2];
+	u8 channel;
+	u8 rank;
+	u8 device[3];
+	u8 component_id[CXL_EVENT_GEN_MED_COMP_ID_SIZE];
+	u8 reserved[CXL_EVENT_GEN_MED_RES_SIZE];
+} __packed;
+
 struct cxl_mbox_get_partition_info {
 	__le64 active_volatile_cap;
 	__le64 active_persistent_cap;
diff --git a/include/trace/events/cxl.h b/include/trace/events/cxl.h
index 318ba5fe046e..82a8d3b750a2 100644
--- a/include/trace/events/cxl.h
+++ b/include/trace/events/cxl.h
@@ -122,6 +122,114 @@ TRACE_EVENT(generic_event,
 		__print_hex(__entry->data, CXL_EVENT_RECORD_DATA_LENGTH))
 );
 
+/*
+ * General Media Event Record - GMER
+ * CXL v2.0 Section 8.2.9.1.1.1; Table 154
+ */
+#define CXL_GMER_PHYS_ADDR_VOLATILE			BIT(0)
+#define CXL_GMER_PHYS_ADDR_MASK				0xFFFFFFFFFFFFFF80
+
+#define CXL_GMER_EVT_DESC_UNCORECTABLE_EVENT		BIT(0)
+#define CXL_GMER_EVT_DESC_THRESHOLD_EVENT		BIT(1)
+#define CXL_GMER_EVT_DESC_POISON_LIST_OVERFLOW		BIT(2)
+#define show_event_desc_flags(flags)	__print_flags(flags, "|",		   \
+	{ CXL_GMER_EVT_DESC_UNCORECTABLE_EVENT,		"Uncorrectable Event"	}, \
+	{ CXL_GMER_EVT_DESC_THRESHOLD_EVENT,		"Threshold event"	}, \
+	{ CXL_GMER_EVT_DESC_POISON_LIST_OVERFLOW,	"Poison List Overflow"	}  \
+)
+
+#define CXL_GMER_MEM_EVT_TYPE_ECC_ERROR			0x00
+#define CXL_GMER_MEM_EVT_TYPE_INV_ADDR			0x01
+#define CXL_GMER_MEM_EVT_TYPE_DATA_PATH_ERROR		0x02
+#define show_mem_event_type(type)	__print_symbolic(type,			\
+	{ CXL_GMER_MEM_EVT_TYPE_ECC_ERROR,		"ECC Error" },		\
+	{ CXL_GMER_MEM_EVT_TYPE_INV_ADDR,		"Invalid Address" },	\
+	{ CXL_GMER_MEM_EVT_TYPE_DATA_PATH_ERROR,	"Data Path Error" }	\
+)
+
+#define CXL_GMER_TRANS_UNKNOWN				0x00
+#define CXL_GMER_TRANS_HOST_READ			0x01
+#define CXL_GMER_TRANS_HOST_WRITE			0x02
+#define CXL_GMER_TRANS_HOST_SCAN_MEDIA			0x03
+#define CXL_GMER_TRANS_HOST_INJECT_POISON		0x04
+#define CXL_GMER_TRANS_INTERNAL_MEDIA_SCRUB		0x05
+#define CXL_GMER_TRANS_INTERNAL_MEDIA_MANAGEMENT	0x06
+#define show_trans_type(type)	__print_symbolic(type,					\
+	{ CXL_GMER_TRANS_UNKNOWN,			"Unknown" },			\
+	{ CXL_GMER_TRANS_HOST_READ,			"Host Read" },			\
+	{ CXL_GMER_TRANS_HOST_WRITE,			"Host Write" },			\
+	{ CXL_GMER_TRANS_HOST_SCAN_MEDIA,		"Host Scan Media" },		\
+	{ CXL_GMER_TRANS_HOST_INJECT_POISON,		"Host Inject Poison" },		\
+	{ CXL_GMER_TRANS_INTERNAL_MEDIA_SCRUB,		"Internal Media Scrub" },	\
+	{ CXL_GMER_TRANS_INTERNAL_MEDIA_MANAGEMENT,	"Internal Media Management" }	\
+)
+
+#define CXL_GMER_VALID_CHANNEL				BIT(0)
+#define CXL_GMER_VALID_RANK				BIT(1)
+#define CXL_GMER_VALID_DEVICE				BIT(2)
+#define CXL_GMER_VALID_COMPONENT			BIT(3)
+#define show_valid_flags(flags)	__print_flags(flags, "|",		   \
+	{ CXL_GMER_VALID_CHANNEL,			"CHANNEL"	}, \
+	{ CXL_GMER_VALID_RANK,				"RANK"		}, \
+	{ CXL_GMER_VALID_DEVICE,			"DEVICE"	}, \
+	{ CXL_GMER_VALID_COMPONENT,			"COMPONENT"	}  \
+)
+
+TRACE_EVENT(general_media,
+
+	TP_PROTO(const char *dev_name, enum cxl_event_log_type log,
+		 struct cxl_event_gen_media *rec),
+
+	TP_ARGS(dev_name, log, rec),
+
+	TP_STRUCT__entry(
+		CXL_EVT_TP_entry
+		/* General Media */
+		__field(u64, phys_addr)
+		__field(u8, descriptor)
+		__field(u8, type)
+		__field(u8, transaction_type)
+		__field(u8, channel)
+		__field(u32, device)
+		__array(u8, comp_id, CXL_EVENT_GEN_MED_COMP_ID_SIZE)
+		__array(u8, reserved, CXL_EVENT_GEN_MED_RES_SIZE)
+		__field(u16, validity_flags)
+		__field(u8, rank) /* Out of order to pack trace record */
+	),
+
+	TP_fast_assign(
+		CXL_EVT_TP_fast_assign(dev_name, log, rec->hdr);
+
+		/* General Media */
+		__entry->phys_addr = le64_to_cpu(rec->phys_addr);
+		__entry->descriptor = rec->descriptor;
+		__entry->type = rec->type;
+		__entry->transaction_type = rec->transaction_type;
+		__entry->channel = rec->channel;
+		__entry->rank = rec->rank;
+		__entry->device = get_unaligned_le24(rec->device);
+		memcpy(__entry->comp_id, &rec->component_id,
+			CXL_EVENT_GEN_MED_COMP_ID_SIZE);
+		memcpy(__entry->reserved, &rec->reserved,
+			CXL_EVENT_GEN_MED_RES_SIZE);
+		__entry->validity_flags = get_unaligned_le16(&rec->validity_flags);
+	),
+
+	CXL_EVT_TP_printk("phys_addr=%llx volatile=%s desc='%s' type='%s' " \
+		"trans_type='%s' channel=%u rank=%u device=%x comp_id=%s " \
+		"valid_flags='%s' reserved=%s",
+		__entry->phys_addr & CXL_GMER_PHYS_ADDR_MASK,
+		(__entry->phys_addr & CXL_GMER_PHYS_ADDR_VOLATILE) ? "TRUE" : "FALSE",
+		show_event_desc_flags(__entry->descriptor),
+		show_mem_event_type(__entry->type),
+		show_trans_type(__entry->transaction_type),
+		__entry->channel, __entry->rank, __entry->device,
+		__print_hex(__entry->comp_id, CXL_EVENT_GEN_MED_COMP_ID_SIZE),
+		show_valid_flags(__entry->validity_flags),
+		__print_hex(__entry->reserved, CXL_EVENT_GEN_MED_RES_SIZE)
+		)
+);
+
 #endif /* _CXL_TRACE_EVENTS_H */
 
 /* This part must be outside protection */
-- 
2.37.2


^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [RFC V2 PATCH 06/11] cxl/mem: Trace DRAM Event Record
  2022-10-10 22:41 [RFC V2 PATCH 00/11] CXL: Process event logs ira.weiny
                   ` (4 preceding siblings ...)
  2022-10-10 22:41 ` [RFC V2 PATCH 05/11] cxl/mem: Trace General Media Event Record ira.weiny
@ 2022-10-10 22:41 ` ira.weiny
  2022-10-11 13:47   ` Jonathan Cameron
  2022-10-15 11:31   ` Steven Rostedt
  2022-10-10 22:41 ` [RFC V2 PATCH 07/11] cxl/mem: Trace Memory Module " ira.weiny
                   ` (4 subsequent siblings)
  10 siblings, 2 replies; 36+ messages in thread
From: ira.weiny @ 2022-10-10 22:41 UTC (permalink / raw)
  To: Dan Williams
  Cc: Ira Weiny, Alison Schofield, Vishal Verma, Ben Widawsky,
	Steven Rostedt, Jonathan Cameron, Davidlohr Bueso, linux-kernel,
	linux-cxl

From: Ira Weiny <ira.weiny@intel.com>

CXL rev 3.0 section 8.2.9.2.1.2 defines the DRAM Event Record.

Determine if the event read is a DRAM event record and if so trace the
record.

Signed-off-by: Ira Weiny <ira.weiny@intel.com>

---
Changes from RFC:
	Add reserved byte data
	Use new CXL header macros
	Jonathan
		Use get_unaligned_le{24,16}() for unaligned fields
		Use 'else if'
	Dave Jiang
		s/cxl_dram_event/dram
		s/cxl_evt_dram_rec/cxl_event_dram
	Adjust for new phys addr mask
---
 drivers/cxl/core/mbox.c    | 14 ++++++
 drivers/cxl/cxlmem.h       | 24 ++++++++++
 include/trace/events/cxl.h | 94 ++++++++++++++++++++++++++++++++++++++
 3 files changed, 132 insertions(+)

diff --git a/drivers/cxl/core/mbox.c b/drivers/cxl/core/mbox.c
index 1097250c115a..72b589edc074 100644
--- a/drivers/cxl/core/mbox.c
+++ b/drivers/cxl/core/mbox.c
@@ -720,6 +720,14 @@ static const uuid_t gen_media_event_uuid =
 	UUID_INIT(0xfbcd0a77, 0xc260, 0x417f,
 		  0x85, 0xa9, 0x08, 0x8b, 0x16, 0x21, 0xeb, 0xa6);
 
+/*
+ * DRAM Event Record
+ * CXL rev 3.0 section 8.2.9.2.1.2; Table 8-44
+ */
+static const uuid_t dram_event_uuid =
+	UUID_INIT(0x601dcbb3, 0x9c06, 0x4eab,
+		  0xb8, 0xaf, 0x4e, 0x9b, 0xfb, 0x5c, 0x96, 0x24);
+
 static void cxl_trace_event_record(const char *dev_name,
 				   enum cxl_event_log_type type,
 				   struct cxl_get_event_payload *payload)
@@ -732,6 +740,12 @@ static void cxl_trace_event_record(const char *dev_name,
 
 		trace_general_media(dev_name, type, rec);
 		return;
+	} else if (uuid_equal(id, &dram_event_uuid)) {
+		struct cxl_event_dram *rec =
+				(struct cxl_event_dram *)&payload->record;
+
+		trace_dram(dev_name, type, rec);
+		return;
 	}
 
 	/* For unknown record types print just the header */
diff --git a/drivers/cxl/cxlmem.h b/drivers/cxl/cxlmem.h
index b5c120bd4068..d0253e5f1187 100644
--- a/drivers/cxl/cxlmem.h
+++ b/drivers/cxl/cxlmem.h
@@ -431,6 +431,30 @@ struct cxl_event_gen_media {
 	u8 reserved[CXL_EVENT_GEN_MED_RES_SIZE];
 } __packed;
 
+/*
+ * DRAM Event Record - DER
+ * CXL rev 3.0 section 8.2.9.2.1.2; Table 3-44
+ */
+#define CXL_EVENT_DER_CORRECTION_MASK_SIZE	0x20
+#define CXL_EVENT_DER_RES_SIZE			0x17
+struct cxl_event_dram {
+	struct cxl_event_record_hdr hdr;
+	__le64 phys_addr;
+	u8 descriptor;
+	u8 type;
+	u8 transaction_type;
+	u8 validity_flags[2];
+	u8 channel;
+	u8 rank;
+	u8 nibble_mask[3];
+	u8 bank_group;
+	u8 bank;
+	u8 row[3];
+	u8 column[2];
+	u8 correction_mask[CXL_EVENT_DER_CORRECTION_MASK_SIZE];
+	u8 reserved[CXL_EVENT_DER_RES_SIZE];
+} __packed;
+
 struct cxl_mbox_get_partition_info {
 	__le64 active_volatile_cap;
 	__le64 active_persistent_cap;
diff --git a/include/trace/events/cxl.h b/include/trace/events/cxl.h
index 82a8d3b750a2..7a90cfea348b 100644
--- a/include/trace/events/cxl.h
+++ b/include/trace/events/cxl.h
@@ -230,6 +230,100 @@ TRACE_EVENT(general_media,
 		)
 );
 
+/*
+ * DRAM Event Record - DER
+ *
+ * CXL rev 3.0 section 8.2.9.2.1.2; Table 8-44
+ */
+/*
+ * DRAM Event Record defines many fields the same as the General Media Event
+ * Record.  Reuse those definitions as appropriate.
+ */
+#define CXL_DER_VALID_CHANNEL				BIT(0)
+#define CXL_DER_VALID_RANK				BIT(1)
+#define CXL_DER_VALID_NIBBLE				BIT(2)
+#define CXL_DER_VALID_BANK_GROUP			BIT(3)
+#define CXL_DER_VALID_BANK				BIT(4)
+#define CXL_DER_VALID_ROW				BIT(5)
+#define CXL_DER_VALID_COLUMN				BIT(6)
+#define CXL_DER_VALID_CORRECTION_MASK			BIT(7)
+#define show_dram_valid_flags(flags)	__print_flags(flags, "|",			   \
+	{ CXL_DER_VALID_CHANNEL,			"CHANNEL"		}, \
+	{ CXL_DER_VALID_RANK,				"RANK"			}, \
+	{ CXL_DER_VALID_NIBBLE,				"NIBBLE"		}, \
+	{ CXL_DER_VALID_BANK_GROUP,			"BANK GROUP"		}, \
+	{ CXL_DER_VALID_BANK,				"BANK"			}, \
+	{ CXL_DER_VALID_ROW,				"ROW"			}, \
+	{ CXL_DER_VALID_COLUMN,				"COLUMN"		}, \
+	{ CXL_DER_VALID_CORRECTION_MASK,		"CORRECTION MASK"	}  \
+)
+
+TRACE_EVENT(dram,
+
+	TP_PROTO(const char *dev_name, enum cxl_event_log_type log,
+		 struct cxl_event_dram *rec),
+
+	TP_ARGS(dev_name, log, rec),
+
+	TP_STRUCT__entry(
+		CXL_EVT_TP_entry
+		/* DRAM */
+		__field(u64, phys_addr)
+		__field(u8, descriptor)
+		__field(u8, type)
+		__field(u8, transaction_type)
+		__field(u8, channel)
+		__field(u16, validity_flags)
+		__field(u16, column)	/* Out of order to pack trace record */
+		__field(u32, nibble_mask)
+		__field(u32, row)
+		__array(u8, cor_mask, CXL_EVENT_DER_CORRECTION_MASK_SIZE)
+		__array(u8, reserved, CXL_EVENT_DER_RES_SIZE)
+		__field(u8, rank)	/* Out of order to pack trace record */
+		__field(u8, bank_group)	/* Out of order to pack trace record */
+		__field(u8, bank)	/* Out of order to pack trace record */
+	),
+
+	TP_fast_assign(
+		CXL_EVT_TP_fast_assign(dev_name, log, rec->hdr);
+
+		/* DRAM */
+		__entry->phys_addr = le64_to_cpu(rec->phys_addr);
+		__entry->descriptor = rec->descriptor;
+		__entry->type = rec->type;
+		__entry->transaction_type = rec->transaction_type;
+		__entry->validity_flags = get_unaligned_le16(rec->validity_flags);
+		__entry->channel = rec->channel;
+		__entry->rank = rec->rank;
+		__entry->nibble_mask = get_unaligned_le24(rec->nibble_mask);
+		__entry->bank_group = rec->bank_group;
+		__entry->bank = rec->bank;
+		__entry->row = get_unaligned_le24(rec->row);
+		__entry->column = get_unaligned_le16(rec->column);
+		memcpy(__entry->cor_mask, &rec->correction_mask,
+			CXL_EVENT_DER_CORRECTION_MASK_SIZE);
+		memcpy(__entry->reserved, &rec->reserved,
+			CXL_EVENT_DER_RES_SIZE);
+	),
+
+	CXL_EVT_TP_printk("phys_addr=%llx volatile=%s desc='%s' type='%s' " \
+		"trans_type='%s' channel=%u rank=%u nibble_mask=%x " \
+		"bank_group=%u bank=%u row=%u column=%u cor_mask=%s " \
+		"valid_flags='%s' reserved=%s",
+		__entry->phys_addr & CXL_GMER_PHYS_ADDR_MASK,
+		(__entry->phys_addr & CXL_GMER_PHYS_ADDR_VOLATILE) ? "TRUE" : "FALSE",
+		show_event_desc_flags(__entry->descriptor),
+		show_mem_event_type(__entry->type),
+		show_trans_type(__entry->transaction_type),
+		__entry->channel, __entry->rank, __entry->nibble_mask,
+		__entry->bank_group, __entry->bank,
+		__entry->row, __entry->column,
+		__print_hex(__entry->cor_mask, CXL_EVENT_DER_CORRECTION_MASK_SIZE),
+		show_dram_valid_flags(__entry->validity_flags),
+		__print_hex(__entry->reserved, CXL_EVENT_DER_RES_SIZE)
+		)
+);
+
 #endif /* _CXL_TRACE_EVENTS_H */
 
 /* This part must be outside protection */
-- 
2.37.2


^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [RFC V2 PATCH 07/11] cxl/mem: Trace Memory Module Event Record
  2022-10-10 22:41 [RFC V2 PATCH 00/11] CXL: Process event logs ira.weiny
                   ` (5 preceding siblings ...)
  2022-10-10 22:41 ` [RFC V2 PATCH 06/11] cxl/mem: Trace DRAM " ira.weiny
@ 2022-10-10 22:41 ` ira.weiny
  2022-10-11 13:54   ` Jonathan Cameron
  2022-10-15 11:32   ` Steven Rostedt
  2022-10-10 22:41 ` [RFC V2 PATCH 08/11] cxl/test: Add generic mock events ira.weiny
                   ` (3 subsequent siblings)
  10 siblings, 2 replies; 36+ messages in thread
From: ira.weiny @ 2022-10-10 22:41 UTC (permalink / raw)
  To: Dan Williams
  Cc: Ira Weiny, Alison Schofield, Vishal Verma, Ben Widawsky,
	Steven Rostedt, Jonathan Cameron, Davidlohr Bueso, linux-kernel,
	linux-cxl

From: Ira Weiny <ira.weiny@intel.com>

CXL rev 3.0 section 8.2.9.2.1.3 defines the Memory Module Event Record.

Determine if the event read is memory module record and if so trace the
record.

Signed-off-by: Ira Weiny <ira.weiny@intel.com>

---
Changes from RFC:
	Clean up spec reference
	Add reserved data
	Use new CXL header macros
	Jonathan
		Use else if
		Use get_unaligned_le*() for unaligned fields
	Dave Jiang
		s/cxl_mem_mod_event/memory_module
		s/cxl_evt_mem_mod_rec/cxl_event_mem_module
---
 drivers/cxl/core/mbox.c    |  14 ++++
 drivers/cxl/cxlmem.h       |  27 +++++++
 include/trace/events/cxl.h | 146 +++++++++++++++++++++++++++++++++++++
 3 files changed, 187 insertions(+)

diff --git a/drivers/cxl/core/mbox.c b/drivers/cxl/core/mbox.c
index 72b589edc074..6b3119bc83d2 100644
--- a/drivers/cxl/core/mbox.c
+++ b/drivers/cxl/core/mbox.c
@@ -728,6 +728,14 @@ static const uuid_t dram_event_uuid =
 	UUID_INIT(0x601dcbb3, 0x9c06, 0x4eab,
 		  0xb8, 0xaf, 0x4e, 0x9b, 0xfb, 0x5c, 0x96, 0x24);
 
+/*
+ * Memory Module Event Record
+ * CXL rev 3.0 section 8.2.9.2.1.3; Table 8-45
+ */
+static const uuid_t mem_mod_event_uuid =
+	UUID_INIT(0xfe927475, 0xdd59, 0x4339,
+		  0xa5, 0x86, 0x79, 0xba, 0xb1, 0x13, 0xb7, 0x74);
+
 static void cxl_trace_event_record(const char *dev_name,
 				   enum cxl_event_log_type type,
 				   struct cxl_get_event_payload *payload)
@@ -746,6 +754,12 @@ static void cxl_trace_event_record(const char *dev_name,
 
 		trace_dram(dev_name, type, rec);
 		return;
+	} else if (uuid_equal(id, &mem_mod_event_uuid)) {
+		struct cxl_event_mem_module *rec =
+				(struct cxl_event_mem_module *)&payload->record;
+
+		trace_memory_module(dev_name, type, rec);
+		return;
 	}
 
 	/* For unknown record types print just the header */
diff --git a/drivers/cxl/cxlmem.h b/drivers/cxl/cxlmem.h
index d0253e5f1187..79b3fac6d9ef 100644
--- a/drivers/cxl/cxlmem.h
+++ b/drivers/cxl/cxlmem.h
@@ -455,6 +455,33 @@ struct cxl_event_dram {
 	u8 reserved[CXL_EVENT_DER_RES_SIZE];
 } __packed;
 
+/*
+ * Get Health Info Record
+ * CXL rev 3.0 section 8.2.9.8.3.1; Table 8-100
+ */
+struct cxl_get_health_info {
+	u8 health_status;
+	u8 media_status;
+	u8 add_status;
+	u8 life_used;
+	u8 device_temp[2];
+	u8 dirty_shutdown_cnt[4];
+	u8 cor_vol_err_cnt[4];
+	u8 cor_per_err_cnt[4];
+} __packed;
+
+/*
+ * Memory Module Event Record
+ * CXL rev 3.0 section 8.2.9.2.1.3; Table 8-45
+ */
+#define CXL_EVENT_MEM_MOD_RES_SIZE	0x3d
+struct cxl_event_mem_module {
+	struct cxl_event_record_hdr hdr;
+	u8 event_type;
+	struct cxl_get_health_info info;
+	u8 reserved[CXL_EVENT_MEM_MOD_RES_SIZE];
+} __packed;
+
 struct cxl_mbox_get_partition_info {
 	__le64 active_volatile_cap;
 	__le64 active_persistent_cap;
diff --git a/include/trace/events/cxl.h b/include/trace/events/cxl.h
index 7a90cfea348b..e2082862ed94 100644
--- a/include/trace/events/cxl.h
+++ b/include/trace/events/cxl.h
@@ -324,6 +324,152 @@ TRACE_EVENT(dram,
 		)
 );
 
+/*
+ * Memory Module Event Record - MMER
+ *
+ * CXL res 3.0 section 8.2.9.2.1.3; Table 8-45
+ */
+#define CXL_MMER_HEALTH_STATUS_CHANGE		0x00
+#define CXL_MMER_MEDIA_STATUS_CHANGE		0x01
+#define CXL_MMER_LIFE_USED_CHANGE		0x02
+#define CXL_MMER_TEMP_CHANGE			0x03
+#define CXL_MMER_DATA_PATH_ERROR		0x04
+#define CXL_MMER_LAS_ERROR			0x05
+#define show_dev_evt_type(type)	__print_symbolic(type,			   \
+	{ CXL_MMER_HEALTH_STATUS_CHANGE,	"Health Status Change"	}, \
+	{ CXL_MMER_MEDIA_STATUS_CHANGE,		"Media Status Change"	}, \
+	{ CXL_MMER_LIFE_USED_CHANGE,		"Life Used Change"	}, \
+	{ CXL_MMER_TEMP_CHANGE,			"Temperature Change"	}, \
+	{ CXL_MMER_DATA_PATH_ERROR,		"Data Path Error"	}, \
+	{ CXL_MMER_LAS_ERROR,			"LSA Error"		}  \
+)
+
+/*
+ * Device Health Information - DHI
+ *
+ * CXL res 3.0 section 8.2.9.8.3.1; Table 8-100
+ */
+#define CXL_DHI_HS_MAINTENANCE_NEEDED				BIT(0)
+#define CXL_DHI_HS_PERFORMANCE_DEGRADED				BIT(1)
+#define CXL_DHI_HS_HW_REPLACEMENT_NEEDED			BIT(2)
+#define show_health_status_flags(flags)	__print_flags(flags, "|",	   \
+	{ CXL_DHI_HS_MAINTENANCE_NEEDED,	"Maintenance Needed"	}, \
+	{ CXL_DHI_HS_PERFORMANCE_DEGRADED,	"Performance Degraded"	}, \
+	{ CXL_DHI_HS_HW_REPLACEMENT_NEEDED,	"Replacement Needed"	}  \
+)
+
+#define CXL_DHI_MS_NORMAL							0x00
+#define CXL_DHI_MS_NOT_READY							0x01
+#define CXL_DHI_MS_WRITE_PERSISTENCY_LOST					0x02
+#define CXL_DHI_MS_ALL_DATA_LOST						0x03
+#define CXL_DHI_MS_WRITE_PERSISTENCY_LOSS_EVENT_POWER_LOSS			0x04
+#define CXL_DHI_MS_WRITE_PERSISTENCY_LOSS_EVENT_SHUTDOWN			0x05
+#define CXL_DHI_MS_WRITE_PERSISTENCY_LOSS_IMMINENT				0x06
+#define CXL_DHI_MS_WRITE_ALL_DATA_LOSS_EVENT_POWER_LOSS				0x07
+#define CXL_DHI_MS_WRITE_ALL_DATA_LOSS_EVENT_SHUTDOWN				0x08
+#define CXL_DHI_MS_WRITE_ALL_DATA_LOSS_IMMINENT					0x09
+#define show_media_status(ms)	__print_symbolic(ms,			   \
+	{ CXL_DHI_MS_NORMAL,						   \
+		"Normal"						}, \
+	{ CXL_DHI_MS_NOT_READY,						   \
+		"Not Ready"						}, \
+	{ CXL_DHI_MS_WRITE_PERSISTENCY_LOST,				   \
+		"Write Persistency Lost"				}, \
+	{ CXL_DHI_MS_ALL_DATA_LOST,					   \
+		"All Data Lost"						}, \
+	{ CXL_DHI_MS_WRITE_PERSISTENCY_LOSS_EVENT_POWER_LOSS,		   \
+		"Write Persistency Loss in the Event of Power Loss"	}, \
+	{ CXL_DHI_MS_WRITE_PERSISTENCY_LOSS_EVENT_SHUTDOWN,		   \
+		"Write Persistency Loss in Event of Shutdown"		}, \
+	{ CXL_DHI_MS_WRITE_PERSISTENCY_LOSS_IMMINENT,			   \
+		"Write Persistency Loss Imminent"			}, \
+	{ CXL_DHI_MS_WRITE_ALL_DATA_LOSS_EVENT_POWER_LOSS,		   \
+		"All Data Loss in Event of Power Loss"			}, \
+	{ CXL_DHI_MS_WRITE_ALL_DATA_LOSS_EVENT_SHUTDOWN,		   \
+		"All Data loss in the Event of Shutdown"		}, \
+	{ CXL_DHI_MS_WRITE_ALL_DATA_LOSS_IMMINENT,			   \
+		"All Data Loss Imminent"				}  \
+)
+
+#define CXL_DHI_AS_NORMAL		0x0
+#define CXL_DHI_AS_WARNING		0x1
+#define CXL_DHI_AS_CRITICAL		0x2
+#define show_add_status(as) __print_symbolic(as,	   \
+	{ CXL_DHI_AS_NORMAL,		"Normal"	}, \
+	{ CXL_DHI_AS_WARNING,		"Warning"	}, \
+	{ CXL_DHI_AS_CRITICAL,		"Critical"	}  \
+)
+
+#define CXL_DHI_AS_LIFE_USED(as)			(as & 0x3)
+#define CXL_DHI_AS_DEV_TEMP(as)				((as & 0xC) >> 2)
+#define CXL_DHI_AS_COR_VOL_ERR_CNT(as)			((as & 0x10) >> 4)
+#define CXL_DHI_AS_COR_PER_ERR_CNT(as)			((as & 0x20) >> 5)
+
+TRACE_EVENT(memory_module,
+
+	TP_PROTO(const char *dev_name, enum cxl_event_log_type log,
+		 struct cxl_event_mem_module *rec),
+
+	TP_ARGS(dev_name, log, rec),
+
+	TP_STRUCT__entry(
+		CXL_EVT_TP_entry
+
+		/* Memory Module Event */
+		__field(u8, event_type)
+
+		/* Device Health Info */
+		__field(u8, health_status)
+		__field(u8, media_status)
+		__field(u8, life_used)
+		__field(u32, dirty_shutdown_cnt)
+		__field(u32, cor_vol_err_cnt)
+		__field(u32, cor_per_err_cnt)
+		__field(s16, device_temp)
+		__field(u8, add_status)
+
+		__array(u8, reserved, CXL_EVENT_MEM_MOD_RES_SIZE)
+	),
+
+	TP_fast_assign(
+		CXL_EVT_TP_fast_assign(dev_name, log, rec->hdr);
+
+		/* Memory Module Event */
+		__entry->event_type = rec->event_type;
+
+		/* Device Health Info */
+		__entry->health_status = rec->info.health_status;
+		__entry->media_status = rec->info.media_status;
+		__entry->life_used = rec->info.life_used;
+		__entry->dirty_shutdown_cnt = get_unaligned_le32(rec->info.dirty_shutdown_cnt);
+		__entry->cor_vol_err_cnt = get_unaligned_le32(rec->info.cor_vol_err_cnt);
+		__entry->cor_per_err_cnt = get_unaligned_le32(rec->info.cor_per_err_cnt);
+		__entry->device_temp = get_unaligned_le16(rec->info.device_temp);
+		__entry->add_status = rec->info.add_status;
+		memcpy(__entry->reserved, &rec->reserved,
+			CXL_EVENT_MEM_MOD_RES_SIZE);
+	),
+
+	CXL_EVT_TP_printk("evt_type='%s' health_status='%s' media_status='%s' " \
+		"as_life_used=%s as_dev_temp=%s as_cor_vol_err_cnt=%s " \
+		"as_cor_per_err_cnt=%s life_used=%u dev_temp=%d " \
+		"dirty_shutdown_cnt=%u cor_vol_err_cnt=%u cor_per_err_cnt=%u " \
+		"reserved=%s",
+		show_dev_evt_type(__entry->event_type),
+		show_health_status_flags(__entry->health_status),
+		show_media_status(__entry->media_status),
+		show_add_status(CXL_DHI_AS_LIFE_USED(__entry->add_status)),
+		show_add_status(CXL_DHI_AS_DEV_TEMP(__entry->add_status)),
+		show_add_status(CXL_DHI_AS_COR_VOL_ERR_CNT(__entry->add_status)),
+		show_add_status(CXL_DHI_AS_COR_PER_ERR_CNT(__entry->add_status)),
+		__entry->life_used, __entry->device_temp,
+		__entry->dirty_shutdown_cnt, __entry->cor_vol_err_cnt,
+		__entry->cor_per_err_cnt,
+		__print_hex(__entry->reserved, CXL_EVENT_MEM_MOD_RES_SIZE)
+		)
+);
+
+
 #endif /* _CXL_TRACE_EVENTS_H */
 
 /* This part must be outside protection */
-- 
2.37.2


^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [RFC V2 PATCH 08/11] cxl/test: Add generic mock events
  2022-10-10 22:41 [RFC V2 PATCH 00/11] CXL: Process event logs ira.weiny
                   ` (6 preceding siblings ...)
  2022-10-10 22:41 ` [RFC V2 PATCH 07/11] cxl/mem: Trace Memory Module " ira.weiny
@ 2022-10-10 22:41 ` ira.weiny
  2022-10-10 22:41 ` [RFC V2 PATCH 09/11] cxl/test: Add specific events ira.weiny
                   ` (2 subsequent siblings)
  10 siblings, 0 replies; 36+ messages in thread
From: ira.weiny @ 2022-10-10 22:41 UTC (permalink / raw)
  To: Dan Williams
  Cc: Ira Weiny, Alison Schofield, Vishal Verma, Ben Widawsky,
	Steven Rostedt, Jonathan Cameron, Davidlohr Bueso, linux-kernel,
	linux-cxl

From: Ira Weiny <ira.weiny@intel.com>

Facilitate testing basic Get/Clear Event functionality by creating
multiple logs and generic events with made up UUID's.

Data is completely made up with data patterns which should be easy to
spot in trace output.

A single sysfs entry resets the event data and triggers collecting the
events for testing.

Test traces are easy to obtain with a small script such as this:

	#!/bin/bash -x

	devices=`find /sys/devices/platform -name cxl_mem*`

	# Turn on tracing
	echo "" > /sys/kernel/tracing/trace
	echo 1 > /sys/kernel/tracing/events/cxl/enable
	echo 1 > /sys/kernel/tracing/tracing_on

	# Generate fake interrupt
	for device in $devices; do
	        echo 1 > $device/event_trigger
	done

	# Turn off tracing and report events
	echo 0 > /sys/kernel/tracing/tracing_on
	cat /sys/kernel/tracing/trace

Signed-off-by: Ira Weiny <ira.weiny@intel.com>

---
Changes from RFC:
	Separate out the event code
	Adjust for struct changes.
	Clean up devm_cxl_mock_event_logs()
	Clean up naming and comments
	Jonathan
		Remove dynamic allocation of event logs
		Clean up comment
		Remove unneeded xarray
		Ensure event_trigger sysfs is valid prior to the driver
		going active.
	Dan
		Remove the fill/reset event sysfs as these operations
		can be done together
---
 tools/testing/cxl/test/Kbuild   |   2 +-
 tools/testing/cxl/test/events.c | 223 ++++++++++++++++++++++++++++++++
 tools/testing/cxl/test/events.h |   9 ++
 tools/testing/cxl/test/mem.c    |  32 +++++
 4 files changed, 265 insertions(+), 1 deletion(-)
 create mode 100644 tools/testing/cxl/test/events.c
 create mode 100644 tools/testing/cxl/test/events.h

diff --git a/tools/testing/cxl/test/Kbuild b/tools/testing/cxl/test/Kbuild
index 4e59e2c911f6..64b14b83d8d9 100644
--- a/tools/testing/cxl/test/Kbuild
+++ b/tools/testing/cxl/test/Kbuild
@@ -7,4 +7,4 @@ obj-m += cxl_mock_mem.o
 
 cxl_test-y := cxl.o
 cxl_mock-y := mock.o
-cxl_mock_mem-y := mem.o
+cxl_mock_mem-y := mem.o events.o
diff --git a/tools/testing/cxl/test/events.c b/tools/testing/cxl/test/events.c
new file mode 100644
index 000000000000..1913c321d16c
--- /dev/null
+++ b/tools/testing/cxl/test/events.c
@@ -0,0 +1,223 @@
+// SPDX-License-Identifier: GPL-2.0-only
+// Copyright(c) 2022 Intel Corporation. All rights reserved.
+
+#include <cxlmem.h>
+#include <trace/events/cxl.h>
+
+#include "events.h"
+
+#define CXL_TEST_EVENT_CNT_MAX 15
+
+struct mock_event_log {
+	int cur_event;
+	int nr_events;
+	struct cxl_event_record_raw *events[CXL_TEST_EVENT_CNT_MAX];
+};
+
+struct mock_event_store {
+	struct cxl_dev_state *cxlds;
+	struct mock_event_log mock_logs[CXL_EVENT_TYPE_MAX];
+};
+
+DEFINE_XARRAY(mock_dev_event_store);
+
+struct mock_event_log *find_event_log(struct device *dev, int log_type)
+{
+	struct mock_event_store *mes = xa_load(&mock_dev_event_store,
+					       (unsigned long)dev);
+
+	if (!mes || log_type >= CXL_EVENT_TYPE_MAX)
+		return NULL;
+	return &mes->mock_logs[log_type];
+}
+
+struct cxl_event_record_raw *get_cur_event(struct mock_event_log *log)
+{
+	return log->events[log->cur_event];
+}
+
+__le16 get_cur_event_handle(struct mock_event_log *log)
+{
+	return cpu_to_le16(log->cur_event);
+}
+
+static bool log_empty(struct mock_event_log *log)
+{
+	return log->cur_event == log->nr_events;
+}
+
+static int log_rec_left(struct mock_event_log *log)
+{
+	return log->nr_events - log->cur_event;
+}
+
+static void event_store_add_event(struct mock_event_store *mes,
+				  enum cxl_event_log_type log_type,
+				  struct cxl_event_record_raw *event)
+{
+	struct mock_event_log *log;
+
+	if (WARN_ON(log_type >= CXL_EVENT_TYPE_MAX))
+		return;
+
+	log = &mes->mock_logs[log_type];
+	if (WARN_ON(log->nr_events >= CXL_TEST_EVENT_CNT_MAX))
+		return;
+
+	log->events[log->nr_events] = event;
+	log->nr_events++;
+}
+
+int mock_get_event(struct cxl_dev_state *cxlds, struct cxl_mbox_cmd *cmd)
+{
+	struct cxl_get_event_payload *pl;
+	struct mock_event_log *log;
+	u8 log_type;
+
+	/* Valid request? */
+	if (cmd->size_in != sizeof(log_type))
+		return -EINVAL;
+
+	log_type = *((u8 *)cmd->payload_in);
+	if (log_type >= CXL_EVENT_TYPE_MAX)
+		return -EINVAL;
+
+	log = find_event_log(cxlds->dev, log_type);
+	if (!log || log_empty(log))
+		goto no_data;
+
+	/* Don't handle more than 1 record at a time */
+	if (cmd->size_out < sizeof(*pl))
+		return -EINVAL;
+
+	pl = cmd->payload_out;
+	memset(pl, 0, sizeof(*pl));
+
+	pl->record_count = cpu_to_le16(1);
+
+	if (log_rec_left(log) > 1)
+		pl->flags |= CXL_GET_EVENT_FLAG_MORE_RECORDS;
+
+	memcpy(&pl->record, get_cur_event(log), sizeof(pl->record));
+	pl->record.hdr.handle = get_cur_event_handle(log);
+	return 0;
+
+no_data:
+	/* Room for header? */
+	if (cmd->size_out < (sizeof(*pl) - sizeof(pl->record)))
+		return -EINVAL;
+
+	memset(cmd->payload_out, 0, cmd->size_out);
+	return 0;
+}
+EXPORT_SYMBOL_GPL(mock_get_event);
+
+/*
+ * Get and clear event only handle 1 record at a time as this is what is
+ * currently implemented in the main code.
+ */
+int mock_clear_event(struct cxl_dev_state *cxlds, struct cxl_mbox_cmd *cmd)
+{
+	struct cxl_mbox_clear_event_payload *pl = cmd->payload_in;
+	struct mock_event_log *log;
+	u8 log_type = pl->event_log;
+
+	/* Don't handle more than 1 record at a time */
+	if (pl->nr_recs != 1)
+		return -EINVAL;
+
+	if (log_type >= CXL_EVENT_TYPE_MAX)
+		return -EINVAL;
+
+	log = find_event_log(cxlds->dev, log_type);
+	if (!log)
+		return 0; /* No mock data in this log */
+
+	/*
+	 * The current code clears events as they are read.  Test that behavior
+	 * only; don't support clearning from the middle of the log
+	 */
+	if (log->cur_event != le16_to_cpu(pl->handle)) {
+		dev_err(cxlds->dev, "Clearing events out of order\n");
+		return -EINVAL;
+	}
+
+	log->cur_event++;
+	return 0;
+}
+EXPORT_SYMBOL_GPL(mock_clear_event);
+
+void cxl_mock_event_trigger(struct device *dev)
+{
+	struct mock_event_store *mes = xa_load(&mock_dev_event_store,
+					       (unsigned long)dev);
+	int i;
+
+	for (i = CXL_EVENT_TYPE_INFO; i < CXL_EVENT_TYPE_MAX; i++) {
+		struct mock_event_log *log;
+
+		log = find_event_log(dev, i);
+		if (log)
+			log->cur_event = 0;
+	}
+
+	cxl_mem_get_event_records(mes->cxlds);
+}
+EXPORT_SYMBOL_GPL(cxl_mock_event_trigger);
+
+struct cxl_event_record_raw maint_needed = {
+	.hdr = {
+		.id = UUID_INIT(0xDEADBEEF, 0xCAFE, 0xBABE,
+				0xa5, 0x5a, 0xa5, 0x5a, 0xa5, 0xa5, 0x5a, 0xa5),
+		.length = sizeof(struct cxl_event_record_raw),
+		.flags[0] = CXL_EVENT_RECORD_FLAG_MAINT_NEEDED,
+		/* .handle = Set dynamically */
+		.related_handle = cpu_to_le16(0xa5b6),
+	},
+	.data = { 0xDE, 0xAD, 0xBE, 0xEF },
+};
+
+struct cxl_event_record_raw hardware_replace = {
+	.hdr = {
+		.id = UUID_INIT(0xBABECAFE, 0xBEEF, 0xDEAD,
+				0xa5, 0x5a, 0xa5, 0x5a, 0xa5, 0xa5, 0x5a, 0xa5),
+		.length = sizeof(struct cxl_event_record_raw),
+		.flags[0] = CXL_EVENT_RECORD_FLAG_HW_REPLACE,
+		/* .handle = Set dynamically */
+		.related_handle = cpu_to_le16(0xb6a5),
+	},
+	.data = { 0xDE, 0xAD, 0xBE, 0xEF },
+};
+
+void cxl_mock_add_event_logs(struct cxl_dev_state *cxlds)
+{
+	struct device *dev = cxlds->dev;
+	struct mock_event_store *mes;
+
+	mes = kzalloc(sizeof(*mes), GFP_KERNEL);
+	if (WARN_ON(!mes))
+		return;
+	mes->cxlds = cxlds;
+
+	if (xa_insert(&mock_dev_event_store, (unsigned long)dev, mes,
+		      GFP_KERNEL)) {
+		dev_err(dev, "Event store not available for %s\n",
+			dev_name(dev));
+		return;
+	}
+
+	event_store_add_event(mes, CXL_EVENT_TYPE_INFO, &maint_needed);
+
+	event_store_add_event(mes, CXL_EVENT_TYPE_FATAL, &hardware_replace);
+
+}
+EXPORT_SYMBOL_GPL(cxl_mock_add_event_logs);
+
+void cxl_mock_remove_event_logs(struct device *dev)
+{
+	struct mock_event_store *mes;
+
+	mes = xa_erase(&mock_dev_event_store, (unsigned long)dev);
+	kfree(mes);
+}
+EXPORT_SYMBOL_GPL(cxl_mock_remove_event_logs);
diff --git a/tools/testing/cxl/test/events.h b/tools/testing/cxl/test/events.h
new file mode 100644
index 000000000000..b3149b38d6df
--- /dev/null
+++ b/tools/testing/cxl/test/events.h
@@ -0,0 +1,9 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+
+#include <cxlmem.h>
+
+int mock_get_event(struct cxl_dev_state *cxlds, struct cxl_mbox_cmd *cmd);
+int mock_clear_event(struct cxl_dev_state *cxlds, struct cxl_mbox_cmd *cmd);
+void cxl_mock_add_event_logs(struct cxl_dev_state *cxlds);
+void cxl_mock_remove_event_logs(struct device *dev);
+void cxl_mock_event_trigger(struct device *dev);
diff --git a/tools/testing/cxl/test/mem.c b/tools/testing/cxl/test/mem.c
index e2f5445d24ff..9768f5132a5c 100644
--- a/tools/testing/cxl/test/mem.c
+++ b/tools/testing/cxl/test/mem.c
@@ -8,6 +8,7 @@
 #include <linux/sizes.h>
 #include <linux/bits.h>
 #include <cxlmem.h>
+#include "events.h"
 
 #define LSA_SIZE SZ_128K
 #define DEV_SIZE SZ_2G
@@ -224,6 +225,12 @@ static int cxl_mock_mbox_send(struct cxl_dev_state *cxlds, struct cxl_mbox_cmd *
 	case CXL_MBOX_OP_GET_PARTITION_INFO:
 		rc = mock_partition_info(cxlds, cmd);
 		break;
+	case CXL_MBOX_OP_GET_EVENT_RECORD:
+		rc = mock_get_event(cxlds, cmd);
+		break;
+	case CXL_MBOX_OP_CLEAR_EVENT_RECORD:
+		rc = mock_clear_event(cxlds, cmd);
+		break;
 	case CXL_MBOX_OP_SET_LSA:
 		rc = mock_set_lsa(cxlds, cmd);
 		break;
@@ -245,6 +252,21 @@ static void label_area_release(void *lsa)
 	vfree(lsa);
 }
 
+static ssize_t event_trigger_store(struct device *dev,
+				   struct device_attribute *attr,
+				   const char *buf, size_t count)
+{
+	cxl_mock_event_trigger(dev);
+	return count;
+}
+static DEVICE_ATTR_WO(event_trigger);
+
+static struct attribute *cxl_mock_event_attrs[] = {
+	&dev_attr_event_trigger.attr,
+	NULL
+};
+ATTRIBUTE_GROUPS(cxl_mock_event);
+
 static int cxl_mock_mem_probe(struct platform_device *pdev)
 {
 	struct device *dev = &pdev->dev;
@@ -281,6 +303,8 @@ static int cxl_mock_mem_probe(struct platform_device *pdev)
 	if (rc)
 		return rc;
 
+	cxl_mock_add_event_logs(cxlds);
+
 	cxlmd = devm_cxl_add_memdev(cxlds);
 	if (IS_ERR(cxlmd))
 		return PTR_ERR(cxlmd);
@@ -293,6 +317,12 @@ static int cxl_mock_mem_probe(struct platform_device *pdev)
 	return 0;
 }
 
+static int cxl_mock_mem_remove(struct platform_device *pdev)
+{
+	cxl_mock_remove_event_logs(&pdev->dev);
+	return 0;
+}
+
 static const struct platform_device_id cxl_mock_mem_ids[] = {
 	{ .name = "cxl_mem", },
 	{ },
@@ -301,9 +331,11 @@ MODULE_DEVICE_TABLE(platform, cxl_mock_mem_ids);
 
 static struct platform_driver cxl_mock_mem_driver = {
 	.probe = cxl_mock_mem_probe,
+	.remove = cxl_mock_mem_remove,
 	.id_table = cxl_mock_mem_ids,
 	.driver = {
 		.name = KBUILD_MODNAME,
+		.dev_groups = cxl_mock_event_groups,
 	},
 };
 
-- 
2.37.2


^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [RFC V2 PATCH 09/11] cxl/test: Add specific events
  2022-10-10 22:41 [RFC V2 PATCH 00/11] CXL: Process event logs ira.weiny
                   ` (7 preceding siblings ...)
  2022-10-10 22:41 ` [RFC V2 PATCH 08/11] cxl/test: Add generic mock events ira.weiny
@ 2022-10-10 22:41 ` ira.weiny
  2022-10-10 22:41 ` [RFC V2 PATCH 10/11] cxl/test: Simulate event log overflow ira.weiny
  2022-10-10 22:41 ` [RFC V2 PATCH 11/11] cxl/mem: Wire up event interrupts ira.weiny
  10 siblings, 0 replies; 36+ messages in thread
From: ira.weiny @ 2022-10-10 22:41 UTC (permalink / raw)
  To: Dan Williams
  Cc: Ira Weiny, Alison Schofield, Vishal Verma, Ben Widawsky,
	Steven Rostedt, Jonathan Cameron, Davidlohr Bueso, linux-kernel,
	linux-cxl

From: Ira Weiny <ira.weiny@intel.com>

Each type of event has different trace point outputs.

Add mock General Media Event, DRAM event, and Memory Module Event
records to the mock list of events returned.

Signed-off-by: Ira Weiny <ira.weiny@intel.com>

---
Changes from RFC:
	Adjust for struct changes
	adjust for unaligned fields
---
 tools/testing/cxl/test/events.c | 70 +++++++++++++++++++++++++++++++++
 1 file changed, 70 insertions(+)

diff --git a/tools/testing/cxl/test/events.c b/tools/testing/cxl/test/events.c
index 1913c321d16c..1d3dbeaf7794 100644
--- a/tools/testing/cxl/test/events.c
+++ b/tools/testing/cxl/test/events.c
@@ -189,6 +189,70 @@ struct cxl_event_record_raw hardware_replace = {
 	.data = { 0xDE, 0xAD, 0xBE, 0xEF },
 };
 
+struct cxl_event_gen_media gen_media = {
+	.hdr = {
+		.id = UUID_INIT(0xfbcd0a77, 0xc260, 0x417f,
+				0x85, 0xa9, 0x08, 0x8b, 0x16, 0x21, 0xeb, 0xa6),
+		.length = sizeof(struct cxl_event_gen_media),
+		.flags[0] = CXL_EVENT_RECORD_FLAG_PERMANENT,
+		/* .handle = Set dynamically */
+		.related_handle = cpu_to_le16(0),
+	},
+	.phys_addr = cpu_to_le64(0x2000),
+	.descriptor = CXL_GMER_EVT_DESC_UNCORECTABLE_EVENT,
+	.type = CXL_GMER_MEM_EVT_TYPE_DATA_PATH_ERROR,
+	.transaction_type = CXL_GMER_TRANS_HOST_WRITE,
+	.validity_flags = { CXL_GMER_VALID_CHANNEL |
+			    CXL_GMER_VALID_RANK, 0 },
+	.channel = 1,
+	.rank = 30
+};
+
+struct cxl_event_dram dram = {
+	.hdr = {
+		.id = UUID_INIT(0x601dcbb3, 0x9c06, 0x4eab,
+				0xb8, 0xaf, 0x4e, 0x9b, 0xfb, 0x5c, 0x96, 0x24),
+		.length = sizeof(struct cxl_event_dram),
+		.flags[0] = CXL_EVENT_RECORD_FLAG_PERF_DEGRADED,
+		/* .handle = Set dynamically */
+		.related_handle = cpu_to_le16(0),
+	},
+	.phys_addr = cpu_to_le64(0x8000),
+	.descriptor = CXL_GMER_EVT_DESC_THRESHOLD_EVENT,
+	.type = CXL_GMER_MEM_EVT_TYPE_INV_ADDR,
+	.transaction_type = CXL_GMER_TRANS_INTERNAL_MEDIA_SCRUB,
+	.validity_flags = { CXL_DER_VALID_CHANNEL |
+			    CXL_DER_VALID_BANK_GROUP |
+			    CXL_DER_VALID_BANK |
+			    CXL_DER_VALID_COLUMN, 0 },
+	.channel = 1,
+	.bank_group = 5,
+	.bank = 2,
+	.column = { 0xDE, 0xAD},
+};
+
+struct cxl_event_mem_module mem_module = {
+	.hdr = {
+		.id = UUID_INIT(0xfe927475, 0xdd59, 0x4339,
+				0xa5, 0x86, 0x79, 0xba, 0xb1, 0x13, 0xb7, 0x74),
+		.length = sizeof(struct cxl_event_mem_module),
+		/* .handle = Set dynamically */
+		.related_handle = cpu_to_le16(0),
+	},
+	.event_type = CXL_MMER_TEMP_CHANGE,
+	.info = {
+		.health_status = CXL_DHI_HS_PERFORMANCE_DEGRADED,
+		.media_status = CXL_DHI_MS_ALL_DATA_LOST,
+		.add_status = (CXL_DHI_AS_CRITICAL << 2) |
+			      (CXL_DHI_AS_WARNING << 4) |
+			      (CXL_DHI_AS_WARNING << 5),
+		.device_temp = { 0xDE, 0xAD},
+		.dirty_shutdown_cnt = { 0xde, 0xad, 0xbe, 0xef },
+		.cor_vol_err_cnt = { 0xde, 0xad, 0xbe, 0xef },
+		.cor_per_err_cnt = { 0xde, 0xad, 0xbe, 0xef },
+	}
+};
+
 void cxl_mock_add_event_logs(struct cxl_dev_state *cxlds)
 {
 	struct device *dev = cxlds->dev;
@@ -207,8 +271,14 @@ void cxl_mock_add_event_logs(struct cxl_dev_state *cxlds)
 	}
 
 	event_store_add_event(mes, CXL_EVENT_TYPE_INFO, &maint_needed);
+	event_store_add_event(mes, CXL_EVENT_TYPE_INFO,
+			      (struct cxl_event_record_raw *)&gen_media);
+	event_store_add_event(mes, CXL_EVENT_TYPE_INFO,
+			      (struct cxl_event_record_raw *)&mem_module);
 
 	event_store_add_event(mes, CXL_EVENT_TYPE_FATAL, &hardware_replace);
+	event_store_add_event(mes, CXL_EVENT_TYPE_FATAL,
+			      (struct cxl_event_record_raw *)&dram);
 
 }
 EXPORT_SYMBOL_GPL(cxl_mock_add_event_logs);
-- 
2.37.2


^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [RFC V2 PATCH 10/11] cxl/test: Simulate event log overflow
  2022-10-10 22:41 [RFC V2 PATCH 00/11] CXL: Process event logs ira.weiny
                   ` (8 preceding siblings ...)
  2022-10-10 22:41 ` [RFC V2 PATCH 09/11] cxl/test: Add specific events ira.weiny
@ 2022-10-10 22:41 ` ira.weiny
  2022-10-10 22:41 ` [RFC V2 PATCH 11/11] cxl/mem: Wire up event interrupts ira.weiny
  10 siblings, 0 replies; 36+ messages in thread
From: ira.weiny @ 2022-10-10 22:41 UTC (permalink / raw)
  To: Dan Williams
  Cc: Ira Weiny, Alison Schofield, Vishal Verma, Ben Widawsky,
	Steven Rostedt, Jonathan Cameron, Davidlohr Bueso, linux-kernel,
	linux-cxl

From: Ira Weiny <ira.weiny@intel.com>

Log overflow is marked by a separate trace message.

Simulate a log with lots of messages and flag overflow until it is
drained a bit.

Signed-off-by: Ira Weiny <ira.weiny@intel.com>

---
Changes from RFC
	Adjust for new struct changes
---
 tools/testing/cxl/test/events.c | 36 +++++++++++++++++++++++++++++++++
 1 file changed, 36 insertions(+)

diff --git a/tools/testing/cxl/test/events.c b/tools/testing/cxl/test/events.c
index 1d3dbeaf7794..21cbd2990bca 100644
--- a/tools/testing/cxl/test/events.c
+++ b/tools/testing/cxl/test/events.c
@@ -68,11 +68,21 @@ static void event_store_add_event(struct mock_event_store *mes,
 	log->nr_events++;
 }
 
+static u16 log_overflow(struct mock_event_log *log)
+{
+	int cnt = log_rec_left(log) - 5;
+
+	if (cnt < 0)
+		return 0;
+	return cnt;
+}
+
 int mock_get_event(struct cxl_dev_state *cxlds, struct cxl_mbox_cmd *cmd)
 {
 	struct cxl_get_event_payload *pl;
 	struct mock_event_log *log;
 	u8 log_type;
+	u16 nr_overflow;
 
 	/* Valid request? */
 	if (cmd->size_in != sizeof(log_type))
@@ -98,6 +108,20 @@ int mock_get_event(struct cxl_dev_state *cxlds, struct cxl_mbox_cmd *cmd)
 	if (log_rec_left(log) > 1)
 		pl->flags |= CXL_GET_EVENT_FLAG_MORE_RECORDS;
 
+	nr_overflow = log_overflow(log);
+	if (nr_overflow) {
+		u64 ns;
+
+		pl->flags |= CXL_GET_EVENT_FLAG_OVERFLOW;
+		pl->overflow_err_count = cpu_to_le16(nr_overflow);
+		ns = ktime_get_real_ns();
+		ns -= 5000000000; /* 5s ago */
+		pl->first_overflow_timestamp = cpu_to_le64(ns);
+		ns = ktime_get_real_ns();
+		ns -= 1000000000; /* 1s ago */
+		pl->last_overflow_timestamp = cpu_to_le64(ns);
+	}
+
 	memcpy(&pl->record, get_cur_event(log), sizeof(pl->record));
 	pl->record.hdr.handle = get_cur_event_handle(log);
 	return 0;
@@ -276,6 +300,18 @@ void cxl_mock_add_event_logs(struct cxl_dev_state *cxlds)
 	event_store_add_event(mes, CXL_EVENT_TYPE_INFO,
 			      (struct cxl_event_record_raw *)&mem_module);
 
+	event_store_add_event(mes, CXL_EVENT_TYPE_FAIL, &maint_needed);
+	event_store_add_event(mes, CXL_EVENT_TYPE_FAIL, &hardware_replace);
+	event_store_add_event(mes, CXL_EVENT_TYPE_FAIL,
+			      (struct cxl_event_record_raw *)&dram);
+	event_store_add_event(mes, CXL_EVENT_TYPE_FAIL,
+			      (struct cxl_event_record_raw *)&gen_media);
+	event_store_add_event(mes, CXL_EVENT_TYPE_FAIL,
+			      (struct cxl_event_record_raw *)&mem_module);
+	event_store_add_event(mes, CXL_EVENT_TYPE_FAIL, &hardware_replace);
+	event_store_add_event(mes, CXL_EVENT_TYPE_FAIL,
+			      (struct cxl_event_record_raw *)&dram);
+
 	event_store_add_event(mes, CXL_EVENT_TYPE_FATAL, &hardware_replace);
 	event_store_add_event(mes, CXL_EVENT_TYPE_FATAL,
 			      (struct cxl_event_record_raw *)&dram);
-- 
2.37.2


^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [RFC V2 PATCH 11/11] cxl/mem: Wire up event interrupts
  2022-10-10 22:41 [RFC V2 PATCH 00/11] CXL: Process event logs ira.weiny
                   ` (9 preceding siblings ...)
  2022-10-10 22:41 ` [RFC V2 PATCH 10/11] cxl/test: Simulate event log overflow ira.weiny
@ 2022-10-10 22:41 ` ira.weiny
  2022-10-12 18:01   ` Davidlohr Bueso
  10 siblings, 1 reply; 36+ messages in thread
From: ira.weiny @ 2022-10-10 22:41 UTC (permalink / raw)
  To: Dan Williams
  Cc: Ira Weiny, Davidlohr Bueso, Alison Schofield, Vishal Verma,
	Ben Widawsky, Steven Rostedt, Jonathan Cameron, linux-kernel,
	linux-cxl

From: Ira Weiny <ira.weiny@intel.com>

CXL device events are signaled via interrupts.  Each event log may have
a different interrupt message number.  These message numbers are
reported in the Get Event Interrupt Policy mailbox command.

Create an infrastructure to query the max vectors required for the CXL
device.  Add event interrupt information that infrastructure.  Set up a
handler for each event log.

Davidlohr suggested the generic vector code.

Suggested-by: Davidlohr Bueso <dave@stgolabs.net>
Link: https://lore.kernel.org/linux-cxl/20220822161802.h47v7yfrqufeltqt@offworld/
Signed-off-by: Ira Weiny <ira.weiny@intel.com>
---
 drivers/cxl/core/mbox.c      |  53 +++++++++++-
 drivers/cxl/cxlmem.h         |  32 ++++++++
 drivers/cxl/pci.c            | 152 +++++++++++++++++++++++++++++++++++
 include/uapi/linux/cxl_mem.h |   2 +
 4 files changed, 237 insertions(+), 2 deletions(-)

diff --git a/drivers/cxl/core/mbox.c b/drivers/cxl/core/mbox.c
index 6b3119bc83d2..ffd58da95df3 100644
--- a/drivers/cxl/core/mbox.c
+++ b/drivers/cxl/core/mbox.c
@@ -53,6 +53,8 @@ static struct cxl_mem_command cxl_mem_commands[CXL_MEM_COMMAND_ID_MAX] = {
 	CXL_CMD(GET_SUPPORTED_LOGS, 0, CXL_VARIABLE_PAYLOAD, CXL_CMD_FLAG_FORCE_ENABLE),
 	CXL_CMD(GET_EVENT_RECORD, 1, CXL_VARIABLE_PAYLOAD, 0),
 	CXL_CMD(CLEAR_EVENT_RECORD, CXL_VARIABLE_PAYLOAD, 0, 0),
+	CXL_CMD(GET_EVT_INT_POLICY, 0, 0x5, 0),
+	CXL_CMD(SET_EVT_INT_POLICY, 0x5, 0, 0),
 	CXL_CMD(GET_FW_INFO, 0, 0x50, 0),
 	CXL_CMD(GET_PARTITION_INFO, 0, 0x20, 0),
 	CXL_CMD(GET_LSA, 0x8, CXL_VARIABLE_PAYLOAD, 0),
@@ -780,8 +782,8 @@ static int cxl_clear_event_record(struct cxl_dev_state *cxlds,
 				 &payload, sizeof(payload), NULL, 0);
 }
 
-static void cxl_mem_get_records_log(struct cxl_dev_state *cxlds,
-				    enum cxl_event_log_type type)
+void cxl_mem_get_records_log(struct cxl_dev_state *cxlds,
+			     enum cxl_event_log_type type)
 {
 	struct cxl_get_event_payload payload;
 
@@ -816,6 +818,7 @@ static void cxl_mem_get_records_log(struct cxl_dev_state *cxlds,
 
 	} while (payload.flags & CXL_GET_EVENT_FLAG_MORE_RECORDS);
 }
+EXPORT_SYMBOL_NS_GPL(cxl_mem_get_records_log, CXL);
 
 /**
  * cxl_mem_get_event_records - Get Event Records from the device
@@ -839,6 +842,52 @@ void cxl_mem_get_event_records(struct cxl_dev_state *cxlds)
 }
 EXPORT_SYMBOL_NS_GPL(cxl_mem_get_event_records, CXL);
 
+static int cxl_event_msgnum(u8 setting)
+{
+	if (!cxl_evt_int_is_msi(setting))
+		return -1;
+
+	return CXL_EVENT_INT_MSGNUM(setting);
+}
+
+int cxl_event_get_max_msgnum(struct cxl_dev_state *cxlds)
+{
+	struct cxl_event_interrupt_policy *policy = &cxlds->evt_int_policy;
+	int rc;
+
+	policy->info_settings = CXL_INT_MSI_MSIX;
+	policy->warn_settings = CXL_INT_MSI_MSIX;
+	policy->failure_settings = CXL_INT_MSI_MSIX;
+	policy->fatal_settings = CXL_INT_MSI_MSIX;
+	policy->dyn_cap_settings = CXL_INT_MSI_MSIX;
+
+	rc = cxl_mbox_send_cmd(cxlds, CXL_MBOX_OP_SET_EVT_INT_POLICY,
+			       policy, sizeof(*policy),
+			       NULL, 0);
+	if (rc) {
+		dev_err(cxlds->dev, "Failed to set event interrupt policy : %d",
+			rc);
+		memset(policy, CXL_INT_NONE, sizeof(*policy));
+		return -1;
+	}
+
+	rc = cxl_mbox_send_cmd(cxlds, CXL_MBOX_OP_GET_EVT_INT_POLICY, NULL, 0,
+			       policy, sizeof(*policy));
+	if (rc) {
+		dev_err(cxlds->dev, "Failed to get event interrupt policy : %d",
+			rc);
+		return -1;
+	}
+
+	rc = max_t(int, rc, cxl_event_msgnum(policy->info_settings));
+	rc = max_t(int, rc, cxl_event_msgnum(policy->warn_settings));
+	rc = max_t(int, rc, cxl_event_msgnum(policy->failure_settings));
+	rc = max_t(int, rc, cxl_event_msgnum(policy->fatal_settings));
+
+	return rc;
+}
+EXPORT_SYMBOL_NS_GPL(cxl_event_get_max_msgnum, CXL);
+
 /**
  * cxl_mem_get_partition_info - Get partition info
  * @cxlds: The device data for the operation
diff --git a/drivers/cxl/cxlmem.h b/drivers/cxl/cxlmem.h
index 79b3fac6d9ef..27132ed2bdd3 100644
--- a/drivers/cxl/cxlmem.h
+++ b/drivers/cxl/cxlmem.h
@@ -179,6 +179,32 @@ struct cxl_endpoint_dvsec_info {
 	struct range dvsec_range[2];
 };
 
+/**
+ * Event Interrupt Policy
+ *
+ * CXL rev 3.0 section 8.2.9.2.4; Table 8-52
+ */
+enum cxl_event_int_mode {
+	CXL_INT_NONE		= 0x00,
+	CXL_INT_MSI_MSIX	= 0x01,
+	CXL_INT_FW		= 0x02,
+	CXL_INT_RES		= 0x03,
+};
+#define CXL_EVENT_INT_MODE_MASK 0x3
+#define CXL_EVENT_INT_MSGNUM(setting) (((setting) & 0xf0) >> 4)
+struct cxl_event_interrupt_policy {
+	u8 info_settings;
+	u8 warn_settings;
+	u8 failure_settings;
+	u8 fatal_settings;
+	u8 dyn_cap_settings;
+} __packed;
+
+static inline bool cxl_evt_int_is_msi(u8 setting)
+{
+	return CXL_INT_MSI_MSIX == (setting & CXL_EVENT_INT_MODE_MASK);
+}
+
 /**
  * struct cxl_dev_state - The driver device state
  *
@@ -245,6 +271,7 @@ struct cxl_dev_state {
 
 	resource_size_t component_reg_phys;
 	u64 serial;
+	struct cxl_event_interrupt_policy evt_int_policy;
 
 	struct xarray doe_mbs;
 
@@ -256,6 +283,8 @@ enum cxl_opcode {
 	CXL_MBOX_OP_RAW			= CXL_MBOX_OP_INVALID,
 	CXL_MBOX_OP_GET_EVENT_RECORD	= 0x0100,
 	CXL_MBOX_OP_CLEAR_EVENT_RECORD	= 0x0101,
+	CXL_MBOX_OP_GET_EVT_INT_POLICY	= 0x0102,
+	CXL_MBOX_OP_SET_EVT_INT_POLICY	= 0x0103,
 	CXL_MBOX_OP_GET_FW_INFO		= 0x0200,
 	CXL_MBOX_OP_ACTIVATE_FW		= 0x0202,
 	CXL_MBOX_OP_GET_SUPPORTED_LOGS	= 0x0400,
@@ -541,7 +570,10 @@ int cxl_mem_create_range_info(struct cxl_dev_state *cxlds);
 struct cxl_dev_state *cxl_dev_state_create(struct device *dev);
 void set_exclusive_cxl_commands(struct cxl_dev_state *cxlds, unsigned long *cmds);
 void clear_exclusive_cxl_commands(struct cxl_dev_state *cxlds, unsigned long *cmds);
+void cxl_mem_get_records_log(struct cxl_dev_state *cxlds,
+			     enum cxl_event_log_type type);
 void cxl_mem_get_event_records(struct cxl_dev_state *cxlds);
+int cxl_event_get_max_msgnum(struct cxl_dev_state *cxlds);
 #ifdef CONFIG_CXL_SUSPEND
 void cxl_mem_active_inc(void);
 void cxl_mem_active_dec(void);
diff --git a/drivers/cxl/pci.c b/drivers/cxl/pci.c
index 5f1b492bd388..a0d2615d5b6b 100644
--- a/drivers/cxl/pci.c
+++ b/drivers/cxl/pci.c
@@ -428,6 +428,156 @@ static void devm_cxl_pci_create_doe(struct cxl_dev_state *cxlds)
 	}
 }
 
+/**
+ * struct cxl_irq_cap - CXL feature that is capable of receiving MSI/MSI-X irqs.
+ *
+ * @name: Name of the device generating this interrupt.
+ * @get_max_msgnum: Get the feature's largest interrupt message number.  If the
+ *		    feature does not have the Interrupt Supported bit set, then
+ *		    return -1.
+ */
+struct cxl_irq_cap {
+	const char *name;
+	int (*get_max_msgnum)(struct cxl_dev_state *cxlds);
+};
+
+struct cxl_irq_cap cxl_irq_cap_table[] = {
+	{ "event", cxl_event_get_max_msgnum }
+};
+
+static void cxl_pci_free_irq_vectors(void *data)
+{
+	pci_free_irq_vectors(data);
+}
+
+static int cxl_pci_alloc_irq_vectors(struct cxl_dev_state *cxlds)
+{
+	struct device *dev = cxlds->dev;
+	struct pci_dev *pdev = to_pci_dev(dev);
+	int rc, i, vectors = -1;
+
+	for (i = 0; i < ARRAY_SIZE(cxl_irq_cap_table); i++) {
+		int irq;
+
+		if (!cxl_irq_cap_table[i].get_max_msgnum)
+			continue;
+
+		irq = cxl_irq_cap_table[i].get_max_msgnum(cxlds);
+		vectors = max_t(int, irq, vectors);
+	}
+
+	if (vectors == -1)
+		return -EINVAL; /* no irq support whatsoever */
+
+	vectors++;
+	rc = pci_alloc_irq_vectors(pdev, vectors, vectors,
+				   PCI_IRQ_MSIX | PCI_IRQ_MSI);
+	if (rc < 0)
+		return rc;
+
+	if (rc != vectors) {
+		dev_err(dev, "Not enough interrupts; use polling where supported\n");
+		/* Some got allocated; clean them up */
+		cxl_pci_free_irq_vectors(pdev);
+		return -ENOSPC;
+	}
+
+	return devm_add_action_or_reset(dev, cxl_pci_free_irq_vectors, pdev);
+}
+
+struct cxl_event_irq_id {
+	struct cxl_dev_state *cxlds;
+	enum cxl_event_log_type log_type;
+	unsigned int msgnum;
+};
+
+static irqreturn_t cxl_event_int_handler(int irq, void *id)
+{
+	struct cxl_event_irq_id *cxlid = id;
+
+	cxl_mem_get_records_log(cxlid->cxlds, cxlid->log_type);
+	return IRQ_HANDLED;
+}
+
+static void cxl_free_irq(void *id)
+{
+	struct cxl_event_irq_id *cxlid = id;
+	struct pci_dev *pdev = to_pci_dev(cxlid->cxlds->dev);
+
+	pci_free_irq(pdev, cxlid->msgnum, id);
+}
+
+static int cxl_request_event_irq(struct cxl_dev_state *cxlds,
+				 enum cxl_event_log_type log_type,
+				 u8 msgnum)
+{
+	struct device *dev = cxlds->dev;
+	struct pci_dev *pdev = to_pci_dev(dev);
+	struct cxl_event_irq_id *id;
+	int irq;
+
+	id = devm_kzalloc(dev, sizeof(*id), GFP_KERNEL);
+	if (!id)
+		return -ENOMEM;
+
+	id->cxlds = cxlds;
+	id->msgnum = msgnum;
+	id->log_type = log_type;
+
+	irq = pci_request_irq(pdev, msgnum, cxl_event_int_handler, NULL, id,
+			      "%s:event-log-%s", dev_name(dev),
+			      cxl_event_log_type_str(id->log_type));
+	if (irq)
+		return irq;
+
+	devm_add_action_or_reset(dev, cxl_free_irq, id);
+	return 0;
+}
+
+static void cxl_config_log_irq(struct cxl_dev_state *cxlds,
+			       enum cxl_event_log_type log_type,
+			       u8 setting, u8 msgnum)
+{
+	struct device *dev = cxlds->dev;
+
+	if (!cxl_evt_int_is_msi(setting)) {
+		dev_dbg(dev, "IRQ not enabled for %s event log\n",
+			cxl_event_log_type_str(log_type));
+		return;
+	}
+
+	if (cxl_request_event_irq(cxlds, log_type, msgnum))
+		dev_err(dev, "Failed to get interrupt for %s Event Log\n",
+			cxl_event_log_type_str(log_type));
+}
+
+static void cxl_configure_event_irq(struct cxl_dev_state *cxlds)
+{
+	u8 info_msg_num;
+	u8 setting;
+
+	setting = cxlds->evt_int_policy.info_settings;
+	info_msg_num = CXL_EVENT_INT_MSGNUM(setting);
+	cxl_config_log_irq(cxlds, CXL_EVENT_TYPE_INFO, setting, info_msg_num);
+
+	setting = cxlds->evt_int_policy.warn_settings;
+	cxl_config_log_irq(cxlds, CXL_EVENT_TYPE_WARN, setting,
+			   CXL_EVENT_INT_MSGNUM(setting));
+
+	setting = cxlds->evt_int_policy.failure_settings;
+	cxl_config_log_irq(cxlds, CXL_EVENT_TYPE_FAIL, setting,
+			   CXL_EVENT_INT_MSGNUM(setting));
+
+	setting = cxlds->evt_int_policy.fatal_settings;
+	cxl_config_log_irq(cxlds, CXL_EVENT_TYPE_FATAL, setting,
+			   CXL_EVENT_INT_MSGNUM(setting));
+
+	setting = cxlds->evt_int_policy.fatal_settings;
+	/* Dynamic Capacity shares the info message number */
+	cxl_config_log_irq(cxlds, CXL_EVENT_TYPE_DYNAMIC_CAP, setting,
+			   info_msg_num);
+}
+
 static int cxl_pci_probe(struct pci_dev *pdev, const struct pci_device_id *id)
 {
 	struct cxl_register_map map;
@@ -498,6 +648,8 @@ static int cxl_pci_probe(struct pci_dev *pdev, const struct pci_device_id *id)
 	if (IS_ERR(cxlmd))
 		return PTR_ERR(cxlmd);
 
+	if (!cxl_pci_alloc_irq_vectors(cxlds))
+		cxl_configure_event_irq(cxlds);
 	cxl_mem_get_event_records(cxlds);
 
 	if (resource_size(&cxlds->pmem_res) && IS_ENABLED(CONFIG_CXL_PMEM))
diff --git a/include/uapi/linux/cxl_mem.h b/include/uapi/linux/cxl_mem.h
index 7c1ad8062792..a8204802fcca 100644
--- a/include/uapi/linux/cxl_mem.h
+++ b/include/uapi/linux/cxl_mem.h
@@ -26,6 +26,8 @@
 	___C(GET_SUPPORTED_LOGS, "Get Supported Logs"),                   \
 	___C(GET_EVENT_RECORD, "Get Event Record"),                       \
 	___C(CLEAR_EVENT_RECORD, "Clear Event Record"),                   \
+	___C(GET_EVT_INT_POLICY, "Get Event Interrupt Policy"),           \
+	___C(SET_EVT_INT_POLICY, "Set Event Interrupt Policy"),           \
 	___C(GET_FW_INFO, "Get FW Info"),                                 \
 	___C(GET_PARTITION_INFO, "Get Partition Information"),            \
 	___C(GET_LSA, "Get Label Storage Area"),                          \
-- 
2.37.2


^ permalink raw reply related	[flat|nested] 36+ messages in thread

* Re: [RFC V2 PATCH 01/11] cxl/mbox: Add debug of hardware error code
  2022-10-10 22:41 ` [RFC V2 PATCH 01/11] cxl/mbox: Add debug of hardware error code ira.weiny
@ 2022-10-11 10:41   ` Jonathan Cameron
  2022-10-14 16:29   ` Davidlohr Bueso
  1 sibling, 0 replies; 36+ messages in thread
From: Jonathan Cameron @ 2022-10-11 10:41 UTC (permalink / raw)
  To: ira.weiny
  Cc: Dan Williams, Alison Schofield, Vishal Verma, Ben Widawsky,
	Steven Rostedt, Davidlohr Bueso, linux-kernel, linux-cxl

On Mon, 10 Oct 2022 15:41:21 -0700
ira.weiny@intel.com wrote:

> From: Ira Weiny <ira.weiny@intel.com>
> 
> If a mailbox command fails the driver always reports ENXIO.  But this
> may not be enough information to understand why the hardware, or in my
> case Qemu, was failing.
> 
> Add a debug print of the error code returned from the hardware.
> 
> Signed-off-by: Ira Weiny <ira.weiny@intel.com>
Seems very sensible to me.

Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>

> ---
>  drivers/cxl/core/mbox.c | 5 ++++-
>  1 file changed, 4 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/cxl/core/mbox.c b/drivers/cxl/core/mbox.c
> index 16176b9278b4..6c4d024ad0e8 100644
> --- a/drivers/cxl/core/mbox.c
> +++ b/drivers/cxl/core/mbox.c
> @@ -181,8 +181,11 @@ int cxl_mbox_send_cmd(struct cxl_dev_state *cxlds, u16 opcode, void *in,
>  	if (rc)
>  		return rc;
>  
> -	if (mbox_cmd.return_code != CXL_MBOX_CMD_RC_SUCCESS)
> +	if (mbox_cmd.return_code != CXL_MBOX_CMD_RC_SUCCESS) {
> +		dev_dbg(cxlds->dev, "MB error : %s\n",
> +			cxl_mbox_cmd_rc2str(&mbox_cmd));
>  		return cxl_mbox_cmd_rc2errno(&mbox_cmd);
> +	}
>  
>  	/*
>  	 * Variable sized commands can't be validated and so it's up to the


^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [RFC V2 PATCH 02/11] cxl/mem: Implement Get Event Records command
  2022-10-10 22:41 ` [RFC V2 PATCH 02/11] cxl/mem: Implement Get Event Records command ira.weiny
@ 2022-10-11 12:39   ` Jonathan Cameron
  2022-10-14 19:21     ` Ira Weiny
  2022-10-15 11:28   ` Steven Rostedt
  2022-10-20 21:50   ` Smita Koralahalli
  2 siblings, 1 reply; 36+ messages in thread
From: Jonathan Cameron @ 2022-10-11 12:39 UTC (permalink / raw)
  To: ira.weiny
  Cc: Dan Williams, Steven Rostedt, Alison Schofield, Vishal Verma,
	Ben Widawsky, Davidlohr Bueso, linux-kernel, linux-cxl,
	Mauro Carvalho Chehab

On Mon, 10 Oct 2022 15:41:22 -0700
ira.weiny@intel.com wrote:

> From: Ira Weiny <ira.weiny@intel.com>
> 
> Event records are defined for CXL devices.  Each record is reported in
> one event log.  Devices are required to support the storage of at least
> one event record in each event log type.
> 
> Devices track event log overflow by incrementing a counter and tracking
> the time of the first and last overflow event seen.
> 
> Software queries events via the Get Event Record mailbox command; CXL
> rev 3.0 section 8.2.9.2.2.
> 
> Issue the Get Event Record mailbox command on driver load.  Trace each
> record found, as well as any overflow conditions.  Only 1 event is
> requested for each query.  Optimization of multiple record queries is
> deferred.
> 
> This patch traces a raw event record only and leaves the specific event
> record types to subsequent patches.
> 
> Macros are created to use for the common CXL Event header fields.
> 
> Cc: Steven Rostedt <rostedt@goodmis.org>
> Signed-off-by: Ira Weiny <ira.weiny@intel.com>

Hi Ira,

This looks basically fine, but I'm not convinced that it is a good
or sustainable idea to report reserved fields from the underlying
events in the trace events.
If they are defined to have meaning later, we can't remove them from
the 'reserved' field with out breaking backwards compatibility.
So we end up with a weird mixture of some fields in both reserved and
new entries in the TP and some not.

I've cc'd Mauro as he's probably more experienced in how to handle this
sort of stuff than anyone else I know.

Jonathan

> 
> ---
> Change from RFC:
> 	Remove redundant error message in get event records loop
> 	s/EVENT_RECORD_DATA_LENGTH/CXL_EVENT_RECORD_DATA_LENGTH
> 	Use hdr_uuid for the header UUID field
> 	Use cxl_event_log_type_str() for the trace events
> 	Create macros for the header fields and common entries of each event
> 	Add reserved buffer output dump
> 	Report error if event query fails
> 	Remove unused record_cnt variable
> 	Steven - reorder overflow record
> 		Remove NOTE about checkpatch
> 	Jonathan
> 		check for exactly 1 record
> 		s/v3.0/rev 3.0
> 		Use 3 byte fields for 24bit fields
> 		Add 3.0 Maintenance Operation Class
> 		Add Dynamic Capacity log type
> 		Fix spelling
> 	Dave Jiang/Dan/Alison
> 		s/cxl-event/cxl
> 		trace/events/cxl-events => trace/events/cxl.h
> 		s/cxl_event_overflow/overflow
> 		s/cxl_event/generic_event
> ---

... 

> +/**
> + * cxl_mem_get_event_records - Get Event Records from the device
> + * @cxlds: The device data for the operation
> + *
> + * Retrieve all event records available on the device and report them as trace
> + * events.
> + *
> + * See CXL rev 3.0 @8.2.9.2.2 Get Event Records
> + */
> +void cxl_mem_get_event_records(struct cxl_dev_state *cxlds)
> +{
> +	enum cxl_event_log_type log_type;
> +
> +	dev_dbg(cxlds->dev, "Reading event logs\n");
> +
> +	for (log_type = CXL_EVENT_TYPE_INFO;

I'd start at 0.  To my mind that's clearer than this which
basically says there is a contiguous range that may or may
not be 0 based.

> +	     log_type < CXL_EVENT_TYPE_MAX; log_type++)
> +		cxl_mem_get_records_log(cxlds, log_type);
> +}
> +EXPORT_SYMBOL_NS_GPL(cxl_mem_get_event_records, CXL);
> +
>  /**
>   * cxl_mem_get_partition_info - Get partition info
>   * @cxlds: The device data for the operation
> diff --git a/drivers/cxl/cxlmem.h b/drivers/cxl/cxlmem.h
> index 88e3a8e54b6a..fa8d248fb299 100644
> --- a/drivers/cxl/cxlmem.h
> +++ b/drivers/cxl/cxlmem.h
> @@ -4,6 +4,7 @@
>  #define __CXL_MEM_H__
>  #include <uapi/linux/cxl_mem.h>
>  #include <linux/cdev.h>
> +#include <linux/uuid.h>
>  #include "cxl.h"
>  
>  /* CXL 2.0 8.2.8.5.1.1 Memory Device Status Register */
> @@ -253,6 +254,7 @@ struct cxl_dev_state {
>  enum cxl_opcode {
>  	CXL_MBOX_OP_INVALID		= 0x0000,
>  	CXL_MBOX_OP_RAW			= CXL_MBOX_OP_INVALID,
> +	CXL_MBOX_OP_GET_EVENT_RECORD	= 0x0100,
>  	CXL_MBOX_OP_GET_FW_INFO		= 0x0200,
>  	CXL_MBOX_OP_ACTIVATE_FW		= 0x0202,
>  	CXL_MBOX_OP_GET_SUPPORTED_LOGS	= 0x0400,
> @@ -322,6 +324,78 @@ struct cxl_mbox_identify {
>  	u8 qos_telemetry_caps;
>  } __packed;
>  
> +/*
> + * Common Event Record Format
> + * CXL rev 3.0 section 8.2.9.2.1; Table 8-42
> + */
> +#define CXL_EVENT_REC_HDR_RES_LEN 0xf
> +struct cxl_event_record_hdr {
> +	uuid_t id;
> +	u8 length;
> +	u8 flags[3];
> +	__le16 handle;
> +	__le16 related_handle;
> +	__le64 timestamp;
> +	u8 maint_op_class;
> +	u8 reserved[CXL_EVENT_REC_HDR_RES_LEN];
> +} __packed;
> +
> +#define CXL_EVENT_RECORD_DATA_LENGTH 0x50
> +struct cxl_event_record_raw {

I'd like some comments here on 'why' this makes sense.
Is expectation that it's here for future CXL spec definitions
or is it intended to allow some reporting of vendor defined
records?  Actually maybe a comment down at the TP would make more
sense than here.  Either way comment somewhere :)

> +	struct cxl_event_record_hdr hdr;
> +	u8 data[CXL_EVENT_RECORD_DATA_LENGTH];
> +} __packed;
> +


> +static inline const char *cxl_event_log_type_str(enum cxl_event_log_type type)
> +{
> +	switch (type) {
> +	case CXL_EVENT_TYPE_INFO:
> +		return "Informational";
> +	case CXL_EVENT_TYPE_WARN:
> +		return "Warning";
> +	case CXL_EVENT_TYPE_FAIL:
> +		return "Failure";
> +	case CXL_EVENT_TYPE_FATAL:
> +		return "Fatal";
> +	case CXL_EVENT_TYPE_DYNAMIC_CAP:
> +		return "Dynamic Capacity";
Array of const char * that you pick from (with limit check) maybe rather than a switch?
Doesn't matter that much though.

> +	default:
> +		break;
> +	}
> +	return "<unknown>";
> +}
> +
>  struct cxl_mbox_get_partition_info {
>  	__le64 active_volatile_cap;
>  	__le64 active_persistent_cap;
> @@ -381,6 +455,7 @@ int cxl_mem_create_range_info(struct cxl_dev_state *cxlds);
>  struct cxl_dev_state *cxl_dev_state_create(struct device *dev);
>  void set_exclusive_cxl_commands(struct cxl_dev_state *cxlds, unsigned long *cmds);
>  void clear_exclusive_cxl_commands(struct cxl_dev_state *cxlds, unsigned long *cmds);
> +void cxl_mem_get_event_records(struct cxl_dev_state *cxlds);
>  #ifdef CONFIG_CXL_SUSPEND
>  void cxl_mem_active_inc(void);
>  void cxl_mem_active_dec(void);
> diff --git a/include/trace/events/cxl.h b/include/trace/events/cxl.h
> new file mode 100644
> index 000000000000..318ba5fe046e
> --- /dev/null
> +++ b/include/trace/events/cxl.h
> @@ -0,0 +1,130 @@
> +/* SPDX-License-Identifier: GPL-2.0 */
> +#undef TRACE_SYSTEM
> +#define TRACE_SYSTEM cxl
> +
> +#if !defined(_CXL_TRACE_EVENTS_H) ||  defined(TRACE_HEADER_MULTI_READ)
> +#define _CXL_TRACE_EVENTS_H
> +
> +#include <asm-generic/unaligned.h>
> +#include <linux/tracepoint.h>
> +
> +TRACE_EVENT(overflow,
> +
> +	TP_PROTO(const char *dev_name, enum cxl_event_log_type log,
> +		 struct cxl_get_event_payload *payload),
> +
> +	TP_ARGS(dev_name, log, payload),
> +
> +	TP_STRUCT__entry(
> +		__string(dev_name, dev_name)
> +		__field(int, log)
> +		__field(u64, first)

first_ts maybe?  first alone is a bit vague.

> +		__field(u64, last)
> +		__field(u16, count)
> +	),
> +
> +	TP_fast_assign(
> +		__assign_str(dev_name, dev_name);
> +		__entry->log = log;
> +		__entry->count = le16_to_cpu(payload->overflow_err_count);
> +		__entry->first = le64_to_cpu(payload->first_overflow_timestamp);
> +		__entry->last = le64_to_cpu(payload->last_overflow_timestamp);
> +	),
> +
> +	TP_printk("%s: EVENT LOG OVERFLOW log=%s : %u records from %llu to %llu",
> +		__get_str(dev_name), cxl_event_log_type_str(__entry->log),

So we have a potential header include ordering issue given this header doesn't
include the definition of cxl_event_log_type_str()?

> +		__entry->count, __entry->first, __entry->last)
> +
> +);
> +
> +/*
> + * Common Event Record Format
> + * CXL 3.0 section 8.2.9.2.1; Table 8-42
> + */
> +#define CXL_EVENT_RECORD_FLAG_PERMANENT		BIT(2)
> +#define CXL_EVENT_RECORD_FLAG_MAINT_NEEDED	BIT(3)
> +#define CXL_EVENT_RECORD_FLAG_PERF_DEGRADED	BIT(4)
> +#define CXL_EVENT_RECORD_FLAG_HW_REPLACE	BIT(5)
> +#define show_hdr_flags(flags)	__print_flags(flags, " | ",			   \
> +	{ CXL_EVENT_RECORD_FLAG_PERMANENT,	"Permanent Condition"		}, \
> +	{ CXL_EVENT_RECORD_FLAG_MAINT_NEEDED,	"Maintenance Needed"		}, \
> +	{ CXL_EVENT_RECORD_FLAG_PERF_DEGRADED,	"Performance Degraded"		}, \
> +	{ CXL_EVENT_RECORD_FLAG_HW_REPLACE,	"Hardware Replacement Needed"	}  \
> +)
> +
> +/*
> + * Define macros for the common header of each CXL event.
> + *
> + * Tracepoints using these macros must do 3 things:
> + *
> + *	1) Add CXL_EVT_TP_entry to TP_STRUCT__entry
> + *	2) Use CXL_EVT_TP_fast_assign within TP_fast_assign;
> + *	   pass the dev_name, log, and CXL event header
> + *	3) Use CXL_EVT_TP_printk() instead of TP_printk()
> + *
> + * See the generic_event tracepoint as an example.
> + */
> +#define CXL_EVT_TP_entry					\
> +	__string(dev_name, dev_name)				\
> +	__field(int, log)					\
> +	__array(u8, hdr_uuid, UUID_SIZE)			\
> +	__field(u32, hdr_flags)					\
> +	__field(u16, hdr_handle)				\
> +	__field(u16, hdr_related_handle)			\
> +	__field(u64, hdr_timestamp)				\
> +	__array(u8, hdr_res, CXL_EVENT_REC_HDR_RES_LEN)		\
> +	__field(u8, hdr_length)					\
> +	__field(u8, hdr_maint_op_class)
> +
> +#define CXL_EVT_TP_fast_assign(dname, l, hdr)					\
> +	__assign_str(dev_name, (dname));					\
> +	__entry->log = (l);							\
> +	memcpy(__entry->hdr_uuid, &(hdr).id, UUID_SIZE);			\
> +	__entry->hdr_length = (hdr).length;					\
> +	__entry->hdr_flags = get_unaligned_le24((hdr).flags);			\
> +	__entry->hdr_handle = le16_to_cpu((hdr).handle);			\
> +	__entry->hdr_related_handle = le16_to_cpu((hdr).related_handle);	\
> +	__entry->hdr_timestamp = le64_to_cpu((hdr).timestamp);			\
> +	__entry->hdr_maint_op_class = (hdr).maint_op_class;			\
> +	memcpy(__entry->hdr_res, &(hdr).reserved,				\
> +		CXL_EVENT_REC_HDR_RES_LEN)

What's the logic behind having the reserved fields here?
How does that change when a future spec defines them as used? Do we have
to keep whatever was in the previously reserved fields for ever in order to
maintain backwards compatibility even though we've added the same data to the end
of the trace point?

I don't think they should be here at all. Given a userspace stack has to cope with
out knowing the contents of those fields as userspace might not be up to date, 
I see no problem with requiring a newer kernel to support stuff added in future
specs.

Steven, Mauro, you probably have a better idea of history of similar cases.  How
have other people handled reserved fields in underlying hardware reports that
may become used in later revisions?

Also, probably makes sense to cc linux-edac on v3 of this series as the experts
in these flows tend to read that list.  Obviously intent so far is not to pass
these into the edac subsystem, but I'd still let them know this work is going on.


> +
> +
> +#define CXL_EVT_TP_printk(fmt, ...) \
> +	TP_printk("%s log=%s : time=%llu uuid=%pUl len=%d flags='%s' "		\
> +		"handle=%x related_handle=%x maint_op_class=%u res='%s' "	\
> +		" : " fmt,							\
> +		__get_str(dev_name), cxl_event_log_type_str(__entry->log),	\
> +		__entry->hdr_timestamp, __entry->hdr_uuid, __entry->hdr_length,	\
> +		show_hdr_flags(__entry->hdr_flags), __entry->hdr_handle,	\
> +		__entry->hdr_related_handle, __entry->hdr_maint_op_class,	\
> +		__print_hex(__entry->hdr_res, CXL_EVENT_REC_HDR_RES_LEN),	\
> +		##__VA_ARGS__)
> +
> +TRACE_EVENT(generic_event,
> +
> +	TP_PROTO(const char *dev_name, enum cxl_event_log_type log,
> +		 struct cxl_event_record_raw *rec),
> +
> +	TP_ARGS(dev_name, log, rec),
> +
> +	TP_STRUCT__entry(
> +		CXL_EVT_TP_entry
> +		__array(u8, data, CXL_EVENT_RECORD_DATA_LENGTH)
> +	),
> +
> +	TP_fast_assign(
> +		CXL_EVT_TP_fast_assign(dev_name, log, rec->hdr);
> +		memcpy(__entry->data, &rec->data, CXL_EVENT_RECORD_DATA_LENGTH);
> +	),
> +
> +	CXL_EVT_TP_printk("%s",
> +		__print_hex(__entry->data, CXL_EVENT_RECORD_DATA_LENGTH))
> +);
> +
> +#endif /* _CXL_TRACE_EVENTS_H */

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [RFC V2 PATCH 04/11] cxl/mem: Clear events on driver load
  2022-10-10 22:41 ` [RFC V2 PATCH 04/11] cxl/mem: Clear events on driver load ira.weiny
@ 2022-10-11 12:42   ` Jonathan Cameron
  0 siblings, 0 replies; 36+ messages in thread
From: Jonathan Cameron @ 2022-10-11 12:42 UTC (permalink / raw)
  To: ira.weiny
  Cc: Dan Williams, Alison Schofield, Vishal Verma, Ben Widawsky,
	Steven Rostedt, Davidlohr Bueso, linux-kernel, linux-cxl

On Mon, 10 Oct 2022 15:41:24 -0700
ira.weiny@intel.com wrote:

> From: Ira Weiny <ira.weiny@intel.com>
> 
> The information contained in the events prior to the driver loading can
> be queried at any time through other mailbox commands.
> 
> Ensure a clean slate of events by reading and clearing the events.  The
> events are sent to the trace buffer but it is not anticipated to have
> anyone listening to it at driver load time.
> 
> Signed-off-by: Ira Weiny <ira.weiny@intel.com>
Makes sense I think.

Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
> ---
>  drivers/cxl/pci.c            | 2 ++
>  tools/testing/cxl/test/mem.c | 2 ++
>  2 files changed, 4 insertions(+)
> 
> diff --git a/drivers/cxl/pci.c b/drivers/cxl/pci.c
> index faeb5d9d7a7a..5f1b492bd388 100644
> --- a/drivers/cxl/pci.c
> +++ b/drivers/cxl/pci.c
> @@ -498,6 +498,8 @@ static int cxl_pci_probe(struct pci_dev *pdev, const struct pci_device_id *id)
>  	if (IS_ERR(cxlmd))
>  		return PTR_ERR(cxlmd);
>  
> +	cxl_mem_get_event_records(cxlds);
> +
>  	if (resource_size(&cxlds->pmem_res) && IS_ENABLED(CONFIG_CXL_PMEM))
>  		rc = devm_cxl_add_nvdimm(&pdev->dev, cxlmd);
>  
> diff --git a/tools/testing/cxl/test/mem.c b/tools/testing/cxl/test/mem.c
> index aa2df3a15051..e2f5445d24ff 100644
> --- a/tools/testing/cxl/test/mem.c
> +++ b/tools/testing/cxl/test/mem.c
> @@ -285,6 +285,8 @@ static int cxl_mock_mem_probe(struct platform_device *pdev)
>  	if (IS_ERR(cxlmd))
>  		return PTR_ERR(cxlmd);
>  
> +	cxl_mem_get_event_records(cxlds);
> +
>  	if (resource_size(&cxlds->pmem_res) && IS_ENABLED(CONFIG_CXL_PMEM))
>  		rc = devm_cxl_add_nvdimm(dev, cxlmd);
>  


^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [RFC V2 PATCH 05/11] cxl/mem: Trace General Media Event Record
  2022-10-10 22:41 ` [RFC V2 PATCH 05/11] cxl/mem: Trace General Media Event Record ira.weiny
@ 2022-10-11 12:57   ` Jonathan Cameron
  2022-10-14 23:33     ` Ira Weiny
  2022-10-15 11:30   ` Steven Rostedt
  1 sibling, 1 reply; 36+ messages in thread
From: Jonathan Cameron @ 2022-10-11 12:57 UTC (permalink / raw)
  To: ira.weiny
  Cc: Dan Williams, Alison Schofield, Vishal Verma, Ben Widawsky,
	Steven Rostedt, Davidlohr Bueso, linux-kernel, linux-cxl

On Mon, 10 Oct 2022 15:41:25 -0700
ira.weiny@intel.com wrote:

> From: Ira Weiny <ira.weiny@intel.com>
> 
> CXL rev 3.0 section 8.2.9.2.1.1 defines the General Media Event Record.
> 
> Determine if the event read is a general media record and if so trace
> the record.
> 
> Signed-off-by: Ira Weiny <ira.weiny@intel.com>

I'll review the rest of these with the assumption that the question
of reserved bytes in tracepoints will get resolved before these go in.

One minor comment on a comment inline.  Other than those lgtm

Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>

> 
> ---
> Changes from RFC:
> 	Add reserved byte array
> 	Use common CXL event header record macros
> 	Jonathan
> 		Use unaligned_le{24,16} for unaligned fields
> 		Don't use the inverse of phy addr mask
> 	Dave Jiang
> 		s/cxl_gen_media_event/general_media
> 		s/cxl_evt_gen_media/cxl_event_gen_media
> ---
>  drivers/cxl/core/mbox.c    |  30 ++++++++++-
>  drivers/cxl/cxlmem.h       |  20 +++++++
>  include/trace/events/cxl.h | 108 +++++++++++++++++++++++++++++++++++++
>  3 files changed, 156 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/cxl/core/mbox.c b/drivers/cxl/core/mbox.c
> index bc4e42b3e01b..1097250c115a 100644
> --- a/drivers/cxl/core/mbox.c
> +++ b/drivers/cxl/core/mbox.c
> @@ -712,6 +712,32 @@ int cxl_enumerate_cmds(struct cxl_dev_state *cxlds)
>  }
>  EXPORT_SYMBOL_NS_GPL(cxl_enumerate_cmds, CXL);
>  
> +/*
> + * General Media Event Record
> + * CXL rev 3.0 Section 8.2.9.2.1.1; Table 8-43
> + */
> +static const uuid_t gen_media_event_uuid =
> +	UUID_INIT(0xfbcd0a77, 0xc260, 0x417f,
> +		  0x85, 0xa9, 0x08, 0x8b, 0x16, 0x21, 0xeb, 0xa6);
> +
> +static void cxl_trace_event_record(const char *dev_name,
> +				   enum cxl_event_log_type type,
> +				   struct cxl_get_event_payload *payload)
> +{
> +	uuid_t *id = &payload->record.hdr.id;
> +
> +	if (uuid_equal(id, &gen_media_event_uuid)) {
> +		struct cxl_event_gen_media *rec =
> +				(struct cxl_event_gen_media *)&payload->record;
> +
> +		trace_general_media(dev_name, type, rec);
> +		return;
> +	}
> +
> +	/* For unknown record types print just the header */
> +	trace_generic_event(dev_name, type, &payload->record);

Looks like it prints the whole thing now...


> +}
> +


> diff --git a/include/trace/events/cxl.h b/include/trace/events/cxl.h
> index 318ba5fe046e..82a8d3b750a2 100644
> --- a/include/trace/events/cxl.h
> +++ b/include/trace/events/cxl.h
> @@ -122,6 +122,114 @@ TRACE_EVENT(generic_event,


> +#define CXL_GMER_VALID_CHANNEL				BIT(0)
> +#define CXL_GMER_VALID_RANK				BIT(1)
> +#define CXL_GMER_VALID_DEVICE				BIT(2)
> +#define CXL_GMER_VALID_COMPONENT			BIT(3)
> +#define show_valid_flags(flags)	__print_flags(flags, "|",		   \
> +	{ CXL_GMER_VALID_CHANNEL,			"CHANNEL"	}, \
> +	{ CXL_GMER_VALID_RANK,				"RANK"		}, \
> +	{ CXL_GMER_VALID_DEVICE,			"DEVICE"	}, \
> +	{ CXL_GMER_VALID_COMPONENT,			"COMPONENT"	}  \
> +)

I'd still rather we only had the TP_printk only print those parts that are valid
rather than the mess of having to go check the validity bit before deciding whether
to take notice of the field.  But meh, not that important given thats
not the main intended way to consume this data.


> +
> +TRACE_EVENT(general_media,
> +
> +	TP_PROTO(const char *dev_name, enum cxl_event_log_type log,
> +		 struct cxl_event_gen_media *rec),
> +
> +	TP_ARGS(dev_name, log, rec),
> +
> +	TP_STRUCT__entry(
> +		CXL_EVT_TP_entry
> +		/* General Media */
> +		__field(u64, phys_addr)
> +		__field(u8, descriptor)
> +		__field(u8, type)
> +		__field(u8, transaction_type)
> +		__field(u8, channel)
> +		__field(u32, device)
> +		__array(u8, comp_id, CXL_EVENT_GEN_MED_COMP_ID_SIZE)
> +		__array(u8, reserved, CXL_EVENT_GEN_MED_RES_SIZE)
> +		__field(u16, validity_flags)
> +		__field(u8, rank) /* Out of order to pack trace record */
> +	),
> +
> +	TP_fast_assign(
> +		CXL_EVT_TP_fast_assign(dev_name, log, rec->hdr);
> +
> +		/* General Media */
> +		__entry->phys_addr = le64_to_cpu(rec->phys_addr);
> +		__entry->descriptor = rec->descriptor;
> +		__entry->type = rec->type;
> +		__entry->transaction_type = rec->transaction_type;
> +		__entry->channel = rec->channel;
> +		__entry->rank = rec->rank;
> +		__entry->device = get_unaligned_le24(rec->device);
> +		memcpy(__entry->comp_id, &rec->component_id,
> +			CXL_EVENT_GEN_MED_COMP_ID_SIZE);
> +		memcpy(__entry->reserved, &rec->reserved,
> +			CXL_EVENT_GEN_MED_RES_SIZE);
> +		__entry->validity_flags = get_unaligned_le16(&rec->validity_flags);
> +	),


^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [RFC V2 PATCH 06/11] cxl/mem: Trace DRAM Event Record
  2022-10-10 22:41 ` [RFC V2 PATCH 06/11] cxl/mem: Trace DRAM " ira.weiny
@ 2022-10-11 13:47   ` Jonathan Cameron
  2022-10-14 23:45     ` Ira Weiny
  2022-10-15 11:31   ` Steven Rostedt
  1 sibling, 1 reply; 36+ messages in thread
From: Jonathan Cameron @ 2022-10-11 13:47 UTC (permalink / raw)
  To: ira.weiny
  Cc: Dan Williams, Alison Schofield, Vishal Verma, Ben Widawsky,
	Steven Rostedt, Davidlohr Bueso, linux-kernel, linux-cxl

On Mon, 10 Oct 2022 15:41:26 -0700
ira.weiny@intel.com wrote:

> From: Ira Weiny <ira.weiny@intel.com>
> 
> CXL rev 3.0 section 8.2.9.2.1.2 defines the DRAM Event Record.
> 
> Determine if the event read is a DRAM event record and if so trace the
> record.
> 
> Signed-off-by: Ira Weiny <ira.weiny@intel.com>
> 

Trivial comments inline

Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>

> diff --git a/include/trace/events/cxl.h b/include/trace/events/cxl.h
> index 82a8d3b750a2..7a90cfea348b 100644
> --- a/include/trace/events/cxl.h
> +++ b/include/trace/events/cxl.h
> @@ -230,6 +230,100 @@ TRACE_EVENT(general_media,
>  		)
>  );
>  

> +
> +TRACE_EVENT(dram,
> +
> +	TP_PROTO(const char *dev_name, enum cxl_event_log_type log,
> +		 struct cxl_event_dram *rec),
> +
> +	TP_ARGS(dev_name, log, rec),
> +
> +	TP_STRUCT__entry(
> +		CXL_EVT_TP_entry
> +		/* DRAM */
> +		__field(u64, phys_addr)
> +		__field(u8, descriptor)
> +		__field(u8, type)
> +		__field(u8, transaction_type)
> +		__field(u8, channel)
> +		__field(u16, validity_flags)
> +		__field(u16, column)	/* Out of order to pack trace record */
> +		__field(u32, nibble_mask)
> +		__field(u32, row)
> +		__array(u8, cor_mask, CXL_EVENT_DER_CORRECTION_MASK_SIZE)
> +		__array(u8, reserved, CXL_EVENT_DER_RES_SIZE)

If we are going to have this, why not put it at the end?  Will that affect the
packing badly?

> +		__field(u8, rank)	/* Out of order to pack trace record */
> +		__field(u8, bank_group)	/* Out of order to pack trace record */
> +		__field(u8, bank)	/* Out of order to pack trace record */
> +	),
> +
> +	TP_fast_assign(
> +		CXL_EVT_TP_fast_assign(dev_name, log, rec->hdr);
> +
> +		/* DRAM */
> +		__entry->phys_addr = le64_to_cpu(rec->phys_addr);
> +		__entry->descriptor = rec->descriptor;
> +		__entry->type = rec->type;
> +		__entry->transaction_type = rec->transaction_type;
> +		__entry->validity_flags = get_unaligned_le16(rec->validity_flags);
> +		__entry->channel = rec->channel;
> +		__entry->rank = rec->rank;
> +		__entry->nibble_mask = get_unaligned_le24(rec->nibble_mask);
> +		__entry->bank_group = rec->bank_group;
> +		__entry->bank = rec->bank;
> +		__entry->row = get_unaligned_le24(rec->row);
> +		__entry->column = get_unaligned_le16(rec->column);
> +		memcpy(__entry->cor_mask, &rec->correction_mask,
> +			CXL_EVENT_DER_CORRECTION_MASK_SIZE);
> +		memcpy(__entry->reserved, &rec->reserved,
> +			CXL_EVENT_DER_RES_SIZE);
> +	),
> +
> +	CXL_EVT_TP_printk("phys_addr=%llx volatile=%s desc='%s' type='%s' " \
> +		"trans_type='%s' channel=%u rank=%u nibble_mask=%x " \
> +		"bank_group=%u bank=%u row=%u column=%u cor_mask=%s " \
> +		"valid_flags='%s' reserved=%s",
> +		__entry->phys_addr & CXL_GMER_PHYS_ADDR_MASK,
> +		(__entry->phys_addr & CXL_GMER_PHYS_ADDR_VOLATILE) ? "TRUE" : "FALSE",
> +		show_event_desc_flags(__entry->descriptor),
> +		show_mem_event_type(__entry->type),
> +		show_trans_type(__entry->transaction_type),
> +		__entry->channel, __entry->rank, __entry->nibble_mask,
> +		__entry->bank_group, __entry->bank,
> +		__entry->row, __entry->column,
> +		__print_hex(__entry->cor_mask, CXL_EVENT_DER_CORRECTION_MASK_SIZE),
> +		show_dram_valid_flags(__entry->validity_flags),
> +		__print_hex(__entry->reserved, CXL_EVENT_DER_RES_SIZE)
> +		)
Probably one less tab on that trailing )?

> +);
> +
>  #endif /* _CXL_TRACE_EVENTS_H */
>  
>  /* This part must be outside protection */


^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [RFC V2 PATCH 07/11] cxl/mem: Trace Memory Module Event Record
  2022-10-10 22:41 ` [RFC V2 PATCH 07/11] cxl/mem: Trace Memory Module " ira.weiny
@ 2022-10-11 13:54   ` Jonathan Cameron
  2022-10-15 11:32   ` Steven Rostedt
  1 sibling, 0 replies; 36+ messages in thread
From: Jonathan Cameron @ 2022-10-11 13:54 UTC (permalink / raw)
  To: ira.weiny
  Cc: Dan Williams, Alison Schofield, Vishal Verma, Ben Widawsky,
	Steven Rostedt, Davidlohr Bueso, linux-kernel, linux-cxl

On Mon, 10 Oct 2022 15:41:27 -0700
ira.weiny@intel.com wrote:

> From: Ira Weiny <ira.weiny@intel.com>
> 
> CXL rev 3.0 section 8.2.9.2.1.3 defines the Memory Module Event Record.
> 
> Determine if the event read is memory module record and if so trace the
> record.
> 
> Signed-off-by: Ira Weiny <ira.weiny@intel.com>
> 
A few trivial comments inline. I'm happy either way
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>


> +#define show_add_status(as) __print_symbolic(as,	   \
> +	{ CXL_DHI_AS_NORMAL,		"Normal"	}, \
> +	{ CXL_DHI_AS_WARNING,		"Warning"	}, \
> +	{ CXL_DHI_AS_CRITICAL,		"Critical"	}  \
> +)
> +
> +#define CXL_DHI_AS_LIFE_USED(as)			(as & 0x3)
> +#define CXL_DHI_AS_DEV_TEMP(as)				((as & 0xC) >> 2)
> +#define CXL_DHI_AS_COR_VOL_ERR_CNT(as)			((as & 0x10) >> 4)
> +#define CXL_DHI_AS_COR_PER_ERR_CNT(as)			((as & 0x20) >> 5)
>

> +	),
> +
> +	CXL_EVT_TP_printk("evt_type='%s' health_status='%s' media_status='%s' " \
> +		"as_life_used=%s as_dev_temp=%s as_cor_vol_err_cnt=%s " \
> +		"as_cor_per_err_cnt=%s life_used=%u dev_temp=%d " \
> +		"dirty_shutdown_cnt=%u cor_vol_err_cnt=%u cor_per_err_cnt=%u " \
> +		"reserved=%s",
> +		show_dev_evt_type(__entry->event_type),
> +		show_health_status_flags(__entry->health_status),
> +		show_media_status(__entry->media_status),
> +		show_add_status(CXL_DHI_AS_LIFE_USED(__entry->add_status)),
> +		show_add_status(CXL_DHI_AS_DEV_TEMP(__entry->add_status)),
> +		show_add_status(CXL_DHI_AS_COR_VOL_ERR_CNT(__entry->add_status)),
> +		show_add_status(CXL_DHI_AS_COR_PER_ERR_CNT(__entry->add_status)),
A little nasty to use same show_add_status() for the 2 bit and 1 bit versions.
Obviously it works, but maybe it's worth two macros for the 1 and 2 bit version?

> +		__entry->life_used, __entry->device_temp,
> +		__entry->dirty_shutdown_cnt, __entry->cor_vol_err_cnt,
> +		__entry->cor_per_err_cnt,
> +		__print_hex(__entry->reserved, CXL_EVENT_MEM_MOD_RES_SIZE)
> +		)
Aligned one tab too far?

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [RFC V2 PATCH 11/11] cxl/mem: Wire up event interrupts
  2022-10-10 22:41 ` [RFC V2 PATCH 11/11] cxl/mem: Wire up event interrupts ira.weiny
@ 2022-10-12 18:01   ` Davidlohr Bueso
  0 siblings, 0 replies; 36+ messages in thread
From: Davidlohr Bueso @ 2022-10-12 18:01 UTC (permalink / raw)
  To: ira.weiny
  Cc: Dan Williams, Alison Schofield, Vishal Verma, Ben Widawsky,
	Steven Rostedt, Jonathan Cameron, linux-kernel, linux-cxl

On Mon, 10 Oct 2022, ira.weiny@intel.com wrote:

>From: Ira Weiny <ira.weiny@intel.com>
>
>CXL device events are signaled via interrupts.  Each event log may have
>a different interrupt message number.  These message numbers are
>reported in the Get Event Interrupt Policy mailbox command.
>
>Create an infrastructure to query the max vectors required for the CXL
>device.  Add event interrupt information that infrastructure.  Set up a
>handler for each event log.
>
>Davidlohr suggested the generic vector code.

So this should be a separate patch, and out of the event series altogether.
There are a number of interested parties for irq support, and imo the
generic vector stuff should be a first dependency to all of them.

I've sent out a patch with some updates you did. I also kept the whole
table populated with nil callbacks as I believe this documents the TODO
nicely.

Thanks,
Davidlohr

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [RFC V2 PATCH 01/11] cxl/mbox: Add debug of hardware error code
  2022-10-10 22:41 ` [RFC V2 PATCH 01/11] cxl/mbox: Add debug of hardware error code ira.weiny
  2022-10-11 10:41   ` Jonathan Cameron
@ 2022-10-14 16:29   ` Davidlohr Bueso
  2022-10-14 16:31     ` Davidlohr Bueso
  1 sibling, 1 reply; 36+ messages in thread
From: Davidlohr Bueso @ 2022-10-14 16:29 UTC (permalink / raw)
  To: ira.weiny
  Cc: Dan Williams, Alison Schofield, Vishal Verma, Ben Widawsky,
	Steven Rostedt, Jonathan Cameron, linux-kernel, linux-cxl

On Mon, 10 Oct 2022, ira.weiny@intel.com wrote:

>From: Ira Weiny <ira.weiny@intel.com>
>
>If a mailbox command fails the driver always reports ENXIO.  But this
>may not be enough information to understand why the hardware, or in my
>case Qemu, was failing.
>
>Add a debug print of the error code returned from the hardware.

Reviewed-by: Davidlohr Bueso <dave@stgolabs.net>

with a nit below.

>
>Signed-off-by: Ira Weiny <ira.weiny@intel.com>
>---
> drivers/cxl/core/mbox.c | 5 ++++-
> 1 file changed, 4 insertions(+), 1 deletion(-)
>
>diff --git a/drivers/cxl/core/mbox.c b/drivers/cxl/core/mbox.c
>index 16176b9278b4..6c4d024ad0e8 100644
>--- a/drivers/cxl/core/mbox.c
>+++ b/drivers/cxl/core/mbox.c
>@@ -181,8 +181,11 @@ int cxl_mbox_send_cmd(struct cxl_dev_state *cxlds, u16 opcode, void *in,
>	if (rc)
>		return rc;
>
>-	if (mbox_cmd.return_code != CXL_MBOX_CMD_RC_SUCCESS)
>+	if (mbox_cmd.return_code != CXL_MBOX_CMD_RC_SUCCESS) {
>+		dev_dbg(cxlds->dev, "MB error : %s\n",

Maybe s/MB/mbox?

>+			cxl_mbox_cmd_rc2str(&mbox_cmd));
>		return cxl_mbox_cmd_rc2errno(&mbox_cmd);
>+	}
>
>	/*
>	 * Variable sized commands can't be validated and so it's up to the
>--
>2.37.2
>

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [RFC V2 PATCH 01/11] cxl/mbox: Add debug of hardware error code
  2022-10-14 16:29   ` Davidlohr Bueso
@ 2022-10-14 16:31     ` Davidlohr Bueso
  2022-10-14 17:00       ` Ira Weiny
  0 siblings, 1 reply; 36+ messages in thread
From: Davidlohr Bueso @ 2022-10-14 16:31 UTC (permalink / raw)
  To: ira.weiny
  Cc: Dan Williams, Alison Schofield, Vishal Verma, Ben Widawsky,
	Steven Rostedt, Jonathan Cameron, linux-kernel, linux-cxl

On Fri, 14 Oct 2022, Davidlohr Bueso wrote:
>>-	if (mbox_cmd.return_code != CXL_MBOX_CMD_RC_SUCCESS)
>>+	if (mbox_cmd.return_code != CXL_MBOX_CMD_RC_SUCCESS) {
>>+		dev_dbg(cxlds->dev, "MB error : %s\n",
>
>Maybe s/MB/mbox?

Actually 'Mailbox' seems to be the standard:

core/regs.c:			dev_dbg(dev, "found Mailbox capability (0x%x)\n", offset);
core/regs.c:			dev_dbg(dev, "found Secondary Mailbox capability (0x%x)\n", offset);
pci.c:		dev_dbg(dev, "Mailbox operation had an error\n");
pci.c:		dev_err(cxlds->dev, "Mailbox is too small (%zub)",
pci.c:	dev_dbg(cxlds->dev, "Mailbox payload sized %zu",

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [RFC V2 PATCH 01/11] cxl/mbox: Add debug of hardware error code
  2022-10-14 16:31     ` Davidlohr Bueso
@ 2022-10-14 17:00       ` Ira Weiny
  0 siblings, 0 replies; 36+ messages in thread
From: Ira Weiny @ 2022-10-14 17:00 UTC (permalink / raw)
  To: Davidlohr Bueso
  Cc: Dan Williams, Alison Schofield, Vishal Verma, Ben Widawsky,
	Steven Rostedt, Jonathan Cameron, linux-kernel, linux-cxl

On Fri, Oct 14, 2022 at 09:31:49AM -0700, Davidlohr Bueso wrote:
> On Fri, 14 Oct 2022, Davidlohr Bueso wrote:
> > > -	if (mbox_cmd.return_code != CXL_MBOX_CMD_RC_SUCCESS)
> > > +	if (mbox_cmd.return_code != CXL_MBOX_CMD_RC_SUCCESS) {
> > > +		dev_dbg(cxlds->dev, "MB error : %s\n",
> > 
> > Maybe s/MB/mbox?
> 
> Actually 'Mailbox' seems to be the standard:

Good point!  Changed.

Thanks!
Ira

> 
> core/regs.c:			dev_dbg(dev, "found Mailbox capability (0x%x)\n", offset);
> core/regs.c:			dev_dbg(dev, "found Secondary Mailbox capability (0x%x)\n", offset);
> pci.c:		dev_dbg(dev, "Mailbox operation had an error\n");
> pci.c:		dev_err(cxlds->dev, "Mailbox is too small (%zub)",
> pci.c:	dev_dbg(cxlds->dev, "Mailbox payload sized %zu",

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [RFC V2 PATCH 02/11] cxl/mem: Implement Get Event Records command
  2022-10-11 12:39   ` Jonathan Cameron
@ 2022-10-14 19:21     ` Ira Weiny
  0 siblings, 0 replies; 36+ messages in thread
From: Ira Weiny @ 2022-10-14 19:21 UTC (permalink / raw)
  To: Jonathan Cameron
  Cc: Dan Williams, Steven Rostedt, Alison Schofield, Vishal Verma,
	Ben Widawsky, Davidlohr Bueso, linux-kernel, linux-cxl,
	Mauro Carvalho Chehab

On Tue, Oct 11, 2022 at 01:39:20PM +0100, Jonathan Cameron wrote:
> On Mon, 10 Oct 2022 15:41:22 -0700
> ira.weiny@intel.com wrote:
> 
> > From: Ira Weiny <ira.weiny@intel.com>
> > 
> > Event records are defined for CXL devices.  Each record is reported in
> > one event log.  Devices are required to support the storage of at least
> > one event record in each event log type.
> > 
> > Devices track event log overflow by incrementing a counter and tracking
> > the time of the first and last overflow event seen.
> > 
> > Software queries events via the Get Event Record mailbox command; CXL
> > rev 3.0 section 8.2.9.2.2.
> > 
> > Issue the Get Event Record mailbox command on driver load.  Trace each
> > record found, as well as any overflow conditions.  Only 1 event is
> > requested for each query.  Optimization of multiple record queries is
> > deferred.
> > 
> > This patch traces a raw event record only and leaves the specific event
> > record types to subsequent patches.
> > 
> > Macros are created to use for the common CXL Event header fields.
> > 
> > Cc: Steven Rostedt <rostedt@goodmis.org>
> > Signed-off-by: Ira Weiny <ira.weiny@intel.com>
> 
> Hi Ira,
> 
> This looks basically fine, but I'm not convinced that it is a good
> or sustainable idea to report reserved fields from the underlying
> events in the trace events.
> If they are defined to have meaning later, we can't remove them from
> the 'reserved' field with out breaking backwards compatibility.
> So we end up with a weird mixture of some fields in both reserved and
> new entries in the TP and some not.

I removed the reserved fields from the data but I forgot the header.  I think I
had a reason they would be useful in the header but I forget now.  Which means
I was probably wrong.  ;-)

> 
> I've cc'd Mauro as he's probably more experienced in how to handle this
> sort of stuff than anyone else I know.
>

Fair enough but I agree the reserved fields should be removed.

> 
> Jonathan
> 
> > 
> > ---
> > Change from RFC:
> > 	Remove redundant error message in get event records loop
> > 	s/EVENT_RECORD_DATA_LENGTH/CXL_EVENT_RECORD_DATA_LENGTH
> > 	Use hdr_uuid for the header UUID field
> > 	Use cxl_event_log_type_str() for the trace events
> > 	Create macros for the header fields and common entries of each event
> > 	Add reserved buffer output dump
> > 	Report error if event query fails
> > 	Remove unused record_cnt variable
> > 	Steven - reorder overflow record
> > 		Remove NOTE about checkpatch
> > 	Jonathan
> > 		check for exactly 1 record
> > 		s/v3.0/rev 3.0
> > 		Use 3 byte fields for 24bit fields
> > 		Add 3.0 Maintenance Operation Class
> > 		Add Dynamic Capacity log type
> > 		Fix spelling
> > 	Dave Jiang/Dan/Alison
> > 		s/cxl-event/cxl
> > 		trace/events/cxl-events => trace/events/cxl.h
> > 		s/cxl_event_overflow/overflow
> > 		s/cxl_event/generic_event
> > ---
> 
> ... 
> 
> > +/**
> > + * cxl_mem_get_event_records - Get Event Records from the device
> > + * @cxlds: The device data for the operation
> > + *
> > + * Retrieve all event records available on the device and report them as trace
> > + * events.
> > + *
> > + * See CXL rev 3.0 @8.2.9.2.2 Get Event Records
> > + */
> > +void cxl_mem_get_event_records(struct cxl_dev_state *cxlds)
> > +{
> > +	enum cxl_event_log_type log_type;
> > +
> > +	dev_dbg(cxlds->dev, "Reading event logs\n");
> > +
> > +	for (log_type = CXL_EVENT_TYPE_INFO;
> 
> I'd start at 0.  To my mind that's clearer than this which
> basically says there is a contiguous range that may or may
> not be 0 based.

Ok.  But weather or not it is 0 based does not matter.

I'll change it.  It takes out a line of code!  ;-)

> 
> > +	     log_type < CXL_EVENT_TYPE_MAX; log_type++)
> > +		cxl_mem_get_records_log(cxlds, log_type);
> > +}
> > +EXPORT_SYMBOL_NS_GPL(cxl_mem_get_event_records, CXL);
> > +
> >  /**
> >   * cxl_mem_get_partition_info - Get partition info
> >   * @cxlds: The device data for the operation
> > diff --git a/drivers/cxl/cxlmem.h b/drivers/cxl/cxlmem.h
> > index 88e3a8e54b6a..fa8d248fb299 100644
> > --- a/drivers/cxl/cxlmem.h
> > +++ b/drivers/cxl/cxlmem.h
> > @@ -4,6 +4,7 @@
> >  #define __CXL_MEM_H__
> >  #include <uapi/linux/cxl_mem.h>
> >  #include <linux/cdev.h>
> > +#include <linux/uuid.h>
> >  #include "cxl.h"
> >  
> >  /* CXL 2.0 8.2.8.5.1.1 Memory Device Status Register */
> > @@ -253,6 +254,7 @@ struct cxl_dev_state {
> >  enum cxl_opcode {
> >  	CXL_MBOX_OP_INVALID		= 0x0000,
> >  	CXL_MBOX_OP_RAW			= CXL_MBOX_OP_INVALID,
> > +	CXL_MBOX_OP_GET_EVENT_RECORD	= 0x0100,
> >  	CXL_MBOX_OP_GET_FW_INFO		= 0x0200,
> >  	CXL_MBOX_OP_ACTIVATE_FW		= 0x0202,
> >  	CXL_MBOX_OP_GET_SUPPORTED_LOGS	= 0x0400,
> > @@ -322,6 +324,78 @@ struct cxl_mbox_identify {
> >  	u8 qos_telemetry_caps;
> >  } __packed;
> >  
> > +/*
> > + * Common Event Record Format
> > + * CXL rev 3.0 section 8.2.9.2.1; Table 8-42
> > + */
> > +#define CXL_EVENT_REC_HDR_RES_LEN 0xf
> > +struct cxl_event_record_hdr {
> > +	uuid_t id;
> > +	u8 length;
> > +	u8 flags[3];
> > +	__le16 handle;
> > +	__le16 related_handle;
> > +	__le64 timestamp;
> > +	u8 maint_op_class;
> > +	u8 reserved[CXL_EVENT_REC_HDR_RES_LEN];
> > +} __packed;
> > +
> > +#define CXL_EVENT_RECORD_DATA_LENGTH 0x50
> > +struct cxl_event_record_raw {
> 
> I'd like some comments here on 'why' this makes sense.
> Is expectation that it's here for future CXL spec definitions
> or is it intended to allow some reporting of vendor defined
> records?  Actually maybe a comment down at the TP would make more
> sense than here.  Either way comment somewhere :)

This comes from a discussion I had with Dan a while back where we decided that
any unknown record would just be dumped to user space as data.  The idea was
based on the kernel ignoring vendor specified events but does allow for some
future CXL spec definitios to get through.

This may cause similar confusion to the reserved fields but I don't think it is
quite the same.

I'll add a comment.

> 
> > +	struct cxl_event_record_hdr hdr;
> > +	u8 data[CXL_EVENT_RECORD_DATA_LENGTH];
> > +} __packed;
> > +
> 
> 
> > +static inline const char *cxl_event_log_type_str(enum cxl_event_log_type type)
> > +{
> > +	switch (type) {
> > +	case CXL_EVENT_TYPE_INFO:
> > +		return "Informational";
> > +	case CXL_EVENT_TYPE_WARN:
> > +		return "Warning";
> > +	case CXL_EVENT_TYPE_FAIL:
> > +		return "Failure";
> > +	case CXL_EVENT_TYPE_FATAL:
> > +		return "Fatal";
> > +	case CXL_EVENT_TYPE_DYNAMIC_CAP:
> > +		return "Dynamic Capacity";
> Array of const char * that you pick from (with limit check) maybe rather than a switch?
> Doesn't matter that much though.

I've seen it both ways.  This forces users to use the helper.  And I don't
think there is much concern about the speed of the lookup.

> 
> > +	default:
> > +		break;
> > +	}
> > +	return "<unknown>";
> > +}
> > +
> >  struct cxl_mbox_get_partition_info {
> >  	__le64 active_volatile_cap;
> >  	__le64 active_persistent_cap;
> > @@ -381,6 +455,7 @@ int cxl_mem_create_range_info(struct cxl_dev_state *cxlds);
> >  struct cxl_dev_state *cxl_dev_state_create(struct device *dev);
> >  void set_exclusive_cxl_commands(struct cxl_dev_state *cxlds, unsigned long *cmds);
> >  void clear_exclusive_cxl_commands(struct cxl_dev_state *cxlds, unsigned long *cmds);
> > +void cxl_mem_get_event_records(struct cxl_dev_state *cxlds);
> >  #ifdef CONFIG_CXL_SUSPEND
> >  void cxl_mem_active_inc(void);
> >  void cxl_mem_active_dec(void);
> > diff --git a/include/trace/events/cxl.h b/include/trace/events/cxl.h
> > new file mode 100644
> > index 000000000000..318ba5fe046e
> > --- /dev/null
> > +++ b/include/trace/events/cxl.h
> > @@ -0,0 +1,130 @@
> > +/* SPDX-License-Identifier: GPL-2.0 */
> > +#undef TRACE_SYSTEM
> > +#define TRACE_SYSTEM cxl
> > +
> > +#if !defined(_CXL_TRACE_EVENTS_H) ||  defined(TRACE_HEADER_MULTI_READ)
> > +#define _CXL_TRACE_EVENTS_H
> > +
> > +#include <asm-generic/unaligned.h>
> > +#include <linux/tracepoint.h>
> > +
> > +TRACE_EVENT(overflow,
> > +
> > +	TP_PROTO(const char *dev_name, enum cxl_event_log_type log,
> > +		 struct cxl_get_event_payload *payload),
> > +
> > +	TP_ARGS(dev_name, log, payload),
> > +
> > +	TP_STRUCT__entry(
> > +		__string(dev_name, dev_name)
> > +		__field(int, log)
> > +		__field(u64, first)
> 
> first_ts maybe?  first alone is a bit vague.

Agreed.  Changed to first_ts and last_ts.

> 
> > +		__field(u64, last)
> > +		__field(u16, count)
> > +	),
> > +
> > +	TP_fast_assign(
> > +		__assign_str(dev_name, dev_name);
> > +		__entry->log = log;
> > +		__entry->count = le16_to_cpu(payload->overflow_err_count);
> > +		__entry->first = le64_to_cpu(payload->first_overflow_timestamp);
> > +		__entry->last = le64_to_cpu(payload->last_overflow_timestamp);
> > +	),
> > +
> > +	TP_printk("%s: EVENT LOG OVERFLOW log=%s : %u records from %llu to %llu",
> > +		__get_str(dev_name), cxl_event_log_type_str(__entry->log),
> 
> So we have a potential header include ordering issue given this header doesn't
> include the definition of cxl_event_log_type_str()?

The RFC had a __print_symbolic() conversion which was redundant.  I really
don't like having 2 conversion routines for this string.  I think it will
result in maintenance issues down the line.

I'll try and resolve it.  Nothing I have tried this morning seems to work.  I
may just go back to what I had.  I need to double check the user space code
too.

> 
> > +		__entry->count, __entry->first, __entry->last)
> > +
> > +);
> > +
> > +/*
> > + * Common Event Record Format
> > + * CXL 3.0 section 8.2.9.2.1; Table 8-42
> > + */
> > +#define CXL_EVENT_RECORD_FLAG_PERMANENT		BIT(2)
> > +#define CXL_EVENT_RECORD_FLAG_MAINT_NEEDED	BIT(3)
> > +#define CXL_EVENT_RECORD_FLAG_PERF_DEGRADED	BIT(4)
> > +#define CXL_EVENT_RECORD_FLAG_HW_REPLACE	BIT(5)
> > +#define show_hdr_flags(flags)	__print_flags(flags, " | ",			   \
> > +	{ CXL_EVENT_RECORD_FLAG_PERMANENT,	"Permanent Condition"		}, \
> > +	{ CXL_EVENT_RECORD_FLAG_MAINT_NEEDED,	"Maintenance Needed"		}, \
> > +	{ CXL_EVENT_RECORD_FLAG_PERF_DEGRADED,	"Performance Degraded"		}, \
> > +	{ CXL_EVENT_RECORD_FLAG_HW_REPLACE,	"Hardware Replacement Needed"	}  \
> > +)
> > +
> > +/*
> > + * Define macros for the common header of each CXL event.
> > + *
> > + * Tracepoints using these macros must do 3 things:
> > + *
> > + *	1) Add CXL_EVT_TP_entry to TP_STRUCT__entry
> > + *	2) Use CXL_EVT_TP_fast_assign within TP_fast_assign;
> > + *	   pass the dev_name, log, and CXL event header
> > + *	3) Use CXL_EVT_TP_printk() instead of TP_printk()
> > + *
> > + * See the generic_event tracepoint as an example.
> > + */
> > +#define CXL_EVT_TP_entry					\
> > +	__string(dev_name, dev_name)				\
> > +	__field(int, log)					\
> > +	__array(u8, hdr_uuid, UUID_SIZE)			\
> > +	__field(u32, hdr_flags)					\
> > +	__field(u16, hdr_handle)				\
> > +	__field(u16, hdr_related_handle)			\
> > +	__field(u64, hdr_timestamp)				\
> > +	__array(u8, hdr_res, CXL_EVENT_REC_HDR_RES_LEN)		\
> > +	__field(u8, hdr_length)					\
> > +	__field(u8, hdr_maint_op_class)
> > +
> > +#define CXL_EVT_TP_fast_assign(dname, l, hdr)					\
> > +	__assign_str(dev_name, (dname));					\
> > +	__entry->log = (l);							\
> > +	memcpy(__entry->hdr_uuid, &(hdr).id, UUID_SIZE);			\
> > +	__entry->hdr_length = (hdr).length;					\
> > +	__entry->hdr_flags = get_unaligned_le24((hdr).flags);			\
> > +	__entry->hdr_handle = le16_to_cpu((hdr).handle);			\
> > +	__entry->hdr_related_handle = le16_to_cpu((hdr).related_handle);	\
> > +	__entry->hdr_timestamp = le64_to_cpu((hdr).timestamp);			\
> > +	__entry->hdr_maint_op_class = (hdr).maint_op_class;			\
> > +	memcpy(__entry->hdr_res, &(hdr).reserved,				\
> > +		CXL_EVENT_REC_HDR_RES_LEN)
> 
> What's the logic behind having the reserved fields here?

I forget.

> How does that change when a future spec defines them as used? Do we have
> to keep whatever was in the previously reserved fields for ever in order to
> maintain backwards compatibility even though we've added the same data to the end
> of the trace point?

I don't expect the header to change as much as the records themselves.  So I'm
ok removing the reserved fields.

> 
> I don't think they should be here at all. Given a userspace stack has to cope with
> out knowing the contents of those fields as userspace might not be up to date, 
> I see no problem with requiring a newer kernel to support stuff added in future
> specs.
> 
> Steven, Mauro, you probably have a better idea of history of similar cases.  How
> have other people handled reserved fields in underlying hardware reports that
> may become used in later revisions?
> 
> Also, probably makes sense to cc linux-edac on v3 of this series as the experts
> in these flows tend to read that list.  Obviously intent so far is not to pass
> these into the edac subsystem, but I'd still let them know this work is going on.
>

Sounds good.

The struggle I'm having is in how flexible these records are compared to the
other RAS trace events.

I'll drop the reserved fields.
Ira

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [RFC V2 PATCH 05/11] cxl/mem: Trace General Media Event Record
  2022-10-11 12:57   ` Jonathan Cameron
@ 2022-10-14 23:33     ` Ira Weiny
  2022-10-17 16:37       ` Jonathan Cameron
  0 siblings, 1 reply; 36+ messages in thread
From: Ira Weiny @ 2022-10-14 23:33 UTC (permalink / raw)
  To: Jonathan Cameron
  Cc: Dan Williams, Alison Schofield, Vishal Verma, Ben Widawsky,
	Steven Rostedt, Davidlohr Bueso, linux-kernel, linux-cxl

On Tue, Oct 11, 2022 at 01:57:02PM +0100, Jonathan Cameron wrote:
> On Mon, 10 Oct 2022 15:41:25 -0700
> ira.weiny@intel.com wrote:
> 
> > From: Ira Weiny <ira.weiny@intel.com>
> > 
> > CXL rev 3.0 section 8.2.9.2.1.1 defines the General Media Event Record.
> > 
> > Determine if the event read is a general media record and if so trace
> > the record.
> > 
> > Signed-off-by: Ira Weiny <ira.weiny@intel.com>
> 
> I'll review the rest of these with the assumption that the question
> of reserved bytes in tracepoints will get resolved before these go in.

Yea I'm removing them.  I think I messed up somehow because I thought I had
removed the reserved fields from the records.  But perhaps that was only in my
dreams...  :-/  Sorry.  :-)

> 
> One minor comment on a comment inline.  Other than those lgtm
> 
> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
> 

Thanks!

[snip]

> >  
> > +/*
> > + * General Media Event Record
> > + * CXL rev 3.0 Section 8.2.9.2.1.1; Table 8-43
> > + */
> > +static const uuid_t gen_media_event_uuid =
> > +	UUID_INIT(0xfbcd0a77, 0xc260, 0x417f,
> > +		  0x85, 0xa9, 0x08, 0x8b, 0x16, 0x21, 0xeb, 0xa6);
> > +
> > +static void cxl_trace_event_record(const char *dev_name,
> > +				   enum cxl_event_log_type type,
> > +				   struct cxl_get_event_payload *payload)
> > +{
> > +	uuid_t *id = &payload->record.hdr.id;
> > +
> > +	if (uuid_equal(id, &gen_media_event_uuid)) {
> > +		struct cxl_event_gen_media *rec =
> > +				(struct cxl_event_gen_media *)&payload->record;
> > +
> > +		trace_general_media(dev_name, type, rec);
> > +		return;
> > +	}
> > +
> > +	/* For unknown record types print just the header */
> > +	trace_generic_event(dev_name, type, &payload->record);
> 
> Looks like it prints the whole thing now...

An unknown record is dumped yes.  I'm ok with skipping this but it was
discussed early on that any unknown record would be passed through.

> 
> 
> > +}
> > +
> 
> 
> > diff --git a/include/trace/events/cxl.h b/include/trace/events/cxl.h
> > index 318ba5fe046e..82a8d3b750a2 100644
> > --- a/include/trace/events/cxl.h
> > +++ b/include/trace/events/cxl.h
> > @@ -122,6 +122,114 @@ TRACE_EVENT(generic_event,
> 
> 
> > +#define CXL_GMER_VALID_CHANNEL				BIT(0)
> > +#define CXL_GMER_VALID_RANK				BIT(1)
> > +#define CXL_GMER_VALID_DEVICE				BIT(2)
> > +#define CXL_GMER_VALID_COMPONENT			BIT(3)
> > +#define show_valid_flags(flags)	__print_flags(flags, "|",		   \
> > +	{ CXL_GMER_VALID_CHANNEL,			"CHANNEL"	}, \
> > +	{ CXL_GMER_VALID_RANK,				"RANK"		}, \
> > +	{ CXL_GMER_VALID_DEVICE,			"DEVICE"	}, \
> > +	{ CXL_GMER_VALID_COMPONENT,			"COMPONENT"	}  \
> > +)
> 
> I'd still rather we only had the TP_printk only print those parts that are valid
> rather than the mess of having to go check the validity bit before deciding whether
> to take notice of the field.  But meh, not that important given thats
> not the main intended way to consume this data.
> 

Ok I spent some time really thinking about this and below is an attempt at
that.

However, I must be missing something in what you are proposing because I don't
like having extra space in the trace buffer to print into like this and I
can't figure out where else to put a print buffer.

Also note that this will have no impact on most of the user space tools which
use libtracefs as they will see all the fields and will need to do a similar
decode.

Do you really think this is worth the effort?

Ira


commit 54c4ee67bcac6a38cbc9b0ea2ef952e197356dee
Author: Ira Weiny <ira.weiny@intel.com>
Date:   Fri Oct 14 16:15:27 2022 -0700

    squash: Attempt to print only valid fields per Jonathan suggestion

diff --git a/drivers/cxl/core/mbox.c b/drivers/cxl/core/mbox.c
index 2add473fc168..9e15028af79c 100644
--- a/drivers/cxl/core/mbox.c
+++ b/drivers/cxl/core/mbox.c
@@ -720,6 +720,28 @@ static const uuid_t gen_media_event_uuid =
        UUID_INIT(0xfbcd0a77, 0xc260, 0x417f,
                  0x85, 0xa9, 0x08, 0x8b, 0x16, 0x21, 0xeb, 0xa6);
 
+const char *cxl_gem_print_valid(u8 *buf, int nr, u16 valid_flags, u8 channel,
+                               u8 rank, u32 device, u8 *comp_id)
+{
+       char *str = buf;
+       int rc = 0;
+
+       if (valid_flags & CXL_GMER_VALID_CHANNEL)
+               rc = snprintf(str, nr, "channel=%u ", channel);
+
+       if (valid_flags & CXL_GMER_VALID_RANK)
+               rc = snprintf(str + rc, nr - rc, "rank=%u ", rank);
+
+       if (valid_flags & CXL_GMER_VALID_DEVICE)
+               rc = snprintf(str + rc, nr - rc, "device=%u ", device);
+
+       if (valid_flags & CXL_GMER_VALID_COMPONENT)
+               rc = snprintf(str + rc, nr - rc, "comp_id=%*ph ",
+                             CXL_EVENT_GEN_MED_COMP_ID_SIZE, comp_id);
+
+       return str;
+}
+
 static void cxl_trace_event_record(const char *dev_name,
                                   enum cxl_event_log_type type,
                                   struct cxl_get_event_payload *payload)
diff --git a/drivers/cxl/cxlmem.h b/drivers/cxl/cxlmem.h
index 8fbd20d8a0b2..3d3bfef69d32 100644
--- a/drivers/cxl/cxlmem.h
+++ b/drivers/cxl/cxlmem.h
@@ -429,6 +429,13 @@ struct cxl_event_gen_media {
        u8 reserved[0x2e];
 } __packed;
 
+#define CXL_GMER_VALID_CHANNEL                         BIT(0)
+#define CXL_GMER_VALID_RANK                            BIT(1)
+#define CXL_GMER_VALID_DEVICE                          BIT(2)
+#define CXL_GMER_VALID_COMPONENT                       BIT(3)
+const char *cxl_gem_print_valid(u8 *buf, int nr, u16 valid_flags, u8 channel,
+                               u8 rank, u32 device, u8 *comp_id);
+
 struct cxl_mbox_get_partition_info {
        __le64 active_volatile_cap;
        __le64 active_persistent_cap;
diff --git a/include/trace/events/cxl.h b/include/trace/events/cxl.h
index e3e11c9055ba..27bb790cb685 100644
--- a/include/trace/events/cxl.h
+++ b/include/trace/events/cxl.h
@@ -161,16 +161,7 @@ TRACE_EVENT(generic_event,
        { CXL_GMER_TRANS_INTERNAL_MEDIA_MANAGEMENT,     "Internal Media Management" }   \
 )
 
-#define CXL_GMER_VALID_CHANNEL                         BIT(0)
-#define CXL_GMER_VALID_RANK                            BIT(1)
-#define CXL_GMER_VALID_DEVICE                          BIT(2)
-#define CXL_GMER_VALID_COMPONENT                       BIT(3)
-#define show_valid_flags(flags)        __print_flags(flags, "|",                  \
-       { CXL_GMER_VALID_CHANNEL,                       "CHANNEL"       }, \
-       { CXL_GMER_VALID_RANK,                          "RANK"          }, \
-       { CXL_GMER_VALID_DEVICE,                        "DEVICE"        }, \
-       { CXL_GMER_VALID_COMPONENT,                     "COMPONENT"     }  \
-)
+#define CXL_VALID_PRINT_STR_LEN 512
 
 TRACE_EVENT(general_media,
 
@@ -191,6 +182,7 @@ TRACE_EVENT(general_media,
                __array(u8, comp_id, CXL_EVENT_GEN_MED_COMP_ID_SIZE)
                __field(u16, validity_flags)
                __field(u8, rank) /* Out of order to pack trace record */
+               __array(u8, str, CXL_VALID_PRINT_STR_LEN)
        ),
 
        TP_fast_assign(
@@ -209,17 +201,17 @@ TRACE_EVENT(general_media,
                __entry->validity_flags = get_unaligned_le16(&rec->validity_flags);
        ),
 
-       CXL_EVT_TP_printk("phys_addr=%llx volatile=%s desc='%s' type='%s' " \
-               "trans_type='%s' channel=%u rank=%u device=%x comp_id=%s " \
-               "valid_flags='%s'",
+       CXL_EVT_TP_printk("phys_addr=%llx volatile=%s desc='%s' type='%s' "     \
+               "trans_type='%s' %s",
                __entry->phys_addr & CXL_GMER_PHYS_ADDR_MASK,
                (__entry->phys_addr & CXL_GMER_PHYS_ADDR_VOLATILE) ? "TRUE" : "FALSE",
                show_event_desc_flags(__entry->descriptor),
                show_mem_event_type(__entry->type),
                show_trans_type(__entry->transaction_type),
-               __entry->channel, __entry->rank, __entry->device,
-               __print_hex(__entry->comp_id, CXL_EVENT_GEN_MED_COMP_ID_SIZE),
-               show_valid_flags(__entry->validity_flags)
+               cxl_gem_print_valid(__entry->str, CXL_VALID_PRINT_STR_LEN,
+                                   __entry->validity_flags, __entry->channel,
+                                   __entry->rank, __entry->device,
+                                   __entry->comp_id)
        )
 );


^ permalink raw reply related	[flat|nested] 36+ messages in thread

* Re: [RFC V2 PATCH 06/11] cxl/mem: Trace DRAM Event Record
  2022-10-11 13:47   ` Jonathan Cameron
@ 2022-10-14 23:45     ` Ira Weiny
  0 siblings, 0 replies; 36+ messages in thread
From: Ira Weiny @ 2022-10-14 23:45 UTC (permalink / raw)
  To: Jonathan Cameron
  Cc: Dan Williams, Alison Schofield, Vishal Verma, Ben Widawsky,
	Steven Rostedt, Davidlohr Bueso, linux-kernel, linux-cxl

On Tue, Oct 11, 2022 at 02:47:12PM +0100, Jonathan Cameron wrote:
> On Mon, 10 Oct 2022 15:41:26 -0700
> ira.weiny@intel.com wrote:
> 
> > From: Ira Weiny <ira.weiny@intel.com>
> > 
> > CXL rev 3.0 section 8.2.9.2.1.2 defines the DRAM Event Record.
> > 
> > Determine if the event read is a DRAM event record and if so trace the
> > record.
> > 
> > Signed-off-by: Ira Weiny <ira.weiny@intel.com>
> > 
> 
> Trivial comments inline
> 
> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
> 
> > diff --git a/include/trace/events/cxl.h b/include/trace/events/cxl.h
> > index 82a8d3b750a2..7a90cfea348b 100644
> > --- a/include/trace/events/cxl.h
> > +++ b/include/trace/events/cxl.h
> > @@ -230,6 +230,100 @@ TRACE_EVENT(general_media,
> >  		)
> >  );
> >  
> 
> > +
> > +TRACE_EVENT(dram,
> > +
> > +	TP_PROTO(const char *dev_name, enum cxl_event_log_type log,
> > +		 struct cxl_event_dram *rec),
> > +
> > +	TP_ARGS(dev_name, log, rec),
> > +
> > +	TP_STRUCT__entry(
> > +		CXL_EVT_TP_entry
> > +		/* DRAM */
> > +		__field(u64, phys_addr)
> > +		__field(u8, descriptor)
> > +		__field(u8, type)
> > +		__field(u8, transaction_type)
> > +		__field(u8, channel)
> > +		__field(u16, validity_flags)
> > +		__field(u16, column)	/* Out of order to pack trace record */
> > +		__field(u32, nibble_mask)
> > +		__field(u32, row)
> > +		__array(u8, cor_mask, CXL_EVENT_DER_CORRECTION_MASK_SIZE)
> > +		__array(u8, reserved, CXL_EVENT_DER_RES_SIZE)
> 
> If we are going to have this, why not put it at the end?  Will that affect the
> packing badly?

I removed it.

[snip]

> > +
> > +	CXL_EVT_TP_printk("phys_addr=%llx volatile=%s desc='%s' type='%s' " \
> > +		"trans_type='%s' channel=%u rank=%u nibble_mask=%x " \
> > +		"bank_group=%u bank=%u row=%u column=%u cor_mask=%s " \
> > +		"valid_flags='%s' reserved=%s",
> > +		__entry->phys_addr & CXL_GMER_PHYS_ADDR_MASK,
> > +		(__entry->phys_addr & CXL_GMER_PHYS_ADDR_VOLATILE) ? "TRUE" : "FALSE",
> > +		show_event_desc_flags(__entry->descriptor),
> > +		show_mem_event_type(__entry->type),
> > +		show_trans_type(__entry->transaction_type),
> > +		__entry->channel, __entry->rank, __entry->nibble_mask,
> > +		__entry->bank_group, __entry->bank,
> > +		__entry->row, __entry->column,
> > +		__print_hex(__entry->cor_mask, CXL_EVENT_DER_CORRECTION_MASK_SIZE),
> > +		show_dram_valid_flags(__entry->validity_flags),
> > +		__print_hex(__entry->reserved, CXL_EVENT_DER_RES_SIZE)
> > +		)
> Probably one less tab on that trailing )?

Done.

Thanks!
Ira

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [RFC V2 PATCH 02/11] cxl/mem: Implement Get Event Records command
  2022-10-10 22:41 ` [RFC V2 PATCH 02/11] cxl/mem: Implement Get Event Records command ira.weiny
  2022-10-11 12:39   ` Jonathan Cameron
@ 2022-10-15 11:28   ` Steven Rostedt
  2022-10-16 21:43     ` Ira Weiny
  2022-10-20 21:50   ` Smita Koralahalli
  2 siblings, 1 reply; 36+ messages in thread
From: Steven Rostedt @ 2022-10-15 11:28 UTC (permalink / raw)
  To: ira.weiny
  Cc: Dan Williams, Alison Schofield, Vishal Verma, Ben Widawsky,
	Jonathan Cameron, Davidlohr Bueso, linux-kernel, linux-cxl

On Mon, 10 Oct 2022 15:41:22 -0700
ira.weiny@intel.com wrote:

> new file mode 100644
> index 000000000000..318ba5fe046e
> --- /dev/null
> +++ b/include/trace/events/cxl.h
> @@ -0,0 +1,130 @@
> +/* SPDX-License-Identifier: GPL-2.0 */
> +#undef TRACE_SYSTEM
> +#define TRACE_SYSTEM cxl
> +
> +#if !defined(_CXL_TRACE_EVENTS_H) ||  defined(TRACE_HEADER_MULTI_READ)
> +#define _CXL_TRACE_EVENTS_H
> +
> +#include <asm-generic/unaligned.h>
> +#include <linux/tracepoint.h>
> +
> +TRACE_EVENT(overflow,

Please do not use generic names for grouped events. As most tooling can
use the name and not associate it with the group, it makes it ambiguous
for what event it wants to enable.

That is, call it "cxl_overflow" and not "overflow".

> +
> +	TP_PROTO(const char *dev_name, enum cxl_event_log_type log,
> +		 struct cxl_get_event_payload *payload),
> +
> +	TP_ARGS(dev_name, log, payload),
> +
> +	TP_STRUCT__entry(
> +		__string(dev_name, dev_name)
> +		__field(int, log)
> +		__field(u64, first)
> +		__field(u64, last)
> +		__field(u16, count)
> +	),
> +
> +	TP_fast_assign(
> +		__assign_str(dev_name, dev_name);
> +		__entry->log = log;
> +		__entry->count = le16_to_cpu(payload->overflow_err_count);
> +		__entry->first = le64_to_cpu(payload->first_overflow_timestamp);
> +		__entry->last = le64_to_cpu(payload->last_overflow_timestamp);
> +	),
> +
> +	TP_printk("%s: EVENT LOG OVERFLOW log=%s : %u records from %llu to %llu",
> +		__get_str(dev_name), cxl_event_log_type_str(__entry->log),
> +		__entry->count, __entry->first, __entry->last)
> +
> +);
> +

> +TRACE_EVENT(generic_event,

Same here. It's a "cxl_generic_event" not a "generic_event" that will
clash with any other subsystem that would (but shouldn't) use the same
name.

-- Steve

> +
> +	TP_PROTO(const char *dev_name, enum cxl_event_log_type log,
> +		 struct cxl_event_record_raw *rec),
> +
> +	TP_ARGS(dev_name, log, rec),
> +
> +	TP_STRUCT__entry(
> +		CXL_EVT_TP_entry
> +		__array(u8, data, CXL_EVENT_RECORD_DATA_LENGTH)
> +	),
> +
> +	TP_fast_assign(
> +		CXL_EVT_TP_fast_assign(dev_name, log, rec->hdr);
> +		memcpy(__entry->data, &rec->data, CXL_EVENT_RECORD_DATA_LENGTH);
> +	),
> +
> +	CXL_EVT_TP_printk("%s",
> +		__print_hex(__entry->data, CXL_EVENT_RECORD_DATA_LENGTH))
> +);
> +
> +#endif /* _CXL_TRACE_EVENTS_H */
> 

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [RFC V2 PATCH 05/11] cxl/mem: Trace General Media Event Record
  2022-10-10 22:41 ` [RFC V2 PATCH 05/11] cxl/mem: Trace General Media Event Record ira.weiny
  2022-10-11 12:57   ` Jonathan Cameron
@ 2022-10-15 11:30   ` Steven Rostedt
  1 sibling, 0 replies; 36+ messages in thread
From: Steven Rostedt @ 2022-10-15 11:30 UTC (permalink / raw)
  To: ira.weiny
  Cc: Dan Williams, Alison Schofield, Vishal Verma, Ben Widawsky,
	Jonathan Cameron, Davidlohr Bueso, linux-kernel, linux-cxl

On Mon, 10 Oct 2022 15:41:25 -0700
ira.weiny@intel.com wrote:

> +static void cxl_trace_event_record(const char *dev_name,
> +				   enum cxl_event_log_type type,
> +				   struct cxl_get_event_payload *payload)
> +{
> +	uuid_t *id = &payload->record.hdr.id;
> +

Perhaps you want to add here:

	if (!trace_cxl_general_media_enabled() && !trace_clx_generic_event_enabled())
		return;

This way the below logic is only executed if either event is enabled.
The above uses static_branches, so if the architecture supports them,
they are not compare and branch, but jumps and/or nops.

-- Steve


> +	if (uuid_equal(id, &gen_media_event_uuid)) {
> +		struct cxl_event_gen_media *rec =
> +				(struct cxl_event_gen_media *)&payload->record;
> +
> +		trace_general_media(dev_name, type, rec);
> +		return;
> +	}
> +
> +	/* For unknown record types print just the header */
> +	trace_generic_event(dev_name, type, &payload->record);
> +}

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [RFC V2 PATCH 06/11] cxl/mem: Trace DRAM Event Record
  2022-10-10 22:41 ` [RFC V2 PATCH 06/11] cxl/mem: Trace DRAM " ira.weiny
  2022-10-11 13:47   ` Jonathan Cameron
@ 2022-10-15 11:31   ` Steven Rostedt
  1 sibling, 0 replies; 36+ messages in thread
From: Steven Rostedt @ 2022-10-15 11:31 UTC (permalink / raw)
  To: ira.weiny
  Cc: Dan Williams, Alison Schofield, Vishal Verma, Ben Widawsky,
	Jonathan Cameron, Davidlohr Bueso, linux-kernel, linux-cxl

On Mon, 10 Oct 2022 15:41:26 -0700
ira.weiny@intel.com wrote:

> +TRACE_EVENT(dram,

Call this "cxl_dram"

-- Steve

> +
> +	TP_PROTO(const char *dev_name, enum cxl_event_log_type log,
> +		 struct cxl_event_dram *rec),
> +
> +	TP_ARGS(dev_name, log, rec),
> +
> +	TP_STRUCT__entry(
> +		CXL_EVT_TP_entry
> +		/* DRAM */
> +		__field(u64, phys_addr)
> +		__field(u8, descriptor)
> +		__field(u8, type)
> +		__field(u8, transaction_type)
> +		__field(u8, channel)
> +		__field(u16, validity_flags)
> +		__field(u16, column)	/* Out of order to pack trace record */
> +		__field(u32, nibble_mask)
> +		__field(u32, row)
> +		__array(u8, cor_mask, CXL_EVENT_DER_CORRECTION_MASK_SIZE)
> +		__array(u8, reserved, CXL_EVENT_DER_RES_SIZE)
> +		__field(u8, rank)	/* Out of order to pack trace record */
> +		__field(u8, bank_group)	/* Out of order to pack trace record */
> +		__field(u8, bank)	/* Out of order to pack trace record */
> +	),
> +
> +	TP_fast_assign(
> +		CXL_EVT_TP_fast_assign(dev_name, log, rec->hdr);
> +
> +		/* DRAM */
> +		__entry->phys_addr = le64_to_cpu(rec->phys_addr);
> +		__entry->descriptor = rec->descriptor;
> +		__entry->type = rec->type;
> +		__entry->transaction_type = rec->transaction_type;
> +		__entry->validity_flags = get_unaligned_le16(rec->validity_flags);
> +		__entry->channel = rec->channel;
> +		__entry->rank = rec->rank;
> +		__entry->nibble_mask = get_unaligned_le24(rec->nibble_mask);
> +		__entry->bank_group = rec->bank_group;
> +		__entry->bank = rec->bank;
> +		__entry->row = get_unaligned_le24(rec->row);
> +		__entry->column = get_unaligned_le16(rec->column);
> +		memcpy(__entry->cor_mask, &rec->correction_mask,
> +			CXL_EVENT_DER_CORRECTION_MASK_SIZE);
> +		memcpy(__entry->reserved, &rec->reserved,
> +			CXL_EVENT_DER_RES_SIZE);
> +	),
> +
> +	CXL_EVT_TP_printk("phys_addr=%llx volatile=%s desc='%s' type='%s' " \
> +		"trans_type='%s' channel=%u rank=%u nibble_mask=%x " \
> +		"bank_group=%u bank=%u row=%u column=%u cor_mask=%s " \
> +		"valid_flags='%s' reserved=%s",
> +		__entry->phys_addr & CXL_GMER_PHYS_ADDR_MASK,
> +		(__entry->phys_addr & CXL_GMER_PHYS_ADDR_VOLATILE) ? "TRUE" : "FALSE",
> +		show_event_desc_flags(__entry->descriptor),
> +		show_mem_event_type(__entry->type),
> +		show_trans_type(__entry->transaction_type),
> +		__entry->channel, __entry->rank, __entry->nibble_mask,
> +		__entry->bank_group, __entry->bank,
> +		__entry->row, __entry->column,
> +		__print_hex(__entry->cor_mask, CXL_EVENT_DER_CORRECTION_MASK_SIZE),
> +		show_dram_valid_flags(__entry->validity_flags),
> +		__print_hex(__entry->reserved, CXL_EVENT_DER_RES_SIZE)
> +		)
> +);
> +

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [RFC V2 PATCH 07/11] cxl/mem: Trace Memory Module Event Record
  2022-10-10 22:41 ` [RFC V2 PATCH 07/11] cxl/mem: Trace Memory Module " ira.weiny
  2022-10-11 13:54   ` Jonathan Cameron
@ 2022-10-15 11:32   ` Steven Rostedt
  1 sibling, 0 replies; 36+ messages in thread
From: Steven Rostedt @ 2022-10-15 11:32 UTC (permalink / raw)
  To: ira.weiny
  Cc: Dan Williams, Alison Schofield, Vishal Verma, Ben Widawsky,
	Jonathan Cameron, Davidlohr Bueso, linux-kernel, linux-cxl

On Mon, 10 Oct 2022 15:41:27 -0700
ira.weiny@intel.com wrote:

> +TRACE_EVENT(memory_module,

Make sure all your new events have the "cxl_" prefix. "cxl_memory_module".

This goes for all events in this series.

Thanks,

-- Steve

> +
> +	TP_PROTO(const char *dev_name, enum cxl_event_log_type log,
> +		 struct cxl_event_mem_module *rec),
> +
> +	TP_ARGS(dev_name, log, rec),
> +
> +	TP_STRUCT__entry(
> +		CXL_EVT_TP_entry
> +
> +		/* Memory Module Event */
> +		__field(u8, event_type)
> +
> +		/* Device Health Info */
> +		__field(u8, health_status)
> +		__field(u8, media_status)
> +		__field(u8, life_used)
> +		__field(u32, dirty_shutdown_cnt)
> +		__field(u32, cor_vol_err_cnt)
> +		__field(u32, cor_per_err_cnt)
> +		__field(s16, device_temp)
> +		__field(u8, add_status)
> +
> +		__array(u8, reserved, CXL_EVENT_MEM_MOD_RES_SIZE)
> +	),
> +
> +	TP_fast_assign(
> +		CXL_EVT_TP_fast_assign(dev_name, log, rec->hdr);
> +
> +		/* Memory Module Event */
> +		__entry->event_type = rec->event_type;
> +
> +		/* Device Health Info */
> +		__entry->health_status = rec->info.health_status;
> +		__entry->media_status = rec->info.media_status;
> +		__entry->life_used = rec->info.life_used;
> +		__entry->dirty_shutdown_cnt = get_unaligned_le32(rec->info.dirty_shutdown_cnt);
> +		__entry->cor_vol_err_cnt = get_unaligned_le32(rec->info.cor_vol_err_cnt);
> +		__entry->cor_per_err_cnt = get_unaligned_le32(rec->info.cor_per_err_cnt);
> +		__entry->device_temp = get_unaligned_le16(rec->info.device_temp);
> +		__entry->add_status = rec->info.add_status;
> +		memcpy(__entry->reserved, &rec->reserved,
> +			CXL_EVENT_MEM_MOD_RES_SIZE);
> +	),
> +
> +	CXL_EVT_TP_printk("evt_type='%s' health_status='%s' media_status='%s' " \
> +		"as_life_used=%s as_dev_temp=%s as_cor_vol_err_cnt=%s " \
> +		"as_cor_per_err_cnt=%s life_used=%u dev_temp=%d " \
> +		"dirty_shutdown_cnt=%u cor_vol_err_cnt=%u cor_per_err_cnt=%u " \
> +		"reserved=%s",
> +		show_dev_evt_type(__entry->event_type),
> +		show_health_status_flags(__entry->health_status),
> +		show_media_status(__entry->media_status),
> +		show_add_status(CXL_DHI_AS_LIFE_USED(__entry->add_status)),
> +		show_add_status(CXL_DHI_AS_DEV_TEMP(__entry->add_status)),
> +		show_add_status(CXL_DHI_AS_COR_VOL_ERR_CNT(__entry->add_status)),
> +		show_add_status(CXL_DHI_AS_COR_PER_ERR_CNT(__entry->add_status)),
> +		__entry->life_used, __entry->device_temp,
> +		__entry->dirty_shutdown_cnt, __entry->cor_vol_err_cnt,
> +		__entry->cor_per_err_cnt,
> +		__print_hex(__entry->reserved, CXL_EVENT_MEM_MOD_RES_SIZE)
> +		)
> +);
> +
> +

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [RFC V2 PATCH 02/11] cxl/mem: Implement Get Event Records command
  2022-10-15 11:28   ` Steven Rostedt
@ 2022-10-16 21:43     ` Ira Weiny
  0 siblings, 0 replies; 36+ messages in thread
From: Ira Weiny @ 2022-10-16 21:43 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: Dan Williams, Alison Schofield, Vishal Verma, Ben Widawsky,
	Jonathan Cameron, Davidlohr Bueso, linux-kernel, linux-cxl

On Sat, Oct 15, 2022 at 07:28:40AM -0400, Steven Rostedt wrote:
> On Mon, 10 Oct 2022 15:41:22 -0700
> ira.weiny@intel.com wrote:
> 
> > new file mode 100644
> > index 000000000000..318ba5fe046e
> > --- /dev/null
> > +++ b/include/trace/events/cxl.h
> > @@ -0,0 +1,130 @@
> > +/* SPDX-License-Identifier: GPL-2.0 */
> > +#undef TRACE_SYSTEM
> > +#define TRACE_SYSTEM cxl
> > +
> > +#if !defined(_CXL_TRACE_EVENTS_H) ||  defined(TRACE_HEADER_MULTI_READ)
> > +#define _CXL_TRACE_EVENTS_H
> > +
> > +#include <asm-generic/unaligned.h>
> > +#include <linux/tracepoint.h>
> > +
> > +TRACE_EVENT(overflow,
> 
> Please do not use generic names for grouped events. As most tooling can
> use the name and not associate it with the group, it makes it ambiguous
> for what event it wants to enable.
> 
> That is, call it "cxl_overflow" and not "overflow".

I did not realize that.[1]

[1] https://lore.kernel.org/linux-cxl/67c4e0da-87e3-44a1-1902-220dbc312700@intel.com/

> 
> > +
> > +	TP_PROTO(const char *dev_name, enum cxl_event_log_type log,
> > +		 struct cxl_get_event_payload *payload),
> > +
> > +	TP_ARGS(dev_name, log, payload),
> > +
> > +	TP_STRUCT__entry(
> > +		__string(dev_name, dev_name)
> > +		__field(int, log)
> > +		__field(u64, first)
> > +		__field(u64, last)
> > +		__field(u16, count)
> > +	),
> > +
> > +	TP_fast_assign(
> > +		__assign_str(dev_name, dev_name);
> > +		__entry->log = log;
> > +		__entry->count = le16_to_cpu(payload->overflow_err_count);
> > +		__entry->first = le64_to_cpu(payload->first_overflow_timestamp);
> > +		__entry->last = le64_to_cpu(payload->last_overflow_timestamp);
> > +	),
> > +
> > +	TP_printk("%s: EVENT LOG OVERFLOW log=%s : %u records from %llu to %llu",
> > +		__get_str(dev_name), cxl_event_log_type_str(__entry->log),
> > +		__entry->count, __entry->first, __entry->last)
> > +
> > +);
> > +
> 
> > +TRACE_EVENT(generic_event,
> 
> Same here. It's a "cxl_generic_event" not a "generic_event" that will
> clash with any other subsystem that would (but shouldn't) use the same
> name.

Sure I'll change it back.

Thanks for setting us straight,
Ira

> 
> -- Steve
> 
> > +
> > +	TP_PROTO(const char *dev_name, enum cxl_event_log_type log,
> > +		 struct cxl_event_record_raw *rec),
> > +
> > +	TP_ARGS(dev_name, log, rec),
> > +
> > +	TP_STRUCT__entry(
> > +		CXL_EVT_TP_entry
> > +		__array(u8, data, CXL_EVENT_RECORD_DATA_LENGTH)
> > +	),
> > +
> > +	TP_fast_assign(
> > +		CXL_EVT_TP_fast_assign(dev_name, log, rec->hdr);
> > +		memcpy(__entry->data, &rec->data, CXL_EVENT_RECORD_DATA_LENGTH);
> > +	),
> > +
> > +	CXL_EVT_TP_printk("%s",
> > +		__print_hex(__entry->data, CXL_EVENT_RECORD_DATA_LENGTH))
> > +);
> > +
> > +#endif /* _CXL_TRACE_EVENTS_H */
> > 

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [RFC V2 PATCH 05/11] cxl/mem: Trace General Media Event Record
  2022-10-14 23:33     ` Ira Weiny
@ 2022-10-17 16:37       ` Jonathan Cameron
  2022-10-17 17:21         ` Steven Rostedt
  0 siblings, 1 reply; 36+ messages in thread
From: Jonathan Cameron @ 2022-10-17 16:37 UTC (permalink / raw)
  To: Ira Weiny
  Cc: Dan Williams, Alison Schofield, Vishal Verma, Ben Widawsky,
	Steven Rostedt, Davidlohr Bueso, linux-kernel, linux-cxl

> >   
> > > diff --git a/include/trace/events/cxl.h b/include/trace/events/cxl.h
> > > index 318ba5fe046e..82a8d3b750a2 100644
> > > --- a/include/trace/events/cxl.h
> > > +++ b/include/trace/events/cxl.h
> > > @@ -122,6 +122,114 @@ TRACE_EVENT(generic_event,  
> > 
> >   
> > > +#define CXL_GMER_VALID_CHANNEL				BIT(0)
> > > +#define CXL_GMER_VALID_RANK				BIT(1)
> > > +#define CXL_GMER_VALID_DEVICE				BIT(2)
> > > +#define CXL_GMER_VALID_COMPONENT			BIT(3)
> > > +#define show_valid_flags(flags)	__print_flags(flags, "|",		   \
> > > +	{ CXL_GMER_VALID_CHANNEL,			"CHANNEL"	}, \
> > > +	{ CXL_GMER_VALID_RANK,				"RANK"		}, \
> > > +	{ CXL_GMER_VALID_DEVICE,			"DEVICE"	}, \
> > > +	{ CXL_GMER_VALID_COMPONENT,			"COMPONENT"	}  \
> > > +)  
> > 
> > I'd still rather we only had the TP_printk only print those parts that are valid
> > rather than the mess of having to go check the validity bit before deciding whether
> > to take notice of the field.  But meh, not that important given thats
> > not the main intended way to consume this data.
> >   
> 
> Ok I spent some time really thinking about this and below is an attempt at
> that.
> 
> However, I must be missing something in what you are proposing because I don't
> like having extra space in the trace buffer to print into like this and I
> can't figure out where else to put a print buffer.

Putting the space in the trace buffer definitely doesn't makes sense.
Ah. Looking back the CCIX code I assumed it was serialized so could use a
static buffer.

Looking at other similar cases though and we have a lot of use
of trace_seq_printf() e.g. libata_trace_parse_status() though note
there is some magic macro stuff in include/trace/events/libata.h 
to tie that together.
https://elixir.bootlin.com/linux/latest/source/drivers/ata/libata-trace.c#L14

That seems to get you access to the actual buffer we are printing into
in similar cases.

> 
> Also note that this will have no impact on most of the user space tools which
> use libtracefs as they will see all the fields and will need to do a similar
> decode.
> 
> Do you really think this is worth the effort?

With the allocation problem, definitely not. With that solved, it's not a huge amount
of extra code.  Also rather nicely 'documents' meaning of the valid bits.

I'm a kernel hacker. I like to not need much beyond echo and cat :)

Jonathan

> 
> Ira
> 
> 
> commit 54c4ee67bcac6a38cbc9b0ea2ef952e197356dee
> Author: Ira Weiny <ira.weiny@intel.com>
> Date:   Fri Oct 14 16:15:27 2022 -0700
> 
>     squash: Attempt to print only valid fields per Jonathan suggestion
> 
> diff --git a/drivers/cxl/core/mbox.c b/drivers/cxl/core/mbox.c
> index 2add473fc168..9e15028af79c 100644
> --- a/drivers/cxl/core/mbox.c
> +++ b/drivers/cxl/core/mbox.c
> @@ -720,6 +720,28 @@ static const uuid_t gen_media_event_uuid =
>         UUID_INIT(0xfbcd0a77, 0xc260, 0x417f,
>                   0x85, 0xa9, 0x08, 0x8b, 0x16, 0x21, 0xeb, 0xa6);
>  
> +const char *cxl_gem_print_valid(u8 *buf, int nr, u16 valid_flags, u8 channel,
> +                               u8 rank, u32 device, u8 *comp_id)
> +{
> +       char *str = buf;
> +       int rc = 0;
> +
> +       if (valid_flags & CXL_GMER_VALID_CHANNEL)
> +               rc = snprintf(str, nr, "channel=%u ", channel);
> +
> +       if (valid_flags & CXL_GMER_VALID_RANK)
> +               rc = snprintf(str + rc, nr - rc, "rank=%u ", rank);
> +
> +       if (valid_flags & CXL_GMER_VALID_DEVICE)
> +               rc = snprintf(str + rc, nr - rc, "device=%u ", device);
> +
> +       if (valid_flags & CXL_GMER_VALID_COMPONENT)
> +               rc = snprintf(str + rc, nr - rc, "comp_id=%*ph ",
> +                             CXL_EVENT_GEN_MED_COMP_ID_SIZE, comp_id);
> +
> +       return str;
> +}
> +
>  static void cxl_trace_event_record(const char *dev_name,
>                                    enum cxl_event_log_type type,
>                                    struct cxl_get_event_payload *payload)
> diff --git a/drivers/cxl/cxlmem.h b/drivers/cxl/cxlmem.h
> index 8fbd20d8a0b2..3d3bfef69d32 100644
> --- a/drivers/cxl/cxlmem.h
> +++ b/drivers/cxl/cxlmem.h
> @@ -429,6 +429,13 @@ struct cxl_event_gen_media {
>         u8 reserved[0x2e];
>  } __packed;
>  
> +#define CXL_GMER_VALID_CHANNEL                         BIT(0)
> +#define CXL_GMER_VALID_RANK                            BIT(1)
> +#define CXL_GMER_VALID_DEVICE                          BIT(2)
> +#define CXL_GMER_VALID_COMPONENT                       BIT(3)
> +const char *cxl_gem_print_valid(u8 *buf, int nr, u16 valid_flags, u8 channel,
> +                               u8 rank, u32 device, u8 *comp_id);
> +
>  struct cxl_mbox_get_partition_info {
>         __le64 active_volatile_cap;
>         __le64 active_persistent_cap;
> diff --git a/include/trace/events/cxl.h b/include/trace/events/cxl.h
> index e3e11c9055ba..27bb790cb685 100644
> --- a/include/trace/events/cxl.h
> +++ b/include/trace/events/cxl.h
> @@ -161,16 +161,7 @@ TRACE_EVENT(generic_event,
>         { CXL_GMER_TRANS_INTERNAL_MEDIA_MANAGEMENT,     "Internal Media Management" }   \
>  )
>  
> -#define CXL_GMER_VALID_CHANNEL                         BIT(0)
> -#define CXL_GMER_VALID_RANK                            BIT(1)
> -#define CXL_GMER_VALID_DEVICE                          BIT(2)
> -#define CXL_GMER_VALID_COMPONENT                       BIT(3)
> -#define show_valid_flags(flags)        __print_flags(flags, "|",                  \
> -       { CXL_GMER_VALID_CHANNEL,                       "CHANNEL"       }, \
> -       { CXL_GMER_VALID_RANK,                          "RANK"          }, \
> -       { CXL_GMER_VALID_DEVICE,                        "DEVICE"        }, \
> -       { CXL_GMER_VALID_COMPONENT,                     "COMPONENT"     }  \
> -)
> +#define CXL_VALID_PRINT_STR_LEN 512
>  
>  TRACE_EVENT(general_media,
>  
> @@ -191,6 +182,7 @@ TRACE_EVENT(general_media,
>                 __array(u8, comp_id, CXL_EVENT_GEN_MED_COMP_ID_SIZE)
>                 __field(u16, validity_flags)
>                 __field(u8, rank) /* Out of order to pack trace record */
> +               __array(u8, str, CXL_VALID_PRINT_STR_LEN)
>         ),
>  
>         TP_fast_assign(
> @@ -209,17 +201,17 @@ TRACE_EVENT(general_media,
>                 __entry->validity_flags = get_unaligned_le16(&rec->validity_flags);
>         ),
>  
> -       CXL_EVT_TP_printk("phys_addr=%llx volatile=%s desc='%s' type='%s' " \
> -               "trans_type='%s' channel=%u rank=%u device=%x comp_id=%s " \
> -               "valid_flags='%s'",
> +       CXL_EVT_TP_printk("phys_addr=%llx volatile=%s desc='%s' type='%s' "     \
> +               "trans_type='%s' %s",
>                 __entry->phys_addr & CXL_GMER_PHYS_ADDR_MASK,
>                 (__entry->phys_addr & CXL_GMER_PHYS_ADDR_VOLATILE) ? "TRUE" : "FALSE",
>                 show_event_desc_flags(__entry->descriptor),
>                 show_mem_event_type(__entry->type),
>                 show_trans_type(__entry->transaction_type),
> -               __entry->channel, __entry->rank, __entry->device,
> -               __print_hex(__entry->comp_id, CXL_EVENT_GEN_MED_COMP_ID_SIZE),
> -               show_valid_flags(__entry->validity_flags)
> +               cxl_gem_print_valid(__entry->str, CXL_VALID_PRINT_STR_LEN,
> +                                   __entry->validity_flags, __entry->channel,
> +                                   __entry->rank, __entry->device,
> +                                   __entry->comp_id)
>         )
>  );
> 
> 


^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [RFC V2 PATCH 05/11] cxl/mem: Trace General Media Event Record
  2022-10-17 16:37       ` Jonathan Cameron
@ 2022-10-17 17:21         ` Steven Rostedt
  2022-10-18  9:46           ` Jonathan Cameron
  0 siblings, 1 reply; 36+ messages in thread
From: Steven Rostedt @ 2022-10-17 17:21 UTC (permalink / raw)
  To: Jonathan Cameron
  Cc: Ira Weiny, Dan Williams, Alison Schofield, Vishal Verma,
	Ben Widawsky, Davidlohr Bueso, linux-kernel, linux-cxl

On Mon, 17 Oct 2022 17:37:17 +0100
Jonathan Cameron <Jonathan.Cameron@huawei.com> wrote:

> Looking at other similar cases though and we have a lot of use
> of trace_seq_printf() e.g. libata_trace_parse_status() though note
> there is some magic macro stuff in include/trace/events/libata.h 
> to tie that together.
> https://elixir.bootlin.com/linux/latest/source/drivers/ata/libata-trace.c#L14
> 
> That seems to get you access to the actual buffer we are printing into
> in similar cases.

Looking at the code you linked to, I wonder why __print_flags() wasn't used?

For instance, you have:

const char *
libata_trace_parse_status(struct trace_seq *p, unsigned char status)
{
        const char *ret = trace_seq_buffer_ptr(p);

        trace_seq_printf(p, "{ ");
        if (status & ATA_BUSY)
                trace_seq_printf(p, "BUSY ");
        if (status & ATA_DRDY)
                trace_seq_printf(p, "DRDY ");
        if (status & ATA_DF)
                trace_seq_printf(p, "DF ");
        if (status & ATA_DSC)
                trace_seq_printf(p, "DSC ");
        if (status & ATA_DRQ)
                trace_seq_printf(p, "DRQ ");
        if (status & ATA_CORR)
                trace_seq_printf(p, "CORR ");
        if (status & ATA_SENSE)
                trace_seq_printf(p, "SENSE ");
        if (status & ATA_ERR)
                trace_seq_printf(p, "ERR ");
        trace_seq_putc(p, '}');
        trace_seq_putc(p, 0);

        return ret;
}


Which is just a re-implementation of:

__print_flags(status, " ", 
	{ ATA_BUSY, "BUSY" },
	{ ATA_DRDY, "DRDY" },
	{ ATA_DF, "DF" },
	{ ATA_DSC, "DSC" },
	{ ATA_DRQ, "DRQ" },
	{ ATA_CORR, "CORR" },
	{ ATA_SENSE, "SENSE" },
	{ ATA_ERR, "ERR" })


The major difference between the two, is that libtraceevent will be able to
parse the above and convert the status bits into strings, whereas using
libata_trace_parse_status() will just give you a parsing error.

That is, perf and trace-cmd will not be able to parse it unless you write a
separate plugin for libtraceevent to do it but that means you'll have
duplicate code.

I know you just want echo and cat, but that will still work, and this will
make it work for the tooling as well.

-- Steve

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [RFC V2 PATCH 05/11] cxl/mem: Trace General Media Event Record
  2022-10-17 17:21         ` Steven Rostedt
@ 2022-10-18  9:46           ` Jonathan Cameron
  2022-10-21  5:13             ` Ira Weiny
  0 siblings, 1 reply; 36+ messages in thread
From: Jonathan Cameron @ 2022-10-18  9:46 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: Ira Weiny, Dan Williams, Alison Schofield, Vishal Verma,
	Ben Widawsky, Davidlohr Bueso, linux-kernel, linux-cxl

On Mon, 17 Oct 2022 13:21:43 -0400
Steven Rostedt <rostedt@goodmis.org> wrote:

> On Mon, 17 Oct 2022 17:37:17 +0100
> Jonathan Cameron <Jonathan.Cameron@huawei.com> wrote:
> 
> > Looking at other similar cases though and we have a lot of use
> > of trace_seq_printf() e.g. libata_trace_parse_status() though note
> > there is some magic macro stuff in include/trace/events/libata.h 
> > to tie that together.
> > https://elixir.bootlin.com/linux/latest/source/drivers/ata/libata-trace.c#L14
> > 
> > That seems to get you access to the actual buffer we are printing into
> > in similar cases.  
> 
> Looking at the code you linked to, I wonder why __print_flags() wasn't used?
> 
> For instance, you have:
> 
> const char *
> libata_trace_parse_status(struct trace_seq *p, unsigned char status)
> {
>         const char *ret = trace_seq_buffer_ptr(p);
> 
>         trace_seq_printf(p, "{ ");
>         if (status & ATA_BUSY)
>                 trace_seq_printf(p, "BUSY ");
>         if (status & ATA_DRDY)
>                 trace_seq_printf(p, "DRDY ");
>         if (status & ATA_DF)
>                 trace_seq_printf(p, "DF ");
>         if (status & ATA_DSC)
>                 trace_seq_printf(p, "DSC ");
>         if (status & ATA_DRQ)
>                 trace_seq_printf(p, "DRQ ");
>         if (status & ATA_CORR)
>                 trace_seq_printf(p, "CORR ");
>         if (status & ATA_SENSE)
>                 trace_seq_printf(p, "SENSE ");
>         if (status & ATA_ERR)
>                 trace_seq_printf(p, "ERR ");
>         trace_seq_putc(p, '}');
>         trace_seq_putc(p, 0);
> 
>         return ret;
> }
> 
> 
> Which is just a re-implementation of:
> 
> __print_flags(status, " ", 
> 	{ ATA_BUSY, "BUSY" },
> 	{ ATA_DRDY, "DRDY" },
> 	{ ATA_DF, "DF" },
> 	{ ATA_DSC, "DSC" },
> 	{ ATA_DRQ, "DRQ" },
> 	{ ATA_CORR, "CORR" },
> 	{ ATA_SENSE, "SENSE" },
> 	{ ATA_ERR, "ERR" })
> 
> 
> The major difference between the two, is that libtraceevent will be able to
> parse the above and convert the status bits into strings, whereas using
> libata_trace_parse_status() will just give you a parsing error.
> 
> That is, perf and trace-cmd will not be able to parse it unless you write a
> separate plugin for libtraceevent to do it but that means you'll have
> duplicate code.
> 
> I know you just want echo and cat, but that will still work, and this will
> make it work for the tooling as well.

Excellent point, though in the case we are interested in for CXL,
__print_flags() is not enough.

We have a mass of fields that only contain something useful to print if
the valid bits in a mask are set. I just pulled that example to
show how trace_seq_printf() could be used to achieve optional printing
as opposed to current situation where the reader of the print has
to interpret the mask to work out if fields contain anything useful.

To do something nice with them in perf (well probably ras-daemon in
this case) we'll have to parse the valid bits anyway so effectively
write such a plugin.  There we need to do a bunch of mangling to get
the events stored in a DB anyway, so this isn't a huge overhead.

Jonathan


> 
> -- Steve
> 


^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [RFC V2 PATCH 02/11] cxl/mem: Implement Get Event Records command
  2022-10-10 22:41 ` [RFC V2 PATCH 02/11] cxl/mem: Implement Get Event Records command ira.weiny
  2022-10-11 12:39   ` Jonathan Cameron
  2022-10-15 11:28   ` Steven Rostedt
@ 2022-10-20 21:50   ` Smita Koralahalli
  2022-10-21  5:11     ` Ira Weiny
  2 siblings, 1 reply; 36+ messages in thread
From: Smita Koralahalli @ 2022-10-20 21:50 UTC (permalink / raw)
  To: ira.weiny, Dan Williams
  Cc: Steven Rostedt, Alison Schofield, Vishal Verma, Ben Widawsky,
	Jonathan Cameron, Davidlohr Bueso, linux-kernel, linux-cxl,
	yazen.ghannam

Hi Ira,

On 10/10/22 5:41 PM, ira.weiny@intel.com wrote:
> From: Ira Weiny <ira.weiny@intel.com>
>
>
>   
> +static void cxl_mem_get_records_log(struct cxl_dev_state *cxlds,
> +				    enum cxl_event_log_type type)
> +{
> +	struct cxl_get_event_payload payload;
> +
> +	do {
> +		u8 log_type = type;
> +		int rc;
> +
> +		rc = cxl_mbox_send_cmd(cxlds, CXL_MBOX_OP_GET_EVENT_RECORD,
> +				       &log_type, sizeof(log_type),
> +				       &payload, sizeof(payload));
> +		if (rc) {
> +			dev_err(cxlds->dev, "Event log '%s': Failed to query event records : %d",
> +				cxl_event_log_type_str(type), rc);
> +			return;
> +		}
> +
> +		if (le16_to_cpu(payload.record_count) == 1)
> +			trace_generic_event(dev_name(cxlds->dev), type,
> +					    &payload.record);
> +
> +		if (payload.flags & CXL_GET_EVENT_FLAG_OVERFLOW)
> +			trace_overflow(dev_name(cxlds->dev), type, &payload);
> +
> +	} while (payload.flags & CXL_GET_EVENT_FLAG_MORE_RECORDS);
> +}
> +
> +/**
> + * cxl_mem_get_event_records - Get Event Records from the device
> + * @cxlds: The device data for the operation
> + *
> + * Retrieve all event records available on the device and report them as trace
> + * events.
> + *
> + * See CXL rev 3.0 @8.2.9.2.2 Get Event Records
> + */
> +void cxl_mem_get_event_records(struct cxl_dev_state *cxlds)
> +{
> +	enum cxl_event_log_type log_type;
> +
> +	dev_dbg(cxlds->dev, "Reading event logs\n");
> +
> +	for (log_type = CXL_EVENT_TYPE_INFO;
> +	     log_type < CXL_EVENT_TYPE_MAX; log_type++)

Why should we loop through each event log here? What if the event
record doesn't exist in the event log?

I got some Mailbox error messages like this while bootup..
[  346.387010] cxl_pci 0000:7f:00.0: Sending command
[  346.387181] cxl_pci 0000:7f:00.0: Doorbell wait took 0ms
[  346.387197] cxl_pci 0000:7f:00.0: Mailbox operation had an error: cmd 
input was invalid
[  346.387205] cxl_pci 0000:7f:00.0: Event log 'Warning': Failed to 
query event records : -6
..

Can we just read the "Event Status" field from Event Status Register
(Device Status Registers Capability Offset + 00h) 8.2.8.3.1 in CXL Spec,
determine if the records exist and just query those event logs?

Thanks,
Smita

> +		cxl_mem_get_records_log(cxlds, log_type);
> +}
> +EXPORT_SYMBOL_NS_GPL(cxl_mem_get_event_records, CXL);
> +
>   /**
>    * cxl_mem_get_partition_info - Get partition info
>    * @cxlds: The device data for the operation

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [RFC V2 PATCH 02/11] cxl/mem: Implement Get Event Records command
  2022-10-20 21:50   ` Smita Koralahalli
@ 2022-10-21  5:11     ` Ira Weiny
  0 siblings, 0 replies; 36+ messages in thread
From: Ira Weiny @ 2022-10-21  5:11 UTC (permalink / raw)
  To: Smita Koralahalli
  Cc: Dan Williams, Steven Rostedt, Alison Schofield, Vishal Verma,
	Ben Widawsky, Jonathan Cameron, Davidlohr Bueso, linux-kernel,
	linux-cxl, yazen.ghannam

On Thu, Oct 20, 2022 at 04:50:30PM -0500, Smita Koralahalli wrote:
> Hi Ira,
> 
> On 10/10/22 5:41 PM, ira.weiny@intel.com wrote:
> > From: Ira Weiny <ira.weiny@intel.com>
> > 
> > 

[snip]

> > +
> > +/**
> > + * cxl_mem_get_event_records - Get Event Records from the device
> > + * @cxlds: The device data for the operation
> > + *
> > + * Retrieve all event records available on the device and report them as trace
> > + * events.
> > + *
> > + * See CXL rev 3.0 @8.2.9.2.2 Get Event Records
> > + */
> > +void cxl_mem_get_event_records(struct cxl_dev_state *cxlds)
> > +{
> > +	enum cxl_event_log_type log_type;
> > +
> > +	dev_dbg(cxlds->dev, "Reading event logs\n");
> > +
> > +	for (log_type = CXL_EVENT_TYPE_INFO;
> > +	     log_type < CXL_EVENT_TYPE_MAX; log_type++)
> 
> Why should we loop through each event log here?

The idea was to clear all the event logs.  I think this made more sense before
the addition of the optional dynamic capacity log.

>
> What if the event
> record doesn't exist in the event log?

An empty log does not cause any issue.  The query will simply return 0 records
which is valid.

> 
> I got some Mailbox error messages like this while bootup..
> [  346.387010] cxl_pci 0000:7f:00.0: Sending command
> [  346.387181] cxl_pci 0000:7f:00.0: Doorbell wait took 0ms
> [  346.387197] cxl_pci 0000:7f:00.0: Mailbox operation had an error: cmd
> input was invalid
> [  346.387205] cxl_pci 0000:7f:00.0: Event log 'Warning': Failed to query
> event records : -6
> ..
> 
> Can we just read the "Event Status" field from Event Status Register
> (Device Status Registers Capability Offset + 00h) 8.2.8.3.1 in CXL Spec,
> determine if the records exist and just query those event logs?

Likely the hardware does not have the dynamic capacity log and so the code is
asking for something invalid.  I did not think of that when I added that new
log.  Checking status register looks to be the proper solution.

I'll throw in some testing in QEMU for this.  I'll also have to implement the
status register in QEMU to fully test.

Thanks for the testing!  :-D

Ira

> 
> Thanks,
> Smita
> 
> > +		cxl_mem_get_records_log(cxlds, log_type);
> > +}
> > +EXPORT_SYMBOL_NS_GPL(cxl_mem_get_event_records, CXL);
> > +
> >   /**
> >    * cxl_mem_get_partition_info - Get partition info
> >    * @cxlds: The device data for the operation

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [RFC V2 PATCH 05/11] cxl/mem: Trace General Media Event Record
  2022-10-18  9:46           ` Jonathan Cameron
@ 2022-10-21  5:13             ` Ira Weiny
  0 siblings, 0 replies; 36+ messages in thread
From: Ira Weiny @ 2022-10-21  5:13 UTC (permalink / raw)
  To: Jonathan Cameron
  Cc: Steven Rostedt, Dan Williams, Alison Schofield, Vishal Verma,
	Ben Widawsky, Davidlohr Bueso, linux-kernel, linux-cxl

On Tue, Oct 18, 2022 at 10:46:36AM +0100, Jonathan Cameron wrote:
> On Mon, 17 Oct 2022 13:21:43 -0400
> Steven Rostedt <rostedt@goodmis.org> wrote:
> 
> > On Mon, 17 Oct 2022 17:37:17 +0100
> > Jonathan Cameron <Jonathan.Cameron@huawei.com> wrote:
> > 
> > > Looking at other similar cases though and we have a lot of use
> > > of trace_seq_printf() e.g. libata_trace_parse_status() though note
> > > there is some magic macro stuff in include/trace/events/libata.h 
> > > to tie that together.
> > > https://elixir.bootlin.com/linux/latest/source/drivers/ata/libata-trace.c#L14
> > > 
> > > That seems to get you access to the actual buffer we are printing into
> > > in similar cases.  
> > 
> > Looking at the code you linked to, I wonder why __print_flags() wasn't used?
> > 
> > For instance, you have:
> > 
> > const char *
> > libata_trace_parse_status(struct trace_seq *p, unsigned char status)
> > {
> >         const char *ret = trace_seq_buffer_ptr(p);
> > 
> >         trace_seq_printf(p, "{ ");
> >         if (status & ATA_BUSY)
> >                 trace_seq_printf(p, "BUSY ");
> >         if (status & ATA_DRDY)
> >                 trace_seq_printf(p, "DRDY ");
> >         if (status & ATA_DF)
> >                 trace_seq_printf(p, "DF ");
> >         if (status & ATA_DSC)
> >                 trace_seq_printf(p, "DSC ");
> >         if (status & ATA_DRQ)
> >                 trace_seq_printf(p, "DRQ ");
> >         if (status & ATA_CORR)
> >                 trace_seq_printf(p, "CORR ");
> >         if (status & ATA_SENSE)
> >                 trace_seq_printf(p, "SENSE ");
> >         if (status & ATA_ERR)
> >                 trace_seq_printf(p, "ERR ");
> >         trace_seq_putc(p, '}');
> >         trace_seq_putc(p, 0);
> > 
> >         return ret;
> > }
> > 
> > 
> > Which is just a re-implementation of:
> > 
> > __print_flags(status, " ", 
> > 	{ ATA_BUSY, "BUSY" },
> > 	{ ATA_DRDY, "DRDY" },
> > 	{ ATA_DF, "DF" },
> > 	{ ATA_DSC, "DSC" },
> > 	{ ATA_DRQ, "DRQ" },
> > 	{ ATA_CORR, "CORR" },
> > 	{ ATA_SENSE, "SENSE" },
> > 	{ ATA_ERR, "ERR" })
> > 
> > 
> > The major difference between the two, is that libtraceevent will be able to
> > parse the above and convert the status bits into strings, whereas using
> > libata_trace_parse_status() will just give you a parsing error.
> > 
> > That is, perf and trace-cmd will not be able to parse it unless you write a
> > separate plugin for libtraceevent to do it but that means you'll have
> > duplicate code.
> > 
> > I know you just want echo and cat, but that will still work, and this will
> > make it work for the tooling as well.
> 
> Excellent point, though in the case we are interested in for CXL,
> __print_flags() is not enough.
> 
> We have a mass of fields that only contain something useful to print if
> the valid bits in a mask are set. I just pulled that example to
> show how trace_seq_printf() could be used to achieve optional printing
> as opposed to current situation where the reader of the print has
> to interpret the mask to work out if fields contain anything useful.
> 
> To do something nice with them in perf (well probably ras-daemon in
> this case) we'll have to parse the valid bits anyway so effectively
> write such a plugin.  There we need to do a bunch of mangling to get
> the events stored in a DB anyway, so this isn't a huge overhead.

Given this information I think I'm going to punt on this and take your reviewed
by on the code as it is.

We can certainly try to change it later but for now I think it serves our
purpose.  Better to focus on getting the code working with irq's.

Ira

^ permalink raw reply	[flat|nested] 36+ messages in thread

end of thread, other threads:[~2022-10-21  5:14 UTC | newest]

Thread overview: 36+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-10-10 22:41 [RFC V2 PATCH 00/11] CXL: Process event logs ira.weiny
2022-10-10 22:41 ` [RFC V2 PATCH 01/11] cxl/mbox: Add debug of hardware error code ira.weiny
2022-10-11 10:41   ` Jonathan Cameron
2022-10-14 16:29   ` Davidlohr Bueso
2022-10-14 16:31     ` Davidlohr Bueso
2022-10-14 17:00       ` Ira Weiny
2022-10-10 22:41 ` [RFC V2 PATCH 02/11] cxl/mem: Implement Get Event Records command ira.weiny
2022-10-11 12:39   ` Jonathan Cameron
2022-10-14 19:21     ` Ira Weiny
2022-10-15 11:28   ` Steven Rostedt
2022-10-16 21:43     ` Ira Weiny
2022-10-20 21:50   ` Smita Koralahalli
2022-10-21  5:11     ` Ira Weiny
2022-10-10 22:41 ` [RFC V2 PATCH 03/11] cxl/mem: Implement Clear " ira.weiny
2022-10-10 22:41 ` [RFC V2 PATCH 04/11] cxl/mem: Clear events on driver load ira.weiny
2022-10-11 12:42   ` Jonathan Cameron
2022-10-10 22:41 ` [RFC V2 PATCH 05/11] cxl/mem: Trace General Media Event Record ira.weiny
2022-10-11 12:57   ` Jonathan Cameron
2022-10-14 23:33     ` Ira Weiny
2022-10-17 16:37       ` Jonathan Cameron
2022-10-17 17:21         ` Steven Rostedt
2022-10-18  9:46           ` Jonathan Cameron
2022-10-21  5:13             ` Ira Weiny
2022-10-15 11:30   ` Steven Rostedt
2022-10-10 22:41 ` [RFC V2 PATCH 06/11] cxl/mem: Trace DRAM " ira.weiny
2022-10-11 13:47   ` Jonathan Cameron
2022-10-14 23:45     ` Ira Weiny
2022-10-15 11:31   ` Steven Rostedt
2022-10-10 22:41 ` [RFC V2 PATCH 07/11] cxl/mem: Trace Memory Module " ira.weiny
2022-10-11 13:54   ` Jonathan Cameron
2022-10-15 11:32   ` Steven Rostedt
2022-10-10 22:41 ` [RFC V2 PATCH 08/11] cxl/test: Add generic mock events ira.weiny
2022-10-10 22:41 ` [RFC V2 PATCH 09/11] cxl/test: Add specific events ira.weiny
2022-10-10 22:41 ` [RFC V2 PATCH 10/11] cxl/test: Simulate event log overflow ira.weiny
2022-10-10 22:41 ` [RFC V2 PATCH 11/11] cxl/mem: Wire up event interrupts ira.weiny
2022-10-12 18:01   ` Davidlohr Bueso

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.