linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH V3 0/8] CXL: Process event logs
@ 2022-12-08  5:21 ira.weiny
  2022-12-08  5:21 ` [PATCH V3 1/8] cxl/mem: Read, trace, and clear events on driver load ira.weiny
                   ` (7 more replies)
  0 siblings, 8 replies; 25+ messages in thread
From: ira.weiny @ 2022-12-08  5:21 UTC (permalink / raw)
  To: Dan Williams
  Cc: Ira Weiny, Bjorn Helgaas, Alison Schofield, Vishal Verma,
	Davidlohr Bueso, Jonathan Cameron, Dave Jiang, linux-kernel,
	linux-pci, linux-acpi, linux-cxl

From: Ira Weiny <ira.weiny@intel.com>

This code has been tested with a newer qemu which allows for more events to be
returned at a time as well an additional QMP event and interrupt injection.
Those patches will follow once they have been cleaned up.

The series is now in 3 parts:

	1) Base functionality including interrupts
	2) Tracing specific events (Dynamic Capacity Event Record is defered)
	3) cxl-test infrastructure for basic tests

Changes from V2
	Rebased on pending 6.3 changes
		CXL security patches from Dave J.
		Moving tracing to cxl core from Dan
	Feed back from Dan, Steven, Jonathan, and Dave.
	The series looks very different now with a lot of the patches squashed
	per Dan's feedback.

- Link to v2: https://lore.kernel.org/r/20221201002719.2596558-1-ira.weiny@intel.com

Changes from V1
	Address comments, from Jonathan, Dave, and Alison
		Main comment was to allow for a full payload size number of
		event records to be processed on each Get event cyle.
	Pick up tags

Changes from RFC v2
	Integrated Davidlohr's irq patch, allocate up to 16 vectors, and base
		my irq support on modifications to that patch.
	Smita
		Check event status before reading each log.
	Jonathan
		Process more than 1 record at a time
		Remove reserved fields
	Steven
		Prefix trace points with 'cxl_'
	Davidlohr
		PUll in his patch

Changes from RFC v1
	Add event irqs
	General simplification of the code.
	Resolve field alignment questions
	Update to rev 3.0 for comments and structures
	Add reserved fields and output them

Davidlohr Bueso (1):
  cxl/mem: Wire up event interrupts

Ira Weiny (7):
  cxl/mem: Read, trace, and clear events on driver load
  cxl/mem: Trace General Media Event Record
  cxl/mem: Trace DRAM Event Record
  cxl/mem: Trace Memory Module Event Record
  cxl/test: Add generic mock events
  cxl/test: Add specific events
  cxl/test: Simulate event log overflow

 drivers/acpi/pci_root.c         |   3 +
 drivers/cxl/core/mbox.c         | 233 ++++++++++++++++
 drivers/cxl/core/trace.h        | 479 ++++++++++++++++++++++++++++++++
 drivers/cxl/cxl.h               |  12 +
 drivers/cxl/cxlmem.h            | 180 ++++++++++++
 drivers/cxl/cxlpci.h            |   6 +
 drivers/cxl/pci.c               | 130 +++++++++
 include/linux/pci.h             |   1 +
 tools/testing/cxl/test/Kbuild   |   4 +-
 tools/testing/cxl/test/events.c | 314 +++++++++++++++++++++
 tools/testing/cxl/test/events.h |  34 +++
 tools/testing/cxl/test/mem.c    |  33 ++-
 tools/testing/cxl/test/mock.h   |  12 +
 13 files changed, 1429 insertions(+), 12 deletions(-)
 create mode 100644 tools/testing/cxl/test/events.c
 create mode 100644 tools/testing/cxl/test/events.h


base-commit: acb704099642bc822ef2aed223a0b8db1f7ea76e
-- 
2.37.2


^ permalink raw reply	[flat|nested] 25+ messages in thread

* [PATCH V3 1/8] cxl/mem: Read, trace, and clear events on driver load
  2022-12-08  5:21 [PATCH V3 0/8] CXL: Process event logs ira.weiny
@ 2022-12-08  5:21 ` ira.weiny
  2022-12-09 17:56   ` Dan Williams
  2022-12-08  5:21 ` [PATCH V3 2/8] cxl/mem: Wire up event interrupts ira.weiny
                   ` (6 subsequent siblings)
  7 siblings, 1 reply; 25+ messages in thread
From: ira.weiny @ 2022-12-08  5:21 UTC (permalink / raw)
  To: Dan Williams
  Cc: Ira Weiny, Bjorn Helgaas, Alison Schofield, Vishal Verma,
	Davidlohr Bueso, Jonathan Cameron, Dave Jiang, linux-kernel,
	linux-pci, linux-acpi, linux-cxl

From: Ira Weiny <ira.weiny@intel.com>

CXL devices have multiple event logs which can be queried for CXL event
records.  Devices are required to support the storage of at least one
event record in each event log type.

Devices track event log overflow by incrementing a counter and tracking
the time of the first and last overflow event seen.

Software queries events via the Get Event Record mailbox command; CXL
rev 3.0 section 8.2.9.2.2 and clears events via CXL rev 3.0 section
8.2.9.2.3 Clear Event Records mailbox command.

CXL _OSC Error Reporting Control is used by the OS to determine if
Firmware has control of various error reporting capabilities including
the event logs.

Expose the result of negotiating CXL Error Reporting Control in struct
pci_host_bridge for consumption by the CXL drivers.  If support is
controlled by the OS read and clear all event logs on driver load.

Ensure a clean slate of events by reading and clearing the events on
driver load.  The operation is performed twice to ensure that any prior
partial readings are completed and a fresh read from the start is done
at least one time.  This is done even if rogue reads cause clear errors.

The status register is not used because a device may continue to trigger
events and the only requirement is to empty the log at least once.  This
allows for the required transition from empty to non-empty for interrupt
generation.  Handling of interrupts is in a follow on patch.

The device can return up to 1MB worth of event records per query.
Allocate a shared large buffer to handle the max number of records based
on the mailbox payload size.

This patch traces a raw event record and leaves specific event record
type tracing to subsequent patches.  Macros are created to aid in
tracing the common CXL Event header fields.

Each record is cleared explicitly.  A clear all bit is specified but is
only valid when the log overflows.

Signed-off-by: Ira Weiny <ira.weiny@intel.com>

---
Changes from V2:
	Rebased on 6.3 pending changes
	Move cxl_mem_alloc_event_buf() to pci.c
	Define and use CXLDEV_EVENT_STATUS_ALL
	Fix error flow on clear failure
	Remove tags
	Jonathan/Dan
		Add in OSC Error Reporting Control check
	Dan (Jonathan in previous version)
		Squash Clear events and the driver load patch into this one.
	Dan
		Make event device status a separate structure
		Move tracing to within cxl core
		Reduce clear double loop to a single loop
		Pass struct device to trace points
		Adjust to new cxl_internal_send_cmd()
		Query error logs in order of severity fatal -> Info
		Remove uapi defines entirely
		pass total via get_pl
		fix 'Clearning' spelling
		Better clarify event_buf singular allocation
		Use decimal for command payload array sizes
		Remove trace_*_enabled() optimization
		Put GET/CLEAR macros at the end of the user enum to
		preserve compatibility
		Add Get/Clear Events to kernel exclusive commands
		Remove cxl_event_log_type_str() outside of tracing
		Add cond_resched() to event log processing
	Jonathan
		s/event_buf_lock/event_log_lock
		Read through all logs two times to ensure partial reads are
			covered.
		Pass buffer to cxl_mem_free_event_buffer()
		kdoc for event buf
		Account for cxlds->payload_size limiting the max handles
		Don't use min_t as it was used incorrectly

Changes from V1:
	Clear Event Record allows for u8 handles while Get Event Record
	allows for u16 records to be returned.  Based on Jonathan's
	feedback; allow for all event records to be handled in this
	clear.  Which means a double loop with potentially multiple
	Clear Event payloads being sent to clear all events sent.

Changes from RFC:
	Jonathan
		Clean up init of payload and use return code.
		Also report any error to clear the event.
		s/v3.0/rev 3.0

squash: make event device state a separate structure.
---
 drivers/acpi/pci_root.c  |   3 +
 drivers/cxl/core/mbox.c  | 138 +++++++++++++++++++++++++++++++++++++++
 drivers/cxl/core/trace.h | 120 ++++++++++++++++++++++++++++++++++
 drivers/cxl/cxl.h        |  12 ++++
 drivers/cxl/cxlmem.h     |  84 ++++++++++++++++++++++++
 drivers/cxl/pci.c        |  42 ++++++++++++
 include/linux/pci.h      |   1 +
 7 files changed, 400 insertions(+)

diff --git a/drivers/acpi/pci_root.c b/drivers/acpi/pci_root.c
index b3c202d2a433..cee8f56fb63a 100644
--- a/drivers/acpi/pci_root.c
+++ b/drivers/acpi/pci_root.c
@@ -1047,6 +1047,9 @@ struct pci_bus *acpi_pci_root_create(struct acpi_pci_root *root,
 	if (!(root->osc_control_set & OSC_PCI_EXPRESS_DPC_CONTROL))
 		host_bridge->native_dpc = 0;
 
+	if (root->osc_ext_control_set & OSC_CXL_ERROR_REPORTING_CONTROL)
+		host_bridge->native_cxl_error = 1;
+
 	/*
 	 * Evaluate the "PCI Boot Configuration" _DSM Function.  If it
 	 * exists and returns 0, we must preserve any PCI resource
diff --git a/drivers/cxl/core/mbox.c b/drivers/cxl/core/mbox.c
index b03fba212799..815da3aac081 100644
--- a/drivers/cxl/core/mbox.c
+++ b/drivers/cxl/core/mbox.c
@@ -8,6 +8,7 @@
 #include <cxl.h>
 
 #include "core.h"
+#include "trace.h"
 
 static bool cxl_raw_allow_all;
 
@@ -717,6 +718,142 @@ int cxl_enumerate_cmds(struct cxl_dev_state *cxlds)
 }
 EXPORT_SYMBOL_NS_GPL(cxl_enumerate_cmds, CXL);
 
+static int cxl_clear_event_record(struct cxl_dev_state *cxlds,
+				  enum cxl_event_log_type log,
+				  struct cxl_get_event_payload *get_pl)
+{
+	struct cxl_mbox_clear_event_payload payload = {
+		.event_log = log,
+	};
+	u16 total = le16_to_cpu(get_pl->record_count);
+	u8 max_handles = CXL_CLEAR_EVENT_MAX_HANDLES;
+	size_t pl_size = sizeof(payload);
+	struct cxl_mbox_cmd mbox_cmd;
+	u16 cnt;
+	int rc;
+	int i;
+
+	/* Payload size may limit the max handles */
+	if (pl_size > cxlds->payload_size) {
+		max_handles = CXL_CLEAR_EVENT_LIMIT_HANDLES(cxlds->payload_size);
+		pl_size = cxlds->payload_size;
+	}
+
+	mbox_cmd = (struct cxl_mbox_cmd) {
+		.opcode = CXL_MBOX_OP_CLEAR_EVENT_RECORD,
+		.payload_in = &payload,
+		.size_in = pl_size,
+	};
+
+	/*
+	 * Clear Event Records uses u8 for the handle cnt while Get Event
+	 * Record can return up to 0xffff records.
+	 */
+	i = 0;
+	for (cnt = 0; cnt < total; cnt++) {
+		payload.handle[i++] = get_pl->records[cnt].hdr.handle;
+		dev_dbg(cxlds->dev, "Event log '%d': Clearing %u\n",
+			log, le16_to_cpu(payload.handle[i]));
+
+		if (i == max_handles) {
+			payload.nr_recs = i;
+			rc = cxl_internal_send_cmd(cxlds, &mbox_cmd);
+			if (rc)
+				return rc;
+			i = 0;
+		}
+	}
+
+	/* Clear what is left if any */
+	if (i) {
+		payload.nr_recs = i;
+		rc = cxl_internal_send_cmd(cxlds, &mbox_cmd);
+		if (rc)
+			return rc;
+	}
+
+	return 0;
+}
+
+static void cxl_mem_get_records_log(struct cxl_dev_state *cxlds,
+				    enum cxl_event_log_type type)
+{
+	struct cxl_get_event_payload *payload;
+	struct cxl_mbox_cmd mbox_cmd;
+	u8 log_type = type;
+	u16 nr_rec;
+
+	mutex_lock(&cxlds->event.log_lock);
+	payload = cxlds->event.buf;
+
+	mbox_cmd = (struct cxl_mbox_cmd) {
+		.opcode = CXL_MBOX_OP_GET_EVENT_RECORD,
+		.payload_in = &log_type,
+		.size_in = sizeof(log_type),
+		.payload_out = payload,
+		.size_out = cxlds->payload_size,
+		.min_out = struct_size(payload, records, 0),
+	};
+
+	do {
+		int rc;
+
+		rc = cxl_internal_send_cmd(cxlds, &mbox_cmd);
+		if (rc) {
+			dev_err(cxlds->dev, "Event log '%d': Failed to query event records : %d",
+				type, rc);
+			goto unlock_buffer;
+		}
+
+		nr_rec = le16_to_cpu(payload->record_count);
+		if (nr_rec > 0) {
+			int i;
+
+			for (i = 0; i < nr_rec; i++)
+				trace_cxl_generic_event(cxlds->dev, type,
+							&payload->records[i]);
+
+			rc = cxl_clear_event_record(cxlds, type, payload);
+			if (rc) {
+				dev_err(cxlds->dev, "Event log '%d': Failed to clear events : %d",
+					type, rc);
+				goto unlock_buffer;
+			}
+		}
+
+		if (payload->flags & CXL_GET_EVENT_FLAG_OVERFLOW)
+			trace_cxl_overflow(cxlds->dev, type, payload);
+	} while (nr_rec);
+
+unlock_buffer:
+	mutex_unlock(&cxlds->event.log_lock);
+}
+
+/**
+ * cxl_mem_get_event_records - Get Event Records from the device
+ * @cxlds: The device data for the operation
+ *
+ * Retrieve all event records available on the device, report them as trace
+ * events, and clear them.
+ *
+ * See CXL rev 3.0 @8.2.9.2.2 Get Event Records
+ * See CXL rev 3.0 @8.2.9.2.3 Clear Event Records
+ */
+void cxl_mem_get_event_records(struct cxl_dev_state *cxlds, u32 status)
+{
+	dev_dbg(cxlds->dev, "Reading event logs: %x\n", status);
+
+	if (status & CXLDEV_EVENT_STATUS_FATAL)
+		cxl_mem_get_records_log(cxlds, CXL_EVENT_TYPE_FATAL);
+	if (status & CXLDEV_EVENT_STATUS_FAIL)
+		cxl_mem_get_records_log(cxlds, CXL_EVENT_TYPE_FAIL);
+	if (status & CXLDEV_EVENT_STATUS_WARN)
+		cxl_mem_get_records_log(cxlds, CXL_EVENT_TYPE_WARN);
+	if (status & CXLDEV_EVENT_STATUS_INFO)
+		cxl_mem_get_records_log(cxlds, CXL_EVENT_TYPE_INFO);
+}
+EXPORT_SYMBOL_NS_GPL(cxl_mem_get_event_records, CXL);
+
 /**
  * cxl_mem_get_partition_info - Get partition info
  * @cxlds: The device data for the operation
@@ -868,6 +1005,7 @@ struct cxl_dev_state *cxl_dev_state_create(struct device *dev)
 	}
 
 	mutex_init(&cxlds->mbox_mutex);
+	mutex_init(&cxlds->event.log_lock);
 	cxlds->dev = dev;
 
 	return cxlds;
diff --git a/drivers/cxl/core/trace.h b/drivers/cxl/core/trace.h
index 20ca2fe2ca8e..24eef6909f13 100644
--- a/drivers/cxl/core/trace.h
+++ b/drivers/cxl/core/trace.h
@@ -6,7 +6,9 @@
 #if !defined(_CXL_EVENTS_H) || defined(TRACE_HEADER_MULTI_READ)
 #define _CXL_EVENTS_H
 
+#include <asm-generic/unaligned.h>
 #include <cxl.h>
+#include <cxlmem.h>
 #include <linux/tracepoint.h>
 
 #define CXL_RAS_UC_CACHE_DATA_PARITY	BIT(0)
@@ -103,6 +105,124 @@ TRACE_EVENT(cxl_aer_correctable_error,
 	)
 );
 
+#include <linux/tracepoint.h>
+
+#define cxl_event_log_type_str(type)				\
+	__print_symbolic(type,					\
+		{ CXL_EVENT_TYPE_INFO, "Informational" },	\
+		{ CXL_EVENT_TYPE_WARN, "Warning" },		\
+		{ CXL_EVENT_TYPE_FAIL, "Failure" },		\
+		{ CXL_EVENT_TYPE_FATAL, "Fatal" })
+
+TRACE_EVENT(cxl_overflow,
+
+	TP_PROTO(const struct device *dev, enum cxl_event_log_type log,
+		 struct cxl_get_event_payload *payload),
+
+	TP_ARGS(dev, log, payload),
+
+	TP_STRUCT__entry(
+		__string(dev_name, dev_name(dev))
+		__field(int, log)
+		__field(u64, first_ts)
+		__field(u64, last_ts)
+		__field(u16, count)
+	),
+
+	TP_fast_assign(
+		__assign_str(dev_name, dev_name(dev));
+		__entry->log = log;
+		__entry->count = le16_to_cpu(payload->overflow_err_count);
+		__entry->first_ts = le64_to_cpu(payload->first_overflow_timestamp);
+		__entry->last_ts = le64_to_cpu(payload->last_overflow_timestamp);
+	),
+
+	TP_printk("%s: EVENT LOG OVERFLOW log=%s : %u records from %llu to %llu",
+		__get_str(dev_name), cxl_event_log_type_str(__entry->log),
+		__entry->count, __entry->first_ts, __entry->last_ts)
+
+);
+
+/*
+ * Common Event Record Format
+ * CXL 3.0 section 8.2.9.2.1; Table 8-42
+ */
+#define CXL_EVENT_RECORD_FLAG_PERMANENT		BIT(2)
+#define CXL_EVENT_RECORD_FLAG_MAINT_NEEDED	BIT(3)
+#define CXL_EVENT_RECORD_FLAG_PERF_DEGRADED	BIT(4)
+#define CXL_EVENT_RECORD_FLAG_HW_REPLACE	BIT(5)
+#define show_hdr_flags(flags)	__print_flags(flags, " | ",			   \
+	{ CXL_EVENT_RECORD_FLAG_PERMANENT,	"PERMANENT_CONDITION"		}, \
+	{ CXL_EVENT_RECORD_FLAG_MAINT_NEEDED,	"MAINTENANCE_NEEDED"		}, \
+	{ CXL_EVENT_RECORD_FLAG_PERF_DEGRADED,	"PERFORMANCE_DEGRADED"		}, \
+	{ CXL_EVENT_RECORD_FLAG_HW_REPLACE,	"HARDWARE_REPLACEMENT_NEEDED"	}  \
+)
+
+/*
+ * Define macros for the common header of each CXL event.
+ *
+ * Tracepoints using these macros must do 3 things:
+ *
+ *	1) Add CXL_EVT_TP_entry to TP_STRUCT__entry
+ *	2) Use CXL_EVT_TP_fast_assign within TP_fast_assign;
+ *	   pass the dev, log, and CXL event header
+ *	3) Use CXL_EVT_TP_printk() instead of TP_printk()
+ *
+ * See the generic_event tracepoint as an example.
+ */
+#define CXL_EVT_TP_entry					\
+	__string(dev_name, dev_name(dev))			\
+	__field(int, log)					\
+	__field_struct(uuid_t, hdr_uuid)			\
+	__field(u32, hdr_flags)					\
+	__field(u16, hdr_handle)				\
+	__field(u16, hdr_related_handle)			\
+	__field(u64, hdr_timestamp)				\
+	__field(u8, hdr_length)					\
+	__field(u8, hdr_maint_op_class)
+
+#define CXL_EVT_TP_fast_assign(dev, l, hdr)					\
+	__assign_str(dev_name, dev_name(dev));					\
+	__entry->log = (l);							\
+	memcpy(&__entry->hdr_uuid, &(hdr).id, sizeof(uuid_t));			\
+	__entry->hdr_length = (hdr).length;					\
+	__entry->hdr_flags = get_unaligned_le24((hdr).flags);			\
+	__entry->hdr_handle = le16_to_cpu((hdr).handle);			\
+	__entry->hdr_related_handle = le16_to_cpu((hdr).related_handle);	\
+	__entry->hdr_timestamp = le64_to_cpu((hdr).timestamp);			\
+	__entry->hdr_maint_op_class = (hdr).maint_op_class
+
+#define CXL_EVT_TP_printk(fmt, ...) \
+	TP_printk("%s log=%s : time=%llu uuid=%pUb len=%d flags='%s' "		\
+		"handle=%x related_handle=%x maint_op_class=%u"			\
+		" : " fmt,							\
+		__get_str(dev_name), cxl_event_log_type_str(__entry->log),	\
+		__entry->hdr_timestamp, &__entry->hdr_uuid, __entry->hdr_length,\
+		show_hdr_flags(__entry->hdr_flags), __entry->hdr_handle,	\
+		__entry->hdr_related_handle, __entry->hdr_maint_op_class,	\
+		##__VA_ARGS__)
+
+TRACE_EVENT(cxl_generic_event,
+
+	TP_PROTO(const struct device *dev, enum cxl_event_log_type log,
+		 struct cxl_event_record_raw *rec),
+
+	TP_ARGS(dev, log, rec),
+
+	TP_STRUCT__entry(
+		CXL_EVT_TP_entry
+		__array(u8, data, CXL_EVENT_RECORD_DATA_LENGTH)
+	),
+
+	TP_fast_assign(
+		CXL_EVT_TP_fast_assign(dev, log, rec->hdr);
+		memcpy(__entry->data, &rec->data, CXL_EVENT_RECORD_DATA_LENGTH);
+	),
+
+	CXL_EVT_TP_printk("%s",
+		__print_hex(__entry->data, CXL_EVENT_RECORD_DATA_LENGTH))
+);
+
 #endif /* _CXL_EVENTS_H */
 
 #define TRACE_INCLUDE_FILE trace
diff --git a/drivers/cxl/cxl.h b/drivers/cxl/cxl.h
index aa3af3bb73b2..5974d1082210 100644
--- a/drivers/cxl/cxl.h
+++ b/drivers/cxl/cxl.h
@@ -156,6 +156,18 @@ static inline int ways_to_eiw(unsigned int ways, u8 *eiw)
 #define CXLDEV_CAP_CAP_ID_SECONDARY_MAILBOX 0x3
 #define CXLDEV_CAP_CAP_ID_MEMDEV 0x4000
 
+/* CXL 3.0 8.2.8.3.1 Event Status Register */
+#define CXLDEV_DEV_EVENT_STATUS_OFFSET		0x00
+#define CXLDEV_EVENT_STATUS_INFO		BIT(0)
+#define CXLDEV_EVENT_STATUS_WARN		BIT(1)
+#define CXLDEV_EVENT_STATUS_FAIL		BIT(2)
+#define CXLDEV_EVENT_STATUS_FATAL		BIT(3)
+
+#define CXLDEV_EVENT_STATUS_ALL (CXLDEV_EVENT_STATUS_INFO |	\
+				 CXLDEV_EVENT_STATUS_WARN |	\
+				 CXLDEV_EVENT_STATUS_FAIL |	\
+				 CXLDEV_EVENT_STATUS_FATAL)
+
 /* CXL 2.0 8.2.8.4 Mailbox Registers */
 #define CXLDEV_MBOX_CAPS_OFFSET 0x00
 #define   CXLDEV_MBOX_CAP_PAYLOAD_SIZE_MASK GENMASK(4, 0)
diff --git a/drivers/cxl/cxlmem.h b/drivers/cxl/cxlmem.h
index ab138004f644..dd9aa3dd738e 100644
--- a/drivers/cxl/cxlmem.h
+++ b/drivers/cxl/cxlmem.h
@@ -4,6 +4,7 @@
 #define __CXL_MEM_H__
 #include <uapi/linux/cxl_mem.h>
 #include <linux/cdev.h>
+#include <linux/uuid.h>
 #include "cxl.h"
 
 /* CXL 2.0 8.2.8.5.1.1 Memory Device Status Register */
@@ -193,6 +194,17 @@ struct cxl_endpoint_dvsec_info {
 	struct range dvsec_range[2];
 };
 
+/**
+ * struct cxl_event_state - Event log driver state
+ *
+ * @event_buf: Buffer to receive event data
+ * @event_log_lock: Serialize event_buf and log use
+ */
+struct cxl_event_state {
+	struct cxl_get_event_payload *buf;
+	struct mutex log_lock;
+};
+
 /**
  * struct cxl_dev_state - The driver device state
  *
@@ -266,12 +278,16 @@ struct cxl_dev_state {
 
 	struct xarray doe_mbs;
 
+	struct cxl_event_state event;
+
 	int (*mbox_send)(struct cxl_dev_state *cxlds, struct cxl_mbox_cmd *cmd);
 };
 
 enum cxl_opcode {
 	CXL_MBOX_OP_INVALID		= 0x0000,
 	CXL_MBOX_OP_RAW			= CXL_MBOX_OP_INVALID,
+	CXL_MBOX_OP_GET_EVENT_RECORD	= 0x0100,
+	CXL_MBOX_OP_CLEAR_EVENT_RECORD	= 0x0101,
 	CXL_MBOX_OP_GET_FW_INFO		= 0x0200,
 	CXL_MBOX_OP_ACTIVATE_FW		= 0x0202,
 	CXL_MBOX_OP_GET_SUPPORTED_LOGS	= 0x0400,
@@ -347,6 +363,73 @@ struct cxl_mbox_identify {
 	u8 qos_telemetry_caps;
 } __packed;
 
+/*
+ * Common Event Record Format
+ * CXL rev 3.0 section 8.2.9.2.1; Table 8-42
+ */
+struct cxl_event_record_hdr {
+	uuid_t id;
+	u8 length;
+	u8 flags[3];
+	__le16 handle;
+	__le16 related_handle;
+	__le64 timestamp;
+	u8 maint_op_class;
+	u8 reserved[15];
+} __packed;
+
+#define CXL_EVENT_RECORD_DATA_LENGTH 0x50
+struct cxl_event_record_raw {
+	struct cxl_event_record_hdr hdr;
+	u8 data[CXL_EVENT_RECORD_DATA_LENGTH];
+} __packed;
+
+/*
+ * Get Event Records output payload
+ * CXL rev 3.0 section 8.2.9.2.2; Table 8-50
+ */
+#define CXL_GET_EVENT_FLAG_OVERFLOW		BIT(0)
+#define CXL_GET_EVENT_FLAG_MORE_RECORDS		BIT(1)
+struct cxl_get_event_payload {
+	u8 flags;
+	u8 reserved1;
+	__le16 overflow_err_count;
+	__le64 first_overflow_timestamp;
+	__le64 last_overflow_timestamp;
+	__le16 record_count;
+	u8 reserved2[10];
+	struct cxl_event_record_raw records[];
+} __packed;
+
+/*
+ * CXL rev 3.0 section 8.2.9.2.2; Table 8-49
+ */
+enum cxl_event_log_type {
+	CXL_EVENT_TYPE_INFO = 0x00,
+	CXL_EVENT_TYPE_WARN,
+	CXL_EVENT_TYPE_FAIL,
+	CXL_EVENT_TYPE_FATAL,
+	CXL_EVENT_TYPE_MAX
+};
+
+/*
+ * Clear Event Records input payload
+ * CXL rev 3.0 section 8.2.9.2.3; Table 8-51
+ */
+#define CXL_CLEAR_EVENT_MAX_HANDLES (0xff)
+struct cxl_mbox_clear_event_payload {
+	u8 event_log;		/* enum cxl_event_log_type */
+	u8 clear_flags;
+	u8 nr_recs;
+	u8 reserved[3];
+	__le16 handle[CXL_CLEAR_EVENT_MAX_HANDLES];
+} __packed;
+#define CXL_CLEAR_EVENT_LIMIT_HANDLES(payload_size)			\
+	(((payload_size) -						\
+		(sizeof(struct cxl_mbox_clear_event_payload) -		\
+		 (sizeof(__le16) * CXL_CLEAR_EVENT_MAX_HANDLES))) /	\
+		sizeof(__le16))
+
 struct cxl_mbox_get_partition_info {
 	__le64 active_volatile_cap;
 	__le64 active_persistent_cap;
@@ -441,6 +524,7 @@ int cxl_mem_create_range_info(struct cxl_dev_state *cxlds);
 struct cxl_dev_state *cxl_dev_state_create(struct device *dev);
 void set_exclusive_cxl_commands(struct cxl_dev_state *cxlds, unsigned long *cmds);
 void clear_exclusive_cxl_commands(struct cxl_dev_state *cxlds, unsigned long *cmds);
+void cxl_mem_get_event_records(struct cxl_dev_state *cxlds, u32 status);
 #ifdef CONFIG_CXL_SUSPEND
 void cxl_mem_active_inc(void);
 void cxl_mem_active_dec(void);
diff --git a/drivers/cxl/pci.c b/drivers/cxl/pci.c
index 3a66aadb4df0..86c84611a168 100644
--- a/drivers/cxl/pci.c
+++ b/drivers/cxl/pci.c
@@ -417,8 +417,44 @@ static void disable_aer(void *pdev)
 	pci_disable_pcie_error_reporting(pdev);
 }
 
+static void cxl_mem_free_event_buffer(void *buf)
+{
+	kvfree(buf);
+}
+
+/*
+ * There is a single buffer for reading event logs from the mailbox.  All logs
+ * share this buffer protected by the cxlds->event_log_lock.
+ */
+static void cxl_mem_alloc_event_buf(struct cxl_dev_state *cxlds)
+{
+	struct cxl_get_event_payload *buf;
+
+	dev_dbg(cxlds->dev, "Allocating event buffer size %zu\n",
+		cxlds->payload_size);
+
+	buf = kvmalloc(cxlds->payload_size, GFP_KERNEL);
+	if (WARN_ON_ONCE(!buf))
+		return;
+
+	if (WARN_ON_ONCE(devm_add_action_or_reset(cxlds->dev,
+			 cxl_mem_free_event_buffer, buf)))
+		return;
+
+	cxlds->event.buf = buf;
+}
+
+static void cxl_clear_event_logs(struct cxl_dev_state *cxlds)
+{
+	/* Force read and clear of all logs */
+	cxl_mem_get_event_records(cxlds, CXLDEV_EVENT_STATUS_ALL);
+	/* Ensure prior partial reads are handled, by starting over again */
+	cxl_mem_get_event_records(cxlds, CXLDEV_EVENT_STATUS_ALL);
+}
+
 static int cxl_pci_probe(struct pci_dev *pdev, const struct pci_device_id *id)
 {
+	struct pci_host_bridge *host_bridge = pci_find_host_bridge(pdev->bus);
 	struct cxl_register_map map;
 	struct cxl_memdev *cxlmd;
 	struct cxl_dev_state *cxlds;
@@ -494,6 +530,12 @@ static int cxl_pci_probe(struct pci_dev *pdev, const struct pci_device_id *id)
 	if (IS_ERR(cxlmd))
 		return PTR_ERR(cxlmd);
 
+	if (host_bridge->native_cxl_error) {
+		cxl_mem_alloc_event_buf(cxlds);
+		if (cxlds->event.buf)
+			cxl_clear_event_logs(cxlds);
+	}
+
 	if (cxlds->regs.ras) {
 		pci_enable_pcie_error_reporting(pdev);
 		rc = devm_add_action_or_reset(&pdev->dev, disable_aer, pdev);
diff --git a/include/linux/pci.h b/include/linux/pci.h
index 1f81807492ef..7fe3752a204e 100644
--- a/include/linux/pci.h
+++ b/include/linux/pci.h
@@ -580,6 +580,7 @@ struct pci_host_bridge {
 	unsigned int	preserve_config:1;	/* Preserve FW resource setup */
 	unsigned int	size_windows:1;		/* Enable root bus sizing */
 	unsigned int	msi_domain:1;		/* Bridge wants MSI domain */
+	unsigned int	native_cxl_error:1;	/* OS CXL Error reporting */
 
 	/* Resource alignment requirements */
 	resource_size_t (*align_resource)(struct pci_dev *dev,
-- 
2.37.2


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH V3 2/8] cxl/mem: Wire up event interrupts
  2022-12-08  5:21 [PATCH V3 0/8] CXL: Process event logs ira.weiny
  2022-12-08  5:21 ` [PATCH V3 1/8] cxl/mem: Read, trace, and clear events on driver load ira.weiny
@ 2022-12-08  5:21 ` ira.weiny
  2022-12-09 21:49   ` Dan Williams
  2022-12-08  5:21 ` [PATCH V3 3/8] cxl/mem: Trace General Media Event Record ira.weiny
                   ` (5 subsequent siblings)
  7 siblings, 1 reply; 25+ messages in thread
From: ira.weiny @ 2022-12-08  5:21 UTC (permalink / raw)
  To: Dan Williams
  Cc: Davidlohr Bueso, Bjorn Helgaas, Jonathan Cameron, Ira Weiny,
	Bjorn Helgaas, Alison Schofield, Vishal Verma, Dave Jiang,
	linux-kernel, linux-pci, linux-acpi, linux-cxl

From: Davidlohr Bueso <dave@stgolabs.net>

Currently the only CXL features targeted for irq support require their
message numbers to be within the first 16 entries.  The device may
however support less than 16 entries depending on the support it
provides.

Attempt to allocate these 16 irq vectors.  If the device supports less
then the PCI infrastructure will allocate that number.  Upon successful
allocation, users can plug in their respective isr at any point
thereafter.

CXL device events are signaled via interrupts.  Each event log may have
a different interrupt message number.  These message numbers are
reported in the Get Event Interrupt Policy mailbox command.

Add interrupt support for event logs.  Interrupts are allocated as
shared interrupts.  Therefore, all or some event logs can share the same
message number.

In addition all logs are queried on any interrupt in order of the most
to least severe based on the status register.

Cc: Bjorn Helgaas <helgaas@kernel.org>
Cc: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Co-developed-by: Ira Weiny <ira.weiny@intel.com>
Signed-off-by: Ira Weiny <ira.weiny@intel.com>
Signed-off-by: Davidlohr Bueso <dave@stgolabs.net>

---
Changes from V2:
	General clean up
	Use cxl_log_id to ensure each irq is unique even if the message numbers are not
	Jonathan/Dan
		Only set up irq vector when OSC indicates OS control
	Dan
		Loop reading while status indicates there are more
			events.
		Use new cxl_internal_send_cmd()
		Squash MSI/MSIx base patch from Davidlohr
		Remove uapi defines altogether
		Remove use of msi_enabled
	Remove the use of cxl_event_log_type_str()
	Pick up tag

Changes from V1:
	Remove unneeded evt_int_policy from struct cxl_dev_state
	defer Dynamic Capacity support
	Dave Jiang
		s/irq/rc
		use IRQ_NONE to signal the irq was not for us.
	Jonathan
		use msi_enabled rather than nr_irq_vec
		On failure explicitly set CXL_INT_NONE
		Add comment for Get Event Interrupt Policy
		use devm_request_threaded_irq()
		Use individual handler/thread functions for each of the
		logs rather than struct cxl_event_irq_id.

Changes from RFC v2
	Adjust to new irq 16 vector allocation
	Jonathan
		Remove CXL_INT_RES
	Use irq threads to ensure mailbox commands are executed outside irq context
	Adjust for optional Dynamic Capacity log
---
 drivers/cxl/core/mbox.c | 42 +++++++++++++++++++
 drivers/cxl/cxlmem.h    | 28 +++++++++++++
 drivers/cxl/cxlpci.h    |  6 +++
 drivers/cxl/pci.c       | 90 ++++++++++++++++++++++++++++++++++++++++-
 4 files changed, 165 insertions(+), 1 deletion(-)

diff --git a/drivers/cxl/core/mbox.c b/drivers/cxl/core/mbox.c
index 815da3aac081..2b25691a9b09 100644
--- a/drivers/cxl/core/mbox.c
+++ b/drivers/cxl/core/mbox.c
@@ -854,6 +854,48 @@ void cxl_mem_get_event_records(struct cxl_dev_state *cxlds, u32 status)
 }
 EXPORT_SYMBOL_NS_GPL(cxl_mem_get_event_records, CXL);
 
+int cxl_event_config_msgnums(struct cxl_dev_state *cxlds,
+			     struct cxl_event_interrupt_policy *policy)
+{
+	struct cxl_mbox_cmd mbox_cmd;
+	int rc;
+
+	policy->info_settings = CXL_INT_MSI_MSIX;
+	policy->warn_settings = CXL_INT_MSI_MSIX;
+	policy->failure_settings = CXL_INT_MSI_MSIX;
+	policy->fatal_settings = CXL_INT_MSI_MSIX;
+
+	mbox_cmd = (struct cxl_mbox_cmd) {
+		.opcode = CXL_MBOX_OP_SET_EVT_INT_POLICY,
+		.payload_in = policy,
+		.size_in = sizeof(*policy),
+	};
+
+	rc = cxl_internal_send_cmd(cxlds, &mbox_cmd);
+	if (rc < 0) {
+		dev_err(cxlds->dev, "Failed to set event interrupt policy : %d",
+			rc);
+		return rc;
+	}
+
+	mbox_cmd = (struct cxl_mbox_cmd) {
+		.opcode = CXL_MBOX_OP_GET_EVT_INT_POLICY,
+		.payload_out = policy,
+		.size_out = sizeof(*policy),
+	};
+
+	/* Retrieve interrupt message numbers */
+	rc = cxl_internal_send_cmd(cxlds, &mbox_cmd);
+	if (rc < 0) {
+		dev_err(cxlds->dev, "Failed to get event interrupt policy : %d",
+			rc);
+		return rc;
+	}
+
+	return 0;
+}
+EXPORT_SYMBOL_NS_GPL(cxl_event_config_msgnums, CXL);
+
 /**
  * cxl_mem_get_partition_info - Get partition info
  * @cxlds: The device data for the operation
diff --git a/drivers/cxl/cxlmem.h b/drivers/cxl/cxlmem.h
index dd9aa3dd738e..350cb460e7fc 100644
--- a/drivers/cxl/cxlmem.h
+++ b/drivers/cxl/cxlmem.h
@@ -194,6 +194,30 @@ struct cxl_endpoint_dvsec_info {
 	struct range dvsec_range[2];
 };
 
+/**
+ * Event Interrupt Policy
+ *
+ * CXL rev 3.0 section 8.2.9.2.4; Table 8-52
+ */
+enum cxl_event_int_mode {
+	CXL_INT_NONE		= 0x00,
+	CXL_INT_MSI_MSIX	= 0x01,
+	CXL_INT_FW		= 0x02
+};
+#define CXL_EVENT_INT_MODE_MASK 0x3
+#define CXL_EVENT_INT_MSGNUM(setting) (((setting) & 0xf0) >> 4)
+struct cxl_event_interrupt_policy {
+	u8 info_settings;
+	u8 warn_settings;
+	u8 failure_settings;
+	u8 fatal_settings;
+} __packed;
+
+static inline bool cxl_evt_int_is_msi(u8 setting)
+{
+	return CXL_INT_MSI_MSIX == (setting & CXL_EVENT_INT_MODE_MASK);
+}
+
 /**
  * struct cxl_event_state - Event log driver state
  *
@@ -288,6 +312,8 @@ enum cxl_opcode {
 	CXL_MBOX_OP_RAW			= CXL_MBOX_OP_INVALID,
 	CXL_MBOX_OP_GET_EVENT_RECORD	= 0x0100,
 	CXL_MBOX_OP_CLEAR_EVENT_RECORD	= 0x0101,
+	CXL_MBOX_OP_GET_EVT_INT_POLICY	= 0x0102,
+	CXL_MBOX_OP_SET_EVT_INT_POLICY	= 0x0103,
 	CXL_MBOX_OP_GET_FW_INFO		= 0x0200,
 	CXL_MBOX_OP_ACTIVATE_FW		= 0x0202,
 	CXL_MBOX_OP_GET_SUPPORTED_LOGS	= 0x0400,
@@ -525,6 +551,8 @@ struct cxl_dev_state *cxl_dev_state_create(struct device *dev);
 void set_exclusive_cxl_commands(struct cxl_dev_state *cxlds, unsigned long *cmds);
 void clear_exclusive_cxl_commands(struct cxl_dev_state *cxlds, unsigned long *cmds);
 void cxl_mem_get_event_records(struct cxl_dev_state *cxlds, u32 status);
+int cxl_event_config_msgnums(struct cxl_dev_state *cxlds,
+			     struct cxl_event_interrupt_policy *policy);
 #ifdef CONFIG_CXL_SUSPEND
 void cxl_mem_active_inc(void);
 void cxl_mem_active_dec(void);
diff --git a/drivers/cxl/cxlpci.h b/drivers/cxl/cxlpci.h
index 77dbdb980b12..4aaadf17a985 100644
--- a/drivers/cxl/cxlpci.h
+++ b/drivers/cxl/cxlpci.h
@@ -53,6 +53,12 @@
 #define	    CXL_DVSEC_REG_LOCATOR_BLOCK_ID_MASK			GENMASK(15, 8)
 #define     CXL_DVSEC_REG_LOCATOR_BLOCK_OFF_LOW_MASK		GENMASK(31, 16)
 
+/*
+ * NOTE: Currently all the functions which are enabled for CXL require their
+ * vectors to be in the first 16.  Use this as the max.
+ */
+#define CXL_PCI_REQUIRED_VECTORS 16
+
 /* Register Block Identifier (RBI) */
 enum cxl_regloc_type {
 	CXL_REGLOC_RBI_EMPTY = 0,
diff --git a/drivers/cxl/pci.c b/drivers/cxl/pci.c
index 86c84611a168..c84922a287ec 100644
--- a/drivers/cxl/pci.c
+++ b/drivers/cxl/pci.c
@@ -452,6 +452,90 @@ static void cxl_clear_event_logs(struct cxl_dev_state *cxlds)
 	cxl_mem_get_event_records(cxlds, CXLDEV_EVENT_STATUS_ALL);
 }
 
+static void cxl_alloc_irq_vectors(struct cxl_dev_state *cxlds)
+{
+	struct device *dev = cxlds->dev;
+	struct pci_dev *pdev = to_pci_dev(dev);
+	int nvecs;
+
+	/*
+	 * pci_alloc_irq_vectors() handles calling pci_free_irq_vectors()
+	 * automatically despite not being called pcim_*.  See
+	 * pci_setup_msi_context().
+	 */
+	nvecs = pci_alloc_irq_vectors(pdev, 1, CXL_PCI_REQUIRED_VECTORS,
+				      PCI_IRQ_MSIX | PCI_IRQ_MSI);
+	if (nvecs < 1)
+		dev_dbg(dev, "Failed to alloc irq vectors: %d\n", nvecs);
+}
+
+struct cxl_dev_id {
+	struct cxl_dev_state *cxlds;
+};
+
+static irqreturn_t cxl_event_thread(int irq, void *id)
+{
+	struct cxl_dev_id *dev_id = id;
+	struct cxl_dev_state *cxlds = dev_id->cxlds;
+	u32 status;
+
+	/*
+	 * CXL 3.0 8.2.8.3.1: The lower 32 bits are the status;
+	 * ignore the reserved upper 32 bits
+	 */
+	status = readl(cxlds->regs.status + CXLDEV_DEV_EVENT_STATUS_OFFSET);
+	while (status) {
+		cxl_mem_get_event_records(cxlds, status);
+		cond_resched();
+		status = readl(cxlds->regs.status + CXLDEV_DEV_EVENT_STATUS_OFFSET);
+	}
+	return IRQ_HANDLED;
+}
+
+static int cxl_req_event_irq(struct cxl_dev_state *cxlds, u8 setting)
+{
+	struct device *dev = cxlds->dev;
+	struct pci_dev *pdev = to_pci_dev(dev);
+	struct cxl_dev_id *dev_id;
+	int irq;
+
+	if (!cxl_evt_int_is_msi(setting))
+		return -ENXIO;
+
+	/* dev_id must be globally unique and must contain the cxlds */
+	dev_id = devm_kmalloc(dev, sizeof(*dev_id), GFP_KERNEL);
+	if (!dev_id)
+		return -ENOMEM;
+	dev_id->cxlds = cxlds;
+
+	irq =  pci_irq_vector(pdev, CXL_EVENT_INT_MSGNUM(setting));
+	if (irq < 0)
+		return irq;
+
+	return devm_request_threaded_irq(dev, irq, NULL, cxl_event_thread,
+					 IRQF_SHARED, NULL, dev_id);
+}
+
+static void cxl_event_irqsetup(struct cxl_dev_state *cxlds)
+{
+	struct cxl_event_interrupt_policy policy;
+
+	if (cxl_event_config_msgnums(cxlds, &policy))
+		return;
+
+	if (cxl_req_event_irq(cxlds, policy.info_settings))
+		dev_err(cxlds->dev, "Failed to get interrupt for event Info log\n");
+
+	if (cxl_req_event_irq(cxlds, policy.warn_settings))
+		dev_err(cxlds->dev, "Failed to get interrupt for event Warn log\n");
+
+	if (cxl_req_event_irq(cxlds, policy.failure_settings))
+		dev_err(cxlds->dev, "Failed to get interrupt for event Failure log\n");
+
+	if (cxl_req_event_irq(cxlds, policy.fatal_settings))
+		dev_err(cxlds->dev, "Failed to get interrupt for event Fatal log\n");
+}
+
 static int cxl_pci_probe(struct pci_dev *pdev, const struct pci_device_id *id)
 {
 	struct pci_host_bridge *host_bridge = pci_find_host_bridge(pdev->bus);
@@ -526,14 +610,18 @@ static int cxl_pci_probe(struct pci_dev *pdev, const struct pci_device_id *id)
 	if (rc)
 		return rc;
 
+	cxl_alloc_irq_vectors(cxlds);
+
 	cxlmd = devm_cxl_add_memdev(cxlds);
 	if (IS_ERR(cxlmd))
 		return PTR_ERR(cxlmd);
 
 	if (host_bridge->native_cxl_error) {
 		cxl_mem_alloc_event_buf(cxlds);
-		if (cxlds->event.buf)
+		if (cxlds->event.buf) {
+			cxl_event_irqsetup(cxlds);
 			cxl_clear_event_logs(cxlds);
+		}
 	}
 
 	if (cxlds->regs.ras) {
-- 
2.37.2


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH V3 3/8] cxl/mem: Trace General Media Event Record
  2022-12-08  5:21 [PATCH V3 0/8] CXL: Process event logs ira.weiny
  2022-12-08  5:21 ` [PATCH V3 1/8] cxl/mem: Read, trace, and clear events on driver load ira.weiny
  2022-12-08  5:21 ` [PATCH V3 2/8] cxl/mem: Wire up event interrupts ira.weiny
@ 2022-12-08  5:21 ` ira.weiny
  2022-12-09 22:04   ` Dan Williams
  2022-12-08  5:21 ` [PATCH V3 4/8] cxl/mem: Trace DRAM " ira.weiny
                   ` (4 subsequent siblings)
  7 siblings, 1 reply; 25+ messages in thread
From: ira.weiny @ 2022-12-08  5:21 UTC (permalink / raw)
  To: Dan Williams
  Cc: Ira Weiny, Bjorn Helgaas, Alison Schofield, Vishal Verma,
	Davidlohr Bueso, Jonathan Cameron, Dave Jiang, linux-kernel,
	linux-pci, linux-acpi, linux-cxl

From: Ira Weiny <ira.weiny@intel.com>

CXL rev 3.0 section 8.2.9.2.1.1 defines the General Media Event Record.

Determine if the event read is a general media record and if so trace
the record as a General Media Event Record.

Signed-off-by: Ira Weiny <ira.weiny@intel.com>

---
Changes from V2:
	Dan
		Remove trace_*_enabled() calls
		Pass struct device to trace points

Changes from V1:
	Jonathan
		fix spec references for CXL rev 3.0
		Make flags all caps

Changes from RFC v2:
	Output DPA flags as a single field
	Ensure names of fields match what TP_print outputs
	Steven
		prefix TRACE_EVENT with 'cxl_'
	Jonathan
		Remove Reserved field

Changes from RFC:
	Add reserved byte array
	Use common CXL event header record macros
	Jonathan
		Use unaligned_le{24,16} for unaligned fields
		Don't use the inverse of phy addr mask
	Dave Jiang
		s/cxl_gen_media_event/general_media
		s/cxl_evt_gen_media/cxl_event_gen_media
---
 drivers/cxl/core/mbox.c  |  30 +++++++++-
 drivers/cxl/core/trace.h | 124 +++++++++++++++++++++++++++++++++++++++
 drivers/cxl/cxlmem.h     |  19 ++++++
 3 files changed, 171 insertions(+), 2 deletions(-)

diff --git a/drivers/cxl/core/mbox.c b/drivers/cxl/core/mbox.c
index 2b25691a9b09..0d8c66f1cdc5 100644
--- a/drivers/cxl/core/mbox.c
+++ b/drivers/cxl/core/mbox.c
@@ -718,6 +718,32 @@ int cxl_enumerate_cmds(struct cxl_dev_state *cxlds)
 }
 EXPORT_SYMBOL_NS_GPL(cxl_enumerate_cmds, CXL);
 
+/*
+ * General Media Event Record
+ * CXL rev 3.0 Section 8.2.9.2.1.1; Table 8-43
+ */
+static const uuid_t gen_media_event_uuid =
+	UUID_INIT(0xfbcd0a77, 0xc260, 0x417f,
+		  0x85, 0xa9, 0x08, 0x8b, 0x16, 0x21, 0xeb, 0xa6);
+
+static void cxl_trace_event_record(const struct device *dev,
+				   enum cxl_event_log_type type,
+				   struct cxl_event_record_raw *record)
+{
+	uuid_t *id = &record->hdr.id;
+
+	if (uuid_equal(id, &gen_media_event_uuid)) {
+		struct cxl_event_gen_media *rec =
+				(struct cxl_event_gen_media *)record;
+
+		trace_cxl_general_media(dev, type, rec);
+		return;
+	}
+
+	/* For unknown record types print just the header */
+	trace_cxl_generic_event(dev, type, record);
+}
+
 static int cxl_clear_event_record(struct cxl_dev_state *cxlds,
 				  enum cxl_event_log_type log,
 				  struct cxl_get_event_payload *get_pl)
@@ -810,8 +836,8 @@ static void cxl_mem_get_records_log(struct cxl_dev_state *cxlds,
 			int i;
 
 			for (i = 0; i < nr_rec; i++)
-				trace_cxl_generic_event(cxlds->dev, type,
-							&payload->records[i]);
+				cxl_trace_event_record(cxlds->dev, type,
+						       &payload->records[i]);
 
 			rc = cxl_clear_event_record(cxlds, type, payload);
 			if (rc) {
diff --git a/drivers/cxl/core/trace.h b/drivers/cxl/core/trace.h
index 24eef6909f13..82462942590b 100644
--- a/drivers/cxl/core/trace.h
+++ b/drivers/cxl/core/trace.h
@@ -223,6 +223,130 @@ TRACE_EVENT(cxl_generic_event,
 		__print_hex(__entry->data, CXL_EVENT_RECORD_DATA_LENGTH))
 );
 
+/*
+ * Physical Address field masks
+ *
+ * General Media Event Record
+ * CXL rev 3.0 Section 8.2.9.2.1.1; Table 8-43
+ *
+ * DRAM Event Record
+ * CXL rev 3.0 section 8.2.9.2.1.2; Table 8-44
+ */
+#define CXL_DPA_FLAGS_MASK			0x3F
+#define CXL_DPA_MASK				(~CXL_DPA_FLAGS_MASK)
+
+#define CXL_DPA_VOLATILE			BIT(0)
+#define CXL_DPA_NOT_REPAIRABLE			BIT(1)
+#define show_dpa_flags(flags)	__print_flags(flags, "|",		   \
+	{ CXL_DPA_VOLATILE,			"VOLATILE"		}, \
+	{ CXL_DPA_NOT_REPAIRABLE,		"NOT_REPAIRABLE"	}  \
+)
+
+/*
+ * General Media Event Record - GMER
+ * CXL rev 3.0 Section 8.2.9.2.1.1; Table 8-43
+ */
+#define CXL_GMER_EVT_DESC_UNCORECTABLE_EVENT		BIT(0)
+#define CXL_GMER_EVT_DESC_THRESHOLD_EVENT		BIT(1)
+#define CXL_GMER_EVT_DESC_POISON_LIST_OVERFLOW		BIT(2)
+#define show_event_desc_flags(flags)	__print_flags(flags, "|",		   \
+	{ CXL_GMER_EVT_DESC_UNCORECTABLE_EVENT,		"UNCORRECTABLE_EVENT"	}, \
+	{ CXL_GMER_EVT_DESC_THRESHOLD_EVENT,		"THRESHOLD_EVENT"	}, \
+	{ CXL_GMER_EVT_DESC_POISON_LIST_OVERFLOW,	"POISON_LIST_OVERFLOW"	}  \
+)
+
+#define CXL_GMER_MEM_EVT_TYPE_ECC_ERROR			0x00
+#define CXL_GMER_MEM_EVT_TYPE_INV_ADDR			0x01
+#define CXL_GMER_MEM_EVT_TYPE_DATA_PATH_ERROR		0x02
+#define show_mem_event_type(type)	__print_symbolic(type,			\
+	{ CXL_GMER_MEM_EVT_TYPE_ECC_ERROR,		"ECC Error" },		\
+	{ CXL_GMER_MEM_EVT_TYPE_INV_ADDR,		"Invalid Address" },	\
+	{ CXL_GMER_MEM_EVT_TYPE_DATA_PATH_ERROR,	"Data Path Error" }	\
+)
+
+#define CXL_GMER_TRANS_UNKNOWN				0x00
+#define CXL_GMER_TRANS_HOST_READ			0x01
+#define CXL_GMER_TRANS_HOST_WRITE			0x02
+#define CXL_GMER_TRANS_HOST_SCAN_MEDIA			0x03
+#define CXL_GMER_TRANS_HOST_INJECT_POISON		0x04
+#define CXL_GMER_TRANS_INTERNAL_MEDIA_SCRUB		0x05
+#define CXL_GMER_TRANS_INTERNAL_MEDIA_MANAGEMENT	0x06
+#define show_trans_type(type)	__print_symbolic(type,					\
+	{ CXL_GMER_TRANS_UNKNOWN,			"Unknown" },			\
+	{ CXL_GMER_TRANS_HOST_READ,			"Host Read" },			\
+	{ CXL_GMER_TRANS_HOST_WRITE,			"Host Write" },			\
+	{ CXL_GMER_TRANS_HOST_SCAN_MEDIA,		"Host Scan Media" },		\
+	{ CXL_GMER_TRANS_HOST_INJECT_POISON,		"Host Inject Poison" },		\
+	{ CXL_GMER_TRANS_INTERNAL_MEDIA_SCRUB,		"Internal Media Scrub" },	\
+	{ CXL_GMER_TRANS_INTERNAL_MEDIA_MANAGEMENT,	"Internal Media Management" }	\
+)
+
+#define CXL_GMER_VALID_CHANNEL				BIT(0)
+#define CXL_GMER_VALID_RANK				BIT(1)
+#define CXL_GMER_VALID_DEVICE				BIT(2)
+#define CXL_GMER_VALID_COMPONENT			BIT(3)
+#define show_valid_flags(flags)	__print_flags(flags, "|",		   \
+	{ CXL_GMER_VALID_CHANNEL,			"CHANNEL"	}, \
+	{ CXL_GMER_VALID_RANK,				"RANK"		}, \
+	{ CXL_GMER_VALID_DEVICE,			"DEVICE"	}, \
+	{ CXL_GMER_VALID_COMPONENT,			"COMPONENT"	}  \
+)
+
+TRACE_EVENT(cxl_general_media,
+
+	TP_PROTO(const struct device *dev, enum cxl_event_log_type log,
+		 struct cxl_event_gen_media *rec),
+
+	TP_ARGS(dev, log, rec),
+
+	TP_STRUCT__entry(
+		CXL_EVT_TP_entry
+		/* General Media */
+		__field(u64, dpa)
+		__field(u8, descriptor)
+		__field(u8, type)
+		__field(u8, transaction_type)
+		__field(u8, channel)
+		__field(u32, device)
+		__array(u8, comp_id, CXL_EVENT_GEN_MED_COMP_ID_SIZE)
+		__field(u16, validity_flags)
+		/* Following are out of order to pack trace record */
+		__field(u8, rank)
+		__field(u8, dpa_flags)
+	),
+
+	TP_fast_assign(
+		CXL_EVT_TP_fast_assign(dev, log, rec->hdr);
+
+		/* General Media */
+		__entry->dpa = le64_to_cpu(rec->phys_addr);
+		__entry->dpa_flags = __entry->dpa & CXL_DPA_FLAGS_MASK;
+		/* Mask after flags have been parsed */
+		__entry->dpa &= CXL_DPA_MASK;
+		__entry->descriptor = rec->descriptor;
+		__entry->type = rec->type;
+		__entry->transaction_type = rec->transaction_type;
+		__entry->channel = rec->channel;
+		__entry->rank = rec->rank;
+		__entry->device = get_unaligned_le24(rec->device);
+		memcpy(__entry->comp_id, &rec->component_id,
+			CXL_EVENT_GEN_MED_COMP_ID_SIZE);
+		__entry->validity_flags = get_unaligned_le16(&rec->validity_flags);
+	),
+
+	CXL_EVT_TP_printk("dpa=%llx dpa_flags='%s' " \
+		"descriptor='%s' type='%s' transaction_type='%s' channel=%u rank=%u " \
+		"device=%x comp_id=%s validity_flags='%s'",
+		__entry->dpa, show_dpa_flags(__entry->dpa_flags),
+		show_event_desc_flags(__entry->descriptor),
+		show_mem_event_type(__entry->type),
+		show_trans_type(__entry->transaction_type),
+		__entry->channel, __entry->rank, __entry->device,
+		__print_hex(__entry->comp_id, CXL_EVENT_GEN_MED_COMP_ID_SIZE),
+		show_valid_flags(__entry->validity_flags)
+	)
+);
+
 #endif /* _CXL_EVENTS_H */
 
 #define TRACE_INCLUDE_FILE trace
diff --git a/drivers/cxl/cxlmem.h b/drivers/cxl/cxlmem.h
index 350cb460e7fc..a5f5d4a380af 100644
--- a/drivers/cxl/cxlmem.h
+++ b/drivers/cxl/cxlmem.h
@@ -456,6 +456,25 @@ struct cxl_mbox_clear_event_payload {
 		 (sizeof(__le16) * CXL_CLEAR_EVENT_MAX_HANDLES))) /	\
 		sizeof(__le16))
 
+/*
+ * General Media Event Record
+ * CXL rev 3.0 Section 8.2.9.2.1.1; Table 8-43
+ */
+#define CXL_EVENT_GEN_MED_COMP_ID_SIZE	0x10
+struct cxl_event_gen_media {
+	struct cxl_event_record_hdr hdr;
+	__le64 phys_addr;
+	u8 descriptor;
+	u8 type;
+	u8 transaction_type;
+	u8 validity_flags[2];
+	u8 channel;
+	u8 rank;
+	u8 device[3];
+	u8 component_id[CXL_EVENT_GEN_MED_COMP_ID_SIZE];
+	u8 reserved[0x2e];
+} __packed;
+
 struct cxl_mbox_get_partition_info {
 	__le64 active_volatile_cap;
 	__le64 active_persistent_cap;
-- 
2.37.2


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH V3 4/8] cxl/mem: Trace DRAM Event Record
  2022-12-08  5:21 [PATCH V3 0/8] CXL: Process event logs ira.weiny
                   ` (2 preceding siblings ...)
  2022-12-08  5:21 ` [PATCH V3 3/8] cxl/mem: Trace General Media Event Record ira.weiny
@ 2022-12-08  5:21 ` ira.weiny
  2022-12-09 22:14   ` Dan Williams
  2022-12-08  5:21 ` [PATCH V3 5/8] cxl/mem: Trace Memory Module " ira.weiny
                   ` (3 subsequent siblings)
  7 siblings, 1 reply; 25+ messages in thread
From: ira.weiny @ 2022-12-08  5:21 UTC (permalink / raw)
  To: Dan Williams
  Cc: Ira Weiny, Bjorn Helgaas, Alison Schofield, Vishal Verma,
	Davidlohr Bueso, Jonathan Cameron, Dave Jiang, linux-kernel,
	linux-pci, linux-acpi, linux-cxl

From: Ira Weiny <ira.weiny@intel.com>

CXL rev 3.0 section 8.2.9.2.1.2 defines the DRAM Event Record.

Determine if the event read is a DRAM event record and if so trace the
record.

Signed-off-by: Ira Weiny <ira.weiny@intel.com>

---
Changes from v2:
	Dan
		Move tracing to cxl core
		Remove trace_*_enabled() calls
		Pass struct device to trace points

Changes from RFC v2:
	Output DPA flags as a separate field.
	Ensure field names match TP_print output
	Steven
		prefix TRACE_EVENT with 'cxl_'
	Jonathan
		Formatting fix
		Remove reserved field

Changes from RFC:
	Add reserved byte data
	Use new CXL header macros
	Jonathan
		Use get_unaligned_le{24,16}() for unaligned fields
		Use 'else if'
	Dave Jiang
		s/cxl_dram_event/dram
		s/cxl_evt_dram_rec/cxl_event_dram
	Adjust for new phys addr mask
---
 drivers/cxl/core/mbox.c  | 13 ++++++
 drivers/cxl/core/trace.h | 92 ++++++++++++++++++++++++++++++++++++++++
 drivers/cxl/cxlmem.h     | 23 ++++++++++
 3 files changed, 128 insertions(+)

diff --git a/drivers/cxl/core/mbox.c b/drivers/cxl/core/mbox.c
index 0d8c66f1cdc5..2fa4645f0ed9 100644
--- a/drivers/cxl/core/mbox.c
+++ b/drivers/cxl/core/mbox.c
@@ -726,6 +726,14 @@ static const uuid_t gen_media_event_uuid =
 	UUID_INIT(0xfbcd0a77, 0xc260, 0x417f,
 		  0x85, 0xa9, 0x08, 0x8b, 0x16, 0x21, 0xeb, 0xa6);
 
+/*
+ * DRAM Event Record
+ * CXL rev 3.0 section 8.2.9.2.1.2; Table 8-44
+ */
+static const uuid_t dram_event_uuid =
+	UUID_INIT(0x601dcbb3, 0x9c06, 0x4eab,
+		  0xb8, 0xaf, 0x4e, 0x9b, 0xfb, 0x5c, 0x96, 0x24);
+
 static void cxl_trace_event_record(const struct device *dev,
 				   enum cxl_event_log_type type,
 				   struct cxl_event_record_raw *record)
@@ -738,6 +746,11 @@ static void cxl_trace_event_record(const struct device *dev,
 
 		trace_cxl_general_media(dev, type, rec);
 		return;
+	} else if (uuid_equal(id, &dram_event_uuid)) {
+		struct cxl_event_dram *rec = (struct cxl_event_dram *)record;
+
+		trace_cxl_dram(dev, type, rec);
+		return;
 	}
 
 	/* For unknown record types print just the header */
diff --git a/drivers/cxl/core/trace.h b/drivers/cxl/core/trace.h
index 82462942590b..5c6cd9aa9450 100644
--- a/drivers/cxl/core/trace.h
+++ b/drivers/cxl/core/trace.h
@@ -347,6 +347,98 @@ TRACE_EVENT(cxl_general_media,
 	)
 );
 
+/*
+ * DRAM Event Record - DER
+ *
+ * CXL rev 3.0 section 8.2.9.2.1.2; Table 8-44
+ */
+/*
+ * DRAM Event Record defines many fields the same as the General Media Event
+ * Record.  Reuse those definitions as appropriate.
+ */
+#define CXL_DER_VALID_CHANNEL				BIT(0)
+#define CXL_DER_VALID_RANK				BIT(1)
+#define CXL_DER_VALID_NIBBLE				BIT(2)
+#define CXL_DER_VALID_BANK_GROUP			BIT(3)
+#define CXL_DER_VALID_BANK				BIT(4)
+#define CXL_DER_VALID_ROW				BIT(5)
+#define CXL_DER_VALID_COLUMN				BIT(6)
+#define CXL_DER_VALID_CORRECTION_MASK			BIT(7)
+#define show_dram_valid_flags(flags)	__print_flags(flags, "|",			   \
+	{ CXL_DER_VALID_CHANNEL,			"CHANNEL"		}, \
+	{ CXL_DER_VALID_RANK,				"RANK"			}, \
+	{ CXL_DER_VALID_NIBBLE,				"NIBBLE"		}, \
+	{ CXL_DER_VALID_BANK_GROUP,			"BANK GROUP"		}, \
+	{ CXL_DER_VALID_BANK,				"BANK"			}, \
+	{ CXL_DER_VALID_ROW,				"ROW"			}, \
+	{ CXL_DER_VALID_COLUMN,				"COLUMN"		}, \
+	{ CXL_DER_VALID_CORRECTION_MASK,		"CORRECTION MASK"	}  \
+)
+
+TRACE_EVENT(cxl_dram,
+
+	TP_PROTO(const struct device *dev, enum cxl_event_log_type log,
+		 struct cxl_event_dram *rec),
+
+	TP_ARGS(dev, log, rec),
+
+	TP_STRUCT__entry(
+		CXL_EVT_TP_entry
+		/* DRAM */
+		__field(u64, dpa)
+		__field(u8, descriptor)
+		__field(u8, type)
+		__field(u8, transaction_type)
+		__field(u8, channel)
+		__field(u16, validity_flags)
+		__field(u16, column)	/* Out of order to pack trace record */
+		__field(u32, nibble_mask)
+		__field(u32, row)
+		__array(u8, cor_mask, CXL_EVENT_DER_CORRECTION_MASK_SIZE)
+		__field(u8, rank)	/* Out of order to pack trace record */
+		__field(u8, bank_group)	/* Out of order to pack trace record */
+		__field(u8, bank)	/* Out of order to pack trace record */
+		__field(u8, dpa_flags)	/* Out of order to pack trace record */
+	),
+
+	TP_fast_assign(
+		CXL_EVT_TP_fast_assign(dev, log, rec->hdr);
+
+		/* DRAM */
+		__entry->dpa = le64_to_cpu(rec->phys_addr);
+		__entry->dpa_flags = __entry->dpa & CXL_DPA_FLAGS_MASK;
+		__entry->dpa &= CXL_DPA_MASK;
+		__entry->descriptor = rec->descriptor;
+		__entry->type = rec->type;
+		__entry->transaction_type = rec->transaction_type;
+		__entry->validity_flags = get_unaligned_le16(rec->validity_flags);
+		__entry->channel = rec->channel;
+		__entry->rank = rec->rank;
+		__entry->nibble_mask = get_unaligned_le24(rec->nibble_mask);
+		__entry->bank_group = rec->bank_group;
+		__entry->bank = rec->bank;
+		__entry->row = get_unaligned_le24(rec->row);
+		__entry->column = get_unaligned_le16(rec->column);
+		memcpy(__entry->cor_mask, &rec->correction_mask,
+			CXL_EVENT_DER_CORRECTION_MASK_SIZE);
+	),
+
+	CXL_EVT_TP_printk("dpa=%llx dpa_flags='%s' descriptor='%s' type='%s' " \
+		"transaction_type='%s' channel=%u rank=%u nibble_mask=%x " \
+		"bank_group=%u bank=%u row=%u column=%u cor_mask=%s " \
+		"validity_flags='%s'",
+		__entry->dpa, show_dpa_flags(__entry->dpa_flags),
+		show_event_desc_flags(__entry->descriptor),
+		show_mem_event_type(__entry->type),
+		show_trans_type(__entry->transaction_type),
+		__entry->channel, __entry->rank, __entry->nibble_mask,
+		__entry->bank_group, __entry->bank,
+		__entry->row, __entry->column,
+		__print_hex(__entry->cor_mask, CXL_EVENT_DER_CORRECTION_MASK_SIZE),
+		show_dram_valid_flags(__entry->validity_flags)
+	)
+);
+
 #endif /* _CXL_EVENTS_H */
 
 #define TRACE_INCLUDE_FILE trace
diff --git a/drivers/cxl/cxlmem.h b/drivers/cxl/cxlmem.h
index a5f5d4a380af..19c9cb6d6ccd 100644
--- a/drivers/cxl/cxlmem.h
+++ b/drivers/cxl/cxlmem.h
@@ -475,6 +475,29 @@ struct cxl_event_gen_media {
 	u8 reserved[0x2e];
 } __packed;
 
+/*
+ * DRAM Event Record - DER
+ * CXL rev 3.0 section 8.2.9.2.1.2; Table 3-44
+ */
+#define CXL_EVENT_DER_CORRECTION_MASK_SIZE	0x20
+struct cxl_event_dram {
+	struct cxl_event_record_hdr hdr;
+	__le64 phys_addr;
+	u8 descriptor;
+	u8 type;
+	u8 transaction_type;
+	u8 validity_flags[2];
+	u8 channel;
+	u8 rank;
+	u8 nibble_mask[3];
+	u8 bank_group;
+	u8 bank;
+	u8 row[3];
+	u8 column[2];
+	u8 correction_mask[CXL_EVENT_DER_CORRECTION_MASK_SIZE];
+	u8 reserved[0x17];
+} __packed;
+
 struct cxl_mbox_get_partition_info {
 	__le64 active_volatile_cap;
 	__le64 active_persistent_cap;
-- 
2.37.2


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH V3 5/8] cxl/mem: Trace Memory Module Event Record
  2022-12-08  5:21 [PATCH V3 0/8] CXL: Process event logs ira.weiny
                   ` (3 preceding siblings ...)
  2022-12-08  5:21 ` [PATCH V3 4/8] cxl/mem: Trace DRAM " ira.weiny
@ 2022-12-08  5:21 ` ira.weiny
  2022-12-09 22:18   ` Dan Williams
  2022-12-08  5:21 ` [PATCH V3 6/8] cxl/test: Add generic mock events ira.weiny
                   ` (2 subsequent siblings)
  7 siblings, 1 reply; 25+ messages in thread
From: ira.weiny @ 2022-12-08  5:21 UTC (permalink / raw)
  To: Dan Williams
  Cc: Ira Weiny, Bjorn Helgaas, Alison Schofield, Vishal Verma,
	Davidlohr Bueso, Jonathan Cameron, Dave Jiang, linux-kernel,
	linux-pci, linux-acpi, linux-cxl

From: Ira Weiny <ira.weiny@intel.com>

CXL rev 3.0 section 8.2.9.2.1.3 defines the Memory Module Event Record.

Determine if the event read is memory module record and if so trace the
record.

Signed-off-by: Ira Weiny <ira.weiny@intel.com>

---
Changes from V2:
	Dan
		Move tracing to cxl core
		Remove trace_*_enabled() calls
		Pass struct device to trace points

Changes from V1:
	Use all caps for flag fields

Changes from RFC v2:
	Ensure field names match TP_print output
	Steven
		prefix TRACE_EVENT with 'cxl_'
	Jonathan
		Remove reserved field
		Define a 1bit and 2 bit status decoder
		Fix paren alignment

Changes from RFC:
	Clean up spec reference
	Add reserved data
	Use new CXL header macros
	Jonathan
		Use else if
		Use get_unaligned_le*() for unaligned fields
	Dave Jiang
		s/cxl_mem_mod_event/memory_module
		s/cxl_evt_mem_mod_rec/cxl_event_mem_module
---
 drivers/cxl/core/mbox.c  |  14 ++++
 drivers/cxl/core/trace.h | 143 +++++++++++++++++++++++++++++++++++++++
 drivers/cxl/cxlmem.h     |  26 +++++++
 3 files changed, 183 insertions(+)

diff --git a/drivers/cxl/core/mbox.c b/drivers/cxl/core/mbox.c
index 2fa4645f0ed9..a5a259b5d038 100644
--- a/drivers/cxl/core/mbox.c
+++ b/drivers/cxl/core/mbox.c
@@ -734,6 +734,14 @@ static const uuid_t dram_event_uuid =
 	UUID_INIT(0x601dcbb3, 0x9c06, 0x4eab,
 		  0xb8, 0xaf, 0x4e, 0x9b, 0xfb, 0x5c, 0x96, 0x24);
 
+/*
+ * Memory Module Event Record
+ * CXL rev 3.0 section 8.2.9.2.1.3; Table 8-45
+ */
+static const uuid_t mem_mod_event_uuid =
+	UUID_INIT(0xfe927475, 0xdd59, 0x4339,
+		  0xa5, 0x86, 0x79, 0xba, 0xb1, 0x13, 0xb7, 0x74);
+
 static void cxl_trace_event_record(const struct device *dev,
 				   enum cxl_event_log_type type,
 				   struct cxl_event_record_raw *record)
@@ -751,6 +759,12 @@ static void cxl_trace_event_record(const struct device *dev,
 
 		trace_cxl_dram(dev, type, rec);
 		return;
+	} else if (uuid_equal(id, &mem_mod_event_uuid)) {
+		struct cxl_event_mem_module *rec =
+				(struct cxl_event_mem_module *)record;
+
+		trace_cxl_memory_module(dev, type, rec);
+		return;
 	}
 
 	/* For unknown record types print just the header */
diff --git a/drivers/cxl/core/trace.h b/drivers/cxl/core/trace.h
index 5c6cd9aa9450..236919f5368d 100644
--- a/drivers/cxl/core/trace.h
+++ b/drivers/cxl/core/trace.h
@@ -439,6 +439,149 @@ TRACE_EVENT(cxl_dram,
 	)
 );
 
+/*
+ * Memory Module Event Record - MMER
+ *
+ * CXL res 3.0 section 8.2.9.2.1.3; Table 8-45
+ */
+#define CXL_MMER_HEALTH_STATUS_CHANGE		0x00
+#define CXL_MMER_MEDIA_STATUS_CHANGE		0x01
+#define CXL_MMER_LIFE_USED_CHANGE		0x02
+#define CXL_MMER_TEMP_CHANGE			0x03
+#define CXL_MMER_DATA_PATH_ERROR		0x04
+#define CXL_MMER_LAS_ERROR			0x05
+#define show_dev_evt_type(type)	__print_symbolic(type,			   \
+	{ CXL_MMER_HEALTH_STATUS_CHANGE,	"Health Status Change"	}, \
+	{ CXL_MMER_MEDIA_STATUS_CHANGE,		"Media Status Change"	}, \
+	{ CXL_MMER_LIFE_USED_CHANGE,		"Life Used Change"	}, \
+	{ CXL_MMER_TEMP_CHANGE,			"Temperature Change"	}, \
+	{ CXL_MMER_DATA_PATH_ERROR,		"Data Path Error"	}, \
+	{ CXL_MMER_LAS_ERROR,			"LSA Error"		}  \
+)
+
+/*
+ * Device Health Information - DHI
+ *
+ * CXL res 3.0 section 8.2.9.8.3.1; Table 8-100
+ */
+#define CXL_DHI_HS_MAINTENANCE_NEEDED				BIT(0)
+#define CXL_DHI_HS_PERFORMANCE_DEGRADED				BIT(1)
+#define CXL_DHI_HS_HW_REPLACEMENT_NEEDED			BIT(2)
+#define show_health_status_flags(flags)	__print_flags(flags, "|",	   \
+	{ CXL_DHI_HS_MAINTENANCE_NEEDED,	"MAINTENANCE_NEEDED"	}, \
+	{ CXL_DHI_HS_PERFORMANCE_DEGRADED,	"PERFORMANCE_DEGRADED"	}, \
+	{ CXL_DHI_HS_HW_REPLACEMENT_NEEDED,	"REPLACEMENT_NEEDED"	}  \
+)
+
+#define CXL_DHI_MS_NORMAL							0x00
+#define CXL_DHI_MS_NOT_READY							0x01
+#define CXL_DHI_MS_WRITE_PERSISTENCY_LOST					0x02
+#define CXL_DHI_MS_ALL_DATA_LOST						0x03
+#define CXL_DHI_MS_WRITE_PERSISTENCY_LOSS_EVENT_POWER_LOSS			0x04
+#define CXL_DHI_MS_WRITE_PERSISTENCY_LOSS_EVENT_SHUTDOWN			0x05
+#define CXL_DHI_MS_WRITE_PERSISTENCY_LOSS_IMMINENT				0x06
+#define CXL_DHI_MS_WRITE_ALL_DATA_LOSS_EVENT_POWER_LOSS				0x07
+#define CXL_DHI_MS_WRITE_ALL_DATA_LOSS_EVENT_SHUTDOWN				0x08
+#define CXL_DHI_MS_WRITE_ALL_DATA_LOSS_IMMINENT					0x09
+#define show_media_status(ms)	__print_symbolic(ms,			   \
+	{ CXL_DHI_MS_NORMAL,						   \
+		"Normal"						}, \
+	{ CXL_DHI_MS_NOT_READY,						   \
+		"Not Ready"						}, \
+	{ CXL_DHI_MS_WRITE_PERSISTENCY_LOST,				   \
+		"Write Persistency Lost"				}, \
+	{ CXL_DHI_MS_ALL_DATA_LOST,					   \
+		"All Data Lost"						}, \
+	{ CXL_DHI_MS_WRITE_PERSISTENCY_LOSS_EVENT_POWER_LOSS,		   \
+		"Write Persistency Loss in the Event of Power Loss"	}, \
+	{ CXL_DHI_MS_WRITE_PERSISTENCY_LOSS_EVENT_SHUTDOWN,		   \
+		"Write Persistency Loss in Event of Shutdown"		}, \
+	{ CXL_DHI_MS_WRITE_PERSISTENCY_LOSS_IMMINENT,			   \
+		"Write Persistency Loss Imminent"			}, \
+	{ CXL_DHI_MS_WRITE_ALL_DATA_LOSS_EVENT_POWER_LOSS,		   \
+		"All Data Loss in Event of Power Loss"			}, \
+	{ CXL_DHI_MS_WRITE_ALL_DATA_LOSS_EVENT_SHUTDOWN,		   \
+		"All Data loss in the Event of Shutdown"		}, \
+	{ CXL_DHI_MS_WRITE_ALL_DATA_LOSS_IMMINENT,			   \
+		"All Data Loss Imminent"				}  \
+)
+
+#define CXL_DHI_AS_NORMAL		0x0
+#define CXL_DHI_AS_WARNING		0x1
+#define CXL_DHI_AS_CRITICAL		0x2
+#define show_two_bit_status(as) __print_symbolic(as,	   \
+	{ CXL_DHI_AS_NORMAL,		"Normal"	}, \
+	{ CXL_DHI_AS_WARNING,		"Warning"	}, \
+	{ CXL_DHI_AS_CRITICAL,		"Critical"	}  \
+)
+#define show_one_bit_status(as) __print_symbolic(as,	   \
+	{ CXL_DHI_AS_NORMAL,		"Normal"	}, \
+	{ CXL_DHI_AS_WARNING,		"Warning"	}  \
+)
+
+#define CXL_DHI_AS_LIFE_USED(as)			(as & 0x3)
+#define CXL_DHI_AS_DEV_TEMP(as)				((as & 0xC) >> 2)
+#define CXL_DHI_AS_COR_VOL_ERR_CNT(as)			((as & 0x10) >> 4)
+#define CXL_DHI_AS_COR_PER_ERR_CNT(as)			((as & 0x20) >> 5)
+
+TRACE_EVENT(cxl_memory_module,
+
+	TP_PROTO(const struct device *dev, enum cxl_event_log_type log,
+		 struct cxl_event_mem_module *rec),
+
+	TP_ARGS(dev, log, rec),
+
+	TP_STRUCT__entry(
+		CXL_EVT_TP_entry
+
+		/* Memory Module Event */
+		__field(u8, event_type)
+
+		/* Device Health Info */
+		__field(u8, health_status)
+		__field(u8, media_status)
+		__field(u8, life_used)
+		__field(u32, dirty_shutdown_cnt)
+		__field(u32, cor_vol_err_cnt)
+		__field(u32, cor_per_err_cnt)
+		__field(s16, device_temp)
+		__field(u8, add_status)
+	),
+
+	TP_fast_assign(
+		CXL_EVT_TP_fast_assign(dev, log, rec->hdr);
+
+		/* Memory Module Event */
+		__entry->event_type = rec->event_type;
+
+		/* Device Health Info */
+		__entry->health_status = rec->info.health_status;
+		__entry->media_status = rec->info.media_status;
+		__entry->life_used = rec->info.life_used;
+		__entry->dirty_shutdown_cnt = get_unaligned_le32(rec->info.dirty_shutdown_cnt);
+		__entry->cor_vol_err_cnt = get_unaligned_le32(rec->info.cor_vol_err_cnt);
+		__entry->cor_per_err_cnt = get_unaligned_le32(rec->info.cor_per_err_cnt);
+		__entry->device_temp = get_unaligned_le16(rec->info.device_temp);
+		__entry->add_status = rec->info.add_status;
+	),
+
+	CXL_EVT_TP_printk("event_type='%s' health_status='%s' media_status='%s' " \
+		"as_life_used=%s as_dev_temp=%s as_cor_vol_err_cnt=%s " \
+		"as_cor_per_err_cnt=%s life_used=%u device_temp=%d " \
+		"dirty_shutdown_cnt=%u cor_vol_err_cnt=%u cor_per_err_cnt=%u",
+		show_dev_evt_type(__entry->event_type),
+		show_health_status_flags(__entry->health_status),
+		show_media_status(__entry->media_status),
+		show_two_bit_status(CXL_DHI_AS_LIFE_USED(__entry->add_status)),
+		show_two_bit_status(CXL_DHI_AS_DEV_TEMP(__entry->add_status)),
+		show_one_bit_status(CXL_DHI_AS_COR_VOL_ERR_CNT(__entry->add_status)),
+		show_one_bit_status(CXL_DHI_AS_COR_PER_ERR_CNT(__entry->add_status)),
+		__entry->life_used, __entry->device_temp,
+		__entry->dirty_shutdown_cnt, __entry->cor_vol_err_cnt,
+		__entry->cor_per_err_cnt
+	)
+);
+
 #endif /* _CXL_EVENTS_H */
 
 #define TRACE_INCLUDE_FILE trace
diff --git a/drivers/cxl/cxlmem.h b/drivers/cxl/cxlmem.h
index 19c9cb6d6ccd..3031e9d420c7 100644
--- a/drivers/cxl/cxlmem.h
+++ b/drivers/cxl/cxlmem.h
@@ -498,6 +498,32 @@ struct cxl_event_dram {
 	u8 reserved[0x17];
 } __packed;
 
+/*
+ * Get Health Info Record
+ * CXL rev 3.0 section 8.2.9.8.3.1; Table 8-100
+ */
+struct cxl_get_health_info {
+	u8 health_status;
+	u8 media_status;
+	u8 add_status;
+	u8 life_used;
+	u8 device_temp[2];
+	u8 dirty_shutdown_cnt[4];
+	u8 cor_vol_err_cnt[4];
+	u8 cor_per_err_cnt[4];
+} __packed;
+
+/*
+ * Memory Module Event Record
+ * CXL rev 3.0 section 8.2.9.2.1.3; Table 8-45
+ */
+struct cxl_event_mem_module {
+	struct cxl_event_record_hdr hdr;
+	u8 event_type;
+	struct cxl_get_health_info info;
+	u8 reserved[0x3d];
+} __packed;
+
 struct cxl_mbox_get_partition_info {
 	__le64 active_volatile_cap;
 	__le64 active_persistent_cap;
-- 
2.37.2


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH V3 6/8] cxl/test: Add generic mock events
  2022-12-08  5:21 [PATCH V3 0/8] CXL: Process event logs ira.weiny
                   ` (4 preceding siblings ...)
  2022-12-08  5:21 ` [PATCH V3 5/8] cxl/mem: Trace Memory Module " ira.weiny
@ 2022-12-08  5:21 ` ira.weiny
  2022-12-09 22:48   ` Dan Williams
  2022-12-08  5:21 ` [PATCH V3 7/8] cxl/test: Add specific events ira.weiny
  2022-12-08  5:21 ` [PATCH V3 8/8] cxl/test: Simulate event log overflow ira.weiny
  7 siblings, 1 reply; 25+ messages in thread
From: ira.weiny @ 2022-12-08  5:21 UTC (permalink / raw)
  To: Dan Williams
  Cc: Ira Weiny, Bjorn Helgaas, Alison Schofield, Vishal Verma,
	Davidlohr Bueso, Jonathan Cameron, Dave Jiang, linux-kernel,
	linux-pci, linux-acpi, linux-cxl

From: Ira Weiny <ira.weiny@intel.com>

Facilitate testing basic Get/Clear Event functionality by creating
multiple logs and generic events with made up UUID's.

Data is completely made up with data patterns which should be easy to
spot in trace output.

A single sysfs entry resets the event data and triggers collecting the
events for testing.

Test traces are easy to obtain with a small script such as this:

	#!/bin/bash -x

	devices=`find /sys/devices/platform -name cxl_mem*`

	# Turn on tracing
	echo "" > /sys/kernel/tracing/trace
	echo 1 > /sys/kernel/tracing/events/cxl/enable
	echo 1 > /sys/kernel/tracing/tracing_on

	# Generate fake interrupt
	for device in $devices; do
	        echo 1 > $device/event_trigger
	done

	# Turn off tracing and report events
	echo 0 > /sys/kernel/tracing/tracing_on
	cat /sys/kernel/tracing/trace

Signed-off-by: Ira Weiny <ira.weiny@intel.com>

---
Changes from v2:
	Adjust for tracing being part of cxl core
	Dan/Dave J.
		Adjust to Dave J.s mock data structure
	Dan
		Remove mock specific functionality in main code

Changes from v1:
	Fix up for new structures
	Jonathan
		Update based on specification discussion

Changes from RFC v2:
	Adjust to simulate the event status register

Changes from RFC:
	Separate out the event code
	Adjust for struct changes.
	Clean up devm_cxl_mock_event_logs()
	Clean up naming and comments
	Jonathan
		Remove dynamic allocation of event logs
		Clean up comment
		Remove unneeded xarray
		Ensure event_trigger sysfs is valid prior to the driver
		going active.
	Dan
		Remove the fill/reset event sysfs as these operations
		can be done together
---
 tools/testing/cxl/test/Kbuild   |   4 +-
 tools/testing/cxl/test/events.c | 195 ++++++++++++++++++++++++++++++++
 tools/testing/cxl/test/events.h |  32 ++++++
 tools/testing/cxl/test/mem.c    |  33 ++++--
 tools/testing/cxl/test/mock.h   |  12 ++
 5 files changed, 264 insertions(+), 12 deletions(-)
 create mode 100644 tools/testing/cxl/test/events.c
 create mode 100644 tools/testing/cxl/test/events.h

diff --git a/tools/testing/cxl/test/Kbuild b/tools/testing/cxl/test/Kbuild
index 4e59e2c911f6..c48d912e3781 100644
--- a/tools/testing/cxl/test/Kbuild
+++ b/tools/testing/cxl/test/Kbuild
@@ -1,5 +1,5 @@
 # SPDX-License-Identifier: GPL-2.0
-ccflags-y := -I$(srctree)/drivers/cxl/
+ccflags-y := -I$(srctree)/drivers/cxl/ -I$(srctree)/drivers/cxl/core
 
 obj-m += cxl_test.o
 obj-m += cxl_mock.o
@@ -7,4 +7,4 @@ obj-m += cxl_mock_mem.o
 
 cxl_test-y := cxl.o
 cxl_mock-y := mock.o
-cxl_mock_mem-y := mem.o
+cxl_mock_mem-y := mem.o events.o
diff --git a/tools/testing/cxl/test/events.c b/tools/testing/cxl/test/events.c
new file mode 100644
index 000000000000..1346c38dce1d
--- /dev/null
+++ b/tools/testing/cxl/test/events.c
@@ -0,0 +1,195 @@
+// SPDX-License-Identifier: GPL-2.0-only
+// Copyright(c) 2022 Intel Corporation. All rights reserved.
+
+#include "mock.h"
+#include "events.h"
+#include "trace.h"
+
+struct mock_event_log *find_event_log(struct device *dev, int log_type)
+{
+	struct cxl_mockmem_data *mdata = dev_get_drvdata(dev);
+
+	if (log_type >= CXL_EVENT_TYPE_MAX)
+		return NULL;
+	return &mdata->mes.mock_logs[log_type];
+}
+
+struct cxl_event_record_raw *get_cur_event(struct mock_event_log *log)
+{
+	return log->events[log->cur_idx];
+}
+
+void reset_event_log(struct mock_event_log *log)
+{
+	log->cur_idx = 0;
+	log->clear_idx = 0;
+}
+
+/* Handle can never be 0 use 1 based indexing for handle */
+u16 get_clear_handle(struct mock_event_log *log)
+{
+	return log->clear_idx + 1;
+}
+
+/* Handle can never be 0 use 1 based indexing for handle */
+__le16 get_cur_event_handle(struct mock_event_log *log)
+{
+	u16 cur_handle = log->cur_idx + 1;
+
+	return cpu_to_le16(cur_handle);
+}
+
+static bool log_empty(struct mock_event_log *log)
+{
+	return log->cur_idx == log->nr_events;
+}
+
+static void event_store_add_event(struct mock_event_store *mes,
+				  enum cxl_event_log_type log_type,
+				  struct cxl_event_record_raw *event)
+{
+	struct mock_event_log *log;
+
+	if (WARN_ON(log_type >= CXL_EVENT_TYPE_MAX))
+		return;
+
+	log = &mes->mock_logs[log_type];
+	if (WARN_ON(log->nr_events >= CXL_TEST_EVENT_CNT_MAX))
+		return;
+
+	log->events[log->nr_events] = event;
+	log->nr_events++;
+}
+
+int mock_get_event(struct cxl_dev_state *cxlds, struct cxl_mbox_cmd *cmd)
+{
+	struct cxl_get_event_payload *pl;
+	struct mock_event_log *log;
+	u8 log_type;
+	int i;
+
+	if (cmd->size_in != sizeof(log_type))
+		return -EINVAL;
+
+	if (cmd->size_out < struct_size(pl, records, CXL_TEST_EVENT_CNT))
+		return -EINVAL;
+
+	log_type = *((u8 *)cmd->payload_in);
+	if (log_type >= CXL_EVENT_TYPE_MAX)
+		return -EINVAL;
+
+	memset(cmd->payload_out, 0, cmd->size_out);
+
+	log = find_event_log(cxlds->dev, log_type);
+	if (!log || log_empty(log))
+		return 0;
+
+	pl = cmd->payload_out;
+
+	for (i = 0; i < CXL_TEST_EVENT_CNT && !log_empty(log); i++) {
+		memcpy(&pl->records[i], get_cur_event(log), sizeof(pl->records[i]));
+		pl->records[i].hdr.handle = get_cur_event_handle(log);
+		log->cur_idx++;
+	}
+
+	pl->record_count = cpu_to_le16(i);
+	if (!log_empty(log))
+		pl->flags |= CXL_GET_EVENT_FLAG_MORE_RECORDS;
+
+	return 0;
+}
+EXPORT_SYMBOL_GPL(mock_get_event);
+
+int mock_clear_event(struct cxl_dev_state *cxlds, struct cxl_mbox_cmd *cmd)
+{
+	struct cxl_mbox_clear_event_payload *pl = cmd->payload_in;
+	struct mock_event_log *log;
+	u8 log_type = pl->event_log;
+	u16 handle;
+	int nr;
+
+	if (log_type >= CXL_EVENT_TYPE_MAX)
+		return -EINVAL;
+
+	log = find_event_log(cxlds->dev, log_type);
+	if (!log)
+		return 0; /* No mock data in this log */
+
+	/*
+	 * This check is technically not invalid per the specification AFAICS.
+	 * (The host could 'guess' handles and clear them in order).
+	 * However, this is not good behavior for the host so test it.
+	 */
+	if (log->clear_idx + pl->nr_recs > log->cur_idx) {
+		dev_err(cxlds->dev,
+			"Attempting to clear more events than returned!\n");
+		return -EINVAL;
+	}
+
+	/* Check handle order prior to clearing events */
+	for (nr = 0, handle = get_clear_handle(log);
+	     nr < pl->nr_recs;
+	     nr++, handle++) {
+		if (handle != le16_to_cpu(pl->handle[nr])) {
+			dev_err(cxlds->dev, "Clearing events out of order\n");
+			return -EINVAL;
+		}
+	}
+
+	/* Clear events */
+	log->clear_idx += pl->nr_recs;
+	return 0;
+}
+EXPORT_SYMBOL_GPL(mock_clear_event);
+
+void cxl_mock_event_trigger(struct device *dev)
+{
+	struct cxl_mockmem_data *mdata = dev_get_drvdata(dev);
+	struct mock_event_store *mes = &mdata->mes;
+	int i;
+
+	for (i = CXL_EVENT_TYPE_INFO; i < CXL_EVENT_TYPE_MAX; i++) {
+		struct mock_event_log *log;
+
+		log = find_event_log(dev, i);
+		if (log)
+			reset_event_log(log);
+	}
+
+	cxl_mem_get_event_records(mes->cxlds, mes->ev_status);
+}
+EXPORT_SYMBOL_GPL(cxl_mock_event_trigger);
+
+struct cxl_event_record_raw maint_needed = {
+	.hdr = {
+		.id = UUID_INIT(0xBA5EBA11, 0xABCD, 0xEFEB,
+				0xa5, 0x5a, 0xa5, 0x5a, 0xa5, 0xa5, 0x5a, 0xa5),
+		.length = sizeof(struct cxl_event_record_raw),
+		.flags[0] = CXL_EVENT_RECORD_FLAG_MAINT_NEEDED,
+		/* .handle = Set dynamically */
+		.related_handle = cpu_to_le16(0xa5b6),
+	},
+	.data = { 0xDE, 0xAD, 0xBE, 0xEF },
+};
+
+struct cxl_event_record_raw hardware_replace = {
+	.hdr = {
+		.id = UUID_INIT(0xABCDEFEB, 0xBA11, 0xBA5E,
+				0xa5, 0x5a, 0xa5, 0x5a, 0xa5, 0xa5, 0x5a, 0xa5),
+		.length = sizeof(struct cxl_event_record_raw),
+		.flags[0] = CXL_EVENT_RECORD_FLAG_HW_REPLACE,
+		/* .handle = Set dynamically */
+		.related_handle = cpu_to_le16(0xb6a5),
+	},
+	.data = { 0xDE, 0xAD, 0xBE, 0xEF },
+};
+
+void cxl_mock_add_event_logs(struct mock_event_store *mes)
+{
+	event_store_add_event(mes, CXL_EVENT_TYPE_INFO, &maint_needed);
+	mes->ev_status |= CXLDEV_EVENT_STATUS_INFO;
+
+	event_store_add_event(mes, CXL_EVENT_TYPE_FATAL, &hardware_replace);
+	mes->ev_status |= CXLDEV_EVENT_STATUS_FATAL;
+}
+EXPORT_SYMBOL_GPL(cxl_mock_add_event_logs);
diff --git a/tools/testing/cxl/test/events.h b/tools/testing/cxl/test/events.h
new file mode 100644
index 000000000000..626cd79f1871
--- /dev/null
+++ b/tools/testing/cxl/test/events.h
@@ -0,0 +1,32 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+
+#ifndef CXL_TEST_EVENTS_H
+#define CXL_TEST_EVENTS_H
+
+#include <cxlmem.h>
+
+#define CXL_TEST_EVENT_CNT_MAX 15
+
+/* Set a number of events to return at a time for simulation.  */
+#define CXL_TEST_EVENT_CNT 3
+
+struct mock_event_log {
+	u16 clear_idx;
+	u16 cur_idx;
+	u16 nr_events;
+	struct cxl_event_record_raw *events[CXL_TEST_EVENT_CNT_MAX];
+};
+
+struct mock_event_store {
+	struct cxl_dev_state *cxlds;
+	struct mock_event_log mock_logs[CXL_EVENT_TYPE_MAX];
+	u32 ev_status;
+};
+
+int mock_get_event(struct cxl_dev_state *cxlds, struct cxl_mbox_cmd *cmd);
+int mock_clear_event(struct cxl_dev_state *cxlds, struct cxl_mbox_cmd *cmd);
+void cxl_mock_add_event_logs(struct mock_event_store *mes);
+void cxl_mock_remove_event_logs(struct device *dev);
+void cxl_mock_event_trigger(struct device *dev);
+
+#endif /* CXL_TEST_EVENTS_H */
diff --git a/tools/testing/cxl/test/mem.c b/tools/testing/cxl/test/mem.c
index 5e4ecd93f1d2..7674d6305d28 100644
--- a/tools/testing/cxl/test/mem.c
+++ b/tools/testing/cxl/test/mem.c
@@ -8,6 +8,7 @@
 #include <linux/sizes.h>
 #include <linux/bits.h>
 #include <cxlmem.h>
+#include "mock.h"
 
 #define LSA_SIZE SZ_128K
 #define DEV_SIZE SZ_2G
@@ -67,16 +68,6 @@ static struct {
 
 #define PASS_TRY_LIMIT 3
 
-struct cxl_mockmem_data {
-	void *lsa;
-	u32 security_state;
-	u8 user_pass[NVDIMM_PASSPHRASE_LEN];
-	u8 master_pass[NVDIMM_PASSPHRASE_LEN];
-	int user_limit;
-	int master_limit;
-
-};
-
 static int mock_gsl(struct cxl_mbox_cmd *cmd)
 {
 	if (cmd->size_out < sizeof(mock_gsl_payload))
@@ -582,6 +573,12 @@ static int cxl_mock_mbox_send(struct cxl_dev_state *cxlds, struct cxl_mbox_cmd *
 	case CXL_MBOX_OP_GET_PARTITION_INFO:
 		rc = mock_partition_info(cxlds, cmd);
 		break;
+	case CXL_MBOX_OP_GET_EVENT_RECORD:
+		rc = mock_get_event(cxlds, cmd);
+		break;
+	case CXL_MBOX_OP_CLEAR_EVENT_RECORD:
+		rc = mock_clear_event(cxlds, cmd);
+		break;
 	case CXL_MBOX_OP_SET_LSA:
 		rc = mock_set_lsa(cxlds, cmd);
 		break;
@@ -628,6 +625,15 @@ static bool is_rcd(struct platform_device *pdev)
 	return !!id->driver_data;
 }
 
+static ssize_t event_trigger_store(struct device *dev,
+				   struct device_attribute *attr,
+				   const char *buf, size_t count)
+{
+	cxl_mock_event_trigger(dev);
+	return count;
+}
+static DEVICE_ATTR_WO(event_trigger);
+
 static int cxl_mock_mem_probe(struct platform_device *pdev)
 {
 	struct device *dev = &pdev->dev;
@@ -655,6 +661,7 @@ static int cxl_mock_mem_probe(struct platform_device *pdev)
 	cxlds->serial = pdev->id;
 	cxlds->mbox_send = cxl_mock_mbox_send;
 	cxlds->payload_size = SZ_4K;
+	cxlds->event.buf = (struct cxl_get_event_payload *) mdata->event_buf;
 	if (is_rcd(pdev)) {
 		cxlds->rcd = true;
 		cxlds->component_reg_phys = CXL_RESOURCE_NONE;
@@ -672,10 +679,15 @@ static int cxl_mock_mem_probe(struct platform_device *pdev)
 	if (rc)
 		return rc;
 
+	mdata->mes.cxlds = cxlds;
+	cxl_mock_add_event_logs(&mdata->mes);
+
 	cxlmd = devm_cxl_add_memdev(cxlds);
 	if (IS_ERR(cxlmd))
 		return PTR_ERR(cxlmd);
 
+	cxl_mem_get_event_records(cxlds, CXLDEV_EVENT_STATUS_ALL);
+
 	return 0;
 }
 
@@ -714,6 +726,7 @@ static DEVICE_ATTR_RW(security_lock);
 
 static struct attribute *cxl_mock_mem_attrs[] = {
 	&dev_attr_security_lock.attr,
+	&dev_attr_event_trigger.attr,
 	NULL
 };
 ATTRIBUTE_GROUPS(cxl_mock_mem);
diff --git a/tools/testing/cxl/test/mock.h b/tools/testing/cxl/test/mock.h
index ef33f159375e..e7827ddedb06 100644
--- a/tools/testing/cxl/test/mock.h
+++ b/tools/testing/cxl/test/mock.h
@@ -3,6 +3,18 @@
 #include <linux/list.h>
 #include <linux/acpi.h>
 #include <cxl.h>
+#include "events.h"
+
+struct cxl_mockmem_data {
+	void *lsa;
+	u32 security_state;
+	u8 user_pass[NVDIMM_PASSPHRASE_LEN];
+	u8 master_pass[NVDIMM_PASSPHRASE_LEN];
+	int user_limit;
+	int master_limit;
+	struct mock_event_store mes;
+	u8 event_buf[SZ_4K];
+};
 
 struct cxl_mock_ops {
 	struct list_head list;
-- 
2.37.2


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH V3 7/8] cxl/test: Add specific events
  2022-12-08  5:21 [PATCH V3 0/8] CXL: Process event logs ira.weiny
                   ` (5 preceding siblings ...)
  2022-12-08  5:21 ` [PATCH V3 6/8] cxl/test: Add generic mock events ira.weiny
@ 2022-12-08  5:21 ` ira.weiny
  2022-12-08  5:21 ` [PATCH V3 8/8] cxl/test: Simulate event log overflow ira.weiny
  7 siblings, 0 replies; 25+ messages in thread
From: ira.weiny @ 2022-12-08  5:21 UTC (permalink / raw)
  To: Dan Williams
  Cc: Ira Weiny, Jonathan Cameron, Bjorn Helgaas, Alison Schofield,
	Vishal Verma, Davidlohr Bueso, Dave Jiang, linux-kernel,
	linux-pci, linux-acpi, linux-cxl

From: Ira Weiny <ira.weiny@intel.com>

Each type of event has different trace point outputs.

Add mock General Media Event, DRAM event, and Memory Module Event
records to the mock list of events returned.

Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Ira Weiny <ira.weiny@intel.com>

---
Changes from V2:
	Rebased on pending cxl-security branch

Changes from V1:
	Jonathan
		use put_unaligned_le16()
		fix spacing

Changes from RFC:
	Adjust for struct changes
	adjust for unaligned fields
---
 tools/testing/cxl/test/events.c | 73 +++++++++++++++++++++++++++++++++
 1 file changed, 73 insertions(+)

diff --git a/tools/testing/cxl/test/events.c b/tools/testing/cxl/test/events.c
index 1346c38dce1d..5214826d264f 100644
--- a/tools/testing/cxl/test/events.c
+++ b/tools/testing/cxl/test/events.c
@@ -184,12 +184,85 @@ struct cxl_event_record_raw hardware_replace = {
 	.data = { 0xDE, 0xAD, 0xBE, 0xEF },
 };
 
+struct cxl_event_gen_media gen_media = {
+	.hdr = {
+		.id = UUID_INIT(0xfbcd0a77, 0xc260, 0x417f,
+				0x85, 0xa9, 0x08, 0x8b, 0x16, 0x21, 0xeb, 0xa6),
+		.length = sizeof(struct cxl_event_gen_media),
+		.flags[0] = CXL_EVENT_RECORD_FLAG_PERMANENT,
+		/* .handle = Set dynamically */
+		.related_handle = cpu_to_le16(0),
+	},
+	.phys_addr = cpu_to_le64(0x2000),
+	.descriptor = CXL_GMER_EVT_DESC_UNCORECTABLE_EVENT,
+	.type = CXL_GMER_MEM_EVT_TYPE_DATA_PATH_ERROR,
+	.transaction_type = CXL_GMER_TRANS_HOST_WRITE,
+	/* .validity_flags = <set below> */
+	.channel = 1,
+	.rank = 30
+};
+
+struct cxl_event_dram dram = {
+	.hdr = {
+		.id = UUID_INIT(0x601dcbb3, 0x9c06, 0x4eab,
+				0xb8, 0xaf, 0x4e, 0x9b, 0xfb, 0x5c, 0x96, 0x24),
+		.length = sizeof(struct cxl_event_dram),
+		.flags[0] = CXL_EVENT_RECORD_FLAG_PERF_DEGRADED,
+		/* .handle = Set dynamically */
+		.related_handle = cpu_to_le16(0),
+	},
+	.phys_addr = cpu_to_le64(0x8000),
+	.descriptor = CXL_GMER_EVT_DESC_THRESHOLD_EVENT,
+	.type = CXL_GMER_MEM_EVT_TYPE_INV_ADDR,
+	.transaction_type = CXL_GMER_TRANS_INTERNAL_MEDIA_SCRUB,
+	/* .validity_flags = <set below> */
+	.channel = 1,
+	.bank_group = 5,
+	.bank = 2,
+	.column = {0xDE, 0xAD},
+};
+
+struct cxl_event_mem_module mem_module = {
+	.hdr = {
+		.id = UUID_INIT(0xfe927475, 0xdd59, 0x4339,
+				0xa5, 0x86, 0x79, 0xba, 0xb1, 0x13, 0xb7, 0x74),
+		.length = sizeof(struct cxl_event_mem_module),
+		/* .handle = Set dynamically */
+		.related_handle = cpu_to_le16(0),
+	},
+	.event_type = CXL_MMER_TEMP_CHANGE,
+	.info = {
+		.health_status = CXL_DHI_HS_PERFORMANCE_DEGRADED,
+		.media_status = CXL_DHI_MS_ALL_DATA_LOST,
+		.add_status = (CXL_DHI_AS_CRITICAL << 2) |
+			      (CXL_DHI_AS_WARNING << 4) |
+			      (CXL_DHI_AS_WARNING << 5),
+		.device_temp = { 0xDE, 0xAD},
+		.dirty_shutdown_cnt = { 0xde, 0xad, 0xbe, 0xef },
+		.cor_vol_err_cnt = { 0xde, 0xad, 0xbe, 0xef },
+		.cor_per_err_cnt = { 0xde, 0xad, 0xbe, 0xef },
+	}
+};
+
 void cxl_mock_add_event_logs(struct mock_event_store *mes)
 {
+	put_unaligned_le16(CXL_GMER_VALID_CHANNEL | CXL_GMER_VALID_RANK,
+			   &gen_media.validity_flags);
+
+	put_unaligned_le16(CXL_DER_VALID_CHANNEL | CXL_DER_VALID_BANK_GROUP |
+			   CXL_DER_VALID_BANK | CXL_DER_VALID_COLUMN,
+			   &dram.validity_flags);
+
 	event_store_add_event(mes, CXL_EVENT_TYPE_INFO, &maint_needed);
+	event_store_add_event(mes, CXL_EVENT_TYPE_INFO,
+			      (struct cxl_event_record_raw *)&gen_media);
+	event_store_add_event(mes, CXL_EVENT_TYPE_INFO,
+			      (struct cxl_event_record_raw *)&mem_module);
 	mes->ev_status |= CXLDEV_EVENT_STATUS_INFO;
 
 	event_store_add_event(mes, CXL_EVENT_TYPE_FATAL, &hardware_replace);
+	event_store_add_event(mes, CXL_EVENT_TYPE_FATAL,
+			      (struct cxl_event_record_raw *)&dram);
 	mes->ev_status |= CXLDEV_EVENT_STATUS_FATAL;
 }
 EXPORT_SYMBOL_GPL(cxl_mock_add_event_logs);
-- 
2.37.2


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH V3 8/8] cxl/test: Simulate event log overflow
  2022-12-08  5:21 [PATCH V3 0/8] CXL: Process event logs ira.weiny
                   ` (6 preceding siblings ...)
  2022-12-08  5:21 ` [PATCH V3 7/8] cxl/test: Add specific events ira.weiny
@ 2022-12-08  5:21 ` ira.weiny
  2022-12-09 22:52   ` Dan Williams
  7 siblings, 1 reply; 25+ messages in thread
From: ira.weiny @ 2022-12-08  5:21 UTC (permalink / raw)
  To: Dan Williams
  Cc: Ira Weiny, Jonathan Cameron, Bjorn Helgaas, Alison Schofield,
	Vishal Verma, Davidlohr Bueso, Dave Jiang, linux-kernel,
	linux-pci, linux-acpi, linux-cxl

From: Ira Weiny <ira.weiny@intel.com>

Log overflow is marked by a separate trace message.

Simulate a log with lots of messages and flag overflow until space is
cleared.

Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Ira Weiny <ira.weiny@intel.com>

---
Changes from V2:
	Rebased on pending cxl-security

Changes from RFC
	Adjust for new struct changes
---
 tools/testing/cxl/test/events.c | 48 ++++++++++++++++++++++++++++++++-
 tools/testing/cxl/test/events.h |  2 ++
 2 files changed, 49 insertions(+), 1 deletion(-)

diff --git a/tools/testing/cxl/test/events.c b/tools/testing/cxl/test/events.c
index 5214826d264f..f389e1ba2ab7 100644
--- a/tools/testing/cxl/test/events.c
+++ b/tools/testing/cxl/test/events.c
@@ -23,6 +23,7 @@ void reset_event_log(struct mock_event_log *log)
 {
 	log->cur_idx = 0;
 	log->clear_idx = 0;
+	log->nr_overflow = log->overflow_reset;
 }
 
 /* Handle can never be 0 use 1 based indexing for handle */
@@ -54,8 +55,12 @@ static void event_store_add_event(struct mock_event_store *mes,
 		return;
 
 	log = &mes->mock_logs[log_type];
-	if (WARN_ON(log->nr_events >= CXL_TEST_EVENT_CNT_MAX))
+
+	if ((log->nr_events + 1) > CXL_TEST_EVENT_CNT_MAX) {
+		log->nr_overflow++;
+		log->overflow_reset = log->nr_overflow;
 		return;
+	}
 
 	log->events[log->nr_events] = event;
 	log->nr_events++;
@@ -65,6 +70,7 @@ int mock_get_event(struct cxl_dev_state *cxlds, struct cxl_mbox_cmd *cmd)
 {
 	struct cxl_get_event_payload *pl;
 	struct mock_event_log *log;
+	u16 nr_overflow;
 	u8 log_type;
 	int i;
 
@@ -96,6 +102,19 @@ int mock_get_event(struct cxl_dev_state *cxlds, struct cxl_mbox_cmd *cmd)
 	if (!log_empty(log))
 		pl->flags |= CXL_GET_EVENT_FLAG_MORE_RECORDS;
 
+	if (log->nr_overflow) {
+		u64 ns;
+
+		pl->flags |= CXL_GET_EVENT_FLAG_OVERFLOW;
+		pl->overflow_err_count = cpu_to_le16(nr_overflow);
+		ns = ktime_get_real_ns();
+		ns -= 5000000000; /* 5s ago */
+		pl->first_overflow_timestamp = cpu_to_le64(ns);
+		ns = ktime_get_real_ns();
+		ns -= 1000000000; /* 1s ago */
+		pl->last_overflow_timestamp = cpu_to_le64(ns);
+	}
+
 	return 0;
 }
 EXPORT_SYMBOL_GPL(mock_get_event);
@@ -136,6 +155,9 @@ int mock_clear_event(struct cxl_dev_state *cxlds, struct cxl_mbox_cmd *cmd)
 		}
 	}
 
+	if (log->nr_overflow)
+		log->nr_overflow = 0;
+
 	/* Clear events */
 	log->clear_idx += pl->nr_recs;
 	return 0;
@@ -260,6 +282,30 @@ void cxl_mock_add_event_logs(struct mock_event_store *mes)
 			      (struct cxl_event_record_raw *)&mem_module);
 	mes->ev_status |= CXLDEV_EVENT_STATUS_INFO;
 
+	event_store_add_event(mes, CXL_EVENT_TYPE_FAIL, &maint_needed);
+	event_store_add_event(mes, CXL_EVENT_TYPE_FAIL, &hardware_replace);
+	event_store_add_event(mes, CXL_EVENT_TYPE_FAIL,
+			      (struct cxl_event_record_raw *)&dram);
+	event_store_add_event(mes, CXL_EVENT_TYPE_FAIL,
+			      (struct cxl_event_record_raw *)&gen_media);
+	event_store_add_event(mes, CXL_EVENT_TYPE_FAIL,
+			      (struct cxl_event_record_raw *)&mem_module);
+	event_store_add_event(mes, CXL_EVENT_TYPE_FAIL, &hardware_replace);
+	event_store_add_event(mes, CXL_EVENT_TYPE_FAIL,
+			      (struct cxl_event_record_raw *)&dram);
+	/* Overflow this log */
+	event_store_add_event(mes, CXL_EVENT_TYPE_FAIL, &hardware_replace);
+	event_store_add_event(mes, CXL_EVENT_TYPE_FAIL, &hardware_replace);
+	event_store_add_event(mes, CXL_EVENT_TYPE_FAIL, &hardware_replace);
+	event_store_add_event(mes, CXL_EVENT_TYPE_FAIL, &hardware_replace);
+	event_store_add_event(mes, CXL_EVENT_TYPE_FAIL, &hardware_replace);
+	event_store_add_event(mes, CXL_EVENT_TYPE_FAIL, &hardware_replace);
+	event_store_add_event(mes, CXL_EVENT_TYPE_FAIL, &hardware_replace);
+	event_store_add_event(mes, CXL_EVENT_TYPE_FAIL, &hardware_replace);
+	event_store_add_event(mes, CXL_EVENT_TYPE_FAIL, &hardware_replace);
+	event_store_add_event(mes, CXL_EVENT_TYPE_FAIL, &hardware_replace);
+	mes->ev_status |= CXLDEV_EVENT_STATUS_FAIL;
+
 	event_store_add_event(mes, CXL_EVENT_TYPE_FATAL, &hardware_replace);
 	event_store_add_event(mes, CXL_EVENT_TYPE_FATAL,
 			      (struct cxl_event_record_raw *)&dram);
diff --git a/tools/testing/cxl/test/events.h b/tools/testing/cxl/test/events.h
index 626cd79f1871..80a74568f455 100644
--- a/tools/testing/cxl/test/events.h
+++ b/tools/testing/cxl/test/events.h
@@ -14,6 +14,8 @@ struct mock_event_log {
 	u16 clear_idx;
 	u16 cur_idx;
 	u16 nr_events;
+	u16 nr_overflow;
+	u16 overflow_reset;
 	struct cxl_event_record_raw *events[CXL_TEST_EVENT_CNT_MAX];
 };
 
-- 
2.37.2


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* RE: [PATCH V3 1/8] cxl/mem: Read, trace, and clear events on driver load
  2022-12-08  5:21 ` [PATCH V3 1/8] cxl/mem: Read, trace, and clear events on driver load ira.weiny
@ 2022-12-09 17:56   ` Dan Williams
  2022-12-09 21:00     ` Ira Weiny
  0 siblings, 1 reply; 25+ messages in thread
From: Dan Williams @ 2022-12-09 17:56 UTC (permalink / raw)
  To: ira.weiny, Dan Williams
  Cc: Ira Weiny, Bjorn Helgaas, Alison Schofield, Vishal Verma,
	Davidlohr Bueso, Jonathan Cameron, Dave Jiang, linux-kernel,
	linux-pci, linux-acpi, linux-cxl

ira.weiny@ wrote:
> From: Ira Weiny <ira.weiny@intel.com>
> 
> CXL devices have multiple event logs which can be queried for CXL event
> records.  Devices are required to support the storage of at least one
> event record in each event log type.
> 
> Devices track event log overflow by incrementing a counter and tracking
> the time of the first and last overflow event seen.
> 
> Software queries events via the Get Event Record mailbox command; CXL
> rev 3.0 section 8.2.9.2.2 and clears events via CXL rev 3.0 section
> 8.2.9.2.3 Clear Event Records mailbox command.
> 
> CXL _OSC Error Reporting Control is used by the OS to determine if
> Firmware has control of various error reporting capabilities including
> the event logs.
> 
> Expose the result of negotiating CXL Error Reporting Control in struct
> pci_host_bridge for consumption by the CXL drivers.  If support is
> controlled by the OS read and clear all event logs on driver load.
> 
> Ensure a clean slate of events by reading and clearing the events on
> driver load.  The operation is performed twice to ensure that any prior
> partial readings are completed and a fresh read from the start is done
> at least one time.  This is done even if rogue reads cause clear errors.
> 
> The status register is not used because a device may continue to trigger
> events and the only requirement is to empty the log at least once.  This
> allows for the required transition from empty to non-empty for interrupt
> generation.  Handling of interrupts is in a follow on patch.
> 
> The device can return up to 1MB worth of event records per query.
> Allocate a shared large buffer to handle the max number of records based
> on the mailbox payload size.
> 
> This patch traces a raw event record and leaves specific event record
> type tracing to subsequent patches.  Macros are created to aid in
> tracing the common CXL Event header fields.
> 
> Each record is cleared explicitly.  A clear all bit is specified but is
> only valid when the log overflows.
> 
> Signed-off-by: Ira Weiny <ira.weiny@intel.com>
> 
> ---
> Changes from V2:
> 	Rebased on 6.3 pending changes
> 	Move cxl_mem_alloc_event_buf() to pci.c
> 	Define and use CXLDEV_EVENT_STATUS_ALL
> 	Fix error flow on clear failure
> 	Remove tags
> 	Jonathan/Dan
> 		Add in OSC Error Reporting Control check
> 	Dan (Jonathan in previous version)
> 		Squash Clear events and the driver load patch into this one.
> 	Dan
> 		Make event device status a separate structure
> 		Move tracing to within cxl core
> 		Reduce clear double loop to a single loop
> 		Pass struct device to trace points
> 		Adjust to new cxl_internal_send_cmd()
> 		Query error logs in order of severity fatal -> Info
> 		Remove uapi defines entirely
> 		pass total via get_pl
> 		fix 'Clearning' spelling
> 		Better clarify event_buf singular allocation
> 		Use decimal for command payload array sizes
> 		Remove trace_*_enabled() optimization
> 		Put GET/CLEAR macros at the end of the user enum to
> 		preserve compatibility
> 		Add Get/Clear Events to kernel exclusive commands
> 		Remove cxl_event_log_type_str() outside of tracing
> 		Add cond_resched() to event log processing
> 	Jonathan
> 		s/event_buf_lock/event_log_lock
> 		Read through all logs two times to ensure partial reads are
> 			covered.
> 		Pass buffer to cxl_mem_free_event_buffer()
> 		kdoc for event buf
> 		Account for cxlds->payload_size limiting the max handles
> 		Don't use min_t as it was used incorrectly
> 
> Changes from V1:
> 	Clear Event Record allows for u8 handles while Get Event Record
> 	allows for u16 records to be returned.  Based on Jonathan's
> 	feedback; allow for all event records to be handled in this
> 	clear.  Which means a double loop with potentially multiple
> 	Clear Event payloads being sent to clear all events sent.
> 
> Changes from RFC:
> 	Jonathan
> 		Clean up init of payload and use return code.
> 		Also report any error to clear the event.
> 		s/v3.0/rev 3.0

This is a lot of text to skip over. The cover letter has the summary of
the changes and a link to v2 and that has the link to v1. Why does each
patch need the full history of changes all over again?

> 
> squash: make event device state a separate structure.
> ---
>  drivers/acpi/pci_root.c  |   3 +
>  drivers/cxl/core/mbox.c  | 138 +++++++++++++++++++++++++++++++++++++++
>  drivers/cxl/core/trace.h | 120 ++++++++++++++++++++++++++++++++++
>  drivers/cxl/cxl.h        |  12 ++++
>  drivers/cxl/cxlmem.h     |  84 ++++++++++++++++++++++++
>  drivers/cxl/pci.c        |  42 ++++++++++++
>  include/linux/pci.h      |   1 +
>  7 files changed, 400 insertions(+)
> 
> diff --git a/drivers/acpi/pci_root.c b/drivers/acpi/pci_root.c
> index b3c202d2a433..cee8f56fb63a 100644
> --- a/drivers/acpi/pci_root.c
> +++ b/drivers/acpi/pci_root.c

I do appreciate that this patch has a full thought about error handling
implementing the full cycle of retrieve and clear, but I think these
drivers/acpi/ changes can stand on their own as a lead-in patch if only
so that Bjorn and Rafael do not need to dig through the CXL details.


> @@ -1047,6 +1047,9 @@ struct pci_bus *acpi_pci_root_create(struct acpi_pci_root *root,
>  	if (!(root->osc_control_set & OSC_PCI_EXPRESS_DPC_CONTROL))
>  		host_bridge->native_dpc = 0;
>  
> +	if (root->osc_ext_control_set & OSC_CXL_ERROR_REPORTING_CONTROL)
> +		host_bridge->native_cxl_error = 1;
> +

Copy the existing style and init the flag to 1 in pci_init_host_bridge()
and clear it in acpi_pci_root_create() upon fail to take control.

>  	/*
>  	 * Evaluate the "PCI Boot Configuration" _DSM Function.  If it
>  	 * exists and returns 0, we must preserve any PCI resource
> diff --git a/drivers/cxl/core/mbox.c b/drivers/cxl/core/mbox.c
> index b03fba212799..815da3aac081 100644
> --- a/drivers/cxl/core/mbox.c
> +++ b/drivers/cxl/core/mbox.c
> @@ -8,6 +8,7 @@
>  #include <cxl.h>
>  
>  #include "core.h"
> +#include "trace.h"
>  
>  static bool cxl_raw_allow_all;
>  
> @@ -717,6 +718,142 @@ int cxl_enumerate_cmds(struct cxl_dev_state *cxlds)
>  }
>  EXPORT_SYMBOL_NS_GPL(cxl_enumerate_cmds, CXL);
>  
> +static int cxl_clear_event_record(struct cxl_dev_state *cxlds,
> +				  enum cxl_event_log_type log,
> +				  struct cxl_get_event_payload *get_pl)
> +{
> +	struct cxl_mbox_clear_event_payload payload = {
> +		.event_log = log,
> +	};
> +	u16 total = le16_to_cpu(get_pl->record_count);
> +	u8 max_handles = CXL_CLEAR_EVENT_MAX_HANDLES;
> +	size_t pl_size = sizeof(payload);
> +	struct cxl_mbox_cmd mbox_cmd;
> +	u16 cnt;
> +	int rc;
> +	int i;
> +
> +	/* Payload size may limit the max handles */
> +	if (pl_size > cxlds->payload_size) {
> +		max_handles = CXL_CLEAR_EVENT_LIMIT_HANDLES(cxlds->payload_size);
> +		pl_size = cxlds->payload_size;
> +	}
> +
> +	mbox_cmd = (struct cxl_mbox_cmd) {
> +		.opcode = CXL_MBOX_OP_CLEAR_EVENT_RECORD,
> +		.payload_in = &payload,
> +		.size_in = pl_size,
> +	};
> +
> +	/*
> +	 * Clear Event Records uses u8 for the handle cnt while Get Event
> +	 * Record can return up to 0xffff records.
> +	 */
> +	i = 0;
> +	for (cnt = 0; cnt < total; cnt++) {
> +		payload.handle[i++] = get_pl->records[cnt].hdr.handle;
> +		dev_dbg(cxlds->dev, "Event log '%d': Clearing %u\n",
> +			log, le16_to_cpu(payload.handle[i]));
> +
> +		if (i == max_handles) {
> +			payload.nr_recs = i;
> +			rc = cxl_internal_send_cmd(cxlds, &mbox_cmd);
> +			if (rc)
> +				return rc;
> +			i = 0;
> +		}
> +	}
> +
> +	/* Clear what is left if any */
> +	if (i) {
> +		payload.nr_recs = i;
> +		rc = cxl_internal_send_cmd(cxlds, &mbox_cmd);
> +		if (rc)
> +			return rc;
> +	}
> +
> +	return 0;
> +}
> +
> +static void cxl_mem_get_records_log(struct cxl_dev_state *cxlds,
> +				    enum cxl_event_log_type type)
> +{
> +	struct cxl_get_event_payload *payload;
> +	struct cxl_mbox_cmd mbox_cmd;
> +	u8 log_type = type;
> +	u16 nr_rec;
> +
> +	mutex_lock(&cxlds->event.log_lock);
> +	payload = cxlds->event.buf;
> +
> +	mbox_cmd = (struct cxl_mbox_cmd) {
> +		.opcode = CXL_MBOX_OP_GET_EVENT_RECORD,
> +		.payload_in = &log_type,
> +		.size_in = sizeof(log_type),
> +		.payload_out = payload,
> +		.size_out = cxlds->payload_size,
> +		.min_out = struct_size(payload, records, 0),
> +	};
> +
> +	do {
> +		int rc;
> +
> +		rc = cxl_internal_send_cmd(cxlds, &mbox_cmd);
> +		if (rc) {
> +			dev_err(cxlds->dev, "Event log '%d': Failed to query event records : %d",
> +				type, rc);
> +			goto unlock_buffer;
> +		}
> +
> +		nr_rec = le16_to_cpu(payload->record_count);
> +		if (nr_rec > 0) {

Can save some indentation here and just do:

	if (!nr_rec)
		break;

...and then out-indent this below:

> +			int i;
> +
> +			for (i = 0; i < nr_rec; i++)
> +				trace_cxl_generic_event(cxlds->dev, type,
> +							&payload->records[i]);
> +
> +			rc = cxl_clear_event_record(cxlds, type, payload);
> +			if (rc) {
> +				dev_err(cxlds->dev, "Event log '%d': Failed to clear events : %d",
> +					type, rc);

This and the other dev_err() above should be dev_err_ratelimited() because if
these ever fire they'll probably start firing in bunches.

> +				goto unlock_buffer;

Nit, but how about just "break;" here? No need for a label.

> +			}
> +		}
> +
> +		if (payload->flags & CXL_GET_EVENT_FLAG_OVERFLOW)
> +			trace_cxl_overflow(cxlds->dev, type, payload);

I'm worried about this potentially looping and causing softlockups
without a cond_resched(), but lets wait and see if more complexity is
needed.

> +	} while (nr_rec);
> +
> +unlock_buffer:
> +	mutex_unlock(&cxlds->event.log_lock);
> +}
> +
> +/**
> + * cxl_mem_get_event_records - Get Event Records from the device
> + * @cxlds: The device data for the operation
> + *
> + * Retrieve all event records available on the device, report them as trace
> + * events, and clear them.
> + *
> + * See CXL rev 3.0 @8.2.9.2.2 Get Event Records
> + * See CXL rev 3.0 @8.2.9.2.3 Clear Event Records
> + */
> +void cxl_mem_get_event_records(struct cxl_dev_state *cxlds, u32 status)
> +{
> +	dev_dbg(cxlds->dev, "Reading event logs: %x\n", status);
> +
> +	if (status & CXLDEV_EVENT_STATUS_FATAL)
> +		cxl_mem_get_records_log(cxlds, CXL_EVENT_TYPE_FATAL);
> +	if (status & CXLDEV_EVENT_STATUS_FAIL)
> +		cxl_mem_get_records_log(cxlds, CXL_EVENT_TYPE_FAIL);
> +	if (status & CXLDEV_EVENT_STATUS_WARN)
> +		cxl_mem_get_records_log(cxlds, CXL_EVENT_TYPE_WARN);
> +	if (status & CXLDEV_EVENT_STATUS_INFO)
> +		cxl_mem_get_records_log(cxlds, CXL_EVENT_TYPE_INFO);
> +}
> +EXPORT_SYMBOL_NS_GPL(cxl_mem_get_event_records, CXL);
> +
>  /**
>   * cxl_mem_get_partition_info - Get partition info
>   * @cxlds: The device data for the operation
> @@ -868,6 +1005,7 @@ struct cxl_dev_state *cxl_dev_state_create(struct device *dev)
>  	}
>  
>  	mutex_init(&cxlds->mbox_mutex);
> +	mutex_init(&cxlds->event.log_lock);
>  	cxlds->dev = dev;
>  
>  	return cxlds;
> diff --git a/drivers/cxl/core/trace.h b/drivers/cxl/core/trace.h
> index 20ca2fe2ca8e..24eef6909f13 100644
> --- a/drivers/cxl/core/trace.h
> +++ b/drivers/cxl/core/trace.h
> @@ -6,7 +6,9 @@
>  #if !defined(_CXL_EVENTS_H) || defined(TRACE_HEADER_MULTI_READ)
>  #define _CXL_EVENTS_H
>  
> +#include <asm-generic/unaligned.h>
>  #include <cxl.h>
> +#include <cxlmem.h>
>  #include <linux/tracepoint.h>
>  
>  #define CXL_RAS_UC_CACHE_DATA_PARITY	BIT(0)
> @@ -103,6 +105,124 @@ TRACE_EVENT(cxl_aer_correctable_error,
>  	)
>  );
>  
> +#include <linux/tracepoint.h>
> +
> +#define cxl_event_log_type_str(type)				\
> +	__print_symbolic(type,					\
> +		{ CXL_EVENT_TYPE_INFO, "Informational" },	\
> +		{ CXL_EVENT_TYPE_WARN, "Warning" },		\
> +		{ CXL_EVENT_TYPE_FAIL, "Failure" },		\
> +		{ CXL_EVENT_TYPE_FATAL, "Fatal" })
> +
> +TRACE_EVENT(cxl_overflow,
> +
> +	TP_PROTO(const struct device *dev, enum cxl_event_log_type log,
> +		 struct cxl_get_event_payload *payload),
> +
> +	TP_ARGS(dev, log, payload),
> +
> +	TP_STRUCT__entry(
> +		__string(dev_name, dev_name(dev))
> +		__field(int, log)
> +		__field(u64, first_ts)
> +		__field(u64, last_ts)
> +		__field(u16, count)
> +	),
> +
> +	TP_fast_assign(
> +		__assign_str(dev_name, dev_name(dev));
> +		__entry->log = log;
> +		__entry->count = le16_to_cpu(payload->overflow_err_count);
> +		__entry->first_ts = le64_to_cpu(payload->first_overflow_timestamp);
> +		__entry->last_ts = le64_to_cpu(payload->last_overflow_timestamp);
> +	),
> +
> +	TP_printk("%s: EVENT LOG OVERFLOW log=%s : %u records from %llu to %llu",

The tracepoint is already called "cxl_overflow", seems redundant to also
print "EVENT LOG OVERFLOW".

> +		__get_str(dev_name), cxl_event_log_type_str(__entry->log),
> +		__entry->count, __entry->first_ts, __entry->last_ts)
> +
> +);
> +
> +/*
> + * Common Event Record Format
> + * CXL 3.0 section 8.2.9.2.1; Table 8-42
> + */
> +#define CXL_EVENT_RECORD_FLAG_PERMANENT		BIT(2)
> +#define CXL_EVENT_RECORD_FLAG_MAINT_NEEDED	BIT(3)
> +#define CXL_EVENT_RECORD_FLAG_PERF_DEGRADED	BIT(4)
> +#define CXL_EVENT_RECORD_FLAG_HW_REPLACE	BIT(5)
> +#define show_hdr_flags(flags)	__print_flags(flags, " | ",			   \
> +	{ CXL_EVENT_RECORD_FLAG_PERMANENT,	"PERMANENT_CONDITION"		}, \
> +	{ CXL_EVENT_RECORD_FLAG_MAINT_NEEDED,	"MAINTENANCE_NEEDED"		}, \
> +	{ CXL_EVENT_RECORD_FLAG_PERF_DEGRADED,	"PERFORMANCE_DEGRADED"		}, \
> +	{ CXL_EVENT_RECORD_FLAG_HW_REPLACE,	"HARDWARE_REPLACEMENT_NEEDED"	}  \

Nit, I know this does not matter for parsing since tooling will use the
raw TP_STRUCT__entry fields, but why are these "CAPITAL_UNDERSCORE" and
other symbols are just "Capitalized term"?

> +)
> +
> +/*
> + * Define macros for the common header of each CXL event.
> + *
> + * Tracepoints using these macros must do 3 things:
> + *
> + *	1) Add CXL_EVT_TP_entry to TP_STRUCT__entry
> + *	2) Use CXL_EVT_TP_fast_assign within TP_fast_assign;
> + *	   pass the dev, log, and CXL event header
> + *	3) Use CXL_EVT_TP_printk() instead of TP_printk()
> + *
> + * See the generic_event tracepoint as an example.
> + */
> +#define CXL_EVT_TP_entry					\
> +	__string(dev_name, dev_name(dev))			\
> +	__field(int, log)					\
> +	__field_struct(uuid_t, hdr_uuid)			\
> +	__field(u32, hdr_flags)					\
> +	__field(u16, hdr_handle)				\
> +	__field(u16, hdr_related_handle)			\
> +	__field(u64, hdr_timestamp)				\
> +	__field(u8, hdr_length)					\
> +	__field(u8, hdr_maint_op_class)
> +
> +#define CXL_EVT_TP_fast_assign(dev, l, hdr)					\
> +	__assign_str(dev_name, dev_name(dev));					\
> +	__entry->log = (l);							\
> +	memcpy(&__entry->hdr_uuid, &(hdr).id, sizeof(uuid_t));			\
> +	__entry->hdr_length = (hdr).length;					\
> +	__entry->hdr_flags = get_unaligned_le24((hdr).flags);			\
> +	__entry->hdr_handle = le16_to_cpu((hdr).handle);			\
> +	__entry->hdr_related_handle = le16_to_cpu((hdr).related_handle);	\
> +	__entry->hdr_timestamp = le64_to_cpu((hdr).timestamp);			\
> +	__entry->hdr_maint_op_class = (hdr).maint_op_class
> +
> +#define CXL_EVT_TP_printk(fmt, ...) \
> +	TP_printk("%s log=%s : time=%llu uuid=%pUb len=%d flags='%s' "		\
> +		"handle=%x related_handle=%x maint_op_class=%u"			\
> +		" : " fmt,							\
> +		__get_str(dev_name), cxl_event_log_type_str(__entry->log),	\
> +		__entry->hdr_timestamp, &__entry->hdr_uuid, __entry->hdr_length,\
> +		show_hdr_flags(__entry->hdr_flags), __entry->hdr_handle,	\
> +		__entry->hdr_related_handle, __entry->hdr_maint_op_class,	\
> +		##__VA_ARGS__)
> +
> +TRACE_EVENT(cxl_generic_event,
> +
> +	TP_PROTO(const struct device *dev, enum cxl_event_log_type log,
> +		 struct cxl_event_record_raw *rec),
> +
> +	TP_ARGS(dev, log, rec),
> +
> +	TP_STRUCT__entry(
> +		CXL_EVT_TP_entry
> +		__array(u8, data, CXL_EVENT_RECORD_DATA_LENGTH)
> +	),
> +
> +	TP_fast_assign(
> +		CXL_EVT_TP_fast_assign(dev, log, rec->hdr);
> +		memcpy(__entry->data, &rec->data, CXL_EVENT_RECORD_DATA_LENGTH);
> +	),
> +
> +	CXL_EVT_TP_printk("%s",
> +		__print_hex(__entry->data, CXL_EVENT_RECORD_DATA_LENGTH))
> +);
> +
>  #endif /* _CXL_EVENTS_H */
>  
>  #define TRACE_INCLUDE_FILE trace
> diff --git a/drivers/cxl/cxl.h b/drivers/cxl/cxl.h
> index aa3af3bb73b2..5974d1082210 100644
> --- a/drivers/cxl/cxl.h
> +++ b/drivers/cxl/cxl.h
> @@ -156,6 +156,18 @@ static inline int ways_to_eiw(unsigned int ways, u8 *eiw)
>  #define CXLDEV_CAP_CAP_ID_SECONDARY_MAILBOX 0x3
>  #define CXLDEV_CAP_CAP_ID_MEMDEV 0x4000
>  
> +/* CXL 3.0 8.2.8.3.1 Event Status Register */
> +#define CXLDEV_DEV_EVENT_STATUS_OFFSET		0x00
> +#define CXLDEV_EVENT_STATUS_INFO		BIT(0)
> +#define CXLDEV_EVENT_STATUS_WARN		BIT(1)
> +#define CXLDEV_EVENT_STATUS_FAIL		BIT(2)
> +#define CXLDEV_EVENT_STATUS_FATAL		BIT(3)
> +
> +#define CXLDEV_EVENT_STATUS_ALL (CXLDEV_EVENT_STATUS_INFO |	\
> +				 CXLDEV_EVENT_STATUS_WARN |	\
> +				 CXLDEV_EVENT_STATUS_FAIL |	\
> +				 CXLDEV_EVENT_STATUS_FATAL)
> +
>  /* CXL 2.0 8.2.8.4 Mailbox Registers */
>  #define CXLDEV_MBOX_CAPS_OFFSET 0x00
>  #define   CXLDEV_MBOX_CAP_PAYLOAD_SIZE_MASK GENMASK(4, 0)
> diff --git a/drivers/cxl/cxlmem.h b/drivers/cxl/cxlmem.h
> index ab138004f644..dd9aa3dd738e 100644
> --- a/drivers/cxl/cxlmem.h
> +++ b/drivers/cxl/cxlmem.h
> @@ -4,6 +4,7 @@
>  #define __CXL_MEM_H__
>  #include <uapi/linux/cxl_mem.h>
>  #include <linux/cdev.h>
> +#include <linux/uuid.h>
>  #include "cxl.h"
>  
>  /* CXL 2.0 8.2.8.5.1.1 Memory Device Status Register */
> @@ -193,6 +194,17 @@ struct cxl_endpoint_dvsec_info {
>  	struct range dvsec_range[2];
>  };
>  
> +/**
> + * struct cxl_event_state - Event log driver state
> + *
> + * @event_buf: Buffer to receive event data
> + * @event_log_lock: Serialize event_buf and log use
> + */
> +struct cxl_event_state {
> +	struct cxl_get_event_payload *buf;
> +	struct mutex log_lock;
> +};
> +
>  /**
>   * struct cxl_dev_state - The driver device state
>   *
> @@ -266,12 +278,16 @@ struct cxl_dev_state {
>  
>  	struct xarray doe_mbs;
>  
> +	struct cxl_event_state event;
> +
>  	int (*mbox_send)(struct cxl_dev_state *cxlds, struct cxl_mbox_cmd *cmd);
>  };
>  
>  enum cxl_opcode {
>  	CXL_MBOX_OP_INVALID		= 0x0000,
>  	CXL_MBOX_OP_RAW			= CXL_MBOX_OP_INVALID,
> +	CXL_MBOX_OP_GET_EVENT_RECORD	= 0x0100,
> +	CXL_MBOX_OP_CLEAR_EVENT_RECORD	= 0x0101,
>  	CXL_MBOX_OP_GET_FW_INFO		= 0x0200,
>  	CXL_MBOX_OP_ACTIVATE_FW		= 0x0202,
>  	CXL_MBOX_OP_GET_SUPPORTED_LOGS	= 0x0400,
> @@ -347,6 +363,73 @@ struct cxl_mbox_identify {
>  	u8 qos_telemetry_caps;
>  } __packed;
>  
> +/*
> + * Common Event Record Format
> + * CXL rev 3.0 section 8.2.9.2.1; Table 8-42
> + */
> +struct cxl_event_record_hdr {
> +	uuid_t id;
> +	u8 length;
> +	u8 flags[3];
> +	__le16 handle;
> +	__le16 related_handle;
> +	__le64 timestamp;
> +	u8 maint_op_class;
> +	u8 reserved[15];
> +} __packed;
> +
> +#define CXL_EVENT_RECORD_DATA_LENGTH 0x50
> +struct cxl_event_record_raw {
> +	struct cxl_event_record_hdr hdr;
> +	u8 data[CXL_EVENT_RECORD_DATA_LENGTH];
> +} __packed;
> +
> +/*
> + * Get Event Records output payload
> + * CXL rev 3.0 section 8.2.9.2.2; Table 8-50
> + */
> +#define CXL_GET_EVENT_FLAG_OVERFLOW		BIT(0)
> +#define CXL_GET_EVENT_FLAG_MORE_RECORDS		BIT(1)
> +struct cxl_get_event_payload {
> +	u8 flags;
> +	u8 reserved1;
> +	__le16 overflow_err_count;
> +	__le64 first_overflow_timestamp;
> +	__le64 last_overflow_timestamp;
> +	__le16 record_count;
> +	u8 reserved2[10];
> +	struct cxl_event_record_raw records[];
> +} __packed;
> +
> +/*
> + * CXL rev 3.0 section 8.2.9.2.2; Table 8-49
> + */
> +enum cxl_event_log_type {
> +	CXL_EVENT_TYPE_INFO = 0x00,
> +	CXL_EVENT_TYPE_WARN,
> +	CXL_EVENT_TYPE_FAIL,
> +	CXL_EVENT_TYPE_FATAL,
> +	CXL_EVENT_TYPE_MAX
> +};
> +
> +/*
> + * Clear Event Records input payload
> + * CXL rev 3.0 section 8.2.9.2.3; Table 8-51
> + */
> +#define CXL_CLEAR_EVENT_MAX_HANDLES (0xff)
> +struct cxl_mbox_clear_event_payload {
> +	u8 event_log;		/* enum cxl_event_log_type */
> +	u8 clear_flags;
> +	u8 nr_recs;
> +	u8 reserved[3];
> +	__le16 handle[CXL_CLEAR_EVENT_MAX_HANDLES];
> +} __packed;
> +#define CXL_CLEAR_EVENT_LIMIT_HANDLES(payload_size)			\
> +	(((payload_size) -						\
> +		(sizeof(struct cxl_mbox_clear_event_payload) -		\
> +		 (sizeof(__le16) * CXL_CLEAR_EVENT_MAX_HANDLES))) /	\
> +		sizeof(__le16))
> +
>  struct cxl_mbox_get_partition_info {
>  	__le64 active_volatile_cap;
>  	__le64 active_persistent_cap;
> @@ -441,6 +524,7 @@ int cxl_mem_create_range_info(struct cxl_dev_state *cxlds);
>  struct cxl_dev_state *cxl_dev_state_create(struct device *dev);
>  void set_exclusive_cxl_commands(struct cxl_dev_state *cxlds, unsigned long *cmds);
>  void clear_exclusive_cxl_commands(struct cxl_dev_state *cxlds, unsigned long *cmds);
> +void cxl_mem_get_event_records(struct cxl_dev_state *cxlds, u32 status);
>  #ifdef CONFIG_CXL_SUSPEND
>  void cxl_mem_active_inc(void);
>  void cxl_mem_active_dec(void);
> diff --git a/drivers/cxl/pci.c b/drivers/cxl/pci.c
> index 3a66aadb4df0..86c84611a168 100644
> --- a/drivers/cxl/pci.c
> +++ b/drivers/cxl/pci.c
> @@ -417,8 +417,44 @@ static void disable_aer(void *pdev)
>  	pci_disable_pcie_error_reporting(pdev);
>  }
>  
> +static void cxl_mem_free_event_buffer(void *buf)
> +{
> +	kvfree(buf);
> +}
> +
> +/*
> + * There is a single buffer for reading event logs from the mailbox.  All logs
> + * share this buffer protected by the cxlds->event_log_lock.
> + */
> +static void cxl_mem_alloc_event_buf(struct cxl_dev_state *cxlds)
> +{
> +	struct cxl_get_event_payload *buf;
> +
> +	dev_dbg(cxlds->dev, "Allocating event buffer size %zu\n",
> +		cxlds->payload_size);
> +
> +	buf = kvmalloc(cxlds->payload_size, GFP_KERNEL);
> +	if (WARN_ON_ONCE(!buf))

No, why is event init so special that it behaves differently than all
the other init-time allocations this driver does?

> +		return;

return -ENOMEM;

> +
> +	if (WARN_ON_ONCE(devm_add_action_or_reset(cxlds->dev,
> +			 cxl_mem_free_event_buffer, buf)))
> +		return;

ditto.

> +
> +	cxlds->event.buf = buf;
> +}
> +
> +static void cxl_clear_event_logs(struct cxl_dev_state *cxlds)
> +{
> +	/* Force read and clear of all logs */
> +	cxl_mem_get_event_records(cxlds, CXLDEV_EVENT_STATUS_ALL);
> +	/* Ensure prior partial reads are handled, by starting over again */

What partial reads? cxl_mem_get_event_records() reads every log until
each returns an empty result. Any remaining events after this returns
are events that fired during the retrieval.

So I do not think cxl_clear_event_logs() needs to exist, just call
cxl_mem_get_event_records(CXLDEV_EVENT_STATUS_ALL) once and that's it.


> +	cxl_mem_get_event_records(cxlds, CXLDEV_EVENT_STATUS_ALL);
> +}
> +
>  static int cxl_pci_probe(struct pci_dev *pdev, const struct pci_device_id *id)
>  {
> +	struct pci_host_bridge *host_bridge = pci_find_host_bridge(pdev->bus);
>  	struct cxl_register_map map;
>  	struct cxl_memdev *cxlmd;
>  	struct cxl_dev_state *cxlds;
> @@ -494,6 +530,12 @@ static int cxl_pci_probe(struct pci_dev *pdev, const struct pci_device_id *id)
>  	if (IS_ERR(cxlmd))
>  		return PTR_ERR(cxlmd);
>  
> +	if (host_bridge->native_cxl_error) {

Probably deserves a small comment about why this flag matters for event
init. Something like: "When BIOS maintains CXL error reporting control,
it will also reap event records. Only one agent can interface with the
event mechanism."

> +		cxl_mem_alloc_event_buf(cxlds);
> +		if (cxlds->event.buf)

No need for this conditional here since this whole block is skipped if
!native_cxl_error and cxl_mem_alloc_event_buf() will fail driver load if
it fails to init.

> +			cxl_clear_event_logs(cxlds);
> +	}
> +
>  	if (cxlds->regs.ras) {
>  		pci_enable_pcie_error_reporting(pdev);
>  		rc = devm_add_action_or_reset(&pdev->dev, disable_aer, pdev);
> diff --git a/include/linux/pci.h b/include/linux/pci.h
> index 1f81807492ef..7fe3752a204e 100644
> --- a/include/linux/pci.h
> +++ b/include/linux/pci.h
> @@ -580,6 +580,7 @@ struct pci_host_bridge {
>  	unsigned int	preserve_config:1;	/* Preserve FW resource setup */
>  	unsigned int	size_windows:1;		/* Enable root bus sizing */
>  	unsigned int	msi_domain:1;		/* Bridge wants MSI domain */
> +	unsigned int	native_cxl_error:1;	/* OS CXL Error reporting */

I would group this with the other native_ flags and copy the comment
style "OS may use CXL RAS + Events", or somesuch.

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH V3 1/8] cxl/mem: Read, trace, and clear events on driver load
  2022-12-09 17:56   ` Dan Williams
@ 2022-12-09 21:00     ` Ira Weiny
  2022-12-09 22:33       ` Dan Williams
  0 siblings, 1 reply; 25+ messages in thread
From: Ira Weiny @ 2022-12-09 21:00 UTC (permalink / raw)
  To: Dan Williams
  Cc: Bjorn Helgaas, Alison Schofield, Vishal Verma, Davidlohr Bueso,
	Jonathan Cameron, Dave Jiang, linux-kernel, linux-pci,
	linux-acpi, linux-cxl

On Fri, Dec 09, 2022 at 09:56:49AM -0800, Dan Williams wrote:
> ira.weiny@ wrote:
> > From: Ira Weiny <ira.weiny@intel.com>
> > 
> > CXL devices have multiple event logs which can be queried for CXL event
> > records.  Devices are required to support the storage of at least one
> > event record in each event log type.
> > 
> > Devices track event log overflow by incrementing a counter and tracking
> > the time of the first and last overflow event seen.
> > 
> > Software queries events via the Get Event Record mailbox command; CXL
> > rev 3.0 section 8.2.9.2.2 and clears events via CXL rev 3.0 section
> > 8.2.9.2.3 Clear Event Records mailbox command.
> > 
> > CXL _OSC Error Reporting Control is used by the OS to determine if
> > Firmware has control of various error reporting capabilities including
> > the event logs.
> > 
> > Expose the result of negotiating CXL Error Reporting Control in struct
> > pci_host_bridge for consumption by the CXL drivers.  If support is
> > controlled by the OS read and clear all event logs on driver load.
> > 
> > Ensure a clean slate of events by reading and clearing the events on
> > driver load.  The operation is performed twice to ensure that any prior
> > partial readings are completed and a fresh read from the start is done
> > at least one time.  This is done even if rogue reads cause clear errors.
> > 
> > The status register is not used because a device may continue to trigger
> > events and the only requirement is to empty the log at least once.  This
> > allows for the required transition from empty to non-empty for interrupt
> > generation.  Handling of interrupts is in a follow on patch.
> > 
> > The device can return up to 1MB worth of event records per query.
> > Allocate a shared large buffer to handle the max number of records based
> > on the mailbox payload size.
> > 
> > This patch traces a raw event record and leaves specific event record
> > type tracing to subsequent patches.  Macros are created to aid in
> > tracing the common CXL Event header fields.
> > 
> > Each record is cleared explicitly.  A clear all bit is specified but is
> > only valid when the log overflows.
> > 
> > Signed-off-by: Ira Weiny <ira.weiny@intel.com>
> > 
> > ---
> > Changes from V2:
> > 	Rebased on 6.3 pending changes
> > 	Move cxl_mem_alloc_event_buf() to pci.c
> > 	Define and use CXLDEV_EVENT_STATUS_ALL
> > 	Fix error flow on clear failure
> > 	Remove tags
> > 	Jonathan/Dan
> > 		Add in OSC Error Reporting Control check
> > 	Dan (Jonathan in previous version)
> > 		Squash Clear events and the driver load patch into this one.
> > 	Dan
> > 		Make event device status a separate structure
> > 		Move tracing to within cxl core
> > 		Reduce clear double loop to a single loop
> > 		Pass struct device to trace points
> > 		Adjust to new cxl_internal_send_cmd()
> > 		Query error logs in order of severity fatal -> Info
> > 		Remove uapi defines entirely
> > 		pass total via get_pl
> > 		fix 'Clearning' spelling
> > 		Better clarify event_buf singular allocation
> > 		Use decimal for command payload array sizes
> > 		Remove trace_*_enabled() optimization
> > 		Put GET/CLEAR macros at the end of the user enum to
> > 		preserve compatibility
> > 		Add Get/Clear Events to kernel exclusive commands
> > 		Remove cxl_event_log_type_str() outside of tracing
> > 		Add cond_resched() to event log processing
> > 	Jonathan
> > 		s/event_buf_lock/event_log_lock
> > 		Read through all logs two times to ensure partial reads are
> > 			covered.
> > 		Pass buffer to cxl_mem_free_event_buffer()
> > 		kdoc for event buf
> > 		Account for cxlds->payload_size limiting the max handles
> > 		Don't use min_t as it was used incorrectly
> > 
> > Changes from V1:
> > 	Clear Event Record allows for u8 handles while Get Event Record
> > 	allows for u16 records to be returned.  Based on Jonathan's
> > 	feedback; allow for all event records to be handled in this
> > 	clear.  Which means a double loop with potentially multiple
> > 	Clear Event payloads being sent to clear all events sent.
> > 
> > Changes from RFC:
> > 	Jonathan
> > 		Clean up init of payload and use return code.
> > 		Also report any error to clear the event.
> > 		s/v3.0/rev 3.0
> 
> This is a lot of text to skip over. The cover letter has the summary of
> the changes and a link to v2 and that has the link to v1. Why does each
> patch need the full history of changes all over again?

I've seen this done elsewhere and just copied it.  But I think the use of b4
send will help me in future.  I agree it is unneeded.  Even in the cover
letter.

> 
> > 
> > squash: make event device state a separate structure.
> > ---
> >  drivers/acpi/pci_root.c  |   3 +
> >  drivers/cxl/core/mbox.c  | 138 +++++++++++++++++++++++++++++++++++++++
> >  drivers/cxl/core/trace.h | 120 ++++++++++++++++++++++++++++++++++
> >  drivers/cxl/cxl.h        |  12 ++++
> >  drivers/cxl/cxlmem.h     |  84 ++++++++++++++++++++++++
> >  drivers/cxl/pci.c        |  42 ++++++++++++
> >  include/linux/pci.h      |   1 +
> >  7 files changed, 400 insertions(+)
> > 
> > diff --git a/drivers/acpi/pci_root.c b/drivers/acpi/pci_root.c
> > index b3c202d2a433..cee8f56fb63a 100644
> > --- a/drivers/acpi/pci_root.c
> > +++ b/drivers/acpi/pci_root.c
> 
> I do appreciate that this patch has a full thought about error handling
> implementing the full cycle of retrieve and clear, but I think these
> drivers/acpi/ changes can stand on their own as a lead-in patch if only
> so that Bjorn and Rafael do not need to dig through the CXL details.

Split it out.  That was my inclination anyway.  ;-)

> 
> 
> > @@ -1047,6 +1047,9 @@ struct pci_bus *acpi_pci_root_create(struct acpi_pci_root *root,
> >  	if (!(root->osc_control_set & OSC_PCI_EXPRESS_DPC_CONTROL))
> >  		host_bridge->native_dpc = 0;
> >  
> > +	if (root->osc_ext_control_set & OSC_CXL_ERROR_REPORTING_CONTROL)
> > +		host_bridge->native_cxl_error = 1;
> > +
> 
> Copy the existing style and init the flag to 1 in pci_init_host_bridge()
> and clear it in acpi_pci_root_create() upon fail to take control.

Ah good point.  Done.

> 
> >  	/*
> >  	 * Evaluate the "PCI Boot Configuration" _DSM Function.  If it
> >  	 * exists and returns 0, we must preserve any PCI resource
> > diff --git a/drivers/cxl/core/mbox.c b/drivers/cxl/core/mbox.c
> > index b03fba212799..815da3aac081 100644
> > --- a/drivers/cxl/core/mbox.c
> > +++ b/drivers/cxl/core/mbox.c
> > @@ -8,6 +8,7 @@
> >  #include <cxl.h>
> >  
> >  #include "core.h"
> > +#include "trace.h"
> >  
> >  static bool cxl_raw_allow_all;
> >  
> > @@ -717,6 +718,142 @@ int cxl_enumerate_cmds(struct cxl_dev_state *cxlds)
> >  }
> >  EXPORT_SYMBOL_NS_GPL(cxl_enumerate_cmds, CXL);
> >  
> > +static int cxl_clear_event_record(struct cxl_dev_state *cxlds,
> > +				  enum cxl_event_log_type log,
> > +				  struct cxl_get_event_payload *get_pl)
> > +{
> > +	struct cxl_mbox_clear_event_payload payload = {
> > +		.event_log = log,
> > +	};
> > +	u16 total = le16_to_cpu(get_pl->record_count);
> > +	u8 max_handles = CXL_CLEAR_EVENT_MAX_HANDLES;
> > +	size_t pl_size = sizeof(payload);
> > +	struct cxl_mbox_cmd mbox_cmd;
> > +	u16 cnt;
> > +	int rc;
> > +	int i;
> > +
> > +	/* Payload size may limit the max handles */
> > +	if (pl_size > cxlds->payload_size) {
> > +		max_handles = CXL_CLEAR_EVENT_LIMIT_HANDLES(cxlds->payload_size);
> > +		pl_size = cxlds->payload_size;
> > +	}
> > +
> > +	mbox_cmd = (struct cxl_mbox_cmd) {
> > +		.opcode = CXL_MBOX_OP_CLEAR_EVENT_RECORD,
> > +		.payload_in = &payload,
> > +		.size_in = pl_size,
> > +	};
> > +
> > +	/*
> > +	 * Clear Event Records uses u8 for the handle cnt while Get Event
> > +	 * Record can return up to 0xffff records.
> > +	 */
> > +	i = 0;
> > +	for (cnt = 0; cnt < total; cnt++) {
> > +		payload.handle[i++] = get_pl->records[cnt].hdr.handle;
> > +		dev_dbg(cxlds->dev, "Event log '%d': Clearing %u\n",
> > +			log, le16_to_cpu(payload.handle[i]));
> > +
> > +		if (i == max_handles) {
> > +			payload.nr_recs = i;
> > +			rc = cxl_internal_send_cmd(cxlds, &mbox_cmd);
> > +			if (rc)
> > +				return rc;
> > +			i = 0;
> > +		}
> > +	}
> > +
> > +	/* Clear what is left if any */
> > +	if (i) {
> > +		payload.nr_recs = i;
> > +		rc = cxl_internal_send_cmd(cxlds, &mbox_cmd);
> > +		if (rc)
> > +			return rc;
> > +	}
> > +
> > +	return 0;
> > +}
> > +
> > +static void cxl_mem_get_records_log(struct cxl_dev_state *cxlds,
> > +				    enum cxl_event_log_type type)
> > +{
> > +	struct cxl_get_event_payload *payload;
> > +	struct cxl_mbox_cmd mbox_cmd;
> > +	u8 log_type = type;
> > +	u16 nr_rec;
> > +
> > +	mutex_lock(&cxlds->event.log_lock);
> > +	payload = cxlds->event.buf;
> > +
> > +	mbox_cmd = (struct cxl_mbox_cmd) {
> > +		.opcode = CXL_MBOX_OP_GET_EVENT_RECORD,
> > +		.payload_in = &log_type,
> > +		.size_in = sizeof(log_type),
> > +		.payload_out = payload,
> > +		.size_out = cxlds->payload_size,
> > +		.min_out = struct_size(payload, records, 0),
> > +	};
> > +
> > +	do {
> > +		int rc;
> > +
> > +		rc = cxl_internal_send_cmd(cxlds, &mbox_cmd);
> > +		if (rc) {
> > +			dev_err(cxlds->dev, "Event log '%d': Failed to query event records : %d",
> > +				type, rc);
> > +			goto unlock_buffer;
> > +		}
> > +
> > +		nr_rec = le16_to_cpu(payload->record_count);
> > +		if (nr_rec > 0) {
> 
> Can save some indentation here and just do:
> 
> 	if (!nr_rec)
> 		break;
> 
> ...and then out-indent this below:

:-/

I suppose that does make sense.  Somehow I thought the overflow flag might be
set here.  But that should not be the case if nr_rec == 0.

> 
> > +			int i;
> > +
> > +			for (i = 0; i < nr_rec; i++)
> > +				trace_cxl_generic_event(cxlds->dev, type,
> > +							&payload->records[i]);
> > +
> > +			rc = cxl_clear_event_record(cxlds, type, payload);
> > +			if (rc) {
> > +				dev_err(cxlds->dev, "Event log '%d': Failed to clear events : %d",
> > +					type, rc);
> 
> This and the other dev_err() above should be dev_err_ratelimited() because if
> these ever fire they'll probably start firing in bunches.

Yea good catch.  Previously, exited from the loop and would effectively stop.
But now there is a loop on the status and this will keep printing.  Effectively
this may be forever because if something is wrong and the log is not clearing
out the status won't clear...  :-/

Thinking about this I was tempted to make this dev_err_once().  But there are
some error which would correct themselves and could make progress.

FWIW I know you mentioned that this error could be a trace point as well.  But
I opted not to do that because if no one is listening to the trace points and
there is an error the admin will have no idea this device is going haywire for
apparently no reason.  This is especially true on start up.

> 
> > +				goto unlock_buffer;
> 
> Nit, but how about just "break;" here? No need for a label.

Just my style.  The goto is more explicit.  Especially with this much logic in
the while loop.  Also an 'unlock_buffer' label is pretty standard as the
'exception' case.

> 
> > +			}
> > +		}
> > +
> > +		if (payload->flags & CXL_GET_EVENT_FLAG_OVERFLOW)
> > +			trace_cxl_overflow(cxlds->dev, type, payload);
> 
> I'm worried about this potentially looping and causing softlockups
> without a cond_resched(), but lets wait and see if more complexity is
> needed.

I did have a cond_resched() here but opted to have it at the status level
rather than here.  So reading of _all_ the logs will not loop without a pause.
This was partly because I want this loop to drop out to read the other logs and
I'm worried that if we delay here we may end up never empting the individual
logs.  Of course then we have other problems too.  So... yea I'd like to wait
and see.  I feel like the OS should be able to keep up with normal errors.

> 
> > +	} while (nr_rec);
> > +
> > +unlock_buffer:
> > +	mutex_unlock(&cxlds->event.log_lock);
> > +}
> > +
> > +/**
> > + * cxl_mem_get_event_records - Get Event Records from the device
> > + * @cxlds: The device data for the operation
> > + *
> > + * Retrieve all event records available on the device, report them as trace
> > + * events, and clear them.
> > + *
> > + * See CXL rev 3.0 @8.2.9.2.2 Get Event Records
> > + * See CXL rev 3.0 @8.2.9.2.3 Clear Event Records
> > + */
> > +void cxl_mem_get_event_records(struct cxl_dev_state *cxlds, u32 status)
> > +{
> > +	dev_dbg(cxlds->dev, "Reading event logs: %x\n", status);
> > +
> > +	if (status & CXLDEV_EVENT_STATUS_FATAL)
> > +		cxl_mem_get_records_log(cxlds, CXL_EVENT_TYPE_FATAL);
> > +	if (status & CXLDEV_EVENT_STATUS_FAIL)
> > +		cxl_mem_get_records_log(cxlds, CXL_EVENT_TYPE_FAIL);
> > +	if (status & CXLDEV_EVENT_STATUS_WARN)
> > +		cxl_mem_get_records_log(cxlds, CXL_EVENT_TYPE_WARN);
> > +	if (status & CXLDEV_EVENT_STATUS_INFO)
> > +		cxl_mem_get_records_log(cxlds, CXL_EVENT_TYPE_INFO);
> > +}
> > +EXPORT_SYMBOL_NS_GPL(cxl_mem_get_event_records, CXL);
> > +
> >  /**
> >   * cxl_mem_get_partition_info - Get partition info
> >   * @cxlds: The device data for the operation
> > @@ -868,6 +1005,7 @@ struct cxl_dev_state *cxl_dev_state_create(struct device *dev)
> >  	}
> >  
> >  	mutex_init(&cxlds->mbox_mutex);
> > +	mutex_init(&cxlds->event.log_lock);
> >  	cxlds->dev = dev;
> >  
> >  	return cxlds;
> > diff --git a/drivers/cxl/core/trace.h b/drivers/cxl/core/trace.h
> > index 20ca2fe2ca8e..24eef6909f13 100644
> > --- a/drivers/cxl/core/trace.h
> > +++ b/drivers/cxl/core/trace.h
> > @@ -6,7 +6,9 @@
> >  #if !defined(_CXL_EVENTS_H) || defined(TRACE_HEADER_MULTI_READ)
> >  #define _CXL_EVENTS_H
> >  
> > +#include <asm-generic/unaligned.h>
> >  #include <cxl.h>
> > +#include <cxlmem.h>
> >  #include <linux/tracepoint.h>
> >  
> >  #define CXL_RAS_UC_CACHE_DATA_PARITY	BIT(0)
> > @@ -103,6 +105,124 @@ TRACE_EVENT(cxl_aer_correctable_error,
> >  	)
> >  );
> >  
> > +#include <linux/tracepoint.h>
> > +
> > +#define cxl_event_log_type_str(type)				\
> > +	__print_symbolic(type,					\
> > +		{ CXL_EVENT_TYPE_INFO, "Informational" },	\
> > +		{ CXL_EVENT_TYPE_WARN, "Warning" },		\
> > +		{ CXL_EVENT_TYPE_FAIL, "Failure" },		\
> > +		{ CXL_EVENT_TYPE_FATAL, "Fatal" })
> > +
> > +TRACE_EVENT(cxl_overflow,
> > +
> > +	TP_PROTO(const struct device *dev, enum cxl_event_log_type log,
> > +		 struct cxl_get_event_payload *payload),
> > +
> > +	TP_ARGS(dev, log, payload),
> > +
> > +	TP_STRUCT__entry(
> > +		__string(dev_name, dev_name(dev))
> > +		__field(int, log)
> > +		__field(u64, first_ts)
> > +		__field(u64, last_ts)
> > +		__field(u16, count)
> > +	),
> > +
> > +	TP_fast_assign(
> > +		__assign_str(dev_name, dev_name(dev));
> > +		__entry->log = log;
> > +		__entry->count = le16_to_cpu(payload->overflow_err_count);
> > +		__entry->first_ts = le64_to_cpu(payload->first_overflow_timestamp);
> > +		__entry->last_ts = le64_to_cpu(payload->last_overflow_timestamp);
> > +	),
> > +
> > +	TP_printk("%s: EVENT LOG OVERFLOW log=%s : %u records from %llu to %llu",
> 
> The tracepoint is already called "cxl_overflow", seems redundant to also
> print "EVENT LOG OVERFLOW".

Fair enough.  Deleted.

> 
> > +		__get_str(dev_name), cxl_event_log_type_str(__entry->log),
> > +		__entry->count, __entry->first_ts, __entry->last_ts)
> > +
> > +);
> > +
> > +/*
> > + * Common Event Record Format
> > + * CXL 3.0 section 8.2.9.2.1; Table 8-42
> > + */
> > +#define CXL_EVENT_RECORD_FLAG_PERMANENT		BIT(2)
> > +#define CXL_EVENT_RECORD_FLAG_MAINT_NEEDED	BIT(3)
> > +#define CXL_EVENT_RECORD_FLAG_PERF_DEGRADED	BIT(4)
> > +#define CXL_EVENT_RECORD_FLAG_HW_REPLACE	BIT(5)
> > +#define show_hdr_flags(flags)	__print_flags(flags, " | ",			   \
> > +	{ CXL_EVENT_RECORD_FLAG_PERMANENT,	"PERMANENT_CONDITION"		}, \
> > +	{ CXL_EVENT_RECORD_FLAG_MAINT_NEEDED,	"MAINTENANCE_NEEDED"		}, \
> > +	{ CXL_EVENT_RECORD_FLAG_PERF_DEGRADED,	"PERFORMANCE_DEGRADED"		}, \
> > +	{ CXL_EVENT_RECORD_FLAG_HW_REPLACE,	"HARDWARE_REPLACEMENT_NEEDED"	}  \
> 
> Nit, I know this does not matter for parsing since tooling will use the
> raw TP_STRUCT__entry fields, but why are these "CAPITAL_UNDERSCORE" and
> other symbols are just "Capitalized term"?

The conclusion before was that all 'flags' values should be all caps.  In
addition, because they are flags (vs the other symbols being enums) I used
underscores to allow for clear parsing of each flag within the field output.

> 
> > +)
> > +
> > +/*
> > + * Define macros for the common header of each CXL event.
> > + *
> > + * Tracepoints using these macros must do 3 things:
> > + *
> > + *	1) Add CXL_EVT_TP_entry to TP_STRUCT__entry
> > + *	2) Use CXL_EVT_TP_fast_assign within TP_fast_assign;
> > + *	   pass the dev, log, and CXL event header
> > + *	3) Use CXL_EVT_TP_printk() instead of TP_printk()
> > + *
> > + * See the generic_event tracepoint as an example.
> > + */
> > +#define CXL_EVT_TP_entry					\
> > +	__string(dev_name, dev_name(dev))			\
> > +	__field(int, log)					\
> > +	__field_struct(uuid_t, hdr_uuid)			\
> > +	__field(u32, hdr_flags)					\
> > +	__field(u16, hdr_handle)				\
> > +	__field(u16, hdr_related_handle)			\
> > +	__field(u64, hdr_timestamp)				\
> > +	__field(u8, hdr_length)					\
> > +	__field(u8, hdr_maint_op_class)
> > +
> > +#define CXL_EVT_TP_fast_assign(dev, l, hdr)					\
> > +	__assign_str(dev_name, dev_name(dev));					\
> > +	__entry->log = (l);							\
> > +	memcpy(&__entry->hdr_uuid, &(hdr).id, sizeof(uuid_t));			\
> > +	__entry->hdr_length = (hdr).length;					\
> > +	__entry->hdr_flags = get_unaligned_le24((hdr).flags);			\
> > +	__entry->hdr_handle = le16_to_cpu((hdr).handle);			\
> > +	__entry->hdr_related_handle = le16_to_cpu((hdr).related_handle);	\
> > +	__entry->hdr_timestamp = le64_to_cpu((hdr).timestamp);			\
> > +	__entry->hdr_maint_op_class = (hdr).maint_op_class
> > +
> > +#define CXL_EVT_TP_printk(fmt, ...) \
> > +	TP_printk("%s log=%s : time=%llu uuid=%pUb len=%d flags='%s' "		\
> > +		"handle=%x related_handle=%x maint_op_class=%u"			\
> > +		" : " fmt,							\
> > +		__get_str(dev_name), cxl_event_log_type_str(__entry->log),	\
> > +		__entry->hdr_timestamp, &__entry->hdr_uuid, __entry->hdr_length,\
> > +		show_hdr_flags(__entry->hdr_flags), __entry->hdr_handle,	\
> > +		__entry->hdr_related_handle, __entry->hdr_maint_op_class,	\
> > +		##__VA_ARGS__)
> > +
> > +TRACE_EVENT(cxl_generic_event,
> > +
> > +	TP_PROTO(const struct device *dev, enum cxl_event_log_type log,
> > +		 struct cxl_event_record_raw *rec),
> > +
> > +	TP_ARGS(dev, log, rec),
> > +
> > +	TP_STRUCT__entry(
> > +		CXL_EVT_TP_entry
> > +		__array(u8, data, CXL_EVENT_RECORD_DATA_LENGTH)
> > +	),
> > +
> > +	TP_fast_assign(
> > +		CXL_EVT_TP_fast_assign(dev, log, rec->hdr);
> > +		memcpy(__entry->data, &rec->data, CXL_EVENT_RECORD_DATA_LENGTH);
> > +	),
> > +
> > +	CXL_EVT_TP_printk("%s",
> > +		__print_hex(__entry->data, CXL_EVENT_RECORD_DATA_LENGTH))
> > +);
> > +
> >  #endif /* _CXL_EVENTS_H */
> >  
> >  #define TRACE_INCLUDE_FILE trace
> > diff --git a/drivers/cxl/cxl.h b/drivers/cxl/cxl.h
> > index aa3af3bb73b2..5974d1082210 100644
> > --- a/drivers/cxl/cxl.h
> > +++ b/drivers/cxl/cxl.h
> > @@ -156,6 +156,18 @@ static inline int ways_to_eiw(unsigned int ways, u8 *eiw)
> >  #define CXLDEV_CAP_CAP_ID_SECONDARY_MAILBOX 0x3
> >  #define CXLDEV_CAP_CAP_ID_MEMDEV 0x4000
> >  
> > +/* CXL 3.0 8.2.8.3.1 Event Status Register */
> > +#define CXLDEV_DEV_EVENT_STATUS_OFFSET		0x00
> > +#define CXLDEV_EVENT_STATUS_INFO		BIT(0)
> > +#define CXLDEV_EVENT_STATUS_WARN		BIT(1)
> > +#define CXLDEV_EVENT_STATUS_FAIL		BIT(2)
> > +#define CXLDEV_EVENT_STATUS_FATAL		BIT(3)
> > +
> > +#define CXLDEV_EVENT_STATUS_ALL (CXLDEV_EVENT_STATUS_INFO |	\
> > +				 CXLDEV_EVENT_STATUS_WARN |	\
> > +				 CXLDEV_EVENT_STATUS_FAIL |	\
> > +				 CXLDEV_EVENT_STATUS_FATAL)
> > +
> >  /* CXL 2.0 8.2.8.4 Mailbox Registers */
> >  #define CXLDEV_MBOX_CAPS_OFFSET 0x00
> >  #define   CXLDEV_MBOX_CAP_PAYLOAD_SIZE_MASK GENMASK(4, 0)
> > diff --git a/drivers/cxl/cxlmem.h b/drivers/cxl/cxlmem.h
> > index ab138004f644..dd9aa3dd738e 100644
> > --- a/drivers/cxl/cxlmem.h
> > +++ b/drivers/cxl/cxlmem.h
> > @@ -4,6 +4,7 @@
> >  #define __CXL_MEM_H__
> >  #include <uapi/linux/cxl_mem.h>
> >  #include <linux/cdev.h>
> > +#include <linux/uuid.h>
> >  #include "cxl.h"
> >  
> >  /* CXL 2.0 8.2.8.5.1.1 Memory Device Status Register */
> > @@ -193,6 +194,17 @@ struct cxl_endpoint_dvsec_info {
> >  	struct range dvsec_range[2];
> >  };
> >  
> > +/**
> > + * struct cxl_event_state - Event log driver state
> > + *
> > + * @event_buf: Buffer to receive event data
> > + * @event_log_lock: Serialize event_buf and log use
> > + */
> > +struct cxl_event_state {
> > +	struct cxl_get_event_payload *buf;
> > +	struct mutex log_lock;
> > +};
> > +
> >  /**
> >   * struct cxl_dev_state - The driver device state
> >   *
> > @@ -266,12 +278,16 @@ struct cxl_dev_state {
> >  
> >  	struct xarray doe_mbs;
> >  
> > +	struct cxl_event_state event;
> > +
> >  	int (*mbox_send)(struct cxl_dev_state *cxlds, struct cxl_mbox_cmd *cmd);
> >  };
> >  
> >  enum cxl_opcode {
> >  	CXL_MBOX_OP_INVALID		= 0x0000,
> >  	CXL_MBOX_OP_RAW			= CXL_MBOX_OP_INVALID,
> > +	CXL_MBOX_OP_GET_EVENT_RECORD	= 0x0100,
> > +	CXL_MBOX_OP_CLEAR_EVENT_RECORD	= 0x0101,
> >  	CXL_MBOX_OP_GET_FW_INFO		= 0x0200,
> >  	CXL_MBOX_OP_ACTIVATE_FW		= 0x0202,
> >  	CXL_MBOX_OP_GET_SUPPORTED_LOGS	= 0x0400,
> > @@ -347,6 +363,73 @@ struct cxl_mbox_identify {
> >  	u8 qos_telemetry_caps;
> >  } __packed;
> >  
> > +/*
> > + * Common Event Record Format
> > + * CXL rev 3.0 section 8.2.9.2.1; Table 8-42
> > + */
> > +struct cxl_event_record_hdr {
> > +	uuid_t id;
> > +	u8 length;
> > +	u8 flags[3];
> > +	__le16 handle;
> > +	__le16 related_handle;
> > +	__le64 timestamp;
> > +	u8 maint_op_class;
> > +	u8 reserved[15];
> > +} __packed;
> > +
> > +#define CXL_EVENT_RECORD_DATA_LENGTH 0x50
> > +struct cxl_event_record_raw {
> > +	struct cxl_event_record_hdr hdr;
> > +	u8 data[CXL_EVENT_RECORD_DATA_LENGTH];
> > +} __packed;
> > +
> > +/*
> > + * Get Event Records output payload
> > + * CXL rev 3.0 section 8.2.9.2.2; Table 8-50
> > + */
> > +#define CXL_GET_EVENT_FLAG_OVERFLOW		BIT(0)
> > +#define CXL_GET_EVENT_FLAG_MORE_RECORDS		BIT(1)
> > +struct cxl_get_event_payload {
> > +	u8 flags;
> > +	u8 reserved1;
> > +	__le16 overflow_err_count;
> > +	__le64 first_overflow_timestamp;
> > +	__le64 last_overflow_timestamp;
> > +	__le16 record_count;
> > +	u8 reserved2[10];
> > +	struct cxl_event_record_raw records[];
> > +} __packed;
> > +
> > +/*
> > + * CXL rev 3.0 section 8.2.9.2.2; Table 8-49
> > + */
> > +enum cxl_event_log_type {
> > +	CXL_EVENT_TYPE_INFO = 0x00,
> > +	CXL_EVENT_TYPE_WARN,
> > +	CXL_EVENT_TYPE_FAIL,
> > +	CXL_EVENT_TYPE_FATAL,
> > +	CXL_EVENT_TYPE_MAX
> > +};
> > +
> > +/*
> > + * Clear Event Records input payload
> > + * CXL rev 3.0 section 8.2.9.2.3; Table 8-51
> > + */
> > +#define CXL_CLEAR_EVENT_MAX_HANDLES (0xff)
> > +struct cxl_mbox_clear_event_payload {
> > +	u8 event_log;		/* enum cxl_event_log_type */
> > +	u8 clear_flags;
> > +	u8 nr_recs;
> > +	u8 reserved[3];
> > +	__le16 handle[CXL_CLEAR_EVENT_MAX_HANDLES];
> > +} __packed;
> > +#define CXL_CLEAR_EVENT_LIMIT_HANDLES(payload_size)			\
> > +	(((payload_size) -						\
> > +		(sizeof(struct cxl_mbox_clear_event_payload) -		\
> > +		 (sizeof(__le16) * CXL_CLEAR_EVENT_MAX_HANDLES))) /	\
> > +		sizeof(__le16))
> > +
> >  struct cxl_mbox_get_partition_info {
> >  	__le64 active_volatile_cap;
> >  	__le64 active_persistent_cap;
> > @@ -441,6 +524,7 @@ int cxl_mem_create_range_info(struct cxl_dev_state *cxlds);
> >  struct cxl_dev_state *cxl_dev_state_create(struct device *dev);
> >  void set_exclusive_cxl_commands(struct cxl_dev_state *cxlds, unsigned long *cmds);
> >  void clear_exclusive_cxl_commands(struct cxl_dev_state *cxlds, unsigned long *cmds);
> > +void cxl_mem_get_event_records(struct cxl_dev_state *cxlds, u32 status);
> >  #ifdef CONFIG_CXL_SUSPEND
> >  void cxl_mem_active_inc(void);
> >  void cxl_mem_active_dec(void);
> > diff --git a/drivers/cxl/pci.c b/drivers/cxl/pci.c
> > index 3a66aadb4df0..86c84611a168 100644
> > --- a/drivers/cxl/pci.c
> > +++ b/drivers/cxl/pci.c
> > @@ -417,8 +417,44 @@ static void disable_aer(void *pdev)
> >  	pci_disable_pcie_error_reporting(pdev);
> >  }
> >  
> > +static void cxl_mem_free_event_buffer(void *buf)
> > +{
> > +	kvfree(buf);
> > +}
> > +
> > +/*
> > + * There is a single buffer for reading event logs from the mailbox.  All logs
> > + * share this buffer protected by the cxlds->event_log_lock.
> > + */
> > +static void cxl_mem_alloc_event_buf(struct cxl_dev_state *cxlds)
> > +{
> > +	struct cxl_get_event_payload *buf;
> > +
> > +	dev_dbg(cxlds->dev, "Allocating event buffer size %zu\n",
> > +		cxlds->payload_size);
> > +
> > +	buf = kvmalloc(cxlds->payload_size, GFP_KERNEL);
> > +	if (WARN_ON_ONCE(!buf))
> 
> No, why is event init so special that it behaves differently than all
> the other init-time allocations this driver does?

Previous review agreed that a warn on once would be printed if this universal
buffer was not allocated.

> 
> > +		return;
> 
> return -ENOMEM;
> 
> > +
> > +	if (WARN_ON_ONCE(devm_add_action_or_reset(cxlds->dev,
> > +			 cxl_mem_free_event_buffer, buf)))
> > +		return;
> 
> ditto.

I'll change both of these with a dev_err() and bail during init.

> 
> > +
> > +	cxlds->event.buf = buf;
> > +}
> > +
> > +static void cxl_clear_event_logs(struct cxl_dev_state *cxlds)
> > +{
> > +	/* Force read and clear of all logs */
> > +	cxl_mem_get_event_records(cxlds, CXLDEV_EVENT_STATUS_ALL);
> > +	/* Ensure prior partial reads are handled, by starting over again */
> 
> What partial reads? cxl_mem_get_event_records() reads every log until
> each returns an empty result. Any remaining events after this returns
> are events that fired during the retrieval.

Jonathan was concerned that something could read part of the log and because of
the statefullness of the log processing this reading of the log could start in
the beginning.  Perhaps from a previous driver unload while reading?

I guess I was also thinking the BIOS could leave things this way?  But I think
we should not be here if the BIOS was ever involved right?

> 
> So I do not think cxl_clear_event_logs() needs to exist, just call
> cxl_mem_get_event_records(CXLDEV_EVENT_STATUS_ALL) once and that's it.

That was my inclination but Jonathan's comments got me thinking I was wrong.

I'll change it back.

> 
> 
> > +	cxl_mem_get_event_records(cxlds, CXLDEV_EVENT_STATUS_ALL);
> > +}
> > +
> >  static int cxl_pci_probe(struct pci_dev *pdev, const struct pci_device_id *id)
> >  {
> > +	struct pci_host_bridge *host_bridge = pci_find_host_bridge(pdev->bus);
> >  	struct cxl_register_map map;
> >  	struct cxl_memdev *cxlmd;
> >  	struct cxl_dev_state *cxlds;
> > @@ -494,6 +530,12 @@ static int cxl_pci_probe(struct pci_dev *pdev, const struct pci_device_id *id)
> >  	if (IS_ERR(cxlmd))
> >  		return PTR_ERR(cxlmd);
> >  
> > +	if (host_bridge->native_cxl_error) {
> 
> Probably deserves a small comment about why this flag matters for event
> init. Something like: "When BIOS maintains CXL error reporting control,
> it will also reap event records. Only one agent can interface with the
> event mechanism."

Sure. Done.

> 
> > +		cxl_mem_alloc_event_buf(cxlds);
> > +		if (cxlds->event.buf)
> 
> No need for this conditional here since this whole block is skipped if
> !native_cxl_error and cxl_mem_alloc_event_buf() will fail driver load if
> it fails to init.

Done.

> 
> > +			cxl_clear_event_logs(cxlds);
> > +	}
> > +
> >  	if (cxlds->regs.ras) {
> >  		pci_enable_pcie_error_reporting(pdev);
> >  		rc = devm_add_action_or_reset(&pdev->dev, disable_aer, pdev);
> > diff --git a/include/linux/pci.h b/include/linux/pci.h
> > index 1f81807492ef..7fe3752a204e 100644
> > --- a/include/linux/pci.h
> > +++ b/include/linux/pci.h
> > @@ -580,6 +580,7 @@ struct pci_host_bridge {
> >  	unsigned int	preserve_config:1;	/* Preserve FW resource setup */
> >  	unsigned int	size_windows:1;		/* Enable root bus sizing */
> >  	unsigned int	msi_domain:1;		/* Bridge wants MSI domain */
> > +	unsigned int	native_cxl_error:1;	/* OS CXL Error reporting */
> 
> I would group this with the other native_ flags and copy the comment
> style "OS may use CXL RAS + Events", or somesuch.

Done.

Ira

^ permalink raw reply	[flat|nested] 25+ messages in thread

* RE: [PATCH V3 2/8] cxl/mem: Wire up event interrupts
  2022-12-08  5:21 ` [PATCH V3 2/8] cxl/mem: Wire up event interrupts ira.weiny
@ 2022-12-09 21:49   ` Dan Williams
  2022-12-10  1:44     ` Ira Weiny
  0 siblings, 1 reply; 25+ messages in thread
From: Dan Williams @ 2022-12-09 21:49 UTC (permalink / raw)
  To: ira.weiny, Dan Williams
  Cc: Davidlohr Bueso, Bjorn Helgaas, Jonathan Cameron, Ira Weiny,
	Bjorn Helgaas, Alison Schofield, Vishal Verma, Dave Jiang,
	linux-kernel, linux-pci, linux-acpi, linux-cxl

ira.weiny@ wrote:
> From: Davidlohr Bueso <dave@stgolabs.net>
> 
> Currently the only CXL features targeted for irq support require their
> message numbers to be within the first 16 entries.  The device may
> however support less than 16 entries depending on the support it
> provides.
> 
> Attempt to allocate these 16 irq vectors.  If the device supports less
> then the PCI infrastructure will allocate that number.  Upon successful
> allocation, users can plug in their respective isr at any point
> thereafter.
> 
> CXL device events are signaled via interrupts.  Each event log may have
> a different interrupt message number.  These message numbers are
> reported in the Get Event Interrupt Policy mailbox command.
> 
> Add interrupt support for event logs.  Interrupts are allocated as
> shared interrupts.  Therefore, all or some event logs can share the same
> message number.
> 
> In addition all logs are queried on any interrupt in order of the most
> to least severe based on the status register.
> 
> Cc: Bjorn Helgaas <helgaas@kernel.org>
> Cc: Jonathan Cameron <Jonathan.Cameron@huawei.com>
> Co-developed-by: Ira Weiny <ira.weiny@intel.com>
> Signed-off-by: Ira Weiny <ira.weiny@intel.com>
> Signed-off-by: Davidlohr Bueso <dave@stgolabs.net>
> 
> ---
> Changes from V2:
> 	General clean up
> 	Use cxl_log_id to ensure each irq is unique even if the message numbers are not
> 	Jonathan/Dan
> 		Only set up irq vector when OSC indicates OS control
> 	Dan
> 		Loop reading while status indicates there are more
> 			events.
> 		Use new cxl_internal_send_cmd()
> 		Squash MSI/MSIx base patch from Davidlohr
> 		Remove uapi defines altogether
> 		Remove use of msi_enabled
> 	Remove the use of cxl_event_log_type_str()
> 	Pick up tag
> 
> Changes from V1:
> 	Remove unneeded evt_int_policy from struct cxl_dev_state
> 	defer Dynamic Capacity support
> 	Dave Jiang
> 		s/irq/rc
> 		use IRQ_NONE to signal the irq was not for us.
> 	Jonathan
> 		use msi_enabled rather than nr_irq_vec
> 		On failure explicitly set CXL_INT_NONE
> 		Add comment for Get Event Interrupt Policy
> 		use devm_request_threaded_irq()
> 		Use individual handler/thread functions for each of the
> 		logs rather than struct cxl_event_irq_id.
> 
> Changes from RFC v2
> 	Adjust to new irq 16 vector allocation
> 	Jonathan
> 		Remove CXL_INT_RES
> 	Use irq threads to ensure mailbox commands are executed outside irq context
> 	Adjust for optional Dynamic Capacity log
> ---
>  drivers/cxl/core/mbox.c | 42 +++++++++++++++++++
>  drivers/cxl/cxlmem.h    | 28 +++++++++++++
>  drivers/cxl/cxlpci.h    |  6 +++
>  drivers/cxl/pci.c       | 90 ++++++++++++++++++++++++++++++++++++++++-
>  4 files changed, 165 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/cxl/core/mbox.c b/drivers/cxl/core/mbox.c
> index 815da3aac081..2b25691a9b09 100644
> --- a/drivers/cxl/core/mbox.c
> +++ b/drivers/cxl/core/mbox.c
> @@ -854,6 +854,48 @@ void cxl_mem_get_event_records(struct cxl_dev_state *cxlds, u32 status)
>  }
>  EXPORT_SYMBOL_NS_GPL(cxl_mem_get_event_records, CXL);
>  
> +int cxl_event_config_msgnums(struct cxl_dev_state *cxlds,
> +			     struct cxl_event_interrupt_policy *policy)
> +{
> +	struct cxl_mbox_cmd mbox_cmd;
> +	int rc;
> +
> +	policy->info_settings = CXL_INT_MSI_MSIX;
> +	policy->warn_settings = CXL_INT_MSI_MSIX;
> +	policy->failure_settings = CXL_INT_MSI_MSIX;
> +	policy->fatal_settings = CXL_INT_MSI_MSIX;

For Robustness Principle "be conservative in what is sent" purposes I
would do the Get Events first to make sure that nothing is steered to
the Firmware VDM, and warn the user that their BIOS gave the OS CXL
Error Control, but did not shutdown event interrupts.

I.e. if the event interrupts are still steered to BIOS then BIOS may
think it still has control of the event logs and trouble ensues.

> +
> +	mbox_cmd = (struct cxl_mbox_cmd) {
> +		.opcode = CXL_MBOX_OP_SET_EVT_INT_POLICY,
> +		.payload_in = policy,
> +		.size_in = sizeof(*policy),
> +	};
> +
> +	rc = cxl_internal_send_cmd(cxlds, &mbox_cmd);
> +	if (rc < 0) {
> +		dev_err(cxlds->dev, "Failed to set event interrupt policy : %d",
> +			rc);
> +		return rc;
> +	}
> +
> +	mbox_cmd = (struct cxl_mbox_cmd) {
> +		.opcode = CXL_MBOX_OP_GET_EVT_INT_POLICY,
> +		.payload_out = policy,
> +		.size_out = sizeof(*policy),
> +	};
> +
> +	/* Retrieve interrupt message numbers */
> +	rc = cxl_internal_send_cmd(cxlds, &mbox_cmd);
> +	if (rc < 0) {
> +		dev_err(cxlds->dev, "Failed to get event interrupt policy : %d",
> +			rc);
> +		return rc;
> +	}
> +
> +	return 0;
> +}
> +EXPORT_SYMBOL_NS_GPL(cxl_event_config_msgnums, CXL);

A question, why is this function in the core and not in cxl_pci? For
cxl_test mocking purposes? Otherwise seems ok to keep this in the same
file as its only caller.

> +
>  /**
>   * cxl_mem_get_partition_info - Get partition info
>   * @cxlds: The device data for the operation
> diff --git a/drivers/cxl/cxlmem.h b/drivers/cxl/cxlmem.h
> index dd9aa3dd738e..350cb460e7fc 100644
> --- a/drivers/cxl/cxlmem.h
> +++ b/drivers/cxl/cxlmem.h
> @@ -194,6 +194,30 @@ struct cxl_endpoint_dvsec_info {
>  	struct range dvsec_range[2];
>  };
>  
> +/**
> + * Event Interrupt Policy
> + *
> + * CXL rev 3.0 section 8.2.9.2.4; Table 8-52
> + */
> +enum cxl_event_int_mode {
> +	CXL_INT_NONE		= 0x00,
> +	CXL_INT_MSI_MSIX	= 0x01,
> +	CXL_INT_FW		= 0x02
> +};
> +#define CXL_EVENT_INT_MODE_MASK 0x3
> +#define CXL_EVENT_INT_MSGNUM(setting) (((setting) & 0xf0) >> 4)
> +struct cxl_event_interrupt_policy {
> +	u8 info_settings;
> +	u8 warn_settings;
> +	u8 failure_settings;
> +	u8 fatal_settings;
> +} __packed;
> +
> +static inline bool cxl_evt_int_is_msi(u8 setting)
> +{
> +	return CXL_INT_MSI_MSIX == (setting & CXL_EVENT_INT_MODE_MASK);
> +}
> +
>  /**
>   * struct cxl_event_state - Event log driver state
>   *
> @@ -288,6 +312,8 @@ enum cxl_opcode {
>  	CXL_MBOX_OP_RAW			= CXL_MBOX_OP_INVALID,
>  	CXL_MBOX_OP_GET_EVENT_RECORD	= 0x0100,
>  	CXL_MBOX_OP_CLEAR_EVENT_RECORD	= 0x0101,
> +	CXL_MBOX_OP_GET_EVT_INT_POLICY	= 0x0102,
> +	CXL_MBOX_OP_SET_EVT_INT_POLICY	= 0x0103,
>  	CXL_MBOX_OP_GET_FW_INFO		= 0x0200,
>  	CXL_MBOX_OP_ACTIVATE_FW		= 0x0202,
>  	CXL_MBOX_OP_GET_SUPPORTED_LOGS	= 0x0400,
> @@ -525,6 +551,8 @@ struct cxl_dev_state *cxl_dev_state_create(struct device *dev);
>  void set_exclusive_cxl_commands(struct cxl_dev_state *cxlds, unsigned long *cmds);
>  void clear_exclusive_cxl_commands(struct cxl_dev_state *cxlds, unsigned long *cmds);
>  void cxl_mem_get_event_records(struct cxl_dev_state *cxlds, u32 status);
> +int cxl_event_config_msgnums(struct cxl_dev_state *cxlds,
> +			     struct cxl_event_interrupt_policy *policy);
>  #ifdef CONFIG_CXL_SUSPEND
>  void cxl_mem_active_inc(void);
>  void cxl_mem_active_dec(void);
> diff --git a/drivers/cxl/cxlpci.h b/drivers/cxl/cxlpci.h
> index 77dbdb980b12..4aaadf17a985 100644
> --- a/drivers/cxl/cxlpci.h
> +++ b/drivers/cxl/cxlpci.h
> @@ -53,6 +53,12 @@
>  #define	    CXL_DVSEC_REG_LOCATOR_BLOCK_ID_MASK			GENMASK(15, 8)
>  #define     CXL_DVSEC_REG_LOCATOR_BLOCK_OFF_LOW_MASK		GENMASK(31, 16)
>  
> +/*
> + * NOTE: Currently all the functions which are enabled for CXL require their
> + * vectors to be in the first 16.  Use this as the max.
> + */
> +#define CXL_PCI_REQUIRED_VECTORS 16
> +
>  /* Register Block Identifier (RBI) */
>  enum cxl_regloc_type {
>  	CXL_REGLOC_RBI_EMPTY = 0,
> diff --git a/drivers/cxl/pci.c b/drivers/cxl/pci.c
> index 86c84611a168..c84922a287ec 100644
> --- a/drivers/cxl/pci.c
> +++ b/drivers/cxl/pci.c
> @@ -452,6 +452,90 @@ static void cxl_clear_event_logs(struct cxl_dev_state *cxlds)
>  	cxl_mem_get_event_records(cxlds, CXLDEV_EVENT_STATUS_ALL);
>  }
>  
> +static void cxl_alloc_irq_vectors(struct cxl_dev_state *cxlds)
> +{
> +	struct device *dev = cxlds->dev;
> +	struct pci_dev *pdev = to_pci_dev(dev);
> +	int nvecs;
> +
> +	/*
> +	 * pci_alloc_irq_vectors() handles calling pci_free_irq_vectors()
> +	 * automatically despite not being called pcim_*.  See
> +	 * pci_setup_msi_context().
> +	 */

I think a more important comment is why the flags are limited to MSIX
and MSI, that's a non-obvious CXL spec constraint.

> +	nvecs = pci_alloc_irq_vectors(pdev, 1, CXL_PCI_REQUIRED_VECTORS,

Since I have some other fixups below I'll go ahead and quibble with the
name. The 'requirement' is 1 vector, so
s/CXL_PCI_REQUIRED_VECTORS/CXL_PCI_DEFAULT_VECTORS/ or something like
that. As it stands today there are diminishing returns to ask for more
than that amount.

In the future, if the code knows better that a specific device could
benefit from more than the default, then it can arrange to override
this. Absent that, today there is no reason to try to ask for more.

> +				      PCI_IRQ_MSIX | PCI_IRQ_MSI);
> +	if (nvecs < 1)
> +		dev_dbg(dev, "Failed to alloc irq vectors: %d\n", nvecs);

Just fail the driver load if this happens. There is something wrong if a
PCI driver cannot even allocate 1 vector.

> +}
> +
> +struct cxl_dev_id {
> +	struct cxl_dev_state *cxlds;
> +};
> +
> +static irqreturn_t cxl_event_thread(int irq, void *id)
> +{
> +	struct cxl_dev_id *dev_id = id;
> +	struct cxl_dev_state *cxlds = dev_id->cxlds;
> +	u32 status;
> +
> +	/*
> +	 * CXL 3.0 8.2.8.3.1: The lower 32 bits are the status;
> +	 * ignore the reserved upper 32 bits
> +	 */
> +	status = readl(cxlds->regs.status + CXLDEV_DEV_EVENT_STATUS_OFFSET);
> +	while (status) {
> +		cxl_mem_get_event_records(cxlds, status);
> +		cond_resched();
> +		status = readl(cxlds->regs.status + CXLDEV_DEV_EVENT_STATUS_OFFSET);
> +	}
> +	return IRQ_HANDLED;
> +}
> +
> +static int cxl_req_event_irq(struct cxl_dev_state *cxlds, u8 setting)
> +{
> +	struct device *dev = cxlds->dev;
> +	struct pci_dev *pdev = to_pci_dev(dev);
> +	struct cxl_dev_id *dev_id;
> +	int irq;
> +
> +	if (!cxl_evt_int_is_msi(setting))
> +		return -ENXIO;
> +
> +	/* dev_id must be globally unique and must contain the cxlds */
> +	dev_id = devm_kmalloc(dev, sizeof(*dev_id), GFP_KERNEL);

Yes, the id is simple and fully initialized below, but this is not a
fast path and the rest of the driver uses devm_kzalloc() even if it
fully inits the result. So its a consistency thing and maybe a "save the
future person who adds another field without initializing it some
hassle" thing.

> +	if (!dev_id)
> +		return -ENOMEM;
> +	dev_id->cxlds = cxlds;
> +
> +	irq =  pci_irq_vector(pdev, CXL_EVENT_INT_MSGNUM(setting));
> +	if (irq < 0)
> +		return irq;
> +
> +	return devm_request_threaded_irq(dev, irq, NULL, cxl_event_thread,
> +					 IRQF_SHARED, NULL, dev_id);
> +}
> +
> +static void cxl_event_irqsetup(struct cxl_dev_state *cxlds)
> +{
> +	struct cxl_event_interrupt_policy policy;
> +
> +	if (cxl_event_config_msgnums(cxlds, &policy))
> +		return;

If not all interrupts can be steered to the OS probably best to abort
the entire event setup.

Otherwise if you can steer all to the OS, if any of the below fails that
should be a driver load failure. I certainly do not want to debug
someone's system that randomly failed alternating log type interrupts.

> +
> +	if (cxl_req_event_irq(cxlds, policy.info_settings))
> +		dev_err(cxlds->dev, "Failed to get interrupt for event Info log\n");
> +
> +	if (cxl_req_event_irq(cxlds, policy.warn_settings))
> +		dev_err(cxlds->dev, "Failed to get interrupt for event Warn log\n");
> +
> +	if (cxl_req_event_irq(cxlds, policy.failure_settings))
> +		dev_err(cxlds->dev, "Failed to get interrupt for event Failure log\n");
> +
> +	if (cxl_req_event_irq(cxlds, policy.fatal_settings))
> +		dev_err(cxlds->dev, "Failed to get interrupt for event Fatal log\n");
> +}
> +
>  static int cxl_pci_probe(struct pci_dev *pdev, const struct pci_device_id *id)
>  {
>  	struct pci_host_bridge *host_bridge = pci_find_host_bridge(pdev->bus);
> @@ -526,14 +610,18 @@ static int cxl_pci_probe(struct pci_dev *pdev, const struct pci_device_id *id)
>  	if (rc)
>  		return rc;
>  
> +	cxl_alloc_irq_vectors(cxlds);

Just pass the pdev directly here, no other part of cxlds is needed.

> +
>  	cxlmd = devm_cxl_add_memdev(cxlds);
>  	if (IS_ERR(cxlmd))
>  		return PTR_ERR(cxlmd);
>  
>  	if (host_bridge->native_cxl_error) {
>  		cxl_mem_alloc_event_buf(cxlds);
> -		if (cxlds->event.buf)
> +		if (cxlds->event.buf) {
> +			cxl_event_irqsetup(cxlds);
>  			cxl_clear_event_logs(cxlds);
> +		}
>  	}
>  
>  	if (cxlds->regs.ras) {
> -- 
> 2.37.2
> 



^ permalink raw reply	[flat|nested] 25+ messages in thread

* RE: [PATCH V3 3/8] cxl/mem: Trace General Media Event Record
  2022-12-08  5:21 ` [PATCH V3 3/8] cxl/mem: Trace General Media Event Record ira.weiny
@ 2022-12-09 22:04   ` Dan Williams
  2022-12-11 16:08     ` Ira Weiny
  0 siblings, 1 reply; 25+ messages in thread
From: Dan Williams @ 2022-12-09 22:04 UTC (permalink / raw)
  To: ira.weiny, Dan Williams
  Cc: Ira Weiny, Bjorn Helgaas, Alison Schofield, Vishal Verma,
	Davidlohr Bueso, Jonathan Cameron, Dave Jiang, linux-kernel,
	linux-pci, linux-acpi, linux-cxl

ira.weiny@ wrote:
> From: Ira Weiny <ira.weiny@intel.com>
> 
> CXL rev 3.0 section 8.2.9.2.1.1 defines the General Media Event Record.
> 
> Determine if the event read is a general media record and if so trace
> the record as a General Media Event Record.
> 
> Signed-off-by: Ira Weiny <ira.weiny@intel.com>
> 
> ---
> Changes from V2:
> 	Dan
> 		Remove trace_*_enabled() calls
> 		Pass struct device to trace points
> 
> Changes from V1:
> 	Jonathan
> 		fix spec references for CXL rev 3.0
> 		Make flags all caps
> 
> Changes from RFC v2:
> 	Output DPA flags as a single field
> 	Ensure names of fields match what TP_print outputs
> 	Steven
> 		prefix TRACE_EVENT with 'cxl_'
> 	Jonathan
> 		Remove Reserved field
> 
> Changes from RFC:
> 	Add reserved byte array
> 	Use common CXL event header record macros
> 	Jonathan
> 		Use unaligned_le{24,16} for unaligned fields
> 		Don't use the inverse of phy addr mask
> 	Dave Jiang
> 		s/cxl_gen_media_event/general_media
> 		s/cxl_evt_gen_media/cxl_event_gen_media
> ---
>  drivers/cxl/core/mbox.c  |  30 +++++++++-
>  drivers/cxl/core/trace.h | 124 +++++++++++++++++++++++++++++++++++++++
>  drivers/cxl/cxlmem.h     |  19 ++++++
>  3 files changed, 171 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/cxl/core/mbox.c b/drivers/cxl/core/mbox.c
> index 2b25691a9b09..0d8c66f1cdc5 100644
> --- a/drivers/cxl/core/mbox.c
> +++ b/drivers/cxl/core/mbox.c
> @@ -718,6 +718,32 @@ int cxl_enumerate_cmds(struct cxl_dev_state *cxlds)
>  }
>  EXPORT_SYMBOL_NS_GPL(cxl_enumerate_cmds, CXL);
>  
> +/*
> + * General Media Event Record
> + * CXL rev 3.0 Section 8.2.9.2.1.1; Table 8-43
> + */
> +static const uuid_t gen_media_event_uuid =
> +	UUID_INIT(0xfbcd0a77, 0xc260, 0x417f,
> +		  0x85, 0xa9, 0x08, 0x8b, 0x16, 0x21, 0xeb, 0xa6);
> +
> +static void cxl_trace_event_record(const struct device *dev,
> +				   enum cxl_event_log_type type,
> +				   struct cxl_event_record_raw *record)
> +{
> +	uuid_t *id = &record->hdr.id;
> +
> +	if (uuid_equal(id, &gen_media_event_uuid)) {
> +		struct cxl_event_gen_media *rec =
> +				(struct cxl_event_gen_media *)record;
> +
> +		trace_cxl_general_media(dev, type, rec);
> +		return;
> +	}
> +
> +	/* For unknown record types print just the header */
> +	trace_cxl_generic_event(dev, type, record);
> +}
> +
>  static int cxl_clear_event_record(struct cxl_dev_state *cxlds,
>  				  enum cxl_event_log_type log,
>  				  struct cxl_get_event_payload *get_pl)
> @@ -810,8 +836,8 @@ static void cxl_mem_get_records_log(struct cxl_dev_state *cxlds,
>  			int i;
>  
>  			for (i = 0; i < nr_rec; i++)
> -				trace_cxl_generic_event(cxlds->dev, type,
> -							&payload->records[i]);
> +				cxl_trace_event_record(cxlds->dev, type,
> +						       &payload->records[i]);
>  
>  			rc = cxl_clear_event_record(cxlds, type, payload);
>  			if (rc) {
> diff --git a/drivers/cxl/core/trace.h b/drivers/cxl/core/trace.h
> index 24eef6909f13..82462942590b 100644
> --- a/drivers/cxl/core/trace.h
> +++ b/drivers/cxl/core/trace.h
> @@ -223,6 +223,130 @@ TRACE_EVENT(cxl_generic_event,
>  		__print_hex(__entry->data, CXL_EVENT_RECORD_DATA_LENGTH))
>  );
>  
> +/*
> + * Physical Address field masks
> + *
> + * General Media Event Record
> + * CXL rev 3.0 Section 8.2.9.2.1.1; Table 8-43
> + *
> + * DRAM Event Record
> + * CXL rev 3.0 section 8.2.9.2.1.2; Table 8-44
> + */
> +#define CXL_DPA_FLAGS_MASK			0x3F
> +#define CXL_DPA_MASK				(~CXL_DPA_FLAGS_MASK)
> +
> +#define CXL_DPA_VOLATILE			BIT(0)
> +#define CXL_DPA_NOT_REPAIRABLE			BIT(1)
> +#define show_dpa_flags(flags)	__print_flags(flags, "|",		   \
> +	{ CXL_DPA_VOLATILE,			"VOLATILE"		}, \
> +	{ CXL_DPA_NOT_REPAIRABLE,		"NOT_REPAIRABLE"	}  \
> +)
> +
> +/*
> + * General Media Event Record - GMER
> + * CXL rev 3.0 Section 8.2.9.2.1.1; Table 8-43
> + */
> +#define CXL_GMER_EVT_DESC_UNCORECTABLE_EVENT		BIT(0)
> +#define CXL_GMER_EVT_DESC_THRESHOLD_EVENT		BIT(1)
> +#define CXL_GMER_EVT_DESC_POISON_LIST_OVERFLOW		BIT(2)
> +#define show_event_desc_flags(flags)	__print_flags(flags, "|",		   \
> +	{ CXL_GMER_EVT_DESC_UNCORECTABLE_EVENT,		"UNCORRECTABLE_EVENT"	}, \
> +	{ CXL_GMER_EVT_DESC_THRESHOLD_EVENT,		"THRESHOLD_EVENT"	}, \
> +	{ CXL_GMER_EVT_DESC_POISON_LIST_OVERFLOW,	"POISON_LIST_OVERFLOW"	}  \
> +)
> +
> +#define CXL_GMER_MEM_EVT_TYPE_ECC_ERROR			0x00
> +#define CXL_GMER_MEM_EVT_TYPE_INV_ADDR			0x01
> +#define CXL_GMER_MEM_EVT_TYPE_DATA_PATH_ERROR		0x02
> +#define show_mem_event_type(type)	__print_symbolic(type,			\
> +	{ CXL_GMER_MEM_EVT_TYPE_ECC_ERROR,		"ECC Error" },		\
> +	{ CXL_GMER_MEM_EVT_TYPE_INV_ADDR,		"Invalid Address" },	\
> +	{ CXL_GMER_MEM_EVT_TYPE_DATA_PATH_ERROR,	"Data Path Error" }	\
> +)
> +
> +#define CXL_GMER_TRANS_UNKNOWN				0x00
> +#define CXL_GMER_TRANS_HOST_READ			0x01
> +#define CXL_GMER_TRANS_HOST_WRITE			0x02
> +#define CXL_GMER_TRANS_HOST_SCAN_MEDIA			0x03
> +#define CXL_GMER_TRANS_HOST_INJECT_POISON		0x04
> +#define CXL_GMER_TRANS_INTERNAL_MEDIA_SCRUB		0x05
> +#define CXL_GMER_TRANS_INTERNAL_MEDIA_MANAGEMENT	0x06
> +#define show_trans_type(type)	__print_symbolic(type,					\
> +	{ CXL_GMER_TRANS_UNKNOWN,			"Unknown" },			\
> +	{ CXL_GMER_TRANS_HOST_READ,			"Host Read" },			\
> +	{ CXL_GMER_TRANS_HOST_WRITE,			"Host Write" },			\
> +	{ CXL_GMER_TRANS_HOST_SCAN_MEDIA,		"Host Scan Media" },		\
> +	{ CXL_GMER_TRANS_HOST_INJECT_POISON,		"Host Inject Poison" },		\
> +	{ CXL_GMER_TRANS_INTERNAL_MEDIA_SCRUB,		"Internal Media Scrub" },	\
> +	{ CXL_GMER_TRANS_INTERNAL_MEDIA_MANAGEMENT,	"Internal Media Management" }	\
> +)
> +
> +#define CXL_GMER_VALID_CHANNEL				BIT(0)
> +#define CXL_GMER_VALID_RANK				BIT(1)
> +#define CXL_GMER_VALID_DEVICE				BIT(2)
> +#define CXL_GMER_VALID_COMPONENT			BIT(3)
> +#define show_valid_flags(flags)	__print_flags(flags, "|",		   \
> +	{ CXL_GMER_VALID_CHANNEL,			"CHANNEL"	}, \
> +	{ CXL_GMER_VALID_RANK,				"RANK"		}, \
> +	{ CXL_GMER_VALID_DEVICE,			"DEVICE"	}, \
> +	{ CXL_GMER_VALID_COMPONENT,			"COMPONENT"	}  \
> +)
> +
> +TRACE_EVENT(cxl_general_media,
> +
> +	TP_PROTO(const struct device *dev, enum cxl_event_log_type log,
> +		 struct cxl_event_gen_media *rec),
> +
> +	TP_ARGS(dev, log, rec),
> +
> +	TP_STRUCT__entry(
> +		CXL_EVT_TP_entry
> +		/* General Media */
> +		__field(u64, dpa)
> +		__field(u8, descriptor)
> +		__field(u8, type)
> +		__field(u8, transaction_type)
> +		__field(u8, channel)
> +		__field(u32, device)
> +		__array(u8, comp_id, CXL_EVENT_GEN_MED_COMP_ID_SIZE)
> +		__field(u16, validity_flags)
> +		/* Following are out of order to pack trace record */
> +		__field(u8, rank)
> +		__field(u8, dpa_flags)
> +	),
> +
> +	TP_fast_assign(
> +		CXL_EVT_TP_fast_assign(dev, log, rec->hdr);
> +
> +		/* General Media */
> +		__entry->dpa = le64_to_cpu(rec->phys_addr);
> +		__entry->dpa_flags = __entry->dpa & CXL_DPA_FLAGS_MASK;
> +		/* Mask after flags have been parsed */
> +		__entry->dpa &= CXL_DPA_MASK;
> +		__entry->descriptor = rec->descriptor;
> +		__entry->type = rec->type;
> +		__entry->transaction_type = rec->transaction_type;
> +		__entry->channel = rec->channel;
> +		__entry->rank = rec->rank;
> +		__entry->device = get_unaligned_le24(rec->device);
> +		memcpy(__entry->comp_id, &rec->component_id,
> +			CXL_EVENT_GEN_MED_COMP_ID_SIZE);
> +		__entry->validity_flags = get_unaligned_le16(&rec->validity_flags);
> +	),
> +
> +	CXL_EVT_TP_printk("dpa=%llx dpa_flags='%s' " \
> +		"descriptor='%s' type='%s' transaction_type='%s' channel=%u rank=%u " \
> +		"device=%x comp_id=%s validity_flags='%s'",
> +		__entry->dpa, show_dpa_flags(__entry->dpa_flags),
> +		show_event_desc_flags(__entry->descriptor),
> +		show_mem_event_type(__entry->type),
> +		show_trans_type(__entry->transaction_type),
> +		__entry->channel, __entry->rank, __entry->device,
> +		__print_hex(__entry->comp_id, CXL_EVENT_GEN_MED_COMP_ID_SIZE),
> +		show_valid_flags(__entry->validity_flags)
> +	)
> +);
> +
>  #endif /* _CXL_EVENTS_H */
>  
>  #define TRACE_INCLUDE_FILE trace
> diff --git a/drivers/cxl/cxlmem.h b/drivers/cxl/cxlmem.h
> index 350cb460e7fc..a5f5d4a380af 100644
> --- a/drivers/cxl/cxlmem.h
> +++ b/drivers/cxl/cxlmem.h
> @@ -456,6 +456,25 @@ struct cxl_mbox_clear_event_payload {
>  		 (sizeof(__le16) * CXL_CLEAR_EVENT_MAX_HANDLES))) /	\
>  		sizeof(__le16))
>  
> +/*
> + * General Media Event Record
> + * CXL rev 3.0 Section 8.2.9.2.1.1; Table 8-43
> + */
> +#define CXL_EVENT_GEN_MED_COMP_ID_SIZE	0x10
> +struct cxl_event_gen_media {
> +	struct cxl_event_record_hdr hdr;
> +	__le64 phys_addr;
> +	u8 descriptor;
> +	u8 type;
> +	u8 transaction_type;
> +	u8 validity_flags[2];
> +	u8 channel;
> +	u8 rank;
> +	u8 device[3];
> +	u8 component_id[CXL_EVENT_GEN_MED_COMP_ID_SIZE];
> +	u8 reserved[0x2e];

If you reflow this one again to make capitalization of symbols
consistent in the trace prints perhaps change that to decimal, but
that's not a blocker.

Reviewed-by: Dan Williams <dan.j.williams@intel.com>

^ permalink raw reply	[flat|nested] 25+ messages in thread

* RE: [PATCH V3 4/8] cxl/mem: Trace DRAM Event Record
  2022-12-08  5:21 ` [PATCH V3 4/8] cxl/mem: Trace DRAM " ira.weiny
@ 2022-12-09 22:14   ` Dan Williams
  2022-12-11 16:21     ` Ira Weiny
  0 siblings, 1 reply; 25+ messages in thread
From: Dan Williams @ 2022-12-09 22:14 UTC (permalink / raw)
  To: ira.weiny, Dan Williams
  Cc: Ira Weiny, Bjorn Helgaas, Alison Schofield, Vishal Verma,
	Davidlohr Bueso, Jonathan Cameron, Dave Jiang, linux-kernel,
	linux-pci, linux-acpi, linux-cxl

ira.weiny@ wrote:
> From: Ira Weiny <ira.weiny@intel.com>
> 
> CXL rev 3.0 section 8.2.9.2.1.2 defines the DRAM Event Record.
> 
> Determine if the event read is a DRAM event record and if so trace the
> record.
> 
> Signed-off-by: Ira Weiny <ira.weiny@intel.com>
> 
> ---
> Changes from v2:
> 	Dan
> 		Move tracing to cxl core
> 		Remove trace_*_enabled() calls
> 		Pass struct device to trace points
> 
> Changes from RFC v2:
> 	Output DPA flags as a separate field.
> 	Ensure field names match TP_print output
> 	Steven
> 		prefix TRACE_EVENT with 'cxl_'
> 	Jonathan
> 		Formatting fix
> 		Remove reserved field
> 
> Changes from RFC:
> 	Add reserved byte data
> 	Use new CXL header macros
> 	Jonathan
> 		Use get_unaligned_le{24,16}() for unaligned fields
> 		Use 'else if'
> 	Dave Jiang
> 		s/cxl_dram_event/dram
> 		s/cxl_evt_dram_rec/cxl_event_dram
> 	Adjust for new phys addr mask
> ---
>  drivers/cxl/core/mbox.c  | 13 ++++++
>  drivers/cxl/core/trace.h | 92 ++++++++++++++++++++++++++++++++++++++++
>  drivers/cxl/cxlmem.h     | 23 ++++++++++
>  3 files changed, 128 insertions(+)
> 
> diff --git a/drivers/cxl/core/mbox.c b/drivers/cxl/core/mbox.c
> index 0d8c66f1cdc5..2fa4645f0ed9 100644
> --- a/drivers/cxl/core/mbox.c
> +++ b/drivers/cxl/core/mbox.c
> @@ -726,6 +726,14 @@ static const uuid_t gen_media_event_uuid =
>  	UUID_INIT(0xfbcd0a77, 0xc260, 0x417f,
>  		  0x85, 0xa9, 0x08, 0x8b, 0x16, 0x21, 0xeb, 0xa6);
>  
> +/*
> + * DRAM Event Record
> + * CXL rev 3.0 section 8.2.9.2.1.2; Table 8-44
> + */
> +static const uuid_t dram_event_uuid =
> +	UUID_INIT(0x601dcbb3, 0x9c06, 0x4eab,
> +		  0xb8, 0xaf, 0x4e, 0x9b, 0xfb, 0x5c, 0x96, 0x24);
> +
>  static void cxl_trace_event_record(const struct device *dev,
>  				   enum cxl_event_log_type type,
>  				   struct cxl_event_record_raw *record)
> @@ -738,6 +746,11 @@ static void cxl_trace_event_record(const struct device *dev,
>  
>  		trace_cxl_general_media(dev, type, rec);
>  		return;
> +	} else if (uuid_equal(id, &dram_event_uuid)) {
> +		struct cxl_event_dram *rec = (struct cxl_event_dram *)record;
> +
> +		trace_cxl_dram(dev, type, rec);
> +		return;

I think I mentioned this before, but rather than a "return" in every
branch just make the 'unknown' case the final else in this if block.

With that feel free to add:

Reviewed-by: Dan Williams <dan.j.williams@intel.com>

^ permalink raw reply	[flat|nested] 25+ messages in thread

* RE: [PATCH V3 5/8] cxl/mem: Trace Memory Module Event Record
  2022-12-08  5:21 ` [PATCH V3 5/8] cxl/mem: Trace Memory Module " ira.weiny
@ 2022-12-09 22:18   ` Dan Williams
  0 siblings, 0 replies; 25+ messages in thread
From: Dan Williams @ 2022-12-09 22:18 UTC (permalink / raw)
  To: ira.weiny, Dan Williams
  Cc: Ira Weiny, Bjorn Helgaas, Alison Schofield, Vishal Verma,
	Davidlohr Bueso, Jonathan Cameron, Dave Jiang, linux-kernel,
	linux-pci, linux-acpi, linux-cxl

ira.weiny@ wrote:
> From: Ira Weiny <ira.weiny@intel.com>
> 
> CXL rev 3.0 section 8.2.9.2.1.3 defines the Memory Module Event Record.
> 
> Determine if the event read is memory module record and if so trace the
> record.
> 

Modulo carrying forward the review comments from previous patches, feel
free to add:

Reviewed-by: Dan Williams <dan.j.williams@intel.com>

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH V3 1/8] cxl/mem: Read, trace, and clear events on driver load
  2022-12-09 21:00     ` Ira Weiny
@ 2022-12-09 22:33       ` Dan Williams
  2022-12-09 23:34         ` Ira Weiny
  0 siblings, 1 reply; 25+ messages in thread
From: Dan Williams @ 2022-12-09 22:33 UTC (permalink / raw)
  To: Ira Weiny, Dan Williams
  Cc: Bjorn Helgaas, Alison Schofield, Vishal Verma, Davidlohr Bueso,
	Jonathan Cameron, Dave Jiang, linux-kernel, linux-pci,
	linux-acpi, linux-cxl

Ira Weiny wrote:
[..]
> > > diff --git a/drivers/cxl/pci.c b/drivers/cxl/pci.c
> > > index 3a66aadb4df0..86c84611a168 100644
> > > --- a/drivers/cxl/pci.c
> > > +++ b/drivers/cxl/pci.c
> > > @@ -417,8 +417,44 @@ static void disable_aer(void *pdev)
> > >  	pci_disable_pcie_error_reporting(pdev);
> > >  }
> > >  
> > > +static void cxl_mem_free_event_buffer(void *buf)
> > > +{
> > > +	kvfree(buf);
> > > +}
> > > +
> > > +/*
> > > + * There is a single buffer for reading event logs from the mailbox.  All logs
> > > + * share this buffer protected by the cxlds->event_log_lock.
> > > + */
> > > +static void cxl_mem_alloc_event_buf(struct cxl_dev_state *cxlds)
> > > +{
> > > +	struct cxl_get_event_payload *buf;
> > > +
> > > +	dev_dbg(cxlds->dev, "Allocating event buffer size %zu\n",
> > > +		cxlds->payload_size);
> > > +
> > > +	buf = kvmalloc(cxlds->payload_size, GFP_KERNEL);
> > > +	if (WARN_ON_ONCE(!buf))
> > 
> > No, why is event init so special that it behaves differently than all
> > the other init-time allocations this driver does?
> 
> Previous review agreed that a warn on once would be printed if this universal
> buffer was not allocated.
> 
> > 
> > > +		return;
> > 
> > return -ENOMEM;
> > 
> > > +
> > > +	if (WARN_ON_ONCE(devm_add_action_or_reset(cxlds->dev,
> > > +			 cxl_mem_free_event_buffer, buf)))
> > > +		return;
> > 
> > ditto.
> 
> I'll change both of these with a dev_err() and bail during init.

No real need to dev_err() for a simple memory allocation faliure, but
at least it is better than a WARN

> 
> > 
> > > +
> > > +	cxlds->event.buf = buf;
> > > +}
> > > +
> > > +static void cxl_clear_event_logs(struct cxl_dev_state *cxlds)
> > > +{
> > > +	/* Force read and clear of all logs */
> > > +	cxl_mem_get_event_records(cxlds, CXLDEV_EVENT_STATUS_ALL);
> > > +	/* Ensure prior partial reads are handled, by starting over again */
> > 
> > What partial reads? cxl_mem_get_event_records() reads every log until
> > each returns an empty result. Any remaining events after this returns
> > are events that fired during the retrieval.
> 
> Jonathan was concerned that something could read part of the log and because of
> the statefullness of the log processing this reading of the log could start in
> the beginning.  Perhaps from a previous driver unload while reading?

The driver will not unload without completing any current executions of
the event retrieval thread otherwise that's an irq shutdown bug.

> I guess I was also thinking the BIOS could leave things this way?  But I think
> we should not be here if the BIOS was ever involved right?

If the OS has CXL Error control and all Event irqs are steered to the OS
then the driver must be allowed to assume that it has exclusive control
over event retrieval and clearing.

> > So I do not think cxl_clear_event_logs() needs to exist, just call
> > cxl_mem_get_event_records(CXLDEV_EVENT_STATUS_ALL) once and that's it.
> 
> That was my inclination but Jonathan's comments got me thinking I was wrong.

Perhaps that was before we realized the recent CXL _OSC entanglement.

^ permalink raw reply	[flat|nested] 25+ messages in thread

* RE: [PATCH V3 6/8] cxl/test: Add generic mock events
  2022-12-08  5:21 ` [PATCH V3 6/8] cxl/test: Add generic mock events ira.weiny
@ 2022-12-09 22:48   ` Dan Williams
  2022-12-11 17:26     ` Ira Weiny
  0 siblings, 1 reply; 25+ messages in thread
From: Dan Williams @ 2022-12-09 22:48 UTC (permalink / raw)
  To: ira.weiny, Dan Williams
  Cc: Ira Weiny, Bjorn Helgaas, Alison Schofield, Vishal Verma,
	Davidlohr Bueso, Jonathan Cameron, Dave Jiang, linux-kernel,
	linux-pci, linux-acpi, linux-cxl

ira.weiny@ wrote:
> From: Ira Weiny <ira.weiny@intel.com>
> 
> Facilitate testing basic Get/Clear Event functionality by creating
> multiple logs and generic events with made up UUID's.
> 
> Data is completely made up with data patterns which should be easy to
> spot in trace output.
> 
> A single sysfs entry resets the event data and triggers collecting the
> events for testing.
> 
> Test traces are easy to obtain with a small script such as this:
> 
> 	#!/bin/bash -x
> 
> 	devices=`find /sys/devices/platform -name cxl_mem*`
> 
> 	# Turn on tracing
> 	echo "" > /sys/kernel/tracing/trace
> 	echo 1 > /sys/kernel/tracing/events/cxl/enable
> 	echo 1 > /sys/kernel/tracing/tracing_on
> 
> 	# Generate fake interrupt
> 	for device in $devices; do
> 	        echo 1 > $device/event_trigger
> 	done
> 
> 	# Turn off tracing and report events
> 	echo 0 > /sys/kernel/tracing/tracing_on
> 	cat /sys/kernel/tracing/trace
> 
> Signed-off-by: Ira Weiny <ira.weiny@intel.com>
> 
> ---
> Changes from v2:
> 	Adjust for tracing being part of cxl core
> 	Dan/Dave J.
> 		Adjust to Dave J.s mock data structure
> 	Dan
> 		Remove mock specific functionality in main code
> 
> Changes from v1:
> 	Fix up for new structures
> 	Jonathan
> 		Update based on specification discussion
> 
> Changes from RFC v2:
> 	Adjust to simulate the event status register
> 
> Changes from RFC:
> 	Separate out the event code
> 	Adjust for struct changes.
> 	Clean up devm_cxl_mock_event_logs()
> 	Clean up naming and comments
> 	Jonathan
> 		Remove dynamic allocation of event logs
> 		Clean up comment
> 		Remove unneeded xarray
> 		Ensure event_trigger sysfs is valid prior to the driver
> 		going active.
> 	Dan
> 		Remove the fill/reset event sysfs as these operations
> 		can be done together
> ---
>  tools/testing/cxl/test/Kbuild   |   4 +-
>  tools/testing/cxl/test/events.c | 195 ++++++++++++++++++++++++++++++++
>  tools/testing/cxl/test/events.h |  32 ++++++
>  tools/testing/cxl/test/mem.c    |  33 ++++--
>  tools/testing/cxl/test/mock.h   |  12 ++
>  5 files changed, 264 insertions(+), 12 deletions(-)
>  create mode 100644 tools/testing/cxl/test/events.c
>  create mode 100644 tools/testing/cxl/test/events.h
> 
> diff --git a/tools/testing/cxl/test/Kbuild b/tools/testing/cxl/test/Kbuild
> index 4e59e2c911f6..c48d912e3781 100644
> --- a/tools/testing/cxl/test/Kbuild
> +++ b/tools/testing/cxl/test/Kbuild
> @@ -1,5 +1,5 @@
>  # SPDX-License-Identifier: GPL-2.0
> -ccflags-y := -I$(srctree)/drivers/cxl/
> +ccflags-y := -I$(srctree)/drivers/cxl/ -I$(srctree)/drivers/cxl/core
>  
>  obj-m += cxl_test.o
>  obj-m += cxl_mock.o
> @@ -7,4 +7,4 @@ obj-m += cxl_mock_mem.o
>  
>  cxl_test-y := cxl.o
>  cxl_mock-y := mock.o
> -cxl_mock_mem-y := mem.o
> +cxl_mock_mem-y := mem.o events.o
> diff --git a/tools/testing/cxl/test/events.c b/tools/testing/cxl/test/events.c
> new file mode 100644
> index 000000000000..1346c38dce1d
> --- /dev/null
> +++ b/tools/testing/cxl/test/events.c
> @@ -0,0 +1,195 @@
> +// SPDX-License-Identifier: GPL-2.0-only
> +// Copyright(c) 2022 Intel Corporation. All rights reserved.
> +
> +#include "mock.h"
> +#include "events.h"
> +#include "trace.h"
> +
> +struct mock_event_log *find_event_log(struct device *dev, int log_type)
> +{
> +	struct cxl_mockmem_data *mdata = dev_get_drvdata(dev);
> +
> +	if (log_type >= CXL_EVENT_TYPE_MAX)
> +		return NULL;
> +	return &mdata->mes.mock_logs[log_type];
> +}
> +
> +struct cxl_event_record_raw *get_cur_event(struct mock_event_log *log)
> +{
> +	return log->events[log->cur_idx];
> +}
> +
> +void reset_event_log(struct mock_event_log *log)
> +{
> +	log->cur_idx = 0;
> +	log->clear_idx = 0;
> +}
> +
> +/* Handle can never be 0 use 1 based indexing for handle */
> +u16 get_clear_handle(struct mock_event_log *log)
> +{
> +	return log->clear_idx + 1;
> +}
> +
> +/* Handle can never be 0 use 1 based indexing for handle */
> +__le16 get_cur_event_handle(struct mock_event_log *log)
> +{
> +	u16 cur_handle = log->cur_idx + 1;
> +
> +	return cpu_to_le16(cur_handle);
> +}
> +
> +static bool log_empty(struct mock_event_log *log)
> +{
> +	return log->cur_idx == log->nr_events;
> +}
> +
> +static void event_store_add_event(struct mock_event_store *mes,
> +				  enum cxl_event_log_type log_type,
> +				  struct cxl_event_record_raw *event)
> +{
> +	struct mock_event_log *log;
> +
> +	if (WARN_ON(log_type >= CXL_EVENT_TYPE_MAX))
> +		return;
> +
> +	log = &mes->mock_logs[log_type];
> +	if (WARN_ON(log->nr_events >= CXL_TEST_EVENT_CNT_MAX))
> +		return;
> +
> +	log->events[log->nr_events] = event;
> +	log->nr_events++;
> +}
> +
> +int mock_get_event(struct cxl_dev_state *cxlds, struct cxl_mbox_cmd *cmd)
> +{
> +	struct cxl_get_event_payload *pl;
> +	struct mock_event_log *log;
> +	u8 log_type;
> +	int i;
> +
> +	if (cmd->size_in != sizeof(log_type))
> +		return -EINVAL;
> +
> +	if (cmd->size_out < struct_size(pl, records, CXL_TEST_EVENT_CNT))
> +		return -EINVAL;
> +
> +	log_type = *((u8 *)cmd->payload_in);
> +	if (log_type >= CXL_EVENT_TYPE_MAX)
> +		return -EINVAL;
> +
> +	memset(cmd->payload_out, 0, cmd->size_out);
> +
> +	log = find_event_log(cxlds->dev, log_type);
> +	if (!log || log_empty(log))
> +		return 0;
> +
> +	pl = cmd->payload_out;
> +
> +	for (i = 0; i < CXL_TEST_EVENT_CNT && !log_empty(log); i++) {
> +		memcpy(&pl->records[i], get_cur_event(log), sizeof(pl->records[i]));
> +		pl->records[i].hdr.handle = get_cur_event_handle(log);
> +		log->cur_idx++;
> +	}
> +
> +	pl->record_count = cpu_to_le16(i);
> +	if (!log_empty(log))
> +		pl->flags |= CXL_GET_EVENT_FLAG_MORE_RECORDS;
> +
> +	return 0;
> +}
> +EXPORT_SYMBOL_GPL(mock_get_event);

I believe I asked this before on the last review. Why is this exported?
The caller is within the same module.

I also notice now that the event support code is even smaller than the
security code that is already in that file, just go ahead and move this
infrastructure in there as well. Then this does not need to move the
cxl_mockmem_data definition around or create other new headers.

Other than that minor detail the implementation looks good. I appreciate
the effort to get cxl_test to light up this interface!

After consolidating in mem.c you can add:

Reviewed-by: Dan Williams <dan.j.williams@intel.com>

> +
> +int mock_clear_event(struct cxl_dev_state *cxlds, struct cxl_mbox_cmd *cmd)
> +{
> +	struct cxl_mbox_clear_event_payload *pl = cmd->payload_in;
> +	struct mock_event_log *log;
> +	u8 log_type = pl->event_log;
> +	u16 handle;
> +	int nr;
> +
> +	if (log_type >= CXL_EVENT_TYPE_MAX)
> +		return -EINVAL;
> +
> +	log = find_event_log(cxlds->dev, log_type);
> +	if (!log)
> +		return 0; /* No mock data in this log */
> +
> +	/*
> +	 * This check is technically not invalid per the specification AFAICS.
> +	 * (The host could 'guess' handles and clear them in order).
> +	 * However, this is not good behavior for the host so test it.
> +	 */
> +	if (log->clear_idx + pl->nr_recs > log->cur_idx) {
> +		dev_err(cxlds->dev,
> +			"Attempting to clear more events than returned!\n");
> +		return -EINVAL;
> +	}
> +
> +	/* Check handle order prior to clearing events */
> +	for (nr = 0, handle = get_clear_handle(log);
> +	     nr < pl->nr_recs;
> +	     nr++, handle++) {
> +		if (handle != le16_to_cpu(pl->handle[nr])) {
> +			dev_err(cxlds->dev, "Clearing events out of order\n");
> +			return -EINVAL;
> +		}
> +	}
> +
> +	/* Clear events */
> +	log->clear_idx += pl->nr_recs;
> +	return 0;
> +}
> +EXPORT_SYMBOL_GPL(mock_clear_event);
> +
> +void cxl_mock_event_trigger(struct device *dev)
> +{
> +	struct cxl_mockmem_data *mdata = dev_get_drvdata(dev);
> +	struct mock_event_store *mes = &mdata->mes;
> +	int i;
> +
> +	for (i = CXL_EVENT_TYPE_INFO; i < CXL_EVENT_TYPE_MAX; i++) {
> +		struct mock_event_log *log;
> +
> +		log = find_event_log(dev, i);
> +		if (log)
> +			reset_event_log(log);
> +	}
> +
> +	cxl_mem_get_event_records(mes->cxlds, mes->ev_status);
> +}
> +EXPORT_SYMBOL_GPL(cxl_mock_event_trigger);
> +
> +struct cxl_event_record_raw maint_needed = {
> +	.hdr = {
> +		.id = UUID_INIT(0xBA5EBA11, 0xABCD, 0xEFEB,
> +				0xa5, 0x5a, 0xa5, 0x5a, 0xa5, 0xa5, 0x5a, 0xa5),
> +		.length = sizeof(struct cxl_event_record_raw),
> +		.flags[0] = CXL_EVENT_RECORD_FLAG_MAINT_NEEDED,
> +		/* .handle = Set dynamically */
> +		.related_handle = cpu_to_le16(0xa5b6),
> +	},
> +	.data = { 0xDE, 0xAD, 0xBE, 0xEF },
> +};
> +
> +struct cxl_event_record_raw hardware_replace = {
> +	.hdr = {
> +		.id = UUID_INIT(0xABCDEFEB, 0xBA11, 0xBA5E,
> +				0xa5, 0x5a, 0xa5, 0x5a, 0xa5, 0xa5, 0x5a, 0xa5),
> +		.length = sizeof(struct cxl_event_record_raw),
> +		.flags[0] = CXL_EVENT_RECORD_FLAG_HW_REPLACE,
> +		/* .handle = Set dynamically */
> +		.related_handle = cpu_to_le16(0xb6a5),
> +	},
> +	.data = { 0xDE, 0xAD, 0xBE, 0xEF },
> +};
> +
> +void cxl_mock_add_event_logs(struct mock_event_store *mes)
> +{
> +	event_store_add_event(mes, CXL_EVENT_TYPE_INFO, &maint_needed);
> +	mes->ev_status |= CXLDEV_EVENT_STATUS_INFO;
> +
> +	event_store_add_event(mes, CXL_EVENT_TYPE_FATAL, &hardware_replace);
> +	mes->ev_status |= CXLDEV_EVENT_STATUS_FATAL;
> +}
> +EXPORT_SYMBOL_GPL(cxl_mock_add_event_logs);
> diff --git a/tools/testing/cxl/test/events.h b/tools/testing/cxl/test/events.h
> new file mode 100644
> index 000000000000..626cd79f1871
> --- /dev/null
> +++ b/tools/testing/cxl/test/events.h
> @@ -0,0 +1,32 @@
> +/* SPDX-License-Identifier: GPL-2.0 */
> +
> +#ifndef CXL_TEST_EVENTS_H
> +#define CXL_TEST_EVENTS_H
> +
> +#include <cxlmem.h>
> +
> +#define CXL_TEST_EVENT_CNT_MAX 15
> +
> +/* Set a number of events to return at a time for simulation.  */
> +#define CXL_TEST_EVENT_CNT 3
> +
> +struct mock_event_log {
> +	u16 clear_idx;
> +	u16 cur_idx;
> +	u16 nr_events;
> +	struct cxl_event_record_raw *events[CXL_TEST_EVENT_CNT_MAX];
> +};
> +
> +struct mock_event_store {
> +	struct cxl_dev_state *cxlds;
> +	struct mock_event_log mock_logs[CXL_EVENT_TYPE_MAX];
> +	u32 ev_status;
> +};
> +
> +int mock_get_event(struct cxl_dev_state *cxlds, struct cxl_mbox_cmd *cmd);
> +int mock_clear_event(struct cxl_dev_state *cxlds, struct cxl_mbox_cmd *cmd);
> +void cxl_mock_add_event_logs(struct mock_event_store *mes);
> +void cxl_mock_remove_event_logs(struct device *dev);
> +void cxl_mock_event_trigger(struct device *dev);
> +
> +#endif /* CXL_TEST_EVENTS_H */
> diff --git a/tools/testing/cxl/test/mem.c b/tools/testing/cxl/test/mem.c
> index 5e4ecd93f1d2..7674d6305d28 100644
> --- a/tools/testing/cxl/test/mem.c
> +++ b/tools/testing/cxl/test/mem.c
> @@ -8,6 +8,7 @@
>  #include <linux/sizes.h>
>  #include <linux/bits.h>
>  #include <cxlmem.h>
> +#include "mock.h"
>  
>  #define LSA_SIZE SZ_128K
>  #define DEV_SIZE SZ_2G
> @@ -67,16 +68,6 @@ static struct {
>  
>  #define PASS_TRY_LIMIT 3
>  
> -struct cxl_mockmem_data {
> -	void *lsa;
> -	u32 security_state;
> -	u8 user_pass[NVDIMM_PASSPHRASE_LEN];
> -	u8 master_pass[NVDIMM_PASSPHRASE_LEN];
> -	int user_limit;
> -	int master_limit;
> -
> -};
> -
>  static int mock_gsl(struct cxl_mbox_cmd *cmd)
>  {
>  	if (cmd->size_out < sizeof(mock_gsl_payload))
> @@ -582,6 +573,12 @@ static int cxl_mock_mbox_send(struct cxl_dev_state *cxlds, struct cxl_mbox_cmd *
>  	case CXL_MBOX_OP_GET_PARTITION_INFO:
>  		rc = mock_partition_info(cxlds, cmd);
>  		break;
> +	case CXL_MBOX_OP_GET_EVENT_RECORD:
> +		rc = mock_get_event(cxlds, cmd);
> +		break;
> +	case CXL_MBOX_OP_CLEAR_EVENT_RECORD:
> +		rc = mock_clear_event(cxlds, cmd);
> +		break;
>  	case CXL_MBOX_OP_SET_LSA:
>  		rc = mock_set_lsa(cxlds, cmd);
>  		break;
> @@ -628,6 +625,15 @@ static bool is_rcd(struct platform_device *pdev)
>  	return !!id->driver_data;
>  }
>  
> +static ssize_t event_trigger_store(struct device *dev,
> +				   struct device_attribute *attr,
> +				   const char *buf, size_t count)
> +{
> +	cxl_mock_event_trigger(dev);
> +	return count;
> +}
> +static DEVICE_ATTR_WO(event_trigger);
> +
>  static int cxl_mock_mem_probe(struct platform_device *pdev)
>  {
>  	struct device *dev = &pdev->dev;
> @@ -655,6 +661,7 @@ static int cxl_mock_mem_probe(struct platform_device *pdev)
>  	cxlds->serial = pdev->id;
>  	cxlds->mbox_send = cxl_mock_mbox_send;
>  	cxlds->payload_size = SZ_4K;
> +	cxlds->event.buf = (struct cxl_get_event_payload *) mdata->event_buf;
>  	if (is_rcd(pdev)) {
>  		cxlds->rcd = true;
>  		cxlds->component_reg_phys = CXL_RESOURCE_NONE;
> @@ -672,10 +679,15 @@ static int cxl_mock_mem_probe(struct platform_device *pdev)
>  	if (rc)
>  		return rc;
>  
> +	mdata->mes.cxlds = cxlds;
> +	cxl_mock_add_event_logs(&mdata->mes);
> +
>  	cxlmd = devm_cxl_add_memdev(cxlds);
>  	if (IS_ERR(cxlmd))
>  		return PTR_ERR(cxlmd);
>  
> +	cxl_mem_get_event_records(cxlds, CXLDEV_EVENT_STATUS_ALL);
> +
>  	return 0;
>  }
>  
> @@ -714,6 +726,7 @@ static DEVICE_ATTR_RW(security_lock);
>  
>  static struct attribute *cxl_mock_mem_attrs[] = {
>  	&dev_attr_security_lock.attr,
> +	&dev_attr_event_trigger.attr,
>  	NULL
>  };
>  ATTRIBUTE_GROUPS(cxl_mock_mem);
> diff --git a/tools/testing/cxl/test/mock.h b/tools/testing/cxl/test/mock.h
> index ef33f159375e..e7827ddedb06 100644
> --- a/tools/testing/cxl/test/mock.h
> +++ b/tools/testing/cxl/test/mock.h
> @@ -3,6 +3,18 @@
>  #include <linux/list.h>
>  #include <linux/acpi.h>
>  #include <cxl.h>
> +#include "events.h"
> +
> +struct cxl_mockmem_data {
> +	void *lsa;
> +	u32 security_state;
> +	u8 user_pass[NVDIMM_PASSPHRASE_LEN];
> +	u8 master_pass[NVDIMM_PASSPHRASE_LEN];
> +	int user_limit;
> +	int master_limit;
> +	struct mock_event_store mes;
> +	u8 event_buf[SZ_4K];
> +};

Related to the above this does not belong here. test/mock.h is dedicated
to supporting the symbols that are rerouted between the CXL modules,
cxl_mock, and cxl_test.

^ permalink raw reply	[flat|nested] 25+ messages in thread

* RE: [PATCH V3 8/8] cxl/test: Simulate event log overflow
  2022-12-08  5:21 ` [PATCH V3 8/8] cxl/test: Simulate event log overflow ira.weiny
@ 2022-12-09 22:52   ` Dan Williams
  2022-12-12  4:21     ` Ira Weiny
  0 siblings, 1 reply; 25+ messages in thread
From: Dan Williams @ 2022-12-09 22:52 UTC (permalink / raw)
  To: ira.weiny, Dan Williams
  Cc: Ira Weiny, Jonathan Cameron, Bjorn Helgaas, Alison Schofield,
	Vishal Verma, Davidlohr Bueso, Dave Jiang, linux-kernel,
	linux-pci, linux-acpi, linux-cxl

ira.weiny@ wrote:
> From: Ira Weiny <ira.weiny@intel.com>
> 
> Log overflow is marked by a separate trace message.
> 
> Simulate a log with lots of messages and flag overflow until space is
> cleared.

This and patch 7 look good to me after addressing the move to mem.c.

Reviewed-by: Dan Williams <dan.j.williams@intel.com>

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH V3 1/8] cxl/mem: Read, trace, and clear events on driver load
  2022-12-09 22:33       ` Dan Williams
@ 2022-12-09 23:34         ` Ira Weiny
  2022-12-12 17:58           ` Jonathan Cameron
  0 siblings, 1 reply; 25+ messages in thread
From: Ira Weiny @ 2022-12-09 23:34 UTC (permalink / raw)
  To: Dan Williams
  Cc: Bjorn Helgaas, Alison Schofield, Vishal Verma, Davidlohr Bueso,
	Jonathan Cameron, Dave Jiang, linux-kernel, linux-pci,
	linux-acpi, linux-cxl

On Fri, Dec 09, 2022 at 02:33:20PM -0800, Dan Williams wrote:
> Ira Weiny wrote:
> [..]
> > > > diff --git a/drivers/cxl/pci.c b/drivers/cxl/pci.c
> > > > index 3a66aadb4df0..86c84611a168 100644
> > > > --- a/drivers/cxl/pci.c
> > > > +++ b/drivers/cxl/pci.c
> > > > @@ -417,8 +417,44 @@ static void disable_aer(void *pdev)
> > > >  	pci_disable_pcie_error_reporting(pdev);
> > > >  }
> > > >  
> > > > +static void cxl_mem_free_event_buffer(void *buf)
> > > > +{
> > > > +	kvfree(buf);
> > > > +}
> > > > +
> > > > +/*
> > > > + * There is a single buffer for reading event logs from the mailbox.  All logs
> > > > + * share this buffer protected by the cxlds->event_log_lock.
> > > > + */
> > > > +static void cxl_mem_alloc_event_buf(struct cxl_dev_state *cxlds)
> > > > +{
> > > > +	struct cxl_get_event_payload *buf;
> > > > +
> > > > +	dev_dbg(cxlds->dev, "Allocating event buffer size %zu\n",
> > > > +		cxlds->payload_size);
> > > > +
> > > > +	buf = kvmalloc(cxlds->payload_size, GFP_KERNEL);
> > > > +	if (WARN_ON_ONCE(!buf))
> > > 
> > > No, why is event init so special that it behaves differently than all
> > > the other init-time allocations this driver does?
> > 
> > Previous review agreed that a warn on once would be printed if this universal
> > buffer was not allocated.
> > 
> > > 
> > > > +		return;
> > > 
> > > return -ENOMEM;
> > > 
> > > > +
> > > > +	if (WARN_ON_ONCE(devm_add_action_or_reset(cxlds->dev,
> > > > +			 cxl_mem_free_event_buffer, buf)))
> > > > +		return;
> > > 
> > > ditto.
> > 
> > I'll change both of these with a dev_err() and bail during init.
> 
> No real need to dev_err() for a simple memory allocation faliure, but
> at least it is better than a WARN

Ok no error then.

> 
> > 
> > > 
> > > > +
> > > > +	cxlds->event.buf = buf;
> > > > +}
> > > > +
> > > > +static void cxl_clear_event_logs(struct cxl_dev_state *cxlds)
> > > > +{
> > > > +	/* Force read and clear of all logs */
> > > > +	cxl_mem_get_event_records(cxlds, CXLDEV_EVENT_STATUS_ALL);
> > > > +	/* Ensure prior partial reads are handled, by starting over again */
> > > 
> > > What partial reads? cxl_mem_get_event_records() reads every log until
> > > each returns an empty result. Any remaining events after this returns
> > > are events that fired during the retrieval.
> > 
> > Jonathan was concerned that something could read part of the log and because of
> > the statefullness of the log processing this reading of the log could start in
> > the beginning.  Perhaps from a previous driver unload while reading?
> 
> The driver will not unload without completing any current executions of
> the event retrieval thread otherwise that's an irq shutdown bug.
> 
> > I guess I was also thinking the BIOS could leave things this way?  But I think
> > we should not be here if the BIOS was ever involved right?
> 
> If the OS has CXL Error control and all Event irqs are steered to the OS
> then the driver must be allowed to assume that it has exclusive control
> over event retrieval and clearing.
> 
> > > So I do not think cxl_clear_event_logs() needs to exist, just call
> > > cxl_mem_get_event_records(CXLDEV_EVENT_STATUS_ALL) once and that's it.
> > 
> > That was my inclination but Jonathan's comments got me thinking I was wrong.
> 
> Perhaps that was before we realized the recent CXL _OSC entanglement.

Yea that could have been.  I'm not clear on the order of the comments.

Ok this should be good to go.  Reworking the rest of the series.

Thanks for the review!
Ira

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH V3 2/8] cxl/mem: Wire up event interrupts
  2022-12-09 21:49   ` Dan Williams
@ 2022-12-10  1:44     ` Ira Weiny
  0 siblings, 0 replies; 25+ messages in thread
From: Ira Weiny @ 2022-12-10  1:44 UTC (permalink / raw)
  To: Dan Williams
  Cc: Davidlohr Bueso, Bjorn Helgaas, Jonathan Cameron, Bjorn Helgaas,
	Alison Schofield, Vishal Verma, Dave Jiang, linux-kernel,
	linux-pci, linux-acpi, linux-cxl

On Fri, Dec 09, 2022 at 01:49:40PM -0800, Dan Williams wrote:
> ira.weiny@ wrote:
> > From: Davidlohr Bueso <dave@stgolabs.net>
> > 
> > Currently the only CXL features targeted for irq support require their
> > message numbers to be within the first 16 entries.  The device may
> > however support less than 16 entries depending on the support it
> > provides.
> > 
> > Attempt to allocate these 16 irq vectors.  If the device supports less
> > then the PCI infrastructure will allocate that number.  Upon successful
> > allocation, users can plug in their respective isr at any point
> > thereafter.
> > 
> > CXL device events are signaled via interrupts.  Each event log may have
> > a different interrupt message number.  These message numbers are
> > reported in the Get Event Interrupt Policy mailbox command.
> > 
> > Add interrupt support for event logs.  Interrupts are allocated as
> > shared interrupts.  Therefore, all or some event logs can share the same
> > message number.
> > 
> > In addition all logs are queried on any interrupt in order of the most
> > to least severe based on the status register.
> > 
> > Cc: Bjorn Helgaas <helgaas@kernel.org>
> > Cc: Jonathan Cameron <Jonathan.Cameron@huawei.com>
> > Co-developed-by: Ira Weiny <ira.weiny@intel.com>
> > Signed-off-by: Ira Weiny <ira.weiny@intel.com>
> > Signed-off-by: Davidlohr Bueso <dave@stgolabs.net>
> > 
> > ---
> > Changes from V2:
> > 	General clean up
> > 	Use cxl_log_id to ensure each irq is unique even if the message numbers are not
> > 	Jonathan/Dan
> > 		Only set up irq vector when OSC indicates OS control
> > 	Dan
> > 		Loop reading while status indicates there are more
> > 			events.
> > 		Use new cxl_internal_send_cmd()
> > 		Squash MSI/MSIx base patch from Davidlohr
> > 		Remove uapi defines altogether
> > 		Remove use of msi_enabled
> > 	Remove the use of cxl_event_log_type_str()
> > 	Pick up tag
> > 
> > Changes from V1:
> > 	Remove unneeded evt_int_policy from struct cxl_dev_state
> > 	defer Dynamic Capacity support
> > 	Dave Jiang
> > 		s/irq/rc
> > 		use IRQ_NONE to signal the irq was not for us.
> > 	Jonathan
> > 		use msi_enabled rather than nr_irq_vec
> > 		On failure explicitly set CXL_INT_NONE
> > 		Add comment for Get Event Interrupt Policy
> > 		use devm_request_threaded_irq()
> > 		Use individual handler/thread functions for each of the
> > 		logs rather than struct cxl_event_irq_id.
> > 
> > Changes from RFC v2
> > 	Adjust to new irq 16 vector allocation
> > 	Jonathan
> > 		Remove CXL_INT_RES
> > 	Use irq threads to ensure mailbox commands are executed outside irq context
> > 	Adjust for optional Dynamic Capacity log
> > ---
> >  drivers/cxl/core/mbox.c | 42 +++++++++++++++++++
> >  drivers/cxl/cxlmem.h    | 28 +++++++++++++
> >  drivers/cxl/cxlpci.h    |  6 +++
> >  drivers/cxl/pci.c       | 90 ++++++++++++++++++++++++++++++++++++++++-
> >  4 files changed, 165 insertions(+), 1 deletion(-)
> > 
> > diff --git a/drivers/cxl/core/mbox.c b/drivers/cxl/core/mbox.c
> > index 815da3aac081..2b25691a9b09 100644
> > --- a/drivers/cxl/core/mbox.c
> > +++ b/drivers/cxl/core/mbox.c
> > @@ -854,6 +854,48 @@ void cxl_mem_get_event_records(struct cxl_dev_state *cxlds, u32 status)
> >  }
> >  EXPORT_SYMBOL_NS_GPL(cxl_mem_get_event_records, CXL);
> >  
> > +int cxl_event_config_msgnums(struct cxl_dev_state *cxlds,
> > +			     struct cxl_event_interrupt_policy *policy)
> > +{
> > +	struct cxl_mbox_cmd mbox_cmd;
> > +	int rc;
> > +
> > +	policy->info_settings = CXL_INT_MSI_MSIX;
> > +	policy->warn_settings = CXL_INT_MSI_MSIX;
> > +	policy->failure_settings = CXL_INT_MSI_MSIX;
> > +	policy->fatal_settings = CXL_INT_MSI_MSIX;
> 
> For Robustness Principle "be conservative in what is sent" purposes I
> would do the Get Events first to make sure that nothing is steered to
> the Firmware VDM, and warn the user that their BIOS gave the OS CXL
> Error Control, but did not shutdown event interrupts.
> 
> I.e. if the event interrupts are still steered to BIOS then BIOS may
> think it still has control of the event logs and trouble ensues.

Easy enough to do.

> 
> > +
> > +	mbox_cmd = (struct cxl_mbox_cmd) {
> > +		.opcode = CXL_MBOX_OP_SET_EVT_INT_POLICY,
> > +		.payload_in = policy,
> > +		.size_in = sizeof(*policy),
> > +	};
> > +
> > +	rc = cxl_internal_send_cmd(cxlds, &mbox_cmd);
> > +	if (rc < 0) {
> > +		dev_err(cxlds->dev, "Failed to set event interrupt policy : %d",
> > +			rc);
> > +		return rc;
> > +	}
> > +
> > +	mbox_cmd = (struct cxl_mbox_cmd) {
> > +		.opcode = CXL_MBOX_OP_GET_EVT_INT_POLICY,
> > +		.payload_out = policy,
> > +		.size_out = sizeof(*policy),
> > +	};
> > +
> > +	/* Retrieve interrupt message numbers */
> > +	rc = cxl_internal_send_cmd(cxlds, &mbox_cmd);
> > +	if (rc < 0) {
> > +		dev_err(cxlds->dev, "Failed to get event interrupt policy : %d",
> > +			rc);
> > +		return rc;
> > +	}
> > +
> > +	return 0;
> > +}
> > +EXPORT_SYMBOL_NS_GPL(cxl_event_config_msgnums, CXL);
> 
> A question, why is this function in the core and not in cxl_pci? For
> cxl_test mocking purposes? Otherwise seems ok to keep this in the same
> file as its only caller.

Just following the pattern that functions issuing mailbox commands were in the
core/mbox.c...  I did not realize that was so that they could be in the mock
module.

I'll move it.

> 
> > +
> >  /**
> >   * cxl_mem_get_partition_info - Get partition info
> >   * @cxlds: The device data for the operation
> > diff --git a/drivers/cxl/cxlmem.h b/drivers/cxl/cxlmem.h
> > index dd9aa3dd738e..350cb460e7fc 100644
> > --- a/drivers/cxl/cxlmem.h
> > +++ b/drivers/cxl/cxlmem.h
> > @@ -194,6 +194,30 @@ struct cxl_endpoint_dvsec_info {
> >  	struct range dvsec_range[2];
> >  };
> >  
> > +/**
> > + * Event Interrupt Policy
> > + *
> > + * CXL rev 3.0 section 8.2.9.2.4; Table 8-52
> > + */
> > +enum cxl_event_int_mode {
> > +	CXL_INT_NONE		= 0x00,
> > +	CXL_INT_MSI_MSIX	= 0x01,
> > +	CXL_INT_FW		= 0x02
> > +};
> > +#define CXL_EVENT_INT_MODE_MASK 0x3
> > +#define CXL_EVENT_INT_MSGNUM(setting) (((setting) & 0xf0) >> 4)
> > +struct cxl_event_interrupt_policy {
> > +	u8 info_settings;
> > +	u8 warn_settings;
> > +	u8 failure_settings;
> > +	u8 fatal_settings;
> > +} __packed;
> > +
> > +static inline bool cxl_evt_int_is_msi(u8 setting)
> > +{
> > +	return CXL_INT_MSI_MSIX == (setting & CXL_EVENT_INT_MODE_MASK);
> > +}
> > +
> >  /**
> >   * struct cxl_event_state - Event log driver state
> >   *
> > @@ -288,6 +312,8 @@ enum cxl_opcode {
> >  	CXL_MBOX_OP_RAW			= CXL_MBOX_OP_INVALID,
> >  	CXL_MBOX_OP_GET_EVENT_RECORD	= 0x0100,
> >  	CXL_MBOX_OP_CLEAR_EVENT_RECORD	= 0x0101,
> > +	CXL_MBOX_OP_GET_EVT_INT_POLICY	= 0x0102,
> > +	CXL_MBOX_OP_SET_EVT_INT_POLICY	= 0x0103,
> >  	CXL_MBOX_OP_GET_FW_INFO		= 0x0200,
> >  	CXL_MBOX_OP_ACTIVATE_FW		= 0x0202,
> >  	CXL_MBOX_OP_GET_SUPPORTED_LOGS	= 0x0400,
> > @@ -525,6 +551,8 @@ struct cxl_dev_state *cxl_dev_state_create(struct device *dev);
> >  void set_exclusive_cxl_commands(struct cxl_dev_state *cxlds, unsigned long *cmds);
> >  void clear_exclusive_cxl_commands(struct cxl_dev_state *cxlds, unsigned long *cmds);
> >  void cxl_mem_get_event_records(struct cxl_dev_state *cxlds, u32 status);
> > +int cxl_event_config_msgnums(struct cxl_dev_state *cxlds,
> > +			     struct cxl_event_interrupt_policy *policy);
> >  #ifdef CONFIG_CXL_SUSPEND
> >  void cxl_mem_active_inc(void);
> >  void cxl_mem_active_dec(void);
> > diff --git a/drivers/cxl/cxlpci.h b/drivers/cxl/cxlpci.h
> > index 77dbdb980b12..4aaadf17a985 100644
> > --- a/drivers/cxl/cxlpci.h
> > +++ b/drivers/cxl/cxlpci.h
> > @@ -53,6 +53,12 @@
> >  #define	    CXL_DVSEC_REG_LOCATOR_BLOCK_ID_MASK			GENMASK(15, 8)
> >  #define     CXL_DVSEC_REG_LOCATOR_BLOCK_OFF_LOW_MASK		GENMASK(31, 16)
> >  
> > +/*
> > + * NOTE: Currently all the functions which are enabled for CXL require their
> > + * vectors to be in the first 16.  Use this as the max.
> > + */
> > +#define CXL_PCI_REQUIRED_VECTORS 16
> > +
> >  /* Register Block Identifier (RBI) */
> >  enum cxl_regloc_type {
> >  	CXL_REGLOC_RBI_EMPTY = 0,
> > diff --git a/drivers/cxl/pci.c b/drivers/cxl/pci.c
> > index 86c84611a168..c84922a287ec 100644
> > --- a/drivers/cxl/pci.c
> > +++ b/drivers/cxl/pci.c
> > @@ -452,6 +452,90 @@ static void cxl_clear_event_logs(struct cxl_dev_state *cxlds)
> >  	cxl_mem_get_event_records(cxlds, CXLDEV_EVENT_STATUS_ALL);
> >  }
> >  
> > +static void cxl_alloc_irq_vectors(struct cxl_dev_state *cxlds)
> > +{
> > +	struct device *dev = cxlds->dev;
> > +	struct pci_dev *pdev = to_pci_dev(dev);
> > +	int nvecs;
> > +
> > +	/*
> > +	 * pci_alloc_irq_vectors() handles calling pci_free_irq_vectors()
> > +	 * automatically despite not being called pcim_*.  See
> > +	 * pci_setup_msi_context().
> > +	 */
> 
> I think a more important comment is why the flags are limited to MSIX
> and MSI, that's a non-obvious CXL spec constraint.

Ok yea I'll add that.

But I think the above is important as I missed that detail and went off the
rails.  I would not want someone trying to 'fix' this by adding a devres action
later.

> 
> > +	nvecs = pci_alloc_irq_vectors(pdev, 1, CXL_PCI_REQUIRED_VECTORS,
> 
> Since I have some other fixups below I'll go ahead and quibble with the
> name. The 'requirement' is 1 vector, so
> s/CXL_PCI_REQUIRED_VECTORS/CXL_PCI_DEFAULT_VECTORS/ or something like
> that. As it stands today there are diminishing returns to ask for more
> than that amount.

ok.

> 
> In the future, if the code knows better that a specific device could
> benefit from more than the default, then it can arrange to override
> this. Absent that, today there is no reason to try to ask for more.

Yes

> 
> > +				      PCI_IRQ_MSIX | PCI_IRQ_MSI);
> > +	if (nvecs < 1)
> > +		dev_dbg(dev, "Failed to alloc irq vectors: %d\n", nvecs);
> 
> Just fail the driver load if this happens. There is something wrong if a
> PCI driver cannot even allocate 1 vector.

Ok

> 
> > +}
> > +
> > +struct cxl_dev_id {
> > +	struct cxl_dev_state *cxlds;
> > +};
> > +
> > +static irqreturn_t cxl_event_thread(int irq, void *id)
> > +{
> > +	struct cxl_dev_id *dev_id = id;
> > +	struct cxl_dev_state *cxlds = dev_id->cxlds;
> > +	u32 status;
> > +
> > +	/*
> > +	 * CXL 3.0 8.2.8.3.1: The lower 32 bits are the status;
> > +	 * ignore the reserved upper 32 bits
> > +	 */
> > +	status = readl(cxlds->regs.status + CXLDEV_DEV_EVENT_STATUS_OFFSET);
> > +	while (status) {
> > +		cxl_mem_get_event_records(cxlds, status);
> > +		cond_resched();
> > +		status = readl(cxlds->regs.status + CXLDEV_DEV_EVENT_STATUS_OFFSET);
> > +	}
> > +	return IRQ_HANDLED;
> > +}
> > +
> > +static int cxl_req_event_irq(struct cxl_dev_state *cxlds, u8 setting)
> > +{
> > +	struct device *dev = cxlds->dev;
> > +	struct pci_dev *pdev = to_pci_dev(dev);
> > +	struct cxl_dev_id *dev_id;
> > +	int irq;
> > +
> > +	if (!cxl_evt_int_is_msi(setting))
> > +		return -ENXIO;
> > +
> > +	/* dev_id must be globally unique and must contain the cxlds */
> > +	dev_id = devm_kmalloc(dev, sizeof(*dev_id), GFP_KERNEL);
> 
> Yes, the id is simple and fully initialized below, but this is not a
> fast path and the rest of the driver uses devm_kzalloc() even if it
> fully inits the result. So its a consistency thing and maybe a "save the
> future person who adds another field without initializing it some
> hassle" thing.

Ah yea, changed.

> 
> > +	if (!dev_id)
> > +		return -ENOMEM;
> > +	dev_id->cxlds = cxlds;
> > +
> > +	irq =  pci_irq_vector(pdev, CXL_EVENT_INT_MSGNUM(setting));
> > +	if (irq < 0)
> > +		return irq;
> > +
> > +	return devm_request_threaded_irq(dev, irq, NULL, cxl_event_thread,
> > +					 IRQF_SHARED, NULL, dev_id);
> > +}
> > +
> > +static void cxl_event_irqsetup(struct cxl_dev_state *cxlds)
> > +{
> > +	struct cxl_event_interrupt_policy policy;
> > +
> > +	if (cxl_event_config_msgnums(cxlds, &policy))
> > +		return;
> 
> If not all interrupts can be steered to the OS probably best to abort
> the entire event setup.

It seems like if native_cxl_error is true and the irq policy is FW then the
device and/or BIOS have misconfigured something and this should be a driver
load failure, not just aborting the event setup.

Right?  Based on all the other things which are causing driver load failures it
seems like this should follow that same pattern.

> 
> Otherwise if you can steer all to the OS, if any of the below fails that
> should be a driver load failure. I certainly do not want to debug
> someone's system that randomly failed alternating log type interrupts.

ok.

> 
> > +
> > +	if (cxl_req_event_irq(cxlds, policy.info_settings))
> > +		dev_err(cxlds->dev, "Failed to get interrupt for event Info log\n");
> > +
> > +	if (cxl_req_event_irq(cxlds, policy.warn_settings))
> > +		dev_err(cxlds->dev, "Failed to get interrupt for event Warn log\n");
> > +
> > +	if (cxl_req_event_irq(cxlds, policy.failure_settings))
> > +		dev_err(cxlds->dev, "Failed to get interrupt for event Failure log\n");
> > +
> > +	if (cxl_req_event_irq(cxlds, policy.fatal_settings))
> > +		dev_err(cxlds->dev, "Failed to get interrupt for event Fatal log\n");
> > +}
> > +
> >  static int cxl_pci_probe(struct pci_dev *pdev, const struct pci_device_id *id)
> >  {
> >  	struct pci_host_bridge *host_bridge = pci_find_host_bridge(pdev->bus);
> > @@ -526,14 +610,18 @@ static int cxl_pci_probe(struct pci_dev *pdev, const struct pci_device_id *id)
> >  	if (rc)
> >  		return rc;
> >  
> > +	cxl_alloc_irq_vectors(cxlds);
> 
> Just pass the pdev directly here, no other part of cxlds is needed.

Ok yea.

Ira

> 
> > +
> >  	cxlmd = devm_cxl_add_memdev(cxlds);
> >  	if (IS_ERR(cxlmd))
> >  		return PTR_ERR(cxlmd);
> >  
> >  	if (host_bridge->native_cxl_error) {
> >  		cxl_mem_alloc_event_buf(cxlds);
> > -		if (cxlds->event.buf)
> > +		if (cxlds->event.buf) {
> > +			cxl_event_irqsetup(cxlds);
> >  			cxl_clear_event_logs(cxlds);
> > +		}
> >  	}
> >  
> >  	if (cxlds->regs.ras) {
> > -- 
> > 2.37.2
> > 
> 
> 

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH V3 3/8] cxl/mem: Trace General Media Event Record
  2022-12-09 22:04   ` Dan Williams
@ 2022-12-11 16:08     ` Ira Weiny
  0 siblings, 0 replies; 25+ messages in thread
From: Ira Weiny @ 2022-12-11 16:08 UTC (permalink / raw)
  To: Dan Williams
  Cc: Bjorn Helgaas, Alison Schofield, Vishal Verma, Davidlohr Bueso,
	Jonathan Cameron, Dave Jiang, linux-kernel, linux-pci,
	linux-acpi, linux-cxl

On Fri, Dec 09, 2022 at 02:04:23PM -0800, Dan Williams wrote:
> ira.weiny@ wrote:
> > From: Ira Weiny <ira.weiny@intel.com>
> > 
> > CXL rev 3.0 section 8.2.9.2.1.1 defines the General Media Event Record.
> > 
> > Determine if the event read is a general media record and if so trace
> > the record as a General Media Event Record.
> > 
> > Signed-off-by: Ira Weiny <ira.weiny@intel.com>
> > 
> > ---
> > Changes from V2:
> > 	Dan
> > 		Remove trace_*_enabled() calls
> > 		Pass struct device to trace points
> > 
> > Changes from V1:
> > 	Jonathan
> > 		fix spec references for CXL rev 3.0
> > 		Make flags all caps
> > 
> > Changes from RFC v2:
> > 	Output DPA flags as a single field
> > 	Ensure names of fields match what TP_print outputs
> > 	Steven
> > 		prefix TRACE_EVENT with 'cxl_'
> > 	Jonathan
> > 		Remove Reserved field
> > 
> > Changes from RFC:
> > 	Add reserved byte array
> > 	Use common CXL event header record macros
> > 	Jonathan
> > 		Use unaligned_le{24,16} for unaligned fields
> > 		Don't use the inverse of phy addr mask
> > 	Dave Jiang
> > 		s/cxl_gen_media_event/general_media
> > 		s/cxl_evt_gen_media/cxl_event_gen_media
> > ---
> >  drivers/cxl/core/mbox.c  |  30 +++++++++-
> >  drivers/cxl/core/trace.h | 124 +++++++++++++++++++++++++++++++++++++++
> >  drivers/cxl/cxlmem.h     |  19 ++++++
> >  3 files changed, 171 insertions(+), 2 deletions(-)
> > 
> > diff --git a/drivers/cxl/core/mbox.c b/drivers/cxl/core/mbox.c
> > index 2b25691a9b09..0d8c66f1cdc5 100644
> > --- a/drivers/cxl/core/mbox.c
> > +++ b/drivers/cxl/core/mbox.c
> > @@ -718,6 +718,32 @@ int cxl_enumerate_cmds(struct cxl_dev_state *cxlds)
> >  }
> >  EXPORT_SYMBOL_NS_GPL(cxl_enumerate_cmds, CXL);
> >  
> > +/*
> > + * General Media Event Record
> > + * CXL rev 3.0 Section 8.2.9.2.1.1; Table 8-43
> > + */
> > +static const uuid_t gen_media_event_uuid =
> > +	UUID_INIT(0xfbcd0a77, 0xc260, 0x417f,
> > +		  0x85, 0xa9, 0x08, 0x8b, 0x16, 0x21, 0xeb, 0xa6);
> > +
> > +static void cxl_trace_event_record(const struct device *dev,
> > +				   enum cxl_event_log_type type,
> > +				   struct cxl_event_record_raw *record)
> > +{
> > +	uuid_t *id = &record->hdr.id;
> > +
> > +	if (uuid_equal(id, &gen_media_event_uuid)) {
> > +		struct cxl_event_gen_media *rec =
> > +				(struct cxl_event_gen_media *)record;
> > +
> > +		trace_cxl_general_media(dev, type, rec);
> > +		return;
> > +	}
> > +
> > +	/* For unknown record types print just the header */
> > +	trace_cxl_generic_event(dev, type, record);
> > +}
> > +
> >  static int cxl_clear_event_record(struct cxl_dev_state *cxlds,
> >  				  enum cxl_event_log_type log,
> >  				  struct cxl_get_event_payload *get_pl)
> > @@ -810,8 +836,8 @@ static void cxl_mem_get_records_log(struct cxl_dev_state *cxlds,
> >  			int i;
> >  
> >  			for (i = 0; i < nr_rec; i++)
> > -				trace_cxl_generic_event(cxlds->dev, type,
> > -							&payload->records[i]);
> > +				cxl_trace_event_record(cxlds->dev, type,
> > +						       &payload->records[i]);
> >  
> >  			rc = cxl_clear_event_record(cxlds, type, payload);
> >  			if (rc) {
> > diff --git a/drivers/cxl/core/trace.h b/drivers/cxl/core/trace.h
> > index 24eef6909f13..82462942590b 100644
> > --- a/drivers/cxl/core/trace.h
> > +++ b/drivers/cxl/core/trace.h
> > @@ -223,6 +223,130 @@ TRACE_EVENT(cxl_generic_event,
> >  		__print_hex(__entry->data, CXL_EVENT_RECORD_DATA_LENGTH))
> >  );
> >  
> > +/*
> > + * Physical Address field masks
> > + *
> > + * General Media Event Record
> > + * CXL rev 3.0 Section 8.2.9.2.1.1; Table 8-43
> > + *
> > + * DRAM Event Record
> > + * CXL rev 3.0 section 8.2.9.2.1.2; Table 8-44
> > + */
> > +#define CXL_DPA_FLAGS_MASK			0x3F
> > +#define CXL_DPA_MASK				(~CXL_DPA_FLAGS_MASK)
> > +
> > +#define CXL_DPA_VOLATILE			BIT(0)
> > +#define CXL_DPA_NOT_REPAIRABLE			BIT(1)
> > +#define show_dpa_flags(flags)	__print_flags(flags, "|",		   \
> > +	{ CXL_DPA_VOLATILE,			"VOLATILE"		}, \
> > +	{ CXL_DPA_NOT_REPAIRABLE,		"NOT_REPAIRABLE"	}  \
> > +)
> > +
> > +/*
> > + * General Media Event Record - GMER
> > + * CXL rev 3.0 Section 8.2.9.2.1.1; Table 8-43
> > + */
> > +#define CXL_GMER_EVT_DESC_UNCORECTABLE_EVENT		BIT(0)
> > +#define CXL_GMER_EVT_DESC_THRESHOLD_EVENT		BIT(1)
> > +#define CXL_GMER_EVT_DESC_POISON_LIST_OVERFLOW		BIT(2)
> > +#define show_event_desc_flags(flags)	__print_flags(flags, "|",		   \
> > +	{ CXL_GMER_EVT_DESC_UNCORECTABLE_EVENT,		"UNCORRECTABLE_EVENT"	}, \
> > +	{ CXL_GMER_EVT_DESC_THRESHOLD_EVENT,		"THRESHOLD_EVENT"	}, \
> > +	{ CXL_GMER_EVT_DESC_POISON_LIST_OVERFLOW,	"POISON_LIST_OVERFLOW"	}  \
> > +)
> > +
> > +#define CXL_GMER_MEM_EVT_TYPE_ECC_ERROR			0x00
> > +#define CXL_GMER_MEM_EVT_TYPE_INV_ADDR			0x01
> > +#define CXL_GMER_MEM_EVT_TYPE_DATA_PATH_ERROR		0x02
> > +#define show_mem_event_type(type)	__print_symbolic(type,			\
> > +	{ CXL_GMER_MEM_EVT_TYPE_ECC_ERROR,		"ECC Error" },		\
> > +	{ CXL_GMER_MEM_EVT_TYPE_INV_ADDR,		"Invalid Address" },	\
> > +	{ CXL_GMER_MEM_EVT_TYPE_DATA_PATH_ERROR,	"Data Path Error" }	\
> > +)
> > +
> > +#define CXL_GMER_TRANS_UNKNOWN				0x00
> > +#define CXL_GMER_TRANS_HOST_READ			0x01
> > +#define CXL_GMER_TRANS_HOST_WRITE			0x02
> > +#define CXL_GMER_TRANS_HOST_SCAN_MEDIA			0x03
> > +#define CXL_GMER_TRANS_HOST_INJECT_POISON		0x04
> > +#define CXL_GMER_TRANS_INTERNAL_MEDIA_SCRUB		0x05
> > +#define CXL_GMER_TRANS_INTERNAL_MEDIA_MANAGEMENT	0x06
> > +#define show_trans_type(type)	__print_symbolic(type,					\
> > +	{ CXL_GMER_TRANS_UNKNOWN,			"Unknown" },			\
> > +	{ CXL_GMER_TRANS_HOST_READ,			"Host Read" },			\
> > +	{ CXL_GMER_TRANS_HOST_WRITE,			"Host Write" },			\
> > +	{ CXL_GMER_TRANS_HOST_SCAN_MEDIA,		"Host Scan Media" },		\
> > +	{ CXL_GMER_TRANS_HOST_INJECT_POISON,		"Host Inject Poison" },		\
> > +	{ CXL_GMER_TRANS_INTERNAL_MEDIA_SCRUB,		"Internal Media Scrub" },	\
> > +	{ CXL_GMER_TRANS_INTERNAL_MEDIA_MANAGEMENT,	"Internal Media Management" }	\
> > +)
> > +
> > +#define CXL_GMER_VALID_CHANNEL				BIT(0)
> > +#define CXL_GMER_VALID_RANK				BIT(1)
> > +#define CXL_GMER_VALID_DEVICE				BIT(2)
> > +#define CXL_GMER_VALID_COMPONENT			BIT(3)
> > +#define show_valid_flags(flags)	__print_flags(flags, "|",		   \
> > +	{ CXL_GMER_VALID_CHANNEL,			"CHANNEL"	}, \
> > +	{ CXL_GMER_VALID_RANK,				"RANK"		}, \
> > +	{ CXL_GMER_VALID_DEVICE,			"DEVICE"	}, \
> > +	{ CXL_GMER_VALID_COMPONENT,			"COMPONENT"	}  \
> > +)
> > +
> > +TRACE_EVENT(cxl_general_media,
> > +
> > +	TP_PROTO(const struct device *dev, enum cxl_event_log_type log,
> > +		 struct cxl_event_gen_media *rec),
> > +
> > +	TP_ARGS(dev, log, rec),
> > +
> > +	TP_STRUCT__entry(
> > +		CXL_EVT_TP_entry
> > +		/* General Media */
> > +		__field(u64, dpa)
> > +		__field(u8, descriptor)
> > +		__field(u8, type)
> > +		__field(u8, transaction_type)
> > +		__field(u8, channel)
> > +		__field(u32, device)
> > +		__array(u8, comp_id, CXL_EVENT_GEN_MED_COMP_ID_SIZE)
> > +		__field(u16, validity_flags)
> > +		/* Following are out of order to pack trace record */
> > +		__field(u8, rank)
> > +		__field(u8, dpa_flags)
> > +	),
> > +
> > +	TP_fast_assign(
> > +		CXL_EVT_TP_fast_assign(dev, log, rec->hdr);
> > +
> > +		/* General Media */
> > +		__entry->dpa = le64_to_cpu(rec->phys_addr);
> > +		__entry->dpa_flags = __entry->dpa & CXL_DPA_FLAGS_MASK;
> > +		/* Mask after flags have been parsed */
> > +		__entry->dpa &= CXL_DPA_MASK;
> > +		__entry->descriptor = rec->descriptor;
> > +		__entry->type = rec->type;
> > +		__entry->transaction_type = rec->transaction_type;
> > +		__entry->channel = rec->channel;
> > +		__entry->rank = rec->rank;
> > +		__entry->device = get_unaligned_le24(rec->device);
> > +		memcpy(__entry->comp_id, &rec->component_id,
> > +			CXL_EVENT_GEN_MED_COMP_ID_SIZE);
> > +		__entry->validity_flags = get_unaligned_le16(&rec->validity_flags);
> > +	),
> > +
> > +	CXL_EVT_TP_printk("dpa=%llx dpa_flags='%s' " \
> > +		"descriptor='%s' type='%s' transaction_type='%s' channel=%u rank=%u " \
> > +		"device=%x comp_id=%s validity_flags='%s'",
> > +		__entry->dpa, show_dpa_flags(__entry->dpa_flags),
> > +		show_event_desc_flags(__entry->descriptor),
> > +		show_mem_event_type(__entry->type),
> > +		show_trans_type(__entry->transaction_type),
> > +		__entry->channel, __entry->rank, __entry->device,
> > +		__print_hex(__entry->comp_id, CXL_EVENT_GEN_MED_COMP_ID_SIZE),
> > +		show_valid_flags(__entry->validity_flags)
> > +	)
> > +);
> > +
> >  #endif /* _CXL_EVENTS_H */
> >  
> >  #define TRACE_INCLUDE_FILE trace
> > diff --git a/drivers/cxl/cxlmem.h b/drivers/cxl/cxlmem.h
> > index 350cb460e7fc..a5f5d4a380af 100644
> > --- a/drivers/cxl/cxlmem.h
> > +++ b/drivers/cxl/cxlmem.h
> > @@ -456,6 +456,25 @@ struct cxl_mbox_clear_event_payload {
> >  		 (sizeof(__le16) * CXL_CLEAR_EVENT_MAX_HANDLES))) /	\
> >  		sizeof(__le16))
> >  
> > +/*
> > + * General Media Event Record
> > + * CXL rev 3.0 Section 8.2.9.2.1.1; Table 8-43
> > + */
> > +#define CXL_EVENT_GEN_MED_COMP_ID_SIZE	0x10
> > +struct cxl_event_gen_media {
> > +	struct cxl_event_record_hdr hdr;
> > +	__le64 phys_addr;
> > +	u8 descriptor;
> > +	u8 type;
> > +	u8 transaction_type;
> > +	u8 validity_flags[2];
> > +	u8 channel;
> > +	u8 rank;
> > +	u8 device[3];
> > +	u8 component_id[CXL_EVENT_GEN_MED_COMP_ID_SIZE];
> > +	u8 reserved[0x2e];
> 
> If you reflow this one again to make capitalization of symbols
> consistent in the trace prints perhaps change that to decimal, but
> that's not a blocker.

Done.

> 
> Reviewed-by: Dan Williams <dan.j.williams@intel.com>

Thanks!
Ira

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH V3 4/8] cxl/mem: Trace DRAM Event Record
  2022-12-09 22:14   ` Dan Williams
@ 2022-12-11 16:21     ` Ira Weiny
  0 siblings, 0 replies; 25+ messages in thread
From: Ira Weiny @ 2022-12-11 16:21 UTC (permalink / raw)
  To: Dan Williams
  Cc: Bjorn Helgaas, Alison Schofield, Vishal Verma, Davidlohr Bueso,
	Jonathan Cameron, Dave Jiang, linux-kernel, linux-pci,
	linux-acpi, linux-cxl

On Fri, Dec 09, 2022 at 02:14:41PM -0800, Dan Williams wrote:
> ira.weiny@ wrote:
> > From: Ira Weiny <ira.weiny@intel.com>
> > 
> > CXL rev 3.0 section 8.2.9.2.1.2 defines the DRAM Event Record.
> > 
> > Determine if the event read is a DRAM event record and if so trace the
> > record.
> > 
> > Signed-off-by: Ira Weiny <ira.weiny@intel.com>
> > 
> > ---
> > Changes from v2:
> > 	Dan
> > 		Move tracing to cxl core
> > 		Remove trace_*_enabled() calls
> > 		Pass struct device to trace points
> > 
> > Changes from RFC v2:
> > 	Output DPA flags as a separate field.
> > 	Ensure field names match TP_print output
> > 	Steven
> > 		prefix TRACE_EVENT with 'cxl_'
> > 	Jonathan
> > 		Formatting fix
> > 		Remove reserved field
> > 
> > Changes from RFC:
> > 	Add reserved byte data
> > 	Use new CXL header macros
> > 	Jonathan
> > 		Use get_unaligned_le{24,16}() for unaligned fields
> > 		Use 'else if'
> > 	Dave Jiang
> > 		s/cxl_dram_event/dram
> > 		s/cxl_evt_dram_rec/cxl_event_dram
> > 	Adjust for new phys addr mask
> > ---
> >  drivers/cxl/core/mbox.c  | 13 ++++++
> >  drivers/cxl/core/trace.h | 92 ++++++++++++++++++++++++++++++++++++++++
> >  drivers/cxl/cxlmem.h     | 23 ++++++++++
> >  3 files changed, 128 insertions(+)
> > 
> > diff --git a/drivers/cxl/core/mbox.c b/drivers/cxl/core/mbox.c
> > index 0d8c66f1cdc5..2fa4645f0ed9 100644
> > --- a/drivers/cxl/core/mbox.c
> > +++ b/drivers/cxl/core/mbox.c
> > @@ -726,6 +726,14 @@ static const uuid_t gen_media_event_uuid =
> >  	UUID_INIT(0xfbcd0a77, 0xc260, 0x417f,
> >  		  0x85, 0xa9, 0x08, 0x8b, 0x16, 0x21, 0xeb, 0xa6);
> >  
> > +/*
> > + * DRAM Event Record
> > + * CXL rev 3.0 section 8.2.9.2.1.2; Table 8-44
> > + */
> > +static const uuid_t dram_event_uuid =
> > +	UUID_INIT(0x601dcbb3, 0x9c06, 0x4eab,
> > +		  0xb8, 0xaf, 0x4e, 0x9b, 0xfb, 0x5c, 0x96, 0x24);
> > +
> >  static void cxl_trace_event_record(const struct device *dev,
> >  				   enum cxl_event_log_type type,
> >  				   struct cxl_event_record_raw *record)
> > @@ -738,6 +746,11 @@ static void cxl_trace_event_record(const struct device *dev,
> >  
> >  		trace_cxl_general_media(dev, type, rec);
> >  		return;
> > +	} else if (uuid_equal(id, &dram_event_uuid)) {
> > +		struct cxl_event_dram *rec = (struct cxl_event_dram *)record;
> > +
> > +		trace_cxl_dram(dev, type, rec);
> > +		return;
> 
> I think I mentioned this before,

Sorry, I don't remember seeing that.

>
> but rather than a "return" in every
> branch just make the 'unknown' case the final else in this if block.

Sounds good.  However the previous patch started this pattern so I fixed it
there and continued through these.

> 
> With that feel free to add:
> 
> Reviewed-by: Dan Williams <dan.j.williams@intel.com>

Thanks!
Ira

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH V3 6/8] cxl/test: Add generic mock events
  2022-12-09 22:48   ` Dan Williams
@ 2022-12-11 17:26     ` Ira Weiny
  0 siblings, 0 replies; 25+ messages in thread
From: Ira Weiny @ 2022-12-11 17:26 UTC (permalink / raw)
  To: Dan Williams
  Cc: Bjorn Helgaas, Alison Schofield, Vishal Verma, Davidlohr Bueso,
	Jonathan Cameron, Dave Jiang, linux-kernel, linux-pci,
	linux-acpi, linux-cxl

On Fri, Dec 09, 2022 at 02:48:29PM -0800, Dan Williams wrote:
> ira.weiny@ wrote:
> > From: Ira Weiny <ira.weiny@intel.com>
> > 
> > Facilitate testing basic Get/Clear Event functionality by creating
> > multiple logs and generic events with made up UUID's.
> > 
> > Data is completely made up with data patterns which should be easy to
> > spot in trace output.
> > 
> > A single sysfs entry resets the event data and triggers collecting the
> > events for testing.
> > 
> > Test traces are easy to obtain with a small script such as this:
> > 
> > 	#!/bin/bash -x
> > 
> > 	devices=`find /sys/devices/platform -name cxl_mem*`
> > 
> > 	# Turn on tracing
> > 	echo "" > /sys/kernel/tracing/trace
> > 	echo 1 > /sys/kernel/tracing/events/cxl/enable
> > 	echo 1 > /sys/kernel/tracing/tracing_on
> > 
> > 	# Generate fake interrupt
> > 	for device in $devices; do
> > 	        echo 1 > $device/event_trigger
> > 	done
> > 
> > 	# Turn off tracing and report events
> > 	echo 0 > /sys/kernel/tracing/tracing_on
> > 	cat /sys/kernel/tracing/trace
> > 
> > Signed-off-by: Ira Weiny <ira.weiny@intel.com>
> > 
> > ---
> > Changes from v2:
> > 	Adjust for tracing being part of cxl core
> > 	Dan/Dave J.
> > 		Adjust to Dave J.s mock data structure
> > 	Dan
> > 		Remove mock specific functionality in main code
> > 
> > Changes from v1:
> > 	Fix up for new structures
> > 	Jonathan
> > 		Update based on specification discussion
> > 
> > Changes from RFC v2:
> > 	Adjust to simulate the event status register
> > 
> > Changes from RFC:
> > 	Separate out the event code
> > 	Adjust for struct changes.
> > 	Clean up devm_cxl_mock_event_logs()
> > 	Clean up naming and comments
> > 	Jonathan
> > 		Remove dynamic allocation of event logs
> > 		Clean up comment
> > 		Remove unneeded xarray
> > 		Ensure event_trigger sysfs is valid prior to the driver
> > 		going active.
> > 	Dan
> > 		Remove the fill/reset event sysfs as these operations
> > 		can be done together
> > ---
> >  tools/testing/cxl/test/Kbuild   |   4 +-
> >  tools/testing/cxl/test/events.c | 195 ++++++++++++++++++++++++++++++++
> >  tools/testing/cxl/test/events.h |  32 ++++++
> >  tools/testing/cxl/test/mem.c    |  33 ++++--
> >  tools/testing/cxl/test/mock.h   |  12 ++
> >  5 files changed, 264 insertions(+), 12 deletions(-)
> >  create mode 100644 tools/testing/cxl/test/events.c
> >  create mode 100644 tools/testing/cxl/test/events.h
> > 
> > diff --git a/tools/testing/cxl/test/Kbuild b/tools/testing/cxl/test/Kbuild
> > index 4e59e2c911f6..c48d912e3781 100644
> > --- a/tools/testing/cxl/test/Kbuild
> > +++ b/tools/testing/cxl/test/Kbuild
> > @@ -1,5 +1,5 @@
> >  # SPDX-License-Identifier: GPL-2.0
> > -ccflags-y := -I$(srctree)/drivers/cxl/
> > +ccflags-y := -I$(srctree)/drivers/cxl/ -I$(srctree)/drivers/cxl/core
> >  
> >  obj-m += cxl_test.o
> >  obj-m += cxl_mock.o
> > @@ -7,4 +7,4 @@ obj-m += cxl_mock_mem.o
> >  
> >  cxl_test-y := cxl.o
> >  cxl_mock-y := mock.o
> > -cxl_mock_mem-y := mem.o
> > +cxl_mock_mem-y := mem.o events.o
> > diff --git a/tools/testing/cxl/test/events.c b/tools/testing/cxl/test/events.c
> > new file mode 100644
> > index 000000000000..1346c38dce1d
> > --- /dev/null
> > +++ b/tools/testing/cxl/test/events.c
> > @@ -0,0 +1,195 @@
> > +// SPDX-License-Identifier: GPL-2.0-only
> > +// Copyright(c) 2022 Intel Corporation. All rights reserved.
> > +
> > +#include "mock.h"
> > +#include "events.h"
> > +#include "trace.h"
> > +
> > +struct mock_event_log *find_event_log(struct device *dev, int log_type)
> > +{
> > +	struct cxl_mockmem_data *mdata = dev_get_drvdata(dev);
> > +
> > +	if (log_type >= CXL_EVENT_TYPE_MAX)
> > +		return NULL;
> > +	return &mdata->mes.mock_logs[log_type];
> > +}
> > +
> > +struct cxl_event_record_raw *get_cur_event(struct mock_event_log *log)
> > +{
> > +	return log->events[log->cur_idx];
> > +}
> > +
> > +void reset_event_log(struct mock_event_log *log)
> > +{
> > +	log->cur_idx = 0;
> > +	log->clear_idx = 0;
> > +}
> > +
> > +/* Handle can never be 0 use 1 based indexing for handle */
> > +u16 get_clear_handle(struct mock_event_log *log)
> > +{
> > +	return log->clear_idx + 1;
> > +}
> > +
> > +/* Handle can never be 0 use 1 based indexing for handle */
> > +__le16 get_cur_event_handle(struct mock_event_log *log)
> > +{
> > +	u16 cur_handle = log->cur_idx + 1;
> > +
> > +	return cpu_to_le16(cur_handle);
> > +}
> > +
> > +static bool log_empty(struct mock_event_log *log)
> > +{
> > +	return log->cur_idx == log->nr_events;
> > +}
> > +
> > +static void event_store_add_event(struct mock_event_store *mes,
> > +				  enum cxl_event_log_type log_type,
> > +				  struct cxl_event_record_raw *event)
> > +{
> > +	struct mock_event_log *log;
> > +
> > +	if (WARN_ON(log_type >= CXL_EVENT_TYPE_MAX))
> > +		return;
> > +
> > +	log = &mes->mock_logs[log_type];
> > +	if (WARN_ON(log->nr_events >= CXL_TEST_EVENT_CNT_MAX))
> > +		return;
> > +
> > +	log->events[log->nr_events] = event;
> > +	log->nr_events++;
> > +}
> > +
> > +int mock_get_event(struct cxl_dev_state *cxlds, struct cxl_mbox_cmd *cmd)
> > +{
> > +	struct cxl_get_event_payload *pl;
> > +	struct mock_event_log *log;
> > +	u8 log_type;
> > +	int i;
> > +
> > +	if (cmd->size_in != sizeof(log_type))
> > +		return -EINVAL;
> > +
> > +	if (cmd->size_out < struct_size(pl, records, CXL_TEST_EVENT_CNT))
> > +		return -EINVAL;
> > +
> > +	log_type = *((u8 *)cmd->payload_in);
> > +	if (log_type >= CXL_EVENT_TYPE_MAX)
> > +		return -EINVAL;
> > +
> > +	memset(cmd->payload_out, 0, cmd->size_out);
> > +
> > +	log = find_event_log(cxlds->dev, log_type);
> > +	if (!log || log_empty(log))
> > +		return 0;
> > +
> > +	pl = cmd->payload_out;
> > +
> > +	for (i = 0; i < CXL_TEST_EVENT_CNT && !log_empty(log); i++) {
> > +		memcpy(&pl->records[i], get_cur_event(log), sizeof(pl->records[i]));
> > +		pl->records[i].hdr.handle = get_cur_event_handle(log);
> > +		log->cur_idx++;
> > +	}
> > +
> > +	pl->record_count = cpu_to_le16(i);
> > +	if (!log_empty(log))
> > +		pl->flags |= CXL_GET_EVENT_FLAG_MORE_RECORDS;
> > +
> > +	return 0;
> > +}
> > +EXPORT_SYMBOL_GPL(mock_get_event);
> 
> I believe I asked this before on the last review. Why is this exported?
> The caller is within the same module.

I don't recall the comment, sorry.  I don't recall why I felt the need to add
the export.  This is now obviously not needed.

> 
> I also notice now that the event support code is even smaller than the
> security code that is already in that file, just go ahead and move this
> infrastructure in there as well. Then this does not need to move the
> cxl_mockmem_data definition around or create other new headers.
> 
> Other than that minor detail the implementation looks good. I appreciate
> the effort to get cxl_test to light up this interface!
> 
> After consolidating in mem.c you can add:

After all the changes the code has shrunk quite a bit.

Moved to mem.c.

> 
> Reviewed-by: Dan Williams <dan.j.williams@intel.com>

Thanks!
Ira

> 
> > +
> > +int mock_clear_event(struct cxl_dev_state *cxlds, struct cxl_mbox_cmd *cmd)
> > +{
> > +	struct cxl_mbox_clear_event_payload *pl = cmd->payload_in;
> > +	struct mock_event_log *log;
> > +	u8 log_type = pl->event_log;
> > +	u16 handle;
> > +	int nr;
> > +
> > +	if (log_type >= CXL_EVENT_TYPE_MAX)
> > +		return -EINVAL;
> > +
> > +	log = find_event_log(cxlds->dev, log_type);
> > +	if (!log)
> > +		return 0; /* No mock data in this log */
> > +
> > +	/*
> > +	 * This check is technically not invalid per the specification AFAICS.
> > +	 * (The host could 'guess' handles and clear them in order).
> > +	 * However, this is not good behavior for the host so test it.
> > +	 */
> > +	if (log->clear_idx + pl->nr_recs > log->cur_idx) {
> > +		dev_err(cxlds->dev,
> > +			"Attempting to clear more events than returned!\n");
> > +		return -EINVAL;
> > +	}
> > +
> > +	/* Check handle order prior to clearing events */
> > +	for (nr = 0, handle = get_clear_handle(log);
> > +	     nr < pl->nr_recs;
> > +	     nr++, handle++) {
> > +		if (handle != le16_to_cpu(pl->handle[nr])) {
> > +			dev_err(cxlds->dev, "Clearing events out of order\n");
> > +			return -EINVAL;
> > +		}
> > +	}
> > +
> > +	/* Clear events */
> > +	log->clear_idx += pl->nr_recs;
> > +	return 0;
> > +}
> > +EXPORT_SYMBOL_GPL(mock_clear_event);
> > +
> > +void cxl_mock_event_trigger(struct device *dev)
> > +{
> > +	struct cxl_mockmem_data *mdata = dev_get_drvdata(dev);
> > +	struct mock_event_store *mes = &mdata->mes;
> > +	int i;
> > +
> > +	for (i = CXL_EVENT_TYPE_INFO; i < CXL_EVENT_TYPE_MAX; i++) {
> > +		struct mock_event_log *log;
> > +
> > +		log = find_event_log(dev, i);
> > +		if (log)
> > +			reset_event_log(log);
> > +	}
> > +
> > +	cxl_mem_get_event_records(mes->cxlds, mes->ev_status);
> > +}
> > +EXPORT_SYMBOL_GPL(cxl_mock_event_trigger);
> > +
> > +struct cxl_event_record_raw maint_needed = {
> > +	.hdr = {
> > +		.id = UUID_INIT(0xBA5EBA11, 0xABCD, 0xEFEB,
> > +				0xa5, 0x5a, 0xa5, 0x5a, 0xa5, 0xa5, 0x5a, 0xa5),
> > +		.length = sizeof(struct cxl_event_record_raw),
> > +		.flags[0] = CXL_EVENT_RECORD_FLAG_MAINT_NEEDED,
> > +		/* .handle = Set dynamically */
> > +		.related_handle = cpu_to_le16(0xa5b6),
> > +	},
> > +	.data = { 0xDE, 0xAD, 0xBE, 0xEF },
> > +};
> > +
> > +struct cxl_event_record_raw hardware_replace = {
> > +	.hdr = {
> > +		.id = UUID_INIT(0xABCDEFEB, 0xBA11, 0xBA5E,
> > +				0xa5, 0x5a, 0xa5, 0x5a, 0xa5, 0xa5, 0x5a, 0xa5),
> > +		.length = sizeof(struct cxl_event_record_raw),
> > +		.flags[0] = CXL_EVENT_RECORD_FLAG_HW_REPLACE,
> > +		/* .handle = Set dynamically */
> > +		.related_handle = cpu_to_le16(0xb6a5),
> > +	},
> > +	.data = { 0xDE, 0xAD, 0xBE, 0xEF },
> > +};
> > +
> > +void cxl_mock_add_event_logs(struct mock_event_store *mes)
> > +{
> > +	event_store_add_event(mes, CXL_EVENT_TYPE_INFO, &maint_needed);
> > +	mes->ev_status |= CXLDEV_EVENT_STATUS_INFO;
> > +
> > +	event_store_add_event(mes, CXL_EVENT_TYPE_FATAL, &hardware_replace);
> > +	mes->ev_status |= CXLDEV_EVENT_STATUS_FATAL;
> > +}
> > +EXPORT_SYMBOL_GPL(cxl_mock_add_event_logs);
> > diff --git a/tools/testing/cxl/test/events.h b/tools/testing/cxl/test/events.h
> > new file mode 100644
> > index 000000000000..626cd79f1871
> > --- /dev/null
> > +++ b/tools/testing/cxl/test/events.h
> > @@ -0,0 +1,32 @@
> > +/* SPDX-License-Identifier: GPL-2.0 */
> > +
> > +#ifndef CXL_TEST_EVENTS_H
> > +#define CXL_TEST_EVENTS_H
> > +
> > +#include <cxlmem.h>
> > +
> > +#define CXL_TEST_EVENT_CNT_MAX 15
> > +
> > +/* Set a number of events to return at a time for simulation.  */
> > +#define CXL_TEST_EVENT_CNT 3
> > +
> > +struct mock_event_log {
> > +	u16 clear_idx;
> > +	u16 cur_idx;
> > +	u16 nr_events;
> > +	struct cxl_event_record_raw *events[CXL_TEST_EVENT_CNT_MAX];
> > +};
> > +
> > +struct mock_event_store {
> > +	struct cxl_dev_state *cxlds;
> > +	struct mock_event_log mock_logs[CXL_EVENT_TYPE_MAX];
> > +	u32 ev_status;
> > +};
> > +
> > +int mock_get_event(struct cxl_dev_state *cxlds, struct cxl_mbox_cmd *cmd);
> > +int mock_clear_event(struct cxl_dev_state *cxlds, struct cxl_mbox_cmd *cmd);
> > +void cxl_mock_add_event_logs(struct mock_event_store *mes);
> > +void cxl_mock_remove_event_logs(struct device *dev);
> > +void cxl_mock_event_trigger(struct device *dev);
> > +
> > +#endif /* CXL_TEST_EVENTS_H */
> > diff --git a/tools/testing/cxl/test/mem.c b/tools/testing/cxl/test/mem.c
> > index 5e4ecd93f1d2..7674d6305d28 100644
> > --- a/tools/testing/cxl/test/mem.c
> > +++ b/tools/testing/cxl/test/mem.c
> > @@ -8,6 +8,7 @@
> >  #include <linux/sizes.h>
> >  #include <linux/bits.h>
> >  #include <cxlmem.h>
> > +#include "mock.h"
> >  
> >  #define LSA_SIZE SZ_128K
> >  #define DEV_SIZE SZ_2G
> > @@ -67,16 +68,6 @@ static struct {
> >  
> >  #define PASS_TRY_LIMIT 3
> >  
> > -struct cxl_mockmem_data {
> > -	void *lsa;
> > -	u32 security_state;
> > -	u8 user_pass[NVDIMM_PASSPHRASE_LEN];
> > -	u8 master_pass[NVDIMM_PASSPHRASE_LEN];
> > -	int user_limit;
> > -	int master_limit;
> > -
> > -};
> > -
> >  static int mock_gsl(struct cxl_mbox_cmd *cmd)
> >  {
> >  	if (cmd->size_out < sizeof(mock_gsl_payload))
> > @@ -582,6 +573,12 @@ static int cxl_mock_mbox_send(struct cxl_dev_state *cxlds, struct cxl_mbox_cmd *
> >  	case CXL_MBOX_OP_GET_PARTITION_INFO:
> >  		rc = mock_partition_info(cxlds, cmd);
> >  		break;
> > +	case CXL_MBOX_OP_GET_EVENT_RECORD:
> > +		rc = mock_get_event(cxlds, cmd);
> > +		break;
> > +	case CXL_MBOX_OP_CLEAR_EVENT_RECORD:
> > +		rc = mock_clear_event(cxlds, cmd);
> > +		break;
> >  	case CXL_MBOX_OP_SET_LSA:
> >  		rc = mock_set_lsa(cxlds, cmd);
> >  		break;
> > @@ -628,6 +625,15 @@ static bool is_rcd(struct platform_device *pdev)
> >  	return !!id->driver_data;
> >  }
> >  
> > +static ssize_t event_trigger_store(struct device *dev,
> > +				   struct device_attribute *attr,
> > +				   const char *buf, size_t count)
> > +{
> > +	cxl_mock_event_trigger(dev);
> > +	return count;
> > +}
> > +static DEVICE_ATTR_WO(event_trigger);
> > +
> >  static int cxl_mock_mem_probe(struct platform_device *pdev)
> >  {
> >  	struct device *dev = &pdev->dev;
> > @@ -655,6 +661,7 @@ static int cxl_mock_mem_probe(struct platform_device *pdev)
> >  	cxlds->serial = pdev->id;
> >  	cxlds->mbox_send = cxl_mock_mbox_send;
> >  	cxlds->payload_size = SZ_4K;
> > +	cxlds->event.buf = (struct cxl_get_event_payload *) mdata->event_buf;
> >  	if (is_rcd(pdev)) {
> >  		cxlds->rcd = true;
> >  		cxlds->component_reg_phys = CXL_RESOURCE_NONE;
> > @@ -672,10 +679,15 @@ static int cxl_mock_mem_probe(struct platform_device *pdev)
> >  	if (rc)
> >  		return rc;
> >  
> > +	mdata->mes.cxlds = cxlds;
> > +	cxl_mock_add_event_logs(&mdata->mes);
> > +
> >  	cxlmd = devm_cxl_add_memdev(cxlds);
> >  	if (IS_ERR(cxlmd))
> >  		return PTR_ERR(cxlmd);
> >  
> > +	cxl_mem_get_event_records(cxlds, CXLDEV_EVENT_STATUS_ALL);
> > +
> >  	return 0;
> >  }
> >  
> > @@ -714,6 +726,7 @@ static DEVICE_ATTR_RW(security_lock);
> >  
> >  static struct attribute *cxl_mock_mem_attrs[] = {
> >  	&dev_attr_security_lock.attr,
> > +	&dev_attr_event_trigger.attr,
> >  	NULL
> >  };
> >  ATTRIBUTE_GROUPS(cxl_mock_mem);
> > diff --git a/tools/testing/cxl/test/mock.h b/tools/testing/cxl/test/mock.h
> > index ef33f159375e..e7827ddedb06 100644
> > --- a/tools/testing/cxl/test/mock.h
> > +++ b/tools/testing/cxl/test/mock.h
> > @@ -3,6 +3,18 @@
> >  #include <linux/list.h>
> >  #include <linux/acpi.h>
> >  #include <cxl.h>
> > +#include "events.h"
> > +
> > +struct cxl_mockmem_data {
> > +	void *lsa;
> > +	u32 security_state;
> > +	u8 user_pass[NVDIMM_PASSPHRASE_LEN];
> > +	u8 master_pass[NVDIMM_PASSPHRASE_LEN];
> > +	int user_limit;
> > +	int master_limit;
> > +	struct mock_event_store mes;
> > +	u8 event_buf[SZ_4K];
> > +};
> 
> Related to the above this does not belong here. test/mock.h is dedicated
> to supporting the symbols that are rerouted between the CXL modules,
> cxl_mock, and cxl_test.

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH V3 8/8] cxl/test: Simulate event log overflow
  2022-12-09 22:52   ` Dan Williams
@ 2022-12-12  4:21     ` Ira Weiny
  0 siblings, 0 replies; 25+ messages in thread
From: Ira Weiny @ 2022-12-12  4:21 UTC (permalink / raw)
  To: Dan Williams
  Cc: Jonathan Cameron, Bjorn Helgaas, Alison Schofield, Vishal Verma,
	Davidlohr Bueso, Dave Jiang, linux-kernel, linux-pci, linux-acpi,
	linux-cxl

On Fri, Dec 09, 2022 at 02:52:55PM -0800, Dan Williams wrote:
> ira.weiny@ wrote:
> > From: Ira Weiny <ira.weiny@intel.com>
> > 
> > Log overflow is marked by a separate trace message.
> > 
> > Simulate a log with lots of messages and flag overflow until space is
> > cleared.
> 
> This and patch 7 look good to me after addressing the move to mem.c.
> 
> Reviewed-by: Dan Williams <dan.j.williams@intel.com>

Thanks!
Ira

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH V3 1/8] cxl/mem: Read, trace, and clear events on driver load
  2022-12-09 23:34         ` Ira Weiny
@ 2022-12-12 17:58           ` Jonathan Cameron
  0 siblings, 0 replies; 25+ messages in thread
From: Jonathan Cameron @ 2022-12-12 17:58 UTC (permalink / raw)
  To: Ira Weiny
  Cc: Dan Williams, Bjorn Helgaas, Alison Schofield, Vishal Verma,
	Davidlohr Bueso, Dave Jiang, linux-kernel, linux-pci, linux-acpi,
	linux-cxl

On Fri, 9 Dec 2022 15:34:42 -0800
Ira Weiny <ira.weiny@intel.com> wrote:

> On Fri, Dec 09, 2022 at 02:33:20PM -0800, Dan Williams wrote:
> > Ira Weiny wrote:
> > [..]  
> > > > > diff --git a/drivers/cxl/pci.c b/drivers/cxl/pci.c
> > > > > index 3a66aadb4df0..86c84611a168 100644
> > > > > --- a/drivers/cxl/pci.c
> > > > > +++ b/drivers/cxl/pci.c
> > > > > @@ -417,8 +417,44 @@ static void disable_aer(void *pdev)
> > > > >  	pci_disable_pcie_error_reporting(pdev);
> > > > >  }
> > > > >  
> > > > > +static void cxl_mem_free_event_buffer(void *buf)
> > > > > +{
> > > > > +	kvfree(buf);
> > > > > +}
> > > > > +
> > > > > +/*
> > > > > + * There is a single buffer for reading event logs from the mailbox.  All logs
> > > > > + * share this buffer protected by the cxlds->event_log_lock.
> > > > > + */
> > > > > +static void cxl_mem_alloc_event_buf(struct cxl_dev_state *cxlds)
> > > > > +{
> > > > > +	struct cxl_get_event_payload *buf;
> > > > > +
> > > > > +	dev_dbg(cxlds->dev, "Allocating event buffer size %zu\n",
> > > > > +		cxlds->payload_size);
> > > > > +
> > > > > +	buf = kvmalloc(cxlds->payload_size, GFP_KERNEL);
> > > > > +	if (WARN_ON_ONCE(!buf))  
> > > > 
> > > > No, why is event init so special that it behaves differently than all
> > > > the other init-time allocations this driver does?  
> > > 
> > > Previous review agreed that a warn on once would be printed if this universal
> > > buffer was not allocated.
> > >   
> > > >   
> > > > > +		return;  
> > > > 
> > > > return -ENOMEM;
> > > >   
> > > > > +
> > > > > +	if (WARN_ON_ONCE(devm_add_action_or_reset(cxlds->dev,
> > > > > +			 cxl_mem_free_event_buffer, buf)))
> > > > > +		return;  
> > > > 
> > > > ditto.  
> > > 
> > > I'll change both of these with a dev_err() and bail during init.  
> > 
> > No real need to dev_err() for a simple memory allocation faliure, but
> > at least it is better than a WARN  
> 
> Ok no error then.
> 
> >   
> > >   
> > > >   
> > > > > +
> > > > > +	cxlds->event.buf = buf;
> > > > > +}
> > > > > +
> > > > > +static void cxl_clear_event_logs(struct cxl_dev_state *cxlds)
> > > > > +{
> > > > > +	/* Force read and clear of all logs */
> > > > > +	cxl_mem_get_event_records(cxlds, CXLDEV_EVENT_STATUS_ALL);
> > > > > +	/* Ensure prior partial reads are handled, by starting over again */  
> > > > 
> > > > What partial reads? cxl_mem_get_event_records() reads every log until
> > > > each returns an empty result. Any remaining events after this returns
> > > > are events that fired during the retrieval.  
> > > 
> > > Jonathan was concerned that something could read part of the log and because of
> > > the statefullness of the log processing this reading of the log could start in
> > > the beginning.  Perhaps from a previous driver unload while reading?  
> > 
> > The driver will not unload without completing any current executions of
> > the event retrieval thread otherwise that's an irq shutdown bug.
> >   
> > > I guess I was also thinking the BIOS could leave things this way?  But I think
> > > we should not be here if the BIOS was ever involved right?  
> > 
> > If the OS has CXL Error control and all Event irqs are steered to the OS
> > then the driver must be allowed to assume that it has exclusive control
> > over event retrieval and clearing.

The OS has control once it's asked for it ;)  We have no idea what the firmware
did before that.

> >   
> > > > So I do not think cxl_clear_event_logs() needs to exist, just call
> > > > cxl_mem_get_event_records(CXLDEV_EVENT_STATUS_ALL) once and that's it.  
> > > 
> > > That was my inclination but Jonathan's comments got me thinking I was wrong.  
> > 
> > Perhaps that was before we realized the recent CXL _OSC entanglement.  
> 
> Yea that could have been.  I'm not clear on the order of the comments.

I'm just paranoid - particularly when my excellent firmware writing colleagues
are involved (or our test teams who like to simulate weird sequences of events).

I'm fine crossing fingers until we know there is someone doing this sort of
crazy in the wild.

Jonathan


> 
> Ok this should be good to go.  Reworking the rest of the series.
> 
> Thanks for the review!
> Ira


^ permalink raw reply	[flat|nested] 25+ messages in thread

end of thread, other threads:[~2022-12-12 17:59 UTC | newest]

Thread overview: 25+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-12-08  5:21 [PATCH V3 0/8] CXL: Process event logs ira.weiny
2022-12-08  5:21 ` [PATCH V3 1/8] cxl/mem: Read, trace, and clear events on driver load ira.weiny
2022-12-09 17:56   ` Dan Williams
2022-12-09 21:00     ` Ira Weiny
2022-12-09 22:33       ` Dan Williams
2022-12-09 23:34         ` Ira Weiny
2022-12-12 17:58           ` Jonathan Cameron
2022-12-08  5:21 ` [PATCH V3 2/8] cxl/mem: Wire up event interrupts ira.weiny
2022-12-09 21:49   ` Dan Williams
2022-12-10  1:44     ` Ira Weiny
2022-12-08  5:21 ` [PATCH V3 3/8] cxl/mem: Trace General Media Event Record ira.weiny
2022-12-09 22:04   ` Dan Williams
2022-12-11 16:08     ` Ira Weiny
2022-12-08  5:21 ` [PATCH V3 4/8] cxl/mem: Trace DRAM " ira.weiny
2022-12-09 22:14   ` Dan Williams
2022-12-11 16:21     ` Ira Weiny
2022-12-08  5:21 ` [PATCH V3 5/8] cxl/mem: Trace Memory Module " ira.weiny
2022-12-09 22:18   ` Dan Williams
2022-12-08  5:21 ` [PATCH V3 6/8] cxl/test: Add generic mock events ira.weiny
2022-12-09 22:48   ` Dan Williams
2022-12-11 17:26     ` Ira Weiny
2022-12-08  5:21 ` [PATCH V3 7/8] cxl/test: Add specific events ira.weiny
2022-12-08  5:21 ` [PATCH V3 8/8] cxl/test: Simulate event log overflow ira.weiny
2022-12-09 22:52   ` Dan Williams
2022-12-12  4:21     ` Ira Weiny

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).