All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH V4 0/9] CXL: Process event logs
@ 2022-12-12  7:06 ira.weiny
  2022-12-12  7:06 ` [PATCH V4 1/9] PCI/CXL: Export native CXL error reporting control ira.weiny
                   ` (9 more replies)
  0 siblings, 10 replies; 36+ messages in thread
From: ira.weiny @ 2022-12-12  7:06 UTC (permalink / raw)
  To: Dan Williams
  Cc: Ira Weiny, Bjorn Helgaas, Alison Schofield, Vishal Verma,
	Davidlohr Bueso, Jonathan Cameron, Dave Jiang, linux-kernel,
	linux-pci, linux-acpi, linux-cxl

From: Ira Weiny <ira.weiny@intel.com>

This code has been tested with a newer qemu which allows for more events to be
returned at a time as well an additional QMP event and interrupt injection.
Those patches will follow once they have been cleaned up.

The series is now in 3 parts:

	1) Base functionality including interrupts
	2) Tracing specific events (Dynamic Capacity Event Record is defered)
	3) cxl-test infrastructure for basic tests

Changes from V3
	Feedback from Dan
	Spit out ACPI changes for Bjorn

- Link to v3: https://lore.kernel.org/all/20221208052115.800170-1-ira.weiny@intel.com/


Davidlohr Bueso (1):
  cxl/mem: Wire up event interrupts

Ira Weiny (8):
  PCI/CXL: Export native CXL error reporting control
  cxl/mem: Read, trace, and clear events on driver load
  cxl/mem: Trace General Media Event Record
  cxl/mem: Trace DRAM Event Record
  cxl/mem: Trace Memory Module Event Record
  cxl/test: Add generic mock events
  cxl/test: Add specific events
  cxl/test: Simulate event log overflow

 drivers/acpi/pci_root.c       |   3 +
 drivers/cxl/core/mbox.c       | 186 +++++++++++++
 drivers/cxl/core/trace.h      | 479 ++++++++++++++++++++++++++++++++++
 drivers/cxl/cxl.h             |  16 ++
 drivers/cxl/cxlmem.h          | 171 ++++++++++++
 drivers/cxl/cxlpci.h          |   6 +
 drivers/cxl/pci.c             | 236 +++++++++++++++++
 drivers/pci/probe.c           |   1 +
 include/linux/pci.h           |   1 +
 tools/testing/cxl/test/Kbuild |   2 +-
 tools/testing/cxl/test/mem.c  | 352 +++++++++++++++++++++++++
 11 files changed, 1452 insertions(+), 1 deletion(-)


base-commit: acb704099642bc822ef2aed223a0b8db1f7ea76e
-- 
2.37.2


^ permalink raw reply	[flat|nested] 36+ messages in thread

* [PATCH V4 1/9] PCI/CXL: Export native CXL error reporting control
  2022-12-12  7:06 [PATCH V4 0/9] CXL: Process event logs ira.weiny
@ 2022-12-12  7:06 ` ira.weiny
  2022-12-13 19:12   ` Dan Williams
                     ` (3 more replies)
  2022-12-12  7:06 ` [PATCH V4 2/9] cxl/mem: Read, trace, and clear events on driver load ira.weiny
                   ` (8 subsequent siblings)
  9 siblings, 4 replies; 36+ messages in thread
From: ira.weiny @ 2022-12-12  7:06 UTC (permalink / raw)
  To: Dan Williams
  Cc: Ira Weiny, Bjorn Helgaas, Lukas Wunner, linux-pci, linux-acpi,
	Alison Schofield, Vishal Verma, Davidlohr Bueso,
	Jonathan Cameron, Dave Jiang, linux-kernel, linux-cxl

From: Ira Weiny <ira.weiny@intel.com>

CXL _OSC Error Reporting Control is used by the OS to determine if
Firmware has control of various CXL error reporting capabilities
including the event logs.

Expose the result of negotiating CXL Error Reporting Control in struct
pci_host_bridge for consumption by the CXL drivers.

Cc: Bjorn Helgaas <bhelgaas@google.com>
Cc: Lukas Wunner <lukas@wunner.de>
Cc: linux-pci@vger.kernel.org
Cc: linux-acpi@vger.kernel.org
Signed-off-by: Ira Weiny <ira.weiny@intel.com>

---
Changes from V3:
	New patch split out
---
 drivers/acpi/pci_root.c | 3 +++
 drivers/pci/probe.c     | 1 +
 include/linux/pci.h     | 1 +
 3 files changed, 5 insertions(+)

diff --git a/drivers/acpi/pci_root.c b/drivers/acpi/pci_root.c
index b3c202d2a433..84030804a763 100644
--- a/drivers/acpi/pci_root.c
+++ b/drivers/acpi/pci_root.c
@@ -1047,6 +1047,9 @@ struct pci_bus *acpi_pci_root_create(struct acpi_pci_root *root,
 	if (!(root->osc_control_set & OSC_PCI_EXPRESS_DPC_CONTROL))
 		host_bridge->native_dpc = 0;
 
+	if (!(root->osc_ext_control_set & OSC_CXL_ERROR_REPORTING_CONTROL))
+		host_bridge->native_cxl_error = 0;
+
 	/*
 	 * Evaluate the "PCI Boot Configuration" _DSM Function.  If it
 	 * exists and returns 0, we must preserve any PCI resource
diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c
index 2f4e88a44e8b..34c9fd6840c4 100644
--- a/drivers/pci/probe.c
+++ b/drivers/pci/probe.c
@@ -596,6 +596,7 @@ static void pci_init_host_bridge(struct pci_host_bridge *bridge)
 	bridge->native_ltr = 1;
 	bridge->native_dpc = 1;
 	bridge->domain_nr = PCI_DOMAIN_NR_NOT_SET;
+	bridge->native_cxl_error = 1;
 
 	device_initialize(&bridge->dev);
 }
diff --git a/include/linux/pci.h b/include/linux/pci.h
index 1f81807492ef..08c3ccd2617b 100644
--- a/include/linux/pci.h
+++ b/include/linux/pci.h
@@ -577,6 +577,7 @@ struct pci_host_bridge {
 	unsigned int	native_pme:1;		/* OS may use PCIe PME */
 	unsigned int	native_ltr:1;		/* OS may use PCIe LTR */
 	unsigned int	native_dpc:1;		/* OS may use PCIe DPC */
+	unsigned int	native_cxl_error:1;	/* OS may use CXL RAS/Events */
 	unsigned int	preserve_config:1;	/* Preserve FW resource setup */
 	unsigned int	size_windows:1;		/* Enable root bus sizing */
 	unsigned int	msi_domain:1;		/* Bridge wants MSI domain */
-- 
2.37.2


^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH V4 2/9] cxl/mem: Read, trace, and clear events on driver load
  2022-12-12  7:06 [PATCH V4 0/9] CXL: Process event logs ira.weiny
  2022-12-12  7:06 ` [PATCH V4 1/9] PCI/CXL: Export native CXL error reporting control ira.weiny
@ 2022-12-12  7:06 ` ira.weiny
  2022-12-13  6:49   ` johnny
  2022-12-16 15:39   ` Jonathan Cameron
  2022-12-12  7:06 ` [PATCH V4 3/9] cxl/mem: Wire up event interrupts ira.weiny
                   ` (7 subsequent siblings)
  9 siblings, 2 replies; 36+ messages in thread
From: ira.weiny @ 2022-12-12  7:06 UTC (permalink / raw)
  To: Dan Williams
  Cc: Ira Weiny, Bjorn Helgaas, Alison Schofield, Vishal Verma,
	Davidlohr Bueso, Jonathan Cameron, Dave Jiang, linux-kernel,
	linux-pci, linux-acpi, linux-cxl

From: Ira Weiny <ira.weiny@intel.com>

CXL devices have multiple event logs which can be queried for CXL event
records.  Devices are required to support the storage of at least one
event record in each event log type.

Devices track event log overflow by incrementing a counter and tracking
the time of the first and last overflow event seen.

Software queries events via the Get Event Record mailbox command; CXL
rev 3.0 section 8.2.9.2.2 and clears events via CXL rev 3.0 section
8.2.9.2.3 Clear Event Records mailbox command.

If the result of negotiating CXL Error Reporting Control is OS control,
read and clear all event logs on driver load.

Ensure a clean slate of events by reading and clearing the events on
driver load.

The status register is not used because a device may continue to trigger
events and the only requirement is to empty the log at least once.  This
allows for the required transition from empty to non-empty for interrupt
generation.  Handling of interrupts is in a follow on patch.

The device can return up to 1MB worth of event records per query.
Allocate a shared large buffer to handle the max number of records based
on the mailbox payload size.

This patch traces a raw event record and leaves specific event record
type tracing to subsequent patches.  Macros are created to aid in
tracing the common CXL Event header fields.

Each record is cleared explicitly.  A clear all bit is specified but is
only valid when the log overflows.

Signed-off-by: Ira Weiny <ira.weiny@intel.com>

---
Changes from V3:
	Dan
		Split off _OSC pcie bits
			Use existing style for host bridge flag in that
			patch
		Clean up event processing loop
		Use dev_err_ratelimited()
		Clean up version change log
		Delete 'EVENT LOG OVERFLOW'
		Remove cxl_clear_event_logs()
		Add comment for native cxl control
		Fail driver load on event buf allocation failure
		Comment why events are not processed without _OSC flag
---
 drivers/cxl/core/mbox.c  | 136 +++++++++++++++++++++++++++++++++++++++
 drivers/cxl/core/trace.h | 120 ++++++++++++++++++++++++++++++++++
 drivers/cxl/cxl.h        |  12 ++++
 drivers/cxl/cxlmem.h     |  84 ++++++++++++++++++++++++
 drivers/cxl/pci.c        |  40 ++++++++++++
 5 files changed, 392 insertions(+)

diff --git a/drivers/cxl/core/mbox.c b/drivers/cxl/core/mbox.c
index b03fba212799..9fb327370e08 100644
--- a/drivers/cxl/core/mbox.c
+++ b/drivers/cxl/core/mbox.c
@@ -8,6 +8,7 @@
 #include <cxl.h>
 
 #include "core.h"
+#include "trace.h"
 
 static bool cxl_raw_allow_all;
 
@@ -717,6 +718,140 @@ int cxl_enumerate_cmds(struct cxl_dev_state *cxlds)
 }
 EXPORT_SYMBOL_NS_GPL(cxl_enumerate_cmds, CXL);
 
+static int cxl_clear_event_record(struct cxl_dev_state *cxlds,
+				  enum cxl_event_log_type log,
+				  struct cxl_get_event_payload *get_pl)
+{
+	struct cxl_mbox_clear_event_payload payload = {
+		.event_log = log,
+	};
+	u16 total = le16_to_cpu(get_pl->record_count);
+	u8 max_handles = CXL_CLEAR_EVENT_MAX_HANDLES;
+	size_t pl_size = sizeof(payload);
+	struct cxl_mbox_cmd mbox_cmd;
+	u16 cnt;
+	int rc;
+	int i;
+
+	/* Payload size may limit the max handles */
+	if (pl_size > cxlds->payload_size) {
+		max_handles = CXL_CLEAR_EVENT_LIMIT_HANDLES(cxlds->payload_size);
+		pl_size = cxlds->payload_size;
+	}
+
+	mbox_cmd = (struct cxl_mbox_cmd) {
+		.opcode = CXL_MBOX_OP_CLEAR_EVENT_RECORD,
+		.payload_in = &payload,
+		.size_in = pl_size,
+	};
+
+	/*
+	 * Clear Event Records uses u8 for the handle cnt while Get Event
+	 * Record can return up to 0xffff records.
+	 */
+	i = 0;
+	for (cnt = 0; cnt < total; cnt++) {
+		payload.handle[i++] = get_pl->records[cnt].hdr.handle;
+		dev_dbg(cxlds->dev, "Event log '%d': Clearing %u\n",
+			log, le16_to_cpu(payload.handle[i]));
+
+		if (i == max_handles) {
+			payload.nr_recs = i;
+			rc = cxl_internal_send_cmd(cxlds, &mbox_cmd);
+			if (rc)
+				return rc;
+			i = 0;
+		}
+	}
+
+	/* Clear what is left if any */
+	if (i) {
+		payload.nr_recs = i;
+		rc = cxl_internal_send_cmd(cxlds, &mbox_cmd);
+		if (rc)
+			return rc;
+	}
+
+	return 0;
+}
+
+static void cxl_mem_get_records_log(struct cxl_dev_state *cxlds,
+				    enum cxl_event_log_type type)
+{
+	struct cxl_get_event_payload *payload;
+	struct cxl_mbox_cmd mbox_cmd;
+	u8 log_type = type;
+	u16 nr_rec;
+
+	mutex_lock(&cxlds->event.log_lock);
+	payload = cxlds->event.buf;
+
+	mbox_cmd = (struct cxl_mbox_cmd) {
+		.opcode = CXL_MBOX_OP_GET_EVENT_RECORD,
+		.payload_in = &log_type,
+		.size_in = sizeof(log_type),
+		.payload_out = payload,
+		.size_out = cxlds->payload_size,
+		.min_out = struct_size(payload, records, 0),
+	};
+
+	do {
+		int rc, i;
+
+		rc = cxl_internal_send_cmd(cxlds, &mbox_cmd);
+		if (rc) {
+			dev_err_ratelimited(cxlds->dev, "Event log '%d': Failed to query event records : %d",
+					    type, rc);
+			break;
+		}
+
+		nr_rec = le16_to_cpu(payload->record_count);
+		if (!nr_rec)
+			break;
+
+		for (i = 0; i < nr_rec; i++)
+			trace_cxl_generic_event(cxlds->dev, type,
+						&payload->records[i]);
+
+		if (payload->flags & CXL_GET_EVENT_FLAG_OVERFLOW)
+			trace_cxl_overflow(cxlds->dev, type, payload);
+
+		rc = cxl_clear_event_record(cxlds, type, payload);
+		if (rc) {
+			dev_err_ratelimited(cxlds->dev, "Event log '%d': Failed to clear events : %d",
+					    type, rc);
+			break;
+		}
+	} while (nr_rec);
+
+	mutex_unlock(&cxlds->event.log_lock);
+}
+
+/**
+ * cxl_mem_get_event_records - Get Event Records from the device
+ * @cxlds: The device data for the operation
+ *
+ * Retrieve all event records available on the device, report them as trace
+ * events, and clear them.
+ *
+ * See CXL rev 3.0 @8.2.9.2.2 Get Event Records
+ * See CXL rev 3.0 @8.2.9.2.3 Clear Event Records
+ */
+void cxl_mem_get_event_records(struct cxl_dev_state *cxlds, u32 status)
+{
+	dev_dbg(cxlds->dev, "Reading event logs: %x\n", status);
+
+	if (status & CXLDEV_EVENT_STATUS_FATAL)
+		cxl_mem_get_records_log(cxlds, CXL_EVENT_TYPE_FATAL);
+	if (status & CXLDEV_EVENT_STATUS_FAIL)
+		cxl_mem_get_records_log(cxlds, CXL_EVENT_TYPE_FAIL);
+	if (status & CXLDEV_EVENT_STATUS_WARN)
+		cxl_mem_get_records_log(cxlds, CXL_EVENT_TYPE_WARN);
+	if (status & CXLDEV_EVENT_STATUS_INFO)
+		cxl_mem_get_records_log(cxlds, CXL_EVENT_TYPE_INFO);
+}
+EXPORT_SYMBOL_NS_GPL(cxl_mem_get_event_records, CXL);
+
 /**
  * cxl_mem_get_partition_info - Get partition info
  * @cxlds: The device data for the operation
@@ -868,6 +1003,7 @@ struct cxl_dev_state *cxl_dev_state_create(struct device *dev)
 	}
 
 	mutex_init(&cxlds->mbox_mutex);
+	mutex_init(&cxlds->event.log_lock);
 	cxlds->dev = dev;
 
 	return cxlds;
diff --git a/drivers/cxl/core/trace.h b/drivers/cxl/core/trace.h
index 20ca2fe2ca8e..6898212fcb47 100644
--- a/drivers/cxl/core/trace.h
+++ b/drivers/cxl/core/trace.h
@@ -6,7 +6,9 @@
 #if !defined(_CXL_EVENTS_H) || defined(TRACE_HEADER_MULTI_READ)
 #define _CXL_EVENTS_H
 
+#include <asm-generic/unaligned.h>
 #include <cxl.h>
+#include <cxlmem.h>
 #include <linux/tracepoint.h>
 
 #define CXL_RAS_UC_CACHE_DATA_PARITY	BIT(0)
@@ -103,6 +105,124 @@ TRACE_EVENT(cxl_aer_correctable_error,
 	)
 );
 
+#include <linux/tracepoint.h>
+
+#define cxl_event_log_type_str(type)				\
+	__print_symbolic(type,					\
+		{ CXL_EVENT_TYPE_INFO, "Informational" },	\
+		{ CXL_EVENT_TYPE_WARN, "Warning" },		\
+		{ CXL_EVENT_TYPE_FAIL, "Failure" },		\
+		{ CXL_EVENT_TYPE_FATAL, "Fatal" })
+
+TRACE_EVENT(cxl_overflow,
+
+	TP_PROTO(const struct device *dev, enum cxl_event_log_type log,
+		 struct cxl_get_event_payload *payload),
+
+	TP_ARGS(dev, log, payload),
+
+	TP_STRUCT__entry(
+		__string(dev_name, dev_name(dev))
+		__field(int, log)
+		__field(u64, first_ts)
+		__field(u64, last_ts)
+		__field(u16, count)
+	),
+
+	TP_fast_assign(
+		__assign_str(dev_name, dev_name(dev));
+		__entry->log = log;
+		__entry->count = le16_to_cpu(payload->overflow_err_count);
+		__entry->first_ts = le64_to_cpu(payload->first_overflow_timestamp);
+		__entry->last_ts = le64_to_cpu(payload->last_overflow_timestamp);
+	),
+
+	TP_printk("%s: log=%s : %u records from %llu to %llu",
+		__get_str(dev_name), cxl_event_log_type_str(__entry->log),
+		__entry->count, __entry->first_ts, __entry->last_ts)
+
+);
+
+/*
+ * Common Event Record Format
+ * CXL 3.0 section 8.2.9.2.1; Table 8-42
+ */
+#define CXL_EVENT_RECORD_FLAG_PERMANENT		BIT(2)
+#define CXL_EVENT_RECORD_FLAG_MAINT_NEEDED	BIT(3)
+#define CXL_EVENT_RECORD_FLAG_PERF_DEGRADED	BIT(4)
+#define CXL_EVENT_RECORD_FLAG_HW_REPLACE	BIT(5)
+#define show_hdr_flags(flags)	__print_flags(flags, " | ",			   \
+	{ CXL_EVENT_RECORD_FLAG_PERMANENT,	"PERMANENT_CONDITION"		}, \
+	{ CXL_EVENT_RECORD_FLAG_MAINT_NEEDED,	"MAINTENANCE_NEEDED"		}, \
+	{ CXL_EVENT_RECORD_FLAG_PERF_DEGRADED,	"PERFORMANCE_DEGRADED"		}, \
+	{ CXL_EVENT_RECORD_FLAG_HW_REPLACE,	"HARDWARE_REPLACEMENT_NEEDED"	}  \
+)
+
+/*
+ * Define macros for the common header of each CXL event.
+ *
+ * Tracepoints using these macros must do 3 things:
+ *
+ *	1) Add CXL_EVT_TP_entry to TP_STRUCT__entry
+ *	2) Use CXL_EVT_TP_fast_assign within TP_fast_assign;
+ *	   pass the dev, log, and CXL event header
+ *	3) Use CXL_EVT_TP_printk() instead of TP_printk()
+ *
+ * See the generic_event tracepoint as an example.
+ */
+#define CXL_EVT_TP_entry					\
+	__string(dev_name, dev_name(dev))			\
+	__field(int, log)					\
+	__field_struct(uuid_t, hdr_uuid)			\
+	__field(u32, hdr_flags)					\
+	__field(u16, hdr_handle)				\
+	__field(u16, hdr_related_handle)			\
+	__field(u64, hdr_timestamp)				\
+	__field(u8, hdr_length)					\
+	__field(u8, hdr_maint_op_class)
+
+#define CXL_EVT_TP_fast_assign(dev, l, hdr)					\
+	__assign_str(dev_name, dev_name(dev));					\
+	__entry->log = (l);							\
+	memcpy(&__entry->hdr_uuid, &(hdr).id, sizeof(uuid_t));			\
+	__entry->hdr_length = (hdr).length;					\
+	__entry->hdr_flags = get_unaligned_le24((hdr).flags);			\
+	__entry->hdr_handle = le16_to_cpu((hdr).handle);			\
+	__entry->hdr_related_handle = le16_to_cpu((hdr).related_handle);	\
+	__entry->hdr_timestamp = le64_to_cpu((hdr).timestamp);			\
+	__entry->hdr_maint_op_class = (hdr).maint_op_class
+
+#define CXL_EVT_TP_printk(fmt, ...) \
+	TP_printk("%s log=%s : time=%llu uuid=%pUb len=%d flags='%s' "		\
+		"handle=%x related_handle=%x maint_op_class=%u"			\
+		" : " fmt,							\
+		__get_str(dev_name), cxl_event_log_type_str(__entry->log),	\
+		__entry->hdr_timestamp, &__entry->hdr_uuid, __entry->hdr_length,\
+		show_hdr_flags(__entry->hdr_flags), __entry->hdr_handle,	\
+		__entry->hdr_related_handle, __entry->hdr_maint_op_class,	\
+		##__VA_ARGS__)
+
+TRACE_EVENT(cxl_generic_event,
+
+	TP_PROTO(const struct device *dev, enum cxl_event_log_type log,
+		 struct cxl_event_record_raw *rec),
+
+	TP_ARGS(dev, log, rec),
+
+	TP_STRUCT__entry(
+		CXL_EVT_TP_entry
+		__array(u8, data, CXL_EVENT_RECORD_DATA_LENGTH)
+	),
+
+	TP_fast_assign(
+		CXL_EVT_TP_fast_assign(dev, log, rec->hdr);
+		memcpy(__entry->data, &rec->data, CXL_EVENT_RECORD_DATA_LENGTH);
+	),
+
+	CXL_EVT_TP_printk("%s",
+		__print_hex(__entry->data, CXL_EVENT_RECORD_DATA_LENGTH))
+);
+
 #endif /* _CXL_EVENTS_H */
 
 #define TRACE_INCLUDE_FILE trace
diff --git a/drivers/cxl/cxl.h b/drivers/cxl/cxl.h
index aa3af3bb73b2..5974d1082210 100644
--- a/drivers/cxl/cxl.h
+++ b/drivers/cxl/cxl.h
@@ -156,6 +156,18 @@ static inline int ways_to_eiw(unsigned int ways, u8 *eiw)
 #define CXLDEV_CAP_CAP_ID_SECONDARY_MAILBOX 0x3
 #define CXLDEV_CAP_CAP_ID_MEMDEV 0x4000
 
+/* CXL 3.0 8.2.8.3.1 Event Status Register */
+#define CXLDEV_DEV_EVENT_STATUS_OFFSET		0x00
+#define CXLDEV_EVENT_STATUS_INFO		BIT(0)
+#define CXLDEV_EVENT_STATUS_WARN		BIT(1)
+#define CXLDEV_EVENT_STATUS_FAIL		BIT(2)
+#define CXLDEV_EVENT_STATUS_FATAL		BIT(3)
+
+#define CXLDEV_EVENT_STATUS_ALL (CXLDEV_EVENT_STATUS_INFO |	\
+				 CXLDEV_EVENT_STATUS_WARN |	\
+				 CXLDEV_EVENT_STATUS_FAIL |	\
+				 CXLDEV_EVENT_STATUS_FATAL)
+
 /* CXL 2.0 8.2.8.4 Mailbox Registers */
 #define CXLDEV_MBOX_CAPS_OFFSET 0x00
 #define   CXLDEV_MBOX_CAP_PAYLOAD_SIZE_MASK GENMASK(4, 0)
diff --git a/drivers/cxl/cxlmem.h b/drivers/cxl/cxlmem.h
index ab138004f644..dd9aa3dd738e 100644
--- a/drivers/cxl/cxlmem.h
+++ b/drivers/cxl/cxlmem.h
@@ -4,6 +4,7 @@
 #define __CXL_MEM_H__
 #include <uapi/linux/cxl_mem.h>
 #include <linux/cdev.h>
+#include <linux/uuid.h>
 #include "cxl.h"
 
 /* CXL 2.0 8.2.8.5.1.1 Memory Device Status Register */
@@ -193,6 +194,17 @@ struct cxl_endpoint_dvsec_info {
 	struct range dvsec_range[2];
 };
 
+/**
+ * struct cxl_event_state - Event log driver state
+ *
+ * @event_buf: Buffer to receive event data
+ * @event_log_lock: Serialize event_buf and log use
+ */
+struct cxl_event_state {
+	struct cxl_get_event_payload *buf;
+	struct mutex log_lock;
+};
+
 /**
  * struct cxl_dev_state - The driver device state
  *
@@ -266,12 +278,16 @@ struct cxl_dev_state {
 
 	struct xarray doe_mbs;
 
+	struct cxl_event_state event;
+
 	int (*mbox_send)(struct cxl_dev_state *cxlds, struct cxl_mbox_cmd *cmd);
 };
 
 enum cxl_opcode {
 	CXL_MBOX_OP_INVALID		= 0x0000,
 	CXL_MBOX_OP_RAW			= CXL_MBOX_OP_INVALID,
+	CXL_MBOX_OP_GET_EVENT_RECORD	= 0x0100,
+	CXL_MBOX_OP_CLEAR_EVENT_RECORD	= 0x0101,
 	CXL_MBOX_OP_GET_FW_INFO		= 0x0200,
 	CXL_MBOX_OP_ACTIVATE_FW		= 0x0202,
 	CXL_MBOX_OP_GET_SUPPORTED_LOGS	= 0x0400,
@@ -347,6 +363,73 @@ struct cxl_mbox_identify {
 	u8 qos_telemetry_caps;
 } __packed;
 
+/*
+ * Common Event Record Format
+ * CXL rev 3.0 section 8.2.9.2.1; Table 8-42
+ */
+struct cxl_event_record_hdr {
+	uuid_t id;
+	u8 length;
+	u8 flags[3];
+	__le16 handle;
+	__le16 related_handle;
+	__le64 timestamp;
+	u8 maint_op_class;
+	u8 reserved[15];
+} __packed;
+
+#define CXL_EVENT_RECORD_DATA_LENGTH 0x50
+struct cxl_event_record_raw {
+	struct cxl_event_record_hdr hdr;
+	u8 data[CXL_EVENT_RECORD_DATA_LENGTH];
+} __packed;
+
+/*
+ * Get Event Records output payload
+ * CXL rev 3.0 section 8.2.9.2.2; Table 8-50
+ */
+#define CXL_GET_EVENT_FLAG_OVERFLOW		BIT(0)
+#define CXL_GET_EVENT_FLAG_MORE_RECORDS		BIT(1)
+struct cxl_get_event_payload {
+	u8 flags;
+	u8 reserved1;
+	__le16 overflow_err_count;
+	__le64 first_overflow_timestamp;
+	__le64 last_overflow_timestamp;
+	__le16 record_count;
+	u8 reserved2[10];
+	struct cxl_event_record_raw records[];
+} __packed;
+
+/*
+ * CXL rev 3.0 section 8.2.9.2.2; Table 8-49
+ */
+enum cxl_event_log_type {
+	CXL_EVENT_TYPE_INFO = 0x00,
+	CXL_EVENT_TYPE_WARN,
+	CXL_EVENT_TYPE_FAIL,
+	CXL_EVENT_TYPE_FATAL,
+	CXL_EVENT_TYPE_MAX
+};
+
+/*
+ * Clear Event Records input payload
+ * CXL rev 3.0 section 8.2.9.2.3; Table 8-51
+ */
+#define CXL_CLEAR_EVENT_MAX_HANDLES (0xff)
+struct cxl_mbox_clear_event_payload {
+	u8 event_log;		/* enum cxl_event_log_type */
+	u8 clear_flags;
+	u8 nr_recs;
+	u8 reserved[3];
+	__le16 handle[CXL_CLEAR_EVENT_MAX_HANDLES];
+} __packed;
+#define CXL_CLEAR_EVENT_LIMIT_HANDLES(payload_size)			\
+	(((payload_size) -						\
+		(sizeof(struct cxl_mbox_clear_event_payload) -		\
+		 (sizeof(__le16) * CXL_CLEAR_EVENT_MAX_HANDLES))) /	\
+		sizeof(__le16))
+
 struct cxl_mbox_get_partition_info {
 	__le64 active_volatile_cap;
 	__le64 active_persistent_cap;
@@ -441,6 +524,7 @@ int cxl_mem_create_range_info(struct cxl_dev_state *cxlds);
 struct cxl_dev_state *cxl_dev_state_create(struct device *dev);
 void set_exclusive_cxl_commands(struct cxl_dev_state *cxlds, unsigned long *cmds);
 void clear_exclusive_cxl_commands(struct cxl_dev_state *cxlds, unsigned long *cmds);
+void cxl_mem_get_event_records(struct cxl_dev_state *cxlds, u32 status);
 #ifdef CONFIG_CXL_SUSPEND
 void cxl_mem_active_inc(void);
 void cxl_mem_active_dec(void);
diff --git a/drivers/cxl/pci.c b/drivers/cxl/pci.c
index 3a66aadb4df0..a2d8382bc593 100644
--- a/drivers/cxl/pci.c
+++ b/drivers/cxl/pci.c
@@ -417,8 +417,37 @@ static void disable_aer(void *pdev)
 	pci_disable_pcie_error_reporting(pdev);
 }
 
+static void cxl_mem_free_event_buffer(void *buf)
+{
+	kvfree(buf);
+}
+
+/*
+ * There is a single buffer for reading event logs from the mailbox.  All logs
+ * share this buffer protected by the cxlds->event_log_lock.
+ */
+static int cxl_mem_alloc_event_buf(struct cxl_dev_state *cxlds)
+{
+	struct cxl_get_event_payload *buf;
+
+	dev_dbg(cxlds->dev, "Allocating event buffer size %zu\n",
+		cxlds->payload_size);
+
+	buf = kvmalloc(cxlds->payload_size, GFP_KERNEL);
+	if (!buf)
+		return -ENOMEM;
+
+	if (devm_add_action_or_reset(cxlds->dev, cxl_mem_free_event_buffer,
+				     buf))
+		return -ENOMEM;
+
+	cxlds->event.buf = buf;
+	return 0;
+}
+
 static int cxl_pci_probe(struct pci_dev *pdev, const struct pci_device_id *id)
 {
+	struct pci_host_bridge *host_bridge = pci_find_host_bridge(pdev->bus);
 	struct cxl_register_map map;
 	struct cxl_memdev *cxlmd;
 	struct cxl_dev_state *cxlds;
@@ -494,6 +523,17 @@ static int cxl_pci_probe(struct pci_dev *pdev, const struct pci_device_id *id)
 	if (IS_ERR(cxlmd))
 		return PTR_ERR(cxlmd);
 
+	rc = cxl_mem_alloc_event_buf(cxlds);
+	if (rc)
+		return rc;
+
+	/*
+	 * When BIOS maintains CXL error reporting control, it will process
+	 * event records.  Only one agent can do so.
+	 */
+	if (host_bridge->native_cxl_error)
+		cxl_mem_get_event_records(cxlds, CXLDEV_EVENT_STATUS_ALL);
+
 	if (cxlds->regs.ras) {
 		pci_enable_pcie_error_reporting(pdev);
 		rc = devm_add_action_or_reset(&pdev->dev, disable_aer, pdev);
-- 
2.37.2


^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH V4 3/9] cxl/mem: Wire up event interrupts
  2022-12-12  7:06 [PATCH V4 0/9] CXL: Process event logs ira.weiny
  2022-12-12  7:06 ` [PATCH V4 1/9] PCI/CXL: Export native CXL error reporting control ira.weiny
  2022-12-12  7:06 ` [PATCH V4 2/9] cxl/mem: Read, trace, and clear events on driver load ira.weiny
@ 2022-12-12  7:06 ` ira.weiny
  2022-12-13 20:15   ` Dan Williams
                     ` (2 more replies)
  2022-12-12  7:06 ` [PATCH V4 4/9] cxl/mem: Trace General Media Event Record ira.weiny
                   ` (6 subsequent siblings)
  9 siblings, 3 replies; 36+ messages in thread
From: ira.weiny @ 2022-12-12  7:06 UTC (permalink / raw)
  To: Dan Williams
  Cc: Davidlohr Bueso, Bjorn Helgaas, Jonathan Cameron, Ira Weiny,
	Bjorn Helgaas, Alison Schofield, Vishal Verma, Dave Jiang,
	linux-kernel, linux-pci, linux-acpi, linux-cxl

From: Davidlohr Bueso <dave@stgolabs.net>

Currently the only CXL features targeted for irq support require their
message numbers to be within the first 16 entries.  The device may
however support less than 16 entries depending on the support it
provides.

Attempt to allocate these 16 irq vectors.  If the device supports less
then the PCI infrastructure will allocate that number.  Upon successful
allocation, users can plug in their respective isr at any point
thereafter.

CXL device events are signaled via interrupts.  Each event log may have
a different interrupt message number.  These message numbers are
reported in the Get Event Interrupt Policy mailbox command.

Add interrupt support for event logs.  Interrupts are allocated as
shared interrupts.  Therefore, all or some event logs can share the same
message number.

In addition all logs are queried on any interrupt in order of the most
to least severe based on the status register.

Cc: Bjorn Helgaas <helgaas@kernel.org>
Cc: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Co-developed-by: Ira Weiny <ira.weiny@intel.com>
Signed-off-by: Ira Weiny <ira.weiny@intel.com>
Signed-off-by: Davidlohr Bueso <dave@stgolabs.net>

---
Changes from V3:
	Adjust based on changes in patch 1
	Consolidate event setup into cxl_event_config()
	Consistently use cxl_event_* for function names
	Remove cxl_event_int_is_msi()
	Ensure DCD log is ignored in status
	Simplify event status loop logic
	Dan
		Fail driver load if the irq's are not allocated
		move cxl_event_config_msgnums() to pci.c
		s/CXL_PCI_REQUIRED_VECTORS/CXL_PCI_DEFAULT_MAX_VECTORS
		s/devm_kmalloc/devm_kzalloc
		Fix up pci_alloc_irq_vectors() comment
		Pass pdev to cxl_alloc_irq_vectors()
		Check FW irq policy prior to configuration
	Jonathan
		Use FIELD_GET instead of manual masking
---
 drivers/cxl/cxl.h    |   4 +
 drivers/cxl/cxlmem.h |  19 ++++
 drivers/cxl/cxlpci.h |   6 ++
 drivers/cxl/pci.c    | 208 +++++++++++++++++++++++++++++++++++++++++--
 4 files changed, 231 insertions(+), 6 deletions(-)

diff --git a/drivers/cxl/cxl.h b/drivers/cxl/cxl.h
index 5974d1082210..b3964149c77b 100644
--- a/drivers/cxl/cxl.h
+++ b/drivers/cxl/cxl.h
@@ -168,6 +168,10 @@ static inline int ways_to_eiw(unsigned int ways, u8 *eiw)
 				 CXLDEV_EVENT_STATUS_FAIL |	\
 				 CXLDEV_EVENT_STATUS_FATAL)
 
+/* CXL rev 3.0 section 8.2.9.2.4; Table 8-52 */
+#define CXLDEV_EVENT_INT_MODE_MASK	GENMASK(1, 0)
+#define CXLDEV_EVENT_INT_MSGNUM_MASK	GENMASK(7, 4)
+
 /* CXL 2.0 8.2.8.4 Mailbox Registers */
 #define CXLDEV_MBOX_CAPS_OFFSET 0x00
 #define   CXLDEV_MBOX_CAP_PAYLOAD_SIZE_MASK GENMASK(4, 0)
diff --git a/drivers/cxl/cxlmem.h b/drivers/cxl/cxlmem.h
index dd9aa3dd738e..bd8bfbe61ec8 100644
--- a/drivers/cxl/cxlmem.h
+++ b/drivers/cxl/cxlmem.h
@@ -194,6 +194,23 @@ struct cxl_endpoint_dvsec_info {
 	struct range dvsec_range[2];
 };
 
+/**
+ * Event Interrupt Policy
+ *
+ * CXL rev 3.0 section 8.2.9.2.4; Table 8-52
+ */
+enum cxl_event_int_mode {
+	CXL_INT_NONE		= 0x00,
+	CXL_INT_MSI_MSIX	= 0x01,
+	CXL_INT_FW		= 0x02
+};
+struct cxl_event_interrupt_policy {
+	u8 info_settings;
+	u8 warn_settings;
+	u8 failure_settings;
+	u8 fatal_settings;
+} __packed;
+
 /**
  * struct cxl_event_state - Event log driver state
  *
@@ -288,6 +305,8 @@ enum cxl_opcode {
 	CXL_MBOX_OP_RAW			= CXL_MBOX_OP_INVALID,
 	CXL_MBOX_OP_GET_EVENT_RECORD	= 0x0100,
 	CXL_MBOX_OP_CLEAR_EVENT_RECORD	= 0x0101,
+	CXL_MBOX_OP_GET_EVT_INT_POLICY	= 0x0102,
+	CXL_MBOX_OP_SET_EVT_INT_POLICY	= 0x0103,
 	CXL_MBOX_OP_GET_FW_INFO		= 0x0200,
 	CXL_MBOX_OP_ACTIVATE_FW		= 0x0202,
 	CXL_MBOX_OP_GET_SUPPORTED_LOGS	= 0x0400,
diff --git a/drivers/cxl/cxlpci.h b/drivers/cxl/cxlpci.h
index 77dbdb980b12..a8ea04f536ab 100644
--- a/drivers/cxl/cxlpci.h
+++ b/drivers/cxl/cxlpci.h
@@ -53,6 +53,12 @@
 #define	    CXL_DVSEC_REG_LOCATOR_BLOCK_ID_MASK			GENMASK(15, 8)
 #define     CXL_DVSEC_REG_LOCATOR_BLOCK_OFF_LOW_MASK		GENMASK(31, 16)
 
+/*
+ * NOTE: Currently all the functions which are enabled for CXL require their
+ * vectors to be in the first 16.  Use this as the default max.
+ */
+#define CXL_PCI_DEFAULT_MAX_VECTORS 16
+
 /* Register Block Identifier (RBI) */
 enum cxl_regloc_type {
 	CXL_REGLOC_RBI_EMPTY = 0,
diff --git a/drivers/cxl/pci.c b/drivers/cxl/pci.c
index a2d8382bc593..d42d87faddb8 100644
--- a/drivers/cxl/pci.c
+++ b/drivers/cxl/pci.c
@@ -445,6 +445,201 @@ static int cxl_mem_alloc_event_buf(struct cxl_dev_state *cxlds)
 	return 0;
 }
 
+static int cxl_alloc_irq_vectors(struct pci_dev *pdev)
+{
+	int nvecs;
+
+	/*
+	 * CXL requires MSI/MSIX support.
+	 *
+	 * Additionally pci_alloc_irq_vectors() handles calling
+	 * pci_free_irq_vectors() automatically despite not being called
+	 * pcim_*.  See pci_setup_msi_context().
+	 */
+	nvecs = pci_alloc_irq_vectors(pdev, 1, CXL_PCI_DEFAULT_MAX_VECTORS,
+				      PCI_IRQ_MSIX | PCI_IRQ_MSI);
+	if (nvecs < 1) {
+		dev_dbg(&pdev->dev, "Failed to alloc irq vectors: %d\n", nvecs);
+		return -ENXIO;
+	}
+	return 0;
+}
+
+struct cxl_dev_id {
+	struct cxl_dev_state *cxlds;
+};
+
+static irqreturn_t cxl_event_thread(int irq, void *id)
+{
+	struct cxl_dev_id *dev_id = id;
+	struct cxl_dev_state *cxlds = dev_id->cxlds;
+	u32 status;
+
+	do {
+		/*
+		 * CXL 3.0 8.2.8.3.1: The lower 32 bits are the status;
+		 * ignore the reserved upper 32 bits
+		 */
+		status = readl(cxlds->regs.status + CXLDEV_DEV_EVENT_STATUS_OFFSET);
+		/* Ignore logs unknown to the driver */
+		status &= CXLDEV_EVENT_STATUS_ALL;
+		if (!status)
+			break;
+		cxl_mem_get_event_records(cxlds, status);
+		cond_resched();
+	} while (status);
+
+	return IRQ_HANDLED;
+}
+
+static int cxl_event_req_irq(struct cxl_dev_state *cxlds, u8 setting)
+{
+	struct device *dev = cxlds->dev;
+	struct pci_dev *pdev = to_pci_dev(dev);
+	struct cxl_dev_id *dev_id;
+	int irq;
+
+	if (FIELD_GET(CXLDEV_EVENT_INT_MODE_MASK, setting) != CXL_INT_MSI_MSIX)
+		return -ENXIO;
+
+	/* dev_id must be globally unique and must contain the cxlds */
+	dev_id = devm_kzalloc(dev, sizeof(*dev_id), GFP_KERNEL);
+	if (!dev_id)
+		return -ENOMEM;
+	dev_id->cxlds = cxlds;
+
+	irq =  pci_irq_vector(pdev,
+			      FIELD_GET(CXLDEV_EVENT_INT_MSGNUM_MASK, setting));
+	if (irq < 0)
+		return irq;
+
+	return devm_request_threaded_irq(dev, irq, NULL, cxl_event_thread,
+					 IRQF_SHARED, NULL, dev_id);
+}
+
+static int cxl_event_get_int_policy(struct cxl_dev_state *cxlds,
+				    struct cxl_event_interrupt_policy *policy)
+{
+	struct cxl_mbox_cmd mbox_cmd = (struct cxl_mbox_cmd) {
+		.opcode = CXL_MBOX_OP_GET_EVT_INT_POLICY,
+		.payload_out = policy,
+		.size_out = sizeof(*policy),
+	};
+	int rc;
+
+	rc = cxl_internal_send_cmd(cxlds, &mbox_cmd);
+	if (rc < 0)
+		dev_err(cxlds->dev, "Failed to get event interrupt policy : %d",
+			rc);
+
+	return rc;
+}
+
+static int cxl_event_config_msgnums(struct cxl_dev_state *cxlds,
+				    struct cxl_event_interrupt_policy *policy)
+{
+	struct cxl_mbox_cmd mbox_cmd;
+	int rc;
+
+	policy->info_settings = CXL_INT_MSI_MSIX;
+	policy->warn_settings = CXL_INT_MSI_MSIX;
+	policy->failure_settings = CXL_INT_MSI_MSIX;
+	policy->fatal_settings = CXL_INT_MSI_MSIX;
+
+	mbox_cmd = (struct cxl_mbox_cmd) {
+		.opcode = CXL_MBOX_OP_SET_EVT_INT_POLICY,
+		.payload_in = policy,
+		.size_in = sizeof(*policy),
+	};
+
+	rc = cxl_internal_send_cmd(cxlds, &mbox_cmd);
+	if (rc < 0) {
+		dev_err(cxlds->dev, "Failed to set event interrupt policy : %d",
+			rc);
+		return rc;
+	}
+
+	/* Retrieve final interrupt settings */
+	return cxl_event_get_int_policy(cxlds, policy);
+}
+
+static int cxl_event_irqsetup(struct cxl_dev_state *cxlds)
+{
+	struct cxl_event_interrupt_policy policy;
+	int rc;
+
+	rc = cxl_event_config_msgnums(cxlds, &policy);
+	if (rc)
+		return rc;
+
+	rc = cxl_event_req_irq(cxlds, policy.info_settings);
+	if (rc) {
+		dev_err(cxlds->dev, "Failed to get interrupt for event Info log\n");
+		return rc;
+	}
+
+	rc = cxl_event_req_irq(cxlds, policy.warn_settings);
+	if (rc) {
+		dev_err(cxlds->dev, "Failed to get interrupt for event Warn log\n");
+		return rc;
+	}
+
+	rc = cxl_event_req_irq(cxlds, policy.failure_settings);
+	if (rc) {
+		dev_err(cxlds->dev, "Failed to get interrupt for event Failure log\n");
+		return rc;
+	}
+
+	rc = cxl_event_req_irq(cxlds, policy.fatal_settings);
+	if (rc) {
+		dev_err(cxlds->dev, "Failed to get interrupt for event Fatal log\n");
+		return rc;
+	}
+
+	return 0;
+}
+
+static bool cxl_event_int_is_fw(u8 setting)
+{
+	u8 mode = FIELD_GET(CXLDEV_EVENT_INT_MODE_MASK, setting);
+
+	return mode == CXL_INT_FW;
+}
+
+static int cxl_event_config(struct pci_host_bridge *host_bridge,
+			    struct cxl_dev_state *cxlds)
+{
+	struct cxl_event_interrupt_policy policy;
+	int rc;
+
+	/*
+	 * When BIOS maintains CXL error reporting control, it will process
+	 * event records.  Only one agent can do so.
+	 */
+	if (!host_bridge->native_cxl_error)
+		return 0;
+
+	rc = cxl_event_get_int_policy(cxlds, &policy);
+	if (rc)
+		return rc;
+
+	if (cxl_event_int_is_fw(policy.info_settings) ||
+	    cxl_event_int_is_fw(policy.warn_settings) ||
+	    cxl_event_int_is_fw(policy.failure_settings) ||
+	    cxl_event_int_is_fw(policy.fatal_settings)) {
+		dev_err(cxlds->dev, "FW still in control of Event Logs despite _OSC settings\n");
+		return -EBUSY;
+	}
+
+	rc = cxl_event_irqsetup(cxlds);
+	if (rc)
+		return rc;
+
+	cxl_mem_get_event_records(cxlds, CXLDEV_EVENT_STATUS_ALL);
+
+	return 0;
+}
+
 static int cxl_pci_probe(struct pci_dev *pdev, const struct pci_device_id *id)
 {
 	struct pci_host_bridge *host_bridge = pci_find_host_bridge(pdev->bus);
@@ -519,6 +714,10 @@ static int cxl_pci_probe(struct pci_dev *pdev, const struct pci_device_id *id)
 	if (rc)
 		return rc;
 
+	rc = cxl_alloc_irq_vectors(pdev);
+	if (rc)
+		return rc;
+
 	cxlmd = devm_cxl_add_memdev(cxlds);
 	if (IS_ERR(cxlmd))
 		return PTR_ERR(cxlmd);
@@ -527,12 +726,9 @@ static int cxl_pci_probe(struct pci_dev *pdev, const struct pci_device_id *id)
 	if (rc)
 		return rc;
 
-	/*
-	 * When BIOS maintains CXL error reporting control, it will process
-	 * event records.  Only one agent can do so.
-	 */
-	if (host_bridge->native_cxl_error)
-		cxl_mem_get_event_records(cxlds, CXLDEV_EVENT_STATUS_ALL);
+	rc = cxl_event_config(host_bridge, cxlds);
+	if (rc)
+		return rc;
 
 	if (cxlds->regs.ras) {
 		pci_enable_pcie_error_reporting(pdev);
-- 
2.37.2


^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH V4 4/9] cxl/mem: Trace General Media Event Record
  2022-12-12  7:06 [PATCH V4 0/9] CXL: Process event logs ira.weiny
                   ` (2 preceding siblings ...)
  2022-12-12  7:06 ` [PATCH V4 3/9] cxl/mem: Wire up event interrupts ira.weiny
@ 2022-12-12  7:06 ` ira.weiny
  2022-12-12  7:06 ` [PATCH V4 5/9] cxl/mem: Trace DRAM " ira.weiny
                   ` (5 subsequent siblings)
  9 siblings, 0 replies; 36+ messages in thread
From: ira.weiny @ 2022-12-12  7:06 UTC (permalink / raw)
  To: Dan Williams
  Cc: Ira Weiny, Bjorn Helgaas, Alison Schofield, Vishal Verma,
	Davidlohr Bueso, Jonathan Cameron, Dave Jiang, linux-kernel,
	linux-pci, linux-acpi, linux-cxl

From: Ira Weiny <ira.weiny@intel.com>

CXL rev 3.0 section 8.2.9.2.1.1 defines the General Media Event Record.

Determine if the event read is a general media record and if so trace
the record as a General Media Event Record.

Reviewed-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Ira Weiny <ira.weiny@intel.com>

---
Changes from V3:
	Adjust based on changes in patch 1
	s/cxl_trace_event_record/cxl_event_trace_record
	Dan
		Use decimal for reserved field length
		Use 'else' instead of return in cxl_event_trace_record()
---
 drivers/cxl/core/mbox.c  |  29 ++++++++-
 drivers/cxl/core/trace.h | 124 +++++++++++++++++++++++++++++++++++++++
 drivers/cxl/cxlmem.h     |  19 ++++++
 3 files changed, 170 insertions(+), 2 deletions(-)

diff --git a/drivers/cxl/core/mbox.c b/drivers/cxl/core/mbox.c
index 9fb327370e08..ffa311f94baa 100644
--- a/drivers/cxl/core/mbox.c
+++ b/drivers/cxl/core/mbox.c
@@ -718,6 +718,31 @@ int cxl_enumerate_cmds(struct cxl_dev_state *cxlds)
 }
 EXPORT_SYMBOL_NS_GPL(cxl_enumerate_cmds, CXL);
 
+/*
+ * General Media Event Record
+ * CXL rev 3.0 Section 8.2.9.2.1.1; Table 8-43
+ */
+static const uuid_t gen_media_event_uuid =
+	UUID_INIT(0xfbcd0a77, 0xc260, 0x417f,
+		  0x85, 0xa9, 0x08, 0x8b, 0x16, 0x21, 0xeb, 0xa6);
+
+static void cxl_event_trace_record(const struct device *dev,
+				   enum cxl_event_log_type type,
+				   struct cxl_event_record_raw *record)
+{
+	uuid_t *id = &record->hdr.id;
+
+	if (uuid_equal(id, &gen_media_event_uuid)) {
+		struct cxl_event_gen_media *rec =
+				(struct cxl_event_gen_media *)record;
+
+		trace_cxl_general_media(dev, type, rec);
+	} else {
+		/* For unknown record types print just the header */
+		trace_cxl_generic_event(dev, type, record);
+	}
+}
+
 static int cxl_clear_event_record(struct cxl_dev_state *cxlds,
 				  enum cxl_event_log_type log,
 				  struct cxl_get_event_payload *get_pl)
@@ -810,8 +835,8 @@ static void cxl_mem_get_records_log(struct cxl_dev_state *cxlds,
 			break;
 
 		for (i = 0; i < nr_rec; i++)
-			trace_cxl_generic_event(cxlds->dev, type,
-						&payload->records[i]);
+			cxl_event_trace_record(cxlds->dev, type,
+					       &payload->records[i]);
 
 		if (payload->flags & CXL_GET_EVENT_FLAG_OVERFLOW)
 			trace_cxl_overflow(cxlds->dev, type, payload);
diff --git a/drivers/cxl/core/trace.h b/drivers/cxl/core/trace.h
index 6898212fcb47..d85f0481661d 100644
--- a/drivers/cxl/core/trace.h
+++ b/drivers/cxl/core/trace.h
@@ -223,6 +223,130 @@ TRACE_EVENT(cxl_generic_event,
 		__print_hex(__entry->data, CXL_EVENT_RECORD_DATA_LENGTH))
 );
 
+/*
+ * Physical Address field masks
+ *
+ * General Media Event Record
+ * CXL rev 3.0 Section 8.2.9.2.1.1; Table 8-43
+ *
+ * DRAM Event Record
+ * CXL rev 3.0 section 8.2.9.2.1.2; Table 8-44
+ */
+#define CXL_DPA_FLAGS_MASK			0x3F
+#define CXL_DPA_MASK				(~CXL_DPA_FLAGS_MASK)
+
+#define CXL_DPA_VOLATILE			BIT(0)
+#define CXL_DPA_NOT_REPAIRABLE			BIT(1)
+#define show_dpa_flags(flags)	__print_flags(flags, "|",		   \
+	{ CXL_DPA_VOLATILE,			"VOLATILE"		}, \
+	{ CXL_DPA_NOT_REPAIRABLE,		"NOT_REPAIRABLE"	}  \
+)
+
+/*
+ * General Media Event Record - GMER
+ * CXL rev 3.0 Section 8.2.9.2.1.1; Table 8-43
+ */
+#define CXL_GMER_EVT_DESC_UNCORECTABLE_EVENT		BIT(0)
+#define CXL_GMER_EVT_DESC_THRESHOLD_EVENT		BIT(1)
+#define CXL_GMER_EVT_DESC_POISON_LIST_OVERFLOW		BIT(2)
+#define show_event_desc_flags(flags)	__print_flags(flags, "|",		   \
+	{ CXL_GMER_EVT_DESC_UNCORECTABLE_EVENT,		"UNCORRECTABLE_EVENT"	}, \
+	{ CXL_GMER_EVT_DESC_THRESHOLD_EVENT,		"THRESHOLD_EVENT"	}, \
+	{ CXL_GMER_EVT_DESC_POISON_LIST_OVERFLOW,	"POISON_LIST_OVERFLOW"	}  \
+)
+
+#define CXL_GMER_MEM_EVT_TYPE_ECC_ERROR			0x00
+#define CXL_GMER_MEM_EVT_TYPE_INV_ADDR			0x01
+#define CXL_GMER_MEM_EVT_TYPE_DATA_PATH_ERROR		0x02
+#define show_mem_event_type(type)	__print_symbolic(type,			\
+	{ CXL_GMER_MEM_EVT_TYPE_ECC_ERROR,		"ECC Error" },		\
+	{ CXL_GMER_MEM_EVT_TYPE_INV_ADDR,		"Invalid Address" },	\
+	{ CXL_GMER_MEM_EVT_TYPE_DATA_PATH_ERROR,	"Data Path Error" }	\
+)
+
+#define CXL_GMER_TRANS_UNKNOWN				0x00
+#define CXL_GMER_TRANS_HOST_READ			0x01
+#define CXL_GMER_TRANS_HOST_WRITE			0x02
+#define CXL_GMER_TRANS_HOST_SCAN_MEDIA			0x03
+#define CXL_GMER_TRANS_HOST_INJECT_POISON		0x04
+#define CXL_GMER_TRANS_INTERNAL_MEDIA_SCRUB		0x05
+#define CXL_GMER_TRANS_INTERNAL_MEDIA_MANAGEMENT	0x06
+#define show_trans_type(type)	__print_symbolic(type,					\
+	{ CXL_GMER_TRANS_UNKNOWN,			"Unknown" },			\
+	{ CXL_GMER_TRANS_HOST_READ,			"Host Read" },			\
+	{ CXL_GMER_TRANS_HOST_WRITE,			"Host Write" },			\
+	{ CXL_GMER_TRANS_HOST_SCAN_MEDIA,		"Host Scan Media" },		\
+	{ CXL_GMER_TRANS_HOST_INJECT_POISON,		"Host Inject Poison" },		\
+	{ CXL_GMER_TRANS_INTERNAL_MEDIA_SCRUB,		"Internal Media Scrub" },	\
+	{ CXL_GMER_TRANS_INTERNAL_MEDIA_MANAGEMENT,	"Internal Media Management" }	\
+)
+
+#define CXL_GMER_VALID_CHANNEL				BIT(0)
+#define CXL_GMER_VALID_RANK				BIT(1)
+#define CXL_GMER_VALID_DEVICE				BIT(2)
+#define CXL_GMER_VALID_COMPONENT			BIT(3)
+#define show_valid_flags(flags)	__print_flags(flags, "|",		   \
+	{ CXL_GMER_VALID_CHANNEL,			"CHANNEL"	}, \
+	{ CXL_GMER_VALID_RANK,				"RANK"		}, \
+	{ CXL_GMER_VALID_DEVICE,			"DEVICE"	}, \
+	{ CXL_GMER_VALID_COMPONENT,			"COMPONENT"	}  \
+)
+
+TRACE_EVENT(cxl_general_media,
+
+	TP_PROTO(const struct device *dev, enum cxl_event_log_type log,
+		 struct cxl_event_gen_media *rec),
+
+	TP_ARGS(dev, log, rec),
+
+	TP_STRUCT__entry(
+		CXL_EVT_TP_entry
+		/* General Media */
+		__field(u64, dpa)
+		__field(u8, descriptor)
+		__field(u8, type)
+		__field(u8, transaction_type)
+		__field(u8, channel)
+		__field(u32, device)
+		__array(u8, comp_id, CXL_EVENT_GEN_MED_COMP_ID_SIZE)
+		__field(u16, validity_flags)
+		/* Following are out of order to pack trace record */
+		__field(u8, rank)
+		__field(u8, dpa_flags)
+	),
+
+	TP_fast_assign(
+		CXL_EVT_TP_fast_assign(dev, log, rec->hdr);
+
+		/* General Media */
+		__entry->dpa = le64_to_cpu(rec->phys_addr);
+		__entry->dpa_flags = __entry->dpa & CXL_DPA_FLAGS_MASK;
+		/* Mask after flags have been parsed */
+		__entry->dpa &= CXL_DPA_MASK;
+		__entry->descriptor = rec->descriptor;
+		__entry->type = rec->type;
+		__entry->transaction_type = rec->transaction_type;
+		__entry->channel = rec->channel;
+		__entry->rank = rec->rank;
+		__entry->device = get_unaligned_le24(rec->device);
+		memcpy(__entry->comp_id, &rec->component_id,
+			CXL_EVENT_GEN_MED_COMP_ID_SIZE);
+		__entry->validity_flags = get_unaligned_le16(&rec->validity_flags);
+	),
+
+	CXL_EVT_TP_printk("dpa=%llx dpa_flags='%s' " \
+		"descriptor='%s' type='%s' transaction_type='%s' channel=%u rank=%u " \
+		"device=%x comp_id=%s validity_flags='%s'",
+		__entry->dpa, show_dpa_flags(__entry->dpa_flags),
+		show_event_desc_flags(__entry->descriptor),
+		show_mem_event_type(__entry->type),
+		show_trans_type(__entry->transaction_type),
+		__entry->channel, __entry->rank, __entry->device,
+		__print_hex(__entry->comp_id, CXL_EVENT_GEN_MED_COMP_ID_SIZE),
+		show_valid_flags(__entry->validity_flags)
+	)
+);
+
 #endif /* _CXL_EVENTS_H */
 
 #define TRACE_INCLUDE_FILE trace
diff --git a/drivers/cxl/cxlmem.h b/drivers/cxl/cxlmem.h
index bd8bfbe61ec8..9cfd20abc3a1 100644
--- a/drivers/cxl/cxlmem.h
+++ b/drivers/cxl/cxlmem.h
@@ -449,6 +449,25 @@ struct cxl_mbox_clear_event_payload {
 		 (sizeof(__le16) * CXL_CLEAR_EVENT_MAX_HANDLES))) /	\
 		sizeof(__le16))
 
+/*
+ * General Media Event Record
+ * CXL rev 3.0 Section 8.2.9.2.1.1; Table 8-43
+ */
+#define CXL_EVENT_GEN_MED_COMP_ID_SIZE	0x10
+struct cxl_event_gen_media {
+	struct cxl_event_record_hdr hdr;
+	__le64 phys_addr;
+	u8 descriptor;
+	u8 type;
+	u8 transaction_type;
+	u8 validity_flags[2];
+	u8 channel;
+	u8 rank;
+	u8 device[3];
+	u8 component_id[CXL_EVENT_GEN_MED_COMP_ID_SIZE];
+	u8 reserved[46];
+} __packed;
+
 struct cxl_mbox_get_partition_info {
 	__le64 active_volatile_cap;
 	__le64 active_persistent_cap;
-- 
2.37.2


^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH V4 5/9] cxl/mem: Trace DRAM Event Record
  2022-12-12  7:06 [PATCH V4 0/9] CXL: Process event logs ira.weiny
                   ` (3 preceding siblings ...)
  2022-12-12  7:06 ` [PATCH V4 4/9] cxl/mem: Trace General Media Event Record ira.weiny
@ 2022-12-12  7:06 ` ira.weiny
  2022-12-12  7:06 ` [PATCH V4 6/9] cxl/mem: Trace Memory Module " ira.weiny
                   ` (4 subsequent siblings)
  9 siblings, 0 replies; 36+ messages in thread
From: ira.weiny @ 2022-12-12  7:06 UTC (permalink / raw)
  To: Dan Williams
  Cc: Ira Weiny, Bjorn Helgaas, Alison Schofield, Vishal Verma,
	Davidlohr Bueso, Jonathan Cameron, Dave Jiang, linux-kernel,
	linux-pci, linux-acpi, linux-cxl

From: Ira Weiny <ira.weiny@intel.com>

CXL rev 3.0 section 8.2.9.2.1.2 defines the DRAM Event Record.

Determine if the event read is a DRAM event record and if so trace the
record.

Reviewed-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Ira Weiny <ira.weiny@intel.com>

---
Changes from v3:
	Dan
		Use if/else if/else instead of returns
---
 drivers/cxl/core/mbox.c  | 12 ++++++
 drivers/cxl/core/trace.h | 92 ++++++++++++++++++++++++++++++++++++++++
 drivers/cxl/cxlmem.h     | 23 ++++++++++
 3 files changed, 127 insertions(+)

diff --git a/drivers/cxl/core/mbox.c b/drivers/cxl/core/mbox.c
index ffa311f94baa..be66cb5816e5 100644
--- a/drivers/cxl/core/mbox.c
+++ b/drivers/cxl/core/mbox.c
@@ -726,6 +726,14 @@ static const uuid_t gen_media_event_uuid =
 	UUID_INIT(0xfbcd0a77, 0xc260, 0x417f,
 		  0x85, 0xa9, 0x08, 0x8b, 0x16, 0x21, 0xeb, 0xa6);
 
+/*
+ * DRAM Event Record
+ * CXL rev 3.0 section 8.2.9.2.1.2; Table 8-44
+ */
+static const uuid_t dram_event_uuid =
+	UUID_INIT(0x601dcbb3, 0x9c06, 0x4eab,
+		  0xb8, 0xaf, 0x4e, 0x9b, 0xfb, 0x5c, 0x96, 0x24);
+
 static void cxl_event_trace_record(const struct device *dev,
 				   enum cxl_event_log_type type,
 				   struct cxl_event_record_raw *record)
@@ -737,6 +745,10 @@ static void cxl_event_trace_record(const struct device *dev,
 				(struct cxl_event_gen_media *)record;
 
 		trace_cxl_general_media(dev, type, rec);
+	} else if (uuid_equal(id, &dram_event_uuid)) {
+		struct cxl_event_dram *rec = (struct cxl_event_dram *)record;
+
+		trace_cxl_dram(dev, type, rec);
 	} else {
 		/* For unknown record types print just the header */
 		trace_cxl_generic_event(dev, type, record);
diff --git a/drivers/cxl/core/trace.h b/drivers/cxl/core/trace.h
index d85f0481661d..b6321cfb1d9f 100644
--- a/drivers/cxl/core/trace.h
+++ b/drivers/cxl/core/trace.h
@@ -347,6 +347,98 @@ TRACE_EVENT(cxl_general_media,
 	)
 );
 
+/*
+ * DRAM Event Record - DER
+ *
+ * CXL rev 3.0 section 8.2.9.2.1.2; Table 8-44
+ */
+/*
+ * DRAM Event Record defines many fields the same as the General Media Event
+ * Record.  Reuse those definitions as appropriate.
+ */
+#define CXL_DER_VALID_CHANNEL				BIT(0)
+#define CXL_DER_VALID_RANK				BIT(1)
+#define CXL_DER_VALID_NIBBLE				BIT(2)
+#define CXL_DER_VALID_BANK_GROUP			BIT(3)
+#define CXL_DER_VALID_BANK				BIT(4)
+#define CXL_DER_VALID_ROW				BIT(5)
+#define CXL_DER_VALID_COLUMN				BIT(6)
+#define CXL_DER_VALID_CORRECTION_MASK			BIT(7)
+#define show_dram_valid_flags(flags)	__print_flags(flags, "|",			   \
+	{ CXL_DER_VALID_CHANNEL,			"CHANNEL"		}, \
+	{ CXL_DER_VALID_RANK,				"RANK"			}, \
+	{ CXL_DER_VALID_NIBBLE,				"NIBBLE"		}, \
+	{ CXL_DER_VALID_BANK_GROUP,			"BANK GROUP"		}, \
+	{ CXL_DER_VALID_BANK,				"BANK"			}, \
+	{ CXL_DER_VALID_ROW,				"ROW"			}, \
+	{ CXL_DER_VALID_COLUMN,				"COLUMN"		}, \
+	{ CXL_DER_VALID_CORRECTION_MASK,		"CORRECTION MASK"	}  \
+)
+
+TRACE_EVENT(cxl_dram,
+
+	TP_PROTO(const struct device *dev, enum cxl_event_log_type log,
+		 struct cxl_event_dram *rec),
+
+	TP_ARGS(dev, log, rec),
+
+	TP_STRUCT__entry(
+		CXL_EVT_TP_entry
+		/* DRAM */
+		__field(u64, dpa)
+		__field(u8, descriptor)
+		__field(u8, type)
+		__field(u8, transaction_type)
+		__field(u8, channel)
+		__field(u16, validity_flags)
+		__field(u16, column)	/* Out of order to pack trace record */
+		__field(u32, nibble_mask)
+		__field(u32, row)
+		__array(u8, cor_mask, CXL_EVENT_DER_CORRECTION_MASK_SIZE)
+		__field(u8, rank)	/* Out of order to pack trace record */
+		__field(u8, bank_group)	/* Out of order to pack trace record */
+		__field(u8, bank)	/* Out of order to pack trace record */
+		__field(u8, dpa_flags)	/* Out of order to pack trace record */
+	),
+
+	TP_fast_assign(
+		CXL_EVT_TP_fast_assign(dev, log, rec->hdr);
+
+		/* DRAM */
+		__entry->dpa = le64_to_cpu(rec->phys_addr);
+		__entry->dpa_flags = __entry->dpa & CXL_DPA_FLAGS_MASK;
+		__entry->dpa &= CXL_DPA_MASK;
+		__entry->descriptor = rec->descriptor;
+		__entry->type = rec->type;
+		__entry->transaction_type = rec->transaction_type;
+		__entry->validity_flags = get_unaligned_le16(rec->validity_flags);
+		__entry->channel = rec->channel;
+		__entry->rank = rec->rank;
+		__entry->nibble_mask = get_unaligned_le24(rec->nibble_mask);
+		__entry->bank_group = rec->bank_group;
+		__entry->bank = rec->bank;
+		__entry->row = get_unaligned_le24(rec->row);
+		__entry->column = get_unaligned_le16(rec->column);
+		memcpy(__entry->cor_mask, &rec->correction_mask,
+			CXL_EVENT_DER_CORRECTION_MASK_SIZE);
+	),
+
+	CXL_EVT_TP_printk("dpa=%llx dpa_flags='%s' descriptor='%s' type='%s' " \
+		"transaction_type='%s' channel=%u rank=%u nibble_mask=%x " \
+		"bank_group=%u bank=%u row=%u column=%u cor_mask=%s " \
+		"validity_flags='%s'",
+		__entry->dpa, show_dpa_flags(__entry->dpa_flags),
+		show_event_desc_flags(__entry->descriptor),
+		show_mem_event_type(__entry->type),
+		show_trans_type(__entry->transaction_type),
+		__entry->channel, __entry->rank, __entry->nibble_mask,
+		__entry->bank_group, __entry->bank,
+		__entry->row, __entry->column,
+		__print_hex(__entry->cor_mask, CXL_EVENT_DER_CORRECTION_MASK_SIZE),
+		show_dram_valid_flags(__entry->validity_flags)
+	)
+);
+
 #endif /* _CXL_EVENTS_H */
 
 #define TRACE_INCLUDE_FILE trace
diff --git a/drivers/cxl/cxlmem.h b/drivers/cxl/cxlmem.h
index 9cfd20abc3a1..78c453278910 100644
--- a/drivers/cxl/cxlmem.h
+++ b/drivers/cxl/cxlmem.h
@@ -468,6 +468,29 @@ struct cxl_event_gen_media {
 	u8 reserved[46];
 } __packed;
 
+/*
+ * DRAM Event Record - DER
+ * CXL rev 3.0 section 8.2.9.2.1.2; Table 3-44
+ */
+#define CXL_EVENT_DER_CORRECTION_MASK_SIZE	0x20
+struct cxl_event_dram {
+	struct cxl_event_record_hdr hdr;
+	__le64 phys_addr;
+	u8 descriptor;
+	u8 type;
+	u8 transaction_type;
+	u8 validity_flags[2];
+	u8 channel;
+	u8 rank;
+	u8 nibble_mask[3];
+	u8 bank_group;
+	u8 bank;
+	u8 row[3];
+	u8 column[2];
+	u8 correction_mask[CXL_EVENT_DER_CORRECTION_MASK_SIZE];
+	u8 reserved[0x17];
+} __packed;
+
 struct cxl_mbox_get_partition_info {
 	__le64 active_volatile_cap;
 	__le64 active_persistent_cap;
-- 
2.37.2


^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH V4 6/9] cxl/mem: Trace Memory Module Event Record
  2022-12-12  7:06 [PATCH V4 0/9] CXL: Process event logs ira.weiny
                   ` (4 preceding siblings ...)
  2022-12-12  7:06 ` [PATCH V4 5/9] cxl/mem: Trace DRAM " ira.weiny
@ 2022-12-12  7:06 ` ira.weiny
  2022-12-12  7:06 ` [PATCH V4 7/9] cxl/test: Add generic mock events ira.weiny
                   ` (3 subsequent siblings)
  9 siblings, 0 replies; 36+ messages in thread
From: ira.weiny @ 2022-12-12  7:06 UTC (permalink / raw)
  To: Dan Williams
  Cc: Ira Weiny, Bjorn Helgaas, Alison Schofield, Vishal Verma,
	Davidlohr Bueso, Jonathan Cameron, Dave Jiang, linux-kernel,
	linux-pci, linux-acpi, linux-cxl

From: Ira Weiny <ira.weiny@intel.com>

CXL rev 3.0 section 8.2.9.2.1.3 defines the Memory Module Event Record.

Determine if the event read is memory module record and if so trace the
record.

Reviewed-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Ira Weiny <ira.weiny@intel.com>

---
Changes from V3:
	Dan
		Use if/else if/else instead of return
---
 drivers/cxl/core/mbox.c  |  13 ++++
 drivers/cxl/core/trace.h | 143 +++++++++++++++++++++++++++++++++++++++
 drivers/cxl/cxlmem.h     |  26 +++++++
 3 files changed, 182 insertions(+)

diff --git a/drivers/cxl/core/mbox.c b/drivers/cxl/core/mbox.c
index be66cb5816e5..b011804ed994 100644
--- a/drivers/cxl/core/mbox.c
+++ b/drivers/cxl/core/mbox.c
@@ -734,6 +734,14 @@ static const uuid_t dram_event_uuid =
 	UUID_INIT(0x601dcbb3, 0x9c06, 0x4eab,
 		  0xb8, 0xaf, 0x4e, 0x9b, 0xfb, 0x5c, 0x96, 0x24);
 
+/*
+ * Memory Module Event Record
+ * CXL rev 3.0 section 8.2.9.2.1.3; Table 8-45
+ */
+static const uuid_t mem_mod_event_uuid =
+	UUID_INIT(0xfe927475, 0xdd59, 0x4339,
+		  0xa5, 0x86, 0x79, 0xba, 0xb1, 0x13, 0xb7, 0x74);
+
 static void cxl_event_trace_record(const struct device *dev,
 				   enum cxl_event_log_type type,
 				   struct cxl_event_record_raw *record)
@@ -749,6 +757,11 @@ static void cxl_event_trace_record(const struct device *dev,
 		struct cxl_event_dram *rec = (struct cxl_event_dram *)record;
 
 		trace_cxl_dram(dev, type, rec);
+	} else if (uuid_equal(id, &mem_mod_event_uuid)) {
+		struct cxl_event_mem_module *rec =
+				(struct cxl_event_mem_module *)record;
+
+		trace_cxl_memory_module(dev, type, rec);
 	} else {
 		/* For unknown record types print just the header */
 		trace_cxl_generic_event(dev, type, record);
diff --git a/drivers/cxl/core/trace.h b/drivers/cxl/core/trace.h
index b6321cfb1d9f..ebb4c8ce8587 100644
--- a/drivers/cxl/core/trace.h
+++ b/drivers/cxl/core/trace.h
@@ -439,6 +439,149 @@ TRACE_EVENT(cxl_dram,
 	)
 );
 
+/*
+ * Memory Module Event Record - MMER
+ *
+ * CXL res 3.0 section 8.2.9.2.1.3; Table 8-45
+ */
+#define CXL_MMER_HEALTH_STATUS_CHANGE		0x00
+#define CXL_MMER_MEDIA_STATUS_CHANGE		0x01
+#define CXL_MMER_LIFE_USED_CHANGE		0x02
+#define CXL_MMER_TEMP_CHANGE			0x03
+#define CXL_MMER_DATA_PATH_ERROR		0x04
+#define CXL_MMER_LAS_ERROR			0x05
+#define show_dev_evt_type(type)	__print_symbolic(type,			   \
+	{ CXL_MMER_HEALTH_STATUS_CHANGE,	"Health Status Change"	}, \
+	{ CXL_MMER_MEDIA_STATUS_CHANGE,		"Media Status Change"	}, \
+	{ CXL_MMER_LIFE_USED_CHANGE,		"Life Used Change"	}, \
+	{ CXL_MMER_TEMP_CHANGE,			"Temperature Change"	}, \
+	{ CXL_MMER_DATA_PATH_ERROR,		"Data Path Error"	}, \
+	{ CXL_MMER_LAS_ERROR,			"LSA Error"		}  \
+)
+
+/*
+ * Device Health Information - DHI
+ *
+ * CXL res 3.0 section 8.2.9.8.3.1; Table 8-100
+ */
+#define CXL_DHI_HS_MAINTENANCE_NEEDED				BIT(0)
+#define CXL_DHI_HS_PERFORMANCE_DEGRADED				BIT(1)
+#define CXL_DHI_HS_HW_REPLACEMENT_NEEDED			BIT(2)
+#define show_health_status_flags(flags)	__print_flags(flags, "|",	   \
+	{ CXL_DHI_HS_MAINTENANCE_NEEDED,	"MAINTENANCE_NEEDED"	}, \
+	{ CXL_DHI_HS_PERFORMANCE_DEGRADED,	"PERFORMANCE_DEGRADED"	}, \
+	{ CXL_DHI_HS_HW_REPLACEMENT_NEEDED,	"REPLACEMENT_NEEDED"	}  \
+)
+
+#define CXL_DHI_MS_NORMAL							0x00
+#define CXL_DHI_MS_NOT_READY							0x01
+#define CXL_DHI_MS_WRITE_PERSISTENCY_LOST					0x02
+#define CXL_DHI_MS_ALL_DATA_LOST						0x03
+#define CXL_DHI_MS_WRITE_PERSISTENCY_LOSS_EVENT_POWER_LOSS			0x04
+#define CXL_DHI_MS_WRITE_PERSISTENCY_LOSS_EVENT_SHUTDOWN			0x05
+#define CXL_DHI_MS_WRITE_PERSISTENCY_LOSS_IMMINENT				0x06
+#define CXL_DHI_MS_WRITE_ALL_DATA_LOSS_EVENT_POWER_LOSS				0x07
+#define CXL_DHI_MS_WRITE_ALL_DATA_LOSS_EVENT_SHUTDOWN				0x08
+#define CXL_DHI_MS_WRITE_ALL_DATA_LOSS_IMMINENT					0x09
+#define show_media_status(ms)	__print_symbolic(ms,			   \
+	{ CXL_DHI_MS_NORMAL,						   \
+		"Normal"						}, \
+	{ CXL_DHI_MS_NOT_READY,						   \
+		"Not Ready"						}, \
+	{ CXL_DHI_MS_WRITE_PERSISTENCY_LOST,				   \
+		"Write Persistency Lost"				}, \
+	{ CXL_DHI_MS_ALL_DATA_LOST,					   \
+		"All Data Lost"						}, \
+	{ CXL_DHI_MS_WRITE_PERSISTENCY_LOSS_EVENT_POWER_LOSS,		   \
+		"Write Persistency Loss in the Event of Power Loss"	}, \
+	{ CXL_DHI_MS_WRITE_PERSISTENCY_LOSS_EVENT_SHUTDOWN,		   \
+		"Write Persistency Loss in Event of Shutdown"		}, \
+	{ CXL_DHI_MS_WRITE_PERSISTENCY_LOSS_IMMINENT,			   \
+		"Write Persistency Loss Imminent"			}, \
+	{ CXL_DHI_MS_WRITE_ALL_DATA_LOSS_EVENT_POWER_LOSS,		   \
+		"All Data Loss in Event of Power Loss"			}, \
+	{ CXL_DHI_MS_WRITE_ALL_DATA_LOSS_EVENT_SHUTDOWN,		   \
+		"All Data loss in the Event of Shutdown"		}, \
+	{ CXL_DHI_MS_WRITE_ALL_DATA_LOSS_IMMINENT,			   \
+		"All Data Loss Imminent"				}  \
+)
+
+#define CXL_DHI_AS_NORMAL		0x0
+#define CXL_DHI_AS_WARNING		0x1
+#define CXL_DHI_AS_CRITICAL		0x2
+#define show_two_bit_status(as) __print_symbolic(as,	   \
+	{ CXL_DHI_AS_NORMAL,		"Normal"	}, \
+	{ CXL_DHI_AS_WARNING,		"Warning"	}, \
+	{ CXL_DHI_AS_CRITICAL,		"Critical"	}  \
+)
+#define show_one_bit_status(as) __print_symbolic(as,	   \
+	{ CXL_DHI_AS_NORMAL,		"Normal"	}, \
+	{ CXL_DHI_AS_WARNING,		"Warning"	}  \
+)
+
+#define CXL_DHI_AS_LIFE_USED(as)			(as & 0x3)
+#define CXL_DHI_AS_DEV_TEMP(as)				((as & 0xC) >> 2)
+#define CXL_DHI_AS_COR_VOL_ERR_CNT(as)			((as & 0x10) >> 4)
+#define CXL_DHI_AS_COR_PER_ERR_CNT(as)			((as & 0x20) >> 5)
+
+TRACE_EVENT(cxl_memory_module,
+
+	TP_PROTO(const struct device *dev, enum cxl_event_log_type log,
+		 struct cxl_event_mem_module *rec),
+
+	TP_ARGS(dev, log, rec),
+
+	TP_STRUCT__entry(
+		CXL_EVT_TP_entry
+
+		/* Memory Module Event */
+		__field(u8, event_type)
+
+		/* Device Health Info */
+		__field(u8, health_status)
+		__field(u8, media_status)
+		__field(u8, life_used)
+		__field(u32, dirty_shutdown_cnt)
+		__field(u32, cor_vol_err_cnt)
+		__field(u32, cor_per_err_cnt)
+		__field(s16, device_temp)
+		__field(u8, add_status)
+	),
+
+	TP_fast_assign(
+		CXL_EVT_TP_fast_assign(dev, log, rec->hdr);
+
+		/* Memory Module Event */
+		__entry->event_type = rec->event_type;
+
+		/* Device Health Info */
+		__entry->health_status = rec->info.health_status;
+		__entry->media_status = rec->info.media_status;
+		__entry->life_used = rec->info.life_used;
+		__entry->dirty_shutdown_cnt = get_unaligned_le32(rec->info.dirty_shutdown_cnt);
+		__entry->cor_vol_err_cnt = get_unaligned_le32(rec->info.cor_vol_err_cnt);
+		__entry->cor_per_err_cnt = get_unaligned_le32(rec->info.cor_per_err_cnt);
+		__entry->device_temp = get_unaligned_le16(rec->info.device_temp);
+		__entry->add_status = rec->info.add_status;
+	),
+
+	CXL_EVT_TP_printk("event_type='%s' health_status='%s' media_status='%s' " \
+		"as_life_used=%s as_dev_temp=%s as_cor_vol_err_cnt=%s " \
+		"as_cor_per_err_cnt=%s life_used=%u device_temp=%d " \
+		"dirty_shutdown_cnt=%u cor_vol_err_cnt=%u cor_per_err_cnt=%u",
+		show_dev_evt_type(__entry->event_type),
+		show_health_status_flags(__entry->health_status),
+		show_media_status(__entry->media_status),
+		show_two_bit_status(CXL_DHI_AS_LIFE_USED(__entry->add_status)),
+		show_two_bit_status(CXL_DHI_AS_DEV_TEMP(__entry->add_status)),
+		show_one_bit_status(CXL_DHI_AS_COR_VOL_ERR_CNT(__entry->add_status)),
+		show_one_bit_status(CXL_DHI_AS_COR_PER_ERR_CNT(__entry->add_status)),
+		__entry->life_used, __entry->device_temp,
+		__entry->dirty_shutdown_cnt, __entry->cor_vol_err_cnt,
+		__entry->cor_per_err_cnt
+	)
+);
+
 #endif /* _CXL_EVENTS_H */
 
 #define TRACE_INCLUDE_FILE trace
diff --git a/drivers/cxl/cxlmem.h b/drivers/cxl/cxlmem.h
index 78c453278910..459f161c7c86 100644
--- a/drivers/cxl/cxlmem.h
+++ b/drivers/cxl/cxlmem.h
@@ -491,6 +491,32 @@ struct cxl_event_dram {
 	u8 reserved[0x17];
 } __packed;
 
+/*
+ * Get Health Info Record
+ * CXL rev 3.0 section 8.2.9.8.3.1; Table 8-100
+ */
+struct cxl_get_health_info {
+	u8 health_status;
+	u8 media_status;
+	u8 add_status;
+	u8 life_used;
+	u8 device_temp[2];
+	u8 dirty_shutdown_cnt[4];
+	u8 cor_vol_err_cnt[4];
+	u8 cor_per_err_cnt[4];
+} __packed;
+
+/*
+ * Memory Module Event Record
+ * CXL rev 3.0 section 8.2.9.2.1.3; Table 8-45
+ */
+struct cxl_event_mem_module {
+	struct cxl_event_record_hdr hdr;
+	u8 event_type;
+	struct cxl_get_health_info info;
+	u8 reserved[0x3d];
+} __packed;
+
 struct cxl_mbox_get_partition_info {
 	__le64 active_volatile_cap;
 	__le64 active_persistent_cap;
-- 
2.37.2


^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH V4 7/9] cxl/test: Add generic mock events
  2022-12-12  7:06 [PATCH V4 0/9] CXL: Process event logs ira.weiny
                   ` (5 preceding siblings ...)
  2022-12-12  7:06 ` [PATCH V4 6/9] cxl/mem: Trace Memory Module " ira.weiny
@ 2022-12-12  7:06 ` ira.weiny
  2022-12-12  7:06 ` [PATCH V4 8/9] cxl/test: Add specific events ira.weiny
                   ` (2 subsequent siblings)
  9 siblings, 0 replies; 36+ messages in thread
From: ira.weiny @ 2022-12-12  7:06 UTC (permalink / raw)
  To: Dan Williams
  Cc: Ira Weiny, Bjorn Helgaas, Alison Schofield, Vishal Verma,
	Davidlohr Bueso, Jonathan Cameron, Dave Jiang, linux-kernel,
	linux-pci, linux-acpi, linux-cxl

From: Ira Weiny <ira.weiny@intel.com>

Facilitate testing basic Get/Clear Event functionality by creating
multiple logs and generic events with made up UUID's.

Data is completely made up with data patterns which should be easy to
spot in trace output.

A single sysfs entry resets the event data and triggers collecting the
events for testing.

Test traces are easy to obtain with a small script such as this:

	#!/bin/bash -x

	devices=`find /sys/devices/platform -name cxl_mem*`

	# Turn on tracing
	echo "" > /sys/kernel/tracing/trace
	echo 1 > /sys/kernel/tracing/events/cxl/enable
	echo 1 > /sys/kernel/tracing/tracing_on

	# Generate fake interrupt
	for device in $devices; do
	        echo 1 > $device/event_trigger
	done

	# Turn off tracing and report events
	echo 0 > /sys/kernel/tracing/tracing_on
	cat /sys/kernel/tracing/trace

Reviewed-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Ira Weiny <ira.weiny@intel.com>

---
Changes from v3:
	Dan
		Move code to mem.c and remove event.c and overhead
	Clean up function names
---
 tools/testing/cxl/test/Kbuild |   2 +-
 tools/testing/cxl/test/mem.c  | 231 ++++++++++++++++++++++++++++++++++
 2 files changed, 232 insertions(+), 1 deletion(-)

diff --git a/tools/testing/cxl/test/Kbuild b/tools/testing/cxl/test/Kbuild
index 4e59e2c911f6..61d5f7bcddf9 100644
--- a/tools/testing/cxl/test/Kbuild
+++ b/tools/testing/cxl/test/Kbuild
@@ -1,5 +1,5 @@
 # SPDX-License-Identifier: GPL-2.0
-ccflags-y := -I$(srctree)/drivers/cxl/
+ccflags-y := -I$(srctree)/drivers/cxl/ -I$(srctree)/drivers/cxl/core
 
 obj-m += cxl_test.o
 obj-m += cxl_mock.o
diff --git a/tools/testing/cxl/test/mem.c b/tools/testing/cxl/test/mem.c
index 5e4ecd93f1d2..036f27f9c18e 100644
--- a/tools/testing/cxl/test/mem.c
+++ b/tools/testing/cxl/test/mem.c
@@ -9,6 +9,8 @@
 #include <linux/bits.h>
 #include <cxlmem.h>
 
+#include "trace.h"
+
 #define LSA_SIZE SZ_128K
 #define DEV_SIZE SZ_2G
 #define EFFECT(x) (1U << x)
@@ -67,6 +69,24 @@ static struct {
 
 #define PASS_TRY_LIMIT 3
 
+#define CXL_TEST_EVENT_CNT_MAX 15
+
+/* Set a number of events to return at a time for simulation.  */
+#define CXL_TEST_EVENT_CNT 3
+
+struct mock_event_log {
+	u16 clear_idx;
+	u16 cur_idx;
+	u16 nr_events;
+	struct cxl_event_record_raw *events[CXL_TEST_EVENT_CNT_MAX];
+};
+
+struct mock_event_store {
+	struct cxl_dev_state *cxlds;
+	struct mock_event_log mock_logs[CXL_EVENT_TYPE_MAX];
+	u32 ev_status;
+};
+
 struct cxl_mockmem_data {
 	void *lsa;
 	u32 security_state;
@@ -74,9 +94,198 @@ struct cxl_mockmem_data {
 	u8 master_pass[NVDIMM_PASSPHRASE_LEN];
 	int user_limit;
 	int master_limit;
+	struct mock_event_store mes;
+	u8 event_buf[SZ_4K];
+};
+
+static struct mock_event_log *event_find_log(struct device *dev, int log_type)
+{
+	struct cxl_mockmem_data *mdata = dev_get_drvdata(dev);
+
+	if (log_type >= CXL_EVENT_TYPE_MAX)
+		return NULL;
+	return &mdata->mes.mock_logs[log_type];
+}
+
+static struct cxl_event_record_raw *event_get_current(struct mock_event_log *log)
+{
+	return log->events[log->cur_idx];
+}
+
+static void event_reset_log(struct mock_event_log *log)
+{
+	log->cur_idx = 0;
+	log->clear_idx = 0;
+}
+
+/* Handle can never be 0 use 1 based indexing for handle */
+static u16 event_get_clear_handle(struct mock_event_log *log)
+{
+	return log->clear_idx + 1;
+}
+
+/* Handle can never be 0 use 1 based indexing for handle */
+static __le16 event_get_cur_event_handle(struct mock_event_log *log)
+{
+	u16 cur_handle = log->cur_idx + 1;
+
+	return cpu_to_le16(cur_handle);
+}
+
+static bool event_log_empty(struct mock_event_log *log)
+{
+	return log->cur_idx == log->nr_events;
+}
+
+static void mes_add_event(struct mock_event_store *mes,
+			  enum cxl_event_log_type log_type,
+			  struct cxl_event_record_raw *event)
+{
+	struct mock_event_log *log;
+
+	if (WARN_ON(log_type >= CXL_EVENT_TYPE_MAX))
+		return;
+
+	log = &mes->mock_logs[log_type];
+	if (WARN_ON(log->nr_events >= CXL_TEST_EVENT_CNT_MAX))
+		return;
+
+	log->events[log->nr_events] = event;
+	log->nr_events++;
+}
+
+static int mock_get_event(struct cxl_dev_state *cxlds,
+			  struct cxl_mbox_cmd *cmd)
+{
+	struct cxl_get_event_payload *pl;
+	struct mock_event_log *log;
+	u8 log_type;
+	int i;
+
+	if (cmd->size_in != sizeof(log_type))
+		return -EINVAL;
+
+	if (cmd->size_out < struct_size(pl, records, CXL_TEST_EVENT_CNT))
+		return -EINVAL;
+
+	log_type = *((u8 *)cmd->payload_in);
+	if (log_type >= CXL_EVENT_TYPE_MAX)
+		return -EINVAL;
+
+	memset(cmd->payload_out, 0, cmd->size_out);
+
+	log = event_find_log(cxlds->dev, log_type);
+	if (!log || event_log_empty(log))
+		return 0;
+
+	pl = cmd->payload_out;
+
+	for (i = 0; i < CXL_TEST_EVENT_CNT && !event_log_empty(log); i++) {
+		memcpy(&pl->records[i], event_get_current(log),
+		       sizeof(pl->records[i]));
+		pl->records[i].hdr.handle = event_get_cur_event_handle(log);
+		log->cur_idx++;
+	}
+
+	pl->record_count = cpu_to_le16(i);
+	if (!event_log_empty(log))
+		pl->flags |= CXL_GET_EVENT_FLAG_MORE_RECORDS;
+
+	return 0;
+}
+
+static int mock_clear_event(struct cxl_dev_state *cxlds,
+			    struct cxl_mbox_cmd *cmd)
+{
+	struct cxl_mbox_clear_event_payload *pl = cmd->payload_in;
+	struct mock_event_log *log;
+	u8 log_type = pl->event_log;
+	u16 handle;
+	int nr;
+
+	if (log_type >= CXL_EVENT_TYPE_MAX)
+		return -EINVAL;
+
+	log = event_find_log(cxlds->dev, log_type);
+	if (!log)
+		return 0; /* No mock data in this log */
+
+	/*
+	 * This check is technically not invalid per the specification AFAICS.
+	 * (The host could 'guess' handles and clear them in order).
+	 * However, this is not good behavior for the host so test it.
+	 */
+	if (log->clear_idx + pl->nr_recs > log->cur_idx) {
+		dev_err(cxlds->dev,
+			"Attempting to clear more events than returned!\n");
+		return -EINVAL;
+	}
+
+	/* Check handle order prior to clearing events */
+	for (nr = 0, handle = event_get_clear_handle(log);
+	     nr < pl->nr_recs;
+	     nr++, handle++) {
+		if (handle != le16_to_cpu(pl->handle[nr])) {
+			dev_err(cxlds->dev, "Clearing events out of order\n");
+			return -EINVAL;
+		}
+	}
+
+	/* Clear events */
+	log->clear_idx += pl->nr_recs;
+	return 0;
+}
+
+static void cxl_mock_event_trigger(struct device *dev)
+{
+	struct cxl_mockmem_data *mdata = dev_get_drvdata(dev);
+	struct mock_event_store *mes = &mdata->mes;
+	int i;
+
+	for (i = CXL_EVENT_TYPE_INFO; i < CXL_EVENT_TYPE_MAX; i++) {
+		struct mock_event_log *log;
+
+		log = event_find_log(dev, i);
+		if (log)
+			event_reset_log(log);
+	}
+
+	cxl_mem_get_event_records(mes->cxlds, mes->ev_status);
+}
+
+struct cxl_event_record_raw maint_needed = {
+	.hdr = {
+		.id = UUID_INIT(0xBA5EBA11, 0xABCD, 0xEFEB,
+				0xa5, 0x5a, 0xa5, 0x5a, 0xa5, 0xa5, 0x5a, 0xa5),
+		.length = sizeof(struct cxl_event_record_raw),
+		.flags[0] = CXL_EVENT_RECORD_FLAG_MAINT_NEEDED,
+		/* .handle = Set dynamically */
+		.related_handle = cpu_to_le16(0xa5b6),
+	},
+	.data = { 0xDE, 0xAD, 0xBE, 0xEF },
+};
 
+struct cxl_event_record_raw hardware_replace = {
+	.hdr = {
+		.id = UUID_INIT(0xABCDEFEB, 0xBA11, 0xBA5E,
+				0xa5, 0x5a, 0xa5, 0x5a, 0xa5, 0xa5, 0x5a, 0xa5),
+		.length = sizeof(struct cxl_event_record_raw),
+		.flags[0] = CXL_EVENT_RECORD_FLAG_HW_REPLACE,
+		/* .handle = Set dynamically */
+		.related_handle = cpu_to_le16(0xb6a5),
+	},
+	.data = { 0xDE, 0xAD, 0xBE, 0xEF },
 };
 
+static void cxl_mock_add_event_logs(struct mock_event_store *mes)
+{
+	mes_add_event(mes, CXL_EVENT_TYPE_INFO, &maint_needed);
+	mes->ev_status |= CXLDEV_EVENT_STATUS_INFO;
+
+	mes_add_event(mes, CXL_EVENT_TYPE_FATAL, &hardware_replace);
+	mes->ev_status |= CXLDEV_EVENT_STATUS_FATAL;
+}
+
 static int mock_gsl(struct cxl_mbox_cmd *cmd)
 {
 	if (cmd->size_out < sizeof(mock_gsl_payload))
@@ -582,6 +791,12 @@ static int cxl_mock_mbox_send(struct cxl_dev_state *cxlds, struct cxl_mbox_cmd *
 	case CXL_MBOX_OP_GET_PARTITION_INFO:
 		rc = mock_partition_info(cxlds, cmd);
 		break;
+	case CXL_MBOX_OP_GET_EVENT_RECORD:
+		rc = mock_get_event(cxlds, cmd);
+		break;
+	case CXL_MBOX_OP_CLEAR_EVENT_RECORD:
+		rc = mock_clear_event(cxlds, cmd);
+		break;
 	case CXL_MBOX_OP_SET_LSA:
 		rc = mock_set_lsa(cxlds, cmd);
 		break;
@@ -628,6 +843,15 @@ static bool is_rcd(struct platform_device *pdev)
 	return !!id->driver_data;
 }
 
+static ssize_t event_trigger_store(struct device *dev,
+				   struct device_attribute *attr,
+				   const char *buf, size_t count)
+{
+	cxl_mock_event_trigger(dev);
+	return count;
+}
+static DEVICE_ATTR_WO(event_trigger);
+
 static int cxl_mock_mem_probe(struct platform_device *pdev)
 {
 	struct device *dev = &pdev->dev;
@@ -655,6 +879,7 @@ static int cxl_mock_mem_probe(struct platform_device *pdev)
 	cxlds->serial = pdev->id;
 	cxlds->mbox_send = cxl_mock_mbox_send;
 	cxlds->payload_size = SZ_4K;
+	cxlds->event.buf = (struct cxl_get_event_payload *) mdata->event_buf;
 	if (is_rcd(pdev)) {
 		cxlds->rcd = true;
 		cxlds->component_reg_phys = CXL_RESOURCE_NONE;
@@ -672,10 +897,15 @@ static int cxl_mock_mem_probe(struct platform_device *pdev)
 	if (rc)
 		return rc;
 
+	mdata->mes.cxlds = cxlds;
+	cxl_mock_add_event_logs(&mdata->mes);
+
 	cxlmd = devm_cxl_add_memdev(cxlds);
 	if (IS_ERR(cxlmd))
 		return PTR_ERR(cxlmd);
 
+	cxl_mem_get_event_records(cxlds, CXLDEV_EVENT_STATUS_ALL);
+
 	return 0;
 }
 
@@ -714,6 +944,7 @@ static DEVICE_ATTR_RW(security_lock);
 
 static struct attribute *cxl_mock_mem_attrs[] = {
 	&dev_attr_security_lock.attr,
+	&dev_attr_event_trigger.attr,
 	NULL
 };
 ATTRIBUTE_GROUPS(cxl_mock_mem);
-- 
2.37.2


^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH V4 8/9] cxl/test: Add specific events
  2022-12-12  7:06 [PATCH V4 0/9] CXL: Process event logs ira.weiny
                   ` (6 preceding siblings ...)
  2022-12-12  7:06 ` [PATCH V4 7/9] cxl/test: Add generic mock events ira.weiny
@ 2022-12-12  7:06 ` ira.weiny
  2022-12-12  7:06 ` [PATCH V4 9/9] cxl/test: Simulate event log overflow ira.weiny
  2022-12-16 12:25 ` [PATCH V4 0/9] CXL: Process event logs Jonathan Cameron
  9 siblings, 0 replies; 36+ messages in thread
From: ira.weiny @ 2022-12-12  7:06 UTC (permalink / raw)
  To: Dan Williams
  Cc: Ira Weiny, Jonathan Cameron, Bjorn Helgaas, Alison Schofield,
	Vishal Verma, Davidlohr Bueso, Dave Jiang, linux-kernel,
	linux-pci, linux-acpi, linux-cxl

From: Ira Weiny <ira.weiny@intel.com>

Each type of event has different trace point outputs.

Add mock General Media Event, DRAM event, and Memory Module Event
records to the mock list of events returned.

Reviewed-by: Dan Williams <dan.j.williams@intel.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Ira Weiny <ira.weiny@intel.com>

---
Changes from V3:
	Dan
		Move code to mem.c
---
 tools/testing/cxl/test/mem.c | 73 ++++++++++++++++++++++++++++++++++++
 1 file changed, 73 insertions(+)

diff --git a/tools/testing/cxl/test/mem.c b/tools/testing/cxl/test/mem.c
index 036f27f9c18e..73db722a8879 100644
--- a/tools/testing/cxl/test/mem.c
+++ b/tools/testing/cxl/test/mem.c
@@ -277,12 +277,85 @@ struct cxl_event_record_raw hardware_replace = {
 	.data = { 0xDE, 0xAD, 0xBE, 0xEF },
 };
 
+struct cxl_event_gen_media gen_media = {
+	.hdr = {
+		.id = UUID_INIT(0xfbcd0a77, 0xc260, 0x417f,
+				0x85, 0xa9, 0x08, 0x8b, 0x16, 0x21, 0xeb, 0xa6),
+		.length = sizeof(struct cxl_event_gen_media),
+		.flags[0] = CXL_EVENT_RECORD_FLAG_PERMANENT,
+		/* .handle = Set dynamically */
+		.related_handle = cpu_to_le16(0),
+	},
+	.phys_addr = cpu_to_le64(0x2000),
+	.descriptor = CXL_GMER_EVT_DESC_UNCORECTABLE_EVENT,
+	.type = CXL_GMER_MEM_EVT_TYPE_DATA_PATH_ERROR,
+	.transaction_type = CXL_GMER_TRANS_HOST_WRITE,
+	/* .validity_flags = <set below> */
+	.channel = 1,
+	.rank = 30
+};
+
+struct cxl_event_dram dram = {
+	.hdr = {
+		.id = UUID_INIT(0x601dcbb3, 0x9c06, 0x4eab,
+				0xb8, 0xaf, 0x4e, 0x9b, 0xfb, 0x5c, 0x96, 0x24),
+		.length = sizeof(struct cxl_event_dram),
+		.flags[0] = CXL_EVENT_RECORD_FLAG_PERF_DEGRADED,
+		/* .handle = Set dynamically */
+		.related_handle = cpu_to_le16(0),
+	},
+	.phys_addr = cpu_to_le64(0x8000),
+	.descriptor = CXL_GMER_EVT_DESC_THRESHOLD_EVENT,
+	.type = CXL_GMER_MEM_EVT_TYPE_INV_ADDR,
+	.transaction_type = CXL_GMER_TRANS_INTERNAL_MEDIA_SCRUB,
+	/* .validity_flags = <set below> */
+	.channel = 1,
+	.bank_group = 5,
+	.bank = 2,
+	.column = {0xDE, 0xAD},
+};
+
+struct cxl_event_mem_module mem_module = {
+	.hdr = {
+		.id = UUID_INIT(0xfe927475, 0xdd59, 0x4339,
+				0xa5, 0x86, 0x79, 0xba, 0xb1, 0x13, 0xb7, 0x74),
+		.length = sizeof(struct cxl_event_mem_module),
+		/* .handle = Set dynamically */
+		.related_handle = cpu_to_le16(0),
+	},
+	.event_type = CXL_MMER_TEMP_CHANGE,
+	.info = {
+		.health_status = CXL_DHI_HS_PERFORMANCE_DEGRADED,
+		.media_status = CXL_DHI_MS_ALL_DATA_LOST,
+		.add_status = (CXL_DHI_AS_CRITICAL << 2) |
+			      (CXL_DHI_AS_WARNING << 4) |
+			      (CXL_DHI_AS_WARNING << 5),
+		.device_temp = { 0xDE, 0xAD},
+		.dirty_shutdown_cnt = { 0xde, 0xad, 0xbe, 0xef },
+		.cor_vol_err_cnt = { 0xde, 0xad, 0xbe, 0xef },
+		.cor_per_err_cnt = { 0xde, 0xad, 0xbe, 0xef },
+	}
+};
+
 static void cxl_mock_add_event_logs(struct mock_event_store *mes)
 {
+	put_unaligned_le16(CXL_GMER_VALID_CHANNEL | CXL_GMER_VALID_RANK,
+			   &gen_media.validity_flags);
+
+	put_unaligned_le16(CXL_DER_VALID_CHANNEL | CXL_DER_VALID_BANK_GROUP |
+			   CXL_DER_VALID_BANK | CXL_DER_VALID_COLUMN,
+			   &dram.validity_flags);
+
 	mes_add_event(mes, CXL_EVENT_TYPE_INFO, &maint_needed);
+	mes_add_event(mes, CXL_EVENT_TYPE_INFO,
+		      (struct cxl_event_record_raw *)&gen_media);
+	mes_add_event(mes, CXL_EVENT_TYPE_INFO,
+		      (struct cxl_event_record_raw *)&mem_module);
 	mes->ev_status |= CXLDEV_EVENT_STATUS_INFO;
 
 	mes_add_event(mes, CXL_EVENT_TYPE_FATAL, &hardware_replace);
+	mes_add_event(mes, CXL_EVENT_TYPE_FATAL,
+		      (struct cxl_event_record_raw *)&dram);
 	mes->ev_status |= CXLDEV_EVENT_STATUS_FATAL;
 }
 
-- 
2.37.2


^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH V4 9/9] cxl/test: Simulate event log overflow
  2022-12-12  7:06 [PATCH V4 0/9] CXL: Process event logs ira.weiny
                   ` (7 preceding siblings ...)
  2022-12-12  7:06 ` [PATCH V4 8/9] cxl/test: Add specific events ira.weiny
@ 2022-12-12  7:06 ` ira.weiny
  2022-12-16 12:25 ` [PATCH V4 0/9] CXL: Process event logs Jonathan Cameron
  9 siblings, 0 replies; 36+ messages in thread
From: ira.weiny @ 2022-12-12  7:06 UTC (permalink / raw)
  To: Dan Williams
  Cc: Ira Weiny, Jonathan Cameron, Bjorn Helgaas, Alison Schofield,
	Vishal Verma, Davidlohr Bueso, Dave Jiang, linux-kernel,
	linux-pci, linux-acpi, linux-cxl

From: Ira Weiny <ira.weiny@intel.com>

Log overflow is marked by a separate trace message.

Simulate a log with lots of messages and flag overflow until space is
cleared.

Reviewed-by: Dan Williams <dan.j.williams@intel.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Ira Weiny <ira.weiny@intel.com>

---
Changes from V3:
	Dan
		Move code to mem.c
---
 tools/testing/cxl/test/mem.c | 50 +++++++++++++++++++++++++++++++++++-
 1 file changed, 49 insertions(+), 1 deletion(-)

diff --git a/tools/testing/cxl/test/mem.c b/tools/testing/cxl/test/mem.c
index 73db722a8879..b2327a8de4a6 100644
--- a/tools/testing/cxl/test/mem.c
+++ b/tools/testing/cxl/test/mem.c
@@ -78,6 +78,8 @@ struct mock_event_log {
 	u16 clear_idx;
 	u16 cur_idx;
 	u16 nr_events;
+	u16 nr_overflow;
+	u16 overflow_reset;
 	struct cxl_event_record_raw *events[CXL_TEST_EVENT_CNT_MAX];
 };
 
@@ -116,6 +118,7 @@ static void event_reset_log(struct mock_event_log *log)
 {
 	log->cur_idx = 0;
 	log->clear_idx = 0;
+	log->nr_overflow = log->overflow_reset;
 }
 
 /* Handle can never be 0 use 1 based indexing for handle */
@@ -147,8 +150,12 @@ static void mes_add_event(struct mock_event_store *mes,
 		return;
 
 	log = &mes->mock_logs[log_type];
-	if (WARN_ON(log->nr_events >= CXL_TEST_EVENT_CNT_MAX))
+
+	if ((log->nr_events + 1) > CXL_TEST_EVENT_CNT_MAX) {
+		log->nr_overflow++;
+		log->overflow_reset = log->nr_overflow;
 		return;
+	}
 
 	log->events[log->nr_events] = event;
 	log->nr_events++;
@@ -159,6 +166,7 @@ static int mock_get_event(struct cxl_dev_state *cxlds,
 {
 	struct cxl_get_event_payload *pl;
 	struct mock_event_log *log;
+	u16 nr_overflow;
 	u8 log_type;
 	int i;
 
@@ -191,6 +199,19 @@ static int mock_get_event(struct cxl_dev_state *cxlds,
 	if (!event_log_empty(log))
 		pl->flags |= CXL_GET_EVENT_FLAG_MORE_RECORDS;
 
+	if (log->nr_overflow) {
+		u64 ns;
+
+		pl->flags |= CXL_GET_EVENT_FLAG_OVERFLOW;
+		pl->overflow_err_count = cpu_to_le16(nr_overflow);
+		ns = ktime_get_real_ns();
+		ns -= 5000000000; /* 5s ago */
+		pl->first_overflow_timestamp = cpu_to_le64(ns);
+		ns = ktime_get_real_ns();
+		ns -= 1000000000; /* 1s ago */
+		pl->last_overflow_timestamp = cpu_to_le64(ns);
+	}
+
 	return 0;
 }
 
@@ -231,6 +252,9 @@ static int mock_clear_event(struct cxl_dev_state *cxlds,
 		}
 	}
 
+	if (log->nr_overflow)
+		log->nr_overflow = 0;
+
 	/* Clear events */
 	log->clear_idx += pl->nr_recs;
 	return 0;
@@ -353,6 +377,30 @@ static void cxl_mock_add_event_logs(struct mock_event_store *mes)
 		      (struct cxl_event_record_raw *)&mem_module);
 	mes->ev_status |= CXLDEV_EVENT_STATUS_INFO;
 
+	mes_add_event(mes, CXL_EVENT_TYPE_FAIL, &maint_needed);
+	mes_add_event(mes, CXL_EVENT_TYPE_FAIL, &hardware_replace);
+	mes_add_event(mes, CXL_EVENT_TYPE_FAIL,
+		      (struct cxl_event_record_raw *)&dram);
+	mes_add_event(mes, CXL_EVENT_TYPE_FAIL,
+		      (struct cxl_event_record_raw *)&gen_media);
+	mes_add_event(mes, CXL_EVENT_TYPE_FAIL,
+		      (struct cxl_event_record_raw *)&mem_module);
+	mes_add_event(mes, CXL_EVENT_TYPE_FAIL, &hardware_replace);
+	mes_add_event(mes, CXL_EVENT_TYPE_FAIL,
+		      (struct cxl_event_record_raw *)&dram);
+	/* Overflow this log */
+	mes_add_event(mes, CXL_EVENT_TYPE_FAIL, &hardware_replace);
+	mes_add_event(mes, CXL_EVENT_TYPE_FAIL, &hardware_replace);
+	mes_add_event(mes, CXL_EVENT_TYPE_FAIL, &hardware_replace);
+	mes_add_event(mes, CXL_EVENT_TYPE_FAIL, &hardware_replace);
+	mes_add_event(mes, CXL_EVENT_TYPE_FAIL, &hardware_replace);
+	mes_add_event(mes, CXL_EVENT_TYPE_FAIL, &hardware_replace);
+	mes_add_event(mes, CXL_EVENT_TYPE_FAIL, &hardware_replace);
+	mes_add_event(mes, CXL_EVENT_TYPE_FAIL, &hardware_replace);
+	mes_add_event(mes, CXL_EVENT_TYPE_FAIL, &hardware_replace);
+	mes_add_event(mes, CXL_EVENT_TYPE_FAIL, &hardware_replace);
+	mes->ev_status |= CXLDEV_EVENT_STATUS_FAIL;
+
 	mes_add_event(mes, CXL_EVENT_TYPE_FATAL, &hardware_replace);
 	mes_add_event(mes, CXL_EVENT_TYPE_FATAL,
 		      (struct cxl_event_record_raw *)&dram);
-- 
2.37.2


^ permalink raw reply related	[flat|nested] 36+ messages in thread

* Re: [PATCH V4 2/9] cxl/mem: Read, trace, and clear events on driver load
  2022-12-12  7:06 ` [PATCH V4 2/9] cxl/mem: Read, trace, and clear events on driver load ira.weiny
@ 2022-12-13  6:49   ` johnny
  2022-12-13 18:56     ` Ira Weiny
  2022-12-16 15:39   ` Jonathan Cameron
  1 sibling, 1 reply; 36+ messages in thread
From: johnny @ 2022-12-13  6:49 UTC (permalink / raw)
  To: ira.weiny
  Cc: Dan Williams, Bjorn Helgaas, Alison Schofield, Vishal Verma,
	Davidlohr Bueso, Jonathan Cameron, Dave Jiang, linux-kernel,
	linux-pci, linux-acpi, linux-cxl

On Sun, Dec 11, 2022 at 11:06:20PM -0800, ira.weiny@intel.com wrote:
> From: Ira Weiny <ira.weiny@intel.com>
> 
> CXL devices have multiple event logs which can be queried for CXL event
> records.  Devices are required to support the storage of at least one
> event record in each event log type.
> 
> Devices track event log overflow by incrementing a counter and tracking
> the time of the first and last overflow event seen.
> 
> Software queries events via the Get Event Record mailbox command; CXL
> rev 3.0 section 8.2.9.2.2 and clears events via CXL rev 3.0 section
> 8.2.9.2.3 Clear Event Records mailbox command.
> 
> If the result of negotiating CXL Error Reporting Control is OS control,
> read and clear all event logs on driver load.
> 
> Ensure a clean slate of events by reading and clearing the events on
> driver load.
> 
> The status register is not used because a device may continue to trigger
> events and the only requirement is to empty the log at least once.  This
> allows for the required transition from empty to non-empty for interrupt
> generation.  Handling of interrupts is in a follow on patch.
> 
> The device can return up to 1MB worth of event records per query.
> Allocate a shared large buffer to handle the max number of records based
> on the mailbox payload size.
> 
> This patch traces a raw event record and leaves specific event record
> type tracing to subsequent patches.  Macros are created to aid in
> tracing the common CXL Event header fields.
> 
> Each record is cleared explicitly.  A clear all bit is specified but is
> only valid when the log overflows.
> 
> Signed-off-by: Ira Weiny <ira.weiny@intel.com>
> 
> ---
> Changes from V3:
> 	Dan
> 		Split off _OSC pcie bits
> 			Use existing style for host bridge flag in that
> 			patch
> 		Clean up event processing loop
> 		Use dev_err_ratelimited()
> 		Clean up version change log
> 		Delete 'EVENT LOG OVERFLOW'
> 		Remove cxl_clear_event_logs()
> 		Add comment for native cxl control
> 		Fail driver load on event buf allocation failure
> 		Comment why events are not processed without _OSC flag
> ---
>  drivers/cxl/core/mbox.c  | 136 +++++++++++++++++++++++++++++++++++++++
>  drivers/cxl/core/trace.h | 120 ++++++++++++++++++++++++++++++++++
>  drivers/cxl/cxl.h        |  12 ++++
>  drivers/cxl/cxlmem.h     |  84 ++++++++++++++++++++++++
>  drivers/cxl/pci.c        |  40 ++++++++++++
>  5 files changed, 392 insertions(+)
> 
> diff --git a/drivers/cxl/core/mbox.c b/drivers/cxl/core/mbox.c
> index b03fba212799..9fb327370e08 100644
> --- a/drivers/cxl/core/mbox.c
> +++ b/drivers/cxl/core/mbox.c
> @@ -8,6 +8,7 @@
>  #include <cxl.h>
>  
>  #include "core.h"
> +#include "trace.h"
>  
>  static bool cxl_raw_allow_all;
>  
> @@ -717,6 +718,140 @@ int cxl_enumerate_cmds(struct cxl_dev_state *cxlds)
>  }
>  EXPORT_SYMBOL_NS_GPL(cxl_enumerate_cmds, CXL);
>  
> +static int cxl_clear_event_record(struct cxl_dev_state *cxlds,
> +				  enum cxl_event_log_type log,
> +				  struct cxl_get_event_payload *get_pl)
> +{
> +	struct cxl_mbox_clear_event_payload payload = {
> +		.event_log = log,
> +	};
> +	u16 total = le16_to_cpu(get_pl->record_count);
> +	u8 max_handles = CXL_CLEAR_EVENT_MAX_HANDLES;
> +	size_t pl_size = sizeof(payload);
> +	struct cxl_mbox_cmd mbox_cmd;
> +	u16 cnt;
> +	int rc;
> +	int i;
> +
> +	/* Payload size may limit the max handles */
> +	if (pl_size > cxlds->payload_size) {
> +		max_handles = CXL_CLEAR_EVENT_LIMIT_HANDLES(cxlds->payload_size);
> +		pl_size = cxlds->payload_size;
> +	}
> +
> +	mbox_cmd = (struct cxl_mbox_cmd) {
> +		.opcode = CXL_MBOX_OP_CLEAR_EVENT_RECORD,
> +		.payload_in = &payload,
> +		.size_in = pl_size,
> +	};
> +
> +	/*
> +	 * Clear Event Records uses u8 for the handle cnt while Get Event
> +	 * Record can return up to 0xffff records.
> +	 */
> +	i = 0;
> +	for (cnt = 0; cnt < total; cnt++) {
> +		payload.handle[i++] = get_pl->records[cnt].hdr.handle;
> +		dev_dbg(cxlds->dev, "Event log '%d': Clearing %u\n",
> +			log, le16_to_cpu(payload.handle[i]));
> +
> +		if (i == max_handles) {
> +			payload.nr_recs = i;
> +			rc = cxl_internal_send_cmd(cxlds, &mbox_cmd);
> +			if (rc)
> +				return rc;
> +			i = 0;
> +		}
> +	}
> +
> +	/* Clear what is left if any */
> +	if (i) {
> +		payload.nr_recs = i;
> +		rc = cxl_internal_send_cmd(cxlds, &mbox_cmd);
> +		if (rc)
> +			return rc;
> +	}
> +
> +	return 0;
> +}
> +
> +static void cxl_mem_get_records_log(struct cxl_dev_state *cxlds,
> +				    enum cxl_event_log_type type)
> +{
> +	struct cxl_get_event_payload *payload;
> +	struct cxl_mbox_cmd mbox_cmd;
> +	u8 log_type = type;
> +	u16 nr_rec;
> +
> +	mutex_lock(&cxlds->event.log_lock);
> +	payload = cxlds->event.buf;
> +
> +	mbox_cmd = (struct cxl_mbox_cmd) {
> +		.opcode = CXL_MBOX_OP_GET_EVENT_RECORD,
> +		.payload_in = &log_type,
> +		.size_in = sizeof(log_type),
> +		.payload_out = payload,
> +		.size_out = cxlds->payload_size,
> +		.min_out = struct_size(payload, records, 0),
> +	};
> +
> +	do {
> +		int rc, i;
> +
> +		rc = cxl_internal_send_cmd(cxlds, &mbox_cmd);
> +		if (rc) {
> +			dev_err_ratelimited(cxlds->dev, "Event log '%d': Failed to query event records : %d",
> +					    type, rc);
> +			break;
> +		}
> +
> +		nr_rec = le16_to_cpu(payload->record_count);
> +		if (!nr_rec)
> +			break;
> +
> +		for (i = 0; i < nr_rec; i++)
> +			trace_cxl_generic_event(cxlds->dev, type,
> +						&payload->records[i]);
> +
> +		if (payload->flags & CXL_GET_EVENT_FLAG_OVERFLOW)
> +			trace_cxl_overflow(cxlds->dev, type, payload);
> +
> +		rc = cxl_clear_event_record(cxlds, type, payload);
> +		if (rc) {
> +			dev_err_ratelimited(cxlds->dev, "Event log '%d': Failed to clear events : %d",
> +					    type, rc);
> +			break;
> +		}
> +	} while (nr_rec);
> +
> +	mutex_unlock(&cxlds->event.log_lock);
> +}
> +
> +/**
> + * cxl_mem_get_event_records - Get Event Records from the device
> + * @cxlds: The device data for the operation
> + *
> + * Retrieve all event records available on the device, report them as trace
> + * events, and clear them.
> + *
> + * See CXL rev 3.0 @8.2.9.2.2 Get Event Records
> + * See CXL rev 3.0 @8.2.9.2.3 Clear Event Records
> + */
> +void cxl_mem_get_event_records(struct cxl_dev_state *cxlds, u32 status)
> +{
> +	dev_dbg(cxlds->dev, "Reading event logs: %x\n", status);
> +
> +	if (status & CXLDEV_EVENT_STATUS_FATAL)
> +		cxl_mem_get_records_log(cxlds, CXL_EVENT_TYPE_FATAL);
> +	if (status & CXLDEV_EVENT_STATUS_FAIL)
> +		cxl_mem_get_records_log(cxlds, CXL_EVENT_TYPE_FAIL);
> +	if (status & CXLDEV_EVENT_STATUS_WARN)
> +		cxl_mem_get_records_log(cxlds, CXL_EVENT_TYPE_WARN);
> +	if (status & CXLDEV_EVENT_STATUS_INFO)
> +		cxl_mem_get_records_log(cxlds, CXL_EVENT_TYPE_INFO);
> +}
> +EXPORT_SYMBOL_NS_GPL(cxl_mem_get_event_records, CXL);
> +
>  /**
>   * cxl_mem_get_partition_info - Get partition info
>   * @cxlds: The device data for the operation
> @@ -868,6 +1003,7 @@ struct cxl_dev_state *cxl_dev_state_create(struct device *dev)
>  	}
>  
>  	mutex_init(&cxlds->mbox_mutex);
> +	mutex_init(&cxlds->event.log_lock);
>  	cxlds->dev = dev;
>  
>  	return cxlds;
> diff --git a/drivers/cxl/core/trace.h b/drivers/cxl/core/trace.h
> index 20ca2fe2ca8e..6898212fcb47 100644
> --- a/drivers/cxl/core/trace.h
> +++ b/drivers/cxl/core/trace.h
> @@ -6,7 +6,9 @@
>  #if !defined(_CXL_EVENTS_H) || defined(TRACE_HEADER_MULTI_READ)
>  #define _CXL_EVENTS_H
>  
> +#include <asm-generic/unaligned.h>
>  #include <cxl.h>
> +#include <cxlmem.h>
>  #include <linux/tracepoint.h>
>  
>  #define CXL_RAS_UC_CACHE_DATA_PARITY	BIT(0)
> @@ -103,6 +105,124 @@ TRACE_EVENT(cxl_aer_correctable_error,
>  	)
>  );
>  
> +#include <linux/tracepoint.h>
> +
> +#define cxl_event_log_type_str(type)				\
> +	__print_symbolic(type,					\
> +		{ CXL_EVENT_TYPE_INFO, "Informational" },	\
> +		{ CXL_EVENT_TYPE_WARN, "Warning" },		\
> +		{ CXL_EVENT_TYPE_FAIL, "Failure" },		\
> +		{ CXL_EVENT_TYPE_FATAL, "Fatal" })
> +
> +TRACE_EVENT(cxl_overflow,
> +
> +	TP_PROTO(const struct device *dev, enum cxl_event_log_type log,
> +		 struct cxl_get_event_payload *payload),
> +
> +	TP_ARGS(dev, log, payload),
> +
> +	TP_STRUCT__entry(
> +		__string(dev_name, dev_name(dev))
> +		__field(int, log)
> +		__field(u64, first_ts)
> +		__field(u64, last_ts)
> +		__field(u16, count)
> +	),
> +
> +	TP_fast_assign(
> +		__assign_str(dev_name, dev_name(dev));
> +		__entry->log = log;
> +		__entry->count = le16_to_cpu(payload->overflow_err_count);
> +		__entry->first_ts = le64_to_cpu(payload->first_overflow_timestamp);
> +		__entry->last_ts = le64_to_cpu(payload->last_overflow_timestamp);
> +	),
> +
> +	TP_printk("%s: log=%s : %u records from %llu to %llu",
> +		__get_str(dev_name), cxl_event_log_type_str(__entry->log),
> +		__entry->count, __entry->first_ts, __entry->last_ts)
> +
> +);
> +
> +/*
> + * Common Event Record Format
> + * CXL 3.0 section 8.2.9.2.1; Table 8-42
> + */
> +#define CXL_EVENT_RECORD_FLAG_PERMANENT		BIT(2)
> +#define CXL_EVENT_RECORD_FLAG_MAINT_NEEDED	BIT(3)
> +#define CXL_EVENT_RECORD_FLAG_PERF_DEGRADED	BIT(4)
> +#define CXL_EVENT_RECORD_FLAG_HW_REPLACE	BIT(5)
> +#define show_hdr_flags(flags)	__print_flags(flags, " | ",			   \
> +	{ CXL_EVENT_RECORD_FLAG_PERMANENT,	"PERMANENT_CONDITION"		}, \
> +	{ CXL_EVENT_RECORD_FLAG_MAINT_NEEDED,	"MAINTENANCE_NEEDED"		}, \
> +	{ CXL_EVENT_RECORD_FLAG_PERF_DEGRADED,	"PERFORMANCE_DEGRADED"		}, \
> +	{ CXL_EVENT_RECORD_FLAG_HW_REPLACE,	"HARDWARE_REPLACEMENT_NEEDED"	}  \
> +)
> +
> +/*
> + * Define macros for the common header of each CXL event.
> + *
> + * Tracepoints using these macros must do 3 things:
> + *
> + *	1) Add CXL_EVT_TP_entry to TP_STRUCT__entry
> + *	2) Use CXL_EVT_TP_fast_assign within TP_fast_assign;
> + *	   pass the dev, log, and CXL event header
> + *	3) Use CXL_EVT_TP_printk() instead of TP_printk()
> + *
> + * See the generic_event tracepoint as an example.
> + */
> +#define CXL_EVT_TP_entry					\
> +	__string(dev_name, dev_name(dev))			\
> +	__field(int, log)					\
> +	__field_struct(uuid_t, hdr_uuid)			\
> +	__field(u32, hdr_flags)					\
> +	__field(u16, hdr_handle)				\
> +	__field(u16, hdr_related_handle)			\
> +	__field(u64, hdr_timestamp)				\
> +	__field(u8, hdr_length)					\
> +	__field(u8, hdr_maint_op_class)
> +
> +#define CXL_EVT_TP_fast_assign(dev, l, hdr)					\
> +	__assign_str(dev_name, dev_name(dev));					\
> +	__entry->log = (l);							\
> +	memcpy(&__entry->hdr_uuid, &(hdr).id, sizeof(uuid_t));			\
> +	__entry->hdr_length = (hdr).length;					\
> +	__entry->hdr_flags = get_unaligned_le24((hdr).flags);			\
> +	__entry->hdr_handle = le16_to_cpu((hdr).handle);			\
> +	__entry->hdr_related_handle = le16_to_cpu((hdr).related_handle);	\
> +	__entry->hdr_timestamp = le64_to_cpu((hdr).timestamp);			\
> +	__entry->hdr_maint_op_class = (hdr).maint_op_class
> +
> +#define CXL_EVT_TP_printk(fmt, ...) \
> +	TP_printk("%s log=%s : time=%llu uuid=%pUb len=%d flags='%s' "		\
> +		"handle=%x related_handle=%x maint_op_class=%u"			\
> +		" : " fmt,							\
> +		__get_str(dev_name), cxl_event_log_type_str(__entry->log),	\
> +		__entry->hdr_timestamp, &__entry->hdr_uuid, __entry->hdr_length,\
> +		show_hdr_flags(__entry->hdr_flags), __entry->hdr_handle,	\
> +		__entry->hdr_related_handle, __entry->hdr_maint_op_class,	\
> +		##__VA_ARGS__)
> +
> +TRACE_EVENT(cxl_generic_event,
> +
> +	TP_PROTO(const struct device *dev, enum cxl_event_log_type log,
> +		 struct cxl_event_record_raw *rec),
> +
> +	TP_ARGS(dev, log, rec),
> +
> +	TP_STRUCT__entry(
> +		CXL_EVT_TP_entry
> +		__array(u8, data, CXL_EVENT_RECORD_DATA_LENGTH)
> +	),
> +
> +	TP_fast_assign(
> +		CXL_EVT_TP_fast_assign(dev, log, rec->hdr);
> +		memcpy(__entry->data, &rec->data, CXL_EVENT_RECORD_DATA_LENGTH);
> +	),
> +
> +	CXL_EVT_TP_printk("%s",
> +		__print_hex(__entry->data, CXL_EVENT_RECORD_DATA_LENGTH))
> +);
> +
>  #endif /* _CXL_EVENTS_H */
>  
>  #define TRACE_INCLUDE_FILE trace
> diff --git a/drivers/cxl/cxl.h b/drivers/cxl/cxl.h
> index aa3af3bb73b2..5974d1082210 100644
> --- a/drivers/cxl/cxl.h
> +++ b/drivers/cxl/cxl.h
> @@ -156,6 +156,18 @@ static inline int ways_to_eiw(unsigned int ways, u8 *eiw)
>  #define CXLDEV_CAP_CAP_ID_SECONDARY_MAILBOX 0x3
>  #define CXLDEV_CAP_CAP_ID_MEMDEV 0x4000
>  
> +/* CXL 3.0 8.2.8.3.1 Event Status Register */
> +#define CXLDEV_DEV_EVENT_STATUS_OFFSET		0x00
> +#define CXLDEV_EVENT_STATUS_INFO		BIT(0)
> +#define CXLDEV_EVENT_STATUS_WARN		BIT(1)
> +#define CXLDEV_EVENT_STATUS_FAIL		BIT(2)
> +#define CXLDEV_EVENT_STATUS_FATAL		BIT(3)
> +
> +#define CXLDEV_EVENT_STATUS_ALL (CXLDEV_EVENT_STATUS_INFO |	\
> +				 CXLDEV_EVENT_STATUS_WARN |	\
> +				 CXLDEV_EVENT_STATUS_FAIL |	\
> +				 CXLDEV_EVENT_STATUS_FATAL)
> +
>  /* CXL 2.0 8.2.8.4 Mailbox Registers */
>  #define CXLDEV_MBOX_CAPS_OFFSET 0x00
>  #define   CXLDEV_MBOX_CAP_PAYLOAD_SIZE_MASK GENMASK(4, 0)
> diff --git a/drivers/cxl/cxlmem.h b/drivers/cxl/cxlmem.h
> index ab138004f644..dd9aa3dd738e 100644
> --- a/drivers/cxl/cxlmem.h
> +++ b/drivers/cxl/cxlmem.h
> @@ -4,6 +4,7 @@
>  #define __CXL_MEM_H__
>  #include <uapi/linux/cxl_mem.h>
>  #include <linux/cdev.h>
> +#include <linux/uuid.h>
>  #include "cxl.h"
>  
>  /* CXL 2.0 8.2.8.5.1.1 Memory Device Status Register */
> @@ -193,6 +194,17 @@ struct cxl_endpoint_dvsec_info {
>  	struct range dvsec_range[2];
>  };
>  
> +/**
> + * struct cxl_event_state - Event log driver state
> + *
> + * @event_buf: Buffer to receive event data
> + * @event_log_lock: Serialize event_buf and log use
> + */
> +struct cxl_event_state {
> +	struct cxl_get_event_payload *buf;
> +	struct mutex log_lock;
> +};
> +
>  /**
>   * struct cxl_dev_state - The driver device state
>   *
> @@ -266,12 +278,16 @@ struct cxl_dev_state {
>  
>  	struct xarray doe_mbs;
>  
> +	struct cxl_event_state event;
> +
>  	int (*mbox_send)(struct cxl_dev_state *cxlds, struct cxl_mbox_cmd *cmd);
>  };
>  
>  enum cxl_opcode {
>  	CXL_MBOX_OP_INVALID		= 0x0000,
>  	CXL_MBOX_OP_RAW			= CXL_MBOX_OP_INVALID,
> +	CXL_MBOX_OP_GET_EVENT_RECORD	= 0x0100,
> +	CXL_MBOX_OP_CLEAR_EVENT_RECORD	= 0x0101,
>  	CXL_MBOX_OP_GET_FW_INFO		= 0x0200,
>  	CXL_MBOX_OP_ACTIVATE_FW		= 0x0202,
>  	CXL_MBOX_OP_GET_SUPPORTED_LOGS	= 0x0400,
> @@ -347,6 +363,73 @@ struct cxl_mbox_identify {
>  	u8 qos_telemetry_caps;
>  } __packed;
>  
> +/*
> + * Common Event Record Format
> + * CXL rev 3.0 section 8.2.9.2.1; Table 8-42
> + */
> +struct cxl_event_record_hdr {
> +	uuid_t id;
> +	u8 length;
> +	u8 flags[3];
> +	__le16 handle;
> +	__le16 related_handle;
> +	__le64 timestamp;
> +	u8 maint_op_class;
> +	u8 reserved[15];
> +} __packed;
> +
> +#define CXL_EVENT_RECORD_DATA_LENGTH 0x50
> +struct cxl_event_record_raw {
> +	struct cxl_event_record_hdr hdr;
> +	u8 data[CXL_EVENT_RECORD_DATA_LENGTH];
> +} __packed;
> +
> +/*
> + * Get Event Records output payload
> + * CXL rev 3.0 section 8.2.9.2.2; Table 8-50
> + */
> +#define CXL_GET_EVENT_FLAG_OVERFLOW		BIT(0)
> +#define CXL_GET_EVENT_FLAG_MORE_RECORDS		BIT(1)
I don't see any code consumes this more flag, is anything I miss?
Device shall set this more flag when single output payload can not fit in all records
> +struct cxl_get_event_payload {
> +	u8 flags;
> +	u8 reserved1;
> +	__le16 overflow_err_count;
> +	__le64 first_overflow_timestamp;
> +	__le64 last_overflow_timestamp;
> +	__le16 record_count;
> +	u8 reserved2[10];
> +	struct cxl_event_record_raw records[];
> +} __packed;
> +
> +/*
> + * CXL rev 3.0 section 8.2.9.2.2; Table 8-49
> + */
> +enum cxl_event_log_type {
> +	CXL_EVENT_TYPE_INFO = 0x00,
> +	CXL_EVENT_TYPE_WARN,
> +	CXL_EVENT_TYPE_FAIL,
> +	CXL_EVENT_TYPE_FATAL,
> +	CXL_EVENT_TYPE_MAX
> +};
> +
> +/*
> + * Clear Event Records input payload
> + * CXL rev 3.0 section 8.2.9.2.3; Table 8-51
> + */
> +#define CXL_CLEAR_EVENT_MAX_HANDLES (0xff)
> +struct cxl_mbox_clear_event_payload {
> +	u8 event_log;		/* enum cxl_event_log_type */
> +	u8 clear_flags;
> +	u8 nr_recs;
> +	u8 reserved[3];
> +	__le16 handle[CXL_CLEAR_EVENT_MAX_HANDLES];
> +} __packed;
> +#define CXL_CLEAR_EVENT_LIMIT_HANDLES(payload_size)			\
> +	(((payload_size) -						\
> +		(sizeof(struct cxl_mbox_clear_event_payload) -		\
> +		 (sizeof(__le16) * CXL_CLEAR_EVENT_MAX_HANDLES))) /	\
> +		sizeof(__le16))
> +
>  struct cxl_mbox_get_partition_info {
>  	__le64 active_volatile_cap;
>  	__le64 active_persistent_cap;
> @@ -441,6 +524,7 @@ int cxl_mem_create_range_info(struct cxl_dev_state *cxlds);
>  struct cxl_dev_state *cxl_dev_state_create(struct device *dev);
>  void set_exclusive_cxl_commands(struct cxl_dev_state *cxlds, unsigned long *cmds);
>  void clear_exclusive_cxl_commands(struct cxl_dev_state *cxlds, unsigned long *cmds);
> +void cxl_mem_get_event_records(struct cxl_dev_state *cxlds, u32 status);
>  #ifdef CONFIG_CXL_SUSPEND
>  void cxl_mem_active_inc(void);
>  void cxl_mem_active_dec(void);
> diff --git a/drivers/cxl/pci.c b/drivers/cxl/pci.c
> index 3a66aadb4df0..a2d8382bc593 100644
> --- a/drivers/cxl/pci.c
> +++ b/drivers/cxl/pci.c
> @@ -417,8 +417,37 @@ static void disable_aer(void *pdev)
>  	pci_disable_pcie_error_reporting(pdev);
>  }
>  
> +static void cxl_mem_free_event_buffer(void *buf)
> +{
> +	kvfree(buf);
> +}
> +
> +/*
> + * There is a single buffer for reading event logs from the mailbox.  All logs
> + * share this buffer protected by the cxlds->event_log_lock.
> + */
> +static int cxl_mem_alloc_event_buf(struct cxl_dev_state *cxlds)
> +{
> +	struct cxl_get_event_payload *buf;
> +
> +	dev_dbg(cxlds->dev, "Allocating event buffer size %zu\n",
> +		cxlds->payload_size);
> +
> +	buf = kvmalloc(cxlds->payload_size, GFP_KERNEL);
> +	if (!buf)
> +		return -ENOMEM;
> +
> +	if (devm_add_action_or_reset(cxlds->dev, cxl_mem_free_event_buffer,
> +				     buf))
> +		return -ENOMEM;
> +
> +	cxlds->event.buf = buf;
> +	return 0;
> +}
> +
>  static int cxl_pci_probe(struct pci_dev *pdev, const struct pci_device_id *id)
>  {
> +	struct pci_host_bridge *host_bridge = pci_find_host_bridge(pdev->bus);
>  	struct cxl_register_map map;
>  	struct cxl_memdev *cxlmd;
>  	struct cxl_dev_state *cxlds;
> @@ -494,6 +523,17 @@ static int cxl_pci_probe(struct pci_dev *pdev, const struct pci_device_id *id)
>  	if (IS_ERR(cxlmd))
>  		return PTR_ERR(cxlmd);
>  
> +	rc = cxl_mem_alloc_event_buf(cxlds);
> +	if (rc)
> +		return rc;
> +
> +	/*
> +	 * When BIOS maintains CXL error reporting control, it will process
> +	 * event records.  Only one agent can do so.
> +	 */
> +	if (host_bridge->native_cxl_error)
> +		cxl_mem_get_event_records(cxlds, CXLDEV_EVENT_STATUS_ALL);
> +
>  	if (cxlds->regs.ras) {
>  		pci_enable_pcie_error_reporting(pdev);
>  		rc = devm_add_action_or_reset(&pdev->dev, disable_aer, pdev);
> -- 
> 2.37.2
> 
> 



^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH V4 2/9] cxl/mem: Read, trace, and clear events on driver load
  2022-12-13  6:49   ` johnny
@ 2022-12-13 18:56     ` Ira Weiny
  0 siblings, 0 replies; 36+ messages in thread
From: Ira Weiny @ 2022-12-13 18:56 UTC (permalink / raw)
  To: johnny
  Cc: Dan Williams, Bjorn Helgaas, Alison Schofield, Vishal Verma,
	Davidlohr Bueso, Jonathan Cameron, Dave Jiang, linux-kernel,
	linux-pci, linux-acpi, linux-cxl

On Tue, Dec 13, 2022 at 02:49:02PM +0800, johnny wrote:
> On Sun, Dec 11, 2022 at 11:06:20PM -0800, ira.weiny@intel.com wrote:
> > From: Ira Weiny <ira.weiny@intel.com>
> > 

[snip]

> > +
> > +#define CXL_EVENT_RECORD_DATA_LENGTH 0x50
> > +struct cxl_event_record_raw {
> > +	struct cxl_event_record_hdr hdr;
> > +	u8 data[CXL_EVENT_RECORD_DATA_LENGTH];
> > +} __packed;
> > +
> > +/*
> > + * Get Event Records output payload
> > + * CXL rev 3.0 section 8.2.9.2.2; Table 8-50
> > + */
> > +#define CXL_GET_EVENT_FLAG_OVERFLOW		BIT(0)
> > +#define CXL_GET_EVENT_FLAG_MORE_RECORDS		BIT(1)
> I don't see any code consumes this more flag, is anything I miss?
> Device shall set this more flag when single output payload can not fit in all records

I should have removed this flag and put something in the cover letter.  I left
it in for completeness but you are correct it is unused.

We determined back in V1 that the more bit was useless in this particular
looping of Get Events Records.[1]

The net-net is that if the driver does not see the number of records go to 0 it
can't be sure it will get an interrupt for the next set of events.  Therefore
it loops until it sees the number of records go to 0.

Ira

[1] https://lore.kernel.org/all/Y4blpk%2FesXJMe79Y@iweiny-desk3/

> > +struct cxl_get_event_payload {
> > +	u8 flags;
> > +	u8 reserved1;
> > +	__le16 overflow_err_count;
> > +	__le64 first_overflow_timestamp;
> > +	__le64 last_overflow_timestamp;
> > +	__le16 record_count;
> > +	u8 reserved2[10];
> > +	struct cxl_event_record_raw records[];
> > +} __packed;
> > +
> > +/*
> > + * CXL rev 3.0 section 8.2.9.2.2; Table 8-49
> > + */
> > +enum cxl_event_log_type {
> > +	CXL_EVENT_TYPE_INFO = 0x00,
> > +	CXL_EVENT_TYPE_WARN,
> > +	CXL_EVENT_TYPE_FAIL,
> > +	CXL_EVENT_TYPE_FATAL,
> > +	CXL_EVENT_TYPE_MAX
> > +};
> > +
> > +/*
> > + * Clear Event Records input payload
> > + * CXL rev 3.0 section 8.2.9.2.3; Table 8-51
> > + */
> > +#define CXL_CLEAR_EVENT_MAX_HANDLES (0xff)
> > +struct cxl_mbox_clear_event_payload {
> > +	u8 event_log;		/* enum cxl_event_log_type */
> > +	u8 clear_flags;
> > +	u8 nr_recs;
> > +	u8 reserved[3];
> > +	__le16 handle[CXL_CLEAR_EVENT_MAX_HANDLES];
> > +} __packed;
> > +#define CXL_CLEAR_EVENT_LIMIT_HANDLES(payload_size)			\
> > +	(((payload_size) -						\
> > +		(sizeof(struct cxl_mbox_clear_event_payload) -		\
> > +		 (sizeof(__le16) * CXL_CLEAR_EVENT_MAX_HANDLES))) /	\
> > +		sizeof(__le16))
> > +
> >  struct cxl_mbox_get_partition_info {
> >  	__le64 active_volatile_cap;
> >  	__le64 active_persistent_cap;
> > @@ -441,6 +524,7 @@ int cxl_mem_create_range_info(struct cxl_dev_state *cxlds);
> >  struct cxl_dev_state *cxl_dev_state_create(struct device *dev);
> >  void set_exclusive_cxl_commands(struct cxl_dev_state *cxlds, unsigned long *cmds);
> >  void clear_exclusive_cxl_commands(struct cxl_dev_state *cxlds, unsigned long *cmds);
> > +void cxl_mem_get_event_records(struct cxl_dev_state *cxlds, u32 status);
> >  #ifdef CONFIG_CXL_SUSPEND
> >  void cxl_mem_active_inc(void);
> >  void cxl_mem_active_dec(void);
> > diff --git a/drivers/cxl/pci.c b/drivers/cxl/pci.c
> > index 3a66aadb4df0..a2d8382bc593 100644
> > --- a/drivers/cxl/pci.c
> > +++ b/drivers/cxl/pci.c
> > @@ -417,8 +417,37 @@ static void disable_aer(void *pdev)
> >  	pci_disable_pcie_error_reporting(pdev);
> >  }
> >  
> > +static void cxl_mem_free_event_buffer(void *buf)
> > +{
> > +	kvfree(buf);
> > +}
> > +
> > +/*
> > + * There is a single buffer for reading event logs from the mailbox.  All logs
> > + * share this buffer protected by the cxlds->event_log_lock.
> > + */
> > +static int cxl_mem_alloc_event_buf(struct cxl_dev_state *cxlds)
> > +{
> > +	struct cxl_get_event_payload *buf;
> > +
> > +	dev_dbg(cxlds->dev, "Allocating event buffer size %zu\n",
> > +		cxlds->payload_size);
> > +
> > +	buf = kvmalloc(cxlds->payload_size, GFP_KERNEL);
> > +	if (!buf)
> > +		return -ENOMEM;
> > +
> > +	if (devm_add_action_or_reset(cxlds->dev, cxl_mem_free_event_buffer,
> > +				     buf))
> > +		return -ENOMEM;
> > +
> > +	cxlds->event.buf = buf;
> > +	return 0;
> > +}
> > +
> >  static int cxl_pci_probe(struct pci_dev *pdev, const struct pci_device_id *id)
> >  {
> > +	struct pci_host_bridge *host_bridge = pci_find_host_bridge(pdev->bus);
> >  	struct cxl_register_map map;
> >  	struct cxl_memdev *cxlmd;
> >  	struct cxl_dev_state *cxlds;
> > @@ -494,6 +523,17 @@ static int cxl_pci_probe(struct pci_dev *pdev, const struct pci_device_id *id)
> >  	if (IS_ERR(cxlmd))
> >  		return PTR_ERR(cxlmd);
> >  
> > +	rc = cxl_mem_alloc_event_buf(cxlds);
> > +	if (rc)
> > +		return rc;
> > +
> > +	/*
> > +	 * When BIOS maintains CXL error reporting control, it will process
> > +	 * event records.  Only one agent can do so.
> > +	 */
> > +	if (host_bridge->native_cxl_error)
> > +		cxl_mem_get_event_records(cxlds, CXLDEV_EVENT_STATUS_ALL);
> > +
> >  	if (cxlds->regs.ras) {
> >  		pci_enable_pcie_error_reporting(pdev);
> >  		rc = devm_add_action_or_reset(&pdev->dev, disable_aer, pdev);
> > -- 
> > 2.37.2
> > 
> > 
> 

^ permalink raw reply	[flat|nested] 36+ messages in thread

* RE: [PATCH V4 1/9] PCI/CXL: Export native CXL error reporting control
  2022-12-12  7:06 ` [PATCH V4 1/9] PCI/CXL: Export native CXL error reporting control ira.weiny
@ 2022-12-13 19:12   ` Dan Williams
  2022-12-16 14:09   ` Jonathan Cameron
                     ` (2 subsequent siblings)
  3 siblings, 0 replies; 36+ messages in thread
From: Dan Williams @ 2022-12-13 19:12 UTC (permalink / raw)
  To: ira.weiny, Dan Williams
  Cc: Ira Weiny, Bjorn Helgaas, Lukas Wunner, linux-pci, linux-acpi,
	Alison Schofield, Vishal Verma, Davidlohr Bueso,
	Jonathan Cameron, Dave Jiang, linux-kernel, linux-cxl

ira.weiny@ wrote:
> From: Ira Weiny <ira.weiny@intel.com>
> 
> CXL _OSC Error Reporting Control is used by the OS to determine if
> Firmware has control of various CXL error reporting capabilities
> including the event logs.
> 
> Expose the result of negotiating CXL Error Reporting Control in struct
> pci_host_bridge for consumption by the CXL drivers.
> 
> Cc: Bjorn Helgaas <bhelgaas@google.com>
> Cc: Lukas Wunner <lukas@wunner.de>
> Cc: linux-pci@vger.kernel.org
> Cc: linux-acpi@vger.kernel.org
> Signed-off-by: Ira Weiny <ira.weiny@intel.com>

Looks good to me,

Reviewed-by: Dan Williams <dan.j.williams@intel.com>

> 
> ---
> Changes from V3:
> 	New patch split out
> ---
>  drivers/acpi/pci_root.c | 3 +++
>  drivers/pci/probe.c     | 1 +
>  include/linux/pci.h     | 1 +
>  3 files changed, 5 insertions(+)
> 
> diff --git a/drivers/acpi/pci_root.c b/drivers/acpi/pci_root.c
> index b3c202d2a433..84030804a763 100644
> --- a/drivers/acpi/pci_root.c
> +++ b/drivers/acpi/pci_root.c
> @@ -1047,6 +1047,9 @@ struct pci_bus *acpi_pci_root_create(struct acpi_pci_root *root,
>  	if (!(root->osc_control_set & OSC_PCI_EXPRESS_DPC_CONTROL))
>  		host_bridge->native_dpc = 0;
>  
> +	if (!(root->osc_ext_control_set & OSC_CXL_ERROR_REPORTING_CONTROL))
> +		host_bridge->native_cxl_error = 0;
> +
>  	/*
>  	 * Evaluate the "PCI Boot Configuration" _DSM Function.  If it
>  	 * exists and returns 0, we must preserve any PCI resource
> diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c
> index 2f4e88a44e8b..34c9fd6840c4 100644
> --- a/drivers/pci/probe.c
> +++ b/drivers/pci/probe.c
> @@ -596,6 +596,7 @@ static void pci_init_host_bridge(struct pci_host_bridge *bridge)
>  	bridge->native_ltr = 1;
>  	bridge->native_dpc = 1;
>  	bridge->domain_nr = PCI_DOMAIN_NR_NOT_SET;
> +	bridge->native_cxl_error = 1;
>  
>  	device_initialize(&bridge->dev);
>  }
> diff --git a/include/linux/pci.h b/include/linux/pci.h
> index 1f81807492ef..08c3ccd2617b 100644
> --- a/include/linux/pci.h
> +++ b/include/linux/pci.h
> @@ -577,6 +577,7 @@ struct pci_host_bridge {
>  	unsigned int	native_pme:1;		/* OS may use PCIe PME */
>  	unsigned int	native_ltr:1;		/* OS may use PCIe LTR */
>  	unsigned int	native_dpc:1;		/* OS may use PCIe DPC */
> +	unsigned int	native_cxl_error:1;	/* OS may use CXL RAS/Events */
>  	unsigned int	preserve_config:1;	/* Preserve FW resource setup */
>  	unsigned int	size_windows:1;		/* Enable root bus sizing */
>  	unsigned int	msi_domain:1;		/* Bridge wants MSI domain */
> -- 
> 2.37.2
> 



^ permalink raw reply	[flat|nested] 36+ messages in thread

* RE: [PATCH V4 3/9] cxl/mem: Wire up event interrupts
  2022-12-12  7:06 ` [PATCH V4 3/9] cxl/mem: Wire up event interrupts ira.weiny
@ 2022-12-13 20:15   ` Dan Williams
  2022-12-16 14:24   ` Jonathan Cameron
  2022-12-16 18:21   ` Jonathan Cameron
  2 siblings, 0 replies; 36+ messages in thread
From: Dan Williams @ 2022-12-13 20:15 UTC (permalink / raw)
  To: ira.weiny, Dan Williams
  Cc: Davidlohr Bueso, Bjorn Helgaas, Jonathan Cameron, Ira Weiny,
	Bjorn Helgaas, Alison Schofield, Vishal Verma, Dave Jiang,
	linux-kernel, linux-pci, linux-acpi, linux-cxl

ira.weiny@ wrote:
> From: Davidlohr Bueso <dave@stgolabs.net>
> 
> Currently the only CXL features targeted for irq support require their
> message numbers to be within the first 16 entries.  The device may
> however support less than 16 entries depending on the support it
> provides.
> 
> Attempt to allocate these 16 irq vectors.  If the device supports less
> then the PCI infrastructure will allocate that number.  Upon successful
> allocation, users can plug in their respective isr at any point
> thereafter.
> 
> CXL device events are signaled via interrupts.  Each event log may have
> a different interrupt message number.  These message numbers are
> reported in the Get Event Interrupt Policy mailbox command.
> 
> Add interrupt support for event logs.  Interrupts are allocated as
> shared interrupts.  Therefore, all or some event logs can share the same
> message number.
> 
> In addition all logs are queried on any interrupt in order of the most
> to least severe based on the status register.
> 
> Cc: Bjorn Helgaas <helgaas@kernel.org>
> Cc: Jonathan Cameron <Jonathan.Cameron@huawei.com>
> Co-developed-by: Ira Weiny <ira.weiny@intel.com>
> Signed-off-by: Ira Weiny <ira.weiny@intel.com>
> Signed-off-by: Davidlohr Bueso <dave@stgolabs.net>
> 
> ---
> Changes from V3:
> 	Adjust based on changes in patch 1
> 	Consolidate event setup into cxl_event_config()
> 	Consistently use cxl_event_* for function names
> 	Remove cxl_event_int_is_msi()
> 	Ensure DCD log is ignored in status
> 	Simplify event status loop logic
> 	Dan
> 		Fail driver load if the irq's are not allocated
> 		move cxl_event_config_msgnums() to pci.c
> 		s/CXL_PCI_REQUIRED_VECTORS/CXL_PCI_DEFAULT_MAX_VECTORS
> 		s/devm_kmalloc/devm_kzalloc
> 		Fix up pci_alloc_irq_vectors() comment
> 		Pass pdev to cxl_alloc_irq_vectors()
> 		Check FW irq policy prior to configuration
> 	Jonathan
> 		Use FIELD_GET instead of manual masking
> ---
>  drivers/cxl/cxl.h    |   4 +
>  drivers/cxl/cxlmem.h |  19 ++++
>  drivers/cxl/cxlpci.h |   6 ++
>  drivers/cxl/pci.c    | 208 +++++++++++++++++++++++++++++++++++++++++--
>  4 files changed, 231 insertions(+), 6 deletions(-)
> 
> diff --git a/drivers/cxl/cxl.h b/drivers/cxl/cxl.h
> index 5974d1082210..b3964149c77b 100644
> --- a/drivers/cxl/cxl.h
> +++ b/drivers/cxl/cxl.h
> @@ -168,6 +168,10 @@ static inline int ways_to_eiw(unsigned int ways, u8 *eiw)
>  				 CXLDEV_EVENT_STATUS_FAIL |	\
>  				 CXLDEV_EVENT_STATUS_FATAL)
>  
> +/* CXL rev 3.0 section 8.2.9.2.4; Table 8-52 */
> +#define CXLDEV_EVENT_INT_MODE_MASK	GENMASK(1, 0)
> +#define CXLDEV_EVENT_INT_MSGNUM_MASK	GENMASK(7, 4)
> +
>  /* CXL 2.0 8.2.8.4 Mailbox Registers */
>  #define CXLDEV_MBOX_CAPS_OFFSET 0x00
>  #define   CXLDEV_MBOX_CAP_PAYLOAD_SIZE_MASK GENMASK(4, 0)
> diff --git a/drivers/cxl/cxlmem.h b/drivers/cxl/cxlmem.h
> index dd9aa3dd738e..bd8bfbe61ec8 100644
> --- a/drivers/cxl/cxlmem.h
> +++ b/drivers/cxl/cxlmem.h
> @@ -194,6 +194,23 @@ struct cxl_endpoint_dvsec_info {
>  	struct range dvsec_range[2];
>  };
>  
> +/**
> + * Event Interrupt Policy
> + *
> + * CXL rev 3.0 section 8.2.9.2.4; Table 8-52
> + */
> +enum cxl_event_int_mode {
> +	CXL_INT_NONE		= 0x00,
> +	CXL_INT_MSI_MSIX	= 0x01,
> +	CXL_INT_FW		= 0x02
> +};
> +struct cxl_event_interrupt_policy {
> +	u8 info_settings;
> +	u8 warn_settings;
> +	u8 failure_settings;
> +	u8 fatal_settings;
> +} __packed;
> +
>  /**
>   * struct cxl_event_state - Event log driver state
>   *
> @@ -288,6 +305,8 @@ enum cxl_opcode {
>  	CXL_MBOX_OP_RAW			= CXL_MBOX_OP_INVALID,
>  	CXL_MBOX_OP_GET_EVENT_RECORD	= 0x0100,
>  	CXL_MBOX_OP_CLEAR_EVENT_RECORD	= 0x0101,
> +	CXL_MBOX_OP_GET_EVT_INT_POLICY	= 0x0102,
> +	CXL_MBOX_OP_SET_EVT_INT_POLICY	= 0x0103,
>  	CXL_MBOX_OP_GET_FW_INFO		= 0x0200,
>  	CXL_MBOX_OP_ACTIVATE_FW		= 0x0202,
>  	CXL_MBOX_OP_GET_SUPPORTED_LOGS	= 0x0400,
> diff --git a/drivers/cxl/cxlpci.h b/drivers/cxl/cxlpci.h
> index 77dbdb980b12..a8ea04f536ab 100644
> --- a/drivers/cxl/cxlpci.h
> +++ b/drivers/cxl/cxlpci.h
> @@ -53,6 +53,12 @@
>  #define	    CXL_DVSEC_REG_LOCATOR_BLOCK_ID_MASK			GENMASK(15, 8)
>  #define     CXL_DVSEC_REG_LOCATOR_BLOCK_OFF_LOW_MASK		GENMASK(31, 16)
>  
> +/*
> + * NOTE: Currently all the functions which are enabled for CXL require their
> + * vectors to be in the first 16.  Use this as the default max.
> + */
> +#define CXL_PCI_DEFAULT_MAX_VECTORS 16
> +
>  /* Register Block Identifier (RBI) */
>  enum cxl_regloc_type {
>  	CXL_REGLOC_RBI_EMPTY = 0,
> diff --git a/drivers/cxl/pci.c b/drivers/cxl/pci.c
> index a2d8382bc593..d42d87faddb8 100644
> --- a/drivers/cxl/pci.c
> +++ b/drivers/cxl/pci.c
> @@ -445,6 +445,201 @@ static int cxl_mem_alloc_event_buf(struct cxl_dev_state *cxlds)
>  	return 0;
>  }
>  
> +static int cxl_alloc_irq_vectors(struct pci_dev *pdev)
> +{
> +	int nvecs;
> +
> +	/*
> +	 * CXL requires MSI/MSIX support.

Spec reference please.

"Per CXL 3.0 3.1.1 CXL.io Endpoint a function on a CXL device must not
generate INTx messages if that function participates in CXL.cache or
CXL.mem."

> +	 *
> +	 * Additionally pci_alloc_irq_vectors() handles calling
> +	 * pci_free_irq_vectors() automatically despite not being called
> +	 * pcim_*.  See pci_setup_msi_context().
> +	 */
> +	nvecs = pci_alloc_irq_vectors(pdev, 1, CXL_PCI_DEFAULT_MAX_VECTORS,
> +				      PCI_IRQ_MSIX | PCI_IRQ_MSI);
> +	if (nvecs < 1) {
> +		dev_dbg(&pdev->dev, "Failed to alloc irq vectors: %d\n", nvecs);
> +		return -ENXIO;
> +	}
> +	return 0;
> +}
> +
> +struct cxl_dev_id {
> +	struct cxl_dev_state *cxlds;
> +};
> +
> +static irqreturn_t cxl_event_thread(int irq, void *id)
> +{
> +	struct cxl_dev_id *dev_id = id;
> +	struct cxl_dev_state *cxlds = dev_id->cxlds;
> +	u32 status;
> +
> +	do {
> +		/*
> +		 * CXL 3.0 8.2.8.3.1: The lower 32 bits are the status;
> +		 * ignore the reserved upper 32 bits
> +		 */
> +		status = readl(cxlds->regs.status + CXLDEV_DEV_EVENT_STATUS_OFFSET);
> +		/* Ignore logs unknown to the driver */
> +		status &= CXLDEV_EVENT_STATUS_ALL;
> +		if (!status)
> +			break;
> +		cxl_mem_get_event_records(cxlds, status);
> +		cond_resched();
> +	} while (status);
> +
> +	return IRQ_HANDLED;
> +}
> +
> +static int cxl_event_req_irq(struct cxl_dev_state *cxlds, u8 setting)
> +{
> +	struct device *dev = cxlds->dev;
> +	struct pci_dev *pdev = to_pci_dev(dev);
> +	struct cxl_dev_id *dev_id;
> +	int irq;
> +
> +	if (FIELD_GET(CXLDEV_EVENT_INT_MODE_MASK, setting) != CXL_INT_MSI_MSIX)
> +		return -ENXIO;
> +
> +	/* dev_id must be globally unique and must contain the cxlds */
> +	dev_id = devm_kzalloc(dev, sizeof(*dev_id), GFP_KERNEL);
> +	if (!dev_id)
> +		return -ENOMEM;
> +	dev_id->cxlds = cxlds;
> +
> +	irq =  pci_irq_vector(pdev,
> +			      FIELD_GET(CXLDEV_EVENT_INT_MSGNUM_MASK, setting));
> +	if (irq < 0)
> +		return irq;
> +
> +	return devm_request_threaded_irq(dev, irq, NULL, cxl_event_thread,
> +					 IRQF_SHARED, NULL, dev_id);
> +}
> +
> +static int cxl_event_get_int_policy(struct cxl_dev_state *cxlds,
> +				    struct cxl_event_interrupt_policy *policy)
> +{
> +	struct cxl_mbox_cmd mbox_cmd = (struct cxl_mbox_cmd) {
> +		.opcode = CXL_MBOX_OP_GET_EVT_INT_POLICY,
> +		.payload_out = policy,
> +		.size_out = sizeof(*policy),
> +	};
> +	int rc;
> +
> +	rc = cxl_internal_send_cmd(cxlds, &mbox_cmd);
> +	if (rc < 0)
> +		dev_err(cxlds->dev, "Failed to get event interrupt policy : %d",
> +			rc);
> +
> +	return rc;
> +}
> +
> +static int cxl_event_config_msgnums(struct cxl_dev_state *cxlds,
> +				    struct cxl_event_interrupt_policy *policy)
> +{
> +	struct cxl_mbox_cmd mbox_cmd;
> +	int rc;
> +
> +	policy->info_settings = CXL_INT_MSI_MSIX;
> +	policy->warn_settings = CXL_INT_MSI_MSIX;
> +	policy->failure_settings = CXL_INT_MSI_MSIX;
> +	policy->fatal_settings = CXL_INT_MSI_MSIX;

I prefer this syntax for command payload initialization if only because
it 0-fills all fields:

diff --git a/drivers/cxl/pci.c b/drivers/cxl/pci.c
index d42d87faddb8..e8b4bd5e517f 100644
--- a/drivers/cxl/pci.c
+++ b/drivers/cxl/pci.c
@@ -541,10 +541,12 @@ static int cxl_event_config_msgnums(struct cxl_dev_state *cxlds,
        struct cxl_mbox_cmd mbox_cmd;
        int rc;
 
-       policy->info_settings = CXL_INT_MSI_MSIX;
-       policy->warn_settings = CXL_INT_MSI_MSIX;
-       policy->failure_settings = CXL_INT_MSI_MSIX;
-       policy->fatal_settings = CXL_INT_MSI_MSIX;
+       *policy = (struct cxl_event_interrupt_policy) {
+               .info_settings = CXL_INT_MSI_MSIX,
+               .warn_settings = CXL_INT_MSI_MSIX,
+               .failure_settings = CXL_INT_MSI_MSIX,
+               .fatal_settings = CXL_INT_MSI_MSIX,
+       };
 
        mbox_cmd = (struct cxl_mbox_cmd) {
                .opcode = CXL_MBOX_OP_SET_EVT_INT_POLICY,

...but that can be a follow-on fixup because I am not seeing much else
to need respinning this patch.

> +
> +	mbox_cmd = (struct cxl_mbox_cmd) {
> +		.opcode = CXL_MBOX_OP_SET_EVT_INT_POLICY,
> +		.payload_in = policy,
> +		.size_in = sizeof(*policy),
> +	};
> +
> +	rc = cxl_internal_send_cmd(cxlds, &mbox_cmd);
> +	if (rc < 0) {
> +		dev_err(cxlds->dev, "Failed to set event interrupt policy : %d",
> +			rc);
> +		return rc;
> +	}
> +
> +	/* Retrieve final interrupt settings */
> +	return cxl_event_get_int_policy(cxlds, policy);
> +}
> +
> +static int cxl_event_irqsetup(struct cxl_dev_state *cxlds)
> +{
> +	struct cxl_event_interrupt_policy policy;
> +	int rc;
> +
> +	rc = cxl_event_config_msgnums(cxlds, &policy);
> +	if (rc)
> +		return rc;
> +
> +	rc = cxl_event_req_irq(cxlds, policy.info_settings);
> +	if (rc) {
> +		dev_err(cxlds->dev, "Failed to get interrupt for event Info log\n");
> +		return rc;
> +	}
> +
> +	rc = cxl_event_req_irq(cxlds, policy.warn_settings);
> +	if (rc) {
> +		dev_err(cxlds->dev, "Failed to get interrupt for event Warn log\n");
> +		return rc;
> +	}
> +
> +	rc = cxl_event_req_irq(cxlds, policy.failure_settings);
> +	if (rc) {
> +		dev_err(cxlds->dev, "Failed to get interrupt for event Failure log\n");
> +		return rc;
> +	}
> +
> +	rc = cxl_event_req_irq(cxlds, policy.fatal_settings);
> +	if (rc) {
> +		dev_err(cxlds->dev, "Failed to get interrupt for event Fatal log\n");
> +		return rc;
> +	}
> +
> +	return 0;
> +}
> +
> +static bool cxl_event_int_is_fw(u8 setting)
> +{
> +	u8 mode = FIELD_GET(CXLDEV_EVENT_INT_MODE_MASK, setting);
> +
> +	return mode == CXL_INT_FW;
> +}
> +
> +static int cxl_event_config(struct pci_host_bridge *host_bridge,
> +			    struct cxl_dev_state *cxlds)
> +{
> +	struct cxl_event_interrupt_policy policy;
> +	int rc;
> +
> +	/*
> +	 * When BIOS maintains CXL error reporting control, it will process
> +	 * event records.  Only one agent can do so.
> +	 */
> +	if (!host_bridge->native_cxl_error)
> +		return 0;
> +
> +	rc = cxl_event_get_int_policy(cxlds, &policy);
> +	if (rc)
> +		return rc;
> +
> +	if (cxl_event_int_is_fw(policy.info_settings) ||
> +	    cxl_event_int_is_fw(policy.warn_settings) ||
> +	    cxl_event_int_is_fw(policy.failure_settings) ||
> +	    cxl_event_int_is_fw(policy.fatal_settings)) {
> +		dev_err(cxlds->dev, "FW still in control of Event Logs despite _OSC settings\n");
> +		return -EBUSY;
> +	}
> +
> +	rc = cxl_event_irqsetup(cxlds);
> +	if (rc)
> +		return rc;
> +
> +	cxl_mem_get_event_records(cxlds, CXLDEV_EVENT_STATUS_ALL);
> +
> +	return 0;
> +}
> +
>  static int cxl_pci_probe(struct pci_dev *pdev, const struct pci_device_id *id)
>  {
>  	struct pci_host_bridge *host_bridge = pci_find_host_bridge(pdev->bus);
> @@ -519,6 +714,10 @@ static int cxl_pci_probe(struct pci_dev *pdev, const struct pci_device_id *id)
>  	if (rc)
>  		return rc;
>  
> +	rc = cxl_alloc_irq_vectors(pdev);
> +	if (rc)
> +		return rc;
> +
>  	cxlmd = devm_cxl_add_memdev(cxlds);
>  	if (IS_ERR(cxlmd))
>  		return PTR_ERR(cxlmd);
> @@ -527,12 +726,9 @@ static int cxl_pci_probe(struct pci_dev *pdev, const struct pci_device_id *id)
>  	if (rc)
>  		return rc;
>  
> -	/*
> -	 * When BIOS maintains CXL error reporting control, it will process
> -	 * event records.  Only one agent can do so.
> -	 */
> -	if (host_bridge->native_cxl_error)
> -		cxl_mem_get_event_records(cxlds, CXLDEV_EVENT_STATUS_ALL);
> +	rc = cxl_event_config(host_bridge, cxlds);
> +	if (rc)
> +		return rc;

In the future, it would help me if these kinds of policy changes mid
patch set, if they cannot be avoided, are at least declared in the
changelog. It saps some reviewer resources to invalidate previous
understanding within the same review session.

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* Re: [PATCH V4 0/9] CXL: Process event logs
  2022-12-12  7:06 [PATCH V4 0/9] CXL: Process event logs ira.weiny
                   ` (8 preceding siblings ...)
  2022-12-12  7:06 ` [PATCH V4 9/9] cxl/test: Simulate event log overflow ira.weiny
@ 2022-12-16 12:25 ` Jonathan Cameron
  2022-12-16 17:01   ` Dan Williams
  9 siblings, 1 reply; 36+ messages in thread
From: Jonathan Cameron @ 2022-12-16 12:25 UTC (permalink / raw)
  To: ira.weiny
  Cc: Dan Williams, Bjorn Helgaas, Alison Schofield, Vishal Verma,
	Davidlohr Bueso, Dave Jiang, linux-kernel, linux-pci, linux-acpi,
	linux-cxl

On Sun, 11 Dec 2022 23:06:18 -0800
ira.weiny@intel.com wrote:

> From: Ira Weiny <ira.weiny@intel.com>
> 
> This code has been tested with a newer qemu which allows for more events to be
> returned at a time as well an additional QMP event and interrupt injection.
> Those patches will follow once they have been cleaned up.
> 
> The series is now in 3 parts:
> 
> 	1) Base functionality including interrupts
> 	2) Tracing specific events (Dynamic Capacity Event Record is defered)
> 	3) cxl-test infrastructure for basic tests
> 
> Changes from V3
> 	Feedback from Dan
> 	Spit out ACPI changes for Bjorn
> 
> - Link to v3: https://lore.kernel.org/all/20221208052115.800170-1-ira.weiny@intel.com/

Because I'm in a grumpy mood (as my colleagues will attest!)...
This is dependent on the patch that moves the trace definitions and
that's not upstream yet except in cxl/preview which is optimistic
place to use for a base commit.  The id isn't the one below either which
isn't in either mailine or the current CXL trees.

Not that I actually checked the cover letter until it failed to apply
(and hence already knew what was missing) but still, please call out
dependencies unless they are in the branches Dan has queued up to push.

I just want to play with Dave's fix for the RAS errors so having to jump
through these other sets.

Thanks,

Jonathan

> 
> 
> Davidlohr Bueso (1):
>   cxl/mem: Wire up event interrupts
> 
> Ira Weiny (8):
>   PCI/CXL: Export native CXL error reporting control
>   cxl/mem: Read, trace, and clear events on driver load
>   cxl/mem: Trace General Media Event Record
>   cxl/mem: Trace DRAM Event Record
>   cxl/mem: Trace Memory Module Event Record
>   cxl/test: Add generic mock events
>   cxl/test: Add specific events
>   cxl/test: Simulate event log overflow
> 
>  drivers/acpi/pci_root.c       |   3 +
>  drivers/cxl/core/mbox.c       | 186 +++++++++++++
>  drivers/cxl/core/trace.h      | 479 ++++++++++++++++++++++++++++++++++
>  drivers/cxl/cxl.h             |  16 ++
>  drivers/cxl/cxlmem.h          | 171 ++++++++++++
>  drivers/cxl/cxlpci.h          |   6 +
>  drivers/cxl/pci.c             | 236 +++++++++++++++++
>  drivers/pci/probe.c           |   1 +
>  include/linux/pci.h           |   1 +
>  tools/testing/cxl/test/Kbuild |   2 +-
>  tools/testing/cxl/test/mem.c  | 352 +++++++++++++++++++++++++
>  11 files changed, 1452 insertions(+), 1 deletion(-)
> 
> 
> base-commit: acb704099642bc822ef2aed223a0b8db1f7ea76e


^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH V4 1/9] PCI/CXL: Export native CXL error reporting control
  2022-12-12  7:06 ` [PATCH V4 1/9] PCI/CXL: Export native CXL error reporting control ira.weiny
  2022-12-13 19:12   ` Dan Williams
@ 2022-12-16 14:09   ` Jonathan Cameron
  2023-01-05  3:16   ` Ira Weiny
  2023-01-05 16:56   ` Bjorn Helgaas
  3 siblings, 0 replies; 36+ messages in thread
From: Jonathan Cameron @ 2022-12-16 14:09 UTC (permalink / raw)
  To: ira.weiny
  Cc: Dan Williams, Bjorn Helgaas, Lukas Wunner, linux-pci, linux-acpi,
	Alison Schofield, Vishal Verma, Davidlohr Bueso, Dave Jiang,
	linux-kernel, linux-cxl

On Sun, 11 Dec 2022 23:06:19 -0800
ira.weiny@intel.com wrote:

> From: Ira Weiny <ira.weiny@intel.com>
> 
> CXL _OSC Error Reporting Control is used by the OS to determine if
> Firmware has control of various CXL error reporting capabilities
> including the event logs.
> 
> Expose the result of negotiating CXL Error Reporting Control in struct
> pci_host_bridge for consumption by the CXL drivers.
> 
> Cc: Bjorn Helgaas <bhelgaas@google.com>
> Cc: Lukas Wunner <lukas@wunner.de>
> Cc: linux-pci@vger.kernel.org
> Cc: linux-acpi@vger.kernel.org
> Signed-off-by: Ira Weiny <ira.weiny@intel.com>
> 
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>

> ---
> Changes from V3:
> 	New patch split out
> ---
>  drivers/acpi/pci_root.c | 3 +++
>  drivers/pci/probe.c     | 1 +
>  include/linux/pci.h     | 1 +
>  3 files changed, 5 insertions(+)
> 
> diff --git a/drivers/acpi/pci_root.c b/drivers/acpi/pci_root.c
> index b3c202d2a433..84030804a763 100644
> --- a/drivers/acpi/pci_root.c
> +++ b/drivers/acpi/pci_root.c
> @@ -1047,6 +1047,9 @@ struct pci_bus *acpi_pci_root_create(struct acpi_pci_root *root,
>  	if (!(root->osc_control_set & OSC_PCI_EXPRESS_DPC_CONTROL))
>  		host_bridge->native_dpc = 0;
>  
> +	if (!(root->osc_ext_control_set & OSC_CXL_ERROR_REPORTING_CONTROL))
> +		host_bridge->native_cxl_error = 0;
> +
>  	/*
>  	 * Evaluate the "PCI Boot Configuration" _DSM Function.  If it
>  	 * exists and returns 0, we must preserve any PCI resource
> diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c
> index 2f4e88a44e8b..34c9fd6840c4 100644
> --- a/drivers/pci/probe.c
> +++ b/drivers/pci/probe.c
> @@ -596,6 +596,7 @@ static void pci_init_host_bridge(struct pci_host_bridge *bridge)
>  	bridge->native_ltr = 1;
>  	bridge->native_dpc = 1;
>  	bridge->domain_nr = PCI_DOMAIN_NR_NOT_SET;
> +	bridge->native_cxl_error = 1;
>  
>  	device_initialize(&bridge->dev);
>  }
> diff --git a/include/linux/pci.h b/include/linux/pci.h
> index 1f81807492ef..08c3ccd2617b 100644
> --- a/include/linux/pci.h
> +++ b/include/linux/pci.h
> @@ -577,6 +577,7 @@ struct pci_host_bridge {
>  	unsigned int	native_pme:1;		/* OS may use PCIe PME */
>  	unsigned int	native_ltr:1;		/* OS may use PCIe LTR */
>  	unsigned int	native_dpc:1;		/* OS may use PCIe DPC */
> +	unsigned int	native_cxl_error:1;	/* OS may use CXL RAS/Events */
>  	unsigned int	preserve_config:1;	/* Preserve FW resource setup */
>  	unsigned int	size_windows:1;		/* Enable root bus sizing */
>  	unsigned int	msi_domain:1;		/* Bridge wants MSI domain */


^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH V4 3/9] cxl/mem: Wire up event interrupts
  2022-12-12  7:06 ` [PATCH V4 3/9] cxl/mem: Wire up event interrupts ira.weiny
  2022-12-13 20:15   ` Dan Williams
@ 2022-12-16 14:24   ` Jonathan Cameron
  2022-12-16 18:42     ` Jonathan Cameron
  2022-12-16 18:21   ` Jonathan Cameron
  2 siblings, 1 reply; 36+ messages in thread
From: Jonathan Cameron @ 2022-12-16 14:24 UTC (permalink / raw)
  To: ira.weiny
  Cc: Dan Williams, Davidlohr Bueso, Bjorn Helgaas, Bjorn Helgaas,
	Alison Schofield, Vishal Verma, Dave Jiang, linux-kernel,
	linux-pci, linux-acpi, linux-cxl

On Sun, 11 Dec 2022 23:06:21 -0800
ira.weiny@intel.com wrote:

> From: Davidlohr Bueso <dave@stgolabs.net>
> 
> Currently the only CXL features targeted for irq support require their
> message numbers to be within the first 16 entries.  The device may
> however support less than 16 entries depending on the support it
> provides.
> 
> Attempt to allocate these 16 irq vectors.  If the device supports less
> then the PCI infrastructure will allocate that number.  Upon successful
> allocation, users can plug in their respective isr at any point
> thereafter.
> 
> CXL device events are signaled via interrupts.  Each event log may have
> a different interrupt message number.  These message numbers are
> reported in the Get Event Interrupt Policy mailbox command.
> 
> Add interrupt support for event logs.  Interrupts are allocated as
> shared interrupts.  Therefore, all or some event logs can share the same
> message number.
> 
> In addition all logs are queried on any interrupt in order of the most
> to least severe based on the status register.
> 
> Cc: Bjorn Helgaas <helgaas@kernel.org>
> Cc: Jonathan Cameron <Jonathan.Cameron@huawei.com>
> Co-developed-by: Ira Weiny <ira.weiny@intel.com>
> Signed-off-by: Ira Weiny <ira.weiny@intel.com>
> Signed-off-by: Davidlohr Bueso <dave@stgolabs.net>
> 

> +/**
> + * Event Interrupt Policy
> + *
> + * CXL rev 3.0 section 8.2.9.2.4; Table 8-52
> + */
> +enum cxl_event_int_mode {
> +	CXL_INT_NONE		= 0x00,
> +	CXL_INT_MSI_MSIX	= 0x01,
> +	CXL_INT_FW		= 0x02
> +};
> +struct cxl_event_interrupt_policy {
> +	u8 info_settings;
> +	u8 warn_settings;
> +	u8 failure_settings;
> +	u8 fatal_settings;

This is an issue for your QEMU code which has this set at 5 bytes.
Guess our handling of record lengths needs updating now we have two different
spec versions to support and hence these can have multiple lengths.

Btw, do you have an updated version of the QEMU patches you can share?
I was planning on just doing the AER type RAS stuff for the first pull this cycle
but given this set means we never reach that code I probably need to do QEMU
support for this and the stuff to support those all in one go - otherwise
no one will be able to test it :)  We rather optimistically have the OSC set
to say the OS can have control of these, but upstream code doesn't emulate
anything yet. Oops. Should have pretended the hardware was handling them
until we had this support in place in QEMU.

Jonathan

> +} __packed;
> +
>  /**
>   * struct cxl_event_state - Event log driver state
>   *
> @@ -288,6 +305,8 @@ enum cxl_opcode {
>  	CXL_MBOX_OP_RAW			= CXL_MBOX_OP_INVALID,
>  	CXL_MBOX_OP_GET_EVENT_RECORD	= 0x0100,
>  	CXL_MBOX_OP_CLEAR_EVENT_RECORD	= 0x0101,
> +	CXL_MBOX_OP_GET_EVT_INT_POLICY	= 0x0102,
> +	CXL_MBOX_OP_SET_EVT_INT_POLICY	= 0x0103,
>  	CXL_MBOX_OP_GET_FW_INFO		= 0x0200,
>  	CXL_MBOX_OP_ACTIVATE_FW		= 0x0202,
>  	CXL_MBOX_OP_GET_SUPPORTED_LOGS	= 0x0400,

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH V4 2/9] cxl/mem: Read, trace, and clear events on driver load
  2022-12-12  7:06 ` [PATCH V4 2/9] cxl/mem: Read, trace, and clear events on driver load ira.weiny
  2022-12-13  6:49   ` johnny
@ 2022-12-16 15:39   ` Jonathan Cameron
  2022-12-16 21:54     ` Ira Weiny
  1 sibling, 1 reply; 36+ messages in thread
From: Jonathan Cameron @ 2022-12-16 15:39 UTC (permalink / raw)
  To: ira.weiny
  Cc: Dan Williams, Bjorn Helgaas, Alison Schofield, Vishal Verma,
	Davidlohr Bueso, Dave Jiang, linux-kernel, linux-pci, linux-acpi,
	linux-cxl

On Sun, 11 Dec 2022 23:06:20 -0800
ira.weiny@intel.com wrote:

> From: Ira Weiny <ira.weiny@intel.com>
> 
> CXL devices have multiple event logs which can be queried for CXL event
> records.  Devices are required to support the storage of at least one
> event record in each event log type.
> 
> Devices track event log overflow by incrementing a counter and tracking
> the time of the first and last overflow event seen.
> 
> Software queries events via the Get Event Record mailbox command; CXL
> rev 3.0 section 8.2.9.2.2 and clears events via CXL rev 3.0 section
> 8.2.9.2.3 Clear Event Records mailbox command.
> 
> If the result of negotiating CXL Error Reporting Control is OS control,
> read and clear all event logs on driver load.
> 
> Ensure a clean slate of events by reading and clearing the events on
> driver load.
> 
> The status register is not used because a device may continue to trigger
> events and the only requirement is to empty the log at least once.  This
> allows for the required transition from empty to non-empty for interrupt
> generation.  Handling of interrupts is in a follow on patch.
> 
> The device can return up to 1MB worth of event records per query.
> Allocate a shared large buffer to handle the max number of records based
> on the mailbox payload size.
> 
> This patch traces a raw event record and leaves specific event record
> type tracing to subsequent patches.  Macros are created to aid in
> tracing the common CXL Event header fields.
> 
> Each record is cleared explicitly.  A clear all bit is specified but is
> only valid when the log overflows.
> 
> Signed-off-by: Ira Weiny <ira.weiny@intel.com>

A few things noticed inline.  I've tightened the QEMU code to reject the
case of the input payload claims to be bigger than the mailbox size
and hacked the size down to 256 bytes so it triggers the problem
highlighted below.

> 
> ---
> Changes from V3:
> 	Dan
> 		Split off _OSC pcie bits
> 			Use existing style for host bridge flag in that
> 			patch
> 		Clean up event processing loop
> 		Use dev_err_ratelimited()
> 		Clean up version change log
> 		Delete 'EVENT LOG OVERFLOW'
> 		Remove cxl_clear_event_logs()
> 		Add comment for native cxl control
> 		Fail driver load on event buf allocation failure
> 		Comment why events are not processed without _OSC flag
> ---
>  drivers/cxl/core/mbox.c  | 136 +++++++++++++++++++++++++++++++++++++++
>  drivers/cxl/core/trace.h | 120 ++++++++++++++++++++++++++++++++++
>  drivers/cxl/cxl.h        |  12 ++++
>  drivers/cxl/cxlmem.h     |  84 ++++++++++++++++++++++++
>  drivers/cxl/pci.c        |  40 ++++++++++++
>  5 files changed, 392 insertions(+)
> 
> diff --git a/drivers/cxl/core/mbox.c b/drivers/cxl/core/mbox.c
> index b03fba212799..9fb327370e08 100644
> --- a/drivers/cxl/core/mbox.c
> +++ b/drivers/cxl/core/mbox.c

> +static int cxl_clear_event_record(struct cxl_dev_state *cxlds,
> +				  enum cxl_event_log_type log,
> +				  struct cxl_get_event_payload *get_pl)
> +{
> +	struct cxl_mbox_clear_event_payload payload = {
> +		.event_log = log,
> +	};
> +	u16 total = le16_to_cpu(get_pl->record_count);
> +	u8 max_handles = CXL_CLEAR_EVENT_MAX_HANDLES;
> +	size_t pl_size = sizeof(payload);
> +	struct cxl_mbox_cmd mbox_cmd;
> +	u16 cnt;
> +	int rc;
> +	int i;
> +
> +	/* Payload size may limit the max handles */
> +	if (pl_size > cxlds->payload_size) {
> +		max_handles = CXL_CLEAR_EVENT_LIMIT_HANDLES(cxlds->payload_size);
> +		pl_size = cxlds->payload_size;
> +	}
> +
> +	mbox_cmd = (struct cxl_mbox_cmd) {
> +		.opcode = CXL_MBOX_OP_CLEAR_EVENT_RECORD,
> +		.payload_in = &payload,
> +		.size_in = pl_size,

This payload size should be whatever we need to store the records,
not the max size possible.  Particularly as that size is currently
bigger than the mailbox might be.

It shouldn't fail (I think) simply because a later version of the spec might
add more to this message and things should still work, but definitely not
good practice to tell the hardware this is much longer than it actually is.


> +	};
> +
> +	/*
> +	 * Clear Event Records uses u8 for the handle cnt while Get Event
> +	 * Record can return up to 0xffff records.
> +	 */
> +	i = 0;
> +	for (cnt = 0; cnt < total; cnt++) {
> +		payload.handle[i++] = get_pl->records[cnt].hdr.handle;
> +		dev_dbg(cxlds->dev, "Event log '%d': Clearing %u\n",
> +			log, le16_to_cpu(payload.handle[i]));
> +
> +		if (i == max_handles) {
> +			payload.nr_recs = i;
> +			rc = cxl_internal_send_cmd(cxlds, &mbox_cmd);
> +			if (rc)
> +				return rc;
> +			i = 0;
> +		}
> +	}
> +
> +	/* Clear what is left if any */
> +	if (i) {
> +		payload.nr_recs = i;
> +		rc = cxl_internal_send_cmd(cxlds, &mbox_cmd);
> +		if (rc)
> +			return rc;
> +	}
> +
> +	return 0;
> +}


...

> diff --git a/drivers/cxl/cxlmem.h b/drivers/cxl/cxlmem.h
> index ab138004f644..dd9aa3dd738e 100644
> --- a/drivers/cxl/cxlmem.h
> +++ b/drivers/cxl/cxlmem.h

...

> +
> +/*
> + * Clear Event Records input payload
> + * CXL rev 3.0 section 8.2.9.2.3; Table 8-51
> + */
> +#define CXL_CLEAR_EVENT_MAX_HANDLES (0xff)
> +struct cxl_mbox_clear_event_payload {
> +	u8 event_log;		/* enum cxl_event_log_type */
> +	u8 clear_flags;
> +	u8 nr_recs;
> +	u8 reserved[3];
> +	__le16 handle[CXL_CLEAR_EVENT_MAX_HANDLES];

Doesn't fit in the smallest possible payload buffer.
It's 526 bytes long.  Payload buffer might be 256 bytes in total.
(8.2.8.4.3 Mailbox capabilities)

Lazy approach, make this smaller and do more loops when clearing.
If we want to optimize this later can expand it to this size.

> +} __packed;
> +#define CXL_CLEAR_EVENT_LIMIT_HANDLES(payload_size)			\
> +	(((payload_size) -						\
> +		(sizeof(struct cxl_mbox_clear_event_payload) -		\
> +		 (sizeof(__le16) * CXL_CLEAR_EVENT_MAX_HANDLES))) /	\
> +		sizeof(__le16))
> +

...


^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH V4 0/9] CXL: Process event logs
  2022-12-16 12:25 ` [PATCH V4 0/9] CXL: Process event logs Jonathan Cameron
@ 2022-12-16 17:01   ` Dan Williams
  2022-12-16 18:15     ` Ira Weiny
  2022-12-16 18:39     ` Jonathan Cameron
  0 siblings, 2 replies; 36+ messages in thread
From: Dan Williams @ 2022-12-16 17:01 UTC (permalink / raw)
  To: Jonathan Cameron, ira.weiny
  Cc: Dan Williams, Bjorn Helgaas, Alison Schofield, Vishal Verma,
	Davidlohr Bueso, Dave Jiang, linux-kernel, linux-pci, linux-acpi,
	linux-cxl

Jonathan Cameron wrote:
> On Sun, 11 Dec 2022 23:06:18 -0800
> ira.weiny@intel.com wrote:
> 
> > From: Ira Weiny <ira.weiny@intel.com>
> > 
> > This code has been tested with a newer qemu which allows for more events to be
> > returned at a time as well an additional QMP event and interrupt injection.
> > Those patches will follow once they have been cleaned up.
> > 
> > The series is now in 3 parts:
> > 
> > 	1) Base functionality including interrupts
> > 	2) Tracing specific events (Dynamic Capacity Event Record is defered)
> > 	3) cxl-test infrastructure for basic tests
> > 
> > Changes from V3
> > 	Feedback from Dan
> > 	Spit out ACPI changes for Bjorn
> > 
> > - Link to v3: https://lore.kernel.org/all/20221208052115.800170-1-ira.weiny@intel.com/
> 
> Because I'm in a grumpy mood (as my colleagues will attest!)...
> This is dependent on the patch that moves the trace definitions and
> that's not upstream yet except in cxl/preview which is optimistic
> place to use for a base commit.  The id isn't the one below either which
> isn't in either mailine or the current CXL trees.

I do not want to commit to a new baseline until after -rc1, so yes this
is in a messy period.

> Not that I actually checked the cover letter until it failed to apply
> (and hence already knew what was missing) but still, please call out
> dependencies unless they are in the branches Dan has queued up to push.
> 
> I just want to play with Dave's fix for the RAS errors so having to jump
> through these other sets.

Yes, that is annoying, apologies.

> 
> Thanks,
> 
> Jonathan
> 
> > 
> > 
> > Davidlohr Bueso (1):
> >   cxl/mem: Wire up event interrupts
> > 
> > Ira Weiny (8):
> >   PCI/CXL: Export native CXL error reporting control
> >   cxl/mem: Read, trace, and clear events on driver load
> >   cxl/mem: Trace General Media Event Record
> >   cxl/mem: Trace DRAM Event Record
> >   cxl/mem: Trace Memory Module Event Record
> >   cxl/test: Add generic mock events
> >   cxl/test: Add specific events
> >   cxl/test: Simulate event log overflow
> > 
> >  drivers/acpi/pci_root.c       |   3 +
> >  drivers/cxl/core/mbox.c       | 186 +++++++++++++
> >  drivers/cxl/core/trace.h      | 479 ++++++++++++++++++++++++++++++++++
> >  drivers/cxl/cxl.h             |  16 ++
> >  drivers/cxl/cxlmem.h          | 171 ++++++++++++
> >  drivers/cxl/cxlpci.h          |   6 +
> >  drivers/cxl/pci.c             | 236 +++++++++++++++++
> >  drivers/pci/probe.c           |   1 +
> >  include/linux/pci.h           |   1 +
> >  tools/testing/cxl/test/Kbuild |   2 +-
> >  tools/testing/cxl/test/mem.c  | 352 +++++++++++++++++++++++++
> >  11 files changed, 1452 insertions(+), 1 deletion(-)
> > 
> > 
> > base-commit: acb704099642bc822ef2aed223a0b8db1f7ea76e
> 

I think going forward these base-commits need to be something that are
reachable on cxl.git. For now I have pushed out a baseline for both Dave
and Ira's patches to cxl/preview which will rebase after -rc1 comes out.

Just the small matter of needing some acks/reviews on those lead in
patches so I can move them to through cxl/pending to cxl/next:

http://lore.kernel.org/r/167051869176.436579.9728373544811641087.stgit@dwillia2-xfh.jf.intel.com
http://lore.kernel.org/r/20221212070627.1372402-2-ira.weiny@intel.com

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH V4 0/9] CXL: Process event logs
  2022-12-16 17:01   ` Dan Williams
@ 2022-12-16 18:15     ` Ira Weiny
  2022-12-16 18:39     ` Jonathan Cameron
  1 sibling, 0 replies; 36+ messages in thread
From: Ira Weiny @ 2022-12-16 18:15 UTC (permalink / raw)
  To: Dan Williams
  Cc: Jonathan Cameron, Bjorn Helgaas, Alison Schofield, Vishal Verma,
	Davidlohr Bueso, Dave Jiang, linux-kernel, linux-pci, linux-acpi,
	linux-cxl

On Fri, Dec 16, 2022 at 09:01:13AM -0800, Dan Williams wrote:
> Jonathan Cameron wrote:
> > On Sun, 11 Dec 2022 23:06:18 -0800
> > ira.weiny@intel.com wrote:
> > 
> > > From: Ira Weiny <ira.weiny@intel.com>
> > > 
> > > This code has been tested with a newer qemu which allows for more events to be
> > > returned at a time as well an additional QMP event and interrupt injection.
> > > Those patches will follow once they have been cleaned up.
> > > 
> > > The series is now in 3 parts:
> > > 
> > > 	1) Base functionality including interrupts
> > > 	2) Tracing specific events (Dynamic Capacity Event Record is defered)
> > > 	3) cxl-test infrastructure for basic tests
> > > 
> > > Changes from V3
> > > 	Feedback from Dan
> > > 	Spit out ACPI changes for Bjorn
> > > 
> > > - Link to v3: https://lore.kernel.org/all/20221208052115.800170-1-ira.weiny@intel.com/
> > 
> > Because I'm in a grumpy mood (as my colleagues will attest!)...
> > This is dependent on the patch that moves the trace definitions and
> > that's not upstream yet except in cxl/preview which is optimistic
> > place to use for a base commit.  The id isn't the one below either which
> > isn't in either mailine or the current CXL trees.
> 
> I do not want to commit to a new baseline until after -rc1, so yes this
> is in a messy period.
> 
> > Not that I actually checked the cover letter until it failed to apply
> > (and hence already knew what was missing) but still, please call out
> > dependencies unless they are in the branches Dan has queued up to push.
> > 
> > I just want to play with Dave's fix for the RAS errors so having to jump
> > through these other sets.
> 
> Yes, that is annoying, apologies.
> 
> > 
> > Thanks,
> > 
> > Jonathan
> > 
> > > 
> > > 
> > > Davidlohr Bueso (1):
> > >   cxl/mem: Wire up event interrupts
> > > 
> > > Ira Weiny (8):
> > >   PCI/CXL: Export native CXL error reporting control
> > >   cxl/mem: Read, trace, and clear events on driver load
> > >   cxl/mem: Trace General Media Event Record
> > >   cxl/mem: Trace DRAM Event Record
> > >   cxl/mem: Trace Memory Module Event Record
> > >   cxl/test: Add generic mock events
> > >   cxl/test: Add specific events
> > >   cxl/test: Simulate event log overflow
> > > 
> > >  drivers/acpi/pci_root.c       |   3 +
> > >  drivers/cxl/core/mbox.c       | 186 +++++++++++++
> > >  drivers/cxl/core/trace.h      | 479 ++++++++++++++++++++++++++++++++++
> > >  drivers/cxl/cxl.h             |  16 ++
> > >  drivers/cxl/cxlmem.h          | 171 ++++++++++++
> > >  drivers/cxl/cxlpci.h          |   6 +
> > >  drivers/cxl/pci.c             | 236 +++++++++++++++++
> > >  drivers/pci/probe.c           |   1 +
> > >  include/linux/pci.h           |   1 +
> > >  tools/testing/cxl/test/Kbuild |   2 +-
> > >  tools/testing/cxl/test/mem.c  | 352 +++++++++++++++++++++++++
> > >  11 files changed, 1452 insertions(+), 1 deletion(-)
> > > 
> > > 
> > > base-commit: acb704099642bc822ef2aed223a0b8db1f7ea76e
> > 
> 
> I think going forward these base-commits need to be something that are
> reachable on cxl.git.

Agreed.  I thought this was in preview.  But even preview is not stable and I
should have waited and asked to see this land in next first.

Ira

> For now I have pushed out a baseline for both Dave
> and Ira's patches to cxl/preview which will rebase after -rc1 comes out.
> 
> Just the small matter of needing some acks/reviews on those lead in
> patches so I can move them to through cxl/pending to cxl/next:
> 
> http://lore.kernel.org/r/167051869176.436579.9728373544811641087.stgit@dwillia2-xfh.jf.intel.com
> http://lore.kernel.org/r/20221212070627.1372402-2-ira.weiny@intel.com

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH V4 3/9] cxl/mem: Wire up event interrupts
  2022-12-12  7:06 ` [PATCH V4 3/9] cxl/mem: Wire up event interrupts ira.weiny
  2022-12-13 20:15   ` Dan Williams
  2022-12-16 14:24   ` Jonathan Cameron
@ 2022-12-16 18:21   ` Jonathan Cameron
  2022-12-16 21:33     ` Ira Weiny
  2 siblings, 1 reply; 36+ messages in thread
From: Jonathan Cameron @ 2022-12-16 18:21 UTC (permalink / raw)
  To: ira.weiny, Alison Schofield
  Cc: Dan Williams, Davidlohr Bueso, Bjorn Helgaas, Bjorn Helgaas,
	Vishal Verma, Dave Jiang, linux-kernel, linux-pci, linux-acpi,
	linux-cxl

On Sun, 11 Dec 2022 23:06:21 -0800
ira.weiny@intel.com wrote:

> From: Davidlohr Bueso <dave@stgolabs.net>
> 
> Currently the only CXL features targeted for irq support require their
> message numbers to be within the first 16 entries.  The device may
> however support less than 16 entries depending on the support it
> provides.
> 
> Attempt to allocate these 16 irq vectors.  If the device supports less
> then the PCI infrastructure will allocate that number.  Upon successful
> allocation, users can plug in their respective isr at any point
> thereafter.
> 
> CXL device events are signaled via interrupts.  Each event log may have
> a different interrupt message number.  These message numbers are
> reported in the Get Event Interrupt Policy mailbox command.
> 
> Add interrupt support for event logs.  Interrupts are allocated as
> shared interrupts.  Therefore, all or some event logs can share the same
> message number.
> 
> In addition all logs are queried on any interrupt in order of the most
> to least severe based on the status register.
> 
> Cc: Bjorn Helgaas <helgaas@kernel.org>
> Cc: Jonathan Cameron <Jonathan.Cameron@huawei.com>
> Co-developed-by: Ira Weiny <ira.weiny@intel.com>
> Signed-off-by: Ira Weiny <ira.weiny@intel.com>
> Signed-off-by: Davidlohr Bueso <dave@stgolabs.net>

Sometimes it feels like we go around in circles as I'm sure we've
fixed this at least 3 times now in different sets that got dropped
(was definitely in the DOE interrupt support set and was controversial ;)

Nothing in this patch set calls pci_set_master() so no interrupts
will even be delivered.  I don't think there are any sets in flight
that would fix that.

Jonathan


> 
> ---
> Changes from V3:
> 	Adjust based on changes in patch 1
> 	Consolidate event setup into cxl_event_config()
> 	Consistently use cxl_event_* for function names
> 	Remove cxl_event_int_is_msi()
> 	Ensure DCD log is ignored in status
> 	Simplify event status loop logic
> 	Dan
> 		Fail driver load if the irq's are not allocated
> 		move cxl_event_config_msgnums() to pci.c
> 		s/CXL_PCI_REQUIRED_VECTORS/CXL_PCI_DEFAULT_MAX_VECTORS
> 		s/devm_kmalloc/devm_kzalloc
> 		Fix up pci_alloc_irq_vectors() comment
> 		Pass pdev to cxl_alloc_irq_vectors()
> 		Check FW irq policy prior to configuration
> 	Jonathan
> 		Use FIELD_GET instead of manual masking
> ---
>  drivers/cxl/cxl.h    |   4 +
>  drivers/cxl/cxlmem.h |  19 ++++
>  drivers/cxl/cxlpci.h |   6 ++
>  drivers/cxl/pci.c    | 208 +++++++++++++++++++++++++++++++++++++++++--
>  4 files changed, 231 insertions(+), 6 deletions(-)
> 
> diff --git a/drivers/cxl/cxl.h b/drivers/cxl/cxl.h
> index 5974d1082210..b3964149c77b 100644
> --- a/drivers/cxl/cxl.h
> +++ b/drivers/cxl/cxl.h
> @@ -168,6 +168,10 @@ static inline int ways_to_eiw(unsigned int ways, u8 *eiw)
>  				 CXLDEV_EVENT_STATUS_FAIL |	\
>  				 CXLDEV_EVENT_STATUS_FATAL)
>  
> +/* CXL rev 3.0 section 8.2.9.2.4; Table 8-52 */
> +#define CXLDEV_EVENT_INT_MODE_MASK	GENMASK(1, 0)
> +#define CXLDEV_EVENT_INT_MSGNUM_MASK	GENMASK(7, 4)
> +
>  /* CXL 2.0 8.2.8.4 Mailbox Registers */
>  #define CXLDEV_MBOX_CAPS_OFFSET 0x00
>  #define   CXLDEV_MBOX_CAP_PAYLOAD_SIZE_MASK GENMASK(4, 0)
> diff --git a/drivers/cxl/cxlmem.h b/drivers/cxl/cxlmem.h
> index dd9aa3dd738e..bd8bfbe61ec8 100644
> --- a/drivers/cxl/cxlmem.h
> +++ b/drivers/cxl/cxlmem.h
> @@ -194,6 +194,23 @@ struct cxl_endpoint_dvsec_info {
>  	struct range dvsec_range[2];
>  };
>  
> +/**
> + * Event Interrupt Policy
> + *
> + * CXL rev 3.0 section 8.2.9.2.4; Table 8-52
> + */
> +enum cxl_event_int_mode {
> +	CXL_INT_NONE		= 0x00,
> +	CXL_INT_MSI_MSIX	= 0x01,
> +	CXL_INT_FW		= 0x02
> +};
> +struct cxl_event_interrupt_policy {
> +	u8 info_settings;
> +	u8 warn_settings;
> +	u8 failure_settings;
> +	u8 fatal_settings;
> +} __packed;
> +
>  /**
>   * struct cxl_event_state - Event log driver state
>   *
> @@ -288,6 +305,8 @@ enum cxl_opcode {
>  	CXL_MBOX_OP_RAW			= CXL_MBOX_OP_INVALID,
>  	CXL_MBOX_OP_GET_EVENT_RECORD	= 0x0100,
>  	CXL_MBOX_OP_CLEAR_EVENT_RECORD	= 0x0101,
> +	CXL_MBOX_OP_GET_EVT_INT_POLICY	= 0x0102,
> +	CXL_MBOX_OP_SET_EVT_INT_POLICY	= 0x0103,
>  	CXL_MBOX_OP_GET_FW_INFO		= 0x0200,
>  	CXL_MBOX_OP_ACTIVATE_FW		= 0x0202,
>  	CXL_MBOX_OP_GET_SUPPORTED_LOGS	= 0x0400,
> diff --git a/drivers/cxl/cxlpci.h b/drivers/cxl/cxlpci.h
> index 77dbdb980b12..a8ea04f536ab 100644
> --- a/drivers/cxl/cxlpci.h
> +++ b/drivers/cxl/cxlpci.h
> @@ -53,6 +53,12 @@
>  #define	    CXL_DVSEC_REG_LOCATOR_BLOCK_ID_MASK			GENMASK(15, 8)
>  #define     CXL_DVSEC_REG_LOCATOR_BLOCK_OFF_LOW_MASK		GENMASK(31, 16)
>  
> +/*
> + * NOTE: Currently all the functions which are enabled for CXL require their
> + * vectors to be in the first 16.  Use this as the default max.
> + */
> +#define CXL_PCI_DEFAULT_MAX_VECTORS 16
> +
>  /* Register Block Identifier (RBI) */
>  enum cxl_regloc_type {
>  	CXL_REGLOC_RBI_EMPTY = 0,
> diff --git a/drivers/cxl/pci.c b/drivers/cxl/pci.c
> index a2d8382bc593..d42d87faddb8 100644
> --- a/drivers/cxl/pci.c
> +++ b/drivers/cxl/pci.c
> @@ -445,6 +445,201 @@ static int cxl_mem_alloc_event_buf(struct cxl_dev_state *cxlds)
>  	return 0;
>  }
>  
> +static int cxl_alloc_irq_vectors(struct pci_dev *pdev)
> +{
> +	int nvecs;
> +
> +	/*
> +	 * CXL requires MSI/MSIX support.
> +	 *
> +	 * Additionally pci_alloc_irq_vectors() handles calling
> +	 * pci_free_irq_vectors() automatically despite not being called
> +	 * pcim_*.  See pci_setup_msi_context().
> +	 */
> +	nvecs = pci_alloc_irq_vectors(pdev, 1, CXL_PCI_DEFAULT_MAX_VECTORS,
> +				      PCI_IRQ_MSIX | PCI_IRQ_MSI);
> +	if (nvecs < 1) {
> +		dev_dbg(&pdev->dev, "Failed to alloc irq vectors: %d\n", nvecs);
> +		return -ENXIO;
> +	}
> +	return 0;
> +}
> +
> +struct cxl_dev_id {
> +	struct cxl_dev_state *cxlds;
> +};
> +
> +static irqreturn_t cxl_event_thread(int irq, void *id)
> +{
> +	struct cxl_dev_id *dev_id = id;
> +	struct cxl_dev_state *cxlds = dev_id->cxlds;
> +	u32 status;
> +
> +	do {
> +		/*
> +		 * CXL 3.0 8.2.8.3.1: The lower 32 bits are the status;
> +		 * ignore the reserved upper 32 bits
> +		 */
> +		status = readl(cxlds->regs.status + CXLDEV_DEV_EVENT_STATUS_OFFSET);
> +		/* Ignore logs unknown to the driver */
> +		status &= CXLDEV_EVENT_STATUS_ALL;
> +		if (!status)
> +			break;
> +		cxl_mem_get_event_records(cxlds, status);
> +		cond_resched();
> +	} while (status);
> +
> +	return IRQ_HANDLED;
> +}
> +
> +static int cxl_event_req_irq(struct cxl_dev_state *cxlds, u8 setting)
> +{
> +	struct device *dev = cxlds->dev;
> +	struct pci_dev *pdev = to_pci_dev(dev);
> +	struct cxl_dev_id *dev_id;
> +	int irq;
> +
> +	if (FIELD_GET(CXLDEV_EVENT_INT_MODE_MASK, setting) != CXL_INT_MSI_MSIX)
> +		return -ENXIO;
> +
> +	/* dev_id must be globally unique and must contain the cxlds */
> +	dev_id = devm_kzalloc(dev, sizeof(*dev_id), GFP_KERNEL);
> +	if (!dev_id)
> +		return -ENOMEM;
> +	dev_id->cxlds = cxlds;
> +
> +	irq =  pci_irq_vector(pdev,
> +			      FIELD_GET(CXLDEV_EVENT_INT_MSGNUM_MASK, setting));
> +	if (irq < 0)
> +		return irq;
> +
> +	return devm_request_threaded_irq(dev, irq, NULL, cxl_event_thread,
> +					 IRQF_SHARED, NULL, dev_id);
> +}
> +
> +static int cxl_event_get_int_policy(struct cxl_dev_state *cxlds,
> +				    struct cxl_event_interrupt_policy *policy)
> +{
> +	struct cxl_mbox_cmd mbox_cmd = (struct cxl_mbox_cmd) {
> +		.opcode = CXL_MBOX_OP_GET_EVT_INT_POLICY,
> +		.payload_out = policy,
> +		.size_out = sizeof(*policy),
> +	};
> +	int rc;
> +
> +	rc = cxl_internal_send_cmd(cxlds, &mbox_cmd);
> +	if (rc < 0)
> +		dev_err(cxlds->dev, "Failed to get event interrupt policy : %d",
> +			rc);
> +
> +	return rc;
> +}
> +
> +static int cxl_event_config_msgnums(struct cxl_dev_state *cxlds,
> +				    struct cxl_event_interrupt_policy *policy)
> +{
> +	struct cxl_mbox_cmd mbox_cmd;
> +	int rc;
> +
> +	policy->info_settings = CXL_INT_MSI_MSIX;
> +	policy->warn_settings = CXL_INT_MSI_MSIX;
> +	policy->failure_settings = CXL_INT_MSI_MSIX;
> +	policy->fatal_settings = CXL_INT_MSI_MSIX;
> +
> +	mbox_cmd = (struct cxl_mbox_cmd) {
> +		.opcode = CXL_MBOX_OP_SET_EVT_INT_POLICY,
> +		.payload_in = policy,
> +		.size_in = sizeof(*policy),
> +	};
> +
> +	rc = cxl_internal_send_cmd(cxlds, &mbox_cmd);
> +	if (rc < 0) {
> +		dev_err(cxlds->dev, "Failed to set event interrupt policy : %d",
> +			rc);
> +		return rc;
> +	}
> +
> +	/* Retrieve final interrupt settings */
> +	return cxl_event_get_int_policy(cxlds, policy);
> +}
> +
> +static int cxl_event_irqsetup(struct cxl_dev_state *cxlds)
> +{
> +	struct cxl_event_interrupt_policy policy;
> +	int rc;
> +
> +	rc = cxl_event_config_msgnums(cxlds, &policy);
> +	if (rc)
> +		return rc;
> +
> +	rc = cxl_event_req_irq(cxlds, policy.info_settings);
> +	if (rc) {
> +		dev_err(cxlds->dev, "Failed to get interrupt for event Info log\n");
> +		return rc;
> +	}
> +
> +	rc = cxl_event_req_irq(cxlds, policy.warn_settings);
> +	if (rc) {
> +		dev_err(cxlds->dev, "Failed to get interrupt for event Warn log\n");
> +		return rc;
> +	}
> +
> +	rc = cxl_event_req_irq(cxlds, policy.failure_settings);
> +	if (rc) {
> +		dev_err(cxlds->dev, "Failed to get interrupt for event Failure log\n");
> +		return rc;
> +	}
> +
> +	rc = cxl_event_req_irq(cxlds, policy.fatal_settings);
> +	if (rc) {
> +		dev_err(cxlds->dev, "Failed to get interrupt for event Fatal log\n");
> +		return rc;
> +	}
> +
> +	return 0;
> +}
> +
> +static bool cxl_event_int_is_fw(u8 setting)
> +{
> +	u8 mode = FIELD_GET(CXLDEV_EVENT_INT_MODE_MASK, setting);
> +
> +	return mode == CXL_INT_FW;
> +}
> +
> +static int cxl_event_config(struct pci_host_bridge *host_bridge,
> +			    struct cxl_dev_state *cxlds)
> +{
> +	struct cxl_event_interrupt_policy policy;
> +	int rc;
> +
> +	/*
> +	 * When BIOS maintains CXL error reporting control, it will process
> +	 * event records.  Only one agent can do so.
> +	 */
> +	if (!host_bridge->native_cxl_error)
> +		return 0;
> +
> +	rc = cxl_event_get_int_policy(cxlds, &policy);
> +	if (rc)
> +		return rc;
> +
> +	if (cxl_event_int_is_fw(policy.info_settings) ||
> +	    cxl_event_int_is_fw(policy.warn_settings) ||
> +	    cxl_event_int_is_fw(policy.failure_settings) ||
> +	    cxl_event_int_is_fw(policy.fatal_settings)) {
> +		dev_err(cxlds->dev, "FW still in control of Event Logs despite _OSC settings\n");
> +		return -EBUSY;
> +	}
> +
> +	rc = cxl_event_irqsetup(cxlds);
> +	if (rc)
> +		return rc;
> +
> +	cxl_mem_get_event_records(cxlds, CXLDEV_EVENT_STATUS_ALL);
> +
> +	return 0;
> +}
> +
>  static int cxl_pci_probe(struct pci_dev *pdev, const struct pci_device_id *id)
>  {
>  	struct pci_host_bridge *host_bridge = pci_find_host_bridge(pdev->bus);
> @@ -519,6 +714,10 @@ static int cxl_pci_probe(struct pci_dev *pdev, const struct pci_device_id *id)
>  	if (rc)
>  		return rc;
>  
> +	rc = cxl_alloc_irq_vectors(pdev);
> +	if (rc)
> +		return rc;
> +
>  	cxlmd = devm_cxl_add_memdev(cxlds);
>  	if (IS_ERR(cxlmd))
>  		return PTR_ERR(cxlmd);
> @@ -527,12 +726,9 @@ static int cxl_pci_probe(struct pci_dev *pdev, const struct pci_device_id *id)
>  	if (rc)
>  		return rc;
>  
> -	/*
> -	 * When BIOS maintains CXL error reporting control, it will process
> -	 * event records.  Only one agent can do so.
> -	 */
> -	if (host_bridge->native_cxl_error)
> -		cxl_mem_get_event_records(cxlds, CXLDEV_EVENT_STATUS_ALL);
> +	rc = cxl_event_config(host_bridge, cxlds);
> +	if (rc)
> +		return rc;
>  
>  	if (cxlds->regs.ras) {
>  		pci_enable_pcie_error_reporting(pdev);


^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH V4 0/9] CXL: Process event logs
  2022-12-16 17:01   ` Dan Williams
  2022-12-16 18:15     ` Ira Weiny
@ 2022-12-16 18:39     ` Jonathan Cameron
  1 sibling, 0 replies; 36+ messages in thread
From: Jonathan Cameron @ 2022-12-16 18:39 UTC (permalink / raw)
  To: Dan Williams
  Cc: ira.weiny, Bjorn Helgaas, Alison Schofield, Vishal Verma,
	Davidlohr Bueso, Dave Jiang, linux-kernel, linux-pci, linux-acpi,
	linux-cxl

On Fri, 16 Dec 2022 09:01:13 -0800
Dan Williams <dan.j.williams@intel.com> wrote:

> Jonathan Cameron wrote:
> > On Sun, 11 Dec 2022 23:06:18 -0800
> > ira.weiny@intel.com wrote:
> >   
> > > From: Ira Weiny <ira.weiny@intel.com>
> > > 
> > > This code has been tested with a newer qemu which allows for more events to be
> > > returned at a time as well an additional QMP event and interrupt injection.
> > > Those patches will follow once they have been cleaned up.
> > > 
> > > The series is now in 3 parts:
> > > 
> > > 	1) Base functionality including interrupts
> > > 	2) Tracing specific events (Dynamic Capacity Event Record is defered)
> > > 	3) cxl-test infrastructure for basic tests
> > > 
> > > Changes from V3
> > > 	Feedback from Dan
> > > 	Spit out ACPI changes for Bjorn
> > > 
> > > - Link to v3: https://lore.kernel.org/all/20221208052115.800170-1-ira.weiny@intel.com/  
> > 
> > Because I'm in a grumpy mood (as my colleagues will attest!)...
> > This is dependent on the patch that moves the trace definitions and
> > that's not upstream yet except in cxl/preview which is optimistic
> > place to use for a base commit.  The id isn't the one below either which
> > isn't in either mailine or the current CXL trees.  
> 
> I do not want to commit to a new baseline until after -rc1, so yes this
> is in a messy period.

Fully understood. I only push trees out as 'testing' for 0-day to hit
until I can rebase on rc1.

> 
> > Not that I actually checked the cover letter until it failed to apply
> > (and hence already knew what was missing) but still, please call out
> > dependencies unless they are in the branches Dan has queued up to push.
> > 
> > I just want to play with Dave's fix for the RAS errors so having to jump
> > through these other sets.  
> 
> Yes, that is annoying, apologies.
Not really a problem I just felt like grumbling :)

Have a good weekend.

Jonathan

> 
> > 
> > Thanks,
> > 
> > Jonathan
> >   
> > > 
> > > 
> > > Davidlohr Bueso (1):
> > >   cxl/mem: Wire up event interrupts
> > > 
> > > Ira Weiny (8):
> > >   PCI/CXL: Export native CXL error reporting control
> > >   cxl/mem: Read, trace, and clear events on driver load
> > >   cxl/mem: Trace General Media Event Record
> > >   cxl/mem: Trace DRAM Event Record
> > >   cxl/mem: Trace Memory Module Event Record
> > >   cxl/test: Add generic mock events
> > >   cxl/test: Add specific events
> > >   cxl/test: Simulate event log overflow
> > > 
> > >  drivers/acpi/pci_root.c       |   3 +
> > >  drivers/cxl/core/mbox.c       | 186 +++++++++++++
> > >  drivers/cxl/core/trace.h      | 479 ++++++++++++++++++++++++++++++++++
> > >  drivers/cxl/cxl.h             |  16 ++
> > >  drivers/cxl/cxlmem.h          | 171 ++++++++++++
> > >  drivers/cxl/cxlpci.h          |   6 +
> > >  drivers/cxl/pci.c             | 236 +++++++++++++++++
> > >  drivers/pci/probe.c           |   1 +
> > >  include/linux/pci.h           |   1 +
> > >  tools/testing/cxl/test/Kbuild |   2 +-
> > >  tools/testing/cxl/test/mem.c  | 352 +++++++++++++++++++++++++
> > >  11 files changed, 1452 insertions(+), 1 deletion(-)
> > > 
> > > 
> > > base-commit: acb704099642bc822ef2aed223a0b8db1f7ea76e  
> >   
> 
> I think going forward these base-commits need to be something that are
> reachable on cxl.git. For now I have pushed out a baseline for both Dave
> and Ira's patches to cxl/preview which will rebase after -rc1 comes out.
> 
> Just the small matter of needing some acks/reviews on those lead in
> patches so I can move them to through cxl/pending to cxl/next:


Don't move too fast with Ira's.  Some issues coming up in testing..
(admittedly half of them were things where QEMU hadn't kept up with
what the kernel code now uses).


> 
> http://lore.kernel.org/r/167051869176.436579.9728373544811641087.stgit@dwillia2-xfh.jf.intel.com
> http://lore.kernel.org/r/20221212070627.1372402-2-ira.weiny@intel.com


^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH V4 3/9] cxl/mem: Wire up event interrupts
  2022-12-16 14:24   ` Jonathan Cameron
@ 2022-12-16 18:42     ` Jonathan Cameron
  2022-12-16 21:28       ` Ira Weiny
  0 siblings, 1 reply; 36+ messages in thread
From: Jonathan Cameron @ 2022-12-16 18:42 UTC (permalink / raw)
  To: ira.weiny
  Cc: Dan Williams, Davidlohr Bueso, Bjorn Helgaas, Bjorn Helgaas,
	Alison Schofield, Vishal Verma, Dave Jiang, linux-kernel,
	linux-pci, linux-acpi, linux-cxl

On Fri, 16 Dec 2022 14:24:38 +0000
Jonathan Cameron <Jonathan.Cameron@Huawei.com> wrote:

> On Sun, 11 Dec 2022 23:06:21 -0800
> ira.weiny@intel.com wrote:
> 
> > From: Davidlohr Bueso <dave@stgolabs.net>
> > 
> > Currently the only CXL features targeted for irq support require their
> > message numbers to be within the first 16 entries.  The device may
> > however support less than 16 entries depending on the support it
> > provides.
> > 
> > Attempt to allocate these 16 irq vectors.  If the device supports less
> > then the PCI infrastructure will allocate that number.  Upon successful
> > allocation, users can plug in their respective isr at any point
> > thereafter.
> > 
> > CXL device events are signaled via interrupts.  Each event log may have
> > a different interrupt message number.  These message numbers are
> > reported in the Get Event Interrupt Policy mailbox command.
> > 
> > Add interrupt support for event logs.  Interrupts are allocated as
> > shared interrupts.  Therefore, all or some event logs can share the same
> > message number.
> > 
> > In addition all logs are queried on any interrupt in order of the most
> > to least severe based on the status register.
> > 
> > Cc: Bjorn Helgaas <helgaas@kernel.org>
> > Cc: Jonathan Cameron <Jonathan.Cameron@huawei.com>
> > Co-developed-by: Ira Weiny <ira.weiny@intel.com>
> > Signed-off-by: Ira Weiny <ira.weiny@intel.com>
> > Signed-off-by: Davidlohr Bueso <dave@stgolabs.net>
> >   
> 
> > +/**
> > + * Event Interrupt Policy
> > + *
> > + * CXL rev 3.0 section 8.2.9.2.4; Table 8-52
> > + */
> > +enum cxl_event_int_mode {
> > +	CXL_INT_NONE		= 0x00,
> > +	CXL_INT_MSI_MSIX	= 0x01,
> > +	CXL_INT_FW		= 0x02
> > +};
> > +struct cxl_event_interrupt_policy {
> > +	u8 info_settings;
> > +	u8 warn_settings;
> > +	u8 failure_settings;
> > +	u8 fatal_settings;  
> 
> This is an issue for your QEMU code which has this set at 5 bytes.
> Guess our handling of record lengths needs updating now we have two different
> spec versions to support and hence these can have multiple lengths.
> 
> Btw, do you have an updated version of the QEMU patches you can share?

Note that I'm happy to take your QEMU series forwards, just don't want to duplicate
stuff you have already done!

Jonathan

> I was planning on just doing the AER type RAS stuff for the first pull this cycle
> but given this set means we never reach that code I probably need to do QEMU
> support for this and the stuff to support those all in one go - otherwise
> no one will be able to test it :)  We rather optimistically have the OSC set
> to say the OS can have control of these, but upstream code doesn't emulate
> anything yet. Oops. Should have pretended the hardware was handling them
> until we had this support in place in QEMU.
> 
> Jonathan
> 
> > +} __packed;
> > +
> >  /**
> >   * struct cxl_event_state - Event log driver state
> >   *
> > @@ -288,6 +305,8 @@ enum cxl_opcode {
> >  	CXL_MBOX_OP_RAW			= CXL_MBOX_OP_INVALID,
> >  	CXL_MBOX_OP_GET_EVENT_RECORD	= 0x0100,
> >  	CXL_MBOX_OP_CLEAR_EVENT_RECORD	= 0x0101,
> > +	CXL_MBOX_OP_GET_EVT_INT_POLICY	= 0x0102,
> > +	CXL_MBOX_OP_SET_EVT_INT_POLICY	= 0x0103,
> >  	CXL_MBOX_OP_GET_FW_INFO		= 0x0200,
> >  	CXL_MBOX_OP_ACTIVATE_FW		= 0x0202,
> >  	CXL_MBOX_OP_GET_SUPPORTED_LOGS	= 0x0400,  


^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH V4 3/9] cxl/mem: Wire up event interrupts
  2022-12-16 18:42     ` Jonathan Cameron
@ 2022-12-16 21:28       ` Ira Weiny
  2022-12-17 16:40         ` Jonathan Cameron
  0 siblings, 1 reply; 36+ messages in thread
From: Ira Weiny @ 2022-12-16 21:28 UTC (permalink / raw)
  To: Jonathan Cameron
  Cc: Dan Williams, Davidlohr Bueso, Bjorn Helgaas, Bjorn Helgaas,
	Alison Schofield, Vishal Verma, Dave Jiang, linux-kernel,
	linux-pci, linux-acpi, linux-cxl

On Fri, Dec 16, 2022 at 06:42:15PM +0000, Jonathan Cameron wrote:
> On Fri, 16 Dec 2022 14:24:38 +0000
> Jonathan Cameron <Jonathan.Cameron@Huawei.com> wrote:
> 
> > On Sun, 11 Dec 2022 23:06:21 -0800
> > ira.weiny@intel.com wrote:
> > 
> > > From: Davidlohr Bueso <dave@stgolabs.net>
> > > 
> > > Currently the only CXL features targeted for irq support require their
> > > message numbers to be within the first 16 entries.  The device may
> > > however support less than 16 entries depending on the support it
> > > provides.
> > > 
> > > Attempt to allocate these 16 irq vectors.  If the device supports less
> > > then the PCI infrastructure will allocate that number.  Upon successful
> > > allocation, users can plug in their respective isr at any point
> > > thereafter.
> > > 
> > > CXL device events are signaled via interrupts.  Each event log may have
> > > a different interrupt message number.  These message numbers are
> > > reported in the Get Event Interrupt Policy mailbox command.
> > > 
> > > Add interrupt support for event logs.  Interrupts are allocated as
> > > shared interrupts.  Therefore, all or some event logs can share the same
> > > message number.
> > > 
> > > In addition all logs are queried on any interrupt in order of the most
> > > to least severe based on the status register.
> > > 
> > > Cc: Bjorn Helgaas <helgaas@kernel.org>
> > > Cc: Jonathan Cameron <Jonathan.Cameron@huawei.com>
> > > Co-developed-by: Ira Weiny <ira.weiny@intel.com>
> > > Signed-off-by: Ira Weiny <ira.weiny@intel.com>
> > > Signed-off-by: Davidlohr Bueso <dave@stgolabs.net>
> > >   
> > 
> > > +/**
> > > + * Event Interrupt Policy
> > > + *
> > > + * CXL rev 3.0 section 8.2.9.2.4; Table 8-52
> > > + */
> > > +enum cxl_event_int_mode {
> > > +	CXL_INT_NONE		= 0x00,
> > > +	CXL_INT_MSI_MSIX	= 0x01,
> > > +	CXL_INT_FW		= 0x02
> > > +};
> > > +struct cxl_event_interrupt_policy {
> > > +	u8 info_settings;
> > > +	u8 warn_settings;
> > > +	u8 failure_settings;
> > > +	u8 fatal_settings;  
> > 
> > This is an issue for your QEMU code which has this set at 5 bytes.
> > Guess our handling of record lengths needs updating now we have two different
> > spec versions to support and hence these can have multiple lengths.
> > 
> > Btw, do you have an updated version of the QEMU patches you can share?
> 
> Note that I'm happy to take your QEMU series forwards, just don't want to duplicate
> stuff you have already done!

I do have updates to handle more than 3 records at a time which I was polishing
until I saw your review.

I was getting all geared up to respin the patches but I'm not sure there is an
issue now that I look at it.

I'm still investigating.

Ira

> 
> Jonathan
> 
> > I was planning on just doing the AER type RAS stuff for the first pull this cycle
> > but given this set means we never reach that code I probably need to do QEMU
> > support for this and the stuff to support those all in one go - otherwise
> > no one will be able to test it :)  We rather optimistically have the OSC set
> > to say the OS can have control of these, but upstream code doesn't emulate
> > anything yet. Oops. Should have pretended the hardware was handling them
> > until we had this support in place in QEMU.
> > 
> > Jonathan
> > 
> > > +} __packed;
> > > +
> > >  /**
> > >   * struct cxl_event_state - Event log driver state
> > >   *
> > > @@ -288,6 +305,8 @@ enum cxl_opcode {
> > >  	CXL_MBOX_OP_RAW			= CXL_MBOX_OP_INVALID,
> > >  	CXL_MBOX_OP_GET_EVENT_RECORD	= 0x0100,
> > >  	CXL_MBOX_OP_CLEAR_EVENT_RECORD	= 0x0101,
> > > +	CXL_MBOX_OP_GET_EVT_INT_POLICY	= 0x0102,
> > > +	CXL_MBOX_OP_SET_EVT_INT_POLICY	= 0x0103,
> > >  	CXL_MBOX_OP_GET_FW_INFO		= 0x0200,
> > >  	CXL_MBOX_OP_ACTIVATE_FW		= 0x0202,
> > >  	CXL_MBOX_OP_GET_SUPPORTED_LOGS	= 0x0400,  
> 

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH V4 3/9] cxl/mem: Wire up event interrupts
  2022-12-16 18:21   ` Jonathan Cameron
@ 2022-12-16 21:33     ` Ira Weiny
  2022-12-17 16:43       ` Jonathan Cameron
  0 siblings, 1 reply; 36+ messages in thread
From: Ira Weiny @ 2022-12-16 21:33 UTC (permalink / raw)
  To: Jonathan Cameron
  Cc: Alison Schofield, Dan Williams, Davidlohr Bueso, Bjorn Helgaas,
	Bjorn Helgaas, Vishal Verma, Dave Jiang, linux-kernel, linux-pci,
	linux-acpi, linux-cxl

On Fri, Dec 16, 2022 at 06:21:10PM +0000, Jonathan Cameron wrote:
> On Sun, 11 Dec 2022 23:06:21 -0800
> ira.weiny@intel.com wrote:
> 
> > From: Davidlohr Bueso <dave@stgolabs.net>
> > 
> > Currently the only CXL features targeted for irq support require their
> > message numbers to be within the first 16 entries.  The device may
> > however support less than 16 entries depending on the support it
> > provides.
> > 
> > Attempt to allocate these 16 irq vectors.  If the device supports less
> > then the PCI infrastructure will allocate that number.  Upon successful
> > allocation, users can plug in their respective isr at any point
> > thereafter.
> > 
> > CXL device events are signaled via interrupts.  Each event log may have
> > a different interrupt message number.  These message numbers are
> > reported in the Get Event Interrupt Policy mailbox command.
> > 
> > Add interrupt support for event logs.  Interrupts are allocated as
> > shared interrupts.  Therefore, all or some event logs can share the same
> > message number.
> > 
> > In addition all logs are queried on any interrupt in order of the most
> > to least severe based on the status register.
> > 
> > Cc: Bjorn Helgaas <helgaas@kernel.org>
> > Cc: Jonathan Cameron <Jonathan.Cameron@huawei.com>
> > Co-developed-by: Ira Weiny <ira.weiny@intel.com>
> > Signed-off-by: Ira Weiny <ira.weiny@intel.com>
> > Signed-off-by: Davidlohr Bueso <dave@stgolabs.net>
> 
> Sometimes it feels like we go around in circles as I'm sure we've
> fixed this at least 3 times now in different sets that got dropped
> (was definitely in the DOE interrupt support set and was controversial ;)

Well I certainly seem to be spinning ATM.  I probably need more eggnog.  ;-)

> 
> Nothing in this patch set calls pci_set_master() so no interrupts
> will even be delivered.

Is this a bug in Qemu then?  Because this _is_ tested and irqs _were_
delivered.  I thought we determined that pci_set_master() was called in
pcim_enable_device() but I don't see that now.  :-(

What I do know is this works with Qemu.  :-/

Ira

> I don't think there are any sets in flight
> that would fix that.
> 
> Jonathan
> 
> 
> > 
> > ---
> > Changes from V3:
> > 	Adjust based on changes in patch 1
> > 	Consolidate event setup into cxl_event_config()
> > 	Consistently use cxl_event_* for function names
> > 	Remove cxl_event_int_is_msi()
> > 	Ensure DCD log is ignored in status
> > 	Simplify event status loop logic
> > 	Dan
> > 		Fail driver load if the irq's are not allocated
> > 		move cxl_event_config_msgnums() to pci.c
> > 		s/CXL_PCI_REQUIRED_VECTORS/CXL_PCI_DEFAULT_MAX_VECTORS
> > 		s/devm_kmalloc/devm_kzalloc
> > 		Fix up pci_alloc_irq_vectors() comment
> > 		Pass pdev to cxl_alloc_irq_vectors()
> > 		Check FW irq policy prior to configuration
> > 	Jonathan
> > 		Use FIELD_GET instead of manual masking
> > ---
> >  drivers/cxl/cxl.h    |   4 +
> >  drivers/cxl/cxlmem.h |  19 ++++
> >  drivers/cxl/cxlpci.h |   6 ++
> >  drivers/cxl/pci.c    | 208 +++++++++++++++++++++++++++++++++++++++++--
> >  4 files changed, 231 insertions(+), 6 deletions(-)
> > 
> > diff --git a/drivers/cxl/cxl.h b/drivers/cxl/cxl.h
> > index 5974d1082210..b3964149c77b 100644
> > --- a/drivers/cxl/cxl.h
> > +++ b/drivers/cxl/cxl.h
> > @@ -168,6 +168,10 @@ static inline int ways_to_eiw(unsigned int ways, u8 *eiw)
> >  				 CXLDEV_EVENT_STATUS_FAIL |	\
> >  				 CXLDEV_EVENT_STATUS_FATAL)
> >  
> > +/* CXL rev 3.0 section 8.2.9.2.4; Table 8-52 */
> > +#define CXLDEV_EVENT_INT_MODE_MASK	GENMASK(1, 0)
> > +#define CXLDEV_EVENT_INT_MSGNUM_MASK	GENMASK(7, 4)
> > +
> >  /* CXL 2.0 8.2.8.4 Mailbox Registers */
> >  #define CXLDEV_MBOX_CAPS_OFFSET 0x00
> >  #define   CXLDEV_MBOX_CAP_PAYLOAD_SIZE_MASK GENMASK(4, 0)
> > diff --git a/drivers/cxl/cxlmem.h b/drivers/cxl/cxlmem.h
> > index dd9aa3dd738e..bd8bfbe61ec8 100644
> > --- a/drivers/cxl/cxlmem.h
> > +++ b/drivers/cxl/cxlmem.h
> > @@ -194,6 +194,23 @@ struct cxl_endpoint_dvsec_info {
> >  	struct range dvsec_range[2];
> >  };
> >  
> > +/**
> > + * Event Interrupt Policy
> > + *
> > + * CXL rev 3.0 section 8.2.9.2.4; Table 8-52
> > + */
> > +enum cxl_event_int_mode {
> > +	CXL_INT_NONE		= 0x00,
> > +	CXL_INT_MSI_MSIX	= 0x01,
> > +	CXL_INT_FW		= 0x02
> > +};
> > +struct cxl_event_interrupt_policy {
> > +	u8 info_settings;
> > +	u8 warn_settings;
> > +	u8 failure_settings;
> > +	u8 fatal_settings;
> > +} __packed;
> > +
> >  /**
> >   * struct cxl_event_state - Event log driver state
> >   *
> > @@ -288,6 +305,8 @@ enum cxl_opcode {
> >  	CXL_MBOX_OP_RAW			= CXL_MBOX_OP_INVALID,
> >  	CXL_MBOX_OP_GET_EVENT_RECORD	= 0x0100,
> >  	CXL_MBOX_OP_CLEAR_EVENT_RECORD	= 0x0101,
> > +	CXL_MBOX_OP_GET_EVT_INT_POLICY	= 0x0102,
> > +	CXL_MBOX_OP_SET_EVT_INT_POLICY	= 0x0103,
> >  	CXL_MBOX_OP_GET_FW_INFO		= 0x0200,
> >  	CXL_MBOX_OP_ACTIVATE_FW		= 0x0202,
> >  	CXL_MBOX_OP_GET_SUPPORTED_LOGS	= 0x0400,
> > diff --git a/drivers/cxl/cxlpci.h b/drivers/cxl/cxlpci.h
> > index 77dbdb980b12..a8ea04f536ab 100644
> > --- a/drivers/cxl/cxlpci.h
> > +++ b/drivers/cxl/cxlpci.h
> > @@ -53,6 +53,12 @@
> >  #define	    CXL_DVSEC_REG_LOCATOR_BLOCK_ID_MASK			GENMASK(15, 8)
> >  #define     CXL_DVSEC_REG_LOCATOR_BLOCK_OFF_LOW_MASK		GENMASK(31, 16)
> >  
> > +/*
> > + * NOTE: Currently all the functions which are enabled for CXL require their
> > + * vectors to be in the first 16.  Use this as the default max.
> > + */
> > +#define CXL_PCI_DEFAULT_MAX_VECTORS 16
> > +
> >  /* Register Block Identifier (RBI) */
> >  enum cxl_regloc_type {
> >  	CXL_REGLOC_RBI_EMPTY = 0,
> > diff --git a/drivers/cxl/pci.c b/drivers/cxl/pci.c
> > index a2d8382bc593..d42d87faddb8 100644
> > --- a/drivers/cxl/pci.c
> > +++ b/drivers/cxl/pci.c
> > @@ -445,6 +445,201 @@ static int cxl_mem_alloc_event_buf(struct cxl_dev_state *cxlds)
> >  	return 0;
> >  }
> >  
> > +static int cxl_alloc_irq_vectors(struct pci_dev *pdev)
> > +{
> > +	int nvecs;
> > +
> > +	/*
> > +	 * CXL requires MSI/MSIX support.
> > +	 *
> > +	 * Additionally pci_alloc_irq_vectors() handles calling
> > +	 * pci_free_irq_vectors() automatically despite not being called
> > +	 * pcim_*.  See pci_setup_msi_context().
> > +	 */
> > +	nvecs = pci_alloc_irq_vectors(pdev, 1, CXL_PCI_DEFAULT_MAX_VECTORS,
> > +				      PCI_IRQ_MSIX | PCI_IRQ_MSI);
> > +	if (nvecs < 1) {
> > +		dev_dbg(&pdev->dev, "Failed to alloc irq vectors: %d\n", nvecs);
> > +		return -ENXIO;
> > +	}
> > +	return 0;
> > +}
> > +
> > +struct cxl_dev_id {
> > +	struct cxl_dev_state *cxlds;
> > +};
> > +
> > +static irqreturn_t cxl_event_thread(int irq, void *id)
> > +{
> > +	struct cxl_dev_id *dev_id = id;
> > +	struct cxl_dev_state *cxlds = dev_id->cxlds;
> > +	u32 status;
> > +
> > +	do {
> > +		/*
> > +		 * CXL 3.0 8.2.8.3.1: The lower 32 bits are the status;
> > +		 * ignore the reserved upper 32 bits
> > +		 */
> > +		status = readl(cxlds->regs.status + CXLDEV_DEV_EVENT_STATUS_OFFSET);
> > +		/* Ignore logs unknown to the driver */
> > +		status &= CXLDEV_EVENT_STATUS_ALL;
> > +		if (!status)
> > +			break;
> > +		cxl_mem_get_event_records(cxlds, status);
> > +		cond_resched();
> > +	} while (status);
> > +
> > +	return IRQ_HANDLED;
> > +}
> > +
> > +static int cxl_event_req_irq(struct cxl_dev_state *cxlds, u8 setting)
> > +{
> > +	struct device *dev = cxlds->dev;
> > +	struct pci_dev *pdev = to_pci_dev(dev);
> > +	struct cxl_dev_id *dev_id;
> > +	int irq;
> > +
> > +	if (FIELD_GET(CXLDEV_EVENT_INT_MODE_MASK, setting) != CXL_INT_MSI_MSIX)
> > +		return -ENXIO;
> > +
> > +	/* dev_id must be globally unique and must contain the cxlds */
> > +	dev_id = devm_kzalloc(dev, sizeof(*dev_id), GFP_KERNEL);
> > +	if (!dev_id)
> > +		return -ENOMEM;
> > +	dev_id->cxlds = cxlds;
> > +
> > +	irq =  pci_irq_vector(pdev,
> > +			      FIELD_GET(CXLDEV_EVENT_INT_MSGNUM_MASK, setting));
> > +	if (irq < 0)
> > +		return irq;
> > +
> > +	return devm_request_threaded_irq(dev, irq, NULL, cxl_event_thread,
> > +					 IRQF_SHARED, NULL, dev_id);
> > +}
> > +
> > +static int cxl_event_get_int_policy(struct cxl_dev_state *cxlds,
> > +				    struct cxl_event_interrupt_policy *policy)
> > +{
> > +	struct cxl_mbox_cmd mbox_cmd = (struct cxl_mbox_cmd) {
> > +		.opcode = CXL_MBOX_OP_GET_EVT_INT_POLICY,
> > +		.payload_out = policy,
> > +		.size_out = sizeof(*policy),
> > +	};
> > +	int rc;
> > +
> > +	rc = cxl_internal_send_cmd(cxlds, &mbox_cmd);
> > +	if (rc < 0)
> > +		dev_err(cxlds->dev, "Failed to get event interrupt policy : %d",
> > +			rc);
> > +
> > +	return rc;
> > +}
> > +
> > +static int cxl_event_config_msgnums(struct cxl_dev_state *cxlds,
> > +				    struct cxl_event_interrupt_policy *policy)
> > +{
> > +	struct cxl_mbox_cmd mbox_cmd;
> > +	int rc;
> > +
> > +	policy->info_settings = CXL_INT_MSI_MSIX;
> > +	policy->warn_settings = CXL_INT_MSI_MSIX;
> > +	policy->failure_settings = CXL_INT_MSI_MSIX;
> > +	policy->fatal_settings = CXL_INT_MSI_MSIX;
> > +
> > +	mbox_cmd = (struct cxl_mbox_cmd) {
> > +		.opcode = CXL_MBOX_OP_SET_EVT_INT_POLICY,
> > +		.payload_in = policy,
> > +		.size_in = sizeof(*policy),
> > +	};
> > +
> > +	rc = cxl_internal_send_cmd(cxlds, &mbox_cmd);
> > +	if (rc < 0) {
> > +		dev_err(cxlds->dev, "Failed to set event interrupt policy : %d",
> > +			rc);
> > +		return rc;
> > +	}
> > +
> > +	/* Retrieve final interrupt settings */
> > +	return cxl_event_get_int_policy(cxlds, policy);
> > +}
> > +
> > +static int cxl_event_irqsetup(struct cxl_dev_state *cxlds)
> > +{
> > +	struct cxl_event_interrupt_policy policy;
> > +	int rc;
> > +
> > +	rc = cxl_event_config_msgnums(cxlds, &policy);
> > +	if (rc)
> > +		return rc;
> > +
> > +	rc = cxl_event_req_irq(cxlds, policy.info_settings);
> > +	if (rc) {
> > +		dev_err(cxlds->dev, "Failed to get interrupt for event Info log\n");
> > +		return rc;
> > +	}
> > +
> > +	rc = cxl_event_req_irq(cxlds, policy.warn_settings);
> > +	if (rc) {
> > +		dev_err(cxlds->dev, "Failed to get interrupt for event Warn log\n");
> > +		return rc;
> > +	}
> > +
> > +	rc = cxl_event_req_irq(cxlds, policy.failure_settings);
> > +	if (rc) {
> > +		dev_err(cxlds->dev, "Failed to get interrupt for event Failure log\n");
> > +		return rc;
> > +	}
> > +
> > +	rc = cxl_event_req_irq(cxlds, policy.fatal_settings);
> > +	if (rc) {
> > +		dev_err(cxlds->dev, "Failed to get interrupt for event Fatal log\n");
> > +		return rc;
> > +	}
> > +
> > +	return 0;
> > +}
> > +
> > +static bool cxl_event_int_is_fw(u8 setting)
> > +{
> > +	u8 mode = FIELD_GET(CXLDEV_EVENT_INT_MODE_MASK, setting);
> > +
> > +	return mode == CXL_INT_FW;
> > +}
> > +
> > +static int cxl_event_config(struct pci_host_bridge *host_bridge,
> > +			    struct cxl_dev_state *cxlds)
> > +{
> > +	struct cxl_event_interrupt_policy policy;
> > +	int rc;
> > +
> > +	/*
> > +	 * When BIOS maintains CXL error reporting control, it will process
> > +	 * event records.  Only one agent can do so.
> > +	 */
> > +	if (!host_bridge->native_cxl_error)
> > +		return 0;
> > +
> > +	rc = cxl_event_get_int_policy(cxlds, &policy);
> > +	if (rc)
> > +		return rc;
> > +
> > +	if (cxl_event_int_is_fw(policy.info_settings) ||
> > +	    cxl_event_int_is_fw(policy.warn_settings) ||
> > +	    cxl_event_int_is_fw(policy.failure_settings) ||
> > +	    cxl_event_int_is_fw(policy.fatal_settings)) {
> > +		dev_err(cxlds->dev, "FW still in control of Event Logs despite _OSC settings\n");
> > +		return -EBUSY;
> > +	}
> > +
> > +	rc = cxl_event_irqsetup(cxlds);
> > +	if (rc)
> > +		return rc;
> > +
> > +	cxl_mem_get_event_records(cxlds, CXLDEV_EVENT_STATUS_ALL);
> > +
> > +	return 0;
> > +}
> > +
> >  static int cxl_pci_probe(struct pci_dev *pdev, const struct pci_device_id *id)
> >  {
> >  	struct pci_host_bridge *host_bridge = pci_find_host_bridge(pdev->bus);
> > @@ -519,6 +714,10 @@ static int cxl_pci_probe(struct pci_dev *pdev, const struct pci_device_id *id)
> >  	if (rc)
> >  		return rc;
> >  
> > +	rc = cxl_alloc_irq_vectors(pdev);
> > +	if (rc)
> > +		return rc;
> > +
> >  	cxlmd = devm_cxl_add_memdev(cxlds);
> >  	if (IS_ERR(cxlmd))
> >  		return PTR_ERR(cxlmd);
> > @@ -527,12 +726,9 @@ static int cxl_pci_probe(struct pci_dev *pdev, const struct pci_device_id *id)
> >  	if (rc)
> >  		return rc;
> >  
> > -	/*
> > -	 * When BIOS maintains CXL error reporting control, it will process
> > -	 * event records.  Only one agent can do so.
> > -	 */
> > -	if (host_bridge->native_cxl_error)
> > -		cxl_mem_get_event_records(cxlds, CXLDEV_EVENT_STATUS_ALL);
> > +	rc = cxl_event_config(host_bridge, cxlds);
> > +	if (rc)
> > +		return rc;
> >  
> >  	if (cxlds->regs.ras) {
> >  		pci_enable_pcie_error_reporting(pdev);
> 

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH V4 2/9] cxl/mem: Read, trace, and clear events on driver load
  2022-12-16 15:39   ` Jonathan Cameron
@ 2022-12-16 21:54     ` Ira Weiny
  2022-12-17 16:38       ` Jonathan Cameron
  2022-12-18  0:25       ` johnny
  0 siblings, 2 replies; 36+ messages in thread
From: Ira Weiny @ 2022-12-16 21:54 UTC (permalink / raw)
  To: Jonathan Cameron
  Cc: Dan Williams, Bjorn Helgaas, Alison Schofield, Vishal Verma,
	Davidlohr Bueso, Dave Jiang, linux-kernel, linux-pci, linux-acpi,
	linux-cxl

On Fri, Dec 16, 2022 at 03:39:39PM +0000, Jonathan Cameron wrote:
> On Sun, 11 Dec 2022 23:06:20 -0800
> ira.weiny@intel.com wrote:
> 
> > From: Ira Weiny <ira.weiny@intel.com>
> > 
> > CXL devices have multiple event logs which can be queried for CXL event
> > records.  Devices are required to support the storage of at least one
> > event record in each event log type.
> > 
> > Devices track event log overflow by incrementing a counter and tracking
> > the time of the first and last overflow event seen.
> > 
> > Software queries events via the Get Event Record mailbox command; CXL
> > rev 3.0 section 8.2.9.2.2 and clears events via CXL rev 3.0 section
> > 8.2.9.2.3 Clear Event Records mailbox command.
> > 
> > If the result of negotiating CXL Error Reporting Control is OS control,
> > read and clear all event logs on driver load.
> > 
> > Ensure a clean slate of events by reading and clearing the events on
> > driver load.
> > 
> > The status register is not used because a device may continue to trigger
> > events and the only requirement is to empty the log at least once.  This
> > allows for the required transition from empty to non-empty for interrupt
> > generation.  Handling of interrupts is in a follow on patch.
> > 
> > The device can return up to 1MB worth of event records per query.
> > Allocate a shared large buffer to handle the max number of records based
> > on the mailbox payload size.
> > 
> > This patch traces a raw event record and leaves specific event record
> > type tracing to subsequent patches.  Macros are created to aid in
> > tracing the common CXL Event header fields.
> > 
> > Each record is cleared explicitly.  A clear all bit is specified but is
> > only valid when the log overflows.
> > 
> > Signed-off-by: Ira Weiny <ira.weiny@intel.com>
> 
> A few things noticed inline.  I've tightened the QEMU code to reject the
> case of the input payload claims to be bigger than the mailbox size
> and hacked the size down to 256 bytes so it triggers the problem
> highlighted below.

I'm not sure what you did here.

> 
> > 
> > ---
> > Changes from V3:
> > 	Dan
> > 		Split off _OSC pcie bits
> > 			Use existing style for host bridge flag in that
> > 			patch
> > 		Clean up event processing loop
> > 		Use dev_err_ratelimited()
> > 		Clean up version change log
> > 		Delete 'EVENT LOG OVERFLOW'
> > 		Remove cxl_clear_event_logs()
> > 		Add comment for native cxl control
> > 		Fail driver load on event buf allocation failure
> > 		Comment why events are not processed without _OSC flag
> > ---
> >  drivers/cxl/core/mbox.c  | 136 +++++++++++++++++++++++++++++++++++++++
> >  drivers/cxl/core/trace.h | 120 ++++++++++++++++++++++++++++++++++
> >  drivers/cxl/cxl.h        |  12 ++++
> >  drivers/cxl/cxlmem.h     |  84 ++++++++++++++++++++++++
> >  drivers/cxl/pci.c        |  40 ++++++++++++
> >  5 files changed, 392 insertions(+)
> > 
> > diff --git a/drivers/cxl/core/mbox.c b/drivers/cxl/core/mbox.c
> > index b03fba212799..9fb327370e08 100644
> > --- a/drivers/cxl/core/mbox.c
> > +++ b/drivers/cxl/core/mbox.c
> 
> > +static int cxl_clear_event_record(struct cxl_dev_state *cxlds,
> > +				  enum cxl_event_log_type log,
> > +				  struct cxl_get_event_payload *get_pl)
> > +{
> > +	struct cxl_mbox_clear_event_payload payload = {
> > +		.event_log = log,
> > +	};
> > +	u16 total = le16_to_cpu(get_pl->record_count);
> > +	u8 max_handles = CXL_CLEAR_EVENT_MAX_HANDLES;
> > +	size_t pl_size = sizeof(payload);
> > +	struct cxl_mbox_cmd mbox_cmd;
> > +	u16 cnt;
> > +	int rc;
> > +	int i;
> > +
> > +	/* Payload size may limit the max handles */
> > +	if (pl_size > cxlds->payload_size) {
> > +		max_handles = CXL_CLEAR_EVENT_LIMIT_HANDLES(cxlds->payload_size);
> > +		pl_size = cxlds->payload_size;

pl_size is only the max size possible if that size was smaller than the size of
the record [sizeof(payload) above].

> > +	}
> > +
> > +	mbox_cmd = (struct cxl_mbox_cmd) {
> > +		.opcode = CXL_MBOX_OP_CLEAR_EVENT_RECORD,
> > +		.payload_in = &payload,
> > +		.size_in = pl_size,
> 
> This payload size should be whatever we need to store the records,
> not the max size possible.  Particularly as that size is currently
> bigger than the mailbox might be.

But the above check and set ensures that does not happen.

> 
> It shouldn't fail (I think) simply because a later version of the spec might
> add more to this message and things should still work, but definitely not
> good practice to tell the hardware this is much longer than it actually is.

I don't follow.

The full payload is going to be sent even if we are just clearing 1 record
which is inefficient but it should never overflow the hardware because it is
limited by the check above.

So why would this be a problem?

> 
> 
> > +	};
> > +
> > +	/*
> > +	 * Clear Event Records uses u8 for the handle cnt while Get Event
> > +	 * Record can return up to 0xffff records.
> > +	 */
> > +	i = 0;
> > +	for (cnt = 0; cnt < total; cnt++) {
> > +		payload.handle[i++] = get_pl->records[cnt].hdr.handle;
> > +		dev_dbg(cxlds->dev, "Event log '%d': Clearing %u\n",
> > +			log, le16_to_cpu(payload.handle[i]));
> > +
> > +		if (i == max_handles) {
> > +			payload.nr_recs = i;
> > +			rc = cxl_internal_send_cmd(cxlds, &mbox_cmd);
> > +			if (rc)
> > +				return rc;
> > +			i = 0;
> > +		}
> > +	}
> > +
> > +	/* Clear what is left if any */
> > +	if (i) {
> > +		payload.nr_recs = i;
> > +		rc = cxl_internal_send_cmd(cxlds, &mbox_cmd);
> > +		if (rc)
> > +			return rc;
> > +	}
> > +
> > +	return 0;
> > +}
> 
> 
> ...
> 
> > diff --git a/drivers/cxl/cxlmem.h b/drivers/cxl/cxlmem.h
> > index ab138004f644..dd9aa3dd738e 100644
> > --- a/drivers/cxl/cxlmem.h
> > +++ b/drivers/cxl/cxlmem.h
> 
> ...
> 
> > +
> > +/*
> > + * Clear Event Records input payload
> > + * CXL rev 3.0 section 8.2.9.2.3; Table 8-51
> > + */
> > +#define CXL_CLEAR_EVENT_MAX_HANDLES (0xff)
> > +struct cxl_mbox_clear_event_payload {
> > +	u8 event_log;		/* enum cxl_event_log_type */
> > +	u8 clear_flags;
> > +	u8 nr_recs;
> > +	u8 reserved[3];
> > +	__le16 handle[CXL_CLEAR_EVENT_MAX_HANDLES];
> 
> Doesn't fit in the smallest possible payload buffer.
> It's 526 bytes long.  Payload buffer might be 256 bytes in total.
> (8.2.8.4.3 Mailbox capabilities)
> 
> Lazy approach, make this smaller and do more loops when clearing.
> If we want to optimize this later can expand it to this size.

I agree but the code already checks for and adjusts this on the fly based on
cxlds->payload_size?

 +	/* Payload size may limit the max handles */
 +	if (pl_size > cxlds->payload_size) {
 +		max_handles = CXL_CLEAR_EVENT_LIMIT_HANDLES(cxlds->payload_size);
 +		pl_size = cxlds->payload_size;
 +	}

Why is this not ok?  [Other than being potentially inefficient.]

Do you have a patch to qemu which causes this?

Ira

> > +} __packed;
> > +#define CXL_CLEAR_EVENT_LIMIT_HANDLES(payload_size)			\
> > +	(((payload_size) -						\
> > +		(sizeof(struct cxl_mbox_clear_event_payload) -		\
> > +		 (sizeof(__le16) * CXL_CLEAR_EVENT_MAX_HANDLES))) /	\
> > +		sizeof(__le16))
> > +
> 
> ...
> 

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH V4 2/9] cxl/mem: Read, trace, and clear events on driver load
  2022-12-16 21:54     ` Ira Weiny
@ 2022-12-17 16:38       ` Jonathan Cameron
  2022-12-18  0:21         ` Ira Weiny
  2022-12-18  0:25       ` johnny
  1 sibling, 1 reply; 36+ messages in thread
From: Jonathan Cameron @ 2022-12-17 16:38 UTC (permalink / raw)
  To: Ira Weiny
  Cc: Dan Williams, Bjorn Helgaas, Alison Schofield, Vishal Verma,
	Davidlohr Bueso, Dave Jiang, linux-kernel, linux-pci, linux-acpi,
	linux-cxl

On Fri, 16 Dec 2022 13:54:01 -0800
Ira Weiny <ira.weiny@intel.com> wrote:

> On Fri, Dec 16, 2022 at 03:39:39PM +0000, Jonathan Cameron wrote:
> > On Sun, 11 Dec 2022 23:06:20 -0800
> > ira.weiny@intel.com wrote:
> >   
> > > From: Ira Weiny <ira.weiny@intel.com>
> > > 
> > > CXL devices have multiple event logs which can be queried for CXL event
> > > records.  Devices are required to support the storage of at least one
> > > event record in each event log type.
> > > 
> > > Devices track event log overflow by incrementing a counter and tracking
> > > the time of the first and last overflow event seen.
> > > 
> > > Software queries events via the Get Event Record mailbox command; CXL
> > > rev 3.0 section 8.2.9.2.2 and clears events via CXL rev 3.0 section
> > > 8.2.9.2.3 Clear Event Records mailbox command.
> > > 
> > > If the result of negotiating CXL Error Reporting Control is OS control,
> > > read and clear all event logs on driver load.
> > > 
> > > Ensure a clean slate of events by reading and clearing the events on
> > > driver load.
> > > 
> > > The status register is not used because a device may continue to trigger
> > > events and the only requirement is to empty the log at least once.  This
> > > allows for the required transition from empty to non-empty for interrupt
> > > generation.  Handling of interrupts is in a follow on patch.
> > > 
> > > The device can return up to 1MB worth of event records per query.
> > > Allocate a shared large buffer to handle the max number of records based
> > > on the mailbox payload size.
> > > 
> > > This patch traces a raw event record and leaves specific event record
> > > type tracing to subsequent patches.  Macros are created to aid in
> > > tracing the common CXL Event header fields.
> > > 
> > > Each record is cleared explicitly.  A clear all bit is specified but is
> > > only valid when the log overflows.
> > > 
> > > Signed-off-by: Ira Weiny <ira.weiny@intel.com>  
> > 
> > A few things noticed inline.  I've tightened the QEMU code to reject the
> > case of the input payload claims to be bigger than the mailbox size
> > and hacked the size down to 256 bytes so it triggers the problem
> > highlighted below.  
> 
> I'm not sure what you did here.

Nor am I. I think this might have been a case of chasing the undersized
length bug in QEMU because it was the CXL 3.0 issue and misunderstanding
one of the debug prints I got.

Friday silliness. Sorry about that!

However, the over sized payload communicated to the hardware is still
a potential problem. See below.

> 
> >   
> > > 
> > > ---
> > > Changes from V3:
> > > 	Dan
> > > 		Split off _OSC pcie bits
> > > 			Use existing style for host bridge flag in that
> > > 			patch
> > > 		Clean up event processing loop
> > > 		Use dev_err_ratelimited()
> > > 		Clean up version change log
> > > 		Delete 'EVENT LOG OVERFLOW'
> > > 		Remove cxl_clear_event_logs()
> > > 		Add comment for native cxl control
> > > 		Fail driver load on event buf allocation failure
> > > 		Comment why events are not processed without _OSC flag
> > > ---
> > >  drivers/cxl/core/mbox.c  | 136 +++++++++++++++++++++++++++++++++++++++
> > >  drivers/cxl/core/trace.h | 120 ++++++++++++++++++++++++++++++++++
> > >  drivers/cxl/cxl.h        |  12 ++++
> > >  drivers/cxl/cxlmem.h     |  84 ++++++++++++++++++++++++
> > >  drivers/cxl/pci.c        |  40 ++++++++++++
> > >  5 files changed, 392 insertions(+)
> > > 
> > > diff --git a/drivers/cxl/core/mbox.c b/drivers/cxl/core/mbox.c
> > > index b03fba212799..9fb327370e08 100644
> > > --- a/drivers/cxl/core/mbox.c
> > > +++ b/drivers/cxl/core/mbox.c  
> >   
> > > +static int cxl_clear_event_record(struct cxl_dev_state *cxlds,
> > > +				  enum cxl_event_log_type log,
> > > +				  struct cxl_get_event_payload *get_pl)
> > > +{
> > > +	struct cxl_mbox_clear_event_payload payload = {
> > > +		.event_log = log,
> > > +	};
> > > +	u16 total = le16_to_cpu(get_pl->record_count);
> > > +	u8 max_handles = CXL_CLEAR_EVENT_MAX_HANDLES;
> > > +	size_t pl_size = sizeof(payload);
> > > +	struct cxl_mbox_cmd mbox_cmd;
> > > +	u16 cnt;
> > > +	int rc;
> > > +	int i;
> > > +
> > > +	/* Payload size may limit the max handles */
> > > +	if (pl_size > cxlds->payload_size) {
> > > +		max_handles = CXL_CLEAR_EVENT_LIMIT_HANDLES(cxlds->payload_size);

Definition of that is more complex than it needs to be - see below.

> > > +		pl_size = cxlds->payload_size;  
> 
> pl_size is only the max size possible if that size was smaller than the size of
> the record [sizeof(payload) above].

Sorry. For some reason my eyes skipped over this completely.
So we are fine for all my comments on overflowing.  On plus side
will now check if that happens in QEMU and return an error which we
weren't doing before.

> 
> > > +	}
> > > +
> > > +	mbox_cmd = (struct cxl_mbox_cmd) {
> > > +		.opcode = CXL_MBOX_OP_CLEAR_EVENT_RECORD,
> > > +		.payload_in = &payload,
> > > +		.size_in = pl_size,  
> > 
> > This payload size should be whatever we need to store the records,
> > not the max size possible.  Particularly as that size is currently
> > bigger than the mailbox might be.  
> 
> But the above check and set ensures that does not happen.
> 
> > 
> > It shouldn't fail (I think) simply because a later version of the spec might
> > add more to this message and things should still work, but definitely not
> > good practice to tell the hardware this is much longer than it actually is.  
> 
> I don't follow.
> 
> The full payload is going to be sent even if we are just clearing 1 record
> which is inefficient but it should never overflow the hardware because it is
> limited by the check above.
> 
> So why would this be a problem?
I'm struggling to find a clear spec statement on if this allowed, so the following
is a thought experiment. There is language in definition of the "invalid payload length"
error code "The input payload length is not valid for the specified command", but it
doesn't go into what counts as valid.

What you have looks fine because a device can't fail on the basis it's told the
payload is longer than it expects, because you might be sending a CXL 4.0 spec
payload that is backwards compatible with CXL 3.0 - hence the fact the sizes
don't match up with that expected can't be considered an error.
So far so good... However, we may have a situation not dissimilar to the
change in record length for the set event interrupt policy payload between CXL 2.0
and CXL 3.0. The only way the endpoint knows what version of message it got is because the
record is 4 bytes or 5 bytes.  If we have extra stuff on the end of this record
in future the end point can assume that it is a new version of the spec and interpret
what is in that payload space.

Say the future structure looks like

struct cxl_mbox_clear_event_payload_future {
	u8 event_log;		/* enum cxl_event_log_type */
	u8 clear_flags;
	u8 nr_recs;
	u8 reserved[3];
	__le16 handle[nr_recs]; 
	__le16 otherdata[nr_recs];
}

Endpoint receiving your 'overly long payload' will assume all those otherdata fields
are 0, not necessarily the same as non present.
For the set event interrupt policy, if we sent an overlong payload like you've done here
with assumption of the CXL 2.0 spec we would be turning off the DCD interrupt rather
that doing nothing (unlikely to be a problem in that particularly case as that one
doesn't have a FW Interrupt option - but that's more luck than design).

I'm not sure why we'd have extra stuff for this payload, but it 'might' happen'.

> > 
> 
> > 
> >   
> > > +	};
> > > +
> > > +	/*
> > > +	 * Clear Event Records uses u8 for the handle cnt while Get Event
> > > +	 * Record can return up to 0xffff records.
> > > +	 */
> > > +	i = 0;
> > > +	for (cnt = 0; cnt < total; cnt++) {
> > > +		payload.handle[i++] = get_pl->records[cnt].hdr.handle;
> > > +		dev_dbg(cxlds->dev, "Event log '%d': Clearing %u\n",
> > > +			log, le16_to_cpu(payload.handle[i]));
> > > +
> > > +		if (i == max_handles) {
> > > +			payload.nr_recs = i;
> > > +			rc = cxl_internal_send_cmd(cxlds, &mbox_cmd);
> > > +			if (rc)
> > > +				return rc;
> > > +			i = 0;
> > > +		}
> > > +	}
> > > +
> > > +	/* Clear what is left if any */
> > > +	if (i) {
> > > +		payload.nr_recs = i;
> > > +		rc = cxl_internal_send_cmd(cxlds, &mbox_cmd);
> > > +		if (rc)
> > > +			return rc;
> > > +	}
> > > +
> > > +	return 0;
> > > +}  
> > 
> > 
> > ...
> >   
> > > diff --git a/drivers/cxl/cxlmem.h b/drivers/cxl/cxlmem.h
> > > index ab138004f644..dd9aa3dd738e 100644
> > > --- a/drivers/cxl/cxlmem.h
> > > +++ b/drivers/cxl/cxlmem.h  
> > 
> > ...
> >   
> > > +
> > > +/*
> > > + * Clear Event Records input payload
> > > + * CXL rev 3.0 section 8.2.9.2.3; Table 8-51
> > > + */
> > > +#define CXL_CLEAR_EVENT_MAX_HANDLES (0xff)
> > > +struct cxl_mbox_clear_event_payload {
> > > +	u8 event_log;		/* enum cxl_event_log_type */
> > > +	u8 clear_flags;
> > > +	u8 nr_recs;
> > > +	u8 reserved[3];
> > > +	__le16 handle[CXL_CLEAR_EVENT_MAX_HANDLES];  
> > 
> > Doesn't fit in the smallest possible payload buffer.
> > It's 526 bytes long.  Payload buffer might be 256 bytes in total.
> > (8.2.8.4.3 Mailbox capabilities)
> > 
> > Lazy approach, make this smaller and do more loops when clearing.
> > If we want to optimize this later can expand it to this size.  
> 
> I agree but the code already checks for and adjusts this on the fly based on
> cxlds->payload_size?
> 
>  +	/* Payload size may limit the max handles */
>  +	if (pl_size > cxlds->payload_size) {
>  +		max_handles = CXL_CLEAR_EVENT_LIMIT_HANDLES(cxlds->payload_size);
>  +		pl_size = cxlds->payload_size;
>  +	}
> 
> Why is this not ok?  [Other than being potentially inefficient.]
> 
> Do you have a patch to qemu which causes this?

Two issues crossing I think on my side and me thinking this one was obviously
the problem when it wasn't.

> 
> Ira
> 
> > > +} __packed;
> > > +#define CXL_CLEAR_EVENT_LIMIT_HANDLES(payload_size)			\
> > > +	(((payload_size) -						\
> > > +		(sizeof(struct cxl_mbox_clear_event_payload) -		\
> > > +		 (sizeof(__le16) * CXL_CLEAR_EVENT_MAX_HANDLES))) /	\

Could use offsetof() to simplify this

> > > +		sizeof(__le16))
> > > +  
> > 
> > ...
> >   


^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH V4 3/9] cxl/mem: Wire up event interrupts
  2022-12-16 21:28       ` Ira Weiny
@ 2022-12-17 16:40         ` Jonathan Cameron
  0 siblings, 0 replies; 36+ messages in thread
From: Jonathan Cameron @ 2022-12-17 16:40 UTC (permalink / raw)
  To: Ira Weiny
  Cc: Dan Williams, Davidlohr Bueso, Bjorn Helgaas, Bjorn Helgaas,
	Alison Schofield, Vishal Verma, Dave Jiang, linux-kernel,
	linux-pci, linux-acpi, linux-cxl

On Fri, 16 Dec 2022 13:28:36 -0800
Ira Weiny <ira.weiny@intel.com> wrote:

> On Fri, Dec 16, 2022 at 06:42:15PM +0000, Jonathan Cameron wrote:
> > On Fri, 16 Dec 2022 14:24:38 +0000
> > Jonathan Cameron <Jonathan.Cameron@Huawei.com> wrote:
> >   
> > > On Sun, 11 Dec 2022 23:06:21 -0800
> > > ira.weiny@intel.com wrote:
> > >   
> > > > From: Davidlohr Bueso <dave@stgolabs.net>
> > > > 
> > > > Currently the only CXL features targeted for irq support require their
> > > > message numbers to be within the first 16 entries.  The device may
> > > > however support less than 16 entries depending on the support it
> > > > provides.
> > > > 
> > > > Attempt to allocate these 16 irq vectors.  If the device supports less
> > > > then the PCI infrastructure will allocate that number.  Upon successful
> > > > allocation, users can plug in their respective isr at any point
> > > > thereafter.
> > > > 
> > > > CXL device events are signaled via interrupts.  Each event log may have
> > > > a different interrupt message number.  These message numbers are
> > > > reported in the Get Event Interrupt Policy mailbox command.
> > > > 
> > > > Add interrupt support for event logs.  Interrupts are allocated as
> > > > shared interrupts.  Therefore, all or some event logs can share the same
> > > > message number.
> > > > 
> > > > In addition all logs are queried on any interrupt in order of the most
> > > > to least severe based on the status register.
> > > > 
> > > > Cc: Bjorn Helgaas <helgaas@kernel.org>
> > > > Cc: Jonathan Cameron <Jonathan.Cameron@huawei.com>
> > > > Co-developed-by: Ira Weiny <ira.weiny@intel.com>
> > > > Signed-off-by: Ira Weiny <ira.weiny@intel.com>
> > > > Signed-off-by: Davidlohr Bueso <dave@stgolabs.net>
> > > >     
> > >   
> > > > +/**
> > > > + * Event Interrupt Policy
> > > > + *
> > > > + * CXL rev 3.0 section 8.2.9.2.4; Table 8-52
> > > > + */
> > > > +enum cxl_event_int_mode {
> > > > +	CXL_INT_NONE		= 0x00,
> > > > +	CXL_INT_MSI_MSIX	= 0x01,
> > > > +	CXL_INT_FW		= 0x02
> > > > +};
> > > > +struct cxl_event_interrupt_policy {
> > > > +	u8 info_settings;
> > > > +	u8 warn_settings;
> > > > +	u8 failure_settings;
> > > > +	u8 fatal_settings;    
> > > 
> > > This is an issue for your QEMU code which has this set at 5 bytes.
> > > Guess our handling of record lengths needs updating now we have two different
> > > spec versions to support and hence these can have multiple lengths.
> > > 
> > > Btw, do you have an updated version of the QEMU patches you can share?  
> > 
> > Note that I'm happy to take your QEMU series forwards, just don't want to duplicate
> > stuff you have already done!  
> 
> I do have updates to handle more than 3 records at a time which I was polishing
> until I saw your review.

Great. I'll work on other stuff in the meantime and hopefully we can get all the
error emulation upstream this QEMU cycle.

> 
> I was getting all geared up to respin the patches but I'm not sure there is an
> issue now that I look at it.
> 
> I'm still investigating.
> 
> Ira
> 
> > 
> > Jonathan
> >   
> > > I was planning on just doing the AER type RAS stuff for the first pull this cycle
> > > but given this set means we never reach that code I probably need to do QEMU
> > > support for this and the stuff to support those all in one go - otherwise
> > > no one will be able to test it :)  We rather optimistically have the OSC set
> > > to say the OS can have control of these, but upstream code doesn't emulate
> > > anything yet. Oops. Should have pretended the hardware was handling them
> > > until we had this support in place in QEMU.
> > > 
> > > Jonathan
> > >   
> > > > +} __packed;
> > > > +
> > > >  /**
> > > >   * struct cxl_event_state - Event log driver state
> > > >   *
> > > > @@ -288,6 +305,8 @@ enum cxl_opcode {
> > > >  	CXL_MBOX_OP_RAW			= CXL_MBOX_OP_INVALID,
> > > >  	CXL_MBOX_OP_GET_EVENT_RECORD	= 0x0100,
> > > >  	CXL_MBOX_OP_CLEAR_EVENT_RECORD	= 0x0101,
> > > > +	CXL_MBOX_OP_GET_EVT_INT_POLICY	= 0x0102,
> > > > +	CXL_MBOX_OP_SET_EVT_INT_POLICY	= 0x0103,
> > > >  	CXL_MBOX_OP_GET_FW_INFO		= 0x0200,
> > > >  	CXL_MBOX_OP_ACTIVATE_FW		= 0x0202,
> > > >  	CXL_MBOX_OP_GET_SUPPORTED_LOGS	= 0x0400,    
> >   


^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH V4 3/9] cxl/mem: Wire up event interrupts
  2022-12-16 21:33     ` Ira Weiny
@ 2022-12-17 16:43       ` Jonathan Cameron
  0 siblings, 0 replies; 36+ messages in thread
From: Jonathan Cameron @ 2022-12-17 16:43 UTC (permalink / raw)
  To: Ira Weiny
  Cc: Alison Schofield, Dan Williams, Davidlohr Bueso, Bjorn Helgaas,
	Bjorn Helgaas, Vishal Verma, Dave Jiang, linux-kernel, linux-pci,
	linux-acpi, linux-cxl

On Fri, 16 Dec 2022 13:33:43 -0800
Ira Weiny <ira.weiny@intel.com> wrote:

> On Fri, Dec 16, 2022 at 06:21:10PM +0000, Jonathan Cameron wrote:
> > On Sun, 11 Dec 2022 23:06:21 -0800
> > ira.weiny@intel.com wrote:
> >   
> > > From: Davidlohr Bueso <dave@stgolabs.net>
> > > 
> > > Currently the only CXL features targeted for irq support require their
> > > message numbers to be within the first 16 entries.  The device may
> > > however support less than 16 entries depending on the support it
> > > provides.
> > > 
> > > Attempt to allocate these 16 irq vectors.  If the device supports less
> > > then the PCI infrastructure will allocate that number.  Upon successful
> > > allocation, users can plug in their respective isr at any point
> > > thereafter.
> > > 
> > > CXL device events are signaled via interrupts.  Each event log may have
> > > a different interrupt message number.  These message numbers are
> > > reported in the Get Event Interrupt Policy mailbox command.
> > > 
> > > Add interrupt support for event logs.  Interrupts are allocated as
> > > shared interrupts.  Therefore, all or some event logs can share the same
> > > message number.
> > > 
> > > In addition all logs are queried on any interrupt in order of the most
> > > to least severe based on the status register.
> > > 
> > > Cc: Bjorn Helgaas <helgaas@kernel.org>
> > > Cc: Jonathan Cameron <Jonathan.Cameron@huawei.com>
> > > Co-developed-by: Ira Weiny <ira.weiny@intel.com>
> > > Signed-off-by: Ira Weiny <ira.weiny@intel.com>
> > > Signed-off-by: Davidlohr Bueso <dave@stgolabs.net>  
> > 
> > Sometimes it feels like we go around in circles as I'm sure we've
> > fixed this at least 3 times now in different sets that got dropped
> > (was definitely in the DOE interrupt support set and was controversial ;)  
> 
> Well I certainly seem to be spinning ATM.  I probably need more eggnog.  ;-)
> 
> > 
> > Nothing in this patch set calls pci_set_master() so no interrupts
> > will even be delivered.  
> 
> Is this a bug in Qemu then?  Because this _is_ tested and irqs _were_
> delivered.  I thought we determined that pci_set_master() was called in
> pcim_enable_device() but I don't see that now.  :-(
> 
> What I do know is this works with Qemu.  :-/

Without the addition of a pci_set_master() the kernel code, 
I don't get any interrupts on my arm64 test setup. Only reason I realized the
call was missing.  Possible there is something different on x86 I guess.

Jonathan

> 
> Ira
> 
> > I don't think there are any sets in flight
> > that would fix that.
> > 
> > Jonathan
> > 
> >   
> > > 
> > > ---
> > > Changes from V3:
> > > 	Adjust based on changes in patch 1
> > > 	Consolidate event setup into cxl_event_config()
> > > 	Consistently use cxl_event_* for function names
> > > 	Remove cxl_event_int_is_msi()
> > > 	Ensure DCD log is ignored in status
> > > 	Simplify event status loop logic
> > > 	Dan
> > > 		Fail driver load if the irq's are not allocated
> > > 		move cxl_event_config_msgnums() to pci.c
> > > 		s/CXL_PCI_REQUIRED_VECTORS/CXL_PCI_DEFAULT_MAX_VECTORS
> > > 		s/devm_kmalloc/devm_kzalloc
> > > 		Fix up pci_alloc_irq_vectors() comment
> > > 		Pass pdev to cxl_alloc_irq_vectors()
> > > 		Check FW irq policy prior to configuration
> > > 	Jonathan
> > > 		Use FIELD_GET instead of manual masking
> > > ---
> > >  drivers/cxl/cxl.h    |   4 +
> > >  drivers/cxl/cxlmem.h |  19 ++++
> > >  drivers/cxl/cxlpci.h |   6 ++
> > >  drivers/cxl/pci.c    | 208 +++++++++++++++++++++++++++++++++++++++++--
> > >  4 files changed, 231 insertions(+), 6 deletions(-)
> > > 
> > > diff --git a/drivers/cxl/cxl.h b/drivers/cxl/cxl.h
> > > index 5974d1082210..b3964149c77b 100644
> > > --- a/drivers/cxl/cxl.h
> > > +++ b/drivers/cxl/cxl.h
> > > @@ -168,6 +168,10 @@ static inline int ways_to_eiw(unsigned int ways, u8 *eiw)
> > >  				 CXLDEV_EVENT_STATUS_FAIL |	\
> > >  				 CXLDEV_EVENT_STATUS_FATAL)
> > >  
> > > +/* CXL rev 3.0 section 8.2.9.2.4; Table 8-52 */
> > > +#define CXLDEV_EVENT_INT_MODE_MASK	GENMASK(1, 0)
> > > +#define CXLDEV_EVENT_INT_MSGNUM_MASK	GENMASK(7, 4)
> > > +
> > >  /* CXL 2.0 8.2.8.4 Mailbox Registers */
> > >  #define CXLDEV_MBOX_CAPS_OFFSET 0x00
> > >  #define   CXLDEV_MBOX_CAP_PAYLOAD_SIZE_MASK GENMASK(4, 0)
> > > diff --git a/drivers/cxl/cxlmem.h b/drivers/cxl/cxlmem.h
> > > index dd9aa3dd738e..bd8bfbe61ec8 100644
> > > --- a/drivers/cxl/cxlmem.h
> > > +++ b/drivers/cxl/cxlmem.h
> > > @@ -194,6 +194,23 @@ struct cxl_endpoint_dvsec_info {
> > >  	struct range dvsec_range[2];
> > >  };
> > >  
> > > +/**
> > > + * Event Interrupt Policy
> > > + *
> > > + * CXL rev 3.0 section 8.2.9.2.4; Table 8-52
> > > + */
> > > +enum cxl_event_int_mode {
> > > +	CXL_INT_NONE		= 0x00,
> > > +	CXL_INT_MSI_MSIX	= 0x01,
> > > +	CXL_INT_FW		= 0x02
> > > +};
> > > +struct cxl_event_interrupt_policy {
> > > +	u8 info_settings;
> > > +	u8 warn_settings;
> > > +	u8 failure_settings;
> > > +	u8 fatal_settings;
> > > +} __packed;
> > > +
> > >  /**
> > >   * struct cxl_event_state - Event log driver state
> > >   *
> > > @@ -288,6 +305,8 @@ enum cxl_opcode {
> > >  	CXL_MBOX_OP_RAW			= CXL_MBOX_OP_INVALID,
> > >  	CXL_MBOX_OP_GET_EVENT_RECORD	= 0x0100,
> > >  	CXL_MBOX_OP_CLEAR_EVENT_RECORD	= 0x0101,
> > > +	CXL_MBOX_OP_GET_EVT_INT_POLICY	= 0x0102,
> > > +	CXL_MBOX_OP_SET_EVT_INT_POLICY	= 0x0103,
> > >  	CXL_MBOX_OP_GET_FW_INFO		= 0x0200,
> > >  	CXL_MBOX_OP_ACTIVATE_FW		= 0x0202,
> > >  	CXL_MBOX_OP_GET_SUPPORTED_LOGS	= 0x0400,
> > > diff --git a/drivers/cxl/cxlpci.h b/drivers/cxl/cxlpci.h
> > > index 77dbdb980b12..a8ea04f536ab 100644
> > > --- a/drivers/cxl/cxlpci.h
> > > +++ b/drivers/cxl/cxlpci.h
> > > @@ -53,6 +53,12 @@
> > >  #define	    CXL_DVSEC_REG_LOCATOR_BLOCK_ID_MASK			GENMASK(15, 8)
> > >  #define     CXL_DVSEC_REG_LOCATOR_BLOCK_OFF_LOW_MASK		GENMASK(31, 16)
> > >  
> > > +/*
> > > + * NOTE: Currently all the functions which are enabled for CXL require their
> > > + * vectors to be in the first 16.  Use this as the default max.
> > > + */
> > > +#define CXL_PCI_DEFAULT_MAX_VECTORS 16
> > > +
> > >  /* Register Block Identifier (RBI) */
> > >  enum cxl_regloc_type {
> > >  	CXL_REGLOC_RBI_EMPTY = 0,
> > > diff --git a/drivers/cxl/pci.c b/drivers/cxl/pci.c
> > > index a2d8382bc593..d42d87faddb8 100644
> > > --- a/drivers/cxl/pci.c
> > > +++ b/drivers/cxl/pci.c
> > > @@ -445,6 +445,201 @@ static int cxl_mem_alloc_event_buf(struct cxl_dev_state *cxlds)
> > >  	return 0;
> > >  }
> > >  
> > > +static int cxl_alloc_irq_vectors(struct pci_dev *pdev)
> > > +{
> > > +	int nvecs;
> > > +
> > > +	/*
> > > +	 * CXL requires MSI/MSIX support.
> > > +	 *
> > > +	 * Additionally pci_alloc_irq_vectors() handles calling
> > > +	 * pci_free_irq_vectors() automatically despite not being called
> > > +	 * pcim_*.  See pci_setup_msi_context().
> > > +	 */
> > > +	nvecs = pci_alloc_irq_vectors(pdev, 1, CXL_PCI_DEFAULT_MAX_VECTORS,
> > > +				      PCI_IRQ_MSIX | PCI_IRQ_MSI);
> > > +	if (nvecs < 1) {
> > > +		dev_dbg(&pdev->dev, "Failed to alloc irq vectors: %d\n", nvecs);
> > > +		return -ENXIO;
> > > +	}
> > > +	return 0;
> > > +}
> > > +
> > > +struct cxl_dev_id {
> > > +	struct cxl_dev_state *cxlds;
> > > +};
> > > +
> > > +static irqreturn_t cxl_event_thread(int irq, void *id)
> > > +{
> > > +	struct cxl_dev_id *dev_id = id;
> > > +	struct cxl_dev_state *cxlds = dev_id->cxlds;
> > > +	u32 status;
> > > +
> > > +	do {
> > > +		/*
> > > +		 * CXL 3.0 8.2.8.3.1: The lower 32 bits are the status;
> > > +		 * ignore the reserved upper 32 bits
> > > +		 */
> > > +		status = readl(cxlds->regs.status + CXLDEV_DEV_EVENT_STATUS_OFFSET);
> > > +		/* Ignore logs unknown to the driver */
> > > +		status &= CXLDEV_EVENT_STATUS_ALL;
> > > +		if (!status)
> > > +			break;
> > > +		cxl_mem_get_event_records(cxlds, status);
> > > +		cond_resched();
> > > +	} while (status);
> > > +
> > > +	return IRQ_HANDLED;
> > > +}
> > > +
> > > +static int cxl_event_req_irq(struct cxl_dev_state *cxlds, u8 setting)
> > > +{
> > > +	struct device *dev = cxlds->dev;
> > > +	struct pci_dev *pdev = to_pci_dev(dev);
> > > +	struct cxl_dev_id *dev_id;
> > > +	int irq;
> > > +
> > > +	if (FIELD_GET(CXLDEV_EVENT_INT_MODE_MASK, setting) != CXL_INT_MSI_MSIX)
> > > +		return -ENXIO;
> > > +
> > > +	/* dev_id must be globally unique and must contain the cxlds */
> > > +	dev_id = devm_kzalloc(dev, sizeof(*dev_id), GFP_KERNEL);
> > > +	if (!dev_id)
> > > +		return -ENOMEM;
> > > +	dev_id->cxlds = cxlds;
> > > +
> > > +	irq =  pci_irq_vector(pdev,
> > > +			      FIELD_GET(CXLDEV_EVENT_INT_MSGNUM_MASK, setting));
> > > +	if (irq < 0)
> > > +		return irq;
> > > +
> > > +	return devm_request_threaded_irq(dev, irq, NULL, cxl_event_thread,
> > > +					 IRQF_SHARED, NULL, dev_id);
> > > +}
> > > +
> > > +static int cxl_event_get_int_policy(struct cxl_dev_state *cxlds,
> > > +				    struct cxl_event_interrupt_policy *policy)
> > > +{
> > > +	struct cxl_mbox_cmd mbox_cmd = (struct cxl_mbox_cmd) {
> > > +		.opcode = CXL_MBOX_OP_GET_EVT_INT_POLICY,
> > > +		.payload_out = policy,
> > > +		.size_out = sizeof(*policy),
> > > +	};
> > > +	int rc;
> > > +
> > > +	rc = cxl_internal_send_cmd(cxlds, &mbox_cmd);
> > > +	if (rc < 0)
> > > +		dev_err(cxlds->dev, "Failed to get event interrupt policy : %d",
> > > +			rc);
> > > +
> > > +	return rc;
> > > +}
> > > +
> > > +static int cxl_event_config_msgnums(struct cxl_dev_state *cxlds,
> > > +				    struct cxl_event_interrupt_policy *policy)
> > > +{
> > > +	struct cxl_mbox_cmd mbox_cmd;
> > > +	int rc;
> > > +
> > > +	policy->info_settings = CXL_INT_MSI_MSIX;
> > > +	policy->warn_settings = CXL_INT_MSI_MSIX;
> > > +	policy->failure_settings = CXL_INT_MSI_MSIX;
> > > +	policy->fatal_settings = CXL_INT_MSI_MSIX;
> > > +
> > > +	mbox_cmd = (struct cxl_mbox_cmd) {
> > > +		.opcode = CXL_MBOX_OP_SET_EVT_INT_POLICY,
> > > +		.payload_in = policy,
> > > +		.size_in = sizeof(*policy),
> > > +	};
> > > +
> > > +	rc = cxl_internal_send_cmd(cxlds, &mbox_cmd);
> > > +	if (rc < 0) {
> > > +		dev_err(cxlds->dev, "Failed to set event interrupt policy : %d",
> > > +			rc);
> > > +		return rc;
> > > +	}
> > > +
> > > +	/* Retrieve final interrupt settings */
> > > +	return cxl_event_get_int_policy(cxlds, policy);
> > > +}
> > > +
> > > +static int cxl_event_irqsetup(struct cxl_dev_state *cxlds)
> > > +{
> > > +	struct cxl_event_interrupt_policy policy;
> > > +	int rc;
> > > +
> > > +	rc = cxl_event_config_msgnums(cxlds, &policy);
> > > +	if (rc)
> > > +		return rc;
> > > +
> > > +	rc = cxl_event_req_irq(cxlds, policy.info_settings);
> > > +	if (rc) {
> > > +		dev_err(cxlds->dev, "Failed to get interrupt for event Info log\n");
> > > +		return rc;
> > > +	}
> > > +
> > > +	rc = cxl_event_req_irq(cxlds, policy.warn_settings);
> > > +	if (rc) {
> > > +		dev_err(cxlds->dev, "Failed to get interrupt for event Warn log\n");
> > > +		return rc;
> > > +	}
> > > +
> > > +	rc = cxl_event_req_irq(cxlds, policy.failure_settings);
> > > +	if (rc) {
> > > +		dev_err(cxlds->dev, "Failed to get interrupt for event Failure log\n");
> > > +		return rc;
> > > +	}
> > > +
> > > +	rc = cxl_event_req_irq(cxlds, policy.fatal_settings);
> > > +	if (rc) {
> > > +		dev_err(cxlds->dev, "Failed to get interrupt for event Fatal log\n");
> > > +		return rc;
> > > +	}
> > > +
> > > +	return 0;
> > > +}
> > > +
> > > +static bool cxl_event_int_is_fw(u8 setting)
> > > +{
> > > +	u8 mode = FIELD_GET(CXLDEV_EVENT_INT_MODE_MASK, setting);
> > > +
> > > +	return mode == CXL_INT_FW;
> > > +}
> > > +
> > > +static int cxl_event_config(struct pci_host_bridge *host_bridge,
> > > +			    struct cxl_dev_state *cxlds)
> > > +{
> > > +	struct cxl_event_interrupt_policy policy;
> > > +	int rc;
> > > +
> > > +	/*
> > > +	 * When BIOS maintains CXL error reporting control, it will process
> > > +	 * event records.  Only one agent can do so.
> > > +	 */
> > > +	if (!host_bridge->native_cxl_error)
> > > +		return 0;
> > > +
> > > +	rc = cxl_event_get_int_policy(cxlds, &policy);
> > > +	if (rc)
> > > +		return rc;
> > > +
> > > +	if (cxl_event_int_is_fw(policy.info_settings) ||
> > > +	    cxl_event_int_is_fw(policy.warn_settings) ||
> > > +	    cxl_event_int_is_fw(policy.failure_settings) ||
> > > +	    cxl_event_int_is_fw(policy.fatal_settings)) {
> > > +		dev_err(cxlds->dev, "FW still in control of Event Logs despite _OSC settings\n");
> > > +		return -EBUSY;
> > > +	}
> > > +
> > > +	rc = cxl_event_irqsetup(cxlds);
> > > +	if (rc)
> > > +		return rc;
> > > +
> > > +	cxl_mem_get_event_records(cxlds, CXLDEV_EVENT_STATUS_ALL);
> > > +
> > > +	return 0;
> > > +}
> > > +
> > >  static int cxl_pci_probe(struct pci_dev *pdev, const struct pci_device_id *id)
> > >  {
> > >  	struct pci_host_bridge *host_bridge = pci_find_host_bridge(pdev->bus);
> > > @@ -519,6 +714,10 @@ static int cxl_pci_probe(struct pci_dev *pdev, const struct pci_device_id *id)
> > >  	if (rc)
> > >  		return rc;
> > >  
> > > +	rc = cxl_alloc_irq_vectors(pdev);
> > > +	if (rc)
> > > +		return rc;
> > > +
> > >  	cxlmd = devm_cxl_add_memdev(cxlds);
> > >  	if (IS_ERR(cxlmd))
> > >  		return PTR_ERR(cxlmd);
> > > @@ -527,12 +726,9 @@ static int cxl_pci_probe(struct pci_dev *pdev, const struct pci_device_id *id)
> > >  	if (rc)
> > >  		return rc;
> > >  
> > > -	/*
> > > -	 * When BIOS maintains CXL error reporting control, it will process
> > > -	 * event records.  Only one agent can do so.
> > > -	 */
> > > -	if (host_bridge->native_cxl_error)
> > > -		cxl_mem_get_event_records(cxlds, CXLDEV_EVENT_STATUS_ALL);
> > > +	rc = cxl_event_config(host_bridge, cxlds);
> > > +	if (rc)
> > > +		return rc;
> > >  
> > >  	if (cxlds->regs.ras) {
> > >  		pci_enable_pcie_error_reporting(pdev);  
> >   
> 


^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH V4 2/9] cxl/mem: Read, trace, and clear events on driver load
  2022-12-17 16:38       ` Jonathan Cameron
@ 2022-12-18  0:21         ` Ira Weiny
  2022-12-18 15:52           ` Jonathan Cameron
  0 siblings, 1 reply; 36+ messages in thread
From: Ira Weiny @ 2022-12-18  0:21 UTC (permalink / raw)
  To: Jonathan Cameron
  Cc: Dan Williams, Bjorn Helgaas, Alison Schofield, Vishal Verma,
	Davidlohr Bueso, Dave Jiang, linux-kernel, linux-pci, linux-acpi,
	linux-cxl

On Sat, Dec 17, 2022 at 04:38:50PM +0000, Jonathan Cameron wrote:
> On Fri, 16 Dec 2022 13:54:01 -0800
> Ira Weiny <ira.weiny@intel.com> wrote:
> 
> > On Fri, Dec 16, 2022 at 03:39:39PM +0000, Jonathan Cameron wrote:
> > > On Sun, 11 Dec 2022 23:06:20 -0800
> > > ira.weiny@intel.com wrote:
> > >   
> > > > From: Ira Weiny <ira.weiny@intel.com>
> > > > 
> > > > CXL devices have multiple event logs which can be queried for CXL event
> > > > records.  Devices are required to support the storage of at least one
> > > > event record in each event log type.
> > > > 
> > > > Devices track event log overflow by incrementing a counter and tracking
> > > > the time of the first and last overflow event seen.
> > > > 
> > > > Software queries events via the Get Event Record mailbox command; CXL
> > > > rev 3.0 section 8.2.9.2.2 and clears events via CXL rev 3.0 section
> > > > 8.2.9.2.3 Clear Event Records mailbox command.
> > > > 
> > > > If the result of negotiating CXL Error Reporting Control is OS control,
> > > > read and clear all event logs on driver load.
> > > > 
> > > > Ensure a clean slate of events by reading and clearing the events on
> > > > driver load.
> > > > 
> > > > The status register is not used because a device may continue to trigger
> > > > events and the only requirement is to empty the log at least once.  This
> > > > allows for the required transition from empty to non-empty for interrupt
> > > > generation.  Handling of interrupts is in a follow on patch.
> > > > 
> > > > The device can return up to 1MB worth of event records per query.
> > > > Allocate a shared large buffer to handle the max number of records based
> > > > on the mailbox payload size.
> > > > 
> > > > This patch traces a raw event record and leaves specific event record
> > > > type tracing to subsequent patches.  Macros are created to aid in
> > > > tracing the common CXL Event header fields.
> > > > 
> > > > Each record is cleared explicitly.  A clear all bit is specified but is
> > > > only valid when the log overflows.
> > > > 
> > > > Signed-off-by: Ira Weiny <ira.weiny@intel.com>  
> > > 
> > > A few things noticed inline.  I've tightened the QEMU code to reject the
> > > case of the input payload claims to be bigger than the mailbox size
> > > and hacked the size down to 256 bytes so it triggers the problem
> > > highlighted below.  
> > 
> > I'm not sure what you did here.
> 
> Nor am I. I think this might have been a case of chasing the undersized
> length bug in QEMU because it was the CXL 3.0 issue and misunderstanding
> one of the debug prints I got.
> 
> Friday silliness. Sorry about that!

NP but you did have me going.  I've vowed to actually understand the spec
better going forward!  :-D

> 
> However, the over sized payload communicated to the hardware is still
> a potential problem. See below.

I don't see where there is an oversized payload used either...

> 
> > 
> > >   
> > > > 
> > > > ---
> > > > Changes from V3:
> > > > 	Dan
> > > > 		Split off _OSC pcie bits
> > > > 			Use existing style for host bridge flag in that
> > > > 			patch
> > > > 		Clean up event processing loop
> > > > 		Use dev_err_ratelimited()
> > > > 		Clean up version change log
> > > > 		Delete 'EVENT LOG OVERFLOW'
> > > > 		Remove cxl_clear_event_logs()
> > > > 		Add comment for native cxl control
> > > > 		Fail driver load on event buf allocation failure
> > > > 		Comment why events are not processed without _OSC flag
> > > > ---
> > > >  drivers/cxl/core/mbox.c  | 136 +++++++++++++++++++++++++++++++++++++++
> > > >  drivers/cxl/core/trace.h | 120 ++++++++++++++++++++++++++++++++++
> > > >  drivers/cxl/cxl.h        |  12 ++++
> > > >  drivers/cxl/cxlmem.h     |  84 ++++++++++++++++++++++++
> > > >  drivers/cxl/pci.c        |  40 ++++++++++++
> > > >  5 files changed, 392 insertions(+)
> > > > 
> > > > diff --git a/drivers/cxl/core/mbox.c b/drivers/cxl/core/mbox.c
> > > > index b03fba212799..9fb327370e08 100644
> > > > --- a/drivers/cxl/core/mbox.c
> > > > +++ b/drivers/cxl/core/mbox.c  
> > >   
> > > > +static int cxl_clear_event_record(struct cxl_dev_state *cxlds,
> > > > +				  enum cxl_event_log_type log,
> > > > +				  struct cxl_get_event_payload *get_pl)
> > > > +{
> > > > +	struct cxl_mbox_clear_event_payload payload = {
> > > > +		.event_log = log,
> > > > +	};
> > > > +	u16 total = le16_to_cpu(get_pl->record_count);
> > > > +	u8 max_handles = CXL_CLEAR_EVENT_MAX_HANDLES;
> > > > +	size_t pl_size = sizeof(payload);

This line ensures the payload is only ever the size of the definition per the
3.0 spec.

> > > > +	struct cxl_mbox_cmd mbox_cmd;
> > > > +	u16 cnt;
> > > > +	int rc;
> > > > +	int i;
> > > > +
> > > > +	/* Payload size may limit the max handles */
> > > > +	if (pl_size > cxlds->payload_size) {
> > > > +		max_handles = CXL_CLEAR_EVENT_LIMIT_HANDLES(cxlds->payload_size);
> 
> Definition of that is more complex than it needs to be - see below.

Then this ensures it is truncated if needed.

> 
> > > > +		pl_size = cxlds->payload_size;  
> > 
> > pl_size is only the max size possible if that size was smaller than the size of
> > the record [sizeof(payload) above].
> 
> Sorry. For some reason my eyes skipped over this completely.
> So we are fine for all my comments on overflowing.  On plus side
> will now check if that happens in QEMU and return an error which we
> weren't doing before.
> 
> > 
> > > > +	}
> > > > +
> > > > +	mbox_cmd = (struct cxl_mbox_cmd) {
> > > > +		.opcode = CXL_MBOX_OP_CLEAR_EVENT_RECORD,
> > > > +		.payload_in = &payload,
> > > > +		.size_in = pl_size,  
> > > 
> > > This payload size should be whatever we need to store the records,
> > > not the max size possible.  Particularly as that size is currently
> > > bigger than the mailbox might be.  
> > 
> > But the above check and set ensures that does not happen.
> > 
> > > 
> > > It shouldn't fail (I think) simply because a later version of the spec might
> > > add more to this message and things should still work, but definitely not
> > > good practice to tell the hardware this is much longer than it actually is.  
> > 
> > I don't follow.
> > 
> > The full payload is going to be sent even if we are just clearing 1 record
> > which is inefficient but it should never overflow the hardware because it is
> > limited by the check above.
> > 
> > So why would this be a problem?
> I'm struggling to find a clear spec statement on if this allowed, so the following
> is a thought experiment. There is language in definition of the "invalid payload length"
> error code "The input payload length is not valid for the specified command", but it
> doesn't go into what counts as valid.

I think the only thing which makes sense is if the payload length is smaller
than:

	Header + nr_recs * 2

Anything up to

	header + (0xff * 2) should be fine per the 3.0 spec.

> 
> What you have looks fine because a device can't fail on the basis it's told the
> payload is longer than it expects, because you might be sending a CXL 4.0 spec
> payload that is backwards compatible with CXL 3.0 - hence the fact the sizes
> don't match up with that expected can't be considered an error.
> So far so good... However, we may have a situation not dissimilar to the
> change in record length for the set event interrupt policy payload between CXL 2.0
> and CXL 3.0. The only way the endpoint knows what version of message it got is because the
> record is 4 bytes or 5 bytes.  If we have extra stuff on the end of this record
> in future the end point can assume that it is a new version of the spec and interpret
> what is in that payload space.
> 
> Say the future structure looks like
> 
> struct cxl_mbox_clear_event_payload_future {
> 	u8 event_log;		/* enum cxl_event_log_type */
> 	u8 clear_flags;
> 	u8 nr_recs;
> 	u8 reserved[3];
> 	__le16 handle[nr_recs]; 
> 	__le16 otherdata[nr_recs];

otherdata should be ignored by a 3.0 device.

a theoretical 4.0 device should handle otherdata not being there per some flag
in the flags field I would suppose...  That would have to be determined if this
payload were extended.  Otherwise this software will fail no matter what.

Other mailbox commands do not 0 out from the command size to 1M either.

> }
> 
> Endpoint receiving your 'overly long payload' will assume all those otherdata fields
> are 0, not necessarily the same as non present.

But it is not 'overly long'.  It is only the length of the current spec.  See
above.

> For the set event interrupt policy, if we sent an overlong payload like you've done here
> with assumption of the CXL 2.0 spec we would be turning off the DCD interrupt rather
> that doing nothing (unlikely to be a problem in that particularly case as that one
> doesn't have a FW Interrupt option - but that's more luck than design).
> 
> I'm not sure why we'd have extra stuff for this payload, but it 'might' happen'.

I'll have to check but I don't think I set the payload long in that message.
It too should be sizeof(<set event int policy>)

> 
> > > 
> > 
> > > 
> > >   
> > > > +	};
> > > > +
> > > > +	/*
> > > > +	 * Clear Event Records uses u8 for the handle cnt while Get Event
> > > > +	 * Record can return up to 0xffff records.
> > > > +	 */
> > > > +	i = 0;
> > > > +	for (cnt = 0; cnt < total; cnt++) {
> > > > +		payload.handle[i++] = get_pl->records[cnt].hdr.handle;
> > > > +		dev_dbg(cxlds->dev, "Event log '%d': Clearing %u\n",
> > > > +			log, le16_to_cpu(payload.handle[i]));
> > > > +
> > > > +		if (i == max_handles) {
> > > > +			payload.nr_recs = i;
> > > > +			rc = cxl_internal_send_cmd(cxlds, &mbox_cmd);
> > > > +			if (rc)
> > > > +				return rc;
> > > > +			i = 0;
> > > > +		}
> > > > +	}
> > > > +
> > > > +	/* Clear what is left if any */
> > > > +	if (i) {
> > > > +		payload.nr_recs = i;
> > > > +		rc = cxl_internal_send_cmd(cxlds, &mbox_cmd);
> > > > +		if (rc)
> > > > +			return rc;
> > > > +	}
> > > > +
> > > > +	return 0;
> > > > +}  
> > > 
> > > 
> > > ...
> > >   
> > > > diff --git a/drivers/cxl/cxlmem.h b/drivers/cxl/cxlmem.h
> > > > index ab138004f644..dd9aa3dd738e 100644
> > > > --- a/drivers/cxl/cxlmem.h
> > > > +++ b/drivers/cxl/cxlmem.h  
> > > 
> > > ...
> > >   
> > > > +
> > > > +/*
> > > > + * Clear Event Records input payload
> > > > + * CXL rev 3.0 section 8.2.9.2.3; Table 8-51
> > > > + */
> > > > +#define CXL_CLEAR_EVENT_MAX_HANDLES (0xff)
> > > > +struct cxl_mbox_clear_event_payload {
> > > > +	u8 event_log;		/* enum cxl_event_log_type */
> > > > +	u8 clear_flags;
> > > > +	u8 nr_recs;
> > > > +	u8 reserved[3];
> > > > +	__le16 handle[CXL_CLEAR_EVENT_MAX_HANDLES];  
> > > 
> > > Doesn't fit in the smallest possible payload buffer.
> > > It's 526 bytes long.  Payload buffer might be 256 bytes in total.
> > > (8.2.8.4.3 Mailbox capabilities)
> > > 
> > > Lazy approach, make this smaller and do more loops when clearing.
> > > If we want to optimize this later can expand it to this size.  
> > 
> > I agree but the code already checks for and adjusts this on the fly based on
> > cxlds->payload_size?
> > 
> >  +	/* Payload size may limit the max handles */
> >  +	if (pl_size > cxlds->payload_size) {
> >  +		max_handles = CXL_CLEAR_EVENT_LIMIT_HANDLES(cxlds->payload_size);
> >  +		pl_size = cxlds->payload_size;
> >  +	}
> > 
> > Why is this not ok?  [Other than being potentially inefficient.]
> > 
> > Do you have a patch to qemu which causes this?
> 
> Two issues crossing I think on my side and me thinking this one was obviously
> the problem when it wasn't.

My fault also for not at least throwing my Qemu test code out there.  I've been
busy with some things today.  I'll try and get those changes cleaned up and at
least another RFC set out ASAP.

> 
> > 
> > Ira
> > 
> > > > +} __packed;
> > > > +#define CXL_CLEAR_EVENT_LIMIT_HANDLES(payload_size)			\
> > > > +	(((payload_size) -						\
> > > > +		(sizeof(struct cxl_mbox_clear_event_payload) -		\
> > > > +		 (sizeof(__le16) * CXL_CLEAR_EVENT_MAX_HANDLES))) /	\
> 
> Could use offsetof() to simplify this

True.  How about I submit a clean up patch to follow?  I don't think this is
broken.

Ira

> 
> > > > +		sizeof(__le16))
> > > > +  
> > > 
> > > ...
> > >   
> 

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH V4 2/9] cxl/mem: Read, trace, and clear events on driver load
  2022-12-16 21:54     ` Ira Weiny
  2022-12-17 16:38       ` Jonathan Cameron
@ 2022-12-18  0:25       ` johnny
  2022-12-18 15:55         ` Jonathan Cameron
  1 sibling, 1 reply; 36+ messages in thread
From: johnny @ 2022-12-18  0:25 UTC (permalink / raw)
  To: Ira Weiny (ira.weiny@intel.com)
  Cc: Jonathan Cameron, Dan Williams, Bjorn Helgaas, Alison Schofield,
	Vishal Verma, Davidlohr Bueso, Dave Jiang, linux-kernel,
	linux-pci, linux-acpi, linux-cxl

On Fri, Dec 16, 2022 at 01:54:01PM -0800, Ira Weiny (ira.weiny@intel.com) wrote:
> On Fri, Dec 16, 2022 at 03:39:39PM +0000, Jonathan Cameron wrote:
> > On Sun, 11 Dec 2022 23:06:20 -0800
> > ira.weiny@intel.com wrote:
> > 
> > > From: Ira Weiny <ira.weiny@intel.com>
> > > 
> > > CXL devices have multiple event logs which can be queried for CXL event
> > > records.  Devices are required to support the storage of at least one
> > > event record in each event log type.
> > > 
> > > Devices track event log overflow by incrementing a counter and tracking
> > > the time of the first and last overflow event seen.
> > > 
> > > Software queries events via the Get Event Record mailbox command; CXL
> > > rev 3.0 section 8.2.9.2.2 and clears events via CXL rev 3.0 section
> > > 8.2.9.2.3 Clear Event Records mailbox command.
> > > 
> > > If the result of negotiating CXL Error Reporting Control is OS control,
> > > read and clear all event logs on driver load.
> > > 
> > > Ensure a clean slate of events by reading and clearing the events on
> > > driver load.
> > > 
> > > The status register is not used because a device may continue to trigger
> > > events and the only requirement is to empty the log at least once.  This
> > > allows for the required transition from empty to non-empty for interrupt
> > > generation.  Handling of interrupts is in a follow on patch.
> > > 
> > > The device can return up to 1MB worth of event records per query.
> > > Allocate a shared large buffer to handle the max number of records based
> > > on the mailbox payload size.
> > > 
> > > This patch traces a raw event record and leaves specific event record
> > > type tracing to subsequent patches.  Macros are created to aid in
> > > tracing the common CXL Event header fields.
> > > 
> > > Each record is cleared explicitly.  A clear all bit is specified but is
> > > only valid when the log overflows.
> > > 
> > > Signed-off-by: Ira Weiny <ira.weiny@intel.com>
> > 
> > A few things noticed inline.  I've tightened the QEMU code to reject the
> > case of the input payload claims to be bigger than the mailbox size
> > and hacked the size down to 256 bytes so it triggers the problem
> > highlighted below.
> 
> I'm not sure what you did here.
> 
> > 
> > > 
> > > ---
> > > Changes from V3:
> > > 	Dan
> > > 		Split off _OSC pcie bits
> > > 			Use existing style for host bridge flag in that
> > > 			patch
> > > 		Clean up event processing loop
> > > 		Use dev_err_ratelimited()
> > > 		Clean up version change log
> > > 		Delete 'EVENT LOG OVERFLOW'
> > > 		Remove cxl_clear_event_logs()
> > > 		Add comment for native cxl control
> > > 		Fail driver load on event buf allocation failure
> > > 		Comment why events are not processed without _OSC flag
> > > ---
> > >  drivers/cxl/core/mbox.c  | 136 +++++++++++++++++++++++++++++++++++++++
> > >  drivers/cxl/core/trace.h | 120 ++++++++++++++++++++++++++++++++++
> > >  drivers/cxl/cxl.h        |  12 ++++
> > >  drivers/cxl/cxlmem.h     |  84 ++++++++++++++++++++++++
> > >  drivers/cxl/pci.c        |  40 ++++++++++++
> > >  5 files changed, 392 insertions(+)
> > > 
> > > diff --git a/drivers/cxl/core/mbox.c b/drivers/cxl/core/mbox.c
> > > index b03fba212799..9fb327370e08 100644
> > > --- a/drivers/cxl/core/mbox.c
> > > +++ b/drivers/cxl/core/mbox.c
> > 
> > > +static int cxl_clear_event_record(struct cxl_dev_state *cxlds,
> > > +				  enum cxl_event_log_type log,
> > > +				  struct cxl_get_event_payload *get_pl)
> > > +{
> > > +	struct cxl_mbox_clear_event_payload payload = {
> > > +		.event_log = log,
> > > +	};
> > > +	u16 total = le16_to_cpu(get_pl->record_count);
> > > +	u8 max_handles = CXL_CLEAR_EVENT_MAX_HANDLES;
> > > +	size_t pl_size = sizeof(payload);
> > > +	struct cxl_mbox_cmd mbox_cmd;
> > > +	u16 cnt;
> > > +	int rc;
> > > +	int i;
> > > +
> > > +	/* Payload size may limit the max handles */
> > > +	if (pl_size > cxlds->payload_size) {
> > > +		max_handles = CXL_CLEAR_EVENT_LIMIT_HANDLES(cxlds->payload_size);
> > > +		pl_size = cxlds->payload_size;
> 
> pl_size is only the max size possible if that size was smaller than the size of
> the record [sizeof(payload) above].
> 
> > > +	}
> > > +
> > > +	mbox_cmd = (struct cxl_mbox_cmd) {
> > > +		.opcode = CXL_MBOX_OP_CLEAR_EVENT_RECORD,
> > > +		.payload_in = &payload,
> > > +		.size_in = pl_size,
> > 
> > This payload size should be whatever we need to store the records,
> > not the max size possible.  Particularly as that size is currently
> > bigger than the mailbox might be.
> 
> But the above check and set ensures that does not happen.
> 
> > 
> > It shouldn't fail (I think) simply because a later version of the spec might
> > add more to this message and things should still work, but definitely not
> > good practice to tell the hardware this is much longer than it actually is.
> 
> I don't follow.
> 
> The full payload is going to be sent even if we are just clearing 1 record
> which is inefficient but it should never overflow the hardware because it is
> limited by the check above.
> 
> So why would this be a problem?
> 

per spec3.0, Event Record Handles field is "A list of Event Record Handles the 
host has consumed and the device shall now remove from its internal Event Log 
store.". Extra unused handle list does not folow above description. And also 
spec mentions "All event record handles shall be nonzero value. A value of 0 
shall be treated by the device as an invalid handle.". So if there is value 0 
in extra unused handles, device shall return invalid handle error code


> > 
> > 
> > > +	};
> > > +
> > > +	/*
> > > +	 * Clear Event Records uses u8 for the handle cnt while Get Event
> > > +	 * Record can return up to 0xffff records.
> > > +	 */
> > > +	i = 0;
> > > +	for (cnt = 0; cnt < total; cnt++) {
> > > +		payload.handle[i++] = get_pl->records[cnt].hdr.handle;
> > > +		dev_dbg(cxlds->dev, "Event log '%d': Clearing %u\n",
> > > +			log, le16_to_cpu(payload.handle[i]));
> > > +
> > > +		if (i == max_handles) {
> > > +			payload.nr_recs = i;
> > > +			rc = cxl_internal_send_cmd(cxlds, &mbox_cmd);
> > > +			if (rc)
> > > +				return rc;
> > > +			i = 0;
> > > +		}
> > > +	}
> > > +
> > > +	/* Clear what is left if any */
> > > +	if (i) {
> > > +		payload.nr_recs = i;
> > > +		rc = cxl_internal_send_cmd(cxlds, &mbox_cmd);
> > > +		if (rc)
> > > +			return rc;
> > > +	}
> > > +
> > > +	return 0;
> > > +}
> > 
> > 
> > ...
> > 
> > > diff --git a/drivers/cxl/cxlmem.h b/drivers/cxl/cxlmem.h
> > > index ab138004f644..dd9aa3dd738e 100644
> > > --- a/drivers/cxl/cxlmem.h
> > > +++ b/drivers/cxl/cxlmem.h
> > 
> > ...
> > 
> > > +
> > > +/*
> > > + * Clear Event Records input payload
> > > + * CXL rev 3.0 section 8.2.9.2.3; Table 8-51
> > > + */
> > > +#define CXL_CLEAR_EVENT_MAX_HANDLES (0xff)
> > > +struct cxl_mbox_clear_event_payload {
> > > +	u8 event_log;		/* enum cxl_event_log_type */
> > > +	u8 clear_flags;
> > > +	u8 nr_recs;
> > > +	u8 reserved[3];
> > > +	__le16 handle[CXL_CLEAR_EVENT_MAX_HANDLES];
> > 
> > Doesn't fit in the smallest possible payload buffer.
> > It's 526 bytes long.  Payload buffer might be 256 bytes in total.
> > (8.2.8.4.3 Mailbox capabilities)
> > 
> > Lazy approach, make this smaller and do more loops when clearing.
> > If we want to optimize this later can expand it to this size.
> 
> I agree but the code already checks for and adjusts this on the fly based on
> cxlds->payload_size?
> 
>  +	/* Payload size may limit the max handles */
>  +	if (pl_size > cxlds->payload_size) {
>  +		max_handles = CXL_CLEAR_EVENT_LIMIT_HANDLES(cxlds->payload_size);
>  +		pl_size = cxlds->payload_size;
>  +	}
> 
> Why is this not ok?  [Other than being potentially inefficient.]
> 
> Do you have a patch to qemu which causes this?
> 
> Ira
> 
> > > +} __packed;
> > > +#define CXL_CLEAR_EVENT_LIMIT_HANDLES(payload_size)			\
> > > +	(((payload_size) -						\
> > > +		(sizeof(struct cxl_mbox_clear_event_payload) -		\
> > > +		 (sizeof(__le16) * CXL_CLEAR_EVENT_MAX_HANDLES))) /	\
> > > +		sizeof(__le16))
> > > +
> > 
> > ...
> > 
> 



^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH V4 2/9] cxl/mem: Read, trace, and clear events on driver load
  2022-12-18  0:21         ` Ira Weiny
@ 2022-12-18 15:52           ` Jonathan Cameron
  0 siblings, 0 replies; 36+ messages in thread
From: Jonathan Cameron @ 2022-12-18 15:52 UTC (permalink / raw)
  To: Ira Weiny
  Cc: Dan Williams, Bjorn Helgaas, Alison Schofield, Vishal Verma,
	Davidlohr Bueso, Dave Jiang, linux-kernel, linux-pci, linux-acpi,
	linux-cxl

On Sat, 17 Dec 2022 16:21:05 -0800
Ira Weiny <ira.weiny@intel.com> wrote:

> On Sat, Dec 17, 2022 at 04:38:50PM +0000, Jonathan Cameron wrote:
> > On Fri, 16 Dec 2022 13:54:01 -0800
> > Ira Weiny <ira.weiny@intel.com> wrote:
> >   
> > > On Fri, Dec 16, 2022 at 03:39:39PM +0000, Jonathan Cameron wrote:  
> > > > On Sun, 11 Dec 2022 23:06:20 -0800
> > > > ira.weiny@intel.com wrote:
> > > >     
> > > > > From: Ira Weiny <ira.weiny@intel.com>
> > > > > 
> > > > > CXL devices have multiple event logs which can be queried for CXL event
> > > > > records.  Devices are required to support the storage of at least one
> > > > > event record in each event log type.
> > > > > 
> > > > > Devices track event log overflow by incrementing a counter and tracking
> > > > > the time of the first and last overflow event seen.
> > > > > 
> > > > > Software queries events via the Get Event Record mailbox command; CXL
> > > > > rev 3.0 section 8.2.9.2.2 and clears events via CXL rev 3.0 section
> > > > > 8.2.9.2.3 Clear Event Records mailbox command.
> > > > > 
> > > > > If the result of negotiating CXL Error Reporting Control is OS control,
> > > > > read and clear all event logs on driver load.
> > > > > 
> > > > > Ensure a clean slate of events by reading and clearing the events on
> > > > > driver load.
> > > > > 
> > > > > The status register is not used because a device may continue to trigger
> > > > > events and the only requirement is to empty the log at least once.  This
> > > > > allows for the required transition from empty to non-empty for interrupt
> > > > > generation.  Handling of interrupts is in a follow on patch.
> > > > > 
> > > > > The device can return up to 1MB worth of event records per query.
> > > > > Allocate a shared large buffer to handle the max number of records based
> > > > > on the mailbox payload size.
> > > > > 
> > > > > This patch traces a raw event record and leaves specific event record
> > > > > type tracing to subsequent patches.  Macros are created to aid in
> > > > > tracing the common CXL Event header fields.
> > > > > 
> > > > > Each record is cleared explicitly.  A clear all bit is specified but is
> > > > > only valid when the log overflows.
> > > > > 
> > > > > Signed-off-by: Ira Weiny <ira.weiny@intel.com>    
> > > > 
> > > > A few things noticed inline.  I've tightened the QEMU code to reject the
> > > > case of the input payload claims to be bigger than the mailbox size
> > > > and hacked the size down to 256 bytes so it triggers the problem
> > > > highlighted below.    
> > > 
> > > I'm not sure what you did here.  
> > 
> > Nor am I. I think this might have been a case of chasing the undersized
> > length bug in QEMU because it was the CXL 3.0 issue and misunderstanding
> > one of the debug prints I got.
> > 
> > Friday silliness. Sorry about that!  
> 
> NP but you did have me going.  I've vowed to actually understand the spec
> better going forward!  :-D
> 
> > 
> > However, the over sized payload communicated to the hardware is still
> > a potential problem. See below.  
> 
> I don't see where there is an oversized payload used either...
We write a payload size into the mailbox register that includes much more
than the payload we are sending.

> 
> >   
> > >   
> > > >     
> > > > > 
> > > > > ---
> > > > > Changes from V3:
> > > > > 	Dan
> > > > > 		Split off _OSC pcie bits
> > > > > 			Use existing style for host bridge flag in that
> > > > > 			patch
> > > > > 		Clean up event processing loop
> > > > > 		Use dev_err_ratelimited()
> > > > > 		Clean up version change log
> > > > > 		Delete 'EVENT LOG OVERFLOW'
> > > > > 		Remove cxl_clear_event_logs()
> > > > > 		Add comment for native cxl control
> > > > > 		Fail driver load on event buf allocation failure
> > > > > 		Comment why events are not processed without _OSC flag
> > > > > ---
> > > > >  drivers/cxl/core/mbox.c  | 136 +++++++++++++++++++++++++++++++++++++++
> > > > >  drivers/cxl/core/trace.h | 120 ++++++++++++++++++++++++++++++++++
> > > > >  drivers/cxl/cxl.h        |  12 ++++
> > > > >  drivers/cxl/cxlmem.h     |  84 ++++++++++++++++++++++++
> > > > >  drivers/cxl/pci.c        |  40 ++++++++++++
> > > > >  5 files changed, 392 insertions(+)
> > > > > 
> > > > > diff --git a/drivers/cxl/core/mbox.c b/drivers/cxl/core/mbox.c
> > > > > index b03fba212799..9fb327370e08 100644
> > > > > --- a/drivers/cxl/core/mbox.c
> > > > > +++ b/drivers/cxl/core/mbox.c    
> > > >     
> > > > > +static int cxl_clear_event_record(struct cxl_dev_state *cxlds,
> > > > > +				  enum cxl_event_log_type log,
> > > > > +				  struct cxl_get_event_payload *get_pl)
> > > > > +{
> > > > > +	struct cxl_mbox_clear_event_payload payload = {
> > > > > +		.event_log = log,
> > > > > +	};
> > > > > +	u16 total = le16_to_cpu(get_pl->record_count);
> > > > > +	u8 max_handles = CXL_CLEAR_EVENT_MAX_HANDLES;
> > > > > +	size_t pl_size = sizeof(payload);  
> 
> This line ensures the payload is only ever the size of the definition per the
> 3.0 spec.

It doesn't (see later). It ensures it is the size of the maximum payload.
The size should be 6 + 2 * total (truncated as necesxary) as that's the
actual payload size. 

> 
> > > > > +	struct cxl_mbox_cmd mbox_cmd;
> > > > > +	u16 cnt;
> > > > > +	int rc;
> > > > > +	int i;
> > > > > +
> > > > > +	/* Payload size may limit the max handles */
> > > > > +	if (pl_size > cxlds->payload_size) {
> > > > > +		max_handles = CXL_CLEAR_EVENT_LIMIT_HANDLES(cxlds->payload_size);  
> > 
> > Definition of that is more complex than it needs to be - see below.  
> 
> Then this ensures it is truncated if needed.
> 
> >   
> > > > > +		pl_size = cxlds->payload_size;    
> > > 
> > > pl_size is only the max size possible if that size was smaller than the size of
> > > the record [sizeof(payload) above].  
> > 
> > Sorry. For some reason my eyes skipped over this completely.
> > So we are fine for all my comments on overflowing.  On plus side
> > will now check if that happens in QEMU and return an error which we
> > weren't doing before.
> >   
> > >   
> > > > > +	}
> > > > > +
> > > > > +	mbox_cmd = (struct cxl_mbox_cmd) {
> > > > > +		.opcode = CXL_MBOX_OP_CLEAR_EVENT_RECORD,
> > > > > +		.payload_in = &payload,
> > > > > +		.size_in = pl_size,    
> > > > 
> > > > This payload size should be whatever we need to store the records,
> > > > not the max size possible.  Particularly as that size is currently
> > > > bigger than the mailbox might be.    
> > > 
> > > But the above check and set ensures that does not happen.
> > >   
> > > > 
> > > > It shouldn't fail (I think) simply because a later version of the spec might
> > > > add more to this message and things should still work, but definitely not
> > > > good practice to tell the hardware this is much longer than it actually is.    
> > > 
> > > I don't follow.
> > > 
> > > The full payload is going to be sent even if we are just clearing 1 record
> > > which is inefficient but it should never overflow the hardware because it is
> > > limited by the check above.
> > > 
> > > So why would this be a problem?  
> > I'm struggling to find a clear spec statement on if this allowed, so the following
> > is a thought experiment. There is language in definition of the "invalid payload length"
> > error code "The input payload length is not valid for the specified command", but it
> > doesn't go into what counts as valid.  
> 
> I think the only thing which makes sense is if the payload length is smaller
> than:
> 
> 	Header + nr_recs * 2
> 
> Anything up to
> 
> 	header + (0xff * 2) should be fine per the 3.0 spec.

nr_recs is in the structure, so the it could check that (the whole argument about
future specs is to say that the device shouldn't enforce that limit even though
it can know the structure is longer than expected).

> 
> > 
> > What you have looks fine because a device can't fail on the basis it's told the
> > payload is longer than it expects, because you might be sending a CXL 4.0 spec
> > payload that is backwards compatible with CXL 3.0 - hence the fact the sizes
> > don't match up with that expected can't be considered an error.
> > So far so good... However, we may have a situation not dissimilar to the
> > change in record length for the set event interrupt policy payload between CXL 2.0
> > and CXL 3.0. The only way the endpoint knows what version of message it got is because the
> > record is 4 bytes or 5 bytes.  If we have extra stuff on the end of this record
> > in future the end point can assume that it is a new version of the spec and interpret
> > what is in that payload space.
> > 
> > Say the future structure looks like
> > 
> > struct cxl_mbox_clear_event_payload_future {
> > 	u8 event_log;		/* enum cxl_event_log_type */
> > 	u8 clear_flags;
> > 	u8 nr_recs;
> > 	u8 reserved[3];
> > 	__le16 handle[nr_recs]; 
> > 	__le16 otherdata[nr_recs];  
> 
> otherdata should be ignored by a 3.0 device.
> 
> a theoretical 4.0 device should handle otherdata not being there per some flag
> in the flags field I would suppose...  That would have to be determined if this
> payload were extended.  Otherwise this software will fail no matter what.

A flag isn't required though obviously nice to have. Length should be enough.
We already have a version of this with you sending a CXL 2.0 command to
a potentially CXL 3.0 device and not checking if DCD is supported before
doing so (which I think is fine).

> 
> Other mailbox commands do not 0 out from the command size to 1M either.
> 
> > }
> > 
> > Endpoint receiving your 'overly long payload' will assume all those otherdata fields
> > are 0, not necessarily the same as non present.  
> 
> But it is not 'overly long'.  It is only the length of the current spec.  See
> above.

This is where we disagree.  The current spec says:
Table 8-36: Input payload size for Clear Event Records is 8+
(the size for when you are clearing one record)
- Side note this is potentially wrong as a clear all wouldn't have event
record handles, so I think it should be 6+.

Table 8-51 has
* Number of event record handles
* Event record handles
and no defined reserved space after that Event Record handles.
Hence the size is precisely Header + nr_records * 2, not more.
Not directly relevant but any record handles that are 0 are treated as
invalid handles (error returned).
(this also aligns with what Jonny raised)

> 
> > For the set event interrupt policy, if we sent an overlong payload like you've done here
> > with assumption of the CXL 2.0 spec we would be turning off the DCD interrupt rather
> > that doing nothing (unlikely to be a problem in that particularly case as that one
> > doesn't have a FW Interrupt option - but that's more luck than design).
> > 
> > I'm not sure why we'd have extra stuff for this payload, but it 'might' happen'.  
> 
> I'll have to check but I don't think I set the payload long in that message.
> It too should be sizeof(<set event int policy>)

You haven't. the illustration there was about the fact that it is 4 bytes in your implementation
(which is fine as that is the CXL 2.0 message) and 5 bytes in CXL 3.0 which you'll
upgrade to when you add DCD support.

> 
> >   
> > > >   
> > >   
> > > > 
> > > >     
> > > > > +	};
> > > > > +
> > > > > +	/*
> > > > > +	 * Clear Event Records uses u8 for the handle cnt while Get Event
> > > > > +	 * Record can return up to 0xffff records.
> > > > > +	 */
> > > > > +	i = 0;
> > > > > +	for (cnt = 0; cnt < total; cnt++) {
> > > > > +		payload.handle[i++] = get_pl->records[cnt].hdr.handle;
> > > > > +		dev_dbg(cxlds->dev, "Event log '%d': Clearing %u\n",
> > > > > +			log, le16_to_cpu(payload.handle[i]));
> > > > > +
> > > > > +		if (i == max_handles) {
> > > > > +			payload.nr_recs = i;
> > > > > +			rc = cxl_internal_send_cmd(cxlds, &mbox_cmd);
> > > > > +			if (rc)
> > > > > +				return rc;
> > > > > +			i = 0;
> > > > > +		}
> > > > > +	}
> > > > > +
> > > > > +	/* Clear what is left if any */
> > > > > +	if (i) {
> > > > > +		payload.nr_recs = i;
> > > > > +		rc = cxl_internal_send_cmd(cxlds, &mbox_cmd);
> > > > > +		if (rc)
> > > > > +			return rc;
> > > > > +	}
> > > > > +
> > > > > +	return 0;
> > > > > +}    
> > > > 
> > > > 
> > > > ...
> > > >     
> > > > > diff --git a/drivers/cxl/cxlmem.h b/drivers/cxl/cxlmem.h
> > > > > index ab138004f644..dd9aa3dd738e 100644
> > > > > --- a/drivers/cxl/cxlmem.h
> > > > > +++ b/drivers/cxl/cxlmem.h    
> > > > 
> > > > ...
> > > >     
> > > > > +
> > > > > +/*
> > > > > + * Clear Event Records input payload
> > > > > + * CXL rev 3.0 section 8.2.9.2.3; Table 8-51
> > > > > + */
> > > > > +#define CXL_CLEAR_EVENT_MAX_HANDLES (0xff)
> > > > > +struct cxl_mbox_clear_event_payload {
> > > > > +	u8 event_log;		/* enum cxl_event_log_type */
> > > > > +	u8 clear_flags;
> > > > > +	u8 nr_recs;
> > > > > +	u8 reserved[3];
> > > > > +	__le16 handle[CXL_CLEAR_EVENT_MAX_HANDLES];    
> > > > 
> > > > Doesn't fit in the smallest possible payload buffer.
> > > > It's 526 bytes long.  Payload buffer might be 256 bytes in total.
> > > > (8.2.8.4.3 Mailbox capabilities)
> > > > 
> > > > Lazy approach, make this smaller and do more loops when clearing.
> > > > If we want to optimize this later can expand it to this size.    
> > > 
> > > I agree but the code already checks for and adjusts this on the fly based on
> > > cxlds->payload_size?
> > > 
> > >  +	/* Payload size may limit the max handles */
> > >  +	if (pl_size > cxlds->payload_size) {
> > >  +		max_handles = CXL_CLEAR_EVENT_LIMIT_HANDLES(cxlds->payload_size);
> > >  +		pl_size = cxlds->payload_size;
> > >  +	}
> > > 
> > > Why is this not ok?  [Other than being potentially inefficient.]
> > > 
> > > Do you have a patch to qemu which causes this?  
> > 
> > Two issues crossing I think on my side and me thinking this one was obviously
> > the problem when it wasn't.  
> 
> My fault also for not at least throwing my Qemu test code out there.  I've been
> busy with some things today.  I'll try and get those changes cleaned up and at
> least another RFC set out ASAP.
> 
> >   
> > > 
> > > Ira
> > >   
> > > > > +} __packed;
> > > > > +#define CXL_CLEAR_EVENT_LIMIT_HANDLES(payload_size)			\
> > > > > +	(((payload_size) -						\
> > > > > +		(sizeof(struct cxl_mbox_clear_event_payload) -		\
> > > > > +		 (sizeof(__le16) * CXL_CLEAR_EVENT_MAX_HANDLES))) /	\  
> > 
> > Could use offsetof() to simplify this  
> 
> True.  How about I submit a clean up patch to follow?  I don't think this is
> broken.

I think you'll be changing the patch anyway - so might as well fix this up too :)

Jonathan

> 
> Ira
> 
> >   
> > > > > +		sizeof(__le16))
> > > > > +    
> > > > 
> > > > ...
> > > >     
> >   


^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH V4 2/9] cxl/mem: Read, trace, and clear events on driver load
  2022-12-18  0:25       ` johnny
@ 2022-12-18 15:55         ` Jonathan Cameron
  2023-01-04 23:53           ` Ira Weiny
  0 siblings, 1 reply; 36+ messages in thread
From: Jonathan Cameron @ 2022-12-18 15:55 UTC (permalink / raw)
  To: johnny
  Cc: Ira Weiny (ira.weiny@intel.com),
	Dan Williams, Bjorn Helgaas, Alison Schofield, Vishal Verma,
	Davidlohr Bueso, Dave Jiang, linux-kernel, linux-pci, linux-acpi,
	linux-cxl

On Sun, 18 Dec 2022 08:25:34 +0800
johnny <johnny.li@montage-tech.com> wrote:

> On Fri, Dec 16, 2022 at 01:54:01PM -0800, Ira Weiny (ira.weiny@intel.com) wrote:
> > On Fri, Dec 16, 2022 at 03:39:39PM +0000, Jonathan Cameron wrote:  
> > > On Sun, 11 Dec 2022 23:06:20 -0800
> > > ira.weiny@intel.com wrote:
> > >   
> > > > From: Ira Weiny <ira.weiny@intel.com>
> > > > 
> > > > CXL devices have multiple event logs which can be queried for CXL event
> > > > records.  Devices are required to support the storage of at least one
> > > > event record in each event log type.
> > > > 
> > > > Devices track event log overflow by incrementing a counter and tracking
> > > > the time of the first and last overflow event seen.
> > > > 
> > > > Software queries events via the Get Event Record mailbox command; CXL
> > > > rev 3.0 section 8.2.9.2.2 and clears events via CXL rev 3.0 section
> > > > 8.2.9.2.3 Clear Event Records mailbox command.
> > > > 
> > > > If the result of negotiating CXL Error Reporting Control is OS control,
> > > > read and clear all event logs on driver load.
> > > > 
> > > > Ensure a clean slate of events by reading and clearing the events on
> > > > driver load.
> > > > 
> > > > The status register is not used because a device may continue to trigger
> > > > events and the only requirement is to empty the log at least once.  This
> > > > allows for the required transition from empty to non-empty for interrupt
> > > > generation.  Handling of interrupts is in a follow on patch.
> > > > 
> > > > The device can return up to 1MB worth of event records per query.
> > > > Allocate a shared large buffer to handle the max number of records based
> > > > on the mailbox payload size.
> > > > 
> > > > This patch traces a raw event record and leaves specific event record
> > > > type tracing to subsequent patches.  Macros are created to aid in
> > > > tracing the common CXL Event header fields.
> > > > 
> > > > Each record is cleared explicitly.  A clear all bit is specified but is
> > > > only valid when the log overflows.
> > > > 
> > > > Signed-off-by: Ira Weiny <ira.weiny@intel.com>  
> > > 
> > > A few things noticed inline.  I've tightened the QEMU code to reject the
> > > case of the input payload claims to be bigger than the mailbox size
> > > and hacked the size down to 256 bytes so it triggers the problem
> > > highlighted below.  
> > 
> > I'm not sure what you did here.
> >   
> > >   
> > > > 
> > > > ---
> > > > Changes from V3:
> > > > 	Dan
> > > > 		Split off _OSC pcie bits
> > > > 			Use existing style for host bridge flag in that
> > > > 			patch
> > > > 		Clean up event processing loop
> > > > 		Use dev_err_ratelimited()
> > > > 		Clean up version change log
> > > > 		Delete 'EVENT LOG OVERFLOW'
> > > > 		Remove cxl_clear_event_logs()
> > > > 		Add comment for native cxl control
> > > > 		Fail driver load on event buf allocation failure
> > > > 		Comment why events are not processed without _OSC flag
> > > > ---
> > > >  drivers/cxl/core/mbox.c  | 136 +++++++++++++++++++++++++++++++++++++++
> > > >  drivers/cxl/core/trace.h | 120 ++++++++++++++++++++++++++++++++++
> > > >  drivers/cxl/cxl.h        |  12 ++++
> > > >  drivers/cxl/cxlmem.h     |  84 ++++++++++++++++++++++++
> > > >  drivers/cxl/pci.c        |  40 ++++++++++++
> > > >  5 files changed, 392 insertions(+)
> > > > 
> > > > diff --git a/drivers/cxl/core/mbox.c b/drivers/cxl/core/mbox.c
> > > > index b03fba212799..9fb327370e08 100644
> > > > --- a/drivers/cxl/core/mbox.c
> > > > +++ b/drivers/cxl/core/mbox.c  
> > >   
> > > > +static int cxl_clear_event_record(struct cxl_dev_state *cxlds,
> > > > +				  enum cxl_event_log_type log,
> > > > +				  struct cxl_get_event_payload *get_pl)
> > > > +{
> > > > +	struct cxl_mbox_clear_event_payload payload = {
> > > > +		.event_log = log,
> > > > +	};
> > > > +	u16 total = le16_to_cpu(get_pl->record_count);
> > > > +	u8 max_handles = CXL_CLEAR_EVENT_MAX_HANDLES;
> > > > +	size_t pl_size = sizeof(payload);
> > > > +	struct cxl_mbox_cmd mbox_cmd;
> > > > +	u16 cnt;
> > > > +	int rc;
> > > > +	int i;
> > > > +
> > > > +	/* Payload size may limit the max handles */
> > > > +	if (pl_size > cxlds->payload_size) {
> > > > +		max_handles = CXL_CLEAR_EVENT_LIMIT_HANDLES(cxlds->payload_size);
> > > > +		pl_size = cxlds->payload_size;  
> > 
> > pl_size is only the max size possible if that size was smaller than the size of
> > the record [sizeof(payload) above].
> >   
> > > > +	}
> > > > +
> > > > +	mbox_cmd = (struct cxl_mbox_cmd) {
> > > > +		.opcode = CXL_MBOX_OP_CLEAR_EVENT_RECORD,
> > > > +		.payload_in = &payload,
> > > > +		.size_in = pl_size,  
> > > 
> > > This payload size should be whatever we need to store the records,
> > > not the max size possible.  Particularly as that size is currently
> > > bigger than the mailbox might be.  
> > 
> > But the above check and set ensures that does not happen.
> >   
> > > 
> > > It shouldn't fail (I think) simply because a later version of the spec might
> > > add more to this message and things should still work, but definitely not
> > > good practice to tell the hardware this is much longer than it actually is.  
> > 
> > I don't follow.
> > 
> > The full payload is going to be sent even if we are just clearing 1 record
> > which is inefficient but it should never overflow the hardware because it is
> > limited by the check above.
> > 
> > So why would this be a problem?
> >   
> 
> per spec3.0, Event Record Handles field is "A list of Event Record Handles the 
> host has consumed and the device shall now remove from its internal Event Log 
> store.". Extra unused handle list does not folow above description. And also 
> spec mentions "All event record handles shall be nonzero value. A value of 0 
> shall be treated by the device as an invalid handle.". So if there is value 0 
> in extra unused handles, device shall return invalid handle error code

I don't think we call into that particular corner as the number of event
record handles is set correctly.  Otherwise I agree this isn't following the
spec - though I think key here is that it won't be broken against CXL 3.0 devices
(with that rather roundabout argument that a CXL 3.0 devices should handle later
spec messages as those should be backwards compatible) but it might be broken
against CXL 3.0+ ones that interpret the 0s at the end as having meaning.

Thanks,

Jonathan

> 
> 
> > > 
> > >   
> > > > +	};
> > > > +
> > > > +	/*
> > > > +	 * Clear Event Records uses u8 for the handle cnt while Get Event
> > > > +	 * Record can return up to 0xffff records.
> > > > +	 */
> > > > +	i = 0;
> > > > +	for (cnt = 0; cnt < total; cnt++) {
> > > > +		payload.handle[i++] = get_pl->records[cnt].hdr.handle;
> > > > +		dev_dbg(cxlds->dev, "Event log '%d': Clearing %u\n",
> > > > +			log, le16_to_cpu(payload.handle[i]));
> > > > +
> > > > +		if (i == max_handles) {
> > > > +			payload.nr_recs = i;
> > > > +			rc = cxl_internal_send_cmd(cxlds, &mbox_cmd);
> > > > +			if (rc)
> > > > +				return rc;
> > > > +			i = 0;
> > > > +		}
> > > > +	}
> > > > +
> > > > +	/* Clear what is left if any */
> > > > +	if (i) {
> > > > +		payload.nr_recs = i;
> > > > +		rc = cxl_internal_send_cmd(cxlds, &mbox_cmd);
> > > > +		if (rc)
> > > > +			return rc;
> > > > +	}
> > > > +
> > > > +	return 0;
> > > > +}  
> > > 
> > > 
> > > ...
> > >   
> > > > diff --git a/drivers/cxl/cxlmem.h b/drivers/cxl/cxlmem.h
> > > > index ab138004f644..dd9aa3dd738e 100644
> > > > --- a/drivers/cxl/cxlmem.h
> > > > +++ b/drivers/cxl/cxlmem.h  
> > > 
> > > ...
> > >   
> > > > +
> > > > +/*
> > > > + * Clear Event Records input payload
> > > > + * CXL rev 3.0 section 8.2.9.2.3; Table 8-51
> > > > + */
> > > > +#define CXL_CLEAR_EVENT_MAX_HANDLES (0xff)
> > > > +struct cxl_mbox_clear_event_payload {
> > > > +	u8 event_log;		/* enum cxl_event_log_type */
> > > > +	u8 clear_flags;
> > > > +	u8 nr_recs;
> > > > +	u8 reserved[3];
> > > > +	__le16 handle[CXL_CLEAR_EVENT_MAX_HANDLES];  
> > > 
> > > Doesn't fit in the smallest possible payload buffer.
> > > It's 526 bytes long.  Payload buffer might be 256 bytes in total.
> > > (8.2.8.4.3 Mailbox capabilities)
> > > 
> > > Lazy approach, make this smaller and do more loops when clearing.
> > > If we want to optimize this later can expand it to this size.  
> > 
> > I agree but the code already checks for and adjusts this on the fly based on
> > cxlds->payload_size?
> > 
> >  +	/* Payload size may limit the max handles */
> >  +	if (pl_size > cxlds->payload_size) {
> >  +		max_handles = CXL_CLEAR_EVENT_LIMIT_HANDLES(cxlds->payload_size);
> >  +		pl_size = cxlds->payload_size;
> >  +	}
> > 
> > Why is this not ok?  [Other than being potentially inefficient.]
> > 
> > Do you have a patch to qemu which causes this?
> > 
> > Ira
> >   
> > > > +} __packed;
> > > > +#define CXL_CLEAR_EVENT_LIMIT_HANDLES(payload_size)			\
> > > > +	(((payload_size) -						\
> > > > +		(sizeof(struct cxl_mbox_clear_event_payload) -		\
> > > > +		 (sizeof(__le16) * CXL_CLEAR_EVENT_MAX_HANDLES))) /	\
> > > > +		sizeof(__le16))
> > > > +  
> > > 
> > > ...
> > >   
> >   
> 


^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH V4 2/9] cxl/mem: Read, trace, and clear events on driver load
  2022-12-18 15:55         ` Jonathan Cameron
@ 2023-01-04 23:53           ` Ira Weiny
  0 siblings, 0 replies; 36+ messages in thread
From: Ira Weiny @ 2023-01-04 23:53 UTC (permalink / raw)
  To: Jonathan Cameron
  Cc: johnny, Dan Williams, Bjorn Helgaas, Alison Schofield,
	Vishal Verma, Davidlohr Bueso, Dave Jiang, linux-kernel,
	linux-pci, linux-acpi, linux-cxl

On Sun, Dec 18, 2022 at 03:55:53PM +0000, Jonathan Cameron wrote:
> On Sun, 18 Dec 2022 08:25:34 +0800
> johnny <johnny.li@montage-tech.com> wrote:
> 

[snip]

> > >   
> > > > > +	}
> > > > > +
> > > > > +	mbox_cmd = (struct cxl_mbox_cmd) {
> > > > > +		.opcode = CXL_MBOX_OP_CLEAR_EVENT_RECORD,
> > > > > +		.payload_in = &payload,
> > > > > +		.size_in = pl_size,  
> > > > 
> > > > This payload size should be whatever we need to store the records,
> > > > not the max size possible.  Particularly as that size is currently
> > > > bigger than the mailbox might be.  
> > > 
> > > But the above check and set ensures that does not happen.
> > >   
> > > > 
> > > > It shouldn't fail (I think) simply because a later version of the spec might
> > > > add more to this message and things should still work, but definitely not
> > > > good practice to tell the hardware this is much longer than it actually is.  
> > > 
> > > I don't follow.
> > > 
> > > The full payload is going to be sent even if we are just clearing 1 record
> > > which is inefficient but it should never overflow the hardware because it is
> > > limited by the check above.
> > > 
> > > So why would this be a problem?
> > >   
> > 
> > per spec3.0, Event Record Handles field is "A list of Event Record Handles the 
> > host has consumed and the device shall now remove from its internal Event Log 
> > store.". Extra unused handle list does not folow above description. And also 
> > spec mentions "All event record handles shall be nonzero value. A value of 0 
> > shall be treated by the device as an invalid handle.". So if there is value 0 
> > in extra unused handles, device shall return invalid handle error code
> 
> I don't think we call into that particular corner as the number of event
> record handles is set correctly.  Otherwise I agree this isn't following the
> spec - though I think key here is that it won't be broken against CXL 3.0 devices
> (with that rather roundabout argument that a CXL 3.0 devices should handle later
> spec messages as those should be backwards compatible) but it might be broken
> against CXL 3.0+ ones that interpret the 0s at the end as having meaning.

I'm respining this to add the pci_set_master() anyway.  So I'm going to change
this as well.  I really don't see how hardware would go off anything but the
number of records to process the handles I could see some overly strict
firmware wanting to validate the size being exactly equal to the number
specified rather than just less than (which is what I would anticipate an issue
with).

Dan has agreed to land the movement of the trace point definition to
drivers/cxl patch I need to cxl/next.  After that I will rebase and send out.

Ira

> 
> Thanks,
> 
> Jonathan
> 

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH V4 1/9] PCI/CXL: Export native CXL error reporting control
  2022-12-12  7:06 ` [PATCH V4 1/9] PCI/CXL: Export native CXL error reporting control ira.weiny
  2022-12-13 19:12   ` Dan Williams
  2022-12-16 14:09   ` Jonathan Cameron
@ 2023-01-05  3:16   ` Ira Weiny
  2023-01-05 16:56   ` Bjorn Helgaas
  3 siblings, 0 replies; 36+ messages in thread
From: Ira Weiny @ 2023-01-05  3:16 UTC (permalink / raw)
  To: Dan Williams, Rafael J. Wysocki
  Cc: Bjorn Helgaas, Lukas Wunner, linux-pci, linux-acpi,
	Alison Schofield, Vishal Verma, Davidlohr Bueso,
	Jonathan Cameron, Dave Jiang, linux-kernel, linux-cxl

On Sun, Dec 11, 2022 at 11:06:19PM -0800, Ira wrote:
> From: Ira Weiny <ira.weiny@intel.com>
> 
> CXL _OSC Error Reporting Control is used by the OS to determine if
> Firmware has control of various CXL error reporting capabilities
> including the event logs.
> 
> Expose the result of negotiating CXL Error Reporting Control in struct
> pci_host_bridge for consumption by the CXL drivers.
> 

Rafael,

I should have CC'ed you on this patch.  Could I get an ack on it so Dan can
take it through the CXL tree?

Thanks,
Ira

> Cc: Bjorn Helgaas <bhelgaas@google.com>
> Cc: Lukas Wunner <lukas@wunner.de>
> Cc: linux-pci@vger.kernel.org
> Cc: linux-acpi@vger.kernel.org
> Signed-off-by: Ira Weiny <ira.weiny@intel.com>
> 
> ---
> Changes from V3:
> 	New patch split out
> ---
>  drivers/acpi/pci_root.c | 3 +++
>  drivers/pci/probe.c     | 1 +
>  include/linux/pci.h     | 1 +
>  3 files changed, 5 insertions(+)
> 
> diff --git a/drivers/acpi/pci_root.c b/drivers/acpi/pci_root.c
> index b3c202d2a433..84030804a763 100644
> --- a/drivers/acpi/pci_root.c
> +++ b/drivers/acpi/pci_root.c
> @@ -1047,6 +1047,9 @@ struct pci_bus *acpi_pci_root_create(struct acpi_pci_root *root,
>  	if (!(root->osc_control_set & OSC_PCI_EXPRESS_DPC_CONTROL))
>  		host_bridge->native_dpc = 0;
>  
> +	if (!(root->osc_ext_control_set & OSC_CXL_ERROR_REPORTING_CONTROL))
> +		host_bridge->native_cxl_error = 0;
> +
>  	/*
>  	 * Evaluate the "PCI Boot Configuration" _DSM Function.  If it
>  	 * exists and returns 0, we must preserve any PCI resource
> diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c
> index 2f4e88a44e8b..34c9fd6840c4 100644
> --- a/drivers/pci/probe.c
> +++ b/drivers/pci/probe.c
> @@ -596,6 +596,7 @@ static void pci_init_host_bridge(struct pci_host_bridge *bridge)
>  	bridge->native_ltr = 1;
>  	bridge->native_dpc = 1;
>  	bridge->domain_nr = PCI_DOMAIN_NR_NOT_SET;
> +	bridge->native_cxl_error = 1;
>  
>  	device_initialize(&bridge->dev);
>  }
> diff --git a/include/linux/pci.h b/include/linux/pci.h
> index 1f81807492ef..08c3ccd2617b 100644
> --- a/include/linux/pci.h
> +++ b/include/linux/pci.h
> @@ -577,6 +577,7 @@ struct pci_host_bridge {
>  	unsigned int	native_pme:1;		/* OS may use PCIe PME */
>  	unsigned int	native_ltr:1;		/* OS may use PCIe LTR */
>  	unsigned int	native_dpc:1;		/* OS may use PCIe DPC */
> +	unsigned int	native_cxl_error:1;	/* OS may use CXL RAS/Events */
>  	unsigned int	preserve_config:1;	/* Preserve FW resource setup */
>  	unsigned int	size_windows:1;		/* Enable root bus sizing */
>  	unsigned int	msi_domain:1;		/* Bridge wants MSI domain */
> -- 
> 2.37.2
> 

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH V4 1/9] PCI/CXL: Export native CXL error reporting control
  2022-12-12  7:06 ` [PATCH V4 1/9] PCI/CXL: Export native CXL error reporting control ira.weiny
                     ` (2 preceding siblings ...)
  2023-01-05  3:16   ` Ira Weiny
@ 2023-01-05 16:56   ` Bjorn Helgaas
  3 siblings, 0 replies; 36+ messages in thread
From: Bjorn Helgaas @ 2023-01-05 16:56 UTC (permalink / raw)
  To: ira.weiny
  Cc: Dan Williams, Bjorn Helgaas, Lukas Wunner, linux-pci, linux-acpi,
	Alison Schofield, Vishal Verma, Davidlohr Bueso,
	Jonathan Cameron, Dave Jiang, linux-kernel, linux-cxl

On Sun, Dec 11, 2022 at 11:06:19PM -0800, ira.weiny@intel.com wrote:
> From: Ira Weiny <ira.weiny@intel.com>
> 
> CXL _OSC Error Reporting Control is used by the OS to determine if
> Firmware has control of various CXL error reporting capabilities
> including the event logs.
> 
> Expose the result of negotiating CXL Error Reporting Control in struct
> pci_host_bridge for consumption by the CXL drivers.
> 
> Cc: Bjorn Helgaas <bhelgaas@google.com>
> Cc: Lukas Wunner <lukas@wunner.de>
> Cc: linux-pci@vger.kernel.org
> Cc: linux-acpi@vger.kernel.org
> Signed-off-by: Ira Weiny <ira.weiny@intel.com>

FWIW, for probe.c and pci.h:

Acked-by: Bjorn Helgaas <bhelgaas@google.com>

> ---
> Changes from V3:
> 	New patch split out
> ---
>  drivers/acpi/pci_root.c | 3 +++
>  drivers/pci/probe.c     | 1 +
>  include/linux/pci.h     | 1 +
>  3 files changed, 5 insertions(+)
> 
> diff --git a/drivers/acpi/pci_root.c b/drivers/acpi/pci_root.c
> index b3c202d2a433..84030804a763 100644
> --- a/drivers/acpi/pci_root.c
> +++ b/drivers/acpi/pci_root.c
> @@ -1047,6 +1047,9 @@ struct pci_bus *acpi_pci_root_create(struct acpi_pci_root *root,
>  	if (!(root->osc_control_set & OSC_PCI_EXPRESS_DPC_CONTROL))
>  		host_bridge->native_dpc = 0;
>  
> +	if (!(root->osc_ext_control_set & OSC_CXL_ERROR_REPORTING_CONTROL))
> +		host_bridge->native_cxl_error = 0;
> +
>  	/*
>  	 * Evaluate the "PCI Boot Configuration" _DSM Function.  If it
>  	 * exists and returns 0, we must preserve any PCI resource
> diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c
> index 2f4e88a44e8b..34c9fd6840c4 100644
> --- a/drivers/pci/probe.c
> +++ b/drivers/pci/probe.c
> @@ -596,6 +596,7 @@ static void pci_init_host_bridge(struct pci_host_bridge *bridge)
>  	bridge->native_ltr = 1;
>  	bridge->native_dpc = 1;
>  	bridge->domain_nr = PCI_DOMAIN_NR_NOT_SET;
> +	bridge->native_cxl_error = 1;
>  
>  	device_initialize(&bridge->dev);
>  }
> diff --git a/include/linux/pci.h b/include/linux/pci.h
> index 1f81807492ef..08c3ccd2617b 100644
> --- a/include/linux/pci.h
> +++ b/include/linux/pci.h
> @@ -577,6 +577,7 @@ struct pci_host_bridge {
>  	unsigned int	native_pme:1;		/* OS may use PCIe PME */
>  	unsigned int	native_ltr:1;		/* OS may use PCIe LTR */
>  	unsigned int	native_dpc:1;		/* OS may use PCIe DPC */
> +	unsigned int	native_cxl_error:1;	/* OS may use CXL RAS/Events */
>  	unsigned int	preserve_config:1;	/* Preserve FW resource setup */
>  	unsigned int	size_windows:1;		/* Enable root bus sizing */
>  	unsigned int	msi_domain:1;		/* Bridge wants MSI domain */
> -- 
> 2.37.2
> 

^ permalink raw reply	[flat|nested] 36+ messages in thread

end of thread, other threads:[~2023-01-05 16:56 UTC | newest]

Thread overview: 36+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-12-12  7:06 [PATCH V4 0/9] CXL: Process event logs ira.weiny
2022-12-12  7:06 ` [PATCH V4 1/9] PCI/CXL: Export native CXL error reporting control ira.weiny
2022-12-13 19:12   ` Dan Williams
2022-12-16 14:09   ` Jonathan Cameron
2023-01-05  3:16   ` Ira Weiny
2023-01-05 16:56   ` Bjorn Helgaas
2022-12-12  7:06 ` [PATCH V4 2/9] cxl/mem: Read, trace, and clear events on driver load ira.weiny
2022-12-13  6:49   ` johnny
2022-12-13 18:56     ` Ira Weiny
2022-12-16 15:39   ` Jonathan Cameron
2022-12-16 21:54     ` Ira Weiny
2022-12-17 16:38       ` Jonathan Cameron
2022-12-18  0:21         ` Ira Weiny
2022-12-18 15:52           ` Jonathan Cameron
2022-12-18  0:25       ` johnny
2022-12-18 15:55         ` Jonathan Cameron
2023-01-04 23:53           ` Ira Weiny
2022-12-12  7:06 ` [PATCH V4 3/9] cxl/mem: Wire up event interrupts ira.weiny
2022-12-13 20:15   ` Dan Williams
2022-12-16 14:24   ` Jonathan Cameron
2022-12-16 18:42     ` Jonathan Cameron
2022-12-16 21:28       ` Ira Weiny
2022-12-17 16:40         ` Jonathan Cameron
2022-12-16 18:21   ` Jonathan Cameron
2022-12-16 21:33     ` Ira Weiny
2022-12-17 16:43       ` Jonathan Cameron
2022-12-12  7:06 ` [PATCH V4 4/9] cxl/mem: Trace General Media Event Record ira.weiny
2022-12-12  7:06 ` [PATCH V4 5/9] cxl/mem: Trace DRAM " ira.weiny
2022-12-12  7:06 ` [PATCH V4 6/9] cxl/mem: Trace Memory Module " ira.weiny
2022-12-12  7:06 ` [PATCH V4 7/9] cxl/test: Add generic mock events ira.weiny
2022-12-12  7:06 ` [PATCH V4 8/9] cxl/test: Add specific events ira.weiny
2022-12-12  7:06 ` [PATCH V4 9/9] cxl/test: Simulate event log overflow ira.weiny
2022-12-16 12:25 ` [PATCH V4 0/9] CXL: Process event logs Jonathan Cameron
2022-12-16 17:01   ` Dan Williams
2022-12-16 18:15     ` Ira Weiny
2022-12-16 18:39     ` Jonathan Cameron

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.