linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH V2 00/11] CXL: Process event logs
@ 2022-12-01  0:27 ira.weiny
  2022-12-01  0:27 ` [PATCH V2 01/11] cxl/pci: Add generic MSI-X/MSI irq support ira.weiny
                   ` (10 more replies)
  0 siblings, 11 replies; 64+ messages in thread
From: ira.weiny @ 2022-12-01  0:27 UTC (permalink / raw)
  To: Dan Williams
  Cc: Ira Weiny, Alison Schofield, Vishal Verma, Ben Widawsky,
	Steven Rostedt, Jonathan Cameron, Davidlohr Bueso, Dave Jiang,
	linux-kernel, linux-cxl

From: Ira Weiny <ira.weiny@intel.com>

Changes from V1
	Address comments, from Jonathan, Dave, and Alison
		Main comment was to allow for a full payload size number of
		event records to be processed on each Get event cyle.
	Pick up tags


This code has been tested with a newer qemu which allows for more events to be
returned at a time as well ad additional QMP event and interrupt injection.
Thos patches will follow once they have been cleaned up.

The series is in 5 parts:

	0) Davidlohrs irq patch modified for 16 vectors
	1) Base functionality
	2) Parsing specific events (Dynamic Capacity Event Record is defered)
	3) Event interrupt support
	4) cxl-test infrastructure for basic tests

While I believe this entire series is ready to be merged I realize that the
interrupt support may still have some discussion around it.  Therefor parts 1,
2, and 4 could be merged without irq support as cxl-test provides testing for
that.  Interrupt testing requires Qemu but it too is fully tested and ready to
go.


Changes from RFC v2
	Integrated Davidlohr's irq patch, allocate up to 16 vectors, and base
		my irq support on modifications to that patch.
	Smita
		Check event status before reading each log.
	Jonathan
		Process more than 1 record at a time
		Remove reserved fields
	Steven
		Prefix trace points with 'cxl_'
	Davidlohr
		PUll in his patch

Changes from RFC v1
	Add event irqs
	General simplification of the code.
	Resolve field alignment questions
	Update to rev 3.0 for comments and structures
	Add reserved fields and output them

Event records inform the OS of various device events.  Events are not needed
for any kernel operation but various user level software will want to track
events.

Add event reporting through the trace event mechanism.  On driver load read and
clear all device events.

Enable all event logs for interrupts and process each log on interrupt.


TESTING:

Testing of this was performed with additions to QEMU in the following repo:

	https://github.com/weiny2/qemu/tree/ira-cxl-events-latest

Changes to this repo are not finalized yet so I'm not posting those patches
right away.  But there is enough functionality added to further test this.

	1) event status register
	2) additional event injection capabilities
	3) Process more than 1 record at a time in Get/Clear mailbox commands

Davidlohr Bueso (1):
  cxl/pci: Add generic MSI-X/MSI irq support

Ira Weiny (10):
  cxl/mem: Implement Get Event Records command
  cxl/mem: Implement Clear Event Records command
  cxl/mem: Clear events on driver load
  cxl/mem: Trace General Media Event Record
  cxl/mem: Trace DRAM Event Record
  cxl/mem: Trace Memory Module Event Record
  cxl/mem: Wire up event interrupts
  cxl/test: Add generic mock events
  cxl/test: Add specific events
  cxl/test: Simulate event log overflow

 MAINTAINERS                     |   1 +
 drivers/cxl/core/mbox.c         | 260 +++++++++++++++++
 drivers/cxl/cxl.h               |   7 +
 drivers/cxl/cxlmem.h            | 188 ++++++++++++
 drivers/cxl/cxlpci.h            |   6 +
 drivers/cxl/pci.c               | 155 ++++++++++
 include/trace/events/cxl.h      | 486 ++++++++++++++++++++++++++++++++
 include/uapi/linux/cxl_mem.h    |   4 +
 tools/testing/cxl/test/Kbuild   |   2 +-
 tools/testing/cxl/test/events.c | 362 ++++++++++++++++++++++++
 tools/testing/cxl/test/events.h |   9 +
 tools/testing/cxl/test/mem.c    |  35 +++
 12 files changed, 1514 insertions(+), 1 deletion(-)
 create mode 100644 include/trace/events/cxl.h
 create mode 100644 tools/testing/cxl/test/events.c
 create mode 100644 tools/testing/cxl/test/events.h


base-commit: aae703b02f92bde9264366c545e87cec451de471
-- 
2.37.2


^ permalink raw reply	[flat|nested] 64+ messages in thread

* [PATCH V2 01/11] cxl/pci: Add generic MSI-X/MSI irq support
  2022-12-01  0:27 [PATCH V2 00/11] CXL: Process event logs ira.weiny
@ 2022-12-01  0:27 ` ira.weiny
  2022-12-01 10:18   ` Jonathan Cameron
                     ` (2 more replies)
  2022-12-01  0:27 ` [PATCH V2 02/11] cxl/mem: Implement Get Event Records command ira.weiny
                   ` (9 subsequent siblings)
  10 siblings, 3 replies; 64+ messages in thread
From: ira.weiny @ 2022-12-01  0:27 UTC (permalink / raw)
  To: Dan Williams
  Cc: Davidlohr Bueso, Bjorn Helgaas, Jonathan Cameron, Ira Weiny,
	Alison Schofield, Vishal Verma, Ben Widawsky, Steven Rostedt,
	Dave Jiang, linux-kernel, linux-cxl

From: Davidlohr Bueso <dave@stgolabs.net>

Currently the only CXL features targeted for irq support require their
message numbers to be within the first 16 entries.  The device may
however support less than 16 entries depending on the support it
provides.

Attempt to allocate these 16 irq vectors.  If the device supports less
then the PCI infrastructure will allocate that number.  Store the number
of vectors actually allocated in the device state for later use
by individual functions.

Upon successful allocation, users can plug in their respective isr at
any point thereafter, for example, if the irq setup is not done in the
PCI driver, such as the case of the CXL-PMU.

Cc: Bjorn Helgaas <helgaas@kernel.org>
Cc: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Co-developed-by: Ira Weiny <ira.weiny@intel.com>
Signed-off-by: Ira Weiny <ira.weiny@intel.com>
Signed-off-by: Davidlohr Bueso <dave@stgolabs.net>

---
Changes from V1:
	Jonathan
		pci_alloc_irq_vectors() cleans up the vectors automatically
		use msi_enabled rather than nr_irq_vecs

Changes from Ira
	Remove reviews
	Allocate up to a static 16 vectors.
	Change cover letter
---
 drivers/cxl/cxlmem.h |  3 +++
 drivers/cxl/cxlpci.h |  6 ++++++
 drivers/cxl/pci.c    | 23 +++++++++++++++++++++++
 3 files changed, 32 insertions(+)

diff --git a/drivers/cxl/cxlmem.h b/drivers/cxl/cxlmem.h
index 88e3a8e54b6a..cd35f43fedd4 100644
--- a/drivers/cxl/cxlmem.h
+++ b/drivers/cxl/cxlmem.h
@@ -211,6 +211,7 @@ struct cxl_endpoint_dvsec_info {
  * @info: Cached DVSEC information about the device.
  * @serial: PCIe Device Serial Number
  * @doe_mbs: PCI DOE mailbox array
+ * @msi_enabled: MSI-X/MSI has been enabled
  * @mbox_send: @dev specific transport for transmitting mailbox commands
  *
  * See section 8.2.9.5.2 Capacity Configuration and Label Storage for
@@ -247,6 +248,8 @@ struct cxl_dev_state {
 
 	struct xarray doe_mbs;
 
+	bool msi_enabled;
+
 	int (*mbox_send)(struct cxl_dev_state *cxlds, struct cxl_mbox_cmd *cmd);
 };
 
diff --git a/drivers/cxl/cxlpci.h b/drivers/cxl/cxlpci.h
index eec597dbe763..b7f4e2f417d3 100644
--- a/drivers/cxl/cxlpci.h
+++ b/drivers/cxl/cxlpci.h
@@ -53,6 +53,12 @@
 #define	    CXL_DVSEC_REG_LOCATOR_BLOCK_ID_MASK			GENMASK(15, 8)
 #define     CXL_DVSEC_REG_LOCATOR_BLOCK_OFF_LOW_MASK		GENMASK(31, 16)
 
+/*
+ * NOTE: Currently all the functions which are enabled for CXL require their
+ * vectors to be in the first 16.  Use this as the max.
+ */
+#define CXL_PCI_REQUIRED_VECTORS 16
+
 /* Register Block Identifier (RBI) */
 enum cxl_regloc_type {
 	CXL_REGLOC_RBI_EMPTY = 0,
diff --git a/drivers/cxl/pci.c b/drivers/cxl/pci.c
index faeb5d9d7a7a..8f86f85d89c7 100644
--- a/drivers/cxl/pci.c
+++ b/drivers/cxl/pci.c
@@ -428,6 +428,27 @@ static void devm_cxl_pci_create_doe(struct cxl_dev_state *cxlds)
 	}
 }
 
+static void cxl_pci_alloc_irq_vectors(struct cxl_dev_state *cxlds)
+{
+	struct device *dev = cxlds->dev;
+	struct pci_dev *pdev = to_pci_dev(dev);
+	int nvecs;
+
+	/*
+	 * NOTE: pci_alloc_irq_vectors() handles calling pci_free_irq_vectors()
+	 * automatically despite not being called pcim_*.  See
+	 * pci_setup_msi_context().
+	 */
+	nvecs = pci_alloc_irq_vectors(pdev, 1, CXL_PCI_REQUIRED_VECTORS,
+				   PCI_IRQ_MSIX | PCI_IRQ_MSI);
+	if (nvecs < 0) {
+		dev_dbg(dev, "Failed to alloc irq vectors; use polling instead.\n");
+		return;
+	}
+
+	cxlds->msi_enabled = true;
+}
+
 static int cxl_pci_probe(struct pci_dev *pdev, const struct pci_device_id *id)
 {
 	struct cxl_register_map map;
@@ -494,6 +515,8 @@ static int cxl_pci_probe(struct pci_dev *pdev, const struct pci_device_id *id)
 	if (rc)
 		return rc;
 
+	cxl_pci_alloc_irq_vectors(cxlds);
+
 	cxlmd = devm_cxl_add_memdev(cxlds);
 	if (IS_ERR(cxlmd))
 		return PTR_ERR(cxlmd);
-- 
2.37.2


^ permalink raw reply related	[flat|nested] 64+ messages in thread

* [PATCH V2 02/11] cxl/mem: Implement Get Event Records command
  2022-12-01  0:27 [PATCH V2 00/11] CXL: Process event logs ira.weiny
  2022-12-01  0:27 ` [PATCH V2 01/11] cxl/pci: Add generic MSI-X/MSI irq support ira.weiny
@ 2022-12-01  0:27 ` ira.weiny
  2022-12-01 13:06   ` Jonathan Cameron
                     ` (2 more replies)
  2022-12-01  0:27 ` [PATCH V2 03/11] cxl/mem: Implement Clear " ira.weiny
                   ` (8 subsequent siblings)
  10 siblings, 3 replies; 64+ messages in thread
From: ira.weiny @ 2022-12-01  0:27 UTC (permalink / raw)
  To: Dan Williams
  Cc: Ira Weiny, Steven Rostedt, Alison Schofield, Vishal Verma,
	Ben Widawsky, Jonathan Cameron, Davidlohr Bueso, Dave Jiang,
	linux-kernel, linux-cxl

From: Ira Weiny <ira.weiny@intel.com>

CXL devices have multiple event logs which can be queried for CXL event
records.  Devices are required to support the storage of at least one
event record in each event log type.

Devices track event log overflow by incrementing a counter and tracking
the time of the first and last overflow event seen.

Software queries events via the Get Event Record mailbox command; CXL
rev 3.0 section 8.2.9.2.2.

Issue the Get Event Record mailbox command on driver load.  Trace each
record found with a generic record trace.  Trace any overflow
conditions.

The device can return up to 1MB worth of event records per query.
Allocate a shared large buffer to handle the max number of records based
on the mailbox payload size.

This patch traces a raw event record only and leaves the specific event
record types to subsequent patches.

Macros are created to use for tracing the common CXL Event header
fields.

Cc: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Ira Weiny <ira.weiny@intel.com>

---
Change from V1:
	Ignore useless More Event Flag
	defer DCD support
	Jonathan
		delete extra blank line
		Use all caps for flags
	Jonathan/Dan/Ira
		Allocate event MB buffer on start up.
	Alison
		s/pl_nr/nr_pl

Change from RFC v2:
	Support reading 3 events at once.
	Reverse Jonathan's suggestion and check for positive number of
		records.  Because the record count may have been
		returned as something > 3 based on what the device
		thinks it can send back even though the core Linux mbox
		processing truncates the data.
	Alison and Dave Jiang
		Change header uuid type to uuid_t for better user space
		processing
	Smita
		Check status reg before reading log.
	Steven
		Prefix all trace points with 'cxl_'
		Use static branch <trace>_enabled() calls
	Jonathan
		s/CXL_EVENT_TYPE_INFO/0
		s/{first,last}/{first,last}_ts
		Remove Reserved field from header
		Fix header issue for cxl_event_log_type_str()

Change from RFC:
	Remove redundant error message in get event records loop
	s/EVENT_RECORD_DATA_LENGTH/CXL_EVENT_RECORD_DATA_LENGTH
	Use hdr_uuid for the header UUID field
	Use cxl_event_log_type_str() for the trace events
	Create macros for the header fields and common entries of each event
	Add reserved buffer output dump
	Report error if event query fails
	Remove unused record_cnt variable
	Steven - reorder overflow record
		Remove NOTE about checkpatch
	Jonathan
		check for exactly 1 record
		s/v3.0/rev 3.0
		Use 3 byte fields for 24bit fields
		Add 3.0 Maintenance Operation Class
		Add Dynamic Capacity log type
		Fix spelling
	Dave Jiang/Dan/Alison
		s/cxl-event/cxl
		trace/events/cxl-events => trace/events/cxl.h
		s/cxl_event_overflow/overflow
		s/cxl_event/generic_event
---
 MAINTAINERS                  |   1 +
 drivers/cxl/core/mbox.c      | 105 +++++++++++++++++++++++++++++
 drivers/cxl/cxl.h            |   7 ++
 drivers/cxl/cxlmem.h         |  72 ++++++++++++++++++++
 include/trace/events/cxl.h   | 126 +++++++++++++++++++++++++++++++++++
 include/uapi/linux/cxl_mem.h |   1 +
 6 files changed, 312 insertions(+)
 create mode 100644 include/trace/events/cxl.h

diff --git a/MAINTAINERS b/MAINTAINERS
index ca063a504026..4b7c6e3055c6 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -5223,6 +5223,7 @@ M:	Dan Williams <dan.j.williams@intel.com>
 L:	linux-cxl@vger.kernel.org
 S:	Maintained
 F:	drivers/cxl/
+F:	include/trace/events/cxl.h
 F:	include/uapi/linux/cxl_mem.h
 
 CONEXANT ACCESSRUNNER USB DRIVER
diff --git a/drivers/cxl/core/mbox.c b/drivers/cxl/core/mbox.c
index 16176b9278b4..70b681027a3d 100644
--- a/drivers/cxl/core/mbox.c
+++ b/drivers/cxl/core/mbox.c
@@ -7,6 +7,9 @@
 #include <cxlmem.h>
 #include <cxl.h>
 
+#define CREATE_TRACE_POINTS
+#include <trace/events/cxl.h>
+
 #include "core.h"
 
 static bool cxl_raw_allow_all;
@@ -48,6 +51,7 @@ static struct cxl_mem_command cxl_mem_commands[CXL_MEM_COMMAND_ID_MAX] = {
 	CXL_CMD(RAW, CXL_VARIABLE_PAYLOAD, CXL_VARIABLE_PAYLOAD, 0),
 #endif
 	CXL_CMD(GET_SUPPORTED_LOGS, 0, CXL_VARIABLE_PAYLOAD, CXL_CMD_FLAG_FORCE_ENABLE),
+	CXL_CMD(GET_EVENT_RECORD, 1, CXL_VARIABLE_PAYLOAD, 0),
 	CXL_CMD(GET_FW_INFO, 0, 0x50, 0),
 	CXL_CMD(GET_PARTITION_INFO, 0, 0x20, 0),
 	CXL_CMD(GET_LSA, 0x8, CXL_VARIABLE_PAYLOAD, 0),
@@ -704,6 +708,106 @@ int cxl_enumerate_cmds(struct cxl_dev_state *cxlds)
 }
 EXPORT_SYMBOL_NS_GPL(cxl_enumerate_cmds, CXL);
 
+static void cxl_mem_get_records_log(struct cxl_dev_state *cxlds,
+				    enum cxl_event_log_type type)
+{
+	struct cxl_get_event_payload *payload;
+	u16 nr_rec;
+
+	mutex_lock(&cxlds->event_buf_lock);
+
+	payload = cxlds->event_buf;
+
+	do {
+		u8 log_type = type;
+		int rc;
+
+		rc = cxl_mbox_send_cmd(cxlds, CXL_MBOX_OP_GET_EVENT_RECORD,
+				       &log_type, sizeof(log_type),
+				       payload, cxlds->payload_size);
+		if (rc) {
+			dev_err(cxlds->dev, "Event log '%s': Failed to query event records : %d",
+				cxl_event_log_type_str(type), rc);
+			goto unlock_buffer;
+		}
+
+		nr_rec = le16_to_cpu(payload->record_count);
+		if (trace_cxl_generic_event_enabled()) {
+			int i;
+
+			for (i = 0; i < nr_rec; i++)
+				trace_cxl_generic_event(dev_name(cxlds->dev),
+							type,
+							&payload->records[i]);
+		}
+
+		if (trace_cxl_overflow_enabled() &&
+		    (payload->flags & CXL_GET_EVENT_FLAG_OVERFLOW))
+			trace_cxl_overflow(dev_name(cxlds->dev), type, payload);
+
+	} while (nr_rec);
+
+unlock_buffer:
+	mutex_unlock(&cxlds->event_buf_lock);
+}
+
+static void cxl_mem_free_event_buffer(void *data)
+{
+	struct cxl_dev_state *cxlds = data;
+
+	kvfree(cxlds->event_buf);
+}
+
+/*
+ * There is a single buffer for reading event logs from the mailbox.  All logs
+ * share this buffer protected by the cxlds->event_buf_lock.
+ */
+static struct cxl_get_event_payload *alloc_event_buf(struct cxl_dev_state *cxlds)
+{
+	struct cxl_get_event_payload *buf;
+
+	dev_dbg(cxlds->dev, "Allocating event buffer size %zu\n",
+		cxlds->payload_size);
+
+	buf = kvmalloc(cxlds->payload_size, GFP_KERNEL);
+	if (buf && devm_add_action_or_reset(cxlds->dev,
+			cxl_mem_free_event_buffer, cxlds))
+		return NULL;
+	return buf;
+}
+
+/**
+ * cxl_mem_get_event_records - Get Event Records from the device
+ * @cxlds: The device data for the operation
+ *
+ * Retrieve all event records available on the device and report them as trace
+ * events.
+ *
+ * See CXL rev 3.0 @8.2.9.2.2 Get Event Records
+ */
+void cxl_mem_get_event_records(struct cxl_dev_state *cxlds)
+{
+	u32 status = readl(cxlds->regs.status + CXLDEV_DEV_EVENT_STATUS_OFFSET);
+
+	dev_dbg(cxlds->dev, "Reading event logs: %x\n", status);
+
+	if (!cxlds->event_buf) {
+		cxlds->event_buf = alloc_event_buf(cxlds);
+		if (WARN_ON_ONCE(!cxlds->event_buf))
+			return;
+	}
+
+	if (status & CXLDEV_EVENT_STATUS_INFO)
+		cxl_mem_get_records_log(cxlds, CXL_EVENT_TYPE_INFO);
+	if (status & CXLDEV_EVENT_STATUS_WARN)
+		cxl_mem_get_records_log(cxlds, CXL_EVENT_TYPE_WARN);
+	if (status & CXLDEV_EVENT_STATUS_FAIL)
+		cxl_mem_get_records_log(cxlds, CXL_EVENT_TYPE_FAIL);
+	if (status & CXLDEV_EVENT_STATUS_FATAL)
+		cxl_mem_get_records_log(cxlds, CXL_EVENT_TYPE_FATAL);
+}
+EXPORT_SYMBOL_NS_GPL(cxl_mem_get_event_records, CXL);
+
 /**
  * cxl_mem_get_partition_info - Get partition info
  * @cxlds: The device data for the operation
@@ -846,6 +950,7 @@ struct cxl_dev_state *cxl_dev_state_create(struct device *dev)
 	}
 
 	mutex_init(&cxlds->mbox_mutex);
+	mutex_init(&cxlds->event_buf_lock);
 	cxlds->dev = dev;
 
 	return cxlds;
diff --git a/drivers/cxl/cxl.h b/drivers/cxl/cxl.h
index f680450f0b16..d4baae74cd97 100644
--- a/drivers/cxl/cxl.h
+++ b/drivers/cxl/cxl.h
@@ -132,6 +132,13 @@ static inline int ways_to_cxl(unsigned int ways, u8 *iw)
 #define CXLDEV_CAP_CAP_ID_SECONDARY_MAILBOX 0x3
 #define CXLDEV_CAP_CAP_ID_MEMDEV 0x4000
 
+/* CXL 3.0 8.2.8.3.1 Event Status Register */
+#define CXLDEV_DEV_EVENT_STATUS_OFFSET		0x00
+#define CXLDEV_EVENT_STATUS_INFO		BIT(0)
+#define CXLDEV_EVENT_STATUS_WARN		BIT(1)
+#define CXLDEV_EVENT_STATUS_FAIL		BIT(2)
+#define CXLDEV_EVENT_STATUS_FATAL		BIT(3)
+
 /* CXL 2.0 8.2.8.4 Mailbox Registers */
 #define CXLDEV_MBOX_CAPS_OFFSET 0x00
 #define   CXLDEV_MBOX_CAP_PAYLOAD_SIZE_MASK GENMASK(4, 0)
diff --git a/drivers/cxl/cxlmem.h b/drivers/cxl/cxlmem.h
index cd35f43fedd4..55d57f5a64bc 100644
--- a/drivers/cxl/cxlmem.h
+++ b/drivers/cxl/cxlmem.h
@@ -4,6 +4,7 @@
 #define __CXL_MEM_H__
 #include <uapi/linux/cxl_mem.h>
 #include <linux/cdev.h>
+#include <linux/uuid.h>
 #include "cxl.h"
 
 /* CXL 2.0 8.2.8.5.1.1 Memory Device Status Register */
@@ -250,12 +251,16 @@ struct cxl_dev_state {
 
 	bool msi_enabled;
 
+	struct cxl_get_event_payload *event_buf;
+	struct mutex event_buf_lock;
+
 	int (*mbox_send)(struct cxl_dev_state *cxlds, struct cxl_mbox_cmd *cmd);
 };
 
 enum cxl_opcode {
 	CXL_MBOX_OP_INVALID		= 0x0000,
 	CXL_MBOX_OP_RAW			= CXL_MBOX_OP_INVALID,
+	CXL_MBOX_OP_GET_EVENT_RECORD	= 0x0100,
 	CXL_MBOX_OP_GET_FW_INFO		= 0x0200,
 	CXL_MBOX_OP_ACTIVATE_FW		= 0x0202,
 	CXL_MBOX_OP_GET_SUPPORTED_LOGS	= 0x0400,
@@ -325,6 +330,72 @@ struct cxl_mbox_identify {
 	u8 qos_telemetry_caps;
 } __packed;
 
+/*
+ * Common Event Record Format
+ * CXL rev 3.0 section 8.2.9.2.1; Table 8-42
+ */
+struct cxl_event_record_hdr {
+	uuid_t id;
+	u8 length;
+	u8 flags[3];
+	__le16 handle;
+	__le16 related_handle;
+	__le64 timestamp;
+	u8 maint_op_class;
+	u8 reserved[0xf];
+} __packed;
+
+#define CXL_EVENT_RECORD_DATA_LENGTH 0x50
+struct cxl_event_record_raw {
+	struct cxl_event_record_hdr hdr;
+	u8 data[CXL_EVENT_RECORD_DATA_LENGTH];
+} __packed;
+
+/*
+ * Get Event Records output payload
+ * CXL rev 3.0 section 8.2.9.2.2; Table 8-50
+ */
+#define CXL_GET_EVENT_FLAG_OVERFLOW		BIT(0)
+#define CXL_GET_EVENT_FLAG_MORE_RECORDS		BIT(1)
+struct cxl_get_event_payload {
+	u8 flags;
+	u8 reserved1;
+	__le16 overflow_err_count;
+	__le64 first_overflow_timestamp;
+	__le64 last_overflow_timestamp;
+	__le16 record_count;
+	u8 reserved2[0xa];
+	struct cxl_event_record_raw records[];
+} __packed;
+
+/*
+ * CXL rev 3.0 section 8.2.9.2.2; Table 8-49
+ */
+enum cxl_event_log_type {
+	CXL_EVENT_TYPE_INFO = 0x00,
+	CXL_EVENT_TYPE_WARN,
+	CXL_EVENT_TYPE_FAIL,
+	CXL_EVENT_TYPE_FATAL,
+	CXL_EVENT_TYPE_MAX
+};
+
+static inline const char *cxl_event_log_type_str(enum cxl_event_log_type type)
+{
+	switch (type) {
+	case CXL_EVENT_TYPE_INFO:
+		return "Informational";
+	case CXL_EVENT_TYPE_WARN:
+		return "Warning";
+	case CXL_EVENT_TYPE_FAIL:
+		return "Failure";
+	case CXL_EVENT_TYPE_FATAL:
+		return "Fatal";
+	default:
+		break;
+	}
+	return "<unknown>";
+}
+
 struct cxl_mbox_get_partition_info {
 	__le64 active_volatile_cap;
 	__le64 active_persistent_cap;
@@ -384,6 +455,7 @@ int cxl_mem_create_range_info(struct cxl_dev_state *cxlds);
 struct cxl_dev_state *cxl_dev_state_create(struct device *dev);
 void set_exclusive_cxl_commands(struct cxl_dev_state *cxlds, unsigned long *cmds);
 void clear_exclusive_cxl_commands(struct cxl_dev_state *cxlds, unsigned long *cmds);
+void cxl_mem_get_event_records(struct cxl_dev_state *cxlds);
 #ifdef CONFIG_CXL_SUSPEND
 void cxl_mem_active_inc(void);
 void cxl_mem_active_dec(void);
diff --git a/include/trace/events/cxl.h b/include/trace/events/cxl.h
new file mode 100644
index 000000000000..c03a1a894af8
--- /dev/null
+++ b/include/trace/events/cxl.h
@@ -0,0 +1,126 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#undef TRACE_SYSTEM
+#define TRACE_SYSTEM cxl
+
+#if !defined(_CXL_TRACE_EVENTS_H) ||  defined(TRACE_HEADER_MULTI_READ)
+#define _CXL_TRACE_EVENTS_H
+
+#include <asm-generic/unaligned.h>
+#include <linux/tracepoint.h>
+#include <cxlmem.h>
+
+TRACE_EVENT(cxl_overflow,
+
+	TP_PROTO(const char *dev_name, enum cxl_event_log_type log,
+		 struct cxl_get_event_payload *payload),
+
+	TP_ARGS(dev_name, log, payload),
+
+	TP_STRUCT__entry(
+		__string(dev_name, dev_name)
+		__field(int, log)
+		__field(u64, first_ts)
+		__field(u64, last_ts)
+		__field(u16, count)
+	),
+
+	TP_fast_assign(
+		__assign_str(dev_name, dev_name);
+		__entry->log = log;
+		__entry->count = le16_to_cpu(payload->overflow_err_count);
+		__entry->first_ts = le64_to_cpu(payload->first_overflow_timestamp);
+		__entry->last_ts = le64_to_cpu(payload->last_overflow_timestamp);
+	),
+
+	TP_printk("%s: EVENT LOG OVERFLOW log=%s : %u records from %llu to %llu",
+		__get_str(dev_name), cxl_event_log_type_str(__entry->log),
+		__entry->count, __entry->first_ts, __entry->last_ts)
+
+);
+
+/*
+ * Common Event Record Format
+ * CXL 3.0 section 8.2.9.2.1; Table 8-42
+ */
+#define CXL_EVENT_RECORD_FLAG_PERMANENT		BIT(2)
+#define CXL_EVENT_RECORD_FLAG_MAINT_NEEDED	BIT(3)
+#define CXL_EVENT_RECORD_FLAG_PERF_DEGRADED	BIT(4)
+#define CXL_EVENT_RECORD_FLAG_HW_REPLACE	BIT(5)
+#define show_hdr_flags(flags)	__print_flags(flags, " | ",			   \
+	{ CXL_EVENT_RECORD_FLAG_PERMANENT,	"PERMANENT_CONDITION"		}, \
+	{ CXL_EVENT_RECORD_FLAG_MAINT_NEEDED,	"MAINTENANCE_NEEDED"		}, \
+	{ CXL_EVENT_RECORD_FLAG_PERF_DEGRADED,	"PERFORMANCE_DEGRADED"		}, \
+	{ CXL_EVENT_RECORD_FLAG_HW_REPLACE,	"HARDWARE_REPLACEMENT_NEEDED"	}  \
+)
+
+/*
+ * Define macros for the common header of each CXL event.
+ *
+ * Tracepoints using these macros must do 3 things:
+ *
+ *	1) Add CXL_EVT_TP_entry to TP_STRUCT__entry
+ *	2) Use CXL_EVT_TP_fast_assign within TP_fast_assign;
+ *	   pass the dev_name, log, and CXL event header
+ *	3) Use CXL_EVT_TP_printk() instead of TP_printk()
+ *
+ * See the generic_event tracepoint as an example.
+ */
+#define CXL_EVT_TP_entry					\
+	__string(dev_name, dev_name)				\
+	__field(int, log)					\
+	__field_struct(uuid_t, hdr_uuid)			\
+	__field(u32, hdr_flags)					\
+	__field(u16, hdr_handle)				\
+	__field(u16, hdr_related_handle)			\
+	__field(u64, hdr_timestamp)				\
+	__field(u8, hdr_length)					\
+	__field(u8, hdr_maint_op_class)
+
+#define CXL_EVT_TP_fast_assign(dname, l, hdr)					\
+	__assign_str(dev_name, (dname));					\
+	__entry->log = (l);							\
+	memcpy(&__entry->hdr_uuid, &(hdr).id, sizeof(uuid_t));			\
+	__entry->hdr_length = (hdr).length;					\
+	__entry->hdr_flags = get_unaligned_le24((hdr).flags);			\
+	__entry->hdr_handle = le16_to_cpu((hdr).handle);			\
+	__entry->hdr_related_handle = le16_to_cpu((hdr).related_handle);	\
+	__entry->hdr_timestamp = le64_to_cpu((hdr).timestamp);			\
+	__entry->hdr_maint_op_class = (hdr).maint_op_class
+
+#define CXL_EVT_TP_printk(fmt, ...) \
+	TP_printk("%s log=%s : time=%llu uuid=%pUb len=%d flags='%s' "		\
+		"handle=%x related_handle=%x maint_op_class=%u"			\
+		" : " fmt,							\
+		__get_str(dev_name), cxl_event_log_type_str(__entry->log),	\
+		__entry->hdr_timestamp, &__entry->hdr_uuid, __entry->hdr_length,\
+		show_hdr_flags(__entry->hdr_flags), __entry->hdr_handle,	\
+		__entry->hdr_related_handle, __entry->hdr_maint_op_class,	\
+		##__VA_ARGS__)
+
+TRACE_EVENT(cxl_generic_event,
+
+	TP_PROTO(const char *dev_name, enum cxl_event_log_type log,
+		 struct cxl_event_record_raw *rec),
+
+	TP_ARGS(dev_name, log, rec),
+
+	TP_STRUCT__entry(
+		CXL_EVT_TP_entry
+		__array(u8, data, CXL_EVENT_RECORD_DATA_LENGTH)
+	),
+
+	TP_fast_assign(
+		CXL_EVT_TP_fast_assign(dev_name, log, rec->hdr);
+		memcpy(__entry->data, &rec->data, CXL_EVENT_RECORD_DATA_LENGTH);
+	),
+
+	CXL_EVT_TP_printk("%s",
+		__print_hex(__entry->data, CXL_EVENT_RECORD_DATA_LENGTH))
+);
+
+#endif /* _CXL_TRACE_EVENTS_H */
+
+/* This part must be outside protection */
+#undef TRACE_INCLUDE_FILE
+#define TRACE_INCLUDE_FILE cxl
+#include <trace/define_trace.h>
diff --git a/include/uapi/linux/cxl_mem.h b/include/uapi/linux/cxl_mem.h
index c71021a2a9ed..70459be5bdd4 100644
--- a/include/uapi/linux/cxl_mem.h
+++ b/include/uapi/linux/cxl_mem.h
@@ -24,6 +24,7 @@
 	___C(IDENTIFY, "Identify Command"),                               \
 	___C(RAW, "Raw device command"),                                  \
 	___C(GET_SUPPORTED_LOGS, "Get Supported Logs"),                   \
+	___C(GET_EVENT_RECORD, "Get Event Record"),                       \
 	___C(GET_FW_INFO, "Get FW Info"),                                 \
 	___C(GET_PARTITION_INFO, "Get Partition Information"),            \
 	___C(GET_LSA, "Get Label Storage Area"),                          \
-- 
2.37.2


^ permalink raw reply related	[flat|nested] 64+ messages in thread

* [PATCH V2 03/11] cxl/mem: Implement Clear Event Records command
  2022-12-01  0:27 [PATCH V2 00/11] CXL: Process event logs ira.weiny
  2022-12-01  0:27 ` [PATCH V2 01/11] cxl/pci: Add generic MSI-X/MSI irq support ira.weiny
  2022-12-01  0:27 ` [PATCH V2 02/11] cxl/mem: Implement Get Event Records command ira.weiny
@ 2022-12-01  0:27 ` ira.weiny
  2022-12-01 13:26   ` Jonathan Cameron
  2022-12-02  2:29   ` Dan Williams
  2022-12-01  0:27 ` [PATCH V2 04/11] cxl/mem: Clear events on driver load ira.weiny
                   ` (7 subsequent siblings)
  10 siblings, 2 replies; 64+ messages in thread
From: ira.weiny @ 2022-12-01  0:27 UTC (permalink / raw)
  To: Dan Williams
  Cc: Ira Weiny, Alison Schofield, Vishal Verma, Ben Widawsky,
	Steven Rostedt, Jonathan Cameron, Davidlohr Bueso, Dave Jiang,
	linux-kernel, linux-cxl

From: Ira Weiny <ira.weiny@intel.com>

CXL rev 3.0 section 8.2.9.2.3 defines the Clear Event Records mailbox
command.  After an event record is read it needs to be cleared from the
event log.

Implement cxl_clear_event_record() to clear all record retrieved from
the device.

Each record is cleared explicitly.  A clear all bit is specified but
events could arrive between a get and any final clear all operation.
This means events would be missed.
Therefore each event is cleared specifically.

Signed-off-by: Ira Weiny <ira.weiny@intel.com>

---
Changes from V1:
	Clear Event Record allows for u8 handles while Get Event Record
	allows for u16 records to be returned.  Based on Jonathan's
	feedback; allow for all event records to be handled in this
	clear.  Which means a double loop with potentially multiple
	Clear Event payloads being sent to clear all events sent.

Changes from RFC:
	Jonathan
		Clean up init of payload and use return code.
		Also report any error to clear the event.
		s/v3.0/rev 3.0
---
 drivers/cxl/core/mbox.c      | 61 +++++++++++++++++++++++++++++++-----
 drivers/cxl/cxlmem.h         | 14 +++++++++
 include/uapi/linux/cxl_mem.h |  1 +
 3 files changed, 69 insertions(+), 7 deletions(-)

diff --git a/drivers/cxl/core/mbox.c b/drivers/cxl/core/mbox.c
index 70b681027a3d..076a3df0ba38 100644
--- a/drivers/cxl/core/mbox.c
+++ b/drivers/cxl/core/mbox.c
@@ -52,6 +52,7 @@ static struct cxl_mem_command cxl_mem_commands[CXL_MEM_COMMAND_ID_MAX] = {
 #endif
 	CXL_CMD(GET_SUPPORTED_LOGS, 0, CXL_VARIABLE_PAYLOAD, CXL_CMD_FLAG_FORCE_ENABLE),
 	CXL_CMD(GET_EVENT_RECORD, 1, CXL_VARIABLE_PAYLOAD, 0),
+	CXL_CMD(CLEAR_EVENT_RECORD, CXL_VARIABLE_PAYLOAD, 0, 0),
 	CXL_CMD(GET_FW_INFO, 0, 0x50, 0),
 	CXL_CMD(GET_PARTITION_INFO, 0, 0x20, 0),
 	CXL_CMD(GET_LSA, 0x8, CXL_VARIABLE_PAYLOAD, 0),
@@ -708,6 +709,42 @@ int cxl_enumerate_cmds(struct cxl_dev_state *cxlds)
 }
 EXPORT_SYMBOL_NS_GPL(cxl_enumerate_cmds, CXL);
 
+static int cxl_clear_event_record(struct cxl_dev_state *cxlds,
+				  enum cxl_event_log_type log,
+				  struct cxl_get_event_payload *get_pl,
+				  u16 total)
+{
+	struct cxl_mbox_clear_event_payload payload = {
+		.event_log = log,
+	};
+	int cnt;
+
+	/*
+	 * Clear Event Records uses u8 for the handle cnt while Get Event
+	 * Record can return up to 0xffff records.
+	 */
+	for (cnt = 0; cnt < total; /* cnt incremented internally */) {
+		u8 nr_recs = min_t(u8, (total - cnt),
+				   CXL_CLEAR_EVENT_MAX_HANDLES);
+		int i, rc;
+
+		for (i = 0; i < nr_recs; i++, cnt++) {
+			payload.handle[i] = get_pl->records[cnt].hdr.handle;
+			dev_dbg(cxlds->dev, "Event log '%s': Clearning %u\n",
+				cxl_event_log_type_str(log),
+				le16_to_cpu(payload.handle[i]));
+		}
+		payload.nr_recs = nr_recs;
+
+		rc = cxl_mbox_send_cmd(cxlds, CXL_MBOX_OP_CLEAR_EVENT_RECORD,
+				       &payload, sizeof(payload), NULL, 0);
+		if (rc)
+			return rc;
+	}
+
+	return 0;
+}
+
 static void cxl_mem_get_records_log(struct cxl_dev_state *cxlds,
 				    enum cxl_event_log_type type)
 {
@@ -732,13 +769,22 @@ static void cxl_mem_get_records_log(struct cxl_dev_state *cxlds,
 		}
 
 		nr_rec = le16_to_cpu(payload->record_count);
-		if (trace_cxl_generic_event_enabled()) {
+		if (nr_rec > 0) {
 			int i;
 
-			for (i = 0; i < nr_rec; i++)
-				trace_cxl_generic_event(dev_name(cxlds->dev),
-							type,
-							&payload->records[i]);
+			if (trace_cxl_generic_event_enabled()) {
+				for (i = 0; i < nr_rec; i++)
+					trace_cxl_generic_event(dev_name(cxlds->dev),
+								type,
+								&payload->records[i]);
+			}
+
+			rc = cxl_clear_event_record(cxlds, type, payload, nr_rec);
+			if (rc) {
+				dev_err(cxlds->dev, "Event log '%s': Failed to clear events : %d",
+					cxl_event_log_type_str(type), rc);
+				return;
+			}
 		}
 
 		if (trace_cxl_overflow_enabled() &&
@@ -780,10 +826,11 @@ static struct cxl_get_event_payload *alloc_event_buf(struct cxl_dev_state *cxlds
  * cxl_mem_get_event_records - Get Event Records from the device
  * @cxlds: The device data for the operation
  *
- * Retrieve all event records available on the device and report them as trace
- * events.
+ * Retrieve all event records available on the device, report them as trace
+ * events, and clear them.
  *
  * See CXL rev 3.0 @8.2.9.2.2 Get Event Records
+ * See CXL rev 3.0 @8.2.9.2.3 Clear Event Records
  */
 void cxl_mem_get_event_records(struct cxl_dev_state *cxlds)
 {
diff --git a/drivers/cxl/cxlmem.h b/drivers/cxl/cxlmem.h
index 55d57f5a64bc..1ae9962c5a06 100644
--- a/drivers/cxl/cxlmem.h
+++ b/drivers/cxl/cxlmem.h
@@ -261,6 +261,7 @@ enum cxl_opcode {
 	CXL_MBOX_OP_INVALID		= 0x0000,
 	CXL_MBOX_OP_RAW			= CXL_MBOX_OP_INVALID,
 	CXL_MBOX_OP_GET_EVENT_RECORD	= 0x0100,
+	CXL_MBOX_OP_CLEAR_EVENT_RECORD	= 0x0101,
 	CXL_MBOX_OP_GET_FW_INFO		= 0x0200,
 	CXL_MBOX_OP_ACTIVATE_FW		= 0x0202,
 	CXL_MBOX_OP_GET_SUPPORTED_LOGS	= 0x0400,
@@ -396,6 +397,19 @@ static inline const char *cxl_event_log_type_str(enum cxl_event_log_type type)
 	return "<unknown>";
 }
 
+/*
+ * Clear Event Records input payload
+ * CXL rev 3.0 section 8.2.9.2.3; Table 8-51
+ */
+#define CXL_CLEAR_EVENT_MAX_HANDLES (0xff)
+struct cxl_mbox_clear_event_payload {
+	u8 event_log;		/* enum cxl_event_log_type */
+	u8 clear_flags;
+	u8 nr_recs;
+	u8 reserved[3];
+	__le16 handle[CXL_CLEAR_EVENT_MAX_HANDLES];
+};
+
 struct cxl_mbox_get_partition_info {
 	__le64 active_volatile_cap;
 	__le64 active_persistent_cap;
diff --git a/include/uapi/linux/cxl_mem.h b/include/uapi/linux/cxl_mem.h
index 70459be5bdd4..7c1ad8062792 100644
--- a/include/uapi/linux/cxl_mem.h
+++ b/include/uapi/linux/cxl_mem.h
@@ -25,6 +25,7 @@
 	___C(RAW, "Raw device command"),                                  \
 	___C(GET_SUPPORTED_LOGS, "Get Supported Logs"),                   \
 	___C(GET_EVENT_RECORD, "Get Event Record"),                       \
+	___C(CLEAR_EVENT_RECORD, "Clear Event Record"),                   \
 	___C(GET_FW_INFO, "Get FW Info"),                                 \
 	___C(GET_PARTITION_INFO, "Get Partition Information"),            \
 	___C(GET_LSA, "Get Label Storage Area"),                          \
-- 
2.37.2


^ permalink raw reply related	[flat|nested] 64+ messages in thread

* [PATCH V2 04/11] cxl/mem: Clear events on driver load
  2022-12-01  0:27 [PATCH V2 00/11] CXL: Process event logs ira.weiny
                   ` (2 preceding siblings ...)
  2022-12-01  0:27 ` [PATCH V2 03/11] cxl/mem: Implement Clear " ira.weiny
@ 2022-12-01  0:27 ` ira.weiny
  2022-12-01 13:30   ` Jonathan Cameron
  2022-12-02  2:48   ` Dan Williams
  2022-12-01  0:27 ` [PATCH V2 05/11] cxl/mem: Trace General Media Event Record ira.weiny
                   ` (6 subsequent siblings)
  10 siblings, 2 replies; 64+ messages in thread
From: ira.weiny @ 2022-12-01  0:27 UTC (permalink / raw)
  To: Dan Williams
  Cc: Ira Weiny, Jonathan Cameron, Dave Jiang, Alison Schofield,
	Vishal Verma, Ben Widawsky, Steven Rostedt, Davidlohr Bueso,
	linux-kernel, linux-cxl

From: Ira Weiny <ira.weiny@intel.com>

The information contained in the events prior to the driver loading can
be queried at any time through other mailbox commands.

Ensure a clean slate of events by reading and clearing the events.  The
events are sent to the trace buffer but it is not anticipated to have
anyone listening to it at driver load time.

Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: Dave Jiang <dave.jiang@intel.com>
Signed-off-by: Ira Weiny <ira.weiny@intel.com>
---
 drivers/cxl/pci.c            | 2 ++
 tools/testing/cxl/test/mem.c | 2 ++
 2 files changed, 4 insertions(+)

diff --git a/drivers/cxl/pci.c b/drivers/cxl/pci.c
index 8f86f85d89c7..11e95a95195a 100644
--- a/drivers/cxl/pci.c
+++ b/drivers/cxl/pci.c
@@ -521,6 +521,8 @@ static int cxl_pci_probe(struct pci_dev *pdev, const struct pci_device_id *id)
 	if (IS_ERR(cxlmd))
 		return PTR_ERR(cxlmd);
 
+	cxl_mem_get_event_records(cxlds);
+
 	if (resource_size(&cxlds->pmem_res) && IS_ENABLED(CONFIG_CXL_PMEM))
 		rc = devm_cxl_add_nvdimm(&pdev->dev, cxlmd);
 
diff --git a/tools/testing/cxl/test/mem.c b/tools/testing/cxl/test/mem.c
index aa2df3a15051..e2f5445d24ff 100644
--- a/tools/testing/cxl/test/mem.c
+++ b/tools/testing/cxl/test/mem.c
@@ -285,6 +285,8 @@ static int cxl_mock_mem_probe(struct platform_device *pdev)
 	if (IS_ERR(cxlmd))
 		return PTR_ERR(cxlmd);
 
+	cxl_mem_get_event_records(cxlds);
+
 	if (resource_size(&cxlds->pmem_res) && IS_ENABLED(CONFIG_CXL_PMEM))
 		rc = devm_cxl_add_nvdimm(dev, cxlmd);
 
-- 
2.37.2


^ permalink raw reply related	[flat|nested] 64+ messages in thread

* [PATCH V2 05/11] cxl/mem: Trace General Media Event Record
  2022-12-01  0:27 [PATCH V2 00/11] CXL: Process event logs ira.weiny
                   ` (3 preceding siblings ...)
  2022-12-01  0:27 ` [PATCH V2 04/11] cxl/mem: Clear events on driver load ira.weiny
@ 2022-12-01  0:27 ` ira.weiny
  2022-12-01 18:54   ` Dave Jiang
  2022-12-02  6:18   ` Dan Williams
  2022-12-01  0:27 ` [PATCH V2 06/11] cxl/mem: Trace DRAM " ira.weiny
                   ` (5 subsequent siblings)
  10 siblings, 2 replies; 64+ messages in thread
From: ira.weiny @ 2022-12-01  0:27 UTC (permalink / raw)
  To: Dan Williams
  Cc: Ira Weiny, Steven Rostedt, Jonathan Cameron, Alison Schofield,
	Vishal Verma, Ben Widawsky, Davidlohr Bueso, Dave Jiang,
	linux-kernel, linux-cxl

From: Ira Weiny <ira.weiny@intel.com>

CXL rev 3.0 section 8.2.9.2.1.1 defines the General Media Event Record.

Determine if the event read is a general media record and if so trace
the record as a General Media Event Record.

Reviewed-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Ira Weiny <ira.weiny@intel.com>

---
Changes from V1:
	Jonathan
		fix spec references for CXL rev 3.0
		Make flags all caps

Changes from RFC v2:
	Output DPA flags as a single field
	Ensure names of fields match what TP_print outputs
	Steven
		prefix TRACE_EVENT with 'cxl_'
	Jonathan
		Remove Reserved field

Changes from RFC:
	Add reserved byte array
	Use common CXL event header record macros
	Jonathan
		Use unaligned_le{24,16} for unaligned fields
		Don't use the inverse of phy addr mask
	Dave Jiang
		s/cxl_gen_media_event/general_media
		s/cxl_evt_gen_media/cxl_event_gen_media
---
 drivers/cxl/core/mbox.c    |  40 ++++++++++--
 drivers/cxl/cxlmem.h       |  19 ++++++
 include/trace/events/cxl.h | 124 +++++++++++++++++++++++++++++++++++++
 3 files changed, 179 insertions(+), 4 deletions(-)

diff --git a/drivers/cxl/core/mbox.c b/drivers/cxl/core/mbox.c
index 076a3df0ba38..20191fe55bba 100644
--- a/drivers/cxl/core/mbox.c
+++ b/drivers/cxl/core/mbox.c
@@ -709,6 +709,38 @@ int cxl_enumerate_cmds(struct cxl_dev_state *cxlds)
 }
 EXPORT_SYMBOL_NS_GPL(cxl_enumerate_cmds, CXL);
 
+/*
+ * General Media Event Record
+ * CXL rev 3.0 Section 8.2.9.2.1.1; Table 8-43
+ */
+static const uuid_t gen_media_event_uuid =
+	UUID_INIT(0xfbcd0a77, 0xc260, 0x417f,
+		  0x85, 0xa9, 0x08, 0x8b, 0x16, 0x21, 0xeb, 0xa6);
+
+static bool cxl_event_tracing_enabled(void)
+{
+	return trace_cxl_generic_event_enabled() ||
+	       trace_cxl_general_media_enabled();
+}
+
+static void cxl_trace_event_record(const char *dev_name,
+				   enum cxl_event_log_type type,
+				   struct cxl_event_record_raw *record)
+{
+	uuid_t *id = &record->hdr.id;
+
+	if (uuid_equal(id, &gen_media_event_uuid)) {
+		struct cxl_event_gen_media *rec =
+				(struct cxl_event_gen_media *)record;
+
+		trace_cxl_general_media(dev_name, type, rec);
+		return;
+	}
+
+	/* For unknown record types print just the header */
+	trace_cxl_generic_event(dev_name, type, record);
+}
+
 static int cxl_clear_event_record(struct cxl_dev_state *cxlds,
 				  enum cxl_event_log_type log,
 				  struct cxl_get_event_payload *get_pl,
@@ -772,11 +804,11 @@ static void cxl_mem_get_records_log(struct cxl_dev_state *cxlds,
 		if (nr_rec > 0) {
 			int i;
 
-			if (trace_cxl_generic_event_enabled()) {
+			if (cxl_event_tracing_enabled()) {
 				for (i = 0; i < nr_rec; i++)
-					trace_cxl_generic_event(dev_name(cxlds->dev),
-								type,
-								&payload->records[i]);
+					cxl_trace_event_record(dev_name(cxlds->dev),
+							       type,
+							       &payload->records[i]);
 			}
 
 			rc = cxl_clear_event_record(cxlds, type, payload, nr_rec);
diff --git a/drivers/cxl/cxlmem.h b/drivers/cxl/cxlmem.h
index 1ae9962c5a06..10696debefa8 100644
--- a/drivers/cxl/cxlmem.h
+++ b/drivers/cxl/cxlmem.h
@@ -410,6 +410,25 @@ struct cxl_mbox_clear_event_payload {
 	__le16 handle[CXL_CLEAR_EVENT_MAX_HANDLES];
 };
 
+/*
+ * General Media Event Record
+ * CXL rev 3.0 Section 8.2.9.2.1.1; Table 8-43
+ */
+#define CXL_EVENT_GEN_MED_COMP_ID_SIZE	0x10
+struct cxl_event_gen_media {
+	struct cxl_event_record_hdr hdr;
+	__le64 phys_addr;
+	u8 descriptor;
+	u8 type;
+	u8 transaction_type;
+	u8 validity_flags[2];
+	u8 channel;
+	u8 rank;
+	u8 device[3];
+	u8 component_id[CXL_EVENT_GEN_MED_COMP_ID_SIZE];
+	u8 reserved[0x2e];
+} __packed;
+
 struct cxl_mbox_get_partition_info {
 	__le64 active_volatile_cap;
 	__le64 active_persistent_cap;
diff --git a/include/trace/events/cxl.h b/include/trace/events/cxl.h
index c03a1a894af8..a4d6bd64e9bc 100644
--- a/include/trace/events/cxl.h
+++ b/include/trace/events/cxl.h
@@ -118,6 +118,130 @@ TRACE_EVENT(cxl_generic_event,
 		__print_hex(__entry->data, CXL_EVENT_RECORD_DATA_LENGTH))
 );
 
+/*
+ * Physical Address field masks
+ *
+ * General Media Event Record
+ * CXL rev 3.0 Section 8.2.9.2.1.1; Table 8-43
+ *
+ * DRAM Event Record
+ * CXL rev 3.0 section 8.2.9.2.1.2; Table 8-44
+ */
+#define CXL_DPA_FLAGS_MASK			0x3F
+#define CXL_DPA_MASK				(~CXL_DPA_FLAGS_MASK)
+
+#define CXL_DPA_VOLATILE			BIT(0)
+#define CXL_DPA_NOT_REPAIRABLE			BIT(1)
+#define show_dpa_flags(flags)	__print_flags(flags, "|",		   \
+	{ CXL_DPA_VOLATILE,			"VOLATILE"		}, \
+	{ CXL_DPA_NOT_REPAIRABLE,		"NOT_REPAIRABLE"	}  \
+)
+
+/*
+ * General Media Event Record - GMER
+ * CXL rev 3.0 Section 8.2.9.2.1.1; Table 8-43
+ */
+#define CXL_GMER_EVT_DESC_UNCORECTABLE_EVENT		BIT(0)
+#define CXL_GMER_EVT_DESC_THRESHOLD_EVENT		BIT(1)
+#define CXL_GMER_EVT_DESC_POISON_LIST_OVERFLOW		BIT(2)
+#define show_event_desc_flags(flags)	__print_flags(flags, "|",		   \
+	{ CXL_GMER_EVT_DESC_UNCORECTABLE_EVENT,		"UNCORRECTABLE_EVENT"	}, \
+	{ CXL_GMER_EVT_DESC_THRESHOLD_EVENT,		"THRESHOLD_EVENT"	}, \
+	{ CXL_GMER_EVT_DESC_POISON_LIST_OVERFLOW,	"POISON_LIST_OVERFLOW"	}  \
+)
+
+#define CXL_GMER_MEM_EVT_TYPE_ECC_ERROR			0x00
+#define CXL_GMER_MEM_EVT_TYPE_INV_ADDR			0x01
+#define CXL_GMER_MEM_EVT_TYPE_DATA_PATH_ERROR		0x02
+#define show_mem_event_type(type)	__print_symbolic(type,			\
+	{ CXL_GMER_MEM_EVT_TYPE_ECC_ERROR,		"ECC Error" },		\
+	{ CXL_GMER_MEM_EVT_TYPE_INV_ADDR,		"Invalid Address" },	\
+	{ CXL_GMER_MEM_EVT_TYPE_DATA_PATH_ERROR,	"Data Path Error" }	\
+)
+
+#define CXL_GMER_TRANS_UNKNOWN				0x00
+#define CXL_GMER_TRANS_HOST_READ			0x01
+#define CXL_GMER_TRANS_HOST_WRITE			0x02
+#define CXL_GMER_TRANS_HOST_SCAN_MEDIA			0x03
+#define CXL_GMER_TRANS_HOST_INJECT_POISON		0x04
+#define CXL_GMER_TRANS_INTERNAL_MEDIA_SCRUB		0x05
+#define CXL_GMER_TRANS_INTERNAL_MEDIA_MANAGEMENT	0x06
+#define show_trans_type(type)	__print_symbolic(type,					\
+	{ CXL_GMER_TRANS_UNKNOWN,			"Unknown" },			\
+	{ CXL_GMER_TRANS_HOST_READ,			"Host Read" },			\
+	{ CXL_GMER_TRANS_HOST_WRITE,			"Host Write" },			\
+	{ CXL_GMER_TRANS_HOST_SCAN_MEDIA,		"Host Scan Media" },		\
+	{ CXL_GMER_TRANS_HOST_INJECT_POISON,		"Host Inject Poison" },		\
+	{ CXL_GMER_TRANS_INTERNAL_MEDIA_SCRUB,		"Internal Media Scrub" },	\
+	{ CXL_GMER_TRANS_INTERNAL_MEDIA_MANAGEMENT,	"Internal Media Management" }	\
+)
+
+#define CXL_GMER_VALID_CHANNEL				BIT(0)
+#define CXL_GMER_VALID_RANK				BIT(1)
+#define CXL_GMER_VALID_DEVICE				BIT(2)
+#define CXL_GMER_VALID_COMPONENT			BIT(3)
+#define show_valid_flags(flags)	__print_flags(flags, "|",		   \
+	{ CXL_GMER_VALID_CHANNEL,			"CHANNEL"	}, \
+	{ CXL_GMER_VALID_RANK,				"RANK"		}, \
+	{ CXL_GMER_VALID_DEVICE,			"DEVICE"	}, \
+	{ CXL_GMER_VALID_COMPONENT,			"COMPONENT"	}  \
+)
+
+TRACE_EVENT(cxl_general_media,
+
+	TP_PROTO(const char *dev_name, enum cxl_event_log_type log,
+		 struct cxl_event_gen_media *rec),
+
+	TP_ARGS(dev_name, log, rec),
+
+	TP_STRUCT__entry(
+		CXL_EVT_TP_entry
+		/* General Media */
+		__field(u64, dpa)
+		__field(u8, descriptor)
+		__field(u8, type)
+		__field(u8, transaction_type)
+		__field(u8, channel)
+		__field(u32, device)
+		__array(u8, comp_id, CXL_EVENT_GEN_MED_COMP_ID_SIZE)
+		__field(u16, validity_flags)
+		/* Following are out of order to pack trace record */
+		__field(u8, rank)
+		__field(u8, dpa_flags)
+	),
+
+	TP_fast_assign(
+		CXL_EVT_TP_fast_assign(dev_name, log, rec->hdr);
+
+		/* General Media */
+		__entry->dpa = le64_to_cpu(rec->phys_addr);
+		__entry->dpa_flags = __entry->dpa & CXL_DPA_FLAGS_MASK;
+		/* Mask after flags have been parsed */
+		__entry->dpa &= CXL_DPA_MASK;
+		__entry->descriptor = rec->descriptor;
+		__entry->type = rec->type;
+		__entry->transaction_type = rec->transaction_type;
+		__entry->channel = rec->channel;
+		__entry->rank = rec->rank;
+		__entry->device = get_unaligned_le24(rec->device);
+		memcpy(__entry->comp_id, &rec->component_id,
+			CXL_EVENT_GEN_MED_COMP_ID_SIZE);
+		__entry->validity_flags = get_unaligned_le16(&rec->validity_flags);
+	),
+
+	CXL_EVT_TP_printk("dpa=%llx dpa_flags='%s' " \
+		"descriptor='%s' type='%s' transaction_type='%s' channel=%u rank=%u " \
+		"device=%x comp_id=%s validity_flags='%s'",
+		__entry->dpa, show_dpa_flags(__entry->dpa_flags),
+		show_event_desc_flags(__entry->descriptor),
+		show_mem_event_type(__entry->type),
+		show_trans_type(__entry->transaction_type),
+		__entry->channel, __entry->rank, __entry->device,
+		__print_hex(__entry->comp_id, CXL_EVENT_GEN_MED_COMP_ID_SIZE),
+		show_valid_flags(__entry->validity_flags)
+	)
+);
+
 #endif /* _CXL_TRACE_EVENTS_H */
 
 /* This part must be outside protection */
-- 
2.37.2


^ permalink raw reply related	[flat|nested] 64+ messages in thread

* [PATCH V2 06/11] cxl/mem: Trace DRAM Event Record
  2022-12-01  0:27 [PATCH V2 00/11] CXL: Process event logs ira.weiny
                   ` (4 preceding siblings ...)
  2022-12-01  0:27 ` [PATCH V2 05/11] cxl/mem: Trace General Media Event Record ira.weiny
@ 2022-12-01  0:27 ` ira.weiny
  2022-12-01 18:55   ` Dave Jiang
  2022-12-01  0:27 ` [PATCH V2 07/11] cxl/mem: Trace Memory Module " ira.weiny
                   ` (4 subsequent siblings)
  10 siblings, 1 reply; 64+ messages in thread
From: ira.weiny @ 2022-12-01  0:27 UTC (permalink / raw)
  To: Dan Williams
  Cc: Ira Weiny, Steven Rostedt, Jonathan Cameron, Dave Jiang,
	Alison Schofield, Vishal Verma, Ben Widawsky, Davidlohr Bueso,
	linux-kernel, linux-cxl

From: Ira Weiny <ira.weiny@intel.com>

CXL rev 3.0 section 8.2.9.2.1.2 defines the DRAM Event Record.

Determine if the event read is a DRAM event record and if so trace the
record.

Reviewed-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: Dave Jiang <dave.jiang@intel.com>
Signed-off-by: Ira Weiny <ira.weiny@intel.com>

---
Changes from RFC v2:
	Output DPA flags as a separate field.
	Ensure field names match TP_print output
	Steven
		prefix TRACE_EVENT with 'cxl_'
	Jonathan
		Formatting fix
		Remove reserved field

Changes from RFC:
	Add reserved byte data
	Use new CXL header macros
	Jonathan
		Use get_unaligned_le{24,16}() for unaligned fields
		Use 'else if'
	Dave Jiang
		s/cxl_dram_event/dram
		s/cxl_evt_dram_rec/cxl_event_dram
	Adjust for new phys addr mask
---
 drivers/cxl/core/mbox.c    | 16 ++++++-
 drivers/cxl/cxlmem.h       | 23 ++++++++++
 include/trace/events/cxl.h | 92 ++++++++++++++++++++++++++++++++++++++
 3 files changed, 130 insertions(+), 1 deletion(-)

diff --git a/drivers/cxl/core/mbox.c b/drivers/cxl/core/mbox.c
index 20191fe55bba..66fc50d89bf4 100644
--- a/drivers/cxl/core/mbox.c
+++ b/drivers/cxl/core/mbox.c
@@ -717,10 +717,19 @@ static const uuid_t gen_media_event_uuid =
 	UUID_INIT(0xfbcd0a77, 0xc260, 0x417f,
 		  0x85, 0xa9, 0x08, 0x8b, 0x16, 0x21, 0xeb, 0xa6);
 
+/*
+ * DRAM Event Record
+ * CXL rev 3.0 section 8.2.9.2.1.2; Table 8-44
+ */
+static const uuid_t dram_event_uuid =
+	UUID_INIT(0x601dcbb3, 0x9c06, 0x4eab,
+		  0xb8, 0xaf, 0x4e, 0x9b, 0xfb, 0x5c, 0x96, 0x24);
+
 static bool cxl_event_tracing_enabled(void)
 {
 	return trace_cxl_generic_event_enabled() ||
-	       trace_cxl_general_media_enabled();
+	       trace_cxl_general_media_enabled() ||
+	       trace_cxl_dram_enabled();
 }
 
 static void cxl_trace_event_record(const char *dev_name,
@@ -735,6 +744,11 @@ static void cxl_trace_event_record(const char *dev_name,
 
 		trace_cxl_general_media(dev_name, type, rec);
 		return;
+	} else if (uuid_equal(id, &dram_event_uuid)) {
+		struct cxl_event_dram *rec = (struct cxl_event_dram *)record;
+
+		trace_cxl_dram(dev_name, type, rec);
+		return;
 	}
 
 	/* For unknown record types print just the header */
diff --git a/drivers/cxl/cxlmem.h b/drivers/cxl/cxlmem.h
index 10696debefa8..f5f63a475478 100644
--- a/drivers/cxl/cxlmem.h
+++ b/drivers/cxl/cxlmem.h
@@ -429,6 +429,29 @@ struct cxl_event_gen_media {
 	u8 reserved[0x2e];
 } __packed;
 
+/*
+ * DRAM Event Record - DER
+ * CXL rev 3.0 section 8.2.9.2.1.2; Table 3-44
+ */
+#define CXL_EVENT_DER_CORRECTION_MASK_SIZE	0x20
+struct cxl_event_dram {
+	struct cxl_event_record_hdr hdr;
+	__le64 phys_addr;
+	u8 descriptor;
+	u8 type;
+	u8 transaction_type;
+	u8 validity_flags[2];
+	u8 channel;
+	u8 rank;
+	u8 nibble_mask[3];
+	u8 bank_group;
+	u8 bank;
+	u8 row[3];
+	u8 column[2];
+	u8 correction_mask[CXL_EVENT_DER_CORRECTION_MASK_SIZE];
+	u8 reserved[0x17];
+} __packed;
+
 struct cxl_mbox_get_partition_info {
 	__le64 active_volatile_cap;
 	__le64 active_persistent_cap;
diff --git a/include/trace/events/cxl.h b/include/trace/events/cxl.h
index a4d6bd64e9bc..474390f895d9 100644
--- a/include/trace/events/cxl.h
+++ b/include/trace/events/cxl.h
@@ -242,6 +242,98 @@ TRACE_EVENT(cxl_general_media,
 	)
 );
 
+/*
+ * DRAM Event Record - DER
+ *
+ * CXL rev 3.0 section 8.2.9.2.1.2; Table 8-44
+ */
+/*
+ * DRAM Event Record defines many fields the same as the General Media Event
+ * Record.  Reuse those definitions as appropriate.
+ */
+#define CXL_DER_VALID_CHANNEL				BIT(0)
+#define CXL_DER_VALID_RANK				BIT(1)
+#define CXL_DER_VALID_NIBBLE				BIT(2)
+#define CXL_DER_VALID_BANK_GROUP			BIT(3)
+#define CXL_DER_VALID_BANK				BIT(4)
+#define CXL_DER_VALID_ROW				BIT(5)
+#define CXL_DER_VALID_COLUMN				BIT(6)
+#define CXL_DER_VALID_CORRECTION_MASK			BIT(7)
+#define show_dram_valid_flags(flags)	__print_flags(flags, "|",			   \
+	{ CXL_DER_VALID_CHANNEL,			"CHANNEL"		}, \
+	{ CXL_DER_VALID_RANK,				"RANK"			}, \
+	{ CXL_DER_VALID_NIBBLE,				"NIBBLE"		}, \
+	{ CXL_DER_VALID_BANK_GROUP,			"BANK GROUP"		}, \
+	{ CXL_DER_VALID_BANK,				"BANK"			}, \
+	{ CXL_DER_VALID_ROW,				"ROW"			}, \
+	{ CXL_DER_VALID_COLUMN,				"COLUMN"		}, \
+	{ CXL_DER_VALID_CORRECTION_MASK,		"CORRECTION MASK"	}  \
+)
+
+TRACE_EVENT(cxl_dram,
+
+	TP_PROTO(const char *dev_name, enum cxl_event_log_type log,
+		 struct cxl_event_dram *rec),
+
+	TP_ARGS(dev_name, log, rec),
+
+	TP_STRUCT__entry(
+		CXL_EVT_TP_entry
+		/* DRAM */
+		__field(u64, dpa)
+		__field(u8, descriptor)
+		__field(u8, type)
+		__field(u8, transaction_type)
+		__field(u8, channel)
+		__field(u16, validity_flags)
+		__field(u16, column)	/* Out of order to pack trace record */
+		__field(u32, nibble_mask)
+		__field(u32, row)
+		__array(u8, cor_mask, CXL_EVENT_DER_CORRECTION_MASK_SIZE)
+		__field(u8, rank)	/* Out of order to pack trace record */
+		__field(u8, bank_group)	/* Out of order to pack trace record */
+		__field(u8, bank)	/* Out of order to pack trace record */
+		__field(u8, dpa_flags)	/* Out of order to pack trace record */
+	),
+
+	TP_fast_assign(
+		CXL_EVT_TP_fast_assign(dev_name, log, rec->hdr);
+
+		/* DRAM */
+		__entry->dpa = le64_to_cpu(rec->phys_addr);
+		__entry->dpa_flags = __entry->dpa & CXL_DPA_FLAGS_MASK;
+		__entry->dpa &= CXL_DPA_MASK;
+		__entry->descriptor = rec->descriptor;
+		__entry->type = rec->type;
+		__entry->transaction_type = rec->transaction_type;
+		__entry->validity_flags = get_unaligned_le16(rec->validity_flags);
+		__entry->channel = rec->channel;
+		__entry->rank = rec->rank;
+		__entry->nibble_mask = get_unaligned_le24(rec->nibble_mask);
+		__entry->bank_group = rec->bank_group;
+		__entry->bank = rec->bank;
+		__entry->row = get_unaligned_le24(rec->row);
+		__entry->column = get_unaligned_le16(rec->column);
+		memcpy(__entry->cor_mask, &rec->correction_mask,
+			CXL_EVENT_DER_CORRECTION_MASK_SIZE);
+	),
+
+	CXL_EVT_TP_printk("dpa=%llx dpa_flags='%s' descriptor='%s' type='%s' " \
+		"transaction_type='%s' channel=%u rank=%u nibble_mask=%x " \
+		"bank_group=%u bank=%u row=%u column=%u cor_mask=%s " \
+		"validity_flags='%s'",
+		__entry->dpa, show_dpa_flags(__entry->dpa_flags),
+		show_event_desc_flags(__entry->descriptor),
+		show_mem_event_type(__entry->type),
+		show_trans_type(__entry->transaction_type),
+		__entry->channel, __entry->rank, __entry->nibble_mask,
+		__entry->bank_group, __entry->bank,
+		__entry->row, __entry->column,
+		__print_hex(__entry->cor_mask, CXL_EVENT_DER_CORRECTION_MASK_SIZE),
+		show_dram_valid_flags(__entry->validity_flags)
+	)
+);
+
 #endif /* _CXL_TRACE_EVENTS_H */
 
 /* This part must be outside protection */
-- 
2.37.2


^ permalink raw reply related	[flat|nested] 64+ messages in thread

* [PATCH V2 07/11] cxl/mem: Trace Memory Module Event Record
  2022-12-01  0:27 [PATCH V2 00/11] CXL: Process event logs ira.weiny
                   ` (5 preceding siblings ...)
  2022-12-01  0:27 ` [PATCH V2 06/11] cxl/mem: Trace DRAM " ira.weiny
@ 2022-12-01  0:27 ` ira.weiny
  2022-12-01 13:31   ` Jonathan Cameron
                     ` (2 more replies)
  2022-12-01  0:27 ` [PATCH V2 08/11] cxl/mem: Wire up event interrupts ira.weiny
                   ` (3 subsequent siblings)
  10 siblings, 3 replies; 64+ messages in thread
From: ira.weiny @ 2022-12-01  0:27 UTC (permalink / raw)
  To: Dan Williams
  Cc: Ira Weiny, Steven Rostedt, Alison Schofield, Vishal Verma,
	Ben Widawsky, Jonathan Cameron, Davidlohr Bueso, Dave Jiang,
	linux-kernel, linux-cxl

From: Ira Weiny <ira.weiny@intel.com>

CXL rev 3.0 section 8.2.9.2.1.3 defines the Memory Module Event Record.

Determine if the event read is memory module record and if so trace the
record.

Reviewed-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Signed-off-by: Ira Weiny <ira.weiny@intel.com>

---
Changes from V1:
	Use all caps for flag fields

Changes from RFC v2:
	Ensure field names match TP_print output
	Steven
		prefix TRACE_EVENT with 'cxl_'
	Jonathan
		Remove reserved field
		Define a 1bit and 2 bit status decoder
		Fix paren alignment

Changes from RFC:
	Clean up spec reference
	Add reserved data
	Use new CXL header macros
	Jonathan
		Use else if
		Use get_unaligned_le*() for unaligned fields
	Dave Jiang
		s/cxl_mem_mod_event/memory_module
		s/cxl_evt_mem_mod_rec/cxl_event_mem_module
---
 drivers/cxl/core/mbox.c    |  17 ++++-
 drivers/cxl/cxlmem.h       |  26 +++++++
 include/trace/events/cxl.h | 144 +++++++++++++++++++++++++++++++++++++
 3 files changed, 186 insertions(+), 1 deletion(-)

diff --git a/drivers/cxl/core/mbox.c b/drivers/cxl/core/mbox.c
index 66fc50d89bf4..30840b711381 100644
--- a/drivers/cxl/core/mbox.c
+++ b/drivers/cxl/core/mbox.c
@@ -725,11 +725,20 @@ static const uuid_t dram_event_uuid =
 	UUID_INIT(0x601dcbb3, 0x9c06, 0x4eab,
 		  0xb8, 0xaf, 0x4e, 0x9b, 0xfb, 0x5c, 0x96, 0x24);
 
+/*
+ * Memory Module Event Record
+ * CXL rev 3.0 section 8.2.9.2.1.3; Table 8-45
+ */
+static const uuid_t mem_mod_event_uuid =
+	UUID_INIT(0xfe927475, 0xdd59, 0x4339,
+		  0xa5, 0x86, 0x79, 0xba, 0xb1, 0x13, 0xb7, 0x74);
+
 static bool cxl_event_tracing_enabled(void)
 {
 	return trace_cxl_generic_event_enabled() ||
 	       trace_cxl_general_media_enabled() ||
-	       trace_cxl_dram_enabled();
+	       trace_cxl_dram_enabled() ||
+	       trace_cxl_memory_module_enabled();
 }
 
 static void cxl_trace_event_record(const char *dev_name,
@@ -749,6 +758,12 @@ static void cxl_trace_event_record(const char *dev_name,
 
 		trace_cxl_dram(dev_name, type, rec);
 		return;
+	} else if (uuid_equal(id, &mem_mod_event_uuid)) {
+		struct cxl_event_mem_module *rec =
+				(struct cxl_event_mem_module *)record;
+
+		trace_cxl_memory_module(dev_name, type, rec);
+		return;
 	}
 
 	/* For unknown record types print just the header */
diff --git a/drivers/cxl/cxlmem.h b/drivers/cxl/cxlmem.h
index f5f63a475478..450b410f29f6 100644
--- a/drivers/cxl/cxlmem.h
+++ b/drivers/cxl/cxlmem.h
@@ -452,6 +452,32 @@ struct cxl_event_dram {
 	u8 reserved[0x17];
 } __packed;
 
+/*
+ * Get Health Info Record
+ * CXL rev 3.0 section 8.2.9.8.3.1; Table 8-100
+ */
+struct cxl_get_health_info {
+	u8 health_status;
+	u8 media_status;
+	u8 add_status;
+	u8 life_used;
+	u8 device_temp[2];
+	u8 dirty_shutdown_cnt[4];
+	u8 cor_vol_err_cnt[4];
+	u8 cor_per_err_cnt[4];
+} __packed;
+
+/*
+ * Memory Module Event Record
+ * CXL rev 3.0 section 8.2.9.2.1.3; Table 8-45
+ */
+struct cxl_event_mem_module {
+	struct cxl_event_record_hdr hdr;
+	u8 event_type;
+	struct cxl_get_health_info info;
+	u8 reserved[0x3d];
+} __packed;
+
 struct cxl_mbox_get_partition_info {
 	__le64 active_volatile_cap;
 	__le64 active_persistent_cap;
diff --git a/include/trace/events/cxl.h b/include/trace/events/cxl.h
index 474390f895d9..48786d6c9615 100644
--- a/include/trace/events/cxl.h
+++ b/include/trace/events/cxl.h
@@ -334,6 +334,150 @@ TRACE_EVENT(cxl_dram,
 	)
 );
 
+/*
+ * Memory Module Event Record - MMER
+ *
+ * CXL res 3.0 section 8.2.9.2.1.3; Table 8-45
+ */
+#define CXL_MMER_HEALTH_STATUS_CHANGE		0x00
+#define CXL_MMER_MEDIA_STATUS_CHANGE		0x01
+#define CXL_MMER_LIFE_USED_CHANGE		0x02
+#define CXL_MMER_TEMP_CHANGE			0x03
+#define CXL_MMER_DATA_PATH_ERROR		0x04
+#define CXL_MMER_LAS_ERROR			0x05
+#define show_dev_evt_type(type)	__print_symbolic(type,			   \
+	{ CXL_MMER_HEALTH_STATUS_CHANGE,	"Health Status Change"	}, \
+	{ CXL_MMER_MEDIA_STATUS_CHANGE,		"Media Status Change"	}, \
+	{ CXL_MMER_LIFE_USED_CHANGE,		"Life Used Change"	}, \
+	{ CXL_MMER_TEMP_CHANGE,			"Temperature Change"	}, \
+	{ CXL_MMER_DATA_PATH_ERROR,		"Data Path Error"	}, \
+	{ CXL_MMER_LAS_ERROR,			"LSA Error"		}  \
+)
+
+/*
+ * Device Health Information - DHI
+ *
+ * CXL res 3.0 section 8.2.9.8.3.1; Table 8-100
+ */
+#define CXL_DHI_HS_MAINTENANCE_NEEDED				BIT(0)
+#define CXL_DHI_HS_PERFORMANCE_DEGRADED				BIT(1)
+#define CXL_DHI_HS_HW_REPLACEMENT_NEEDED			BIT(2)
+#define show_health_status_flags(flags)	__print_flags(flags, "|",	   \
+	{ CXL_DHI_HS_MAINTENANCE_NEEDED,	"MAINTENANCE_NEEDED"	}, \
+	{ CXL_DHI_HS_PERFORMANCE_DEGRADED,	"PERFORMANCE_DEGRADED"	}, \
+	{ CXL_DHI_HS_HW_REPLACEMENT_NEEDED,	"REPLACEMENT_NEEDED"	}  \
+)
+
+#define CXL_DHI_MS_NORMAL							0x00
+#define CXL_DHI_MS_NOT_READY							0x01
+#define CXL_DHI_MS_WRITE_PERSISTENCY_LOST					0x02
+#define CXL_DHI_MS_ALL_DATA_LOST						0x03
+#define CXL_DHI_MS_WRITE_PERSISTENCY_LOSS_EVENT_POWER_LOSS			0x04
+#define CXL_DHI_MS_WRITE_PERSISTENCY_LOSS_EVENT_SHUTDOWN			0x05
+#define CXL_DHI_MS_WRITE_PERSISTENCY_LOSS_IMMINENT				0x06
+#define CXL_DHI_MS_WRITE_ALL_DATA_LOSS_EVENT_POWER_LOSS				0x07
+#define CXL_DHI_MS_WRITE_ALL_DATA_LOSS_EVENT_SHUTDOWN				0x08
+#define CXL_DHI_MS_WRITE_ALL_DATA_LOSS_IMMINENT					0x09
+#define show_media_status(ms)	__print_symbolic(ms,			   \
+	{ CXL_DHI_MS_NORMAL,						   \
+		"Normal"						}, \
+	{ CXL_DHI_MS_NOT_READY,						   \
+		"Not Ready"						}, \
+	{ CXL_DHI_MS_WRITE_PERSISTENCY_LOST,				   \
+		"Write Persistency Lost"				}, \
+	{ CXL_DHI_MS_ALL_DATA_LOST,					   \
+		"All Data Lost"						}, \
+	{ CXL_DHI_MS_WRITE_PERSISTENCY_LOSS_EVENT_POWER_LOSS,		   \
+		"Write Persistency Loss in the Event of Power Loss"	}, \
+	{ CXL_DHI_MS_WRITE_PERSISTENCY_LOSS_EVENT_SHUTDOWN,		   \
+		"Write Persistency Loss in Event of Shutdown"		}, \
+	{ CXL_DHI_MS_WRITE_PERSISTENCY_LOSS_IMMINENT,			   \
+		"Write Persistency Loss Imminent"			}, \
+	{ CXL_DHI_MS_WRITE_ALL_DATA_LOSS_EVENT_POWER_LOSS,		   \
+		"All Data Loss in Event of Power Loss"			}, \
+	{ CXL_DHI_MS_WRITE_ALL_DATA_LOSS_EVENT_SHUTDOWN,		   \
+		"All Data loss in the Event of Shutdown"		}, \
+	{ CXL_DHI_MS_WRITE_ALL_DATA_LOSS_IMMINENT,			   \
+		"All Data Loss Imminent"				}  \
+)
+
+#define CXL_DHI_AS_NORMAL		0x0
+#define CXL_DHI_AS_WARNING		0x1
+#define CXL_DHI_AS_CRITICAL		0x2
+#define show_two_bit_status(as) __print_symbolic(as,	   \
+	{ CXL_DHI_AS_NORMAL,		"Normal"	}, \
+	{ CXL_DHI_AS_WARNING,		"Warning"	}, \
+	{ CXL_DHI_AS_CRITICAL,		"Critical"	}  \
+)
+#define show_one_bit_status(as) __print_symbolic(as,	   \
+	{ CXL_DHI_AS_NORMAL,		"Normal"	}, \
+	{ CXL_DHI_AS_WARNING,		"Warning"	}  \
+)
+
+#define CXL_DHI_AS_LIFE_USED(as)			(as & 0x3)
+#define CXL_DHI_AS_DEV_TEMP(as)				((as & 0xC) >> 2)
+#define CXL_DHI_AS_COR_VOL_ERR_CNT(as)			((as & 0x10) >> 4)
+#define CXL_DHI_AS_COR_PER_ERR_CNT(as)			((as & 0x20) >> 5)
+
+TRACE_EVENT(cxl_memory_module,
+
+	TP_PROTO(const char *dev_name, enum cxl_event_log_type log,
+		 struct cxl_event_mem_module *rec),
+
+	TP_ARGS(dev_name, log, rec),
+
+	TP_STRUCT__entry(
+		CXL_EVT_TP_entry
+
+		/* Memory Module Event */
+		__field(u8, event_type)
+
+		/* Device Health Info */
+		__field(u8, health_status)
+		__field(u8, media_status)
+		__field(u8, life_used)
+		__field(u32, dirty_shutdown_cnt)
+		__field(u32, cor_vol_err_cnt)
+		__field(u32, cor_per_err_cnt)
+		__field(s16, device_temp)
+		__field(u8, add_status)
+	),
+
+	TP_fast_assign(
+		CXL_EVT_TP_fast_assign(dev_name, log, rec->hdr);
+
+		/* Memory Module Event */
+		__entry->event_type = rec->event_type;
+
+		/* Device Health Info */
+		__entry->health_status = rec->info.health_status;
+		__entry->media_status = rec->info.media_status;
+		__entry->life_used = rec->info.life_used;
+		__entry->dirty_shutdown_cnt = get_unaligned_le32(rec->info.dirty_shutdown_cnt);
+		__entry->cor_vol_err_cnt = get_unaligned_le32(rec->info.cor_vol_err_cnt);
+		__entry->cor_per_err_cnt = get_unaligned_le32(rec->info.cor_per_err_cnt);
+		__entry->device_temp = get_unaligned_le16(rec->info.device_temp);
+		__entry->add_status = rec->info.add_status;
+	),
+
+	CXL_EVT_TP_printk("event_type='%s' health_status='%s' media_status='%s' " \
+		"as_life_used=%s as_dev_temp=%s as_cor_vol_err_cnt=%s " \
+		"as_cor_per_err_cnt=%s life_used=%u device_temp=%d " \
+		"dirty_shutdown_cnt=%u cor_vol_err_cnt=%u cor_per_err_cnt=%u",
+		show_dev_evt_type(__entry->event_type),
+		show_health_status_flags(__entry->health_status),
+		show_media_status(__entry->media_status),
+		show_two_bit_status(CXL_DHI_AS_LIFE_USED(__entry->add_status)),
+		show_two_bit_status(CXL_DHI_AS_DEV_TEMP(__entry->add_status)),
+		show_one_bit_status(CXL_DHI_AS_COR_VOL_ERR_CNT(__entry->add_status)),
+		show_one_bit_status(CXL_DHI_AS_COR_PER_ERR_CNT(__entry->add_status)),
+		__entry->life_used, __entry->device_temp,
+		__entry->dirty_shutdown_cnt, __entry->cor_vol_err_cnt,
+		__entry->cor_per_err_cnt
+	)
+);
+
+
 #endif /* _CXL_TRACE_EVENTS_H */
 
 /* This part must be outside protection */
-- 
2.37.2


^ permalink raw reply related	[flat|nested] 64+ messages in thread

* [PATCH V2 08/11] cxl/mem: Wire up event interrupts
  2022-12-01  0:27 [PATCH V2 00/11] CXL: Process event logs ira.weiny
                   ` (6 preceding siblings ...)
  2022-12-01  0:27 ` [PATCH V2 07/11] cxl/mem: Trace Memory Module " ira.weiny
@ 2022-12-01  0:27 ` ira.weiny
  2022-12-01 14:21   ` Jonathan Cameron
                     ` (2 more replies)
  2022-12-01  0:27 ` [PATCH V2 09/11] cxl/test: Add generic mock events ira.weiny
                   ` (2 subsequent siblings)
  10 siblings, 3 replies; 64+ messages in thread
From: ira.weiny @ 2022-12-01  0:27 UTC (permalink / raw)
  To: Dan Williams
  Cc: Ira Weiny, Alison Schofield, Vishal Verma, Ben Widawsky,
	Steven Rostedt, Jonathan Cameron, Davidlohr Bueso, Dave Jiang,
	linux-kernel, linux-cxl

From: Ira Weiny <ira.weiny@intel.com>

CXL device events are signaled via interrupts.  Each event log may have
a different interrupt message number.  These message numbers are
reported in the Get Event Interrupt Policy mailbox command.

Add interrupt support for event logs.  Interrupts are allocated as
shared interrupts.  Therefore, all or some event logs can share the same
message number.

Signed-off-by: Ira Weiny <ira.weiny@intel.com>

---
Changes from V1:
	Remove unneeded evt_int_policy from struct cxl_dev_state
	defer Dynamic Capacity support
	Dave Jiang
		s/irq/rc
		use IRQ_NONE to signal the irq was not for us.
	Jonathan
		use msi_enabled rather than nr_irq_vec
		On failure explicitly set CXL_INT_NONE
		Add comment for Get Event Interrupt Policy
		use devm_request_threaded_irq()
		Use individual handler/thread functions for each of the
		logs rather than struct cxl_event_irq_id.

Changes from RFC v2
	Adjust to new irq 16 vector allocation
	Jonathan
		Remove CXL_INT_RES
	Use irq threads to ensure mailbox commands are executed outside irq context
	Adjust for optional Dynamic Capacity log
---
 drivers/cxl/core/mbox.c      |  44 +++++++++++-
 drivers/cxl/cxlmem.h         |  30 ++++++++
 drivers/cxl/pci.c            | 130 +++++++++++++++++++++++++++++++++++
 include/uapi/linux/cxl_mem.h |   2 +
 4 files changed, 204 insertions(+), 2 deletions(-)

diff --git a/drivers/cxl/core/mbox.c b/drivers/cxl/core/mbox.c
index 30840b711381..1e00b49d8b06 100644
--- a/drivers/cxl/core/mbox.c
+++ b/drivers/cxl/core/mbox.c
@@ -53,6 +53,8 @@ static struct cxl_mem_command cxl_mem_commands[CXL_MEM_COMMAND_ID_MAX] = {
 	CXL_CMD(GET_SUPPORTED_LOGS, 0, CXL_VARIABLE_PAYLOAD, CXL_CMD_FLAG_FORCE_ENABLE),
 	CXL_CMD(GET_EVENT_RECORD, 1, CXL_VARIABLE_PAYLOAD, 0),
 	CXL_CMD(CLEAR_EVENT_RECORD, CXL_VARIABLE_PAYLOAD, 0, 0),
+	CXL_CMD(GET_EVT_INT_POLICY, 0, 0x5, 0),
+	CXL_CMD(SET_EVT_INT_POLICY, 0x5, 0, 0),
 	CXL_CMD(GET_FW_INFO, 0, 0x50, 0),
 	CXL_CMD(GET_PARTITION_INFO, 0, 0x20, 0),
 	CXL_CMD(GET_LSA, 0x8, CXL_VARIABLE_PAYLOAD, 0),
@@ -806,8 +808,8 @@ static int cxl_clear_event_record(struct cxl_dev_state *cxlds,
 	return 0;
 }
 
-static void cxl_mem_get_records_log(struct cxl_dev_state *cxlds,
-				    enum cxl_event_log_type type)
+void cxl_mem_get_records_log(struct cxl_dev_state *cxlds,
+			     enum cxl_event_log_type type)
 {
 	struct cxl_get_event_payload *payload;
 	u16 nr_rec;
@@ -857,6 +859,7 @@ static void cxl_mem_get_records_log(struct cxl_dev_state *cxlds,
 unlock_buffer:
 	mutex_unlock(&cxlds->event_buf_lock);
 }
+EXPORT_SYMBOL_NS_GPL(cxl_mem_get_records_log, CXL);
 
 static void cxl_mem_free_event_buffer(void *data)
 {
@@ -916,6 +919,43 @@ void cxl_mem_get_event_records(struct cxl_dev_state *cxlds)
 }
 EXPORT_SYMBOL_NS_GPL(cxl_mem_get_event_records, CXL);
 
+int cxl_event_config_msgnums(struct cxl_dev_state *cxlds,
+			     struct cxl_event_interrupt_policy *policy)
+{
+	int rc;
+
+	policy->info_settings = CXL_INT_MSI_MSIX;
+	policy->warn_settings = CXL_INT_MSI_MSIX;
+	policy->failure_settings = CXL_INT_MSI_MSIX;
+	policy->fatal_settings = CXL_INT_MSI_MSIX;
+
+	rc = cxl_mbox_send_cmd(cxlds, CXL_MBOX_OP_SET_EVT_INT_POLICY,
+			       policy, sizeof(*policy), NULL, 0);
+	if (rc < 0) {
+		dev_err(cxlds->dev, "Failed to set event interrupt policy : %d",
+			rc);
+
+		policy->info_settings = CXL_INT_NONE;
+		policy->warn_settings = CXL_INT_NONE;
+		policy->failure_settings = CXL_INT_NONE;
+		policy->fatal_settings = CXL_INT_NONE;
+
+		return rc;
+	}
+
+	/* Retrieve interrupt message numbers */
+	rc = cxl_mbox_send_cmd(cxlds, CXL_MBOX_OP_GET_EVT_INT_POLICY, NULL, 0,
+			       policy, sizeof(*policy));
+	if (rc < 0) {
+		dev_err(cxlds->dev, "Failed to get event interrupt policy : %d",
+			rc);
+		return rc;
+	}
+
+	return 0;
+}
+EXPORT_SYMBOL_NS_GPL(cxl_event_config_msgnums, CXL);
+
 /**
  * cxl_mem_get_partition_info - Get partition info
  * @cxlds: The device data for the operation
diff --git a/drivers/cxl/cxlmem.h b/drivers/cxl/cxlmem.h
index 450b410f29f6..2d384b0fc2b3 100644
--- a/drivers/cxl/cxlmem.h
+++ b/drivers/cxl/cxlmem.h
@@ -179,6 +179,30 @@ struct cxl_endpoint_dvsec_info {
 	struct range dvsec_range[2];
 };
 
+/**
+ * Event Interrupt Policy
+ *
+ * CXL rev 3.0 section 8.2.9.2.4; Table 8-52
+ */
+enum cxl_event_int_mode {
+	CXL_INT_NONE		= 0x00,
+	CXL_INT_MSI_MSIX	= 0x01,
+	CXL_INT_FW		= 0x02
+};
+#define CXL_EVENT_INT_MODE_MASK 0x3
+#define CXL_EVENT_INT_MSGNUM(setting) (((setting) & 0xf0) >> 4)
+struct cxl_event_interrupt_policy {
+	u8 info_settings;
+	u8 warn_settings;
+	u8 failure_settings;
+	u8 fatal_settings;
+} __packed;
+
+static inline bool cxl_evt_int_is_msi(u8 setting)
+{
+	return CXL_INT_MSI_MSIX == (setting & CXL_EVENT_INT_MODE_MASK);
+}
+
 /**
  * struct cxl_dev_state - The driver device state
  *
@@ -262,6 +286,8 @@ enum cxl_opcode {
 	CXL_MBOX_OP_RAW			= CXL_MBOX_OP_INVALID,
 	CXL_MBOX_OP_GET_EVENT_RECORD	= 0x0100,
 	CXL_MBOX_OP_CLEAR_EVENT_RECORD	= 0x0101,
+	CXL_MBOX_OP_GET_EVT_INT_POLICY	= 0x0102,
+	CXL_MBOX_OP_SET_EVT_INT_POLICY	= 0x0103,
 	CXL_MBOX_OP_GET_FW_INFO		= 0x0200,
 	CXL_MBOX_OP_ACTIVATE_FW		= 0x0202,
 	CXL_MBOX_OP_GET_SUPPORTED_LOGS	= 0x0400,
@@ -537,7 +563,11 @@ int cxl_mem_create_range_info(struct cxl_dev_state *cxlds);
 struct cxl_dev_state *cxl_dev_state_create(struct device *dev);
 void set_exclusive_cxl_commands(struct cxl_dev_state *cxlds, unsigned long *cmds);
 void clear_exclusive_cxl_commands(struct cxl_dev_state *cxlds, unsigned long *cmds);
+void cxl_mem_get_records_log(struct cxl_dev_state *cxlds,
+			     enum cxl_event_log_type type);
 void cxl_mem_get_event_records(struct cxl_dev_state *cxlds);
+int cxl_event_config_msgnums(struct cxl_dev_state *cxlds,
+			     struct cxl_event_interrupt_policy *policy);
 #ifdef CONFIG_CXL_SUSPEND
 void cxl_mem_active_inc(void);
 void cxl_mem_active_dec(void);
diff --git a/drivers/cxl/pci.c b/drivers/cxl/pci.c
index 11e95a95195a..3c0b9199f11a 100644
--- a/drivers/cxl/pci.c
+++ b/drivers/cxl/pci.c
@@ -449,6 +449,134 @@ static void cxl_pci_alloc_irq_vectors(struct cxl_dev_state *cxlds)
 	cxlds->msi_enabled = true;
 }
 
+static irqreturn_t cxl_event_info_thread(int irq, void *id)
+{
+	struct cxl_dev_state *cxlds = id;
+
+	cxl_mem_get_records_log(cxlds, CXL_EVENT_TYPE_INFO);
+	return IRQ_HANDLED;
+}
+
+static irqreturn_t cxl_event_info_handler(int irq, void *id)
+{
+	struct cxl_dev_state *cxlds = id;
+	u32 status = readl(cxlds->regs.status + CXLDEV_DEV_EVENT_STATUS_OFFSET);
+
+	if (CXLDEV_EVENT_STATUS_INFO & status)
+		return IRQ_WAKE_THREAD;
+	return IRQ_NONE;
+}
+
+static irqreturn_t cxl_event_warn_thread(int irq, void *id)
+{
+	struct cxl_dev_state *cxlds = id;
+
+	cxl_mem_get_records_log(cxlds, CXL_EVENT_TYPE_WARN);
+	return IRQ_HANDLED;
+}
+
+static irqreturn_t cxl_event_warn_handler(int irq, void *id)
+{
+	struct cxl_dev_state *cxlds = id;
+	u32 status = readl(cxlds->regs.status + CXLDEV_DEV_EVENT_STATUS_OFFSET);
+
+	if (CXLDEV_EVENT_STATUS_WARN & status)
+		return IRQ_WAKE_THREAD;
+	return IRQ_NONE;
+}
+
+static irqreturn_t cxl_event_failure_thread(int irq, void *id)
+{
+	struct cxl_dev_state *cxlds = id;
+
+	cxl_mem_get_records_log(cxlds, CXL_EVENT_TYPE_FAIL);
+	return IRQ_HANDLED;
+}
+
+static irqreturn_t cxl_event_failure_handler(int irq, void *id)
+{
+	struct cxl_dev_state *cxlds = id;
+	u32 status = readl(cxlds->regs.status + CXLDEV_DEV_EVENT_STATUS_OFFSET);
+
+	if (CXLDEV_EVENT_STATUS_FAIL & status)
+		return IRQ_WAKE_THREAD;
+	return IRQ_NONE;
+}
+
+static irqreturn_t cxl_event_fatal_thread(int irq, void *id)
+{
+	struct cxl_dev_state *cxlds = id;
+
+	cxl_mem_get_records_log(cxlds, CXL_EVENT_TYPE_FATAL);
+	return IRQ_HANDLED;
+}
+
+static irqreturn_t cxl_event_fatal_handler(int irq, void *id)
+{
+	struct cxl_dev_state *cxlds = id;
+	u32 status = readl(cxlds->regs.status + CXLDEV_DEV_EVENT_STATUS_OFFSET);
+
+	if (CXLDEV_EVENT_STATUS_FATAL & status)
+		return IRQ_WAKE_THREAD;
+	return IRQ_NONE;
+}
+
+static void cxl_event_irqsetup(struct cxl_dev_state *cxlds)
+{
+	struct cxl_event_interrupt_policy policy;
+	struct device *dev = cxlds->dev;
+	struct pci_dev *pdev = to_pci_dev(dev);
+	u8 setting;
+	int rc;
+
+	if (cxl_event_config_msgnums(cxlds, &policy))
+		return;
+
+	setting = policy.info_settings;
+	if (cxl_evt_int_is_msi(setting)) {
+		rc = devm_request_threaded_irq(dev,
+				pci_irq_vector(pdev, CXL_EVENT_INT_MSGNUM(setting)),
+				cxl_event_info_handler, cxl_event_info_thread,
+				IRQF_SHARED, NULL, cxlds);
+		if (rc)
+			dev_err(dev, "Failed to get interrupt for %s event log\n",
+				cxl_event_log_type_str(CXL_EVENT_TYPE_INFO));
+	}
+
+	setting = policy.warn_settings;
+	if (cxl_evt_int_is_msi(setting)) {
+		rc = devm_request_threaded_irq(dev,
+				pci_irq_vector(pdev, CXL_EVENT_INT_MSGNUM(setting)),
+				cxl_event_warn_handler, cxl_event_warn_thread,
+				IRQF_SHARED, NULL, cxlds);
+		if (rc)
+			dev_err(dev, "Failed to get interrupt for %s event log\n",
+				cxl_event_log_type_str(CXL_EVENT_TYPE_WARN));
+	}
+
+	setting = policy.failure_settings;
+	if (cxl_evt_int_is_msi(setting)) {
+		rc = devm_request_threaded_irq(dev,
+				pci_irq_vector(pdev, CXL_EVENT_INT_MSGNUM(setting)),
+				cxl_event_failure_handler, cxl_event_failure_thread,
+				IRQF_SHARED, NULL, cxlds);
+		if (rc)
+			dev_err(dev, "Failed to get interrupt for %s event log\n",
+				cxl_event_log_type_str(CXL_EVENT_TYPE_FAIL));
+	}
+
+	setting = policy.fatal_settings;
+	if (cxl_evt_int_is_msi(setting)) {
+		rc = devm_request_threaded_irq(dev,
+				pci_irq_vector(pdev, CXL_EVENT_INT_MSGNUM(setting)),
+				cxl_event_fatal_handler, cxl_event_fatal_thread,
+				IRQF_SHARED, NULL, cxlds);
+		if (rc)
+			dev_err(dev, "Failed to get interrupt for %s event log\n",
+				cxl_event_log_type_str(CXL_EVENT_TYPE_FATAL));
+	}
+}
+
 static int cxl_pci_probe(struct pci_dev *pdev, const struct pci_device_id *id)
 {
 	struct cxl_register_map map;
@@ -516,6 +644,8 @@ static int cxl_pci_probe(struct pci_dev *pdev, const struct pci_device_id *id)
 		return rc;
 
 	cxl_pci_alloc_irq_vectors(cxlds);
+	if (cxlds->msi_enabled)
+		cxl_event_irqsetup(cxlds);
 
 	cxlmd = devm_cxl_add_memdev(cxlds);
 	if (IS_ERR(cxlmd))
diff --git a/include/uapi/linux/cxl_mem.h b/include/uapi/linux/cxl_mem.h
index 7c1ad8062792..a8204802fcca 100644
--- a/include/uapi/linux/cxl_mem.h
+++ b/include/uapi/linux/cxl_mem.h
@@ -26,6 +26,8 @@
 	___C(GET_SUPPORTED_LOGS, "Get Supported Logs"),                   \
 	___C(GET_EVENT_RECORD, "Get Event Record"),                       \
 	___C(CLEAR_EVENT_RECORD, "Clear Event Record"),                   \
+	___C(GET_EVT_INT_POLICY, "Get Event Interrupt Policy"),           \
+	___C(SET_EVT_INT_POLICY, "Set Event Interrupt Policy"),           \
 	___C(GET_FW_INFO, "Get FW Info"),                                 \
 	___C(GET_PARTITION_INFO, "Get Partition Information"),            \
 	___C(GET_LSA, "Get Label Storage Area"),                          \
-- 
2.37.2


^ permalink raw reply related	[flat|nested] 64+ messages in thread

* [PATCH V2 09/11] cxl/test: Add generic mock events
  2022-12-01  0:27 [PATCH V2 00/11] CXL: Process event logs ira.weiny
                   ` (7 preceding siblings ...)
  2022-12-01  0:27 ` [PATCH V2 08/11] cxl/mem: Wire up event interrupts ira.weiny
@ 2022-12-01  0:27 ` ira.weiny
  2022-12-01 14:37   ` Jonathan Cameron
  2022-12-02  8:07   ` Dan Williams
  2022-12-01  0:27 ` [PATCH V2 10/11] cxl/test: Add specific events ira.weiny
  2022-12-01  0:27 ` [PATCH V2 11/11] cxl/test: Simulate event log overflow ira.weiny
  10 siblings, 2 replies; 64+ messages in thread
From: ira.weiny @ 2022-12-01  0:27 UTC (permalink / raw)
  To: Dan Williams
  Cc: Ira Weiny, Alison Schofield, Vishal Verma, Ben Widawsky,
	Steven Rostedt, Jonathan Cameron, Davidlohr Bueso, Dave Jiang,
	linux-kernel, linux-cxl

From: Ira Weiny <ira.weiny@intel.com>

Facilitate testing basic Get/Clear Event functionality by creating
multiple logs and generic events with made up UUID's.

Data is completely made up with data patterns which should be easy to
spot in trace output.

A single sysfs entry resets the event data and triggers collecting the
events for testing.

Events are returned one at a time which is within the specification even
though it does not exercise the full capabilities of what a device may
do.

Test traces are easy to obtain with a small script such as this:

	#!/bin/bash -x

	devices=`find /sys/devices/platform -name cxl_mem*`

	# Turn on tracing
	echo "" > /sys/kernel/tracing/trace
	echo 1 > /sys/kernel/tracing/events/cxl/enable
	echo 1 > /sys/kernel/tracing/tracing_on

	# Generate fake interrupt
	for device in $devices; do
	        echo 1 > $device/event_trigger
	done

	# Turn off tracing and report events
	echo 0 > /sys/kernel/tracing/tracing_on
	cat /sys/kernel/tracing/trace

Signed-off-by: Ira Weiny <ira.weiny@intel.com>

---
Changes from v1:
	Fix up for new structures
	Jonathan
		Update based on specification discussion

Changes from RFC v2:
	Adjust to simulate the event status register

Changes from RFC:
	Separate out the event code
	Adjust for struct changes.
	Clean up devm_cxl_mock_event_logs()
	Clean up naming and comments
	Jonathan
		Remove dynamic allocation of event logs
		Clean up comment
		Remove unneeded xarray
		Ensure event_trigger sysfs is valid prior to the driver
		going active.
	Dan
		Remove the fill/reset event sysfs as these operations
		can be done together
---
 drivers/cxl/core/mbox.c         |  33 +++--
 drivers/cxl/cxlmem.h            |   1 +
 tools/testing/cxl/test/Kbuild   |   2 +-
 tools/testing/cxl/test/events.c | 242 ++++++++++++++++++++++++++++++++
 tools/testing/cxl/test/events.h |   9 ++
 tools/testing/cxl/test/mem.c    |  35 ++++-
 6 files changed, 307 insertions(+), 15 deletions(-)
 create mode 100644 tools/testing/cxl/test/events.c
 create mode 100644 tools/testing/cxl/test/events.h

diff --git a/drivers/cxl/core/mbox.c b/drivers/cxl/core/mbox.c
index 1e00b49d8b06..17659b9a0408 100644
--- a/drivers/cxl/core/mbox.c
+++ b/drivers/cxl/core/mbox.c
@@ -886,20 +886,9 @@ static struct cxl_get_event_payload *alloc_event_buf(struct cxl_dev_state *cxlds
 	return buf;
 }
 
-/**
- * cxl_mem_get_event_records - Get Event Records from the device
- * @cxlds: The device data for the operation
- *
- * Retrieve all event records available on the device, report them as trace
- * events, and clear them.
- *
- * See CXL rev 3.0 @8.2.9.2.2 Get Event Records
- * See CXL rev 3.0 @8.2.9.2.3 Clear Event Records
- */
-void cxl_mem_get_event_records(struct cxl_dev_state *cxlds)
+/* Direct call for mock testing */
+void __cxl_mem_get_event_records(struct cxl_dev_state *cxlds, u32 status)
 {
-	u32 status = readl(cxlds->regs.status + CXLDEV_DEV_EVENT_STATUS_OFFSET);
-
 	dev_dbg(cxlds->dev, "Reading event logs: %x\n", status);
 
 	if (!cxlds->event_buf) {
@@ -917,6 +906,24 @@ void cxl_mem_get_event_records(struct cxl_dev_state *cxlds)
 	if (status & CXLDEV_EVENT_STATUS_FATAL)
 		cxl_mem_get_records_log(cxlds, CXL_EVENT_TYPE_FATAL);
 }
+EXPORT_SYMBOL_NS_GPL(__cxl_mem_get_event_records, CXL);
+
+/**
+ * cxl_mem_get_event_records - Get Event Records from the device
+ * @cxlds: The device data for the operation
+ *
+ * Retrieve all event records available on the device, report them as trace
+ * events, and clear them.
+ *
+ * See CXL rev 3.0 @8.2.9.2.2 Get Event Records
+ * See CXL rev 3.0 @8.2.9.2.3 Clear Event Records
+ */
+void cxl_mem_get_event_records(struct cxl_dev_state *cxlds)
+{
+	u32 status = readl(cxlds->regs.status + CXLDEV_DEV_EVENT_STATUS_OFFSET);
+
+	__cxl_mem_get_event_records(cxlds, status);
+}
 EXPORT_SYMBOL_NS_GPL(cxl_mem_get_event_records, CXL);
 
 int cxl_event_config_msgnums(struct cxl_dev_state *cxlds,
diff --git a/drivers/cxl/cxlmem.h b/drivers/cxl/cxlmem.h
index 2d384b0fc2b3..10e3c1c893f3 100644
--- a/drivers/cxl/cxlmem.h
+++ b/drivers/cxl/cxlmem.h
@@ -565,6 +565,7 @@ void set_exclusive_cxl_commands(struct cxl_dev_state *cxlds, unsigned long *cmds
 void clear_exclusive_cxl_commands(struct cxl_dev_state *cxlds, unsigned long *cmds);
 void cxl_mem_get_records_log(struct cxl_dev_state *cxlds,
 			     enum cxl_event_log_type type);
+void __cxl_mem_get_event_records(struct cxl_dev_state *cxlds, u32 status);
 void cxl_mem_get_event_records(struct cxl_dev_state *cxlds);
 int cxl_event_config_msgnums(struct cxl_dev_state *cxlds,
 			     struct cxl_event_interrupt_policy *policy);
diff --git a/tools/testing/cxl/test/Kbuild b/tools/testing/cxl/test/Kbuild
index 4e59e2c911f6..64b14b83d8d9 100644
--- a/tools/testing/cxl/test/Kbuild
+++ b/tools/testing/cxl/test/Kbuild
@@ -7,4 +7,4 @@ obj-m += cxl_mock_mem.o
 
 cxl_test-y := cxl.o
 cxl_mock-y := mock.o
-cxl_mock_mem-y := mem.o
+cxl_mock_mem-y := mem.o events.o
diff --git a/tools/testing/cxl/test/events.c b/tools/testing/cxl/test/events.c
new file mode 100644
index 000000000000..a3d2ec7cc9fe
--- /dev/null
+++ b/tools/testing/cxl/test/events.c
@@ -0,0 +1,242 @@
+// SPDX-License-Identifier: GPL-2.0-only
+// Copyright(c) 2022 Intel Corporation. All rights reserved.
+
+#include <cxlmem.h>
+#include <trace/events/cxl.h>
+
+#include "events.h"
+
+#define CXL_TEST_EVENT_CNT_MAX 15
+
+/* Set a number of events to return at a time for simulation.  */
+#define CXL_TEST_EVENT_CNT 3
+
+struct mock_event_log {
+	u16 clear_idx;
+	u16 cur_idx;
+	u16 nr_events;
+	struct cxl_event_record_raw *events[CXL_TEST_EVENT_CNT_MAX];
+};
+
+struct mock_event_store {
+	struct cxl_dev_state *cxlds;
+	struct mock_event_log mock_logs[CXL_EVENT_TYPE_MAX];
+	u32 ev_status;
+};
+
+DEFINE_XARRAY(mock_dev_event_store);
+
+struct mock_event_log *find_event_log(struct device *dev, int log_type)
+{
+	struct mock_event_store *mes = xa_load(&mock_dev_event_store,
+					       (unsigned long)dev);
+
+	if (!mes || log_type >= CXL_EVENT_TYPE_MAX)
+		return NULL;
+	return &mes->mock_logs[log_type];
+}
+
+struct cxl_event_record_raw *get_cur_event(struct mock_event_log *log)
+{
+	return log->events[log->cur_idx];
+}
+
+void reset_event_log(struct mock_event_log *log)
+{
+	log->cur_idx = 0;
+	log->clear_idx = 0;
+}
+
+/* Handle can never be 0 use 1 based indexing for handle */
+u16 get_clear_handle(struct mock_event_log *log)
+{
+	return log->clear_idx + 1;
+}
+
+/* Handle can never be 0 use 1 based indexing for handle */
+__le16 get_cur_event_handle(struct mock_event_log *log)
+{
+	u16 cur_handle = log->cur_idx + 1;
+
+	return cpu_to_le16(cur_handle);
+}
+
+static bool log_empty(struct mock_event_log *log)
+{
+	return log->cur_idx == log->nr_events;
+}
+
+static void event_store_add_event(struct mock_event_store *mes,
+				  enum cxl_event_log_type log_type,
+				  struct cxl_event_record_raw *event)
+{
+	struct mock_event_log *log;
+
+	if (WARN_ON(log_type >= CXL_EVENT_TYPE_MAX))
+		return;
+
+	log = &mes->mock_logs[log_type];
+	if (WARN_ON(log->nr_events >= CXL_TEST_EVENT_CNT_MAX))
+		return;
+
+	log->events[log->nr_events] = event;
+	log->nr_events++;
+}
+
+int mock_get_event(struct cxl_dev_state *cxlds, struct cxl_mbox_cmd *cmd)
+{
+	struct cxl_get_event_payload *pl;
+	struct mock_event_log *log;
+	u8 log_type;
+	int i;
+
+	if (cmd->size_in != sizeof(log_type))
+		return -EINVAL;
+
+	if (cmd->size_out < struct_size(pl, records, CXL_TEST_EVENT_CNT))
+		return -EINVAL;
+
+	log_type = *((u8 *)cmd->payload_in);
+	if (log_type >= CXL_EVENT_TYPE_MAX)
+		return -EINVAL;
+
+	memset(cmd->payload_out, 0, cmd->size_out);
+
+	log = find_event_log(cxlds->dev, log_type);
+	if (!log || log_empty(log))
+		return 0;
+
+	pl = cmd->payload_out;
+
+	for (i = 0; i < CXL_TEST_EVENT_CNT && !log_empty(log); i++) {
+		memcpy(&pl->records[i], get_cur_event(log), sizeof(pl->records[i]));
+		pl->records[i].hdr.handle = get_cur_event_handle(log);
+		log->cur_idx++;
+	}
+
+	pl->record_count = cpu_to_le16(i);
+	if (!log_empty(log))
+		pl->flags |= CXL_GET_EVENT_FLAG_MORE_RECORDS;
+
+	return 0;
+}
+EXPORT_SYMBOL_GPL(mock_get_event);
+
+int mock_clear_event(struct cxl_dev_state *cxlds, struct cxl_mbox_cmd *cmd)
+{
+	struct cxl_mbox_clear_event_payload *pl = cmd->payload_in;
+	struct mock_event_log *log;
+	u8 log_type = pl->event_log;
+	u16 handle;
+	int nr;
+
+	if (log_type >= CXL_EVENT_TYPE_MAX)
+		return -EINVAL;
+
+	log = find_event_log(cxlds->dev, log_type);
+	if (!log)
+		return 0; /* No mock data in this log */
+
+	/*
+	 * This check is technically not invalid per the specification AFAICS.
+	 * (The host could 'guess' handles and clear them in order).
+	 * However, this is not good behavior for the host so test it.
+	 */
+	if (log->clear_idx + pl->nr_recs > log->cur_idx) {
+		dev_err(cxlds->dev,
+			"Attempting to clear more events than returned!\n");
+		return -EINVAL;
+	}
+
+	/* Check handle order prior to clearing events */
+	for (nr = 0, handle = get_clear_handle(log);
+	     nr < pl->nr_recs;
+	     nr++, handle++) {
+		if (handle != le16_to_cpu(pl->handle[nr])) {
+			dev_err(cxlds->dev, "Clearing events out of order\n");
+			return -EINVAL;
+		}
+	}
+
+	/* Clear events */
+	log->clear_idx += pl->nr_recs;
+	return 0;
+}
+EXPORT_SYMBOL_GPL(mock_clear_event);
+
+void cxl_mock_event_trigger(struct device *dev)
+{
+	struct mock_event_store *mes = xa_load(&mock_dev_event_store,
+					       (unsigned long)dev);
+	int i;
+
+	for (i = CXL_EVENT_TYPE_INFO; i < CXL_EVENT_TYPE_MAX; i++) {
+		struct mock_event_log *log;
+
+		log = find_event_log(dev, i);
+		if (log)
+			reset_event_log(log);
+	}
+
+	__cxl_mem_get_event_records(mes->cxlds, mes->ev_status);
+}
+EXPORT_SYMBOL_GPL(cxl_mock_event_trigger);
+
+struct cxl_event_record_raw maint_needed = {
+	.hdr = {
+		.id = UUID_INIT(0xDEADBEEF, 0xCAFE, 0xBABE,
+				0xa5, 0x5a, 0xa5, 0x5a, 0xa5, 0xa5, 0x5a, 0xa5),
+		.length = sizeof(struct cxl_event_record_raw),
+		.flags[0] = CXL_EVENT_RECORD_FLAG_MAINT_NEEDED,
+		/* .handle = Set dynamically */
+		.related_handle = cpu_to_le16(0xa5b6),
+	},
+	.data = { 0xDE, 0xAD, 0xBE, 0xEF },
+};
+
+struct cxl_event_record_raw hardware_replace = {
+	.hdr = {
+		.id = UUID_INIT(0xBABECAFE, 0xBEEF, 0xDEAD,
+				0xa5, 0x5a, 0xa5, 0x5a, 0xa5, 0xa5, 0x5a, 0xa5),
+		.length = sizeof(struct cxl_event_record_raw),
+		.flags[0] = CXL_EVENT_RECORD_FLAG_HW_REPLACE,
+		/* .handle = Set dynamically */
+		.related_handle = cpu_to_le16(0xb6a5),
+	},
+	.data = { 0xDE, 0xAD, 0xBE, 0xEF },
+};
+
+u32 cxl_mock_add_event_logs(struct cxl_dev_state *cxlds)
+{
+	struct device *dev = cxlds->dev;
+	struct mock_event_store *mes;
+
+	mes = devm_kzalloc(dev, sizeof(*mes), GFP_KERNEL);
+	if (WARN_ON(!mes))
+		return 0;
+	mes->cxlds = cxlds;
+
+	if (xa_insert(&mock_dev_event_store, (unsigned long)dev, mes,
+		      GFP_KERNEL)) {
+		dev_err(dev, "Event store not available for %s\n",
+			dev_name(dev));
+		return 0;
+	}
+
+	event_store_add_event(mes, CXL_EVENT_TYPE_INFO, &maint_needed);
+	mes->ev_status |= CXLDEV_EVENT_STATUS_INFO;
+
+	event_store_add_event(mes, CXL_EVENT_TYPE_FATAL, &hardware_replace);
+	mes->ev_status |= CXLDEV_EVENT_STATUS_FATAL;
+
+	return mes->ev_status;
+}
+EXPORT_SYMBOL_GPL(cxl_mock_add_event_logs);
+
+void cxl_mock_remove_event_logs(struct device *dev)
+{
+	struct mock_event_store *mes;
+
+	mes = xa_erase(&mock_dev_event_store, (unsigned long)dev);
+}
+EXPORT_SYMBOL_GPL(cxl_mock_remove_event_logs);
diff --git a/tools/testing/cxl/test/events.h b/tools/testing/cxl/test/events.h
new file mode 100644
index 000000000000..5bebc6a0a01b
--- /dev/null
+++ b/tools/testing/cxl/test/events.h
@@ -0,0 +1,9 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+
+#include <cxlmem.h>
+
+int mock_get_event(struct cxl_dev_state *cxlds, struct cxl_mbox_cmd *cmd);
+int mock_clear_event(struct cxl_dev_state *cxlds, struct cxl_mbox_cmd *cmd);
+u32 cxl_mock_add_event_logs(struct cxl_dev_state *cxlds);
+void cxl_mock_remove_event_logs(struct device *dev);
+void cxl_mock_event_trigger(struct device *dev);
diff --git a/tools/testing/cxl/test/mem.c b/tools/testing/cxl/test/mem.c
index e2f5445d24ff..333fa8527a07 100644
--- a/tools/testing/cxl/test/mem.c
+++ b/tools/testing/cxl/test/mem.c
@@ -8,6 +8,7 @@
 #include <linux/sizes.h>
 #include <linux/bits.h>
 #include <cxlmem.h>
+#include "events.h"
 
 #define LSA_SIZE SZ_128K
 #define DEV_SIZE SZ_2G
@@ -224,6 +225,12 @@ static int cxl_mock_mbox_send(struct cxl_dev_state *cxlds, struct cxl_mbox_cmd *
 	case CXL_MBOX_OP_GET_PARTITION_INFO:
 		rc = mock_partition_info(cxlds, cmd);
 		break;
+	case CXL_MBOX_OP_GET_EVENT_RECORD:
+		rc = mock_get_event(cxlds, cmd);
+		break;
+	case CXL_MBOX_OP_CLEAR_EVENT_RECORD:
+		rc = mock_clear_event(cxlds, cmd);
+		break;
 	case CXL_MBOX_OP_SET_LSA:
 		rc = mock_set_lsa(cxlds, cmd);
 		break;
@@ -245,11 +252,27 @@ static void label_area_release(void *lsa)
 	vfree(lsa);
 }
 
+static ssize_t event_trigger_store(struct device *dev,
+				   struct device_attribute *attr,
+				   const char *buf, size_t count)
+{
+	cxl_mock_event_trigger(dev);
+	return count;
+}
+static DEVICE_ATTR_WO(event_trigger);
+
+static struct attribute *cxl_mock_event_attrs[] = {
+	&dev_attr_event_trigger.attr,
+	NULL
+};
+ATTRIBUTE_GROUPS(cxl_mock_event);
+
 static int cxl_mock_mem_probe(struct platform_device *pdev)
 {
 	struct device *dev = &pdev->dev;
 	struct cxl_memdev *cxlmd;
 	struct cxl_dev_state *cxlds;
+	u32 ev_status;
 	void *lsa;
 	int rc;
 
@@ -281,11 +304,13 @@ static int cxl_mock_mem_probe(struct platform_device *pdev)
 	if (rc)
 		return rc;
 
+	ev_status = cxl_mock_add_event_logs(cxlds);
+
 	cxlmd = devm_cxl_add_memdev(cxlds);
 	if (IS_ERR(cxlmd))
 		return PTR_ERR(cxlmd);
 
-	cxl_mem_get_event_records(cxlds);
+	__cxl_mem_get_event_records(cxlds, ev_status);
 
 	if (resource_size(&cxlds->pmem_res) && IS_ENABLED(CONFIG_CXL_PMEM))
 		rc = devm_cxl_add_nvdimm(dev, cxlmd);
@@ -293,6 +318,12 @@ static int cxl_mock_mem_probe(struct platform_device *pdev)
 	return 0;
 }
 
+static int cxl_mock_mem_remove(struct platform_device *pdev)
+{
+	cxl_mock_remove_event_logs(&pdev->dev);
+	return 0;
+}
+
 static const struct platform_device_id cxl_mock_mem_ids[] = {
 	{ .name = "cxl_mem", },
 	{ },
@@ -301,9 +332,11 @@ MODULE_DEVICE_TABLE(platform, cxl_mock_mem_ids);
 
 static struct platform_driver cxl_mock_mem_driver = {
 	.probe = cxl_mock_mem_probe,
+	.remove = cxl_mock_mem_remove,
 	.id_table = cxl_mock_mem_ids,
 	.driver = {
 		.name = KBUILD_MODNAME,
+		.dev_groups = cxl_mock_event_groups,
 	},
 };
 
-- 
2.37.2


^ permalink raw reply related	[flat|nested] 64+ messages in thread

* [PATCH V2 10/11] cxl/test: Add specific events
  2022-12-01  0:27 [PATCH V2 00/11] CXL: Process event logs ira.weiny
                   ` (8 preceding siblings ...)
  2022-12-01  0:27 ` [PATCH V2 09/11] cxl/test: Add generic mock events ira.weiny
@ 2022-12-01  0:27 ` ira.weiny
  2022-12-01 21:00   ` Dave Jiang
  2022-12-01  0:27 ` [PATCH V2 11/11] cxl/test: Simulate event log overflow ira.weiny
  10 siblings, 1 reply; 64+ messages in thread
From: ira.weiny @ 2022-12-01  0:27 UTC (permalink / raw)
  To: Dan Williams
  Cc: Ira Weiny, Jonathan Cameron, Alison Schofield, Vishal Verma,
	Ben Widawsky, Steven Rostedt, Davidlohr Bueso, Dave Jiang,
	linux-kernel, linux-cxl

From: Ira Weiny <ira.weiny@intel.com>

Each type of event has different trace point outputs.

Add mock General Media Event, DRAM event, and Memory Module Event
records to the mock list of events returned.

Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Ira Weiny <ira.weiny@intel.com>

---
Changes from V1:
	Jonathan
		use put_unaligned_le16()
		fix spacing

Changes from RFC:
	Adjust for struct changes
	adjust for unaligned fields
---
 tools/testing/cxl/test/events.c | 73 +++++++++++++++++++++++++++++++++
 1 file changed, 73 insertions(+)

diff --git a/tools/testing/cxl/test/events.c b/tools/testing/cxl/test/events.c
index a3d2ec7cc9fe..0bcc485e07da 100644
--- a/tools/testing/cxl/test/events.c
+++ b/tools/testing/cxl/test/events.c
@@ -206,6 +206,66 @@ struct cxl_event_record_raw hardware_replace = {
 	.data = { 0xDE, 0xAD, 0xBE, 0xEF },
 };
 
+struct cxl_event_gen_media gen_media = {
+	.hdr = {
+		.id = UUID_INIT(0xfbcd0a77, 0xc260, 0x417f,
+				0x85, 0xa9, 0x08, 0x8b, 0x16, 0x21, 0xeb, 0xa6),
+		.length = sizeof(struct cxl_event_gen_media),
+		.flags[0] = CXL_EVENT_RECORD_FLAG_PERMANENT,
+		/* .handle = Set dynamically */
+		.related_handle = cpu_to_le16(0),
+	},
+	.phys_addr = cpu_to_le64(0x2000),
+	.descriptor = CXL_GMER_EVT_DESC_UNCORECTABLE_EVENT,
+	.type = CXL_GMER_MEM_EVT_TYPE_DATA_PATH_ERROR,
+	.transaction_type = CXL_GMER_TRANS_HOST_WRITE,
+	/* .validity_flags = <set below> */
+	.channel = 1,
+	.rank = 30
+};
+
+struct cxl_event_dram dram = {
+	.hdr = {
+		.id = UUID_INIT(0x601dcbb3, 0x9c06, 0x4eab,
+				0xb8, 0xaf, 0x4e, 0x9b, 0xfb, 0x5c, 0x96, 0x24),
+		.length = sizeof(struct cxl_event_dram),
+		.flags[0] = CXL_EVENT_RECORD_FLAG_PERF_DEGRADED,
+		/* .handle = Set dynamically */
+		.related_handle = cpu_to_le16(0),
+	},
+	.phys_addr = cpu_to_le64(0x8000),
+	.descriptor = CXL_GMER_EVT_DESC_THRESHOLD_EVENT,
+	.type = CXL_GMER_MEM_EVT_TYPE_INV_ADDR,
+	.transaction_type = CXL_GMER_TRANS_INTERNAL_MEDIA_SCRUB,
+	/* .validity_flags = <set below> */
+	.channel = 1,
+	.bank_group = 5,
+	.bank = 2,
+	.column = {0xDE, 0xAD},
+};
+
+struct cxl_event_mem_module mem_module = {
+	.hdr = {
+		.id = UUID_INIT(0xfe927475, 0xdd59, 0x4339,
+				0xa5, 0x86, 0x79, 0xba, 0xb1, 0x13, 0xb7, 0x74),
+		.length = sizeof(struct cxl_event_mem_module),
+		/* .handle = Set dynamically */
+		.related_handle = cpu_to_le16(0),
+	},
+	.event_type = CXL_MMER_TEMP_CHANGE,
+	.info = {
+		.health_status = CXL_DHI_HS_PERFORMANCE_DEGRADED,
+		.media_status = CXL_DHI_MS_ALL_DATA_LOST,
+		.add_status = (CXL_DHI_AS_CRITICAL << 2) |
+			      (CXL_DHI_AS_WARNING << 4) |
+			      (CXL_DHI_AS_WARNING << 5),
+		.device_temp = { 0xDE, 0xAD},
+		.dirty_shutdown_cnt = { 0xde, 0xad, 0xbe, 0xef },
+		.cor_vol_err_cnt = { 0xde, 0xad, 0xbe, 0xef },
+		.cor_per_err_cnt = { 0xde, 0xad, 0xbe, 0xef },
+	}
+};
+
 u32 cxl_mock_add_event_logs(struct cxl_dev_state *cxlds)
 {
 	struct device *dev = cxlds->dev;
@@ -223,10 +283,23 @@ u32 cxl_mock_add_event_logs(struct cxl_dev_state *cxlds)
 		return 0;
 	}
 
+	put_unaligned_le16(CXL_GMER_VALID_CHANNEL | CXL_GMER_VALID_RANK,
+			   &gen_media.validity_flags);
+
+	put_unaligned_le16(CXL_DER_VALID_CHANNEL | CXL_DER_VALID_BANK_GROUP |
+			   CXL_DER_VALID_BANK | CXL_DER_VALID_COLUMN,
+			   &dram.validity_flags);
+
 	event_store_add_event(mes, CXL_EVENT_TYPE_INFO, &maint_needed);
+	event_store_add_event(mes, CXL_EVENT_TYPE_INFO,
+			      (struct cxl_event_record_raw *)&gen_media);
+	event_store_add_event(mes, CXL_EVENT_TYPE_INFO,
+			      (struct cxl_event_record_raw *)&mem_module);
 	mes->ev_status |= CXLDEV_EVENT_STATUS_INFO;
 
 	event_store_add_event(mes, CXL_EVENT_TYPE_FATAL, &hardware_replace);
+	event_store_add_event(mes, CXL_EVENT_TYPE_FATAL,
+			      (struct cxl_event_record_raw *)&dram);
 	mes->ev_status |= CXLDEV_EVENT_STATUS_FATAL;
 
 	return mes->ev_status;
-- 
2.37.2


^ permalink raw reply related	[flat|nested] 64+ messages in thread

* [PATCH V2 11/11] cxl/test: Simulate event log overflow
  2022-12-01  0:27 [PATCH V2 00/11] CXL: Process event logs ira.weiny
                   ` (9 preceding siblings ...)
  2022-12-01  0:27 ` [PATCH V2 10/11] cxl/test: Add specific events ira.weiny
@ 2022-12-01  0:27 ` ira.weiny
  2022-12-01 21:28   ` Dave Jiang
  10 siblings, 1 reply; 64+ messages in thread
From: ira.weiny @ 2022-12-01  0:27 UTC (permalink / raw)
  To: Dan Williams
  Cc: Ira Weiny, Jonathan Cameron, Alison Schofield, Vishal Verma,
	Ben Widawsky, Steven Rostedt, Davidlohr Bueso, Dave Jiang,
	linux-kernel, linux-cxl

From: Ira Weiny <ira.weiny@intel.com>

Log overflow is marked by a separate trace message.

Simulate a log with lots of messages and flag overflow until it is
drained a bit.

Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Ira Weiny <ira.weiny@intel.com>

---
Changes from RFC
	Adjust for new struct changes
---
 tools/testing/cxl/test/events.c | 49 ++++++++++++++++++++++++++++++++-
 1 file changed, 48 insertions(+), 1 deletion(-)

diff --git a/tools/testing/cxl/test/events.c b/tools/testing/cxl/test/events.c
index 0bcc485e07da..ceabefb526c2 100644
--- a/tools/testing/cxl/test/events.c
+++ b/tools/testing/cxl/test/events.c
@@ -15,6 +15,8 @@ struct mock_event_log {
 	u16 clear_idx;
 	u16 cur_idx;
 	u16 nr_events;
+	u16 nr_overflow;
+	u16 overflow_reset;
 	struct cxl_event_record_raw *events[CXL_TEST_EVENT_CNT_MAX];
 };
 
@@ -45,6 +47,7 @@ void reset_event_log(struct mock_event_log *log)
 {
 	log->cur_idx = 0;
 	log->clear_idx = 0;
+	log->nr_overflow = log->overflow_reset;
 }
 
 /* Handle can never be 0 use 1 based indexing for handle */
@@ -76,8 +79,12 @@ static void event_store_add_event(struct mock_event_store *mes,
 		return;
 
 	log = &mes->mock_logs[log_type];
-	if (WARN_ON(log->nr_events >= CXL_TEST_EVENT_CNT_MAX))
+
+	if ((log->nr_events + 1) > CXL_TEST_EVENT_CNT_MAX) {
+		log->nr_overflow++;
+		log->overflow_reset = log->nr_overflow;
 		return;
+	}
 
 	log->events[log->nr_events] = event;
 	log->nr_events++;
@@ -87,6 +94,7 @@ int mock_get_event(struct cxl_dev_state *cxlds, struct cxl_mbox_cmd *cmd)
 {
 	struct cxl_get_event_payload *pl;
 	struct mock_event_log *log;
+	u16 nr_overflow;
 	u8 log_type;
 	int i;
 
@@ -118,6 +126,21 @@ int mock_get_event(struct cxl_dev_state *cxlds, struct cxl_mbox_cmd *cmd)
 	if (!log_empty(log))
 		pl->flags |= CXL_GET_EVENT_FLAG_MORE_RECORDS;
 
+	if (log->nr_overflow) {
+		u64 ns;
+
+		pl->flags |= CXL_GET_EVENT_FLAG_OVERFLOW;
+		pl->overflow_err_count = cpu_to_le16(nr_overflow);
+		ns = ktime_get_real_ns();
+		ns -= 5000000000; /* 5s ago */
+		pl->first_overflow_timestamp = cpu_to_le64(ns);
+		ns = ktime_get_real_ns();
+		ns -= 1000000000; /* 1s ago */
+		pl->last_overflow_timestamp = cpu_to_le64(ns);
+
+		log->nr_overflow = 0;
+	}
+
 	return 0;
 }
 EXPORT_SYMBOL_GPL(mock_get_event);
@@ -297,6 +320,30 @@ u32 cxl_mock_add_event_logs(struct cxl_dev_state *cxlds)
 			      (struct cxl_event_record_raw *)&mem_module);
 	mes->ev_status |= CXLDEV_EVENT_STATUS_INFO;
 
+	event_store_add_event(mes, CXL_EVENT_TYPE_FAIL, &maint_needed);
+	event_store_add_event(mes, CXL_EVENT_TYPE_FAIL, &hardware_replace);
+	event_store_add_event(mes, CXL_EVENT_TYPE_FAIL,
+			      (struct cxl_event_record_raw *)&dram);
+	event_store_add_event(mes, CXL_EVENT_TYPE_FAIL,
+			      (struct cxl_event_record_raw *)&gen_media);
+	event_store_add_event(mes, CXL_EVENT_TYPE_FAIL,
+			      (struct cxl_event_record_raw *)&mem_module);
+	event_store_add_event(mes, CXL_EVENT_TYPE_FAIL, &hardware_replace);
+	event_store_add_event(mes, CXL_EVENT_TYPE_FAIL,
+			      (struct cxl_event_record_raw *)&dram);
+	/* Overflow this log */
+	event_store_add_event(mes, CXL_EVENT_TYPE_FAIL, &hardware_replace);
+	event_store_add_event(mes, CXL_EVENT_TYPE_FAIL, &hardware_replace);
+	event_store_add_event(mes, CXL_EVENT_TYPE_FAIL, &hardware_replace);
+	event_store_add_event(mes, CXL_EVENT_TYPE_FAIL, &hardware_replace);
+	event_store_add_event(mes, CXL_EVENT_TYPE_FAIL, &hardware_replace);
+	event_store_add_event(mes, CXL_EVENT_TYPE_FAIL, &hardware_replace);
+	event_store_add_event(mes, CXL_EVENT_TYPE_FAIL, &hardware_replace);
+	event_store_add_event(mes, CXL_EVENT_TYPE_FAIL, &hardware_replace);
+	event_store_add_event(mes, CXL_EVENT_TYPE_FAIL, &hardware_replace);
+	event_store_add_event(mes, CXL_EVENT_TYPE_FAIL, &hardware_replace);
+	mes->ev_status |= CXLDEV_EVENT_STATUS_FAIL;
+
 	event_store_add_event(mes, CXL_EVENT_TYPE_FATAL, &hardware_replace);
 	event_store_add_event(mes, CXL_EVENT_TYPE_FATAL,
 			      (struct cxl_event_record_raw *)&dram);
-- 
2.37.2


^ permalink raw reply related	[flat|nested] 64+ messages in thread

* Re: [PATCH V2 01/11] cxl/pci: Add generic MSI-X/MSI irq support
  2022-12-01  0:27 ` [PATCH V2 01/11] cxl/pci: Add generic MSI-X/MSI irq support ira.weiny
@ 2022-12-01 10:18   ` Jonathan Cameron
  2022-12-01 18:37   ` Dave Jiang
  2022-12-02  0:23   ` Dan Williams
  2 siblings, 0 replies; 64+ messages in thread
From: Jonathan Cameron @ 2022-12-01 10:18 UTC (permalink / raw)
  To: ira.weiny
  Cc: Dan Williams, Davidlohr Bueso, Bjorn Helgaas, Alison Schofield,
	Vishal Verma, Ben Widawsky, Steven Rostedt, Dave Jiang,
	linux-kernel, linux-cxl

On Wed, 30 Nov 2022 16:27:09 -0800
ira.weiny@intel.com wrote:

> From: Davidlohr Bueso <dave@stgolabs.net>
> 
> Currently the only CXL features targeted for irq support require their
> message numbers to be within the first 16 entries.  The device may
> however support less than 16 entries depending on the support it
> provides.
> 
> Attempt to allocate these 16 irq vectors.  If the device supports less
> then the PCI infrastructure will allocate that number.  Store the number
> of vectors actually allocated in the device state for later use
> by individual functions.
> 
> Upon successful allocation, users can plug in their respective isr at
> any point thereafter, for example, if the irq setup is not done in the
> PCI driver, such as the case of the CXL-PMU.
> 
> Cc: Bjorn Helgaas <helgaas@kernel.org>
> Cc: Jonathan Cameron <Jonathan.Cameron@huawei.com>
> Co-developed-by: Ira Weiny <ira.weiny@intel.com>
> Signed-off-by: Ira Weiny <ira.weiny@intel.com>
> Signed-off-by: Davidlohr Bueso <dave@stgolabs.net>

Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>

> 
> ---
> Changes from V1:
> 	Jonathan
> 		pci_alloc_irq_vectors() cleans up the vectors automatically
> 		use msi_enabled rather than nr_irq_vecs
> 
> Changes from Ira
> 	Remove reviews
> 	Allocate up to a static 16 vectors.
> 	Change cover letter
> ---
>  drivers/cxl/cxlmem.h |  3 +++
>  drivers/cxl/cxlpci.h |  6 ++++++
>  drivers/cxl/pci.c    | 23 +++++++++++++++++++++++
>  3 files changed, 32 insertions(+)
> 
> diff --git a/drivers/cxl/cxlmem.h b/drivers/cxl/cxlmem.h
> index 88e3a8e54b6a..cd35f43fedd4 100644
> --- a/drivers/cxl/cxlmem.h
> +++ b/drivers/cxl/cxlmem.h
> @@ -211,6 +211,7 @@ struct cxl_endpoint_dvsec_info {
>   * @info: Cached DVSEC information about the device.
>   * @serial: PCIe Device Serial Number
>   * @doe_mbs: PCI DOE mailbox array
> + * @msi_enabled: MSI-X/MSI has been enabled
>   * @mbox_send: @dev specific transport for transmitting mailbox commands
>   *
>   * See section 8.2.9.5.2 Capacity Configuration and Label Storage for
> @@ -247,6 +248,8 @@ struct cxl_dev_state {
>  
>  	struct xarray doe_mbs;
>  
> +	bool msi_enabled;
> +
>  	int (*mbox_send)(struct cxl_dev_state *cxlds, struct cxl_mbox_cmd *cmd);
>  };
>  
> diff --git a/drivers/cxl/cxlpci.h b/drivers/cxl/cxlpci.h
> index eec597dbe763..b7f4e2f417d3 100644
> --- a/drivers/cxl/cxlpci.h
> +++ b/drivers/cxl/cxlpci.h
> @@ -53,6 +53,12 @@
>  #define	    CXL_DVSEC_REG_LOCATOR_BLOCK_ID_MASK			GENMASK(15, 8)
>  #define     CXL_DVSEC_REG_LOCATOR_BLOCK_OFF_LOW_MASK		GENMASK(31, 16)
>  
> +/*
> + * NOTE: Currently all the functions which are enabled for CXL require their
> + * vectors to be in the first 16.  Use this as the max.
> + */
> +#define CXL_PCI_REQUIRED_VECTORS 16
> +
>  /* Register Block Identifier (RBI) */
>  enum cxl_regloc_type {
>  	CXL_REGLOC_RBI_EMPTY = 0,
> diff --git a/drivers/cxl/pci.c b/drivers/cxl/pci.c
> index faeb5d9d7a7a..8f86f85d89c7 100644
> --- a/drivers/cxl/pci.c
> +++ b/drivers/cxl/pci.c
> @@ -428,6 +428,27 @@ static void devm_cxl_pci_create_doe(struct cxl_dev_state *cxlds)
>  	}
>  }
>  
> +static void cxl_pci_alloc_irq_vectors(struct cxl_dev_state *cxlds)
> +{
> +	struct device *dev = cxlds->dev;
> +	struct pci_dev *pdev = to_pci_dev(dev);
> +	int nvecs;
> +
> +	/*
> +	 * NOTE: pci_alloc_irq_vectors() handles calling pci_free_irq_vectors()
> +	 * automatically despite not being called pcim_*.  See
> +	 * pci_setup_msi_context().
> +	 */
> +	nvecs = pci_alloc_irq_vectors(pdev, 1, CXL_PCI_REQUIRED_VECTORS,
> +				   PCI_IRQ_MSIX | PCI_IRQ_MSI);
> +	if (nvecs < 0) {
> +		dev_dbg(dev, "Failed to alloc irq vectors; use polling instead.\n");
> +		return;
> +	}
> +
> +	cxlds->msi_enabled = true;
> +}
> +
>  static int cxl_pci_probe(struct pci_dev *pdev, const struct pci_device_id *id)
>  {
>  	struct cxl_register_map map;
> @@ -494,6 +515,8 @@ static int cxl_pci_probe(struct pci_dev *pdev, const struct pci_device_id *id)
>  	if (rc)
>  		return rc;
>  
> +	cxl_pci_alloc_irq_vectors(cxlds);
> +
>  	cxlmd = devm_cxl_add_memdev(cxlds);
>  	if (IS_ERR(cxlmd))
>  		return PTR_ERR(cxlmd);


^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH V2 02/11] cxl/mem: Implement Get Event Records command
  2022-12-01  0:27 ` [PATCH V2 02/11] cxl/mem: Implement Get Event Records command ira.weiny
@ 2022-12-01 13:06   ` Jonathan Cameron
  2022-12-01 15:10     ` Ira Weiny
  2022-12-01 17:38   ` Steven Rostedt
  2022-12-02  1:39   ` Dan Williams
  2 siblings, 1 reply; 64+ messages in thread
From: Jonathan Cameron @ 2022-12-01 13:06 UTC (permalink / raw)
  To: ira.weiny
  Cc: Dan Williams, Steven Rostedt, Alison Schofield, Vishal Verma,
	Ben Widawsky, Davidlohr Bueso, Dave Jiang, linux-kernel,
	linux-cxl

On Wed, 30 Nov 2022 16:27:10 -0800
ira.weiny@intel.com wrote:

> From: Ira Weiny <ira.weiny@intel.com>
> 
> CXL devices have multiple event logs which can be queried for CXL event
> records.  Devices are required to support the storage of at least one
> event record in each event log type.
> 
> Devices track event log overflow by incrementing a counter and tracking
> the time of the first and last overflow event seen.
> 
> Software queries events via the Get Event Record mailbox command; CXL
> rev 3.0 section 8.2.9.2.2.
> 
> Issue the Get Event Record mailbox command on driver load.  Trace each
> record found with a generic record trace.  Trace any overflow
> conditions.
> 
> The device can return up to 1MB worth of event records per query.
> Allocate a shared large buffer to handle the max number of records based
> on the mailbox payload size.
> 
> This patch traces a raw event record only and leaves the specific event
> record types to subsequent patches.
> 
> Macros are created to use for tracing the common CXL Event header
> fields.
> 
> Cc: Steven Rostedt <rostedt@goodmis.org>
> Signed-off-by: Ira Weiny <ira.weiny@intel.com>

Hi Ira,

Looks good to me.  A few trivial suggestions inline. Either way,

Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>


> diff --git a/drivers/cxl/core/mbox.c b/drivers/cxl/core/mbox.c
> index 16176b9278b4..70b681027a3d 100644
> --- a/drivers/cxl/core/mbox.c
> +++ b/drivers/cxl/core/mbox.c
> @@ -7,6 +7,9 @@

...

> +
> +static void cxl_mem_free_event_buffer(void *data)
> +{
> +	struct cxl_dev_state *cxlds = data;
> +
> +	kvfree(cxlds->event_buf);

Trivial, but why not just pass in the event_buf?

> +}
> +
> +/*
> + * There is a single buffer for reading event logs from the mailbox.  All logs
> + * share this buffer protected by the cxlds->event_buf_lock.
> + */
> +static struct cxl_get_event_payload *alloc_event_buf(struct cxl_dev_state *cxlds)
> +{
> +	struct cxl_get_event_payload *buf;
> +
> +	dev_dbg(cxlds->dev, "Allocating event buffer size %zu\n",
> +		cxlds->payload_size);
> +
> +	buf = kvmalloc(cxlds->payload_size, GFP_KERNEL);

huh. I assumed there would be a devm_kvmalloc() but apparently not..  Ah well
- whilst it might makes sense to add one, let's not tie that up with this series.

> +	if (buf && devm_add_action_or_reset(cxlds->dev,
> +			cxl_mem_free_event_buffer, cxlds))
> +		return NULL;

Trivial, but I'd go for a more wordy but more conventional pattern of
	if (!buf)
		return NULL;

	if (devm_add_action_or_reset())
		return NULL
	
	return buff;
	
> +	return buf;
> +}
> +

...

> diff --git a/drivers/cxl/cxlmem.h b/drivers/cxl/cxlmem.h
> index cd35f43fedd4..55d57f5a64bc 100644
> --- a/drivers/cxl/cxlmem.h
> +++ b/drivers/cxl/cxlmem.h
> @@ -4,6 +4,7 @@
>  #define __CXL_MEM_H__
>  #include <uapi/linux/cxl_mem.h>
>  #include <linux/cdev.h>
> +#include <linux/uuid.h>
>  #include "cxl.h"
>  
>  /* CXL 2.0 8.2.8.5.1.1 Memory Device Status Register */
> @@ -250,12 +251,16 @@ struct cxl_dev_state {
>  
>  	bool msi_enabled;
>  
> +	struct cxl_get_event_payload *event_buf;
Whilst it is obvious (and document at point of allocation),
I think one of the static checkers still warns that all locks must
have comments.  Probably easier to add one now than wait for the
inevitable warning report.

> +	struct mutex event_buf_lock;
> +
>  	int (*mbox_send)(struct cxl_dev_state *cxlds, struct cxl_mbox_cmd *cmd);
>  };
>  



^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH V2 03/11] cxl/mem: Implement Clear Event Records command
  2022-12-01  0:27 ` [PATCH V2 03/11] cxl/mem: Implement Clear " ira.weiny
@ 2022-12-01 13:26   ` Jonathan Cameron
  2022-12-01 15:30     ` Ira Weiny
  2022-12-02  2:29   ` Dan Williams
  1 sibling, 1 reply; 64+ messages in thread
From: Jonathan Cameron @ 2022-12-01 13:26 UTC (permalink / raw)
  To: ira.weiny
  Cc: Dan Williams, Alison Schofield, Vishal Verma, Ben Widawsky,
	Steven Rostedt, Davidlohr Bueso, Dave Jiang, linux-kernel,
	linux-cxl

On Wed, 30 Nov 2022 16:27:11 -0800
ira.weiny@intel.com wrote:

> From: Ira Weiny <ira.weiny@intel.com>
> 
> CXL rev 3.0 section 8.2.9.2.3 defines the Clear Event Records mailbox
> command.  After an event record is read it needs to be cleared from the
> event log.
> 
> Implement cxl_clear_event_record() to clear all record retrieved from
> the device.
> 
> Each record is cleared explicitly.  A clear all bit is specified but
> events could arrive between a get and any final clear all operation.
> This means events would be missed.
> Therefore each event is cleared specifically.
> 
> Signed-off-by: Ira Weiny <ira.weiny@intel.com>
I think there is a type issue on the min_t() calculation with that addressed
this looks good to me.

Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>

> 
> ---
> Changes from V1:
> 	Clear Event Record allows for u8 handles while Get Event Record
> 	allows for u16 records to be returned.  Based on Jonathan's
> 	feedback; allow for all event records to be handled in this
> 	clear.  Which means a double loop with potentially multiple
> 	Clear Event payloads being sent to clear all events sent.
> 
> Changes from RFC:
> 	Jonathan
> 		Clean up init of payload and use return code.
> 		Also report any error to clear the event.
> 		s/v3.0/rev 3.0
> ---
>  drivers/cxl/core/mbox.c      | 61 +++++++++++++++++++++++++++++++-----
>  drivers/cxl/cxlmem.h         | 14 +++++++++
>  include/uapi/linux/cxl_mem.h |  1 +
>  3 files changed, 69 insertions(+), 7 deletions(-)
> 
> diff --git a/drivers/cxl/core/mbox.c b/drivers/cxl/core/mbox.c
> index 70b681027a3d..076a3df0ba38 100644
> --- a/drivers/cxl/core/mbox.c
> +++ b/drivers/cxl/core/mbox.c
> @@ -52,6 +52,7 @@ static struct cxl_mem_command cxl_mem_commands[CXL_MEM_COMMAND_ID_MAX] = {
>  #endif
>  	CXL_CMD(GET_SUPPORTED_LOGS, 0, CXL_VARIABLE_PAYLOAD, CXL_CMD_FLAG_FORCE_ENABLE),
>  	CXL_CMD(GET_EVENT_RECORD, 1, CXL_VARIABLE_PAYLOAD, 0),
> +	CXL_CMD(CLEAR_EVENT_RECORD, CXL_VARIABLE_PAYLOAD, 0, 0),
>  	CXL_CMD(GET_FW_INFO, 0, 0x50, 0),
>  	CXL_CMD(GET_PARTITION_INFO, 0, 0x20, 0),
>  	CXL_CMD(GET_LSA, 0x8, CXL_VARIABLE_PAYLOAD, 0),
> @@ -708,6 +709,42 @@ int cxl_enumerate_cmds(struct cxl_dev_state *cxlds)
>  }
>  EXPORT_SYMBOL_NS_GPL(cxl_enumerate_cmds, CXL);
>  
> +static int cxl_clear_event_record(struct cxl_dev_state *cxlds,
> +				  enum cxl_event_log_type log,
> +				  struct cxl_get_event_payload *get_pl,
> +				  u16 total)
> +{
> +	struct cxl_mbox_clear_event_payload payload = {
> +		.event_log = log,
> +	};
> +	int cnt;
> +
> +	/*
> +	 * Clear Event Records uses u8 for the handle cnt while Get Event
> +	 * Record can return up to 0xffff records.
> +	 */
> +	for (cnt = 0; cnt < total; /* cnt incremented internally */) {
> +		u8 nr_recs = min_t(u8, (total - cnt),
> +				   CXL_CLEAR_EVENT_MAX_HANDLES);

I might be half asleep but isn't this assuming that (total - cnt)
fits in an u8?  Shouldn't this be min_t(u16, ..) 
Also, maybe u16 cnt would be simpler.

Hmm.  This is safe but only because of how you call it alongside
handling of a particular Get event records response (which must
have fitted in the mailbox and has a longer header).

Looking at this function in isolation, I think the mailbox could be
small enough that we might not fit 255 records + the header.
Perhaps we need a comment to say that, or at minimum a check and error
return if it won't fit?

> +		int i, rc;
> +
> +		for (i = 0; i < nr_recs; i++, cnt++) {
> +			payload.handle[i] = get_pl->records[cnt].hdr.handle;
> +			dev_dbg(cxlds->dev, "Event log '%s': Clearning %u\n",
> +				cxl_event_log_type_str(log),
> +				le16_to_cpu(payload.handle[i]));
> +		}
> +		payload.nr_recs = nr_recs;
> +
> +		rc = cxl_mbox_send_cmd(cxlds, CXL_MBOX_OP_CLEAR_EVENT_RECORD,
> +				       &payload, sizeof(payload), NULL, 0);
> +		if (rc)
> +			return rc;
> +	}
> +
> +	return 0;
> +}
> +
>  static void cxl_mem_get_records_log(struct cxl_dev_state *cxlds,
>  				    enum cxl_event_log_type type)
>  {
> @@ -732,13 +769,22 @@ static void cxl_mem_get_records_log(struct cxl_dev_state *cxlds,
This feels miss named now but I can't immediately think of better naming so on that
basis fine to leave it as is if you don't have a better idea!.

>  		}
>  
>  		nr_rec = le16_to_cpu(payload->record_count);
> -		if (trace_cxl_generic_event_enabled()) {
> +		if (nr_rec > 0) {
>  			int i;
>  
> -			for (i = 0; i < nr_rec; i++)
> -				trace_cxl_generic_event(dev_name(cxlds->dev),
> -							type,
> -							&payload->records[i]);
> +			if (trace_cxl_generic_event_enabled()) {
> +				for (i = 0; i < nr_rec; i++)
> +					trace_cxl_generic_event(dev_name(cxlds->dev),
> +								type,
> +								&payload->records[i]);
> +			}
> +
> +			rc = cxl_clear_event_record(cxlds, type, payload, nr_rec);
> +			if (rc) {
> +				dev_err(cxlds->dev, "Event log '%s': Failed to clear events : %d",
> +					cxl_event_log_type_str(type), rc);
> +				return;
> +			}
>  		}


^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH V2 04/11] cxl/mem: Clear events on driver load
  2022-12-01  0:27 ` [PATCH V2 04/11] cxl/mem: Clear events on driver load ira.weiny
@ 2022-12-01 13:30   ` Jonathan Cameron
  2022-12-01 17:02     ` Ira Weiny
  2022-12-02  2:48   ` Dan Williams
  1 sibling, 1 reply; 64+ messages in thread
From: Jonathan Cameron @ 2022-12-01 13:30 UTC (permalink / raw)
  To: ira.weiny
  Cc: Dan Williams, Dave Jiang, Alison Schofield, Vishal Verma,
	Ben Widawsky, Steven Rostedt, Davidlohr Bueso, linux-kernel,
	linux-cxl

On Wed, 30 Nov 2022 16:27:12 -0800
ira.weiny@intel.com wrote:

> From: Ira Weiny <ira.weiny@intel.com>
> 
> The information contained in the events prior to the driver loading can
> be queried at any time through other mailbox commands.
> 
> Ensure a clean slate of events by reading and clearing the events.  The
> events are sent to the trace buffer but it is not anticipated to have
> anyone listening to it at driver load time.
> 
> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
> Reviewed-by: Dave Jiang <dave.jiang@intel.com>
> Signed-off-by: Ira Weiny <ira.weiny@intel.com>

Probably not worth addressing but there is a corner case where this might fail
if some broken software already messed with reading out the events.

Imagine it read the first mailbox sized chunk, but didn't clear them...

If that happens, then we'd end up seeing the whole list, but in non
temporal order and hence trying to clear them out of order with predictable
fails.

Maybe this is the category of things we 'fix' if we ever hear of it actually
happening.

So with that caveat called out so I can say 'I told you so' :), fine to keep my tag on this.

Thanks,

Jonathan


> ---
>  drivers/cxl/pci.c            | 2 ++
>  tools/testing/cxl/test/mem.c | 2 ++
>  2 files changed, 4 insertions(+)
> 
> diff --git a/drivers/cxl/pci.c b/drivers/cxl/pci.c
> index 8f86f85d89c7..11e95a95195a 100644
> --- a/drivers/cxl/pci.c
> +++ b/drivers/cxl/pci.c
> @@ -521,6 +521,8 @@ static int cxl_pci_probe(struct pci_dev *pdev, const struct pci_device_id *id)
>  	if (IS_ERR(cxlmd))
>  		return PTR_ERR(cxlmd);
>  
> +	cxl_mem_get_event_records(cxlds);
> +
>  	if (resource_size(&cxlds->pmem_res) && IS_ENABLED(CONFIG_CXL_PMEM))
>  		rc = devm_cxl_add_nvdimm(&pdev->dev, cxlmd);
>  
> diff --git a/tools/testing/cxl/test/mem.c b/tools/testing/cxl/test/mem.c
> index aa2df3a15051..e2f5445d24ff 100644
> --- a/tools/testing/cxl/test/mem.c
> +++ b/tools/testing/cxl/test/mem.c
> @@ -285,6 +285,8 @@ static int cxl_mock_mem_probe(struct platform_device *pdev)
>  	if (IS_ERR(cxlmd))
>  		return PTR_ERR(cxlmd);
>  
> +	cxl_mem_get_event_records(cxlds);
> +
>  	if (resource_size(&cxlds->pmem_res) && IS_ENABLED(CONFIG_CXL_PMEM))
>  		rc = devm_cxl_add_nvdimm(dev, cxlmd);
>  


^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH V2 07/11] cxl/mem: Trace Memory Module Event Record
  2022-12-01  0:27 ` [PATCH V2 07/11] cxl/mem: Trace Memory Module " ira.weiny
@ 2022-12-01 13:31   ` Jonathan Cameron
  2022-12-01 18:57   ` Dave Jiang
  2022-12-02  6:25   ` Dan Williams
  2 siblings, 0 replies; 64+ messages in thread
From: Jonathan Cameron @ 2022-12-01 13:31 UTC (permalink / raw)
  To: ira.weiny
  Cc: Dan Williams, Steven Rostedt, Alison Schofield, Vishal Verma,
	Ben Widawsky, Davidlohr Bueso, Dave Jiang, linux-kernel,
	linux-cxl

On Wed, 30 Nov 2022 16:27:15 -0800
ira.weiny@intel.com wrote:

> From: Ira Weiny <ira.weiny@intel.com>
> 
> CXL rev 3.0 section 8.2.9.2.1.3 defines the Memory Module Event Record.
> 
> Determine if the event read is memory module record and if so trace the
> record.
> 
> Reviewed-by: Steven Rostedt (Google) <rostedt@goodmis.org>
> Signed-off-by: Ira Weiny <ira.weiny@intel.com>
> 
LGTM

Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>


^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH V2 08/11] cxl/mem: Wire up event interrupts
  2022-12-01  0:27 ` [PATCH V2 08/11] cxl/mem: Wire up event interrupts ira.weiny
@ 2022-12-01 14:21   ` Jonathan Cameron
  2022-12-01 17:23     ` Ira Weiny
  2022-12-01 18:35   ` Davidlohr Bueso
  2022-12-02  7:37   ` Dan Williams
  2 siblings, 1 reply; 64+ messages in thread
From: Jonathan Cameron @ 2022-12-01 14:21 UTC (permalink / raw)
  To: ira.weiny
  Cc: Dan Williams, Alison Schofield, Vishal Verma, Ben Widawsky,
	Steven Rostedt, Davidlohr Bueso, Dave Jiang, linux-kernel,
	linux-cxl

On Wed, 30 Nov 2022 16:27:16 -0800
ira.weiny@intel.com wrote:

> From: Ira Weiny <ira.weiny@intel.com>
> 
> CXL device events are signaled via interrupts.  Each event log may have
> a different interrupt message number.  These message numbers are
> reported in the Get Event Interrupt Policy mailbox command.
> 
> Add interrupt support for event logs.  Interrupts are allocated as
> shared interrupts.  Therefore, all or some event logs can share the same
> message number.
> 
> Signed-off-by: Ira Weiny <ira.weiny@intel.com>

A few trivial comments, but only superficially code style stuff which you
can ignore if you feel strongly about current style or it matches existing
file style etc...

Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>

> 
> ---
> Changes from V1:
> 	Remove unneeded evt_int_policy from struct cxl_dev_state
> 	defer Dynamic Capacity support
> 	Dave Jiang
> 		s/irq/rc
> 		use IRQ_NONE to signal the irq was not for us.
> 	Jonathan
> 		use msi_enabled rather than nr_irq_vec
> 		On failure explicitly set CXL_INT_NONE
> 		Add comment for Get Event Interrupt Policy
> 		use devm_request_threaded_irq()
> 		Use individual handler/thread functions for each of the
> 		logs rather than struct cxl_event_irq_id.
> 
> Changes from RFC v2
> 	Adjust to new irq 16 vector allocation
> 	Jonathan
> 		Remove CXL_INT_RES
> 	Use irq threads to ensure mailbox commands are executed outside irq context
> 	Adjust for optional Dynamic Capacity log
> ---
>  drivers/cxl/core/mbox.c      |  44 +++++++++++-
>  drivers/cxl/cxlmem.h         |  30 ++++++++
>  drivers/cxl/pci.c            | 130 +++++++++++++++++++++++++++++++++++
>  include/uapi/linux/cxl_mem.h |   2 +
>  4 files changed, 204 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/cxl/cxlmem.h b/drivers/cxl/cxlmem.h
> index 450b410f29f6..2d384b0fc2b3 100644
> --- a/drivers/cxl/cxlmem.h
> +++ b/drivers/cxl/cxlmem.h
> @@ -179,6 +179,30 @@ struct cxl_endpoint_dvsec_info {
>  	struct range dvsec_range[2];
>  };
>  
> +/**
> + * Event Interrupt Policy
> + *
> + * CXL rev 3.0 section 8.2.9.2.4; Table 8-52
> + */
> +enum cxl_event_int_mode {
> +	CXL_INT_NONE		= 0x00,
> +	CXL_INT_MSI_MSIX	= 0x01,
> +	CXL_INT_FW		= 0x02
> +};
> +#define CXL_EVENT_INT_MODE_MASK 0x3
> +#define CXL_EVENT_INT_MSGNUM(setting) (((setting) & 0xf0) >> 4)
> +struct cxl_event_interrupt_policy {
> +	u8 info_settings;
> +	u8 warn_settings;
> +	u8 failure_settings;
> +	u8 fatal_settings;
> +} __packed;
> +
> +static inline bool cxl_evt_int_is_msi(u8 setting)
> +{
> +	return CXL_INT_MSI_MSIX == (setting & CXL_EVENT_INT_MODE_MASK);

Maybe a case for FIELD_GET() though given the defines are all local
it is already obvious what this is doing so fine if you prefer to
keep it as is.

> +}
...

> diff --git a/drivers/cxl/pci.c b/drivers/cxl/pci.c
> index 11e95a95195a..3c0b9199f11a 100644
> --- a/drivers/cxl/pci.c
> +++ b/drivers/cxl/pci.c
> @@ -449,6 +449,134 @@ static void cxl_pci_alloc_irq_vectors(struct cxl_dev_state *cxlds)
>  	cxlds->msi_enabled = true;
>  }
>  
> +static irqreturn_t cxl_event_info_thread(int irq, void *id)
> +{
> +	struct cxl_dev_state *cxlds = id;
> +
> +	cxl_mem_get_records_log(cxlds, CXL_EVENT_TYPE_INFO);
> +	return IRQ_HANDLED;
> +}

I'm not a great fan of macros, but maybe this is a case for them.

> +
> +static irqreturn_t cxl_event_info_handler(int irq, void *id)
> +{
> +	struct cxl_dev_state *cxlds = id;
> +	u32 status = readl(cxlds->regs.status + CXLDEV_DEV_EVENT_STATUS_OFFSET);

Superficial and this is guaranteed to work (8.2.8 allow all sizes of read up
to 64 bytes), but maybe should treat this as a 64 bit register as that aligns
better with spec?

> +
> +	if (CXLDEV_EVENT_STATUS_INFO & status)

Another maybe FIELD_GET() case?

> +		return IRQ_WAKE_THREAD;
> +	return IRQ_NONE;
> +}
> +
> +static irqreturn_t cxl_event_warn_thread(int irq, void *id)
> +{
> +	struct cxl_dev_state *cxlds = id;

Why id?  I'd call it what it is (maybe _cxlsd) and not bother with
the local variable in this case as it is only used once and doesn't
need the type.

static irqreturn_t cxl_event_warn_thread(int irq, void *cxlds)
{
	cxl_mem_get_records_log(cxlds, CXL_EVENT_TYPE_WARN);

	return IRQ_HANDLED;
}


> +
> +	cxl_mem_get_records_log(cxlds, CXL_EVENT_TYPE_WARN);
> +	return IRQ_HANDLED;
> +}
> +

...




^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH V2 09/11] cxl/test: Add generic mock events
  2022-12-01  0:27 ` [PATCH V2 09/11] cxl/test: Add generic mock events ira.weiny
@ 2022-12-01 14:37   ` Jonathan Cameron
  2022-12-01 17:49     ` Ira Weiny
  2022-12-02  8:07   ` Dan Williams
  1 sibling, 1 reply; 64+ messages in thread
From: Jonathan Cameron @ 2022-12-01 14:37 UTC (permalink / raw)
  To: ira.weiny
  Cc: Dan Williams, Alison Schofield, Vishal Verma, Ben Widawsky,
	Steven Rostedt, Davidlohr Bueso, Dave Jiang, linux-kernel,
	linux-cxl

On Wed, 30 Nov 2022 16:27:17 -0800
ira.weiny@intel.com wrote:

> From: Ira Weiny <ira.weiny@intel.com>
> 
> Facilitate testing basic Get/Clear Event functionality by creating
> multiple logs and generic events with made up UUID's.
> 
> Data is completely made up with data patterns which should be easy to
> spot in trace output.
> 
> A single sysfs entry resets the event data and triggers collecting the
> events for testing.
> 
> Events are returned one at a time which is within the specification even
> though it does not exercise the full capabilities of what a device may
> do.
> 
> Test traces are easy to obtain with a small script such as this:
> 
> 	#!/bin/bash -x
> 
> 	devices=`find /sys/devices/platform -name cxl_mem*`
> 
> 	# Turn on tracing
> 	echo "" > /sys/kernel/tracing/trace
> 	echo 1 > /sys/kernel/tracing/events/cxl/enable
> 	echo 1 > /sys/kernel/tracing/tracing_on
> 
> 	# Generate fake interrupt
> 	for device in $devices; do
> 	        echo 1 > $device/event_trigger
> 	done
> 
> 	# Turn off tracing and report events
> 	echo 0 > /sys/kernel/tracing/tracing_on
> 	cat /sys/kernel/tracing/trace
> 
> Signed-off-by: Ira Weiny <ira.weiny@intel.com>
> 

A minor comment on xarray cleanup inline

Jonathan


> +void cxl_mock_remove_event_logs(struct device *dev)
> +{
> +	struct mock_event_store *mes;
> +
> +	mes = xa_erase(&mock_dev_event_store, (unsigned long)dev);

As below, I'd move this into a devm_add_action_or_reset() so
that we don't need to deal with doing it manually.

> +}
> +EXPORT_SYMBOL_GPL(cxl_mock_remove_event_logs);

...

>  static int cxl_mock_mem_probe(struct platform_device *pdev)
>  {
>  	struct device *dev = &pdev->dev;
>  	struct cxl_memdev *cxlmd;
>  	struct cxl_dev_state *cxlds;
> +	u32 ev_status;
>  	void *lsa;
>  	int rc;
>  
> @@ -281,11 +304,13 @@ static int cxl_mock_mem_probe(struct platform_device *pdev)
>  	if (rc)
>  		return rc;
>  
> +	ev_status = cxl_mock_add_event_logs(cxlds);

On error later in this function these leak.  Just use devm_add_action_or_reset()
inside cxl_mock_add_event_logs() so we don't have to care about that.

> +
>  	cxlmd = devm_cxl_add_memdev(cxlds);
>  	if (IS_ERR(cxlmd))
>  		return PTR_ERR(cxlmd);
>  
> -	cxl_mem_get_event_records(cxlds);
> +	__cxl_mem_get_event_records(cxlds, ev_status);
>  
>  	if (resource_size(&cxlds->pmem_res) && IS_ENABLED(CONFIG_CXL_PMEM))
>  		rc = devm_cxl_add_nvdimm(dev, cxlmd);
> @@ -293,6 +318,12 @@ static int cxl_mock_mem_probe(struct platform_device *pdev)
>  	return 0;
>  }
>  
> +static int cxl_mock_mem_remove(struct platform_device *pdev)
> +{
> +	cxl_mock_remove_event_logs(&pdev->dev);
Why not use devm_add_action_or_reset()?

> +	return 0;
> +}
> +

>  


^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH V2 02/11] cxl/mem: Implement Get Event Records command
  2022-12-01 13:06   ` Jonathan Cameron
@ 2022-12-01 15:10     ` Ira Weiny
  0 siblings, 0 replies; 64+ messages in thread
From: Ira Weiny @ 2022-12-01 15:10 UTC (permalink / raw)
  To: Jonathan Cameron
  Cc: Dan Williams, Steven Rostedt, Alison Schofield, Vishal Verma,
	Ben Widawsky, Davidlohr Bueso, Dave Jiang, linux-kernel,
	linux-cxl

On Thu, Dec 01, 2022 at 01:06:50PM +0000, Jonathan Cameron wrote:
> On Wed, 30 Nov 2022 16:27:10 -0800
> ira.weiny@intel.com wrote:
> 
> > From: Ira Weiny <ira.weiny@intel.com>
> > 
> > CXL devices have multiple event logs which can be queried for CXL event
> > records.  Devices are required to support the storage of at least one
> > event record in each event log type.
> > 
> > Devices track event log overflow by incrementing a counter and tracking
> > the time of the first and last overflow event seen.
> > 
> > Software queries events via the Get Event Record mailbox command; CXL
> > rev 3.0 section 8.2.9.2.2.
> > 
> > Issue the Get Event Record mailbox command on driver load.  Trace each
> > record found with a generic record trace.  Trace any overflow
> > conditions.
> > 
> > The device can return up to 1MB worth of event records per query.
> > Allocate a shared large buffer to handle the max number of records based
> > on the mailbox payload size.
> > 
> > This patch traces a raw event record only and leaves the specific event
> > record types to subsequent patches.
> > 
> > Macros are created to use for tracing the common CXL Event header
> > fields.
> > 
> > Cc: Steven Rostedt <rostedt@goodmis.org>
> > Signed-off-by: Ira Weiny <ira.weiny@intel.com>
> 
> Hi Ira,
> 
> Looks good to me.  A few trivial suggestions inline. Either way,
> 
> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
> 
> 
> > diff --git a/drivers/cxl/core/mbox.c b/drivers/cxl/core/mbox.c
> > index 16176b9278b4..70b681027a3d 100644
> > --- a/drivers/cxl/core/mbox.c
> > +++ b/drivers/cxl/core/mbox.c
> > @@ -7,6 +7,9 @@
> 
> ...
> 
> > +
> > +static void cxl_mem_free_event_buffer(void *data)
> > +{
> > +	struct cxl_dev_state *cxlds = data;
> > +
> > +	kvfree(cxlds->event_buf);
> 
> Trivial, but why not just pass in the event_buf?

Just following the pattern that 'cxl_mem_*' functions take a cxlds parameter.
<shrug>

I'm going to leave this because it is tested.

> 
> > +}
> > +
> > +/*
> > + * There is a single buffer for reading event logs from the mailbox.  All logs
> > + * share this buffer protected by the cxlds->event_buf_lock.
> > + */
> > +static struct cxl_get_event_payload *alloc_event_buf(struct cxl_dev_state *cxlds)
> > +{
> > +	struct cxl_get_event_payload *buf;
> > +
> > +	dev_dbg(cxlds->dev, "Allocating event buffer size %zu\n",
> > +		cxlds->payload_size);
> > +
> > +	buf = kvmalloc(cxlds->payload_size, GFP_KERNEL);
> 
> huh. I assumed there would be a devm_kvmalloc() but apparently not..  Ah well

Nope I've learned my lesson and checked first!

> - whilst it might makes sense to add one, let's not tie that up with this series.

Yep I did not want to hold this up for something like that.

> 
> > +	if (buf && devm_add_action_or_reset(cxlds->dev,
> > +			cxl_mem_free_event_buffer, cxlds))
> > +		return NULL;
> 
> Trivial, but I'd go for a more wordy but more conventional pattern of
> 	if (!buf)
> 		return NULL;
> 
> 	if (devm_add_action_or_reset())
> 		return NULL

I've been beat up in the past for not combining statements before.  So I've a
bad habit sometimes.

This pattern is a bit more clear.  Since I'm adding the comment below I'll
change it.

> 	
> 	return buff;
> 	
> > +	return buf;
> > +}
> > +
> 
> ...
> 
> > diff --git a/drivers/cxl/cxlmem.h b/drivers/cxl/cxlmem.h
> > index cd35f43fedd4..55d57f5a64bc 100644
> > --- a/drivers/cxl/cxlmem.h
> > +++ b/drivers/cxl/cxlmem.h
> > @@ -4,6 +4,7 @@
> >  #define __CXL_MEM_H__
> >  #include <uapi/linux/cxl_mem.h>
> >  #include <linux/cdev.h>
> > +#include <linux/uuid.h>
> >  #include "cxl.h"
> >  
> >  /* CXL 2.0 8.2.8.5.1.1 Memory Device Status Register */
> > @@ -250,12 +251,16 @@ struct cxl_dev_state {
> >  
> >  	bool msi_enabled;
> >  
> > +	struct cxl_get_event_payload *event_buf;
> Whilst it is obvious (and document at point of allocation),
> I think one of the static checkers still warns that all locks must
> have comments.  Probably easier to add one now than wait for the
> inevitable warning report.

Well 0-day did not complain.  :-/  But I know there are other checkers out
there; better to add now, thanks.

Thanks for the review,
Ira

> 
> > +	struct mutex event_buf_lock;
> > +
> >  	int (*mbox_send)(struct cxl_dev_state *cxlds, struct cxl_mbox_cmd *cmd);
> >  };
> >  
> 
> 

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH V2 03/11] cxl/mem: Implement Clear Event Records command
  2022-12-01 13:26   ` Jonathan Cameron
@ 2022-12-01 15:30     ` Ira Weiny
  0 siblings, 0 replies; 64+ messages in thread
From: Ira Weiny @ 2022-12-01 15:30 UTC (permalink / raw)
  To: Jonathan Cameron
  Cc: Dan Williams, Alison Schofield, Vishal Verma, Ben Widawsky,
	Steven Rostedt, Davidlohr Bueso, Dave Jiang, linux-kernel,
	linux-cxl

On Thu, Dec 01, 2022 at 01:26:18PM +0000, Jonathan Cameron wrote:
> On Wed, 30 Nov 2022 16:27:11 -0800
> ira.weiny@intel.com wrote:
> 
> > From: Ira Weiny <ira.weiny@intel.com>
> > 
> > CXL rev 3.0 section 8.2.9.2.3 defines the Clear Event Records mailbox
> > command.  After an event record is read it needs to be cleared from the
> > event log.
> > 
> > Implement cxl_clear_event_record() to clear all record retrieved from
> > the device.
> > 
> > Each record is cleared explicitly.  A clear all bit is specified but
> > events could arrive between a get and any final clear all operation.
> > This means events would be missed.
> > Therefore each event is cleared specifically.
> > 
> > Signed-off-by: Ira Weiny <ira.weiny@intel.com>
> I think there is a type issue on the min_t() calculation with that addressed
> this looks good to me.
> 
> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
> 
> > 
> > ---
> > Changes from V1:
> > 	Clear Event Record allows for u8 handles while Get Event Record
> > 	allows for u16 records to be returned.  Based on Jonathan's
> > 	feedback; allow for all event records to be handled in this
> > 	clear.  Which means a double loop with potentially multiple
> > 	Clear Event payloads being sent to clear all events sent.
> > 
> > Changes from RFC:
> > 	Jonathan
> > 		Clean up init of payload and use return code.
> > 		Also report any error to clear the event.
> > 		s/v3.0/rev 3.0
> > ---
> >  drivers/cxl/core/mbox.c      | 61 +++++++++++++++++++++++++++++++-----
> >  drivers/cxl/cxlmem.h         | 14 +++++++++
> >  include/uapi/linux/cxl_mem.h |  1 +
> >  3 files changed, 69 insertions(+), 7 deletions(-)
> > 
> > diff --git a/drivers/cxl/core/mbox.c b/drivers/cxl/core/mbox.c
> > index 70b681027a3d..076a3df0ba38 100644
> > --- a/drivers/cxl/core/mbox.c
> > +++ b/drivers/cxl/core/mbox.c
> > @@ -52,6 +52,7 @@ static struct cxl_mem_command cxl_mem_commands[CXL_MEM_COMMAND_ID_MAX] = {
> >  #endif
> >  	CXL_CMD(GET_SUPPORTED_LOGS, 0, CXL_VARIABLE_PAYLOAD, CXL_CMD_FLAG_FORCE_ENABLE),
> >  	CXL_CMD(GET_EVENT_RECORD, 1, CXL_VARIABLE_PAYLOAD, 0),
> > +	CXL_CMD(CLEAR_EVENT_RECORD, CXL_VARIABLE_PAYLOAD, 0, 0),
> >  	CXL_CMD(GET_FW_INFO, 0, 0x50, 0),
> >  	CXL_CMD(GET_PARTITION_INFO, 0, 0x20, 0),
> >  	CXL_CMD(GET_LSA, 0x8, CXL_VARIABLE_PAYLOAD, 0),
> > @@ -708,6 +709,42 @@ int cxl_enumerate_cmds(struct cxl_dev_state *cxlds)
> >  }
> >  EXPORT_SYMBOL_NS_GPL(cxl_enumerate_cmds, CXL);
> >  
> > +static int cxl_clear_event_record(struct cxl_dev_state *cxlds,
> > +				  enum cxl_event_log_type log,
> > +				  struct cxl_get_event_payload *get_pl,
> > +				  u16 total)
> > +{
> > +	struct cxl_mbox_clear_event_payload payload = {
> > +		.event_log = log,
> > +	};
> > +	int cnt;
> > +
> > +	/*
> > +	 * Clear Event Records uses u8 for the handle cnt while Get Event
> > +	 * Record can return up to 0xffff records.
> > +	 */
> > +	for (cnt = 0; cnt < total; /* cnt incremented internally */) {
> > +		u8 nr_recs = min_t(u8, (total - cnt),
> > +				   CXL_CLEAR_EVENT_MAX_HANDLES);
> 
> I might be half asleep but isn't this assuming that (total - cnt)
> fits in an u8?  Shouldn't this be min_t(u16, ..) 

This cast will ensure the value is never out of range for nr_recs which needs
to be u8 and (total - cnt) will never be negative.

But now you have me double thinking myself.

> Also, maybe u16 cnt would be simpler.
> 
> Hmm.  This is safe but only because of how you call it alongside
> handling of a particular Get event records response (which must
> have fitted in the mailbox and has a longer header).
> 
> Looking at this function in isolation, I think the mailbox could be
> small enough that we might not fit 255 records + the header.
> Perhaps we need a comment to say that, or at minimum a check and error
> return if it won't fit?

I did not realize that Payload Size applied to input payloads as well.  :-/
There is no check in the send command for that ATM.  Looking at the spec I
think you are right.

I'll further limit the payload size here too.

And with this I might get rid of the min_t() and just cap based on that value.

> 
> > +		int i, rc;
> > +
> > +		for (i = 0; i < nr_recs; i++, cnt++) {
> > +			payload.handle[i] = get_pl->records[cnt].hdr.handle;
> > +			dev_dbg(cxlds->dev, "Event log '%s': Clearning %u\n",
> > +				cxl_event_log_type_str(log),
> > +				le16_to_cpu(payload.handle[i]));
> > +		}
> > +		payload.nr_recs = nr_recs;
> > +
> > +		rc = cxl_mbox_send_cmd(cxlds, CXL_MBOX_OP_CLEAR_EVENT_RECORD,
> > +				       &payload, sizeof(payload), NULL, 0);
> > +		if (rc)
> > +			return rc;
> > +	}
> > +
> > +	return 0;
> > +}
> > +
> >  static void cxl_mem_get_records_log(struct cxl_dev_state *cxlds,
> >  				    enum cxl_event_log_type type)
> >  {
> > @@ -732,13 +769,22 @@ static void cxl_mem_get_records_log(struct cxl_dev_state *cxlds,
> This feels miss named now but I can't immediately think of better naming so on that
> basis fine to leave it as is if you don't have a better idea!.

So we leave it.  Naming is hard!  :-D

Thanks for the quick review, V3 coming ASAP.
Ira

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH V2 04/11] cxl/mem: Clear events on driver load
  2022-12-01 13:30   ` Jonathan Cameron
@ 2022-12-01 17:02     ` Ira Weiny
  0 siblings, 0 replies; 64+ messages in thread
From: Ira Weiny @ 2022-12-01 17:02 UTC (permalink / raw)
  To: Jonathan Cameron
  Cc: Dan Williams, Dave Jiang, Alison Schofield, Vishal Verma,
	Ben Widawsky, Steven Rostedt, Davidlohr Bueso, linux-kernel,
	linux-cxl

On Thu, Dec 01, 2022 at 01:30:33PM +0000, Jonathan Cameron wrote:
> On Wed, 30 Nov 2022 16:27:12 -0800
> ira.weiny@intel.com wrote:
> 
> > From: Ira Weiny <ira.weiny@intel.com>
> > 
> > The information contained in the events prior to the driver loading can
> > be queried at any time through other mailbox commands.
> > 
> > Ensure a clean slate of events by reading and clearing the events.  The
> > events are sent to the trace buffer but it is not anticipated to have
> > anyone listening to it at driver load time.
> > 
> > Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
> > Reviewed-by: Dave Jiang <dave.jiang@intel.com>
> > Signed-off-by: Ira Weiny <ira.weiny@intel.com>
> 
> Probably not worth addressing but there is a corner case where this might fail
> if some broken software already messed with reading out the events.

Yea they can keep the pieces if they have done that.

> 
> Imagine it read the first mailbox sized chunk, but didn't clear them...
> 
> If that happens, then we'd end up seeing the whole list, but in non
> temporal order and hence trying to clear them out of order with predictable
> fails.
> 
> Maybe this is the category of things we 'fix' if we ever hear of it actually
> happening.
> 
> So with that caveat called out so I can say 'I told you so' :), fine to keep my tag on this.

Sure!  We probably owe you this T-Shirt already!

https://www.amazon.com/Big-Bang-Theory-Informed-Thusly/dp/B06XYCSQRF

:-D

Ira

> 
> Thanks,
> 
> Jonathan
> 
> 
> > ---
> >  drivers/cxl/pci.c            | 2 ++
> >  tools/testing/cxl/test/mem.c | 2 ++
> >  2 files changed, 4 insertions(+)
> > 
> > diff --git a/drivers/cxl/pci.c b/drivers/cxl/pci.c
> > index 8f86f85d89c7..11e95a95195a 100644
> > --- a/drivers/cxl/pci.c
> > +++ b/drivers/cxl/pci.c
> > @@ -521,6 +521,8 @@ static int cxl_pci_probe(struct pci_dev *pdev, const struct pci_device_id *id)
> >  	if (IS_ERR(cxlmd))
> >  		return PTR_ERR(cxlmd);
> >  
> > +	cxl_mem_get_event_records(cxlds);
> > +
> >  	if (resource_size(&cxlds->pmem_res) && IS_ENABLED(CONFIG_CXL_PMEM))
> >  		rc = devm_cxl_add_nvdimm(&pdev->dev, cxlmd);
> >  
> > diff --git a/tools/testing/cxl/test/mem.c b/tools/testing/cxl/test/mem.c
> > index aa2df3a15051..e2f5445d24ff 100644
> > --- a/tools/testing/cxl/test/mem.c
> > +++ b/tools/testing/cxl/test/mem.c
> > @@ -285,6 +285,8 @@ static int cxl_mock_mem_probe(struct platform_device *pdev)
> >  	if (IS_ERR(cxlmd))
> >  		return PTR_ERR(cxlmd);
> >  
> > +	cxl_mem_get_event_records(cxlds);
> > +
> >  	if (resource_size(&cxlds->pmem_res) && IS_ENABLED(CONFIG_CXL_PMEM))
> >  		rc = devm_cxl_add_nvdimm(dev, cxlmd);
> >  
> 

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH V2 08/11] cxl/mem: Wire up event interrupts
  2022-12-01 14:21   ` Jonathan Cameron
@ 2022-12-01 17:23     ` Ira Weiny
  0 siblings, 0 replies; 64+ messages in thread
From: Ira Weiny @ 2022-12-01 17:23 UTC (permalink / raw)
  To: Jonathan Cameron
  Cc: Dan Williams, Alison Schofield, Vishal Verma, Ben Widawsky,
	Steven Rostedt, Davidlohr Bueso, Dave Jiang, linux-kernel,
	linux-cxl

On Thu, Dec 01, 2022 at 02:21:18PM +0000, Jonathan Cameron wrote:
> On Wed, 30 Nov 2022 16:27:16 -0800
> ira.weiny@intel.com wrote:
> 
> > From: Ira Weiny <ira.weiny@intel.com>
> > 
> > CXL device events are signaled via interrupts.  Each event log may have
> > a different interrupt message number.  These message numbers are
> > reported in the Get Event Interrupt Policy mailbox command.
> > 
> > Add interrupt support for event logs.  Interrupts are allocated as
> > shared interrupts.  Therefore, all or some event logs can share the same
> > message number.
> > 
> > Signed-off-by: Ira Weiny <ira.weiny@intel.com>
> 
> A few trivial comments, but only superficially code style stuff which you
> can ignore if you feel strongly about current style or it matches existing
> file style etc...

Thanks I'm leaving this one I think.

> 
> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
> 
> > 
> > ---
> > Changes from V1:
> > 	Remove unneeded evt_int_policy from struct cxl_dev_state
> > 	defer Dynamic Capacity support
> > 	Dave Jiang
> > 		s/irq/rc
> > 		use IRQ_NONE to signal the irq was not for us.
> > 	Jonathan
> > 		use msi_enabled rather than nr_irq_vec
> > 		On failure explicitly set CXL_INT_NONE
> > 		Add comment for Get Event Interrupt Policy
> > 		use devm_request_threaded_irq()
> > 		Use individual handler/thread functions for each of the
> > 		logs rather than struct cxl_event_irq_id.
> > 
> > Changes from RFC v2
> > 	Adjust to new irq 16 vector allocation
> > 	Jonathan
> > 		Remove CXL_INT_RES
> > 	Use irq threads to ensure mailbox commands are executed outside irq context
> > 	Adjust for optional Dynamic Capacity log
> > ---
> >  drivers/cxl/core/mbox.c      |  44 +++++++++++-
> >  drivers/cxl/cxlmem.h         |  30 ++++++++
> >  drivers/cxl/pci.c            | 130 +++++++++++++++++++++++++++++++++++
> >  include/uapi/linux/cxl_mem.h |   2 +
> >  4 files changed, 204 insertions(+), 2 deletions(-)
> > 
> > diff --git a/drivers/cxl/cxlmem.h b/drivers/cxl/cxlmem.h
> > index 450b410f29f6..2d384b0fc2b3 100644
> > --- a/drivers/cxl/cxlmem.h
> > +++ b/drivers/cxl/cxlmem.h
> > @@ -179,6 +179,30 @@ struct cxl_endpoint_dvsec_info {
> >  	struct range dvsec_range[2];
> >  };
> >  
> > +/**
> > + * Event Interrupt Policy
> > + *
> > + * CXL rev 3.0 section 8.2.9.2.4; Table 8-52
> > + */
> > +enum cxl_event_int_mode {
> > +	CXL_INT_NONE		= 0x00,
> > +	CXL_INT_MSI_MSIX	= 0x01,
> > +	CXL_INT_FW		= 0x02
> > +};
> > +#define CXL_EVENT_INT_MODE_MASK 0x3
> > +#define CXL_EVENT_INT_MSGNUM(setting) (((setting) & 0xf0) >> 4)
> > +struct cxl_event_interrupt_policy {
> > +	u8 info_settings;
> > +	u8 warn_settings;
> > +	u8 failure_settings;
> > +	u8 fatal_settings;
> > +} __packed;
> > +
> > +static inline bool cxl_evt_int_is_msi(u8 setting)
> > +{
> > +	return CXL_INT_MSI_MSIX == (setting & CXL_EVENT_INT_MODE_MASK);
> 
> Maybe a case for FIELD_GET() though given the defines are all local
> it is already obvious what this is doing so fine if you prefer to
> keep it as is.
> 
> > +}
> ...
> 
> > diff --git a/drivers/cxl/pci.c b/drivers/cxl/pci.c
> > index 11e95a95195a..3c0b9199f11a 100644
> > --- a/drivers/cxl/pci.c
> > +++ b/drivers/cxl/pci.c
> > @@ -449,6 +449,134 @@ static void cxl_pci_alloc_irq_vectors(struct cxl_dev_state *cxlds)
> >  	cxlds->msi_enabled = true;
> >  }
> >  
> > +static irqreturn_t cxl_event_info_thread(int irq, void *id)
> > +{
> > +	struct cxl_dev_state *cxlds = id;
> > +
> > +	cxl_mem_get_records_log(cxlds, CXL_EVENT_TYPE_INFO);
> > +	return IRQ_HANDLED;
> > +}
> 
> I'm not a great fan of macros, but maybe this is a case for them.

Nor am I.  I particularly don't like when functions are defined with macros as
it makes git grep and cscope harder to use on the code.

> 
> > +
> > +static irqreturn_t cxl_event_info_handler(int irq, void *id)
> > +{
> > +	struct cxl_dev_state *cxlds = id;
> > +	u32 status = readl(cxlds->regs.status + CXLDEV_DEV_EVENT_STATUS_OFFSET);
> 
> Superficial and this is guaranteed to work (8.2.8 allow all sizes of read up
> to 64 bytes), but maybe should treat this as a 64 bit register as that aligns
> better with spec?

I suppose it does.  When I looked at that I only noticed the 32bit field and
thought the register was 32bits.  But most of our reads are to u32 fields.

I think it is fine for now.

> 
> > +
> > +	if (CXLDEV_EVENT_STATUS_INFO & status)
> 
> Another maybe FIELD_GET() case?

Are we using field get for individual bits?  I see we are for some things.  :-(

I was thinking the 'field' is 'Event Status' and the 32 bit read of the
register already got that.

> 
> > +		return IRQ_WAKE_THREAD;
> > +	return IRQ_NONE;
> > +}
> > +
> > +static irqreturn_t cxl_event_warn_thread(int irq, void *id)
> > +{
> > +	struct cxl_dev_state *cxlds = id;
> 
> Why id?

Shortened from 'dev_id' from the devm_request_threaded_irq() function
signature.

> I'd call it what it is (maybe _cxlsd) and not bother with
> the local variable in this case as it is only used once and doesn't
> need the type.
> 
> static irqreturn_t cxl_event_warn_thread(int irq, void *cxlds)
> {
> 	cxl_mem_get_records_log(cxlds, CXL_EVENT_TYPE_WARN);

Yea in this case it is pretty clear but I hate having to look at function
signatures to figure out what a (void *) callback type is supposed to be.

This ...

> > +static irqreturn_t cxl_event_warn_thread(int irq, void *id)
> > +{
> > +	struct cxl_dev_state *cxlds = id;

... helps to self document cxl_event_warn_thread() IMO, and the compiler will
optimize.

Thanks for looking!
Ira

> 
> 	return IRQ_HANDLED;
> }
> 
> 
> > +
> > +	cxl_mem_get_records_log(cxlds, CXL_EVENT_TYPE_WARN);
> > +	return IRQ_HANDLED;
> > +}
> > +
> 
> ...
> 
> 
> 

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH V2 02/11] cxl/mem: Implement Get Event Records command
  2022-12-01  0:27 ` [PATCH V2 02/11] cxl/mem: Implement Get Event Records command ira.weiny
  2022-12-01 13:06   ` Jonathan Cameron
@ 2022-12-01 17:38   ` Steven Rostedt
  2022-12-02  0:09     ` Ira Weiny
  2022-12-02  1:39   ` Dan Williams
  2 siblings, 1 reply; 64+ messages in thread
From: Steven Rostedt @ 2022-12-01 17:38 UTC (permalink / raw)
  To: ira.weiny
  Cc: Dan Williams, Alison Schofield, Vishal Verma, Ben Widawsky,
	Jonathan Cameron, Davidlohr Bueso, Dave Jiang, linux-kernel,
	linux-cxl

On Wed, 30 Nov 2022 16:27:10 -0800
ira.weiny@intel.com wrote:

>  CONEXANT ACCESSRUNNER USB DRIVER
> diff --git a/drivers/cxl/core/mbox.c b/drivers/cxl/core/mbox.c
> index 16176b9278b4..70b681027a3d 100644
> --- a/drivers/cxl/core/mbox.c
> +++ b/drivers/cxl/core/mbox.c
> @@ -7,6 +7,9 @@
>  #include <cxlmem.h>
>  #include <cxl.h>
>  
> +#define CREATE_TRACE_POINTS
> +#include <trace/events/cxl.h>
> +
>  #include "core.h"
>  
>  static bool cxl_raw_allow_all;
> @@ -48,6 +51,7 @@ static struct cxl_mem_command cxl_mem_commands[CXL_MEM_COMMAND_ID_MAX] = {
>  	CXL_CMD(RAW, CXL_VARIABLE_PAYLOAD, CXL_VARIABLE_PAYLOAD, 0),
>  #endif
>  	CXL_CMD(GET_SUPPORTED_LOGS, 0, CXL_VARIABLE_PAYLOAD, CXL_CMD_FLAG_FORCE_ENABLE),
> +	CXL_CMD(GET_EVENT_RECORD, 1, CXL_VARIABLE_PAYLOAD, 0),
>  	CXL_CMD(GET_FW_INFO, 0, 0x50, 0),
>  	CXL_CMD(GET_PARTITION_INFO, 0, 0x20, 0),
>  	CXL_CMD(GET_LSA, 0x8, CXL_VARIABLE_PAYLOAD, 0),
> @@ -704,6 +708,106 @@ int cxl_enumerate_cmds(struct cxl_dev_state *cxlds)
>  }
>  EXPORT_SYMBOL_NS_GPL(cxl_enumerate_cmds, CXL);
>  
> +static void cxl_mem_get_records_log(struct cxl_dev_state *cxlds,
> +				    enum cxl_event_log_type type)
> +{
> +	struct cxl_get_event_payload *payload;
> +	u16 nr_rec;
> +
> +	mutex_lock(&cxlds->event_buf_lock);
> +
> +	payload = cxlds->event_buf;
> +
> +	do {
> +		u8 log_type = type;
> +		int rc;
> +
> +		rc = cxl_mbox_send_cmd(cxlds, CXL_MBOX_OP_GET_EVENT_RECORD,
> +				       &log_type, sizeof(log_type),
> +				       payload, cxlds->payload_size);
> +		if (rc) {
> +			dev_err(cxlds->dev, "Event log '%s': Failed to query event records : %d",
> +				cxl_event_log_type_str(type), rc);
> +			goto unlock_buffer;
> +		}
> +
> +		nr_rec = le16_to_cpu(payload->record_count);
> +		if (trace_cxl_generic_event_enabled()) {
> +			int i;
> +
> +			for (i = 0; i < nr_rec; i++)
> +				trace_cxl_generic_event(dev_name(cxlds->dev),
> +							type,
> +							&payload->records[i]);
> +		}
> +
> +		if (trace_cxl_overflow_enabled() &&
> +		    (payload->flags & CXL_GET_EVENT_FLAG_OVERFLOW))
> +			trace_cxl_overflow(dev_name(cxlds->dev), type, payload);
> +
> +	} while (nr_rec);
> +
> +unlock_buffer:
> +	mutex_unlock(&cxlds->event_buf_lock);
> +}
> +
> +static void cxl_mem_free_event_buffer(void *data)
> +{
> +	struct cxl_dev_state *cxlds = data;
> +
> +	kvfree(cxlds->event_buf);
> +}
> +
> +/*
> + * There is a single buffer for reading event logs from the mailbox.  All logs
> + * share this buffer protected by the cxlds->event_buf_lock.
> + */
> +static struct cxl_get_event_payload *alloc_event_buf(struct cxl_dev_state *cxlds)
> +{
> +	struct cxl_get_event_payload *buf;
> +
> +	dev_dbg(cxlds->dev, "Allocating event buffer size %zu\n",
> +		cxlds->payload_size);
> +
> +	buf = kvmalloc(cxlds->payload_size, GFP_KERNEL);
> +	if (buf && devm_add_action_or_reset(cxlds->dev,
> +			cxl_mem_free_event_buffer, cxlds))
> +		return NULL;
> +	return buf;
> +}
> +
> +/**
> + * cxl_mem_get_event_records - Get Event Records from the device
> + * @cxlds: The device data for the operation
> + *
> + * Retrieve all event records available on the device and report them as trace
> + * events.
> + *
> + * See CXL rev 3.0 @8.2.9.2.2 Get Event Records
> + */
> +void cxl_mem_get_event_records(struct cxl_dev_state *cxlds)
> +{
> +	u32 status = readl(cxlds->regs.status + CXLDEV_DEV_EVENT_STATUS_OFFSET);
> +
> +	dev_dbg(cxlds->dev, "Reading event logs: %x\n", status);
> +
> +	if (!cxlds->event_buf) {
> +		cxlds->event_buf = alloc_event_buf(cxlds);
> +		if (WARN_ON_ONCE(!cxlds->event_buf))
> +			return;
> +	}
> +
> +	if (status & CXLDEV_EVENT_STATUS_INFO)
> +		cxl_mem_get_records_log(cxlds, CXL_EVENT_TYPE_INFO);
> +	if (status & CXLDEV_EVENT_STATUS_WARN)
> +		cxl_mem_get_records_log(cxlds, CXL_EVENT_TYPE_WARN);
> +	if (status & CXLDEV_EVENT_STATUS_FAIL)
> +		cxl_mem_get_records_log(cxlds, CXL_EVENT_TYPE_FAIL);
> +	if (status & CXLDEV_EVENT_STATUS_FATAL)
> +		cxl_mem_get_records_log(cxlds, CXL_EVENT_TYPE_FATAL);
> +}
> +EXPORT_SYMBOL_NS_GPL(cxl_mem_get_event_records, CXL);
> +
>  /**
>   * cxl_mem_get_partition_info - Get partition info
>   * @cxlds: The device data for the operation
> @@ -846,6 +950,7 @@ struct cxl_dev_state *cxl_dev_state_create(struct device *dev)
>  	}
>  
>  	mutex_init(&cxlds->mbox_mutex);
> +	mutex_init(&cxlds->event_buf_lock);
>  	cxlds->dev = dev;
>  
>  	return cxlds;
> diff --git a/drivers/cxl/cxl.h b/drivers/cxl/cxl.h
> index f680450f0b16..d4baae74cd97 100644
> --- a/drivers/cxl/cxl.h
> +++ b/drivers/cxl/cxl.h
> @@ -132,6 +132,13 @@ static inline int ways_to_cxl(unsigned int ways, u8 *iw)
>  #define CXLDEV_CAP_CAP_ID_SECONDARY_MAILBOX 0x3
>  #define CXLDEV_CAP_CAP_ID_MEMDEV 0x4000
>  
> +/* CXL 3.0 8.2.8.3.1 Event Status Register */
> +#define CXLDEV_DEV_EVENT_STATUS_OFFSET		0x00
> +#define CXLDEV_EVENT_STATUS_INFO		BIT(0)
> +#define CXLDEV_EVENT_STATUS_WARN		BIT(1)
> +#define CXLDEV_EVENT_STATUS_FAIL		BIT(2)
> +#define CXLDEV_EVENT_STATUS_FATAL		BIT(3)
> +
>  /* CXL 2.0 8.2.8.4 Mailbox Registers */
>  #define CXLDEV_MBOX_CAPS_OFFSET 0x00
>  #define   CXLDEV_MBOX_CAP_PAYLOAD_SIZE_MASK GENMASK(4, 0)
> diff --git a/drivers/cxl/cxlmem.h b/drivers/cxl/cxlmem.h
> index cd35f43fedd4..55d57f5a64bc 100644
> --- a/drivers/cxl/cxlmem.h
> +++ b/drivers/cxl/cxlmem.h
> @@ -4,6 +4,7 @@
>  #define __CXL_MEM_H__
>  #include <uapi/linux/cxl_mem.h>
>  #include <linux/cdev.h>
> +#include <linux/uuid.h>
>  #include "cxl.h"
>  
>  /* CXL 2.0 8.2.8.5.1.1 Memory Device Status Register */
> @@ -250,12 +251,16 @@ struct cxl_dev_state {
>  
>  	bool msi_enabled;
>  
> +	struct cxl_get_event_payload *event_buf;
> +	struct mutex event_buf_lock;
> +
>  	int (*mbox_send)(struct cxl_dev_state *cxlds, struct cxl_mbox_cmd *cmd);
>  };
>  
>  enum cxl_opcode {
>  	CXL_MBOX_OP_INVALID		= 0x0000,
>  	CXL_MBOX_OP_RAW			= CXL_MBOX_OP_INVALID,
> +	CXL_MBOX_OP_GET_EVENT_RECORD	= 0x0100,
>  	CXL_MBOX_OP_GET_FW_INFO		= 0x0200,
>  	CXL_MBOX_OP_ACTIVATE_FW		= 0x0202,
>  	CXL_MBOX_OP_GET_SUPPORTED_LOGS	= 0x0400,
> @@ -325,6 +330,72 @@ struct cxl_mbox_identify {
>  	u8 qos_telemetry_caps;
>  } __packed;
>  
> +/*
> + * Common Event Record Format
> + * CXL rev 3.0 section 8.2.9.2.1; Table 8-42
> + */
> +struct cxl_event_record_hdr {
> +	uuid_t id;
> +	u8 length;
> +	u8 flags[3];
> +	__le16 handle;
> +	__le16 related_handle;
> +	__le64 timestamp;
> +	u8 maint_op_class;
> +	u8 reserved[0xf];
> +} __packed;
> +
> +#define CXL_EVENT_RECORD_DATA_LENGTH 0x50
> +struct cxl_event_record_raw {
> +	struct cxl_event_record_hdr hdr;
> +	u8 data[CXL_EVENT_RECORD_DATA_LENGTH];
> +} __packed;
> +
> +/*
> + * Get Event Records output payload
> + * CXL rev 3.0 section 8.2.9.2.2; Table 8-50
> + */
> +#define CXL_GET_EVENT_FLAG_OVERFLOW		BIT(0)
> +#define CXL_GET_EVENT_FLAG_MORE_RECORDS		BIT(1)
> +struct cxl_get_event_payload {
> +	u8 flags;
> +	u8 reserved1;
> +	__le16 overflow_err_count;
> +	__le64 first_overflow_timestamp;
> +	__le64 last_overflow_timestamp;
> +	__le16 record_count;
> +	u8 reserved2[0xa];
> +	struct cxl_event_record_raw records[];
> +} __packed;
> +
> +/*
> + * CXL rev 3.0 section 8.2.9.2.2; Table 8-49
> + */
> +enum cxl_event_log_type {
> +	CXL_EVENT_TYPE_INFO = 0x00,
> +	CXL_EVENT_TYPE_WARN,
> +	CXL_EVENT_TYPE_FAIL,
> +	CXL_EVENT_TYPE_FATAL,
> +	CXL_EVENT_TYPE_MAX
> +};
> +
> +static inline const char *cxl_event_log_type_str(enum cxl_event_log_type type)
> +{
> +	switch (type) {
> +	case CXL_EVENT_TYPE_INFO:
> +		return "Informational";
> +	case CXL_EVENT_TYPE_WARN:
> +		return "Warning";
> +	case CXL_EVENT_TYPE_FAIL:
> +		return "Failure";
> +	case CXL_EVENT_TYPE_FATAL:
> +		return "Fatal";
> +	default:
> +		break;
> +	}
> +	return "<unknown>";
> +}

So you are using this in a TP_printk() section, which means perf and
trace-cmd have no idea how to parse it. Can I recommend instead having:

#define cxl_event_log_type_str(type)				\
	__print_symbolic(type,					\
		{ CXL_EVENT_TYPE_INFO, "Informational" },	\
		{ CXL_EVENT_TYPE_WARN, "Warning" },		\
		{ CXL_EVENT_TYPE_FAIL, "Failure" },		\
		{ CXL_EVENT_TYPE_FATAL, "Fatal" })

#ifndef CREATE_TRACE_POINTS
static inline const char *__cxl_event_log_type_str(enum cxl_event_log_type type,
			struct trace_print_flags *symbols)
{
	for (; symbols->mask >= 0; symbols++) {
		if (type == symbols->mask)
			return symbols->name;
	}
	return "<unknown>";
}
#define __print_symbolic(value, symbol_array...)			\
	({								\
		static const struct trace_print_flags symbols[] =	\
			{ symbol_array, { -1, NULL }};			\
		__cxl_event_log_type_str(value, symbols);		\
	})
#endif		

Note, I did not even try to compile the above. But it should be close to
working.

This way, the cxl_event_log_type_str() for trace events will be converted
into the __print_symbolic() which can be parsed by perf and trace-cmd. For
all other use cases, it is converted into the function above to return the
string.

-- Steve

> +
>  struct cxl_mbox_get_partition_info {
>  	__le64 active_volatile_cap;
>  	__le64 active_persistent_cap;
> @@ -384,6 +455,7 @@ int cxl_mem_create_range_info(struct cxl_dev_state *cxlds);
>  struct cxl_dev_state *cxl_dev_state_create(struct device *dev);
>  void set_exclusive_cxl_commands(struct cxl_dev_state *cxlds, unsigned long *cmds);
>  void clear_exclusive_cxl_commands(struct cxl_dev_state *cxlds, unsigned long *cmds);
> +void cxl_mem_get_event_records(struct cxl_dev_state *cxlds);
>  #ifdef CONFIG_CXL_SUSPEND
>  void cxl_mem_active_inc(void);
>  void cxl_mem_active_dec(void);
> diff --git a/include/trace/events/cxl.h b/include/trace/events/cxl.h
> new file mode 100644
> index 000000000000..c03a1a894af8
> --- /dev/null
> +++ b/include/trace/events/cxl.h
> @@ -0,0 +1,126 @@
> +/* SPDX-License-Identifier: GPL-2.0 */
> +#undef TRACE_SYSTEM
> +#define TRACE_SYSTEM cxl
> +
> +#if !defined(_CXL_TRACE_EVENTS_H) ||  defined(TRACE_HEADER_MULTI_READ)
> +#define _CXL_TRACE_EVENTS_H
> +
> +#include <asm-generic/unaligned.h>
> +#include <linux/tracepoint.h>
> +#include <cxlmem.h>
> +
> +TRACE_EVENT(cxl_overflow,
> +
> +	TP_PROTO(const char *dev_name, enum cxl_event_log_type log,
> +		 struct cxl_get_event_payload *payload),
> +
> +	TP_ARGS(dev_name, log, payload),
> +
> +	TP_STRUCT__entry(
> +		__string(dev_name, dev_name)
> +		__field(int, log)
> +		__field(u64, first_ts)
> +		__field(u64, last_ts)
> +		__field(u16, count)
> +	),
> +
> +	TP_fast_assign(
> +		__assign_str(dev_name, dev_name);
> +		__entry->log = log;
> +		__entry->count = le16_to_cpu(payload->overflow_err_count);
> +		__entry->first_ts = le64_to_cpu(payload->first_overflow_timestamp);
> +		__entry->last_ts = le64_to_cpu(payload->last_overflow_timestamp);
> +	),
> +
> +	TP_printk("%s: EVENT LOG OVERFLOW log=%s : %u records from %llu to %llu",
> +		__get_str(dev_name), cxl_event_log_type_str(__entry->log),
> +		__entry->count, __entry->first_ts, __entry->last_ts)
> +
> +);
> +
> +/*
> + * Common Event Record Format
> + * CXL 3.0 section 8.2.9.2.1; Table 8-42
> + */
> +#define CXL_EVENT_RECORD_FLAG_PERMANENT		BIT(2)
> +#define CXL_EVENT_RECORD_FLAG_MAINT_NEEDED	BIT(3)
> +#define CXL_EVENT_RECORD_FLAG_PERF_DEGRADED	BIT(4)
> +#define CXL_EVENT_RECORD_FLAG_HW_REPLACE	BIT(5)
> +#define show_hdr_flags(flags)	__print_flags(flags, " | ",			   \
> +	{ CXL_EVENT_RECORD_FLAG_PERMANENT,	"PERMANENT_CONDITION"		}, \
> +	{ CXL_EVENT_RECORD_FLAG_MAINT_NEEDED,	"MAINTENANCE_NEEDED"		}, \
> +	{ CXL_EVENT_RECORD_FLAG_PERF_DEGRADED,	"PERFORMANCE_DEGRADED"		}, \
> +	{ CXL_EVENT_RECORD_FLAG_HW_REPLACE,	"HARDWARE_REPLACEMENT_NEEDED"	}  \
> +)
> +
> +/*
> + * Define macros for the common header of each CXL event.
> + *
> + * Tracepoints using these macros must do 3 things:
> + *
> + *	1) Add CXL_EVT_TP_entry to TP_STRUCT__entry
> + *	2) Use CXL_EVT_TP_fast_assign within TP_fast_assign;
> + *	   pass the dev_name, log, and CXL event header
> + *	3) Use CXL_EVT_TP_printk() instead of TP_printk()
> + *
> + * See the generic_event tracepoint as an example.
> + */
> +#define CXL_EVT_TP_entry					\
> +	__string(dev_name, dev_name)				\
> +	__field(int, log)					\
> +	__field_struct(uuid_t, hdr_uuid)			\
> +	__field(u32, hdr_flags)					\
> +	__field(u16, hdr_handle)				\
> +	__field(u16, hdr_related_handle)			\
> +	__field(u64, hdr_timestamp)				\
> +	__field(u8, hdr_length)					\
> +	__field(u8, hdr_maint_op_class)
> +
> +#define CXL_EVT_TP_fast_assign(dname, l, hdr)					\
> +	__assign_str(dev_name, (dname));					\
> +	__entry->log = (l);							\
> +	memcpy(&__entry->hdr_uuid, &(hdr).id, sizeof(uuid_t));			\
> +	__entry->hdr_length = (hdr).length;					\
> +	__entry->hdr_flags = get_unaligned_le24((hdr).flags);			\
> +	__entry->hdr_handle = le16_to_cpu((hdr).handle);			\
> +	__entry->hdr_related_handle = le16_to_cpu((hdr).related_handle);	\
> +	__entry->hdr_timestamp = le64_to_cpu((hdr).timestamp);			\
> +	__entry->hdr_maint_op_class = (hdr).maint_op_class
> +
> +#define CXL_EVT_TP_printk(fmt, ...) \
> +	TP_printk("%s log=%s : time=%llu uuid=%pUb len=%d flags='%s' "		\
> +		"handle=%x related_handle=%x maint_op_class=%u"			\
> +		" : " fmt,							\
> +		__get_str(dev_name), cxl_event_log_type_str(__entry->log),	\
> +		__entry->hdr_timestamp, &__entry->hdr_uuid, __entry->hdr_length,\
> +		show_hdr_flags(__entry->hdr_flags), __entry->hdr_handle,	\
> +		__entry->hdr_related_handle, __entry->hdr_maint_op_class,	\
> +		##__VA_ARGS__)
> +
> +TRACE_EVENT(cxl_generic_event,
> +
> +	TP_PROTO(const char *dev_name, enum cxl_event_log_type log,
> +		 struct cxl_event_record_raw *rec),
> +
> +	TP_ARGS(dev_name, log, rec),
> +
> +	TP_STRUCT__entry(
> +		CXL_EVT_TP_entry
> +		__array(u8, data, CXL_EVENT_RECORD_DATA_LENGTH)
> +	),
> +
> +	TP_fast_assign(
> +		CXL_EVT_TP_fast_assign(dev_name, log, rec->hdr);
> +		memcpy(__entry->data, &rec->data, CXL_EVENT_RECORD_DATA_LENGTH);
> +	),
> +
> +	CXL_EVT_TP_printk("%s",
> +		__print_hex(__entry->data, CXL_EVENT_RECORD_DATA_LENGTH))
> +);
> +
> +#endif /* _CXL_TRACE_EVENTS_H */
> +
> +/* This part must be outside protection */
> +#undef TRACE_INCLUDE_FILE
> +#define TRACE_INCLUDE_FILE cxl
> +#include <trace/define_trace.h>
> diff --git a/include/uapi/linux/cxl_mem.h b/include/uapi/linux/cxl_mem.h
> index c71021a2a9ed..70459be5bdd4 100644
> --- a/include/uapi/linux/cxl_mem.h
> +++ b/include/uapi/linux/cxl_mem.h
> @@ -24,6 +24,7 @@
>  	___C(IDENTIFY, "Identify Command"),                               \
>  	___C(RAW, "Raw device command"),                                  \
>  	___C(GET_SUPPORTED_LOGS, "Get Supported Logs"),                   \
> +	___C(GET_EVENT_RECORD, "Get Event Record"),                       \
>  	___C(GET_FW_INFO, "Get FW Info"),                                 \
>  	___C(GET_PARTITION_INFO, "Get Partition Information"),            \
>  	___C(GET_LSA, "Get Label Storage Area"),                          \


^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH V2 09/11] cxl/test: Add generic mock events
  2022-12-01 14:37   ` Jonathan Cameron
@ 2022-12-01 17:49     ` Ira Weiny
  0 siblings, 0 replies; 64+ messages in thread
From: Ira Weiny @ 2022-12-01 17:49 UTC (permalink / raw)
  To: Jonathan Cameron
  Cc: Dan Williams, Alison Schofield, Vishal Verma, Ben Widawsky,
	Steven Rostedt, Davidlohr Bueso, Dave Jiang, linux-kernel,
	linux-cxl

On Thu, Dec 01, 2022 at 02:37:27PM +0000, Jonathan Cameron wrote:
> On Wed, 30 Nov 2022 16:27:17 -0800
> ira.weiny@intel.com wrote:
> 
> > From: Ira Weiny <ira.weiny@intel.com>
> > 
> > Facilitate testing basic Get/Clear Event functionality by creating
> > multiple logs and generic events with made up UUID's.
> > 
> > Data is completely made up with data patterns which should be easy to
> > spot in trace output.
> > 
> > A single sysfs entry resets the event data and triggers collecting the
> > events for testing.
> > 
> > Events are returned one at a time which is within the specification even
> > though it does not exercise the full capabilities of what a device may
> > do.
> > 
> > Test traces are easy to obtain with a small script such as this:
> > 
> > 	#!/bin/bash -x
> > 
> > 	devices=`find /sys/devices/platform -name cxl_mem*`
> > 
> > 	# Turn on tracing
> > 	echo "" > /sys/kernel/tracing/trace
> > 	echo 1 > /sys/kernel/tracing/events/cxl/enable
> > 	echo 1 > /sys/kernel/tracing/tracing_on
> > 
> > 	# Generate fake interrupt
> > 	for device in $devices; do
> > 	        echo 1 > $device/event_trigger
> > 	done
> > 
> > 	# Turn off tracing and report events
> > 	echo 0 > /sys/kernel/tracing/tracing_on
> > 	cat /sys/kernel/tracing/trace
> > 
> > Signed-off-by: Ira Weiny <ira.weiny@intel.com>
> > 
> 
> A minor comment on xarray cleanup inline
> 
> Jonathan
> 
> 
> > +void cxl_mock_remove_event_logs(struct device *dev)
> > +{
> > +	struct mock_event_store *mes;
> > +
> > +	mes = xa_erase(&mock_dev_event_store, (unsigned long)dev);
> 
> As below, I'd move this into a devm_add_action_or_reset() so
> that we don't need to deal with doing it manually.

yea

> 
> > +}
> > +EXPORT_SYMBOL_GPL(cxl_mock_remove_event_logs);
> 
> ...
> 
> >  static int cxl_mock_mem_probe(struct platform_device *pdev)
> >  {
> >  	struct device *dev = &pdev->dev;
> >  	struct cxl_memdev *cxlmd;
> >  	struct cxl_dev_state *cxlds;
> > +	u32 ev_status;
> >  	void *lsa;
> >  	int rc;
> >  
> > @@ -281,11 +304,13 @@ static int cxl_mock_mem_probe(struct platform_device *pdev)
> >  	if (rc)
> >  		return rc;
> >  
> > +	ev_status = cxl_mock_add_event_logs(cxlds);
> 
> On error later in this function these leak.  Just use devm_add_action_or_reset()
> inside cxl_mock_add_event_logs() so we don't have to care about that.

I see it now.  There is an entry in the xarray which is leaked.

> 
> > +
> >  	cxlmd = devm_cxl_add_memdev(cxlds);
> >  	if (IS_ERR(cxlmd))
> >  		return PTR_ERR(cxlmd);
> >  
> > -	cxl_mem_get_event_records(cxlds);
> > +	__cxl_mem_get_event_records(cxlds, ev_status);
> >  
> >  	if (resource_size(&cxlds->pmem_res) && IS_ENABLED(CONFIG_CXL_PMEM))
> >  		rc = devm_cxl_add_nvdimm(dev, cxlmd);
> > @@ -293,6 +318,12 @@ static int cxl_mock_mem_probe(struct platform_device *pdev)
> >  	return 0;
> >  }
> >  
> > +static int cxl_mock_mem_remove(struct platform_device *pdev)
> > +{
> > +	cxl_mock_remove_event_logs(&pdev->dev);
> Why not use devm_add_action_or_reset()?

Yea that is a better pattern.

Ira

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH V2 08/11] cxl/mem: Wire up event interrupts
  2022-12-01  0:27 ` [PATCH V2 08/11] cxl/mem: Wire up event interrupts ira.weiny
  2022-12-01 14:21   ` Jonathan Cameron
@ 2022-12-01 18:35   ` Davidlohr Bueso
  2022-12-02  7:37   ` Dan Williams
  2 siblings, 0 replies; 64+ messages in thread
From: Davidlohr Bueso @ 2022-12-01 18:35 UTC (permalink / raw)
  To: ira.weiny
  Cc: Dan Williams, Alison Schofield, Vishal Verma, Ben Widawsky,
	Steven Rostedt, Jonathan Cameron, Dave Jiang, linux-kernel,
	linux-cxl

On Wed, 30 Nov 2022, ira.weiny@intel.com wrote:

>From: Ira Weiny <ira.weiny@intel.com>
>
>CXL device events are signaled via interrupts.  Each event log may have
>a different interrupt message number.  These message numbers are
>reported in the Get Event Interrupt Policy mailbox command.
>
>Add interrupt support for event logs.  Interrupts are allocated as
>shared interrupts.  Therefore, all or some event logs can share the same
>message number.
>
>Signed-off-by: Ira Weiny <ira.weiny@intel.com>

Reviewed-by: Davidlohr Bueso <dave@stgolabs.net>

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH V2 01/11] cxl/pci: Add generic MSI-X/MSI irq support
  2022-12-01  0:27 ` [PATCH V2 01/11] cxl/pci: Add generic MSI-X/MSI irq support ira.weiny
  2022-12-01 10:18   ` Jonathan Cameron
@ 2022-12-01 18:37   ` Dave Jiang
  2022-12-02  0:23   ` Dan Williams
  2 siblings, 0 replies; 64+ messages in thread
From: Dave Jiang @ 2022-12-01 18:37 UTC (permalink / raw)
  To: ira.weiny, Dan Williams
  Cc: Davidlohr Bueso, Bjorn Helgaas, Jonathan Cameron,
	Alison Schofield, Vishal Verma, Ben Widawsky, Steven Rostedt,
	linux-kernel, linux-cxl



On 11/30/2022 5:27 PM, ira.weiny@intel.com wrote:
> From: Davidlohr Bueso <dave@stgolabs.net>
> 
> Currently the only CXL features targeted for irq support require their
> message numbers to be within the first 16 entries.  The device may
> however support less than 16 entries depending on the support it
> provides.
> 
> Attempt to allocate these 16 irq vectors.  If the device supports less
> then the PCI infrastructure will allocate that number.  Store the number
> of vectors actually allocated in the device state for later use
> by individual functions.
> 
> Upon successful allocation, users can plug in their respective isr at
> any point thereafter, for example, if the irq setup is not done in the
> PCI driver, such as the case of the CXL-PMU.
> 
> Cc: Bjorn Helgaas <helgaas@kernel.org>
> Cc: Jonathan Cameron <Jonathan.Cameron@huawei.com>
> Co-developed-by: Ira Weiny <ira.weiny@intel.com>
> Signed-off-by: Ira Weiny <ira.weiny@intel.com>
> Signed-off-by: Davidlohr Bueso <dave@stgolabs.net>

Reviewed-by: Dave Jiang <dave.jiang@intel.com>

> 
> ---
> Changes from V1:
> 	Jonathan
> 		pci_alloc_irq_vectors() cleans up the vectors automatically
> 		use msi_enabled rather than nr_irq_vecs
> 
> Changes from Ira
> 	Remove reviews
> 	Allocate up to a static 16 vectors.
> 	Change cover letter
> ---
>   drivers/cxl/cxlmem.h |  3 +++
>   drivers/cxl/cxlpci.h |  6 ++++++
>   drivers/cxl/pci.c    | 23 +++++++++++++++++++++++
>   3 files changed, 32 insertions(+)
> 
> diff --git a/drivers/cxl/cxlmem.h b/drivers/cxl/cxlmem.h
> index 88e3a8e54b6a..cd35f43fedd4 100644
> --- a/drivers/cxl/cxlmem.h
> +++ b/drivers/cxl/cxlmem.h
> @@ -211,6 +211,7 @@ struct cxl_endpoint_dvsec_info {
>    * @info: Cached DVSEC information about the device.
>    * @serial: PCIe Device Serial Number
>    * @doe_mbs: PCI DOE mailbox array
> + * @msi_enabled: MSI-X/MSI has been enabled
>    * @mbox_send: @dev specific transport for transmitting mailbox commands
>    *
>    * See section 8.2.9.5.2 Capacity Configuration and Label Storage for
> @@ -247,6 +248,8 @@ struct cxl_dev_state {
>   
>   	struct xarray doe_mbs;
>   
> +	bool msi_enabled;
> +
>   	int (*mbox_send)(struct cxl_dev_state *cxlds, struct cxl_mbox_cmd *cmd);
>   };
>   
> diff --git a/drivers/cxl/cxlpci.h b/drivers/cxl/cxlpci.h
> index eec597dbe763..b7f4e2f417d3 100644
> --- a/drivers/cxl/cxlpci.h
> +++ b/drivers/cxl/cxlpci.h
> @@ -53,6 +53,12 @@
>   #define	    CXL_DVSEC_REG_LOCATOR_BLOCK_ID_MASK			GENMASK(15, 8)
>   #define     CXL_DVSEC_REG_LOCATOR_BLOCK_OFF_LOW_MASK		GENMASK(31, 16)
>   
> +/*
> + * NOTE: Currently all the functions which are enabled for CXL require their
> + * vectors to be in the first 16.  Use this as the max.
> + */
> +#define CXL_PCI_REQUIRED_VECTORS 16
> +
>   /* Register Block Identifier (RBI) */
>   enum cxl_regloc_type {
>   	CXL_REGLOC_RBI_EMPTY = 0,
> diff --git a/drivers/cxl/pci.c b/drivers/cxl/pci.c
> index faeb5d9d7a7a..8f86f85d89c7 100644
> --- a/drivers/cxl/pci.c
> +++ b/drivers/cxl/pci.c
> @@ -428,6 +428,27 @@ static void devm_cxl_pci_create_doe(struct cxl_dev_state *cxlds)
>   	}
>   }
>   
> +static void cxl_pci_alloc_irq_vectors(struct cxl_dev_state *cxlds)
> +{
> +	struct device *dev = cxlds->dev;
> +	struct pci_dev *pdev = to_pci_dev(dev);
> +	int nvecs;
> +
> +	/*
> +	 * NOTE: pci_alloc_irq_vectors() handles calling pci_free_irq_vectors()
> +	 * automatically despite not being called pcim_*.  See
> +	 * pci_setup_msi_context().
> +	 */
> +	nvecs = pci_alloc_irq_vectors(pdev, 1, CXL_PCI_REQUIRED_VECTORS,
> +				   PCI_IRQ_MSIX | PCI_IRQ_MSI);
> +	if (nvecs < 0) {
> +		dev_dbg(dev, "Failed to alloc irq vectors; use polling instead.\n");
> +		return;
> +	}
> +
> +	cxlds->msi_enabled = true;
> +}
> +
>   static int cxl_pci_probe(struct pci_dev *pdev, const struct pci_device_id *id)
>   {
>   	struct cxl_register_map map;
> @@ -494,6 +515,8 @@ static int cxl_pci_probe(struct pci_dev *pdev, const struct pci_device_id *id)
>   	if (rc)
>   		return rc;
>   
> +	cxl_pci_alloc_irq_vectors(cxlds);
> +
>   	cxlmd = devm_cxl_add_memdev(cxlds);
>   	if (IS_ERR(cxlmd))
>   		return PTR_ERR(cxlmd);

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH V2 05/11] cxl/mem: Trace General Media Event Record
  2022-12-01  0:27 ` [PATCH V2 05/11] cxl/mem: Trace General Media Event Record ira.weiny
@ 2022-12-01 18:54   ` Dave Jiang
  2022-12-02  6:18   ` Dan Williams
  1 sibling, 0 replies; 64+ messages in thread
From: Dave Jiang @ 2022-12-01 18:54 UTC (permalink / raw)
  To: ira.weiny, Dan Williams
  Cc: Steven Rostedt, Jonathan Cameron, Alison Schofield, Vishal Verma,
	Ben Widawsky, Davidlohr Bueso, linux-kernel, linux-cxl



On 11/30/2022 5:27 PM, ira.weiny@intel.com wrote:
> From: Ira Weiny <ira.weiny@intel.com>
> 
> CXL rev 3.0 section 8.2.9.2.1.1 defines the General Media Event Record.
> 
> Determine if the event read is a general media record and if so trace
> the record as a General Media Event Record.
> 
> Reviewed-by: Steven Rostedt (Google) <rostedt@goodmis.org>
> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
> Signed-off-by: Ira Weiny <ira.weiny@intel.com>

Reviewed-by: Dave Jiang <dave.jiang@intel.com>

> 
> ---
> Changes from V1:
> 	Jonathan
> 		fix spec references for CXL rev 3.0
> 		Make flags all caps
> 
> Changes from RFC v2:
> 	Output DPA flags as a single field
> 	Ensure names of fields match what TP_print outputs
> 	Steven
> 		prefix TRACE_EVENT with 'cxl_'
> 	Jonathan
> 		Remove Reserved field
> 
> Changes from RFC:
> 	Add reserved byte array
> 	Use common CXL event header record macros
> 	Jonathan
> 		Use unaligned_le{24,16} for unaligned fields
> 		Don't use the inverse of phy addr mask
> 	Dave Jiang
> 		s/cxl_gen_media_event/general_media
> 		s/cxl_evt_gen_media/cxl_event_gen_media
> ---
>   drivers/cxl/core/mbox.c    |  40 ++++++++++--
>   drivers/cxl/cxlmem.h       |  19 ++++++
>   include/trace/events/cxl.h | 124 +++++++++++++++++++++++++++++++++++++
>   3 files changed, 179 insertions(+), 4 deletions(-)
> 
> diff --git a/drivers/cxl/core/mbox.c b/drivers/cxl/core/mbox.c
> index 076a3df0ba38..20191fe55bba 100644
> --- a/drivers/cxl/core/mbox.c
> +++ b/drivers/cxl/core/mbox.c
> @@ -709,6 +709,38 @@ int cxl_enumerate_cmds(struct cxl_dev_state *cxlds)
>   }
>   EXPORT_SYMBOL_NS_GPL(cxl_enumerate_cmds, CXL);
>   
> +/*
> + * General Media Event Record
> + * CXL rev 3.0 Section 8.2.9.2.1.1; Table 8-43
> + */
> +static const uuid_t gen_media_event_uuid =
> +	UUID_INIT(0xfbcd0a77, 0xc260, 0x417f,
> +		  0x85, 0xa9, 0x08, 0x8b, 0x16, 0x21, 0xeb, 0xa6);
> +
> +static bool cxl_event_tracing_enabled(void)
> +{
> +	return trace_cxl_generic_event_enabled() ||
> +	       trace_cxl_general_media_enabled();
> +}
> +
> +static void cxl_trace_event_record(const char *dev_name,
> +				   enum cxl_event_log_type type,
> +				   struct cxl_event_record_raw *record)
> +{
> +	uuid_t *id = &record->hdr.id;
> +
> +	if (uuid_equal(id, &gen_media_event_uuid)) {
> +		struct cxl_event_gen_media *rec =
> +				(struct cxl_event_gen_media *)record;
> +
> +		trace_cxl_general_media(dev_name, type, rec);
> +		return;
> +	}
> +
> +	/* For unknown record types print just the header */
> +	trace_cxl_generic_event(dev_name, type, record);
> +}
> +
>   static int cxl_clear_event_record(struct cxl_dev_state *cxlds,
>   				  enum cxl_event_log_type log,
>   				  struct cxl_get_event_payload *get_pl,
> @@ -772,11 +804,11 @@ static void cxl_mem_get_records_log(struct cxl_dev_state *cxlds,
>   		if (nr_rec > 0) {
>   			int i;
>   
> -			if (trace_cxl_generic_event_enabled()) {
> +			if (cxl_event_tracing_enabled()) {
>   				for (i = 0; i < nr_rec; i++)
> -					trace_cxl_generic_event(dev_name(cxlds->dev),
> -								type,
> -								&payload->records[i]);
> +					cxl_trace_event_record(dev_name(cxlds->dev),
> +							       type,
> +							       &payload->records[i]);
>   			}
>   
>   			rc = cxl_clear_event_record(cxlds, type, payload, nr_rec);
> diff --git a/drivers/cxl/cxlmem.h b/drivers/cxl/cxlmem.h
> index 1ae9962c5a06..10696debefa8 100644
> --- a/drivers/cxl/cxlmem.h
> +++ b/drivers/cxl/cxlmem.h
> @@ -410,6 +410,25 @@ struct cxl_mbox_clear_event_payload {
>   	__le16 handle[CXL_CLEAR_EVENT_MAX_HANDLES];
>   };
>   
> +/*
> + * General Media Event Record
> + * CXL rev 3.0 Section 8.2.9.2.1.1; Table 8-43
> + */
> +#define CXL_EVENT_GEN_MED_COMP_ID_SIZE	0x10
> +struct cxl_event_gen_media {
> +	struct cxl_event_record_hdr hdr;
> +	__le64 phys_addr;
> +	u8 descriptor;
> +	u8 type;
> +	u8 transaction_type;
> +	u8 validity_flags[2];
> +	u8 channel;
> +	u8 rank;
> +	u8 device[3];
> +	u8 component_id[CXL_EVENT_GEN_MED_COMP_ID_SIZE];
> +	u8 reserved[0x2e];
> +} __packed;
> +
>   struct cxl_mbox_get_partition_info {
>   	__le64 active_volatile_cap;
>   	__le64 active_persistent_cap;
> diff --git a/include/trace/events/cxl.h b/include/trace/events/cxl.h
> index c03a1a894af8..a4d6bd64e9bc 100644
> --- a/include/trace/events/cxl.h
> +++ b/include/trace/events/cxl.h
> @@ -118,6 +118,130 @@ TRACE_EVENT(cxl_generic_event,
>   		__print_hex(__entry->data, CXL_EVENT_RECORD_DATA_LENGTH))
>   );
>   
> +/*
> + * Physical Address field masks
> + *
> + * General Media Event Record
> + * CXL rev 3.0 Section 8.2.9.2.1.1; Table 8-43
> + *
> + * DRAM Event Record
> + * CXL rev 3.0 section 8.2.9.2.1.2; Table 8-44
> + */
> +#define CXL_DPA_FLAGS_MASK			0x3F
> +#define CXL_DPA_MASK				(~CXL_DPA_FLAGS_MASK)
> +
> +#define CXL_DPA_VOLATILE			BIT(0)
> +#define CXL_DPA_NOT_REPAIRABLE			BIT(1)
> +#define show_dpa_flags(flags)	__print_flags(flags, "|",		   \
> +	{ CXL_DPA_VOLATILE,			"VOLATILE"		}, \
> +	{ CXL_DPA_NOT_REPAIRABLE,		"NOT_REPAIRABLE"	}  \
> +)
> +
> +/*
> + * General Media Event Record - GMER
> + * CXL rev 3.0 Section 8.2.9.2.1.1; Table 8-43
> + */
> +#define CXL_GMER_EVT_DESC_UNCORECTABLE_EVENT		BIT(0)
> +#define CXL_GMER_EVT_DESC_THRESHOLD_EVENT		BIT(1)
> +#define CXL_GMER_EVT_DESC_POISON_LIST_OVERFLOW		BIT(2)
> +#define show_event_desc_flags(flags)	__print_flags(flags, "|",		   \
> +	{ CXL_GMER_EVT_DESC_UNCORECTABLE_EVENT,		"UNCORRECTABLE_EVENT"	}, \
> +	{ CXL_GMER_EVT_DESC_THRESHOLD_EVENT,		"THRESHOLD_EVENT"	}, \
> +	{ CXL_GMER_EVT_DESC_POISON_LIST_OVERFLOW,	"POISON_LIST_OVERFLOW"	}  \
> +)
> +
> +#define CXL_GMER_MEM_EVT_TYPE_ECC_ERROR			0x00
> +#define CXL_GMER_MEM_EVT_TYPE_INV_ADDR			0x01
> +#define CXL_GMER_MEM_EVT_TYPE_DATA_PATH_ERROR		0x02
> +#define show_mem_event_type(type)	__print_symbolic(type,			\
> +	{ CXL_GMER_MEM_EVT_TYPE_ECC_ERROR,		"ECC Error" },		\
> +	{ CXL_GMER_MEM_EVT_TYPE_INV_ADDR,		"Invalid Address" },	\
> +	{ CXL_GMER_MEM_EVT_TYPE_DATA_PATH_ERROR,	"Data Path Error" }	\
> +)
> +
> +#define CXL_GMER_TRANS_UNKNOWN				0x00
> +#define CXL_GMER_TRANS_HOST_READ			0x01
> +#define CXL_GMER_TRANS_HOST_WRITE			0x02
> +#define CXL_GMER_TRANS_HOST_SCAN_MEDIA			0x03
> +#define CXL_GMER_TRANS_HOST_INJECT_POISON		0x04
> +#define CXL_GMER_TRANS_INTERNAL_MEDIA_SCRUB		0x05
> +#define CXL_GMER_TRANS_INTERNAL_MEDIA_MANAGEMENT	0x06
> +#define show_trans_type(type)	__print_symbolic(type,					\
> +	{ CXL_GMER_TRANS_UNKNOWN,			"Unknown" },			\
> +	{ CXL_GMER_TRANS_HOST_READ,			"Host Read" },			\
> +	{ CXL_GMER_TRANS_HOST_WRITE,			"Host Write" },			\
> +	{ CXL_GMER_TRANS_HOST_SCAN_MEDIA,		"Host Scan Media" },		\
> +	{ CXL_GMER_TRANS_HOST_INJECT_POISON,		"Host Inject Poison" },		\
> +	{ CXL_GMER_TRANS_INTERNAL_MEDIA_SCRUB,		"Internal Media Scrub" },	\
> +	{ CXL_GMER_TRANS_INTERNAL_MEDIA_MANAGEMENT,	"Internal Media Management" }	\
> +)
> +
> +#define CXL_GMER_VALID_CHANNEL				BIT(0)
> +#define CXL_GMER_VALID_RANK				BIT(1)
> +#define CXL_GMER_VALID_DEVICE				BIT(2)
> +#define CXL_GMER_VALID_COMPONENT			BIT(3)
> +#define show_valid_flags(flags)	__print_flags(flags, "|",		   \
> +	{ CXL_GMER_VALID_CHANNEL,			"CHANNEL"	}, \
> +	{ CXL_GMER_VALID_RANK,				"RANK"		}, \
> +	{ CXL_GMER_VALID_DEVICE,			"DEVICE"	}, \
> +	{ CXL_GMER_VALID_COMPONENT,			"COMPONENT"	}  \
> +)
> +
> +TRACE_EVENT(cxl_general_media,
> +
> +	TP_PROTO(const char *dev_name, enum cxl_event_log_type log,
> +		 struct cxl_event_gen_media *rec),
> +
> +	TP_ARGS(dev_name, log, rec),
> +
> +	TP_STRUCT__entry(
> +		CXL_EVT_TP_entry
> +		/* General Media */
> +		__field(u64, dpa)
> +		__field(u8, descriptor)
> +		__field(u8, type)
> +		__field(u8, transaction_type)
> +		__field(u8, channel)
> +		__field(u32, device)
> +		__array(u8, comp_id, CXL_EVENT_GEN_MED_COMP_ID_SIZE)
> +		__field(u16, validity_flags)
> +		/* Following are out of order to pack trace record */
> +		__field(u8, rank)
> +		__field(u8, dpa_flags)
> +	),
> +
> +	TP_fast_assign(
> +		CXL_EVT_TP_fast_assign(dev_name, log, rec->hdr);
> +
> +		/* General Media */
> +		__entry->dpa = le64_to_cpu(rec->phys_addr);
> +		__entry->dpa_flags = __entry->dpa & CXL_DPA_FLAGS_MASK;
> +		/* Mask after flags have been parsed */
> +		__entry->dpa &= CXL_DPA_MASK;
> +		__entry->descriptor = rec->descriptor;
> +		__entry->type = rec->type;
> +		__entry->transaction_type = rec->transaction_type;
> +		__entry->channel = rec->channel;
> +		__entry->rank = rec->rank;
> +		__entry->device = get_unaligned_le24(rec->device);
> +		memcpy(__entry->comp_id, &rec->component_id,
> +			CXL_EVENT_GEN_MED_COMP_ID_SIZE);
> +		__entry->validity_flags = get_unaligned_le16(&rec->validity_flags);
> +	),
> +
> +	CXL_EVT_TP_printk("dpa=%llx dpa_flags='%s' " \
> +		"descriptor='%s' type='%s' transaction_type='%s' channel=%u rank=%u " \
> +		"device=%x comp_id=%s validity_flags='%s'",
> +		__entry->dpa, show_dpa_flags(__entry->dpa_flags),
> +		show_event_desc_flags(__entry->descriptor),
> +		show_mem_event_type(__entry->type),
> +		show_trans_type(__entry->transaction_type),
> +		__entry->channel, __entry->rank, __entry->device,
> +		__print_hex(__entry->comp_id, CXL_EVENT_GEN_MED_COMP_ID_SIZE),
> +		show_valid_flags(__entry->validity_flags)
> +	)
> +);
> +
>   #endif /* _CXL_TRACE_EVENTS_H */
>   
>   /* This part must be outside protection */

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH V2 06/11] cxl/mem: Trace DRAM Event Record
  2022-12-01  0:27 ` [PATCH V2 06/11] cxl/mem: Trace DRAM " ira.weiny
@ 2022-12-01 18:55   ` Dave Jiang
  0 siblings, 0 replies; 64+ messages in thread
From: Dave Jiang @ 2022-12-01 18:55 UTC (permalink / raw)
  To: ira.weiny, Dan Williams
  Cc: Steven Rostedt, Jonathan Cameron, Alison Schofield, Vishal Verma,
	Ben Widawsky, Davidlohr Bueso, linux-kernel, linux-cxl



On 11/30/2022 5:27 PM, ira.weiny@intel.com wrote:
> From: Ira Weiny <ira.weiny@intel.com>
> 
> CXL rev 3.0 section 8.2.9.2.1.2 defines the DRAM Event Record.
> 
> Determine if the event read is a DRAM event record and if so trace the
> record.
> 
> Reviewed-by: Steven Rostedt (Google) <rostedt@goodmis.org>
> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
> Reviewed-by: Dave Jiang <dave.jiang@intel.com>
> Signed-off-by: Ira Weiny <ira.weiny@intel.com>

Reviewed-by: Dave Jiang <dave.jiang@intel.com>
> 
> ---
> Changes from RFC v2:
> 	Output DPA flags as a separate field.
> 	Ensure field names match TP_print output
> 	Steven
> 		prefix TRACE_EVENT with 'cxl_'
> 	Jonathan
> 		Formatting fix
> 		Remove reserved field
> 
> Changes from RFC:
> 	Add reserved byte data
> 	Use new CXL header macros
> 	Jonathan
> 		Use get_unaligned_le{24,16}() for unaligned fields
> 		Use 'else if'
> 	Dave Jiang
> 		s/cxl_dram_event/dram
> 		s/cxl_evt_dram_rec/cxl_event_dram
> 	Adjust for new phys addr mask
> ---
>   drivers/cxl/core/mbox.c    | 16 ++++++-
>   drivers/cxl/cxlmem.h       | 23 ++++++++++
>   include/trace/events/cxl.h | 92 ++++++++++++++++++++++++++++++++++++++
>   3 files changed, 130 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/cxl/core/mbox.c b/drivers/cxl/core/mbox.c
> index 20191fe55bba..66fc50d89bf4 100644
> --- a/drivers/cxl/core/mbox.c
> +++ b/drivers/cxl/core/mbox.c
> @@ -717,10 +717,19 @@ static const uuid_t gen_media_event_uuid =
>   	UUID_INIT(0xfbcd0a77, 0xc260, 0x417f,
>   		  0x85, 0xa9, 0x08, 0x8b, 0x16, 0x21, 0xeb, 0xa6);
>   
> +/*
> + * DRAM Event Record
> + * CXL rev 3.0 section 8.2.9.2.1.2; Table 8-44
> + */
> +static const uuid_t dram_event_uuid =
> +	UUID_INIT(0x601dcbb3, 0x9c06, 0x4eab,
> +		  0xb8, 0xaf, 0x4e, 0x9b, 0xfb, 0x5c, 0x96, 0x24);
> +
>   static bool cxl_event_tracing_enabled(void)
>   {
>   	return trace_cxl_generic_event_enabled() ||
> -	       trace_cxl_general_media_enabled();
> +	       trace_cxl_general_media_enabled() ||
> +	       trace_cxl_dram_enabled();
>   }
>   
>   static void cxl_trace_event_record(const char *dev_name,
> @@ -735,6 +744,11 @@ static void cxl_trace_event_record(const char *dev_name,
>   
>   		trace_cxl_general_media(dev_name, type, rec);
>   		return;
> +	} else if (uuid_equal(id, &dram_event_uuid)) {
> +		struct cxl_event_dram *rec = (struct cxl_event_dram *)record;
> +
> +		trace_cxl_dram(dev_name, type, rec);
> +		return;
>   	}
>   
>   	/* For unknown record types print just the header */
> diff --git a/drivers/cxl/cxlmem.h b/drivers/cxl/cxlmem.h
> index 10696debefa8..f5f63a475478 100644
> --- a/drivers/cxl/cxlmem.h
> +++ b/drivers/cxl/cxlmem.h
> @@ -429,6 +429,29 @@ struct cxl_event_gen_media {
>   	u8 reserved[0x2e];
>   } __packed;
>   
> +/*
> + * DRAM Event Record - DER
> + * CXL rev 3.0 section 8.2.9.2.1.2; Table 3-44
> + */
> +#define CXL_EVENT_DER_CORRECTION_MASK_SIZE	0x20
> +struct cxl_event_dram {
> +	struct cxl_event_record_hdr hdr;
> +	__le64 phys_addr;
> +	u8 descriptor;
> +	u8 type;
> +	u8 transaction_type;
> +	u8 validity_flags[2];
> +	u8 channel;
> +	u8 rank;
> +	u8 nibble_mask[3];
> +	u8 bank_group;
> +	u8 bank;
> +	u8 row[3];
> +	u8 column[2];
> +	u8 correction_mask[CXL_EVENT_DER_CORRECTION_MASK_SIZE];
> +	u8 reserved[0x17];
> +} __packed;
> +
>   struct cxl_mbox_get_partition_info {
>   	__le64 active_volatile_cap;
>   	__le64 active_persistent_cap;
> diff --git a/include/trace/events/cxl.h b/include/trace/events/cxl.h
> index a4d6bd64e9bc..474390f895d9 100644
> --- a/include/trace/events/cxl.h
> +++ b/include/trace/events/cxl.h
> @@ -242,6 +242,98 @@ TRACE_EVENT(cxl_general_media,
>   	)
>   );
>   
> +/*
> + * DRAM Event Record - DER
> + *
> + * CXL rev 3.0 section 8.2.9.2.1.2; Table 8-44
> + */
> +/*
> + * DRAM Event Record defines many fields the same as the General Media Event
> + * Record.  Reuse those definitions as appropriate.
> + */
> +#define CXL_DER_VALID_CHANNEL				BIT(0)
> +#define CXL_DER_VALID_RANK				BIT(1)
> +#define CXL_DER_VALID_NIBBLE				BIT(2)
> +#define CXL_DER_VALID_BANK_GROUP			BIT(3)
> +#define CXL_DER_VALID_BANK				BIT(4)
> +#define CXL_DER_VALID_ROW				BIT(5)
> +#define CXL_DER_VALID_COLUMN				BIT(6)
> +#define CXL_DER_VALID_CORRECTION_MASK			BIT(7)
> +#define show_dram_valid_flags(flags)	__print_flags(flags, "|",			   \
> +	{ CXL_DER_VALID_CHANNEL,			"CHANNEL"		}, \
> +	{ CXL_DER_VALID_RANK,				"RANK"			}, \
> +	{ CXL_DER_VALID_NIBBLE,				"NIBBLE"		}, \
> +	{ CXL_DER_VALID_BANK_GROUP,			"BANK GROUP"		}, \
> +	{ CXL_DER_VALID_BANK,				"BANK"			}, \
> +	{ CXL_DER_VALID_ROW,				"ROW"			}, \
> +	{ CXL_DER_VALID_COLUMN,				"COLUMN"		}, \
> +	{ CXL_DER_VALID_CORRECTION_MASK,		"CORRECTION MASK"	}  \
> +)
> +
> +TRACE_EVENT(cxl_dram,
> +
> +	TP_PROTO(const char *dev_name, enum cxl_event_log_type log,
> +		 struct cxl_event_dram *rec),
> +
> +	TP_ARGS(dev_name, log, rec),
> +
> +	TP_STRUCT__entry(
> +		CXL_EVT_TP_entry
> +		/* DRAM */
> +		__field(u64, dpa)
> +		__field(u8, descriptor)
> +		__field(u8, type)
> +		__field(u8, transaction_type)
> +		__field(u8, channel)
> +		__field(u16, validity_flags)
> +		__field(u16, column)	/* Out of order to pack trace record */
> +		__field(u32, nibble_mask)
> +		__field(u32, row)
> +		__array(u8, cor_mask, CXL_EVENT_DER_CORRECTION_MASK_SIZE)
> +		__field(u8, rank)	/* Out of order to pack trace record */
> +		__field(u8, bank_group)	/* Out of order to pack trace record */
> +		__field(u8, bank)	/* Out of order to pack trace record */
> +		__field(u8, dpa_flags)	/* Out of order to pack trace record */
> +	),
> +
> +	TP_fast_assign(
> +		CXL_EVT_TP_fast_assign(dev_name, log, rec->hdr);
> +
> +		/* DRAM */
> +		__entry->dpa = le64_to_cpu(rec->phys_addr);
> +		__entry->dpa_flags = __entry->dpa & CXL_DPA_FLAGS_MASK;
> +		__entry->dpa &= CXL_DPA_MASK;
> +		__entry->descriptor = rec->descriptor;
> +		__entry->type = rec->type;
> +		__entry->transaction_type = rec->transaction_type;
> +		__entry->validity_flags = get_unaligned_le16(rec->validity_flags);
> +		__entry->channel = rec->channel;
> +		__entry->rank = rec->rank;
> +		__entry->nibble_mask = get_unaligned_le24(rec->nibble_mask);
> +		__entry->bank_group = rec->bank_group;
> +		__entry->bank = rec->bank;
> +		__entry->row = get_unaligned_le24(rec->row);
> +		__entry->column = get_unaligned_le16(rec->column);
> +		memcpy(__entry->cor_mask, &rec->correction_mask,
> +			CXL_EVENT_DER_CORRECTION_MASK_SIZE);
> +	),
> +
> +	CXL_EVT_TP_printk("dpa=%llx dpa_flags='%s' descriptor='%s' type='%s' " \
> +		"transaction_type='%s' channel=%u rank=%u nibble_mask=%x " \
> +		"bank_group=%u bank=%u row=%u column=%u cor_mask=%s " \
> +		"validity_flags='%s'",
> +		__entry->dpa, show_dpa_flags(__entry->dpa_flags),
> +		show_event_desc_flags(__entry->descriptor),
> +		show_mem_event_type(__entry->type),
> +		show_trans_type(__entry->transaction_type),
> +		__entry->channel, __entry->rank, __entry->nibble_mask,
> +		__entry->bank_group, __entry->bank,
> +		__entry->row, __entry->column,
> +		__print_hex(__entry->cor_mask, CXL_EVENT_DER_CORRECTION_MASK_SIZE),
> +		show_dram_valid_flags(__entry->validity_flags)
> +	)
> +);
> +
>   #endif /* _CXL_TRACE_EVENTS_H */
>   
>   /* This part must be outside protection */

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH V2 07/11] cxl/mem: Trace Memory Module Event Record
  2022-12-01  0:27 ` [PATCH V2 07/11] cxl/mem: Trace Memory Module " ira.weiny
  2022-12-01 13:31   ` Jonathan Cameron
@ 2022-12-01 18:57   ` Dave Jiang
  2022-12-02  6:25   ` Dan Williams
  2 siblings, 0 replies; 64+ messages in thread
From: Dave Jiang @ 2022-12-01 18:57 UTC (permalink / raw)
  To: ira.weiny, Dan Williams
  Cc: Steven Rostedt, Alison Schofield, Vishal Verma, Ben Widawsky,
	Jonathan Cameron, Davidlohr Bueso, linux-kernel, linux-cxl



On 11/30/2022 5:27 PM, ira.weiny@intel.com wrote:
> From: Ira Weiny <ira.weiny@intel.com>
> 
> CXL rev 3.0 section 8.2.9.2.1.3 defines the Memory Module Event Record.
> 
> Determine if the event read is memory module record and if so trace the
> record.
> 
> Reviewed-by: Steven Rostedt (Google) <rostedt@goodmis.org>
> Signed-off-by: Ira Weiny <ira.weiny@intel.com>

Reviewed-by: Dave Jiang <dave.jiang@intel.com>

> 
> ---
> Changes from V1:
> 	Use all caps for flag fields
> 
> Changes from RFC v2:
> 	Ensure field names match TP_print output
> 	Steven
> 		prefix TRACE_EVENT with 'cxl_'
> 	Jonathan
> 		Remove reserved field
> 		Define a 1bit and 2 bit status decoder
> 		Fix paren alignment
> 
> Changes from RFC:
> 	Clean up spec reference
> 	Add reserved data
> 	Use new CXL header macros
> 	Jonathan
> 		Use else if
> 		Use get_unaligned_le*() for unaligned fields
> 	Dave Jiang
> 		s/cxl_mem_mod_event/memory_module
> 		s/cxl_evt_mem_mod_rec/cxl_event_mem_module
> ---
>   drivers/cxl/core/mbox.c    |  17 ++++-
>   drivers/cxl/cxlmem.h       |  26 +++++++
>   include/trace/events/cxl.h | 144 +++++++++++++++++++++++++++++++++++++
>   3 files changed, 186 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/cxl/core/mbox.c b/drivers/cxl/core/mbox.c
> index 66fc50d89bf4..30840b711381 100644
> --- a/drivers/cxl/core/mbox.c
> +++ b/drivers/cxl/core/mbox.c
> @@ -725,11 +725,20 @@ static const uuid_t dram_event_uuid =
>   	UUID_INIT(0x601dcbb3, 0x9c06, 0x4eab,
>   		  0xb8, 0xaf, 0x4e, 0x9b, 0xfb, 0x5c, 0x96, 0x24);
>   
> +/*
> + * Memory Module Event Record
> + * CXL rev 3.0 section 8.2.9.2.1.3; Table 8-45
> + */
> +static const uuid_t mem_mod_event_uuid =
> +	UUID_INIT(0xfe927475, 0xdd59, 0x4339,
> +		  0xa5, 0x86, 0x79, 0xba, 0xb1, 0x13, 0xb7, 0x74);
> +
>   static bool cxl_event_tracing_enabled(void)
>   {
>   	return trace_cxl_generic_event_enabled() ||
>   	       trace_cxl_general_media_enabled() ||
> -	       trace_cxl_dram_enabled();
> +	       trace_cxl_dram_enabled() ||
> +	       trace_cxl_memory_module_enabled();
>   }
>   
>   static void cxl_trace_event_record(const char *dev_name,
> @@ -749,6 +758,12 @@ static void cxl_trace_event_record(const char *dev_name,
>   
>   		trace_cxl_dram(dev_name, type, rec);
>   		return;
> +	} else if (uuid_equal(id, &mem_mod_event_uuid)) {
> +		struct cxl_event_mem_module *rec =
> +				(struct cxl_event_mem_module *)record;
> +
> +		trace_cxl_memory_module(dev_name, type, rec);
> +		return;
>   	}
>   
>   	/* For unknown record types print just the header */
> diff --git a/drivers/cxl/cxlmem.h b/drivers/cxl/cxlmem.h
> index f5f63a475478..450b410f29f6 100644
> --- a/drivers/cxl/cxlmem.h
> +++ b/drivers/cxl/cxlmem.h
> @@ -452,6 +452,32 @@ struct cxl_event_dram {
>   	u8 reserved[0x17];
>   } __packed;
>   
> +/*
> + * Get Health Info Record
> + * CXL rev 3.0 section 8.2.9.8.3.1; Table 8-100
> + */
> +struct cxl_get_health_info {
> +	u8 health_status;
> +	u8 media_status;
> +	u8 add_status;
> +	u8 life_used;
> +	u8 device_temp[2];
> +	u8 dirty_shutdown_cnt[4];
> +	u8 cor_vol_err_cnt[4];
> +	u8 cor_per_err_cnt[4];
> +} __packed;
> +
> +/*
> + * Memory Module Event Record
> + * CXL rev 3.0 section 8.2.9.2.1.3; Table 8-45
> + */
> +struct cxl_event_mem_module {
> +	struct cxl_event_record_hdr hdr;
> +	u8 event_type;
> +	struct cxl_get_health_info info;
> +	u8 reserved[0x3d];
> +} __packed;
> +
>   struct cxl_mbox_get_partition_info {
>   	__le64 active_volatile_cap;
>   	__le64 active_persistent_cap;
> diff --git a/include/trace/events/cxl.h b/include/trace/events/cxl.h
> index 474390f895d9..48786d6c9615 100644
> --- a/include/trace/events/cxl.h
> +++ b/include/trace/events/cxl.h
> @@ -334,6 +334,150 @@ TRACE_EVENT(cxl_dram,
>   	)
>   );
>   
> +/*
> + * Memory Module Event Record - MMER
> + *
> + * CXL res 3.0 section 8.2.9.2.1.3; Table 8-45
> + */
> +#define CXL_MMER_HEALTH_STATUS_CHANGE		0x00
> +#define CXL_MMER_MEDIA_STATUS_CHANGE		0x01
> +#define CXL_MMER_LIFE_USED_CHANGE		0x02
> +#define CXL_MMER_TEMP_CHANGE			0x03
> +#define CXL_MMER_DATA_PATH_ERROR		0x04
> +#define CXL_MMER_LAS_ERROR			0x05
> +#define show_dev_evt_type(type)	__print_symbolic(type,			   \
> +	{ CXL_MMER_HEALTH_STATUS_CHANGE,	"Health Status Change"	}, \
> +	{ CXL_MMER_MEDIA_STATUS_CHANGE,		"Media Status Change"	}, \
> +	{ CXL_MMER_LIFE_USED_CHANGE,		"Life Used Change"	}, \
> +	{ CXL_MMER_TEMP_CHANGE,			"Temperature Change"	}, \
> +	{ CXL_MMER_DATA_PATH_ERROR,		"Data Path Error"	}, \
> +	{ CXL_MMER_LAS_ERROR,			"LSA Error"		}  \
> +)
> +
> +/*
> + * Device Health Information - DHI
> + *
> + * CXL res 3.0 section 8.2.9.8.3.1; Table 8-100
> + */
> +#define CXL_DHI_HS_MAINTENANCE_NEEDED				BIT(0)
> +#define CXL_DHI_HS_PERFORMANCE_DEGRADED				BIT(1)
> +#define CXL_DHI_HS_HW_REPLACEMENT_NEEDED			BIT(2)
> +#define show_health_status_flags(flags)	__print_flags(flags, "|",	   \
> +	{ CXL_DHI_HS_MAINTENANCE_NEEDED,	"MAINTENANCE_NEEDED"	}, \
> +	{ CXL_DHI_HS_PERFORMANCE_DEGRADED,	"PERFORMANCE_DEGRADED"	}, \
> +	{ CXL_DHI_HS_HW_REPLACEMENT_NEEDED,	"REPLACEMENT_NEEDED"	}  \
> +)
> +
> +#define CXL_DHI_MS_NORMAL							0x00
> +#define CXL_DHI_MS_NOT_READY							0x01
> +#define CXL_DHI_MS_WRITE_PERSISTENCY_LOST					0x02
> +#define CXL_DHI_MS_ALL_DATA_LOST						0x03
> +#define CXL_DHI_MS_WRITE_PERSISTENCY_LOSS_EVENT_POWER_LOSS			0x04
> +#define CXL_DHI_MS_WRITE_PERSISTENCY_LOSS_EVENT_SHUTDOWN			0x05
> +#define CXL_DHI_MS_WRITE_PERSISTENCY_LOSS_IMMINENT				0x06
> +#define CXL_DHI_MS_WRITE_ALL_DATA_LOSS_EVENT_POWER_LOSS				0x07
> +#define CXL_DHI_MS_WRITE_ALL_DATA_LOSS_EVENT_SHUTDOWN				0x08
> +#define CXL_DHI_MS_WRITE_ALL_DATA_LOSS_IMMINENT					0x09
> +#define show_media_status(ms)	__print_symbolic(ms,			   \
> +	{ CXL_DHI_MS_NORMAL,						   \
> +		"Normal"						}, \
> +	{ CXL_DHI_MS_NOT_READY,						   \
> +		"Not Ready"						}, \
> +	{ CXL_DHI_MS_WRITE_PERSISTENCY_LOST,				   \
> +		"Write Persistency Lost"				}, \
> +	{ CXL_DHI_MS_ALL_DATA_LOST,					   \
> +		"All Data Lost"						}, \
> +	{ CXL_DHI_MS_WRITE_PERSISTENCY_LOSS_EVENT_POWER_LOSS,		   \
> +		"Write Persistency Loss in the Event of Power Loss"	}, \
> +	{ CXL_DHI_MS_WRITE_PERSISTENCY_LOSS_EVENT_SHUTDOWN,		   \
> +		"Write Persistency Loss in Event of Shutdown"		}, \
> +	{ CXL_DHI_MS_WRITE_PERSISTENCY_LOSS_IMMINENT,			   \
> +		"Write Persistency Loss Imminent"			}, \
> +	{ CXL_DHI_MS_WRITE_ALL_DATA_LOSS_EVENT_POWER_LOSS,		   \
> +		"All Data Loss in Event of Power Loss"			}, \
> +	{ CXL_DHI_MS_WRITE_ALL_DATA_LOSS_EVENT_SHUTDOWN,		   \
> +		"All Data loss in the Event of Shutdown"		}, \
> +	{ CXL_DHI_MS_WRITE_ALL_DATA_LOSS_IMMINENT,			   \
> +		"All Data Loss Imminent"				}  \
> +)
> +
> +#define CXL_DHI_AS_NORMAL		0x0
> +#define CXL_DHI_AS_WARNING		0x1
> +#define CXL_DHI_AS_CRITICAL		0x2
> +#define show_two_bit_status(as) __print_symbolic(as,	   \
> +	{ CXL_DHI_AS_NORMAL,		"Normal"	}, \
> +	{ CXL_DHI_AS_WARNING,		"Warning"	}, \
> +	{ CXL_DHI_AS_CRITICAL,		"Critical"	}  \
> +)
> +#define show_one_bit_status(as) __print_symbolic(as,	   \
> +	{ CXL_DHI_AS_NORMAL,		"Normal"	}, \
> +	{ CXL_DHI_AS_WARNING,		"Warning"	}  \
> +)
> +
> +#define CXL_DHI_AS_LIFE_USED(as)			(as & 0x3)
> +#define CXL_DHI_AS_DEV_TEMP(as)				((as & 0xC) >> 2)
> +#define CXL_DHI_AS_COR_VOL_ERR_CNT(as)			((as & 0x10) >> 4)
> +#define CXL_DHI_AS_COR_PER_ERR_CNT(as)			((as & 0x20) >> 5)
> +
> +TRACE_EVENT(cxl_memory_module,
> +
> +	TP_PROTO(const char *dev_name, enum cxl_event_log_type log,
> +		 struct cxl_event_mem_module *rec),
> +
> +	TP_ARGS(dev_name, log, rec),
> +
> +	TP_STRUCT__entry(
> +		CXL_EVT_TP_entry
> +
> +		/* Memory Module Event */
> +		__field(u8, event_type)
> +
> +		/* Device Health Info */
> +		__field(u8, health_status)
> +		__field(u8, media_status)
> +		__field(u8, life_used)
> +		__field(u32, dirty_shutdown_cnt)
> +		__field(u32, cor_vol_err_cnt)
> +		__field(u32, cor_per_err_cnt)
> +		__field(s16, device_temp)
> +		__field(u8, add_status)
> +	),
> +
> +	TP_fast_assign(
> +		CXL_EVT_TP_fast_assign(dev_name, log, rec->hdr);
> +
> +		/* Memory Module Event */
> +		__entry->event_type = rec->event_type;
> +
> +		/* Device Health Info */
> +		__entry->health_status = rec->info.health_status;
> +		__entry->media_status = rec->info.media_status;
> +		__entry->life_used = rec->info.life_used;
> +		__entry->dirty_shutdown_cnt = get_unaligned_le32(rec->info.dirty_shutdown_cnt);
> +		__entry->cor_vol_err_cnt = get_unaligned_le32(rec->info.cor_vol_err_cnt);
> +		__entry->cor_per_err_cnt = get_unaligned_le32(rec->info.cor_per_err_cnt);
> +		__entry->device_temp = get_unaligned_le16(rec->info.device_temp);
> +		__entry->add_status = rec->info.add_status;
> +	),
> +
> +	CXL_EVT_TP_printk("event_type='%s' health_status='%s' media_status='%s' " \
> +		"as_life_used=%s as_dev_temp=%s as_cor_vol_err_cnt=%s " \
> +		"as_cor_per_err_cnt=%s life_used=%u device_temp=%d " \
> +		"dirty_shutdown_cnt=%u cor_vol_err_cnt=%u cor_per_err_cnt=%u",
> +		show_dev_evt_type(__entry->event_type),
> +		show_health_status_flags(__entry->health_status),
> +		show_media_status(__entry->media_status),
> +		show_two_bit_status(CXL_DHI_AS_LIFE_USED(__entry->add_status)),
> +		show_two_bit_status(CXL_DHI_AS_DEV_TEMP(__entry->add_status)),
> +		show_one_bit_status(CXL_DHI_AS_COR_VOL_ERR_CNT(__entry->add_status)),
> +		show_one_bit_status(CXL_DHI_AS_COR_PER_ERR_CNT(__entry->add_status)),
> +		__entry->life_used, __entry->device_temp,
> +		__entry->dirty_shutdown_cnt, __entry->cor_vol_err_cnt,
> +		__entry->cor_per_err_cnt
> +	)
> +);
> +
> +
>   #endif /* _CXL_TRACE_EVENTS_H */
>   
>   /* This part must be outside protection */

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH V2 10/11] cxl/test: Add specific events
  2022-12-01  0:27 ` [PATCH V2 10/11] cxl/test: Add specific events ira.weiny
@ 2022-12-01 21:00   ` Dave Jiang
  0 siblings, 0 replies; 64+ messages in thread
From: Dave Jiang @ 2022-12-01 21:00 UTC (permalink / raw)
  To: ira.weiny, Dan Williams
  Cc: Jonathan Cameron, Alison Schofield, Vishal Verma, Ben Widawsky,
	Steven Rostedt, Davidlohr Bueso, linux-kernel, linux-cxl



On 11/30/2022 5:27 PM, ira.weiny@intel.com wrote:
> From: Ira Weiny <ira.weiny@intel.com>
> 
> Each type of event has different trace point outputs.
> 
> Add mock General Media Event, DRAM event, and Memory Module Event
> records to the mock list of events returned.
> 
> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
> Signed-off-by: Ira Weiny <ira.weiny@intel.com>

Reviewed-by: Dave Jiang <dave.jiang@intel.com>

> 
> ---
> Changes from V1:
> 	Jonathan
> 		use put_unaligned_le16()
> 		fix spacing
> 
> Changes from RFC:
> 	Adjust for struct changes
> 	adjust for unaligned fields
> ---
>   tools/testing/cxl/test/events.c | 73 +++++++++++++++++++++++++++++++++
>   1 file changed, 73 insertions(+)
> 
> diff --git a/tools/testing/cxl/test/events.c b/tools/testing/cxl/test/events.c
> index a3d2ec7cc9fe..0bcc485e07da 100644
> --- a/tools/testing/cxl/test/events.c
> +++ b/tools/testing/cxl/test/events.c
> @@ -206,6 +206,66 @@ struct cxl_event_record_raw hardware_replace = {
>   	.data = { 0xDE, 0xAD, 0xBE, 0xEF },
>   };
>   
> +struct cxl_event_gen_media gen_media = {
> +	.hdr = {
> +		.id = UUID_INIT(0xfbcd0a77, 0xc260, 0x417f,
> +				0x85, 0xa9, 0x08, 0x8b, 0x16, 0x21, 0xeb, 0xa6),
> +		.length = sizeof(struct cxl_event_gen_media),
> +		.flags[0] = CXL_EVENT_RECORD_FLAG_PERMANENT,
> +		/* .handle = Set dynamically */
> +		.related_handle = cpu_to_le16(0),
> +	},
> +	.phys_addr = cpu_to_le64(0x2000),
> +	.descriptor = CXL_GMER_EVT_DESC_UNCORECTABLE_EVENT,
> +	.type = CXL_GMER_MEM_EVT_TYPE_DATA_PATH_ERROR,
> +	.transaction_type = CXL_GMER_TRANS_HOST_WRITE,
> +	/* .validity_flags = <set below> */
> +	.channel = 1,
> +	.rank = 30
> +};
> +
> +struct cxl_event_dram dram = {
> +	.hdr = {
> +		.id = UUID_INIT(0x601dcbb3, 0x9c06, 0x4eab,
> +				0xb8, 0xaf, 0x4e, 0x9b, 0xfb, 0x5c, 0x96, 0x24),
> +		.length = sizeof(struct cxl_event_dram),
> +		.flags[0] = CXL_EVENT_RECORD_FLAG_PERF_DEGRADED,
> +		/* .handle = Set dynamically */
> +		.related_handle = cpu_to_le16(0),
> +	},
> +	.phys_addr = cpu_to_le64(0x8000),
> +	.descriptor = CXL_GMER_EVT_DESC_THRESHOLD_EVENT,
> +	.type = CXL_GMER_MEM_EVT_TYPE_INV_ADDR,
> +	.transaction_type = CXL_GMER_TRANS_INTERNAL_MEDIA_SCRUB,
> +	/* .validity_flags = <set below> */
> +	.channel = 1,
> +	.bank_group = 5,
> +	.bank = 2,
> +	.column = {0xDE, 0xAD},
> +};
> +
> +struct cxl_event_mem_module mem_module = {
> +	.hdr = {
> +		.id = UUID_INIT(0xfe927475, 0xdd59, 0x4339,
> +				0xa5, 0x86, 0x79, 0xba, 0xb1, 0x13, 0xb7, 0x74),
> +		.length = sizeof(struct cxl_event_mem_module),
> +		/* .handle = Set dynamically */
> +		.related_handle = cpu_to_le16(0),
> +	},
> +	.event_type = CXL_MMER_TEMP_CHANGE,
> +	.info = {
> +		.health_status = CXL_DHI_HS_PERFORMANCE_DEGRADED,
> +		.media_status = CXL_DHI_MS_ALL_DATA_LOST,
> +		.add_status = (CXL_DHI_AS_CRITICAL << 2) |
> +			      (CXL_DHI_AS_WARNING << 4) |
> +			      (CXL_DHI_AS_WARNING << 5),
> +		.device_temp = { 0xDE, 0xAD},
> +		.dirty_shutdown_cnt = { 0xde, 0xad, 0xbe, 0xef },
> +		.cor_vol_err_cnt = { 0xde, 0xad, 0xbe, 0xef },
> +		.cor_per_err_cnt = { 0xde, 0xad, 0xbe, 0xef },
> +	}
> +};
> +
>   u32 cxl_mock_add_event_logs(struct cxl_dev_state *cxlds)
>   {
>   	struct device *dev = cxlds->dev;
> @@ -223,10 +283,23 @@ u32 cxl_mock_add_event_logs(struct cxl_dev_state *cxlds)
>   		return 0;
>   	}
>   
> +	put_unaligned_le16(CXL_GMER_VALID_CHANNEL | CXL_GMER_VALID_RANK,
> +			   &gen_media.validity_flags);
> +
> +	put_unaligned_le16(CXL_DER_VALID_CHANNEL | CXL_DER_VALID_BANK_GROUP |
> +			   CXL_DER_VALID_BANK | CXL_DER_VALID_COLUMN,
> +			   &dram.validity_flags);
> +
>   	event_store_add_event(mes, CXL_EVENT_TYPE_INFO, &maint_needed);
> +	event_store_add_event(mes, CXL_EVENT_TYPE_INFO,
> +			      (struct cxl_event_record_raw *)&gen_media);
> +	event_store_add_event(mes, CXL_EVENT_TYPE_INFO,
> +			      (struct cxl_event_record_raw *)&mem_module);
>   	mes->ev_status |= CXLDEV_EVENT_STATUS_INFO;
>   
>   	event_store_add_event(mes, CXL_EVENT_TYPE_FATAL, &hardware_replace);
> +	event_store_add_event(mes, CXL_EVENT_TYPE_FATAL,
> +			      (struct cxl_event_record_raw *)&dram);
>   	mes->ev_status |= CXLDEV_EVENT_STATUS_FATAL;
>   
>   	return mes->ev_status;

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH V2 11/11] cxl/test: Simulate event log overflow
  2022-12-01  0:27 ` [PATCH V2 11/11] cxl/test: Simulate event log overflow ira.weiny
@ 2022-12-01 21:28   ` Dave Jiang
  0 siblings, 0 replies; 64+ messages in thread
From: Dave Jiang @ 2022-12-01 21:28 UTC (permalink / raw)
  To: ira.weiny, Dan Williams
  Cc: Jonathan Cameron, Alison Schofield, Vishal Verma, Ben Widawsky,
	Steven Rostedt, Davidlohr Bueso, linux-kernel, linux-cxl



On 11/30/2022 5:27 PM, ira.weiny@intel.com wrote:
> From: Ira Weiny <ira.weiny@intel.com>
> 
> Log overflow is marked by a separate trace message.
> 
> Simulate a log with lots of messages and flag overflow until it is
> drained a bit.
> 
> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
> Signed-off-by: Ira Weiny <ira.weiny@intel.com>

Reviewed-by: Dave Jiang <dave.jiang@intel.com>

> 
> ---
> Changes from RFC
> 	Adjust for new struct changes
> ---
>   tools/testing/cxl/test/events.c | 49 ++++++++++++++++++++++++++++++++-
>   1 file changed, 48 insertions(+), 1 deletion(-)
> 
> diff --git a/tools/testing/cxl/test/events.c b/tools/testing/cxl/test/events.c
> index 0bcc485e07da..ceabefb526c2 100644
> --- a/tools/testing/cxl/test/events.c
> +++ b/tools/testing/cxl/test/events.c
> @@ -15,6 +15,8 @@ struct mock_event_log {
>   	u16 clear_idx;
>   	u16 cur_idx;
>   	u16 nr_events;
> +	u16 nr_overflow;
> +	u16 overflow_reset;
>   	struct cxl_event_record_raw *events[CXL_TEST_EVENT_CNT_MAX];
>   };
>   
> @@ -45,6 +47,7 @@ void reset_event_log(struct mock_event_log *log)
>   {
>   	log->cur_idx = 0;
>   	log->clear_idx = 0;
> +	log->nr_overflow = log->overflow_reset;
>   }
>   
>   /* Handle can never be 0 use 1 based indexing for handle */
> @@ -76,8 +79,12 @@ static void event_store_add_event(struct mock_event_store *mes,
>   		return;
>   
>   	log = &mes->mock_logs[log_type];
> -	if (WARN_ON(log->nr_events >= CXL_TEST_EVENT_CNT_MAX))
> +
> +	if ((log->nr_events + 1) > CXL_TEST_EVENT_CNT_MAX) {
> +		log->nr_overflow++;
> +		log->overflow_reset = log->nr_overflow;
>   		return;
> +	}
>   
>   	log->events[log->nr_events] = event;
>   	log->nr_events++;
> @@ -87,6 +94,7 @@ int mock_get_event(struct cxl_dev_state *cxlds, struct cxl_mbox_cmd *cmd)
>   {
>   	struct cxl_get_event_payload *pl;
>   	struct mock_event_log *log;
> +	u16 nr_overflow;
>   	u8 log_type;
>   	int i;
>   
> @@ -118,6 +126,21 @@ int mock_get_event(struct cxl_dev_state *cxlds, struct cxl_mbox_cmd *cmd)
>   	if (!log_empty(log))
>   		pl->flags |= CXL_GET_EVENT_FLAG_MORE_RECORDS;
>   
> +	if (log->nr_overflow) {
> +		u64 ns;
> +
> +		pl->flags |= CXL_GET_EVENT_FLAG_OVERFLOW;
> +		pl->overflow_err_count = cpu_to_le16(nr_overflow);
> +		ns = ktime_get_real_ns();
> +		ns -= 5000000000; /* 5s ago */
> +		pl->first_overflow_timestamp = cpu_to_le64(ns);
> +		ns = ktime_get_real_ns();
> +		ns -= 1000000000; /* 1s ago */
> +		pl->last_overflow_timestamp = cpu_to_le64(ns);
> +
> +		log->nr_overflow = 0;
> +	}
> +
>   	return 0;
>   }
>   EXPORT_SYMBOL_GPL(mock_get_event);
> @@ -297,6 +320,30 @@ u32 cxl_mock_add_event_logs(struct cxl_dev_state *cxlds)
>   			      (struct cxl_event_record_raw *)&mem_module);
>   	mes->ev_status |= CXLDEV_EVENT_STATUS_INFO;
>   
> +	event_store_add_event(mes, CXL_EVENT_TYPE_FAIL, &maint_needed);
> +	event_store_add_event(mes, CXL_EVENT_TYPE_FAIL, &hardware_replace);
> +	event_store_add_event(mes, CXL_EVENT_TYPE_FAIL,
> +			      (struct cxl_event_record_raw *)&dram);
> +	event_store_add_event(mes, CXL_EVENT_TYPE_FAIL,
> +			      (struct cxl_event_record_raw *)&gen_media);
> +	event_store_add_event(mes, CXL_EVENT_TYPE_FAIL,
> +			      (struct cxl_event_record_raw *)&mem_module);
> +	event_store_add_event(mes, CXL_EVENT_TYPE_FAIL, &hardware_replace);
> +	event_store_add_event(mes, CXL_EVENT_TYPE_FAIL,
> +			      (struct cxl_event_record_raw *)&dram);
> +	/* Overflow this log */
> +	event_store_add_event(mes, CXL_EVENT_TYPE_FAIL, &hardware_replace);
> +	event_store_add_event(mes, CXL_EVENT_TYPE_FAIL, &hardware_replace);
> +	event_store_add_event(mes, CXL_EVENT_TYPE_FAIL, &hardware_replace);
> +	event_store_add_event(mes, CXL_EVENT_TYPE_FAIL, &hardware_replace);
> +	event_store_add_event(mes, CXL_EVENT_TYPE_FAIL, &hardware_replace);
> +	event_store_add_event(mes, CXL_EVENT_TYPE_FAIL, &hardware_replace);
> +	event_store_add_event(mes, CXL_EVENT_TYPE_FAIL, &hardware_replace);
> +	event_store_add_event(mes, CXL_EVENT_TYPE_FAIL, &hardware_replace);
> +	event_store_add_event(mes, CXL_EVENT_TYPE_FAIL, &hardware_replace);
> +	event_store_add_event(mes, CXL_EVENT_TYPE_FAIL, &hardware_replace);
> +	mes->ev_status |= CXLDEV_EVENT_STATUS_FAIL;
> +
>   	event_store_add_event(mes, CXL_EVENT_TYPE_FATAL, &hardware_replace);
>   	event_store_add_event(mes, CXL_EVENT_TYPE_FATAL,
>   			      (struct cxl_event_record_raw *)&dram);

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH V2 02/11] cxl/mem: Implement Get Event Records command
  2022-12-01 17:38   ` Steven Rostedt
@ 2022-12-02  0:09     ` Ira Weiny
  2022-12-02  4:40       ` Steven Rostedt
  0 siblings, 1 reply; 64+ messages in thread
From: Ira Weiny @ 2022-12-02  0:09 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: Dan Williams, Alison Schofield, Vishal Verma, Ben Widawsky,
	Jonathan Cameron, Davidlohr Bueso, Dave Jiang, linux-kernel,
	linux-cxl

On Thu, Dec 01, 2022 at 12:38:49PM -0500, Steven Rostedt wrote:
> On Wed, 30 Nov 2022 16:27:10 -0800
> ira.weiny@intel.com wrote:
> 

[snip]

> > +
> > +/*
> > + * CXL rev 3.0 section 8.2.9.2.2; Table 8-49
> > + */
> > +enum cxl_event_log_type {
> > +	CXL_EVENT_TYPE_INFO = 0x00,
> > +	CXL_EVENT_TYPE_WARN,
> > +	CXL_EVENT_TYPE_FAIL,
> > +	CXL_EVENT_TYPE_FATAL,
> > +	CXL_EVENT_TYPE_MAX
> > +};
> > +
> > +static inline const char *cxl_event_log_type_str(enum cxl_event_log_type type)
> > +{
> > +	switch (type) {
> > +	case CXL_EVENT_TYPE_INFO:
> > +		return "Informational";
> > +	case CXL_EVENT_TYPE_WARN:
> > +		return "Warning";
> > +	case CXL_EVENT_TYPE_FAIL:
> > +		return "Failure";
> > +	case CXL_EVENT_TYPE_FATAL:
> > +		return "Fatal";
> > +	default:
> > +		break;
> > +	}
> > +	return "<unknown>";
> > +}
> 
> So you are using this in a TP_printk() section, which means perf and
> trace-cmd have no idea how to parse it. Can I recommend instead having:
> 
> #define cxl_event_log_type_str(type)				\
> 	__print_symbolic(type,					\
> 		{ CXL_EVENT_TYPE_INFO, "Informational" },	\
> 		{ CXL_EVENT_TYPE_WARN, "Warning" },		\
> 		{ CXL_EVENT_TYPE_FAIL, "Failure" },		\
> 		{ CXL_EVENT_TYPE_FATAL, "Fatal" })
> 
> #ifndef CREATE_TRACE_POINTS
> static inline const char *__cxl_event_log_type_str(enum cxl_event_log_type type,
> 			struct trace_print_flags *symbols)
> {
> 	for (; symbols->mask >= 0; symbols++) {
> 		if (type == symbols->mask)
> 			return symbols->name;
> 	}
> 	return "<unknown>";
> }
> #define __print_symbolic(value, symbol_array...)			\
> 	({								\
> 		static const struct trace_print_flags symbols[] =	\
> 			{ symbol_array, { -1, NULL }};			\
> 		__cxl_event_log_type_str(value, symbols);		\
> 	})
> #endif		
> 
> Note, I did not even try to compile the above. But it should be close to
> working.

Dropping that into cxlmem.h does not compile.  I've given it another go but
because I use cxl_event_log_type_str() in a file where trace points are used
CREATE_TRACE_POINTS is defined and I get the following error.

|| drivers/cxl/core/mbox.c: In function ‘cxl_mem_get_records_log’:
drivers/cxl/cxlmem.h|386 col 7| error: implicit declaration of function ‘__print_symbolic’; did you mean ‘sprint_symbol’?  [-Werror=implicit-function-declaration]                        
||   386 |       __print_symbolic(type,                            \
||       |       ^~~~~~~~~~~~~~~~

I got it to work with the patch below on top of this one.[3]  But it is kind of
ugly.  The only way I could get __print_symbolic() to be defined was to
redefine it in mbox.c.[1]  Then throw it in it's own header as in [3]

NOTE that patch [2] which I think _should_ work on top of patch [1] does not.
I can't understand why.

> 
> This way, the cxl_event_log_type_str() for trace events will be converted
> into the __print_symbolic() which can be parsed by perf and trace-cmd. For
> all other use cases, it is converted into the function above to return the
> string.

I would like to have this support.  I really tried to share this code a while
back.  What you have is seems nicer than what I remember coming up with but it
is still a bit hacky IMO.  And I'm afraid of how fragile this seems right now.

At this point I'm just going to define cxl_event_log_type_str() separate from
the __print_symbolic() in the trace code.  Comment that they should both be
updated if changed and move forward.

Thanks for the suggestion but I think it is going to be more complicated than
it is worth.  At least for mere mortals such as myself.

Ira

[1]

For mbox.c I have to have the special redefinition of __print_symbolic() in the
c file itself.  Code including cxlmem.h without CREATE_TRACE_POINTS defined
(like pci.c) works just fine with what you had.


commit 43a30047962312be2e532dff542d47a132949c08
Author: Ira Weiny <ira.weiny@intel.com>
Date:   Thu Dec 1 13:50:14 2022 -0800

    squash: Redefine __print_symbolic to have a central define of the log string print function.

diff --git a/drivers/cxl/core/mbox.c b/drivers/cxl/core/mbox.c
index 63ee0fd5f4c2..55938d530a21 100644
--- a/drivers/cxl/core/mbox.c
+++ b/drivers/cxl/core/mbox.c
@@ -4,12 +4,30 @@
 #include <linux/security.h>
 #include <linux/debugfs.h>
 #include <linux/mutex.h>
-#include <cxlmem.h>
 #include <cxl.h>
 
 #define CREATE_TRACE_POINTS
+/* Must be after CREATE_TRACE_POINTS */
+#include <cxlmem.h>
 #include <trace/events/cxl.h>
 
+static inline const char *__cxl_event_log_type_str(enum cxl_event_log_type type,
+                      const struct trace_print_flags *symbols)
+{
+      for (; symbols->mask >= 0; symbols++) {
+              if (type == symbols->mask)
+                      return symbols->name;
+      }
+      return "<unknown>";
+}
+
+#define __print_symbolic(value, symbol_array...)			\
+      ({								\
+              static const struct trace_print_flags symbols[] =		\
+                      { symbol_array, { -1, NULL }};			\
+              __cxl_event_log_type_str(value, symbols);			\
+      })
+
 #include "core.h"
 
 static bool cxl_raw_allow_all;
diff --git a/drivers/cxl/cxlmem.h b/drivers/cxl/cxlmem.h
index a701a2e9bcba..34e7d8ae6cfd 100644
--- a/drivers/cxl/cxlmem.h
+++ b/drivers/cxl/cxlmem.h
@@ -381,23 +381,32 @@ enum cxl_event_log_type {
 	CXL_EVENT_TYPE_MAX
 };
 
-static inline const char *cxl_event_log_type_str(enum cxl_event_log_type type)
+#define cxl_event_log_type_str(type)			\
+      __print_symbolic(type,				\
+              { CXL_EVENT_TYPE_INFO, "Informational" },	\
+              { CXL_EVENT_TYPE_WARN, "Warning" },	\
+              { CXL_EVENT_TYPE_FAIL, "Failure" },	\
+              { CXL_EVENT_TYPE_FATAL, "Fatal" })
+
+#ifndef CREATE_TRACE_POINTS
+static inline const char *__cxl_event_log_type_str(enum cxl_event_log_type type,
+                      const struct trace_print_flags *symbols)
 {
-	switch (type) {
-	case CXL_EVENT_TYPE_INFO:
-		return "Informational";
-	case CXL_EVENT_TYPE_WARN:
-		return "Warning";
-	case CXL_EVENT_TYPE_FAIL:
-		return "Failure";
-	case CXL_EVENT_TYPE_FATAL:
-		return "Fatal";
-	default:
-		break;
-	}
-	return "<unknown>";
+      for (; symbols->mask >= 0; symbols++) {
+              if (type == symbols->mask)
+                      return symbols->name;
+      }
+      return "<unknown>";
 }
 
+#define __print_symbolic(value, symbol_array...)			\
+      ({								\
+              static const struct trace_print_flags symbols[] =		\
+                      { symbol_array, { -1, NULL }};			\
+              __cxl_event_log_type_str(value, symbols);			\
+      })
+#endif
+
 struct cxl_mbox_get_partition_info {
 	__le64 active_volatile_cap;
 	__le64 active_persistent_cap;



[2]

Why can't this work?  Why does the undef of CREATE_TRACE_POINTS not work?

I've also tried to include __cxl_event_log_type_str() and this redefintion of
__print_symbolic() in trace/events/cxl.h and that does not work.

I'm also worried this somehow breaks the support you want.  But I'm still not
sure how the trace headers and multiple passes work.

commit ad08110d2432fb24d4513fe9e75bb9be94870e6f
Author: Ira Weiny <ira.weiny@intel.com>
Date:   Thu Dec 1 15:01:05 2022 -0800

    squash: broken!

diff --git a/drivers/cxl/core/mbox.c b/drivers/cxl/core/mbox.c
index 55938d530a21..04117afe9fbf 100644
--- a/drivers/cxl/core/mbox.c
+++ b/drivers/cxl/core/mbox.c
@@ -7,26 +7,10 @@
 #include <cxl.h>
 
 #define CREATE_TRACE_POINTS
-/* Must be after CREATE_TRACE_POINTS */
-#include <cxlmem.h>
 #include <trace/events/cxl.h>
 
-static inline const char *__cxl_event_log_type_str(enum cxl_event_log_type type,
-                      const struct trace_print_flags *symbols)
-{
-      for (; symbols->mask >= 0; symbols++) {
-              if (type == symbols->mask)
-                      return symbols->name;
-      }
-      return "<unknown>";
-}
-
-#define __print_symbolic(value, symbol_array...)			\
-      ({								\
-              static const struct trace_print_flags symbols[] =		\
-                      { symbol_array, { -1, NULL }};			\
-              __cxl_event_log_type_str(value, symbols);			\
-      })
+#undef CREATE_TRACE_POINTS
+#include <cxlmem.h>
 
 #include "core.h"
 



[3]

This seems to be the cleanest thing I have gotten to work.  Work == compiles
and trace points still print the strings.


commit c04f87639164737605c9ff503f8060b901c1b83a
Author: Ira Weiny <ira.weiny@intel.com>
Date:   Thu Dec 1 13:50:14 2022 -0800

    squash: Redefine __print_symbolic to have a central define of the log string print function.

diff --git a/drivers/cxl/core/mbox.c b/drivers/cxl/core/mbox.c
index 63ee0fd5f4c2..31a65106e93c 100644
--- a/drivers/cxl/core/mbox.c
+++ b/drivers/cxl/core/mbox.c
@@ -4,12 +4,19 @@
 #include <linux/security.h>
 #include <linux/debugfs.h>
 #include <linux/mutex.h>
-#include <cxlmem.h>
 #include <cxl.h>
 
 #define CREATE_TRACE_POINTS
+/* Must be after CREATE_TRACE_POINTS */
+#include <cxlmem.h>
 #include <trace/events/cxl.h>
 
+/*
+ * Must be included explicitly after trace header
+ * because CREATE_TRACE_POINTS can't be undefined for cxlmem.h???
+ */
+#include <cxl_event_log.h>
+
 #include "core.h"
 
 static bool cxl_raw_allow_all;
diff --git a/drivers/cxl/cxl_event_log.h b/drivers/cxl/cxl_event_log.h
new file mode 100644
index 000000000000..e8357bfeecdf
--- /dev/null
+++ b/drivers/cxl/cxl_event_log.h
@@ -0,0 +1,23 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/* Copyright(c) 2020 Intel Corporation. All rights reserved. */
+
+/*
+ * Redefine __print_symbolic() from the trace code to be used in regular C code
+ * This compliments the define of cxl_event_log_type_str() in cxlmem.h
+ */
+static inline const char *__cxl_event_log_type_str(enum cxl_event_log_type type,
+                      const struct trace_print_flags *symbols)
+{
+      for (; symbols->mask >= 0; symbols++) {
+              if (type == symbols->mask)
+                      return symbols->name;
+      }
+      return "<unknown>";
+}
+
+#define __print_symbolic(value, symbol_array...)			\
+      ({								\
+              static const struct trace_print_flags symbols[] =		\
+                      { symbol_array, { -1, NULL }};			\
+              __cxl_event_log_type_str(value, symbols);			\
+      })
diff --git a/drivers/cxl/cxlmem.h b/drivers/cxl/cxlmem.h
index a701a2e9bcba..15222a0ceb3f 100644
--- a/drivers/cxl/cxlmem.h
+++ b/drivers/cxl/cxlmem.h
@@ -381,22 +381,16 @@ enum cxl_event_log_type {
 	CXL_EVENT_TYPE_MAX
 };
 
-static inline const char *cxl_event_log_type_str(enum cxl_event_log_type type)
-{
-	switch (type) {
-	case CXL_EVENT_TYPE_INFO:
-		return "Informational";
-	case CXL_EVENT_TYPE_WARN:
-		return "Warning";
-	case CXL_EVENT_TYPE_FAIL:
-		return "Failure";
-	case CXL_EVENT_TYPE_FATAL:
-		return "Fatal";
-	default:
-		break;
-	}
-	return "<unknown>";
-}
+#define cxl_event_log_type_str(type)			\
+      __print_symbolic(type,				\
+              { CXL_EVENT_TYPE_INFO, "Informational" },	\
+              { CXL_EVENT_TYPE_WARN, "Warning" },	\
+              { CXL_EVENT_TYPE_FAIL, "Failure" },	\
+              { CXL_EVENT_TYPE_FATAL, "Fatal" })
+
+#ifndef CREATE_TRACE_POINTS
+#include "cxl_event_log.h"
+#endif
 
 struct cxl_mbox_get_partition_info {
 	__le64 active_volatile_cap;

^ permalink raw reply related	[flat|nested] 64+ messages in thread

* RE: [PATCH V2 01/11] cxl/pci: Add generic MSI-X/MSI irq support
  2022-12-01  0:27 ` [PATCH V2 01/11] cxl/pci: Add generic MSI-X/MSI irq support ira.weiny
  2022-12-01 10:18   ` Jonathan Cameron
  2022-12-01 18:37   ` Dave Jiang
@ 2022-12-02  0:23   ` Dan Williams
  2022-12-02  0:34     ` Ira Weiny
  2 siblings, 1 reply; 64+ messages in thread
From: Dan Williams @ 2022-12-02  0:23 UTC (permalink / raw)
  To: ira.weiny, Dan Williams
  Cc: Davidlohr Bueso, Bjorn Helgaas, Jonathan Cameron, Ira Weiny,
	Alison Schofield, Vishal Verma, Ben Widawsky, Steven Rostedt,
	Dave Jiang, linux-kernel, linux-cxl

ira.weiny@ wrote:
> From: Davidlohr Bueso <dave@stgolabs.net>
> 
> Currently the only CXL features targeted for irq support require their
> message numbers to be within the first 16 entries.  The device may
> however support less than 16 entries depending on the support it
> provides.
> 
> Attempt to allocate these 16 irq vectors.  If the device supports less
> then the PCI infrastructure will allocate that number.

What happens if the device supports 16, but irq-core allocates less? I
believe the answer is with the first user, but this patch does not
include a user.

> Store the number of vectors actually allocated in the device state for
> later use by individual functions.

The patch does not do that.

I know this patch has gone through a lot of discussion, but this
mismatch shows it should really be squashed with the first user because
it does not stand on its own anymore.

> Upon successful allocation, users can plug in their respective isr at
> any point thereafter, for example, if the irq setup is not done in the
> PCI driver, such as the case of the CXL-PMU.
> 
> Cc: Bjorn Helgaas <helgaas@kernel.org>
> Cc: Jonathan Cameron <Jonathan.Cameron@huawei.com>
> Co-developed-by: Ira Weiny <ira.weiny@intel.com>
> Signed-off-by: Ira Weiny <ira.weiny@intel.com>
> Signed-off-by: Davidlohr Bueso <dave@stgolabs.net>
> 
> ---
> Changes from V1:
> 	Jonathan
> 		pci_alloc_irq_vectors() cleans up the vectors automatically
> 		use msi_enabled rather than nr_irq_vecs
> 
> Changes from Ira
> 	Remove reviews
> 	Allocate up to a static 16 vectors.
> 	Change cover letter
> ---
>  drivers/cxl/cxlmem.h |  3 +++
>  drivers/cxl/cxlpci.h |  6 ++++++
>  drivers/cxl/pci.c    | 23 +++++++++++++++++++++++
>  3 files changed, 32 insertions(+)
> 
> diff --git a/drivers/cxl/cxlmem.h b/drivers/cxl/cxlmem.h
> index 88e3a8e54b6a..cd35f43fedd4 100644
> --- a/drivers/cxl/cxlmem.h
> +++ b/drivers/cxl/cxlmem.h
> @@ -211,6 +211,7 @@ struct cxl_endpoint_dvsec_info {
>   * @info: Cached DVSEC information about the device.
>   * @serial: PCIe Device Serial Number
>   * @doe_mbs: PCI DOE mailbox array
> + * @msi_enabled: MSI-X/MSI has been enabled
>   * @mbox_send: @dev specific transport for transmitting mailbox commands
>   *
>   * See section 8.2.9.5.2 Capacity Configuration and Label Storage for
> @@ -247,6 +248,8 @@ struct cxl_dev_state {
>  
>  	struct xarray doe_mbs;
>  
> +	bool msi_enabled;
> +

This goes unused in this patch and it also duplicates what the core
offers with pdev->{msi,msix}_enabled.

>  	int (*mbox_send)(struct cxl_dev_state *cxlds, struct cxl_mbox_cmd *cmd);
>  };
>  
> diff --git a/drivers/cxl/cxlpci.h b/drivers/cxl/cxlpci.h
> index eec597dbe763..b7f4e2f417d3 100644
> --- a/drivers/cxl/cxlpci.h
> +++ b/drivers/cxl/cxlpci.h
> @@ -53,6 +53,12 @@
>  #define	    CXL_DVSEC_REG_LOCATOR_BLOCK_ID_MASK			GENMASK(15, 8)
>  #define     CXL_DVSEC_REG_LOCATOR_BLOCK_OFF_LOW_MASK		GENMASK(31, 16)
>  
> +/*
> + * NOTE: Currently all the functions which are enabled for CXL require their
> + * vectors to be in the first 16.  Use this as the max.
> + */
> +#define CXL_PCI_REQUIRED_VECTORS 16
> +
>  /* Register Block Identifier (RBI) */
>  enum cxl_regloc_type {
>  	CXL_REGLOC_RBI_EMPTY = 0,
> diff --git a/drivers/cxl/pci.c b/drivers/cxl/pci.c
> index faeb5d9d7a7a..8f86f85d89c7 100644
> --- a/drivers/cxl/pci.c
> +++ b/drivers/cxl/pci.c
> @@ -428,6 +428,27 @@ static void devm_cxl_pci_create_doe(struct cxl_dev_state *cxlds)
>  	}
>  }
>  
> +static void cxl_pci_alloc_irq_vectors(struct cxl_dev_state *cxlds)
> +{
> +	struct device *dev = cxlds->dev;
> +	struct pci_dev *pdev = to_pci_dev(dev);
> +	int nvecs;
> +
> +	/*
> +	 * NOTE: pci_alloc_irq_vectors() handles calling pci_free_irq_vectors()
> +	 * automatically despite not being called pcim_*.  See
> +	 * pci_setup_msi_context().
> +	 */
> +	nvecs = pci_alloc_irq_vectors(pdev, 1, CXL_PCI_REQUIRED_VECTORS,
> +				   PCI_IRQ_MSIX | PCI_IRQ_MSI);

clang-format would scooch that second line in for you.

Might also be worth a comment for the next person that goes looking for
why this isn't PCI_IRQ_ALL_TYPES.

From CXL 3.0 3.1.1 CXL.io Endpoint:
A Function on a CXL device must not generate INTx messages if that
Function participates in CXL.cache protocol or CXL.mem protocols.


> +	if (nvecs < 0) {
> +		dev_dbg(dev, "Failed to alloc irq vectors; use polling instead.\n");
> +		return;
> +	}
> +
> +	cxlds->msi_enabled = true;
> +}
> +
>  static int cxl_pci_probe(struct pci_dev *pdev, const struct pci_device_id *id)
>  {
>  	struct cxl_register_map map;
> @@ -494,6 +515,8 @@ static int cxl_pci_probe(struct pci_dev *pdev, const struct pci_device_id *id)
>  	if (rc)
>  		return rc;
>  
> +	cxl_pci_alloc_irq_vectors(cxlds);
> +
>  	cxlmd = devm_cxl_add_memdev(cxlds);
>  	if (IS_ERR(cxlmd))
>  		return PTR_ERR(cxlmd);
> -- 
> 2.37.2
> 



^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH V2 01/11] cxl/pci: Add generic MSI-X/MSI irq support
  2022-12-02  0:23   ` Dan Williams
@ 2022-12-02  0:34     ` Ira Weiny
  2022-12-02  2:00       ` Dan Williams
  0 siblings, 1 reply; 64+ messages in thread
From: Ira Weiny @ 2022-12-02  0:34 UTC (permalink / raw)
  To: Dan Williams
  Cc: Davidlohr Bueso, Bjorn Helgaas, Jonathan Cameron,
	Alison Schofield, Vishal Verma, Ben Widawsky, Steven Rostedt,
	Dave Jiang, linux-kernel, linux-cxl

On Thu, Dec 01, 2022 at 04:23:21PM -0800, Dan Williams wrote:
> ira.weiny@ wrote:
> > From: Davidlohr Bueso <dave@stgolabs.net>
> > 
> > Currently the only CXL features targeted for irq support require their
> > message numbers to be within the first 16 entries.  The device may
> > however support less than 16 entries depending on the support it
> > provides.
> > 
> > Attempt to allocate these 16 irq vectors.  If the device supports less
> > then the PCI infrastructure will allocate that number.
> 
> What happens if the device supports 16, but irq-core allocates less? I
> believe the answer is with the first user, but this patch does not
> include a user.
> 
> > Store the number of vectors actually allocated in the device state for
> > later use by individual functions.
> 
> The patch does not do that.

Sorry missed updating this message.

> 
> I know this patch has gone through a lot of discussion, but this
> mismatch shows it should really be squashed with the first user because
> it does not stand on its own anymore.

It is separate because it was Davidlohr's to begin with.

I'll squash it back.

> 
> > Upon successful allocation, users can plug in their respective isr at
> > any point thereafter, for example, if the irq setup is not done in the
> > PCI driver, such as the case of the CXL-PMU.
> > 
> > Cc: Bjorn Helgaas <helgaas@kernel.org>
> > Cc: Jonathan Cameron <Jonathan.Cameron@huawei.com>
> > Co-developed-by: Ira Weiny <ira.weiny@intel.com>
> > Signed-off-by: Ira Weiny <ira.weiny@intel.com>
> > Signed-off-by: Davidlohr Bueso <dave@stgolabs.net>
> > 
> > ---
> > Changes from V1:
> > 	Jonathan
> > 		pci_alloc_irq_vectors() cleans up the vectors automatically
> > 		use msi_enabled rather than nr_irq_vecs
> > 
> > Changes from Ira
> > 	Remove reviews
> > 	Allocate up to a static 16 vectors.
> > 	Change cover letter
> > ---
> >  drivers/cxl/cxlmem.h |  3 +++
> >  drivers/cxl/cxlpci.h |  6 ++++++
> >  drivers/cxl/pci.c    | 23 +++++++++++++++++++++++
> >  3 files changed, 32 insertions(+)
> > 
> > diff --git a/drivers/cxl/cxlmem.h b/drivers/cxl/cxlmem.h
> > index 88e3a8e54b6a..cd35f43fedd4 100644
> > --- a/drivers/cxl/cxlmem.h
> > +++ b/drivers/cxl/cxlmem.h
> > @@ -211,6 +211,7 @@ struct cxl_endpoint_dvsec_info {
> >   * @info: Cached DVSEC information about the device.
> >   * @serial: PCIe Device Serial Number
> >   * @doe_mbs: PCI DOE mailbox array
> > + * @msi_enabled: MSI-X/MSI has been enabled
> >   * @mbox_send: @dev specific transport for transmitting mailbox commands
> >   *
> >   * See section 8.2.9.5.2 Capacity Configuration and Label Storage for
> > @@ -247,6 +248,8 @@ struct cxl_dev_state {
> >  
> >  	struct xarray doe_mbs;
> >  
> > +	bool msi_enabled;
> > +
> 
> This goes unused in this patch and it also duplicates what the core
> offers with pdev->{msi,msix}_enabled.

I tried to argue that with Jonathan and lost.  What I had in V1 was to store
the number actually allocated.  Then if a function reports something higher
later it can't be used.

I admit that at this point I really don't understand PCI interrupts at all.
Every time this patch is discussed I get (what is to me) confusing information.
And I've been unable to discern from the spec how exactly this is all supposed
to work.

> 
> >  	int (*mbox_send)(struct cxl_dev_state *cxlds, struct cxl_mbox_cmd *cmd);
> >  };
> >  
> > diff --git a/drivers/cxl/cxlpci.h b/drivers/cxl/cxlpci.h
> > index eec597dbe763..b7f4e2f417d3 100644
> > --- a/drivers/cxl/cxlpci.h
> > +++ b/drivers/cxl/cxlpci.h
> > @@ -53,6 +53,12 @@
> >  #define	    CXL_DVSEC_REG_LOCATOR_BLOCK_ID_MASK			GENMASK(15, 8)
> >  #define     CXL_DVSEC_REG_LOCATOR_BLOCK_OFF_LOW_MASK		GENMASK(31, 16)
> >  
> > +/*
> > + * NOTE: Currently all the functions which are enabled for CXL require their
> > + * vectors to be in the first 16.  Use this as the max.
> > + */
> > +#define CXL_PCI_REQUIRED_VECTORS 16
> > +
> >  /* Register Block Identifier (RBI) */
> >  enum cxl_regloc_type {
> >  	CXL_REGLOC_RBI_EMPTY = 0,
> > diff --git a/drivers/cxl/pci.c b/drivers/cxl/pci.c
> > index faeb5d9d7a7a..8f86f85d89c7 100644
> > --- a/drivers/cxl/pci.c
> > +++ b/drivers/cxl/pci.c
> > @@ -428,6 +428,27 @@ static void devm_cxl_pci_create_doe(struct cxl_dev_state *cxlds)
> >  	}
> >  }
> >  
> > +static void cxl_pci_alloc_irq_vectors(struct cxl_dev_state *cxlds)
> > +{
> > +	struct device *dev = cxlds->dev;
> > +	struct pci_dev *pdev = to_pci_dev(dev);
> > +	int nvecs;
> > +
> > +	/*
> > +	 * NOTE: pci_alloc_irq_vectors() handles calling pci_free_irq_vectors()
> > +	 * automatically despite not being called pcim_*.  See
> > +	 * pci_setup_msi_context().
> > +	 */
> > +	nvecs = pci_alloc_irq_vectors(pdev, 1, CXL_PCI_REQUIRED_VECTORS,
> > +				   PCI_IRQ_MSIX | PCI_IRQ_MSI);
> 
> clang-format would scooch that second line in for you.
> 
> Might also be worth a comment for the next person that goes looking for
> why this isn't PCI_IRQ_ALL_TYPES.
> 
> From CXL 3.0 3.1.1 CXL.io Endpoint:
> A Function on a CXL device must not generate INTx messages if that
> Function participates in CXL.cache protocol or CXL.mem protocols.

Seems reasonable.

Ira

> 
> 
> > +	if (nvecs < 0) {
> > +		dev_dbg(dev, "Failed to alloc irq vectors; use polling instead.\n");
> > +		return;
> > +	}
> > +
> > +	cxlds->msi_enabled = true;
> > +}
> > +
> >  static int cxl_pci_probe(struct pci_dev *pdev, const struct pci_device_id *id)
> >  {
> >  	struct cxl_register_map map;
> > @@ -494,6 +515,8 @@ static int cxl_pci_probe(struct pci_dev *pdev, const struct pci_device_id *id)
> >  	if (rc)
> >  		return rc;
> >  
> > +	cxl_pci_alloc_irq_vectors(cxlds);
> > +
> >  	cxlmd = devm_cxl_add_memdev(cxlds);
> >  	if (IS_ERR(cxlmd))
> >  		return PTR_ERR(cxlmd);
> > -- 
> > 2.37.2
> > 
> 
> 

^ permalink raw reply	[flat|nested] 64+ messages in thread

* RE: [PATCH V2 02/11] cxl/mem: Implement Get Event Records command
  2022-12-01  0:27 ` [PATCH V2 02/11] cxl/mem: Implement Get Event Records command ira.weiny
  2022-12-01 13:06   ` Jonathan Cameron
  2022-12-01 17:38   ` Steven Rostedt
@ 2022-12-02  1:39   ` Dan Williams
  2022-12-02 21:47     ` Ira Weiny
  2 siblings, 1 reply; 64+ messages in thread
From: Dan Williams @ 2022-12-02  1:39 UTC (permalink / raw)
  To: ira.weiny, Dan Williams
  Cc: Ira Weiny, Steven Rostedt, Alison Schofield, Vishal Verma,
	Ben Widawsky, Jonathan Cameron, Davidlohr Bueso, Dave Jiang,
	linux-kernel, linux-cxl

ira.weiny@ wrote:
> From: Ira Weiny <ira.weiny@intel.com>
> 
> CXL devices have multiple event logs which can be queried for CXL event
> records.  Devices are required to support the storage of at least one
> event record in each event log type.
> 
> Devices track event log overflow by incrementing a counter and tracking
> the time of the first and last overflow event seen.
> 
> Software queries events via the Get Event Record mailbox command; CXL
> rev 3.0 section 8.2.9.2.2.
> 
> Issue the Get Event Record mailbox command on driver load.  Trace each
> record found with a generic record trace.  Trace any overflow
> conditions.
> 
> The device can return up to 1MB worth of event records per query.
> Allocate a shared large buffer to handle the max number of records based
> on the mailbox payload size.
> 
> This patch traces a raw event record only and leaves the specific event
> record types to subsequent patches.
> 
> Macros are created to use for tracing the common CXL Event header
> fields.
> 
> Cc: Steven Rostedt <rostedt@goodmis.org>
> Signed-off-by: Ira Weiny <ira.weiny@intel.com>
> 
> ---
> Change from V1:
> 	Ignore useless More Event Flag
> 	defer DCD support
> 	Jonathan
> 		delete extra blank line
> 		Use all caps for flags
> 	Jonathan/Dan/Ira
> 		Allocate event MB buffer on start up.
> 	Alison
> 		s/pl_nr/nr_pl
> 
> Change from RFC v2:
> 	Support reading 3 events at once.
> 	Reverse Jonathan's suggestion and check for positive number of
> 		records.  Because the record count may have been
> 		returned as something > 3 based on what the device
> 		thinks it can send back even though the core Linux mbox
> 		processing truncates the data.
> 	Alison and Dave Jiang
> 		Change header uuid type to uuid_t for better user space
> 		processing
> 	Smita
> 		Check status reg before reading log.
> 	Steven
> 		Prefix all trace points with 'cxl_'
> 		Use static branch <trace>_enabled() calls
> 	Jonathan
> 		s/CXL_EVENT_TYPE_INFO/0
> 		s/{first,last}/{first,last}_ts
> 		Remove Reserved field from header
> 		Fix header issue for cxl_event_log_type_str()
> 
> Change from RFC:
> 	Remove redundant error message in get event records loop
> 	s/EVENT_RECORD_DATA_LENGTH/CXL_EVENT_RECORD_DATA_LENGTH
> 	Use hdr_uuid for the header UUID field
> 	Use cxl_event_log_type_str() for the trace events
> 	Create macros for the header fields and common entries of each event
> 	Add reserved buffer output dump
> 	Report error if event query fails
> 	Remove unused record_cnt variable
> 	Steven - reorder overflow record
> 		Remove NOTE about checkpatch
> 	Jonathan
> 		check for exactly 1 record
> 		s/v3.0/rev 3.0
> 		Use 3 byte fields for 24bit fields
> 		Add 3.0 Maintenance Operation Class
> 		Add Dynamic Capacity log type
> 		Fix spelling
> 	Dave Jiang/Dan/Alison
> 		s/cxl-event/cxl
> 		trace/events/cxl-events => trace/events/cxl.h
> 		s/cxl_event_overflow/overflow
> 		s/cxl_event/generic_event
> ---
>  MAINTAINERS                  |   1 +
>  drivers/cxl/core/mbox.c      | 105 +++++++++++++++++++++++++++++
>  drivers/cxl/cxl.h            |   7 ++
>  drivers/cxl/cxlmem.h         |  72 ++++++++++++++++++++
>  include/trace/events/cxl.h   | 126 +++++++++++++++++++++++++++++++++++
>  include/uapi/linux/cxl_mem.h |   1 +
>  6 files changed, 312 insertions(+)
>  create mode 100644 include/trace/events/cxl.h
> 
> diff --git a/MAINTAINERS b/MAINTAINERS
> index ca063a504026..4b7c6e3055c6 100644
> --- a/MAINTAINERS
> +++ b/MAINTAINERS
> @@ -5223,6 +5223,7 @@ M:	Dan Williams <dan.j.williams@intel.com>
>  L:	linux-cxl@vger.kernel.org
>  S:	Maintained
>  F:	drivers/cxl/
> +F:	include/trace/events/cxl.h
>  F:	include/uapi/linux/cxl_mem.h
>  
>  CONEXANT ACCESSRUNNER USB DRIVER
> diff --git a/drivers/cxl/core/mbox.c b/drivers/cxl/core/mbox.c
> index 16176b9278b4..70b681027a3d 100644
> --- a/drivers/cxl/core/mbox.c
> +++ b/drivers/cxl/core/mbox.c
> @@ -7,6 +7,9 @@
>  #include <cxlmem.h>
>  #include <cxl.h>
>  
> +#define CREATE_TRACE_POINTS
> +#include <trace/events/cxl.h>
> +
>  #include "core.h"
>  
>  static bool cxl_raw_allow_all;
> @@ -48,6 +51,7 @@ static struct cxl_mem_command cxl_mem_commands[CXL_MEM_COMMAND_ID_MAX] = {
>  	CXL_CMD(RAW, CXL_VARIABLE_PAYLOAD, CXL_VARIABLE_PAYLOAD, 0),
>  #endif
>  	CXL_CMD(GET_SUPPORTED_LOGS, 0, CXL_VARIABLE_PAYLOAD, CXL_CMD_FLAG_FORCE_ENABLE),
> +	CXL_CMD(GET_EVENT_RECORD, 1, CXL_VARIABLE_PAYLOAD, 0),
>  	CXL_CMD(GET_FW_INFO, 0, 0x50, 0),
>  	CXL_CMD(GET_PARTITION_INFO, 0, 0x20, 0),
>  	CXL_CMD(GET_LSA, 0x8, CXL_VARIABLE_PAYLOAD, 0),

Similar to this patch:

https://lore.kernel.org/linux-cxl/166993221008.1995348.11651567302609703175.stgit@dwillia2-xfh.jf.intel.com/

CXL_MEM_COMMAND_ID_GET_EVENT_RECORD, should be added to the "always
kernel" / cxlds->exclusive_cmds mask.

> @@ -704,6 +708,106 @@ int cxl_enumerate_cmds(struct cxl_dev_state *cxlds)
>  }
>  EXPORT_SYMBOL_NS_GPL(cxl_enumerate_cmds, CXL);
>  
> +static void cxl_mem_get_records_log(struct cxl_dev_state *cxlds,
> +				    enum cxl_event_log_type type)
> +{
> +	struct cxl_get_event_payload *payload;
> +	u16 nr_rec;
> +
> +	mutex_lock(&cxlds->event_buf_lock);
> +
> +	payload = cxlds->event_buf;
> +
> +	do {
> +		u8 log_type = type;
> +		int rc;
> +
> +		rc = cxl_mbox_send_cmd(cxlds, CXL_MBOX_OP_GET_EVENT_RECORD,
> +				       &log_type, sizeof(log_type),
> +				       payload, cxlds->payload_size);
> +		if (rc) {
> +			dev_err(cxlds->dev, "Event log '%s': Failed to query event records : %d",
> +				cxl_event_log_type_str(type), rc);
> +			goto unlock_buffer;
> +		}
> +
> +		nr_rec = le16_to_cpu(payload->record_count);
> +		if (trace_cxl_generic_event_enabled()) {

This feels like a premature micro-optimization as none of this code is
fast path and it is dwarfed by the cost of executing the mailbox
command. I started with trying to reduce the 80 column collision
pressure, but then stepped back even further and asked, why?

> +			int i;
> +
> +			for (i = 0; i < nr_rec; i++)
> +				trace_cxl_generic_event(dev_name(cxlds->dev),
> +							type,
> +							&payload->records[i]);

As far as I can tell trace_cxl_generic_event() always expects a
device-name as its first argument. So why not enforce that with
type-safety?  I.e. I think trace_cxl_generic_event() should take a
"struct device *", not a string unless it is really the case that any
old string will do as the first argument to the trace event. Otherwise
the trace point can do "__string(dev_name, dev_name(dev))", and mandate
that callers pass devices.

> +		}
> +
> +		if (trace_cxl_overflow_enabled() &&
> +		    (payload->flags & CXL_GET_EVENT_FLAG_OVERFLOW))
> +			trace_cxl_overflow(dev_name(cxlds->dev), type, payload);
> +
> +	} while (nr_rec);
> +
> +unlock_buffer:
> +	mutex_unlock(&cxlds->event_buf_lock);
> +}
> +
> +static void cxl_mem_free_event_buffer(void *data)
> +{
> +	struct cxl_dev_state *cxlds = data;
> +
> +	kvfree(cxlds->event_buf);
> +}
> +
> +/*
> + * There is a single buffer for reading event logs from the mailbox.  All logs
> + * share this buffer protected by the cxlds->event_buf_lock.
> + */
> +static struct cxl_get_event_payload *alloc_event_buf(struct cxl_dev_state *cxlds)
> +{
> +	struct cxl_get_event_payload *buf;
> +
> +	dev_dbg(cxlds->dev, "Allocating event buffer size %zu\n",
> +		cxlds->payload_size);
> +
> +	buf = kvmalloc(cxlds->payload_size, GFP_KERNEL);
> +	if (buf && devm_add_action_or_reset(cxlds->dev,
> +			cxl_mem_free_event_buffer, cxlds))
> +		return NULL;
> +	return buf;
> +}
> +
> +/**
> + * cxl_mem_get_event_records - Get Event Records from the device
> + * @cxlds: The device data for the operation
> + *
> + * Retrieve all event records available on the device and report them as trace
> + * events.
> + *
> + * See CXL rev 3.0 @8.2.9.2.2 Get Event Records
> + */
> +void cxl_mem_get_event_records(struct cxl_dev_state *cxlds)
> +{
> +	u32 status = readl(cxlds->regs.status + CXLDEV_DEV_EVENT_STATUS_OFFSET);
> +
> +	dev_dbg(cxlds->dev, "Reading event logs: %x\n", status);
> +
> +	if (!cxlds->event_buf) {
> +		cxlds->event_buf = alloc_event_buf(cxlds);
> +		if (WARN_ON_ONCE(!cxlds->event_buf))
> +			return;
> +	}

What's the point of having an event_buf_lock if event_buf is reallocated
every call?

Just allocate event_buf once at the beginning of time during init if the
device supports event log retrieval, and fail the driver load if that
allocation fails. No runtime WARN() for memory allocation.

I notice this patch does not clear events, I trust that comes later in
the series, but I think it belongs here to make this patch a complete
standalone thought.

> +	if (status & CXLDEV_EVENT_STATUS_INFO)
> +		cxl_mem_get_records_log(cxlds, CXL_EVENT_TYPE_INFO);
> +	if (status & CXLDEV_EVENT_STATUS_WARN)
> +		cxl_mem_get_records_log(cxlds, CXL_EVENT_TYPE_WARN);
> +	if (status & CXLDEV_EVENT_STATUS_FAIL)
> +		cxl_mem_get_records_log(cxlds, CXL_EVENT_TYPE_FAIL);
> +	if (status & CXLDEV_EVENT_STATUS_FATAL)
> +		cxl_mem_get_records_log(cxlds, CXL_EVENT_TYPE_FATAL);

This retrieval order should be flipped. If there is a FATAL pending I
expect a monitor wants that first and would be happy to parse the INFO
later. I would go so far as to say that if the INFO logger is looping
and a FATAL comes in the driver should get that out first before going
back for more INFO logs. That would mean executing Clear Events and
looping through the logs by priority until all the status bits fall
silent inside cxl_mem_get_records_log().

> +}
> +EXPORT_SYMBOL_NS_GPL(cxl_mem_get_event_records, CXL);
> +
>  /**
>   * cxl_mem_get_partition_info - Get partition info
>   * @cxlds: The device data for the operation
> @@ -846,6 +950,7 @@ struct cxl_dev_state *cxl_dev_state_create(struct device *dev)
>  	}
>  
>  	mutex_init(&cxlds->mbox_mutex);
> +	mutex_init(&cxlds->event_buf_lock);
>  	cxlds->dev = dev;
>  
>  	return cxlds;
> diff --git a/drivers/cxl/cxl.h b/drivers/cxl/cxl.h
> index f680450f0b16..d4baae74cd97 100644
> --- a/drivers/cxl/cxl.h
> +++ b/drivers/cxl/cxl.h
> @@ -132,6 +132,13 @@ static inline int ways_to_cxl(unsigned int ways, u8 *iw)
>  #define CXLDEV_CAP_CAP_ID_SECONDARY_MAILBOX 0x3
>  #define CXLDEV_CAP_CAP_ID_MEMDEV 0x4000
>  
> +/* CXL 3.0 8.2.8.3.1 Event Status Register */
> +#define CXLDEV_DEV_EVENT_STATUS_OFFSET		0x00
> +#define CXLDEV_EVENT_STATUS_INFO		BIT(0)
> +#define CXLDEV_EVENT_STATUS_WARN		BIT(1)
> +#define CXLDEV_EVENT_STATUS_FAIL		BIT(2)
> +#define CXLDEV_EVENT_STATUS_FATAL		BIT(3)
> +
>  /* CXL 2.0 8.2.8.4 Mailbox Registers */
>  #define CXLDEV_MBOX_CAPS_OFFSET 0x00
>  #define   CXLDEV_MBOX_CAP_PAYLOAD_SIZE_MASK GENMASK(4, 0)
> diff --git a/drivers/cxl/cxlmem.h b/drivers/cxl/cxlmem.h
> index cd35f43fedd4..55d57f5a64bc 100644
> --- a/drivers/cxl/cxlmem.h
> +++ b/drivers/cxl/cxlmem.h
> @@ -4,6 +4,7 @@
>  #define __CXL_MEM_H__
>  #include <uapi/linux/cxl_mem.h>
>  #include <linux/cdev.h>
> +#include <linux/uuid.h>
>  #include "cxl.h"
>  
>  /* CXL 2.0 8.2.8.5.1.1 Memory Device Status Register */
> @@ -250,12 +251,16 @@ struct cxl_dev_state {
>  
>  	bool msi_enabled;
>  
> +	struct cxl_get_event_payload *event_buf;
> +	struct mutex event_buf_lock;
> +

Missing kdoc.

>  	int (*mbox_send)(struct cxl_dev_state *cxlds, struct cxl_mbox_cmd *cmd);
>  };
>  
>  enum cxl_opcode {
>  	CXL_MBOX_OP_INVALID		= 0x0000,
>  	CXL_MBOX_OP_RAW			= CXL_MBOX_OP_INVALID,
> +	CXL_MBOX_OP_GET_EVENT_RECORD	= 0x0100,
>  	CXL_MBOX_OP_GET_FW_INFO		= 0x0200,
>  	CXL_MBOX_OP_ACTIVATE_FW		= 0x0202,
>  	CXL_MBOX_OP_GET_SUPPORTED_LOGS	= 0x0400,
> @@ -325,6 +330,72 @@ struct cxl_mbox_identify {
>  	u8 qos_telemetry_caps;
>  } __packed;
>  
> +/*
> + * Common Event Record Format
> + * CXL rev 3.0 section 8.2.9.2.1; Table 8-42
> + */
> +struct cxl_event_record_hdr {
> +	uuid_t id;
> +	u8 length;
> +	u8 flags[3];
> +	__le16 handle;
> +	__le16 related_handle;
> +	__le64 timestamp;
> +	u8 maint_op_class;
> +	u8 reserved[0xf];

Nit, but lets not copy the spec here and just make all the field sizes
decimal. It even saves a character to write 15 instead of 0xf, and @flags
is also decimal.

> +} __packed;
> +
> +#define CXL_EVENT_RECORD_DATA_LENGTH 0x50
> +struct cxl_event_record_raw {
> +	struct cxl_event_record_hdr hdr;
> +	u8 data[CXL_EVENT_RECORD_DATA_LENGTH];
> +} __packed;
> +
> +/*
> + * Get Event Records output payload
> + * CXL rev 3.0 section 8.2.9.2.2; Table 8-50
> + */
> +#define CXL_GET_EVENT_FLAG_OVERFLOW		BIT(0)
> +#define CXL_GET_EVENT_FLAG_MORE_RECORDS		BIT(1)
> +struct cxl_get_event_payload {
> +	u8 flags;
> +	u8 reserved1;
> +	__le16 overflow_err_count;
> +	__le64 first_overflow_timestamp;
> +	__le64 last_overflow_timestamp;
> +	__le16 record_count;
> +	u8 reserved2[0xa];

Same nit.

> +	struct cxl_event_record_raw records[];
> +} __packed;
> +
> +/*
> + * CXL rev 3.0 section 8.2.9.2.2; Table 8-49
> + */
> +enum cxl_event_log_type {
> +	CXL_EVENT_TYPE_INFO = 0x00,
> +	CXL_EVENT_TYPE_WARN,
> +	CXL_EVENT_TYPE_FAIL,
> +	CXL_EVENT_TYPE_FATAL,
> +	CXL_EVENT_TYPE_MAX
> +};
> +
> +static inline const char *cxl_event_log_type_str(enum cxl_event_log_type type)
> +{
> +	switch (type) {
> +	case CXL_EVENT_TYPE_INFO:
> +		return "Informational";
> +	case CXL_EVENT_TYPE_WARN:
> +		return "Warning";
> +	case CXL_EVENT_TYPE_FAIL:
> +		return "Failure";
> +	case CXL_EVENT_TYPE_FATAL:
> +		return "Fatal";
> +	default:
> +		break;
> +	}
> +	return "<unknown>";
> +}
> +
>  struct cxl_mbox_get_partition_info {
>  	__le64 active_volatile_cap;
>  	__le64 active_persistent_cap;
> @@ -384,6 +455,7 @@ int cxl_mem_create_range_info(struct cxl_dev_state *cxlds);
>  struct cxl_dev_state *cxl_dev_state_create(struct device *dev);
>  void set_exclusive_cxl_commands(struct cxl_dev_state *cxlds, unsigned long *cmds);
>  void clear_exclusive_cxl_commands(struct cxl_dev_state *cxlds, unsigned long *cmds);
> +void cxl_mem_get_event_records(struct cxl_dev_state *cxlds);
>  #ifdef CONFIG_CXL_SUSPEND
>  void cxl_mem_active_inc(void);
>  void cxl_mem_active_dec(void);
> diff --git a/include/trace/events/cxl.h b/include/trace/events/cxl.h
> new file mode 100644
> index 000000000000..c03a1a894af8
> --- /dev/null
> +++ b/include/trace/events/cxl.h
> @@ -0,0 +1,126 @@
> +/* SPDX-License-Identifier: GPL-2.0 */
> +#undef TRACE_SYSTEM
> +#define TRACE_SYSTEM cxl
> +
> +#if !defined(_CXL_TRACE_EVENTS_H) ||  defined(TRACE_HEADER_MULTI_READ)
> +#define _CXL_TRACE_EVENTS_H
> +
> +#include <asm-generic/unaligned.h>
> +#include <linux/tracepoint.h>
> +#include <cxlmem.h>
> +
> +TRACE_EVENT(cxl_overflow,
> +
> +	TP_PROTO(const char *dev_name, enum cxl_event_log_type log,
> +		 struct cxl_get_event_payload *payload),
> +
> +	TP_ARGS(dev_name, log, payload),
> +
> +	TP_STRUCT__entry(
> +		__string(dev_name, dev_name)
> +		__field(int, log)
> +		__field(u64, first_ts)
> +		__field(u64, last_ts)
> +		__field(u16, count)
> +	),
> +
> +	TP_fast_assign(
> +		__assign_str(dev_name, dev_name);
> +		__entry->log = log;
> +		__entry->count = le16_to_cpu(payload->overflow_err_count);
> +		__entry->first_ts = le64_to_cpu(payload->first_overflow_timestamp);
> +		__entry->last_ts = le64_to_cpu(payload->last_overflow_timestamp);
> +	),
> +
> +	TP_printk("%s: EVENT LOG OVERFLOW log=%s : %u records from %llu to %llu",
> +		__get_str(dev_name), cxl_event_log_type_str(__entry->log),
> +		__entry->count, __entry->first_ts, __entry->last_ts)
> +
> +);
> +
> +/*
> + * Common Event Record Format
> + * CXL 3.0 section 8.2.9.2.1; Table 8-42
> + */
> +#define CXL_EVENT_RECORD_FLAG_PERMANENT		BIT(2)
> +#define CXL_EVENT_RECORD_FLAG_MAINT_NEEDED	BIT(3)
> +#define CXL_EVENT_RECORD_FLAG_PERF_DEGRADED	BIT(4)
> +#define CXL_EVENT_RECORD_FLAG_HW_REPLACE	BIT(5)
> +#define show_hdr_flags(flags)	__print_flags(flags, " | ",			   \
> +	{ CXL_EVENT_RECORD_FLAG_PERMANENT,	"PERMANENT_CONDITION"		}, \
> +	{ CXL_EVENT_RECORD_FLAG_MAINT_NEEDED,	"MAINTENANCE_NEEDED"		}, \
> +	{ CXL_EVENT_RECORD_FLAG_PERF_DEGRADED,	"PERFORMANCE_DEGRADED"		}, \
> +	{ CXL_EVENT_RECORD_FLAG_HW_REPLACE,	"HARDWARE_REPLACEMENT_NEEDED"	}  \
> +)
> +
> +/*
> + * Define macros for the common header of each CXL event.
> + *
> + * Tracepoints using these macros must do 3 things:
> + *
> + *	1) Add CXL_EVT_TP_entry to TP_STRUCT__entry
> + *	2) Use CXL_EVT_TP_fast_assign within TP_fast_assign;
> + *	   pass the dev_name, log, and CXL event header
> + *	3) Use CXL_EVT_TP_printk() instead of TP_printk()
> + *
> + * See the generic_event tracepoint as an example.
> + */
> +#define CXL_EVT_TP_entry					\
> +	__string(dev_name, dev_name)				\
> +	__field(int, log)					\
> +	__field_struct(uuid_t, hdr_uuid)			\
> +	__field(u32, hdr_flags)					\
> +	__field(u16, hdr_handle)				\
> +	__field(u16, hdr_related_handle)			\
> +	__field(u64, hdr_timestamp)				\
> +	__field(u8, hdr_length)					\
> +	__field(u8, hdr_maint_op_class)
> +
> +#define CXL_EVT_TP_fast_assign(dname, l, hdr)					\
> +	__assign_str(dev_name, (dname));					\
> +	__entry->log = (l);							\
> +	memcpy(&__entry->hdr_uuid, &(hdr).id, sizeof(uuid_t));			\
> +	__entry->hdr_length = (hdr).length;					\
> +	__entry->hdr_flags = get_unaligned_le24((hdr).flags);			\
> +	__entry->hdr_handle = le16_to_cpu((hdr).handle);			\
> +	__entry->hdr_related_handle = le16_to_cpu((hdr).related_handle);	\
> +	__entry->hdr_timestamp = le64_to_cpu((hdr).timestamp);			\
> +	__entry->hdr_maint_op_class = (hdr).maint_op_class
> +
> +#define CXL_EVT_TP_printk(fmt, ...) \
> +	TP_printk("%s log=%s : time=%llu uuid=%pUb len=%d flags='%s' "		\
> +		"handle=%x related_handle=%x maint_op_class=%u"			\
> +		" : " fmt,							\
> +		__get_str(dev_name), cxl_event_log_type_str(__entry->log),	\
> +		__entry->hdr_timestamp, &__entry->hdr_uuid, __entry->hdr_length,\
> +		show_hdr_flags(__entry->hdr_flags), __entry->hdr_handle,	\
> +		__entry->hdr_related_handle, __entry->hdr_maint_op_class,	\
> +		##__VA_ARGS__)
> +
> +TRACE_EVENT(cxl_generic_event,
> +
> +	TP_PROTO(const char *dev_name, enum cxl_event_log_type log,
> +		 struct cxl_event_record_raw *rec),
> +
> +	TP_ARGS(dev_name, log, rec),
> +
> +	TP_STRUCT__entry(
> +		CXL_EVT_TP_entry
> +		__array(u8, data, CXL_EVENT_RECORD_DATA_LENGTH)
> +	),
> +
> +	TP_fast_assign(
> +		CXL_EVT_TP_fast_assign(dev_name, log, rec->hdr);
> +		memcpy(__entry->data, &rec->data, CXL_EVENT_RECORD_DATA_LENGTH);
> +	),
> +
> +	CXL_EVT_TP_printk("%s",
> +		__print_hex(__entry->data, CXL_EVENT_RECORD_DATA_LENGTH))
> +);
> +
> +#endif /* _CXL_TRACE_EVENTS_H */
> +
> +/* This part must be outside protection */
> +#undef TRACE_INCLUDE_FILE
> +#define TRACE_INCLUDE_FILE cxl
> +#include <trace/define_trace.h>
> diff --git a/include/uapi/linux/cxl_mem.h b/include/uapi/linux/cxl_mem.h
> index c71021a2a9ed..70459be5bdd4 100644
> --- a/include/uapi/linux/cxl_mem.h
> +++ b/include/uapi/linux/cxl_mem.h
> @@ -24,6 +24,7 @@
>  	___C(IDENTIFY, "Identify Command"),                               \
>  	___C(RAW, "Raw device command"),                                  \
>  	___C(GET_SUPPORTED_LOGS, "Get Supported Logs"),                   \
> +	___C(GET_EVENT_RECORD, "Get Event Record"),                       \
>  	___C(GET_FW_INFO, "Get FW Info"),                                 \
>  	___C(GET_PARTITION_INFO, "Get Partition Information"),            \
>  	___C(GET_LSA, "Get Label Storage Area"),                          \

Yikes, no, this is an enum. New commands need to come at the end
otherwise different kernels will disagree about the command numbering.
Likely needs a comment here alerting future devs about the ABI breakage
danger here.

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH V2 01/11] cxl/pci: Add generic MSI-X/MSI irq support
  2022-12-02  0:34     ` Ira Weiny
@ 2022-12-02  2:00       ` Dan Williams
  2022-12-02 13:04         ` Jonathan Cameron
  0 siblings, 1 reply; 64+ messages in thread
From: Dan Williams @ 2022-12-02  2:00 UTC (permalink / raw)
  To: Ira Weiny, Dan Williams
  Cc: Davidlohr Bueso, Bjorn Helgaas, Jonathan Cameron,
	Alison Schofield, Vishal Verma, Ben Widawsky, Steven Rostedt,
	Dave Jiang, linux-kernel, linux-cxl

Ira Weiny wrote:
> On Thu, Dec 01, 2022 at 04:23:21PM -0800, Dan Williams wrote:
> > ira.weiny@ wrote:
> > > From: Davidlohr Bueso <dave@stgolabs.net>
> > > 
> > > Currently the only CXL features targeted for irq support require their
> > > message numbers to be within the first 16 entries.  The device may
> > > however support less than 16 entries depending on the support it
> > > provides.
> > > 
> > > Attempt to allocate these 16 irq vectors.  If the device supports less
> > > then the PCI infrastructure will allocate that number.
> > 
> > What happens if the device supports 16, but irq-core allocates less? I
> > believe the answer is with the first user, but this patch does not
> > include a user.
> > 
> > > Store the number of vectors actually allocated in the device state for
> > > later use by individual functions.
> > 
> > The patch does not do that.
> 
> Sorry missed updating this message.
> 
> > 
> > I know this patch has gone through a lot of discussion, but this
> > mismatch shows it should really be squashed with the first user because
> > it does not stand on its own anymore.
> 
> It is separate because it was Davidlohr's to begin with.
> 
> I'll squash it back.
> 
> > 
> > > Upon successful allocation, users can plug in their respective isr at
> > > any point thereafter, for example, if the irq setup is not done in the
> > > PCI driver, such as the case of the CXL-PMU.
> > > 
> > > Cc: Bjorn Helgaas <helgaas@kernel.org>
> > > Cc: Jonathan Cameron <Jonathan.Cameron@huawei.com>
> > > Co-developed-by: Ira Weiny <ira.weiny@intel.com>
> > > Signed-off-by: Ira Weiny <ira.weiny@intel.com>
> > > Signed-off-by: Davidlohr Bueso <dave@stgolabs.net>
> > > 
> > > ---
> > > Changes from V1:
> > > 	Jonathan
> > > 		pci_alloc_irq_vectors() cleans up the vectors automatically
> > > 		use msi_enabled rather than nr_irq_vecs
> > > 
> > > Changes from Ira
> > > 	Remove reviews
> > > 	Allocate up to a static 16 vectors.
> > > 	Change cover letter
> > > ---
> > >  drivers/cxl/cxlmem.h |  3 +++
> > >  drivers/cxl/cxlpci.h |  6 ++++++
> > >  drivers/cxl/pci.c    | 23 +++++++++++++++++++++++
> > >  3 files changed, 32 insertions(+)
> > > 
> > > diff --git a/drivers/cxl/cxlmem.h b/drivers/cxl/cxlmem.h
> > > index 88e3a8e54b6a..cd35f43fedd4 100644
> > > --- a/drivers/cxl/cxlmem.h
> > > +++ b/drivers/cxl/cxlmem.h
> > > @@ -211,6 +211,7 @@ struct cxl_endpoint_dvsec_info {
> > >   * @info: Cached DVSEC information about the device.
> > >   * @serial: PCIe Device Serial Number
> > >   * @doe_mbs: PCI DOE mailbox array
> > > + * @msi_enabled: MSI-X/MSI has been enabled
> > >   * @mbox_send: @dev specific transport for transmitting mailbox commands
> > >   *
> > >   * See section 8.2.9.5.2 Capacity Configuration and Label Storage for
> > > @@ -247,6 +248,8 @@ struct cxl_dev_state {
> > >  
> > >  	struct xarray doe_mbs;
> > >  
> > > +	bool msi_enabled;
> > > +
> > 
> > This goes unused in this patch and it also duplicates what the core
> > offers with pdev->{msi,msix}_enabled.
> 
> I tried to argue that with Jonathan and lost.  What I had in V1 was to store
> the number actually allocated.  Then if a function reports something higher
> later it can't be used.

A successful pci_alloc_irq_vectors() call assigns a vector number to all
interrupt sources on the device regardless of how many interrupt sources
there are. If the device has 32 interrupt sources and 16 irqs are returned
from pci_alloc_irq_vectors() then each interrupt source will be sharing
a vector with one or more other vectors. All PCI IRQ vectors are shared.

So I do not see the point of this msi_enabled flag cxl_dev_state. If
pci_alloc_irq_vectors() returns at least 1 then you are good to go.

^ permalink raw reply	[flat|nested] 64+ messages in thread

* RE: [PATCH V2 03/11] cxl/mem: Implement Clear Event Records command
  2022-12-01  0:27 ` [PATCH V2 03/11] cxl/mem: Implement Clear " ira.weiny
  2022-12-01 13:26   ` Jonathan Cameron
@ 2022-12-02  2:29   ` Dan Williams
  2022-12-02 13:18     ` Jonathan Cameron
                       ` (2 more replies)
  1 sibling, 3 replies; 64+ messages in thread
From: Dan Williams @ 2022-12-02  2:29 UTC (permalink / raw)
  To: ira.weiny, Dan Williams
  Cc: Ira Weiny, Alison Schofield, Vishal Verma, Ben Widawsky,
	Steven Rostedt, Jonathan Cameron, Davidlohr Bueso, Dave Jiang,
	linux-kernel, linux-cxl

ira.weiny@ wrote:
> From: Ira Weiny <ira.weiny@intel.com>
> 
> CXL rev 3.0 section 8.2.9.2.3 defines the Clear Event Records mailbox
> command.  After an event record is read it needs to be cleared from the
> event log.
> 
> Implement cxl_clear_event_record() to clear all record retrieved from
> the device.
> 
> Each record is cleared explicitly.  A clear all bit is specified but
> events could arrive between a get and any final clear all operation.
> This means events would be missed.
> Therefore each event is cleared specifically.

Note that the spec has a better reason for why Clear All has limited
usage:

"Clear All Events is only allowed when the Event Log has overflowed;
otherwise, the device shall return Invalid Input."

Will need to wait and see if we need that to keep pace with a device
with a high event frequency.

> 
> Signed-off-by: Ira Weiny <ira.weiny@intel.com>
> 
> ---
> Changes from V1:
> 	Clear Event Record allows for u8 handles while Get Event Record
> 	allows for u16 records to be returned.  Based on Jonathan's
> 	feedback; allow for all event records to be handled in this
> 	clear.  Which means a double loop with potentially multiple
> 	Clear Event payloads being sent to clear all events sent.
> 
> Changes from RFC:
> 	Jonathan
> 		Clean up init of payload and use return code.
> 		Also report any error to clear the event.
> 		s/v3.0/rev 3.0
> ---
>  drivers/cxl/core/mbox.c      | 61 +++++++++++++++++++++++++++++++-----
>  drivers/cxl/cxlmem.h         | 14 +++++++++
>  include/uapi/linux/cxl_mem.h |  1 +
>  3 files changed, 69 insertions(+), 7 deletions(-)
> 
> diff --git a/drivers/cxl/core/mbox.c b/drivers/cxl/core/mbox.c
> index 70b681027a3d..076a3df0ba38 100644
> --- a/drivers/cxl/core/mbox.c
> +++ b/drivers/cxl/core/mbox.c
> @@ -52,6 +52,7 @@ static struct cxl_mem_command cxl_mem_commands[CXL_MEM_COMMAND_ID_MAX] = {
>  #endif
>  	CXL_CMD(GET_SUPPORTED_LOGS, 0, CXL_VARIABLE_PAYLOAD, CXL_CMD_FLAG_FORCE_ENABLE),
>  	CXL_CMD(GET_EVENT_RECORD, 1, CXL_VARIABLE_PAYLOAD, 0),
> +	CXL_CMD(CLEAR_EVENT_RECORD, CXL_VARIABLE_PAYLOAD, 0, 0),
>  	CXL_CMD(GET_FW_INFO, 0, 0x50, 0),
>  	CXL_CMD(GET_PARTITION_INFO, 0, 0x20, 0),
>  	CXL_CMD(GET_LSA, 0x8, CXL_VARIABLE_PAYLOAD, 0),
> @@ -708,6 +709,42 @@ int cxl_enumerate_cmds(struct cxl_dev_state *cxlds)
>  }
>  EXPORT_SYMBOL_NS_GPL(cxl_enumerate_cmds, CXL);
>  
> +static int cxl_clear_event_record(struct cxl_dev_state *cxlds,
> +				  enum cxl_event_log_type log,
> +				  struct cxl_get_event_payload *get_pl,
> +				  u16 total)
> +{
> +	struct cxl_mbox_clear_event_payload payload = {
> +		.event_log = log,
> +	};
> +	int cnt;
> +
> +	/*
> +	 * Clear Event Records uses u8 for the handle cnt while Get Event
> +	 * Record can return up to 0xffff records.
> +	 */
> +	for (cnt = 0; cnt < total; /* cnt incremented internally */) {
> +		u8 nr_recs = min_t(u8, (total - cnt),
> +				   CXL_CLEAR_EVENT_MAX_HANDLES);

This seems overly complicated. @total is a duplicate of
@get_pl->record_count, and the 2 loops feel like it could be cut
down to one.

> +		int i, rc;
> +
> +		for (i = 0; i < nr_recs; i++, cnt++) {
> +			payload.handle[i] = get_pl->records[cnt].hdr.handle;
> +			dev_dbg(cxlds->dev, "Event log '%s': Clearning %u\n",

While I do think this operation is a mix of clearing and cleaning event
records, I don't think "Clearning" is a word.

> +				cxl_event_log_type_str(log),
> +				le16_to_cpu(payload.handle[i]));
> +		}
> +		payload.nr_recs = nr_recs;
> +
> +		rc = cxl_mbox_send_cmd(cxlds, CXL_MBOX_OP_CLEAR_EVENT_RECORD,
> +				       &payload, sizeof(payload), NULL, 0);
> +		if (rc)
> +			return rc;
> +	}
> +
> +	return 0;
> +}
> +
>  static void cxl_mem_get_records_log(struct cxl_dev_state *cxlds,
>  				    enum cxl_event_log_type type)
>  {
> @@ -732,13 +769,22 @@ static void cxl_mem_get_records_log(struct cxl_dev_state *cxlds,
>  		}
>  
>  		nr_rec = le16_to_cpu(payload->record_count);
> -		if (trace_cxl_generic_event_enabled()) {
> +		if (nr_rec > 0) {
>  			int i;
>  
> -			for (i = 0; i < nr_rec; i++)
> -				trace_cxl_generic_event(dev_name(cxlds->dev),
> -							type,
> -							&payload->records[i]);
> +			if (trace_cxl_generic_event_enabled()) {

Again, trace_cxl_generic_event_enabled() injects some awkward
formatting here to micro-optimize looping. Any performance benefit this
code might offer is likely offset by the extra human effort to read it.

> +				for (i = 0; i < nr_rec; i++)
> +					trace_cxl_generic_event(dev_name(cxlds->dev),
> +								type,
> +								&payload->records[i]);
> +			}
> +
> +			rc = cxl_clear_event_record(cxlds, type, payload, nr_rec);
> +			if (rc) {
> +				dev_err(cxlds->dev, "Event log '%s': Failed to clear events : %d",
> +					cxl_event_log_type_str(type), rc);
> +				return;
> +			}
>  		}
>  
>  		if (trace_cxl_overflow_enabled() &&
> @@ -780,10 +826,11 @@ static struct cxl_get_event_payload *alloc_event_buf(struct cxl_dev_state *cxlds
>   * cxl_mem_get_event_records - Get Event Records from the device
>   * @cxlds: The device data for the operation
>   *
> - * Retrieve all event records available on the device and report them as trace
> - * events.
> + * Retrieve all event records available on the device, report them as trace
> + * events, and clear them.
>   *
>   * See CXL rev 3.0 @8.2.9.2.2 Get Event Records
> + * See CXL rev 3.0 @8.2.9.2.3 Clear Event Records
>   */
>  void cxl_mem_get_event_records(struct cxl_dev_state *cxlds)
>  {
> diff --git a/drivers/cxl/cxlmem.h b/drivers/cxl/cxlmem.h
> index 55d57f5a64bc..1ae9962c5a06 100644
> --- a/drivers/cxl/cxlmem.h
> +++ b/drivers/cxl/cxlmem.h
> @@ -261,6 +261,7 @@ enum cxl_opcode {
>  	CXL_MBOX_OP_INVALID		= 0x0000,
>  	CXL_MBOX_OP_RAW			= CXL_MBOX_OP_INVALID,
>  	CXL_MBOX_OP_GET_EVENT_RECORD	= 0x0100,
> +	CXL_MBOX_OP_CLEAR_EVENT_RECORD	= 0x0101,
>  	CXL_MBOX_OP_GET_FW_INFO		= 0x0200,
>  	CXL_MBOX_OP_ACTIVATE_FW		= 0x0202,
>  	CXL_MBOX_OP_GET_SUPPORTED_LOGS	= 0x0400,
> @@ -396,6 +397,19 @@ static inline const char *cxl_event_log_type_str(enum cxl_event_log_type type)
>  	return "<unknown>";
>  }
>  
> +/*
> + * Clear Event Records input payload
> + * CXL rev 3.0 section 8.2.9.2.3; Table 8-51
> + */
> +#define CXL_CLEAR_EVENT_MAX_HANDLES (0xff)
> +struct cxl_mbox_clear_event_payload {
> +	u8 event_log;		/* enum cxl_event_log_type */
> +	u8 clear_flags;
> +	u8 nr_recs;
> +	u8 reserved[3];
> +	__le16 handle[CXL_CLEAR_EVENT_MAX_HANDLES];
> +};
> +
>  struct cxl_mbox_get_partition_info {
>  	__le64 active_volatile_cap;
>  	__le64 active_persistent_cap;
> diff --git a/include/uapi/linux/cxl_mem.h b/include/uapi/linux/cxl_mem.h
> index 70459be5bdd4..7c1ad8062792 100644
> --- a/include/uapi/linux/cxl_mem.h
> +++ b/include/uapi/linux/cxl_mem.h
> @@ -25,6 +25,7 @@
>  	___C(RAW, "Raw device command"),                                  \
>  	___C(GET_SUPPORTED_LOGS, "Get Supported Logs"),                   \
>  	___C(GET_EVENT_RECORD, "Get Event Record"),                       \
> +	___C(CLEAR_EVENT_RECORD, "Clear Event Record"),                   \
>  	___C(GET_FW_INFO, "Get FW Info"),                                 \
>  	___C(GET_PARTITION_INFO, "Get Partition Information"),            \
>  	___C(GET_LSA, "Get Label Storage Area"),                          \

Same, "yikes" / "must be at the end of the enum" feedback.

^ permalink raw reply	[flat|nested] 64+ messages in thread

* RE: [PATCH V2 04/11] cxl/mem: Clear events on driver load
  2022-12-01  0:27 ` [PATCH V2 04/11] cxl/mem: Clear events on driver load ira.weiny
  2022-12-01 13:30   ` Jonathan Cameron
@ 2022-12-02  2:48   ` Dan Williams
  2022-12-02 16:34     ` Ira Weiny
  1 sibling, 1 reply; 64+ messages in thread
From: Dan Williams @ 2022-12-02  2:48 UTC (permalink / raw)
  To: ira.weiny, Dan Williams
  Cc: Ira Weiny, Jonathan Cameron, Dave Jiang, Alison Schofield,
	Vishal Verma, Ben Widawsky, Steven Rostedt, Davidlohr Bueso,
	linux-kernel, linux-cxl

cxl/mem is cxl_mem.ko, This is cxl/pci.

ira.weiny@ wrote:
> From: Ira Weiny <ira.weiny@intel.com>
> 
> The information contained in the events prior to the driver loading can
> be queried at any time through other mailbox commands.
> 
> Ensure a clean slate of events by reading and clearing the events.  The
> events are sent to the trace buffer but it is not anticipated to have
> anyone listening to it at driver load time.

This is easy to guarantee with modprobe policy, so I am not sure it is
worth stating.

This breakdown feels odd. I would split the trace event definitions into
its own lead in patch since that is a pile of definitions that can be
merged on their own. Then squash get, clear, and this patch into one
patch as they don't have much reason to go in separately.

> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
> Reviewed-by: Dave Jiang <dave.jiang@intel.com>
> Signed-off-by: Ira Weiny <ira.weiny@intel.com>
> ---
>  drivers/cxl/pci.c            | 2 ++
>  tools/testing/cxl/test/mem.c | 2 ++
>  2 files changed, 4 insertions(+)
> 
> diff --git a/drivers/cxl/pci.c b/drivers/cxl/pci.c
> index 8f86f85d89c7..11e95a95195a 100644
> --- a/drivers/cxl/pci.c
> +++ b/drivers/cxl/pci.c
> @@ -521,6 +521,8 @@ static int cxl_pci_probe(struct pci_dev *pdev, const struct pci_device_id *id)
>  	if (IS_ERR(cxlmd))
>  		return PTR_ERR(cxlmd);
>  
> +	cxl_mem_get_event_records(cxlds);
> +
>  	if (resource_size(&cxlds->pmem_res) && IS_ENABLED(CONFIG_CXL_PMEM))
>  		rc = devm_cxl_add_nvdimm(&pdev->dev, cxlmd);
>  
> diff --git a/tools/testing/cxl/test/mem.c b/tools/testing/cxl/test/mem.c
> index aa2df3a15051..e2f5445d24ff 100644
> --- a/tools/testing/cxl/test/mem.c
> +++ b/tools/testing/cxl/test/mem.c
> @@ -285,6 +285,8 @@ static int cxl_mock_mem_probe(struct platform_device *pdev)
>  	if (IS_ERR(cxlmd))
>  		return PTR_ERR(cxlmd);
>  
> +	cxl_mem_get_event_records(cxlds);
> +

This hunk likely goes with the first patch that actually implements some
mocked events.

>  	if (resource_size(&cxlds->pmem_res) && IS_ENABLED(CONFIG_CXL_PMEM))
>  		rc = devm_cxl_add_nvdimm(dev, cxlmd);
>  
> -- 
> 2.37.2
> 



^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH V2 02/11] cxl/mem: Implement Get Event Records command
  2022-12-02  0:09     ` Ira Weiny
@ 2022-12-02  4:40       ` Steven Rostedt
  2022-12-02  5:00         ` Steven Rostedt
  0 siblings, 1 reply; 64+ messages in thread
From: Steven Rostedt @ 2022-12-02  4:40 UTC (permalink / raw)
  To: Ira Weiny
  Cc: Dan Williams, Alison Schofield, Vishal Verma, Ben Widawsky,
	Jonathan Cameron, Davidlohr Bueso, Dave Jiang, linux-kernel,
	linux-cxl

On Thu, 1 Dec 2022 16:09:17 -0800
Ira Weiny <ira.weiny@intel.com> wrote:

> Dropping that into cxlmem.h does not compile.  I've given it another go but
> because I use cxl_event_log_type_str() in a file where trace points are used
> CREATE_TRACE_POINTS is defined and I get the following error.
> 
> || drivers/cxl/core/mbox.c: In function ‘cxl_mem_get_records_log’:
> drivers/cxl/cxlmem.h|386 col 7| error: implicit declaration of function ‘__print_symbolic’; did you mean ‘sprint_symbol’?  [-Werror=implicit-function-declaration]                        
> ||   386 |       __print_symbolic(type,                            \
> ||       |       ^~~~~~~~~~~~~~~~
> 
> I got it to work with the patch below on top of this one.[3]  But it is kind of
> ugly.  The only way I could get __print_symbolic() to be defined was to
> redefine it in mbox.c.[1]  Then throw it in it's own header as in [3]

I played around a bit, and with the below patch, you can just have:


#define cxl_event_log_type_str(type)				\
	__print_symbolic(type,					\
		{ CXL_EVENT_TYPE_INFO, "Informational" },	\
		{ CXL_EVENT_TYPE_WARN, "Warning" },		\
		{ CXL_EVENT_TYPE_FAIL, "Failure" },		\
		{ CXL_EVENT_TYPE_FATAL, "Fatal" })

And everything else should "just work" :-)

I can work on a more formal patch if this works for you. And thinking about
this, perhaps we could add this throughout the kernel!

-- Steve

diff --git a/include/trace/define_trace.h b/include/trace/define_trace.h
index 00723935dcc7..ee41057674a2 100644
--- a/include/trace/define_trace.h
+++ b/include/trace/define_trace.h
@@ -132,4 +132,25 @@
 /* We may be processing more files */
 #define CREATE_TRACE_POINTS
 
+#ifndef __DEFINE_PRINT_SYMBOLIC_STR
+#define __DEFINE_PRINT_SYMBOLIC_STR
+static inline const char *
+__print_symbolic_str(int type, struct trace_print_flags *symbols)
+{
+	for (; symbols->name != NULL; symbols++) {
+		if (type == symbols->mask)
+			return symbols->name;
+	}
+	return "<invalid>";
+}
+#endif
+
+#undef __print_symbolic
+#define __print_symbolic(value, symbol_array...)			\
+	({								\
+		static const struct trace_print_flags symbols[] =	\
+			{ symbol_array, { -1, NULL }};			\
+		__print_symbolic_str(value, symbols);			\
+	})
+
 #endif /* CREATE_TRACE_POINTS */
diff --git a/include/trace/stages/stage7_class_define.h b/include/trace/stages/stage7_class_define.h
index 8a7ec24c246d..6fe83397f65d 100644
--- a/include/trace/stages/stage7_class_define.h
+++ b/include/trace/stages/stage7_class_define.h
@@ -6,7 +6,6 @@
 #define __entry REC
 
 #undef __print_flags
-#undef __print_symbolic
 #undef __print_hex
 #undef __print_hex_str
 #undef __get_dynamic_array

^ permalink raw reply related	[flat|nested] 64+ messages in thread

* Re: [PATCH V2 02/11] cxl/mem: Implement Get Event Records command
  2022-12-02  4:40       ` Steven Rostedt
@ 2022-12-02  5:00         ` Steven Rostedt
  2022-12-02 21:31           ` Ira Weiny
  0 siblings, 1 reply; 64+ messages in thread
From: Steven Rostedt @ 2022-12-02  5:00 UTC (permalink / raw)
  To: Ira Weiny
  Cc: Dan Williams, Alison Schofield, Vishal Verma, Ben Widawsky,
	Jonathan Cameron, Davidlohr Bueso, Dave Jiang, linux-kernel,
	linux-cxl

On Thu, 1 Dec 2022 23:40:52 -0500
Steven Rostedt <rostedt@goodmis.org> wrote:

> +#undef __print_symbolic
> +#define __print_symbolic(value, symbol_array...)			\
> +	({								\
> +		static const struct trace_print_flags symbols[] =	\
> +			{ symbol_array, { -1, NULL }};			\
> +		__print_symbolic_str(value, symbols);			\
> +	})
> +
>  #endif /* CREATE_TRACE_POINTS */

Bah, I want this outside that #ifdef

> diff --git a/include/trace/stages/stage7_class_define.h b/include/trace/stages/stage7_class_define.h
> index 8a7ec24c246d..6fe83397f65d 100644
> --- a/include/trace/stages/stage7_class_define.h
> +++ b/include/trace/stages/stage7_class_define.h
> @@ -6,7 +6,6 @@

I also don't think I need to touch stage7.

New patch:

diff --git a/include/trace/define_trace.h b/include/trace/define_trace.h
index 00723935dcc7..9d665f634614 100644
--- a/include/trace/define_trace.h
+++ b/include/trace/define_trace.h
@@ -133,3 +133,24 @@
 #define CREATE_TRACE_POINTS
 
 #endif /* CREATE_TRACE_POINTS */
+
+#ifndef __DEFINE_PRINT_SYMBOLIC_STR
+#define __DEFINE_PRINT_SYMBOLIC_STR
+static inline const char *
+__print_symbolic_str(int type, struct trace_print_flags *symbols)
+{
+	for (; symbols->name != NULL; symbols++) {
+		if (type == symbols->mask)
+			return symbols->name;
+	}
+	return "<invalid>";
+}
+#endif
+
+#undef __print_symbolic
+#define __print_symbolic(value, symbol_array...)			\
+	({								\
+		static const struct trace_print_flags symbols[] =	\
+			{ symbol_array, { -1, NULL }};			\
+		__print_symbolic_str(value, symbols);			\
+	})

^ permalink raw reply related	[flat|nested] 64+ messages in thread

* RE: [PATCH V2 05/11] cxl/mem: Trace General Media Event Record
  2022-12-01  0:27 ` [PATCH V2 05/11] cxl/mem: Trace General Media Event Record ira.weiny
  2022-12-01 18:54   ` Dave Jiang
@ 2022-12-02  6:18   ` Dan Williams
  1 sibling, 0 replies; 64+ messages in thread
From: Dan Williams @ 2022-12-02  6:18 UTC (permalink / raw)
  To: ira.weiny, Dan Williams
  Cc: Ira Weiny, Steven Rostedt, Jonathan Cameron, Alison Schofield,
	Vishal Verma, Ben Widawsky, Davidlohr Bueso, Dave Jiang,
	linux-kernel, linux-cxl

ira.weiny@ wrote:
> From: Ira Weiny <ira.weiny@intel.com>
> 
> CXL rev 3.0 section 8.2.9.2.1.1 defines the General Media Event Record.
> 
> Determine if the event read is a general media record and if so trace
> the record as a General Media Event Record.
> 
> Reviewed-by: Steven Rostedt (Google) <rostedt@goodmis.org>
> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
> Signed-off-by: Ira Weiny <ira.weiny@intel.com>
> 
> ---
> Changes from V1:
> 	Jonathan
> 		fix spec references for CXL rev 3.0
> 		Make flags all caps
> 
> Changes from RFC v2:
> 	Output DPA flags as a single field
> 	Ensure names of fields match what TP_print outputs
> 	Steven
> 		prefix TRACE_EVENT with 'cxl_'
> 	Jonathan
> 		Remove Reserved field
> 
> Changes from RFC:
> 	Add reserved byte array
> 	Use common CXL event header record macros
> 	Jonathan
> 		Use unaligned_le{24,16} for unaligned fields
> 		Don't use the inverse of phy addr mask
> 	Dave Jiang
> 		s/cxl_gen_media_event/general_media
> 		s/cxl_evt_gen_media/cxl_event_gen_media
> ---
>  drivers/cxl/core/mbox.c    |  40 ++++++++++--
>  drivers/cxl/cxlmem.h       |  19 ++++++
>  include/trace/events/cxl.h | 124 +++++++++++++++++++++++++++++++++++++
>  3 files changed, 179 insertions(+), 4 deletions(-)
> 
> diff --git a/drivers/cxl/core/mbox.c b/drivers/cxl/core/mbox.c
> index 076a3df0ba38..20191fe55bba 100644
> --- a/drivers/cxl/core/mbox.c
> +++ b/drivers/cxl/core/mbox.c
> @@ -709,6 +709,38 @@ int cxl_enumerate_cmds(struct cxl_dev_state *cxlds)
>  }
>  EXPORT_SYMBOL_NS_GPL(cxl_enumerate_cmds, CXL);
>  
> +/*
> + * General Media Event Record
> + * CXL rev 3.0 Section 8.2.9.2.1.1; Table 8-43
> + */
> +static const uuid_t gen_media_event_uuid =
> +	UUID_INIT(0xfbcd0a77, 0xc260, 0x417f,
> +		  0x85, 0xa9, 0x08, 0x8b, 0x16, 0x21, 0xeb, 0xa6);
> +

Move this next to the other UUID_INITs in cxlmem.h.

> +static bool cxl_event_tracing_enabled(void)
> +{
> +	return trace_cxl_generic_event_enabled() ||
> +	       trace_cxl_general_media_enabled();
> +}

...and now the micro-optimization gets more complicated. The mailbox
command is an uncached PCI mmio memcpy(), the incremental cycles this
enabled check is saving would be difficult to spot in a profile. So
unless it has a worthwhile perf-profile impact I prefer the simplicity
of the straight through code.

> +
> +static void cxl_trace_event_record(const char *dev_name,
> +				   enum cxl_event_log_type type,
> +				   struct cxl_event_record_raw *record)
> +{
> +	uuid_t *id = &record->hdr.id;
> +
> +	if (uuid_equal(id, &gen_media_event_uuid)) {
> +		struct cxl_event_gen_media *rec =
> +				(struct cxl_event_gen_media *)record;
> +
> +		trace_cxl_general_media(dev_name, type, rec);
> +		return;
> +	}
> +
> +	/* For unknown record types print just the header */
> +	trace_cxl_generic_event(dev_name, type, record);
> +}
> +
>  static int cxl_clear_event_record(struct cxl_dev_state *cxlds,
>  				  enum cxl_event_log_type log,
>  				  struct cxl_get_event_payload *get_pl,
> @@ -772,11 +804,11 @@ static void cxl_mem_get_records_log(struct cxl_dev_state *cxlds,
>  		if (nr_rec > 0) {
>  			int i;
>  
> -			if (trace_cxl_generic_event_enabled()) {
> +			if (cxl_event_tracing_enabled()) {
>  				for (i = 0; i < nr_rec; i++)
> -					trace_cxl_generic_event(dev_name(cxlds->dev),
> -								type,
> -								&payload->records[i]);
> +					cxl_trace_event_record(dev_name(cxlds->dev),
> +							       type,
> +							       &payload->records[i]);

Changing the same lines multiple times in the same series is sometihng
that sets off my complexity alarm bells.

>  			}
>  
>  			rc = cxl_clear_event_record(cxlds, type, payload, nr_rec);
> diff --git a/drivers/cxl/cxlmem.h b/drivers/cxl/cxlmem.h
> index 1ae9962c5a06..10696debefa8 100644
> --- a/drivers/cxl/cxlmem.h
> +++ b/drivers/cxl/cxlmem.h
> @@ -410,6 +410,25 @@ struct cxl_mbox_clear_event_payload {
>  	__le16 handle[CXL_CLEAR_EVENT_MAX_HANDLES];
>  };
>  
> +/*
> + * General Media Event Record
> + * CXL rev 3.0 Section 8.2.9.2.1.1; Table 8-43
> + */
> +#define CXL_EVENT_GEN_MED_COMP_ID_SIZE	0x10
> +struct cxl_event_gen_media {
> +	struct cxl_event_record_hdr hdr;
> +	__le64 phys_addr;
> +	u8 descriptor;
> +	u8 type;
> +	u8 transaction_type;
> +	u8 validity_flags[2];
> +	u8 channel;
> +	u8 rank;
> +	u8 device[3];
> +	u8 component_id[CXL_EVENT_GEN_MED_COMP_ID_SIZE];
> +	u8 reserved[0x2e];

s/0x2e/46/

> +} __packed;
> +
>  struct cxl_mbox_get_partition_info {
>  	__le64 active_volatile_cap;
>  	__le64 active_persistent_cap;
> diff --git a/include/trace/events/cxl.h b/include/trace/events/cxl.h
> index c03a1a894af8..a4d6bd64e9bc 100644
> --- a/include/trace/events/cxl.h
> +++ b/include/trace/events/cxl.h
> @@ -118,6 +118,130 @@ TRACE_EVENT(cxl_generic_event,
>  		__print_hex(__entry->data, CXL_EVENT_RECORD_DATA_LENGTH))
>  );
>  
> +/*
> + * Physical Address field masks
> + *
> + * General Media Event Record
> + * CXL rev 3.0 Section 8.2.9.2.1.1; Table 8-43
> + *
> + * DRAM Event Record
> + * CXL rev 3.0 section 8.2.9.2.1.2; Table 8-44
> + */
> +#define CXL_DPA_FLAGS_MASK			0x3F
> +#define CXL_DPA_MASK				(~CXL_DPA_FLAGS_MASK)
> +
> +#define CXL_DPA_VOLATILE			BIT(0)
> +#define CXL_DPA_NOT_REPAIRABLE			BIT(1)
> +#define show_dpa_flags(flags)	__print_flags(flags, "|",		   \
> +	{ CXL_DPA_VOLATILE,			"VOLATILE"		}, \
> +	{ CXL_DPA_NOT_REPAIRABLE,		"NOT_REPAIRABLE"	}  \
> +)
> +
> +/*
> + * General Media Event Record - GMER
> + * CXL rev 3.0 Section 8.2.9.2.1.1; Table 8-43
> + */
> +#define CXL_GMER_EVT_DESC_UNCORECTABLE_EVENT		BIT(0)
> +#define CXL_GMER_EVT_DESC_THRESHOLD_EVENT		BIT(1)
> +#define CXL_GMER_EVT_DESC_POISON_LIST_OVERFLOW		BIT(2)
> +#define show_event_desc_flags(flags)	__print_flags(flags, "|",		   \
> +	{ CXL_GMER_EVT_DESC_UNCORECTABLE_EVENT,		"UNCORRECTABLE_EVENT"	}, \
> +	{ CXL_GMER_EVT_DESC_THRESHOLD_EVENT,		"THRESHOLD_EVENT"	}, \
> +	{ CXL_GMER_EVT_DESC_POISON_LIST_OVERFLOW,	"POISON_LIST_OVERFLOW"	}  \
> +)
> +
> +#define CXL_GMER_MEM_EVT_TYPE_ECC_ERROR			0x00
> +#define CXL_GMER_MEM_EVT_TYPE_INV_ADDR			0x01
> +#define CXL_GMER_MEM_EVT_TYPE_DATA_PATH_ERROR		0x02
> +#define show_mem_event_type(type)	__print_symbolic(type,			\
> +	{ CXL_GMER_MEM_EVT_TYPE_ECC_ERROR,		"ECC Error" },		\
> +	{ CXL_GMER_MEM_EVT_TYPE_INV_ADDR,		"Invalid Address" },	\
> +	{ CXL_GMER_MEM_EVT_TYPE_DATA_PATH_ERROR,	"Data Path Error" }	\
> +)
> +
> +#define CXL_GMER_TRANS_UNKNOWN				0x00
> +#define CXL_GMER_TRANS_HOST_READ			0x01
> +#define CXL_GMER_TRANS_HOST_WRITE			0x02
> +#define CXL_GMER_TRANS_HOST_SCAN_MEDIA			0x03
> +#define CXL_GMER_TRANS_HOST_INJECT_POISON		0x04
> +#define CXL_GMER_TRANS_INTERNAL_MEDIA_SCRUB		0x05
> +#define CXL_GMER_TRANS_INTERNAL_MEDIA_MANAGEMENT	0x06
> +#define show_trans_type(type)	__print_symbolic(type,					\
> +	{ CXL_GMER_TRANS_UNKNOWN,			"Unknown" },			\
> +	{ CXL_GMER_TRANS_HOST_READ,			"Host Read" },			\
> +	{ CXL_GMER_TRANS_HOST_WRITE,			"Host Write" },			\
> +	{ CXL_GMER_TRANS_HOST_SCAN_MEDIA,		"Host Scan Media" },		\
> +	{ CXL_GMER_TRANS_HOST_INJECT_POISON,		"Host Inject Poison" },		\
> +	{ CXL_GMER_TRANS_INTERNAL_MEDIA_SCRUB,		"Internal Media Scrub" },	\
> +	{ CXL_GMER_TRANS_INTERNAL_MEDIA_MANAGEMENT,	"Internal Media Management" }	\
> +)
> +
> +#define CXL_GMER_VALID_CHANNEL				BIT(0)
> +#define CXL_GMER_VALID_RANK				BIT(1)
> +#define CXL_GMER_VALID_DEVICE				BIT(2)
> +#define CXL_GMER_VALID_COMPONENT			BIT(3)
> +#define show_valid_flags(flags)	__print_flags(flags, "|",		   \
> +	{ CXL_GMER_VALID_CHANNEL,			"CHANNEL"	}, \
> +	{ CXL_GMER_VALID_RANK,				"RANK"		}, \
> +	{ CXL_GMER_VALID_DEVICE,			"DEVICE"	}, \
> +	{ CXL_GMER_VALID_COMPONENT,			"COMPONENT"	}  \
> +)
> +
> +TRACE_EVENT(cxl_general_media,
> +
> +	TP_PROTO(const char *dev_name, enum cxl_event_log_type log,
> +		 struct cxl_event_gen_media *rec),
> +
> +	TP_ARGS(dev_name, log, rec),
> +
> +	TP_STRUCT__entry(
> +		CXL_EVT_TP_entry
> +		/* General Media */
> +		__field(u64, dpa)
> +		__field(u8, descriptor)
> +		__field(u8, type)
> +		__field(u8, transaction_type)
> +		__field(u8, channel)
> +		__field(u32, device)
> +		__array(u8, comp_id, CXL_EVENT_GEN_MED_COMP_ID_SIZE)
> +		__field(u16, validity_flags)
> +		/* Following are out of order to pack trace record */
> +		__field(u8, rank)
> +		__field(u8, dpa_flags)
> +	),
> +
> +	TP_fast_assign(
> +		CXL_EVT_TP_fast_assign(dev_name, log, rec->hdr);
> +
> +		/* General Media */
> +		__entry->dpa = le64_to_cpu(rec->phys_addr);
> +		__entry->dpa_flags = __entry->dpa & CXL_DPA_FLAGS_MASK;
> +		/* Mask after flags have been parsed */
> +		__entry->dpa &= CXL_DPA_MASK;
> +		__entry->descriptor = rec->descriptor;
> +		__entry->type = rec->type;
> +		__entry->transaction_type = rec->transaction_type;
> +		__entry->channel = rec->channel;
> +		__entry->rank = rec->rank;
> +		__entry->device = get_unaligned_le24(rec->device);
> +		memcpy(__entry->comp_id, &rec->component_id,
> +			CXL_EVENT_GEN_MED_COMP_ID_SIZE);
> +		__entry->validity_flags = get_unaligned_le16(&rec->validity_flags);
> +	),
> +
> +	CXL_EVT_TP_printk("dpa=%llx dpa_flags='%s' " \
> +		"descriptor='%s' type='%s' transaction_type='%s' channel=%u rank=%u " \
> +		"device=%x comp_id=%s validity_flags='%s'",
> +		__entry->dpa, show_dpa_flags(__entry->dpa_flags),
> +		show_event_desc_flags(__entry->descriptor),
> +		show_mem_event_type(__entry->type),
> +		show_trans_type(__entry->transaction_type),
> +		__entry->channel, __entry->rank, __entry->device,
> +		__print_hex(__entry->comp_id, CXL_EVENT_GEN_MED_COMP_ID_SIZE),
> +		show_valid_flags(__entry->validity_flags)
> +	)
> +);
> +
>  #endif /* _CXL_TRACE_EVENTS_H */
>  
>  /* This part must be outside protection */
> -- 
> 2.37.2
> 



^ permalink raw reply	[flat|nested] 64+ messages in thread

* RE: [PATCH V2 07/11] cxl/mem: Trace Memory Module Event Record
  2022-12-01  0:27 ` [PATCH V2 07/11] cxl/mem: Trace Memory Module " ira.weiny
  2022-12-01 13:31   ` Jonathan Cameron
  2022-12-01 18:57   ` Dave Jiang
@ 2022-12-02  6:25   ` Dan Williams
  2 siblings, 0 replies; 64+ messages in thread
From: Dan Williams @ 2022-12-02  6:25 UTC (permalink / raw)
  To: ira.weiny, Dan Williams
  Cc: Ira Weiny, Steven Rostedt, Alison Schofield, Vishal Verma,
	Ben Widawsky, Jonathan Cameron, Davidlohr Bueso, Dave Jiang,
	linux-kernel, linux-cxl

ira.weiny@ wrote:
> From: Ira Weiny <ira.weiny@intel.com>
> 
> CXL rev 3.0 section 8.2.9.2.1.3 defines the Memory Module Event Record.
> 
> Determine if the event read is memory module record and if so trace the
> record.
> 
> Reviewed-by: Steven Rostedt (Google) <rostedt@goodmis.org>
> Signed-off-by: Ira Weiny <ira.weiny@intel.com>
> 
> ---
> Changes from V1:
> 	Use all caps for flag fields
> 
> Changes from RFC v2:
> 	Ensure field names match TP_print output
> 	Steven
> 		prefix TRACE_EVENT with 'cxl_'
> 	Jonathan
> 		Remove reserved field
> 		Define a 1bit and 2 bit status decoder
> 		Fix paren alignment
> 
> Changes from RFC:
> 	Clean up spec reference
> 	Add reserved data
> 	Use new CXL header macros
> 	Jonathan
> 		Use else if
> 		Use get_unaligned_le*() for unaligned fields
> 	Dave Jiang
> 		s/cxl_mem_mod_event/memory_module
> 		s/cxl_evt_mem_mod_rec/cxl_event_mem_module
> ---
>  drivers/cxl/core/mbox.c    |  17 ++++-
>  drivers/cxl/cxlmem.h       |  26 +++++++
>  include/trace/events/cxl.h | 144 +++++++++++++++++++++++++++++++++++++
>  3 files changed, 186 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/cxl/core/mbox.c b/drivers/cxl/core/mbox.c
> index 66fc50d89bf4..30840b711381 100644
> --- a/drivers/cxl/core/mbox.c
> +++ b/drivers/cxl/core/mbox.c
> @@ -725,11 +725,20 @@ static const uuid_t dram_event_uuid =
>  	UUID_INIT(0x601dcbb3, 0x9c06, 0x4eab,
>  		  0xb8, 0xaf, 0x4e, 0x9b, 0xfb, 0x5c, 0x96, 0x24);
>  
> +/*
> + * Memory Module Event Record
> + * CXL rev 3.0 section 8.2.9.2.1.3; Table 8-45
> + */
> +static const uuid_t mem_mod_event_uuid =
> +	UUID_INIT(0xfe927475, 0xdd59, 0x4339,
> +		  0xa5, 0x86, 0x79, 0xba, 0xb1, 0x13, 0xb7, 0x74);
> +
>  static bool cxl_event_tracing_enabled(void)
>  {
>  	return trace_cxl_generic_event_enabled() ||
>  	       trace_cxl_general_media_enabled() ||
> -	       trace_cxl_dram_enabled();
> +	       trace_cxl_dram_enabled() ||
> +	       trace_cxl_memory_module_enabled();
>  }
>  
>  static void cxl_trace_event_record(const char *dev_name,
> @@ -749,6 +758,12 @@ static void cxl_trace_event_record(const char *dev_name,
>  
>  		trace_cxl_dram(dev_name, type, rec);
>  		return;
> +	} else if (uuid_equal(id, &mem_mod_event_uuid)) {
> +		struct cxl_event_mem_module *rec =
> +				(struct cxl_event_mem_module *)record;
> +
> +		trace_cxl_memory_module(dev_name, type, rec);
> +		return;

Replace these early returns with a final else that calls
trace_cxl_generic_event()

>  	}
>  
>  	/* For unknown record types print just the header */
> diff --git a/drivers/cxl/cxlmem.h b/drivers/cxl/cxlmem.h
> index f5f63a475478..450b410f29f6 100644
> --- a/drivers/cxl/cxlmem.h
> +++ b/drivers/cxl/cxlmem.h
> @@ -452,6 +452,32 @@ struct cxl_event_dram {
>  	u8 reserved[0x17];
>  } __packed;
>  
> +/*
> + * Get Health Info Record
> + * CXL rev 3.0 section 8.2.9.8.3.1; Table 8-100
> + */
> +struct cxl_get_health_info {
> +	u8 health_status;
> +	u8 media_status;
> +	u8 add_status;
> +	u8 life_used;
> +	u8 device_temp[2];
> +	u8 dirty_shutdown_cnt[4];
> +	u8 cor_vol_err_cnt[4];
> +	u8 cor_per_err_cnt[4];
> +} __packed;
> +
> +/*
> + * Memory Module Event Record
> + * CXL rev 3.0 section 8.2.9.2.1.3; Table 8-45
> + */
> +struct cxl_event_mem_module {
> +	struct cxl_event_record_hdr hdr;
> +	u8 event_type;
> +	struct cxl_get_health_info info;
> +	u8 reserved[0x3d];

Decimal size please, otherwise the rest looks good to me.

> +} __packed;
> +
>  struct cxl_mbox_get_partition_info {
>  	__le64 active_volatile_cap;
>  	__le64 active_persistent_cap;
> diff --git a/include/trace/events/cxl.h b/include/trace/events/cxl.h
> index 474390f895d9..48786d6c9615 100644
> --- a/include/trace/events/cxl.h
> +++ b/include/trace/events/cxl.h
> @@ -334,6 +334,150 @@ TRACE_EVENT(cxl_dram,
>  	)
>  );
>  
> +/*
> + * Memory Module Event Record - MMER
> + *
> + * CXL res 3.0 section 8.2.9.2.1.3; Table 8-45
> + */
> +#define CXL_MMER_HEALTH_STATUS_CHANGE		0x00
> +#define CXL_MMER_MEDIA_STATUS_CHANGE		0x01
> +#define CXL_MMER_LIFE_USED_CHANGE		0x02
> +#define CXL_MMER_TEMP_CHANGE			0x03
> +#define CXL_MMER_DATA_PATH_ERROR		0x04
> +#define CXL_MMER_LAS_ERROR			0x05
> +#define show_dev_evt_type(type)	__print_symbolic(type,			   \
> +	{ CXL_MMER_HEALTH_STATUS_CHANGE,	"Health Status Change"	}, \
> +	{ CXL_MMER_MEDIA_STATUS_CHANGE,		"Media Status Change"	}, \
> +	{ CXL_MMER_LIFE_USED_CHANGE,		"Life Used Change"	}, \
> +	{ CXL_MMER_TEMP_CHANGE,			"Temperature Change"	}, \
> +	{ CXL_MMER_DATA_PATH_ERROR,		"Data Path Error"	}, \
> +	{ CXL_MMER_LAS_ERROR,			"LSA Error"		}  \
> +)
> +
> +/*
> + * Device Health Information - DHI
> + *
> + * CXL res 3.0 section 8.2.9.8.3.1; Table 8-100
> + */
> +#define CXL_DHI_HS_MAINTENANCE_NEEDED				BIT(0)
> +#define CXL_DHI_HS_PERFORMANCE_DEGRADED				BIT(1)
> +#define CXL_DHI_HS_HW_REPLACEMENT_NEEDED			BIT(2)
> +#define show_health_status_flags(flags)	__print_flags(flags, "|",	   \
> +	{ CXL_DHI_HS_MAINTENANCE_NEEDED,	"MAINTENANCE_NEEDED"	}, \
> +	{ CXL_DHI_HS_PERFORMANCE_DEGRADED,	"PERFORMANCE_DEGRADED"	}, \
> +	{ CXL_DHI_HS_HW_REPLACEMENT_NEEDED,	"REPLACEMENT_NEEDED"	}  \
> +)
> +
> +#define CXL_DHI_MS_NORMAL							0x00
> +#define CXL_DHI_MS_NOT_READY							0x01
> +#define CXL_DHI_MS_WRITE_PERSISTENCY_LOST					0x02
> +#define CXL_DHI_MS_ALL_DATA_LOST						0x03
> +#define CXL_DHI_MS_WRITE_PERSISTENCY_LOSS_EVENT_POWER_LOSS			0x04
> +#define CXL_DHI_MS_WRITE_PERSISTENCY_LOSS_EVENT_SHUTDOWN			0x05
> +#define CXL_DHI_MS_WRITE_PERSISTENCY_LOSS_IMMINENT				0x06
> +#define CXL_DHI_MS_WRITE_ALL_DATA_LOSS_EVENT_POWER_LOSS				0x07
> +#define CXL_DHI_MS_WRITE_ALL_DATA_LOSS_EVENT_SHUTDOWN				0x08
> +#define CXL_DHI_MS_WRITE_ALL_DATA_LOSS_IMMINENT					0x09
> +#define show_media_status(ms)	__print_symbolic(ms,			   \
> +	{ CXL_DHI_MS_NORMAL,						   \
> +		"Normal"						}, \
> +	{ CXL_DHI_MS_NOT_READY,						   \
> +		"Not Ready"						}, \
> +	{ CXL_DHI_MS_WRITE_PERSISTENCY_LOST,				   \
> +		"Write Persistency Lost"				}, \
> +	{ CXL_DHI_MS_ALL_DATA_LOST,					   \
> +		"All Data Lost"						}, \
> +	{ CXL_DHI_MS_WRITE_PERSISTENCY_LOSS_EVENT_POWER_LOSS,		   \
> +		"Write Persistency Loss in the Event of Power Loss"	}, \
> +	{ CXL_DHI_MS_WRITE_PERSISTENCY_LOSS_EVENT_SHUTDOWN,		   \
> +		"Write Persistency Loss in Event of Shutdown"		}, \
> +	{ CXL_DHI_MS_WRITE_PERSISTENCY_LOSS_IMMINENT,			   \
> +		"Write Persistency Loss Imminent"			}, \
> +	{ CXL_DHI_MS_WRITE_ALL_DATA_LOSS_EVENT_POWER_LOSS,		   \
> +		"All Data Loss in Event of Power Loss"			}, \
> +	{ CXL_DHI_MS_WRITE_ALL_DATA_LOSS_EVENT_SHUTDOWN,		   \
> +		"All Data loss in the Event of Shutdown"		}, \
> +	{ CXL_DHI_MS_WRITE_ALL_DATA_LOSS_IMMINENT,			   \
> +		"All Data Loss Imminent"				}  \
> +)
> +
> +#define CXL_DHI_AS_NORMAL		0x0
> +#define CXL_DHI_AS_WARNING		0x1
> +#define CXL_DHI_AS_CRITICAL		0x2
> +#define show_two_bit_status(as) __print_symbolic(as,	   \
> +	{ CXL_DHI_AS_NORMAL,		"Normal"	}, \
> +	{ CXL_DHI_AS_WARNING,		"Warning"	}, \
> +	{ CXL_DHI_AS_CRITICAL,		"Critical"	}  \
> +)
> +#define show_one_bit_status(as) __print_symbolic(as,	   \
> +	{ CXL_DHI_AS_NORMAL,		"Normal"	}, \
> +	{ CXL_DHI_AS_WARNING,		"Warning"	}  \
> +)
> +
> +#define CXL_DHI_AS_LIFE_USED(as)			(as & 0x3)
> +#define CXL_DHI_AS_DEV_TEMP(as)				((as & 0xC) >> 2)
> +#define CXL_DHI_AS_COR_VOL_ERR_CNT(as)			((as & 0x10) >> 4)
> +#define CXL_DHI_AS_COR_PER_ERR_CNT(as)			((as & 0x20) >> 5)
> +
> +TRACE_EVENT(cxl_memory_module,
> +
> +	TP_PROTO(const char *dev_name, enum cxl_event_log_type log,
> +		 struct cxl_event_mem_module *rec),
> +
> +	TP_ARGS(dev_name, log, rec),
> +
> +	TP_STRUCT__entry(
> +		CXL_EVT_TP_entry
> +
> +		/* Memory Module Event */
> +		__field(u8, event_type)
> +
> +		/* Device Health Info */
> +		__field(u8, health_status)
> +		__field(u8, media_status)
> +		__field(u8, life_used)
> +		__field(u32, dirty_shutdown_cnt)
> +		__field(u32, cor_vol_err_cnt)
> +		__field(u32, cor_per_err_cnt)
> +		__field(s16, device_temp)
> +		__field(u8, add_status)
> +	),
> +
> +	TP_fast_assign(
> +		CXL_EVT_TP_fast_assign(dev_name, log, rec->hdr);
> +
> +		/* Memory Module Event */
> +		__entry->event_type = rec->event_type;
> +
> +		/* Device Health Info */
> +		__entry->health_status = rec->info.health_status;
> +		__entry->media_status = rec->info.media_status;
> +		__entry->life_used = rec->info.life_used;
> +		__entry->dirty_shutdown_cnt = get_unaligned_le32(rec->info.dirty_shutdown_cnt);
> +		__entry->cor_vol_err_cnt = get_unaligned_le32(rec->info.cor_vol_err_cnt);
> +		__entry->cor_per_err_cnt = get_unaligned_le32(rec->info.cor_per_err_cnt);
> +		__entry->device_temp = get_unaligned_le16(rec->info.device_temp);
> +		__entry->add_status = rec->info.add_status;
> +	),
> +
> +	CXL_EVT_TP_printk("event_type='%s' health_status='%s' media_status='%s' " \
> +		"as_life_used=%s as_dev_temp=%s as_cor_vol_err_cnt=%s " \
> +		"as_cor_per_err_cnt=%s life_used=%u device_temp=%d " \
> +		"dirty_shutdown_cnt=%u cor_vol_err_cnt=%u cor_per_err_cnt=%u",
> +		show_dev_evt_type(__entry->event_type),
> +		show_health_status_flags(__entry->health_status),
> +		show_media_status(__entry->media_status),
> +		show_two_bit_status(CXL_DHI_AS_LIFE_USED(__entry->add_status)),
> +		show_two_bit_status(CXL_DHI_AS_DEV_TEMP(__entry->add_status)),
> +		show_one_bit_status(CXL_DHI_AS_COR_VOL_ERR_CNT(__entry->add_status)),
> +		show_one_bit_status(CXL_DHI_AS_COR_PER_ERR_CNT(__entry->add_status)),
> +		__entry->life_used, __entry->device_temp,
> +		__entry->dirty_shutdown_cnt, __entry->cor_vol_err_cnt,
> +		__entry->cor_per_err_cnt
> +	)
> +);
> +
> +
>  #endif /* _CXL_TRACE_EVENTS_H */
>  
>  /* This part must be outside protection */
> -- 
> 2.37.2
> 



^ permalink raw reply	[flat|nested] 64+ messages in thread

* RE: [PATCH V2 08/11] cxl/mem: Wire up event interrupts
  2022-12-01  0:27 ` [PATCH V2 08/11] cxl/mem: Wire up event interrupts ira.weiny
  2022-12-01 14:21   ` Jonathan Cameron
  2022-12-01 18:35   ` Davidlohr Bueso
@ 2022-12-02  7:37   ` Dan Williams
  2022-12-02 14:19     ` Jonathan Cameron
  2 siblings, 1 reply; 64+ messages in thread
From: Dan Williams @ 2022-12-02  7:37 UTC (permalink / raw)
  To: ira.weiny, Dan Williams
  Cc: Ira Weiny, Alison Schofield, Vishal Verma, Ben Widawsky,
	Steven Rostedt, Jonathan Cameron, Davidlohr Bueso, Dave Jiang,
	linux-kernel, linux-cxl

ira.weiny@ wrote:
> From: Ira Weiny <ira.weiny@intel.com>
> 
> CXL device events are signaled via interrupts.  Each event log may have
> a different interrupt message number.  These message numbers are
> reported in the Get Event Interrupt Policy mailbox command.
> 
> Add interrupt support for event logs.  Interrupts are allocated as
> shared interrupts.  Therefore, all or some event logs can share the same
> message number.

Definitely squash patch1 with this one, especially because this shows
that the ->msi_enabled portion of patch1 was unnecessary, see below.

> 
> Signed-off-by: Ira Weiny <ira.weiny@intel.com>
> 
> ---
> Changes from V1:
> 	Remove unneeded evt_int_policy from struct cxl_dev_state
> 	defer Dynamic Capacity support
> 	Dave Jiang
> 		s/irq/rc
> 		use IRQ_NONE to signal the irq was not for us.
> 	Jonathan
> 		use msi_enabled rather than nr_irq_vec
> 		On failure explicitly set CXL_INT_NONE
> 		Add comment for Get Event Interrupt Policy
> 		use devm_request_threaded_irq()
> 		Use individual handler/thread functions for each of the
> 		logs rather than struct cxl_event_irq_id.
> 
> Changes from RFC v2
> 	Adjust to new irq 16 vector allocation
> 	Jonathan
> 		Remove CXL_INT_RES
> 	Use irq threads to ensure mailbox commands are executed outside irq context
> 	Adjust for optional Dynamic Capacity log
> ---
>  drivers/cxl/core/mbox.c      |  44 +++++++++++-
>  drivers/cxl/cxlmem.h         |  30 ++++++++
>  drivers/cxl/pci.c            | 130 +++++++++++++++++++++++++++++++++++
>  include/uapi/linux/cxl_mem.h |   2 +
>  4 files changed, 204 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/cxl/core/mbox.c b/drivers/cxl/core/mbox.c
> index 30840b711381..1e00b49d8b06 100644
> --- a/drivers/cxl/core/mbox.c
> +++ b/drivers/cxl/core/mbox.c
> @@ -53,6 +53,8 @@ static struct cxl_mem_command cxl_mem_commands[CXL_MEM_COMMAND_ID_MAX] = {
>  	CXL_CMD(GET_SUPPORTED_LOGS, 0, CXL_VARIABLE_PAYLOAD, CXL_CMD_FLAG_FORCE_ENABLE),
>  	CXL_CMD(GET_EVENT_RECORD, 1, CXL_VARIABLE_PAYLOAD, 0),
>  	CXL_CMD(CLEAR_EVENT_RECORD, CXL_VARIABLE_PAYLOAD, 0, 0),
> +	CXL_CMD(GET_EVT_INT_POLICY, 0, 0x5, 0),
> +	CXL_CMD(SET_EVT_INT_POLICY, 0x5, 0, 0),
>  	CXL_CMD(GET_FW_INFO, 0, 0x50, 0),
>  	CXL_CMD(GET_PARTITION_INFO, 0, 0x20, 0),
>  	CXL_CMD(GET_LSA, 0x8, CXL_VARIABLE_PAYLOAD, 0),
> @@ -806,8 +808,8 @@ static int cxl_clear_event_record(struct cxl_dev_state *cxlds,
>  	return 0;
>  }
>  
> -static void cxl_mem_get_records_log(struct cxl_dev_state *cxlds,
> -				    enum cxl_event_log_type type)
> +void cxl_mem_get_records_log(struct cxl_dev_state *cxlds,
> +			     enum cxl_event_log_type type)
>  {
>  	struct cxl_get_event_payload *payload;
>  	u16 nr_rec;
> @@ -857,6 +859,7 @@ static void cxl_mem_get_records_log(struct cxl_dev_state *cxlds,
>  unlock_buffer:
>  	mutex_unlock(&cxlds->event_buf_lock);
>  }
> +EXPORT_SYMBOL_NS_GPL(cxl_mem_get_records_log, CXL);
>  
>  static void cxl_mem_free_event_buffer(void *data)
>  {
> @@ -916,6 +919,43 @@ void cxl_mem_get_event_records(struct cxl_dev_state *cxlds)
>  }
>  EXPORT_SYMBOL_NS_GPL(cxl_mem_get_event_records, CXL);
>  
> +int cxl_event_config_msgnums(struct cxl_dev_state *cxlds,
> +			     struct cxl_event_interrupt_policy *policy)
> +{
> +	int rc;
> +
> +	policy->info_settings = CXL_INT_MSI_MSIX;
> +	policy->warn_settings = CXL_INT_MSI_MSIX;
> +	policy->failure_settings = CXL_INT_MSI_MSIX;
> +	policy->fatal_settings = CXL_INT_MSI_MSIX;

I think this needs to be careful not to undo events that the BIOS
steered to itself in firmware-first mode, which raises another question,
does firmware-first mean more the OS needs to backoff on some event-log
handling as well?

> +
> +	rc = cxl_mbox_send_cmd(cxlds, CXL_MBOX_OP_SET_EVT_INT_POLICY,
> +			       policy, sizeof(*policy), NULL, 0);
> +	if (rc < 0) {
> +		dev_err(cxlds->dev, "Failed to set event interrupt policy : %d",
> +			rc);
> +
> +		policy->info_settings = CXL_INT_NONE;
> +		policy->warn_settings = CXL_INT_NONE;
> +		policy->failure_settings = CXL_INT_NONE;
> +		policy->fatal_settings = CXL_INT_NONE;
> +
> +		return rc;
> +	}
> +
> +	/* Retrieve interrupt message numbers */
> +	rc = cxl_mbox_send_cmd(cxlds, CXL_MBOX_OP_GET_EVT_INT_POLICY, NULL, 0,
> +			       policy, sizeof(*policy));
> +	if (rc < 0) {
> +		dev_err(cxlds->dev, "Failed to get event interrupt policy : %d",
> +			rc);
> +		return rc;
> +	}
> +
> +	return 0;
> +}
> +EXPORT_SYMBOL_NS_GPL(cxl_event_config_msgnums, CXL);
> +
>  /**
>   * cxl_mem_get_partition_info - Get partition info
>   * @cxlds: The device data for the operation
> diff --git a/drivers/cxl/cxlmem.h b/drivers/cxl/cxlmem.h
> index 450b410f29f6..2d384b0fc2b3 100644
> --- a/drivers/cxl/cxlmem.h
> +++ b/drivers/cxl/cxlmem.h
> @@ -179,6 +179,30 @@ struct cxl_endpoint_dvsec_info {
>  	struct range dvsec_range[2];
>  };
>  
> +/**
> + * Event Interrupt Policy
> + *
> + * CXL rev 3.0 section 8.2.9.2.4; Table 8-52
> + */
> +enum cxl_event_int_mode {
> +	CXL_INT_NONE		= 0x00,
> +	CXL_INT_MSI_MSIX	= 0x01,
> +	CXL_INT_FW		= 0x02
> +};
> +#define CXL_EVENT_INT_MODE_MASK 0x3
> +#define CXL_EVENT_INT_MSGNUM(setting) (((setting) & 0xf0) >> 4)
> +struct cxl_event_interrupt_policy {
> +	u8 info_settings;
> +	u8 warn_settings;
> +	u8 failure_settings;
> +	u8 fatal_settings;
> +} __packed;
> +
> +static inline bool cxl_evt_int_is_msi(u8 setting)
> +{
> +	return CXL_INT_MSI_MSIX == (setting & CXL_EVENT_INT_MODE_MASK);
> +}
> +
>  /**
>   * struct cxl_dev_state - The driver device state
>   *
> @@ -262,6 +286,8 @@ enum cxl_opcode {
>  	CXL_MBOX_OP_RAW			= CXL_MBOX_OP_INVALID,
>  	CXL_MBOX_OP_GET_EVENT_RECORD	= 0x0100,
>  	CXL_MBOX_OP_CLEAR_EVENT_RECORD	= 0x0101,
> +	CXL_MBOX_OP_GET_EVT_INT_POLICY	= 0x0102,
> +	CXL_MBOX_OP_SET_EVT_INT_POLICY	= 0x0103,
>  	CXL_MBOX_OP_GET_FW_INFO		= 0x0200,
>  	CXL_MBOX_OP_ACTIVATE_FW		= 0x0202,
>  	CXL_MBOX_OP_GET_SUPPORTED_LOGS	= 0x0400,
> @@ -537,7 +563,11 @@ int cxl_mem_create_range_info(struct cxl_dev_state *cxlds);
>  struct cxl_dev_state *cxl_dev_state_create(struct device *dev);
>  void set_exclusive_cxl_commands(struct cxl_dev_state *cxlds, unsigned long *cmds);
>  void clear_exclusive_cxl_commands(struct cxl_dev_state *cxlds, unsigned long *cmds);
> +void cxl_mem_get_records_log(struct cxl_dev_state *cxlds,
> +			     enum cxl_event_log_type type);
>  void cxl_mem_get_event_records(struct cxl_dev_state *cxlds);
> +int cxl_event_config_msgnums(struct cxl_dev_state *cxlds,
> +			     struct cxl_event_interrupt_policy *policy);
>  #ifdef CONFIG_CXL_SUSPEND
>  void cxl_mem_active_inc(void);
>  void cxl_mem_active_dec(void);
> diff --git a/drivers/cxl/pci.c b/drivers/cxl/pci.c
> index 11e95a95195a..3c0b9199f11a 100644
> --- a/drivers/cxl/pci.c
> +++ b/drivers/cxl/pci.c
> @@ -449,6 +449,134 @@ static void cxl_pci_alloc_irq_vectors(struct cxl_dev_state *cxlds)
>  	cxlds->msi_enabled = true;
>  }
>  
> +static irqreturn_t cxl_event_info_thread(int irq, void *id)
> +{
> +	struct cxl_dev_state *cxlds = id;
> +
> +	cxl_mem_get_records_log(cxlds, CXL_EVENT_TYPE_INFO);
> +	return IRQ_HANDLED;
> +}
> +
> +static irqreturn_t cxl_event_info_handler(int irq, void *id)
> +{
> +	struct cxl_dev_state *cxlds = id;
> +	u32 status = readl(cxlds->regs.status + CXLDEV_DEV_EVENT_STATUS_OFFSET);
> +
> +	if (CXLDEV_EVENT_STATUS_INFO & status)
> +		return IRQ_WAKE_THREAD;
> +	return IRQ_NONE;
> +}
> +
> +static irqreturn_t cxl_event_warn_thread(int irq, void *id)
> +{
> +	struct cxl_dev_state *cxlds = id;
> +
> +	cxl_mem_get_records_log(cxlds, CXL_EVENT_TYPE_WARN);
> +	return IRQ_HANDLED;
> +}
> +
> +static irqreturn_t cxl_event_warn_handler(int irq, void *id)
> +{
> +	struct cxl_dev_state *cxlds = id;
> +	u32 status = readl(cxlds->regs.status + CXLDEV_DEV_EVENT_STATUS_OFFSET);
> +
> +	if (CXLDEV_EVENT_STATUS_WARN & status)
> +		return IRQ_WAKE_THREAD;
> +	return IRQ_NONE;
> +}
> +
> +static irqreturn_t cxl_event_failure_thread(int irq, void *id)
> +{
> +	struct cxl_dev_state *cxlds = id;
> +
> +	cxl_mem_get_records_log(cxlds, CXL_EVENT_TYPE_FAIL);
> +	return IRQ_HANDLED;
> +}

So I think one of the nice side effects of moving log priorty handling
inside of cxl_mem_get_records_log() and looping through all log types in
priority order until all status is clear is that an INFO interrupt also
triggers a check of the FATAL status for free.

You likely do not even need to do the status read in hardirq part of the
handler, just unconditionally wake the event handler thread. I.e. just
pass NULL for @handler to devm_request_threaded_irq() and let the
thread_fn figure it all out in priority order.

> +
> +static irqreturn_t cxl_event_failure_handler(int irq, void *id)
> +{
> +	struct cxl_dev_state *cxlds = id;
> +	u32 status = readl(cxlds->regs.status + CXLDEV_DEV_EVENT_STATUS_OFFSET);
> +
> +	if (CXLDEV_EVENT_STATUS_FAIL & status)
> +		return IRQ_WAKE_THREAD;
> +	return IRQ_NONE;
> +}
> +
> +static irqreturn_t cxl_event_fatal_thread(int irq, void *id)
> +{
> +	struct cxl_dev_state *cxlds = id;
> +
> +	cxl_mem_get_records_log(cxlds, CXL_EVENT_TYPE_FATAL);
> +	return IRQ_HANDLED;
> +}
> +
> +static irqreturn_t cxl_event_fatal_handler(int irq, void *id)
> +{
> +	struct cxl_dev_state *cxlds = id;
> +	u32 status = readl(cxlds->regs.status + CXLDEV_DEV_EVENT_STATUS_OFFSET);
> +
> +	if (CXLDEV_EVENT_STATUS_FATAL & status)
> +		return IRQ_WAKE_THREAD;
> +	return IRQ_NONE;
> +}
> +
> +static void cxl_event_irqsetup(struct cxl_dev_state *cxlds)
> +{
> +	struct cxl_event_interrupt_policy policy;
> +	struct device *dev = cxlds->dev;
> +	struct pci_dev *pdev = to_pci_dev(dev);
> +	u8 setting;
> +	int rc;
> +
> +	if (cxl_event_config_msgnums(cxlds, &policy))
> +		return;
> +
> +	setting = policy.info_settings;
> +	if (cxl_evt_int_is_msi(setting)) {

So pci_irq_vector() automatically handles checking if msi is enabled and
will return a failure if either MSI is not enabled, or the message
number did not get a vector.

With that insight I would do something like this (untested):

@@ -521,7 +521,14 @@ static irqreturn_t cxl_event_fatal_handler(int irq, void *id)
        return IRQ_NONE;
 }
 
-static void cxl_event_irqsetup(struct cxl_dev_state *cxlds)
+static int cxl_evt_irq(struct pci_dev *pdev, u8 setting)
+{
+       if ((setting & CXL_EVENT_INT_MODE_MASK) != CXL_INT_MSI_MSIX)
+               return -ENXIO;
+       return pci_irq_vector(pdev, CXL_EVENT_INT_MSGNUM(setting));
+}
+
+static int cxl_event_irqsetup(struct cxl_dev_state *cxlds)
 {
        struct cxl_event_interrupt_policy policy;
        struct device *dev = cxlds->dev;
@@ -529,18 +536,17 @@ static void cxl_event_irqsetup(struct cxl_dev_state *cxlds)
        u8 setting;
        int rc;
 
-       if (cxl_event_config_msgnums(cxlds, &policy))
-               return;
+       rc = cxl_event_config_msgnums(cxlds, &policy);
+       if (rc)
+               return rc;
 
-       setting = policy.info_settings;
-       if (cxl_evt_int_is_msi(setting)) {
-               rc = devm_request_threaded_irq(dev,
-                               pci_irq_vector(pdev, CXL_EVENT_INT_MSGNUM(setting)),
-                               cxl_event_info_handler, cxl_event_info_thread,
-                               IRQF_SHARED, NULL, cxlds);
-               if (rc)
-                       dev_err(dev, "Failed to get interrupt for %s event log\n",
-                               cxl_event_log_type_str(CXL_EVENT_TYPE_INFO));
+       rc = devm_request_threaded_irq(dev,
+                                      cxl_evt_irq(pdev, policy.info_settings),
+                                      NULL, cxl_event_info_thread, IRQF_SHARED,
+                                      NULL, cxlds);
+       if (rc) {
+               dev_err(dev, "Failed to get interrupt for INFO event log\n");
+               return rc;
        }
 
        setting = policy.warn_settings;



> +		rc = devm_request_threaded_irq(dev,
> +				pci_irq_vector(pdev, CXL_EVENT_INT_MSGNUM(setting)),
> +				cxl_event_info_handler, cxl_event_info_thread,
> +				IRQF_SHARED, NULL, cxlds);
> +		if (rc)
> +			dev_err(dev, "Failed to get interrupt for %s event log\n",
> +				cxl_event_log_type_str(CXL_EVENT_TYPE_INFO));

Per above, no need to use cxl_event_log_type_str() in these.

> +	}
> +
> +	setting = policy.warn_settings;
> +	if (cxl_evt_int_is_msi(setting)) {
> +		rc = devm_request_threaded_irq(dev,
> +				pci_irq_vector(pdev, CXL_EVENT_INT_MSGNUM(setting)),
> +				cxl_event_warn_handler, cxl_event_warn_thread,
> +				IRQF_SHARED, NULL, cxlds);
> +		if (rc)
> +			dev_err(dev, "Failed to get interrupt for %s event log\n",
> +				cxl_event_log_type_str(CXL_EVENT_TYPE_WARN));
> +	}
> +
> +	setting = policy.failure_settings;
> +	if (cxl_evt_int_is_msi(setting)) {
> +		rc = devm_request_threaded_irq(dev,
> +				pci_irq_vector(pdev, CXL_EVENT_INT_MSGNUM(setting)),
> +				cxl_event_failure_handler, cxl_event_failure_thread,
> +				IRQF_SHARED, NULL, cxlds);
> +		if (rc)
> +			dev_err(dev, "Failed to get interrupt for %s event log\n",
> +				cxl_event_log_type_str(CXL_EVENT_TYPE_FAIL));
> +	}
> +
> +	setting = policy.fatal_settings;
> +	if (cxl_evt_int_is_msi(setting)) {
> +		rc = devm_request_threaded_irq(dev,
> +				pci_irq_vector(pdev, CXL_EVENT_INT_MSGNUM(setting)),
> +				cxl_event_fatal_handler, cxl_event_fatal_thread,
> +				IRQF_SHARED, NULL, cxlds);
> +		if (rc)
> +			dev_err(dev, "Failed to get interrupt for %s event log\n",
> +				cxl_event_log_type_str(CXL_EVENT_TYPE_FATAL));
> +	}
> +}
> +
>  static int cxl_pci_probe(struct pci_dev *pdev, const struct pci_device_id *id)
>  {
>  	struct cxl_register_map map;
> @@ -516,6 +644,8 @@ static int cxl_pci_probe(struct pci_dev *pdev, const struct pci_device_id *id)
>  		return rc;
>  
>  	cxl_pci_alloc_irq_vectors(cxlds);

There should be fail return here, or a comment why this can be skipped,
especially if the device claims to support events.

> +	if (cxlds->msi_enabled)
> +		cxl_event_irqsetup(cxlds);

Per above, do this unconditionally.

>  
>  	cxlmd = devm_cxl_add_memdev(cxlds);
>  	if (IS_ERR(cxlmd))
> diff --git a/include/uapi/linux/cxl_mem.h b/include/uapi/linux/cxl_mem.h
> index 7c1ad8062792..a8204802fcca 100644
> --- a/include/uapi/linux/cxl_mem.h
> +++ b/include/uapi/linux/cxl_mem.h
> @@ -26,6 +26,8 @@
>  	___C(GET_SUPPORTED_LOGS, "Get Supported Logs"),                   \
>  	___C(GET_EVENT_RECORD, "Get Event Record"),                       \
>  	___C(CLEAR_EVENT_RECORD, "Clear Event Record"),                   \
> +	___C(GET_EVT_INT_POLICY, "Get Event Interrupt Policy"),           \
> +	___C(SET_EVT_INT_POLICY, "Set Event Interrupt Policy"),           \
>  	___C(GET_FW_INFO, "Get FW Info"),                                 \
>  	___C(GET_PARTITION_INFO, "Get Partition Information"),            \
>  	___C(GET_LSA, "Get Label Storage Area"),                          \

Same, "at the end" comment.

^ permalink raw reply	[flat|nested] 64+ messages in thread

* RE: [PATCH V2 09/11] cxl/test: Add generic mock events
  2022-12-01  0:27 ` [PATCH V2 09/11] cxl/test: Add generic mock events ira.weiny
  2022-12-01 14:37   ` Jonathan Cameron
@ 2022-12-02  8:07   ` Dan Williams
  1 sibling, 0 replies; 64+ messages in thread
From: Dan Williams @ 2022-12-02  8:07 UTC (permalink / raw)
  To: ira.weiny, Dan Williams
  Cc: Ira Weiny, Alison Schofield, Vishal Verma, Ben Widawsky,
	Steven Rostedt, Jonathan Cameron, Davidlohr Bueso, Dave Jiang,
	linux-kernel, linux-cxl

ira.weiny@ wrote:
> From: Ira Weiny <ira.weiny@intel.com>
> 
> Facilitate testing basic Get/Clear Event functionality by creating
> multiple logs and generic events with made up UUID's.
> 
> Data is completely made up with data patterns which should be easy to
> spot in trace output.
> 
> A single sysfs entry resets the event data and triggers collecting the
> events for testing.
> 
> Events are returned one at a time which is within the specification even
> though it does not exercise the full capabilities of what a device may
> do.
> 
> Test traces are easy to obtain with a small script such as this:
> 
> 	#!/bin/bash -x
> 
> 	devices=`find /sys/devices/platform -name cxl_mem*`
> 
> 	# Turn on tracing
> 	echo "" > /sys/kernel/tracing/trace
> 	echo 1 > /sys/kernel/tracing/events/cxl/enable
> 	echo 1 > /sys/kernel/tracing/tracing_on
> 
> 	# Generate fake interrupt
> 	for device in $devices; do
> 	        echo 1 > $device/event_trigger
> 	done
> 
> 	# Turn off tracing and report events
> 	echo 0 > /sys/kernel/tracing/tracing_on
> 	cat /sys/kernel/tracing/trace
> 

Nice, should be straightforward to copy-pasta that into a unit test
script.

> Signed-off-by: Ira Weiny <ira.weiny@intel.com>
> 
> ---
> Changes from v1:
> 	Fix up for new structures
> 	Jonathan
> 		Update based on specification discussion
> 
> Changes from RFC v2:
> 	Adjust to simulate the event status register
> 
> Changes from RFC:
> 	Separate out the event code
> 	Adjust for struct changes.
> 	Clean up devm_cxl_mock_event_logs()
> 	Clean up naming and comments
> 	Jonathan
> 		Remove dynamic allocation of event logs
> 		Clean up comment
> 		Remove unneeded xarray
> 		Ensure event_trigger sysfs is valid prior to the driver
> 		going active.
> 	Dan
> 		Remove the fill/reset event sysfs as these operations
> 		can be done together
> ---
>  drivers/cxl/core/mbox.c         |  33 +++--
>  drivers/cxl/cxlmem.h            |   1 +
>  tools/testing/cxl/test/Kbuild   |   2 +-
>  tools/testing/cxl/test/events.c | 242 ++++++++++++++++++++++++++++++++
>  tools/testing/cxl/test/events.h |   9 ++
>  tools/testing/cxl/test/mem.c    |  35 ++++-
>  6 files changed, 307 insertions(+), 15 deletions(-)
>  create mode 100644 tools/testing/cxl/test/events.c
>  create mode 100644 tools/testing/cxl/test/events.h
> 
> diff --git a/drivers/cxl/core/mbox.c b/drivers/cxl/core/mbox.c
> index 1e00b49d8b06..17659b9a0408 100644
> --- a/drivers/cxl/core/mbox.c
> +++ b/drivers/cxl/core/mbox.c
> @@ -886,20 +886,9 @@ static struct cxl_get_event_payload *alloc_event_buf(struct cxl_dev_state *cxlds
>  	return buf;
>  }
>  
> -/**
> - * cxl_mem_get_event_records - Get Event Records from the device
> - * @cxlds: The device data for the operation
> - *
> - * Retrieve all event records available on the device, report them as trace
> - * events, and clear them.
> - *
> - * See CXL rev 3.0 @8.2.9.2.2 Get Event Records
> - * See CXL rev 3.0 @8.2.9.2.3 Clear Event Records
> - */
> -void cxl_mem_get_event_records(struct cxl_dev_state *cxlds)
> +/* Direct call for mock testing */
> +void __cxl_mem_get_event_records(struct cxl_dev_state *cxlds, u32 status)
>  {
> -	u32 status = readl(cxlds->regs.status + CXLDEV_DEV_EVENT_STATUS_OFFSET);
> -
>  	dev_dbg(cxlds->dev, "Reading event logs: %x\n", status);
>  
>  	if (!cxlds->event_buf) {
> @@ -917,6 +906,24 @@ void cxl_mem_get_event_records(struct cxl_dev_state *cxlds)
>  	if (status & CXLDEV_EVENT_STATUS_FATAL)
>  		cxl_mem_get_records_log(cxlds, CXL_EVENT_TYPE_FATAL);
>  }
> +EXPORT_SYMBOL_NS_GPL(__cxl_mem_get_event_records, CXL);
> +
> +/**
> + * cxl_mem_get_event_records - Get Event Records from the device
> + * @cxlds: The device data for the operation
> + *
> + * Retrieve all event records available on the device, report them as trace
> + * events, and clear them.
> + *
> + * See CXL rev 3.0 @8.2.9.2.2 Get Event Records
> + * See CXL rev 3.0 @8.2.9.2.3 Clear Event Records
> + */
> +void cxl_mem_get_event_records(struct cxl_dev_state *cxlds)
> +{
> +	u32 status = readl(cxlds->regs.status + CXLDEV_DEV_EVENT_STATUS_OFFSET);
> +
> +	__cxl_mem_get_event_records(cxlds, status);
> +}
>  EXPORT_SYMBOL_NS_GPL(cxl_mem_get_event_records, CXL);

The unit tests should be minimally disruptive to the mainlie code, and
giving them their own private export is outside that spirit. Can you
rearrange so that the unit tests don't need their own export? Like all
callers pass the status?

>  
>  int cxl_event_config_msgnums(struct cxl_dev_state *cxlds,
> diff --git a/drivers/cxl/cxlmem.h b/drivers/cxl/cxlmem.h
> index 2d384b0fc2b3..10e3c1c893f3 100644
> --- a/drivers/cxl/cxlmem.h
> +++ b/drivers/cxl/cxlmem.h
> @@ -565,6 +565,7 @@ void set_exclusive_cxl_commands(struct cxl_dev_state *cxlds, unsigned long *cmds
>  void clear_exclusive_cxl_commands(struct cxl_dev_state *cxlds, unsigned long *cmds);
>  void cxl_mem_get_records_log(struct cxl_dev_state *cxlds,
>  			     enum cxl_event_log_type type);
> +void __cxl_mem_get_event_records(struct cxl_dev_state *cxlds, u32 status);
>  void cxl_mem_get_event_records(struct cxl_dev_state *cxlds);
>  int cxl_event_config_msgnums(struct cxl_dev_state *cxlds,
>  			     struct cxl_event_interrupt_policy *policy);
> diff --git a/tools/testing/cxl/test/Kbuild b/tools/testing/cxl/test/Kbuild
> index 4e59e2c911f6..64b14b83d8d9 100644
> --- a/tools/testing/cxl/test/Kbuild
> +++ b/tools/testing/cxl/test/Kbuild
> @@ -7,4 +7,4 @@ obj-m += cxl_mock_mem.o
>  
>  cxl_test-y := cxl.o
>  cxl_mock-y := mock.o
> -cxl_mock_mem-y := mem.o
> +cxl_mock_mem-y := mem.o events.o
> diff --git a/tools/testing/cxl/test/events.c b/tools/testing/cxl/test/events.c
> new file mode 100644
> index 000000000000..a3d2ec7cc9fe
> --- /dev/null
> +++ b/tools/testing/cxl/test/events.c
> @@ -0,0 +1,242 @@
> +// SPDX-License-Identifier: GPL-2.0-only
> +// Copyright(c) 2022 Intel Corporation. All rights reserved.
> +
> +#include <cxlmem.h>
> +#include <trace/events/cxl.h>
> +
> +#include "events.h"
> +
> +#define CXL_TEST_EVENT_CNT_MAX 15
> +
> +/* Set a number of events to return at a time for simulation.  */
> +#define CXL_TEST_EVENT_CNT 3
> +
> +struct mock_event_log {
> +	u16 clear_idx;
> +	u16 cur_idx;
> +	u16 nr_events;
> +	struct cxl_event_record_raw *events[CXL_TEST_EVENT_CNT_MAX];
> +};
> +
> +struct mock_event_store {
> +	struct cxl_dev_state *cxlds;
> +	struct mock_event_log mock_logs[CXL_EVENT_TYPE_MAX];
> +	u32 ev_status;
> +};
> +
> +DEFINE_XARRAY(mock_dev_event_store);
> +
> +struct mock_event_log *find_event_log(struct device *dev, int log_type)
> +{
> +	struct mock_event_store *mes = xa_load(&mock_dev_event_store,
> +					       (unsigned long)dev);
> +
> +	if (!mes || log_type >= CXL_EVENT_TYPE_MAX)
> +		return NULL;
> +	return &mes->mock_logs[log_type];
> +}
> +
> +struct cxl_event_record_raw *get_cur_event(struct mock_event_log *log)
> +{
> +	return log->events[log->cur_idx];
> +}
> +
> +void reset_event_log(struct mock_event_log *log)
> +{
> +	log->cur_idx = 0;
> +	log->clear_idx = 0;
> +}
> +
> +/* Handle can never be 0 use 1 based indexing for handle */
> +u16 get_clear_handle(struct mock_event_log *log)
> +{
> +	return log->clear_idx + 1;
> +}
> +
> +/* Handle can never be 0 use 1 based indexing for handle */
> +__le16 get_cur_event_handle(struct mock_event_log *log)
> +{
> +	u16 cur_handle = log->cur_idx + 1;
> +
> +	return cpu_to_le16(cur_handle);
> +}
> +
> +static bool log_empty(struct mock_event_log *log)
> +{
> +	return log->cur_idx == log->nr_events;
> +}
> +
> +static void event_store_add_event(struct mock_event_store *mes,
> +				  enum cxl_event_log_type log_type,
> +				  struct cxl_event_record_raw *event)
> +{
> +	struct mock_event_log *log;
> +
> +	if (WARN_ON(log_type >= CXL_EVENT_TYPE_MAX))
> +		return;
> +
> +	log = &mes->mock_logs[log_type];
> +	if (WARN_ON(log->nr_events >= CXL_TEST_EVENT_CNT_MAX))
> +		return;
> +
> +	log->events[log->nr_events] = event;
> +	log->nr_events++;
> +}
> +
> +int mock_get_event(struct cxl_dev_state *cxlds, struct cxl_mbox_cmd *cmd)
> +{
> +	struct cxl_get_event_payload *pl;
> +	struct mock_event_log *log;
> +	u8 log_type;
> +	int i;
> +
> +	if (cmd->size_in != sizeof(log_type))
> +		return -EINVAL;
> +
> +	if (cmd->size_out < struct_size(pl, records, CXL_TEST_EVENT_CNT))
> +		return -EINVAL;
> +
> +	log_type = *((u8 *)cmd->payload_in);
> +	if (log_type >= CXL_EVENT_TYPE_MAX)
> +		return -EINVAL;
> +
> +	memset(cmd->payload_out, 0, cmd->size_out);
> +
> +	log = find_event_log(cxlds->dev, log_type);
> +	if (!log || log_empty(log))
> +		return 0;
> +
> +	pl = cmd->payload_out;
> +
> +	for (i = 0; i < CXL_TEST_EVENT_CNT && !log_empty(log); i++) {
> +		memcpy(&pl->records[i], get_cur_event(log), sizeof(pl->records[i]));
> +		pl->records[i].hdr.handle = get_cur_event_handle(log);
> +		log->cur_idx++;
> +	}
> +
> +	pl->record_count = cpu_to_le16(i);
> +	if (!log_empty(log))
> +		pl->flags |= CXL_GET_EVENT_FLAG_MORE_RECORDS;
> +
> +	return 0;
> +}
> +EXPORT_SYMBOL_GPL(mock_get_event);
> +
> +int mock_clear_event(struct cxl_dev_state *cxlds, struct cxl_mbox_cmd *cmd)
> +{
> +	struct cxl_mbox_clear_event_payload *pl = cmd->payload_in;
> +	struct mock_event_log *log;
> +	u8 log_type = pl->event_log;
> +	u16 handle;
> +	int nr;
> +
> +	if (log_type >= CXL_EVENT_TYPE_MAX)
> +		return -EINVAL;
> +
> +	log = find_event_log(cxlds->dev, log_type);
> +	if (!log)
> +		return 0; /* No mock data in this log */
> +
> +	/*
> +	 * This check is technically not invalid per the specification AFAICS.
> +	 * (The host could 'guess' handles and clear them in order).
> +	 * However, this is not good behavior for the host so test it.
> +	 */
> +	if (log->clear_idx + pl->nr_recs > log->cur_idx) {
> +		dev_err(cxlds->dev,
> +			"Attempting to clear more events than returned!\n");
> +		return -EINVAL;
> +	}
> +
> +	/* Check handle order prior to clearing events */
> +	for (nr = 0, handle = get_clear_handle(log);
> +	     nr < pl->nr_recs;
> +	     nr++, handle++) {
> +		if (handle != le16_to_cpu(pl->handle[nr])) {
> +			dev_err(cxlds->dev, "Clearing events out of order\n");
> +			return -EINVAL;
> +		}
> +	}
> +
> +	/* Clear events */
> +	log->clear_idx += pl->nr_recs;
> +	return 0;
> +}
> +EXPORT_SYMBOL_GPL(mock_clear_event);

Why is this exported? The caller is in the same compilation unit.

> +
> +void cxl_mock_event_trigger(struct device *dev)
> +{
> +	struct mock_event_store *mes = xa_load(&mock_dev_event_store,
> +					       (unsigned long)dev);
> +	int i;
> +
> +	for (i = CXL_EVENT_TYPE_INFO; i < CXL_EVENT_TYPE_MAX; i++) {
> +		struct mock_event_log *log;
> +
> +		log = find_event_log(dev, i);
> +		if (log)
> +			reset_event_log(log);
> +	}
> +
> +	__cxl_mem_get_event_records(mes->cxlds, mes->ev_status);
> +}
> +EXPORT_SYMBOL_GPL(cxl_mock_event_trigger);

Ditto on the export question.

> +
> +struct cxl_event_record_raw maint_needed = {
> +	.hdr = {
> +		.id = UUID_INIT(0xDEADBEEF, 0xCAFE, 0xBABE,
> +				0xa5, 0x5a, 0xa5, 0x5a, 0xa5, 0xa5, 0x5a, 0xa5),
> +		.length = sizeof(struct cxl_event_record_raw),
> +		.flags[0] = CXL_EVENT_RECORD_FLAG_MAINT_NEEDED,
> +		/* .handle = Set dynamically */
> +		.related_handle = cpu_to_le16(0xa5b6),
> +	},
> +	.data = { 0xDE, 0xAD, 0xBE, 0xEF },
> +};
> +
> +struct cxl_event_record_raw hardware_replace = {
> +	.hdr = {
> +		.id = UUID_INIT(0xBABECAFE, 0xBEEF, 0xDEAD,
> +				0xa5, 0x5a, 0xa5, 0x5a, 0xa5, 0xa5, 0x5a, 0xa5),
> +		.length = sizeof(struct cxl_event_record_raw),
> +		.flags[0] = CXL_EVENT_RECORD_FLAG_HW_REPLACE,
> +		/* .handle = Set dynamically */
> +		.related_handle = cpu_to_le16(0xb6a5),
> +	},
> +	.data = { 0xDE, 0xAD, 0xBE, 0xEF },
> +};
> +
> +u32 cxl_mock_add_event_logs(struct cxl_dev_state *cxlds)
> +{
> +	struct device *dev = cxlds->dev;
> +	struct mock_event_store *mes;
> +
> +	mes = devm_kzalloc(dev, sizeof(*mes), GFP_KERNEL);
> +	if (WARN_ON(!mes))
> +		return 0;

Why a WARN? Perhaps make this return 'struct mock_event_store *' or an
ERR_PTR.


> +	mes->cxlds = cxlds;
> +
> +	if (xa_insert(&mock_dev_event_store, (unsigned long)dev, mes,
> +		      GFP_KERNEL)) {
> +		dev_err(dev, "Event store not available for %s\n",
> +			dev_name(dev));

Downgraded to a dev_err() rather than a WARN is nice, but I think this
just wants to be something that hangs off of dev_get_drvdata(). See what
Dave and I came up with for his extra variables in the security unit
test. You might want to base on the for-6.2/cxl-security branch since
there are some other conflicts with dev_groups too.

> +		return 0;
> +	}
> +
> +	event_store_add_event(mes, CXL_EVENT_TYPE_INFO, &maint_needed);
> +	mes->ev_status |= CXLDEV_EVENT_STATUS_INFO;
> +
> +	event_store_add_event(mes, CXL_EVENT_TYPE_FATAL, &hardware_replace);
> +	mes->ev_status |= CXLDEV_EVENT_STATUS_FATAL;
> +
> +	return mes->ev_status;
> +}
> +EXPORT_SYMBOL_GPL(cxl_mock_add_event_logs);

Another export to remove.

> +
> +void cxl_mock_remove_event_logs(struct device *dev)

Unnecessary if cxl_mock_event_store move to dev_get_drvdata().

> +{
> +	struct mock_event_store *mes;
> +
> +	mes = xa_erase(&mock_dev_event_store, (unsigned long)dev);
> +}
> +EXPORT_SYMBOL_GPL(cxl_mock_remove_event_logs);
> diff --git a/tools/testing/cxl/test/events.h b/tools/testing/cxl/test/events.h
> new file mode 100644
> index 000000000000..5bebc6a0a01b
> --- /dev/null
> +++ b/tools/testing/cxl/test/events.h
> @@ -0,0 +1,9 @@
> +/* SPDX-License-Identifier: GPL-2.0 */
> +
> +#include <cxlmem.h>
> +
> +int mock_get_event(struct cxl_dev_state *cxlds, struct cxl_mbox_cmd *cmd);
> +int mock_clear_event(struct cxl_dev_state *cxlds, struct cxl_mbox_cmd *cmd);
> +u32 cxl_mock_add_event_logs(struct cxl_dev_state *cxlds);
> +void cxl_mock_remove_event_logs(struct device *dev);
> +void cxl_mock_event_trigger(struct device *dev);
> diff --git a/tools/testing/cxl/test/mem.c b/tools/testing/cxl/test/mem.c
> index e2f5445d24ff..333fa8527a07 100644
> --- a/tools/testing/cxl/test/mem.c
> +++ b/tools/testing/cxl/test/mem.c
> @@ -8,6 +8,7 @@
>  #include <linux/sizes.h>
>  #include <linux/bits.h>
>  #include <cxlmem.h>
> +#include "events.h"
>  
>  #define LSA_SIZE SZ_128K
>  #define DEV_SIZE SZ_2G
> @@ -224,6 +225,12 @@ static int cxl_mock_mbox_send(struct cxl_dev_state *cxlds, struct cxl_mbox_cmd *
>  	case CXL_MBOX_OP_GET_PARTITION_INFO:
>  		rc = mock_partition_info(cxlds, cmd);
>  		break;
> +	case CXL_MBOX_OP_GET_EVENT_RECORD:
> +		rc = mock_get_event(cxlds, cmd);
> +		break;
> +	case CXL_MBOX_OP_CLEAR_EVENT_RECORD:
> +		rc = mock_clear_event(cxlds, cmd);
> +		break;
>  	case CXL_MBOX_OP_SET_LSA:
>  		rc = mock_set_lsa(cxlds, cmd);
>  		break;
> @@ -245,11 +252,27 @@ static void label_area_release(void *lsa)
>  	vfree(lsa);
>  }
>  
> +static ssize_t event_trigger_store(struct device *dev,
> +				   struct device_attribute *attr,
> +				   const char *buf, size_t count)
> +{
> +	cxl_mock_event_trigger(dev);
> +	return count;
> +}
> +static DEVICE_ATTR_WO(event_trigger);
> +
> +static struct attribute *cxl_mock_event_attrs[] = {
> +	&dev_attr_event_trigger.attr,
> +	NULL
> +};
> +ATTRIBUTE_GROUPS(cxl_mock_event);
> +
>  static int cxl_mock_mem_probe(struct platform_device *pdev)
>  {
>  	struct device *dev = &pdev->dev;
>  	struct cxl_memdev *cxlmd;
>  	struct cxl_dev_state *cxlds;
> +	u32 ev_status;
>  	void *lsa;
>  	int rc;
>  
> @@ -281,11 +304,13 @@ static int cxl_mock_mem_probe(struct platform_device *pdev)
>  	if (rc)
>  		return rc;
>  
> +	ev_status = cxl_mock_add_event_logs(cxlds);
> +
>  	cxlmd = devm_cxl_add_memdev(cxlds);
>  	if (IS_ERR(cxlmd))
>  		return PTR_ERR(cxlmd);
>  
> -	cxl_mem_get_event_records(cxlds);
> +	__cxl_mem_get_event_records(cxlds, ev_status);
>  
>  	if (resource_size(&cxlds->pmem_res) && IS_ENABLED(CONFIG_CXL_PMEM))
>  		rc = devm_cxl_add_nvdimm(dev, cxlmd);
> @@ -293,6 +318,12 @@ static int cxl_mock_mem_probe(struct platform_device *pdev)
>  	return 0;
>  }
>  
> +static int cxl_mock_mem_remove(struct platform_device *pdev)

It's moot with the comment above about moving to drvdata, but I
otherwise would have expected:

devm_add_action_or_reset(dev, cxl_mock_remove_event_logs, mes);

...in the probe path. Otherwse, I just got rid of the only .revove()
handler in the CXL subsystem the for-6.2/cxl-rch branch.

> +{
> +	cxl_mock_remove_event_logs(&pdev->dev);
> +	return 0;
> +}
> +
>  static const struct platform_device_id cxl_mock_mem_ids[] = {
>  	{ .name = "cxl_mem", },
>  	{ },
> @@ -301,9 +332,11 @@ MODULE_DEVICE_TABLE(platform, cxl_mock_mem_ids);
>  
>  static struct platform_driver cxl_mock_mem_driver = {
>  	.probe = cxl_mock_mem_probe,
> +	.remove = cxl_mock_mem_remove,
>  	.id_table = cxl_mock_mem_ids,
>  	.driver = {
>  		.name = KBUILD_MODNAME,
> +		.dev_groups = cxl_mock_event_groups,

Looks good, but you'll need to append yours to the ones that Dave added
for the security state machine.

>  	},
>  };
>  
> -- 
> 2.37.2
> 



^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH V2 01/11] cxl/pci: Add generic MSI-X/MSI irq support
  2022-12-02  2:00       ` Dan Williams
@ 2022-12-02 13:04         ` Jonathan Cameron
  0 siblings, 0 replies; 64+ messages in thread
From: Jonathan Cameron @ 2022-12-02 13:04 UTC (permalink / raw)
  To: Dan Williams
  Cc: Ira Weiny, Davidlohr Bueso, Bjorn Helgaas, Alison Schofield,
	Vishal Verma, Ben Widawsky, Steven Rostedt, Dave Jiang,
	linux-kernel, linux-cxl

On Thu, 1 Dec 2022 18:00:59 -0800
Dan Williams <dan.j.williams@intel.com> wrote:

> Ira Weiny wrote:
> > On Thu, Dec 01, 2022 at 04:23:21PM -0800, Dan Williams wrote:  
> > > ira.weiny@ wrote:  
> > > > From: Davidlohr Bueso <dave@stgolabs.net>
> > > > 
> > > > Currently the only CXL features targeted for irq support require their
> > > > message numbers to be within the first 16 entries.  The device may
> > > > however support less than 16 entries depending on the support it
> > > > provides.
> > > > 
> > > > Attempt to allocate these 16 irq vectors.  If the device supports less
> > > > then the PCI infrastructure will allocate that number.  
> > > 
> > > What happens if the device supports 16, but irq-core allocates less? I
> > > believe the answer is with the first user, but this patch does not
> > > include a user.
> > >   
> > > > Store the number of vectors actually allocated in the device state for
> > > > later use by individual functions.  
> > > 
> > > The patch does not do that.  
> > 
> > Sorry missed updating this message.
> >   
> > > 
> > > I know this patch has gone through a lot of discussion, but this
> > > mismatch shows it should really be squashed with the first user because
> > > it does not stand on its own anymore.  
> > 
> > It is separate because it was Davidlohr's to begin with.
> > 
> > I'll squash it back.
> >   
> > >   
> > > > Upon successful allocation, users can plug in their respective isr at
> > > > any point thereafter, for example, if the irq setup is not done in the
> > > > PCI driver, such as the case of the CXL-PMU.
> > > > 
> > > > Cc: Bjorn Helgaas <helgaas@kernel.org>
> > > > Cc: Jonathan Cameron <Jonathan.Cameron@huawei.com>
> > > > Co-developed-by: Ira Weiny <ira.weiny@intel.com>
> > > > Signed-off-by: Ira Weiny <ira.weiny@intel.com>
> > > > Signed-off-by: Davidlohr Bueso <dave@stgolabs.net>
> > > > 
> > > > ---
> > > > Changes from V1:
> > > > 	Jonathan
> > > > 		pci_alloc_irq_vectors() cleans up the vectors automatically
> > > > 		use msi_enabled rather than nr_irq_vecs
> > > > 
> > > > Changes from Ira
> > > > 	Remove reviews
> > > > 	Allocate up to a static 16 vectors.
> > > > 	Change cover letter
> > > > ---
> > > >  drivers/cxl/cxlmem.h |  3 +++
> > > >  drivers/cxl/cxlpci.h |  6 ++++++
> > > >  drivers/cxl/pci.c    | 23 +++++++++++++++++++++++
> > > >  3 files changed, 32 insertions(+)
> > > > 
> > > > diff --git a/drivers/cxl/cxlmem.h b/drivers/cxl/cxlmem.h
> > > > index 88e3a8e54b6a..cd35f43fedd4 100644
> > > > --- a/drivers/cxl/cxlmem.h
> > > > +++ b/drivers/cxl/cxlmem.h
> > > > @@ -211,6 +211,7 @@ struct cxl_endpoint_dvsec_info {
> > > >   * @info: Cached DVSEC information about the device.
> > > >   * @serial: PCIe Device Serial Number
> > > >   * @doe_mbs: PCI DOE mailbox array
> > > > + * @msi_enabled: MSI-X/MSI has been enabled
> > > >   * @mbox_send: @dev specific transport for transmitting mailbox commands
> > > >   *
> > > >   * See section 8.2.9.5.2 Capacity Configuration and Label Storage for
> > > > @@ -247,6 +248,8 @@ struct cxl_dev_state {
> > > >  
> > > >  	struct xarray doe_mbs;
> > > >  
> > > > +	bool msi_enabled;
> > > > +  
> > > 
> > > This goes unused in this patch and it also duplicates what the core
> > > offers with pdev->{msi,msix}_enabled.  
> > 
> > I tried to argue that with Jonathan and lost.  What I had in V1 was to store
> > the number actually allocated.  Then if a function reports something higher
> > later it can't be used.  
> 
> A successful pci_alloc_irq_vectors() call assigns a vector number to all
> interrupt sources on the device regardless of how many interrupt sources
> there are. If the device has 32 interrupt sources and 16 irqs are returned
> from pci_alloc_irq_vectors() then each interrupt source will be sharing
> a vector with one or more other vectors. All PCI IRQ vectors are shared.

Assuming my understanding is correct...
Subtle tweak to that description (not that it matters in practice).
Some of the vectors will be shared. For MSI at least it is up to the
device to assign msgnums in whatever way it likes such that they are
fit in the number that were enabled.  So it 'could' put them all on the
first msgnum if it wants to, or put any that would otherwise have been
greater than 16 on msgnum 15.  Impdef how it decides that spread.

MSIX is has a layer of indirection in control of software, so it gets
more complex...



> 
> So I do not see the point of this msi_enabled flag cxl_dev_state. If
> pci_alloc_irq_vectors() returns at least 1 then you are good to go.


^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH V2 03/11] cxl/mem: Implement Clear Event Records command
  2022-12-02  2:29   ` Dan Williams
@ 2022-12-02 13:18     ` Jonathan Cameron
  2022-12-02 13:34     ` Steven Rostedt
  2022-12-02 23:49     ` Ira Weiny
  2 siblings, 0 replies; 64+ messages in thread
From: Jonathan Cameron @ 2022-12-02 13:18 UTC (permalink / raw)
  To: Dan Williams
  Cc: ira.weiny, Alison Schofield, Vishal Verma, Ben Widawsky,
	Steven Rostedt, Davidlohr Bueso, Dave Jiang, linux-kernel,
	linux-cxl


> > +static int cxl_clear_event_record(struct cxl_dev_state *cxlds,
> > +				  enum cxl_event_log_type log,
> > +				  struct cxl_get_event_payload *get_pl,
> > +				  u16 total)
> > +{
> > +	struct cxl_mbox_clear_event_payload payload = {
> > +		.event_log = log,
> > +	};
> > +	int cnt;
> > +
> > +	/*
> > +	 * Clear Event Records uses u8 for the handle cnt while Get Event
> > +	 * Record can return up to 0xffff records.
> > +	 */
> > +	for (cnt = 0; cnt < total; /* cnt incremented internally */) {
> > +		u8 nr_recs = min_t(u8, (total - cnt),
> > +				   CXL_CLEAR_EVENT_MAX_HANDLES);  
> 
> This seems overly complicated. @total is a duplicate of
> @get_pl->record_count, and the 2 loops feel like it could be cut
> down to one.


You could do something nasty like
	for (i = 0; i < total; i++) {

		...
		payload.handle[i % CLEAR_EVENT_MAX_HANDLES] = ...
		if (i % CXL_CLEAR_EVENT_MAX_HANDLES == CXL_CLEAR_EVENT_MAX_HANDLE - 1) {
			send command.
		}
	}

but that looks worse to me than the double loop.

Making outer loop
	for (j = 0; j <= total / CXL_CLEAR_EVENT_MAX_HANDLES; j++)
might bet clearer but then you'd have to do
records[j * CXL_CLEAR_EVENT_MAX_HANDLES + i] which isn't nice.

Ah well, Ira gets to try and find a happy compromise.


...

> > diff --git a/include/uapi/linux/cxl_mem.h b/include/uapi/linux/cxl_mem.h
> > index 70459be5bdd4..7c1ad8062792 100644
> > --- a/include/uapi/linux/cxl_mem.h
> > +++ b/include/uapi/linux/cxl_mem.h
> > @@ -25,6 +25,7 @@
> >  	___C(RAW, "Raw device command"),                                  \
> >  	___C(GET_SUPPORTED_LOGS, "Get Supported Logs"),                   \
> >  	___C(GET_EVENT_RECORD, "Get Event Record"),                       \
> > +	___C(CLEAR_EVENT_RECORD, "Clear Event Record"),                   \
> >  	___C(GET_FW_INFO, "Get FW Info"),                                 \
> >  	___C(GET_PARTITION_INFO, "Get Partition Information"),            \
> >  	___C(GET_LSA, "Get Label Storage Area"),                          \  
> 
> Same, "yikes" / "must be at the end of the enum" feedback.

Macro magic makes that non obvious.. Not that I'd ever said I thought this trick
was a bad idea ;) 


^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH V2 03/11] cxl/mem: Implement Clear Event Records command
  2022-12-02  2:29   ` Dan Williams
  2022-12-02 13:18     ` Jonathan Cameron
@ 2022-12-02 13:34     ` Steven Rostedt
  2022-12-02 19:27       ` Dan Williams
  2022-12-02 23:49     ` Ira Weiny
  2 siblings, 1 reply; 64+ messages in thread
From: Steven Rostedt @ 2022-12-02 13:34 UTC (permalink / raw)
  To: Dan Williams
  Cc: ira.weiny, Alison Schofield, Vishal Verma, Ben Widawsky,
	Jonathan Cameron, Davidlohr Bueso, Dave Jiang, linux-kernel,
	linux-cxl

On Thu, 1 Dec 2022 18:29:20 -0800
Dan Williams <dan.j.williams@intel.com> wrote:

> >  static void cxl_mem_get_records_log(struct cxl_dev_state *cxlds,
> >  				    enum cxl_event_log_type type)
> >  {
> > @@ -732,13 +769,22 @@ static void cxl_mem_get_records_log(struct cxl_dev_state *cxlds,
> >  		}
> >  
> >  		nr_rec = le16_to_cpu(payload->record_count);
> > -		if (trace_cxl_generic_event_enabled()) {
> > +		if (nr_rec > 0) {
> >  			int i;
> >  
> > -			for (i = 0; i < nr_rec; i++)
> > -				trace_cxl_generic_event(dev_name(cxlds->dev),
> > -							type,
> > -							&payload->records[i]);
> > +			if (trace_cxl_generic_event_enabled()) {  
> 
> Again, trace_cxl_generic_event_enabled() injects some awkward
> formatting here to micro-optimize looping. Any performance benefit this
> code might offer is likely offset by the extra human effort to read it.

This is commonly used throughout the kernel, and highly suggested for use to
encapsulate any work being done only for tracing, when tracing is disabled.
It uses static_braches/jump_labels which makes the loop into a 'nop' when
tracing is off. That is, there is zero overhead for the for loop below (and
there's not even a branch to skip it!)

But sure, if you really don't care as it's not a fast path, then keep it
out. I like people to keep the habit of doing this, because otherwise it
tends to creep into the fast paths.

-- Steve

> 
> > +				for (i = 0; i < nr_rec; i++)
> > +					trace_cxl_generic_event(dev_name(cxlds->dev),
> > +								type,
> > +								&payload->records[i]);
> > +			}
> > +
> > +			rc = cxl_clear_event_record(cxlds, type, payload, nr_rec);
> > +			if (rc) {
> > +				dev_err(cxlds->dev, "Event log '%s': Failed to clear events : %d",
> > +					cxl_event_log_type_str(type), rc);
> > +				return;
> > +			}
> >  		}
> >  

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH V2 08/11] cxl/mem: Wire up event interrupts
  2022-12-02  7:37   ` Dan Williams
@ 2022-12-02 14:19     ` Jonathan Cameron
  2022-12-02 19:43       ` Dan Williams
  0 siblings, 1 reply; 64+ messages in thread
From: Jonathan Cameron @ 2022-12-02 14:19 UTC (permalink / raw)
  To: Dan Williams
  Cc: ira.weiny, Alison Schofield, Vishal Verma, Ben Widawsky,
	Steven Rostedt, Davidlohr Bueso, Dave Jiang, linux-kernel,
	linux-cxl


> > +int cxl_event_config_msgnums(struct cxl_dev_state *cxlds,
> > +			     struct cxl_event_interrupt_policy *policy)
> > +{
> > +	int rc;
> > +
> > +	policy->info_settings = CXL_INT_MSI_MSIX;
> > +	policy->warn_settings = CXL_INT_MSI_MSIX;
> > +	policy->failure_settings = CXL_INT_MSI_MSIX;
> > +	policy->fatal_settings = CXL_INT_MSI_MSIX;  
> 
> I think this needs to be careful not to undo events that the BIOS
> steered to itself in firmware-first mode, which raises another question,
> does firmware-first mean more the OS needs to backoff on some event-log
> handling as well?

Hmm. Does the _OSC cover these.  There is one for Memory error reporting
that I think covers it (refers to 12.2.3.2)

Note that should cover any means of obtaining these, not just interrupt
driven - so including the initial record clear.

..

> > +
> > +static irqreturn_t cxl_event_failure_thread(int irq, void *id)
> > +{
> > +	struct cxl_dev_state *cxlds = id;
> > +
> > +	cxl_mem_get_records_log(cxlds, CXL_EVENT_TYPE_FAIL);
> > +	return IRQ_HANDLED;
> > +}  
> 
> So I think one of the nice side effects of moving log priorty handling
> inside of cxl_mem_get_records_log() and looping through all log types in
> priority order until all status is clear is that an INFO interrupt also
> triggers a check of the FATAL status for free.
> 

I go the opposite way on this in thinking that an interrupt should only
ever be used to handle the things it was registered for - so we should
not be clearing fatal records in the handler triggered for info events.

Doing other actions like this relies on subtlies of the generic interrupt
handling code which happens to force interrupt threads on a shared interrupt
line to be serialized.  I'm not sure we are safe at all the interrupt
isn't shared unless we put a lock around the whole thing (we have one
because of the buffer mutex though).

If going this way I think the lock needs a rename.
It's not just protecting the buffer used, but also serialize multiple
interrupt threads.

Jonathan

> You likely do not even need to do the status read in hardirq part of the
> handler, just unconditionally wake the event handler thread. I.e. just
> pass NULL for @handler to devm_request_threaded_irq() and let the
> thread_fn figure it all out in priority order.




^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH V2 04/11] cxl/mem: Clear events on driver load
  2022-12-02  2:48   ` Dan Williams
@ 2022-12-02 16:34     ` Ira Weiny
  2022-12-02 23:34       ` Dan Williams
  0 siblings, 1 reply; 64+ messages in thread
From: Ira Weiny @ 2022-12-02 16:34 UTC (permalink / raw)
  To: Dan Williams
  Cc: Jonathan Cameron, Dave Jiang, Alison Schofield, Vishal Verma,
	Ben Widawsky, Steven Rostedt, Davidlohr Bueso, linux-kernel,
	linux-cxl

On Thu, Dec 01, 2022 at 06:48:12PM -0800, Dan Williams wrote:
> cxl/mem is cxl_mem.ko, This is cxl/pci.
> 
> ira.weiny@ wrote:
> > From: Ira Weiny <ira.weiny@intel.com>
> > 
> > The information contained in the events prior to the driver loading can
> > be queried at any time through other mailbox commands.
> > 
> > Ensure a clean slate of events by reading and clearing the events.  The
> > events are sent to the trace buffer but it is not anticipated to have
> > anyone listening to it at driver load time.
> 
> This is easy to guarantee with modprobe policy, so I am not sure it is
> worth stating.

Fair enough.  But there was some discussion early on regarding why reading and
clearing on startup was a good thing.  This showed that we chose to do that and
why we don't care.  I'll remove it.

> 
> This breakdown feels odd. I would split the trace event definitions into
> its own lead in patch since that is a pile of definitions that can be
> merged on their own. Then squash get, clear, and this patch into one
> patch as they don't have much reason to go in separately.

I agree that splitting the Get/Clear/and this patch was odd.  However,
splitting Get/Clear made the discussion on those operations easier IMO.

As a result this did not really belong in either of those patches on their own.

It is also very clearly a do one thing per patch situation.

> 
> > Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
> > Reviewed-by: Dave Jiang <dave.jiang@intel.com>
> > Signed-off-by: Ira Weiny <ira.weiny@intel.com>
> > ---
> >  drivers/cxl/pci.c            | 2 ++
> >  tools/testing/cxl/test/mem.c | 2 ++
> >  2 files changed, 4 insertions(+)
> > 
> > diff --git a/drivers/cxl/pci.c b/drivers/cxl/pci.c
> > index 8f86f85d89c7..11e95a95195a 100644
> > --- a/drivers/cxl/pci.c
> > +++ b/drivers/cxl/pci.c
> > @@ -521,6 +521,8 @@ static int cxl_pci_probe(struct pci_dev *pdev, const struct pci_device_id *id)
> >  	if (IS_ERR(cxlmd))
> >  		return PTR_ERR(cxlmd);
> >  
> > +	cxl_mem_get_event_records(cxlds);
> > +
> >  	if (resource_size(&cxlds->pmem_res) && IS_ENABLED(CONFIG_CXL_PMEM))
> >  		rc = devm_cxl_add_nvdimm(&pdev->dev, cxlmd);
> >  
> > diff --git a/tools/testing/cxl/test/mem.c b/tools/testing/cxl/test/mem.c
> > index aa2df3a15051..e2f5445d24ff 100644
> > --- a/tools/testing/cxl/test/mem.c
> > +++ b/tools/testing/cxl/test/mem.c
> > @@ -285,6 +285,8 @@ static int cxl_mock_mem_probe(struct platform_device *pdev)
> >  	if (IS_ERR(cxlmd))
> >  		return PTR_ERR(cxlmd);
> >  
> > +	cxl_mem_get_event_records(cxlds);
> > +
> 
> This hunk likely goes with the first patch that actually implements some
> mocked events.

If this patch was squashed into the other patches yes.  But as a patch which
does exactly 1 thing "Clear events on driver load" it works IMO.  I could just
have well put this patch at the very end.

Now that the Get/Clear operations are more settled I'll split this out and
squash it as you suggest.  Jonathan suggested squashing Get/Clear too but again
I really prefer the 1 thing/patch and each of those operations seemed like a
good breakdown.

Ira

> 
> >  	if (resource_size(&cxlds->pmem_res) && IS_ENABLED(CONFIG_CXL_PMEM))
> >  		rc = devm_cxl_add_nvdimm(dev, cxlmd);
> >  
> > -- 
> > 2.37.2
> > 
> 
> 

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH V2 03/11] cxl/mem: Implement Clear Event Records command
  2022-12-02 13:34     ` Steven Rostedt
@ 2022-12-02 19:27       ` Dan Williams
  2022-12-02 21:28         ` Ira Weiny
  0 siblings, 1 reply; 64+ messages in thread
From: Dan Williams @ 2022-12-02 19:27 UTC (permalink / raw)
  To: Steven Rostedt, Dan Williams
  Cc: ira.weiny, Alison Schofield, Vishal Verma, Ben Widawsky,
	Jonathan Cameron, Davidlohr Bueso, Dave Jiang, linux-kernel,
	linux-cxl

Steven Rostedt wrote:
> On Thu, 1 Dec 2022 18:29:20 -0800
> Dan Williams <dan.j.williams@intel.com> wrote:
> 
> > >  static void cxl_mem_get_records_log(struct cxl_dev_state *cxlds,
> > >  				    enum cxl_event_log_type type)
> > >  {
> > > @@ -732,13 +769,22 @@ static void cxl_mem_get_records_log(struct cxl_dev_state *cxlds,
> > >  		}
> > >  
> > >  		nr_rec = le16_to_cpu(payload->record_count);
> > > -		if (trace_cxl_generic_event_enabled()) {
> > > +		if (nr_rec > 0) {
> > >  			int i;
> > >  
> > > -			for (i = 0; i < nr_rec; i++)
> > > -				trace_cxl_generic_event(dev_name(cxlds->dev),
> > > -							type,
> > > -							&payload->records[i]);
> > > +			if (trace_cxl_generic_event_enabled()) {  
> > 
> > Again, trace_cxl_generic_event_enabled() injects some awkward
> > formatting here to micro-optimize looping. Any performance benefit this
> > code might offer is likely offset by the extra human effort to read it.
> 
> This is commonly used throughout the kernel, and highly suggested for use to
> encapsulate any work being done only for tracing, when tracing is disabled.
> It uses static_braches/jump_labels which makes the loop into a 'nop' when
> tracing is off. That is, there is zero overhead for the for loop below (and
> there's not even a branch to skip it!)
> 
> But sure, if you really don't care as it's not a fast path, then keep it
> out. I like people to keep the habit of doing this, because otherwise it
> tends to creep into the fast paths.

Duly noted. It makes a lot of sense when you are tracing in a fast path
to skip any and all preamble code. In this case we are doing it after
doing a whole series of uncached PCI mmio reads with all the stalling
and serialization that implies. 

Speaking of which, this probably wants a cond_resched() after each loop
iteration.

I'll note it is also a tracepoint that is likely to be enabled most of
the time in production.

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH V2 08/11] cxl/mem: Wire up event interrupts
  2022-12-02 14:19     ` Jonathan Cameron
@ 2022-12-02 19:43       ` Dan Williams
  2022-12-05 13:01         ` Jonathan Cameron
  0 siblings, 1 reply; 64+ messages in thread
From: Dan Williams @ 2022-12-02 19:43 UTC (permalink / raw)
  To: Jonathan Cameron, Dan Williams
  Cc: ira.weiny, Alison Schofield, Vishal Verma, Ben Widawsky,
	Steven Rostedt, Davidlohr Bueso, Dave Jiang, linux-kernel,
	linux-cxl

Jonathan Cameron wrote:
> 
> > > +int cxl_event_config_msgnums(struct cxl_dev_state *cxlds,
> > > +			     struct cxl_event_interrupt_policy *policy)
> > > +{
> > > +	int rc;
> > > +
> > > +	policy->info_settings = CXL_INT_MSI_MSIX;
> > > +	policy->warn_settings = CXL_INT_MSI_MSIX;
> > > +	policy->failure_settings = CXL_INT_MSI_MSIX;
> > > +	policy->fatal_settings = CXL_INT_MSI_MSIX;  
> > 
> > I think this needs to be careful not to undo events that the BIOS
> > steered to itself in firmware-first mode, which raises another question,
> > does firmware-first mean more the OS needs to backoff on some event-log
> > handling as well?
> 
> Hmm. Does the _OSC cover these.  There is one for Memory error reporting
> that I think covers it (refers to 12.2.3.2)
> 
> Note that should cover any means of obtaining these, not just interrupt
> driven - so including the initial record clear.
> 
> ..
> 
> > > +
> > > +static irqreturn_t cxl_event_failure_thread(int irq, void *id)
> > > +{
> > > +	struct cxl_dev_state *cxlds = id;
> > > +
> > > +	cxl_mem_get_records_log(cxlds, CXL_EVENT_TYPE_FAIL);
> > > +	return IRQ_HANDLED;
> > > +}  
> > 
> > So I think one of the nice side effects of moving log priorty handling
> > inside of cxl_mem_get_records_log() and looping through all log types in
> > priority order until all status is clear is that an INFO interrupt also
> > triggers a check of the FATAL status for free.
> > 
> 
> I go the opposite way on this in thinking that an interrupt should only
> ever be used to handle the things it was registered for - so we should
> not be clearing fatal records in the handler triggered for info events.

I would agree with you if this was a fast path and if the hardware
mechanism did not involve shared status register that tells you
that both FATAL and INFO are pending retrieval through a mechanism.
Compare that to the separation between admin and IO queues in NVME.

If the handler is going to loop on the status register then it must be
careful not to starve out FATAL while processing INFO.

> Doing other actions like this relies on subtlies of the generic interrupt
> handling code which happens to force interrupt threads on a shared interrupt
> line to be serialized.  I'm not sure we are safe at all the interrupt
> isn't shared unless we put a lock around the whole thing (we have one
> because of the buffer mutex though).

The interrupt is likely shared since there is no performance benefit to
entice hardware vendors spend transistor budget on more vector space for
events. The events architecture does not merit that spend.

> If going this way I think the lock needs a rename.
> It's not just protecting the buffer used, but also serialize multiple
> interrupt threads.

I will let Ira decide if he wants to rename, but in my mind the shared
event buffer *is* the data being locked, the fact that multiple threads
might be contending for it is immaterial.

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH V2 03/11] cxl/mem: Implement Clear Event Records command
  2022-12-02 19:27       ` Dan Williams
@ 2022-12-02 21:28         ` Ira Weiny
  0 siblings, 0 replies; 64+ messages in thread
From: Ira Weiny @ 2022-12-02 21:28 UTC (permalink / raw)
  To: Dan Williams
  Cc: Steven Rostedt, Alison Schofield, Vishal Verma, Ben Widawsky,
	Jonathan Cameron, Davidlohr Bueso, Dave Jiang, linux-kernel,
	linux-cxl

On Fri, Dec 02, 2022 at 11:27:07AM -0800, Dan Williams wrote:
> Steven Rostedt wrote:
> > On Thu, 1 Dec 2022 18:29:20 -0800
> > Dan Williams <dan.j.williams@intel.com> wrote:
> > 
> > > >  static void cxl_mem_get_records_log(struct cxl_dev_state *cxlds,
> > > >  				    enum cxl_event_log_type type)
> > > >  {
> > > > @@ -732,13 +769,22 @@ static void cxl_mem_get_records_log(struct cxl_dev_state *cxlds,
> > > >  		}
> > > >  
> > > >  		nr_rec = le16_to_cpu(payload->record_count);
> > > > -		if (trace_cxl_generic_event_enabled()) {
> > > > +		if (nr_rec > 0) {
> > > >  			int i;
> > > >  
> > > > -			for (i = 0; i < nr_rec; i++)
> > > > -				trace_cxl_generic_event(dev_name(cxlds->dev),
> > > > -							type,
> > > > -							&payload->records[i]);
> > > > +			if (trace_cxl_generic_event_enabled()) {  
> > > 
> > > Again, trace_cxl_generic_event_enabled() injects some awkward
> > > formatting here to micro-optimize looping. Any performance benefit this
> > > code might offer is likely offset by the extra human effort to read it.
> > 
> > This is commonly used throughout the kernel, and highly suggested for use to
> > encapsulate any work being done only for tracing, when tracing is disabled.
> > It uses static_braches/jump_labels which makes the loop into a 'nop' when
> > tracing is off. That is, there is zero overhead for the for loop below (and
> > there's not even a branch to skip it!)
> > 
> > But sure, if you really don't care as it's not a fast path, then keep it
> > out. I like people to keep the habit of doing this, because otherwise it
> > tends to creep into the fast paths.

Thanks for chiming in here Steven.  I should have pushed back on this.

> 
> Duly noted. It makes a lot of sense when you are tracing in a fast path
> to skip any and all preamble code. In this case we are doing it after
> doing a whole series of uncached PCI mmio reads with all the stalling
> and serialization that implies. 
> 
> Speaking of which, this probably wants a cond_resched() after each loop
> iteration.
> 
> I'll note it is also a tracepoint that is likely to be enabled most of
> the time in production.

Ok I did not have any of these in there originally and I will remove them now.

Thanks!
Ira

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH V2 02/11] cxl/mem: Implement Get Event Records command
  2022-12-02  5:00         ` Steven Rostedt
@ 2022-12-02 21:31           ` Ira Weiny
  0 siblings, 0 replies; 64+ messages in thread
From: Ira Weiny @ 2022-12-02 21:31 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: Dan Williams, Alison Schofield, Vishal Verma, Ben Widawsky,
	Jonathan Cameron, Davidlohr Bueso, Dave Jiang, linux-kernel,
	linux-cxl

On Fri, Dec 02, 2022 at 12:00:59AM -0500, Steven Rostedt wrote:
> On Thu, 1 Dec 2022 23:40:52 -0500
> Steven Rostedt <rostedt@goodmis.org> wrote:
> 
> > +#undef __print_symbolic
> > +#define __print_symbolic(value, symbol_array...)			\
> > +	({								\
> > +		static const struct trace_print_flags symbols[] =	\
> > +			{ symbol_array, { -1, NULL }};			\
> > +		__print_symbolic_str(value, symbols);			\
> > +	})
> > +
> >  #endif /* CREATE_TRACE_POINTS */
> 
> Bah, I want this outside that #ifdef
> 
> > diff --git a/include/trace/stages/stage7_class_define.h b/include/trace/stages/stage7_class_define.h
> > index 8a7ec24c246d..6fe83397f65d 100644
> > --- a/include/trace/stages/stage7_class_define.h
> > +++ b/include/trace/stages/stage7_class_define.h
> > @@ -6,7 +6,6 @@
> 
> I also don't think I need to touch stage7.
> 
> New patch:

I'm still going to defer this but will include a follow up patch to add this
back in after the bulk of the series gets in.

Thanks for helping here.  I still want to understand this all better but I have
to focus on the main features ATM.

Thanks again!
Ira

> 
> diff --git a/include/trace/define_trace.h b/include/trace/define_trace.h
> index 00723935dcc7..9d665f634614 100644
> --- a/include/trace/define_trace.h
> +++ b/include/trace/define_trace.h
> @@ -133,3 +133,24 @@
>  #define CREATE_TRACE_POINTS
>  
>  #endif /* CREATE_TRACE_POINTS */
> +
> +#ifndef __DEFINE_PRINT_SYMBOLIC_STR
> +#define __DEFINE_PRINT_SYMBOLIC_STR
> +static inline const char *
> +__print_symbolic_str(int type, struct trace_print_flags *symbols)
> +{
> +	for (; symbols->name != NULL; symbols++) {
> +		if (type == symbols->mask)
> +			return symbols->name;
> +	}
> +	return "<invalid>";
> +}
> +#endif
> +
> +#undef __print_symbolic
> +#define __print_symbolic(value, symbol_array...)			\
> +	({								\
> +		static const struct trace_print_flags symbols[] =	\
> +			{ symbol_array, { -1, NULL }};			\
> +		__print_symbolic_str(value, symbols);			\
> +	})

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH V2 02/11] cxl/mem: Implement Get Event Records command
  2022-12-02  1:39   ` Dan Williams
@ 2022-12-02 21:47     ` Ira Weiny
  2022-12-03 21:33       ` Dan Williams
  0 siblings, 1 reply; 64+ messages in thread
From: Ira Weiny @ 2022-12-02 21:47 UTC (permalink / raw)
  To: Dan Williams
  Cc: Steven Rostedt, Alison Schofield, Vishal Verma, Ben Widawsky,
	Jonathan Cameron, Davidlohr Bueso, Dave Jiang, linux-kernel,
	linux-cxl

On Thu, Dec 01, 2022 at 05:39:12PM -0800, Dan Williams wrote:
> ira.weiny@ wrote:
> > From: Ira Weiny <ira.weiny@intel.com>
> > 

[snip]

> >  
> > +#define CREATE_TRACE_POINTS
> > +#include <trace/events/cxl.h>
> > +
> >  #include "core.h"
> >  
> >  static bool cxl_raw_allow_all;
> > @@ -48,6 +51,7 @@ static struct cxl_mem_command cxl_mem_commands[CXL_MEM_COMMAND_ID_MAX] = {
> >  	CXL_CMD(RAW, CXL_VARIABLE_PAYLOAD, CXL_VARIABLE_PAYLOAD, 0),
> >  #endif
> >  	CXL_CMD(GET_SUPPORTED_LOGS, 0, CXL_VARIABLE_PAYLOAD, CXL_CMD_FLAG_FORCE_ENABLE),
> > +	CXL_CMD(GET_EVENT_RECORD, 1, CXL_VARIABLE_PAYLOAD, 0),
> >  	CXL_CMD(GET_FW_INFO, 0, 0x50, 0),
> >  	CXL_CMD(GET_PARTITION_INFO, 0, 0x20, 0),
> >  	CXL_CMD(GET_LSA, 0x8, CXL_VARIABLE_PAYLOAD, 0),
> 
> Similar to this patch:
> 
> https://lore.kernel.org/linux-cxl/166993221008.1995348.11651567302609703175.stgit@dwillia2-xfh.jf.intel.com/
> 
> CXL_MEM_COMMAND_ID_GET_EVENT_RECORD, should be added to the "always
> kernel" / cxlds->exclusive_cmds mask.

Done for all the commands.  I'll rebase as well before sending this out.

> 
> > @@ -704,6 +708,106 @@ int cxl_enumerate_cmds(struct cxl_dev_state *cxlds)
> >  }
> >  EXPORT_SYMBOL_NS_GPL(cxl_enumerate_cmds, CXL);
> >  
> > +static void cxl_mem_get_records_log(struct cxl_dev_state *cxlds,
> > +				    enum cxl_event_log_type type)
> > +{
> > +	struct cxl_get_event_payload *payload;
> > +	u16 nr_rec;
> > +
> > +	mutex_lock(&cxlds->event_buf_lock);
> > +
> > +	payload = cxlds->event_buf;
> > +
> > +	do {
> > +		u8 log_type = type;
> > +		int rc;
> > +
> > +		rc = cxl_mbox_send_cmd(cxlds, CXL_MBOX_OP_GET_EVENT_RECORD,
> > +				       &log_type, sizeof(log_type),
> > +				       payload, cxlds->payload_size);
> > +		if (rc) {
> > +			dev_err(cxlds->dev, "Event log '%s': Failed to query event records : %d",
> > +				cxl_event_log_type_str(type), rc);
> > +			goto unlock_buffer;
> > +		}
> > +
> > +		nr_rec = le16_to_cpu(payload->record_count);
> > +		if (trace_cxl_generic_event_enabled()) {
> 
> This feels like a premature micro-optimization as none of this code is
> fast path and it is dwarfed by the cost of executing the mailbox
> command. I started with trying to reduce the 80 column collision
> pressure, but then stepped back even further and asked, why?

Because Steven told me to.  :-(  I should have been smarter than that.

> 
> > +			int i;
> > +
> > +			for (i = 0; i < nr_rec; i++)
> > +				trace_cxl_generic_event(dev_name(cxlds->dev),
> > +							type,
> > +							&payload->records[i]);
> 
> As far as I can tell trace_cxl_generic_event() always expects a
> device-name as its first argument. So why not enforce that with
> type-safety?  I.e. I think trace_cxl_generic_event() should take a
> "struct device *", not a string unless it is really the case that any
> old string will do as the first argument to the trace event. Otherwise
> the trace point can do "__string(dev_name, dev_name(dev))", and mandate
> that callers pass devices.

From a trace point view 'any old string' will do.  There was nothing else the
trace needed from struct device so I skipped it.

[snip]

> > +
> > +/**
> > + * cxl_mem_get_event_records - Get Event Records from the device
> > + * @cxlds: The device data for the operation
> > + *
> > + * Retrieve all event records available on the device and report them as trace
> > + * events.
> > + *
> > + * See CXL rev 3.0 @8.2.9.2.2 Get Event Records
> > + */
> > +void cxl_mem_get_event_records(struct cxl_dev_state *cxlds)
> > +{
> > +	u32 status = readl(cxlds->regs.status + CXLDEV_DEV_EVENT_STATUS_OFFSET);
> > +
> > +	dev_dbg(cxlds->dev, "Reading event logs: %x\n", status);
> > +
> > +	if (!cxlds->event_buf) {
> > +		cxlds->event_buf = alloc_event_buf(cxlds);
> > +		if (WARN_ON_ONCE(!cxlds->event_buf))
> > +			return;
> > +	}
> 
> What's the point of having an event_buf_lock if event_buf is reallocated
> every call?

This is only called on start up.

> 
> Just allocate event_buf once at the beginning of time during init if the
> device supports event log retrieval, and fail the driver load if that
> allocation fails. No runtime WARN() for memory allocation.

It was.  I'll make that more clear in the next series.

> 
> I notice this patch does not clear events, I trust that comes later in
> the series, but I think it belongs here to make this patch a complete
> standalone thought.

Squashed.  But it does make for a large patch.  Which I'm not a fan of for
review.  Lucky that now we have a lot of review on the parts.

> 
> > +	if (status & CXLDEV_EVENT_STATUS_INFO)
> > +		cxl_mem_get_records_log(cxlds, CXL_EVENT_TYPE_INFO);
> > +	if (status & CXLDEV_EVENT_STATUS_WARN)
> > +		cxl_mem_get_records_log(cxlds, CXL_EVENT_TYPE_WARN);
> > +	if (status & CXLDEV_EVENT_STATUS_FAIL)
> > +		cxl_mem_get_records_log(cxlds, CXL_EVENT_TYPE_FAIL);
> > +	if (status & CXLDEV_EVENT_STATUS_FATAL)
> > +		cxl_mem_get_records_log(cxlds, CXL_EVENT_TYPE_FATAL);
> 
> This retrieval order should be flipped. If there is a FATAL pending I
> expect a monitor wants that first and would be happy to parse the INFO
> later. I would go so far as to say that if the INFO logger is looping
> and a FATAL comes in the driver should get that out first before going
> back for more INFO logs. That would mean executing Clear Events and
> looping through the logs by priority until all the status bits fall
> silent inside cxl_mem_get_records_log().

I'll flip them.  And determine if this is really what we want to do for the
irq.

The issue with the irq handling calling a single function which checks all
status is that we may end up with some odd interrupts doing nothing depending
on racing etc.

A buffer per log would eliminate this to some extent if the message numbers are
not shared.  I don't doubt that vendors are unlikely to burn more than 1
message number so the irq may indeed be shared and serialized anyway.

For simplicity I'll throw them all together.

> 
> > +}
> > +EXPORT_SYMBOL_NS_GPL(cxl_mem_get_event_records, CXL);
> > +
> >  /**
> >   * cxl_mem_get_partition_info - Get partition info
> >   * @cxlds: The device data for the operation
> > @@ -846,6 +950,7 @@ struct cxl_dev_state *cxl_dev_state_create(struct device *dev)
> >  	}
> >  
> >  	mutex_init(&cxlds->mbox_mutex);
> > +	mutex_init(&cxlds->event_buf_lock);
> >  	cxlds->dev = dev;
> >  
> >  	return cxlds;
> > diff --git a/drivers/cxl/cxl.h b/drivers/cxl/cxl.h
> > index f680450f0b16..d4baae74cd97 100644
> > --- a/drivers/cxl/cxl.h
> > +++ b/drivers/cxl/cxl.h
> > @@ -132,6 +132,13 @@ static inline int ways_to_cxl(unsigned int ways, u8 *iw)
> >  #define CXLDEV_CAP_CAP_ID_SECONDARY_MAILBOX 0x3
> >  #define CXLDEV_CAP_CAP_ID_MEMDEV 0x4000
> >  
> > +/* CXL 3.0 8.2.8.3.1 Event Status Register */
> > +#define CXLDEV_DEV_EVENT_STATUS_OFFSET		0x00
> > +#define CXLDEV_EVENT_STATUS_INFO		BIT(0)
> > +#define CXLDEV_EVENT_STATUS_WARN		BIT(1)
> > +#define CXLDEV_EVENT_STATUS_FAIL		BIT(2)
> > +#define CXLDEV_EVENT_STATUS_FATAL		BIT(3)
> > +
> >  /* CXL 2.0 8.2.8.4 Mailbox Registers */
> >  #define CXLDEV_MBOX_CAPS_OFFSET 0x00
> >  #define   CXLDEV_MBOX_CAP_PAYLOAD_SIZE_MASK GENMASK(4, 0)
> > diff --git a/drivers/cxl/cxlmem.h b/drivers/cxl/cxlmem.h
> > index cd35f43fedd4..55d57f5a64bc 100644
> > --- a/drivers/cxl/cxlmem.h
> > +++ b/drivers/cxl/cxlmem.h
> > @@ -4,6 +4,7 @@
> >  #define __CXL_MEM_H__
> >  #include <uapi/linux/cxl_mem.h>
> >  #include <linux/cdev.h>
> > +#include <linux/uuid.h>
> >  #include "cxl.h"
> >  
> >  /* CXL 2.0 8.2.8.5.1.1 Memory Device Status Register */
> > @@ -250,12 +251,16 @@ struct cxl_dev_state {
> >  
> >  	bool msi_enabled;
> >  
> > +	struct cxl_get_event_payload *event_buf;
> > +	struct mutex event_buf_lock;
> > +
> 
> Missing kdoc.

Got it from Jonathan.

> 
> >  	int (*mbox_send)(struct cxl_dev_state *cxlds, struct cxl_mbox_cmd *cmd);
> >  };
> >  
> >  enum cxl_opcode {
> >  	CXL_MBOX_OP_INVALID		= 0x0000,
> >  	CXL_MBOX_OP_RAW			= CXL_MBOX_OP_INVALID,
> > +	CXL_MBOX_OP_GET_EVENT_RECORD	= 0x0100,
> >  	CXL_MBOX_OP_GET_FW_INFO		= 0x0200,
> >  	CXL_MBOX_OP_ACTIVATE_FW		= 0x0202,
> >  	CXL_MBOX_OP_GET_SUPPORTED_LOGS	= 0x0400,
> > @@ -325,6 +330,72 @@ struct cxl_mbox_identify {
> >  	u8 qos_telemetry_caps;
> >  } __packed;
> >  
> > +/*
> > + * Common Event Record Format
> > + * CXL rev 3.0 section 8.2.9.2.1; Table 8-42
> > + */
> > +struct cxl_event_record_hdr {
> > +	uuid_t id;
> > +	u8 length;
> > +	u8 flags[3];
> > +	__le16 handle;
> > +	__le16 related_handle;
> > +	__le64 timestamp;
> > +	u8 maint_op_class;
> > +	u8 reserved[0xf];
> 
> Nit, but lets not copy the spec here and just make all the field sizes
> decimal. It even saves a character to write 15 instead of 0xf, and @flags
> is also decimal.

Ok.

> 
> > +} __packed;
> > +
> > +#define CXL_EVENT_RECORD_DATA_LENGTH 0x50
> > +struct cxl_event_record_raw {
> > +	struct cxl_event_record_hdr hdr;
> > +	u8 data[CXL_EVENT_RECORD_DATA_LENGTH];
> > +} __packed;
> > +
> > +/*
> > + * Get Event Records output payload
> > + * CXL rev 3.0 section 8.2.9.2.2; Table 8-50
> > + */
> > +#define CXL_GET_EVENT_FLAG_OVERFLOW		BIT(0)
> > +#define CXL_GET_EVENT_FLAG_MORE_RECORDS		BIT(1)
> > +struct cxl_get_event_payload {
> > +	u8 flags;
> > +	u8 reserved1;
> > +	__le16 overflow_err_count;
> > +	__le64 first_overflow_timestamp;
> > +	__le64 last_overflow_timestamp;
> > +	__le16 record_count;
> > +	u8 reserved2[0xa];
> 
> Same nit.

Done.

[snip]

> > +
> > +/* This part must be outside protection */
> > +#undef TRACE_INCLUDE_FILE
> > +#define TRACE_INCLUDE_FILE cxl
> > +#include <trace/define_trace.h>
> > diff --git a/include/uapi/linux/cxl_mem.h b/include/uapi/linux/cxl_mem.h
> > index c71021a2a9ed..70459be5bdd4 100644
> > --- a/include/uapi/linux/cxl_mem.h
> > +++ b/include/uapi/linux/cxl_mem.h
> > @@ -24,6 +24,7 @@
> >  	___C(IDENTIFY, "Identify Command"),                               \
> >  	___C(RAW, "Raw device command"),                                  \
> >  	___C(GET_SUPPORTED_LOGS, "Get Supported Logs"),                   \
> > +	___C(GET_EVENT_RECORD, "Get Event Record"),                       \
> >  	___C(GET_FW_INFO, "Get FW Info"),                                 \
> >  	___C(GET_PARTITION_INFO, "Get Partition Information"),            \
> >  	___C(GET_LSA, "Get Label Storage Area"),                          \
> 
> Yikes, no, this is an enum. New commands need to come at the end
> otherwise different kernels will disagree about the command numbering.

Ooops sorry about that.  I somehow thought these were tied to the command
values.

Thanks, Changed for all the commands,

> Likely needs a comment here alerting future devs about the ABI breakage
> danger here.

Additional patch added.

Ira

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH V2 04/11] cxl/mem: Clear events on driver load
  2022-12-02 16:34     ` Ira Weiny
@ 2022-12-02 23:34       ` Dan Williams
  2022-12-03 21:00         ` Ira Weiny
  0 siblings, 1 reply; 64+ messages in thread
From: Dan Williams @ 2022-12-02 23:34 UTC (permalink / raw)
  To: Ira Weiny, Dan Williams
  Cc: Jonathan Cameron, Dave Jiang, Alison Schofield, Vishal Verma,
	Ben Widawsky, Steven Rostedt, Davidlohr Bueso, linux-kernel,
	linux-cxl

Ira Weiny wrote:
> On Thu, Dec 01, 2022 at 06:48:12PM -0800, Dan Williams wrote:
> > cxl/mem is cxl_mem.ko, This is cxl/pci.
> > 
> > ira.weiny@ wrote:
> > > From: Ira Weiny <ira.weiny@intel.com>
> > > 
> > > The information contained in the events prior to the driver loading can
> > > be queried at any time through other mailbox commands.
> > > 
> > > Ensure a clean slate of events by reading and clearing the events.  The
> > > events are sent to the trace buffer but it is not anticipated to have
> > > anyone listening to it at driver load time.
> > 
> > This is easy to guarantee with modprobe policy, so I am not sure it is
> > worth stating.
> 
> Fair enough.  But there was some discussion early on regarding why reading and
> clearing on startup was a good thing.  This showed that we chose to do that and
> why we don't care.  I'll remove it.
> 
> > 
> > This breakdown feels odd. I would split the trace event definitions into
> > its own lead in patch since that is a pile of definitions that can be
> > merged on their own. Then squash get, clear, and this patch into one
> > patch as they don't have much reason to go in separately.
> 
> I agree that splitting the Get/Clear/and this patch was odd.  However,
> splitting Get/Clear made the discussion on those operations easier IMO.
> 
> As a result this did not really belong in either of those patches on their own.
> 
> It is also very clearly a do one thing per patch situation.
> 
> > 
> > > Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
> > > Reviewed-by: Dave Jiang <dave.jiang@intel.com>
> > > Signed-off-by: Ira Weiny <ira.weiny@intel.com>
> > > ---
> > >  drivers/cxl/pci.c            | 2 ++
> > >  tools/testing/cxl/test/mem.c | 2 ++
> > >  2 files changed, 4 insertions(+)
> > > 
> > > diff --git a/drivers/cxl/pci.c b/drivers/cxl/pci.c
> > > index 8f86f85d89c7..11e95a95195a 100644
> > > --- a/drivers/cxl/pci.c
> > > +++ b/drivers/cxl/pci.c
> > > @@ -521,6 +521,8 @@ static int cxl_pci_probe(struct pci_dev *pdev, const struct pci_device_id *id)
> > >  	if (IS_ERR(cxlmd))
> > >  		return PTR_ERR(cxlmd);
> > >  
> > > +	cxl_mem_get_event_records(cxlds);
> > > +
> > >  	if (resource_size(&cxlds->pmem_res) && IS_ENABLED(CONFIG_CXL_PMEM))
> > >  		rc = devm_cxl_add_nvdimm(&pdev->dev, cxlmd);
> > >  
> > > diff --git a/tools/testing/cxl/test/mem.c b/tools/testing/cxl/test/mem.c
> > > index aa2df3a15051..e2f5445d24ff 100644
> > > --- a/tools/testing/cxl/test/mem.c
> > > +++ b/tools/testing/cxl/test/mem.c
> > > @@ -285,6 +285,8 @@ static int cxl_mock_mem_probe(struct platform_device *pdev)
> > >  	if (IS_ERR(cxlmd))
> > >  		return PTR_ERR(cxlmd);
> > >  
> > > +	cxl_mem_get_event_records(cxlds);
> > > +
> > 
> > This hunk likely goes with the first patch that actually implements some
> > mocked events.
> 
> If this patch was squashed into the other patches yes.  But as a patch which
> does exactly 1 thing "Clear events on driver load" it works IMO.  I could just
> have well put this patch at the very end.
> 
> Now that the Get/Clear operations are more settled I'll split this out and
> squash it as you suggest.  Jonathan suggested squashing Get/Clear too but again
> I really prefer the 1 thing/patch and each of those operations seemed like a
> good breakdown.
> 

I'll preface this by saying if you ask 3 kernel developers how to split
a patch series you'll get 5 answers. For me though, a patch should be a
bisectable full-thought. That at each step of a series the kernel is
incrementally better in a way that makes sense. The kernel that gets Get
Events likely needs to clear them too to complete 1 full thought about
enbling Event handling. Otherwise a kernel that just retrieves some
events until they overflow feels like a POC.

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH V2 03/11] cxl/mem: Implement Clear Event Records command
  2022-12-02  2:29   ` Dan Williams
  2022-12-02 13:18     ` Jonathan Cameron
  2022-12-02 13:34     ` Steven Rostedt
@ 2022-12-02 23:49     ` Ira Weiny
  2022-12-03  1:14       ` Dan Williams
  2 siblings, 1 reply; 64+ messages in thread
From: Ira Weiny @ 2022-12-02 23:49 UTC (permalink / raw)
  To: Dan Williams
  Cc: Alison Schofield, Vishal Verma, Ben Widawsky, Steven Rostedt,
	Jonathan Cameron, Davidlohr Bueso, Dave Jiang, linux-kernel,
	linux-cxl

On Thu, Dec 01, 2022 at 06:29:20PM -0800, Dan Williams wrote:
> ira.weiny@ wrote:
> > From: Ira Weiny <ira.weiny@intel.com>
> > 
> > CXL rev 3.0 section 8.2.9.2.3 defines the Clear Event Records mailbox
> > command.  After an event record is read it needs to be cleared from the
> > event log.
> > 
> > Implement cxl_clear_event_record() to clear all record retrieved from
> > the device.
> > 
> > Each record is cleared explicitly.  A clear all bit is specified but
> > events could arrive between a get and any final clear all operation.
> > This means events would be missed.
> > Therefore each event is cleared specifically.
> 
> Note that the spec has a better reason for why Clear All has limited
> usage:
> 
> "Clear All Events is only allowed when the Event Log has overflowed;
> otherwise, the device shall return Invalid Input."
> 
> Will need to wait and see if we need that to keep pace with a device
> with a high event frequency.

Perhaps.  But yea I would wait and see.

[snip]

> > +static int cxl_clear_event_record(struct cxl_dev_state *cxlds,
> > +				  enum cxl_event_log_type log,
> > +				  struct cxl_get_event_payload *get_pl,
> > +				  u16 total)
> > +{
> > +	struct cxl_mbox_clear_event_payload payload = {
> > +		.event_log = log,
> > +	};
> > +	int cnt;
> > +
> > +	/*
> > +	 * Clear Event Records uses u8 for the handle cnt while Get Event
> > +	 * Record can return up to 0xffff records.
> > +	 */
> > +	for (cnt = 0; cnt < total; /* cnt incremented internally */) {
> > +		u8 nr_recs = min_t(u8, (total - cnt),
> > +				   CXL_CLEAR_EVENT_MAX_HANDLES);
> 
> This seems overly complicated. @total is a duplicate of
> @get_pl->record_count, and the 2 loops feel like it could be cut
> down to one.

Sure, total is redundant to pass to the function.

However, 2 loops is IMO not at all overly complicated.  Note that the 2 loops
do not do the same thing.  The inner loop is filling in the payload for the
Clear command.  There is really no way around doing this.

Now that I've had time to think about it:

	Are you suggesting we issue a single mailbox command for every handle?

That would be a single loop.  But a lot more mailbox commands.

> 
> > +		int i, rc;
> > +
> > +		for (i = 0; i < nr_recs; i++, cnt++) {
> > +			payload.handle[i] = get_pl->records[cnt].hdr.handle;
> > +			dev_dbg(cxlds->dev, "Event log '%s': Clearning %u\n",
> 
> While I do think this operation is a mix of clearing and cleaning event
> records, I don't think "Clearning" is a word.

LOL...  I'll fix it.  :-D

> 
> > +				cxl_event_log_type_str(log),
> > +				le16_to_cpu(payload.handle[i]));
> > +		}
> > +		payload.nr_recs = nr_recs;
> > +
> > +		rc = cxl_mbox_send_cmd(cxlds, CXL_MBOX_OP_CLEAR_EVENT_RECORD,
> > +				       &payload, sizeof(payload), NULL, 0);
> > +		if (rc)
> > +			return rc;
> > +	}
> > +
> > +	return 0;
> > +}
> > +
> >  static void cxl_mem_get_records_log(struct cxl_dev_state *cxlds,
> >  				    enum cxl_event_log_type type)
> >  {
> > @@ -732,13 +769,22 @@ static void cxl_mem_get_records_log(struct cxl_dev_state *cxlds,
> >  		}
> >  
> >  		nr_rec = le16_to_cpu(payload->record_count);
> > -		if (trace_cxl_generic_event_enabled()) {
> > +		if (nr_rec > 0) {
> >  			int i;
> >  
> > -			for (i = 0; i < nr_rec; i++)
> > -				trace_cxl_generic_event(dev_name(cxlds->dev),
> > -							type,
> > -							&payload->records[i]);
> > +			if (trace_cxl_generic_event_enabled()) {
> 
> Again, trace_cxl_generic_event_enabled() injects some awkward
> formatting here to micro-optimize looping. Any performance benefit this
> code might offer is likely offset by the extra human effort to read it.

Agreed.  Gone.

> 
> > +				for (i = 0; i < nr_rec; i++)
> > +					trace_cxl_generic_event(dev_name(cxlds->dev),
> > +								type,
> > +								&payload->records[i]);
> > +			}
> > +
> > +			rc = cxl_clear_event_record(cxlds, type, payload, nr_rec);
> > +			if (rc) {
> > +				dev_err(cxlds->dev, "Event log '%s': Failed to clear events : %d",
> > +					cxl_event_log_type_str(type), rc);
> > +				return;
> > +			}
> >  		}
> >  
> >  		if (trace_cxl_overflow_enabled() &&
> > @@ -780,10 +826,11 @@ static struct cxl_get_event_payload *alloc_event_buf(struct cxl_dev_state *cxlds
> >   * cxl_mem_get_event_records - Get Event Records from the device
> >   * @cxlds: The device data for the operation
> >   *
> > - * Retrieve all event records available on the device and report them as trace
> > - * events.
> > + * Retrieve all event records available on the device, report them as trace
> > + * events, and clear them.
> >   *
> >   * See CXL rev 3.0 @8.2.9.2.2 Get Event Records
> > + * See CXL rev 3.0 @8.2.9.2.3 Clear Event Records
> >   */
> >  void cxl_mem_get_event_records(struct cxl_dev_state *cxlds)
> >  {
> > diff --git a/drivers/cxl/cxlmem.h b/drivers/cxl/cxlmem.h
> > index 55d57f5a64bc..1ae9962c5a06 100644
> > --- a/drivers/cxl/cxlmem.h
> > +++ b/drivers/cxl/cxlmem.h
> > @@ -261,6 +261,7 @@ enum cxl_opcode {
> >  	CXL_MBOX_OP_INVALID		= 0x0000,
> >  	CXL_MBOX_OP_RAW			= CXL_MBOX_OP_INVALID,
> >  	CXL_MBOX_OP_GET_EVENT_RECORD	= 0x0100,
> > +	CXL_MBOX_OP_CLEAR_EVENT_RECORD	= 0x0101,
> >  	CXL_MBOX_OP_GET_FW_INFO		= 0x0200,
> >  	CXL_MBOX_OP_ACTIVATE_FW		= 0x0202,
> >  	CXL_MBOX_OP_GET_SUPPORTED_LOGS	= 0x0400,
> > @@ -396,6 +397,19 @@ static inline const char *cxl_event_log_type_str(enum cxl_event_log_type type)
> >  	return "<unknown>";
> >  }
> >  
> > +/*
> > + * Clear Event Records input payload
> > + * CXL rev 3.0 section 8.2.9.2.3; Table 8-51
> > + */
> > +#define CXL_CLEAR_EVENT_MAX_HANDLES (0xff)
> > +struct cxl_mbox_clear_event_payload {
> > +	u8 event_log;		/* enum cxl_event_log_type */
> > +	u8 clear_flags;
> > +	u8 nr_recs;
> > +	u8 reserved[3];
> > +	__le16 handle[CXL_CLEAR_EVENT_MAX_HANDLES];
> > +};
> > +
> >  struct cxl_mbox_get_partition_info {
> >  	__le64 active_volatile_cap;
> >  	__le64 active_persistent_cap;
> > diff --git a/include/uapi/linux/cxl_mem.h b/include/uapi/linux/cxl_mem.h
> > index 70459be5bdd4..7c1ad8062792 100644
> > --- a/include/uapi/linux/cxl_mem.h
> > +++ b/include/uapi/linux/cxl_mem.h
> > @@ -25,6 +25,7 @@
> >  	___C(RAW, "Raw device command"),                                  \
> >  	___C(GET_SUPPORTED_LOGS, "Get Supported Logs"),                   \
> >  	___C(GET_EVENT_RECORD, "Get Event Record"),                       \
> > +	___C(CLEAR_EVENT_RECORD, "Clear Event Record"),                   \
> >  	___C(GET_FW_INFO, "Get FW Info"),                                 \
> >  	___C(GET_PARTITION_INFO, "Get Partition Information"),            \
> >  	___C(GET_LSA, "Get Label Storage Area"),                          \
> 
> Same, "yikes" / "must be at the end of the enum" feedback.

Yep,
Ira

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH V2 03/11] cxl/mem: Implement Clear Event Records command
  2022-12-02 23:49     ` Ira Weiny
@ 2022-12-03  1:14       ` Dan Williams
  2022-12-06  7:35         ` Ira Weiny
  0 siblings, 1 reply; 64+ messages in thread
From: Dan Williams @ 2022-12-03  1:14 UTC (permalink / raw)
  To: Ira Weiny, Dan Williams
  Cc: Alison Schofield, Vishal Verma, Ben Widawsky, Steven Rostedt,
	Jonathan Cameron, Davidlohr Bueso, Dave Jiang, linux-kernel,
	linux-cxl

Ira Weiny wrote:
> On Thu, Dec 01, 2022 at 06:29:20PM -0800, Dan Williams wrote:
> > ira.weiny@ wrote:
> > > From: Ira Weiny <ira.weiny@intel.com>
> > > 
> > > CXL rev 3.0 section 8.2.9.2.3 defines the Clear Event Records mailbox
> > > command.  After an event record is read it needs to be cleared from the
> > > event log.
> > > 
> > > Implement cxl_clear_event_record() to clear all record retrieved from
> > > the device.
> > > 
> > > Each record is cleared explicitly.  A clear all bit is specified but
> > > events could arrive between a get and any final clear all operation.
> > > This means events would be missed.
> > > Therefore each event is cleared specifically.
> > 
> > Note that the spec has a better reason for why Clear All has limited
> > usage:
> > 
> > "Clear All Events is only allowed when the Event Log has overflowed;
> > otherwise, the device shall return Invalid Input."
> > 
> > Will need to wait and see if we need that to keep pace with a device
> > with a high event frequency.
> 
> Perhaps.  But yea I would wait and see.
> 
> [snip]
> 
> > > +static int cxl_clear_event_record(struct cxl_dev_state *cxlds,
> > > +				  enum cxl_event_log_type log,
> > > +				  struct cxl_get_event_payload *get_pl,
> > > +				  u16 total)
> > > +{
> > > +	struct cxl_mbox_clear_event_payload payload = {
> > > +		.event_log = log,
> > > +	};
> > > +	int cnt;
> > > +
> > > +	/*
> > > +	 * Clear Event Records uses u8 for the handle cnt while Get Event
> > > +	 * Record can return up to 0xffff records.
> > > +	 */
> > > +	for (cnt = 0; cnt < total; /* cnt incremented internally */) {
> > > +		u8 nr_recs = min_t(u8, (total - cnt),
> > > +				   CXL_CLEAR_EVENT_MAX_HANDLES);
> > 
> > This seems overly complicated. @total is a duplicate of
> > @get_pl->record_count, and the 2 loops feel like it could be cut
> > down to one.
> 
> Sure, total is redundant to pass to the function.
> 
> However, 2 loops is IMO not at all overly complicated.  Note that the 2 loops
> do not do the same thing.  The inner loop is filling in the payload for the
> Clear command.  There is really no way around doing this.
> 
> Now that I've had time to think about it:
> 
> 	Are you suggesting we issue a single mailbox command for every handle?
> 
> That would be a single loop.  But a lot more mailbox commands.

I was thinking something like this pseudo code

int tosend = le16_to_cpu(get_pl->record_count);
int added = 0;

    for (i = 0; i < tosend; i++) {
    	add_to_clear(added++);
    	if (added == MAX)
    		send_mailbox();
	added = 0;
    }

    if (added)
    	send_mailbox();

...where it batches and sends every 256 and one more send afterwards for
any stragglers.

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH V2 04/11] cxl/mem: Clear events on driver load
  2022-12-02 23:34       ` Dan Williams
@ 2022-12-03 21:00         ` Ira Weiny
  0 siblings, 0 replies; 64+ messages in thread
From: Ira Weiny @ 2022-12-03 21:00 UTC (permalink / raw)
  To: Dan Williams
  Cc: Jonathan Cameron, Dave Jiang, Alison Schofield, Vishal Verma,
	Ben Widawsky, Steven Rostedt, Davidlohr Bueso, linux-kernel,
	linux-cxl

On Fri, Dec 02, 2022 at 03:34:20PM -0800, Dan Williams wrote:
> Ira Weiny wrote:
> > On Thu, Dec 01, 2022 at 06:48:12PM -0800, Dan Williams wrote:
> > > cxl/mem is cxl_mem.ko, This is cxl/pci.

[snip]

> > > > +	cxl_mem_get_event_records(cxlds);
> > > > +
> > > 
> > > This hunk likely goes with the first patch that actually implements some
> > > mocked events.
> > 
> > If this patch was squashed into the other patches yes.  But as a patch which
> > does exactly 1 thing "Clear events on driver load" it works IMO.  I could just
> > have well put this patch at the very end.
> > 
> > Now that the Get/Clear operations are more settled I'll split this out and
> > squash it as you suggest.  Jonathan suggested squashing Get/Clear too but again
> > I really prefer the 1 thing/patch and each of those operations seemed like a
> > good breakdown.
> > 
> 
> I'll preface this by saying if you ask 3 kernel developers how to split
> a patch series you'll get 5 answers.

Indeed.

> For me though, a patch should be a
> bisectable full-thought. That at each step of a series the kernel is
> incrementally better in a way that makes sense. The kernel that gets Get
> Events likely needs to clear them too to complete 1 full thought about
> enbling Event handling. Otherwise a kernel that just retrieves some
> events until they overflow feels like a POC.

I've squashed it.

Ira

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH V2 02/11] cxl/mem: Implement Get Event Records command
  2022-12-02 21:47     ` Ira Weiny
@ 2022-12-03 21:33       ` Dan Williams
  0 siblings, 0 replies; 64+ messages in thread
From: Dan Williams @ 2022-12-03 21:33 UTC (permalink / raw)
  To: Ira Weiny, Dan Williams
  Cc: Steven Rostedt, Alison Schofield, Vishal Verma, Ben Widawsky,
	Jonathan Cameron, Davidlohr Bueso, Dave Jiang, linux-kernel,
	linux-cxl

Ira Weiny wrote:
> On Thu, Dec 01, 2022 at 05:39:12PM -0800, Dan Williams wrote:
> > ira.weiny@ wrote:
> > > From: Ira Weiny <ira.weiny@intel.com>
> > > 
> 
> [snip]
> 
> > >  
> > > +#define CREATE_TRACE_POINTS
> > > +#include <trace/events/cxl.h>
> > > +
> > >  #include "core.h"
> > >  
> > >  static bool cxl_raw_allow_all;
> > > @@ -48,6 +51,7 @@ static struct cxl_mem_command cxl_mem_commands[CXL_MEM_COMMAND_ID_MAX] = {
> > >  	CXL_CMD(RAW, CXL_VARIABLE_PAYLOAD, CXL_VARIABLE_PAYLOAD, 0),
> > >  #endif
> > >  	CXL_CMD(GET_SUPPORTED_LOGS, 0, CXL_VARIABLE_PAYLOAD, CXL_CMD_FLAG_FORCE_ENABLE),
> > > +	CXL_CMD(GET_EVENT_RECORD, 1, CXL_VARIABLE_PAYLOAD, 0),
> > >  	CXL_CMD(GET_FW_INFO, 0, 0x50, 0),
> > >  	CXL_CMD(GET_PARTITION_INFO, 0, 0x20, 0),
> > >  	CXL_CMD(GET_LSA, 0x8, CXL_VARIABLE_PAYLOAD, 0),
> > 
> > Similar to this patch:
> > 
> > https://lore.kernel.org/linux-cxl/166993221008.1995348.11651567302609703175.stgit@dwillia2-xfh.jf.intel.com/
> > 
> > CXL_MEM_COMMAND_ID_GET_EVENT_RECORD, should be added to the "always
> > kernel" / cxlds->exclusive_cmds mask.
> 
> Done for all the commands.  I'll rebase as well before sending this out.
> 
> > 
> > > @@ -704,6 +708,106 @@ int cxl_enumerate_cmds(struct cxl_dev_state *cxlds)
> > >  }
> > >  EXPORT_SYMBOL_NS_GPL(cxl_enumerate_cmds, CXL);
> > >  
> > > +static void cxl_mem_get_records_log(struct cxl_dev_state *cxlds,
> > > +				    enum cxl_event_log_type type)
> > > +{
> > > +	struct cxl_get_event_payload *payload;
> > > +	u16 nr_rec;
> > > +
> > > +	mutex_lock(&cxlds->event_buf_lock);
> > > +
> > > +	payload = cxlds->event_buf;
> > > +
> > > +	do {
> > > +		u8 log_type = type;
> > > +		int rc;
> > > +
> > > +		rc = cxl_mbox_send_cmd(cxlds, CXL_MBOX_OP_GET_EVENT_RECORD,
> > > +				       &log_type, sizeof(log_type),
> > > +				       payload, cxlds->payload_size);
> > > +		if (rc) {
> > > +			dev_err(cxlds->dev, "Event log '%s': Failed to query event records : %d",
> > > +				cxl_event_log_type_str(type), rc);
> > > +			goto unlock_buffer;
> > > +		}
> > > +
> > > +		nr_rec = le16_to_cpu(payload->record_count);
> > > +		if (trace_cxl_generic_event_enabled()) {
> > 
> > This feels like a premature micro-optimization as none of this code is
> > fast path and it is dwarfed by the cost of executing the mailbox
> > command. I started with trying to reduce the 80 column collision
> > pressure, but then stepped back even further and asked, why?
> 
> Because Steven told me to.  :-(  I should have been smarter than that.

You did the right thing, I failed to jump in sooner on this set.

> 
> > 
> > > +			int i;
> > > +
> > > +			for (i = 0; i < nr_rec; i++)
> > > +				trace_cxl_generic_event(dev_name(cxlds->dev),
> > > +							type,
> > > +							&payload->records[i]);
> > 
> > As far as I can tell trace_cxl_generic_event() always expects a
> > device-name as its first argument. So why not enforce that with
> > type-safety?  I.e. I think trace_cxl_generic_event() should take a
> > "struct device *", not a string unless it is really the case that any
> > old string will do as the first argument to the trace event. Otherwise
> > the trace point can do "__string(dev_name, dev_name(dev))", and mandate
> > that callers pass devices.
> 
> From a trace point view 'any old string' will do.  There was nothing else the
> trace needed from struct device so I skipped it.

I'd prefer more fine-grained type safety wherever possible.

> 
> [snip]
> 
> > > +
> > > +/**
> > > + * cxl_mem_get_event_records - Get Event Records from the device
> > > + * @cxlds: The device data for the operation
> > > + *
> > > + * Retrieve all event records available on the device and report them as trace
> > > + * events.
> > > + *
> > > + * See CXL rev 3.0 @8.2.9.2.2 Get Event Records
> > > + */
> > > +void cxl_mem_get_event_records(struct cxl_dev_state *cxlds)
> > > +{
> > > +	u32 status = readl(cxlds->regs.status + CXLDEV_DEV_EVENT_STATUS_OFFSET);
> > > +
> > > +	dev_dbg(cxlds->dev, "Reading event logs: %x\n", status);
> > > +
> > > +	if (!cxlds->event_buf) {
> > > +		cxlds->event_buf = alloc_event_buf(cxlds);
> > > +		if (WARN_ON_ONCE(!cxlds->event_buf))
> > > +			return;
> > > +	}
> > 
> > What's the point of having an event_buf_lock if event_buf is reallocated
> > every call?
> 
> This is only called on start up.

cxl_mem_get_event_records() is called all the time. The place to
allocate buffers attached to 'struct cxl_dev_state' at start up is
cxl_dev_state_create(), or sometime after cxl_enumerate_cmds() if you
want to wait and see if the device supports events and CXL _OSC says the
driver can drive events.

> > Just allocate event_buf once at the beginning of time during init if the
> > device supports event log retrieval, and fail the driver load if that
> > allocation fails. No runtime WARN() for memory allocation.
> 
> It was.  I'll make that more clear in the next series.
> 
> > 
> > I notice this patch does not clear events, I trust that comes later in
> > the series, but I think it belongs here to make this patch a complete
> > standalone thought.
> 
> Squashed.  But it does make for a large patch.  Which I'm not a fan of for
> review.  Lucky that now we have a lot of review on the parts.
> 
> > 
> > > +	if (status & CXLDEV_EVENT_STATUS_INFO)
> > > +		cxl_mem_get_records_log(cxlds, CXL_EVENT_TYPE_INFO);
> > > +	if (status & CXLDEV_EVENT_STATUS_WARN)
> > > +		cxl_mem_get_records_log(cxlds, CXL_EVENT_TYPE_WARN);
> > > +	if (status & CXLDEV_EVENT_STATUS_FAIL)
> > > +		cxl_mem_get_records_log(cxlds, CXL_EVENT_TYPE_FAIL);
> > > +	if (status & CXLDEV_EVENT_STATUS_FATAL)
> > > +		cxl_mem_get_records_log(cxlds, CXL_EVENT_TYPE_FATAL);
> > 
> > This retrieval order should be flipped. If there is a FATAL pending I
> > expect a monitor wants that first and would be happy to parse the INFO
> > later. I would go so far as to say that if the INFO logger is looping
> > and a FATAL comes in the driver should get that out first before going
> > back for more INFO logs. That would mean executing Clear Events and
> > looping through the logs by priority until all the status bits fall
> > silent inside cxl_mem_get_records_log().
> 
> I'll flip them.  And determine if this is really what we want to do for the
> irq.
> 
> The issue with the irq handling calling a single function which checks all
> status is that we may end up with some odd interrupts doing nothing depending
> on racing etc.

If an event handler wakes and reads 0-status bits because another
handler did it then return IRQ_HANDLED. You'll have this problem whether
you have a central function or not, because there's only one status
register for multiple sources.

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH V2 08/11] cxl/mem: Wire up event interrupts
  2022-12-02 19:43       ` Dan Williams
@ 2022-12-05 13:01         ` Jonathan Cameron
  2022-12-05 16:35           ` Dan Williams
  0 siblings, 1 reply; 64+ messages in thread
From: Jonathan Cameron @ 2022-12-05 13:01 UTC (permalink / raw)
  To: Dan Williams
  Cc: ira.weiny, Alison Schofield, Vishal Verma, Ben Widawsky,
	Steven Rostedt, Davidlohr Bueso, Dave Jiang, linux-kernel,
	linux-cxl

On Fri, 2 Dec 2022 11:43:29 -0800
Dan Williams <dan.j.williams@intel.com> wrote:

> Jonathan Cameron wrote:
> >   
> > > > +int cxl_event_config_msgnums(struct cxl_dev_state *cxlds,
> > > > +			     struct cxl_event_interrupt_policy *policy)
> > > > +{
> > > > +	int rc;
> > > > +
> > > > +	policy->info_settings = CXL_INT_MSI_MSIX;
> > > > +	policy->warn_settings = CXL_INT_MSI_MSIX;
> > > > +	policy->failure_settings = CXL_INT_MSI_MSIX;
> > > > +	policy->fatal_settings = CXL_INT_MSI_MSIX;    
> > > 
> > > I think this needs to be careful not to undo events that the BIOS
> > > steered to itself in firmware-first mode, which raises another question,
> > > does firmware-first mean more the OS needs to backoff on some event-log
> > > handling as well?  
> > 
> > Hmm. Does the _OSC cover these.  There is one for Memory error reporting
> > that I think covers it (refers to 12.2.3.2)
> > 
> > Note that should cover any means of obtaining these, not just interrupt
> > driven - so including the initial record clear.
> > 
> > ..
> >   
> > > > +
> > > > +static irqreturn_t cxl_event_failure_thread(int irq, void *id)
> > > > +{
> > > > +	struct cxl_dev_state *cxlds = id;
> > > > +
> > > > +	cxl_mem_get_records_log(cxlds, CXL_EVENT_TYPE_FAIL);
> > > > +	return IRQ_HANDLED;
> > > > +}    
> > > 
> > > So I think one of the nice side effects of moving log priorty handling
> > > inside of cxl_mem_get_records_log() and looping through all log types in
> > > priority order until all status is clear is that an INFO interrupt also
> > > triggers a check of the FATAL status for free.
> > >   
> > 
> > I go the opposite way on this in thinking that an interrupt should only
> > ever be used to handle the things it was registered for - so we should
> > not be clearing fatal records in the handler triggered for info events.  
> 
> I would agree with you if this was a fast path and if the hardware
> mechanism did not involve shared status register that tells you
> that both FATAL and INFO are pending retrieval through a mechanism.
> Compare that to the separation between admin and IO queues in NVME.
> 
> If the handler is going to loop on the status register then it must be
> careful not to starve out FATAL while processing INFO.
> 
> > Doing other actions like this relies on subtlies of the generic interrupt
> > handling code which happens to force interrupt threads on a shared interrupt
> > line to be serialized.  I'm not sure we are safe at all the interrupt
> > isn't shared unless we put a lock around the whole thing (we have one
> > because of the buffer mutex though).  
> 
> The interrupt is likely shared since there is no performance benefit to
> entice hardware vendors spend transistor budget on more vector space for
> events. The events architecture does not merit that spend.
> 
> > If going this way I think the lock needs a rename.
> > It's not just protecting the buffer used, but also serialize multiple
> > interrupt threads.  
> 
> I will let Ira decide if he wants to rename, but in my mind the shared
> event buffer *is* the data being locked, the fact that multiple threads
> might be contending for it is immaterial.

It isn't he only thing being protected.  Access to the device is also
being serialized including the data in it's registers.

If someone comes along later and decides to implement multiple buffers
and there for gets rid of the lock. boom.


Jonathan


^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH V2 08/11] cxl/mem: Wire up event interrupts
  2022-12-05 13:01         ` Jonathan Cameron
@ 2022-12-05 16:35           ` Dan Williams
  2022-12-06  9:38             ` Jonathan Cameron
  0 siblings, 1 reply; 64+ messages in thread
From: Dan Williams @ 2022-12-05 16:35 UTC (permalink / raw)
  To: Jonathan Cameron, Dan Williams
  Cc: ira.weiny, Alison Schofield, Vishal Verma, Ben Widawsky,
	Steven Rostedt, Davidlohr Bueso, Dave Jiang, linux-kernel,
	linux-cxl

Jonathan Cameron wrote:
> On Fri, 2 Dec 2022 11:43:29 -0800
> Dan Williams <dan.j.williams@intel.com> wrote:
> 
> > Jonathan Cameron wrote:
> > >   
> > > > > +int cxl_event_config_msgnums(struct cxl_dev_state *cxlds,
> > > > > +			     struct cxl_event_interrupt_policy *policy)
> > > > > +{
> > > > > +	int rc;
> > > > > +
> > > > > +	policy->info_settings = CXL_INT_MSI_MSIX;
> > > > > +	policy->warn_settings = CXL_INT_MSI_MSIX;
> > > > > +	policy->failure_settings = CXL_INT_MSI_MSIX;
> > > > > +	policy->fatal_settings = CXL_INT_MSI_MSIX;    
> > > > 
> > > > I think this needs to be careful not to undo events that the BIOS
> > > > steered to itself in firmware-first mode, which raises another question,
> > > > does firmware-first mean more the OS needs to backoff on some event-log
> > > > handling as well?  
> > > 
> > > Hmm. Does the _OSC cover these.  There is one for Memory error reporting
> > > that I think covers it (refers to 12.2.3.2)
> > > 
> > > Note that should cover any means of obtaining these, not just interrupt
> > > driven - so including the initial record clear.
> > > 
> > > ..
> > >   
> > > > > +
> > > > > +static irqreturn_t cxl_event_failure_thread(int irq, void *id)
> > > > > +{
> > > > > +	struct cxl_dev_state *cxlds = id;
> > > > > +
> > > > > +	cxl_mem_get_records_log(cxlds, CXL_EVENT_TYPE_FAIL);
> > > > > +	return IRQ_HANDLED;
> > > > > +}    
> > > > 
> > > > So I think one of the nice side effects of moving log priorty handling
> > > > inside of cxl_mem_get_records_log() and looping through all log types in
> > > > priority order until all status is clear is that an INFO interrupt also
> > > > triggers a check of the FATAL status for free.
> > > >   
> > > 
> > > I go the opposite way on this in thinking that an interrupt should only
> > > ever be used to handle the things it was registered for - so we should
> > > not be clearing fatal records in the handler triggered for info events.  
> > 
> > I would agree with you if this was a fast path and if the hardware
> > mechanism did not involve shared status register that tells you
> > that both FATAL and INFO are pending retrieval through a mechanism.
> > Compare that to the separation between admin and IO queues in NVME.
> > 
> > If the handler is going to loop on the status register then it must be
> > careful not to starve out FATAL while processing INFO.
> > 
> > > Doing other actions like this relies on subtlies of the generic interrupt
> > > handling code which happens to force interrupt threads on a shared interrupt
> > > line to be serialized.  I'm not sure we are safe at all the interrupt
> > > isn't shared unless we put a lock around the whole thing (we have one
> > > because of the buffer mutex though).  
> > 
> > The interrupt is likely shared since there is no performance benefit to
> > entice hardware vendors spend transistor budget on more vector space for
> > events. The events architecture does not merit that spend.
> > 
> > > If going this way I think the lock needs a rename.
> > > It's not just protecting the buffer used, but also serialize multiple
> > > interrupt threads.  
> > 
> > I will let Ira decide if he wants to rename, but in my mind the shared
> > event buffer *is* the data being locked, the fact that multiple threads
> > might be contending for it is immaterial.
> 
> It isn't he only thing being protected.  Access to the device is also
> being serialized including the data in it's registers.
> 
> If someone comes along later and decides to implement multiple buffers
> and there for gets rid of the lock. boom.

That's what the mailbox mutex is protecting against. If there is an
aspect of the hardware state that is not protected by that then that's a
bug.

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH V2 03/11] cxl/mem: Implement Clear Event Records command
  2022-12-03  1:14       ` Dan Williams
@ 2022-12-06  7:35         ` Ira Weiny
  0 siblings, 0 replies; 64+ messages in thread
From: Ira Weiny @ 2022-12-06  7:35 UTC (permalink / raw)
  To: Dan Williams
  Cc: Alison Schofield, Vishal Verma, Ben Widawsky, Steven Rostedt,
	Jonathan Cameron, Davidlohr Bueso, Dave Jiang, linux-kernel,
	linux-cxl

On Fri, Dec 02, 2022 at 05:14:27PM -0800, Dan Williams wrote:
> Ira Weiny wrote:
> > On Thu, Dec 01, 2022 at 06:29:20PM -0800, Dan Williams wrote:
> > > ira.weiny@ wrote:
> > > > From: Ira Weiny <ira.weiny@intel.com>
> > > > 
> > > > CXL rev 3.0 section 8.2.9.2.3 defines the Clear Event Records mailbox
> > > > command.  After an event record is read it needs to be cleared from the
> > > > event log.
> > > > 
> > > > Implement cxl_clear_event_record() to clear all record retrieved from
> > > > the device.
> > > > 
> > > > Each record is cleared explicitly.  A clear all bit is specified but
> > > > events could arrive between a get and any final clear all operation.
> > > > This means events would be missed.
> > > > Therefore each event is cleared specifically.
> > > 
> > > Note that the spec has a better reason for why Clear All has limited
> > > usage:
> > > 
> > > "Clear All Events is only allowed when the Event Log has overflowed;
> > > otherwise, the device shall return Invalid Input."
> > > 
> > > Will need to wait and see if we need that to keep pace with a device
> > > with a high event frequency.
> > 
> > Perhaps.  But yea I would wait and see.
> > 
> > [snip]
> > 
> > > > +static int cxl_clear_event_record(struct cxl_dev_state *cxlds,
> > > > +				  enum cxl_event_log_type log,
> > > > +				  struct cxl_get_event_payload *get_pl,
> > > > +				  u16 total)
> > > > +{
> > > > +	struct cxl_mbox_clear_event_payload payload = {
> > > > +		.event_log = log,
> > > > +	};
> > > > +	int cnt;
> > > > +
> > > > +	/*
> > > > +	 * Clear Event Records uses u8 for the handle cnt while Get Event
> > > > +	 * Record can return up to 0xffff records.
> > > > +	 */
> > > > +	for (cnt = 0; cnt < total; /* cnt incremented internally */) {
> > > > +		u8 nr_recs = min_t(u8, (total - cnt),
> > > > +				   CXL_CLEAR_EVENT_MAX_HANDLES);
> > > 
> > > This seems overly complicated. @total is a duplicate of
> > > @get_pl->record_count, and the 2 loops feel like it could be cut
> > > down to one.
> > 
> > Sure, total is redundant to pass to the function.
> > 
> > However, 2 loops is IMO not at all overly complicated.  Note that the 2 loops
> > do not do the same thing.  The inner loop is filling in the payload for the
> > Clear command.  There is really no way around doing this.
> > 
> > Now that I've had time to think about it:
> > 
> > 	Are you suggesting we issue a single mailbox command for every handle?
> > 
> > That would be a single loop.  But a lot more mailbox commands.
> 
> I was thinking something like this pseudo code
> 
> int tosend = le16_to_cpu(get_pl->record_count);
> int added = 0;
> 
>     for (i = 0; i < tosend; i++) {
>     	add_to_clear(added++);
>     	if (added == MAX)
>     		send_mailbox();
> 	added = 0;
>     }
> 
>     if (added)
>     	send_mailbox();
> 
> ...where it batches and sends every 256 and one more send afterwards for
> any stragglers.

Ok I'm not convinced it makes that much difference but I don't have the
fortitude to try and look at the assembly to argue...  ;-)

Done.

Ira

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH V2 08/11] cxl/mem: Wire up event interrupts
  2022-12-05 16:35           ` Dan Williams
@ 2022-12-06  9:38             ` Jonathan Cameron
  0 siblings, 0 replies; 64+ messages in thread
From: Jonathan Cameron @ 2022-12-06  9:38 UTC (permalink / raw)
  To: Dan Williams
  Cc: ira.weiny, Alison Schofield, Vishal Verma, Ben Widawsky,
	Steven Rostedt, Davidlohr Bueso, Dave Jiang, linux-kernel,
	linux-cxl

On Mon, 5 Dec 2022 08:35:34 -0800
Dan Williams <dan.j.williams@intel.com> wrote:

> Jonathan Cameron wrote:
> > On Fri, 2 Dec 2022 11:43:29 -0800
> > Dan Williams <dan.j.williams@intel.com> wrote:
> >   
> > > Jonathan Cameron wrote:  
> > > >     
> > > > > > +int cxl_event_config_msgnums(struct cxl_dev_state *cxlds,
> > > > > > +			     struct cxl_event_interrupt_policy *policy)
> > > > > > +{
> > > > > > +	int rc;
> > > > > > +
> > > > > > +	policy->info_settings = CXL_INT_MSI_MSIX;
> > > > > > +	policy->warn_settings = CXL_INT_MSI_MSIX;
> > > > > > +	policy->failure_settings = CXL_INT_MSI_MSIX;
> > > > > > +	policy->fatal_settings = CXL_INT_MSI_MSIX;      
> > > > > 
> > > > > I think this needs to be careful not to undo events that the BIOS
> > > > > steered to itself in firmware-first mode, which raises another question,
> > > > > does firmware-first mean more the OS needs to backoff on some event-log
> > > > > handling as well?    
> > > > 
> > > > Hmm. Does the _OSC cover these.  There is one for Memory error reporting
> > > > that I think covers it (refers to 12.2.3.2)
> > > > 
> > > > Note that should cover any means of obtaining these, not just interrupt
> > > > driven - so including the initial record clear.
> > > > 
> > > > ..
> > > >     
> > > > > > +
> > > > > > +static irqreturn_t cxl_event_failure_thread(int irq, void *id)
> > > > > > +{
> > > > > > +	struct cxl_dev_state *cxlds = id;
> > > > > > +
> > > > > > +	cxl_mem_get_records_log(cxlds, CXL_EVENT_TYPE_FAIL);
> > > > > > +	return IRQ_HANDLED;
> > > > > > +}      
> > > > > 
> > > > > So I think one of the nice side effects of moving log priorty handling
> > > > > inside of cxl_mem_get_records_log() and looping through all log types in
> > > > > priority order until all status is clear is that an INFO interrupt also
> > > > > triggers a check of the FATAL status for free.
> > > > >     
> > > > 
> > > > I go the opposite way on this in thinking that an interrupt should only
> > > > ever be used to handle the things it was registered for - so we should
> > > > not be clearing fatal records in the handler triggered for info events.    
> > > 
> > > I would agree with you if this was a fast path and if the hardware
> > > mechanism did not involve shared status register that tells you
> > > that both FATAL and INFO are pending retrieval through a mechanism.
> > > Compare that to the separation between admin and IO queues in NVME.
> > > 
> > > If the handler is going to loop on the status register then it must be
> > > careful not to starve out FATAL while processing INFO.
> > >   
> > > > Doing other actions like this relies on subtlies of the generic interrupt
> > > > handling code which happens to force interrupt threads on a shared interrupt
> > > > line to be serialized.  I'm not sure we are safe at all the interrupt
> > > > isn't shared unless we put a lock around the whole thing (we have one
> > > > because of the buffer mutex though).    
> > > 
> > > The interrupt is likely shared since there is no performance benefit to
> > > entice hardware vendors spend transistor budget on more vector space for
> > > events. The events architecture does not merit that spend.
> > >   
> > > > If going this way I think the lock needs a rename.
> > > > It's not just protecting the buffer used, but also serialize multiple
> > > > interrupt threads.    
> > > 
> > > I will let Ira decide if he wants to rename, but in my mind the shared
> > > event buffer *is* the data being locked, the fact that multiple threads
> > > might be contending for it is immaterial.  
> > 
> > It isn't he only thing being protected.  Access to the device is also
> > being serialized including the data in it's registers.
> > 
> > If someone comes along later and decides to implement multiple buffers
> > and there for gets rid of the lock. boom.  
> 
> That's what the mailbox mutex is protecting against. If there is an
> aspect of the hardware state that is not protected by that then that's a
> bug.
> 
Wrong level of locking. This is about a race on multiple commands
1) Read data from interrupt thread 1.
2) Read same data from interrupt thread 2.
3) Clear data from interrupt thread 1.
4) Clear data from interrupt thread 2. Boom (well minor error we
probably eat but not good practice).





^ permalink raw reply	[flat|nested] 64+ messages in thread

end of thread, other threads:[~2022-12-06  9:38 UTC | newest]

Thread overview: 64+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-12-01  0:27 [PATCH V2 00/11] CXL: Process event logs ira.weiny
2022-12-01  0:27 ` [PATCH V2 01/11] cxl/pci: Add generic MSI-X/MSI irq support ira.weiny
2022-12-01 10:18   ` Jonathan Cameron
2022-12-01 18:37   ` Dave Jiang
2022-12-02  0:23   ` Dan Williams
2022-12-02  0:34     ` Ira Weiny
2022-12-02  2:00       ` Dan Williams
2022-12-02 13:04         ` Jonathan Cameron
2022-12-01  0:27 ` [PATCH V2 02/11] cxl/mem: Implement Get Event Records command ira.weiny
2022-12-01 13:06   ` Jonathan Cameron
2022-12-01 15:10     ` Ira Weiny
2022-12-01 17:38   ` Steven Rostedt
2022-12-02  0:09     ` Ira Weiny
2022-12-02  4:40       ` Steven Rostedt
2022-12-02  5:00         ` Steven Rostedt
2022-12-02 21:31           ` Ira Weiny
2022-12-02  1:39   ` Dan Williams
2022-12-02 21:47     ` Ira Weiny
2022-12-03 21:33       ` Dan Williams
2022-12-01  0:27 ` [PATCH V2 03/11] cxl/mem: Implement Clear " ira.weiny
2022-12-01 13:26   ` Jonathan Cameron
2022-12-01 15:30     ` Ira Weiny
2022-12-02  2:29   ` Dan Williams
2022-12-02 13:18     ` Jonathan Cameron
2022-12-02 13:34     ` Steven Rostedt
2022-12-02 19:27       ` Dan Williams
2022-12-02 21:28         ` Ira Weiny
2022-12-02 23:49     ` Ira Weiny
2022-12-03  1:14       ` Dan Williams
2022-12-06  7:35         ` Ira Weiny
2022-12-01  0:27 ` [PATCH V2 04/11] cxl/mem: Clear events on driver load ira.weiny
2022-12-01 13:30   ` Jonathan Cameron
2022-12-01 17:02     ` Ira Weiny
2022-12-02  2:48   ` Dan Williams
2022-12-02 16:34     ` Ira Weiny
2022-12-02 23:34       ` Dan Williams
2022-12-03 21:00         ` Ira Weiny
2022-12-01  0:27 ` [PATCH V2 05/11] cxl/mem: Trace General Media Event Record ira.weiny
2022-12-01 18:54   ` Dave Jiang
2022-12-02  6:18   ` Dan Williams
2022-12-01  0:27 ` [PATCH V2 06/11] cxl/mem: Trace DRAM " ira.weiny
2022-12-01 18:55   ` Dave Jiang
2022-12-01  0:27 ` [PATCH V2 07/11] cxl/mem: Trace Memory Module " ira.weiny
2022-12-01 13:31   ` Jonathan Cameron
2022-12-01 18:57   ` Dave Jiang
2022-12-02  6:25   ` Dan Williams
2022-12-01  0:27 ` [PATCH V2 08/11] cxl/mem: Wire up event interrupts ira.weiny
2022-12-01 14:21   ` Jonathan Cameron
2022-12-01 17:23     ` Ira Weiny
2022-12-01 18:35   ` Davidlohr Bueso
2022-12-02  7:37   ` Dan Williams
2022-12-02 14:19     ` Jonathan Cameron
2022-12-02 19:43       ` Dan Williams
2022-12-05 13:01         ` Jonathan Cameron
2022-12-05 16:35           ` Dan Williams
2022-12-06  9:38             ` Jonathan Cameron
2022-12-01  0:27 ` [PATCH V2 09/11] cxl/test: Add generic mock events ira.weiny
2022-12-01 14:37   ` Jonathan Cameron
2022-12-01 17:49     ` Ira Weiny
2022-12-02  8:07   ` Dan Williams
2022-12-01  0:27 ` [PATCH V2 10/11] cxl/test: Add specific events ira.weiny
2022-12-01 21:00   ` Dave Jiang
2022-12-01  0:27 ` [PATCH V2 11/11] cxl/test: Simulate event log overflow ira.weiny
2022-12-01 21:28   ` Dave Jiang

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).