All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v3 0/2] cxl: add poison creation event handler
@ 2024-04-17  7:50 ` Shiyang Ruan via
  0 siblings, 0 replies; 28+ messages in thread
From: Shiyang Ruan @ 2024-04-17  7:50 UTC (permalink / raw)
  To: qemu-devel, linux-cxl
  Cc: Jonathan.Cameron, dan.j.williams, dave, ira.weiny, alison.schofield

Changes: RFCv2 -> v3:
1. patch1: removed changes for flags
2. changed the main idea of this patchset: not for injection event
     handling, but for creation;
3. removed GET_POISON_LIST command while receiving POISON event;
4. dropped poison report in debugfs;
5. added DER event handler to handle POISON event, in case POISON event is
     sent by DER;
After the above changes, this patchset becomes smaller.  Main code changes
are in patch2, which seems could't be divided smaller.


Currently driver only traces cxl events, poison creation (for both vmem
and pmem type) on cxl memdev is silent.  OS needs to be notified then it
could handle poison pages in time.  Per CXL spec, the device error event
could be signaled through FW-First and OS-First methods.

So, add poison creation event handler in OS-First method:
  - Qemu:
    - CXL device reports POISON creation event to OS by MSI by sending
      GMER/DER after injecting a poison record;
  - CXL driver:                                   <--- this patchset
    a. parse the POISON event from GMER/DER;
    b. translate poisoned DPA to HPA (PFN);
    c. enqueue poisoned PFN to memory_failure's work queue;

Shiyang Ruan (2):
  cxl/core: correct length of DPA field masks
  cxl/core: add poison creation event handler

 drivers/cxl/core/mbox.c   | 119 +++++++++++++++++++++++++++++++++-----
 drivers/cxl/core/trace.h  |   2 +-
 drivers/cxl/cxlmem.h      |   8 +--
 include/linux/cxl-event.h |  18 +++++-
 4 files changed, 126 insertions(+), 21 deletions(-)

-- 
2.34.1


^ permalink raw reply	[flat|nested] 28+ messages in thread

* [PATCH v3 0/2] cxl: add poison creation event handler
@ 2024-04-17  7:50 ` Shiyang Ruan via
  0 siblings, 0 replies; 28+ messages in thread
From: Shiyang Ruan via @ 2024-04-17  7:50 UTC (permalink / raw)
  To: qemu-devel, linux-cxl
  Cc: Jonathan.Cameron, dan.j.williams, dave, ira.weiny, alison.schofield

Changes: RFCv2 -> v3:
1. patch1: removed changes for flags
2. changed the main idea of this patchset: not for injection event
     handling, but for creation;
3. removed GET_POISON_LIST command while receiving POISON event;
4. dropped poison report in debugfs;
5. added DER event handler to handle POISON event, in case POISON event is
     sent by DER;
After the above changes, this patchset becomes smaller.  Main code changes
are in patch2, which seems could't be divided smaller.


Currently driver only traces cxl events, poison creation (for both vmem
and pmem type) on cxl memdev is silent.  OS needs to be notified then it
could handle poison pages in time.  Per CXL spec, the device error event
could be signaled through FW-First and OS-First methods.

So, add poison creation event handler in OS-First method:
  - Qemu:
    - CXL device reports POISON creation event to OS by MSI by sending
      GMER/DER after injecting a poison record;
  - CXL driver:                                   <--- this patchset
    a. parse the POISON event from GMER/DER;
    b. translate poisoned DPA to HPA (PFN);
    c. enqueue poisoned PFN to memory_failure's work queue;

Shiyang Ruan (2):
  cxl/core: correct length of DPA field masks
  cxl/core: add poison creation event handler

 drivers/cxl/core/mbox.c   | 119 +++++++++++++++++++++++++++++++++-----
 drivers/cxl/core/trace.h  |   2 +-
 drivers/cxl/cxlmem.h      |   8 +--
 include/linux/cxl-event.h |  18 +++++-
 4 files changed, 126 insertions(+), 21 deletions(-)

-- 
2.34.1



^ permalink raw reply	[flat|nested] 28+ messages in thread

* [PATCH v3 1/2] cxl/core: correct length of DPA field masks
  2024-04-17  7:50 ` Shiyang Ruan via
@ 2024-04-17  7:50   ` Shiyang Ruan via
  -1 siblings, 0 replies; 28+ messages in thread
From: Shiyang Ruan @ 2024-04-17  7:50 UTC (permalink / raw)
  To: qemu-devel, linux-cxl
  Cc: Jonathan.Cameron, dan.j.williams, dave, ira.weiny,
	alison.schofield, stable

The length of Physical Address in General Media Event Record/DRAM Event
Record is 64-bit, so the field mask should be defined as such length.
Otherwise, this causes cxl_general_media and cxl_dram tracepoints to
mask off the upper-32-bits of DPA addresses. The cxl_poison event is
unaffected.

If userspace was doing its own DPA-to-HPA translation this could lead to
incorrect page retirement decisions, but there is no known consumer
(like rasdaemon) of this event today.

Fixes: d54a531a430b ("cxl/mem: Trace General Media Event Record")
Cc: <stable@vger.kernel.org>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Davidlohr Bueso <dave@stgolabs.net>
Cc: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Cc: Ira Weiny <ira.weiny@intel.com>
Signed-off-by: Shiyang Ruan <ruansy.fnst@fujitsu.com>
---
 drivers/cxl/core/trace.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/cxl/core/trace.h b/drivers/cxl/core/trace.h
index e5f13260fc52..cdfce932d5b1 100644
--- a/drivers/cxl/core/trace.h
+++ b/drivers/cxl/core/trace.h
@@ -253,7 +253,7 @@ TRACE_EVENT(cxl_generic_event,
  * DRAM Event Record
  * CXL rev 3.0 section 8.2.9.2.1.2; Table 8-44
  */
-#define CXL_DPA_FLAGS_MASK			0x3F
+#define CXL_DPA_FLAGS_MASK			0x3FULL
 #define CXL_DPA_MASK				(~CXL_DPA_FLAGS_MASK)
 
 #define CXL_DPA_VOLATILE			BIT(0)
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH v3 1/2] cxl/core: correct length of DPA field masks
@ 2024-04-17  7:50   ` Shiyang Ruan via
  0 siblings, 0 replies; 28+ messages in thread
From: Shiyang Ruan via @ 2024-04-17  7:50 UTC (permalink / raw)
  To: qemu-devel, linux-cxl
  Cc: Jonathan.Cameron, dan.j.williams, dave, ira.weiny,
	alison.schofield, stable

The length of Physical Address in General Media Event Record/DRAM Event
Record is 64-bit, so the field mask should be defined as such length.
Otherwise, this causes cxl_general_media and cxl_dram tracepoints to
mask off the upper-32-bits of DPA addresses. The cxl_poison event is
unaffected.

If userspace was doing its own DPA-to-HPA translation this could lead to
incorrect page retirement decisions, but there is no known consumer
(like rasdaemon) of this event today.

Fixes: d54a531a430b ("cxl/mem: Trace General Media Event Record")
Cc: <stable@vger.kernel.org>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Davidlohr Bueso <dave@stgolabs.net>
Cc: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Cc: Ira Weiny <ira.weiny@intel.com>
Signed-off-by: Shiyang Ruan <ruansy.fnst@fujitsu.com>
---
 drivers/cxl/core/trace.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/cxl/core/trace.h b/drivers/cxl/core/trace.h
index e5f13260fc52..cdfce932d5b1 100644
--- a/drivers/cxl/core/trace.h
+++ b/drivers/cxl/core/trace.h
@@ -253,7 +253,7 @@ TRACE_EVENT(cxl_generic_event,
  * DRAM Event Record
  * CXL rev 3.0 section 8.2.9.2.1.2; Table 8-44
  */
-#define CXL_DPA_FLAGS_MASK			0x3F
+#define CXL_DPA_FLAGS_MASK			0x3FULL
 #define CXL_DPA_MASK				(~CXL_DPA_FLAGS_MASK)
 
 #define CXL_DPA_VOLATILE			BIT(0)
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH v3 2/2] cxl/core: add poison creation event handler
  2024-04-17  7:50 ` Shiyang Ruan via
@ 2024-04-17  7:50   ` Shiyang Ruan
  -1 siblings, 0 replies; 28+ messages in thread
From: Shiyang Ruan via @ 2024-04-17  7:50 UTC (permalink / raw)
  To: qemu-devel, linux-cxl
  Cc: Jonathan.Cameron, dan.j.williams, dave, ira.weiny, alison.schofield

Currently driver only traces cxl events, poison creation (for both vmem
and pmem type) on cxl memdev is silent.  OS needs to be notified then it
could handle poison pages in time.  Per CXL spec, the device error event
could be signaled through FW-First and OS-First methods.

So, add poison creation event handler in OS-First method:
  - Qemu:
    - CXL device reports POISON creation event to OS by MSI by sending
      GMER/DER after injecting a poison record;
  - CXL driver:
    a. parse the POISON event from GMER/DER;
    b. translate poisoned DPA to HPA (PFN);
    c. enqueue poisoned PFN to memory_failure's work queue;

Signed-off-by: Shiyang Ruan <ruansy.fnst@fujitsu.com>
---
 drivers/cxl/core/mbox.c   | 119 +++++++++++++++++++++++++++++++++-----
 drivers/cxl/cxlmem.h      |   8 +--
 include/linux/cxl-event.h |  18 +++++-
 3 files changed, 125 insertions(+), 20 deletions(-)

diff --git a/drivers/cxl/core/mbox.c b/drivers/cxl/core/mbox.c
index f0f54aeccc87..76af0d73859d 100644
--- a/drivers/cxl/core/mbox.c
+++ b/drivers/cxl/core/mbox.c
@@ -837,25 +837,116 @@ int cxl_enumerate_cmds(struct cxl_memdev_state *mds)
 }
 EXPORT_SYMBOL_NS_GPL(cxl_enumerate_cmds, CXL);
 
-void cxl_event_trace_record(const struct cxl_memdev *cxlmd,
-			    enum cxl_event_log_type type,
-			    enum cxl_event_type event_type,
-			    const uuid_t *uuid, union cxl_event *evt)
+static void cxl_report_poison(struct cxl_memdev *cxlmd, struct cxl_region *cxlr,
+			      u64 dpa)
 {
-	if (event_type == CXL_CPER_EVENT_GEN_MEDIA)
+	u64 hpa = cxl_trace_hpa(cxlr, cxlmd, dpa);
+	unsigned long pfn = PHYS_PFN(hpa);
+
+	if (!IS_ENABLED(CONFIG_MEMORY_FAILURE))
+		return;
+
+	memory_failure_queue(pfn, MF_ACTION_REQUIRED);
+}
+
+static int __cxl_report_poison(struct device *dev, void *arg)
+{
+	struct cxl_endpoint_decoder *cxled;
+	struct cxl_memdev *cxlmd;
+	u64 dpa = *(u64 *)arg;
+
+	cxled = to_cxl_endpoint_decoder(dev);
+	if (!cxled || !cxled->dpa_res || !resource_size(cxled->dpa_res))
+		return 0;
+
+	if (cxled->mode == CXL_DECODER_MIXED) {
+		dev_dbg(dev, "poison list read unsupported in mixed mode\n");
+		return 0;
+	}
+
+	if (dpa > cxled->dpa_res->end || dpa < cxled->dpa_res->start)
+		return 0;
+
+	cxlmd = cxled_to_memdev(cxled);
+	cxl_report_poison(cxlmd, cxled->cxld.region, dpa);
+
+	return 1;
+}
+
+static void cxl_event_handle_poison(struct cxl_memdev *cxlmd, u64 dpa)
+{
+	struct cxl_port *port = cxlmd->endpoint;
+
+	/*
+	 * No region is mapped to this endpoint, that is to say no HPA is
+	 * mapped.
+	 */
+	if (!port || !is_cxl_endpoint(port) ||
+	    cxl_num_decoders_committed(port) == 0)
+		return;
+
+	device_for_each_child(&port->dev, &dpa, __cxl_report_poison);
+}
+
+static void cxl_event_handle_general_media(struct cxl_memdev *cxlmd,
+					   enum cxl_event_log_type type,
+					   struct cxl_event_gen_media *rec)
+{
+	u64 dpa = le64_to_cpu(rec->phys_addr) & CXL_DPA_MASK;
+
+	if (type == CXL_EVENT_TYPE_FAIL) {
+		switch (rec->transaction_type) {
+		case CXL_EVENT_TRANSACTION_READ:
+		case CXL_EVENT_TRANSACTION_WRITE:
+		case CXL_EVENT_TRANSACTION_INJECT_POISON:
+			cxl_event_handle_poison(cxlmd, dpa);
+			break;
+		default:
+			break;
+		}
+	}
+}
+
+static void cxl_event_handle_dram(struct cxl_memdev *cxlmd,
+				  enum cxl_event_log_type type,
+				  struct cxl_event_dram *rec)
+{
+	u64 dpa = le64_to_cpu(rec->phys_addr) & CXL_DPA_MASK;
+
+	if (type == CXL_EVENT_TYPE_FAIL) {
+		switch (rec->transaction_type) {
+		case CXL_EVENT_TRANSACTION_READ:
+		case CXL_EVENT_TRANSACTION_WRITE:
+		case CXL_EVENT_TRANSACTION_INJECT_POISON:
+			cxl_event_handle_poison(cxlmd, dpa);
+			break;
+		default:
+			break;
+		}
+	}
+}
+
+void cxl_event_handle_record(struct cxl_memdev *cxlmd,
+			     enum cxl_event_log_type type,
+			     enum cxl_event_type event_type,
+			     const uuid_t *uuid, union cxl_event *evt)
+{
+	if (event_type == CXL_CPER_EVENT_GEN_MEDIA) {
 		trace_cxl_general_media(cxlmd, type, &evt->gen_media);
-	else if (event_type == CXL_CPER_EVENT_DRAM)
+		cxl_event_handle_general_media(cxlmd, type, &evt->gen_media);
+	} else if (event_type == CXL_CPER_EVENT_DRAM) {
 		trace_cxl_dram(cxlmd, type, &evt->dram);
-	else if (event_type == CXL_CPER_EVENT_MEM_MODULE)
+		cxl_event_handle_dram(cxlmd, type, &evt->dram);
+	} else if (event_type == CXL_CPER_EVENT_MEM_MODULE)
 		trace_cxl_memory_module(cxlmd, type, &evt->mem_module);
 	else
 		trace_cxl_generic_event(cxlmd, type, uuid, &evt->generic);
 }
-EXPORT_SYMBOL_NS_GPL(cxl_event_trace_record, CXL);
+EXPORT_SYMBOL_NS_GPL(cxl_event_handle_record, CXL);
 
-static void __cxl_event_trace_record(const struct cxl_memdev *cxlmd,
-				     enum cxl_event_log_type type,
-				     struct cxl_event_record_raw *record)
+static void __cxl_event_handle_record(struct cxl_memdev *cxlmd,
+				      enum cxl_event_log_type type,
+				      struct cxl_event_record_raw *record)
 {
 	enum cxl_event_type ev_type = CXL_CPER_EVENT_GENERIC;
 	const uuid_t *uuid = &record->id;
@@ -867,7 +958,7 @@ static void __cxl_event_trace_record(const struct cxl_memdev *cxlmd,
 	else if (uuid_equal(uuid, &CXL_EVENT_MEM_MODULE_UUID))
 		ev_type = CXL_CPER_EVENT_MEM_MODULE;
 
-	cxl_event_trace_record(cxlmd, type, ev_type, uuid, &record->event);
+	cxl_event_handle_record(cxlmd, type, ev_type, uuid, &record->event);
 }
 
 static int cxl_clear_event_record(struct cxl_memdev_state *mds,
@@ -979,8 +1070,8 @@ static void cxl_mem_get_records_log(struct cxl_memdev_state *mds,
 			break;
 
 		for (i = 0; i < nr_rec; i++)
-			__cxl_event_trace_record(cxlmd, type,
-						 &payload->records[i]);
+			__cxl_event_handle_record(cxlmd, type,
+						  &payload->records[i]);
 
 		if (payload->flags & CXL_GET_EVENT_FLAG_OVERFLOW)
 			trace_cxl_overflow(cxlmd, type, payload);
diff --git a/drivers/cxl/cxlmem.h b/drivers/cxl/cxlmem.h
index 36cee9c30ceb..ba1347de5651 100644
--- a/drivers/cxl/cxlmem.h
+++ b/drivers/cxl/cxlmem.h
@@ -822,10 +822,10 @@ void set_exclusive_cxl_commands(struct cxl_memdev_state *mds,
 void clear_exclusive_cxl_commands(struct cxl_memdev_state *mds,
 				  unsigned long *cmds);
 void cxl_mem_get_event_records(struct cxl_memdev_state *mds, u32 status);
-void cxl_event_trace_record(const struct cxl_memdev *cxlmd,
-			    enum cxl_event_log_type type,
-			    enum cxl_event_type event_type,
-			    const uuid_t *uuid, union cxl_event *evt);
+void cxl_event_handle_record(struct cxl_memdev *cxlmd,
+			     enum cxl_event_log_type type,
+			     enum cxl_event_type event_type,
+			     const uuid_t *uuid, union cxl_event *evt);
 int cxl_set_timestamp(struct cxl_memdev_state *mds);
 int cxl_poison_state_init(struct cxl_memdev_state *mds);
 int cxl_mem_get_poison(struct cxl_memdev *cxlmd, u64 offset, u64 len,
diff --git a/include/linux/cxl-event.h b/include/linux/cxl-event.h
index 03fa6d50d46f..8189bed76c12 100644
--- a/include/linux/cxl-event.h
+++ b/include/linux/cxl-event.h
@@ -23,6 +23,20 @@ struct cxl_event_generic {
 	u8 data[CXL_EVENT_RECORD_DATA_LENGTH];
 } __packed;
 
+/*
+ * Event transaction type
+ * CXL rev 3.0 Section 8.2.9.2.1.1; Table 8-43
+ */
+enum cxl_event_transaction_type {
+	CXL_EVENT_TRANSACTION_UNKNOWN = 0X00,
+	CXL_EVENT_TRANSACTION_READ,
+	CXL_EVENT_TRANSACTION_WRITE,
+	CXL_EVENT_TRANSACTION_SCAN_MEDIA,
+	CXL_EVENT_TRANSACTION_INJECT_POISON,
+	CXL_EVENT_TRANSACTION_MEDIA_SCRUB,
+	CXL_EVENT_TRANSACTION_MEDIA_MANAGEMENT,
+};
+
 /*
  * General Media Event Record
  * CXL rev 3.0 Section 8.2.9.2.1.1; Table 8-43
@@ -33,7 +47,7 @@ struct cxl_event_gen_media {
 	__le64 phys_addr;
 	u8 descriptor;
 	u8 type;
-	u8 transaction_type;
+	u8 transaction_type;	/* enum cxl_event_transaction_type */
 	u8 validity_flags[2];
 	u8 channel;
 	u8 rank;
@@ -52,7 +66,7 @@ struct cxl_event_dram {
 	__le64 phys_addr;
 	u8 descriptor;
 	u8 type;
-	u8 transaction_type;
+	u8 transaction_type;	/* enum cxl_event_transaction_type */
 	u8 validity_flags[2];
 	u8 channel;
 	u8 rank;
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH v3 2/2] cxl/core: add poison creation event handler
@ 2024-04-17  7:50   ` Shiyang Ruan
  0 siblings, 0 replies; 28+ messages in thread
From: Shiyang Ruan @ 2024-04-17  7:50 UTC (permalink / raw)
  To: qemu-devel, linux-cxl
  Cc: Jonathan.Cameron, dan.j.williams, dave, ira.weiny, alison.schofield

Currently driver only traces cxl events, poison creation (for both vmem
and pmem type) on cxl memdev is silent.  OS needs to be notified then it
could handle poison pages in time.  Per CXL spec, the device error event
could be signaled through FW-First and OS-First methods.

So, add poison creation event handler in OS-First method:
  - Qemu:
    - CXL device reports POISON creation event to OS by MSI by sending
      GMER/DER after injecting a poison record;
  - CXL driver:
    a. parse the POISON event from GMER/DER;
    b. translate poisoned DPA to HPA (PFN);
    c. enqueue poisoned PFN to memory_failure's work queue;

Signed-off-by: Shiyang Ruan <ruansy.fnst@fujitsu.com>
---
 drivers/cxl/core/mbox.c   | 119 +++++++++++++++++++++++++++++++++-----
 drivers/cxl/cxlmem.h      |   8 +--
 include/linux/cxl-event.h |  18 +++++-
 3 files changed, 125 insertions(+), 20 deletions(-)

diff --git a/drivers/cxl/core/mbox.c b/drivers/cxl/core/mbox.c
index f0f54aeccc87..76af0d73859d 100644
--- a/drivers/cxl/core/mbox.c
+++ b/drivers/cxl/core/mbox.c
@@ -837,25 +837,116 @@ int cxl_enumerate_cmds(struct cxl_memdev_state *mds)
 }
 EXPORT_SYMBOL_NS_GPL(cxl_enumerate_cmds, CXL);
 
-void cxl_event_trace_record(const struct cxl_memdev *cxlmd,
-			    enum cxl_event_log_type type,
-			    enum cxl_event_type event_type,
-			    const uuid_t *uuid, union cxl_event *evt)
+static void cxl_report_poison(struct cxl_memdev *cxlmd, struct cxl_region *cxlr,
+			      u64 dpa)
 {
-	if (event_type == CXL_CPER_EVENT_GEN_MEDIA)
+	u64 hpa = cxl_trace_hpa(cxlr, cxlmd, dpa);
+	unsigned long pfn = PHYS_PFN(hpa);
+
+	if (!IS_ENABLED(CONFIG_MEMORY_FAILURE))
+		return;
+
+	memory_failure_queue(pfn, MF_ACTION_REQUIRED);
+}
+
+static int __cxl_report_poison(struct device *dev, void *arg)
+{
+	struct cxl_endpoint_decoder *cxled;
+	struct cxl_memdev *cxlmd;
+	u64 dpa = *(u64 *)arg;
+
+	cxled = to_cxl_endpoint_decoder(dev);
+	if (!cxled || !cxled->dpa_res || !resource_size(cxled->dpa_res))
+		return 0;
+
+	if (cxled->mode == CXL_DECODER_MIXED) {
+		dev_dbg(dev, "poison list read unsupported in mixed mode\n");
+		return 0;
+	}
+
+	if (dpa > cxled->dpa_res->end || dpa < cxled->dpa_res->start)
+		return 0;
+
+	cxlmd = cxled_to_memdev(cxled);
+	cxl_report_poison(cxlmd, cxled->cxld.region, dpa);
+
+	return 1;
+}
+
+static void cxl_event_handle_poison(struct cxl_memdev *cxlmd, u64 dpa)
+{
+	struct cxl_port *port = cxlmd->endpoint;
+
+	/*
+	 * No region is mapped to this endpoint, that is to say no HPA is
+	 * mapped.
+	 */
+	if (!port || !is_cxl_endpoint(port) ||
+	    cxl_num_decoders_committed(port) == 0)
+		return;
+
+	device_for_each_child(&port->dev, &dpa, __cxl_report_poison);
+}
+
+static void cxl_event_handle_general_media(struct cxl_memdev *cxlmd,
+					   enum cxl_event_log_type type,
+					   struct cxl_event_gen_media *rec)
+{
+	u64 dpa = le64_to_cpu(rec->phys_addr) & CXL_DPA_MASK;
+
+	if (type == CXL_EVENT_TYPE_FAIL) {
+		switch (rec->transaction_type) {
+		case CXL_EVENT_TRANSACTION_READ:
+		case CXL_EVENT_TRANSACTION_WRITE:
+		case CXL_EVENT_TRANSACTION_INJECT_POISON:
+			cxl_event_handle_poison(cxlmd, dpa);
+			break;
+		default:
+			break;
+		}
+	}
+}
+
+static void cxl_event_handle_dram(struct cxl_memdev *cxlmd,
+				  enum cxl_event_log_type type,
+				  struct cxl_event_dram *rec)
+{
+	u64 dpa = le64_to_cpu(rec->phys_addr) & CXL_DPA_MASK;
+
+	if (type == CXL_EVENT_TYPE_FAIL) {
+		switch (rec->transaction_type) {
+		case CXL_EVENT_TRANSACTION_READ:
+		case CXL_EVENT_TRANSACTION_WRITE:
+		case CXL_EVENT_TRANSACTION_INJECT_POISON:
+			cxl_event_handle_poison(cxlmd, dpa);
+			break;
+		default:
+			break;
+		}
+	}
+}
+
+void cxl_event_handle_record(struct cxl_memdev *cxlmd,
+			     enum cxl_event_log_type type,
+			     enum cxl_event_type event_type,
+			     const uuid_t *uuid, union cxl_event *evt)
+{
+	if (event_type == CXL_CPER_EVENT_GEN_MEDIA) {
 		trace_cxl_general_media(cxlmd, type, &evt->gen_media);
-	else if (event_type == CXL_CPER_EVENT_DRAM)
+		cxl_event_handle_general_media(cxlmd, type, &evt->gen_media);
+	} else if (event_type == CXL_CPER_EVENT_DRAM) {
 		trace_cxl_dram(cxlmd, type, &evt->dram);
-	else if (event_type == CXL_CPER_EVENT_MEM_MODULE)
+		cxl_event_handle_dram(cxlmd, type, &evt->dram);
+	} else if (event_type == CXL_CPER_EVENT_MEM_MODULE)
 		trace_cxl_memory_module(cxlmd, type, &evt->mem_module);
 	else
 		trace_cxl_generic_event(cxlmd, type, uuid, &evt->generic);
 }
-EXPORT_SYMBOL_NS_GPL(cxl_event_trace_record, CXL);
+EXPORT_SYMBOL_NS_GPL(cxl_event_handle_record, CXL);
 
-static void __cxl_event_trace_record(const struct cxl_memdev *cxlmd,
-				     enum cxl_event_log_type type,
-				     struct cxl_event_record_raw *record)
+static void __cxl_event_handle_record(struct cxl_memdev *cxlmd,
+				      enum cxl_event_log_type type,
+				      struct cxl_event_record_raw *record)
 {
 	enum cxl_event_type ev_type = CXL_CPER_EVENT_GENERIC;
 	const uuid_t *uuid = &record->id;
@@ -867,7 +958,7 @@ static void __cxl_event_trace_record(const struct cxl_memdev *cxlmd,
 	else if (uuid_equal(uuid, &CXL_EVENT_MEM_MODULE_UUID))
 		ev_type = CXL_CPER_EVENT_MEM_MODULE;
 
-	cxl_event_trace_record(cxlmd, type, ev_type, uuid, &record->event);
+	cxl_event_handle_record(cxlmd, type, ev_type, uuid, &record->event);
 }
 
 static int cxl_clear_event_record(struct cxl_memdev_state *mds,
@@ -979,8 +1070,8 @@ static void cxl_mem_get_records_log(struct cxl_memdev_state *mds,
 			break;
 
 		for (i = 0; i < nr_rec; i++)
-			__cxl_event_trace_record(cxlmd, type,
-						 &payload->records[i]);
+			__cxl_event_handle_record(cxlmd, type,
+						  &payload->records[i]);
 
 		if (payload->flags & CXL_GET_EVENT_FLAG_OVERFLOW)
 			trace_cxl_overflow(cxlmd, type, payload);
diff --git a/drivers/cxl/cxlmem.h b/drivers/cxl/cxlmem.h
index 36cee9c30ceb..ba1347de5651 100644
--- a/drivers/cxl/cxlmem.h
+++ b/drivers/cxl/cxlmem.h
@@ -822,10 +822,10 @@ void set_exclusive_cxl_commands(struct cxl_memdev_state *mds,
 void clear_exclusive_cxl_commands(struct cxl_memdev_state *mds,
 				  unsigned long *cmds);
 void cxl_mem_get_event_records(struct cxl_memdev_state *mds, u32 status);
-void cxl_event_trace_record(const struct cxl_memdev *cxlmd,
-			    enum cxl_event_log_type type,
-			    enum cxl_event_type event_type,
-			    const uuid_t *uuid, union cxl_event *evt);
+void cxl_event_handle_record(struct cxl_memdev *cxlmd,
+			     enum cxl_event_log_type type,
+			     enum cxl_event_type event_type,
+			     const uuid_t *uuid, union cxl_event *evt);
 int cxl_set_timestamp(struct cxl_memdev_state *mds);
 int cxl_poison_state_init(struct cxl_memdev_state *mds);
 int cxl_mem_get_poison(struct cxl_memdev *cxlmd, u64 offset, u64 len,
diff --git a/include/linux/cxl-event.h b/include/linux/cxl-event.h
index 03fa6d50d46f..8189bed76c12 100644
--- a/include/linux/cxl-event.h
+++ b/include/linux/cxl-event.h
@@ -23,6 +23,20 @@ struct cxl_event_generic {
 	u8 data[CXL_EVENT_RECORD_DATA_LENGTH];
 } __packed;
 
+/*
+ * Event transaction type
+ * CXL rev 3.0 Section 8.2.9.2.1.1; Table 8-43
+ */
+enum cxl_event_transaction_type {
+	CXL_EVENT_TRANSACTION_UNKNOWN = 0X00,
+	CXL_EVENT_TRANSACTION_READ,
+	CXL_EVENT_TRANSACTION_WRITE,
+	CXL_EVENT_TRANSACTION_SCAN_MEDIA,
+	CXL_EVENT_TRANSACTION_INJECT_POISON,
+	CXL_EVENT_TRANSACTION_MEDIA_SCRUB,
+	CXL_EVENT_TRANSACTION_MEDIA_MANAGEMENT,
+};
+
 /*
  * General Media Event Record
  * CXL rev 3.0 Section 8.2.9.2.1.1; Table 8-43
@@ -33,7 +47,7 @@ struct cxl_event_gen_media {
 	__le64 phys_addr;
 	u8 descriptor;
 	u8 type;
-	u8 transaction_type;
+	u8 transaction_type;	/* enum cxl_event_transaction_type */
 	u8 validity_flags[2];
 	u8 channel;
 	u8 rank;
@@ -52,7 +66,7 @@ struct cxl_event_dram {
 	__le64 phys_addr;
 	u8 descriptor;
 	u8 type;
-	u8 transaction_type;
+	u8 transaction_type;	/* enum cxl_event_transaction_type */
 	u8 validity_flags[2];
 	u8 channel;
 	u8 rank;
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 28+ messages in thread

* Re: [PATCH v3 2/2] cxl/core: add poison creation event handler
  2024-04-17  7:50   ` Shiyang Ruan
  (?)
@ 2024-04-17 17:30   ` Dave Jiang
  2024-04-18  9:01       ` Shiyang Ruan
  -1 siblings, 1 reply; 28+ messages in thread
From: Dave Jiang @ 2024-04-17 17:30 UTC (permalink / raw)
  To: Shiyang Ruan, qemu-devel, linux-cxl
  Cc: Jonathan.Cameron, dan.j.williams, dave, ira.weiny, alison.schofield



On 4/17/24 12:50 AM, Shiyang Ruan wrote:
> Currently driver only traces cxl events, poison creation (for both vmem
> and pmem type) on cxl memdev is silent.  OS needs to be notified then it
> could handle poison pages in time.  Per CXL spec, the device error event
> could be signaled through FW-First and OS-First methods.

Please consider below for better clarity:
Currently the driver only traces CXL events. Poison creation (for both ram
and pmem type) on a CXL memdev is silent. The OS needs to be notified so it
can handle poison pages. Per CXL spec, the device error event
can be signaled through the FW-First method or the OS-First method.

> 
> So, add poison creation event handler in OS-First method:
>   - Qemu:
>     - CXL device reports POISON creation event to OS by MSI by sending
>       GMER/DER after injecting a poison record;

Can probably drop the QEMU changes and this is the kernel commit log.

>   - CXL driver:
>     a. parse the POISON event from GMER/DER;
>     b. translate poisoned DPA to HPA (PFN);
>     c. enqueue poisoned PFN to memory_failure's work queue;
> 
> Signed-off-by: Shiyang Ruan <ruansy.fnst@fujitsu.com>
> ---
>  drivers/cxl/core/mbox.c   | 119 +++++++++++++++++++++++++++++++++-----
>  drivers/cxl/cxlmem.h      |   8 +--
>  include/linux/cxl-event.h |  18 +++++-
>  3 files changed, 125 insertions(+), 20 deletions(-)
> 
> diff --git a/drivers/cxl/core/mbox.c b/drivers/cxl/core/mbox.c
> index f0f54aeccc87..76af0d73859d 100644
> --- a/drivers/cxl/core/mbox.c
> +++ b/drivers/cxl/core/mbox.c
> @@ -837,25 +837,116 @@ int cxl_enumerate_cmds(struct cxl_memdev_state *mds)
>  }
>  EXPORT_SYMBOL_NS_GPL(cxl_enumerate_cmds, CXL);
>  
> -void cxl_event_trace_record(const struct cxl_memdev *cxlmd,
> -			    enum cxl_event_log_type type,
> -			    enum cxl_event_type event_type,
> -			    const uuid_t *uuid, union cxl_event *evt)
> +static void cxl_report_poison(struct cxl_memdev *cxlmd, struct cxl_region *cxlr,

I think this needs to be changed to __cxl_report_poison() and the function below to cxl_report_poison(). Otherwise it goes against typical Linux methodology of having the __functionX() as the raw functionality function called by a functionX() wrapper. 

DJ

> +			      u64 dpa)
>  {
> -	if (event_type == CXL_CPER_EVENT_GEN_MEDIA)
> +	u64 hpa = cxl_trace_hpa(cxlr, cxlmd, dpa);
> +	unsigned long pfn = PHYS_PFN(hpa);
> +
> +	if (!IS_ENABLED(CONFIG_MEMORY_FAILURE))
> +		return;
> +
> +	memory_failure_queue(pfn, MF_ACTION_REQUIRED);
> +}
> +
> +static int __cxl_report_poison(struct device *dev, void *arg)
> +{
> +	struct cxl_endpoint_decoder *cxled;
> +	struct cxl_memdev *cxlmd;
> +	u64 dpa = *(u64 *)arg;
> +
> +	cxled = to_cxl_endpoint_decoder(dev);
> +	if (!cxled || !cxled->dpa_res || !resource_size(cxled->dpa_res))
> +		return 0;
> +
> +	if (cxled->mode == CXL_DECODER_MIXED) {
> +		dev_dbg(dev, "poison list read unsupported in mixed mode\n");
> +		return 0;
> +	}
> +
> +	if (dpa > cxled->dpa_res->end || dpa < cxled->dpa_res->start)
> +		return 0;
> +
> +	cxlmd = cxled_to_memdev(cxled);
> +	cxl_report_poison(cxlmd, cxled->cxld.region, dpa);
> +
> +	return 1;
> +}
> +
> +static void cxl_event_handle_poison(struct cxl_memdev *cxlmd, u64 dpa)
> +{
> +	struct cxl_port *port = cxlmd->endpoint;
> +
> +	/*
> +	 * No region is mapped to this endpoint, that is to say no HPA is
> +	 * mapped.
> +	 */
> +	if (!port || !is_cxl_endpoint(port) ||
> +	    cxl_num_decoders_committed(port) == 0)
> +		return;
> +
> +	device_for_each_child(&port->dev, &dpa, __cxl_report_poison);
> +}
> +
> +static void cxl_event_handle_general_media(struct cxl_memdev *cxlmd,
> +					   enum cxl_event_log_type type,
> +					   struct cxl_event_gen_media *rec)
> +{
> +	u64 dpa = le64_to_cpu(rec->phys_addr) & CXL_DPA_MASK;
> +
> +	if (type == CXL_EVENT_TYPE_FAIL) {
> +		switch (rec->transaction_type) {
> +		case CXL_EVENT_TRANSACTION_READ:
> +		case CXL_EVENT_TRANSACTION_WRITE:
> +		case CXL_EVENT_TRANSACTION_INJECT_POISON:
> +			cxl_event_handle_poison(cxlmd, dpa);
> +			break;
> +		default:
> +			break;
> +		}
> +	}
> +}
> +
> +static void cxl_event_handle_dram(struct cxl_memdev *cxlmd,
> +				  enum cxl_event_log_type type,
> +				  struct cxl_event_dram *rec)
> +{
> +	u64 dpa = le64_to_cpu(rec->phys_addr) & CXL_DPA_MASK;
> +
> +	if (type == CXL_EVENT_TYPE_FAIL) {
> +		switch (rec->transaction_type) {
> +		case CXL_EVENT_TRANSACTION_READ:
> +		case CXL_EVENT_TRANSACTION_WRITE:
> +		case CXL_EVENT_TRANSACTION_INJECT_POISON:
> +			cxl_event_handle_poison(cxlmd, dpa);
> +			break;
> +		default:
> +			break;
> +		}
> +	}
> +}
> +
> +void cxl_event_handle_record(struct cxl_memdev *cxlmd,
> +			     enum cxl_event_log_type type,
> +			     enum cxl_event_type event_type,
> +			     const uuid_t *uuid, union cxl_event *evt)
> +{
> +	if (event_type == CXL_CPER_EVENT_GEN_MEDIA) {
>  		trace_cxl_general_media(cxlmd, type, &evt->gen_media);
> -	else if (event_type == CXL_CPER_EVENT_DRAM)
> +		cxl_event_handle_general_media(cxlmd, type, &evt->gen_media);
> +	} else if (event_type == CXL_CPER_EVENT_DRAM) {
>  		trace_cxl_dram(cxlmd, type, &evt->dram);
> -	else if (event_type == CXL_CPER_EVENT_MEM_MODULE)
> +		cxl_event_handle_dram(cxlmd, type, &evt->dram);
> +	} else if (event_type == CXL_CPER_EVENT_MEM_MODULE)
>  		trace_cxl_memory_module(cxlmd, type, &evt->mem_module);
>  	else
>  		trace_cxl_generic_event(cxlmd, type, uuid, &evt->generic);
>  }
> -EXPORT_SYMBOL_NS_GPL(cxl_event_trace_record, CXL);
> +EXPORT_SYMBOL_NS_GPL(cxl_event_handle_record, CXL);
>  
> -static void __cxl_event_trace_record(const struct cxl_memdev *cxlmd,
> -				     enum cxl_event_log_type type,
> -				     struct cxl_event_record_raw *record)
> +static void __cxl_event_handle_record(struct cxl_memdev *cxlmd,
> +				      enum cxl_event_log_type type,
> +				      struct cxl_event_record_raw *record)
>  {
>  	enum cxl_event_type ev_type = CXL_CPER_EVENT_GENERIC;
>  	const uuid_t *uuid = &record->id;
> @@ -867,7 +958,7 @@ static void __cxl_event_trace_record(const struct cxl_memdev *cxlmd,
>  	else if (uuid_equal(uuid, &CXL_EVENT_MEM_MODULE_UUID))
>  		ev_type = CXL_CPER_EVENT_MEM_MODULE;
>  
> -	cxl_event_trace_record(cxlmd, type, ev_type, uuid, &record->event);
> +	cxl_event_handle_record(cxlmd, type, ev_type, uuid, &record->event);
>  }
>  
>  static int cxl_clear_event_record(struct cxl_memdev_state *mds,
> @@ -979,8 +1070,8 @@ static void cxl_mem_get_records_log(struct cxl_memdev_state *mds,
>  			break;
>  
>  		for (i = 0; i < nr_rec; i++)
> -			__cxl_event_trace_record(cxlmd, type,
> -						 &payload->records[i]);
> +			__cxl_event_handle_record(cxlmd, type,
> +						  &payload->records[i]);
>  
>  		if (payload->flags & CXL_GET_EVENT_FLAG_OVERFLOW)
>  			trace_cxl_overflow(cxlmd, type, payload);
> diff --git a/drivers/cxl/cxlmem.h b/drivers/cxl/cxlmem.h
> index 36cee9c30ceb..ba1347de5651 100644
> --- a/drivers/cxl/cxlmem.h
> +++ b/drivers/cxl/cxlmem.h
> @@ -822,10 +822,10 @@ void set_exclusive_cxl_commands(struct cxl_memdev_state *mds,
>  void clear_exclusive_cxl_commands(struct cxl_memdev_state *mds,
>  				  unsigned long *cmds);
>  void cxl_mem_get_event_records(struct cxl_memdev_state *mds, u32 status);
> -void cxl_event_trace_record(const struct cxl_memdev *cxlmd,
> -			    enum cxl_event_log_type type,
> -			    enum cxl_event_type event_type,
> -			    const uuid_t *uuid, union cxl_event *evt);
> +void cxl_event_handle_record(struct cxl_memdev *cxlmd,
> +			     enum cxl_event_log_type type,
> +			     enum cxl_event_type event_type,
> +			     const uuid_t *uuid, union cxl_event *evt);
>  int cxl_set_timestamp(struct cxl_memdev_state *mds);
>  int cxl_poison_state_init(struct cxl_memdev_state *mds);
>  int cxl_mem_get_poison(struct cxl_memdev *cxlmd, u64 offset, u64 len,
> diff --git a/include/linux/cxl-event.h b/include/linux/cxl-event.h
> index 03fa6d50d46f..8189bed76c12 100644
> --- a/include/linux/cxl-event.h
> +++ b/include/linux/cxl-event.h
> @@ -23,6 +23,20 @@ struct cxl_event_generic {
>  	u8 data[CXL_EVENT_RECORD_DATA_LENGTH];
>  } __packed;
>  
> +/*
> + * Event transaction type
> + * CXL rev 3.0 Section 8.2.9.2.1.1; Table 8-43
> + */
> +enum cxl_event_transaction_type {
> +	CXL_EVENT_TRANSACTION_UNKNOWN = 0X00,
> +	CXL_EVENT_TRANSACTION_READ,
> +	CXL_EVENT_TRANSACTION_WRITE,
> +	CXL_EVENT_TRANSACTION_SCAN_MEDIA,
> +	CXL_EVENT_TRANSACTION_INJECT_POISON,
> +	CXL_EVENT_TRANSACTION_MEDIA_SCRUB,
> +	CXL_EVENT_TRANSACTION_MEDIA_MANAGEMENT,
> +};
> +
>  /*
>   * General Media Event Record
>   * CXL rev 3.0 Section 8.2.9.2.1.1; Table 8-43
> @@ -33,7 +47,7 @@ struct cxl_event_gen_media {
>  	__le64 phys_addr;
>  	u8 descriptor;
>  	u8 type;
> -	u8 transaction_type;
> +	u8 transaction_type;	/* enum cxl_event_transaction_type */
>  	u8 validity_flags[2];
>  	u8 channel;
>  	u8 rank;
> @@ -52,7 +66,7 @@ struct cxl_event_dram {
>  	__le64 phys_addr;
>  	u8 descriptor;
>  	u8 type;
> -	u8 transaction_type;
> +	u8 transaction_type;	/* enum cxl_event_transaction_type */
>  	u8 validity_flags[2];
>  	u8 channel;
>  	u8 rank;

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v3 2/2] cxl/core: add poison creation event handler
  2024-04-17 17:30   ` Dave Jiang
@ 2024-04-18  9:01       ` Shiyang Ruan
  0 siblings, 0 replies; 28+ messages in thread
From: Shiyang Ruan via @ 2024-04-18  9:01 UTC (permalink / raw)
  To: Dave Jiang
  Cc: qemu-devel, linux-cxl, Jonathan.Cameron, dan.j.williams, dave,
	ira.weiny, alison.schofield



在 2024/4/18 1:30, Dave Jiang 写道:
> 
> 
> On 4/17/24 12:50 AM, Shiyang Ruan wrote:
>> Currently driver only traces cxl events, poison creation (for both vmem
>> and pmem type) on cxl memdev is silent.  OS needs to be notified then it
>> could handle poison pages in time.  Per CXL spec, the device error event
>> could be signaled through FW-First and OS-First methods.
> 
> Please consider below for better clarity:
> Currently the driver only traces CXL events. Poison creation (for both ram
> and pmem type) on a CXL memdev is silent. The OS needs to be notified so it
> can handle poison pages. Per CXL spec, the device error event
> can be signaled through the FW-First method or the OS-First method.

Thanks, this is better.

> 
>>
>> So, add poison creation event handler in OS-First method:
>>    - Qemu:
>>      - CXL device reports POISON creation event to OS by MSI by sending
>>        GMER/DER after injecting a poison record;
> 
> Can probably drop the QEMU changes and this is the kernel commit log.

Ok.

> 
>>    - CXL driver:
>>      a. parse the POISON event from GMER/DER;
>>      b. translate poisoned DPA to HPA (PFN);
>>      c. enqueue poisoned PFN to memory_failure's work queue;
>>
>> Signed-off-by: Shiyang Ruan <ruansy.fnst@fujitsu.com>
>> ---
>>   drivers/cxl/core/mbox.c   | 119 +++++++++++++++++++++++++++++++++-----
>>   drivers/cxl/cxlmem.h      |   8 +--
>>   include/linux/cxl-event.h |  18 +++++-
>>   3 files changed, 125 insertions(+), 20 deletions(-)
>>
>> diff --git a/drivers/cxl/core/mbox.c b/drivers/cxl/core/mbox.c
>> index f0f54aeccc87..76af0d73859d 100644
>> --- a/drivers/cxl/core/mbox.c
>> +++ b/drivers/cxl/core/mbox.c
>> @@ -837,25 +837,116 @@ int cxl_enumerate_cmds(struct cxl_memdev_state *mds)
>>   }
>>   EXPORT_SYMBOL_NS_GPL(cxl_enumerate_cmds, CXL);
>>   
>> -void cxl_event_trace_record(const struct cxl_memdev *cxlmd,
>> -			    enum cxl_event_log_type type,
>> -			    enum cxl_event_type event_type,
>> -			    const uuid_t *uuid, union cxl_event *evt)
>> +static void cxl_report_poison(struct cxl_memdev *cxlmd, struct cxl_region *cxlr,
> 
> I think this needs to be changed to __cxl_report_poison() and the function below to cxl_report_poison(). Otherwise it goes against typical Linux methodology of having the __functionX() as the raw functionality function called by a functionX() wrapper.

This function was designed to do the real reporting work, and could be 
called at other places (actually did in previous version).  Now that it 
is called only below in this version, yes, it's better to change the names.


--
Thanks,
Ruan.

> 
> DJ
> 


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v3 2/2] cxl/core: add poison creation event handler
@ 2024-04-18  9:01       ` Shiyang Ruan
  0 siblings, 0 replies; 28+ messages in thread
From: Shiyang Ruan @ 2024-04-18  9:01 UTC (permalink / raw)
  To: Dave Jiang
  Cc: qemu-devel, linux-cxl, Jonathan.Cameron, dan.j.williams, dave,
	ira.weiny, alison.schofield



在 2024/4/18 1:30, Dave Jiang 写道:
> 
> 
> On 4/17/24 12:50 AM, Shiyang Ruan wrote:
>> Currently driver only traces cxl events, poison creation (for both vmem
>> and pmem type) on cxl memdev is silent.  OS needs to be notified then it
>> could handle poison pages in time.  Per CXL spec, the device error event
>> could be signaled through FW-First and OS-First methods.
> 
> Please consider below for better clarity:
> Currently the driver only traces CXL events. Poison creation (for both ram
> and pmem type) on a CXL memdev is silent. The OS needs to be notified so it
> can handle poison pages. Per CXL spec, the device error event
> can be signaled through the FW-First method or the OS-First method.

Thanks, this is better.

> 
>>
>> So, add poison creation event handler in OS-First method:
>>    - Qemu:
>>      - CXL device reports POISON creation event to OS by MSI by sending
>>        GMER/DER after injecting a poison record;
> 
> Can probably drop the QEMU changes and this is the kernel commit log.

Ok.

> 
>>    - CXL driver:
>>      a. parse the POISON event from GMER/DER;
>>      b. translate poisoned DPA to HPA (PFN);
>>      c. enqueue poisoned PFN to memory_failure's work queue;
>>
>> Signed-off-by: Shiyang Ruan <ruansy.fnst@fujitsu.com>
>> ---
>>   drivers/cxl/core/mbox.c   | 119 +++++++++++++++++++++++++++++++++-----
>>   drivers/cxl/cxlmem.h      |   8 +--
>>   include/linux/cxl-event.h |  18 +++++-
>>   3 files changed, 125 insertions(+), 20 deletions(-)
>>
>> diff --git a/drivers/cxl/core/mbox.c b/drivers/cxl/core/mbox.c
>> index f0f54aeccc87..76af0d73859d 100644
>> --- a/drivers/cxl/core/mbox.c
>> +++ b/drivers/cxl/core/mbox.c
>> @@ -837,25 +837,116 @@ int cxl_enumerate_cmds(struct cxl_memdev_state *mds)
>>   }
>>   EXPORT_SYMBOL_NS_GPL(cxl_enumerate_cmds, CXL);
>>   
>> -void cxl_event_trace_record(const struct cxl_memdev *cxlmd,
>> -			    enum cxl_event_log_type type,
>> -			    enum cxl_event_type event_type,
>> -			    const uuid_t *uuid, union cxl_event *evt)
>> +static void cxl_report_poison(struct cxl_memdev *cxlmd, struct cxl_region *cxlr,
> 
> I think this needs to be changed to __cxl_report_poison() and the function below to cxl_report_poison(). Otherwise it goes against typical Linux methodology of having the __functionX() as the raw functionality function called by a functionX() wrapper.

This function was designed to do the real reporting work, and could be 
called at other places (actually did in previous version).  Now that it 
is called only below in this version, yes, it's better to change the names.


--
Thanks,
Ruan.

> 
> DJ
> 

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v3 2/2] cxl/core: add poison creation event handler
  2024-04-17  7:50   ` Shiyang Ruan
  (?)
  (?)
@ 2024-04-21 12:14   ` kernel test robot
  -1 siblings, 0 replies; 28+ messages in thread
From: kernel test robot @ 2024-04-21 12:14 UTC (permalink / raw)
  To: Shiyang Ruan, qemu-devel, linux-cxl
  Cc: oe-kbuild-all, Jonathan.Cameron, dan.j.williams, dave, ira.weiny,
	alison.schofield

Hi Shiyang,

kernel test robot noticed the following build errors:

[auto build test ERROR on cxl/next]
[also build test ERROR on linus/master v6.9-rc4 next-20240419]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch#_base_tree_information]

url:    https://github.com/intel-lab-lkp/linux/commits/Shiyang-Ruan/cxl-core-correct-length-of-DPA-field-masks/20240417-155443
base:   https://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxl.git next
patch link:    https://lore.kernel.org/r/20240417075053.3273543-3-ruansy.fnst%40fujitsu.com
patch subject: [PATCH v3 2/2] cxl/core: add poison creation event handler
config: csky-randconfig-002-20240421 (https://download.01.org/0day-ci/archive/20240421/202404212044.uSxtGRtL-lkp@intel.com/config)
compiler: csky-linux-gcc (GCC) 13.2.0
reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20240421/202404212044.uSxtGRtL-lkp@intel.com/reproduce)

If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <lkp@intel.com>
| Closes: https://lore.kernel.org/oe-kbuild-all/202404212044.uSxtGRtL-lkp@intel.com/

All errors (new ones prefixed by >>):

   `.exit.text' referenced in section `__jump_table' of fs/ceph/super.o: defined in discarded section `.exit.text' of fs/ceph/super.o
   `.exit.text' referenced in section `__jump_table' of fs/ceph/super.o: defined in discarded section `.exit.text' of fs/ceph/super.o
   csky-linux-ld: drivers/cxl/core/mbox.o: in function `__cxl_report_poison':
   mbox.c:(.text+0xa12): undefined reference to `cxl_trace_hpa'
>> csky-linux-ld: mbox.c:(.text+0xa44): undefined reference to `cxl_trace_hpa'
   `.exit.text' referenced in section `__jump_table' of drivers/target/target_core_configfs.o: defined in discarded section `.exit.text' of drivers/target/target_core_configfs.o
   `.exit.text' referenced in section `__jump_table' of drivers/target/target_core_configfs.o: defined in discarded section `.exit.text' of drivers/target/target_core_configfs.o
   `.exit.text' referenced in section `__jump_table' of net/ceph/ceph_common.o: defined in discarded section `.exit.text' of net/ceph/ceph_common.o
   `.exit.text' referenced in section `__jump_table' of net/ceph/ceph_common.o: defined in discarded section `.exit.text' of net/ceph/ceph_common.o

-- 
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v3 1/2] cxl/core: correct length of DPA field masks
  2024-04-17  7:50   ` Shiyang Ruan via
  (?)
@ 2024-04-23 17:35   ` Ira Weiny
  -1 siblings, 0 replies; 28+ messages in thread
From: Ira Weiny @ 2024-04-23 17:35 UTC (permalink / raw)
  To: Shiyang Ruan, qemu-devel, linux-cxl
  Cc: Jonathan.Cameron, dan.j.williams, dave, ira.weiny,
	alison.schofield, stable

Shiyang Ruan wrote:
> The length of Physical Address in General Media Event Record/DRAM Event
> Record is 64-bit, so the field mask should be defined as such length.
> Otherwise, this causes cxl_general_media and cxl_dram tracepoints to
> mask off the upper-32-bits of DPA addresses. The cxl_poison event is
> unaffected.
> 
> If userspace was doing its own DPA-to-HPA translation this could lead to
> incorrect page retirement decisions, but there is no known consumer
> (like rasdaemon) of this event today.
> 
> Fixes: d54a531a430b ("cxl/mem: Trace General Media Event Record")
> Cc: <stable@vger.kernel.org>
> Cc: Dan Williams <dan.j.williams@intel.com>
> Cc: Davidlohr Bueso <dave@stgolabs.net>
> Cc: Jonathan Cameron <Jonathan.Cameron@huawei.com>
> Cc: Ira Weiny <ira.weiny@intel.com>

Apologies I thought I saw this go in before.  But perhaps it was a
different mask.

Reviewed-by: Ira Weiny <ira.weiny@intel.com>

> Signed-off-by: Shiyang Ruan <ruansy.fnst@fujitsu.com>
> ---
>  drivers/cxl/core/trace.h | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/cxl/core/trace.h b/drivers/cxl/core/trace.h
> index e5f13260fc52..cdfce932d5b1 100644
> --- a/drivers/cxl/core/trace.h
> +++ b/drivers/cxl/core/trace.h
> @@ -253,7 +253,7 @@ TRACE_EVENT(cxl_generic_event,
>   * DRAM Event Record
>   * CXL rev 3.0 section 8.2.9.2.1.2; Table 8-44
>   */
> -#define CXL_DPA_FLAGS_MASK			0x3F
> +#define CXL_DPA_FLAGS_MASK			0x3FULL
>  #define CXL_DPA_MASK				(~CXL_DPA_FLAGS_MASK)
>  
>  #define CXL_DPA_VOLATILE			BIT(0)
> -- 
> 2.34.1
> 



^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v3 1/2] cxl/core: correct length of DPA field masks
  2024-04-17  7:50   ` Shiyang Ruan via
  (?)
  (?)
@ 2024-04-23 17:35   ` Dan Williams
  -1 siblings, 0 replies; 28+ messages in thread
From: Dan Williams @ 2024-04-23 17:35 UTC (permalink / raw)
  To: Shiyang Ruan, qemu-devel, linux-cxl
  Cc: Jonathan.Cameron, dan.j.williams, dave, ira.weiny,
	alison.schofield, stable

Shiyang Ruan wrote:
> The length of Physical Address in General Media Event Record/DRAM Event
> Record is 64-bit, so the field mask should be defined as such length.
> Otherwise, this causes cxl_general_media and cxl_dram tracepoints to
> mask off the upper-32-bits of DPA addresses. The cxl_poison event is
> unaffected.
> 
> If userspace was doing its own DPA-to-HPA translation this could lead to
> incorrect page retirement decisions, but there is no known consumer
> (like rasdaemon) of this event today.
> 
> Fixes: d54a531a430b ("cxl/mem: Trace General Media Event Record")
> Cc: <stable@vger.kernel.org>
> Cc: Dan Williams <dan.j.williams@intel.com>
> Cc: Davidlohr Bueso <dave@stgolabs.net>
> Cc: Jonathan Cameron <Jonathan.Cameron@huawei.com>
> Cc: Ira Weiny <ira.weiny@intel.com>
> Signed-off-by: Shiyang Ruan <ruansy.fnst@fujitsu.com>
> ---
>  drivers/cxl/core/trace.h | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/cxl/core/trace.h b/drivers/cxl/core/trace.h
> index e5f13260fc52..cdfce932d5b1 100644
> --- a/drivers/cxl/core/trace.h
> +++ b/drivers/cxl/core/trace.h
> @@ -253,7 +253,7 @@ TRACE_EVENT(cxl_generic_event,
>   * DRAM Event Record
>   * CXL rev 3.0 section 8.2.9.2.1.2; Table 8-44
>   */
> -#define CXL_DPA_FLAGS_MASK			0x3F
> +#define CXL_DPA_FLAGS_MASK			0x3FULL
>  #define CXL_DPA_MASK				(~CXL_DPA_FLAGS_MASK)
>  
>  #define CXL_DPA_VOLATILE			BIT(0)

Looks good,

Reviewed-by: Dan Williams <dan.j.williams@intel.com>

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v3 1/2] cxl/core: correct length of DPA field masks
  2024-04-17  7:50   ` Shiyang Ruan via
                     ` (2 preceding siblings ...)
  (?)
@ 2024-04-23 17:42   ` Alison Schofield
  2024-04-23 21:04     ` Ira Weiny
  -1 siblings, 1 reply; 28+ messages in thread
From: Alison Schofield @ 2024-04-23 17:42 UTC (permalink / raw)
  To: Shiyang Ruan
  Cc: qemu-devel, linux-cxl, Jonathan.Cameron, dan.j.williams, dave,
	ira.weiny, stable

On Wed, Apr 17, 2024 at 03:50:52PM +0800, Shiyang Ruan wrote:
> The length of Physical Address in General Media Event Record/DRAM Event
> Record is 64-bit, so the field mask should be defined as such length.
> Otherwise, this causes cxl_general_media and cxl_dram tracepoints to
> mask off the upper-32-bits of DPA addresses. The cxl_poison event is
> unaffected.
> 
> If userspace was doing its own DPA-to-HPA translation this could lead to
> incorrect page retirement decisions, but there is no known consumer
> (like rasdaemon) of this event today.
> 

So, an invalid DPA is emitted in the trace event log and that could
lead to 'incorrect page retirement decisions...'

> Fixes: d54a531a430b ("cxl/mem: Trace General Media Event Record")
> Cc: <stable@vger.kernel.org>
> Cc: Dan Williams <dan.j.williams@intel.com>
> Cc: Davidlohr Bueso <dave@stgolabs.net>
> Cc: Jonathan Cameron <Jonathan.Cameron@huawei.com>
> Cc: Ira Weiny <ira.weiny@intel.com>
> Signed-off-by: Shiyang Ruan <ruansy.fnst@fujitsu.com>
> ---
>  drivers/cxl/core/trace.h | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/cxl/core/trace.h b/drivers/cxl/core/trace.h
> index e5f13260fc52..cdfce932d5b1 100644
> --- a/drivers/cxl/core/trace.h
> +++ b/drivers/cxl/core/trace.h
> @@ -253,7 +253,7 @@ TRACE_EVENT(cxl_generic_event,
>   * DRAM Event Record
>   * CXL rev 3.0 section 8.2.9.2.1.2; Table 8-44
>   */
> -#define CXL_DPA_FLAGS_MASK			0x3F
> +#define CXL_DPA_FLAGS_MASK			0x3FULL
>  #define CXL_DPA_MASK				(~CXL_DPA_FLAGS_MASK)
>  
>  #define CXL_DPA_VOLATILE			BIT(0)

This works but I'm thinking this is the time to convene on one 
CXL_EVENT_DPA_MASK for both all CXL events, rather than having
cxl_poison event be different.

I prefer how poison defines it:

cxlmem.h:#define CXL_POISON_START_MASK          GENMASK_ULL(63, 6)

Can we rename that CXL_EVENT_DPA_MASK and use for all events?

--Alison

> -- 
> 2.34.1
> 

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v3 2/2] cxl/core: add poison creation event handler
  2024-04-17  7:50   ` Shiyang Ruan
                     ` (2 preceding siblings ...)
  (?)
@ 2024-04-23 17:57   ` Ira Weiny
  2024-05-03 10:42       ` Shiyang Ruan via
  -1 siblings, 1 reply; 28+ messages in thread
From: Ira Weiny @ 2024-04-23 17:57 UTC (permalink / raw)
  To: Shiyang Ruan, qemu-devel, linux-cxl
  Cc: Jonathan.Cameron, dan.j.williams, dave, ira.weiny, alison.schofield

Shiyang Ruan wrote:
> Currently driver only traces cxl events, poison creation (for both vmem
> and pmem type) on cxl memdev is silent.  OS needs to be notified then it
> could handle poison pages in time.  Per CXL spec, the device error event
> could be signaled through FW-First and OS-First methods.
> 
> So, add poison creation event handler in OS-First method:
>   - Qemu:
>     - CXL device reports POISON creation event to OS by MSI by sending
>       GMER/DER after injecting a poison record;
>   - CXL driver:
>     a. parse the POISON event from GMER/DER;
>     b. translate poisoned DPA to HPA (PFN);
>     c. enqueue poisoned PFN to memory_failure's work queue;

I'm a bit confused by the need for this patch.  Perhaps a bit more detail
here?

More comments below.

> 
> Signed-off-by: Shiyang Ruan <ruansy.fnst@fujitsu.com>
> ---
>  drivers/cxl/core/mbox.c   | 119 +++++++++++++++++++++++++++++++++-----
>  drivers/cxl/cxlmem.h      |   8 +--
>  include/linux/cxl-event.h |  18 +++++-
>  3 files changed, 125 insertions(+), 20 deletions(-)
> 
> diff --git a/drivers/cxl/core/mbox.c b/drivers/cxl/core/mbox.c
> index f0f54aeccc87..76af0d73859d 100644
> --- a/drivers/cxl/core/mbox.c
> +++ b/drivers/cxl/core/mbox.c
> @@ -837,25 +837,116 @@ int cxl_enumerate_cmds(struct cxl_memdev_state *mds)
>  }
>  EXPORT_SYMBOL_NS_GPL(cxl_enumerate_cmds, CXL);
>  
> -void cxl_event_trace_record(const struct cxl_memdev *cxlmd,
> -			    enum cxl_event_log_type type,
> -			    enum cxl_event_type event_type,
> -			    const uuid_t *uuid, union cxl_event *evt)
> +static void cxl_report_poison(struct cxl_memdev *cxlmd, struct cxl_region *cxlr,
> +			      u64 dpa)
>  {
> -	if (event_type == CXL_CPER_EVENT_GEN_MEDIA)
> +	u64 hpa = cxl_trace_hpa(cxlr, cxlmd, dpa);
> +	unsigned long pfn = PHYS_PFN(hpa);
> +
> +	if (!IS_ENABLED(CONFIG_MEMORY_FAILURE))
> +		return;
> +
> +	memory_failure_queue(pfn, MF_ACTION_REQUIRED);

I thought that ras daemon was supposed to take care of this when the trace
event occurred.  Alison is working on the HPA data for that path.

> +}
> +
> +static int __cxl_report_poison(struct device *dev, void *arg)
> +{
> +	struct cxl_endpoint_decoder *cxled;
> +	struct cxl_memdev *cxlmd;
> +	u64 dpa = *(u64 *)arg;
> +
> +	cxled = to_cxl_endpoint_decoder(dev);
> +	if (!cxled || !cxled->dpa_res || !resource_size(cxled->dpa_res))
> +		return 0;
> +
> +	if (cxled->mode == CXL_DECODER_MIXED) {
> +		dev_dbg(dev, "poison list read unsupported in mixed mode\n");
> +		return 0;
> +	}
> +
> +	if (dpa > cxled->dpa_res->end || dpa < cxled->dpa_res->start)
> +		return 0;
> +
> +	cxlmd = cxled_to_memdev(cxled);
> +	cxl_report_poison(cxlmd, cxled->cxld.region, dpa);
> +
> +	return 1;
> +}
> +
> +static void cxl_event_handle_poison(struct cxl_memdev *cxlmd, u64 dpa)
> +{
> +	struct cxl_port *port = cxlmd->endpoint;
> +
> +	/*
> +	 * No region is mapped to this endpoint, that is to say no HPA is
> +	 * mapped.
> +	 */
> +	if (!port || !is_cxl_endpoint(port) ||
> +	    cxl_num_decoders_committed(port) == 0)
> +		return;
> +
> +	device_for_each_child(&port->dev, &dpa, __cxl_report_poison);
> +}
> +
> +static void cxl_event_handle_general_media(struct cxl_memdev *cxlmd,
> +					   enum cxl_event_log_type type,
> +					   struct cxl_event_gen_media *rec)
> +{
> +	u64 dpa = le64_to_cpu(rec->phys_addr) & CXL_DPA_MASK;
> +
> +	if (type == CXL_EVENT_TYPE_FAIL) {

Why only FAIL and not FATAL?

> +		switch (rec->transaction_type) {
> +		case CXL_EVENT_TRANSACTION_READ:
> +		case CXL_EVENT_TRANSACTION_WRITE:
> +		case CXL_EVENT_TRANSACTION_INJECT_POISON:

Why not scan media?

> +			cxl_event_handle_poison(cxlmd, dpa);
> +			break;
> +		default:
> +			break;
> +		}
> +	}
> +}
> +
> +static void cxl_event_handle_dram(struct cxl_memdev *cxlmd,
> +				  enum cxl_event_log_type type,
> +				  struct cxl_event_dram *rec)
> +{
> +	u64 dpa = le64_to_cpu(rec->phys_addr) & CXL_DPA_MASK;
> +
> +	if (type == CXL_EVENT_TYPE_FAIL) {
> +		switch (rec->transaction_type) {
> +		case CXL_EVENT_TRANSACTION_READ:
> +		case CXL_EVENT_TRANSACTION_WRITE:
> +		case CXL_EVENT_TRANSACTION_INJECT_POISON:
> +			cxl_event_handle_poison(cxlmd, dpa);
> +			break;
> +		default:
> +			break;
> +		}
> +	}
> +}
> +
> +void cxl_event_handle_record(struct cxl_memdev *cxlmd,
> +			     enum cxl_event_log_type type,
> +			     enum cxl_event_type event_type,
> +			     const uuid_t *uuid, union cxl_event *evt)
> +{
> +	if (event_type == CXL_CPER_EVENT_GEN_MEDIA) {
>  		trace_cxl_general_media(cxlmd, type, &evt->gen_media);
> -	else if (event_type == CXL_CPER_EVENT_DRAM)
> +		cxl_event_handle_general_media(cxlmd, type, &evt->gen_media);
> +	} else if (event_type == CXL_CPER_EVENT_DRAM) {
>  		trace_cxl_dram(cxlmd, type, &evt->dram);
> -	else if (event_type == CXL_CPER_EVENT_MEM_MODULE)
> +		cxl_event_handle_dram(cxlmd, type, &evt->dram);
> +	} else if (event_type == CXL_CPER_EVENT_MEM_MODULE)
>  		trace_cxl_memory_module(cxlmd, type, &evt->mem_module);
>  	else
>  		trace_cxl_generic_event(cxlmd, type, uuid, &evt->generic);
>  }
> -EXPORT_SYMBOL_NS_GPL(cxl_event_trace_record, CXL);
> +EXPORT_SYMBOL_NS_GPL(cxl_event_handle_record, CXL);
>

Why all the churn with changing the names of functions?

Ira

>  
> -static void __cxl_event_trace_record(const struct cxl_memdev *cxlmd,
> -				     enum cxl_event_log_type type,
> -				     struct cxl_event_record_raw *record)
> +static void __cxl_event_handle_record(struct cxl_memdev *cxlmd,
> +				      enum cxl_event_log_type type,
> +				      struct cxl_event_record_raw *record)
>  {
>  	enum cxl_event_type ev_type = CXL_CPER_EVENT_GENERIC;
>  	const uuid_t *uuid = &record->id;
> @@ -867,7 +958,7 @@ static void __cxl_event_trace_record(const struct cxl_memdev *cxlmd,
>  	else if (uuid_equal(uuid, &CXL_EVENT_MEM_MODULE_UUID))
>  		ev_type = CXL_CPER_EVENT_MEM_MODULE;
>  
> -	cxl_event_trace_record(cxlmd, type, ev_type, uuid, &record->event);
> +	cxl_event_handle_record(cxlmd, type, ev_type, uuid, &record->event);
>  }
>  
>  static int cxl_clear_event_record(struct cxl_memdev_state *mds,
> @@ -979,8 +1070,8 @@ static void cxl_mem_get_records_log(struct cxl_memdev_state *mds,
>  			break;
>  
>  		for (i = 0; i < nr_rec; i++)
> -			__cxl_event_trace_record(cxlmd, type,
> -						 &payload->records[i]);
> +			__cxl_event_handle_record(cxlmd, type,
> +						  &payload->records[i]);
>  
>  		if (payload->flags & CXL_GET_EVENT_FLAG_OVERFLOW)
>  			trace_cxl_overflow(cxlmd, type, payload);
> diff --git a/drivers/cxl/cxlmem.h b/drivers/cxl/cxlmem.h
> index 36cee9c30ceb..ba1347de5651 100644
> --- a/drivers/cxl/cxlmem.h
> +++ b/drivers/cxl/cxlmem.h
> @@ -822,10 +822,10 @@ void set_exclusive_cxl_commands(struct cxl_memdev_state *mds,
>  void clear_exclusive_cxl_commands(struct cxl_memdev_state *mds,
>  				  unsigned long *cmds);
>  void cxl_mem_get_event_records(struct cxl_memdev_state *mds, u32 status);
> -void cxl_event_trace_record(const struct cxl_memdev *cxlmd,
> -			    enum cxl_event_log_type type,
> -			    enum cxl_event_type event_type,
> -			    const uuid_t *uuid, union cxl_event *evt);
> +void cxl_event_handle_record(struct cxl_memdev *cxlmd,
> +			     enum cxl_event_log_type type,
> +			     enum cxl_event_type event_type,
> +			     const uuid_t *uuid, union cxl_event *evt);
>  int cxl_set_timestamp(struct cxl_memdev_state *mds);
>  int cxl_poison_state_init(struct cxl_memdev_state *mds);
>  int cxl_mem_get_poison(struct cxl_memdev *cxlmd, u64 offset, u64 len,
> diff --git a/include/linux/cxl-event.h b/include/linux/cxl-event.h
> index 03fa6d50d46f..8189bed76c12 100644
> --- a/include/linux/cxl-event.h
> +++ b/include/linux/cxl-event.h
> @@ -23,6 +23,20 @@ struct cxl_event_generic {
>  	u8 data[CXL_EVENT_RECORD_DATA_LENGTH];
>  } __packed;
>  
> +/*
> + * Event transaction type
> + * CXL rev 3.0 Section 8.2.9.2.1.1; Table 8-43
> + */
> +enum cxl_event_transaction_type {
> +	CXL_EVENT_TRANSACTION_UNKNOWN = 0X00,
> +	CXL_EVENT_TRANSACTION_READ,
> +	CXL_EVENT_TRANSACTION_WRITE,
> +	CXL_EVENT_TRANSACTION_SCAN_MEDIA,
> +	CXL_EVENT_TRANSACTION_INJECT_POISON,
> +	CXL_EVENT_TRANSACTION_MEDIA_SCRUB,
> +	CXL_EVENT_TRANSACTION_MEDIA_MANAGEMENT,
> +};
> +
>  /*
>   * General Media Event Record
>   * CXL rev 3.0 Section 8.2.9.2.1.1; Table 8-43
> @@ -33,7 +47,7 @@ struct cxl_event_gen_media {
>  	__le64 phys_addr;
>  	u8 descriptor;
>  	u8 type;
> -	u8 transaction_type;
> +	u8 transaction_type;	/* enum cxl_event_transaction_type */
>  	u8 validity_flags[2];
>  	u8 channel;
>  	u8 rank;
> @@ -52,7 +66,7 @@ struct cxl_event_dram {
>  	__le64 phys_addr;
>  	u8 descriptor;
>  	u8 type;
> -	u8 transaction_type;
> +	u8 transaction_type;	/* enum cxl_event_transaction_type */
>  	u8 validity_flags[2];
>  	u8 channel;
>  	u8 rank;
> -- 
> 2.34.1
> 



^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v3 2/2] cxl/core: add poison creation event handler
  2024-04-17  7:50   ` Shiyang Ruan
                     ` (3 preceding siblings ...)
  (?)
@ 2024-04-23 18:40   ` Dan Williams
  2024-05-03 11:32       ` Shiyang Ruan via
  -1 siblings, 1 reply; 28+ messages in thread
From: Dan Williams @ 2024-04-23 18:40 UTC (permalink / raw)
  To: Shiyang Ruan, qemu-devel, linux-cxl
  Cc: Jonathan.Cameron, dan.j.williams, dave, ira.weiny, alison.schofield

Shiyang Ruan wrote:
> Currently driver only traces cxl events, poison creation (for both vmem
> and pmem type) on cxl memdev is silent.

As it should be.

> OS needs to be notified then it could handle poison pages in time.

No, it was always the case that latent poison is an "action optional"
event. I am not understanding the justification for this approach. What
breaks if the kernel does not forward events to memory_failure_queue()?

Consider that in the CPU consumption case that the firmware first path
will do its own memory_failure_queue() and in the native case the MCE
handler will take care of this. So that leaves pages that are accessed
by DMA or background operation that encounter poison. Those are "action
optional" scenarios and it is not clear to me how the driver tells the
difference.

This needs more precision on which agent is repsonsible for what level
of reporting. The distribution of responsibility between ACPI GHES,
EDAC, and the CXL driver is messy and I expect this changelog to
demonstrate it understands all those considerations.

> Per CXL spec, the device error event could be signaled through
> FW-First and OS-First methods.
> 
> So, add poison creation event handler in OS-First method:
>   - Qemu:

Why is QEMU relevant for this patch? QEMU is only a development vehicle
the upstream enabling should be reference shipping or expected to be
shipping hardware implementations.

>     - CXL device reports POISON creation event to OS by MSI by sending
>       GMER/DER after injecting a poison record;

When you say "inject" here do you mean "add to the poison list if
present". Because "inject" to me means the "Inject Poison" Memory Device
Command.

>   - CXL driver:
>     a. parse the POISON event from GMER/DER;
>     b. translate poisoned DPA to HPA (PFN);
>     c. enqueue poisoned PFN to memory_failure's work queue;
> 
> Signed-off-by: Shiyang Ruan <ruansy.fnst@fujitsu.com>
> ---
>  drivers/cxl/core/mbox.c   | 119 +++++++++++++++++++++++++++++++++-----
>  drivers/cxl/cxlmem.h      |   8 +--
>  include/linux/cxl-event.h |  18 +++++-
>  3 files changed, 125 insertions(+), 20 deletions(-)
> 
> diff --git a/drivers/cxl/core/mbox.c b/drivers/cxl/core/mbox.c
> index f0f54aeccc87..76af0d73859d 100644
> --- a/drivers/cxl/core/mbox.c
> +++ b/drivers/cxl/core/mbox.c
> @@ -837,25 +837,116 @@ int cxl_enumerate_cmds(struct cxl_memdev_state *mds)
>  }
>  EXPORT_SYMBOL_NS_GPL(cxl_enumerate_cmds, CXL);
>  
> -void cxl_event_trace_record(const struct cxl_memdev *cxlmd,
> -			    enum cxl_event_log_type type,
> -			    enum cxl_event_type event_type,
> -			    const uuid_t *uuid, union cxl_event *evt)
> +static void cxl_report_poison(struct cxl_memdev *cxlmd, struct cxl_region *cxlr,
> +			      u64 dpa)
>  {
> -	if (event_type == CXL_CPER_EVENT_GEN_MEDIA)
> +	u64 hpa = cxl_trace_hpa(cxlr, cxlmd, dpa);
> +	unsigned long pfn = PHYS_PFN(hpa);
> +
> +	if (!IS_ENABLED(CONFIG_MEMORY_FAILURE))
> +		return;

No need for this check, memory_failure_queue() is already stubbed out in
the CONFIG_MEMORY_FAILURE=n case.

> +	memory_failure_queue(pfn, MF_ACTION_REQUIRED);

My expectation is MF_ACTION_REQUIRED is not appropriate for CXL event
reported errors since action is only required for direct consumption
events and those need not be reported through the device event queue.

It would be useful to collaborate with a BIOS firmware engineer so that
the kernel ends up with similar logic as is used to set CPER record
severity, or at least understands why it would want to be different.
See how ghes_handle_memory_failure() determines the
memory_failure_queue() flags.

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v3 1/2] cxl/core: correct length of DPA field masks
  2024-04-23 17:42   ` Alison Schofield
@ 2024-04-23 21:04     ` Ira Weiny
  2024-04-25 10:05         ` Shiyang Ruan via
  0 siblings, 1 reply; 28+ messages in thread
From: Ira Weiny @ 2024-04-23 21:04 UTC (permalink / raw)
  To: Alison Schofield, Shiyang Ruan
  Cc: qemu-devel, linux-cxl, Jonathan.Cameron, dan.j.williams, dave,
	ira.weiny, stable

Alison Schofield wrote:
> On Wed, Apr 17, 2024 at 03:50:52PM +0800, Shiyang Ruan wrote:

[snip]

> > diff --git a/drivers/cxl/core/trace.h b/drivers/cxl/core/trace.h
> > index e5f13260fc52..cdfce932d5b1 100644
> > --- a/drivers/cxl/core/trace.h
> > +++ b/drivers/cxl/core/trace.h
> > @@ -253,7 +253,7 @@ TRACE_EVENT(cxl_generic_event,
> >   * DRAM Event Record
> >   * CXL rev 3.0 section 8.2.9.2.1.2; Table 8-44
> >   */
> > -#define CXL_DPA_FLAGS_MASK			0x3F
> > +#define CXL_DPA_FLAGS_MASK			0x3FULL
> >  #define CXL_DPA_MASK				(~CXL_DPA_FLAGS_MASK)
> >  
> >  #define CXL_DPA_VOLATILE			BIT(0)
> 
> This works but I'm thinking this is the time to convene on one 
> CXL_EVENT_DPA_MASK for both all CXL events, rather than having
> cxl_poison event be different.
> 
> I prefer how poison defines it:
> 
> cxlmem.h:#define CXL_POISON_START_MASK          GENMASK_ULL(63, 6)
> 
> Can we rename that CXL_EVENT_DPA_MASK and use for all events?

Ah!  Great catch.  I dont' know why I only masked off the 2 used bits.
That was short sighted of me.

Yes we should consolidate these.
Ira

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v3 1/2] cxl/core: correct length of DPA field masks
  2024-04-23 21:04     ` Ira Weiny
@ 2024-04-25 10:05         ` Shiyang Ruan via
  0 siblings, 0 replies; 28+ messages in thread
From: Shiyang Ruan @ 2024-04-25 10:05 UTC (permalink / raw)
  To: Ira Weiny, Alison Schofield
  Cc: qemu-devel, linux-cxl, Jonathan.Cameron, dan.j.williams, dave, stable



在 2024/4/24 5:04, Ira Weiny 写道:
> Alison Schofield wrote:
>> On Wed, Apr 17, 2024 at 03:50:52PM +0800, Shiyang Ruan wrote:
> 
> [snip]
> 
>>> diff --git a/drivers/cxl/core/trace.h b/drivers/cxl/core/trace.h
>>> index e5f13260fc52..cdfce932d5b1 100644
>>> --- a/drivers/cxl/core/trace.h
>>> +++ b/drivers/cxl/core/trace.h
>>> @@ -253,7 +253,7 @@ TRACE_EVENT(cxl_generic_event,
>>>    * DRAM Event Record
>>>    * CXL rev 3.0 section 8.2.9.2.1.2; Table 8-44
>>>    */
>>> -#define CXL_DPA_FLAGS_MASK			0x3F
>>> +#define CXL_DPA_FLAGS_MASK			0x3FULL
>>>   #define CXL_DPA_MASK				(~CXL_DPA_FLAGS_MASK)
>>>   
>>>   #define CXL_DPA_VOLATILE			BIT(0)
>>
>> This works but I'm thinking this is the time to convene on one
>> CXL_EVENT_DPA_MASK for both all CXL events, rather than having
>> cxl_poison event be different.
>>
>> I prefer how poison defines it:
>>
>> cxlmem.h:#define CXL_POISON_START_MASK          GENMASK_ULL(63, 6)
>>
>> Can we rename that CXL_EVENT_DPA_MASK and use for all events?

cxlmem.h:CXL_POISON_START_MASK is defined for MBOX commands (poison 
record, the lower 3 bits indicate "Error Source"), but CXL_DPA_MASK here 
is for events.  They have different meaning though their values are 
same.  So, I don't think we should consolidate them.

> 
> Ah!  Great catch.  I dont' know why I only masked off the 2 used bits.

Per spec, the lowest 2 bits of CXL event's DPA are defined for "Volatile 
or not" and "not repairable".  So there is no mistake here.

> That was short sighted of me.
> 
> Yes we should consolidate these.
> Ira

--
Thanks,
Ruan.


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v3 1/2] cxl/core: correct length of DPA field masks
@ 2024-04-25 10:05         ` Shiyang Ruan via
  0 siblings, 0 replies; 28+ messages in thread
From: Shiyang Ruan via @ 2024-04-25 10:05 UTC (permalink / raw)
  To: Ira Weiny, Alison Schofield
  Cc: qemu-devel, linux-cxl, Jonathan.Cameron, dan.j.williams, dave, stable



在 2024/4/24 5:04, Ira Weiny 写道:
> Alison Schofield wrote:
>> On Wed, Apr 17, 2024 at 03:50:52PM +0800, Shiyang Ruan wrote:
> 
> [snip]
> 
>>> diff --git a/drivers/cxl/core/trace.h b/drivers/cxl/core/trace.h
>>> index e5f13260fc52..cdfce932d5b1 100644
>>> --- a/drivers/cxl/core/trace.h
>>> +++ b/drivers/cxl/core/trace.h
>>> @@ -253,7 +253,7 @@ TRACE_EVENT(cxl_generic_event,
>>>    * DRAM Event Record
>>>    * CXL rev 3.0 section 8.2.9.2.1.2; Table 8-44
>>>    */
>>> -#define CXL_DPA_FLAGS_MASK			0x3F
>>> +#define CXL_DPA_FLAGS_MASK			0x3FULL
>>>   #define CXL_DPA_MASK				(~CXL_DPA_FLAGS_MASK)
>>>   
>>>   #define CXL_DPA_VOLATILE			BIT(0)
>>
>> This works but I'm thinking this is the time to convene on one
>> CXL_EVENT_DPA_MASK for both all CXL events, rather than having
>> cxl_poison event be different.
>>
>> I prefer how poison defines it:
>>
>> cxlmem.h:#define CXL_POISON_START_MASK          GENMASK_ULL(63, 6)
>>
>> Can we rename that CXL_EVENT_DPA_MASK and use for all events?

cxlmem.h:CXL_POISON_START_MASK is defined for MBOX commands (poison 
record, the lower 3 bits indicate "Error Source"), but CXL_DPA_MASK here 
is for events.  They have different meaning though their values are 
same.  So, I don't think we should consolidate them.

> 
> Ah!  Great catch.  I dont' know why I only masked off the 2 used bits.

Per spec, the lowest 2 bits of CXL event's DPA are defined for "Volatile 
or not" and "not repairable".  So there is no mistake here.

> That was short sighted of me.
> 
> Yes we should consolidate these.
> Ira

--
Thanks,
Ruan.



^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v3 1/2] cxl/core: correct length of DPA field masks
  2024-04-25 10:05         ` Shiyang Ruan via
  (?)
@ 2024-04-25 16:04         ` Ira Weiny
  -1 siblings, 0 replies; 28+ messages in thread
From: Ira Weiny @ 2024-04-25 16:04 UTC (permalink / raw)
  To: Shiyang Ruan, Ira Weiny, Alison Schofield
  Cc: qemu-devel, linux-cxl, Jonathan.Cameron, dan.j.williams, dave, stable

Shiyang Ruan wrote:
> 
> 
> 在 2024/4/24 5:04, Ira Weiny 写道:
> > Alison Schofield wrote:
> >> On Wed, Apr 17, 2024 at 03:50:52PM +0800, Shiyang Ruan wrote:
> > 
> > [snip]
> > 
> >>> diff --git a/drivers/cxl/core/trace.h b/drivers/cxl/core/trace.h
> >>> index e5f13260fc52..cdfce932d5b1 100644
> >>> --- a/drivers/cxl/core/trace.h
> >>> +++ b/drivers/cxl/core/trace.h
> >>> @@ -253,7 +253,7 @@ TRACE_EVENT(cxl_generic_event,
> >>>    * DRAM Event Record
> >>>    * CXL rev 3.0 section 8.2.9.2.1.2; Table 8-44
> >>>    */
> >>> -#define CXL_DPA_FLAGS_MASK			0x3F
> >>> +#define CXL_DPA_FLAGS_MASK			0x3FULL
> >>>   #define CXL_DPA_MASK				(~CXL_DPA_FLAGS_MASK)
> >>>   
> >>>   #define CXL_DPA_VOLATILE			BIT(0)
> >>
> >> This works but I'm thinking this is the time to convene on one
> >> CXL_EVENT_DPA_MASK for both all CXL events, rather than having
> >> cxl_poison event be different.
> >>
> >> I prefer how poison defines it:
> >>
> >> cxlmem.h:#define CXL_POISON_START_MASK          GENMASK_ULL(63, 6)
> >>
> >> Can we rename that CXL_EVENT_DPA_MASK and use for all events?
> 
> cxlmem.h:CXL_POISON_START_MASK is defined for MBOX commands (poison 
> record, the lower 3 bits indicate "Error Source"), but CXL_DPA_MASK here 
> is for events.  They have different meaning though their values are 
> same.  So, I don't think we should consolidate them.

By definition the DPA in all these payloads can't use the lower 6 bits.
We are defining a mask to get the DPA.  This has nothing to do with what
may be stored in the other 6 bits.

Defining a common DPA mask is correct per both sections of the spec.

> 
> > 
> > Ah!  Great catch.  I dont' know why I only masked off the 2 used bits.
> 
> Per spec, the lowest 2 bits of CXL event's DPA are defined for "Volatile 
> or not" and "not repairable".  So there is no mistake here.

I appreciate your not calling out my code as a bug.  :-D

However, bits [5:2] are also Reserved.  So it is incorrect to mask off
only the lower 2 bits.  Even though the reserved bits must be 0 per the
spec, it is still better to properly mask all reserved bits because they
are by definition not part of the DPA.

Could you create a common macro for the next version of the patch?

Thanks,
Ira

> 
> > That was short sighted of me.
> > 
> > Yes we should consolidate these.
> > Ira
> 
> --
> Thanks,
> Ruan.
> 



^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v3 1/2] cxl/core: correct length of DPA field masks
  2024-04-17  7:50   ` Shiyang Ruan via
                     ` (3 preceding siblings ...)
  (?)
@ 2024-04-30 21:00   ` Alison Schofield
  2024-05-03 11:37       ` Shiyang Ruan
  -1 siblings, 1 reply; 28+ messages in thread
From: Alison Schofield @ 2024-04-30 21:00 UTC (permalink / raw)
  To: Shiyang Ruan
  Cc: qemu-devel, linux-cxl, Jonathan.Cameron, dan.j.williams, dave,
	ira.weiny, stable

On Wed, Apr 17, 2024 at 03:50:52PM +0800, Shiyang Ruan wrote:
> The length of Physical Address in General Media Event Record/DRAM Event
> Record is 64-bit, so the field mask should be defined as such length.
> Otherwise, this causes cxl_general_media and cxl_dram tracepoints to
> mask off the upper-32-bits of DPA addresses. The cxl_poison event is
> unaffected.
> 
> If userspace was doing its own DPA-to-HPA translation this could lead to
> incorrect page retirement decisions, but there is no known consumer
> (like rasdaemon) of this event today.
> 
> Fixes: d54a531a430b ("cxl/mem: Trace General Media Event Record")
> Cc: <stable@vger.kernel.org>
> Cc: Dan Williams <dan.j.williams@intel.com>
> Cc: Davidlohr Bueso <dave@stgolabs.net>
> Cc: Jonathan Cameron <Jonathan.Cameron@huawei.com>
> Cc: Ira Weiny <ira.weiny@intel.com>
> Signed-off-by: Shiyang Ruan <ruansy.fnst@fujitsu.com>

Hi Ruan,

This fixup is important for the Event DPA->HPA translation work, so I
grabbed it, updated it with most* of the review comments, and posted
with that set. I expect you saw that in your mailbox.

DaveJ queued it in a topic branch for 6.10 here:
https://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxl.git/log/?h=for-6.10/dpa-to-hpa

*I did not create a common mask for events and poison because I wanted to
limit the changes. If you'd like to make that change it would be welcomed.

-- Alison

> ---
>  drivers/cxl/core/trace.h | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/cxl/core/trace.h b/drivers/cxl/core/trace.h
> index e5f13260fc52..cdfce932d5b1 100644
> --- a/drivers/cxl/core/trace.h
> +++ b/drivers/cxl/core/trace.h
> @@ -253,7 +253,7 @@ TRACE_EVENT(cxl_generic_event,
>   * DRAM Event Record
>   * CXL rev 3.0 section 8.2.9.2.1.2; Table 8-44
>   */
> -#define CXL_DPA_FLAGS_MASK			0x3F
> +#define CXL_DPA_FLAGS_MASK			0x3FULL
>  #define CXL_DPA_MASK				(~CXL_DPA_FLAGS_MASK)
>  
>  #define CXL_DPA_VOLATILE			BIT(0)
> -- 
> 2.34.1
> 

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v3 2/2] cxl/core: add poison creation event handler
  2024-04-23 17:57   ` Ira Weiny
@ 2024-05-03 10:42       ` Shiyang Ruan via
  0 siblings, 0 replies; 28+ messages in thread
From: Shiyang Ruan @ 2024-05-03 10:42 UTC (permalink / raw)
  To: Ira Weiny
  Cc: linux-cxl, qemu-devel, Jonathan.Cameron, dan.j.williams, dave,
	alison.schofield



在 2024/4/24 1:57, Ira Weiny 写道:
> Shiyang Ruan wrote:
>> Currently driver only traces cxl events, poison creation (for both vmem
>> and pmem type) on cxl memdev is silent.  OS needs to be notified then it
>> could handle poison pages in time.  Per CXL spec, the device error event
>> could be signaled through FW-First and OS-First methods.
>>
>> So, add poison creation event handler in OS-First method:
>>    - Qemu:
>>      - CXL device reports POISON creation event to OS by MSI by sending
>>        GMER/DER after injecting a poison record;
>>    - CXL driver:
>>      a. parse the POISON event from GMER/DER;
>>      b. translate poisoned DPA to HPA (PFN);
>>      c. enqueue poisoned PFN to memory_failure's work queue;
> 
> I'm a bit confused by the need for this patch.  Perhaps a bit more detail
> here?

Yes, I should have wrote more details.

I want to check and make sure the HWPOISON on a CXL device (type3) is 
working properly.  For example, a FSDAX filesystem created on a 
type3-pmem device, then it gets a POISON bit, the OS should be able to 
handle this POISON event: find the relevant process

Currently I'm using Qemu with several simulated CXL devices, and using 
poison injection API of Qemu to create POISON records, but OS isn't 
notified.  Only when we actively call list POISON records (cxl list -L) 
will the driver fetch them and log into trace events, then we see the 
POISON records.  Memory failure wasn't triggered too.

That's why I said "poison creation on cxl memdev is silent".  Per spec, 
POISON creation should be notified to OS.  Since not familiar with 
firmware part, I'm try adding this notification for OS-First.

> 
> More comments below.
> 
>>
>> Signed-off-by: Shiyang Ruan <ruansy.fnst@fujitsu.com>
>> ---
>>   drivers/cxl/core/mbox.c   | 119 +++++++++++++++++++++++++++++++++-----
>>   drivers/cxl/cxlmem.h      |   8 +--
>>   include/linux/cxl-event.h |  18 +++++-
>>   3 files changed, 125 insertions(+), 20 deletions(-)
>>
>> diff --git a/drivers/cxl/core/mbox.c b/drivers/cxl/core/mbox.c
>> index f0f54aeccc87..76af0d73859d 100644
>> --- a/drivers/cxl/core/mbox.c
>> +++ b/drivers/cxl/core/mbox.c
>> @@ -837,25 +837,116 @@ int cxl_enumerate_cmds(struct cxl_memdev_state *mds)
>>   }
>>   EXPORT_SYMBOL_NS_GPL(cxl_enumerate_cmds, CXL);
>>   
>> -void cxl_event_trace_record(const struct cxl_memdev *cxlmd,
>> -			    enum cxl_event_log_type type,
>> -			    enum cxl_event_type event_type,
>> -			    const uuid_t *uuid, union cxl_event *evt)
>> +static void cxl_report_poison(struct cxl_memdev *cxlmd, struct cxl_region *cxlr,
>> +			      u64 dpa)
>>   {
>> -	if (event_type == CXL_CPER_EVENT_GEN_MEDIA)
>> +	u64 hpa = cxl_trace_hpa(cxlr, cxlmd, dpa);
>> +	unsigned long pfn = PHYS_PFN(hpa);
>> +
>> +	if (!IS_ENABLED(CONFIG_MEMORY_FAILURE))
>> +		return;
>> +
>> +	memory_failure_queue(pfn, MF_ACTION_REQUIRED);
> 
> I thought that ras daemon was supposed to take care of this when the trace
> event occurred.  Alison is working on the HPA data for that path.

It seems to save CXL trace events/memory-failures to DB and report to 
others, but cannot let OS call memory failure.

> 
>> +}
>> +
>> +static int __cxl_report_poison(struct device *dev, void *arg)
>> +{
>> +	struct cxl_endpoint_decoder *cxled;
>> +	struct cxl_memdev *cxlmd;
>> +	u64 dpa = *(u64 *)arg;
>> +
>> +	cxled = to_cxl_endpoint_decoder(dev);
>> +	if (!cxled || !cxled->dpa_res || !resource_size(cxled->dpa_res))
>> +		return 0;
>> +
>> +	if (cxled->mode == CXL_DECODER_MIXED) {
>> +		dev_dbg(dev, "poison list read unsupported in mixed mode\n");
>> +		return 0;
>> +	}
>> +
>> +	if (dpa > cxled->dpa_res->end || dpa < cxled->dpa_res->start)
>> +		return 0;
>> +
>> +	cxlmd = cxled_to_memdev(cxled);
>> +	cxl_report_poison(cxlmd, cxled->cxld.region, dpa);
>> +
>> +	return 1;
>> +}
>> +
>> +static void cxl_event_handle_poison(struct cxl_memdev *cxlmd, u64 dpa)
>> +{
>> +	struct cxl_port *port = cxlmd->endpoint;
>> +
>> +	/*
>> +	 * No region is mapped to this endpoint, that is to say no HPA is
>> +	 * mapped.
>> +	 */
>> +	if (!port || !is_cxl_endpoint(port) ||
>> +	    cxl_num_decoders_committed(port) == 0)
>> +		return;
>> +
>> +	device_for_each_child(&port->dev, &dpa, __cxl_report_poison);
>> +}
>> +
>> +static void cxl_event_handle_general_media(struct cxl_memdev *cxlmd,
>> +					   enum cxl_event_log_type type,
>> +					   struct cxl_event_gen_media *rec)
>> +{
>> +	u64 dpa = le64_to_cpu(rec->phys_addr) & CXL_DPA_MASK;
>> +
>> +	if (type == CXL_EVENT_TYPE_FAIL) {
> 
> Why only FAIL and not FATAL?

POISON is recoverable(could be cleared by sending CLEAR_POISON command) 
and won't cause serious problems, so only use failure event here.

> 
>> +		switch (rec->transaction_type) {
>> +		case CXL_EVENT_TRANSACTION_READ:
>> +		case CXL_EVENT_TRANSACTION_WRITE:
>> +		case CXL_EVENT_TRANSACTION_INJECT_POISON:
> 
> Why not scan media?

Yes, this type should be handled too.  Will add.

> 
>> +			cxl_event_handle_poison(cxlmd, dpa);
>> +			break;
>> +		default:
>> +			break;
>> +		}
>> +	}
>> +}
>> +
>> +static void cxl_event_handle_dram(struct cxl_memdev *cxlmd,
>> +				  enum cxl_event_log_type type,
>> +				  struct cxl_event_dram *rec)
>> +{
>> +	u64 dpa = le64_to_cpu(rec->phys_addr) & CXL_DPA_MASK;
>> +
>> +	if (type == CXL_EVENT_TYPE_FAIL) {
>> +		switch (rec->transaction_type) {
>> +		case CXL_EVENT_TRANSACTION_READ:
>> +		case CXL_EVENT_TRANSACTION_WRITE:
>> +		case CXL_EVENT_TRANSACTION_INJECT_POISON:
>> +			cxl_event_handle_poison(cxlmd, dpa);
>> +			break;
>> +		default:
>> +			break;
>> +		}
>> +	}
>> +}
>> +
>> +void cxl_event_handle_record(struct cxl_memdev *cxlmd,
>> +			     enum cxl_event_log_type type,
>> +			     enum cxl_event_type event_type,
>> +			     const uuid_t *uuid, union cxl_event *evt)
>> +{
>> +	if (event_type == CXL_CPER_EVENT_GEN_MEDIA) {
>>   		trace_cxl_general_media(cxlmd, type, &evt->gen_media);
>> -	else if (event_type == CXL_CPER_EVENT_DRAM)
>> +		cxl_event_handle_general_media(cxlmd, type, &evt->gen_media);
>> +	} else if (event_type == CXL_CPER_EVENT_DRAM) {
>>   		trace_cxl_dram(cxlmd, type, &evt->dram);
>> -	else if (event_type == CXL_CPER_EVENT_MEM_MODULE)
>> +		cxl_event_handle_dram(cxlmd, type, &evt->dram);
>> +	} else if (event_type == CXL_CPER_EVENT_MEM_MODULE)
>>   		trace_cxl_memory_module(cxlmd, type, &evt->mem_module);
>>   	else
>>   		trace_cxl_generic_event(cxlmd, type, uuid, &evt->generic);
>>   }
>> -EXPORT_SYMBOL_NS_GPL(cxl_event_trace_record, CXL);
>> +EXPORT_SYMBOL_NS_GPL(cxl_event_handle_record, CXL);
>>
> 
> Why all the churn with changing the names of functions?

After adding cxl_report_poison(), these functions not only traces events 
but also contains event handler.  So rename them.

--
Thanks,
Ruan.

> 
> Ira
> 
>>   
>> -static void __cxl_event_trace_record(const struct cxl_memdev *cxlmd,
>> -				     enum cxl_event_log_type type,
>> -				     struct cxl_event_record_raw *record)
>> +static void __cxl_event_handle_record(struct cxl_memdev *cxlmd,
>> +				      enum cxl_event_log_type type,
>> +				      struct cxl_event_record_raw *record)
>>   {
>>   	enum cxl_event_type ev_type = CXL_CPER_EVENT_GENERIC;
>>   	const uuid_t *uuid = &record->id;
>> @@ -867,7 +958,7 @@ static void __cxl_event_trace_record(const struct cxl_memdev *cxlmd,
>>   	else if (uuid_equal(uuid, &CXL_EVENT_MEM_MODULE_UUID))
>>   		ev_type = CXL_CPER_EVENT_MEM_MODULE;
>>   
>> -	cxl_event_trace_record(cxlmd, type, ev_type, uuid, &record->event);
>> +	cxl_event_handle_record(cxlmd, type, ev_type, uuid, &record->event);
>>   }
>>   
>>   static int cxl_clear_event_record(struct cxl_memdev_state *mds,
>> @@ -979,8 +1070,8 @@ static void cxl_mem_get_records_log(struct cxl_memdev_state *mds,
>>   			break;
>>   
>>   		for (i = 0; i < nr_rec; i++)
>> -			__cxl_event_trace_record(cxlmd, type,
>> -						 &payload->records[i]);
>> +			__cxl_event_handle_record(cxlmd, type,
>> +						  &payload->records[i]);
>>   
>>   		if (payload->flags & CXL_GET_EVENT_FLAG_OVERFLOW)
>>   			trace_cxl_overflow(cxlmd, type, payload);
>> diff --git a/drivers/cxl/cxlmem.h b/drivers/cxl/cxlmem.h
>> index 36cee9c30ceb..ba1347de5651 100644
>> --- a/drivers/cxl/cxlmem.h
>> +++ b/drivers/cxl/cxlmem.h
>> @@ -822,10 +822,10 @@ void set_exclusive_cxl_commands(struct cxl_memdev_state *mds,
>>   void clear_exclusive_cxl_commands(struct cxl_memdev_state *mds,
>>   				  unsigned long *cmds);
>>   void cxl_mem_get_event_records(struct cxl_memdev_state *mds, u32 status);
>> -void cxl_event_trace_record(const struct cxl_memdev *cxlmd,
>> -			    enum cxl_event_log_type type,
>> -			    enum cxl_event_type event_type,
>> -			    const uuid_t *uuid, union cxl_event *evt);
>> +void cxl_event_handle_record(struct cxl_memdev *cxlmd,
>> +			     enum cxl_event_log_type type,
>> +			     enum cxl_event_type event_type,
>> +			     const uuid_t *uuid, union cxl_event *evt);
>>   int cxl_set_timestamp(struct cxl_memdev_state *mds);
>>   int cxl_poison_state_init(struct cxl_memdev_state *mds);
>>   int cxl_mem_get_poison(struct cxl_memdev *cxlmd, u64 offset, u64 len,
>> diff --git a/include/linux/cxl-event.h b/include/linux/cxl-event.h
>> index 03fa6d50d46f..8189bed76c12 100644
>> --- a/include/linux/cxl-event.h
>> +++ b/include/linux/cxl-event.h
>> @@ -23,6 +23,20 @@ struct cxl_event_generic {
>>   	u8 data[CXL_EVENT_RECORD_DATA_LENGTH];
>>   } __packed;
>>   
>> +/*
>> + * Event transaction type
>> + * CXL rev 3.0 Section 8.2.9.2.1.1; Table 8-43
>> + */
>> +enum cxl_event_transaction_type {
>> +	CXL_EVENT_TRANSACTION_UNKNOWN = 0X00,
>> +	CXL_EVENT_TRANSACTION_READ,
>> +	CXL_EVENT_TRANSACTION_WRITE,
>> +	CXL_EVENT_TRANSACTION_SCAN_MEDIA,
>> +	CXL_EVENT_TRANSACTION_INJECT_POISON,
>> +	CXL_EVENT_TRANSACTION_MEDIA_SCRUB,
>> +	CXL_EVENT_TRANSACTION_MEDIA_MANAGEMENT,
>> +};
>> +
>>   /*
>>    * General Media Event Record
>>    * CXL rev 3.0 Section 8.2.9.2.1.1; Table 8-43
>> @@ -33,7 +47,7 @@ struct cxl_event_gen_media {
>>   	__le64 phys_addr;
>>   	u8 descriptor;
>>   	u8 type;
>> -	u8 transaction_type;
>> +	u8 transaction_type;	/* enum cxl_event_transaction_type */
>>   	u8 validity_flags[2];
>>   	u8 channel;
>>   	u8 rank;
>> @@ -52,7 +66,7 @@ struct cxl_event_dram {
>>   	__le64 phys_addr;
>>   	u8 descriptor;
>>   	u8 type;
>> -	u8 transaction_type;
>> +	u8 transaction_type;	/* enum cxl_event_transaction_type */
>>   	u8 validity_flags[2];
>>   	u8 channel;
>>   	u8 rank;
>> -- 
>> 2.34.1
>>
> 
> 

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v3 2/2] cxl/core: add poison creation event handler
@ 2024-05-03 10:42       ` Shiyang Ruan via
  0 siblings, 0 replies; 28+ messages in thread
From: Shiyang Ruan via @ 2024-05-03 10:42 UTC (permalink / raw)
  To: Ira Weiny
  Cc: linux-cxl, qemu-devel, Jonathan.Cameron, dan.j.williams, dave,
	alison.schofield



在 2024/4/24 1:57, Ira Weiny 写道:
> Shiyang Ruan wrote:
>> Currently driver only traces cxl events, poison creation (for both vmem
>> and pmem type) on cxl memdev is silent.  OS needs to be notified then it
>> could handle poison pages in time.  Per CXL spec, the device error event
>> could be signaled through FW-First and OS-First methods.
>>
>> So, add poison creation event handler in OS-First method:
>>    - Qemu:
>>      - CXL device reports POISON creation event to OS by MSI by sending
>>        GMER/DER after injecting a poison record;
>>    - CXL driver:
>>      a. parse the POISON event from GMER/DER;
>>      b. translate poisoned DPA to HPA (PFN);
>>      c. enqueue poisoned PFN to memory_failure's work queue;
> 
> I'm a bit confused by the need for this patch.  Perhaps a bit more detail
> here?

Yes, I should have wrote more details.

I want to check and make sure the HWPOISON on a CXL device (type3) is 
working properly.  For example, a FSDAX filesystem created on a 
type3-pmem device, then it gets a POISON bit, the OS should be able to 
handle this POISON event: find the relevant process

Currently I'm using Qemu with several simulated CXL devices, and using 
poison injection API of Qemu to create POISON records, but OS isn't 
notified.  Only when we actively call list POISON records (cxl list -L) 
will the driver fetch them and log into trace events, then we see the 
POISON records.  Memory failure wasn't triggered too.

That's why I said "poison creation on cxl memdev is silent".  Per spec, 
POISON creation should be notified to OS.  Since not familiar with 
firmware part, I'm try adding this notification for OS-First.

> 
> More comments below.
> 
>>
>> Signed-off-by: Shiyang Ruan <ruansy.fnst@fujitsu.com>
>> ---
>>   drivers/cxl/core/mbox.c   | 119 +++++++++++++++++++++++++++++++++-----
>>   drivers/cxl/cxlmem.h      |   8 +--
>>   include/linux/cxl-event.h |  18 +++++-
>>   3 files changed, 125 insertions(+), 20 deletions(-)
>>
>> diff --git a/drivers/cxl/core/mbox.c b/drivers/cxl/core/mbox.c
>> index f0f54aeccc87..76af0d73859d 100644
>> --- a/drivers/cxl/core/mbox.c
>> +++ b/drivers/cxl/core/mbox.c
>> @@ -837,25 +837,116 @@ int cxl_enumerate_cmds(struct cxl_memdev_state *mds)
>>   }
>>   EXPORT_SYMBOL_NS_GPL(cxl_enumerate_cmds, CXL);
>>   
>> -void cxl_event_trace_record(const struct cxl_memdev *cxlmd,
>> -			    enum cxl_event_log_type type,
>> -			    enum cxl_event_type event_type,
>> -			    const uuid_t *uuid, union cxl_event *evt)
>> +static void cxl_report_poison(struct cxl_memdev *cxlmd, struct cxl_region *cxlr,
>> +			      u64 dpa)
>>   {
>> -	if (event_type == CXL_CPER_EVENT_GEN_MEDIA)
>> +	u64 hpa = cxl_trace_hpa(cxlr, cxlmd, dpa);
>> +	unsigned long pfn = PHYS_PFN(hpa);
>> +
>> +	if (!IS_ENABLED(CONFIG_MEMORY_FAILURE))
>> +		return;
>> +
>> +	memory_failure_queue(pfn, MF_ACTION_REQUIRED);
> 
> I thought that ras daemon was supposed to take care of this when the trace
> event occurred.  Alison is working on the HPA data for that path.

It seems to save CXL trace events/memory-failures to DB and report to 
others, but cannot let OS call memory failure.

> 
>> +}
>> +
>> +static int __cxl_report_poison(struct device *dev, void *arg)
>> +{
>> +	struct cxl_endpoint_decoder *cxled;
>> +	struct cxl_memdev *cxlmd;
>> +	u64 dpa = *(u64 *)arg;
>> +
>> +	cxled = to_cxl_endpoint_decoder(dev);
>> +	if (!cxled || !cxled->dpa_res || !resource_size(cxled->dpa_res))
>> +		return 0;
>> +
>> +	if (cxled->mode == CXL_DECODER_MIXED) {
>> +		dev_dbg(dev, "poison list read unsupported in mixed mode\n");
>> +		return 0;
>> +	}
>> +
>> +	if (dpa > cxled->dpa_res->end || dpa < cxled->dpa_res->start)
>> +		return 0;
>> +
>> +	cxlmd = cxled_to_memdev(cxled);
>> +	cxl_report_poison(cxlmd, cxled->cxld.region, dpa);
>> +
>> +	return 1;
>> +}
>> +
>> +static void cxl_event_handle_poison(struct cxl_memdev *cxlmd, u64 dpa)
>> +{
>> +	struct cxl_port *port = cxlmd->endpoint;
>> +
>> +	/*
>> +	 * No region is mapped to this endpoint, that is to say no HPA is
>> +	 * mapped.
>> +	 */
>> +	if (!port || !is_cxl_endpoint(port) ||
>> +	    cxl_num_decoders_committed(port) == 0)
>> +		return;
>> +
>> +	device_for_each_child(&port->dev, &dpa, __cxl_report_poison);
>> +}
>> +
>> +static void cxl_event_handle_general_media(struct cxl_memdev *cxlmd,
>> +					   enum cxl_event_log_type type,
>> +					   struct cxl_event_gen_media *rec)
>> +{
>> +	u64 dpa = le64_to_cpu(rec->phys_addr) & CXL_DPA_MASK;
>> +
>> +	if (type == CXL_EVENT_TYPE_FAIL) {
> 
> Why only FAIL and not FATAL?

POISON is recoverable(could be cleared by sending CLEAR_POISON command) 
and won't cause serious problems, so only use failure event here.

> 
>> +		switch (rec->transaction_type) {
>> +		case CXL_EVENT_TRANSACTION_READ:
>> +		case CXL_EVENT_TRANSACTION_WRITE:
>> +		case CXL_EVENT_TRANSACTION_INJECT_POISON:
> 
> Why not scan media?

Yes, this type should be handled too.  Will add.

> 
>> +			cxl_event_handle_poison(cxlmd, dpa);
>> +			break;
>> +		default:
>> +			break;
>> +		}
>> +	}
>> +}
>> +
>> +static void cxl_event_handle_dram(struct cxl_memdev *cxlmd,
>> +				  enum cxl_event_log_type type,
>> +				  struct cxl_event_dram *rec)
>> +{
>> +	u64 dpa = le64_to_cpu(rec->phys_addr) & CXL_DPA_MASK;
>> +
>> +	if (type == CXL_EVENT_TYPE_FAIL) {
>> +		switch (rec->transaction_type) {
>> +		case CXL_EVENT_TRANSACTION_READ:
>> +		case CXL_EVENT_TRANSACTION_WRITE:
>> +		case CXL_EVENT_TRANSACTION_INJECT_POISON:
>> +			cxl_event_handle_poison(cxlmd, dpa);
>> +			break;
>> +		default:
>> +			break;
>> +		}
>> +	}
>> +}
>> +
>> +void cxl_event_handle_record(struct cxl_memdev *cxlmd,
>> +			     enum cxl_event_log_type type,
>> +			     enum cxl_event_type event_type,
>> +			     const uuid_t *uuid, union cxl_event *evt)
>> +{
>> +	if (event_type == CXL_CPER_EVENT_GEN_MEDIA) {
>>   		trace_cxl_general_media(cxlmd, type, &evt->gen_media);
>> -	else if (event_type == CXL_CPER_EVENT_DRAM)
>> +		cxl_event_handle_general_media(cxlmd, type, &evt->gen_media);
>> +	} else if (event_type == CXL_CPER_EVENT_DRAM) {
>>   		trace_cxl_dram(cxlmd, type, &evt->dram);
>> -	else if (event_type == CXL_CPER_EVENT_MEM_MODULE)
>> +		cxl_event_handle_dram(cxlmd, type, &evt->dram);
>> +	} else if (event_type == CXL_CPER_EVENT_MEM_MODULE)
>>   		trace_cxl_memory_module(cxlmd, type, &evt->mem_module);
>>   	else
>>   		trace_cxl_generic_event(cxlmd, type, uuid, &evt->generic);
>>   }
>> -EXPORT_SYMBOL_NS_GPL(cxl_event_trace_record, CXL);
>> +EXPORT_SYMBOL_NS_GPL(cxl_event_handle_record, CXL);
>>
> 
> Why all the churn with changing the names of functions?

After adding cxl_report_poison(), these functions not only traces events 
but also contains event handler.  So rename them.

--
Thanks,
Ruan.

> 
> Ira
> 
>>   
>> -static void __cxl_event_trace_record(const struct cxl_memdev *cxlmd,
>> -				     enum cxl_event_log_type type,
>> -				     struct cxl_event_record_raw *record)
>> +static void __cxl_event_handle_record(struct cxl_memdev *cxlmd,
>> +				      enum cxl_event_log_type type,
>> +				      struct cxl_event_record_raw *record)
>>   {
>>   	enum cxl_event_type ev_type = CXL_CPER_EVENT_GENERIC;
>>   	const uuid_t *uuid = &record->id;
>> @@ -867,7 +958,7 @@ static void __cxl_event_trace_record(const struct cxl_memdev *cxlmd,
>>   	else if (uuid_equal(uuid, &CXL_EVENT_MEM_MODULE_UUID))
>>   		ev_type = CXL_CPER_EVENT_MEM_MODULE;
>>   
>> -	cxl_event_trace_record(cxlmd, type, ev_type, uuid, &record->event);
>> +	cxl_event_handle_record(cxlmd, type, ev_type, uuid, &record->event);
>>   }
>>   
>>   static int cxl_clear_event_record(struct cxl_memdev_state *mds,
>> @@ -979,8 +1070,8 @@ static void cxl_mem_get_records_log(struct cxl_memdev_state *mds,
>>   			break;
>>   
>>   		for (i = 0; i < nr_rec; i++)
>> -			__cxl_event_trace_record(cxlmd, type,
>> -						 &payload->records[i]);
>> +			__cxl_event_handle_record(cxlmd, type,
>> +						  &payload->records[i]);
>>   
>>   		if (payload->flags & CXL_GET_EVENT_FLAG_OVERFLOW)
>>   			trace_cxl_overflow(cxlmd, type, payload);
>> diff --git a/drivers/cxl/cxlmem.h b/drivers/cxl/cxlmem.h
>> index 36cee9c30ceb..ba1347de5651 100644
>> --- a/drivers/cxl/cxlmem.h
>> +++ b/drivers/cxl/cxlmem.h
>> @@ -822,10 +822,10 @@ void set_exclusive_cxl_commands(struct cxl_memdev_state *mds,
>>   void clear_exclusive_cxl_commands(struct cxl_memdev_state *mds,
>>   				  unsigned long *cmds);
>>   void cxl_mem_get_event_records(struct cxl_memdev_state *mds, u32 status);
>> -void cxl_event_trace_record(const struct cxl_memdev *cxlmd,
>> -			    enum cxl_event_log_type type,
>> -			    enum cxl_event_type event_type,
>> -			    const uuid_t *uuid, union cxl_event *evt);
>> +void cxl_event_handle_record(struct cxl_memdev *cxlmd,
>> +			     enum cxl_event_log_type type,
>> +			     enum cxl_event_type event_type,
>> +			     const uuid_t *uuid, union cxl_event *evt);
>>   int cxl_set_timestamp(struct cxl_memdev_state *mds);
>>   int cxl_poison_state_init(struct cxl_memdev_state *mds);
>>   int cxl_mem_get_poison(struct cxl_memdev *cxlmd, u64 offset, u64 len,
>> diff --git a/include/linux/cxl-event.h b/include/linux/cxl-event.h
>> index 03fa6d50d46f..8189bed76c12 100644
>> --- a/include/linux/cxl-event.h
>> +++ b/include/linux/cxl-event.h
>> @@ -23,6 +23,20 @@ struct cxl_event_generic {
>>   	u8 data[CXL_EVENT_RECORD_DATA_LENGTH];
>>   } __packed;
>>   
>> +/*
>> + * Event transaction type
>> + * CXL rev 3.0 Section 8.2.9.2.1.1; Table 8-43
>> + */
>> +enum cxl_event_transaction_type {
>> +	CXL_EVENT_TRANSACTION_UNKNOWN = 0X00,
>> +	CXL_EVENT_TRANSACTION_READ,
>> +	CXL_EVENT_TRANSACTION_WRITE,
>> +	CXL_EVENT_TRANSACTION_SCAN_MEDIA,
>> +	CXL_EVENT_TRANSACTION_INJECT_POISON,
>> +	CXL_EVENT_TRANSACTION_MEDIA_SCRUB,
>> +	CXL_EVENT_TRANSACTION_MEDIA_MANAGEMENT,
>> +};
>> +
>>   /*
>>    * General Media Event Record
>>    * CXL rev 3.0 Section 8.2.9.2.1.1; Table 8-43
>> @@ -33,7 +47,7 @@ struct cxl_event_gen_media {
>>   	__le64 phys_addr;
>>   	u8 descriptor;
>>   	u8 type;
>> -	u8 transaction_type;
>> +	u8 transaction_type;	/* enum cxl_event_transaction_type */
>>   	u8 validity_flags[2];
>>   	u8 channel;
>>   	u8 rank;
>> @@ -52,7 +66,7 @@ struct cxl_event_dram {
>>   	__le64 phys_addr;
>>   	u8 descriptor;
>>   	u8 type;
>> -	u8 transaction_type;
>> +	u8 transaction_type;	/* enum cxl_event_transaction_type */
>>   	u8 validity_flags[2];
>>   	u8 channel;
>>   	u8 rank;
>> -- 
>> 2.34.1
>>
> 
> 


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v3 2/2] cxl/core: add poison creation event handler
  2024-04-23 18:40   ` Dan Williams
@ 2024-05-03 11:32       ` Shiyang Ruan via
  0 siblings, 0 replies; 28+ messages in thread
From: Shiyang Ruan @ 2024-05-03 11:32 UTC (permalink / raw)
  To: Dan Williams
  Cc: linux-cxl, qemu-devel, Jonathan.Cameron, dave, ira.weiny,
	alison.schofield



在 2024/4/24 2:40, Dan Williams 写道:
> Shiyang Ruan wrote:
>> Currently driver only traces cxl events, poison creation (for both vmem
>> and pmem type) on cxl memdev is silent.
> 
> As it should be.
> 
>> OS needs to be notified then it could handle poison pages in time.
> 
> No, it was always the case that latent poison is an "action optional"
> event. I am not understanding the justification for this approach. What
> breaks if the kernel does not forward events to memory_failure_queue()?

I think for type3(pmem) device, it should be handled like NVDIMM.  If 
there are processes or filesystems running on it, they could be notified 
then operate a friendly shutdown if POISON happens.

> 
> Consider that in the CPU consumption case that the firmware first path
> will do its own memory_failure_queue() and in the native case the MCE
> handler will take care of this. So that leaves pages that are accessed
> by DMA or background operation that encounter poison. Those are "action
> optional" scenarios and it is not clear to me how the driver tells the
> difference.

So for real CXL device, it always use FW-First path to notify such 
failure event?  Then, there is nothing to do with OS-First path?

> 
> This needs more precision on which agent is repsonsible for what level
> of reporting. The distribution of responsibility between ACPI GHES,
> EDAC, and the CXL driver is messy and I expect this changelog to
> demonstrate it understands all those considerations.

Ok, I'll try to understand them.

> 
>> Per CXL spec, the device error event could be signaled through
>> FW-First and OS-First methods.
>>
>> So, add poison creation event handler in OS-First method:
>>    - Qemu:
> 
> Why is QEMU relevant for this patch? QEMU is only a development vehicle
> the upstream enabling should be reference shipping or expected to be
> shipping hardware implementations.

Yes, but currently we don't have a real CXL device so developing and 
verification could only be done on Qemu with simulated CXL device.

> 
>>      - CXL device reports POISON creation event to OS by MSI by sending
>>        GMER/DER after injecting a poison record;
> 
> When you say "inject" here do you mean "add to the poison list if
> present". Because "inject" to me means the "Inject Poison" Memory Device
> Command.

It's a Qemu qmp command called "cxl-inject-poison", which only adds a 
given address,length record to CXL's poison list, doesn't send 
INJECT_POISON mbox command.

> 
>>    - CXL driver:
>>      a. parse the POISON event from GMER/DER;
>>      b. translate poisoned DPA to HPA (PFN);
>>      c. enqueue poisoned PFN to memory_failure's work queue;
>>
>> Signed-off-by: Shiyang Ruan <ruansy.fnst@fujitsu.com>
>> ---
>>   drivers/cxl/core/mbox.c   | 119 +++++++++++++++++++++++++++++++++-----
>>   drivers/cxl/cxlmem.h      |   8 +--
>>   include/linux/cxl-event.h |  18 +++++-
>>   3 files changed, 125 insertions(+), 20 deletions(-)
>>
>> diff --git a/drivers/cxl/core/mbox.c b/drivers/cxl/core/mbox.c
>> index f0f54aeccc87..76af0d73859d 100644
>> --- a/drivers/cxl/core/mbox.c
>> +++ b/drivers/cxl/core/mbox.c
>> @@ -837,25 +837,116 @@ int cxl_enumerate_cmds(struct cxl_memdev_state *mds)
>>   }
>>   EXPORT_SYMBOL_NS_GPL(cxl_enumerate_cmds, CXL);
>>   
>> -void cxl_event_trace_record(const struct cxl_memdev *cxlmd,
>> -			    enum cxl_event_log_type type,
>> -			    enum cxl_event_type event_type,
>> -			    const uuid_t *uuid, union cxl_event *evt)
>> +static void cxl_report_poison(struct cxl_memdev *cxlmd, struct cxl_region *cxlr,
>> +			      u64 dpa)
>>   {
>> -	if (event_type == CXL_CPER_EVENT_GEN_MEDIA)
>> +	u64 hpa = cxl_trace_hpa(cxlr, cxlmd, dpa);
>> +	unsigned long pfn = PHYS_PFN(hpa);
>> +
>> +	if (!IS_ENABLED(CONFIG_MEMORY_FAILURE))
>> +		return;
> 
> No need for this check, memory_failure_queue() is already stubbed out in
> the CONFIG_MEMORY_FAILURE=n case.
Yes, I'm overthinking it.

> 
>> +	memory_failure_queue(pfn, MF_ACTION_REQUIRED);
> 
> My expectation is MF_ACTION_REQUIRED is not appropriate for CXL event
> reported errors since action is only required for direct consumption
> events and those need not be reported through the device event queue.
Got it.

> 
> It would be useful to collaborate with a BIOS firmware engineer so that
> the kernel ends up with similar logic as is used to set CPER record
> severity, or at least understands why it would want to be different.
> See how ghes_handle_memory_failure() determines the
> memory_failure_queue() flags.

Ok, thanks for the advice.


--
Ruan.




^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v3 2/2] cxl/core: add poison creation event handler
@ 2024-05-03 11:32       ` Shiyang Ruan via
  0 siblings, 0 replies; 28+ messages in thread
From: Shiyang Ruan via @ 2024-05-03 11:32 UTC (permalink / raw)
  To: Dan Williams
  Cc: linux-cxl, qemu-devel, Jonathan.Cameron, dave, ira.weiny,
	alison.schofield



在 2024/4/24 2:40, Dan Williams 写道:
> Shiyang Ruan wrote:
>> Currently driver only traces cxl events, poison creation (for both vmem
>> and pmem type) on cxl memdev is silent.
> 
> As it should be.
> 
>> OS needs to be notified then it could handle poison pages in time.
> 
> No, it was always the case that latent poison is an "action optional"
> event. I am not understanding the justification for this approach. What
> breaks if the kernel does not forward events to memory_failure_queue()?

I think for type3(pmem) device, it should be handled like NVDIMM.  If 
there are processes or filesystems running on it, they could be notified 
then operate a friendly shutdown if POISON happens.

> 
> Consider that in the CPU consumption case that the firmware first path
> will do its own memory_failure_queue() and in the native case the MCE
> handler will take care of this. So that leaves pages that are accessed
> by DMA or background operation that encounter poison. Those are "action
> optional" scenarios and it is not clear to me how the driver tells the
> difference.

So for real CXL device, it always use FW-First path to notify such 
failure event?  Then, there is nothing to do with OS-First path?

> 
> This needs more precision on which agent is repsonsible for what level
> of reporting. The distribution of responsibility between ACPI GHES,
> EDAC, and the CXL driver is messy and I expect this changelog to
> demonstrate it understands all those considerations.

Ok, I'll try to understand them.

> 
>> Per CXL spec, the device error event could be signaled through
>> FW-First and OS-First methods.
>>
>> So, add poison creation event handler in OS-First method:
>>    - Qemu:
> 
> Why is QEMU relevant for this patch? QEMU is only a development vehicle
> the upstream enabling should be reference shipping or expected to be
> shipping hardware implementations.

Yes, but currently we don't have a real CXL device so developing and 
verification could only be done on Qemu with simulated CXL device.

> 
>>      - CXL device reports POISON creation event to OS by MSI by sending
>>        GMER/DER after injecting a poison record;
> 
> When you say "inject" here do you mean "add to the poison list if
> present". Because "inject" to me means the "Inject Poison" Memory Device
> Command.

It's a Qemu qmp command called "cxl-inject-poison", which only adds a 
given address,length record to CXL's poison list, doesn't send 
INJECT_POISON mbox command.

> 
>>    - CXL driver:
>>      a. parse the POISON event from GMER/DER;
>>      b. translate poisoned DPA to HPA (PFN);
>>      c. enqueue poisoned PFN to memory_failure's work queue;
>>
>> Signed-off-by: Shiyang Ruan <ruansy.fnst@fujitsu.com>
>> ---
>>   drivers/cxl/core/mbox.c   | 119 +++++++++++++++++++++++++++++++++-----
>>   drivers/cxl/cxlmem.h      |   8 +--
>>   include/linux/cxl-event.h |  18 +++++-
>>   3 files changed, 125 insertions(+), 20 deletions(-)
>>
>> diff --git a/drivers/cxl/core/mbox.c b/drivers/cxl/core/mbox.c
>> index f0f54aeccc87..76af0d73859d 100644
>> --- a/drivers/cxl/core/mbox.c
>> +++ b/drivers/cxl/core/mbox.c
>> @@ -837,25 +837,116 @@ int cxl_enumerate_cmds(struct cxl_memdev_state *mds)
>>   }
>>   EXPORT_SYMBOL_NS_GPL(cxl_enumerate_cmds, CXL);
>>   
>> -void cxl_event_trace_record(const struct cxl_memdev *cxlmd,
>> -			    enum cxl_event_log_type type,
>> -			    enum cxl_event_type event_type,
>> -			    const uuid_t *uuid, union cxl_event *evt)
>> +static void cxl_report_poison(struct cxl_memdev *cxlmd, struct cxl_region *cxlr,
>> +			      u64 dpa)
>>   {
>> -	if (event_type == CXL_CPER_EVENT_GEN_MEDIA)
>> +	u64 hpa = cxl_trace_hpa(cxlr, cxlmd, dpa);
>> +	unsigned long pfn = PHYS_PFN(hpa);
>> +
>> +	if (!IS_ENABLED(CONFIG_MEMORY_FAILURE))
>> +		return;
> 
> No need for this check, memory_failure_queue() is already stubbed out in
> the CONFIG_MEMORY_FAILURE=n case.
Yes, I'm overthinking it.

> 
>> +	memory_failure_queue(pfn, MF_ACTION_REQUIRED);
> 
> My expectation is MF_ACTION_REQUIRED is not appropriate for CXL event
> reported errors since action is only required for direct consumption
> events and those need not be reported through the device event queue.
Got it.

> 
> It would be useful to collaborate with a BIOS firmware engineer so that
> the kernel ends up with similar logic as is used to set CPER record
> severity, or at least understands why it would want to be different.
> See how ghes_handle_memory_failure() determines the
> memory_failure_queue() flags.

Ok, thanks for the advice.


--
Ruan.





^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v3 1/2] cxl/core: correct length of DPA field masks
  2024-04-30 21:00   ` Alison Schofield
@ 2024-05-03 11:37       ` Shiyang Ruan
  0 siblings, 0 replies; 28+ messages in thread
From: Shiyang Ruan via @ 2024-05-03 11:37 UTC (permalink / raw)
  To: Alison Schofield; +Cc: qemu-devel, linux-cxl



在 2024/5/1 5:00, Alison Schofield 写道:
> On Wed, Apr 17, 2024 at 03:50:52PM +0800, Shiyang Ruan wrote:
>> The length of Physical Address in General Media Event Record/DRAM Event
>> Record is 64-bit, so the field mask should be defined as such length.
>> Otherwise, this causes cxl_general_media and cxl_dram tracepoints to
>> mask off the upper-32-bits of DPA addresses. The cxl_poison event is
>> unaffected.
>>
>> If userspace was doing its own DPA-to-HPA translation this could lead to
>> incorrect page retirement decisions, but there is no known consumer
>> (like rasdaemon) of this event today.
>>
>> Fixes: d54a531a430b ("cxl/mem: Trace General Media Event Record")
>> Cc: <stable@vger.kernel.org>
>> Cc: Dan Williams <dan.j.williams@intel.com>
>> Cc: Davidlohr Bueso <dave@stgolabs.net>
>> Cc: Jonathan Cameron <Jonathan.Cameron@huawei.com>
>> Cc: Ira Weiny <ira.weiny@intel.com>
>> Signed-off-by: Shiyang Ruan <ruansy.fnst@fujitsu.com>
> 
> Hi Ruan,
> 
> This fixup is important for the Event DPA->HPA translation work, so I
> grabbed it, updated it with most* of the review comments, and posted
> with that set. I expect you saw that in your mailbox.

Yes, I saw that.  Nice fix!

> 
> DaveJ queued it in a topic branch for 6.10 here:
> https://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxl.git/log/?h=for-6.10/dpa-to-hpa
> 

That's good!

> *I did not create a common mask for events and poison because I wanted to
> limit the changes. If you'd like to make that change it would be welcomed.

Ok~


--
Thanks,
Ruan.

> 
> -- Alison
> 
>> ---
>>   drivers/cxl/core/trace.h | 2 +-
>>   1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/drivers/cxl/core/trace.h b/drivers/cxl/core/trace.h
>> index e5f13260fc52..cdfce932d5b1 100644
>> --- a/drivers/cxl/core/trace.h
>> +++ b/drivers/cxl/core/trace.h
>> @@ -253,7 +253,7 @@ TRACE_EVENT(cxl_generic_event,
>>    * DRAM Event Record
>>    * CXL rev 3.0 section 8.2.9.2.1.2; Table 8-44
>>    */
>> -#define CXL_DPA_FLAGS_MASK			0x3F
>> +#define CXL_DPA_FLAGS_MASK			0x3FULL
>>   #define CXL_DPA_MASK				(~CXL_DPA_FLAGS_MASK)
>>   
>>   #define CXL_DPA_VOLATILE			BIT(0)
>> -- 
>> 2.34.1
>>


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v3 1/2] cxl/core: correct length of DPA field masks
@ 2024-05-03 11:37       ` Shiyang Ruan
  0 siblings, 0 replies; 28+ messages in thread
From: Shiyang Ruan @ 2024-05-03 11:37 UTC (permalink / raw)
  To: Alison Schofield; +Cc: qemu-devel, linux-cxl



在 2024/5/1 5:00, Alison Schofield 写道:
> On Wed, Apr 17, 2024 at 03:50:52PM +0800, Shiyang Ruan wrote:
>> The length of Physical Address in General Media Event Record/DRAM Event
>> Record is 64-bit, so the field mask should be defined as such length.
>> Otherwise, this causes cxl_general_media and cxl_dram tracepoints to
>> mask off the upper-32-bits of DPA addresses. The cxl_poison event is
>> unaffected.
>>
>> If userspace was doing its own DPA-to-HPA translation this could lead to
>> incorrect page retirement decisions, but there is no known consumer
>> (like rasdaemon) of this event today.
>>
>> Fixes: d54a531a430b ("cxl/mem: Trace General Media Event Record")
>> Cc: <stable@vger.kernel.org>
>> Cc: Dan Williams <dan.j.williams@intel.com>
>> Cc: Davidlohr Bueso <dave@stgolabs.net>
>> Cc: Jonathan Cameron <Jonathan.Cameron@huawei.com>
>> Cc: Ira Weiny <ira.weiny@intel.com>
>> Signed-off-by: Shiyang Ruan <ruansy.fnst@fujitsu.com>
> 
> Hi Ruan,
> 
> This fixup is important for the Event DPA->HPA translation work, so I
> grabbed it, updated it with most* of the review comments, and posted
> with that set. I expect you saw that in your mailbox.

Yes, I saw that.  Nice fix!

> 
> DaveJ queued it in a topic branch for 6.10 here:
> https://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxl.git/log/?h=for-6.10/dpa-to-hpa
> 

That's good!

> *I did not create a common mask for events and poison because I wanted to
> limit the changes. If you'd like to make that change it would be welcomed.

Ok~


--
Thanks,
Ruan.

> 
> -- Alison
> 
>> ---
>>   drivers/cxl/core/trace.h | 2 +-
>>   1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/drivers/cxl/core/trace.h b/drivers/cxl/core/trace.h
>> index e5f13260fc52..cdfce932d5b1 100644
>> --- a/drivers/cxl/core/trace.h
>> +++ b/drivers/cxl/core/trace.h
>> @@ -253,7 +253,7 @@ TRACE_EVENT(cxl_generic_event,
>>    * DRAM Event Record
>>    * CXL rev 3.0 section 8.2.9.2.1.2; Table 8-44
>>    */
>> -#define CXL_DPA_FLAGS_MASK			0x3F
>> +#define CXL_DPA_FLAGS_MASK			0x3FULL
>>   #define CXL_DPA_MASK				(~CXL_DPA_FLAGS_MASK)
>>   
>>   #define CXL_DPA_VOLATILE			BIT(0)
>> -- 
>> 2.34.1
>>

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v3 2/2] cxl/core: add poison creation event handler
  2024-05-03 10:42       ` Shiyang Ruan via
@ 2024-05-08 16:15         ` Jonathan Cameron via
  -1 siblings, 0 replies; 28+ messages in thread
From: Jonathan Cameron @ 2024-05-08 16:15 UTC (permalink / raw)
  To: Shiyang Ruan
  Cc: Ira Weiny, linux-cxl, qemu-devel, dan.j.williams, dave,
	alison.schofield, shiju.jose

On Fri, 3 May 2024 18:42:31 +0800
Shiyang Ruan <ruansy.fnst@fujitsu.com> wrote:

> 在 2024/4/24 1:57, Ira Weiny 写道:
> > Shiyang Ruan wrote:  
> >> Currently driver only traces cxl events, poison creation (for both vmem
> >> and pmem type) on cxl memdev is silent.  OS needs to be notified then it
> >> could handle poison pages in time.  Per CXL spec, the device error event
> >> could be signaled through FW-First and OS-First methods.
> >>
> >> So, add poison creation event handler in OS-First method:
> >>    - Qemu:
> >>      - CXL device reports POISON creation event to OS by MSI by sending
> >>        GMER/DER after injecting a poison record;
> >>    - CXL driver:
> >>      a. parse the POISON event from GMER/DER;
> >>      b. translate poisoned DPA to HPA (PFN);
> >>      c. enqueue poisoned PFN to memory_failure's work queue;  
> > 
> > I'm a bit confused by the need for this patch.  Perhaps a bit more detail
> > here?  
> 
> Yes, I should have wrote more details.
> 
> I want to check and make sure the HWPOISON on a CXL device (type3) is 
> working properly.  For example, a FSDAX filesystem created on a 
> type3-pmem device, then it gets a POISON bit, the OS should be able to 
> handle this POISON event: find the relevant process
> 
> Currently I'm using Qemu with several simulated CXL devices, and using 
> poison injection API of Qemu to create POISON records, but OS isn't 
> notified.  Only when we actively call list POISON records (cxl list -L) 
> will the driver fetch them and log into trace events, then we see the 
> POISON records.  Memory failure wasn't triggered too.
> 

Indeed - QEMU emulation of this is not complete.  It should also be generating
the events. Ideally we'd even handle injecting silent poison (not yet detected
by the device) and have that generate the synchronous memory faults on an access.

> That's why I said "poison creation on cxl memdev is silent".  Per spec, 
> POISON creation should be notified to OS.  Since not familiar with 
> firmware part, I'm try adding this notification for OS-First.
> 
> > 
> > More comments below.
> >   
> >>
> >> Signed-off-by: Shiyang Ruan <ruansy.fnst@fujitsu.com>
> >> ---
> >>   drivers/cxl/core/mbox.c   | 119 +++++++++++++++++++++++++++++++++-----
> >>   drivers/cxl/cxlmem.h      |   8 +--
> >>   include/linux/cxl-event.h |  18 +++++-
> >>   3 files changed, 125 insertions(+), 20 deletions(-)
> >>
> >> diff --git a/drivers/cxl/core/mbox.c b/drivers/cxl/core/mbox.c
> >> index f0f54aeccc87..76af0d73859d 100644
> >> --- a/drivers/cxl/core/mbox.c
> >> +++ b/drivers/cxl/core/mbox.c
> >> @@ -837,25 +837,116 @@ int cxl_enumerate_cmds(struct cxl_memdev_state *mds)
> >>   }
> >>   EXPORT_SYMBOL_NS_GPL(cxl_enumerate_cmds, CXL);
> >>   
> >> -void cxl_event_trace_record(const struct cxl_memdev *cxlmd,
> >> -			    enum cxl_event_log_type type,
> >> -			    enum cxl_event_type event_type,
> >> -			    const uuid_t *uuid, union cxl_event *evt)
> >> +static void cxl_report_poison(struct cxl_memdev *cxlmd, struct cxl_region *cxlr,
> >> +			      u64 dpa)
> >>   {
> >> -	if (event_type == CXL_CPER_EVENT_GEN_MEDIA)
> >> +	u64 hpa = cxl_trace_hpa(cxlr, cxlmd, dpa);
> >> +	unsigned long pfn = PHYS_PFN(hpa);
> >> +
> >> +	if (!IS_ENABLED(CONFIG_MEMORY_FAILURE))
> >> +		return;
> >> +
> >> +	memory_failure_queue(pfn, MF_ACTION_REQUIRED);  
> > 
> > I thought that ras daemon was supposed to take care of this when the trace
> > event occurred.  Alison is working on the HPA data for that path.  
> 
> It seems to save CXL trace events/memory-failures to DB and report to 
> others, but cannot let OS call memory failure.

Interesting question of whose problem this is.  For corrected
errors it's policy stuff that belongs in userspace, but for known
memory failure, it might want to be in kernel.

Shiju (+Cc) pointed me a the existing rasdaemon handling for corrected
errors (statistics get too bad, so memory offlined) 
https://github.com/mchehab/rasdaemon/commit/9ae6b70effb8adc9572debc800b8e16173f74bb8

Poison detection via scrub though is reported via CPER Memory Error Section with
"Scrub uncorrected error" set. That triggers apei handling. On
x86 looks to be apei_mce_report_mem_error(). Also triggers
the ghes_do_memory_failure_path() and ultimately memory_failure().

Conveniently there was a patch fixing the sync path last year
that includes info on what happens in async case.
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=a70297d2213253853e95f5b49651f924990c6d3b

"In addition, for aysnchronous errors, kernel will notify the
process who owns the poisoned page by sending SIGBUS with BUS_MCERRR_A0
in early kill mode."

So I think the kernel should probably do the same for CXL poison
error records when it gets them.

Jonathan


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v3 2/2] cxl/core: add poison creation event handler
@ 2024-05-08 16:15         ` Jonathan Cameron via
  0 siblings, 0 replies; 28+ messages in thread
From: Jonathan Cameron via @ 2024-05-08 16:15 UTC (permalink / raw)
  To: Shiyang Ruan
  Cc: Ira Weiny, linux-cxl, qemu-devel, dan.j.williams, dave,
	alison.schofield, shiju.jose

On Fri, 3 May 2024 18:42:31 +0800
Shiyang Ruan <ruansy.fnst@fujitsu.com> wrote:

> 在 2024/4/24 1:57, Ira Weiny 写道:
> > Shiyang Ruan wrote:  
> >> Currently driver only traces cxl events, poison creation (for both vmem
> >> and pmem type) on cxl memdev is silent.  OS needs to be notified then it
> >> could handle poison pages in time.  Per CXL spec, the device error event
> >> could be signaled through FW-First and OS-First methods.
> >>
> >> So, add poison creation event handler in OS-First method:
> >>    - Qemu:
> >>      - CXL device reports POISON creation event to OS by MSI by sending
> >>        GMER/DER after injecting a poison record;
> >>    - CXL driver:
> >>      a. parse the POISON event from GMER/DER;
> >>      b. translate poisoned DPA to HPA (PFN);
> >>      c. enqueue poisoned PFN to memory_failure's work queue;  
> > 
> > I'm a bit confused by the need for this patch.  Perhaps a bit more detail
> > here?  
> 
> Yes, I should have wrote more details.
> 
> I want to check and make sure the HWPOISON on a CXL device (type3) is 
> working properly.  For example, a FSDAX filesystem created on a 
> type3-pmem device, then it gets a POISON bit, the OS should be able to 
> handle this POISON event: find the relevant process
> 
> Currently I'm using Qemu with several simulated CXL devices, and using 
> poison injection API of Qemu to create POISON records, but OS isn't 
> notified.  Only when we actively call list POISON records (cxl list -L) 
> will the driver fetch them and log into trace events, then we see the 
> POISON records.  Memory failure wasn't triggered too.
> 

Indeed - QEMU emulation of this is not complete.  It should also be generating
the events. Ideally we'd even handle injecting silent poison (not yet detected
by the device) and have that generate the synchronous memory faults on an access.

> That's why I said "poison creation on cxl memdev is silent".  Per spec, 
> POISON creation should be notified to OS.  Since not familiar with 
> firmware part, I'm try adding this notification for OS-First.
> 
> > 
> > More comments below.
> >   
> >>
> >> Signed-off-by: Shiyang Ruan <ruansy.fnst@fujitsu.com>
> >> ---
> >>   drivers/cxl/core/mbox.c   | 119 +++++++++++++++++++++++++++++++++-----
> >>   drivers/cxl/cxlmem.h      |   8 +--
> >>   include/linux/cxl-event.h |  18 +++++-
> >>   3 files changed, 125 insertions(+), 20 deletions(-)
> >>
> >> diff --git a/drivers/cxl/core/mbox.c b/drivers/cxl/core/mbox.c
> >> index f0f54aeccc87..76af0d73859d 100644
> >> --- a/drivers/cxl/core/mbox.c
> >> +++ b/drivers/cxl/core/mbox.c
> >> @@ -837,25 +837,116 @@ int cxl_enumerate_cmds(struct cxl_memdev_state *mds)
> >>   }
> >>   EXPORT_SYMBOL_NS_GPL(cxl_enumerate_cmds, CXL);
> >>   
> >> -void cxl_event_trace_record(const struct cxl_memdev *cxlmd,
> >> -			    enum cxl_event_log_type type,
> >> -			    enum cxl_event_type event_type,
> >> -			    const uuid_t *uuid, union cxl_event *evt)
> >> +static void cxl_report_poison(struct cxl_memdev *cxlmd, struct cxl_region *cxlr,
> >> +			      u64 dpa)
> >>   {
> >> -	if (event_type == CXL_CPER_EVENT_GEN_MEDIA)
> >> +	u64 hpa = cxl_trace_hpa(cxlr, cxlmd, dpa);
> >> +	unsigned long pfn = PHYS_PFN(hpa);
> >> +
> >> +	if (!IS_ENABLED(CONFIG_MEMORY_FAILURE))
> >> +		return;
> >> +
> >> +	memory_failure_queue(pfn, MF_ACTION_REQUIRED);  
> > 
> > I thought that ras daemon was supposed to take care of this when the trace
> > event occurred.  Alison is working on the HPA data for that path.  
> 
> It seems to save CXL trace events/memory-failures to DB and report to 
> others, but cannot let OS call memory failure.

Interesting question of whose problem this is.  For corrected
errors it's policy stuff that belongs in userspace, but for known
memory failure, it might want to be in kernel.

Shiju (+Cc) pointed me a the existing rasdaemon handling for corrected
errors (statistics get too bad, so memory offlined) 
https://github.com/mchehab/rasdaemon/commit/9ae6b70effb8adc9572debc800b8e16173f74bb8

Poison detection via scrub though is reported via CPER Memory Error Section with
"Scrub uncorrected error" set. That triggers apei handling. On
x86 looks to be apei_mce_report_mem_error(). Also triggers
the ghes_do_memory_failure_path() and ultimately memory_failure().

Conveniently there was a patch fixing the sync path last year
that includes info on what happens in async case.
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=a70297d2213253853e95f5b49651f924990c6d3b

"In addition, for aysnchronous errors, kernel will notify the
process who owns the poisoned page by sending SIGBUS with BUS_MCERRR_A0
in early kill mode."

So I think the kernel should probably do the same for CXL poison
error records when it gets them.

Jonathan



^ permalink raw reply	[flat|nested] 28+ messages in thread

end of thread, other threads:[~2024-05-08 16:16 UTC | newest]

Thread overview: 28+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-04-17  7:50 [PATCH v3 0/2] cxl: add poison creation event handler Shiyang Ruan
2024-04-17  7:50 ` Shiyang Ruan via
2024-04-17  7:50 ` [PATCH v3 1/2] cxl/core: correct length of DPA field masks Shiyang Ruan
2024-04-17  7:50   ` Shiyang Ruan via
2024-04-23 17:35   ` Ira Weiny
2024-04-23 17:35   ` Dan Williams
2024-04-23 17:42   ` Alison Schofield
2024-04-23 21:04     ` Ira Weiny
2024-04-25 10:05       ` Shiyang Ruan
2024-04-25 10:05         ` Shiyang Ruan via
2024-04-25 16:04         ` Ira Weiny
2024-04-30 21:00   ` Alison Schofield
2024-05-03 11:37     ` Shiyang Ruan via
2024-05-03 11:37       ` Shiyang Ruan
2024-04-17  7:50 ` [PATCH v3 2/2] cxl/core: add poison creation event handler Shiyang Ruan via
2024-04-17  7:50   ` Shiyang Ruan
2024-04-17 17:30   ` Dave Jiang
2024-04-18  9:01     ` Shiyang Ruan via
2024-04-18  9:01       ` Shiyang Ruan
2024-04-21 12:14   ` kernel test robot
2024-04-23 17:57   ` Ira Weiny
2024-05-03 10:42     ` Shiyang Ruan
2024-05-03 10:42       ` Shiyang Ruan via
2024-05-08 16:15       ` Jonathan Cameron
2024-05-08 16:15         ` Jonathan Cameron via
2024-04-23 18:40   ` Dan Williams
2024-05-03 11:32     ` Shiyang Ruan
2024-05-03 11:32       ` Shiyang Ruan via

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.