All of lore.kernel.org
 help / color / mirror / Atom feed
* [RFC PATCH SET] cxl: add poison event handler
@ 2024-02-09 11:54 ` Shiyang Ruan via
  0 siblings, 0 replies; 40+ messages in thread
From: Shiyang Ruan @ 2024-02-09 11:54 UTC (permalink / raw)
  To: qemu-devel, linux-cxl; +Cc: Jonathan.Cameron, dan.j.williams

Currently driver only trace cxl events, poison injection on cxl memdev
is silent.  OS needs to be notified then it could handle poison range
in time.  Per CXL spec, the device error event could be signaled through
FW-First and OS-First methods.

This draft patchset adds poison event handler in OS-First method:
  - qemu:
    - CXL device report POISON event to OS by MSI by sending GMER after
      injecting a poison record
  - CXL driver
    a. read the POISON event through GMER;
    b. get POISON list;
    c. translate DPA to HPA;
    d. construct a mce instance, then call mce_log() to queue this mce
       instance;

This patchset includes 2 patches for qemu, 5 patches for cxl driver.

Shiyang Ruan (5):
  cxl/core: correct length of DPA field masks
  cxl/core: introduce cxl_memdev_dpa_to_hpa()
  cxl/core: introduce cxl_mem_report_poison()
  cxl/core: add report option for cxl_mem_get_poison()
  cxl/core: add poison injection event handler

 arch/x86/kernel/cpu/mce/core.c |  1 +
 drivers/cxl/core/mbox.c        | 82 +++++++++++++++++++++++++++-------
 drivers/cxl/core/memdev.c      | 16 ++++++-
 drivers/cxl/core/region.c      |  8 ++--
 drivers/cxl/core/trace.h       |  6 +--
 drivers/cxl/cxlmem.h           | 11 ++---
 drivers/cxl/pci.c              |  4 +-
 7 files changed, 97 insertions(+), 31 deletions(-)

-- 
2.34.1


^ permalink raw reply	[flat|nested] 40+ messages in thread

* [RFC PATCH SET] cxl: add poison event handler
@ 2024-02-09 11:54 ` Shiyang Ruan via
  0 siblings, 0 replies; 40+ messages in thread
From: Shiyang Ruan via @ 2024-02-09 11:54 UTC (permalink / raw)
  To: qemu-devel, linux-cxl; +Cc: Jonathan.Cameron, dan.j.williams

Currently driver only trace cxl events, poison injection on cxl memdev
is silent.  OS needs to be notified then it could handle poison range
in time.  Per CXL spec, the device error event could be signaled through
FW-First and OS-First methods.

This draft patchset adds poison event handler in OS-First method:
  - qemu:
    - CXL device report POISON event to OS by MSI by sending GMER after
      injecting a poison record
  - CXL driver
    a. read the POISON event through GMER;
    b. get POISON list;
    c. translate DPA to HPA;
    d. construct a mce instance, then call mce_log() to queue this mce
       instance;

This patchset includes 2 patches for qemu, 5 patches for cxl driver.

Shiyang Ruan (5):
  cxl/core: correct length of DPA field masks
  cxl/core: introduce cxl_memdev_dpa_to_hpa()
  cxl/core: introduce cxl_mem_report_poison()
  cxl/core: add report option for cxl_mem_get_poison()
  cxl/core: add poison injection event handler

 arch/x86/kernel/cpu/mce/core.c |  1 +
 drivers/cxl/core/mbox.c        | 82 +++++++++++++++++++++++++++-------
 drivers/cxl/core/memdev.c      | 16 ++++++-
 drivers/cxl/core/region.c      |  8 ++--
 drivers/cxl/core/trace.h       |  6 +--
 drivers/cxl/cxlmem.h           | 11 ++---
 drivers/cxl/pci.c              |  4 +-
 7 files changed, 97 insertions(+), 31 deletions(-)

-- 
2.34.1



^ permalink raw reply	[flat|nested] 40+ messages in thread

* [RFC PATCH 1/2] hw/cxl/type3: add missing flag bit for GMER
  2024-02-09 11:54 ` Shiyang Ruan via
@ 2024-02-09 11:54   ` Shiyang Ruan via
  -1 siblings, 0 replies; 40+ messages in thread
From: Shiyang Ruan @ 2024-02-09 11:54 UTC (permalink / raw)
  To: qemu-devel, linux-cxl; +Cc: Jonathan.Cameron, dan.j.williams

The "Volatile" should be set if current device is a volatile memory.
Per CXL Spec r3.0 8.2.9.2.1.1, Table 8-43.

Signed-off-by: Shiyang Ruan <ruansy.fnst@fujitsu.com>
---
 hw/mem/cxl_type3.c | 6 ++++++
 1 file changed, 6 insertions(+)

diff --git a/hw/mem/cxl_type3.c b/hw/mem/cxl_type3.c
index 52647b4ac7..d8fb63b1de 100644
--- a/hw/mem/cxl_type3.c
+++ b/hw/mem/cxl_type3.c
@@ -1285,6 +1285,9 @@ static const QemuUUID memory_module_uuid = {
                  0x79, 0xba, 0xb1, 0x13, 0xb7, 0x74),
 };
 
+#define CXL_GMER_DPA_VOLATILE                           BIT(0)
+#define CXL_GMER_DPA_NOT_REPAIRABLE                     BIT(1)
+
 #define CXL_GMER_VALID_CHANNEL                          BIT(0)
 #define CXL_GMER_VALID_RANK                             BIT(1)
 #define CXL_GMER_VALID_DEVICE                           BIT(2)
@@ -1348,6 +1351,9 @@ void qmp_cxl_inject_general_media_event(const char *path, CxlEventLog log,
     cxl_assign_event_header(hdr, &gen_media_uuid, flags, sizeof(gem),
                             cxl_device_get_timestamp(&ct3d->cxl_dstate));
 
+    if (ct3d->hostvmem) {
+        dpa |= CXL_GMER_DPA_VOLATILE;
+    }
     stq_le_p(&gem.phys_addr, dpa);
     gem.descriptor = descriptor;
     gem.type = type;
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [RFC PATCH 1/2] hw/cxl/type3: add missing flag bit for GMER
@ 2024-02-09 11:54   ` Shiyang Ruan via
  0 siblings, 0 replies; 40+ messages in thread
From: Shiyang Ruan via @ 2024-02-09 11:54 UTC (permalink / raw)
  To: qemu-devel, linux-cxl; +Cc: Jonathan.Cameron, dan.j.williams

The "Volatile" should be set if current device is a volatile memory.
Per CXL Spec r3.0 8.2.9.2.1.1, Table 8-43.

Signed-off-by: Shiyang Ruan <ruansy.fnst@fujitsu.com>
---
 hw/mem/cxl_type3.c | 6 ++++++
 1 file changed, 6 insertions(+)

diff --git a/hw/mem/cxl_type3.c b/hw/mem/cxl_type3.c
index 52647b4ac7..d8fb63b1de 100644
--- a/hw/mem/cxl_type3.c
+++ b/hw/mem/cxl_type3.c
@@ -1285,6 +1285,9 @@ static const QemuUUID memory_module_uuid = {
                  0x79, 0xba, 0xb1, 0x13, 0xb7, 0x74),
 };
 
+#define CXL_GMER_DPA_VOLATILE                           BIT(0)
+#define CXL_GMER_DPA_NOT_REPAIRABLE                     BIT(1)
+
 #define CXL_GMER_VALID_CHANNEL                          BIT(0)
 #define CXL_GMER_VALID_RANK                             BIT(1)
 #define CXL_GMER_VALID_DEVICE                           BIT(2)
@@ -1348,6 +1351,9 @@ void qmp_cxl_inject_general_media_event(const char *path, CxlEventLog log,
     cxl_assign_event_header(hdr, &gen_media_uuid, flags, sizeof(gem),
                             cxl_device_get_timestamp(&ct3d->cxl_dstate));
 
+    if (ct3d->hostvmem) {
+        dpa |= CXL_GMER_DPA_VOLATILE;
+    }
     stq_le_p(&gem.phys_addr, dpa);
     gem.descriptor = descriptor;
     gem.type = type;
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [RFC PATCH 2/2] hw/cxl/type3: send a GMER while injecting poison
  2024-02-09 11:54 ` Shiyang Ruan via
@ 2024-02-09 11:54   ` Shiyang Ruan via
  -1 siblings, 0 replies; 40+ messages in thread
From: Shiyang Ruan @ 2024-02-09 11:54 UTC (permalink / raw)
  To: qemu-devel, linux-cxl; +Cc: Jonathan.Cameron, dan.j.williams

Send a signal to OS to let it able to handle the poison range.

TODO: This is an rough draft, will add more parameters for
qmp_cxl_inject_poison() to set to GMER.

Signed-off-by: Shiyang Ruan <ruansy.fnst@fujitsu.com>
---
 hw/mem/cxl_type3.c | 5 +++++
 1 file changed, 5 insertions(+)

diff --git a/hw/mem/cxl_type3.c b/hw/mem/cxl_type3.c
index d8fb63b1de..813f7f2175 100644
--- a/hw/mem/cxl_type3.c
+++ b/hw/mem/cxl_type3.c
@@ -1116,6 +1116,11 @@ void qmp_cxl_inject_poison(const char *path, uint64_t start, uint64_t length,
 
     QLIST_INSERT_HEAD(&ct3d->poison_list, p, node);
     ct3d->poison_list_cnt++;
+
+    /* Emit an GMER event, let os handle it */
+    qmp_cxl_inject_general_media_event(path, CXL_EVENT_LOG_FAILURE, 0, start,
+                                       0, 0, 4, false, 0, false, 0,
+                                       false, 0, NULL, errp);
 }
 
 /* For uncorrectable errors include support for multiple header recording */
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [RFC PATCH 2/2] hw/cxl/type3: send a GMER while injecting poison
@ 2024-02-09 11:54   ` Shiyang Ruan via
  0 siblings, 0 replies; 40+ messages in thread
From: Shiyang Ruan via @ 2024-02-09 11:54 UTC (permalink / raw)
  To: qemu-devel, linux-cxl; +Cc: Jonathan.Cameron, dan.j.williams

Send a signal to OS to let it able to handle the poison range.

TODO: This is an rough draft, will add more parameters for
qmp_cxl_inject_poison() to set to GMER.

Signed-off-by: Shiyang Ruan <ruansy.fnst@fujitsu.com>
---
 hw/mem/cxl_type3.c | 5 +++++
 1 file changed, 5 insertions(+)

diff --git a/hw/mem/cxl_type3.c b/hw/mem/cxl_type3.c
index d8fb63b1de..813f7f2175 100644
--- a/hw/mem/cxl_type3.c
+++ b/hw/mem/cxl_type3.c
@@ -1116,6 +1116,11 @@ void qmp_cxl_inject_poison(const char *path, uint64_t start, uint64_t length,
 
     QLIST_INSERT_HEAD(&ct3d->poison_list, p, node);
     ct3d->poison_list_cnt++;
+
+    /* Emit an GMER event, let os handle it */
+    qmp_cxl_inject_general_media_event(path, CXL_EVENT_LOG_FAILURE, 0, start,
+                                       0, 0, 4, false, 0, false, 0,
+                                       false, 0, NULL, errp);
 }
 
 /* For uncorrectable errors include support for multiple header recording */
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [RFC PATCH 1/5] cxl/core: correct length of DPA field masks
  2024-02-09 11:54 ` Shiyang Ruan via
@ 2024-02-09 11:54   ` Shiyang Ruan via
  -1 siblings, 0 replies; 40+ messages in thread
From: Shiyang Ruan @ 2024-02-09 11:54 UTC (permalink / raw)
  To: qemu-devel, linux-cxl; +Cc: Jonathan.Cameron, dan.j.williams

The length of Physical Address in General Media Event Record/DRAM Event
Record is 64-bit, so the field mask should be defined as such length.

Signed-off-by: Shiyang Ruan <ruansy.fnst@fujitsu.com>
---
 drivers/cxl/core/trace.h | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/cxl/core/trace.h b/drivers/cxl/core/trace.h
index 89445435303a..388a87d972c2 100644
--- a/drivers/cxl/core/trace.h
+++ b/drivers/cxl/core/trace.h
@@ -253,11 +253,11 @@ TRACE_EVENT(cxl_generic_event,
  * DRAM Event Record
  * CXL rev 3.0 section 8.2.9.2.1.2; Table 8-44
  */
-#define CXL_DPA_FLAGS_MASK			0x3F
+#define CXL_DPA_FLAGS_MASK			0x3FULL
 #define CXL_DPA_MASK				(~CXL_DPA_FLAGS_MASK)
 
-#define CXL_DPA_VOLATILE			BIT(0)
-#define CXL_DPA_NOT_REPAIRABLE			BIT(1)
+#define CXL_DPA_VOLATILE			BIT_ULL(0)
+#define CXL_DPA_NOT_REPAIRABLE			BIT_ULL(1)
 #define show_dpa_flags(flags)	__print_flags(flags, "|",		   \
 	{ CXL_DPA_VOLATILE,			"VOLATILE"		}, \
 	{ CXL_DPA_NOT_REPAIRABLE,		"NOT_REPAIRABLE"	}  \
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [RFC PATCH 1/5] cxl/core: correct length of DPA field masks
@ 2024-02-09 11:54   ` Shiyang Ruan via
  0 siblings, 0 replies; 40+ messages in thread
From: Shiyang Ruan via @ 2024-02-09 11:54 UTC (permalink / raw)
  To: qemu-devel, linux-cxl; +Cc: Jonathan.Cameron, dan.j.williams

The length of Physical Address in General Media Event Record/DRAM Event
Record is 64-bit, so the field mask should be defined as such length.

Signed-off-by: Shiyang Ruan <ruansy.fnst@fujitsu.com>
---
 drivers/cxl/core/trace.h | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/cxl/core/trace.h b/drivers/cxl/core/trace.h
index 89445435303a..388a87d972c2 100644
--- a/drivers/cxl/core/trace.h
+++ b/drivers/cxl/core/trace.h
@@ -253,11 +253,11 @@ TRACE_EVENT(cxl_generic_event,
  * DRAM Event Record
  * CXL rev 3.0 section 8.2.9.2.1.2; Table 8-44
  */
-#define CXL_DPA_FLAGS_MASK			0x3F
+#define CXL_DPA_FLAGS_MASK			0x3FULL
 #define CXL_DPA_MASK				(~CXL_DPA_FLAGS_MASK)
 
-#define CXL_DPA_VOLATILE			BIT(0)
-#define CXL_DPA_NOT_REPAIRABLE			BIT(1)
+#define CXL_DPA_VOLATILE			BIT_ULL(0)
+#define CXL_DPA_NOT_REPAIRABLE			BIT_ULL(1)
 #define show_dpa_flags(flags)	__print_flags(flags, "|",		   \
 	{ CXL_DPA_VOLATILE,			"VOLATILE"		}, \
 	{ CXL_DPA_NOT_REPAIRABLE,		"NOT_REPAIRABLE"	}  \
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [RFC PATCH 2/5] cxl/core: introduce cxl_memdev_dpa_to_hpa()
  2024-02-09 11:54 ` Shiyang Ruan via
@ 2024-02-09 11:54   ` Shiyang Ruan via
  -1 siblings, 0 replies; 40+ messages in thread
From: Shiyang Ruan @ 2024-02-09 11:54 UTC (permalink / raw)
  To: qemu-devel, linux-cxl; +Cc: Jonathan.Cameron, dan.j.williams

When a memdev is assigned to a region, its Device Physical Address will be
mapped to Host Physical Address.  Introduce this helper function to
translate HPA from a given memdev and its DPA.

Signed-off-by: Shiyang Ruan <ruansy.fnst@fujitsu.com>
---
 drivers/cxl/core/memdev.c | 12 ++++++++++++
 drivers/cxl/cxlmem.h      |  1 +
 2 files changed, 13 insertions(+)

diff --git a/drivers/cxl/core/memdev.c b/drivers/cxl/core/memdev.c
index dae8802ecdb0..c304e709ef0e 100644
--- a/drivers/cxl/core/memdev.c
+++ b/drivers/cxl/core/memdev.c
@@ -319,6 +319,18 @@ static int cxl_validate_poison_dpa(struct cxl_memdev *cxlmd, u64 dpa)
 	return 0;
 }
 
+phys_addr_t cxl_memdev_dpa_to_hpa(struct cxl_memdev *cxlmd, u64 dpa)
+{
+	struct cxl_region *cxlr = cxl_dpa_to_region(cxlmd, dpa);
+
+	if (cxlr)
+		return cxlr->params.res->start + dpa;
+	else {
+		dev_dbg(&cxlmd->dev, "device belongs to no region.\n");
+		return 0;
+	}
+}
+
 int cxl_inject_poison(struct cxl_memdev *cxlmd, u64 dpa)
 {
 	struct cxl_memdev_state *mds = to_cxl_memdev_state(cxlmd->cxlds);
diff --git a/drivers/cxl/cxlmem.h b/drivers/cxl/cxlmem.h
index 5303d6942b88..97ddab421e63 100644
--- a/drivers/cxl/cxlmem.h
+++ b/drivers/cxl/cxlmem.h
@@ -833,6 +833,7 @@ int cxl_poison_state_init(struct cxl_memdev_state *mds);
 int cxl_mem_get_poison(struct cxl_memdev *cxlmd, u64 offset, u64 len,
 		       struct cxl_region *cxlr);
 int cxl_trigger_poison_list(struct cxl_memdev *cxlmd);
+phys_addr_t cxl_memdev_dpa_to_hpa(struct cxl_memdev *cxlmd, u64 dpa);
 int cxl_inject_poison(struct cxl_memdev *cxlmd, u64 dpa);
 int cxl_clear_poison(struct cxl_memdev *cxlmd, u64 dpa);
 
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [RFC PATCH 2/5] cxl/core: introduce cxl_memdev_dpa_to_hpa()
@ 2024-02-09 11:54   ` Shiyang Ruan via
  0 siblings, 0 replies; 40+ messages in thread
From: Shiyang Ruan via @ 2024-02-09 11:54 UTC (permalink / raw)
  To: qemu-devel, linux-cxl; +Cc: Jonathan.Cameron, dan.j.williams

When a memdev is assigned to a region, its Device Physical Address will be
mapped to Host Physical Address.  Introduce this helper function to
translate HPA from a given memdev and its DPA.

Signed-off-by: Shiyang Ruan <ruansy.fnst@fujitsu.com>
---
 drivers/cxl/core/memdev.c | 12 ++++++++++++
 drivers/cxl/cxlmem.h      |  1 +
 2 files changed, 13 insertions(+)

diff --git a/drivers/cxl/core/memdev.c b/drivers/cxl/core/memdev.c
index dae8802ecdb0..c304e709ef0e 100644
--- a/drivers/cxl/core/memdev.c
+++ b/drivers/cxl/core/memdev.c
@@ -319,6 +319,18 @@ static int cxl_validate_poison_dpa(struct cxl_memdev *cxlmd, u64 dpa)
 	return 0;
 }
 
+phys_addr_t cxl_memdev_dpa_to_hpa(struct cxl_memdev *cxlmd, u64 dpa)
+{
+	struct cxl_region *cxlr = cxl_dpa_to_region(cxlmd, dpa);
+
+	if (cxlr)
+		return cxlr->params.res->start + dpa;
+	else {
+		dev_dbg(&cxlmd->dev, "device belongs to no region.\n");
+		return 0;
+	}
+}
+
 int cxl_inject_poison(struct cxl_memdev *cxlmd, u64 dpa)
 {
 	struct cxl_memdev_state *mds = to_cxl_memdev_state(cxlmd->cxlds);
diff --git a/drivers/cxl/cxlmem.h b/drivers/cxl/cxlmem.h
index 5303d6942b88..97ddab421e63 100644
--- a/drivers/cxl/cxlmem.h
+++ b/drivers/cxl/cxlmem.h
@@ -833,6 +833,7 @@ int cxl_poison_state_init(struct cxl_memdev_state *mds);
 int cxl_mem_get_poison(struct cxl_memdev *cxlmd, u64 offset, u64 len,
 		       struct cxl_region *cxlr);
 int cxl_trigger_poison_list(struct cxl_memdev *cxlmd);
+phys_addr_t cxl_memdev_dpa_to_hpa(struct cxl_memdev *cxlmd, u64 dpa);
 int cxl_inject_poison(struct cxl_memdev *cxlmd, u64 dpa);
 int cxl_clear_poison(struct cxl_memdev *cxlmd, u64 dpa);
 
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [RFC PATCH 3/5] cxl/core: introduce cxl_mem_report_poison()
  2024-02-09 11:54 ` Shiyang Ruan via
@ 2024-02-09 11:54   ` Shiyang Ruan via
  -1 siblings, 0 replies; 40+ messages in thread
From: Shiyang Ruan @ 2024-02-09 11:54 UTC (permalink / raw)
  To: qemu-devel, linux-cxl; +Cc: Jonathan.Cameron, dan.j.williams

If poison is detected(reported from cxl memdev), OS should be notified to
handle it.  Introduce this function:
  1. translate DPA to HPA;
  2. construct a MCE instance; (TODO: more details need to be filled)
  3. log it into MCE event queue;

After that, MCE mechanism can walk over its notifier chain to execute
specific handlers.

Signed-off-by: Shiyang Ruan <ruansy.fnst@fujitsu.com>
---
 arch/x86/kernel/cpu/mce/core.c |  1 +
 drivers/cxl/core/mbox.c        | 33 +++++++++++++++++++++++++++++++++
 2 files changed, 34 insertions(+)

diff --git a/arch/x86/kernel/cpu/mce/core.c b/arch/x86/kernel/cpu/mce/core.c
index bc39252bc54f..a64c0aceb7e0 100644
--- a/arch/x86/kernel/cpu/mce/core.c
+++ b/arch/x86/kernel/cpu/mce/core.c
@@ -131,6 +131,7 @@ void mce_setup(struct mce *m)
 	m->ppin = cpu_data(m->extcpu).ppin;
 	m->microcode = boot_cpu_data.microcode;
 }
+EXPORT_SYMBOL_GPL(mce_setup);
 
 DEFINE_PER_CPU(struct mce, injectm);
 EXPORT_PER_CPU_SYMBOL_GPL(injectm);
diff --git a/drivers/cxl/core/mbox.c b/drivers/cxl/core/mbox.c
index 27166a411705..f9b6f50fbe80 100644
--- a/drivers/cxl/core/mbox.c
+++ b/drivers/cxl/core/mbox.c
@@ -4,6 +4,7 @@
 #include <linux/debugfs.h>
 #include <linux/ktime.h>
 #include <linux/mutex.h>
+#include <asm/mce.h>
 #include <asm/unaligned.h>
 #include <cxlpci.h>
 #include <cxlmem.h>
@@ -1290,6 +1291,38 @@ int cxl_set_timestamp(struct cxl_memdev_state *mds)
 }
 EXPORT_SYMBOL_NS_GPL(cxl_set_timestamp, CXL);
 
+static void cxl_mem_report_poison(struct cxl_memdev *cxlmd,
+				  struct cxl_poison_record *poison)
+{
+	struct mce m;
+	u64 dpa = le64_to_cpu(poison->address) & CXL_POISON_START_MASK;
+	u64 len = le64_to_cpu(poison->length), i;
+	phys_addr_t phys_addr = cxl_memdev_dpa_to_hpa(cxlmd, dpa);
+
+	if (phys_addr)
+		return;
+
+	/*
+	 * Initialize struct mce.  Call preempt_disable() to avoid
+	 * "BUG: using smp_processor_id() in preemptible" for now, not sure
+	 * if this is a correct way.
+	 */
+	preempt_disable();
+	mce_setup(&m);
+	preempt_enable();
+
+	m.bank = -1;
+	/* Fake a memory read error with unknown channel */
+	m.status = MCI_STATUS_VAL | MCI_STATUS_EN | MCI_STATUS_ADDRV |
+		   MCI_STATUS_MISCV | 0x9f;
+	m.misc = (MCI_MISC_ADDR_PHYS << 6);
+
+	for (i = 0; i < len; i++) {
+		m.addr = phys_addr++;
+		mce_log(&m);
+	}
+}
+
 int cxl_mem_get_poison(struct cxl_memdev *cxlmd, u64 offset, u64 len,
 		       struct cxl_region *cxlr)
 {
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [RFC PATCH 3/5] cxl/core: introduce cxl_mem_report_poison()
@ 2024-02-09 11:54   ` Shiyang Ruan via
  0 siblings, 0 replies; 40+ messages in thread
From: Shiyang Ruan via @ 2024-02-09 11:54 UTC (permalink / raw)
  To: qemu-devel, linux-cxl; +Cc: Jonathan.Cameron, dan.j.williams

If poison is detected(reported from cxl memdev), OS should be notified to
handle it.  Introduce this function:
  1. translate DPA to HPA;
  2. construct a MCE instance; (TODO: more details need to be filled)
  3. log it into MCE event queue;

After that, MCE mechanism can walk over its notifier chain to execute
specific handlers.

Signed-off-by: Shiyang Ruan <ruansy.fnst@fujitsu.com>
---
 arch/x86/kernel/cpu/mce/core.c |  1 +
 drivers/cxl/core/mbox.c        | 33 +++++++++++++++++++++++++++++++++
 2 files changed, 34 insertions(+)

diff --git a/arch/x86/kernel/cpu/mce/core.c b/arch/x86/kernel/cpu/mce/core.c
index bc39252bc54f..a64c0aceb7e0 100644
--- a/arch/x86/kernel/cpu/mce/core.c
+++ b/arch/x86/kernel/cpu/mce/core.c
@@ -131,6 +131,7 @@ void mce_setup(struct mce *m)
 	m->ppin = cpu_data(m->extcpu).ppin;
 	m->microcode = boot_cpu_data.microcode;
 }
+EXPORT_SYMBOL_GPL(mce_setup);
 
 DEFINE_PER_CPU(struct mce, injectm);
 EXPORT_PER_CPU_SYMBOL_GPL(injectm);
diff --git a/drivers/cxl/core/mbox.c b/drivers/cxl/core/mbox.c
index 27166a411705..f9b6f50fbe80 100644
--- a/drivers/cxl/core/mbox.c
+++ b/drivers/cxl/core/mbox.c
@@ -4,6 +4,7 @@
 #include <linux/debugfs.h>
 #include <linux/ktime.h>
 #include <linux/mutex.h>
+#include <asm/mce.h>
 #include <asm/unaligned.h>
 #include <cxlpci.h>
 #include <cxlmem.h>
@@ -1290,6 +1291,38 @@ int cxl_set_timestamp(struct cxl_memdev_state *mds)
 }
 EXPORT_SYMBOL_NS_GPL(cxl_set_timestamp, CXL);
 
+static void cxl_mem_report_poison(struct cxl_memdev *cxlmd,
+				  struct cxl_poison_record *poison)
+{
+	struct mce m;
+	u64 dpa = le64_to_cpu(poison->address) & CXL_POISON_START_MASK;
+	u64 len = le64_to_cpu(poison->length), i;
+	phys_addr_t phys_addr = cxl_memdev_dpa_to_hpa(cxlmd, dpa);
+
+	if (phys_addr)
+		return;
+
+	/*
+	 * Initialize struct mce.  Call preempt_disable() to avoid
+	 * "BUG: using smp_processor_id() in preemptible" for now, not sure
+	 * if this is a correct way.
+	 */
+	preempt_disable();
+	mce_setup(&m);
+	preempt_enable();
+
+	m.bank = -1;
+	/* Fake a memory read error with unknown channel */
+	m.status = MCI_STATUS_VAL | MCI_STATUS_EN | MCI_STATUS_ADDRV |
+		   MCI_STATUS_MISCV | 0x9f;
+	m.misc = (MCI_MISC_ADDR_PHYS << 6);
+
+	for (i = 0; i < len; i++) {
+		m.addr = phys_addr++;
+		mce_log(&m);
+	}
+}
+
 int cxl_mem_get_poison(struct cxl_memdev *cxlmd, u64 offset, u64 len,
 		       struct cxl_region *cxlr)
 {
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [RFC PATCH 4/5] cxl/core: add report option for cxl_mem_get_poison()
  2024-02-09 11:54 ` Shiyang Ruan via
@ 2024-02-09 11:54   ` Shiyang Ruan via
  -1 siblings, 0 replies; 40+ messages in thread
From: Shiyang Ruan @ 2024-02-09 11:54 UTC (permalink / raw)
  To: qemu-devel, linux-cxl; +Cc: Jonathan.Cameron, dan.j.williams

When a poison event is received, driver uses GET_POISON_LIST command
to get the poison list.  Now driver has cxl_mem_get_poison(), so
reuse it and add a parameter 'bool report', report poison record to MCE
if set true.

Signed-off-by: Shiyang Ruan <ruansy.fnst@fujitsu.com>
---
 drivers/cxl/core/mbox.c   | 7 +++++--
 drivers/cxl/core/memdev.c | 4 ++--
 drivers/cxl/core/region.c | 8 ++++----
 drivers/cxl/cxlmem.h      | 2 +-
 4 files changed, 12 insertions(+), 9 deletions(-)

diff --git a/drivers/cxl/core/mbox.c b/drivers/cxl/core/mbox.c
index f9b6f50fbe80..e1c67159acc4 100644
--- a/drivers/cxl/core/mbox.c
+++ b/drivers/cxl/core/mbox.c
@@ -1324,7 +1324,7 @@ static void cxl_mem_report_poison(struct cxl_memdev *cxlmd,
 }
 
 int cxl_mem_get_poison(struct cxl_memdev *cxlmd, u64 offset, u64 len,
-		       struct cxl_region *cxlr)
+		       struct cxl_region *cxlr, bool report)
 {
 	struct cxl_memdev_state *mds = to_cxl_memdev_state(cxlmd->cxlds);
 	struct cxl_mbox_poison_out *po;
@@ -1355,10 +1355,13 @@ int cxl_mem_get_poison(struct cxl_memdev *cxlmd, u64 offset, u64 len,
 		if (rc)
 			break;
 
-		for (int i = 0; i < le16_to_cpu(po->count); i++)
+		for (int i = 0; i < le16_to_cpu(po->count); i++) {
 			trace_cxl_poison(cxlmd, cxlr, &po->record[i],
 					 po->flags, po->overflow_ts,
 					 CXL_POISON_TRACE_LIST);
+			if (report)
+				cxl_mem_report_poison(cxlmd, &po->record[i]);
+		}
 
 		/* Protect against an uncleared _FLAG_MORE */
 		nr_records = nr_records + le16_to_cpu(po->count);
diff --git a/drivers/cxl/core/memdev.c b/drivers/cxl/core/memdev.c
index c304e709ef0e..320bcb8af5b0 100644
--- a/drivers/cxl/core/memdev.c
+++ b/drivers/cxl/core/memdev.c
@@ -200,14 +200,14 @@ static int cxl_get_poison_by_memdev(struct cxl_memdev *cxlmd)
 	if (resource_size(&cxlds->pmem_res)) {
 		offset = cxlds->pmem_res.start;
 		length = resource_size(&cxlds->pmem_res);
-		rc = cxl_mem_get_poison(cxlmd, offset, length, NULL);
+		rc = cxl_mem_get_poison(cxlmd, offset, length, NULL, false);
 		if (rc)
 			return rc;
 	}
 	if (resource_size(&cxlds->ram_res)) {
 		offset = cxlds->ram_res.start;
 		length = resource_size(&cxlds->ram_res);
-		rc = cxl_mem_get_poison(cxlmd, offset, length, NULL);
+		rc = cxl_mem_get_poison(cxlmd, offset, length, NULL, false);
 		/*
 		 * Invalid Physical Address is not an error for
 		 * volatile addresses. Device support is optional.
diff --git a/drivers/cxl/core/region.c b/drivers/cxl/core/region.c
index ce0e2d82bb2b..b008c7e13560 100644
--- a/drivers/cxl/core/region.c
+++ b/drivers/cxl/core/region.c
@@ -2386,7 +2386,7 @@ static int cxl_get_poison_unmapped(struct cxl_memdev *cxlmd,
 	if (ctx->mode == CXL_DECODER_RAM) {
 		offset = ctx->offset;
 		length = resource_size(&cxlds->ram_res) - offset;
-		rc = cxl_mem_get_poison(cxlmd, offset, length, NULL);
+		rc = cxl_mem_get_poison(cxlmd, offset, length, NULL, false);
 		if (rc == -EFAULT)
 			rc = 0;
 		if (rc)
@@ -2404,7 +2404,7 @@ static int cxl_get_poison_unmapped(struct cxl_memdev *cxlmd,
 		return 0;
 	}
 
-	return cxl_mem_get_poison(cxlmd, offset, length, NULL);
+	return cxl_mem_get_poison(cxlmd, offset, length, NULL, false);
 }
 
 static int poison_by_decoder(struct device *dev, void *arg)
@@ -2438,7 +2438,7 @@ static int poison_by_decoder(struct device *dev, void *arg)
 	if (cxled->skip) {
 		offset = cxled->dpa_res->start - cxled->skip;
 		length = cxled->skip;
-		rc = cxl_mem_get_poison(cxlmd, offset, length, NULL);
+		rc = cxl_mem_get_poison(cxlmd, offset, length, NULL, false);
 		if (rc == -EFAULT && cxled->mode == CXL_DECODER_RAM)
 			rc = 0;
 		if (rc)
@@ -2447,7 +2447,7 @@ static int poison_by_decoder(struct device *dev, void *arg)
 
 	offset = cxled->dpa_res->start;
 	length = cxled->dpa_res->end - offset + 1;
-	rc = cxl_mem_get_poison(cxlmd, offset, length, cxled->cxld.region);
+	rc = cxl_mem_get_poison(cxlmd, offset, length, cxled->cxld.region, false);
 	if (rc == -EFAULT && cxled->mode == CXL_DECODER_RAM)
 		rc = 0;
 	if (rc)
diff --git a/drivers/cxl/cxlmem.h b/drivers/cxl/cxlmem.h
index 97ddab421e63..f0877f055f53 100644
--- a/drivers/cxl/cxlmem.h
+++ b/drivers/cxl/cxlmem.h
@@ -831,7 +831,7 @@ void cxl_event_trace_record(const struct cxl_memdev *cxlmd,
 int cxl_set_timestamp(struct cxl_memdev_state *mds);
 int cxl_poison_state_init(struct cxl_memdev_state *mds);
 int cxl_mem_get_poison(struct cxl_memdev *cxlmd, u64 offset, u64 len,
-		       struct cxl_region *cxlr);
+		       struct cxl_region *cxlr, bool report);
 int cxl_trigger_poison_list(struct cxl_memdev *cxlmd);
 phys_addr_t cxl_memdev_dpa_to_hpa(struct cxl_memdev *cxlmd, u64 dpa);
 int cxl_inject_poison(struct cxl_memdev *cxlmd, u64 dpa);
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [RFC PATCH 4/5] cxl/core: add report option for cxl_mem_get_poison()
@ 2024-02-09 11:54   ` Shiyang Ruan via
  0 siblings, 0 replies; 40+ messages in thread
From: Shiyang Ruan via @ 2024-02-09 11:54 UTC (permalink / raw)
  To: qemu-devel, linux-cxl; +Cc: Jonathan.Cameron, dan.j.williams

When a poison event is received, driver uses GET_POISON_LIST command
to get the poison list.  Now driver has cxl_mem_get_poison(), so
reuse it and add a parameter 'bool report', report poison record to MCE
if set true.

Signed-off-by: Shiyang Ruan <ruansy.fnst@fujitsu.com>
---
 drivers/cxl/core/mbox.c   | 7 +++++--
 drivers/cxl/core/memdev.c | 4 ++--
 drivers/cxl/core/region.c | 8 ++++----
 drivers/cxl/cxlmem.h      | 2 +-
 4 files changed, 12 insertions(+), 9 deletions(-)

diff --git a/drivers/cxl/core/mbox.c b/drivers/cxl/core/mbox.c
index f9b6f50fbe80..e1c67159acc4 100644
--- a/drivers/cxl/core/mbox.c
+++ b/drivers/cxl/core/mbox.c
@@ -1324,7 +1324,7 @@ static void cxl_mem_report_poison(struct cxl_memdev *cxlmd,
 }
 
 int cxl_mem_get_poison(struct cxl_memdev *cxlmd, u64 offset, u64 len,
-		       struct cxl_region *cxlr)
+		       struct cxl_region *cxlr, bool report)
 {
 	struct cxl_memdev_state *mds = to_cxl_memdev_state(cxlmd->cxlds);
 	struct cxl_mbox_poison_out *po;
@@ -1355,10 +1355,13 @@ int cxl_mem_get_poison(struct cxl_memdev *cxlmd, u64 offset, u64 len,
 		if (rc)
 			break;
 
-		for (int i = 0; i < le16_to_cpu(po->count); i++)
+		for (int i = 0; i < le16_to_cpu(po->count); i++) {
 			trace_cxl_poison(cxlmd, cxlr, &po->record[i],
 					 po->flags, po->overflow_ts,
 					 CXL_POISON_TRACE_LIST);
+			if (report)
+				cxl_mem_report_poison(cxlmd, &po->record[i]);
+		}
 
 		/* Protect against an uncleared _FLAG_MORE */
 		nr_records = nr_records + le16_to_cpu(po->count);
diff --git a/drivers/cxl/core/memdev.c b/drivers/cxl/core/memdev.c
index c304e709ef0e..320bcb8af5b0 100644
--- a/drivers/cxl/core/memdev.c
+++ b/drivers/cxl/core/memdev.c
@@ -200,14 +200,14 @@ static int cxl_get_poison_by_memdev(struct cxl_memdev *cxlmd)
 	if (resource_size(&cxlds->pmem_res)) {
 		offset = cxlds->pmem_res.start;
 		length = resource_size(&cxlds->pmem_res);
-		rc = cxl_mem_get_poison(cxlmd, offset, length, NULL);
+		rc = cxl_mem_get_poison(cxlmd, offset, length, NULL, false);
 		if (rc)
 			return rc;
 	}
 	if (resource_size(&cxlds->ram_res)) {
 		offset = cxlds->ram_res.start;
 		length = resource_size(&cxlds->ram_res);
-		rc = cxl_mem_get_poison(cxlmd, offset, length, NULL);
+		rc = cxl_mem_get_poison(cxlmd, offset, length, NULL, false);
 		/*
 		 * Invalid Physical Address is not an error for
 		 * volatile addresses. Device support is optional.
diff --git a/drivers/cxl/core/region.c b/drivers/cxl/core/region.c
index ce0e2d82bb2b..b008c7e13560 100644
--- a/drivers/cxl/core/region.c
+++ b/drivers/cxl/core/region.c
@@ -2386,7 +2386,7 @@ static int cxl_get_poison_unmapped(struct cxl_memdev *cxlmd,
 	if (ctx->mode == CXL_DECODER_RAM) {
 		offset = ctx->offset;
 		length = resource_size(&cxlds->ram_res) - offset;
-		rc = cxl_mem_get_poison(cxlmd, offset, length, NULL);
+		rc = cxl_mem_get_poison(cxlmd, offset, length, NULL, false);
 		if (rc == -EFAULT)
 			rc = 0;
 		if (rc)
@@ -2404,7 +2404,7 @@ static int cxl_get_poison_unmapped(struct cxl_memdev *cxlmd,
 		return 0;
 	}
 
-	return cxl_mem_get_poison(cxlmd, offset, length, NULL);
+	return cxl_mem_get_poison(cxlmd, offset, length, NULL, false);
 }
 
 static int poison_by_decoder(struct device *dev, void *arg)
@@ -2438,7 +2438,7 @@ static int poison_by_decoder(struct device *dev, void *arg)
 	if (cxled->skip) {
 		offset = cxled->dpa_res->start - cxled->skip;
 		length = cxled->skip;
-		rc = cxl_mem_get_poison(cxlmd, offset, length, NULL);
+		rc = cxl_mem_get_poison(cxlmd, offset, length, NULL, false);
 		if (rc == -EFAULT && cxled->mode == CXL_DECODER_RAM)
 			rc = 0;
 		if (rc)
@@ -2447,7 +2447,7 @@ static int poison_by_decoder(struct device *dev, void *arg)
 
 	offset = cxled->dpa_res->start;
 	length = cxled->dpa_res->end - offset + 1;
-	rc = cxl_mem_get_poison(cxlmd, offset, length, cxled->cxld.region);
+	rc = cxl_mem_get_poison(cxlmd, offset, length, cxled->cxld.region, false);
 	if (rc == -EFAULT && cxled->mode == CXL_DECODER_RAM)
 		rc = 0;
 	if (rc)
diff --git a/drivers/cxl/cxlmem.h b/drivers/cxl/cxlmem.h
index 97ddab421e63..f0877f055f53 100644
--- a/drivers/cxl/cxlmem.h
+++ b/drivers/cxl/cxlmem.h
@@ -831,7 +831,7 @@ void cxl_event_trace_record(const struct cxl_memdev *cxlmd,
 int cxl_set_timestamp(struct cxl_memdev_state *mds);
 int cxl_poison_state_init(struct cxl_memdev_state *mds);
 int cxl_mem_get_poison(struct cxl_memdev *cxlmd, u64 offset, u64 len,
-		       struct cxl_region *cxlr);
+		       struct cxl_region *cxlr, bool report);
 int cxl_trigger_poison_list(struct cxl_memdev *cxlmd);
 phys_addr_t cxl_memdev_dpa_to_hpa(struct cxl_memdev *cxlmd, u64 dpa);
 int cxl_inject_poison(struct cxl_memdev *cxlmd, u64 dpa);
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [RFC PATCH 5/5] cxl/core: add poison injection event handler
  2024-02-09 11:54 ` Shiyang Ruan via
@ 2024-02-09 11:54   ` Shiyang Ruan via
  -1 siblings, 0 replies; 40+ messages in thread
From: Shiyang Ruan @ 2024-02-09 11:54 UTC (permalink / raw)
  To: qemu-devel, linux-cxl; +Cc: Jonathan.Cameron, dan.j.williams

Currently driver only trace cxl events, poison injection on cxl memdev
is silent.  OS needs to be notified then it could handle poison range
in time.  Per CXL spec, the device error event could be signaled through
FW-First and OS-First methods.

So, add poison event handler in OS-First method:
  - qemu:
    - CXL device report POISON event to OS by MSI by sending GMER after
      injecting a poison record
  - CXL driver
    a. read the POISON event through GMER;   <-- this patch
    b. get POISON list;
    c. translate DPA to HPA;
    d. construct a mce instance, then call mce_log() to queue this mce
       instance;

Signed-off-by: Shiyang Ruan <ruansy.fnst@fujitsu.com>
---
 drivers/cxl/core/mbox.c | 42 ++++++++++++++++++++++++++++-------------
 drivers/cxl/cxlmem.h    |  8 ++++----
 drivers/cxl/pci.c       |  4 ++--
 3 files changed, 35 insertions(+), 19 deletions(-)

diff --git a/drivers/cxl/core/mbox.c b/drivers/cxl/core/mbox.c
index e1c67159acc4..fa65a98ada16 100644
--- a/drivers/cxl/core/mbox.c
+++ b/drivers/cxl/core/mbox.c
@@ -838,25 +838,41 @@ int cxl_enumerate_cmds(struct cxl_memdev_state *mds)
 }
 EXPORT_SYMBOL_NS_GPL(cxl_enumerate_cmds, CXL);
 
-void cxl_event_trace_record(const struct cxl_memdev *cxlmd,
-			    enum cxl_event_log_type type,
-			    enum cxl_event_type event_type,
-			    const uuid_t *uuid, union cxl_event *evt)
+static void cxl_event_handle_poison(struct cxl_memdev *cxlmd,
+				    struct cxl_event_gen_media *rec)
 {
-	if (event_type == CXL_CPER_EVENT_GEN_MEDIA)
+	u64 phys_addr = rec->phys_addr & CXL_DPA_MASK, len;
+
+	if (rec->phys_addr & CXL_DPA_VOLATILE)
+		len = resource_size(&cxlmd->cxlds->ram_res) - phys_addr;
+	else
+		len = resource_size(&cxlmd->cxlds->dpa_res) - phys_addr;
+
+	cxl_mem_get_poison(cxlmd, phys_addr, len, NULL, true);
+}
+
+void cxl_event_handle_record(struct cxl_memdev *cxlmd,
+			     enum cxl_event_log_type type,
+			     enum cxl_event_type event_type,
+			     const uuid_t *uuid, union cxl_event *evt)
+{
+	if (event_type == CXL_CPER_EVENT_GEN_MEDIA) {
 		trace_cxl_general_media(cxlmd, type, &evt->gen_media);
-	else if (event_type == CXL_CPER_EVENT_DRAM)
+		/* handle poison event */
+		if (type == CXL_EVENT_TYPE_FAIL)
+			cxl_event_handle_poison(cxlmd, &evt->gen_media);
+	} else if (event_type == CXL_CPER_EVENT_DRAM)
 		trace_cxl_dram(cxlmd, type, &evt->dram);
 	else if (event_type == CXL_CPER_EVENT_MEM_MODULE)
 		trace_cxl_memory_module(cxlmd, type, &evt->mem_module);
 	else
 		trace_cxl_generic_event(cxlmd, type, uuid, &evt->generic);
 }
-EXPORT_SYMBOL_NS_GPL(cxl_event_trace_record, CXL);
+EXPORT_SYMBOL_NS_GPL(cxl_event_handle_record, CXL);
 
-static void __cxl_event_trace_record(const struct cxl_memdev *cxlmd,
-				     enum cxl_event_log_type type,
-				     struct cxl_event_record_raw *record)
+static void __cxl_event_handle_record(struct cxl_memdev *cxlmd,
+				      enum cxl_event_log_type type,
+				      struct cxl_event_record_raw *record)
 {
 	enum cxl_event_type ev_type = CXL_CPER_EVENT_GENERIC;
 	const uuid_t *uuid = &record->id;
@@ -868,7 +884,7 @@ static void __cxl_event_trace_record(const struct cxl_memdev *cxlmd,
 	else if (uuid_equal(uuid, &CXL_EVENT_MEM_MODULE_UUID))
 		ev_type = CXL_CPER_EVENT_MEM_MODULE;
 
-	cxl_event_trace_record(cxlmd, type, ev_type, uuid, &record->event);
+	cxl_event_handle_record(cxlmd, type, ev_type, uuid, &record->event);
 }
 
 static int cxl_clear_event_record(struct cxl_memdev_state *mds,
@@ -979,8 +995,8 @@ static void cxl_mem_get_records_log(struct cxl_memdev_state *mds,
 			break;
 
 		for (i = 0; i < nr_rec; i++)
-			__cxl_event_trace_record(cxlmd, type,
-						 &payload->records[i]);
+			__cxl_event_handle_record(cxlmd, type,
+						  &payload->records[i]);
 
 		if (payload->flags & CXL_GET_EVENT_FLAG_OVERFLOW)
 			trace_cxl_overflow(cxlmd, type, payload);
diff --git a/drivers/cxl/cxlmem.h b/drivers/cxl/cxlmem.h
index f0877f055f53..1e9e3b9c11d1 100644
--- a/drivers/cxl/cxlmem.h
+++ b/drivers/cxl/cxlmem.h
@@ -824,10 +824,10 @@ void set_exclusive_cxl_commands(struct cxl_memdev_state *mds,
 void clear_exclusive_cxl_commands(struct cxl_memdev_state *mds,
 				  unsigned long *cmds);
 void cxl_mem_get_event_records(struct cxl_memdev_state *mds, u32 status);
-void cxl_event_trace_record(const struct cxl_memdev *cxlmd,
-			    enum cxl_event_log_type type,
-			    enum cxl_event_type event_type,
-			    const uuid_t *uuid, union cxl_event *evt);
+void cxl_event_handle_record(struct cxl_memdev *cxlmd,
+			     enum cxl_event_log_type type,
+			     enum cxl_event_type event_type,
+			     const uuid_t *uuid, union cxl_event *evt);
 int cxl_set_timestamp(struct cxl_memdev_state *mds);
 int cxl_poison_state_init(struct cxl_memdev_state *mds);
 int cxl_mem_get_poison(struct cxl_memdev *cxlmd, u64 offset, u64 len,
diff --git a/drivers/cxl/pci.c b/drivers/cxl/pci.c
index 233e7c42c161..29a5e641decd 100644
--- a/drivers/cxl/pci.c
+++ b/drivers/cxl/pci.c
@@ -1003,8 +1003,8 @@ static void cxl_cper_event_call(enum cxl_event_type ev_type,
 	hdr_flags = get_unaligned_le24(rec->event.generic.hdr.flags);
 	log_type = FIELD_GET(CXL_EVENT_HDR_FLAGS_REC_SEVERITY, hdr_flags);
 
-	cxl_event_trace_record(cxlds->cxlmd, log_type, ev_type,
-			       &uuid_null, &rec->event);
+	cxl_event_handle_record(cxlds->cxlmd, log_type, ev_type,
+				&uuid_null, &rec->event);
 }
 
 static int __init cxl_pci_driver_init(void)
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [RFC PATCH 5/5] cxl/core: add poison injection event handler
@ 2024-02-09 11:54   ` Shiyang Ruan via
  0 siblings, 0 replies; 40+ messages in thread
From: Shiyang Ruan via @ 2024-02-09 11:54 UTC (permalink / raw)
  To: qemu-devel, linux-cxl; +Cc: Jonathan.Cameron, dan.j.williams

Currently driver only trace cxl events, poison injection on cxl memdev
is silent.  OS needs to be notified then it could handle poison range
in time.  Per CXL spec, the device error event could be signaled through
FW-First and OS-First methods.

So, add poison event handler in OS-First method:
  - qemu:
    - CXL device report POISON event to OS by MSI by sending GMER after
      injecting a poison record
  - CXL driver
    a. read the POISON event through GMER;   <-- this patch
    b. get POISON list;
    c. translate DPA to HPA;
    d. construct a mce instance, then call mce_log() to queue this mce
       instance;

Signed-off-by: Shiyang Ruan <ruansy.fnst@fujitsu.com>
---
 drivers/cxl/core/mbox.c | 42 ++++++++++++++++++++++++++++-------------
 drivers/cxl/cxlmem.h    |  8 ++++----
 drivers/cxl/pci.c       |  4 ++--
 3 files changed, 35 insertions(+), 19 deletions(-)

diff --git a/drivers/cxl/core/mbox.c b/drivers/cxl/core/mbox.c
index e1c67159acc4..fa65a98ada16 100644
--- a/drivers/cxl/core/mbox.c
+++ b/drivers/cxl/core/mbox.c
@@ -838,25 +838,41 @@ int cxl_enumerate_cmds(struct cxl_memdev_state *mds)
 }
 EXPORT_SYMBOL_NS_GPL(cxl_enumerate_cmds, CXL);
 
-void cxl_event_trace_record(const struct cxl_memdev *cxlmd,
-			    enum cxl_event_log_type type,
-			    enum cxl_event_type event_type,
-			    const uuid_t *uuid, union cxl_event *evt)
+static void cxl_event_handle_poison(struct cxl_memdev *cxlmd,
+				    struct cxl_event_gen_media *rec)
 {
-	if (event_type == CXL_CPER_EVENT_GEN_MEDIA)
+	u64 phys_addr = rec->phys_addr & CXL_DPA_MASK, len;
+
+	if (rec->phys_addr & CXL_DPA_VOLATILE)
+		len = resource_size(&cxlmd->cxlds->ram_res) - phys_addr;
+	else
+		len = resource_size(&cxlmd->cxlds->dpa_res) - phys_addr;
+
+	cxl_mem_get_poison(cxlmd, phys_addr, len, NULL, true);
+}
+
+void cxl_event_handle_record(struct cxl_memdev *cxlmd,
+			     enum cxl_event_log_type type,
+			     enum cxl_event_type event_type,
+			     const uuid_t *uuid, union cxl_event *evt)
+{
+	if (event_type == CXL_CPER_EVENT_GEN_MEDIA) {
 		trace_cxl_general_media(cxlmd, type, &evt->gen_media);
-	else if (event_type == CXL_CPER_EVENT_DRAM)
+		/* handle poison event */
+		if (type == CXL_EVENT_TYPE_FAIL)
+			cxl_event_handle_poison(cxlmd, &evt->gen_media);
+	} else if (event_type == CXL_CPER_EVENT_DRAM)
 		trace_cxl_dram(cxlmd, type, &evt->dram);
 	else if (event_type == CXL_CPER_EVENT_MEM_MODULE)
 		trace_cxl_memory_module(cxlmd, type, &evt->mem_module);
 	else
 		trace_cxl_generic_event(cxlmd, type, uuid, &evt->generic);
 }
-EXPORT_SYMBOL_NS_GPL(cxl_event_trace_record, CXL);
+EXPORT_SYMBOL_NS_GPL(cxl_event_handle_record, CXL);
 
-static void __cxl_event_trace_record(const struct cxl_memdev *cxlmd,
-				     enum cxl_event_log_type type,
-				     struct cxl_event_record_raw *record)
+static void __cxl_event_handle_record(struct cxl_memdev *cxlmd,
+				      enum cxl_event_log_type type,
+				      struct cxl_event_record_raw *record)
 {
 	enum cxl_event_type ev_type = CXL_CPER_EVENT_GENERIC;
 	const uuid_t *uuid = &record->id;
@@ -868,7 +884,7 @@ static void __cxl_event_trace_record(const struct cxl_memdev *cxlmd,
 	else if (uuid_equal(uuid, &CXL_EVENT_MEM_MODULE_UUID))
 		ev_type = CXL_CPER_EVENT_MEM_MODULE;
 
-	cxl_event_trace_record(cxlmd, type, ev_type, uuid, &record->event);
+	cxl_event_handle_record(cxlmd, type, ev_type, uuid, &record->event);
 }
 
 static int cxl_clear_event_record(struct cxl_memdev_state *mds,
@@ -979,8 +995,8 @@ static void cxl_mem_get_records_log(struct cxl_memdev_state *mds,
 			break;
 
 		for (i = 0; i < nr_rec; i++)
-			__cxl_event_trace_record(cxlmd, type,
-						 &payload->records[i]);
+			__cxl_event_handle_record(cxlmd, type,
+						  &payload->records[i]);
 
 		if (payload->flags & CXL_GET_EVENT_FLAG_OVERFLOW)
 			trace_cxl_overflow(cxlmd, type, payload);
diff --git a/drivers/cxl/cxlmem.h b/drivers/cxl/cxlmem.h
index f0877f055f53..1e9e3b9c11d1 100644
--- a/drivers/cxl/cxlmem.h
+++ b/drivers/cxl/cxlmem.h
@@ -824,10 +824,10 @@ void set_exclusive_cxl_commands(struct cxl_memdev_state *mds,
 void clear_exclusive_cxl_commands(struct cxl_memdev_state *mds,
 				  unsigned long *cmds);
 void cxl_mem_get_event_records(struct cxl_memdev_state *mds, u32 status);
-void cxl_event_trace_record(const struct cxl_memdev *cxlmd,
-			    enum cxl_event_log_type type,
-			    enum cxl_event_type event_type,
-			    const uuid_t *uuid, union cxl_event *evt);
+void cxl_event_handle_record(struct cxl_memdev *cxlmd,
+			     enum cxl_event_log_type type,
+			     enum cxl_event_type event_type,
+			     const uuid_t *uuid, union cxl_event *evt);
 int cxl_set_timestamp(struct cxl_memdev_state *mds);
 int cxl_poison_state_init(struct cxl_memdev_state *mds);
 int cxl_mem_get_poison(struct cxl_memdev *cxlmd, u64 offset, u64 len,
diff --git a/drivers/cxl/pci.c b/drivers/cxl/pci.c
index 233e7c42c161..29a5e641decd 100644
--- a/drivers/cxl/pci.c
+++ b/drivers/cxl/pci.c
@@ -1003,8 +1003,8 @@ static void cxl_cper_event_call(enum cxl_event_type ev_type,
 	hdr_flags = get_unaligned_le24(rec->event.generic.hdr.flags);
 	log_type = FIELD_GET(CXL_EVENT_HDR_FLAGS_REC_SEVERITY, hdr_flags);
 
-	cxl_event_trace_record(cxlds->cxlmd, log_type, ev_type,
-			       &uuid_null, &rec->event);
+	cxl_event_handle_record(cxlds->cxlmd, log_type, ev_type,
+				&uuid_null, &rec->event);
 }
 
 static int __init cxl_pci_driver_init(void)
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 40+ messages in thread

* RE: [RFC PATCH 1/5] cxl/core: correct length of DPA field masks
  2024-02-09 11:54   ` Shiyang Ruan via
  (?)
@ 2024-02-10  6:34   ` Dan Williams
  2024-02-19 10:49       ` Shiyang Ruan
  -1 siblings, 1 reply; 40+ messages in thread
From: Dan Williams @ 2024-02-10  6:34 UTC (permalink / raw)
  To: Shiyang Ruan, qemu-devel, linux-cxl; +Cc: Jonathan.Cameron, dan.j.williams

Shiyang Ruan wrote:
> The length of Physical Address in General Media Event Record/DRAM Event
> Record is 64-bit, so the field mask should be defined as such length.

Can you include this user visible side-effect of this change. Looks like
this could cause usages of CXL_DPA_FLAGS_MASK to return an incorrect
result?

> 
> Signed-off-by: Shiyang Ruan <ruansy.fnst@fujitsu.com>
> ---
>  drivers/cxl/core/trace.h | 6 +++---
>  1 file changed, 3 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/cxl/core/trace.h b/drivers/cxl/core/trace.h
> index 89445435303a..388a87d972c2 100644
> --- a/drivers/cxl/core/trace.h
> +++ b/drivers/cxl/core/trace.h
> @@ -253,11 +253,11 @@ TRACE_EVENT(cxl_generic_event,
>   * DRAM Event Record
>   * CXL rev 3.0 section 8.2.9.2.1.2; Table 8-44
>   */
> -#define CXL_DPA_FLAGS_MASK			0x3F
> +#define CXL_DPA_FLAGS_MASK			0x3FULL
>  #define CXL_DPA_MASK				(~CXL_DPA_FLAGS_MASK)
>  
> -#define CXL_DPA_VOLATILE			BIT(0)
> -#define CXL_DPA_NOT_REPAIRABLE			BIT(1)
> +#define CXL_DPA_VOLATILE			BIT_ULL(0)
> +#define CXL_DPA_NOT_REPAIRABLE			BIT_ULL(1)
>  #define show_dpa_flags(flags)	__print_flags(flags, "|",		   \
>  	{ CXL_DPA_VOLATILE,			"VOLATILE"		}, \
>  	{ CXL_DPA_NOT_REPAIRABLE,		"NOT_REPAIRABLE"	}  \
> -- 
> 2.34.1
> 



^ permalink raw reply	[flat|nested] 40+ messages in thread

* RE: [RFC PATCH 2/5] cxl/core: introduce cxl_memdev_dpa_to_hpa()
  2024-02-09 11:54   ` Shiyang Ruan via
  (?)
@ 2024-02-10  6:39   ` Dan Williams
  -1 siblings, 0 replies; 40+ messages in thread
From: Dan Williams @ 2024-02-10  6:39 UTC (permalink / raw)
  To: Shiyang Ruan, qemu-devel, linux-cxl; +Cc: Jonathan.Cameron, dan.j.williams

Shiyang Ruan wrote:
> When a memdev is assigned to a region, its Device Physical Address will be
> mapped to Host Physical Address.  Introduce this helper function to
> translate HPA from a given memdev and its DPA.
> 
> Signed-off-by: Shiyang Ruan <ruansy.fnst@fujitsu.com>
> ---
>  drivers/cxl/core/memdev.c | 12 ++++++++++++
>  drivers/cxl/cxlmem.h      |  1 +
>  2 files changed, 13 insertions(+)
> 
> diff --git a/drivers/cxl/core/memdev.c b/drivers/cxl/core/memdev.c
> index dae8802ecdb0..c304e709ef0e 100644
> --- a/drivers/cxl/core/memdev.c
> +++ b/drivers/cxl/core/memdev.c
> @@ -319,6 +319,18 @@ static int cxl_validate_poison_dpa(struct cxl_memdev *cxlmd, u64 dpa)
>  	return 0;
>  }
>  
> +phys_addr_t cxl_memdev_dpa_to_hpa(struct cxl_memdev *cxlmd, u64 dpa)
> +{
> +	struct cxl_region *cxlr = cxl_dpa_to_region(cxlmd, dpa);
> +
> +	if (cxlr)
> +		return cxlr->params.res->start + dpa;
> +	else {
> +		dev_dbg(&cxlmd->dev, "device belongs to no region.\n");
> +		return 0;
> +	}
> +}

Hmm no, I would not say memdevs are assigned to regions, *endpoint
decoders* are assigned to regions. cxl_dpa_to_region() is only an
internal helper when the endpoint decoder is unknown. Otherwise endpoint
decoders have a direct-link to their region, if mapped. See usage of
"cxled->cxld.region".

^ permalink raw reply	[flat|nested] 40+ messages in thread

* RE: [RFC PATCH 3/5] cxl/core: introduce cxl_mem_report_poison()
  2024-02-09 11:54   ` Shiyang Ruan via
  (?)
@ 2024-02-10  6:46   ` Dan Williams
  2024-03-14 15:23       ` Shiyang Ruan via
  -1 siblings, 1 reply; 40+ messages in thread
From: Dan Williams @ 2024-02-10  6:46 UTC (permalink / raw)
  To: Shiyang Ruan, qemu-devel, linux-cxl; +Cc: Jonathan.Cameron, dan.j.williams

Shiyang Ruan wrote:
> If poison is detected(reported from cxl memdev), OS should be notified to
> handle it.  Introduce this function:
>   1. translate DPA to HPA;
>   2. construct a MCE instance; (TODO: more details need to be filled)
>   3. log it into MCE event queue;
> 
> After that, MCE mechanism can walk over its notifier chain to execute
> specific handlers.
> 
> Signed-off-by: Shiyang Ruan <ruansy.fnst@fujitsu.com>
> ---
>  arch/x86/kernel/cpu/mce/core.c |  1 +
>  drivers/cxl/core/mbox.c        | 33 +++++++++++++++++++++++++++++++++
>  2 files changed, 34 insertions(+)
> 
> diff --git a/arch/x86/kernel/cpu/mce/core.c b/arch/x86/kernel/cpu/mce/core.c
> index bc39252bc54f..a64c0aceb7e0 100644
> --- a/arch/x86/kernel/cpu/mce/core.c
> +++ b/arch/x86/kernel/cpu/mce/core.c
> @@ -131,6 +131,7 @@ void mce_setup(struct mce *m)
>  	m->ppin = cpu_data(m->extcpu).ppin;
>  	m->microcode = boot_cpu_data.microcode;
>  }
> +EXPORT_SYMBOL_GPL(mce_setup);

No, mce_setup() is x86 specific and the CXL subsystem is CPU
architecture independent. My expectation is that CXL should translate
errors for edac similar to how the ACPI GHES code does it. See usage of
edac_raw_mc_handle_error() and memory_failure_queue().

Otherwise an MCE is a CPU consumption of poison event, and CXL is
reporting device-side discovery of poison.

^ permalink raw reply	[flat|nested] 40+ messages in thread

* RE: [RFC PATCH 4/5] cxl/core: add report option for cxl_mem_get_poison()
  2024-02-09 11:54   ` Shiyang Ruan via
  (?)
@ 2024-02-10  6:49   ` Dan Williams
  2024-03-14 15:01       ` Shiyang Ruan via
  -1 siblings, 1 reply; 40+ messages in thread
From: Dan Williams @ 2024-02-10  6:49 UTC (permalink / raw)
  To: Shiyang Ruan, qemu-devel, linux-cxl; +Cc: Jonathan.Cameron, dan.j.williams

Shiyang Ruan wrote:
> When a poison event is received, driver uses GET_POISON_LIST command
> to get the poison list.  Now driver has cxl_mem_get_poison(), so
> reuse it and add a parameter 'bool report', report poison record to MCE
> if set true.

If the memory error record has the poison event, why does the poison
list need to be retrieved by the kernel? I would expect it is sufficient
to just report the single poison event and leave it to userspace to
react to that event and retrieve more data if it wants.

^ permalink raw reply	[flat|nested] 40+ messages in thread

* RE: [RFC PATCH 5/5] cxl/core: add poison injection event handler
  2024-02-09 11:54   ` Shiyang Ruan via
  (?)
@ 2024-02-10  6:54   ` Dan Williams
  -1 siblings, 0 replies; 40+ messages in thread
From: Dan Williams @ 2024-02-10  6:54 UTC (permalink / raw)
  To: Shiyang Ruan, qemu-devel, linux-cxl; +Cc: Jonathan.Cameron, dan.j.williams

Shiyang Ruan wrote:
> Currently driver only trace cxl events, poison injection on cxl memdev
> is silent.  OS needs to be notified then it could handle poison range
> in time.  Per CXL spec, the device error event could be signaled through
> FW-First and OS-First methods.
> 
> So, add poison event handler in OS-First method:
>   - qemu:
>     - CXL device report POISON event to OS by MSI by sending GMER after
>       injecting a poison record

QEMU details do not belong in a kernel changelog. It is ok for an RFC,
but my hope is that this can be tested on hardware after being proven on
QEMU.

>   - CXL driver
>     a. read the POISON event through GMER;   <-- this patch
>     b. get POISON list;
>     c. translate DPA to HPA;
>     d. construct a mce instance, then call mce_log() to queue this mce
>        instance;

It is not clear to me why the kernel should proactively fire machine
check notifications on injection? The changelog needs to make clear why
the kernel should do this, and the consequences of not going it.

For CPU consumed poison the machine check event will already fire. For
background discovery of poison, that should translate to a
memory_failure() notification with teh MF_ACTION_REQUIRED flag cleared.
Userspace, like rasdaemon, can then make a page offline decision.

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [RFC PATCH SET] cxl: add poison event handler
  2024-02-09 11:54 ` Shiyang Ruan via
                   ` (7 preceding siblings ...)
  (?)
@ 2024-02-13  0:20 ` Dave Jiang
  -1 siblings, 0 replies; 40+ messages in thread
From: Dave Jiang @ 2024-02-13  0:20 UTC (permalink / raw)
  To: Shiyang Ruan, qemu-devel, linux-cxl; +Cc: Jonathan.Cameron, dan.j.williams



On 2/9/24 4:54 AM, Shiyang Ruan wrote:
> Currently driver only trace cxl events, poison injection on cxl memdev
> is silent.  OS needs to be notified then it could handle poison range
> in time.  Per CXL spec, the device error event could be signaled through
> FW-First and OS-First methods.
> 
> This draft patchset adds poison event handler in OS-First method:
>   - qemu:
>     - CXL device report POISON event to OS by MSI by sending GMER after
>       injecting a poison record
>   - CXL driver
>     a. read the POISON event through GMER;
>     b. get POISON list;
>     c. translate DPA to HPA;
>     d. construct a mce instance, then call mce_log() to queue this mce
>        instance;
> 
> This patchset includes 2 patches for qemu, 5 patches for cxl driver.

Hi Ruan,
Next time please split this out and post as 2 different series. You can have the CXL driver series cover letter reference the QEMU changes. And please add QEMU to subject prefix for the QEMU patches. Thank you!

> 
> Shiyang Ruan (5):
>   cxl/core: correct length of DPA field masks
>   cxl/core: introduce cxl_memdev_dpa_to_hpa()
>   cxl/core: introduce cxl_mem_report_poison()
>   cxl/core: add report option for cxl_mem_get_poison()
>   cxl/core: add poison injection event handler
> 
>  arch/x86/kernel/cpu/mce/core.c |  1 +
>  drivers/cxl/core/mbox.c        | 82 +++++++++++++++++++++++++++-------
>  drivers/cxl/core/memdev.c      | 16 ++++++-
>  drivers/cxl/core/region.c      |  8 ++--
>  drivers/cxl/core/trace.h       |  6 +--
>  drivers/cxl/cxlmem.h           | 11 ++---
>  drivers/cxl/pci.c              |  4 +-
>  7 files changed, 97 insertions(+), 31 deletions(-)
> 

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [RFC PATCH 1/2] hw/cxl/type3: add missing flag bit for GMER
  2024-02-09 11:54   ` Shiyang Ruan via
@ 2024-02-13 16:27     ` Jonathan Cameron via
  -1 siblings, 0 replies; 40+ messages in thread
From: Jonathan Cameron @ 2024-02-13 16:27 UTC (permalink / raw)
  To: Shiyang Ruan; +Cc: qemu-devel, linux-cxl, dan.j.williams

On Fri,  9 Feb 2024 19:54:11 +0800
Shiyang Ruan <ruansy.fnst@fujitsu.com> wrote:

> The "Volatile" should be set if current device is a volatile memory.
> Per CXL Spec r3.0 8.2.9.2.1.1, Table 8-43.

Whilst true, we haven't previously adjusted the injected record
to conform to the device.  I don't think there is anything preventing
you setting this bit in the dpa fields whilst injecting.

I'm not against filling this in automatically but we should be
masking the relevant bits out of what is injected and filling it in based
on the dpa that is provided and which memory that falls within.


> 
> Signed-off-by: Shiyang Ruan <ruansy.fnst@fujitsu.com>
> ---
>  hw/mem/cxl_type3.c | 6 ++++++
>  1 file changed, 6 insertions(+)
> 
> diff --git a/hw/mem/cxl_type3.c b/hw/mem/cxl_type3.c
> index 52647b4ac7..d8fb63b1de 100644
> --- a/hw/mem/cxl_type3.c
> +++ b/hw/mem/cxl_type3.c
> @@ -1285,6 +1285,9 @@ static const QemuUUID memory_module_uuid = {
>                   0x79, 0xba, 0xb1, 0x13, 0xb7, 0x74),
>  };
>  
> +#define CXL_GMER_DPA_VOLATILE                           BIT(0)
> +#define CXL_GMER_DPA_NOT_REPAIRABLE                     BIT(1)
> +
>  #define CXL_GMER_VALID_CHANNEL                          BIT(0)
>  #define CXL_GMER_VALID_RANK                             BIT(1)
>  #define CXL_GMER_VALID_DEVICE                           BIT(2)
> @@ -1348,6 +1351,9 @@ void qmp_cxl_inject_general_media_event(const char *path, CxlEventLog log,
>      cxl_assign_event_header(hdr, &gen_media_uuid, flags, sizeof(gem),
>                              cxl_device_get_timestamp(&ct3d->cxl_dstate));
>  
> +    if (ct3d->hostvmem) {
> +        dpa |= CXL_GMER_DPA_VOLATILE;
> +    }

The presence of volatile memory isn't sufficient.  You may have a device
with both volatile and persistent memory. It will get even fiddlier with
Dynamic Capacity which will need checking as well once that support is
ready.



>      stq_le_p(&gem.phys_addr, dpa);
>      gem.descriptor = descriptor;
>      gem.type = type;


^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [RFC PATCH 1/2] hw/cxl/type3: add missing flag bit for GMER
@ 2024-02-13 16:27     ` Jonathan Cameron via
  0 siblings, 0 replies; 40+ messages in thread
From: Jonathan Cameron via @ 2024-02-13 16:27 UTC (permalink / raw)
  To: Shiyang Ruan; +Cc: qemu-devel, linux-cxl, dan.j.williams

On Fri,  9 Feb 2024 19:54:11 +0800
Shiyang Ruan <ruansy.fnst@fujitsu.com> wrote:

> The "Volatile" should be set if current device is a volatile memory.
> Per CXL Spec r3.0 8.2.9.2.1.1, Table 8-43.

Whilst true, we haven't previously adjusted the injected record
to conform to the device.  I don't think there is anything preventing
you setting this bit in the dpa fields whilst injecting.

I'm not against filling this in automatically but we should be
masking the relevant bits out of what is injected and filling it in based
on the dpa that is provided and which memory that falls within.


> 
> Signed-off-by: Shiyang Ruan <ruansy.fnst@fujitsu.com>
> ---
>  hw/mem/cxl_type3.c | 6 ++++++
>  1 file changed, 6 insertions(+)
> 
> diff --git a/hw/mem/cxl_type3.c b/hw/mem/cxl_type3.c
> index 52647b4ac7..d8fb63b1de 100644
> --- a/hw/mem/cxl_type3.c
> +++ b/hw/mem/cxl_type3.c
> @@ -1285,6 +1285,9 @@ static const QemuUUID memory_module_uuid = {
>                   0x79, 0xba, 0xb1, 0x13, 0xb7, 0x74),
>  };
>  
> +#define CXL_GMER_DPA_VOLATILE                           BIT(0)
> +#define CXL_GMER_DPA_NOT_REPAIRABLE                     BIT(1)
> +
>  #define CXL_GMER_VALID_CHANNEL                          BIT(0)
>  #define CXL_GMER_VALID_RANK                             BIT(1)
>  #define CXL_GMER_VALID_DEVICE                           BIT(2)
> @@ -1348,6 +1351,9 @@ void qmp_cxl_inject_general_media_event(const char *path, CxlEventLog log,
>      cxl_assign_event_header(hdr, &gen_media_uuid, flags, sizeof(gem),
>                              cxl_device_get_timestamp(&ct3d->cxl_dstate));
>  
> +    if (ct3d->hostvmem) {
> +        dpa |= CXL_GMER_DPA_VOLATILE;
> +    }

The presence of volatile memory isn't sufficient.  You may have a device
with both volatile and persistent memory. It will get even fiddlier with
Dynamic Capacity which will need checking as well once that support is
ready.



>      stq_le_p(&gem.phys_addr, dpa);
>      gem.descriptor = descriptor;
>      gem.type = type;



^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [RFC PATCH 2/2] hw/cxl/type3: send a GMER while injecting poison
  2024-02-09 11:54   ` Shiyang Ruan via
@ 2024-02-13 16:32     ` Jonathan Cameron via
  -1 siblings, 0 replies; 40+ messages in thread
From: Jonathan Cameron @ 2024-02-13 16:32 UTC (permalink / raw)
  To: Shiyang Ruan; +Cc: qemu-devel, linux-cxl, dan.j.williams

On Fri,  9 Feb 2024 19:54:12 +0800
Shiyang Ruan <ruansy.fnst@fujitsu.com> wrote:

> Send a signal to OS to let it able to handle the poison range.
> 
> TODO: This is an rough draft, will add more parameters for
> qmp_cxl_inject_poison() to set to GMER.

I wonder if that's the best plan or if we should think about
providing some default memory topology and perhaps a means
to override it.   Adding more parameters to the injection
commands works for qmp injection but not for mailbox based
injection from the host for example.

If we have a DPA to channel, rank, etc default mapping then
we should be able to fill them all in based on just the DPA
but allow them to be overridden from the existing GMER
injection (so other test cases can be poked).

One comment inline.

Jonathan

> 
> Signed-off-by: Shiyang Ruan <ruansy.fnst@fujitsu.com>
> ---
>  hw/mem/cxl_type3.c | 5 +++++
>  1 file changed, 5 insertions(+)
> 
> diff --git a/hw/mem/cxl_type3.c b/hw/mem/cxl_type3.c
> index d8fb63b1de..813f7f2175 100644
> --- a/hw/mem/cxl_type3.c
> +++ b/hw/mem/cxl_type3.c
> @@ -1116,6 +1116,11 @@ void qmp_cxl_inject_poison(const char *path, uint64_t start, uint64_t length,
>  
>      QLIST_INSERT_HEAD(&ct3d->poison_list, p, node);
>      ct3d->poison_list_cnt++;
> +
> +    /* Emit an GMER event, let os handle it */
> +    qmp_cxl_inject_general_media_event(path, CXL_EVENT_LOG_FAILURE, 0, start,
> +                                       0, 0, 4, false, 0, false, 0,
> +                                       false, 0, NULL, errp);

This results in some slightly messy duplication of effort. I'd like to see the
parts we need factored out of the qmp_cxl_inject_general_media_event()
so that we don't end up looking up the device twice and similar.
Pull the code we need in both places out to
cxl_inject_general_media_event(CXLType3 *ct3d, ...

helper in a precursor patch and reuse that code here.

  }
>  
>  /* For uncorrectable errors include support for multiple header recording */


^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [RFC PATCH 2/2] hw/cxl/type3: send a GMER while injecting poison
@ 2024-02-13 16:32     ` Jonathan Cameron via
  0 siblings, 0 replies; 40+ messages in thread
From: Jonathan Cameron via @ 2024-02-13 16:32 UTC (permalink / raw)
  To: Shiyang Ruan; +Cc: qemu-devel, linux-cxl, dan.j.williams

On Fri,  9 Feb 2024 19:54:12 +0800
Shiyang Ruan <ruansy.fnst@fujitsu.com> wrote:

> Send a signal to OS to let it able to handle the poison range.
> 
> TODO: This is an rough draft, will add more parameters for
> qmp_cxl_inject_poison() to set to GMER.

I wonder if that's the best plan or if we should think about
providing some default memory topology and perhaps a means
to override it.   Adding more parameters to the injection
commands works for qmp injection but not for mailbox based
injection from the host for example.

If we have a DPA to channel, rank, etc default mapping then
we should be able to fill them all in based on just the DPA
but allow them to be overridden from the existing GMER
injection (so other test cases can be poked).

One comment inline.

Jonathan

> 
> Signed-off-by: Shiyang Ruan <ruansy.fnst@fujitsu.com>
> ---
>  hw/mem/cxl_type3.c | 5 +++++
>  1 file changed, 5 insertions(+)
> 
> diff --git a/hw/mem/cxl_type3.c b/hw/mem/cxl_type3.c
> index d8fb63b1de..813f7f2175 100644
> --- a/hw/mem/cxl_type3.c
> +++ b/hw/mem/cxl_type3.c
> @@ -1116,6 +1116,11 @@ void qmp_cxl_inject_poison(const char *path, uint64_t start, uint64_t length,
>  
>      QLIST_INSERT_HEAD(&ct3d->poison_list, p, node);
>      ct3d->poison_list_cnt++;
> +
> +    /* Emit an GMER event, let os handle it */
> +    qmp_cxl_inject_general_media_event(path, CXL_EVENT_LOG_FAILURE, 0, start,
> +                                       0, 0, 4, false, 0, false, 0,
> +                                       false, 0, NULL, errp);

This results in some slightly messy duplication of effort. I'd like to see the
parts we need factored out of the qmp_cxl_inject_general_media_event()
so that we don't end up looking up the device twice and similar.
Pull the code we need in both places out to
cxl_inject_general_media_event(CXLType3 *ct3d, ...

helper in a precursor patch and reuse that code here.

  }
>  
>  /* For uncorrectable errors include support for multiple header recording */



^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [RFC PATCH 5/5] cxl/core: add poison injection event handler
  2024-02-09 11:54   ` Shiyang Ruan via
@ 2024-02-13 16:51     ` Jonathan Cameron via
  -1 siblings, 0 replies; 40+ messages in thread
From: Jonathan Cameron @ 2024-02-13 16:51 UTC (permalink / raw)
  To: Shiyang Ruan; +Cc: qemu-devel, linux-cxl, dan.j.williams


> +
> +void cxl_event_handle_record(struct cxl_memdev *cxlmd,
> +			     enum cxl_event_log_type type,
> +			     enum cxl_event_type event_type,
> +			     const uuid_t *uuid, union cxl_event *evt)
> +{
> +	if (event_type == CXL_CPER_EVENT_GEN_MEDIA) {
>  		trace_cxl_general_media(cxlmd, type, &evt->gen_media);
> -	else if (event_type == CXL_CPER_EVENT_DRAM)
> +		/* handle poison event */
> +		if (type == CXL_EVENT_TYPE_FAIL)
> +			cxl_event_handle_poison(cxlmd, &evt->gen_media); 

I'm not 100% convinced this is necessary poison causing.  Also
the text tells us we should see 'an appropriate event'.
DRAM one seems likely to be chosen by some vendors.

The fatal check maybe makes it a little more likely (maybe though
I'm not sure anything says a device must log it to the failure log)
but it might be Memory Event Type 1, which is the host tried to
access an invalid address.  Sure poison might be returned to that
error but what would the main kernel memory handling do with it?
Something is very wrong
but it's not corrupted device memory.  TE state violations are in there
as well. Sure poison is returned on reads (I think - haven't checked).

IF the aim here is to say 'maybe there is poison, better check the
poison list'. Then that is reasonable but we should ensure things
like timer expiry are definitely ruled out and rename the function
to make it clear it might not find poison.

Jonathan

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [RFC PATCH 5/5] cxl/core: add poison injection event handler
@ 2024-02-13 16:51     ` Jonathan Cameron via
  0 siblings, 0 replies; 40+ messages in thread
From: Jonathan Cameron via @ 2024-02-13 16:51 UTC (permalink / raw)
  To: Shiyang Ruan; +Cc: qemu-devel, linux-cxl, dan.j.williams


> +
> +void cxl_event_handle_record(struct cxl_memdev *cxlmd,
> +			     enum cxl_event_log_type type,
> +			     enum cxl_event_type event_type,
> +			     const uuid_t *uuid, union cxl_event *evt)
> +{
> +	if (event_type == CXL_CPER_EVENT_GEN_MEDIA) {
>  		trace_cxl_general_media(cxlmd, type, &evt->gen_media);
> -	else if (event_type == CXL_CPER_EVENT_DRAM)
> +		/* handle poison event */
> +		if (type == CXL_EVENT_TYPE_FAIL)
> +			cxl_event_handle_poison(cxlmd, &evt->gen_media); 

I'm not 100% convinced this is necessary poison causing.  Also
the text tells us we should see 'an appropriate event'.
DRAM one seems likely to be chosen by some vendors.

The fatal check maybe makes it a little more likely (maybe though
I'm not sure anything says a device must log it to the failure log)
but it might be Memory Event Type 1, which is the host tried to
access an invalid address.  Sure poison might be returned to that
error but what would the main kernel memory handling do with it?
Something is very wrong
but it's not corrupted device memory.  TE state violations are in there
as well. Sure poison is returned on reads (I think - haven't checked).

IF the aim here is to say 'maybe there is poison, better check the
poison list'. Then that is reasonable but we should ensure things
like timer expiry are definitely ruled out and rename the function
to make it clear it might not find poison.

Jonathan


^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [RFC PATCH 3/5] cxl/core: introduce cxl_mem_report_poison()
  2024-02-09 11:54   ` Shiyang Ruan via
  (?)
  (?)
@ 2024-02-15  1:19   ` Tony Luck
  -1 siblings, 0 replies; 40+ messages in thread
From: Tony Luck @ 2024-02-15  1:19 UTC (permalink / raw)
  To: Shiyang Ruan; +Cc: qemu-devel, linux-cxl, Jonathan.Cameron, dan.j.williams

On Fri, Feb 09, 2024 at 07:54:15PM +0800, Shiyang Ruan wrote:
> If poison is detected(reported from cxl memdev), OS should be notified to
> handle it.  Introduce this function:
>   1. translate DPA to HPA;
>   2. construct a MCE instance; (TODO: more details need to be filled)
>   3. log it into MCE event queue;
> 
> After that, MCE mechanism can walk over its notifier chain to execute
> specific handlers.

This looks like a useful proof of concept patch to pass errors to all
the existing logging systems (console, mcelog, rasdaemon, EDAC). But
it's a bare minimum (just passing the address and dropping any other
interesting information about the error). I think we need something
more advanced that covers more CXL error types.

> Signed-off-by: Shiyang Ruan <ruansy.fnst@fujitsu.com>
> ---
>  arch/x86/kernel/cpu/mce/core.c |  1 +
>  drivers/cxl/core/mbox.c        | 33 +++++++++++++++++++++++++++++++++
>  2 files changed, 34 insertions(+)
> 
> diff --git a/arch/x86/kernel/cpu/mce/core.c b/arch/x86/kernel/cpu/mce/core.c
> index bc39252bc54f..a64c0aceb7e0 100644
> --- a/arch/x86/kernel/cpu/mce/core.c
> +++ b/arch/x86/kernel/cpu/mce/core.c
> @@ -131,6 +131,7 @@ void mce_setup(struct mce *m)
>  	m->ppin = cpu_data(m->extcpu).ppin;
>  	m->microcode = boot_cpu_data.microcode;
>  }
> +EXPORT_SYMBOL_GPL(mce_setup);
>  
>  DEFINE_PER_CPU(struct mce, injectm);
>  EXPORT_PER_CPU_SYMBOL_GPL(injectm);
> diff --git a/drivers/cxl/core/mbox.c b/drivers/cxl/core/mbox.c
> index 27166a411705..f9b6f50fbe80 100644
> --- a/drivers/cxl/core/mbox.c
> +++ b/drivers/cxl/core/mbox.c
> @@ -4,6 +4,7 @@
>  #include <linux/debugfs.h>
>  #include <linux/ktime.h>
>  #include <linux/mutex.h>
> +#include <asm/mce.h>
>  #include <asm/unaligned.h>
>  #include <cxlpci.h>
>  #include <cxlmem.h>
> @@ -1290,6 +1291,38 @@ int cxl_set_timestamp(struct cxl_memdev_state *mds)
>  }
>  EXPORT_SYMBOL_NS_GPL(cxl_set_timestamp, CXL);
>  
> +static void cxl_mem_report_poison(struct cxl_memdev *cxlmd,
> +				  struct cxl_poison_record *poison)
> +{
> +	struct mce m;
> +	u64 dpa = le64_to_cpu(poison->address) & CXL_POISON_START_MASK;
> +	u64 len = le64_to_cpu(poison->length), i;
> +	phys_addr_t phys_addr = cxl_memdev_dpa_to_hpa(cxlmd, dpa);
> +
> +	if (phys_addr)
> +		return;
> +
> +	/*
> +	 * Initialize struct mce.  Call preempt_disable() to avoid
> +	 * "BUG: using smp_processor_id() in preemptible" for now, not sure
> +	 * if this is a correct way.
> +	 */
> +	preempt_disable();
> +	mce_setup(&m);
> +	preempt_enable();
> +
> +	m.bank = -1;
> +	/* Fake a memory read error with unknown channel */
> +	m.status = MCI_STATUS_VAL | MCI_STATUS_EN | MCI_STATUS_ADDRV |
> +		   MCI_STATUS_MISCV | 0x9f;
> +	m.misc = (MCI_MISC_ADDR_PHYS << 6);
> +
> +	for (i = 0; i < len; i++) {
> +		m.addr = phys_addr++;
> +		mce_log(&m);

This loop looks wrong. What values do you expect for "len" (a.k.a.
poison->length)? Creating one log for each byte in the range will
be very noisy!

> +	}
> +}
> +
>  int cxl_mem_get_poison(struct cxl_memdev *cxlmd, u64 offset, u64 len,
>  		       struct cxl_region *cxlr)
>  {
> -- 
> 2.34.1

-Tony

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [RFC PATCH 1/5] cxl/core: correct length of DPA field masks
  2024-02-10  6:34   ` Dan Williams
@ 2024-02-19 10:49       ` Shiyang Ruan
  0 siblings, 0 replies; 40+ messages in thread
From: Shiyang Ruan via @ 2024-02-19 10:49 UTC (permalink / raw)
  To: Dan Williams; +Cc: Jonathan.Cameron, qemu-devel, linux-cxl



在 2024/2/10 14:34, Dan Williams 写道:
> Shiyang Ruan wrote:
>> The length of Physical Address in General Media Event Record/DRAM Event
>> Record is 64-bit, so the field mask should be defined as such length.
> 
> Can you include this user visible side-effect of this change. Looks like
> this could cause usages of CXL_DPA_FLAGS_MASK to return an incorrect
> result?

Ok.  Will change it to this:

The length of Physical Address in General Media Event Record/DRAM Event 
Record is 64bit, per CXL Spec r3.0 - 8.2.9.2.1.1, Table 8-43.  Currently 
CXL_DPA_FLAGS_MASK is defined as int (32bit), then CXL_DPA_MASK is a int 
too, it will be 0x00000000FFFFFFC0 while using "->dpa & CXL_DPA_MASK" to 
obtain real physical address (to drop flags in lower bits), in this case 
the higher 32bit of ->dpa will be lost.

To avoid this, define CXL_DPA_FLAGS_MASK as 64bit: 0x3FULL.


--
Thanks,
Ruan.

> 
>>
>> Signed-off-by: Shiyang Ruan <ruansy.fnst@fujitsu.com>
>> ---
>>   drivers/cxl/core/trace.h | 6 +++---
>>   1 file changed, 3 insertions(+), 3 deletions(-)
>>
>> diff --git a/drivers/cxl/core/trace.h b/drivers/cxl/core/trace.h
>> index 89445435303a..388a87d972c2 100644
>> --- a/drivers/cxl/core/trace.h
>> +++ b/drivers/cxl/core/trace.h
>> @@ -253,11 +253,11 @@ TRACE_EVENT(cxl_generic_event,
>>    * DRAM Event Record
>>    * CXL rev 3.0 section 8.2.9.2.1.2; Table 8-44
>>    */
>> -#define CXL_DPA_FLAGS_MASK			0x3F
>> +#define CXL_DPA_FLAGS_MASK			0x3FULL
>>   #define CXL_DPA_MASK				(~CXL_DPA_FLAGS_MASK)
>>   
>> -#define CXL_DPA_VOLATILE			BIT(0)
>> -#define CXL_DPA_NOT_REPAIRABLE			BIT(1)
>> +#define CXL_DPA_VOLATILE			BIT_ULL(0)
>> +#define CXL_DPA_NOT_REPAIRABLE			BIT_ULL(1)
>>   #define show_dpa_flags(flags)	__print_flags(flags, "|",		   \
>>   	{ CXL_DPA_VOLATILE,			"VOLATILE"		}, \
>>   	{ CXL_DPA_NOT_REPAIRABLE,		"NOT_REPAIRABLE"	}  \
>> -- 
>> 2.34.1
>>
> 
> 


^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [RFC PATCH 1/5] cxl/core: correct length of DPA field masks
@ 2024-02-19 10:49       ` Shiyang Ruan
  0 siblings, 0 replies; 40+ messages in thread
From: Shiyang Ruan @ 2024-02-19 10:49 UTC (permalink / raw)
  To: Dan Williams; +Cc: Jonathan.Cameron, qemu-devel, linux-cxl



在 2024/2/10 14:34, Dan Williams 写道:
> Shiyang Ruan wrote:
>> The length of Physical Address in General Media Event Record/DRAM Event
>> Record is 64-bit, so the field mask should be defined as such length.
> 
> Can you include this user visible side-effect of this change. Looks like
> this could cause usages of CXL_DPA_FLAGS_MASK to return an incorrect
> result?

Ok.  Will change it to this:

The length of Physical Address in General Media Event Record/DRAM Event 
Record is 64bit, per CXL Spec r3.0 - 8.2.9.2.1.1, Table 8-43.  Currently 
CXL_DPA_FLAGS_MASK is defined as int (32bit), then CXL_DPA_MASK is a int 
too, it will be 0x00000000FFFFFFC0 while using "->dpa & CXL_DPA_MASK" to 
obtain real physical address (to drop flags in lower bits), in this case 
the higher 32bit of ->dpa will be lost.

To avoid this, define CXL_DPA_FLAGS_MASK as 64bit: 0x3FULL.


--
Thanks,
Ruan.

> 
>>
>> Signed-off-by: Shiyang Ruan <ruansy.fnst@fujitsu.com>
>> ---
>>   drivers/cxl/core/trace.h | 6 +++---
>>   1 file changed, 3 insertions(+), 3 deletions(-)
>>
>> diff --git a/drivers/cxl/core/trace.h b/drivers/cxl/core/trace.h
>> index 89445435303a..388a87d972c2 100644
>> --- a/drivers/cxl/core/trace.h
>> +++ b/drivers/cxl/core/trace.h
>> @@ -253,11 +253,11 @@ TRACE_EVENT(cxl_generic_event,
>>    * DRAM Event Record
>>    * CXL rev 3.0 section 8.2.9.2.1.2; Table 8-44
>>    */
>> -#define CXL_DPA_FLAGS_MASK			0x3F
>> +#define CXL_DPA_FLAGS_MASK			0x3FULL
>>   #define CXL_DPA_MASK				(~CXL_DPA_FLAGS_MASK)
>>   
>> -#define CXL_DPA_VOLATILE			BIT(0)
>> -#define CXL_DPA_NOT_REPAIRABLE			BIT(1)
>> +#define CXL_DPA_VOLATILE			BIT_ULL(0)
>> +#define CXL_DPA_NOT_REPAIRABLE			BIT_ULL(1)
>>   #define show_dpa_flags(flags)	__print_flags(flags, "|",		   \
>>   	{ CXL_DPA_VOLATILE,			"VOLATILE"		}, \
>>   	{ CXL_DPA_NOT_REPAIRABLE,		"NOT_REPAIRABLE"	}  \
>> -- 
>> 2.34.1
>>
> 
> 

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [RFC PATCH 1/5] cxl/core: correct length of DPA field masks
  2024-02-19 10:49       ` Shiyang Ruan
  (?)
@ 2024-02-22  2:27       ` Dan Williams
  -1 siblings, 0 replies; 40+ messages in thread
From: Dan Williams @ 2024-02-22  2:27 UTC (permalink / raw)
  To: Shiyang Ruan, Dan Williams
  Cc: Jonathan.Cameron, qemu-devel, linux-cxl, ira.weiny, dave

[ add Ira and Davidlohr ]

Shiyang Ruan wrote:
> 
> 
> 在 2024/2/10 14:34, Dan Williams 写道:
> > Shiyang Ruan wrote:
> >> The length of Physical Address in General Media Event Record/DRAM Event
> >> Record is 64-bit, so the field mask should be defined as such length.
> > 
> > Can you include this user visible side-effect of this change. Looks like
> > this could cause usages of CXL_DPA_FLAGS_MASK to return an incorrect
> > result?
> 
> Ok.  Will change it to this:
> 
> The length of Physical Address in General Media Event Record/DRAM Event 
> Record is 64bit, per CXL Spec r3.0 - 8.2.9.2.1.1, Table 8-43.  Currently 
> CXL_DPA_FLAGS_MASK is defined as int (32bit), then CXL_DPA_MASK is a int 
> too, it will be 0x00000000FFFFFFC0 while using "->dpa & CXL_DPA_MASK" to 
> obtain real physical address (to drop flags in lower bits), in this case 
> the higher 32bit of ->dpa will be lost.
> 
> To avoid this, define CXL_DPA_FLAGS_MASK as 64bit: 0x3FULL.

That part was clear, what is missing is the impact. For example, this
bug only impacts the output of the dpa field in the cxl_general_media
and cxl_dram tracepoints, but no other part of the CXL code.

So in this case the impact of the bug is low, but in the worst case an
wrong size mask could compromise the security / integrity of the system.

So I would expect a changelog like this to make the importance of the
fix clear and to notify the original stakeholders where the bug was
introduced.

---
The length of Physical Address in General Media Event Record/DRAM Event
Record is 64-bit, so the field mask should be defined as such length.
Otherwise, this causes cxl_general_media and cxl_dram tracepoints to
mask off the upper-32-bits of DPA addresses. The cxl_poison event is
unaffected.

If userspace was doing its own DPA-to-HPA translation this could lead to
incorrect page retirement decisions, but there is no known consumer
(like rasdaemon) of this event today.

Fixes: d54a531a430b ("cxl/mem: Trace General Media Event Record")
Cc: <stable@vger.kernel.org>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Davidlohr Bueso <dave@stgolabs.net>
Cc: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Cc: Ira Weiny <ira.weiny@intel.com>
---

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [RFC PATCH 4/5] cxl/core: add report option for cxl_mem_get_poison()
  2024-02-10  6:49   ` Dan Williams
@ 2024-03-14 15:01       ` Shiyang Ruan via
  0 siblings, 0 replies; 40+ messages in thread
From: Shiyang Ruan @ 2024-03-14 15:01 UTC (permalink / raw)
  To: Dan Williams; +Cc: Jonathan.Cameron, qemu-devel, linux-cxl



在 2024/2/10 14:49, Dan Williams 写道:
> Shiyang Ruan wrote:
>> When a poison event is received, driver uses GET_POISON_LIST command
>> to get the poison list.  Now driver has cxl_mem_get_poison(), so
>> reuse it and add a parameter 'bool report', report poison record to MCE
>> if set true.
> 
> If the memory error record has the poison event, why does the poison
> list need to be retrieved by the kernel? I would expect it is sufficient
> to just report the single poison event and leave it to userspace to
> react to that event and retrieve more data if it wants.

The GMER has only physical address field, no range/length of the POISON, 
we can't get the poison range from the single event record.  Since the 
POISON range is injected by one command, one GMER is sent to driver, we 
have to use GET_POISON_LIST command to get the length.


--
Thanks,
Ruan.

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [RFC PATCH 4/5] cxl/core: add report option for cxl_mem_get_poison()
@ 2024-03-14 15:01       ` Shiyang Ruan via
  0 siblings, 0 replies; 40+ messages in thread
From: Shiyang Ruan via @ 2024-03-14 15:01 UTC (permalink / raw)
  To: Dan Williams; +Cc: Jonathan.Cameron, qemu-devel, linux-cxl



在 2024/2/10 14:49, Dan Williams 写道:
> Shiyang Ruan wrote:
>> When a poison event is received, driver uses GET_POISON_LIST command
>> to get the poison list.  Now driver has cxl_mem_get_poison(), so
>> reuse it and add a parameter 'bool report', report poison record to MCE
>> if set true.
> 
> If the memory error record has the poison event, why does the poison
> list need to be retrieved by the kernel? I would expect it is sufficient
> to just report the single poison event and leave it to userspace to
> react to that event and retrieve more data if it wants.

The GMER has only physical address field, no range/length of the POISON, 
we can't get the poison range from the single event record.  Since the 
POISON range is injected by one command, one GMER is sent to driver, we 
have to use GET_POISON_LIST command to get the length.


--
Thanks,
Ruan.


^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [RFC PATCH 3/5] cxl/core: introduce cxl_mem_report_poison()
  2024-02-10  6:46   ` Dan Williams
@ 2024-03-14 15:23       ` Shiyang Ruan via
  0 siblings, 0 replies; 40+ messages in thread
From: Shiyang Ruan @ 2024-03-14 15:23 UTC (permalink / raw)
  To: Dan Williams; +Cc: Jonathan.Cameron, qemu-devel, linux-cxl



在 2024/2/10 14:46, Dan Williams 写道:
> Shiyang Ruan wrote:
>> If poison is detected(reported from cxl memdev), OS should be notified to
>> handle it.  Introduce this function:
>>    1. translate DPA to HPA;
>>    2. construct a MCE instance; (TODO: more details need to be filled)
>>    3. log it into MCE event queue;
>>
>> After that, MCE mechanism can walk over its notifier chain to execute
>> specific handlers.
>>
>> Signed-off-by: Shiyang Ruan <ruansy.fnst@fujitsu.com>
>> ---
>>   arch/x86/kernel/cpu/mce/core.c |  1 +
>>   drivers/cxl/core/mbox.c        | 33 +++++++++++++++++++++++++++++++++
>>   2 files changed, 34 insertions(+)
>>
>> diff --git a/arch/x86/kernel/cpu/mce/core.c b/arch/x86/kernel/cpu/mce/core.c
>> index bc39252bc54f..a64c0aceb7e0 100644
>> --- a/arch/x86/kernel/cpu/mce/core.c
>> +++ b/arch/x86/kernel/cpu/mce/core.c
>> @@ -131,6 +131,7 @@ void mce_setup(struct mce *m)
>>   	m->ppin = cpu_data(m->extcpu).ppin;
>>   	m->microcode = boot_cpu_data.microcode;
>>   }
>> +EXPORT_SYMBOL_GPL(mce_setup);
> 
> No, mce_setup() is x86 specific and the CXL subsystem is CPU
> architecture independent. My expectation is that CXL should translate
> errors for edac similar to how the ACPI GHES code does it. See usage of
> edac_raw_mc_handle_error() and memory_failure_queue().
> 
> Otherwise an MCE is a CPU consumption of poison event, and CXL is
> reporting device-side discovery of poison.

Yes, I misunderstood here.  I was mean to use MCE to finally call 
memory_failure(). I think memory_failure_queue() is what I need.

   void memory_failure_queue(unsigned long pfn, int flags)

But it can only queue one PFN at a time, we may need to make it support 
queuing a range of PFN.


--
Thanks,
Ruan.

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [RFC PATCH 3/5] cxl/core: introduce cxl_mem_report_poison()
@ 2024-03-14 15:23       ` Shiyang Ruan via
  0 siblings, 0 replies; 40+ messages in thread
From: Shiyang Ruan via @ 2024-03-14 15:23 UTC (permalink / raw)
  To: Dan Williams; +Cc: Jonathan.Cameron, qemu-devel, linux-cxl



在 2024/2/10 14:46, Dan Williams 写道:
> Shiyang Ruan wrote:
>> If poison is detected(reported from cxl memdev), OS should be notified to
>> handle it.  Introduce this function:
>>    1. translate DPA to HPA;
>>    2. construct a MCE instance; (TODO: more details need to be filled)
>>    3. log it into MCE event queue;
>>
>> After that, MCE mechanism can walk over its notifier chain to execute
>> specific handlers.
>>
>> Signed-off-by: Shiyang Ruan <ruansy.fnst@fujitsu.com>
>> ---
>>   arch/x86/kernel/cpu/mce/core.c |  1 +
>>   drivers/cxl/core/mbox.c        | 33 +++++++++++++++++++++++++++++++++
>>   2 files changed, 34 insertions(+)
>>
>> diff --git a/arch/x86/kernel/cpu/mce/core.c b/arch/x86/kernel/cpu/mce/core.c
>> index bc39252bc54f..a64c0aceb7e0 100644
>> --- a/arch/x86/kernel/cpu/mce/core.c
>> +++ b/arch/x86/kernel/cpu/mce/core.c
>> @@ -131,6 +131,7 @@ void mce_setup(struct mce *m)
>>   	m->ppin = cpu_data(m->extcpu).ppin;
>>   	m->microcode = boot_cpu_data.microcode;
>>   }
>> +EXPORT_SYMBOL_GPL(mce_setup);
> 
> No, mce_setup() is x86 specific and the CXL subsystem is CPU
> architecture independent. My expectation is that CXL should translate
> errors for edac similar to how the ACPI GHES code does it. See usage of
> edac_raw_mc_handle_error() and memory_failure_queue().
> 
> Otherwise an MCE is a CPU consumption of poison event, and CXL is
> reporting device-side discovery of poison.

Yes, I misunderstood here.  I was mean to use MCE to finally call 
memory_failure(). I think memory_failure_queue() is what I need.

   void memory_failure_queue(unsigned long pfn, int flags)

But it can only queue one PFN at a time, we may need to make it support 
queuing a range of PFN.


--
Thanks,
Ruan.


^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [RFC PATCH 5/5] cxl/core: add poison injection event handler
  2024-02-13 16:51     ` Jonathan Cameron via
@ 2024-03-15  2:29       ` Shiyang Ruan via
  -1 siblings, 0 replies; 40+ messages in thread
From: Shiyang Ruan @ 2024-03-15  2:29 UTC (permalink / raw)
  To: Jonathan Cameron; +Cc: qemu-devel, linux-cxl, dan.j.williams



在 2024/2/14 0:51, Jonathan Cameron 写道:
> 
>> +
>> +void cxl_event_handle_record(struct cxl_memdev *cxlmd,
>> +			     enum cxl_event_log_type type,
>> +			     enum cxl_event_type event_type,
>> +			     const uuid_t *uuid, union cxl_event *evt)
>> +{
>> +	if (event_type == CXL_CPER_EVENT_GEN_MEDIA) {
>>   		trace_cxl_general_media(cxlmd, type, &evt->gen_media);
>> -	else if (event_type == CXL_CPER_EVENT_DRAM)
>> +		/* handle poison event */
>> +		if (type == CXL_EVENT_TYPE_FAIL)
>> +			cxl_event_handle_poison(cxlmd, &evt->gen_media);
> 
> I'm not 100% convinced this is necessary poison causing.  Also
> the text tells us we should see 'an appropriate event'.
> DRAM one seems likely to be chosen by some vendors.

I think it's right to use DRAM Event Record for volatile-memdev, but 
should poison on a persistent-memdev also use DRAM Event Record too? 
Though its 'Physical Address' feild has the 'Volatile' bit too, which is 
same as General Media Event Record.  I am a bit confused about this.

> 
> The fatal check maybe makes it a little more likely (maybe though
> I'm not sure anything says a device must log it to the failure log)
> but it might be Memory Event Type 1, which is the host tried to
> access an invalid address.  Sure poison might be returned to that
> error but what would the main kernel memory handling do with it?
> Something is very wrong
> but it's not corrupted device memory.  TE state violations are in there
> as well. Sure poison is returned on reads (I think - haven't checked).
> 
> IF the aim here is to say 'maybe there is poison, better check the
> poison list'. Then that is reasonable but we should ensure things
> like timer expiry are definitely ruled out and rename the function
> to make it clear it might not find poison.

I forgot to distinguish the 'Transaction Type' here. Host Inject Poison 
is 0x04h. And other types should also have their specific handle method.


--
Thanks,
Ruan.

> 
> Jonathan

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [RFC PATCH 5/5] cxl/core: add poison injection event handler
@ 2024-03-15  2:29       ` Shiyang Ruan via
  0 siblings, 0 replies; 40+ messages in thread
From: Shiyang Ruan via @ 2024-03-15  2:29 UTC (permalink / raw)
  To: Jonathan Cameron; +Cc: qemu-devel, linux-cxl, dan.j.williams



在 2024/2/14 0:51, Jonathan Cameron 写道:
> 
>> +
>> +void cxl_event_handle_record(struct cxl_memdev *cxlmd,
>> +			     enum cxl_event_log_type type,
>> +			     enum cxl_event_type event_type,
>> +			     const uuid_t *uuid, union cxl_event *evt)
>> +{
>> +	if (event_type == CXL_CPER_EVENT_GEN_MEDIA) {
>>   		trace_cxl_general_media(cxlmd, type, &evt->gen_media);
>> -	else if (event_type == CXL_CPER_EVENT_DRAM)
>> +		/* handle poison event */
>> +		if (type == CXL_EVENT_TYPE_FAIL)
>> +			cxl_event_handle_poison(cxlmd, &evt->gen_media);
> 
> I'm not 100% convinced this is necessary poison causing.  Also
> the text tells us we should see 'an appropriate event'.
> DRAM one seems likely to be chosen by some vendors.

I think it's right to use DRAM Event Record for volatile-memdev, but 
should poison on a persistent-memdev also use DRAM Event Record too? 
Though its 'Physical Address' feild has the 'Volatile' bit too, which is 
same as General Media Event Record.  I am a bit confused about this.

> 
> The fatal check maybe makes it a little more likely (maybe though
> I'm not sure anything says a device must log it to the failure log)
> but it might be Memory Event Type 1, which is the host tried to
> access an invalid address.  Sure poison might be returned to that
> error but what would the main kernel memory handling do with it?
> Something is very wrong
> but it's not corrupted device memory.  TE state violations are in there
> as well. Sure poison is returned on reads (I think - haven't checked).
> 
> IF the aim here is to say 'maybe there is poison, better check the
> poison list'. Then that is reasonable but we should ensure things
> like timer expiry are definitely ruled out and rename the function
> to make it clear it might not find poison.

I forgot to distinguish the 'Transaction Type' here. Host Inject Poison 
is 0x04h. And other types should also have their specific handle method.


--
Thanks,
Ruan.

> 
> Jonathan


^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [RFC PATCH 5/5] cxl/core: add poison injection event handler
  2024-03-15  2:29       ` Shiyang Ruan via
@ 2024-04-05 17:35         ` Jonathan Cameron via
  -1 siblings, 0 replies; 40+ messages in thread
From: Jonathan Cameron @ 2024-04-05 17:35 UTC (permalink / raw)
  To: Shiyang Ruan; +Cc: qemu-devel, linux-cxl, dan.j.williams

On Fri, 15 Mar 2024 10:29:07 +0800
Shiyang Ruan <ruansy.fnst@fujitsu.com> wrote:

> 在 2024/2/14 0:51, Jonathan Cameron 写道:
> >   
> >> +
> >> +void cxl_event_handle_record(struct cxl_memdev *cxlmd,
> >> +			     enum cxl_event_log_type type,
> >> +			     enum cxl_event_type event_type,
> >> +			     const uuid_t *uuid, union cxl_event *evt)
> >> +{
> >> +	if (event_type == CXL_CPER_EVENT_GEN_MEDIA) {
> >>   		trace_cxl_general_media(cxlmd, type, &evt->gen_media);
> >> -	else if (event_type == CXL_CPER_EVENT_DRAM)
> >> +		/* handle poison event */
> >> +		if (type == CXL_EVENT_TYPE_FAIL)
> >> +			cxl_event_handle_poison(cxlmd, &evt->gen_media);  
> > 
> > I'm not 100% convinced this is necessary poison causing.  Also
> > the text tells us we should see 'an appropriate event'.
> > DRAM one seems likely to be chosen by some vendors.  
> 
> I think it's right to use DRAM Event Record for volatile-memdev, but 
> should poison on a persistent-memdev also use DRAM Event Record too? 
> Though its 'Physical Address' feild has the 'Volatile' bit too, which is 
> same as General Media Event Record.  I am a bit confused about this.

That is indeed 'novel' in a DRAM device, but maybe it could be battery
backed and have a path to say a flash device that isn't visible to CXL
and form which the DRAM is refilled on power restore?

Anyhow, doesn't make sense for persistent memory that doesn't correspond
to all the other stuff in the DRAM event.
> 
> > 
> > The fatal check maybe makes it a little more likely (maybe though
> > I'm not sure anything says a device must log it to the failure log)
> > but it might be Memory Event Type 1, which is the host tried to
> > access an invalid address.  Sure poison might be returned to that
> > error but what would the main kernel memory handling do with it?
> > Something is very wrong
> > but it's not corrupted device memory.  TE state violations are in there
> > as well. Sure poison is returned on reads (I think - haven't checked).
> > 
> > IF the aim here is to say 'maybe there is poison, better check the
> > poison list'. Then that is reasonable but we should ensure things
> > like timer expiry are definitely ruled out and rename the function
> > to make it clear it might not find poison.  
> 
> I forgot to distinguish the 'Transaction Type' here. Host Inject Poison 
> is 0x04h. And other types should also have their specific handle method.
Yes. If you can use transaction type that solves this issue I think.
> 
> 
> --
> Thanks,
> Ruan.
> 
> > 
> > Jonathan  


^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [RFC PATCH 5/5] cxl/core: add poison injection event handler
@ 2024-04-05 17:35         ` Jonathan Cameron via
  0 siblings, 0 replies; 40+ messages in thread
From: Jonathan Cameron via @ 2024-04-05 17:35 UTC (permalink / raw)
  To: Shiyang Ruan; +Cc: qemu-devel, linux-cxl, dan.j.williams

On Fri, 15 Mar 2024 10:29:07 +0800
Shiyang Ruan <ruansy.fnst@fujitsu.com> wrote:

> 在 2024/2/14 0:51, Jonathan Cameron 写道:
> >   
> >> +
> >> +void cxl_event_handle_record(struct cxl_memdev *cxlmd,
> >> +			     enum cxl_event_log_type type,
> >> +			     enum cxl_event_type event_type,
> >> +			     const uuid_t *uuid, union cxl_event *evt)
> >> +{
> >> +	if (event_type == CXL_CPER_EVENT_GEN_MEDIA) {
> >>   		trace_cxl_general_media(cxlmd, type, &evt->gen_media);
> >> -	else if (event_type == CXL_CPER_EVENT_DRAM)
> >> +		/* handle poison event */
> >> +		if (type == CXL_EVENT_TYPE_FAIL)
> >> +			cxl_event_handle_poison(cxlmd, &evt->gen_media);  
> > 
> > I'm not 100% convinced this is necessary poison causing.  Also
> > the text tells us we should see 'an appropriate event'.
> > DRAM one seems likely to be chosen by some vendors.  
> 
> I think it's right to use DRAM Event Record for volatile-memdev, but 
> should poison on a persistent-memdev also use DRAM Event Record too? 
> Though its 'Physical Address' feild has the 'Volatile' bit too, which is 
> same as General Media Event Record.  I am a bit confused about this.

That is indeed 'novel' in a DRAM device, but maybe it could be battery
backed and have a path to say a flash device that isn't visible to CXL
and form which the DRAM is refilled on power restore?

Anyhow, doesn't make sense for persistent memory that doesn't correspond
to all the other stuff in the DRAM event.
> 
> > 
> > The fatal check maybe makes it a little more likely (maybe though
> > I'm not sure anything says a device must log it to the failure log)
> > but it might be Memory Event Type 1, which is the host tried to
> > access an invalid address.  Sure poison might be returned to that
> > error but what would the main kernel memory handling do with it?
> > Something is very wrong
> > but it's not corrupted device memory.  TE state violations are in there
> > as well. Sure poison is returned on reads (I think - haven't checked).
> > 
> > IF the aim here is to say 'maybe there is poison, better check the
> > poison list'. Then that is reasonable but we should ensure things
> > like timer expiry are definitely ruled out and rename the function
> > to make it clear it might not find poison.  
> 
> I forgot to distinguish the 'Transaction Type' here. Host Inject Poison 
> is 0x04h. And other types should also have their specific handle method.
Yes. If you can use transaction type that solves this issue I think.
> 
> 
> --
> Thanks,
> Ruan.
> 
> > 
> > Jonathan  



^ permalink raw reply	[flat|nested] 40+ messages in thread

end of thread, other threads:[~2024-04-05 17:36 UTC | newest]

Thread overview: 40+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-02-09 11:54 [RFC PATCH SET] cxl: add poison event handler Shiyang Ruan
2024-02-09 11:54 ` Shiyang Ruan via
2024-02-09 11:54 ` [RFC PATCH 1/2] hw/cxl/type3: add missing flag bit for GMER Shiyang Ruan
2024-02-09 11:54   ` Shiyang Ruan via
2024-02-13 16:27   ` Jonathan Cameron
2024-02-13 16:27     ` Jonathan Cameron via
2024-02-09 11:54 ` [RFC PATCH 2/2] hw/cxl/type3: send a GMER while injecting poison Shiyang Ruan
2024-02-09 11:54   ` Shiyang Ruan via
2024-02-13 16:32   ` Jonathan Cameron
2024-02-13 16:32     ` Jonathan Cameron via
2024-02-09 11:54 ` [RFC PATCH 1/5] cxl/core: correct length of DPA field masks Shiyang Ruan
2024-02-09 11:54   ` Shiyang Ruan via
2024-02-10  6:34   ` Dan Williams
2024-02-19 10:49     ` Shiyang Ruan via
2024-02-19 10:49       ` Shiyang Ruan
2024-02-22  2:27       ` Dan Williams
2024-02-09 11:54 ` [RFC PATCH 2/5] cxl/core: introduce cxl_memdev_dpa_to_hpa() Shiyang Ruan
2024-02-09 11:54   ` Shiyang Ruan via
2024-02-10  6:39   ` Dan Williams
2024-02-09 11:54 ` [RFC PATCH 3/5] cxl/core: introduce cxl_mem_report_poison() Shiyang Ruan
2024-02-09 11:54   ` Shiyang Ruan via
2024-02-10  6:46   ` Dan Williams
2024-03-14 15:23     ` Shiyang Ruan
2024-03-14 15:23       ` Shiyang Ruan via
2024-02-15  1:19   ` Tony Luck
2024-02-09 11:54 ` [RFC PATCH 4/5] cxl/core: add report option for cxl_mem_get_poison() Shiyang Ruan
2024-02-09 11:54   ` Shiyang Ruan via
2024-02-10  6:49   ` Dan Williams
2024-03-14 15:01     ` Shiyang Ruan
2024-03-14 15:01       ` Shiyang Ruan via
2024-02-09 11:54 ` [RFC PATCH 5/5] cxl/core: add poison injection event handler Shiyang Ruan
2024-02-09 11:54   ` Shiyang Ruan via
2024-02-10  6:54   ` Dan Williams
2024-02-13 16:51   ` Jonathan Cameron
2024-02-13 16:51     ` Jonathan Cameron via
2024-03-15  2:29     ` Shiyang Ruan
2024-03-15  2:29       ` Shiyang Ruan via
2024-04-05 17:35       ` Jonathan Cameron
2024-04-05 17:35         ` Jonathan Cameron via
2024-02-13  0:20 ` [RFC PATCH SET] cxl: add poison " Dave Jiang

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.