linux-cxl.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/6] hw/cxl: Poison get, inject, clear
@ 2023-02-17 18:18 Jonathan Cameron
  2023-02-17 18:18 ` [PATCH 1/6] hw/cxl: Move enum ret_code definition to cxl_device.h Jonathan Cameron
                   ` (5 more replies)
  0 siblings, 6 replies; 15+ messages in thread
From: Jonathan Cameron @ 2023-02-17 18:18 UTC (permalink / raw)
  To: qemu-devel, Michael Tsirkin
  Cc: Ben Widawsky, linux-cxl, linuxarm, Ira Weiny, Gregory Price,
	Philippe Mathieu-Daudé,
	Mike Maslenkin, Markus Armbruster, Dave Jiang, alison.schofield

Note Alison has stated the kernel series will be post 6.3 material
so this one isn't quite as urgent as the patches it is based on.
However I think this series in a good state (plus I have lots more queued
behind it) hence promoting it from RFC.

Changes since RFC v2: Thanks to Markus for review.
 - Improve documentation for QMP interface
 - Add better description of baseline series
 - Include precursor refactors around ret_code / CXLRetCode as this is now
   the first series in suggeste merge order to rely on those.
 - Include Ira's cxl_device_get_timestamp() function as it was better than
   the equivalent in the RFC.

Based on following series (in order)
1. [PATCH v4 00/10] hw/cxl: CXL emulation cleanups and minor fixes for upstream
2. [PATCH v4 0/8] hw/cxl: RAS error emulation and injection
3. [PATCH 0/2] hw/cxl: Passthrough HDM decoder emulation
4. [PATCH v2 0/2] hw/mem: CXL Type-3 Volatile Memory Support

Based on: Message-Id: 20230206172816.8201-1-Jonathan.Cameron@huawei.com
Based-on: Message-id: 20230217172924.25239-1-Jonathan.Cameron@huawei.com
Based-on: Message-id: 20230125152703.9928-1-Jonathan.Cameron@huawei.com
Based-on: Message-id: 20230217175657.26632-1-Jonathan.Cameron@huawei.com

The series supports:
1) Injection of variable length poison regions via QMP (to fake real
   memory corruption and ensure we deal with odd overflow corner cases
   such as clearing the middle of a large region making the list overflow
   as we go from one long entry to two smaller entries.
2) Read of poison list via the CXL mailbox.
3) Injection via the poison injection mailbox command (limited to 64 byte
   entries)
4) Clearing of poison injected via either method.

The implementation is meant to be a valid combination of impdef choices
based on what the spec allowed. There are a number of places where it could
be made more sophisticated that we might consider in future:
* Fusing adjacent poison entries if the types match.
* Separate injection list and main poison list, to test out limits on
  injected poison list being smaller than the main list.
* Poison list overflow event (needs event log support in general)
* Connecting up to the poison list error record generation (rather complex
  and not needed for currently kernel handling testing).

As the kernel code is currently fairly simple, it is likely that the above
does not yet matter but who knows what will turn up in future!

Kernel patches:
 [PATCH v6 0/6] CXL Poison List Retrieval & Tracing
 Message-id: cover.1675983077.git.alison.schofield@intel.com
 [PATCH v2 0/6] cxl: CXL Inject & Clear Poison
 cover.1674101475.git.alison.schofield@intel.com


Ira Weiny (1):
  hw/cxl: Introduce cxl_device_get_timestamp() utility function

Jonathan Cameron (5):
  hw/cxl: Move enum ret_code definition to cxl_device.h
  hw/cxl: rename mailbox return code type from ret_code to CXLRetCode
  hw/cxl: QMP based poison injection support
  hw/cxl: Add poison injection via the mailbox.
  hw/cxl: Add clear poison mailbox command support.

 hw/cxl/cxl-device-utils.c   |  15 ++
 hw/cxl/cxl-mailbox-utils.c  | 300 +++++++++++++++++++++++++++---------
 hw/mem/cxl_type3.c          |  92 +++++++++++
 hw/mem/cxl_type3_stubs.c    |   3 +
 hw/mem/meson.build          |   2 +
 include/hw/cxl/cxl_device.h |  51 ++++++
 qapi/cxl.json               |  16 ++
 7 files changed, 410 insertions(+), 69 deletions(-)

-- 
2.37.2


^ permalink raw reply	[flat|nested] 15+ messages in thread

* [PATCH 1/6] hw/cxl: Move enum ret_code definition to cxl_device.h
  2023-02-17 18:18 [PATCH 0/6] hw/cxl: Poison get, inject, clear Jonathan Cameron
@ 2023-02-17 18:18 ` Jonathan Cameron
  2023-02-22  1:47   ` Ira Weiny
  2023-02-17 18:18 ` [PATCH 2/6] hw/cxl: rename mailbox return code type from ret_code to CXLRetCode Jonathan Cameron
                   ` (4 subsequent siblings)
  5 siblings, 1 reply; 15+ messages in thread
From: Jonathan Cameron @ 2023-02-17 18:18 UTC (permalink / raw)
  To: qemu-devel, Michael Tsirkin
  Cc: Ben Widawsky, linux-cxl, linuxarm, Ira Weiny, Gregory Price,
	Philippe Mathieu-Daudé,
	Mike Maslenkin, Markus Armbruster, Dave Jiang, alison.schofield

Needs tidy up and rename to something more generic now it is
in a header.

Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
---
 hw/cxl/cxl-mailbox-utils.c  | 28 ----------------------------
 include/hw/cxl/cxl_device.h | 28 ++++++++++++++++++++++++++++
 2 files changed, 28 insertions(+), 28 deletions(-)

diff --git a/hw/cxl/cxl-mailbox-utils.c b/hw/cxl/cxl-mailbox-utils.c
index cc9c8b7380..d9bd5daa0d 100644
--- a/hw/cxl/cxl-mailbox-utils.c
+++ b/hw/cxl/cxl-mailbox-utils.c
@@ -64,34 +64,6 @@ enum {
         #define SET_LSA       0x3
 };
 
-/* 8.2.8.4.5.1 Command Return Codes */
-typedef enum {
-    CXL_MBOX_SUCCESS = 0x0,
-    CXL_MBOX_BG_STARTED = 0x1,
-    CXL_MBOX_INVALID_INPUT = 0x2,
-    CXL_MBOX_UNSUPPORTED = 0x3,
-    CXL_MBOX_INTERNAL_ERROR = 0x4,
-    CXL_MBOX_RETRY_REQUIRED = 0x5,
-    CXL_MBOX_BUSY = 0x6,
-    CXL_MBOX_MEDIA_DISABLED = 0x7,
-    CXL_MBOX_FW_XFER_IN_PROGRESS = 0x8,
-    CXL_MBOX_FW_XFER_OUT_OF_ORDER = 0x9,
-    CXL_MBOX_FW_AUTH_FAILED = 0xa,
-    CXL_MBOX_FW_INVALID_SLOT = 0xb,
-    CXL_MBOX_FW_ROLLEDBACK = 0xc,
-    CXL_MBOX_FW_REST_REQD = 0xd,
-    CXL_MBOX_INVALID_HANDLE = 0xe,
-    CXL_MBOX_INVALID_PA = 0xf,
-    CXL_MBOX_INJECT_POISON_LIMIT = 0x10,
-    CXL_MBOX_PERMANENT_MEDIA_FAILURE = 0x11,
-    CXL_MBOX_ABORTED = 0x12,
-    CXL_MBOX_INVALID_SECURITY_STATE = 0x13,
-    CXL_MBOX_INCORRECT_PASSPHRASE = 0x14,
-    CXL_MBOX_UNSUPPORTED_MAILBOX = 0x15,
-    CXL_MBOX_INVALID_PAYLOAD_LENGTH = 0x16,
-    CXL_MBOX_MAX = 0x17
-} ret_code;
-
 struct cxl_cmd;
 typedef ret_code (*opcode_handler)(struct cxl_cmd *cmd,
                                    CXLDeviceState *cxl_dstate, uint16_t *len);
diff --git a/include/hw/cxl/cxl_device.h b/include/hw/cxl/cxl_device.h
index edb9791bab..d01c6b29c5 100644
--- a/include/hw/cxl/cxl_device.h
+++ b/include/hw/cxl/cxl_device.h
@@ -82,6 +82,34 @@
     (CXL_DEVICE_CAP_REG_SIZE + CXL_DEVICE_STATUS_REGISTERS_LENGTH +     \
      CXL_MAILBOX_REGISTERS_LENGTH + CXL_MEMORY_DEVICE_REGISTERS_LENGTH)
 
+/* 8.2.8.4.5.1 Command Return Codes */
+typedef enum {
+    CXL_MBOX_SUCCESS = 0x0,
+    CXL_MBOX_BG_STARTED = 0x1,
+    CXL_MBOX_INVALID_INPUT = 0x2,
+    CXL_MBOX_UNSUPPORTED = 0x3,
+    CXL_MBOX_INTERNAL_ERROR = 0x4,
+    CXL_MBOX_RETRY_REQUIRED = 0x5,
+    CXL_MBOX_BUSY = 0x6,
+    CXL_MBOX_MEDIA_DISABLED = 0x7,
+    CXL_MBOX_FW_XFER_IN_PROGRESS = 0x8,
+    CXL_MBOX_FW_XFER_OUT_OF_ORDER = 0x9,
+    CXL_MBOX_FW_AUTH_FAILED = 0xa,
+    CXL_MBOX_FW_INVALID_SLOT = 0xb,
+    CXL_MBOX_FW_ROLLEDBACK = 0xc,
+    CXL_MBOX_FW_REST_REQD = 0xd,
+    CXL_MBOX_INVALID_HANDLE = 0xe,
+    CXL_MBOX_INVALID_PA = 0xf,
+    CXL_MBOX_INJECT_POISON_LIMIT = 0x10,
+    CXL_MBOX_PERMANENT_MEDIA_FAILURE = 0x11,
+    CXL_MBOX_ABORTED = 0x12,
+    CXL_MBOX_INVALID_SECURITY_STATE = 0x13,
+    CXL_MBOX_INCORRECT_PASSPHRASE = 0x14,
+    CXL_MBOX_UNSUPPORTED_MAILBOX = 0x15,
+    CXL_MBOX_INVALID_PAYLOAD_LENGTH = 0x16,
+    CXL_MBOX_MAX = 0x17
+} ret_code;
+
 typedef struct cxl_device_state {
     MemoryRegion device_registers;
 
-- 
2.37.2


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH 2/6] hw/cxl: rename mailbox return code type from ret_code to CXLRetCode
  2023-02-17 18:18 [PATCH 0/6] hw/cxl: Poison get, inject, clear Jonathan Cameron
  2023-02-17 18:18 ` [PATCH 1/6] hw/cxl: Move enum ret_code definition to cxl_device.h Jonathan Cameron
@ 2023-02-17 18:18 ` Jonathan Cameron
  2023-02-22  1:47   ` Ira Weiny
  2023-02-17 18:18 ` [PATCH 3/6] hw/cxl: Introduce cxl_device_get_timestamp() utility function Jonathan Cameron
                   ` (3 subsequent siblings)
  5 siblings, 1 reply; 15+ messages in thread
From: Jonathan Cameron @ 2023-02-17 18:18 UTC (permalink / raw)
  To: qemu-devel, Michael Tsirkin
  Cc: Ben Widawsky, linux-cxl, linuxarm, Ira Weiny, Gregory Price,
	Philippe Mathieu-Daudé,
	Mike Maslenkin, Markus Armbruster, Dave Jiang, alison.schofield

This enum typedef used to be local to one file, so having a generic
name wasn't a big problem even if it wasn't compliant with QEMU naming
conventions.  Now it is in cxl_device.h to support use outside of
cxl-mailbox-utils.c rename it.

Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
---
 hw/cxl/cxl-mailbox-utils.c  | 62 ++++++++++++++++++-------------------
 include/hw/cxl/cxl_device.h |  2 +-
 2 files changed, 32 insertions(+), 32 deletions(-)

diff --git a/hw/cxl/cxl-mailbox-utils.c b/hw/cxl/cxl-mailbox-utils.c
index d9bd5daa0d..67aca3fd6c 100644
--- a/hw/cxl/cxl-mailbox-utils.c
+++ b/hw/cxl/cxl-mailbox-utils.c
@@ -23,7 +23,7 @@
  *     FOO    = 0x7f,
  *          #define BAR 0
  *  2. Implement the handler
- *    static ret_code cmd_foo_bar(struct cxl_cmd *cmd,
+ *    static CXLRetCode cmd_foo_bar(struct cxl_cmd *cmd,
  *                                  CXLDeviceState *cxl_dstate, uint16_t *len)
  *  3. Add the command to the cxl_cmd_set[][]
  *    [FOO][BAR] = { "FOO_BAR", cmd_foo_bar, x, y },
@@ -65,7 +65,7 @@ enum {
 };
 
 struct cxl_cmd;
-typedef ret_code (*opcode_handler)(struct cxl_cmd *cmd,
+typedef CXLRetCode (*opcode_handler)(struct cxl_cmd *cmd,
                                    CXLDeviceState *cxl_dstate, uint16_t *len);
 struct cxl_cmd {
     const char *name;
@@ -77,16 +77,16 @@ struct cxl_cmd {
 
 #define DEFINE_MAILBOX_HANDLER_ZEROED(name, size)                         \
     uint16_t __zero##name = size;                                         \
-    static ret_code cmd_##name(struct cxl_cmd *cmd,                       \
-                               CXLDeviceState *cxl_dstate, uint16_t *len) \
+    static CXLRetCode cmd_##name(struct cxl_cmd *cmd,                       \
+                                 CXLDeviceState *cxl_dstate, uint16_t *len) \
     {                                                                     \
         *len = __zero##name;                                              \
         memset(cmd->payload, 0, *len);                                    \
         return CXL_MBOX_SUCCESS;                                          \
     }
 #define DEFINE_MAILBOX_HANDLER_NOP(name)                                  \
-    static ret_code cmd_##name(struct cxl_cmd *cmd,                       \
-                               CXLDeviceState *cxl_dstate, uint16_t *len) \
+    static CXLRetCode cmd_##name(struct cxl_cmd *cmd,                       \
+                                 CXLDeviceState *cxl_dstate, uint16_t *len) \
     {                                                                     \
         return CXL_MBOX_SUCCESS;                                          \
     }
@@ -97,9 +97,9 @@ DEFINE_MAILBOX_HANDLER_ZEROED(events_get_interrupt_policy, 4);
 DEFINE_MAILBOX_HANDLER_NOP(events_set_interrupt_policy);
 
 /* 8.2.9.2.1 */
-static ret_code cmd_firmware_update_get_info(struct cxl_cmd *cmd,
-                                             CXLDeviceState *cxl_dstate,
-                                             uint16_t *len)
+static CXLRetCode cmd_firmware_update_get_info(struct cxl_cmd *cmd,
+                                               CXLDeviceState *cxl_dstate,
+                                               uint16_t *len)
 {
     struct {
         uint8_t slots_supported;
@@ -131,9 +131,9 @@ static ret_code cmd_firmware_update_get_info(struct cxl_cmd *cmd,
 }
 
 /* 8.2.9.3.1 */
-static ret_code cmd_timestamp_get(struct cxl_cmd *cmd,
-                                  CXLDeviceState *cxl_dstate,
-                                  uint16_t *len)
+static CXLRetCode cmd_timestamp_get(struct cxl_cmd *cmd,
+                                    CXLDeviceState *cxl_dstate,
+                                    uint16_t *len)
 {
     uint64_t time, delta;
     uint64_t final_time = 0;
@@ -153,7 +153,7 @@ static ret_code cmd_timestamp_get(struct cxl_cmd *cmd,
 }
 
 /* 8.2.9.3.2 */
-static ret_code cmd_timestamp_set(struct cxl_cmd *cmd,
+static CXLRetCode cmd_timestamp_set(struct cxl_cmd *cmd,
                                   CXLDeviceState *cxl_dstate,
                                   uint16_t *len)
 {
@@ -173,9 +173,9 @@ static const QemuUUID cel_uuid = {
 };
 
 /* 8.2.9.4.1 */
-static ret_code cmd_logs_get_supported(struct cxl_cmd *cmd,
-                                       CXLDeviceState *cxl_dstate,
-                                       uint16_t *len)
+static CXLRetCode cmd_logs_get_supported(struct cxl_cmd *cmd,
+                                         CXLDeviceState *cxl_dstate,
+                                         uint16_t *len)
 {
     struct {
         uint16_t entries;
@@ -196,9 +196,9 @@ static ret_code cmd_logs_get_supported(struct cxl_cmd *cmd,
 }
 
 /* 8.2.9.4.2 */
-static ret_code cmd_logs_get_log(struct cxl_cmd *cmd,
-                                 CXLDeviceState *cxl_dstate,
-                                 uint16_t *len)
+static CXLRetCode cmd_logs_get_log(struct cxl_cmd *cmd,
+                                   CXLDeviceState *cxl_dstate,
+                                   uint16_t *len)
 {
     struct {
         QemuUUID uuid;
@@ -237,9 +237,9 @@ static ret_code cmd_logs_get_log(struct cxl_cmd *cmd,
 }
 
 /* 8.2.9.5.1.1 */
-static ret_code cmd_identify_memory_device(struct cxl_cmd *cmd,
-                                           CXLDeviceState *cxl_dstate,
-                                           uint16_t *len)
+static CXLRetCode cmd_identify_memory_device(struct cxl_cmd *cmd,
+                                             CXLDeviceState *cxl_dstate,
+                                             uint16_t *len)
 {
     struct {
         char fw_revision[0x10];
@@ -281,9 +281,9 @@ static ret_code cmd_identify_memory_device(struct cxl_cmd *cmd,
     return CXL_MBOX_SUCCESS;
 }
 
-static ret_code cmd_ccls_get_partition_info(struct cxl_cmd *cmd,
-                                           CXLDeviceState *cxl_dstate,
-                                           uint16_t *len)
+static CXLRetCode cmd_ccls_get_partition_info(struct cxl_cmd *cmd,
+                                              CXLDeviceState *cxl_dstate,
+                                              uint16_t *len)
 {
     struct {
         uint64_t active_vmem;
@@ -311,9 +311,9 @@ static ret_code cmd_ccls_get_partition_info(struct cxl_cmd *cmd,
     return CXL_MBOX_SUCCESS;
 }
 
-static ret_code cmd_ccls_get_lsa(struct cxl_cmd *cmd,
-                                 CXLDeviceState *cxl_dstate,
-                                 uint16_t *len)
+static CXLRetCode cmd_ccls_get_lsa(struct cxl_cmd *cmd,
+                                   CXLDeviceState *cxl_dstate,
+                                   uint16_t *len)
 {
     struct {
         uint32_t offset;
@@ -336,9 +336,9 @@ static ret_code cmd_ccls_get_lsa(struct cxl_cmd *cmd,
     return CXL_MBOX_SUCCESS;
 }
 
-static ret_code cmd_ccls_set_lsa(struct cxl_cmd *cmd,
-                                 CXLDeviceState *cxl_dstate,
-                                 uint16_t *len)
+static CXLRetCode cmd_ccls_set_lsa(struct cxl_cmd *cmd,
+                                   CXLDeviceState *cxl_dstate,
+                                   uint16_t *len)
 {
     struct set_lsa_pl {
         uint32_t offset;
diff --git a/include/hw/cxl/cxl_device.h b/include/hw/cxl/cxl_device.h
index d01c6b29c5..b737c3699f 100644
--- a/include/hw/cxl/cxl_device.h
+++ b/include/hw/cxl/cxl_device.h
@@ -108,7 +108,7 @@ typedef enum {
     CXL_MBOX_UNSUPPORTED_MAILBOX = 0x15,
     CXL_MBOX_INVALID_PAYLOAD_LENGTH = 0x16,
     CXL_MBOX_MAX = 0x17
-} ret_code;
+} CXLRetCode;
 
 typedef struct cxl_device_state {
     MemoryRegion device_registers;
-- 
2.37.2


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH 3/6] hw/cxl: Introduce cxl_device_get_timestamp() utility function
  2023-02-17 18:18 [PATCH 0/6] hw/cxl: Poison get, inject, clear Jonathan Cameron
  2023-02-17 18:18 ` [PATCH 1/6] hw/cxl: Move enum ret_code definition to cxl_device.h Jonathan Cameron
  2023-02-17 18:18 ` [PATCH 2/6] hw/cxl: rename mailbox return code type from ret_code to CXLRetCode Jonathan Cameron
@ 2023-02-17 18:18 ` Jonathan Cameron
  2023-02-17 18:18 ` [PATCH 4/6] hw/cxl: QMP based poison injection support Jonathan Cameron
                   ` (2 subsequent siblings)
  5 siblings, 0 replies; 15+ messages in thread
From: Jonathan Cameron @ 2023-02-17 18:18 UTC (permalink / raw)
  To: qemu-devel, Michael Tsirkin
  Cc: Ben Widawsky, linux-cxl, linuxarm, Ira Weiny, Gregory Price,
	Philippe Mathieu-Daudé,
	Mike Maslenkin, Markus Armbruster, Dave Jiang, alison.schofield

From: Ira Weiny <ira.weiny@intel.com>

There are new users of this functionality coming shortly so factor
it out from the GET_TIMESTAMP mailbox command handling.

Signed-off-by: Ira Weiny <ira.weiny@intel.com>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
---
 hw/cxl/cxl-device-utils.c   | 15 +++++++++++++++
 hw/cxl/cxl-mailbox-utils.c  | 11 +----------
 include/hw/cxl/cxl_device.h |  2 ++
 3 files changed, 18 insertions(+), 10 deletions(-)

diff --git a/hw/cxl/cxl-device-utils.c b/hw/cxl/cxl-device-utils.c
index 4c5e88aaf5..86e1cea8ce 100644
--- a/hw/cxl/cxl-device-utils.c
+++ b/hw/cxl/cxl-device-utils.c
@@ -269,3 +269,18 @@ void cxl_device_register_init_common(CXLDeviceState *cxl_dstate)
 
     cxl_initialize_mailbox(cxl_dstate);
 }
+
+uint64_t cxl_device_get_timestamp(CXLDeviceState *cxl_dstate)
+{
+    uint64_t time, delta;
+    uint64_t final_time = 0;
+
+    if (cxl_dstate->timestamp.set) {
+        /* Find the delta from the last time the host set the time. */
+        time = qemu_clock_get_ns(QEMU_CLOCK_VIRTUAL);
+        delta = time - cxl_dstate->timestamp.last_set;
+        final_time = cxl_dstate->timestamp.host_set + delta;
+    }
+
+    return final_time;
+}
diff --git a/hw/cxl/cxl-mailbox-utils.c b/hw/cxl/cxl-mailbox-utils.c
index 67aca3fd6c..580366ed2f 100644
--- a/hw/cxl/cxl-mailbox-utils.c
+++ b/hw/cxl/cxl-mailbox-utils.c
@@ -135,17 +135,8 @@ static CXLRetCode cmd_timestamp_get(struct cxl_cmd *cmd,
                                     CXLDeviceState *cxl_dstate,
                                     uint16_t *len)
 {
-    uint64_t time, delta;
-    uint64_t final_time = 0;
-
-    if (cxl_dstate->timestamp.set) {
-        /* First find the delta from the last time the host set the time. */
-        time = qemu_clock_get_ns(QEMU_CLOCK_VIRTUAL);
-        delta = time - cxl_dstate->timestamp.last_set;
-        final_time = cxl_dstate->timestamp.host_set + delta;
-    }
+    uint64_t final_time = cxl_device_get_timestamp(cxl_dstate);
 
-    /* Then adjust the actual time */
     stq_le_p(cmd->payload, final_time);
     *len = 8;
 
diff --git a/include/hw/cxl/cxl_device.h b/include/hw/cxl/cxl_device.h
index b737c3699f..44fea2d649 100644
--- a/include/hw/cxl/cxl_device.h
+++ b/include/hw/cxl/cxl_device.h
@@ -315,4 +315,6 @@ MemTxResult cxl_type3_read(PCIDevice *d, hwaddr host_addr, uint64_t *data,
 MemTxResult cxl_type3_write(PCIDevice *d, hwaddr host_addr, uint64_t data,
                             unsigned size, MemTxAttrs attrs);
 
+uint64_t cxl_device_get_timestamp(CXLDeviceState *cxlds);
+
 #endif
-- 
2.37.2


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH 4/6] hw/cxl: QMP based poison injection support
  2023-02-17 18:18 [PATCH 0/6] hw/cxl: Poison get, inject, clear Jonathan Cameron
                   ` (2 preceding siblings ...)
  2023-02-17 18:18 ` [PATCH 3/6] hw/cxl: Introduce cxl_device_get_timestamp() utility function Jonathan Cameron
@ 2023-02-17 18:18 ` Jonathan Cameron
  2023-02-22  1:14   ` Ira Weiny
  2023-02-17 18:18 ` [PATCH 5/6] hw/cxl: Add poison injection via the mailbox Jonathan Cameron
  2023-02-17 18:18 ` [PATCH 6/6] hw/cxl: Add clear poison mailbox command support Jonathan Cameron
  5 siblings, 1 reply; 15+ messages in thread
From: Jonathan Cameron @ 2023-02-17 18:18 UTC (permalink / raw)
  To: qemu-devel, Michael Tsirkin
  Cc: Ben Widawsky, linux-cxl, linuxarm, Ira Weiny, Gregory Price,
	Philippe Mathieu-Daudé,
	Mike Maslenkin, Markus Armbruster, Dave Jiang, alison.schofield

Inject poison using qmp command cxl-inject-poison to add an entry to the
poison list.

For now, the poison is not returned CXL.mem reads, but only via the
mailbox command Get Poison List.

See CXL rev 3.0, sec 8.2.9.8.4.1 Get Poison list (Opcode 4300h)

Kernel patches to use this interface here:
https://lore.kernel.org/linux-cxl/cover.1665606782.git.alison.schofield@intel.com/

To inject poison using qmp (telnet to the qmp port)
{ "execute": "qmp_capabilities" }

{ "execute": "cxl-inject-poison",
    "arguments": {
         "path": "/machine/peripheral/cxl-pmem0",
         "start": 2048,
         "length": 256
    }
}

Adjusted to select a device on your machine.

Note that the poison list supported is kept short enough to avoid the
complexity of state machine that is needed to handle the MORE flag.

Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>

---
v3:
Improve QMP documentation.

v2:
Moved to QMP to allow for single command.
Update reference in coverletter
Added specific setting of type for this approach to injection.
Drop the unnecessary ct3d class get_poison_list callback.
Block overlapping regions from being injected
Handle list overflow
Use Ira's utility function to get the timestamps
---
 hw/cxl/cxl-mailbox-utils.c  | 82 +++++++++++++++++++++++++++++++++++++
 hw/mem/cxl_type3.c          | 56 +++++++++++++++++++++++++
 hw/mem/cxl_type3_stubs.c    |  3 ++
 hw/mem/meson.build          |  2 +
 include/hw/cxl/cxl_device.h | 20 +++++++++
 qapi/cxl.json               | 16 ++++++++
 6 files changed, 179 insertions(+)

diff --git a/hw/cxl/cxl-mailbox-utils.c b/hw/cxl/cxl-mailbox-utils.c
index 580366ed2f..cf3cfb10a1 100644
--- a/hw/cxl/cxl-mailbox-utils.c
+++ b/hw/cxl/cxl-mailbox-utils.c
@@ -62,6 +62,8 @@ enum {
         #define GET_PARTITION_INFO     0x0
         #define GET_LSA       0x2
         #define SET_LSA       0x3
+    MEDIA_AND_POISON = 0x43,
+        #define GET_POISON_LIST        0x0
 };
 
 struct cxl_cmd;
@@ -267,6 +269,8 @@ static CXLRetCode cmd_identify_memory_device(struct cxl_cmd *cmd,
     id->persistent_capacity = cxl_dstate->pmem_size / CXL_CAPACITY_MULTIPLIER;
     id->volatile_capacity = cxl_dstate->vmem_size / CXL_CAPACITY_MULTIPLIER;
     id->lsa_size = cvc->get_lsa_size(ct3d);
+    id->poison_list_max_mer[1] = 0x1; /* 256 poison records */
+    id->inject_poison_limit = 0; /* No limit - so limited by main poison record limit */
 
     *len = sizeof(*id);
     return CXL_MBOX_SUCCESS;
@@ -356,6 +360,82 @@ static CXLRetCode cmd_ccls_set_lsa(struct cxl_cmd *cmd,
     return CXL_MBOX_SUCCESS;
 }
 
+/*
+ * This is very inefficient, but good enough for now!
+ * Also the payload will always fit, so no need to handle the MORE flag and
+ * make this stateful. We may want to allow longer poison lists to aid
+ * testing that kernel functionality.
+ */
+static CXLRetCode cmd_media_get_poison_list(struct cxl_cmd *cmd,
+                                            CXLDeviceState *cxl_dstate,
+                                            uint16_t *len)
+{
+    struct get_poison_list_pl {
+        uint64_t pa;
+        uint64_t length;
+    } QEMU_PACKED;
+
+    struct get_poison_list_out_pl {
+        uint8_t flags;
+        uint8_t rsvd1;
+        uint64_t overflow_timestamp;
+        uint16_t count;
+        uint8_t rsvd2[0x14];
+        struct {
+            uint64_t addr;
+            uint32_t length;
+            uint32_t resv;
+        } QEMU_PACKED records[];
+    } QEMU_PACKED;
+
+    struct get_poison_list_pl *in = (void *)cmd->payload;
+    struct get_poison_list_out_pl *out = (void *)cmd->payload;
+    CXLType3Dev *ct3d = container_of(cxl_dstate, CXLType3Dev, cxl_dstate);
+    uint16_t record_count = 0, i = 0;
+    uint64_t query_start = in->pa;
+    uint64_t query_length = in->length;
+    CXLPoisonList *poison_list = &ct3d->poison_list;
+    CXLPoison *ent;
+    uint16_t out_pl_len;
+
+    QLIST_FOREACH(ent, poison_list, node) {
+        /* Check for no overlap */
+        if (ent->start >= query_start + query_length ||
+            ent->start + ent->length <= query_start) {
+            continue;
+        }
+        record_count++;
+    }
+    out_pl_len = sizeof(*out) + record_count * sizeof(out->records[0]);
+    assert(out_pl_len <= CXL_MAILBOX_MAX_PAYLOAD_SIZE);
+
+    memset(out, 0, out_pl_len);
+    QLIST_FOREACH(ent, poison_list, node) {
+        uint64_t start, stop;
+
+        /* Check for no overlap */
+        if (ent->start >= query_start + query_length ||
+            ent->start + ent->length <= query_start) {
+            continue;
+        }
+
+        /* Deal with overlap */
+        start = MAX(ent->start & 0xffffffffffffffc0, query_start);
+        stop = MIN((ent->start & 0xffffffffffffffc0) + ent->length,
+                   query_start + query_length);
+        out->records[i].addr = start | (ent->type & 0x3);
+        out->records[i].length = (stop - start) / 64;
+        i++;
+    }
+    if (ct3d->poison_list_overflowed) {
+        out->flags = (1 << 1);
+        out->overflow_timestamp = ct3d->poison_list_overflow_ts;
+    }
+    out->count = record_count;
+    *len = out_pl_len;
+    return CXL_MBOX_SUCCESS;
+}
+
 #define IMMEDIATE_CONFIG_CHANGE (1 << 1)
 #define IMMEDIATE_DATA_CHANGE (1 << 2)
 #define IMMEDIATE_POLICY_CHANGE (1 << 3)
@@ -383,6 +463,8 @@ static struct cxl_cmd cxl_cmd_set[256][256] = {
     [CCLS][GET_LSA] = { "CCLS_GET_LSA", cmd_ccls_get_lsa, 8, 0 },
     [CCLS][SET_LSA] = { "CCLS_SET_LSA", cmd_ccls_set_lsa,
         ~0, IMMEDIATE_CONFIG_CHANGE | IMMEDIATE_DATA_CHANGE },
+    [MEDIA_AND_POISON][GET_POISON_LIST] = { "MEDIA_AND_POISON_GET_POISON_LIST",
+        cmd_media_get_poison_list, 16, 0 },
 };
 
 void cxl_process_mailbox(CXLDeviceState *cxl_dstate)
diff --git a/hw/mem/cxl_type3.c b/hw/mem/cxl_type3.c
index 8b7727a75b..3585f78b4e 100644
--- a/hw/mem/cxl_type3.c
+++ b/hw/mem/cxl_type3.c
@@ -925,6 +925,62 @@ static void set_lsa(CXLType3Dev *ct3d, const void *buf, uint64_t size,
      */
 }
 
+void cxl_set_poison_list_overflowed(CXLType3Dev *ct3d)
+{
+        ct3d->poison_list_overflowed = true;
+        ct3d->poison_list_overflow_ts =
+            cxl_device_get_timestamp(&ct3d->cxl_dstate);
+}
+
+void qmp_cxl_inject_poison(const char *path, uint64_t start, uint64_t length,
+                           Error **errp)
+{
+    Object *obj = object_resolve_path(path, NULL);
+    CXLType3Dev *ct3d;
+    CXLPoison *p;
+
+    if (length % 64) {
+        error_setg(errp, "Poison injection must be in multiples of 64 bytes");
+        return;
+    }
+    if (start % 64) {
+        error_setg(errp, "Poison start address must be 64 byte aligned");
+        return;
+    }
+    if (!obj) {
+        error_setg(errp, "Unable to resolve path");
+        return;
+    }
+    if (!object_dynamic_cast(obj, TYPE_CXL_TYPE3)) {
+        error_setg(errp, "Path does not point to a CXL type 3 device");
+        return;
+    }
+
+    ct3d = CXL_TYPE3(obj);
+
+    QLIST_FOREACH(p, &ct3d->poison_list, node) {
+        if (((start >= p->start) && (start < p->start + p->length)) ||
+            ((start + length > p->start) &&
+             (start + length <= p->start + p->length))) {
+            error_setg(errp, "Overlap with existing poisoned region not supported");
+            return;
+        }
+    }
+
+    if (ct3d->poison_list_cnt == CXL_POISON_LIST_LIMIT) {
+        cxl_set_poison_list_overflowed(ct3d);
+        return;
+    }
+
+    p = g_new0(CXLPoison, 1);
+    p->length = length;
+    p->start = start;
+    p->type = CXL_POISON_TYPE_INTERNAL; /* Different from injected via the mbox */
+
+    QLIST_INSERT_HEAD(&ct3d->poison_list, p, node);
+    ct3d->poison_list_cnt++;
+}
+
 /* For uncorrectable errors include support for multiple header recording */
 void qmp_cxl_inject_uncorrectable_errors(const char *path,
                                          CXLUncorErrorRecordList *errors,
diff --git a/hw/mem/cxl_type3_stubs.c b/hw/mem/cxl_type3_stubs.c
index b6b51ced54..6055190ca6 100644
--- a/hw/mem/cxl_type3_stubs.c
+++ b/hw/mem/cxl_type3_stubs.c
@@ -2,6 +2,9 @@
 #include "qemu/osdep.h"
 #include "qapi/qapi-commands-cxl.h"
 
+void qmp_cxl_inject_poison(const char *path, uint64_t start, uint64_t length,
+                           Error **errp) {}
+
 void qmp_cxl_inject_uncorrectable_errors(const char *path,
                                          CXLUncorErrorRecordList *errors,
                                          Error **errp) {}
diff --git a/hw/mem/meson.build b/hw/mem/meson.build
index 56c2618b84..930c67e390 100644
--- a/hw/mem/meson.build
+++ b/hw/mem/meson.build
@@ -10,3 +10,5 @@ softmmu_ss.add(when: 'CONFIG_ALL', if_true: files('cxl_type3_stubs.c'))
 softmmu_ss.add_all(when: 'CONFIG_MEM_DEVICE', if_true: mem_ss)
 
 softmmu_ss.add(when: 'CONFIG_SPARSE_MEM', if_true: files('sparse-mem.c'))
+softmmu_ss.add(when: 'CONFIG_CXL_MEM_DEVICE', if_false: files('cxl_type3_stubs.c'))
+softmmu_ss.add(when: 'CONFIG_ALL', if_true: files('cxl_type3_stubs.c'))
diff --git a/include/hw/cxl/cxl_device.h b/include/hw/cxl/cxl_device.h
index 44fea2d649..3cb77fe8a5 100644
--- a/include/hw/cxl/cxl_device.h
+++ b/include/hw/cxl/cxl_device.h
@@ -270,6 +270,18 @@ typedef struct CXLError {
 
 typedef QTAILQ_HEAD(, CXLError) CXLErrorList;
 
+typedef struct CXLPoison {
+    uint64_t start, length;
+    uint8_t type;
+#define CXL_POISON_TYPE_EXTERNAL 0x1
+#define CXL_POISON_TYPE_INTERNAL 0x2
+#define CXL_POISON_TYPE_INJECTED 0x3
+    QLIST_ENTRY(CXLPoison) node;
+} CXLPoison;
+
+typedef QLIST_HEAD(, CXLPoison) CXLPoisonList;
+#define CXL_POISON_LIST_LIMIT 256
+
 struct CXLType3Dev {
     /* Private */
     PCIDevice parent_obj;
@@ -292,6 +304,12 @@ struct CXLType3Dev {
 
     /* Error injection */
     CXLErrorList error_list;
+
+    /* Poison Injection - cache */
+    CXLPoisonList poison_list;
+    unsigned int poison_list_cnt;
+    bool poison_list_overflowed;
+    uint64_t poison_list_overflow_ts;
 };
 
 #define TYPE_CXL_TYPE3 "cxl-type3"
@@ -317,4 +335,6 @@ MemTxResult cxl_type3_write(PCIDevice *d, hwaddr host_addr, uint64_t data,
 
 uint64_t cxl_device_get_timestamp(CXLDeviceState *cxlds);
 
+void cxl_set_poison_list_overflowed(CXLType3Dev *ct3d);
+
 #endif
diff --git a/qapi/cxl.json b/qapi/cxl.json
index ac7e167fa2..bc099d695e 100644
--- a/qapi/cxl.json
+++ b/qapi/cxl.json
@@ -5,6 +5,22 @@
 # = CXL devices
 ##
 
+##
+# @cxl-inject-poison:
+#
+# Poison records indicate that a CXL memory device knows that a particular
+# memory region may be corrupted. This may be because of locally detected
+# errors (e.g. ECC failure) or poisoned writes received from other components
+# in the system. This injection mechanism enables testing of the OS handling
+# of poison records which may be queried via the CXL mailbox.
+#
+# @path: CXL type 3 device canonical QOM path
+# @start: Start address
+# @length: Length of poison to inject
+##
+{ 'command': 'cxl-inject-poison',
+  'data': { 'path': 'str', 'start': 'uint64', 'length': 'uint64' }}
+
 ##
 # @CxlUncorErrorType:
 #
-- 
2.37.2


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH 5/6] hw/cxl: Add poison injection via the mailbox.
  2023-02-17 18:18 [PATCH 0/6] hw/cxl: Poison get, inject, clear Jonathan Cameron
                   ` (3 preceding siblings ...)
  2023-02-17 18:18 ` [PATCH 4/6] hw/cxl: QMP based poison injection support Jonathan Cameron
@ 2023-02-17 18:18 ` Jonathan Cameron
  2023-02-22  1:18   ` Ira Weiny
  2023-02-17 18:18 ` [PATCH 6/6] hw/cxl: Add clear poison mailbox command support Jonathan Cameron
  5 siblings, 1 reply; 15+ messages in thread
From: Jonathan Cameron @ 2023-02-17 18:18 UTC (permalink / raw)
  To: qemu-devel, Michael Tsirkin
  Cc: Ben Widawsky, linux-cxl, linuxarm, Ira Weiny, Gregory Price,
	Philippe Mathieu-Daudé,
	Mike Maslenkin, Markus Armbruster, Dave Jiang, alison.schofield

Very simple implementation to allow testing of corresponding
kernel code. Note that for now we track each 64 byte section
independently.  Whilst a valid implementation choice, it may
make sense to fuse entries so as to prove out more complex
corners of the kernel code.

Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
---
 hw/cxl/cxl-mailbox-utils.c | 40 ++++++++++++++++++++++++++++++++++++++
 1 file changed, 40 insertions(+)

diff --git a/hw/cxl/cxl-mailbox-utils.c b/hw/cxl/cxl-mailbox-utils.c
index cf3cfb10a1..7d3f7bcd3a 100644
--- a/hw/cxl/cxl-mailbox-utils.c
+++ b/hw/cxl/cxl-mailbox-utils.c
@@ -64,6 +64,7 @@ enum {
         #define SET_LSA       0x3
     MEDIA_AND_POISON = 0x43,
         #define GET_POISON_LIST        0x0
+        #define INJECT_POISON          0x1
 };
 
 struct cxl_cmd;
@@ -436,6 +437,43 @@ static CXLRetCode cmd_media_get_poison_list(struct cxl_cmd *cmd,
     return CXL_MBOX_SUCCESS;
 }
 
+static CXLRetCode cmd_media_inject_poison(struct cxl_cmd *cmd,
+                                          CXLDeviceState *cxl_dstate,
+                                          uint16_t *len)
+{
+    CXLType3Dev *ct3d = container_of(cxl_dstate, CXLType3Dev, cxl_dstate);
+    CXLPoisonList *poison_list = &ct3d->poison_list;
+    CXLPoison *ent;
+    struct inject_poison_pl {
+        uint64_t dpa;
+    };
+    struct inject_poison_pl *in = (void *)cmd->payload;
+    CXLPoison *p;
+
+    QLIST_FOREACH(ent, poison_list, node) {
+        if (ent->start == in->dpa && ent->length == 64) {
+            return CXL_MBOX_SUCCESS;
+        }
+    }
+
+    if (ct3d->poison_list_cnt == CXL_POISON_LIST_LIMIT) {
+        return CXL_MBOX_INJECT_POISON_LIMIT;
+    }
+    p = g_new0(CXLPoison, 1);
+
+    p->length = 64;
+    p->start = in->dpa;
+    p->type = CXL_POISON_TYPE_INJECTED;
+
+    /*
+     * Possible todo: Merge with existing entry if next to it and if same type
+     */
+    QLIST_INSERT_HEAD(poison_list, p, node);
+    ct3d->poison_list_cnt++;
+
+    return CXL_MBOX_SUCCESS;
+}
+
 #define IMMEDIATE_CONFIG_CHANGE (1 << 1)
 #define IMMEDIATE_DATA_CHANGE (1 << 2)
 #define IMMEDIATE_POLICY_CHANGE (1 << 3)
@@ -465,6 +503,8 @@ static struct cxl_cmd cxl_cmd_set[256][256] = {
         ~0, IMMEDIATE_CONFIG_CHANGE | IMMEDIATE_DATA_CHANGE },
     [MEDIA_AND_POISON][GET_POISON_LIST] = { "MEDIA_AND_POISON_GET_POISON_LIST",
         cmd_media_get_poison_list, 16, 0 },
+    [MEDIA_AND_POISON][INJECT_POISON] = { "MEDIA_AND_POISON_INJECT_POISON",
+        cmd_media_inject_poison, 8, 0 },
 };
 
 void cxl_process_mailbox(CXLDeviceState *cxl_dstate)
-- 
2.37.2


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH 6/6] hw/cxl: Add clear poison mailbox command support.
  2023-02-17 18:18 [PATCH 0/6] hw/cxl: Poison get, inject, clear Jonathan Cameron
                   ` (4 preceding siblings ...)
  2023-02-17 18:18 ` [PATCH 5/6] hw/cxl: Add poison injection via the mailbox Jonathan Cameron
@ 2023-02-17 18:18 ` Jonathan Cameron
  2023-02-22  1:31   ` Ira Weiny
  5 siblings, 1 reply; 15+ messages in thread
From: Jonathan Cameron @ 2023-02-17 18:18 UTC (permalink / raw)
  To: qemu-devel, Michael Tsirkin
  Cc: Ben Widawsky, linux-cxl, linuxarm, Ira Weiny, Gregory Price,
	Philippe Mathieu-Daudé,
	Mike Maslenkin, Markus Armbruster, Dave Jiang, alison.schofield

Current implementation is very simple so many of the corner
cases do not exist (e.g. fragmenting larger poison list entries)

Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
---
 hw/cxl/cxl-mailbox-utils.c  | 77 +++++++++++++++++++++++++++++++++++++
 hw/mem/cxl_type3.c          | 36 +++++++++++++++++
 include/hw/cxl/cxl_device.h |  1 +
 3 files changed, 114 insertions(+)

diff --git a/hw/cxl/cxl-mailbox-utils.c b/hw/cxl/cxl-mailbox-utils.c
index 7d3f7bcd3a..f56c76b205 100644
--- a/hw/cxl/cxl-mailbox-utils.c
+++ b/hw/cxl/cxl-mailbox-utils.c
@@ -65,6 +65,7 @@ enum {
     MEDIA_AND_POISON = 0x43,
         #define GET_POISON_LIST        0x0
         #define INJECT_POISON          0x1
+        #define CLEAR_POISON           0x2
 };
 
 struct cxl_cmd;
@@ -474,6 +475,80 @@ static CXLRetCode cmd_media_inject_poison(struct cxl_cmd *cmd,
     return CXL_MBOX_SUCCESS;
 }
 
+static CXLRetCode cmd_media_clear_poison(struct cxl_cmd *cmd,
+                                         CXLDeviceState *cxl_dstate,
+                                         uint16_t *len)
+{
+    CXLType3Dev *ct3d = container_of(cxl_dstate, CXLType3Dev, cxl_dstate);
+    CXLPoisonList *poison_list = &ct3d->poison_list;
+    CXLType3Class *cvc = CXL_TYPE3_GET_CLASS(ct3d);
+    struct clear_poison_pl {
+        uint64_t dpa;
+        uint8_t data[64];
+    };
+    CXLPoison *ent;
+
+    struct clear_poison_pl *in = (void *)cmd->payload;
+
+    if (in->dpa + 64 > cxl_dstate->mem_size) {
+        return CXL_MBOX_INVALID_PA;
+    }
+
+    QLIST_FOREACH(ent, poison_list, node) {
+        /*
+         * Test for contained in entry. Simpler than general case
+         * as clearing 64 bytes and entries 64 byte aligned
+         */
+        if ((in->dpa < ent->start) || (in->dpa >= ent->start + ent->length)) {
+            continue;
+        }
+        /* Do accounting early as we know one will go away */
+        ct3d->poison_list_cnt--;
+        if (in->dpa > ent->start) {
+            CXLPoison *frag;
+            if (ct3d->poison_list_cnt == CXL_POISON_LIST_LIMIT) {
+                cxl_set_poison_list_overflowed(ct3d);
+                break;
+            }
+            frag = g_new0(CXLPoison, 1);
+
+            frag->start = ent->start;
+            frag->length = in->dpa - ent->start;
+            frag->type = ent->type;
+
+            QLIST_INSERT_HEAD(poison_list, frag, node);
+            ct3d->poison_list_cnt++;
+        }
+        if (in->dpa + 64 < ent->start + ent->length) {
+            CXLPoison *frag;
+
+            if (ct3d->poison_list_cnt == CXL_POISON_LIST_LIMIT) {
+                cxl_set_poison_list_overflowed(ct3d);
+                break;
+            }
+
+            frag = g_new0(CXLPoison, 1);
+
+            frag->start = in->dpa + 64;
+            frag->length = ent->start + ent->length - frag->start;
+            frag->type = ent->type;
+            QLIST_INSERT_HEAD(poison_list, frag, node);
+            ct3d->poison_list_cnt++;
+        }
+        /* Any fragments have been added, free original entry */
+        QLIST_REMOVE(ent, node);
+        g_free(ent);
+        break;
+    }
+    /* Clearing a region with no poison is not an error so always do so */
+    if (cvc->set_cacheline)
+        if (!cvc->set_cacheline(ct3d, in->dpa, in->data)) {
+            return CXL_MBOX_INTERNAL_ERROR;
+        }
+
+    return CXL_MBOX_SUCCESS;
+}
+
 #define IMMEDIATE_CONFIG_CHANGE (1 << 1)
 #define IMMEDIATE_DATA_CHANGE (1 << 2)
 #define IMMEDIATE_POLICY_CHANGE (1 << 3)
@@ -505,6 +580,8 @@ static struct cxl_cmd cxl_cmd_set[256][256] = {
         cmd_media_get_poison_list, 16, 0 },
     [MEDIA_AND_POISON][INJECT_POISON] = { "MEDIA_AND_POISON_INJECT_POISON",
         cmd_media_inject_poison, 8, 0 },
+    [MEDIA_AND_POISON][CLEAR_POISON] = { "MEDIA_AND_POISON_CLEAR_POISON",
+        cmd_media_clear_poison, 72, 0 },
 };
 
 void cxl_process_mailbox(CXLDeviceState *cxl_dstate)
diff --git a/hw/mem/cxl_type3.c b/hw/mem/cxl_type3.c
index 3585f78b4e..8adc725edc 100644
--- a/hw/mem/cxl_type3.c
+++ b/hw/mem/cxl_type3.c
@@ -925,6 +925,41 @@ static void set_lsa(CXLType3Dev *ct3d, const void *buf, uint64_t size,
      */
 }
 
+static bool set_cacheline(CXLType3Dev *ct3d, uint64_t dpa_offset, uint8_t *data)
+{
+    MemoryRegion *vmr = NULL, *pmr = NULL;
+    AddressSpace *as;
+
+    if (ct3d->hostvmem) {
+        vmr = host_memory_backend_get_memory(ct3d->hostvmem);
+    }
+    if (ct3d->hostpmem) {
+        pmr = host_memory_backend_get_memory(ct3d->hostpmem);
+    }
+
+    if (!vmr && !pmr) {
+        return false;
+    }
+
+    if (dpa_offset + 64 > int128_get64(ct3d->cxl_dstate.mem_size)) {
+        return false;
+    }
+
+    if (vmr) {
+        if (dpa_offset <= int128_get64(vmr->size)) {
+            as = &ct3d->hostvmem_as;
+        } else {
+            as = &ct3d->hostpmem_as;
+            dpa_offset -= vmr->size;
+        }
+    } else {
+        as = &ct3d->hostpmem_as;
+    }
+
+    address_space_write(as, dpa_offset, MEMTXATTRS_UNSPECIFIED, &data, 64);
+    return true;
+}
+
 void cxl_set_poison_list_overflowed(CXLType3Dev *ct3d)
 {
         ct3d->poison_list_overflowed = true;
@@ -1146,6 +1181,7 @@ static void ct3_class_init(ObjectClass *oc, void *data)
     cvc->get_lsa_size = get_lsa_size;
     cvc->get_lsa = get_lsa;
     cvc->set_lsa = set_lsa;
+    cvc->set_cacheline = set_cacheline;
 }
 
 static const TypeInfo ct3d_info = {
diff --git a/include/hw/cxl/cxl_device.h b/include/hw/cxl/cxl_device.h
index 3cb77fe8a5..0a05f21e40 100644
--- a/include/hw/cxl/cxl_device.h
+++ b/include/hw/cxl/cxl_device.h
@@ -326,6 +326,7 @@ struct CXLType3Class {
                         uint64_t offset);
     void (*set_lsa)(CXLType3Dev *ct3d, const void *buf, uint64_t size,
                     uint64_t offset);
+    bool (*set_cacheline)(CXLType3Dev *ct3d, uint64_t dpa_offset, uint8_t *data);
 };
 
 MemTxResult cxl_type3_read(PCIDevice *d, hwaddr host_addr, uint64_t *data,
-- 
2.37.2


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* Re: [PATCH 4/6] hw/cxl: QMP based poison injection support
  2023-02-17 18:18 ` [PATCH 4/6] hw/cxl: QMP based poison injection support Jonathan Cameron
@ 2023-02-22  1:14   ` Ira Weiny
  2023-02-22 17:53     ` Jonathan Cameron
  0 siblings, 1 reply; 15+ messages in thread
From: Ira Weiny @ 2023-02-22  1:14 UTC (permalink / raw)
  To: Jonathan Cameron, qemu-devel, Michael Tsirkin
  Cc: Ben Widawsky, linux-cxl, linuxarm, Ira Weiny, Gregory Price,
	Philippe Mathieu-Daudé,
	Mike Maslenkin, Markus Armbruster, Dave Jiang, alison.schofield

Jonathan Cameron wrote:
> Inject poison using qmp command cxl-inject-poison to add an entry to the
> poison list.
> 
> For now, the poison is not returned CXL.mem reads, but only via the
> mailbox command Get Poison List.
> 
> See CXL rev 3.0, sec 8.2.9.8.4.1 Get Poison list (Opcode 4300h)
> 
> Kernel patches to use this interface here:
> https://lore.kernel.org/linux-cxl/cover.1665606782.git.alison.schofield@intel.com/
> 
> To inject poison using qmp (telnet to the qmp port)
> { "execute": "qmp_capabilities" }
> 
> { "execute": "cxl-inject-poison",
>     "arguments": {
>          "path": "/machine/peripheral/cxl-pmem0",
>          "start": 2048,
>          "length": 256
>     }
> }
> 
> Adjusted to select a device on your machine.
> 
> Note that the poison list supported is kept short enough to avoid the
> complexity of state machine that is needed to handle the MORE flag.
> 
> Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
> 
> ---
> v3:
> Improve QMP documentation.
> 
> v2:
> Moved to QMP to allow for single command.
> Update reference in coverletter
> Added specific setting of type for this approach to injection.
> Drop the unnecessary ct3d class get_poison_list callback.
> Block overlapping regions from being injected
> Handle list overflow
> Use Ira's utility function to get the timestamps
> ---
>  hw/cxl/cxl-mailbox-utils.c  | 82 +++++++++++++++++++++++++++++++++++++
>  hw/mem/cxl_type3.c          | 56 +++++++++++++++++++++++++
>  hw/mem/cxl_type3_stubs.c    |  3 ++
>  hw/mem/meson.build          |  2 +
>  include/hw/cxl/cxl_device.h | 20 +++++++++
>  qapi/cxl.json               | 16 ++++++++
>  6 files changed, 179 insertions(+)
> 
> diff --git a/hw/cxl/cxl-mailbox-utils.c b/hw/cxl/cxl-mailbox-utils.c
> index 580366ed2f..cf3cfb10a1 100644
> --- a/hw/cxl/cxl-mailbox-utils.c
> +++ b/hw/cxl/cxl-mailbox-utils.c
> @@ -62,6 +62,8 @@ enum {
>          #define GET_PARTITION_INFO     0x0
>          #define GET_LSA       0x2
>          #define SET_LSA       0x3
> +    MEDIA_AND_POISON = 0x43,
> +        #define GET_POISON_LIST        0x0
>  };
>  
>  struct cxl_cmd;
> @@ -267,6 +269,8 @@ static CXLRetCode cmd_identify_memory_device(struct cxl_cmd *cmd,
>      id->persistent_capacity = cxl_dstate->pmem_size / CXL_CAPACITY_MULTIPLIER;
>      id->volatile_capacity = cxl_dstate->vmem_size / CXL_CAPACITY_MULTIPLIER;
>      id->lsa_size = cvc->get_lsa_size(ct3d);
> +    id->poison_list_max_mer[1] = 0x1; /* 256 poison records */

Using st24_le_p() would be more robust I think.

> +    id->inject_poison_limit = 0; /* No limit - so limited by main poison record limit */
>  
>      *len = sizeof(*id);
>      return CXL_MBOX_SUCCESS;
> @@ -356,6 +360,82 @@ static CXLRetCode cmd_ccls_set_lsa(struct cxl_cmd *cmd,
>      return CXL_MBOX_SUCCESS;
>  }
>  
> +/*
> + * This is very inefficient, but good enough for now!
> + * Also the payload will always fit, so no need to handle the MORE flag and
> + * make this stateful. We may want to allow longer poison lists to aid
> + * testing that kernel functionality.
> + */
> +static CXLRetCode cmd_media_get_poison_list(struct cxl_cmd *cmd,
> +                                            CXLDeviceState *cxl_dstate,
> +                                            uint16_t *len)
> +{
> +    struct get_poison_list_pl {
> +        uint64_t pa;
> +        uint64_t length;
> +    } QEMU_PACKED;
> +
> +    struct get_poison_list_out_pl {
> +        uint8_t flags;
> +        uint8_t rsvd1;
> +        uint64_t overflow_timestamp;
> +        uint16_t count;
> +        uint8_t rsvd2[0x14];
> +        struct {
> +            uint64_t addr;
> +            uint32_t length;
> +            uint32_t resv;
> +        } QEMU_PACKED records[];
> +    } QEMU_PACKED;
> +
> +    struct get_poison_list_pl *in = (void *)cmd->payload;
> +    struct get_poison_list_out_pl *out = (void *)cmd->payload;
> +    CXLType3Dev *ct3d = container_of(cxl_dstate, CXLType3Dev, cxl_dstate);
> +    uint16_t record_count = 0, i = 0;
> +    uint64_t query_start = in->pa;

Should we verify Bits[5:0] are 0?

> +    uint64_t query_length = in->length;

Isn't in->length in units of 64bytes?  Does that get converted somewhere?

> +    CXLPoisonList *poison_list = &ct3d->poison_list;
> +    CXLPoison *ent;
> +    uint16_t out_pl_len;
> +
> +    QLIST_FOREACH(ent, poison_list, node) {
> +        /* Check for no overlap */
> +        if (ent->start >= query_start + query_length ||
> +            ent->start + ent->length <= query_start) {
> +            continue;
> +        }
> +        record_count++;
> +    }
> +    out_pl_len = sizeof(*out) + record_count * sizeof(out->records[0]);
> +    assert(out_pl_len <= CXL_MAILBOX_MAX_PAYLOAD_SIZE);
> +
> +    memset(out, 0, out_pl_len);
> +    QLIST_FOREACH(ent, poison_list, node) {
> +        uint64_t start, stop;
> +
> +        /* Check for no overlap */
> +        if (ent->start >= query_start + query_length ||
> +            ent->start + ent->length <= query_start) {
> +            continue;
> +        }
> +
> +        /* Deal with overlap */
> +        start = MAX(ent->start & 0xffffffffffffffc0, query_start);
> +        stop = MIN((ent->start & 0xffffffffffffffc0) + ent->length,
> +                   query_start + query_length);
> +        out->records[i].addr = start | (ent->type & 0x3);

cpu_to_le64()?

> +        out->records[i].length = (stop - start) / 64;

cpu_to_le32()?

> +        i++;
> +    }
> +    if (ct3d->poison_list_overflowed) {
> +        out->flags = (1 << 1);
> +        out->overflow_timestamp = ct3d->poison_list_overflow_ts;

cpu_to_le64()?

> +    }
> +    out->count = record_count;
> +    *len = out_pl_len;
> +    return CXL_MBOX_SUCCESS;
> +}
> +
>  #define IMMEDIATE_CONFIG_CHANGE (1 << 1)
>  #define IMMEDIATE_DATA_CHANGE (1 << 2)
>  #define IMMEDIATE_POLICY_CHANGE (1 << 3)
> @@ -383,6 +463,8 @@ static struct cxl_cmd cxl_cmd_set[256][256] = {
>      [CCLS][GET_LSA] = { "CCLS_GET_LSA", cmd_ccls_get_lsa, 8, 0 },
>      [CCLS][SET_LSA] = { "CCLS_SET_LSA", cmd_ccls_set_lsa,
>          ~0, IMMEDIATE_CONFIG_CHANGE | IMMEDIATE_DATA_CHANGE },
> +    [MEDIA_AND_POISON][GET_POISON_LIST] = { "MEDIA_AND_POISON_GET_POISON_LIST",
> +        cmd_media_get_poison_list, 16, 0 },
>  };
>  
>  void cxl_process_mailbox(CXLDeviceState *cxl_dstate)
> diff --git a/hw/mem/cxl_type3.c b/hw/mem/cxl_type3.c
> index 8b7727a75b..3585f78b4e 100644
> --- a/hw/mem/cxl_type3.c
> +++ b/hw/mem/cxl_type3.c
> @@ -925,6 +925,62 @@ static void set_lsa(CXLType3Dev *ct3d, const void *buf, uint64_t size,
>       */
>  }
>  
> +void cxl_set_poison_list_overflowed(CXLType3Dev *ct3d)
> +{
> +        ct3d->poison_list_overflowed = true;
> +        ct3d->poison_list_overflow_ts =
> +            cxl_device_get_timestamp(&ct3d->cxl_dstate);
> +}
> +
> +void qmp_cxl_inject_poison(const char *path, uint64_t start, uint64_t length,
> +                           Error **errp)
> +{
> +    Object *obj = object_resolve_path(path, NULL);
> +    CXLType3Dev *ct3d;
> +    CXLPoison *p;
> +
> +    if (length % 64) {
> +        error_setg(errp, "Poison injection must be in multiples of 64 bytes");
> +        return;
> +    }
> +    if (start % 64) {
> +        error_setg(errp, "Poison start address must be 64 byte aligned");
> +        return;
> +    }
> +    if (!obj) {
> +        error_setg(errp, "Unable to resolve path");
> +        return;
> +    }
> +    if (!object_dynamic_cast(obj, TYPE_CXL_TYPE3)) {
> +        error_setg(errp, "Path does not point to a CXL type 3 device");
> +        return;
> +    }
> +
> +    ct3d = CXL_TYPE3(obj);
> +
> +    QLIST_FOREACH(p, &ct3d->poison_list, node) {
> +        if (((start >= p->start) && (start < p->start + p->length)) ||
> +            ((start + length > p->start) &&
> +             (start + length <= p->start + p->length))) {
> +            error_setg(errp, "Overlap with existing poisoned region not supported");
> +            return;
> +        }
> +    }
> +
> +    if (ct3d->poison_list_cnt == CXL_POISON_LIST_LIMIT) {
> +        cxl_set_poison_list_overflowed(ct3d);
> +        return;
> +    }
> +
> +    p = g_new0(CXLPoison, 1);
> +    p->length = length;
> +    p->start = start;
> +    p->type = CXL_POISON_TYPE_INTERNAL; /* Different from injected via the mbox */
> +
> +    QLIST_INSERT_HEAD(&ct3d->poison_list, p, node);
> +    ct3d->poison_list_cnt++;
> +}
> +
>  /* For uncorrectable errors include support for multiple header recording */
>  void qmp_cxl_inject_uncorrectable_errors(const char *path,
>                                           CXLUncorErrorRecordList *errors,
> diff --git a/hw/mem/cxl_type3_stubs.c b/hw/mem/cxl_type3_stubs.c
> index b6b51ced54..6055190ca6 100644
> --- a/hw/mem/cxl_type3_stubs.c
> +++ b/hw/mem/cxl_type3_stubs.c
> @@ -2,6 +2,9 @@
>  #include "qemu/osdep.h"
>  #include "qapi/qapi-commands-cxl.h"
>  
> +void qmp_cxl_inject_poison(const char *path, uint64_t start, uint64_t length,
> +                           Error **errp) {}
> +
>  void qmp_cxl_inject_uncorrectable_errors(const char *path,
>                                           CXLUncorErrorRecordList *errors,
>                                           Error **errp) {}
> diff --git a/hw/mem/meson.build b/hw/mem/meson.build
> index 56c2618b84..930c67e390 100644
> --- a/hw/mem/meson.build
> +++ b/hw/mem/meson.build
> @@ -10,3 +10,5 @@ softmmu_ss.add(when: 'CONFIG_ALL', if_true: files('cxl_type3_stubs.c'))
>  softmmu_ss.add_all(when: 'CONFIG_MEM_DEVICE', if_true: mem_ss)
>  
>  softmmu_ss.add(when: 'CONFIG_SPARSE_MEM', if_true: files('sparse-mem.c'))
> +softmmu_ss.add(when: 'CONFIG_CXL_MEM_DEVICE', if_false: files('cxl_type3_stubs.c'))
> +softmmu_ss.add(when: 'CONFIG_ALL', if_true: files('cxl_type3_stubs.c'))
> diff --git a/include/hw/cxl/cxl_device.h b/include/hw/cxl/cxl_device.h
> index 44fea2d649..3cb77fe8a5 100644
> --- a/include/hw/cxl/cxl_device.h
> +++ b/include/hw/cxl/cxl_device.h
> @@ -270,6 +270,18 @@ typedef struct CXLError {
>  
>  typedef QTAILQ_HEAD(, CXLError) CXLErrorList;
>  
> +typedef struct CXLPoison {
> +    uint64_t start, length;
> +    uint8_t type;
> +#define CXL_POISON_TYPE_EXTERNAL 0x1
> +#define CXL_POISON_TYPE_INTERNAL 0x2
> +#define CXL_POISON_TYPE_INJECTED 0x3
> +    QLIST_ENTRY(CXLPoison) node;
> +} CXLPoison;
> +
> +typedef QLIST_HEAD(, CXLPoison) CXLPoisonList;
> +#define CXL_POISON_LIST_LIMIT 256
> +
>  struct CXLType3Dev {
>      /* Private */
>      PCIDevice parent_obj;
> @@ -292,6 +304,12 @@ struct CXLType3Dev {
>  
>      /* Error injection */
>      CXLErrorList error_list;
> +
> +    /* Poison Injection - cache */
> +    CXLPoisonList poison_list;
> +    unsigned int poison_list_cnt;
> +    bool poison_list_overflowed;
> +    uint64_t poison_list_overflow_ts;
>  };
>  
>  #define TYPE_CXL_TYPE3 "cxl-type3"
> @@ -317,4 +335,6 @@ MemTxResult cxl_type3_write(PCIDevice *d, hwaddr host_addr, uint64_t data,
>  
>  uint64_t cxl_device_get_timestamp(CXLDeviceState *cxlds);
>  
> +void cxl_set_poison_list_overflowed(CXLType3Dev *ct3d);
> +
>  #endif
> diff --git a/qapi/cxl.json b/qapi/cxl.json
> index ac7e167fa2..bc099d695e 100644
> --- a/qapi/cxl.json
> +++ b/qapi/cxl.json
> @@ -5,6 +5,22 @@
>  # = CXL devices
>  ##
>  
> +##
> +# @cxl-inject-poison:
> +#
> +# Poison records indicate that a CXL memory device knows that a particular
> +# memory region may be corrupted. This may be because of locally detected
> +# errors (e.g. ECC failure) or poisoned writes received from other components
> +# in the system. This injection mechanism enables testing of the OS handling
> +# of poison records which may be queried via the CXL mailbox.
> +#
> +# @path: CXL type 3 device canonical QOM path
> +# @start: Start address

NIT: "Must be 64 bytes aligned."

> +# @length: Length of poison to inject

NIT: "Must be in multiples of 64 bytes."

Ira

> +##
> +{ 'command': 'cxl-inject-poison',
> +  'data': { 'path': 'str', 'start': 'uint64', 'length': 'uint64' }}
> +
>  ##
>  # @CxlUncorErrorType:
>  #
> -- 
> 2.37.2
> 



^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 5/6] hw/cxl: Add poison injection via the mailbox.
  2023-02-17 18:18 ` [PATCH 5/6] hw/cxl: Add poison injection via the mailbox Jonathan Cameron
@ 2023-02-22  1:18   ` Ira Weiny
  2023-02-27 14:57     ` Jonathan Cameron
  0 siblings, 1 reply; 15+ messages in thread
From: Ira Weiny @ 2023-02-22  1:18 UTC (permalink / raw)
  To: Jonathan Cameron, qemu-devel, Michael Tsirkin
  Cc: Ben Widawsky, linux-cxl, linuxarm, Ira Weiny, Gregory Price,
	Philippe Mathieu-Daudé,
	Mike Maslenkin, Markus Armbruster, Dave Jiang, alison.schofield

Jonathan Cameron wrote:
> Very simple implementation to allow testing of corresponding
> kernel code. Note that for now we track each 64 byte section
> independently.  Whilst a valid implementation choice, it may
> make sense to fuse entries so as to prove out more complex
> corners of the kernel code.
> 
> Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
> ---
>  hw/cxl/cxl-mailbox-utils.c | 40 ++++++++++++++++++++++++++++++++++++++
>  1 file changed, 40 insertions(+)
> 
> diff --git a/hw/cxl/cxl-mailbox-utils.c b/hw/cxl/cxl-mailbox-utils.c
> index cf3cfb10a1..7d3f7bcd3a 100644
> --- a/hw/cxl/cxl-mailbox-utils.c
> +++ b/hw/cxl/cxl-mailbox-utils.c
> @@ -64,6 +64,7 @@ enum {
>          #define SET_LSA       0x3
>      MEDIA_AND_POISON = 0x43,
>          #define GET_POISON_LIST        0x0
> +        #define INJECT_POISON          0x1
>  };
>  
>  struct cxl_cmd;
> @@ -436,6 +437,43 @@ static CXLRetCode cmd_media_get_poison_list(struct cxl_cmd *cmd,
>      return CXL_MBOX_SUCCESS;
>  }
>  
> +static CXLRetCode cmd_media_inject_poison(struct cxl_cmd *cmd,
> +                                          CXLDeviceState *cxl_dstate,
> +                                          uint16_t *len)
> +{
> +    CXLType3Dev *ct3d = container_of(cxl_dstate, CXLType3Dev, cxl_dstate);
> +    CXLPoisonList *poison_list = &ct3d->poison_list;
> +    CXLPoison *ent;
> +    struct inject_poison_pl {
> +        uint64_t dpa;
> +    };
> +    struct inject_poison_pl *in = (void *)cmd->payload;
> +    CXLPoison *p;
> +
> +    QLIST_FOREACH(ent, poison_list, node) {
> +        if (ent->start == in->dpa && ent->length == 64) {

How does this interact with the QMP inject poison?  Should this be
checking the range of the entries?

Ira

> +            return CXL_MBOX_SUCCESS;
> +        }
> +    }
> +
> +    if (ct3d->poison_list_cnt == CXL_POISON_LIST_LIMIT) {
> +        return CXL_MBOX_INJECT_POISON_LIMIT;
> +    }
> +    p = g_new0(CXLPoison, 1);
> +
> +    p->length = 64;
> +    p->start = in->dpa;
> +    p->type = CXL_POISON_TYPE_INJECTED;
> +
> +    /*
> +     * Possible todo: Merge with existing entry if next to it and if same type
> +     */
> +    QLIST_INSERT_HEAD(poison_list, p, node);
> +    ct3d->poison_list_cnt++;
> +
> +    return CXL_MBOX_SUCCESS;
> +}
> +
>  #define IMMEDIATE_CONFIG_CHANGE (1 << 1)
>  #define IMMEDIATE_DATA_CHANGE (1 << 2)
>  #define IMMEDIATE_POLICY_CHANGE (1 << 3)
> @@ -465,6 +503,8 @@ static struct cxl_cmd cxl_cmd_set[256][256] = {
>          ~0, IMMEDIATE_CONFIG_CHANGE | IMMEDIATE_DATA_CHANGE },
>      [MEDIA_AND_POISON][GET_POISON_LIST] = { "MEDIA_AND_POISON_GET_POISON_LIST",
>          cmd_media_get_poison_list, 16, 0 },
> +    [MEDIA_AND_POISON][INJECT_POISON] = { "MEDIA_AND_POISON_INJECT_POISON",
> +        cmd_media_inject_poison, 8, 0 },
>  };
>  
>  void cxl_process_mailbox(CXLDeviceState *cxl_dstate)
> -- 
> 2.37.2
> 



^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 6/6] hw/cxl: Add clear poison mailbox command support.
  2023-02-17 18:18 ` [PATCH 6/6] hw/cxl: Add clear poison mailbox command support Jonathan Cameron
@ 2023-02-22  1:31   ` Ira Weiny
  0 siblings, 0 replies; 15+ messages in thread
From: Ira Weiny @ 2023-02-22  1:31 UTC (permalink / raw)
  To: Jonathan Cameron, qemu-devel, Michael Tsirkin
  Cc: Ben Widawsky, linux-cxl, linuxarm, Ira Weiny, Gregory Price,
	Philippe Mathieu-Daudé,
	Mike Maslenkin, Markus Armbruster, Dave Jiang, alison.schofield

Jonathan Cameron wrote:
> Current implementation is very simple so many of the corner
> cases do not exist (e.g. fragmenting larger poison list entries)
> 
> Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
> ---
>  hw/cxl/cxl-mailbox-utils.c  | 77 +++++++++++++++++++++++++++++++++++++
>  hw/mem/cxl_type3.c          | 36 +++++++++++++++++
>  include/hw/cxl/cxl_device.h |  1 +
>  3 files changed, 114 insertions(+)
> 
> diff --git a/hw/cxl/cxl-mailbox-utils.c b/hw/cxl/cxl-mailbox-utils.c
> index 7d3f7bcd3a..f56c76b205 100644
> --- a/hw/cxl/cxl-mailbox-utils.c
> +++ b/hw/cxl/cxl-mailbox-utils.c
> @@ -65,6 +65,7 @@ enum {
>      MEDIA_AND_POISON = 0x43,
>          #define GET_POISON_LIST        0x0
>          #define INJECT_POISON          0x1
> +        #define CLEAR_POISON           0x2
>  };
>  
>  struct cxl_cmd;
> @@ -474,6 +475,80 @@ static CXLRetCode cmd_media_inject_poison(struct cxl_cmd *cmd,
>      return CXL_MBOX_SUCCESS;
>  }
>  
> +static CXLRetCode cmd_media_clear_poison(struct cxl_cmd *cmd,
> +                                         CXLDeviceState *cxl_dstate,
> +                                         uint16_t *len)
> +{
> +    CXLType3Dev *ct3d = container_of(cxl_dstate, CXLType3Dev, cxl_dstate);
> +    CXLPoisonList *poison_list = &ct3d->poison_list;
> +    CXLType3Class *cvc = CXL_TYPE3_GET_CLASS(ct3d);
> +    struct clear_poison_pl {
> +        uint64_t dpa;
> +        uint8_t data[64];
> +    };
> +    CXLPoison *ent;
> +
> +    struct clear_poison_pl *in = (void *)cmd->payload;
> +
> +    if (in->dpa + 64 > cxl_dstate->mem_size) {
> +        return CXL_MBOX_INVALID_PA;
> +    }
> +
> +    QLIST_FOREACH(ent, poison_list, node) {
> +        /*
> +         * Test for contained in entry. Simpler than general case
> +         * as clearing 64 bytes and entries 64 byte aligned
> +         */
> +        if ((in->dpa < ent->start) || (in->dpa >= ent->start + ent->length)) {
> +            continue;
> +        }
> +        /* Do accounting early as we know one will go away */
> +        ct3d->poison_list_cnt--;
> +        if (in->dpa > ent->start) {
> +            CXLPoison *frag;
> +            if (ct3d->poison_list_cnt == CXL_POISON_LIST_LIMIT) {

Isn't this always impossible because poison_list_cnt was just decremented?

I wonder if the early accounting is correct with this check.

> +                cxl_set_poison_list_overflowed(ct3d);
> +                break;
> +            }
> +            frag = g_new0(CXLPoison, 1);
> +
> +            frag->start = ent->start;
> +            frag->length = in->dpa - ent->start;
> +            frag->type = ent->type;
> +
> +            QLIST_INSERT_HEAD(poison_list, frag, node);
> +            ct3d->poison_list_cnt++;
> +        }
> +        if (in->dpa + 64 < ent->start + ent->length) {
> +            CXLPoison *frag;
> +
> +            if (ct3d->poison_list_cnt == CXL_POISON_LIST_LIMIT) {
> +                cxl_set_poison_list_overflowed(ct3d);
> +                break;
> +            }
> +
> +            frag = g_new0(CXLPoison, 1);
> +
> +            frag->start = in->dpa + 64;
> +            frag->length = ent->start + ent->length - frag->start;
> +            frag->type = ent->type;
> +            QLIST_INSERT_HEAD(poison_list, frag, node);
> +            ct3d->poison_list_cnt++;
> +        }
> +        /* Any fragments have been added, free original entry */
> +        QLIST_REMOVE(ent, node);

Seems safer to decrement here and check limit prior to adding the
fragments above.

> +        g_free(ent);
> +        break;
> +    }
> +    /* Clearing a region with no poison is not an error so always do so */
> +    if (cvc->set_cacheline)

Per Qemu coding you need '{'.  But is this check needed? ...

> +        if (!cvc->set_cacheline(ct3d, in->dpa, in->data)) {
> +            return CXL_MBOX_INTERNAL_ERROR;
> +        }
> +
> +    return CXL_MBOX_SUCCESS;
> +}
> +
>  #define IMMEDIATE_CONFIG_CHANGE (1 << 1)
>  #define IMMEDIATE_DATA_CHANGE (1 << 2)
>  #define IMMEDIATE_POLICY_CHANGE (1 << 3)
> @@ -505,6 +580,8 @@ static struct cxl_cmd cxl_cmd_set[256][256] = {
>          cmd_media_get_poison_list, 16, 0 },
>      [MEDIA_AND_POISON][INJECT_POISON] = { "MEDIA_AND_POISON_INJECT_POISON",
>          cmd_media_inject_poison, 8, 0 },
> +    [MEDIA_AND_POISON][CLEAR_POISON] = { "MEDIA_AND_POISON_CLEAR_POISON",
> +        cmd_media_clear_poison, 72, 0 },
>  };
>  
>  void cxl_process_mailbox(CXLDeviceState *cxl_dstate)
> diff --git a/hw/mem/cxl_type3.c b/hw/mem/cxl_type3.c
> index 3585f78b4e..8adc725edc 100644
> --- a/hw/mem/cxl_type3.c
> +++ b/hw/mem/cxl_type3.c
> @@ -925,6 +925,41 @@ static void set_lsa(CXLType3Dev *ct3d, const void *buf, uint64_t size,
>       */
>  }
>  
> +static bool set_cacheline(CXLType3Dev *ct3d, uint64_t dpa_offset, uint8_t *data)
> +{
> +    MemoryRegion *vmr = NULL, *pmr = NULL;
> +    AddressSpace *as;
> +
> +    if (ct3d->hostvmem) {
> +        vmr = host_memory_backend_get_memory(ct3d->hostvmem);
> +    }
> +    if (ct3d->hostpmem) {
> +        pmr = host_memory_backend_get_memory(ct3d->hostpmem);
> +    }
> +
> +    if (!vmr && !pmr) {
> +        return false;
> +    }
> +
> +    if (dpa_offset + 64 > int128_get64(ct3d->cxl_dstate.mem_size)) {
> +        return false;
> +    }
> +
> +    if (vmr) {
> +        if (dpa_offset <= int128_get64(vmr->size)) {
> +            as = &ct3d->hostvmem_as;
> +        } else {
> +            as = &ct3d->hostpmem_as;
> +            dpa_offset -= vmr->size;
> +        }
> +    } else {
> +        as = &ct3d->hostpmem_as;
> +    }
> +
> +    address_space_write(as, dpa_offset, MEMTXATTRS_UNSPECIFIED, &data, 64);
> +    return true;
> +}
> +
>  void cxl_set_poison_list_overflowed(CXLType3Dev *ct3d)
>  {
>          ct3d->poison_list_overflowed = true;
> @@ -1146,6 +1181,7 @@ static void ct3_class_init(ObjectClass *oc, void *data)
>      cvc->get_lsa_size = get_lsa_size;
>      cvc->get_lsa = get_lsa;
>      cvc->set_lsa = set_lsa;
> +    cvc->set_cacheline = set_cacheline;

...  This is always set?

Ira

>  }
>  
>  static const TypeInfo ct3d_info = {
> diff --git a/include/hw/cxl/cxl_device.h b/include/hw/cxl/cxl_device.h
> index 3cb77fe8a5..0a05f21e40 100644
> --- a/include/hw/cxl/cxl_device.h
> +++ b/include/hw/cxl/cxl_device.h
> @@ -326,6 +326,7 @@ struct CXLType3Class {
>                          uint64_t offset);
>      void (*set_lsa)(CXLType3Dev *ct3d, const void *buf, uint64_t size,
>                      uint64_t offset);
> +    bool (*set_cacheline)(CXLType3Dev *ct3d, uint64_t dpa_offset, uint8_t *data);
>  };
>  
>  MemTxResult cxl_type3_read(PCIDevice *d, hwaddr host_addr, uint64_t *data,
> -- 
> 2.37.2
> 



^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 1/6] hw/cxl: Move enum ret_code definition to cxl_device.h
  2023-02-17 18:18 ` [PATCH 1/6] hw/cxl: Move enum ret_code definition to cxl_device.h Jonathan Cameron
@ 2023-02-22  1:47   ` Ira Weiny
  2023-02-24 15:10     ` Jonathan Cameron
  0 siblings, 1 reply; 15+ messages in thread
From: Ira Weiny @ 2023-02-22  1:47 UTC (permalink / raw)
  To: Jonathan Cameron, qemu-devel, Michael Tsirkin
  Cc: Ben Widawsky, linux-cxl, linuxarm, Ira Weiny, Gregory Price,
	Philippe Mathieu-Daudé,
	Mike Maslenkin, Markus Armbruster, Dave Jiang, alison.schofield

Jonathan Cameron wrote:
> Needs tidy up and rename to something more generic now it is
> in a header.

I'm not opposed to this change and patch 2 but I don't see where
CXLRetCode is being used outside of cxl-mailbox-utils.c in this series.

Despite that reservation I think this is a good clarification.

Reviewed-by: Ira Weiny <ira.weiny@intel.com>

> 
> Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
> ---
>  hw/cxl/cxl-mailbox-utils.c  | 28 ----------------------------
>  include/hw/cxl/cxl_device.h | 28 ++++++++++++++++++++++++++++
>  2 files changed, 28 insertions(+), 28 deletions(-)
> 

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 2/6] hw/cxl: rename mailbox return code type from ret_code to CXLRetCode
  2023-02-17 18:18 ` [PATCH 2/6] hw/cxl: rename mailbox return code type from ret_code to CXLRetCode Jonathan Cameron
@ 2023-02-22  1:47   ` Ira Weiny
  0 siblings, 0 replies; 15+ messages in thread
From: Ira Weiny @ 2023-02-22  1:47 UTC (permalink / raw)
  To: Jonathan Cameron, qemu-devel, Michael Tsirkin
  Cc: Ben Widawsky, linux-cxl, linuxarm, Ira Weiny, Gregory Price,
	Philippe Mathieu-Daudé,
	Mike Maslenkin, Markus Armbruster, Dave Jiang, alison.schofield

Jonathan Cameron wrote:
> This enum typedef used to be local to one file, so having a generic
> name wasn't a big problem even if it wasn't compliant with QEMU naming
> conventions.  Now it is in cxl_device.h to support use outside of
> cxl-mailbox-utils.c rename it.

Same comment as 1/6 but still.

Reviewed-by: Ira Weiny <ira.weiny@intel.com>

> 
> Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
> ---
>  hw/cxl/cxl-mailbox-utils.c  | 62 ++++++++++++++++++-------------------
>  include/hw/cxl/cxl_device.h |  2 +-
>  2 files changed, 32 insertions(+), 32 deletions(-)
> 

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 4/6] hw/cxl: QMP based poison injection support
  2023-02-22  1:14   ` Ira Weiny
@ 2023-02-22 17:53     ` Jonathan Cameron
  0 siblings, 0 replies; 15+ messages in thread
From: Jonathan Cameron @ 2023-02-22 17:53 UTC (permalink / raw)
  To: Ira Weiny
  Cc: qemu-devel, Michael Tsirkin, Ben Widawsky, linux-cxl, linuxarm,
	Gregory Price, Philippe Mathieu-Daudé,
	Mike Maslenkin, Markus Armbruster, Dave Jiang, alison.schofield

On Tue, 21 Feb 2023 17:14:05 -0800
Ira Weiny <ira.weiny@intel.com> wrote:

> Jonathan Cameron wrote:
> > Inject poison using qmp command cxl-inject-poison to add an entry to the
> > poison list.
> > 
> > For now, the poison is not returned CXL.mem reads, but only via the
> > mailbox command Get Poison List.
> > 
> > See CXL rev 3.0, sec 8.2.9.8.4.1 Get Poison list (Opcode 4300h)
> > 
> > Kernel patches to use this interface here:
> > https://lore.kernel.org/linux-cxl/cover.1665606782.git.alison.schofield@intel.com/
> > 
> > To inject poison using qmp (telnet to the qmp port)
> > { "execute": "qmp_capabilities" }
> > 
> > { "execute": "cxl-inject-poison",
> >     "arguments": {
> >          "path": "/machine/peripheral/cxl-pmem0",
> >          "start": 2048,
> >          "length": 256
> >     }
> > }
> > 
> > Adjusted to select a device on your machine.
> > 
> > Note that the poison list supported is kept short enough to avoid the
> > complexity of state machine that is needed to handle the MORE flag.
> > 
> > Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
> > 
> > ---
> > v3:
> > Improve QMP documentation.
> > 
> > v2:
> > Moved to QMP to allow for single command.
> > Update reference in coverletter
> > Added specific setting of type for this approach to injection.
> > Drop the unnecessary ct3d class get_poison_list callback.
> > Block overlapping regions from being injected
> > Handle list overflow
> > Use Ira's utility function to get the timestamps
> > ---
> >  hw/cxl/cxl-mailbox-utils.c  | 82 +++++++++++++++++++++++++++++++++++++
> >  hw/mem/cxl_type3.c          | 56 +++++++++++++++++++++++++
> >  hw/mem/cxl_type3_stubs.c    |  3 ++
> >  hw/mem/meson.build          |  2 +
> >  include/hw/cxl/cxl_device.h | 20 +++++++++
> >  qapi/cxl.json               | 16 ++++++++
> >  6 files changed, 179 insertions(+)
> > 
> > diff --git a/hw/cxl/cxl-mailbox-utils.c b/hw/cxl/cxl-mailbox-utils.c
> > index 580366ed2f..cf3cfb10a1 100644
> > --- a/hw/cxl/cxl-mailbox-utils.c
> > +++ b/hw/cxl/cxl-mailbox-utils.c
> > @@ -62,6 +62,8 @@ enum {
> >          #define GET_PARTITION_INFO     0x0
> >          #define GET_LSA       0x2
> >          #define SET_LSA       0x3
> > +    MEDIA_AND_POISON = 0x43,
> > +        #define GET_POISON_LIST        0x0
> >  };
> >  
> >  struct cxl_cmd;
> > @@ -267,6 +269,8 @@ static CXLRetCode cmd_identify_memory_device(struct cxl_cmd *cmd,
> >      id->persistent_capacity = cxl_dstate->pmem_size / CXL_CAPACITY_MULTIPLIER;
> >      id->volatile_capacity = cxl_dstate->vmem_size / CXL_CAPACITY_MULTIPLIER;
> >      id->lsa_size = cvc->get_lsa_size(ct3d);
> > +    id->poison_list_max_mer[1] = 0x1; /* 256 poison records */  
> 
> Using st24_le_p() would be more robust I think.

Ah the wonder of patch reordering. The patch adding that was after this is my queue.
I'll pull it down to before here so we can use it.

There are a bunch of endian issues around here (such as the lsa_size just above
this) but good not to introduce a new one.

I'll circle back around to clean the rest up at a later date.


> 
> > +    id->inject_poison_limit = 0; /* No limit - so limited by main poison record limit */
> >  
> >      *len = sizeof(*id);
> >      return CXL_MBOX_SUCCESS;
> > @@ -356,6 +360,82 @@ static CXLRetCode cmd_ccls_set_lsa(struct cxl_cmd *cmd,
> >      return CXL_MBOX_SUCCESS;
> >  }
> >  
> > +/*
> > + * This is very inefficient, but good enough for now!
> > + * Also the payload will always fit, so no need to handle the MORE flag and
> > + * make this stateful. We may want to allow longer poison lists to aid
> > + * testing that kernel functionality.
> > + */
> > +static CXLRetCode cmd_media_get_poison_list(struct cxl_cmd *cmd,
> > +                                            CXLDeviceState *cxl_dstate,
> > +                                            uint16_t *len)
> > +{
> > +    struct get_poison_list_pl {
> > +        uint64_t pa;
> > +        uint64_t length;
> > +    } QEMU_PACKED;
> > +
> > +    struct get_poison_list_out_pl {
> > +        uint8_t flags;
> > +        uint8_t rsvd1;
> > +        uint64_t overflow_timestamp;
> > +        uint16_t count;
> > +        uint8_t rsvd2[0x14];
> > +        struct {
> > +            uint64_t addr;
> > +            uint32_t length;
> > +            uint32_t resv;
> > +        } QEMU_PACKED records[];
> > +    } QEMU_PACKED;
> > +
> > +    struct get_poison_list_pl *in = (void *)cmd->payload;
> > +    struct get_poison_list_out_pl *out = (void *)cmd->payload;
> > +    CXLType3Dev *ct3d = container_of(cxl_dstate, CXLType3Dev, cxl_dstate);
> > +    uint16_t record_count = 0, i = 0;
> > +    uint64_t query_start = in->pa;  
> 
> Should we verify Bits[5:0] are 0?

Yes, plus looking again at this code, it needs a bunch of le handling like
all the other mailbox commands.  In the interests of not making that worse
I'll fix this one up whilst we are here.

> 
> > +    uint64_t query_length = in->length;  
> 
> Isn't in->length in units of 64bytes?  Does that get converted somewhere?
Good spot. I guess all my test values happen to be small so I never noticed.
Clearly need to update those!



> 
> > +    CXLPoisonList *poison_list = &ct3d->poison_list;
> > +    CXLPoison *ent;
> > +    uint16_t out_pl_len;
> > +
> > +    QLIST_FOREACH(ent, poison_list, node) {
> > +        /* Check for no overlap */
> > +        if (ent->start >= query_start + query_length ||
> > +            ent->start + ent->length <= query_start) {
> > +            continue;
> > +        }
> > +        record_count++;
> > +    }
> > +    out_pl_len = sizeof(*out) + record_count * sizeof(out->records[0]);
> > +    assert(out_pl_len <= CXL_MAILBOX_MAX_PAYLOAD_SIZE);
> > +
> > +    memset(out, 0, out_pl_len);
> > +    QLIST_FOREACH(ent, poison_list, node) {
> > +        uint64_t start, stop;
> > +
> > +        /* Check for no overlap */
> > +        if (ent->start >= query_start + query_length ||
> > +            ent->start + ent->length <= query_start) {
> > +            continue;
> > +        }
> > +
> > +        /* Deal with overlap */
> > +        start = MAX(ent->start & 0xffffffffffffffc0, query_start);
> > +        stop = MIN((ent->start & 0xffffffffffffffc0) + ent->length,
> > +                   query_start + query_length);
> > +        out->records[i].addr = start | (ent->type & 0x3);  
> 
> cpu_to_le64()?
> 
> > +        out->records[i].length = (stop - start) / 64;  
> 
> cpu_to_le32()?

yup. Though I'll use stw_le_p() etc

Usual problem of when you've looked at code too many times you forget
the newer issues identified in other patches that apply here.

Oh for that big endian CXL emulated machine to make all these
stand out in spectacular fashion.

> 
> > +        i++;
> > +    }
> > +    if (ct3d->poison_list_overflowed) {
> > +        out->flags = (1 << 1);
> > +        out->overflow_timestamp = ct3d->poison_list_overflow_ts;  
> 
> cpu_to_le64()?
> 
> > +    }
> > +    out->count = record_count;
> > +    *len = out_pl_len;
> > +    return CXL_MBOX_SUCCESS;
> > +}

...

> >  void cxl_process_mailbox(CXLDeviceState *cxl_dstate)
> > diff --git a/hw/mem/cxl_type3.c b/hw/mem/cxl_type3.c
> > index 8b7727a75b..3585f78b4e 100644
> > --- a/hw/mem/cxl_type3.c
> > +++ b/hw/mem/cxl_type3.c
> > @@ -925,6 +925,62 @@ static void set_lsa(CXLType3Dev *ct3d, const void *buf, uint64_t size,

> > +void qmp_cxl_inject_poison(const char *path, uint64_t start, uint64_t length,
> > +                           Error **errp)
> > +{
> > +    Object *obj = object_resolve_path(path, NULL);
> > +    CXLType3Dev *ct3d;
> > +    CXLPoison *p;
> > +
> > +    if (length % 64) {
> > +        error_setg(errp, "Poison injection must be in multiples of 64 bytes");
> > +        return;
> > +    }

Relevant to below response.  Length is 64 byte aligned, not to be multiplied by
64 (unlike the CXL commands).

...
> > diff --git a/hw/mem/cxl_type3_stubs.c b/hw/mem/cxl_type3_stubs.c
> > index b6b51ced54..6055190ca6 100644
> > --- a/hw/mem/cxl_type3_stubs.c
> > +++ b/hw/mem/cxl_type3_stubs.c
> > @@ -2,6 +2,9 @@
> >  #include "qemu/osdep.h"
> >  #include "qapi/qapi-commands-cxl.h"
> >  
> > +void qmp_cxl_inject_poison(const char *path, uint64_t start, uint64_t length,
> > +                           Error **errp) {}

Added an error_setg()  inline with Philippe's feedback.

> > +
> >  void qmp_cxl_inject_uncorrectable_errors(const char *path,
> >                                           CXLUncorErrorRecordList *errors,
> >                                           Error **errp) {}
> > diff --git a/hw/mem/meson.build b/hw/mem/meson.build
> > index 56c2618b84..930c67e390 100644
> > --- a/hw/mem/meson.build
> > +++ b/hw/mem/meson.build
> > @@ -10,3 +10,5 @@ softmmu_ss.add(when: 'CONFIG_ALL', if_true: files('cxl_type3_stubs.c'))
> >  softmmu_ss.add_all(when: 'CONFIG_MEM_DEVICE', if_true: mem_ss)
> >  
> >  softmmu_ss.add(when: 'CONFIG_SPARSE_MEM', if_true: files('sparse-mem.c'))
> > +softmmu_ss.add(when: 'CONFIG_CXL_MEM_DEVICE', if_false: files('cxl_type3_stubs.c'))
> > +softmmu_ss.add(when: 'CONFIG_ALL', if_true: files('cxl_type3_stubs.c'))

Gah. Rebase error. This is duplicated.

> > diff --git a/qapi/cxl.json b/qapi/cxl.json
> > index ac7e167fa2..bc099d695e 100644
> > --- a/qapi/cxl.json
> > +++ b/qapi/cxl.json
> > @@ -5,6 +5,22 @@
> >  # = CXL devices
> >  ##
> >  
> > +##
> > +# @cxl-inject-poison:
> > +#
> > +# Poison records indicate that a CXL memory device knows that a particular
> > +# memory region may be corrupted. This may be because of locally detected
> > +# errors (e.g. ECC failure) or poisoned writes received from other components
> > +# in the system. This injection mechanism enables testing of the OS handling
> > +# of poison records which may be queried via the CXL mailbox.
> > +#
> > +# @path: CXL type 3 device canonical QOM path
> > +# @start: Start address  
> 
> NIT: "Must be 64 bytes aligned."
> 
> > +# @length: Length of poison to inject  
> 
> NIT: "Must be in multiples of 64 bytes."
Must be an integer multiple of.  I decided the CXL convention of
* 64 was not intuitive so at least for the qapi I can use real numbers!
> 
> Ira

Thanks for your feedback - very useful! Also makes it clear I need
to do a scrub of all the mailbox commands to ensure little endian
setters and getters are used.

Some other changes coming from Philippe's review of the AER set
may affect this so those will have to come to a conclusion before
I post v2.

Thanks,

Jonathan
> 
> > +##
> > +{ 'command': 'cxl-inject-poison',
> > +  'data': { 'path': 'str', 'start': 'uint64', 'length': 'uint64' }}
> > +
> >  ##
> >  # @CxlUncorErrorType:
> >  #
> > -- 
> > 2.37.2
> >   
> 
> 


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 1/6] hw/cxl: Move enum ret_code definition to cxl_device.h
  2023-02-22  1:47   ` Ira Weiny
@ 2023-02-24 15:10     ` Jonathan Cameron
  0 siblings, 0 replies; 15+ messages in thread
From: Jonathan Cameron @ 2023-02-24 15:10 UTC (permalink / raw)
  To: Ira Weiny
  Cc: qemu-devel, Michael Tsirkin, Ben Widawsky, linux-cxl, linuxarm,
	Gregory Price, Philippe Mathieu-Daudé,
	Mike Maslenkin, Markus Armbruster, Dave Jiang, alison.schofield

On Tue, 21 Feb 2023 17:47:23 -0800
Ira Weiny <ira.weiny@intel.com> wrote:

> Jonathan Cameron wrote:
> > Needs tidy up and rename to something more generic now it is
> > in a header.  
> 
> I'm not opposed to this change and patch 2 but I don't see where
> CXLRetCode is being used outside of cxl-mailbox-utils.c in this series.
> 
> Despite that reservation I think this is a good clarification.
> 
> Reviewed-by: Ira Weiny <ira.weiny@intel.com>

Ah.  This is a mess due to patch reordering as this was originally
meant as a precursor for your event injection series then I had to drag
it forwards to here to avoid introducing more use of the old naming.

Let me see if I can swap it round and just do the rename in this series
and push the move back to the start of that series - which I'll post
soon.

As it's the same code changes in a different order I'll pick up your RB.
Give me a shout if you'd rather I didn't!

Thanks,

Jonathan



> 
> > 
> > Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
> > ---
> >  hw/cxl/cxl-mailbox-utils.c  | 28 ----------------------------
> >  include/hw/cxl/cxl_device.h | 28 ++++++++++++++++++++++++++++
> >  2 files changed, 28 insertions(+), 28 deletions(-)
> >   


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 5/6] hw/cxl: Add poison injection via the mailbox.
  2023-02-22  1:18   ` Ira Weiny
@ 2023-02-27 14:57     ` Jonathan Cameron
  0 siblings, 0 replies; 15+ messages in thread
From: Jonathan Cameron @ 2023-02-27 14:57 UTC (permalink / raw)
  To: Ira Weiny
  Cc: qemu-devel, Michael Tsirkin, Ben Widawsky, linux-cxl, linuxarm,
	Gregory Price, Philippe Mathieu-Daudé,
	Mike Maslenkin, Markus Armbruster, Dave Jiang, alison.schofield

On Tue, 21 Feb 2023 17:18:01 -0800
Ira Weiny <ira.weiny@intel.com> wrote:

> Jonathan Cameron wrote:
> > Very simple implementation to allow testing of corresponding
> > kernel code. Note that for now we track each 64 byte section
> > independently.  Whilst a valid implementation choice, it may
> > make sense to fuse entries so as to prove out more complex
> > corners of the kernel code.
> > 
> > Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
> > ---
> >  hw/cxl/cxl-mailbox-utils.c | 40 ++++++++++++++++++++++++++++++++++++++
> >  1 file changed, 40 insertions(+)
> > 
> > diff --git a/hw/cxl/cxl-mailbox-utils.c b/hw/cxl/cxl-mailbox-utils.c
> > index cf3cfb10a1..7d3f7bcd3a 100644
> > --- a/hw/cxl/cxl-mailbox-utils.c
> > +++ b/hw/cxl/cxl-mailbox-utils.c
> > @@ -64,6 +64,7 @@ enum {
> >          #define SET_LSA       0x3
> >      MEDIA_AND_POISON = 0x43,
> >          #define GET_POISON_LIST        0x0
> > +        #define INJECT_POISON          0x1
> >  };
> >  
> >  struct cxl_cmd;
> > @@ -436,6 +437,43 @@ static CXLRetCode cmd_media_get_poison_list(struct cxl_cmd *cmd,
> >      return CXL_MBOX_SUCCESS;
> >  }
> >  
> > +static CXLRetCode cmd_media_inject_poison(struct cxl_cmd *cmd,
> > +                                          CXLDeviceState *cxl_dstate,
> > +                                          uint16_t *len)
> > +{
> > +    CXLType3Dev *ct3d = container_of(cxl_dstate, CXLType3Dev, cxl_dstate);
> > +    CXLPoisonList *poison_list = &ct3d->poison_list;
> > +    CXLPoison *ent;
> > +    struct inject_poison_pl {
> > +        uint64_t dpa;
> > +    };
> > +    struct inject_poison_pl *in = (void *)cmd->payload;
> > +    CXLPoison *p;
> > +
> > +    QLIST_FOREACH(ent, poison_list, node) {
> > +        if (ent->start == in->dpa && ent->length == 64) {  
> 
> How does this interact with the QMP inject poison?  Should this be
> checking the range of the entries?

Good question and this implementation is definitely not right.
Having looked at the spec I'm not entirely sure what happens wrt
to the poison list if there is overlap. It leaves things less
sharply defined than I'd like.

The inject poison command calls out that it is not an error to inject poison
into a DPA that already has poison present - so a range match would
make more sense than what is here - I'll fix that.

It also calls out that the device "shall add a the new physical
address to the device's poison list and error source shall be set to an
injected event".

What it doesn't say is what should it do if there is already an entry
for a different poison type.  Should we update the type?  That's
nasty as it could lead to list overflow by turning one region into 2 or
3.

I guess no one really cares that much on the precise detail of poison
injection hence the spec is a little vague.

Anyhow, for now I'm thinking that if a range contains matches leave the
type alone is easy and I can't find anything to say that's not a valid
implementation choice.

As a side note, we don't yet have events support (that series is later in
the tree) so that fact injecting poison should add an entry to that isn't
happening.  I don't propose doing that until after the generic event injection
is done though as it will otherwise make that series more complex and
I doubt this is the only case we need to cover of these various error
reporting paths interacting.

Jonathan


> 
> Ira
> 
> > +            return CXL_MBOX_SUCCESS;
> > +        }
> > +    }
> > +
> > +    if (ct3d->poison_list_cnt == CXL_POISON_LIST_LIMIT) {
> > +        return CXL_MBOX_INJECT_POISON_LIMIT;
> > +    }
> > +    p = g_new0(CXLPoison, 1);
> > +
> > +    p->length = 64;
> > +    p->start = in->dpa;
> > +    p->type = CXL_POISON_TYPE_INJECTED;
> > +
> > +    /*
> > +     * Possible todo: Merge with existing entry if next to it and if same type
> > +     */
> > +    QLIST_INSERT_HEAD(poison_list, p, node);
> > +    ct3d->poison_list_cnt++;
> > +
> > +    return CXL_MBOX_SUCCESS;
> > +}
> > +
> >  #define IMMEDIATE_CONFIG_CHANGE (1 << 1)
> >  #define IMMEDIATE_DATA_CHANGE (1 << 2)
> >  #define IMMEDIATE_POLICY_CHANGE (1 << 3)
> > @@ -465,6 +503,8 @@ static struct cxl_cmd cxl_cmd_set[256][256] = {
> >          ~0, IMMEDIATE_CONFIG_CHANGE | IMMEDIATE_DATA_CHANGE },
> >      [MEDIA_AND_POISON][GET_POISON_LIST] = { "MEDIA_AND_POISON_GET_POISON_LIST",
> >          cmd_media_get_poison_list, 16, 0 },
> > +    [MEDIA_AND_POISON][INJECT_POISON] = { "MEDIA_AND_POISON_INJECT_POISON",
> > +        cmd_media_inject_poison, 8, 0 },
> >  };
> >  
> >  void cxl_process_mailbox(CXLDeviceState *cxl_dstate)
> > -- 
> > 2.37.2
> >   
> 
> 


^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2023-02-27 14:57 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-02-17 18:18 [PATCH 0/6] hw/cxl: Poison get, inject, clear Jonathan Cameron
2023-02-17 18:18 ` [PATCH 1/6] hw/cxl: Move enum ret_code definition to cxl_device.h Jonathan Cameron
2023-02-22  1:47   ` Ira Weiny
2023-02-24 15:10     ` Jonathan Cameron
2023-02-17 18:18 ` [PATCH 2/6] hw/cxl: rename mailbox return code type from ret_code to CXLRetCode Jonathan Cameron
2023-02-22  1:47   ` Ira Weiny
2023-02-17 18:18 ` [PATCH 3/6] hw/cxl: Introduce cxl_device_get_timestamp() utility function Jonathan Cameron
2023-02-17 18:18 ` [PATCH 4/6] hw/cxl: QMP based poison injection support Jonathan Cameron
2023-02-22  1:14   ` Ira Weiny
2023-02-22 17:53     ` Jonathan Cameron
2023-02-17 18:18 ` [PATCH 5/6] hw/cxl: Add poison injection via the mailbox Jonathan Cameron
2023-02-22  1:18   ` Ira Weiny
2023-02-27 14:57     ` Jonathan Cameron
2023-02-17 18:18 ` [PATCH 6/6] hw/cxl: Add clear poison mailbox command support Jonathan Cameron
2023-02-22  1:31   ` Ira Weiny

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).