All of lore.kernel.org
 help / color / mirror / Atom feed
* [RFC PATCH v2 0/3] hw/cxl: Poison get, inject, clear
@ 2023-02-01 10:03 ` Jonathan Cameron via
  0 siblings, 0 replies; 12+ messages in thread
From: Jonathan Cameron @ 2023-02-01 10:03 UTC (permalink / raw)
  To: qemu-devel, Michael Tsirkin
  Cc: Ben Widawsky, linux-cxl, linuxarm, Ira Weiny, Alison Schofield

RFC mostly because we already have 4 CXL QEMU series waiting for review
ahead of this and I don't want to distract too much from those.

I posted an RFC of part of this series a long time back [1]. It's been more
or less entirely rewriten since then and gained support for mailbox based
injection and clearing (including writing the cacheline). Note that I still
don't check for poison on direct reads of memory. That may be added in future
but isn't really that useful for testing the kernel code - it will end up
being the same as injecting a Machine Check or equivalent for the Host
Physical Address in question.

The series supports:
1) Injection of variable length poison regions via QMP (to fake real
   memory corruption and ensure we deal with odd overflow corner cases
   such as clearing the middle of a large region making the list overflow
   as we go from one long entry to two smaller entries.
2) Read of poison list via the CXL mailbox.
3) Injection via the poison injection mailbox command (limited to 64 byte
   entries)
4) Clearing of poison injected via either method.

The implementation is meant to be a valid combination of imdef choices
based on what the spec allowed. There are a number of places where it could
be made more sophisticated that we might consider in future:
* Fusing adjacent poison entries if the types match.
* Separate injection list and main poison list, to test out limits on
  injected poison list being smaller than the main list.
* Poison list overflow event (needs event log support in general)
* Connecting up to the poison list error record generation (rather complex
  and not needed for currently kernel handling testing).

As the kernel code is currently fairly simple, it is likely that the above
does not yet matter but who knows what will turn up in future!

Tested against Alison's latest kernel patches.
https://lore.kernel.org/linux-cxl/cover.1674070170.git.alison.schofield@intel.com/
https://lore.kernel.org/linux-cxl/cover.1674101475.git.alison.schofield@intel.com/
and set timestamp patch
https://lore.kernel.org/linux-cxl/20230130151327.32415-1-Jonathan.Cameron@huawei.com/

[1] https://lore.kernel.org/linux-cxl/20220620162056.16790-1-Jonathan.Cameron@huawei.com/

Jonathan Cameron (3):
  hw/cxl: QMP based poison injection support
  hw/cxl: Add poison injection via the mailbox.
  hw/cxl: Add clear poison mailbox command support.

 hw/cxl/cxl-mailbox-utils.c  | 199 ++++++++++++++++++++++++++++++++++++
 hw/mem/cxl_type3.c          |  92 +++++++++++++++++
 hw/mem/cxl_type3_stubs.c    |   3 +
 hw/mem/meson.build          |   2 +
 include/hw/cxl/cxl_device.h |  21 ++++
 qapi/cxl.json               |  11 ++
 6 files changed, 328 insertions(+)

-- 
2.37.2


^ permalink raw reply	[flat|nested] 12+ messages in thread

* [RFC PATCH v2 0/3] hw/cxl: Poison get, inject, clear
@ 2023-02-01 10:03 ` Jonathan Cameron via
  0 siblings, 0 replies; 12+ messages in thread
From: Jonathan Cameron via @ 2023-02-01 10:03 UTC (permalink / raw)
  To: qemu-devel, Michael Tsirkin
  Cc: Ben Widawsky, linux-cxl, linuxarm, Ira Weiny, Alison Schofield

RFC mostly because we already have 4 CXL QEMU series waiting for review
ahead of this and I don't want to distract too much from those.

I posted an RFC of part of this series a long time back [1]. It's been more
or less entirely rewriten since then and gained support for mailbox based
injection and clearing (including writing the cacheline). Note that I still
don't check for poison on direct reads of memory. That may be added in future
but isn't really that useful for testing the kernel code - it will end up
being the same as injecting a Machine Check or equivalent for the Host
Physical Address in question.

The series supports:
1) Injection of variable length poison regions via QMP (to fake real
   memory corruption and ensure we deal with odd overflow corner cases
   such as clearing the middle of a large region making the list overflow
   as we go from one long entry to two smaller entries.
2) Read of poison list via the CXL mailbox.
3) Injection via the poison injection mailbox command (limited to 64 byte
   entries)
4) Clearing of poison injected via either method.

The implementation is meant to be a valid combination of imdef choices
based on what the spec allowed. There are a number of places where it could
be made more sophisticated that we might consider in future:
* Fusing adjacent poison entries if the types match.
* Separate injection list and main poison list, to test out limits on
  injected poison list being smaller than the main list.
* Poison list overflow event (needs event log support in general)
* Connecting up to the poison list error record generation (rather complex
  and not needed for currently kernel handling testing).

As the kernel code is currently fairly simple, it is likely that the above
does not yet matter but who knows what will turn up in future!

Tested against Alison's latest kernel patches.
https://lore.kernel.org/linux-cxl/cover.1674070170.git.alison.schofield@intel.com/
https://lore.kernel.org/linux-cxl/cover.1674101475.git.alison.schofield@intel.com/
and set timestamp patch
https://lore.kernel.org/linux-cxl/20230130151327.32415-1-Jonathan.Cameron@huawei.com/

[1] https://lore.kernel.org/linux-cxl/20220620162056.16790-1-Jonathan.Cameron@huawei.com/

Jonathan Cameron (3):
  hw/cxl: QMP based poison injection support
  hw/cxl: Add poison injection via the mailbox.
  hw/cxl: Add clear poison mailbox command support.

 hw/cxl/cxl-mailbox-utils.c  | 199 ++++++++++++++++++++++++++++++++++++
 hw/mem/cxl_type3.c          |  92 +++++++++++++++++
 hw/mem/cxl_type3_stubs.c    |   3 +
 hw/mem/meson.build          |   2 +
 include/hw/cxl/cxl_device.h |  21 ++++
 qapi/cxl.json               |  11 ++
 6 files changed, 328 insertions(+)

-- 
2.37.2



^ permalink raw reply	[flat|nested] 12+ messages in thread

* [RFC PATCH v2 1/3] hw/cxl: QMP based poison injection support
  2023-02-01 10:03 ` Jonathan Cameron via
@ 2023-02-01 10:03   ` Jonathan Cameron via
  -1 siblings, 0 replies; 12+ messages in thread
From: Jonathan Cameron @ 2023-02-01 10:03 UTC (permalink / raw)
  To: qemu-devel, Michael Tsirkin
  Cc: Ben Widawsky, linux-cxl, linuxarm, Ira Weiny, Alison Schofield

Inject poison using qmp command cxl-inject-poison to add an entry to the
poison list.

For now, the poison is not returned CXL.mem reads, but only via the
mailbox command Get Poison List.

See CXL rev 3.0, sec 8.2.9.8.4.1 Get Poison list (Opcode 4300h)

Kernel patches to use this interface here:
https://lore.kernel.org/linux-cxl/cover.1665606782.git.alison.schofield@intel.com/

To inject poison using qmp (telnet to the qmp port)
{ "execute": "qmp_capabilities" }

{ "execute": "cxl-inject-poison",
    "arguments": {
         "path": "/machine/peripheral/cxl-pmem0",
         "start": 2048,
         "length": 256
    }
}

Adjusted to select a device on your machine.

Note that the poison list supported is kept short enough to avoid the
complexity of state machine that is needed to handle the MORE flag.

Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>

---
v2:
Moved to QMP to allow for single command.
Update reference in coverletter
Added specific setting of type for this approach to injection.
Drop the unnecessary ct3d class get_poison_list callback.
Block overlapping regions from being injected
Handle list overflow
Use Ira's utility function to get the timestamps
---
 hw/cxl/cxl-mailbox-utils.c  | 82 +++++++++++++++++++++++++++++++++++++
 hw/mem/cxl_type3.c          | 56 +++++++++++++++++++++++++
 hw/mem/cxl_type3_stubs.c    |  3 ++
 hw/mem/meson.build          |  2 +
 include/hw/cxl/cxl_device.h | 20 +++++++++
 qapi/cxl.json               | 11 +++++
 6 files changed, 174 insertions(+)

diff --git a/hw/cxl/cxl-mailbox-utils.c b/hw/cxl/cxl-mailbox-utils.c
index 580366ed2f..cf3cfb10a1 100644
--- a/hw/cxl/cxl-mailbox-utils.c
+++ b/hw/cxl/cxl-mailbox-utils.c
@@ -62,6 +62,8 @@ enum {
         #define GET_PARTITION_INFO     0x0
         #define GET_LSA       0x2
         #define SET_LSA       0x3
+    MEDIA_AND_POISON = 0x43,
+        #define GET_POISON_LIST        0x0
 };
 
 struct cxl_cmd;
@@ -267,6 +269,8 @@ static CXLRetCode cmd_identify_memory_device(struct cxl_cmd *cmd,
     id->persistent_capacity = cxl_dstate->pmem_size / CXL_CAPACITY_MULTIPLIER;
     id->volatile_capacity = cxl_dstate->vmem_size / CXL_CAPACITY_MULTIPLIER;
     id->lsa_size = cvc->get_lsa_size(ct3d);
+    id->poison_list_max_mer[1] = 0x1; /* 256 poison records */
+    id->inject_poison_limit = 0; /* No limit - so limited by main poison record limit */
 
     *len = sizeof(*id);
     return CXL_MBOX_SUCCESS;
@@ -356,6 +360,82 @@ static CXLRetCode cmd_ccls_set_lsa(struct cxl_cmd *cmd,
     return CXL_MBOX_SUCCESS;
 }
 
+/*
+ * This is very inefficient, but good enough for now!
+ * Also the payload will always fit, so no need to handle the MORE flag and
+ * make this stateful. We may want to allow longer poison lists to aid
+ * testing that kernel functionality.
+ */
+static CXLRetCode cmd_media_get_poison_list(struct cxl_cmd *cmd,
+                                            CXLDeviceState *cxl_dstate,
+                                            uint16_t *len)
+{
+    struct get_poison_list_pl {
+        uint64_t pa;
+        uint64_t length;
+    } QEMU_PACKED;
+
+    struct get_poison_list_out_pl {
+        uint8_t flags;
+        uint8_t rsvd1;
+        uint64_t overflow_timestamp;
+        uint16_t count;
+        uint8_t rsvd2[0x14];
+        struct {
+            uint64_t addr;
+            uint32_t length;
+            uint32_t resv;
+        } QEMU_PACKED records[];
+    } QEMU_PACKED;
+
+    struct get_poison_list_pl *in = (void *)cmd->payload;
+    struct get_poison_list_out_pl *out = (void *)cmd->payload;
+    CXLType3Dev *ct3d = container_of(cxl_dstate, CXLType3Dev, cxl_dstate);
+    uint16_t record_count = 0, i = 0;
+    uint64_t query_start = in->pa;
+    uint64_t query_length = in->length;
+    CXLPoisonList *poison_list = &ct3d->poison_list;
+    CXLPoison *ent;
+    uint16_t out_pl_len;
+
+    QLIST_FOREACH(ent, poison_list, node) {
+        /* Check for no overlap */
+        if (ent->start >= query_start + query_length ||
+            ent->start + ent->length <= query_start) {
+            continue;
+        }
+        record_count++;
+    }
+    out_pl_len = sizeof(*out) + record_count * sizeof(out->records[0]);
+    assert(out_pl_len <= CXL_MAILBOX_MAX_PAYLOAD_SIZE);
+
+    memset(out, 0, out_pl_len);
+    QLIST_FOREACH(ent, poison_list, node) {
+        uint64_t start, stop;
+
+        /* Check for no overlap */
+        if (ent->start >= query_start + query_length ||
+            ent->start + ent->length <= query_start) {
+            continue;
+        }
+
+        /* Deal with overlap */
+        start = MAX(ent->start & 0xffffffffffffffc0, query_start);
+        stop = MIN((ent->start & 0xffffffffffffffc0) + ent->length,
+                   query_start + query_length);
+        out->records[i].addr = start | (ent->type & 0x3);
+        out->records[i].length = (stop - start) / 64;
+        i++;
+    }
+    if (ct3d->poison_list_overflowed) {
+        out->flags = (1 << 1);
+        out->overflow_timestamp = ct3d->poison_list_overflow_ts;
+    }
+    out->count = record_count;
+    *len = out_pl_len;
+    return CXL_MBOX_SUCCESS;
+}
+
 #define IMMEDIATE_CONFIG_CHANGE (1 << 1)
 #define IMMEDIATE_DATA_CHANGE (1 << 2)
 #define IMMEDIATE_POLICY_CHANGE (1 << 3)
@@ -383,6 +463,8 @@ static struct cxl_cmd cxl_cmd_set[256][256] = {
     [CCLS][GET_LSA] = { "CCLS_GET_LSA", cmd_ccls_get_lsa, 8, 0 },
     [CCLS][SET_LSA] = { "CCLS_SET_LSA", cmd_ccls_set_lsa,
         ~0, IMMEDIATE_CONFIG_CHANGE | IMMEDIATE_DATA_CHANGE },
+    [MEDIA_AND_POISON][GET_POISON_LIST] = { "MEDIA_AND_POISON_GET_POISON_LIST",
+        cmd_media_get_poison_list, 16, 0 },
 };
 
 void cxl_process_mailbox(CXLDeviceState *cxl_dstate)
diff --git a/hw/mem/cxl_type3.c b/hw/mem/cxl_type3.c
index beb931d59a..20f8d8e8aa 100644
--- a/hw/mem/cxl_type3.c
+++ b/hw/mem/cxl_type3.c
@@ -925,6 +925,62 @@ static void set_lsa(CXLType3Dev *ct3d, const void *buf, uint64_t size,
      */
 }
 
+void cxl_set_poison_list_overflowed(CXLType3Dev *ct3d)
+{
+        ct3d->poison_list_overflowed = true;
+        ct3d->poison_list_overflow_ts =
+            cxl_device_get_timestamp(&ct3d->cxl_dstate);
+}
+
+void qmp_cxl_inject_poison(const char *path, uint64_t start, uint64_t length,
+                           Error **errp)
+{
+    Object *obj = object_resolve_path(path, NULL);
+    CXLType3Dev *ct3d;
+    CXLPoison *p;
+
+    if (length % 64) {
+        error_setg(errp, "Poison injection must be in multiples of 64 bytes");
+        return;
+    }
+    if (start % 64) {
+        error_setg(errp, "Poison start address must be 64 byte aligned");
+        return;
+    }
+    if (!obj) {
+        error_setg(errp, "Unable to resolve path");
+        return;
+    }
+    if (!object_dynamic_cast(obj, TYPE_CXL_TYPE3)) {
+        error_setg(errp, "Path does not point to a CXL type 3 device");
+        return;
+    }
+
+    ct3d = CXL_TYPE3(obj);
+
+    QLIST_FOREACH(p, &ct3d->poison_list, node) {
+        if (((start >= p->start) && (start < p->start + p->length)) ||
+            ((start + length > p->start) &&
+             (start + length <= p->start + p->length))) {
+            error_setg(errp, "Overlap with existing poisoned region not supported");
+            return;
+        }
+    }
+
+    if (ct3d->poison_list_cnt == CXL_POISON_LIST_LIMIT) {
+        cxl_set_poison_list_overflowed(ct3d);
+        return;
+    }
+
+    p = g_new0(CXLPoison, 1);
+    p->length = length;
+    p->start = start;
+    p->type = CXL_POISON_TYPE_INTERNAL; /* Different from injected via the mbox */
+
+    QLIST_INSERT_HEAD(&ct3d->poison_list, p, node);
+    ct3d->poison_list_cnt++;
+}
+
 /* For uncorrectable errors include support for multiple header recording */
 void qmp_cxl_inject_uncorrectable_errors(const char *path,
                                          CXLUncorErrorRecordList *errors,
diff --git a/hw/mem/cxl_type3_stubs.c b/hw/mem/cxl_type3_stubs.c
index b6b51ced54..6055190ca6 100644
--- a/hw/mem/cxl_type3_stubs.c
+++ b/hw/mem/cxl_type3_stubs.c
@@ -2,6 +2,9 @@
 #include "qemu/osdep.h"
 #include "qapi/qapi-commands-cxl.h"
 
+void qmp_cxl_inject_poison(const char *path, uint64_t start, uint64_t length,
+                           Error **errp) {}
+
 void qmp_cxl_inject_uncorrectable_errors(const char *path,
                                          CXLUncorErrorRecordList *errors,
                                          Error **errp) {}
diff --git a/hw/mem/meson.build b/hw/mem/meson.build
index 56c2618b84..930c67e390 100644
--- a/hw/mem/meson.build
+++ b/hw/mem/meson.build
@@ -10,3 +10,5 @@ softmmu_ss.add(when: 'CONFIG_ALL', if_true: files('cxl_type3_stubs.c'))
 softmmu_ss.add_all(when: 'CONFIG_MEM_DEVICE', if_true: mem_ss)
 
 softmmu_ss.add(when: 'CONFIG_SPARSE_MEM', if_true: files('sparse-mem.c'))
+softmmu_ss.add(when: 'CONFIG_CXL_MEM_DEVICE', if_false: files('cxl_type3_stubs.c'))
+softmmu_ss.add(when: 'CONFIG_ALL', if_true: files('cxl_type3_stubs.c'))
diff --git a/include/hw/cxl/cxl_device.h b/include/hw/cxl/cxl_device.h
index 44fea2d649..3cb77fe8a5 100644
--- a/include/hw/cxl/cxl_device.h
+++ b/include/hw/cxl/cxl_device.h
@@ -270,6 +270,18 @@ typedef struct CXLError {
 
 typedef QTAILQ_HEAD(, CXLError) CXLErrorList;
 
+typedef struct CXLPoison {
+    uint64_t start, length;
+    uint8_t type;
+#define CXL_POISON_TYPE_EXTERNAL 0x1
+#define CXL_POISON_TYPE_INTERNAL 0x2
+#define CXL_POISON_TYPE_INJECTED 0x3
+    QLIST_ENTRY(CXLPoison) node;
+} CXLPoison;
+
+typedef QLIST_HEAD(, CXLPoison) CXLPoisonList;
+#define CXL_POISON_LIST_LIMIT 256
+
 struct CXLType3Dev {
     /* Private */
     PCIDevice parent_obj;
@@ -292,6 +304,12 @@ struct CXLType3Dev {
 
     /* Error injection */
     CXLErrorList error_list;
+
+    /* Poison Injection - cache */
+    CXLPoisonList poison_list;
+    unsigned int poison_list_cnt;
+    bool poison_list_overflowed;
+    uint64_t poison_list_overflow_ts;
 };
 
 #define TYPE_CXL_TYPE3 "cxl-type3"
@@ -317,4 +335,6 @@ MemTxResult cxl_type3_write(PCIDevice *d, hwaddr host_addr, uint64_t data,
 
 uint64_t cxl_device_get_timestamp(CXLDeviceState *cxlds);
 
+void cxl_set_poison_list_overflowed(CXLType3Dev *ct3d);
+
 #endif
diff --git a/qapi/cxl.json b/qapi/cxl.json
index 3c18556ee8..5b995db255 100644
--- a/qapi/cxl.json
+++ b/qapi/cxl.json
@@ -5,6 +5,17 @@
 # = CXL devices
 ##
 
+##
+# @cxl-inject-poison:
+#
+# @path: CXL type 3 device canonical QOM path
+#
+# @start: Start address
+# @length: Length of poison to inject
+##
+{ 'command': 'cxl-inject-poison',
+  'data': { 'path': 'str', 'start': 'uint64', 'length': 'uint64' }}
+
 ##
 # @CxlUncorErrorType:
 #
-- 
2.37.2


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [RFC PATCH v2 1/3] hw/cxl: QMP based poison injection support
@ 2023-02-01 10:03   ` Jonathan Cameron via
  0 siblings, 0 replies; 12+ messages in thread
From: Jonathan Cameron via @ 2023-02-01 10:03 UTC (permalink / raw)
  To: qemu-devel, Michael Tsirkin
  Cc: Ben Widawsky, linux-cxl, linuxarm, Ira Weiny, Alison Schofield

Inject poison using qmp command cxl-inject-poison to add an entry to the
poison list.

For now, the poison is not returned CXL.mem reads, but only via the
mailbox command Get Poison List.

See CXL rev 3.0, sec 8.2.9.8.4.1 Get Poison list (Opcode 4300h)

Kernel patches to use this interface here:
https://lore.kernel.org/linux-cxl/cover.1665606782.git.alison.schofield@intel.com/

To inject poison using qmp (telnet to the qmp port)
{ "execute": "qmp_capabilities" }

{ "execute": "cxl-inject-poison",
    "arguments": {
         "path": "/machine/peripheral/cxl-pmem0",
         "start": 2048,
         "length": 256
    }
}

Adjusted to select a device on your machine.

Note that the poison list supported is kept short enough to avoid the
complexity of state machine that is needed to handle the MORE flag.

Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>

---
v2:
Moved to QMP to allow for single command.
Update reference in coverletter
Added specific setting of type for this approach to injection.
Drop the unnecessary ct3d class get_poison_list callback.
Block overlapping regions from being injected
Handle list overflow
Use Ira's utility function to get the timestamps
---
 hw/cxl/cxl-mailbox-utils.c  | 82 +++++++++++++++++++++++++++++++++++++
 hw/mem/cxl_type3.c          | 56 +++++++++++++++++++++++++
 hw/mem/cxl_type3_stubs.c    |  3 ++
 hw/mem/meson.build          |  2 +
 include/hw/cxl/cxl_device.h | 20 +++++++++
 qapi/cxl.json               | 11 +++++
 6 files changed, 174 insertions(+)

diff --git a/hw/cxl/cxl-mailbox-utils.c b/hw/cxl/cxl-mailbox-utils.c
index 580366ed2f..cf3cfb10a1 100644
--- a/hw/cxl/cxl-mailbox-utils.c
+++ b/hw/cxl/cxl-mailbox-utils.c
@@ -62,6 +62,8 @@ enum {
         #define GET_PARTITION_INFO     0x0
         #define GET_LSA       0x2
         #define SET_LSA       0x3
+    MEDIA_AND_POISON = 0x43,
+        #define GET_POISON_LIST        0x0
 };
 
 struct cxl_cmd;
@@ -267,6 +269,8 @@ static CXLRetCode cmd_identify_memory_device(struct cxl_cmd *cmd,
     id->persistent_capacity = cxl_dstate->pmem_size / CXL_CAPACITY_MULTIPLIER;
     id->volatile_capacity = cxl_dstate->vmem_size / CXL_CAPACITY_MULTIPLIER;
     id->lsa_size = cvc->get_lsa_size(ct3d);
+    id->poison_list_max_mer[1] = 0x1; /* 256 poison records */
+    id->inject_poison_limit = 0; /* No limit - so limited by main poison record limit */
 
     *len = sizeof(*id);
     return CXL_MBOX_SUCCESS;
@@ -356,6 +360,82 @@ static CXLRetCode cmd_ccls_set_lsa(struct cxl_cmd *cmd,
     return CXL_MBOX_SUCCESS;
 }
 
+/*
+ * This is very inefficient, but good enough for now!
+ * Also the payload will always fit, so no need to handle the MORE flag and
+ * make this stateful. We may want to allow longer poison lists to aid
+ * testing that kernel functionality.
+ */
+static CXLRetCode cmd_media_get_poison_list(struct cxl_cmd *cmd,
+                                            CXLDeviceState *cxl_dstate,
+                                            uint16_t *len)
+{
+    struct get_poison_list_pl {
+        uint64_t pa;
+        uint64_t length;
+    } QEMU_PACKED;
+
+    struct get_poison_list_out_pl {
+        uint8_t flags;
+        uint8_t rsvd1;
+        uint64_t overflow_timestamp;
+        uint16_t count;
+        uint8_t rsvd2[0x14];
+        struct {
+            uint64_t addr;
+            uint32_t length;
+            uint32_t resv;
+        } QEMU_PACKED records[];
+    } QEMU_PACKED;
+
+    struct get_poison_list_pl *in = (void *)cmd->payload;
+    struct get_poison_list_out_pl *out = (void *)cmd->payload;
+    CXLType3Dev *ct3d = container_of(cxl_dstate, CXLType3Dev, cxl_dstate);
+    uint16_t record_count = 0, i = 0;
+    uint64_t query_start = in->pa;
+    uint64_t query_length = in->length;
+    CXLPoisonList *poison_list = &ct3d->poison_list;
+    CXLPoison *ent;
+    uint16_t out_pl_len;
+
+    QLIST_FOREACH(ent, poison_list, node) {
+        /* Check for no overlap */
+        if (ent->start >= query_start + query_length ||
+            ent->start + ent->length <= query_start) {
+            continue;
+        }
+        record_count++;
+    }
+    out_pl_len = sizeof(*out) + record_count * sizeof(out->records[0]);
+    assert(out_pl_len <= CXL_MAILBOX_MAX_PAYLOAD_SIZE);
+
+    memset(out, 0, out_pl_len);
+    QLIST_FOREACH(ent, poison_list, node) {
+        uint64_t start, stop;
+
+        /* Check for no overlap */
+        if (ent->start >= query_start + query_length ||
+            ent->start + ent->length <= query_start) {
+            continue;
+        }
+
+        /* Deal with overlap */
+        start = MAX(ent->start & 0xffffffffffffffc0, query_start);
+        stop = MIN((ent->start & 0xffffffffffffffc0) + ent->length,
+                   query_start + query_length);
+        out->records[i].addr = start | (ent->type & 0x3);
+        out->records[i].length = (stop - start) / 64;
+        i++;
+    }
+    if (ct3d->poison_list_overflowed) {
+        out->flags = (1 << 1);
+        out->overflow_timestamp = ct3d->poison_list_overflow_ts;
+    }
+    out->count = record_count;
+    *len = out_pl_len;
+    return CXL_MBOX_SUCCESS;
+}
+
 #define IMMEDIATE_CONFIG_CHANGE (1 << 1)
 #define IMMEDIATE_DATA_CHANGE (1 << 2)
 #define IMMEDIATE_POLICY_CHANGE (1 << 3)
@@ -383,6 +463,8 @@ static struct cxl_cmd cxl_cmd_set[256][256] = {
     [CCLS][GET_LSA] = { "CCLS_GET_LSA", cmd_ccls_get_lsa, 8, 0 },
     [CCLS][SET_LSA] = { "CCLS_SET_LSA", cmd_ccls_set_lsa,
         ~0, IMMEDIATE_CONFIG_CHANGE | IMMEDIATE_DATA_CHANGE },
+    [MEDIA_AND_POISON][GET_POISON_LIST] = { "MEDIA_AND_POISON_GET_POISON_LIST",
+        cmd_media_get_poison_list, 16, 0 },
 };
 
 void cxl_process_mailbox(CXLDeviceState *cxl_dstate)
diff --git a/hw/mem/cxl_type3.c b/hw/mem/cxl_type3.c
index beb931d59a..20f8d8e8aa 100644
--- a/hw/mem/cxl_type3.c
+++ b/hw/mem/cxl_type3.c
@@ -925,6 +925,62 @@ static void set_lsa(CXLType3Dev *ct3d, const void *buf, uint64_t size,
      */
 }
 
+void cxl_set_poison_list_overflowed(CXLType3Dev *ct3d)
+{
+        ct3d->poison_list_overflowed = true;
+        ct3d->poison_list_overflow_ts =
+            cxl_device_get_timestamp(&ct3d->cxl_dstate);
+}
+
+void qmp_cxl_inject_poison(const char *path, uint64_t start, uint64_t length,
+                           Error **errp)
+{
+    Object *obj = object_resolve_path(path, NULL);
+    CXLType3Dev *ct3d;
+    CXLPoison *p;
+
+    if (length % 64) {
+        error_setg(errp, "Poison injection must be in multiples of 64 bytes");
+        return;
+    }
+    if (start % 64) {
+        error_setg(errp, "Poison start address must be 64 byte aligned");
+        return;
+    }
+    if (!obj) {
+        error_setg(errp, "Unable to resolve path");
+        return;
+    }
+    if (!object_dynamic_cast(obj, TYPE_CXL_TYPE3)) {
+        error_setg(errp, "Path does not point to a CXL type 3 device");
+        return;
+    }
+
+    ct3d = CXL_TYPE3(obj);
+
+    QLIST_FOREACH(p, &ct3d->poison_list, node) {
+        if (((start >= p->start) && (start < p->start + p->length)) ||
+            ((start + length > p->start) &&
+             (start + length <= p->start + p->length))) {
+            error_setg(errp, "Overlap with existing poisoned region not supported");
+            return;
+        }
+    }
+
+    if (ct3d->poison_list_cnt == CXL_POISON_LIST_LIMIT) {
+        cxl_set_poison_list_overflowed(ct3d);
+        return;
+    }
+
+    p = g_new0(CXLPoison, 1);
+    p->length = length;
+    p->start = start;
+    p->type = CXL_POISON_TYPE_INTERNAL; /* Different from injected via the mbox */
+
+    QLIST_INSERT_HEAD(&ct3d->poison_list, p, node);
+    ct3d->poison_list_cnt++;
+}
+
 /* For uncorrectable errors include support for multiple header recording */
 void qmp_cxl_inject_uncorrectable_errors(const char *path,
                                          CXLUncorErrorRecordList *errors,
diff --git a/hw/mem/cxl_type3_stubs.c b/hw/mem/cxl_type3_stubs.c
index b6b51ced54..6055190ca6 100644
--- a/hw/mem/cxl_type3_stubs.c
+++ b/hw/mem/cxl_type3_stubs.c
@@ -2,6 +2,9 @@
 #include "qemu/osdep.h"
 #include "qapi/qapi-commands-cxl.h"
 
+void qmp_cxl_inject_poison(const char *path, uint64_t start, uint64_t length,
+                           Error **errp) {}
+
 void qmp_cxl_inject_uncorrectable_errors(const char *path,
                                          CXLUncorErrorRecordList *errors,
                                          Error **errp) {}
diff --git a/hw/mem/meson.build b/hw/mem/meson.build
index 56c2618b84..930c67e390 100644
--- a/hw/mem/meson.build
+++ b/hw/mem/meson.build
@@ -10,3 +10,5 @@ softmmu_ss.add(when: 'CONFIG_ALL', if_true: files('cxl_type3_stubs.c'))
 softmmu_ss.add_all(when: 'CONFIG_MEM_DEVICE', if_true: mem_ss)
 
 softmmu_ss.add(when: 'CONFIG_SPARSE_MEM', if_true: files('sparse-mem.c'))
+softmmu_ss.add(when: 'CONFIG_CXL_MEM_DEVICE', if_false: files('cxl_type3_stubs.c'))
+softmmu_ss.add(when: 'CONFIG_ALL', if_true: files('cxl_type3_stubs.c'))
diff --git a/include/hw/cxl/cxl_device.h b/include/hw/cxl/cxl_device.h
index 44fea2d649..3cb77fe8a5 100644
--- a/include/hw/cxl/cxl_device.h
+++ b/include/hw/cxl/cxl_device.h
@@ -270,6 +270,18 @@ typedef struct CXLError {
 
 typedef QTAILQ_HEAD(, CXLError) CXLErrorList;
 
+typedef struct CXLPoison {
+    uint64_t start, length;
+    uint8_t type;
+#define CXL_POISON_TYPE_EXTERNAL 0x1
+#define CXL_POISON_TYPE_INTERNAL 0x2
+#define CXL_POISON_TYPE_INJECTED 0x3
+    QLIST_ENTRY(CXLPoison) node;
+} CXLPoison;
+
+typedef QLIST_HEAD(, CXLPoison) CXLPoisonList;
+#define CXL_POISON_LIST_LIMIT 256
+
 struct CXLType3Dev {
     /* Private */
     PCIDevice parent_obj;
@@ -292,6 +304,12 @@ struct CXLType3Dev {
 
     /* Error injection */
     CXLErrorList error_list;
+
+    /* Poison Injection - cache */
+    CXLPoisonList poison_list;
+    unsigned int poison_list_cnt;
+    bool poison_list_overflowed;
+    uint64_t poison_list_overflow_ts;
 };
 
 #define TYPE_CXL_TYPE3 "cxl-type3"
@@ -317,4 +335,6 @@ MemTxResult cxl_type3_write(PCIDevice *d, hwaddr host_addr, uint64_t data,
 
 uint64_t cxl_device_get_timestamp(CXLDeviceState *cxlds);
 
+void cxl_set_poison_list_overflowed(CXLType3Dev *ct3d);
+
 #endif
diff --git a/qapi/cxl.json b/qapi/cxl.json
index 3c18556ee8..5b995db255 100644
--- a/qapi/cxl.json
+++ b/qapi/cxl.json
@@ -5,6 +5,17 @@
 # = CXL devices
 ##
 
+##
+# @cxl-inject-poison:
+#
+# @path: CXL type 3 device canonical QOM path
+#
+# @start: Start address
+# @length: Length of poison to inject
+##
+{ 'command': 'cxl-inject-poison',
+  'data': { 'path': 'str', 'start': 'uint64', 'length': 'uint64' }}
+
 ##
 # @CxlUncorErrorType:
 #
-- 
2.37.2



^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [RFC PATCH v2 2/3] hw/cxl: Add poison injection via the mailbox.
  2023-02-01 10:03 ` Jonathan Cameron via
@ 2023-02-01 10:03   ` Jonathan Cameron via
  -1 siblings, 0 replies; 12+ messages in thread
From: Jonathan Cameron @ 2023-02-01 10:03 UTC (permalink / raw)
  To: qemu-devel, Michael Tsirkin
  Cc: Ben Widawsky, linux-cxl, linuxarm, Ira Weiny, Alison Schofield

Very simple implementation to allow testing of corresponding
kernel code. Note that for now we track each 64 byte section
independently.  Whilst a valid implementation choice, it may
make sense to fuse entries so as to prove out more complex
corners of the kernel code.

Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
---
 hw/cxl/cxl-mailbox-utils.c | 40 ++++++++++++++++++++++++++++++++++++++
 1 file changed, 40 insertions(+)

diff --git a/hw/cxl/cxl-mailbox-utils.c b/hw/cxl/cxl-mailbox-utils.c
index cf3cfb10a1..7d3f7bcd3a 100644
--- a/hw/cxl/cxl-mailbox-utils.c
+++ b/hw/cxl/cxl-mailbox-utils.c
@@ -64,6 +64,7 @@ enum {
         #define SET_LSA       0x3
     MEDIA_AND_POISON = 0x43,
         #define GET_POISON_LIST        0x0
+        #define INJECT_POISON          0x1
 };
 
 struct cxl_cmd;
@@ -436,6 +437,43 @@ static CXLRetCode cmd_media_get_poison_list(struct cxl_cmd *cmd,
     return CXL_MBOX_SUCCESS;
 }
 
+static CXLRetCode cmd_media_inject_poison(struct cxl_cmd *cmd,
+                                          CXLDeviceState *cxl_dstate,
+                                          uint16_t *len)
+{
+    CXLType3Dev *ct3d = container_of(cxl_dstate, CXLType3Dev, cxl_dstate);
+    CXLPoisonList *poison_list = &ct3d->poison_list;
+    CXLPoison *ent;
+    struct inject_poison_pl {
+        uint64_t dpa;
+    };
+    struct inject_poison_pl *in = (void *)cmd->payload;
+    CXLPoison *p;
+
+    QLIST_FOREACH(ent, poison_list, node) {
+        if (ent->start == in->dpa && ent->length == 64) {
+            return CXL_MBOX_SUCCESS;
+        }
+    }
+
+    if (ct3d->poison_list_cnt == CXL_POISON_LIST_LIMIT) {
+        return CXL_MBOX_INJECT_POISON_LIMIT;
+    }
+    p = g_new0(CXLPoison, 1);
+
+    p->length = 64;
+    p->start = in->dpa;
+    p->type = CXL_POISON_TYPE_INJECTED;
+
+    /*
+     * Possible todo: Merge with existing entry if next to it and if same type
+     */
+    QLIST_INSERT_HEAD(poison_list, p, node);
+    ct3d->poison_list_cnt++;
+
+    return CXL_MBOX_SUCCESS;
+}
+
 #define IMMEDIATE_CONFIG_CHANGE (1 << 1)
 #define IMMEDIATE_DATA_CHANGE (1 << 2)
 #define IMMEDIATE_POLICY_CHANGE (1 << 3)
@@ -465,6 +503,8 @@ static struct cxl_cmd cxl_cmd_set[256][256] = {
         ~0, IMMEDIATE_CONFIG_CHANGE | IMMEDIATE_DATA_CHANGE },
     [MEDIA_AND_POISON][GET_POISON_LIST] = { "MEDIA_AND_POISON_GET_POISON_LIST",
         cmd_media_get_poison_list, 16, 0 },
+    [MEDIA_AND_POISON][INJECT_POISON] = { "MEDIA_AND_POISON_INJECT_POISON",
+        cmd_media_inject_poison, 8, 0 },
 };
 
 void cxl_process_mailbox(CXLDeviceState *cxl_dstate)
-- 
2.37.2


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [RFC PATCH v2 2/3] hw/cxl: Add poison injection via the mailbox.
@ 2023-02-01 10:03   ` Jonathan Cameron via
  0 siblings, 0 replies; 12+ messages in thread
From: Jonathan Cameron via @ 2023-02-01 10:03 UTC (permalink / raw)
  To: qemu-devel, Michael Tsirkin
  Cc: Ben Widawsky, linux-cxl, linuxarm, Ira Weiny, Alison Schofield

Very simple implementation to allow testing of corresponding
kernel code. Note that for now we track each 64 byte section
independently.  Whilst a valid implementation choice, it may
make sense to fuse entries so as to prove out more complex
corners of the kernel code.

Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
---
 hw/cxl/cxl-mailbox-utils.c | 40 ++++++++++++++++++++++++++++++++++++++
 1 file changed, 40 insertions(+)

diff --git a/hw/cxl/cxl-mailbox-utils.c b/hw/cxl/cxl-mailbox-utils.c
index cf3cfb10a1..7d3f7bcd3a 100644
--- a/hw/cxl/cxl-mailbox-utils.c
+++ b/hw/cxl/cxl-mailbox-utils.c
@@ -64,6 +64,7 @@ enum {
         #define SET_LSA       0x3
     MEDIA_AND_POISON = 0x43,
         #define GET_POISON_LIST        0x0
+        #define INJECT_POISON          0x1
 };
 
 struct cxl_cmd;
@@ -436,6 +437,43 @@ static CXLRetCode cmd_media_get_poison_list(struct cxl_cmd *cmd,
     return CXL_MBOX_SUCCESS;
 }
 
+static CXLRetCode cmd_media_inject_poison(struct cxl_cmd *cmd,
+                                          CXLDeviceState *cxl_dstate,
+                                          uint16_t *len)
+{
+    CXLType3Dev *ct3d = container_of(cxl_dstate, CXLType3Dev, cxl_dstate);
+    CXLPoisonList *poison_list = &ct3d->poison_list;
+    CXLPoison *ent;
+    struct inject_poison_pl {
+        uint64_t dpa;
+    };
+    struct inject_poison_pl *in = (void *)cmd->payload;
+    CXLPoison *p;
+
+    QLIST_FOREACH(ent, poison_list, node) {
+        if (ent->start == in->dpa && ent->length == 64) {
+            return CXL_MBOX_SUCCESS;
+        }
+    }
+
+    if (ct3d->poison_list_cnt == CXL_POISON_LIST_LIMIT) {
+        return CXL_MBOX_INJECT_POISON_LIMIT;
+    }
+    p = g_new0(CXLPoison, 1);
+
+    p->length = 64;
+    p->start = in->dpa;
+    p->type = CXL_POISON_TYPE_INJECTED;
+
+    /*
+     * Possible todo: Merge with existing entry if next to it and if same type
+     */
+    QLIST_INSERT_HEAD(poison_list, p, node);
+    ct3d->poison_list_cnt++;
+
+    return CXL_MBOX_SUCCESS;
+}
+
 #define IMMEDIATE_CONFIG_CHANGE (1 << 1)
 #define IMMEDIATE_DATA_CHANGE (1 << 2)
 #define IMMEDIATE_POLICY_CHANGE (1 << 3)
@@ -465,6 +503,8 @@ static struct cxl_cmd cxl_cmd_set[256][256] = {
         ~0, IMMEDIATE_CONFIG_CHANGE | IMMEDIATE_DATA_CHANGE },
     [MEDIA_AND_POISON][GET_POISON_LIST] = { "MEDIA_AND_POISON_GET_POISON_LIST",
         cmd_media_get_poison_list, 16, 0 },
+    [MEDIA_AND_POISON][INJECT_POISON] = { "MEDIA_AND_POISON_INJECT_POISON",
+        cmd_media_inject_poison, 8, 0 },
 };
 
 void cxl_process_mailbox(CXLDeviceState *cxl_dstate)
-- 
2.37.2



^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [RFC PATCH v2 3/3] hw/cxl: Add clear poison mailbox command support.
  2023-02-01 10:03 ` Jonathan Cameron via
@ 2023-02-01 10:03   ` Jonathan Cameron via
  -1 siblings, 0 replies; 12+ messages in thread
From: Jonathan Cameron @ 2023-02-01 10:03 UTC (permalink / raw)
  To: qemu-devel, Michael Tsirkin
  Cc: Ben Widawsky, linux-cxl, linuxarm, Ira Weiny, Alison Schofield

Current implementation is very simple so many of the corner
cases do not exist (e.g. fragmenting larger poison list entries)

Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
---
 hw/cxl/cxl-mailbox-utils.c  | 77 +++++++++++++++++++++++++++++++++++++
 hw/mem/cxl_type3.c          | 36 +++++++++++++++++
 include/hw/cxl/cxl_device.h |  1 +
 3 files changed, 114 insertions(+)

diff --git a/hw/cxl/cxl-mailbox-utils.c b/hw/cxl/cxl-mailbox-utils.c
index 7d3f7bcd3a..f56c76b205 100644
--- a/hw/cxl/cxl-mailbox-utils.c
+++ b/hw/cxl/cxl-mailbox-utils.c
@@ -65,6 +65,7 @@ enum {
     MEDIA_AND_POISON = 0x43,
         #define GET_POISON_LIST        0x0
         #define INJECT_POISON          0x1
+        #define CLEAR_POISON           0x2
 };
 
 struct cxl_cmd;
@@ -474,6 +475,80 @@ static CXLRetCode cmd_media_inject_poison(struct cxl_cmd *cmd,
     return CXL_MBOX_SUCCESS;
 }
 
+static CXLRetCode cmd_media_clear_poison(struct cxl_cmd *cmd,
+                                         CXLDeviceState *cxl_dstate,
+                                         uint16_t *len)
+{
+    CXLType3Dev *ct3d = container_of(cxl_dstate, CXLType3Dev, cxl_dstate);
+    CXLPoisonList *poison_list = &ct3d->poison_list;
+    CXLType3Class *cvc = CXL_TYPE3_GET_CLASS(ct3d);
+    struct clear_poison_pl {
+        uint64_t dpa;
+        uint8_t data[64];
+    };
+    CXLPoison *ent;
+
+    struct clear_poison_pl *in = (void *)cmd->payload;
+
+    if (in->dpa + 64 > cxl_dstate->mem_size) {
+        return CXL_MBOX_INVALID_PA;
+    }
+
+    QLIST_FOREACH(ent, poison_list, node) {
+        /*
+         * Test for contained in entry. Simpler than general case
+         * as clearing 64 bytes and entries 64 byte aligned
+         */
+        if ((in->dpa < ent->start) || (in->dpa >= ent->start + ent->length)) {
+            continue;
+        }
+        /* Do accounting early as we know one will go away */
+        ct3d->poison_list_cnt--;
+        if (in->dpa > ent->start) {
+            CXLPoison *frag;
+            if (ct3d->poison_list_cnt == CXL_POISON_LIST_LIMIT) {
+                cxl_set_poison_list_overflowed(ct3d);
+                break;
+            }
+            frag = g_new0(CXLPoison, 1);
+
+            frag->start = ent->start;
+            frag->length = in->dpa - ent->start;
+            frag->type = ent->type;
+
+            QLIST_INSERT_HEAD(poison_list, frag, node);
+            ct3d->poison_list_cnt++;
+        }
+        if (in->dpa + 64 < ent->start + ent->length) {
+            CXLPoison *frag;
+
+            if (ct3d->poison_list_cnt == CXL_POISON_LIST_LIMIT) {
+                cxl_set_poison_list_overflowed(ct3d);
+                break;
+            }
+
+            frag = g_new0(CXLPoison, 1);
+
+            frag->start = in->dpa + 64;
+            frag->length = ent->start + ent->length - frag->start;
+            frag->type = ent->type;
+            QLIST_INSERT_HEAD(poison_list, frag, node);
+            ct3d->poison_list_cnt++;
+        }
+        /* Any fragments have been added, free original entry */
+        QLIST_REMOVE(ent, node);
+        g_free(ent);
+        break;
+    }
+    /* Clearing a region with no poison is not an error so always do so */
+    if (cvc->set_cacheline)
+        if (!cvc->set_cacheline(ct3d, in->dpa, in->data)) {
+            return CXL_MBOX_INTERNAL_ERROR;
+        }
+
+    return CXL_MBOX_SUCCESS;
+}
+
 #define IMMEDIATE_CONFIG_CHANGE (1 << 1)
 #define IMMEDIATE_DATA_CHANGE (1 << 2)
 #define IMMEDIATE_POLICY_CHANGE (1 << 3)
@@ -505,6 +580,8 @@ static struct cxl_cmd cxl_cmd_set[256][256] = {
         cmd_media_get_poison_list, 16, 0 },
     [MEDIA_AND_POISON][INJECT_POISON] = { "MEDIA_AND_POISON_INJECT_POISON",
         cmd_media_inject_poison, 8, 0 },
+    [MEDIA_AND_POISON][CLEAR_POISON] = { "MEDIA_AND_POISON_CLEAR_POISON",
+        cmd_media_clear_poison, 72, 0 },
 };
 
 void cxl_process_mailbox(CXLDeviceState *cxl_dstate)
diff --git a/hw/mem/cxl_type3.c b/hw/mem/cxl_type3.c
index 20f8d8e8aa..75ecf95a2d 100644
--- a/hw/mem/cxl_type3.c
+++ b/hw/mem/cxl_type3.c
@@ -925,6 +925,41 @@ static void set_lsa(CXLType3Dev *ct3d, const void *buf, uint64_t size,
      */
 }
 
+static bool set_cacheline(CXLType3Dev *ct3d, uint64_t dpa_offset, uint8_t *data)
+{
+    MemoryRegion *vmr = NULL, *pmr = NULL;
+    AddressSpace *as;
+
+    if (ct3d->hostvmem) {
+        vmr = host_memory_backend_get_memory(ct3d->hostvmem);
+    }
+    if (ct3d->hostpmem) {
+        pmr = host_memory_backend_get_memory(ct3d->hostpmem);
+    }
+
+    if (!vmr && !pmr) {
+        return false;
+    }
+
+    if (dpa_offset + 64 > int128_get64(ct3d->cxl_dstate.mem_size)) {
+        return false;
+    }
+
+    if (vmr) {
+        if (dpa_offset <= int128_get64(vmr->size)) {
+            as = &ct3d->hostvmem_as;
+        } else {
+            as = &ct3d->hostpmem_as;
+            dpa_offset -= vmr->size;
+        }
+    } else {
+        as = &ct3d->hostpmem_as;
+    }
+
+    address_space_write(as, dpa_offset, MEMTXATTRS_UNSPECIFIED, &data, 64);
+    return true;
+}
+
 void cxl_set_poison_list_overflowed(CXLType3Dev *ct3d)
 {
         ct3d->poison_list_overflowed = true;
@@ -1146,6 +1181,7 @@ static void ct3_class_init(ObjectClass *oc, void *data)
     cvc->get_lsa_size = get_lsa_size;
     cvc->get_lsa = get_lsa;
     cvc->set_lsa = set_lsa;
+    cvc->set_cacheline = set_cacheline;
 }
 
 static const TypeInfo ct3d_info = {
diff --git a/include/hw/cxl/cxl_device.h b/include/hw/cxl/cxl_device.h
index 3cb77fe8a5..0a05f21e40 100644
--- a/include/hw/cxl/cxl_device.h
+++ b/include/hw/cxl/cxl_device.h
@@ -326,6 +326,7 @@ struct CXLType3Class {
                         uint64_t offset);
     void (*set_lsa)(CXLType3Dev *ct3d, const void *buf, uint64_t size,
                     uint64_t offset);
+    bool (*set_cacheline)(CXLType3Dev *ct3d, uint64_t dpa_offset, uint8_t *data);
 };
 
 MemTxResult cxl_type3_read(PCIDevice *d, hwaddr host_addr, uint64_t *data,
-- 
2.37.2


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [RFC PATCH v2 3/3] hw/cxl: Add clear poison mailbox command support.
@ 2023-02-01 10:03   ` Jonathan Cameron via
  0 siblings, 0 replies; 12+ messages in thread
From: Jonathan Cameron via @ 2023-02-01 10:03 UTC (permalink / raw)
  To: qemu-devel, Michael Tsirkin
  Cc: Ben Widawsky, linux-cxl, linuxarm, Ira Weiny, Alison Schofield

Current implementation is very simple so many of the corner
cases do not exist (e.g. fragmenting larger poison list entries)

Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
---
 hw/cxl/cxl-mailbox-utils.c  | 77 +++++++++++++++++++++++++++++++++++++
 hw/mem/cxl_type3.c          | 36 +++++++++++++++++
 include/hw/cxl/cxl_device.h |  1 +
 3 files changed, 114 insertions(+)

diff --git a/hw/cxl/cxl-mailbox-utils.c b/hw/cxl/cxl-mailbox-utils.c
index 7d3f7bcd3a..f56c76b205 100644
--- a/hw/cxl/cxl-mailbox-utils.c
+++ b/hw/cxl/cxl-mailbox-utils.c
@@ -65,6 +65,7 @@ enum {
     MEDIA_AND_POISON = 0x43,
         #define GET_POISON_LIST        0x0
         #define INJECT_POISON          0x1
+        #define CLEAR_POISON           0x2
 };
 
 struct cxl_cmd;
@@ -474,6 +475,80 @@ static CXLRetCode cmd_media_inject_poison(struct cxl_cmd *cmd,
     return CXL_MBOX_SUCCESS;
 }
 
+static CXLRetCode cmd_media_clear_poison(struct cxl_cmd *cmd,
+                                         CXLDeviceState *cxl_dstate,
+                                         uint16_t *len)
+{
+    CXLType3Dev *ct3d = container_of(cxl_dstate, CXLType3Dev, cxl_dstate);
+    CXLPoisonList *poison_list = &ct3d->poison_list;
+    CXLType3Class *cvc = CXL_TYPE3_GET_CLASS(ct3d);
+    struct clear_poison_pl {
+        uint64_t dpa;
+        uint8_t data[64];
+    };
+    CXLPoison *ent;
+
+    struct clear_poison_pl *in = (void *)cmd->payload;
+
+    if (in->dpa + 64 > cxl_dstate->mem_size) {
+        return CXL_MBOX_INVALID_PA;
+    }
+
+    QLIST_FOREACH(ent, poison_list, node) {
+        /*
+         * Test for contained in entry. Simpler than general case
+         * as clearing 64 bytes and entries 64 byte aligned
+         */
+        if ((in->dpa < ent->start) || (in->dpa >= ent->start + ent->length)) {
+            continue;
+        }
+        /* Do accounting early as we know one will go away */
+        ct3d->poison_list_cnt--;
+        if (in->dpa > ent->start) {
+            CXLPoison *frag;
+            if (ct3d->poison_list_cnt == CXL_POISON_LIST_LIMIT) {
+                cxl_set_poison_list_overflowed(ct3d);
+                break;
+            }
+            frag = g_new0(CXLPoison, 1);
+
+            frag->start = ent->start;
+            frag->length = in->dpa - ent->start;
+            frag->type = ent->type;
+
+            QLIST_INSERT_HEAD(poison_list, frag, node);
+            ct3d->poison_list_cnt++;
+        }
+        if (in->dpa + 64 < ent->start + ent->length) {
+            CXLPoison *frag;
+
+            if (ct3d->poison_list_cnt == CXL_POISON_LIST_LIMIT) {
+                cxl_set_poison_list_overflowed(ct3d);
+                break;
+            }
+
+            frag = g_new0(CXLPoison, 1);
+
+            frag->start = in->dpa + 64;
+            frag->length = ent->start + ent->length - frag->start;
+            frag->type = ent->type;
+            QLIST_INSERT_HEAD(poison_list, frag, node);
+            ct3d->poison_list_cnt++;
+        }
+        /* Any fragments have been added, free original entry */
+        QLIST_REMOVE(ent, node);
+        g_free(ent);
+        break;
+    }
+    /* Clearing a region with no poison is not an error so always do so */
+    if (cvc->set_cacheline)
+        if (!cvc->set_cacheline(ct3d, in->dpa, in->data)) {
+            return CXL_MBOX_INTERNAL_ERROR;
+        }
+
+    return CXL_MBOX_SUCCESS;
+}
+
 #define IMMEDIATE_CONFIG_CHANGE (1 << 1)
 #define IMMEDIATE_DATA_CHANGE (1 << 2)
 #define IMMEDIATE_POLICY_CHANGE (1 << 3)
@@ -505,6 +580,8 @@ static struct cxl_cmd cxl_cmd_set[256][256] = {
         cmd_media_get_poison_list, 16, 0 },
     [MEDIA_AND_POISON][INJECT_POISON] = { "MEDIA_AND_POISON_INJECT_POISON",
         cmd_media_inject_poison, 8, 0 },
+    [MEDIA_AND_POISON][CLEAR_POISON] = { "MEDIA_AND_POISON_CLEAR_POISON",
+        cmd_media_clear_poison, 72, 0 },
 };
 
 void cxl_process_mailbox(CXLDeviceState *cxl_dstate)
diff --git a/hw/mem/cxl_type3.c b/hw/mem/cxl_type3.c
index 20f8d8e8aa..75ecf95a2d 100644
--- a/hw/mem/cxl_type3.c
+++ b/hw/mem/cxl_type3.c
@@ -925,6 +925,41 @@ static void set_lsa(CXLType3Dev *ct3d, const void *buf, uint64_t size,
      */
 }
 
+static bool set_cacheline(CXLType3Dev *ct3d, uint64_t dpa_offset, uint8_t *data)
+{
+    MemoryRegion *vmr = NULL, *pmr = NULL;
+    AddressSpace *as;
+
+    if (ct3d->hostvmem) {
+        vmr = host_memory_backend_get_memory(ct3d->hostvmem);
+    }
+    if (ct3d->hostpmem) {
+        pmr = host_memory_backend_get_memory(ct3d->hostpmem);
+    }
+
+    if (!vmr && !pmr) {
+        return false;
+    }
+
+    if (dpa_offset + 64 > int128_get64(ct3d->cxl_dstate.mem_size)) {
+        return false;
+    }
+
+    if (vmr) {
+        if (dpa_offset <= int128_get64(vmr->size)) {
+            as = &ct3d->hostvmem_as;
+        } else {
+            as = &ct3d->hostpmem_as;
+            dpa_offset -= vmr->size;
+        }
+    } else {
+        as = &ct3d->hostpmem_as;
+    }
+
+    address_space_write(as, dpa_offset, MEMTXATTRS_UNSPECIFIED, &data, 64);
+    return true;
+}
+
 void cxl_set_poison_list_overflowed(CXLType3Dev *ct3d)
 {
         ct3d->poison_list_overflowed = true;
@@ -1146,6 +1181,7 @@ static void ct3_class_init(ObjectClass *oc, void *data)
     cvc->get_lsa_size = get_lsa_size;
     cvc->get_lsa = get_lsa;
     cvc->set_lsa = set_lsa;
+    cvc->set_cacheline = set_cacheline;
 }
 
 static const TypeInfo ct3d_info = {
diff --git a/include/hw/cxl/cxl_device.h b/include/hw/cxl/cxl_device.h
index 3cb77fe8a5..0a05f21e40 100644
--- a/include/hw/cxl/cxl_device.h
+++ b/include/hw/cxl/cxl_device.h
@@ -326,6 +326,7 @@ struct CXLType3Class {
                         uint64_t offset);
     void (*set_lsa)(CXLType3Dev *ct3d, const void *buf, uint64_t size,
                     uint64_t offset);
+    bool (*set_cacheline)(CXLType3Dev *ct3d, uint64_t dpa_offset, uint8_t *data);
 };
 
 MemTxResult cxl_type3_read(PCIDevice *d, hwaddr host_addr, uint64_t *data,
-- 
2.37.2



^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [RFC PATCH v2 1/3] hw/cxl: QMP based poison injection support
  2023-02-01 10:03   ` Jonathan Cameron via
  (?)
@ 2023-02-01 12:14   ` Markus Armbruster
  2023-02-01 14:45       ` Jonathan Cameron via
  -1 siblings, 1 reply; 12+ messages in thread
From: Markus Armbruster @ 2023-02-01 12:14 UTC (permalink / raw)
  To: Jonathan Cameron via
  Cc: Michael Tsirkin, Jonathan Cameron, Ben Widawsky, linux-cxl,
	linuxarm, Ira Weiny, Alison Schofield

Jonathan Cameron via <qemu-devel@nongnu.org> writes:

> Inject poison using qmp command cxl-inject-poison to add an entry to the
> poison list.
>
> For now, the poison is not returned CXL.mem reads, but only via the
> mailbox command Get Poison List.
>
> See CXL rev 3.0, sec 8.2.9.8.4.1 Get Poison list (Opcode 4300h)
>
> Kernel patches to use this interface here:
> https://lore.kernel.org/linux-cxl/cover.1665606782.git.alison.schofield@intel.com/
>
> To inject poison using qmp (telnet to the qmp port)
> { "execute": "qmp_capabilities" }
>
> { "execute": "cxl-inject-poison",
>     "arguments": {
>          "path": "/machine/peripheral/cxl-pmem0",
>          "start": 2048,
>          "length": 256
>     }
> }
>
> Adjusted to select a device on your machine.
>
> Note that the poison list supported is kept short enough to avoid the
> complexity of state machine that is needed to handle the MORE flag.
>
> Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>

[...]

> diff --git a/qapi/cxl.json b/qapi/cxl.json
> index 3c18556ee8..5b995db255 100644
> --- a/qapi/cxl.json
> +++ b/qapi/cxl.json

There is no qapi/cxl.json in current master.  So this must be based on
some other patch(es).  Please point to it in the cover letter.  I like
to point both in human-readable and machine-readable form, e.g. like
this:

    Based on my "[PATCH 00/22] qapi: Remove simple unions from the schema
    language".

    Based-on: Message-Id: <20210913123932.3306639-1-armbru@redhat.com>

> @@ -5,6 +5,17 @@
>  # = CXL devices
>  ##
>  
> +##
> +# @cxl-inject-poison:
> +#
> +# @path: CXL type 3 device canonical QOM path
> +#
> +# @start: Start address
> +# @length: Length of poison to inject

Either separate all the arguments with blank lines, or none.

> +##
> +{ 'command': 'cxl-inject-poison',
> +  'data': { 'path': 'str', 'start': 'uint64', 'length': 'uint64' }}
> +
>  ##
>  # @CxlUncorErrorType:
>  #

Both commit message and doc comment are rather terse.

The commit message should make the case for the feature: why do we want
it?  This typically involves explaining the problem(s) it solves.

The doc comment ideally explains intended use.


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [RFC PATCH v2 1/3] hw/cxl: QMP based poison injection support
  2023-02-01 12:14   ` Markus Armbruster
@ 2023-02-01 14:45       ` Jonathan Cameron via
  0 siblings, 0 replies; 12+ messages in thread
From: Jonathan Cameron @ 2023-02-01 14:45 UTC (permalink / raw)
  To: Markus Armbruster
  Cc: Jonathan Cameron via, Michael Tsirkin, Ben Widawsky, linux-cxl,
	linuxarm, Ira Weiny, Alison Schofield

On Wed, 01 Feb 2023 13:14:06 +0100
Markus Armbruster <armbru@redhat.com> wrote:

> Jonathan Cameron via <qemu-devel@nongnu.org> writes:
> 
> > Inject poison using qmp command cxl-inject-poison to add an entry to the
> > poison list.
> >
> > For now, the poison is not returned CXL.mem reads, but only via the
> > mailbox command Get Poison List.
> >
> > See CXL rev 3.0, sec 8.2.9.8.4.1 Get Poison list (Opcode 4300h)
> >
> > Kernel patches to use this interface here:
> > https://lore.kernel.org/linux-cxl/cover.1665606782.git.alison.schofield@intel.com/
> >
> > To inject poison using qmp (telnet to the qmp port)
> > { "execute": "qmp_capabilities" }
> >
> > { "execute": "cxl-inject-poison",
> >     "arguments": {
> >          "path": "/machine/peripheral/cxl-pmem0",
> >          "start": 2048,
> >          "length": 256
> >     }
> > }
> >
> > Adjusted to select a device on your machine.
> >
> > Note that the poison list supported is kept short enough to avoid the
> > complexity of state machine that is needed to handle the MORE flag.
> >
> > Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>  
> 
> [...]
> 
> > diff --git a/qapi/cxl.json b/qapi/cxl.json
> > index 3c18556ee8..5b995db255 100644
> > --- a/qapi/cxl.json
> > +++ b/qapi/cxl.json  
> 
> There is no qapi/cxl.json in current master.  So this must be based on
> some other patch(es).  Please point to it in the cover letter.  I like
> to point both in human-readable and machine-readable form, e.g. like
> this:
> 
>     Based on my "[PATCH 00/22] qapi: Remove simple unions from the schema
>     language".
> 
>     Based-on: Message-Id: <20210913123932.3306639-1-armbru@redhat.com>

Good point. I missed it in this series beyond a general reference to 'lots'.
Based on "[PATCH 0/2] hw/mem: CXL Type-3 Volatile Memory Support"
(which isn't mine)
Based-on: Message-ID: <20230131163847.23025-1-Jonathan.Cameron@huawei.com>

(and all the things that series does say it's based on)
What matters here is mostly in final patch of.
b) https://lore.kernel.org/linux-cxl/20230130155251.3430-1-Jonathan.Cameron@huawei.com/
   [PATCH v3 0/8] hw/cxl: RAS error emulation and injection
If you have time to look at that it would be appreciated as it's more
complex QMP usage than in here.

Sorry about that - I'll also start using your suggested format.

> 
> > @@ -5,6 +5,17 @@
> >  # = CXL devices
> >  ##
> >  
> > +##
> > +# @cxl-inject-poison:
> > +#
> > +# @path: CXL type 3 device canonical QOM path
> > +#
> > +# @start: Start address
> > +# @length: Length of poison to inject  
> 
> Either separate all the arguments with blank lines, or none.
> 
> > +##
> > +{ 'command': 'cxl-inject-poison',
> > +  'data': { 'path': 'str', 'start': 'uint64', 'length': 'uint64' }}
> > +
> >  ##
> >  # @CxlUncorErrorType:
> >  #  
> 
> Both commit message and doc comment are rather terse.
> 
> The commit message should make the case for the feature: why do we want
> it?  This typically involves explaining the problem(s) it solves.
> 
> The doc comment ideally explains intended use.

OK. I'll expand on this. It'll be a bit of fuzzy text that
boils down to we emulate so we can test the OS does the right thing
when it gets poison related events. I can add some generic fluff on
why a real device might implement this in the first place though
I'm not sure that will even matter to anyone reading these docs.

> 
> 


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [RFC PATCH v2 1/3] hw/cxl: QMP based poison injection support
@ 2023-02-01 14:45       ` Jonathan Cameron via
  0 siblings, 0 replies; 12+ messages in thread
From: Jonathan Cameron via @ 2023-02-01 14:45 UTC (permalink / raw)
  To: Markus Armbruster
  Cc: Jonathan Cameron via, Michael Tsirkin, Ben Widawsky, linux-cxl,
	linuxarm, Ira Weiny, Alison Schofield

On Wed, 01 Feb 2023 13:14:06 +0100
Markus Armbruster <armbru@redhat.com> wrote:

> Jonathan Cameron via <qemu-devel@nongnu.org> writes:
> 
> > Inject poison using qmp command cxl-inject-poison to add an entry to the
> > poison list.
> >
> > For now, the poison is not returned CXL.mem reads, but only via the
> > mailbox command Get Poison List.
> >
> > See CXL rev 3.0, sec 8.2.9.8.4.1 Get Poison list (Opcode 4300h)
> >
> > Kernel patches to use this interface here:
> > https://lore.kernel.org/linux-cxl/cover.1665606782.git.alison.schofield@intel.com/
> >
> > To inject poison using qmp (telnet to the qmp port)
> > { "execute": "qmp_capabilities" }
> >
> > { "execute": "cxl-inject-poison",
> >     "arguments": {
> >          "path": "/machine/peripheral/cxl-pmem0",
> >          "start": 2048,
> >          "length": 256
> >     }
> > }
> >
> > Adjusted to select a device on your machine.
> >
> > Note that the poison list supported is kept short enough to avoid the
> > complexity of state machine that is needed to handle the MORE flag.
> >
> > Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>  
> 
> [...]
> 
> > diff --git a/qapi/cxl.json b/qapi/cxl.json
> > index 3c18556ee8..5b995db255 100644
> > --- a/qapi/cxl.json
> > +++ b/qapi/cxl.json  
> 
> There is no qapi/cxl.json in current master.  So this must be based on
> some other patch(es).  Please point to it in the cover letter.  I like
> to point both in human-readable and machine-readable form, e.g. like
> this:
> 
>     Based on my "[PATCH 00/22] qapi: Remove simple unions from the schema
>     language".
> 
>     Based-on: Message-Id: <20210913123932.3306639-1-armbru@redhat.com>

Good point. I missed it in this series beyond a general reference to 'lots'.
Based on "[PATCH 0/2] hw/mem: CXL Type-3 Volatile Memory Support"
(which isn't mine)
Based-on: Message-ID: <20230131163847.23025-1-Jonathan.Cameron@huawei.com>

(and all the things that series does say it's based on)
What matters here is mostly in final patch of.
b) https://lore.kernel.org/linux-cxl/20230130155251.3430-1-Jonathan.Cameron@huawei.com/
   [PATCH v3 0/8] hw/cxl: RAS error emulation and injection
If you have time to look at that it would be appreciated as it's more
complex QMP usage than in here.

Sorry about that - I'll also start using your suggested format.

> 
> > @@ -5,6 +5,17 @@
> >  # = CXL devices
> >  ##
> >  
> > +##
> > +# @cxl-inject-poison:
> > +#
> > +# @path: CXL type 3 device canonical QOM path
> > +#
> > +# @start: Start address
> > +# @length: Length of poison to inject  
> 
> Either separate all the arguments with blank lines, or none.
> 
> > +##
> > +{ 'command': 'cxl-inject-poison',
> > +  'data': { 'path': 'str', 'start': 'uint64', 'length': 'uint64' }}
> > +
> >  ##
> >  # @CxlUncorErrorType:
> >  #  
> 
> Both commit message and doc comment are rather terse.
> 
> The commit message should make the case for the feature: why do we want
> it?  This typically involves explaining the problem(s) it solves.
> 
> The doc comment ideally explains intended use.

OK. I'll expand on this. It'll be a bit of fuzzy text that
boils down to we emulate so we can test the OS does the right thing
when it gets poison related events. I can add some generic fluff on
why a real device might implement this in the first place though
I'm not sure that will even matter to anyone reading these docs.

> 
> 



^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [RFC PATCH v2 1/3] hw/cxl: QMP based poison injection support
  2023-02-01 14:45       ` Jonathan Cameron via
  (?)
@ 2023-02-01 16:10       ` Markus Armbruster
  -1 siblings, 0 replies; 12+ messages in thread
From: Markus Armbruster @ 2023-02-01 16:10 UTC (permalink / raw)
  To: Jonathan Cameron
  Cc: Jonathan Cameron via, Michael Tsirkin, Ben Widawsky, linux-cxl,
	linuxarm, Ira Weiny, Alison Schofield

Jonathan Cameron <Jonathan.Cameron@Huawei.com> writes:

> On Wed, 01 Feb 2023 13:14:06 +0100
> Markus Armbruster <armbru@redhat.com> wrote:

[...]

>> Both commit message and doc comment are rather terse.
>> 
>> The commit message should make the case for the feature: why do we want
>> it?  This typically involves explaining the problem(s) it solves.
>> 
>> The doc comment ideally explains intended use.
>
> OK. I'll expand on this. It'll be a bit of fuzzy text that
> boils down to we emulate so we can test the OS does the right thing
> when it gets poison related events. I can add some generic fluff on
> why a real device might implement this in the first place though
> I'm not sure that will even matter to anyone reading these docs.

Use your judgement :)


^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2023-02-01 16:11 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-02-01 10:03 [RFC PATCH v2 0/3] hw/cxl: Poison get, inject, clear Jonathan Cameron
2023-02-01 10:03 ` Jonathan Cameron via
2023-02-01 10:03 ` [RFC PATCH v2 1/3] hw/cxl: QMP based poison injection support Jonathan Cameron
2023-02-01 10:03   ` Jonathan Cameron via
2023-02-01 12:14   ` Markus Armbruster
2023-02-01 14:45     ` Jonathan Cameron
2023-02-01 14:45       ` Jonathan Cameron via
2023-02-01 16:10       ` Markus Armbruster
2023-02-01 10:03 ` [RFC PATCH v2 2/3] hw/cxl: Add poison injection via the mailbox Jonathan Cameron
2023-02-01 10:03   ` Jonathan Cameron via
2023-02-01 10:03 ` [RFC PATCH v2 3/3] hw/cxl: Add clear poison mailbox command support Jonathan Cameron
2023-02-01 10:03   ` Jonathan Cameron via

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.