qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
* [RFC PATCH 0/4] hw/cxl/ + /hw/pci/: PCI DOE + CXL CDAT emulation
@ 2021-02-01 15:16 Jonathan Cameron
  2021-02-01 15:16 ` [RFC PATCH 1/4] include/standard-headers/linux/pci_regs: temp hack to add necessary DOE definitions Jonathan Cameron
                   ` (4 more replies)
  0 siblings, 5 replies; 12+ messages in thread
From: Jonathan Cameron @ 2021-02-01 15:16 UTC (permalink / raw)
  To: qemu-devel
  Cc: Thomas Huth, Ben Widawsky, Michael S . Tsirkin, Jonathan Cameron,
	Vishal Verma, f.fangjian, Chris Browy, f4bug, linuxarm, jcm,
	Prashant V Agarwal, Igor Mammedov, Dan Williams

Whilst I know others are working on an implementation of at least some of
this, a desire to work on the kernel user of this required an
implementation so I threw this together in the meantime and am sending
it out on the basis that it may be of use to others.

As it is an RFC I have been lazy on some cleanup and error handling.
Will fix that in a future version if we decide to take this forwards.

Based on Ben's cxl-2.0v3 branch from https://gitlab.com/bwidawsk/qemu
Possible I've picked an unstable branch so this may or may not currently
apply :)  I'll rebase on Ben's next version when avaialble.

Jonathan Cameron (4):
  include/standard-headers/linux/pci_regs: temp hack to add necessary
    DOE definitions.
  hw/pci/pcie_doe: Introduce utility functions for PCIe DOE
  hw/cxl/cxl-cdat: Initial CDAT implementation for use by CXL devices
  hw/mem/cxl_type3: Enabled DOE mailbox for access to CDAT

 hw/cxl/cxl-cdat.c                         | 252 +++++++++++++++++++++
 hw/cxl/meson.build                        |   1 +
 hw/mem/cxl_type3.c                        |  49 ++++-
 hw/pci/meson.build                        |   2 +-
 hw/pci/pcie_doe.c                         | 257 ++++++++++++++++++++++
 include/hw/cxl/cxl.h                      |   1 +
 include/hw/cxl/cxl_cdat.h                 | 101 +++++++++
 include/hw/pci/doe.h                      |  40 ++++
 include/hw/pci/pci_ids.h                  |   2 +
 include/standard-headers/linux/pci_regs.h |  33 ++-
 10 files changed, 734 insertions(+), 4 deletions(-)
 create mode 100644 hw/cxl/cxl-cdat.c
 create mode 100644 hw/pci/pcie_doe.c
 create mode 100644 include/hw/cxl/cxl_cdat.h
 create mode 100644 include/hw/pci/doe.h

-- 
2.19.1



^ permalink raw reply	[flat|nested] 12+ messages in thread

* [RFC PATCH 1/4] include/standard-headers/linux/pci_regs: temp hack to add necessary DOE definitions.
  2021-02-01 15:16 [RFC PATCH 0/4] hw/cxl/ + /hw/pci/: PCI DOE + CXL CDAT emulation Jonathan Cameron
@ 2021-02-01 15:16 ` Jonathan Cameron
  2021-02-02 15:39   ` Ben Widawsky
  2021-02-01 15:16 ` [RFC PATCH 2/4] hw/pci/pcie_doe: Introduce utility functions for PCIe DOE Jonathan Cameron
                   ` (3 subsequent siblings)
  4 siblings, 1 reply; 12+ messages in thread
From: Jonathan Cameron @ 2021-02-01 15:16 UTC (permalink / raw)
  To: qemu-devel
  Cc: Thomas Huth, Ben Widawsky, Michael S . Tsirkin, Jonathan Cameron,
	Vishal Verma, f.fangjian, Chris Browy, f4bug, linuxarm, jcm,
	Prashant V Agarwal, Igor Mammedov, Dan Williams

Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
---
 include/standard-headers/linux/pci_regs.h | 33 ++++++++++++++++++++++-
 1 file changed, 32 insertions(+), 1 deletion(-)

diff --git a/include/standard-headers/linux/pci_regs.h b/include/standard-headers/linux/pci_regs.h
index e709ae8235..7e852d3dd0 100644
--- a/include/standard-headers/linux/pci_regs.h
+++ b/include/standard-headers/linux/pci_regs.h
@@ -730,7 +730,8 @@
 #define PCI_EXT_CAP_ID_DVSEC	0x23	/* Designated Vendor-Specific */
 #define PCI_EXT_CAP_ID_DLF	0x25	/* Data Link Feature */
 #define PCI_EXT_CAP_ID_PL_16GT	0x26	/* Physical Layer 16.0 GT/s */
-#define PCI_EXT_CAP_ID_MAX	PCI_EXT_CAP_ID_PL_16GT
+#define PCI_EXT_CAP_ID_DOE	0x2E	/* Data Object Exchange */
+#define PCI_EXT_CAP_ID_MAX	PCI_EXT_CAP_ID_DOE
 
 #define PCI_EXT_CAP_DSN_SIZEOF	12
 #define PCI_EXT_CAP_MCAST_ENDPOINT_SIZEOF 40
@@ -1092,4 +1093,34 @@
 #define  PCI_PL_16GT_LE_CTRL_USP_TX_PRESET_MASK		0x000000F0
 #define  PCI_PL_16GT_LE_CTRL_USP_TX_PRESET_SHIFT	4
 
+/* Data Object Exchange */
+#define PCI_DOE_CAP		0x04
+#define  PCI_DOE_CAP_INT_SUPPORT			0x00000001
+#define  PCI_DOE_CAP_INT_MSG_NUM			0x00000FFE
+
+#define PCI_DOE_CTRL		0x08
+#define  PCI_DOE_CTRL_DOE_ABORT				0x00000001
+#define  PCI_DOE_CTRL_DOE_INT_EN			0x00000002
+#define  PCI_DOE_CTRL_DOE_GO				0x80000000
+
+#define PCI_DOE_STATUS		0x0c
+#define  PCI_DOE_STATUS_DOE_BUSY			0x00000001
+#define  PCI_DOE_STATUS_INT_STATUS			0x00000002
+#define  PCI_DOE_STATUS_DOE_ERROR			0x00000004
+#define  PCI_DOE_STATUS_DATA_OBJECT_READY		0x80000000
+
+#define PCI_DOE_WRITE_MAILBOX	0x10
+#define PCI_DOE_READ_MAILBOX	0x14
+
+/* Data Object Format DOE ECN 6.xx.1 */
+#define PCI_DATA_OBJ_DW0_VID				0x0000ffff
+#define PCI_DATA_OBJ_DW0_TYPE				0x00ff0000
+#define PCI_DATA_OBJ_DW1_LEN				0x0003ffff
+
+/* DOE Discover Data Object */
+#define PCI_DOE_DIS_OBJ_TYPE	 0x1
+#define PCI_DOE_DIS_REQ_D0_DW0_INDEX			0x000000ff
+#define PCI_DOE_DIS_RSP_DO_DW0_VID			0x0000ffff
+#define PCI_DOE_DIS_RSP_D0_DW0_PROT			0x00ff0000
+#define PCI_DOE_DIS_RSP_D0_DW0_NEXT_INDEX		0xff000000
 #endif /* LINUX_PCI_REGS_H */
-- 
2.19.1



^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [RFC PATCH 2/4] hw/pci/pcie_doe: Introduce utility functions for PCIe DOE
  2021-02-01 15:16 [RFC PATCH 0/4] hw/cxl/ + /hw/pci/: PCI DOE + CXL CDAT emulation Jonathan Cameron
  2021-02-01 15:16 ` [RFC PATCH 1/4] include/standard-headers/linux/pci_regs: temp hack to add necessary DOE definitions Jonathan Cameron
@ 2021-02-01 15:16 ` Jonathan Cameron
  2021-02-02 17:54   ` Ben Widawsky
  2021-02-01 15:16 ` [RFC PATCH 3/4] hw/cxl/cxl-cdat: Initial CDAT implementation for use by CXL devices Jonathan Cameron
                   ` (2 subsequent siblings)
  4 siblings, 1 reply; 12+ messages in thread
From: Jonathan Cameron @ 2021-02-01 15:16 UTC (permalink / raw)
  To: qemu-devel
  Cc: Thomas Huth, Ben Widawsky, Michael S . Tsirkin, Jonathan Cameron,
	Vishal Verma, f.fangjian, Chris Browy, f4bug, linuxarm, jcm,
	Prashant V Agarwal, Igor Mammedov, Dan Williams

This implements the ECN to the PCI 5.0 specification available at
https://members.pcisig.com/wg/PCI-SIG/document/14143

Does not currently support interrupts.

Note that currently no attempt is made to clean up allocated memory.

Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
---
 hw/pci/meson.build       |   2 +-
 hw/pci/pcie_doe.c        | 257 +++++++++++++++++++++++++++++++++++++++
 include/hw/pci/doe.h     |  40 ++++++
 include/hw/pci/pci_ids.h |   2 +
 4 files changed, 300 insertions(+), 1 deletion(-)

diff --git a/hw/pci/meson.build b/hw/pci/meson.build
index 5c4bbac817..7336620ee3 100644
--- a/hw/pci/meson.build
+++ b/hw/pci/meson.build
@@ -11,7 +11,7 @@ pci_ss.add(files(
 # The functions in these modules can be used by devices too.  Since we
 # allow plugging PCIe devices into PCI buses, include them even if
 # CONFIG_PCI_EXPRESS=n.
-pci_ss.add(files('pcie.c', 'pcie_aer.c'))
+pci_ss.add(files('pcie.c', 'pcie_aer.c',  'pcie_doe.c'))
 softmmu_ss.add(when: 'CONFIG_PCI_EXPRESS', if_true: files('pcie_port.c', 'pcie_host.c'))
 softmmu_ss.add_all(when: 'CONFIG_PCI', if_true: pci_ss)
 
diff --git a/hw/pci/pcie_doe.c b/hw/pci/pcie_doe.c
new file mode 100644
index 0000000000..8739c41280
--- /dev/null
+++ b/hw/pci/pcie_doe.c
@@ -0,0 +1,257 @@
+/*
+ * pcie_doe.c
+ * utility functions for pci express data object exchange introduced
+ * in PCI 5.0 Data Object Exchange (DOE) ECN
+ *
+ * Copyright (c) 2021 Jonathan Cameron <Jonathan.Cameron@huawei.com>
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2 or later.
+ * See the COPYING file in the top-level directory.
+ */
+
+#include "qemu/osdep.h"
+#include "qemu/units.h"
+#include "qemu/error-report.h"
+#include "hw/pci/pci.h"
+#include "hw/pci/doe.h"
+#include "hw/qdev-properties.h"
+#include "qapi/error.h"
+#include "qemu/log.h"
+#include "qemu/module.h"
+#include "qemu/range.h"
+#include "qemu/rcu.h"
+#include "sysemu/hostmem.h"
+
+struct doe_handler {
+    uint16_t vendor_id;
+    uint8_t object_type;
+    doe_msg_handler_t handler;
+    void *priv;
+};
+
+static void doe_set_ctl(PCIEDOE *doe, uint32_t val)
+{
+    /* Abort */
+    if (val & PCI_DOE_CTRL_DOE_ABORT) {
+        doe->req_index = 0;
+        doe->rsp_index = 0;
+        doe->req_length = 0;
+        doe->error = false;
+        doe->data_object_ready = false;
+    }
+
+    if (val & PCI_DOE_CTRL_DOE_GO) {
+        GList *l;
+        uint16_t vendor_id = doe->store[0] & PCI_DATA_OBJ_DW0_VID;
+        uint8_t object_type = (doe->store[0] & PCI_DATA_OBJ_DW0_TYPE) >>
+            ctz32(PCI_DATA_OBJ_DW0_TYPE);
+        if ((doe->req_index != 3) || (doe->req_length != 3)) {
+            /*
+             * Not entirely clear what should happen if req_length is correct
+             * buf insufficient data has been received.
+             */
+            doe->error = true;
+            return;
+        }
+        /* Discovery protocol - DOE ECN */
+        if (vendor_id == PCI_VENDOR_ID_PCI_SIG &&
+            object_type == PCI_DOE_DIS_OBJ_TYPE) {
+            uint8_t index = doe->store[2] & PCI_DOE_DIS_REQ_D0_DW0_INDEX;
+            doe->store[1] = 3;
+            if (index == 0) {
+                /* First entry is this one, the discovery protocol itself */
+                uint8_t next;
+
+                if (doe->cb_list) {
+                    next = index + 1;
+                } else {
+                    next = 0;
+                }
+                doe->store[2] =
+                    (next << ctz32(PCI_DOE_DIS_RSP_D0_DW0_NEXT_INDEX)) |
+                    (0 << ctz32(PCI_DOE_DIS_RSP_D0_DW0_PROT)) |
+                    0x0001;
+            } else {
+                /* Other entries based on register callbacks */
+                uint8_t next;
+                struct doe_handler *h;
+
+                h = g_list_nth_data(doe->cb_list, index - 1);
+                /*
+                 * Off end of list, Table 7-x4 in DOE ECN -
+                 * Vendor ID 0xFFFF if no more indices
+                 */
+                if (h == NULL) {
+                    doe->store[2] = 0xFFFF;
+                } else {
+                    if (g_list_nth(doe->cb_list, index)) {
+                        next = index + 1;
+                    } else {
+                        next = 0;
+                    }
+                    doe->store[2] =
+                        (next << ctz32(PCI_DOE_DIS_RSP_D0_DW0_NEXT_INDEX)) |
+                        (h->object_type << ctz32(PCI_DOE_DIS_RSP_D0_DW0_PROT)) |
+                        h->vendor_id;
+                }
+            }
+            doe->data_object_ready = true;
+            doe->rsp_index = 0;
+        } else {
+            for (l = doe->cb_list; l != NULL; l = l->next) {
+                struct doe_handler *h = l->data;
+                if (h->vendor_id == vendor_id &&
+                    h->object_type == object_type) {
+                    int ret = h->handler(doe, h->vendor_id, h->object_type,
+                                         h->priv);
+                    if (ret) {
+                        /*
+                         * No response so as per 6.xx.1 in DOE ECN
+                         * "... within 1 second after the DOE Go bit was Set
+                         *  in the DOE Control register, otherwise the DOE
+                         *  instance must Set the DOE Error bit in the DOE
+                         *  Status register.."
+                         */
+                         doe->error = true;
+                        break;
+                    }
+                    doe->data_object_ready = true;
+                    doe->rsp_index = 0;
+                    break;
+                }
+            }
+            /* Comamnd not handled */
+            if (l == NULL) {
+                doe->error = true;
+            }
+        }
+        /* Reset input index to allow for a new message */
+        doe->req_index = 0;
+    }
+}
+
+static void doe_set_write_mailbox(PCIEDOE *doe, uint32_t val)
+{
+    if (doe->req_index == 1) {
+        if (val & 0x3FFFF) {
+            doe->req_length = val & PCI_DATA_OBJ_DW1_LEN;
+        } else {
+            doe->req_length = 1 << 18;
+        }
+    }
+    if (doe->req_length && doe->req_index == doe->req_length) {
+        /*
+         * 6.xx.1 Data Objects
+         * If the DW transferred does not match the indicated Length
+         * for a data object, then the data object must be
+         * silently discarded
+         */
+        return;
+    }
+    doe->store[doe->req_index] = val;
+    doe->req_index++;
+}
+
+static uint32_t doe_get_read_mailbox(PCIEDOE *doe)
+{
+    uint32_t val;
+
+    if (doe->rsp_index == 0) {
+        doe->rsp_length = doe->store[1] & PCI_DATA_OBJ_DW1_LEN;
+    }
+    if (!doe->data_object_ready) {
+        /* Underflow of the Read Data Mailbox Mechanism */
+        doe->error = true;
+        return 0;
+    }
+
+    val = doe->store[doe->rsp_index];
+    doe->rsp_index++;
+    if (doe->rsp_index == doe->rsp_length) {
+        doe->rsp_index = -1;
+        doe->data_object_ready = false;
+    }
+
+    return val;
+}
+
+static uint32_t doe_get_status(PCIEDOE *doe)
+{
+    uint32_t val = 0;
+
+    if (doe->busy) {
+        val |= PCI_DOE_STATUS_DOE_BUSY;
+    }
+    /* bit 1: interrupt not yet supported */
+    if (doe->error) {
+        val |= PCI_DOE_STATUS_DOE_ERROR;
+    }
+    if (doe->data_object_ready) {
+        val |= PCI_DOE_STATUS_DATA_OBJECT_READY;
+    }
+
+    return val;
+}
+
+void doe_add_message_handler(PCIEDOE *doe, uint16_t vendor_id,
+                             uint8_t object_type,
+                             const doe_msg_handler_t handler, void *priv)
+{
+    struct doe_handler *h = g_malloc0(sizeof(*handler));
+
+    h->vendor_id = vendor_id;
+    h->object_type = object_type;
+    h->handler = handler;
+    h->priv = priv;
+    doe->cb_list = g_list_append(doe->cb_list, h);
+}
+
+uint32_t pcie_doe_ecap(PCIEDOE *doe, PCIDevice *d, uint16_t offset)
+{
+    doe->doe_base = offset;
+    /* Length field is 18 bits and is in dwords */
+    doe->store = g_malloc0((1 << 18) * sizeof(uint32_t));
+
+    pcie_add_capability(d, PCI_EXT_CAP_ID_DOE, 1, offset, 0x18);
+    offset += 0x18;
+
+    return offset;
+}
+
+void pcie_doe_write(PCIEDOE *doe, uint32_t addr, uint32_t val, int len)
+{
+    if (len != 4) {
+        return;
+    }
+
+    switch (addr - doe->doe_base) {
+    case PCI_DOE_CTRL:
+        doe_set_ctl(doe, val);
+        break;
+    case PCI_DOE_WRITE_MAILBOX:
+        doe_set_write_mailbox(doe, val);
+        break;
+    }
+}
+
+uint32_t pcie_doe_read(PCIEDOE *doe, uint32_t addr, int len, int *found)
+{
+    if (len != 4) {
+        *found = 0;
+        return 0;
+    }
+
+    *found = 1;
+    switch (addr - doe->doe_base) {
+    case PCI_DOE_CAP:
+        return 0; /* No interrupt support */
+    case PCI_DOE_STATUS:
+        return doe_get_status(doe);
+    case PCI_DOE_READ_MAILBOX:
+        return doe_get_read_mailbox(doe);
+    default:
+        *found = 0;
+        return 0;
+    }
+}
+
diff --git a/include/hw/pci/doe.h b/include/hw/pci/doe.h
new file mode 100644
index 0000000000..364c866c53
--- /dev/null
+++ b/include/hw/pci/doe.h
@@ -0,0 +1,40 @@
+/*
+ * PCIE DOE emulation.
+ *
+ * Copyright (c) 2021 Jonathan Cameron <Jonathan.Cameron@huawei.com>
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2 or later.
+ * See the COPYING file in the top-level directory.
+ */
+
+#ifndef QEMU_PCIE_DOE_H_
+#define QEMU_PCIE_DOE_H_
+#include "qemu/osdep.h"
+#include "qemu/units.h"
+#include "hw/qdev-properties.h"
+#include "qapi/error.h"
+#include "qemu/module.h"
+
+typedef struct pcie_doe {
+    uint32_t doe_base;
+    GList *cb_list;
+    int req_index;
+    int req_length;
+    int rsp_index;
+    int rsp_length;
+    bool data_object_ready;
+    bool error;
+    bool busy;
+    uint32_t *store;
+} PCIEDOE;
+
+typedef int (*doe_msg_handler_t)(PCIEDOE *doe, uint16_t vendor_id,
+                                 uint8_t object_type, void *priv);
+
+uint32_t pcie_doe_ecap(PCIEDOE *doe, PCIDevice *d, uint16_t offset);
+void doe_add_message_handler(PCIEDOE *doe, uint16_t vendor_id,
+                             uint8_t object_type,
+                             const doe_msg_handler_t handler, void *priv);
+uint32_t pcie_doe_read(PCIEDOE *doe, uint32_t addr, int len, int *found);
+void pcie_doe_write(PCIEDOE *doe, uint32_t addr, uint32_t val, int len);
+#endif
diff --git a/include/hw/pci/pci_ids.h b/include/hw/pci/pci_ids.h
index 76bf3ed590..636b2e8017 100644
--- a/include/hw/pci/pci_ids.h
+++ b/include/hw/pci/pci_ids.h
@@ -157,6 +157,8 @@
 
 /* Vendors and devices.  Sort key: vendor first, device next. */
 
+#define PCI_VENDOR_ID_PCI_SIG            0x0001
+
 #define PCI_VENDOR_ID_LSI_LOGIC          0x1000
 #define PCI_DEVICE_ID_LSI_53C810         0x0001
 #define PCI_DEVICE_ID_LSI_53C895A        0x0012
-- 
2.19.1



^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [RFC PATCH 3/4] hw/cxl/cxl-cdat: Initial CDAT implementation for use by CXL devices
  2021-02-01 15:16 [RFC PATCH 0/4] hw/cxl/ + /hw/pci/: PCI DOE + CXL CDAT emulation Jonathan Cameron
  2021-02-01 15:16 ` [RFC PATCH 1/4] include/standard-headers/linux/pci_regs: temp hack to add necessary DOE definitions Jonathan Cameron
  2021-02-01 15:16 ` [RFC PATCH 2/4] hw/pci/pcie_doe: Introduce utility functions for PCIe DOE Jonathan Cameron
@ 2021-02-01 15:16 ` Jonathan Cameron
  2021-02-02 18:49   ` Ben Widawsky
  2021-02-01 15:16 ` [RFC PATCH 4/4] hw/mem/cxl_type3: Enabled DOE mailbox for access to CDAT Jonathan Cameron
  2021-02-03 17:32 ` [RFC PATCH 0/4] hw/cxl/ + /hw/pci/: PCI DOE + CXL CDAT emulation Jonathan Cameron
  4 siblings, 1 reply; 12+ messages in thread
From: Jonathan Cameron @ 2021-02-01 15:16 UTC (permalink / raw)
  To: qemu-devel
  Cc: Thomas Huth, Ben Widawsky, Michael S . Tsirkin, Jonathan Cameron,
	Vishal Verma, f.fangjian, Chris Browy, f4bug, linuxarm, jcm,
	Prashant V Agarwal, Igor Mammedov, Dan Williams

CDAT is an ACPI like format defined by the CXL consortium. It is
available from

https://www.uefi.org/node/4093

Here support for managing all the entires is introduced, along with
an implementation of a callback for a DOE mailbox which may be
used to read these values from CXL hardware by either firmware or
an OS.

Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
---
 hw/cxl/cxl-cdat.c         | 252 ++++++++++++++++++++++++++++++++++++++
 hw/cxl/meson.build        |   1 +
 include/hw/cxl/cxl_cdat.h | 101 +++++++++++++++
 3 files changed, 354 insertions(+)

diff --git a/hw/cxl/cxl-cdat.c b/hw/cxl/cxl-cdat.c
new file mode 100644
index 0000000000..6ed4c15cc0
--- /dev/null
+++ b/hw/cxl/cxl-cdat.c
@@ -0,0 +1,252 @@
+/*
+ * Support for CDAT entires as defined in
+ * Coherent Device Attribute Table (CDAT) Specification rev 1.02
+ * Available from uefi.org.
+ *
+ * Copyright (c) 2021 Jonathan Cameron <Jonathan.Cameron@huawei.com>
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2 or later.
+ * See the COPYING file in the top-level directory.
+ */
+#include "qemu/osdep.h"
+#include "qemu/units.h"
+#include "qemu/error-report.h"
+#include "hw/mem/memory-device.h"
+#include "hw/mem/pc-dimm.h"
+#include "hw/pci/pci.h"
+#include "hw/pci/doe.h"
+#include "hw/qdev-properties.h"
+#include "qapi/error.h"
+#include "qemu/log.h"
+#include "qemu/module.h"
+#include "qemu/range.h"
+#include "qemu/rcu.h"
+#include "sysemu/hostmem.h"
+#include "sysemu/numa.h"
+#include "hw/cxl/cxl.h"
+
+void cdat_add_dsmas(CXLCDAT *cdat, uint8_t dsmad_handle, uint8_t flags,
+                    uint64_t dpa_base, uint64_t dpa_length)
+{
+    struct cxl_cdat_dsmas *dsmas = g_malloc0(sizeof(*dsmas));
+    dsmas->dsmad_handle = dsmad_handle;
+    dsmas->flags = flags;
+    dsmas->dpa_base = dpa_base;
+    dsmas->dpa_length = dpa_length;
+    cdat->dsmas_list = g_list_append(cdat->dsmas_list, dsmas);
+}
+
+void cdat_add_dslbis(CXLCDAT *cdat, uint8_t handle, uint8_t flags,
+                     uint8_t data_type, uint64_t base_unit,
+                     uint16_t entry0, uint16_t entry1, uint16_t entry2)
+{
+    struct cxl_cdat_dslbis *dslbis = g_malloc0(sizeof(*dslbis));
+    dslbis->handle = handle;
+    dslbis->flags = flags;
+    dslbis->data_type = data_type;
+    dslbis->entry_base_unit = base_unit;
+    dslbis->entries[0] = entry0;
+    dslbis->entries[1] = entry1;
+    dslbis->entries[2] = entry2;
+    cdat->dslbis_list = g_list_append(cdat->dslbis_list, dslbis);
+}
+
+void cdat_add_dsmscis(CXLCDAT *cdat, uint8_t dsmas_handle,
+                      uint64_t memory_sc_size, uint64_t cache_attrs)
+{
+    struct cxl_cdat_dsmscis *dsmscis = g_malloc(sizeof(*dsmscis));
+    dsmscis->dsmas_handle = dsmas_handle;
+    dsmscis->memory_side_cache_size = memory_sc_size;
+    dsmscis->cache_attributes = cache_attrs;
+    cdat->dsmscis_list = g_list_append(cdat->dsmscis_list, dsmscis);
+}
+
+void cdat_add_dsis(CXLCDAT *cdat, uint8_t flags, uint8_t handle)
+{
+    struct cxl_cdat_dsis *dsis = g_malloc(sizeof(*dsis));
+    dsis->flags = flags;
+    dsis->handle = handle;
+    cdat->dsis_list = g_list_append(cdat->dsis_list, dsis);
+}
+
+void cdat_add_dsemts(CXLCDAT *cdat, uint8_t dsmas_handle,
+                     uint8_t efi_mem_type_attr, uint64_t dpa_offset,
+                     uint64_t dpa_length)
+{
+    struct cxl_cdat_dsemts *dsemts = g_malloc(sizeof(*dsemts));
+    dsemts->dsmas_handle = dsmas_handle;
+    dsemts->efi_mem_type_attr = efi_mem_type_attr;
+    dsemts->dpa_offset = dpa_offset;
+    dsemts->dpa_length = dpa_length;
+    cdat->dsemts_list = g_list_append(cdat->dsemts_list, dsemts);
+}
+
+struct cxl_cdat_sslbis *cdat_add_sslbis(CXLCDAT *cdat, uint8_t num_entries,
+                                        uint8_t data_type, uint64_t base_unit)
+{
+    struct cxl_cdat_sslbis *sslbis =
+        g_malloc(sizeof(*sslbis) + num_entries * sizeof(sslbis->entries[0]));
+    sslbis->num_entries = num_entries;
+    sslbis->data_type = data_type;
+    sslbis->base_unit = base_unit;
+    cdat->sslbis_list = g_list_append(cdat->sslbis_list, sslbis);
+    return sslbis;
+}
+
+int cdata_sslbis_set_entry(struct cxl_cdat_sslbis *sslbis, uint8_t index,
+                           uint16_t portx, uint16_t porty, uint16_t val)
+{
+    struct cxl_cdat_sslbis_entry *entry;
+    if (index >= sslbis->num_entries) {
+        return -1;
+    }
+    entry = &sslbis->entries[index];
+    entry->port_x_id = portx;
+    entry->port_y_id = porty;
+    entry->val = val;
+    return 0;
+}
+
+int cxl_table_access(PCIEDOE *doe, uint16_t vendor_id, uint8_t object_type,
+                     void *priv)
+{
+    uint8_t table_type;
+    int total_entries;
+    uint16_t entry_handle;
+    CXLCDAT *cdat = priv;
+    uint16_t next_entry;
+
+    if (doe->req_length != 3) {
+        /* optional error for unexpected command length */
+        return -1;
+    }
+
+    if ((doe->store[2] & CXL_DOE_TABLE_ACCESS_DW2_RCODE) != 0) {
+        /*
+         * Only table access code currently supported.
+         * Error indication is lack of Data Object Ready
+         */
+        return -1;
+    }
+
+    table_type = (doe->store[2] & CXL_DOE_TABLE_ACCESS_DW2_TYPE) >>
+        ctz32(CXL_DOE_TABLE_ACCESS_DW2_TYPE);
+    if (table_type != 0) {
+        /* Unsuported table ID so just don't set Data Object Ready */
+        return -1;
+    }
+    entry_handle = (doe->store[2] & CXL_DOE_TABLE_ACCESS_DW2_ENTRYHANDLE) >>
+        ctz32(CXL_DOE_TABLE_ACCESS_DW2_ENTRYHANDLE);
+
+    /* Assume entry handle == CDAT structure index */
+    total_entries = g_list_length(cdat->dsmas_list) +
+        g_list_length(cdat->dslbis_list) +
+        g_list_length(cdat->dsmscis_list) +
+        g_list_length(cdat->dsis_list) +
+        g_list_length(cdat->dsemts_list) +
+        g_list_length(cdat->sslbis_list);
+    if (entry_handle + 1 == total_entries) {
+        next_entry = 0xFFFF;
+    } else {
+        next_entry = entry_handle + 1;
+    }
+
+    if (entry_handle < g_list_length(cdat->dsmas_list)) {
+        const int dsmas_len = 24;
+        struct cxl_cdat_dsmas *dsmas =
+            g_list_nth_data(cdat->dsmas_list, entry_handle);
+
+        doe->store[1] = 3 + dsmas_len / sizeof(uint32_t);
+        doe->store[2] = next_entry << 16;
+        doe->store[3] = (dsmas_len << 16) | CXL_CDAT_DSMAS_TYPE;
+
+        /* cxl version of proximity domain */
+        doe->store[4] = dsmas->dsmad_handle;
+        /* flags in 2nd byte of [4] (bit 2 is non volatile) */
+        doe->store[4] |= (dsmas->flags << 8);
+        doe->store[5] = dsmas->dpa_base & 0xFFFFFFFF;
+        doe->store[6] = (dsmas->dpa_base >> 32) & 0xFFFFFFFF;
+        doe->store[7] = dsmas->dpa_length & 0xFFFFFFFF;
+        doe->store[8] = (dsmas->dpa_length >> 32) & 0xFFFFFFFF;
+        return 0;
+    }
+    entry_handle -= g_list_length(cdat->dsmas_list);
+    if (entry_handle < g_list_length(cdat->dslbis_list)) {
+        const int dslbis_len = 24;
+        struct cxl_cdat_dslbis *dslbis =
+            g_list_nth_data(cdat->dslbis_list, entry_handle);
+
+        doe->store[1] = 3 + dslbis_len / sizeof(uint32_t);
+        doe->store[2] = next_entry << 16;
+        doe->store[3] = (dslbis_len << 16) | CXL_CDAT_DSLBIS_TYPE;
+
+        doe->store[4] = (dslbis->data_type << 24) | (dslbis->flags << 8) |
+            dslbis->handle;
+        doe->store[5] = dslbis->entry_base_unit & 0xFFFFFFFF;
+        doe->store[6] = (dslbis->entry_base_unit >> 32) & 0xFFFFFFFF;
+        doe->store[7] = (dslbis->entries[1] << 16) | dslbis->entries[0];
+        doe->store[8] = dslbis->entries[2];
+        return 0;
+    }
+    entry_handle -= g_list_length(cdat->dslbis_list);
+    if (entry_handle < g_list_length(cdat->dsmscis_list)) {
+        const int dsmscis_len = 20;
+        struct cxl_cdat_dsmscis *dsmscis =
+            g_list_nth_data(cdat->dsmscis_list, entry_handle);
+
+        doe->store[1] = 3 + dsmscis_len / sizeof(uint32_t);
+        doe->store[2] = next_entry << 16;
+        doe->store[3] = (dsmscis_len << 16) | CXL_CDAT_DSMSCIS_TYPE;
+        doe->store[4] = dsmscis->dsmas_handle;
+        doe->store[5] = dsmscis->memory_side_cache_size & 0xffffffff;
+        doe->store[6] = (dsmscis->memory_side_cache_size >> 32) & 0xffffffff;
+        doe->store[7] = dsmscis->cache_attributes;
+    }
+    entry_handle -= g_list_length(cdat->dsmscis_list);
+    if (entry_handle < g_list_length(cdat->dsis_list)) {
+        const int dsis_len = 8;
+        struct cxl_cdat_dsis *dsis =
+            g_list_nth_data(cdat->dsis_list, entry_handle);
+
+        doe->store[1] = 3 + dsis_len / sizeof(uint32_t);
+        doe->store[2] = next_entry << 16;
+        doe->store[3] = (dsis_len << 16) | CXL_CDAT_DSMSCIS_TYPE;
+        doe->store[4] = (dsis->handle << 8) | dsis->flags;
+    }
+    entry_handle -= g_list_length(cdat->dsis_list);
+    if (entry_handle < g_list_length(cdat->dsemts_list)) {
+        const int dsemts_len = 24;
+        struct cxl_cdat_dsemts *dsemts =
+            g_list_nth_data(cdat->dsemts_list, entry_handle);
+
+        doe->store[1] = 3 + dsemts_len / sizeof(uint32_t);
+        doe->store[2] = next_entry << 16;
+        doe->store[3] = (dsemts_len << 16) | CXL_CDAT_DSEMTS_TYPE;
+        doe->store[4] = (dsemts->efi_mem_type_attr << 8) | dsemts->dsmas_handle;
+        doe->store[5] = dsemts->dpa_offset & 0xffffffff;
+        doe->store[6] = (dsemts->dpa_offset >> 32) & 0xffffffff;
+        doe->store[7] = dsemts->dpa_length & 0xffffffff;
+        doe->store[8] = (dsemts->dpa_length >> 32) & 0xffffffff;
+    }
+    entry_handle -= g_list_length(cdat->dsemts_list);
+    if (entry_handle < g_list_length(cdat->sslbis_list)) {
+        struct cxl_cdat_sslbis *sslbis =
+            g_list_nth_data(cdat->sslbis_list, entry_handle);
+        int sslbis_len = 16 + 8 * sslbis->num_entries;
+        int i;
+
+        doe->store[1] = 3 + sslbis_len / sizeof(uint32_t);
+        doe->store[2] = next_entry << 16;
+        doe->store[3] = (sslbis_len << 16) | CXL_CDAT_SSLBIS_TYPE;
+        doe->store[4] = sslbis->data_type;
+        doe->store[5] = sslbis->base_unit & 0xffffffff;
+        doe->store[6] = (sslbis->base_unit >> 32) & 0xffffffff;
+        for (i = 0; i < sslbis->num_entries; i++) {
+            doe->store[7 + i * 2] = (sslbis->entries[i].port_y_id << 8) |
+                sslbis->entries[i].port_x_id;
+            doe->store[7 + i * 2 + 1] = sslbis->entries[i].val;
+        }
+    }
+
+    return -1;
+}
diff --git a/hw/cxl/meson.build b/hw/cxl/meson.build
index 0eca715d10..9e2e5f4094 100644
--- a/hw/cxl/meson.build
+++ b/hw/cxl/meson.build
@@ -2,4 +2,5 @@ softmmu_ss.add(when: 'CONFIG_CXL', if_true: files(
   'cxl-component-utils.c',
   'cxl-device-utils.c',
   'cxl-mailbox-utils.c',
+  'cxl-cdat.c',
 ))
diff --git a/include/hw/cxl/cxl_cdat.h b/include/hw/cxl/cxl_cdat.h
new file mode 100644
index 0000000000..d0226c463c
--- /dev/null
+++ b/include/hw/cxl/cxl_cdat.h
@@ -0,0 +1,101 @@
+/*
+ * Support for CDAT entires as defined in
+ * Coherent Device Attribute Table (CDAT) Specification rev 1.02
+ * Available from uefi.org.
+ *
+ * Copyright (c) 2021 Jonathan Cameron <Jonathan.Cameron@huawei.com>
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2 or later.
+ * See the COPYING file in the top-level directory.
+ */
+#ifndef QEMU_CXL_CDAT_H
+#define QEMU_CXL_CDAT_H
+#include "hw/pci/doe.h"
+/* DW2 is common to the request and response */
+#define CXL_DOE_TABLE_ACCESS_DW2_RCODE          0x000000ff
+#define CXL_DOE_TABLE_ACCESS_DW2_TYPE           0x0000ff00
+#define CXL_DOE_TABLE_ACCESS_DW2_ENTRYHANDLE    0xffff0000
+
+#define CXL_CDAT_DSMAS_TYPE     0
+#define CXL_CDAT_DSLBIS_TYPE    1
+#define CXL_CDAT_DSMSCIS_TYPE   2
+#define CXL_CDAT_DSIS_TYPE      3
+#define CXL_CDAT_DSEMTS_TYPE    4
+#define CXL_CDAT_SSLBIS_TYPE    5
+
+struct cxl_cdat_dsmas {
+    uint8_t dsmad_handle;
+    uint8_t flags;
+#define CDAT_DSMAS_FLAG_NV (1 << 2)
+    uint64_t dpa_base;
+    uint64_t dpa_length;
+};
+
+struct cxl_cdat_dslbis {
+    uint8_t handle;
+    uint8_t flags;
+    uint8_t data_type;
+    uint64_t entry_base_unit;
+    uint16_t entries[3]; /* 6 bytes */
+};
+
+struct cxl_cdat_dsmscis {
+    uint8_t dsmas_handle;
+    uint64_t memory_side_cache_size;
+    uint32_t cache_attributes;
+};
+
+struct cxl_cdat_dsis {
+    uint8_t flags;
+#define CDAT_DSIS_MEMORY_ATTACHED 0x01
+    uint8_t handle;
+};
+
+struct cxl_cdat_dsemts {
+    uint8_t dsmas_handle;
+    uint8_t efi_mem_type_attr;
+    uint64_t dpa_offset;
+    uint64_t dpa_length;
+};
+
+struct cxl_cdat_sslbis_entry {
+    uint16_t port_x_id;
+    uint16_t port_y_id;
+    uint16_t val;
+};
+
+struct cxl_cdat_sslbis {
+    uint8_t num_entries; /* needed to compute length */
+    uint8_t data_type;
+    uint64_t base_unit;
+    struct cxl_cdat_sslbis_entry entries[];
+};
+
+typedef struct cxl_cdat {
+    GList *dsmas_list;
+    GList *dslbis_list;
+    GList *dsmscis_list;
+    GList *dsis_list;
+    GList *dsemts_list;
+    GList *sslbis_list;
+} CXLCDAT;
+
+void cdat_add_dsmas(CXLCDAT *cdat, uint8_t dsmad_handle, uint8_t flags,
+                    uint64_t dpa_base, uint64_t dpa_length);
+void cdat_add_dslbis(CXLCDAT *cdat, uint8_t handle, uint8_t flags,
+                     uint8_t data_type, uint64_t base_unit,
+                     uint16_t entry0, uint16_t entry1, uint16_t entry2);
+void cdat_add_dsmscis(CXLCDAT *cdat, uint8_t dsmas_handle,
+                      uint64_t memory_sc_size, uint64_t cache_attrs);
+void cdat_add_dsis(CXLCDAT *cdat, uint8_t flags, uint8_t handle);
+void cdat_add_dsemts(CXLCDAT *cdat, uint8_t dsmas_handle,
+                     uint8_t efi_mem_type_attr, uint64_t dpa_offset,
+                     uint64_t dpa_length);
+struct cxl_cdat_sslbis *cdat_add_sslbis(CXLCDAT *cdat, uint8_t num_entries,
+                                        uint8_t data_type, uint64_t base_unit);
+int cdata_sslbis_set_entry(struct cxl_cdat_sslbis *sslbis, uint8_t index,
+                           uint16_t portx, uint16_t pory, uint16_t val);
+
+int cxl_table_access(PCIEDOE *doe, uint16_t vendor_id, uint8_t object_type,
+                     void *priv);
+#endif
-- 
2.19.1



^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [RFC PATCH 4/4] hw/mem/cxl_type3: Enabled DOE mailbox for access to CDAT
  2021-02-01 15:16 [RFC PATCH 0/4] hw/cxl/ + /hw/pci/: PCI DOE + CXL CDAT emulation Jonathan Cameron
                   ` (2 preceding siblings ...)
  2021-02-01 15:16 ` [RFC PATCH 3/4] hw/cxl/cxl-cdat: Initial CDAT implementation for use by CXL devices Jonathan Cameron
@ 2021-02-01 15:16 ` Jonathan Cameron
  2021-02-03 17:32 ` [RFC PATCH 0/4] hw/cxl/ + /hw/pci/: PCI DOE + CXL CDAT emulation Jonathan Cameron
  4 siblings, 0 replies; 12+ messages in thread
From: Jonathan Cameron @ 2021-02-01 15:16 UTC (permalink / raw)
  To: qemu-devel
  Cc: Thomas Huth, Ben Widawsky, Michael S . Tsirkin, Jonathan Cameron,
	Vishal Verma, f.fangjian, Chris Browy, f4bug, linuxarm, jcm,
	Prashant V Agarwal, Igor Mammedov, Dan Williams

This enables basic read back of CDAT entries via a DOE mailbox in
the devices PCI config space.

Note that for now a few possible entries are provided that
don't make a lot of sense for current instances of this device.
It will need to be made somewhat dynamic and possibly require
additional command line parameters.

Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
---
 hw/mem/cxl_type3.c   | 49 ++++++++++++++++++++++++++++++++++++++++++--
 include/hw/cxl/cxl.h |  1 +
 2 files changed, 48 insertions(+), 2 deletions(-)

diff --git a/hw/mem/cxl_type3.c b/hw/mem/cxl_type3.c
index dee5a8884b..3449f02090 100644
--- a/hw/mem/cxl_type3.c
+++ b/hw/mem/cxl_type3.c
@@ -4,6 +4,7 @@
 #include "hw/mem/memory-device.h"
 #include "hw/mem/pc-dimm.h"
 #include "hw/pci/pci.h"
+#include "hw/pci/doe.h"
 #include "hw/qdev-properties.h"
 #include "qapi/error.h"
 #include "qemu/log.h"
@@ -11,6 +12,7 @@
 #include "qemu/range.h"
 #include "qemu/rcu.h"
 #include "sysemu/hostmem.h"
+#include "sysemu/numa.h"
 #include "hw/cxl/cxl.h"
 
 typedef struct cxl_type3_dev {
@@ -21,6 +23,9 @@ typedef struct cxl_type3_dev {
     uint64_t size;
     HostMemoryBackend *hostmem;
 
+    PCIEDOE doe;
+    CXLCDAT cdat;
+
     /* State */
     CXLComponentState cxl_cstate;
     CXLDeviceState cxl_dstate;
@@ -28,7 +33,7 @@ typedef struct cxl_type3_dev {
 
 #define CT3(obj) OBJECT_CHECK(CXLType3Dev, (obj), TYPE_CXL_TYPE3_DEV)
 
-static void build_dvsecs(CXLType3Dev *ct3d)
+static int build_dvsecs(CXLType3Dev *ct3d)
 {
     CXLComponentState *cxl_cstate = &ct3d->cxl_cstate;
     uint8_t *dvsec;
@@ -55,6 +60,7 @@ static void build_dvsecs(CXLType3Dev *ct3d)
     };
     cxl_component_create_dvsec(cxl_cstate, REG_LOC_DVSEC_LENGTH, REG_LOC_DVSEC,
                                REG_LOC_DVSEC_REVID, dvsec);
+    return cxl_cstate->dvsec_offset;
 }
 
 static void hdm_decoder_commit(CXLType3Dev *ct3d, int which)
@@ -201,6 +207,7 @@ static void ct3_realize(PCIDevice *pci_dev, Error **errp)
     ComponentRegisters *regs = &cxl_cstate->crb;
     MemoryRegion *mr = &regs->component_registers;
     uint8_t *pci_conf = pci_dev->config;
+    int offset;
 
     if (!ct3d->cxl_dstate.pmem) {
         cxl_setup_memory(ct3d, errp);
@@ -213,7 +220,21 @@ static void ct3_realize(PCIDevice *pci_dev, Error **errp)
     cxl_cstate->dvsec_offset = 0x100;
 
     ct3d->cxl_cstate.pdev = pci_dev;
-    build_dvsecs(ct3d);
+    offset = build_dvsecs(ct3d);
+
+    /* Build DOE - no interrupt for now */
+    offset = pcie_doe_ecap(&ct3d->doe, pci_dev, offset);
+    doe_add_message_handler(&ct3d->doe, 0x1e98, 0x2, &cxl_table_access,
+                            &ct3d->cdat);
+
+    /* Store entires of CDAT  for read back from DOE */
+    cdat_add_dsmas(&ct3d->cdat, 0, CDAT_DSMAS_FLAG_NV, 0x3345000000,
+                   0x0001000000);
+    cdat_add_dsmas(&ct3d->cdat, 1, 0, 0x33460000000, 0x00010000000);
+    cdat_add_dslbis(&ct3d->cdat, 0, HMAT_LB_MEM_MEMORY,
+                    (uint8_t)HMAT_LB_DATA_ACCESS_LATENCY,
+                    1000, /* microsecs */
+                    90, 0, 0);
 
 #ifndef SET_PMEM_PADDR
     regs->special_ops = g_new0(MemoryRegionOps, 1);
@@ -287,6 +308,28 @@ static void pc_dimm_md_fill_device_info(const MemoryDeviceState *md,
     info->type = MEMORY_DEVICE_INFO_KIND_CXL;
 }
 
+static void ct3d_config_write(PCIDevice *pci_dev, uint32_t addr, uint32_t val, int len)
+{
+    CXLType3Dev *ct3d = CT3(pci_dev);
+
+    pci_default_write_config(pci_dev, addr, val, len);
+    pcie_doe_write(&ct3d->doe, addr, val, len);
+
+}
+
+static uint32_t ct3d_config_read(PCIDevice *pci_dev, uint32_t addr, int len)
+{
+    CXLType3Dev *ct3d = CT3(pci_dev);
+    int found = 0;
+    uint32_t val;
+
+    val = pcie_doe_read(&ct3d->doe, addr, len, &found);
+    if (found) {
+        return val;
+    }
+    return pci_default_read_config(pci_dev, addr, len);
+}
+
 static void ct3_class_init(ObjectClass *oc, void *data)
 {
     DeviceClass *dc = DEVICE_CLASS(oc);
@@ -298,6 +341,8 @@ static void ct3_class_init(ObjectClass *oc, void *data)
     pc->vendor_id = PCI_VENDOR_ID_INTEL;
     pc->device_id = 0xd93; /* LVF for now */
     pc->revision = 1;
+    pc->config_write = ct3d_config_write;
+    pc->config_read = ct3d_config_read;
 
     set_bit(DEVICE_CATEGORY_STORAGE, dc->categories);
     dc->desc = "CXL PMEM Device (Type 3)";
diff --git a/include/hw/cxl/cxl.h b/include/hw/cxl/cxl.h
index 6961e47076..ec3b4d7398 100644
--- a/include/hw/cxl/cxl.h
+++ b/include/hw/cxl/cxl.h
@@ -13,6 +13,7 @@
 #include "cxl_pci.h"
 #include "cxl_component.h"
 #include "cxl_device.h"
+#include "cxl_cdat.h"
 
 #define COMPONENT_REG_BAR_IDX 0
 #define DEVICE_REG_BAR_IDX 2
-- 
2.19.1



^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [RFC PATCH 1/4] include/standard-headers/linux/pci_regs: temp hack to add necessary DOE definitions.
  2021-02-01 15:16 ` [RFC PATCH 1/4] include/standard-headers/linux/pci_regs: temp hack to add necessary DOE definitions Jonathan Cameron
@ 2021-02-02 15:39   ` Ben Widawsky
  2021-02-02 23:13     ` Michael S. Tsirkin
  0 siblings, 1 reply; 12+ messages in thread
From: Ben Widawsky @ 2021-02-02 15:39 UTC (permalink / raw)
  To: Jonathan Cameron
  Cc: Thomas Huth, linuxarm, Michael S . Tsirkin, Vishal Verma,
	f.fangjian, Chris Browy, qemu-devel, f4bug, jcm,
	Prashant V Agarwal, Igor Mammedov, Dan Williams

On 21-02-01 23:16:26, Jonathan Cameron wrote:
> Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
> ---
>  include/standard-headers/linux/pci_regs.h | 33 ++++++++++++++++++++++-
>  1 file changed, 32 insertions(+), 1 deletion(-)
> 
> diff --git a/include/standard-headers/linux/pci_regs.h b/include/standard-headers/linux/pci_regs.h
> index e709ae8235..7e852d3dd0 100644
> --- a/include/standard-headers/linux/pci_regs.h
> +++ b/include/standard-headers/linux/pci_regs.h
> @@ -730,7 +730,8 @@
>  #define PCI_EXT_CAP_ID_DVSEC	0x23	/* Designated Vendor-Specific */
>  #define PCI_EXT_CAP_ID_DLF	0x25	/* Data Link Feature */
>  #define PCI_EXT_CAP_ID_PL_16GT	0x26	/* Physical Layer 16.0 GT/s */
> -#define PCI_EXT_CAP_ID_MAX	PCI_EXT_CAP_ID_PL_16GT
> +#define PCI_EXT_CAP_ID_DOE	0x2E	/* Data Object Exchange */
> +#define PCI_EXT_CAP_ID_MAX	PCI_EXT_CAP_ID_DOE
>  
>  #define PCI_EXT_CAP_DSN_SIZEOF	12
>  #define PCI_EXT_CAP_MCAST_ENDPOINT_SIZEOF 40
> @@ -1092,4 +1093,34 @@
>  #define  PCI_PL_16GT_LE_CTRL_USP_TX_PRESET_MASK		0x000000F0
>  #define  PCI_PL_16GT_LE_CTRL_USP_TX_PRESET_SHIFT	4
>  
> +/* Data Object Exchange */
> +#define PCI_DOE_CAP		0x04
> +#define  PCI_DOE_CAP_INT_SUPPORT			0x00000001
> +#define  PCI_DOE_CAP_INT_MSG_NUM			0x00000FFE
> +
> +#define PCI_DOE_CTRL		0x08
> +#define  PCI_DOE_CTRL_DOE_ABORT				0x00000001
> +#define  PCI_DOE_CTRL_DOE_INT_EN			0x00000002
> +#define  PCI_DOE_CTRL_DOE_GO				0x80000000
> +
> +#define PCI_DOE_STATUS		0x0c
> +#define  PCI_DOE_STATUS_DOE_BUSY			0x00000001
> +#define  PCI_DOE_STATUS_INT_STATUS			0x00000002
> +#define  PCI_DOE_STATUS_DOE_ERROR			0x00000004
> +#define  PCI_DOE_STATUS_DATA_OBJECT_READY		0x80000000
> +
> +#define PCI_DOE_WRITE_MAILBOX	0x10
> +#define PCI_DOE_READ_MAILBOX	0x14
> +
> +/* Data Object Format DOE ECN 6.xx.1 */
> +#define PCI_DATA_OBJ_DW0_VID				0x0000ffff
> +#define PCI_DATA_OBJ_DW0_TYPE				0x00ff0000
> +#define PCI_DATA_OBJ_DW1_LEN				0x0003ffff
> +
> +/* DOE Discover Data Object */
> +#define PCI_DOE_DIS_OBJ_TYPE	 0x1
> +#define PCI_DOE_DIS_REQ_D0_DW0_INDEX			0x000000ff
> +#define PCI_DOE_DIS_RSP_DO_DW0_VID			0x0000ffff
> +#define PCI_DOE_DIS_RSP_D0_DW0_PROT			0x00ff0000
> +#define PCI_DOE_DIS_RSP_D0_DW0_NEXT_INDEX		0xff000000
>  #endif /* LINUX_PCI_REGS_H */

I think a lot of these should have had _MASK at the end.

As for the accuracy of the values, lgtm.


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [RFC PATCH 2/4] hw/pci/pcie_doe: Introduce utility functions for PCIe DOE
  2021-02-01 15:16 ` [RFC PATCH 2/4] hw/pci/pcie_doe: Introduce utility functions for PCIe DOE Jonathan Cameron
@ 2021-02-02 17:54   ` Ben Widawsky
  2021-02-03 18:01     ` Jonathan Cameron
  0 siblings, 1 reply; 12+ messages in thread
From: Ben Widawsky @ 2021-02-02 17:54 UTC (permalink / raw)
  To: Jonathan Cameron
  Cc: Thomas Huth, linuxarm, Michael S . Tsirkin, Vishal Verma,
	f.fangjian, Chris Browy, qemu-devel, f4bug, jcm,
	Prashant V Agarwal, Igor Mammedov, Dan Williams

This was a bit more complicated than I was anticipating :-)

On 21-02-01 23:16:27, Jonathan Cameron wrote:
> This implements the ECN to the PCI 5.0 specification available at
> https://members.pcisig.com/wg/PCI-SIG/document/14143
> 
> Does not currently support interrupts.
> 
> Note that currently no attempt is made to clean up allocated memory.
> 
> Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
> ---
>  hw/pci/meson.build       |   2 +-
>  hw/pci/pcie_doe.c        | 257 +++++++++++++++++++++++++++++++++++++++
>  include/hw/pci/doe.h     |  40 ++++++
>  include/hw/pci/pci_ids.h |   2 +
>  4 files changed, 300 insertions(+), 1 deletion(-)
> 
> diff --git a/hw/pci/meson.build b/hw/pci/meson.build
> index 5c4bbac817..7336620ee3 100644
> --- a/hw/pci/meson.build
> +++ b/hw/pci/meson.build
> @@ -11,7 +11,7 @@ pci_ss.add(files(
>  # The functions in these modules can be used by devices too.  Since we
>  # allow plugging PCIe devices into PCI buses, include them even if
>  # CONFIG_PCI_EXPRESS=n.
> -pci_ss.add(files('pcie.c', 'pcie_aer.c'))
> +pci_ss.add(files('pcie.c', 'pcie_aer.c',  'pcie_doe.c'))
>  softmmu_ss.add(when: 'CONFIG_PCI_EXPRESS', if_true: files('pcie_port.c', 'pcie_host.c'))
>  softmmu_ss.add_all(when: 'CONFIG_PCI', if_true: pci_ss)
>  
> diff --git a/hw/pci/pcie_doe.c b/hw/pci/pcie_doe.c
> new file mode 100644
> index 0000000000..8739c41280
> --- /dev/null
> +++ b/hw/pci/pcie_doe.c
> @@ -0,0 +1,257 @@
> +/*
> + * pcie_doe.c
> + * utility functions for pci express data object exchange introduced
> + * in PCI 5.0 Data Object Exchange (DOE) ECN
> + *
> + * Copyright (c) 2021 Jonathan Cameron <Jonathan.Cameron@huawei.com>
> + *
> + * This work is licensed under the terms of the GNU GPL, version 2 or later.
> + * See the COPYING file in the top-level directory.
> + */
> +
> +#include "qemu/osdep.h"
> +#include "qemu/units.h"
> +#include "qemu/error-report.h"
> +#include "hw/pci/pci.h"
> +#include "hw/pci/doe.h"
> +#include "hw/qdev-properties.h"
> +#include "qapi/error.h"
> +#include "qemu/log.h"
> +#include "qemu/module.h"
> +#include "qemu/range.h"
> +#include "qemu/rcu.h"
> +#include "sysemu/hostmem.h"
> +

I know it's RFC and quickly thrown together, but:

/* VID and Type */
#define DOE_DATA_OBJECT_HDR1 0
/* Length */
#define DOE_DATA_OBJECT_HDR2 1
/* Index */
#define DOE_DATA_OBJECT_REQUEST_DATA 2
#define DOE_DATA_OBJECT_RESPONSE_DATA 3

Then use that throughout below, please?

> +struct doe_handler {
> +    uint16_t vendor_id;
> +    uint8_t object_type;
> +    doe_msg_handler_t handler;
> +    void *priv;
> +};
> +
> +static void doe_set_ctl(PCIEDOE *doe, uint32_t val)
> +{
> +    /* Abort */
> +    if (val & PCI_DOE_CTRL_DOE_ABORT) {
> +        doe->req_index = 0;
> +        doe->rsp_index = 0;
> +        doe->req_length = 0;
> +        doe->error = false;
> +        doe->data_object_ready = false;
> +    }
> +
> +    if (val & PCI_DOE_CTRL_DOE_GO) {
> +        GList *l;
> +        uint16_t vendor_id = doe->store[0] & PCI_DATA_OBJ_DW0_VID;
> +        uint8_t object_type = (doe->store[0] & PCI_DATA_OBJ_DW0_TYPE) >>
> +            ctz32(PCI_DATA_OBJ_DW0_TYPE);

I think it'd be much nicer to read this as REG32/FIELD_EX32.

> +        if ((doe->req_index != 3) || (doe->req_length != 3)) {
> +            /*
> +             * Not entirely clear what should happen if req_length is correct
> +             * buf insufficient data has been received.

s/buf/but

Also, maybe for more resiliency and readability, with a comment about why '3':
if ((doe->req_index < 3) ...

I think it'd be much more readable if you pulled the index directly out of the
register store here instead of when you set.

len = doe->store[DOE_DATA_OBJECT_HDR2] & PCI_DATA_OBJ_DW1_LEN;
if (!len)
	len = 1<<18;

> +             */
> +            doe->error = true;
> +            return;

I don't think this should be an error:
"If the Length is shorter than expected for a specific data object, then the data
object must be silently discarded.

If the Length is greater than expected for a specific data object, then the
portion of the data object up to the expected length must be processednormally
and the remainder of the data object must be silently discarded."

> +        }
> +        /* Discovery protocol - DOE ECN */
> +        if (vendor_id == PCI_VENDOR_ID_PCI_SIG &&
> +            object_type == PCI_DOE_DIS_OBJ_TYPE) {
> +            uint8_t index = doe->store[2] & PCI_DOE_DIS_REQ_D0_DW0_INDEX;
> +            doe->store[1] = 3;
> +            if (index == 0) {
> +                /* First entry is this one, the discovery protocol itself */
> +                uint8_t next;
> +
> +                if (doe->cb_list) {
> +                    next = index + 1;
> +                } else {
> +                    next = 0;
> +                }

I think a comment here that you're terminating the list if no callbacks are
registered would be good.

> +                doe->store[2] =
> +                    (next << ctz32(PCI_DOE_DIS_RSP_D0_DW0_NEXT_INDEX)) |
> +                    (0 << ctz32(PCI_DOE_DIS_RSP_D0_DW0_PROT)) |
> +                    0x0001;

Same comment about FIELD_DP|EX32. I'd try it and see how it looks.

Would probably be nice to #define some vendor ID, or use fields so it's obvious
why you have a 0x1.

> +            } else {
> +                /* Other entries based on register callbacks */
> +                uint8_t next;
> +                struct doe_handler *h;
> +
> +                h = g_list_nth_data(doe->cb_list, index - 1);

It wasn't immediately obvious to me why index - 1. Maybe it'd be a little nicer
if you had a helper:

static inline struct doe_handler *doe_get_handler(PCIEDOE *doe, int doe_index)
{
	struct doe_handler *ret = NULL;

	// assert(doe->cb_list);
	if (doe->cb_list)
		ret = g_list_nth_data(doe->cb_list, doe_index - 1);

	return ret;
}

> +                /*
> +                 * Off end of list, Table 7-x4 in DOE ECN -
> +                 * Vendor ID 0xFFFF if no more indices
> +                 */
> +                if (h == NULL) {
> +                    doe->store[2] = 0xFFFF;
> +                } else {
> +                    if (g_list_nth(doe->cb_list, index)) {
> +                        next = index + 1;
> +                    } else {
> +                        next = 0;
> +                    }

A bit confusing here as well IMO, but I don't know a better way to write it
other than adding comments about index effectively being off by 1.

> +                    doe->store[2] =
> +                        (next << ctz32(PCI_DOE_DIS_RSP_D0_DW0_NEXT_INDEX)) |
> +                        (h->object_type << ctz32(PCI_DOE_DIS_RSP_D0_DW0_PROT)) |
> +                        h->vendor_id;
> +                }
> +            }
> +            doe->data_object_ready = true;
> +            doe->rsp_index = 0;
> +        } else {
> +            for (l = doe->cb_list; l != NULL; l = l->next) {
> +                struct doe_handler *h = l->data;
> +                if (h->vendor_id == vendor_id &&
> +                    h->object_type == object_type) {
> +                    int ret = h->handler(doe, h->vendor_id, h->object_type,
> +                                         h->priv);
> +                    if (ret) {
> +                        /*
> +                         * No response so as per 6.xx.1 in DOE ECN
> +                         * "... within 1 second after the DOE Go bit was Set
> +                         *  in the DOE Control register, otherwise the DOE
> +                         *  instance must Set the DOE Error bit in the DOE
> +                         *  Status register.."
> +                         */
> +                         doe->error = true;

extra whitespace

> +                        break;
> +                    }
> +                    doe->data_object_ready = true;
> +                    doe->rsp_index = 0;
> +                    break;
> +                }
> +            }
> +            /* Comamnd not handled */
> +            if (l == NULL) {
> +                doe->error = true;
> +            }
> +        }
> +        /* Reset input index to allow for a new message */
> +        doe->req_index = 0;
> +    }
> +}
> +
> +static void doe_set_write_mailbox(PCIEDOE *doe, uint32_t val)
> +{
> +    if (doe->req_index == 1) {
> +        if (val & 0x3FFFF) {

val & PCI_DATA_OBJ_DW1_LEN

> +            doe->req_length = val & PCI_DATA_OBJ_DW1_LEN;
> +        } else {
> +            doe->req_length = 1 << 18;
> +        }
> +    }
> +    if (doe->req_length && doe->req_index == doe->req_length) {
> +        /*
> +         * 6.xx.1 Data Objects
> +         * If the DW transferred does not match the indicated Length
> +         * for a data object, then the data object must be
> +         * silently discarded
> +         */
> +        return;
> +    }
> +    doe->store[doe->req_index] = val;
> +    doe->req_index++;
> +}
> +
> +static uint32_t doe_get_read_mailbox(PCIEDOE *doe)
> +{
> +    uint32_t val;
> +
> +    if (doe->rsp_index == 0) {
> +        doe->rsp_length = doe->store[1] & PCI_DATA_OBJ_DW1_LEN;
> +    }
> +    if (!doe->data_object_ready) {
> +        /* Underflow of the Read Data Mailbox Mechanism */
> +        doe->error = true;
> +        return 0;
> +    }

I don't think this should be an error condition. Could you please explain?

> +
> +    val = doe->store[doe->rsp_index];
> +    doe->rsp_index++;
> +    if (doe->rsp_index == doe->rsp_length) {
> +        doe->rsp_index = -1;
> +        doe->data_object_ready = false;
> +    }
> +
> +    return val;
> +}
> +
> +static uint32_t doe_get_status(PCIEDOE *doe)
> +{
> +    uint32_t val = 0;
> +
> +    if (doe->busy) {
> +        val |= PCI_DOE_STATUS_DOE_BUSY;
> +    }

Do you actually intend to model busy?

> +    /* bit 1: interrupt not yet supported */
> +    if (doe->error) {
> +        val |= PCI_DOE_STATUS_DOE_ERROR;
> +    }
> +    if (doe->data_object_ready) {
> +        val |= PCI_DOE_STATUS_DATA_OBJECT_READY;
> +    }
> +
> +    return val;
> +}
> +
> +void doe_add_message_handler(PCIEDOE *doe, uint16_t vendor_id,
> +                             uint8_t object_type,
> +                             const doe_msg_handler_t handler, void *priv)
> +{
> +    struct doe_handler *h = g_malloc0(sizeof(*handler));
> +
> +    h->vendor_id = vendor_id;
> +    h->object_type = object_type;
> +    h->handler = handler;
> +    h->priv = priv;
> +    doe->cb_list = g_list_append(doe->cb_list, h);
> +}
> +
> +uint32_t pcie_doe_ecap(PCIEDOE *doe, PCIDevice *d, uint16_t offset)
> +{
> +    doe->doe_base = offset;
> +    /* Length field is 18 bits and is in dwords */
> +    doe->store = g_malloc0((1 << 18) * sizeof(uint32_t));

I think it's fine if you're instantiating DOE to just have this be statically
allocated.

> +
> +    pcie_add_capability(d, PCI_EXT_CAP_ID_DOE, 1, offset, 0x18);
> +    offset += 0x18;
> +
> +    return offset;
> +}
> +
> +void pcie_doe_write(PCIEDOE *doe, uint32_t addr, uint32_t val, int len)
> +{
> +    if (len != 4) {
> +        return;
> +    } 

Do you want to check alignment also?

> +
> +    switch (addr - doe->doe_base) {
> +    case PCI_DOE_CTRL:
> +        doe_set_ctl(doe, val);
> +        break;
> +    case PCI_DOE_WRITE_MAILBOX:
> +        doe_set_write_mailbox(doe, val);
> +        break;
> +    }
> +}
> +
> +uint32_t pcie_doe_read(PCIEDOE *doe, uint32_t addr, int len, int *found)
> +{
> +    if (len != 4) {
> +        *found = 0;
> +        return 0;
> +    }

Same comment as _write

> +
> +    *found = 1;
> +    switch (addr - doe->doe_base) {
> +    case PCI_DOE_CAP:
> +        return 0; /* No interrupt support */
> +    case PCI_DOE_STATUS:
> +        return doe_get_status(doe);
> +    case PCI_DOE_READ_MAILBOX:
> +        return doe_get_read_mailbox(doe);
> +    default:
> +        *found = 0;
> +        return 0;
> +    }
> +}

I'm guessing you don't love 'found'. Perhaps you can have this return an int64_t
and negative values are errors?

I'm not yet convinced you even need that level of error handling though. I
suppose next patch will tell me.

> +
> diff --git a/include/hw/pci/doe.h b/include/hw/pci/doe.h
> new file mode 100644
> index 0000000000..364c866c53
> --- /dev/null
> +++ b/include/hw/pci/doe.h
> @@ -0,0 +1,40 @@
> +/*
> + * PCIE DOE emulation.
> + *
> + * Copyright (c) 2021 Jonathan Cameron <Jonathan.Cameron@huawei.com>
> + *
> + * This work is licensed under the terms of the GNU GPL, version 2 or later.
> + * See the COPYING file in the top-level directory.
> + */
> +
> +#ifndef QEMU_PCIE_DOE_H_
> +#define QEMU_PCIE_DOE_H_
> +#include "qemu/osdep.h"
> +#include "qemu/units.h"
> +#include "hw/qdev-properties.h"
> +#include "qapi/error.h"
> +#include "qemu/module.h"
> +
> +typedef struct pcie_doe {
> +    uint32_t doe_base;
> +    GList *cb_list;
> +    int req_index;
> +    int req_length;
> +    int rsp_index;
> +    int rsp_length;
> +    bool data_object_ready;
> +    bool error;
> +    bool busy;
> +    uint32_t *store;
> +} PCIEDOE;
> +
> +typedef int (*doe_msg_handler_t)(PCIEDOE *doe, uint16_t vendor_id,
> +                                 uint8_t object_type, void *priv);
> +
> +uint32_t pcie_doe_ecap(PCIEDOE *doe, PCIDevice *d, uint16_t offset);
> +void doe_add_message_handler(PCIEDOE *doe, uint16_t vendor_id,
> +                             uint8_t object_type,
> +                             const doe_msg_handler_t handler, void *priv);
> +uint32_t pcie_doe_read(PCIEDOE *doe, uint32_t addr, int len, int *found);
> +void pcie_doe_write(PCIEDOE *doe, uint32_t addr, uint32_t val, int len);
> +#endif
> diff --git a/include/hw/pci/pci_ids.h b/include/hw/pci/pci_ids.h
> index 76bf3ed590..636b2e8017 100644
> --- a/include/hw/pci/pci_ids.h
> +++ b/include/hw/pci/pci_ids.h
> @@ -157,6 +157,8 @@
>  
>  /* Vendors and devices.  Sort key: vendor first, device next. */
>  
> +#define PCI_VENDOR_ID_PCI_SIG            0x0001
> +
>  #define PCI_VENDOR_ID_LSI_LOGIC          0x1000
>  #define PCI_DEVICE_ID_LSI_53C810         0x0001
>  #define PCI_DEVICE_ID_LSI_53C895A        0x0012
> -- 
> 2.19.1
> 
> 


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [RFC PATCH 3/4] hw/cxl/cxl-cdat: Initial CDAT implementation for use by CXL devices
  2021-02-01 15:16 ` [RFC PATCH 3/4] hw/cxl/cxl-cdat: Initial CDAT implementation for use by CXL devices Jonathan Cameron
@ 2021-02-02 18:49   ` Ben Widawsky
  2021-02-02 19:18     ` [Linuxarm] " Jonathan Cameron
  0 siblings, 1 reply; 12+ messages in thread
From: Ben Widawsky @ 2021-02-02 18:49 UTC (permalink / raw)
  To: Jonathan Cameron
  Cc: Thomas Huth, linuxarm, Michael S . Tsirkin, Vishal Verma,
	f.fangjian, Chris Browy, qemu-devel, f4bug, jcm,
	Prashant V Agarwal, Igor Mammedov, Dan Williams

On 21-02-01 23:16:28, Jonathan Cameron wrote:
> CDAT is an ACPI like format defined by the CXL consortium. It is
> available from
> 
> https://www.uefi.org/node/4093
> 
> Here support for managing all the entires is introduced, along with
> an implementation of a callback for a DOE mailbox which may be
> used to read these values from CXL hardware by either firmware or
> an OS.
> 
> Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>

I seem to be missing one critical thing, where is the CDAT header (Table 1 from
the spec) actually populated, length, revision, checksum, etc?

General, probably underthought-out, comment:

With the CXLType3Class I added since you probably wrote these patches, I wonder
if that'd be a better fit for populating some of these tables, having them
populate dynamically when the DOE/CDAT request actually comes in.

I have no strong opinion. The one advantage I see to using the class is
operations like managing handles, length calculations, etc can be managed by the
helper library rather than the device implementation having to do it. OTTOH if
you have devices that want to do weird things, they might lose some flexibility.

I think with just a few methods added to the CXLType3Class you could pretty
trivially build up a nice sane default CDAT for any CXL device.

> ---
>  hw/cxl/cxl-cdat.c         | 252 ++++++++++++++++++++++++++++++++++++++
>  hw/cxl/meson.build        |   1 +
>  include/hw/cxl/cxl_cdat.h | 101 +++++++++++++++
>  3 files changed, 354 insertions(+)
> 
> diff --git a/hw/cxl/cxl-cdat.c b/hw/cxl/cxl-cdat.c
> new file mode 100644
> index 0000000000..6ed4c15cc0
> --- /dev/null
> +++ b/hw/cxl/cxl-cdat.c
> @@ -0,0 +1,252 @@
> +/*
> + * Support for CDAT entires as defined in
> + * Coherent Device Attribute Table (CDAT) Specification rev 1.02
> + * Available from uefi.org.
> + *
> + * Copyright (c) 2021 Jonathan Cameron <Jonathan.Cameron@huawei.com>
> + *
> + * This work is licensed under the terms of the GNU GPL, version 2 or later.
> + * See the COPYING file in the top-level directory.
> + */
> +#include "qemu/osdep.h"
> +#include "qemu/units.h"
> +#include "qemu/error-report.h"
> +#include "hw/mem/memory-device.h"
> +#include "hw/mem/pc-dimm.h"
> +#include "hw/pci/pci.h"
> +#include "hw/pci/doe.h"
> +#include "hw/qdev-properties.h"
> +#include "qapi/error.h"
> +#include "qemu/log.h"
> +#include "qemu/module.h"
> +#include "qemu/range.h"
> +#include "qemu/rcu.h"
> +#include "sysemu/hostmem.h"
> +#include "sysemu/numa.h"
> +#include "hw/cxl/cxl.h"
> +
> +void cdat_add_dsmas(CXLCDAT *cdat, uint8_t dsmad_handle, uint8_t flags,
> +                    uint64_t dpa_base, uint64_t dpa_length)
> +{
> +    struct cxl_cdat_dsmas *dsmas = g_malloc0(sizeof(*dsmas));
> +    dsmas->dsmad_handle = dsmad_handle;
> +    dsmas->flags = flags;
> +    dsmas->dpa_base = dpa_base;
> +    dsmas->dpa_length = dpa_length;
> +    cdat->dsmas_list = g_list_append(cdat->dsmas_list, dsmas);
> +}
> +
> +void cdat_add_dslbis(CXLCDAT *cdat, uint8_t handle, uint8_t flags,
> +                     uint8_t data_type, uint64_t base_unit,
> +                     uint16_t entry0, uint16_t entry1, uint16_t entry2)
> +{
> +    struct cxl_cdat_dslbis *dslbis = g_malloc0(sizeof(*dslbis));
> +    dslbis->handle = handle;
> +    dslbis->flags = flags;
> +    dslbis->data_type = data_type;
> +    dslbis->entry_base_unit = base_unit;
> +    dslbis->entries[0] = entry0;
> +    dslbis->entries[1] = entry1;
> +    dslbis->entries[2] = entry2;
> +    cdat->dslbis_list = g_list_append(cdat->dslbis_list, dslbis);
> +}
> +
> +void cdat_add_dsmscis(CXLCDAT *cdat, uint8_t dsmas_handle,
> +                      uint64_t memory_sc_size, uint64_t cache_attrs)
> +{
> +    struct cxl_cdat_dsmscis *dsmscis = g_malloc(sizeof(*dsmscis));
> +    dsmscis->dsmas_handle = dsmas_handle;
> +    dsmscis->memory_side_cache_size = memory_sc_size;
> +    dsmscis->cache_attributes = cache_attrs;
> +    cdat->dsmscis_list = g_list_append(cdat->dsmscis_list, dsmscis);
> +}
> +
> +void cdat_add_dsis(CXLCDAT *cdat, uint8_t flags, uint8_t handle)
> +{
> +    struct cxl_cdat_dsis *dsis = g_malloc(sizeof(*dsis));
> +    dsis->flags = flags;
> +    dsis->handle = handle;
> +    cdat->dsis_list = g_list_append(cdat->dsis_list, dsis);
> +}
> +
> +void cdat_add_dsemts(CXLCDAT *cdat, uint8_t dsmas_handle,
> +                     uint8_t efi_mem_type_attr, uint64_t dpa_offset,
> +                     uint64_t dpa_length)
> +{
> +    struct cxl_cdat_dsemts *dsemts = g_malloc(sizeof(*dsemts));
> +    dsemts->dsmas_handle = dsmas_handle;
> +    dsemts->efi_mem_type_attr = efi_mem_type_attr;
> +    dsemts->dpa_offset = dpa_offset;
> +    dsemts->dpa_length = dpa_length;
> +    cdat->dsemts_list = g_list_append(cdat->dsemts_list, dsemts);
> +}
> +
> +struct cxl_cdat_sslbis *cdat_add_sslbis(CXLCDAT *cdat, uint8_t num_entries,
> +                                        uint8_t data_type, uint64_t base_unit)
> +{
> +    struct cxl_cdat_sslbis *sslbis =
> +        g_malloc(sizeof(*sslbis) + num_entries * sizeof(sslbis->entries[0]));
> +    sslbis->num_entries = num_entries;
> +    sslbis->data_type = data_type;
> +    sslbis->base_unit = base_unit;
> +    cdat->sslbis_list = g_list_append(cdat->sslbis_list, sslbis);
> +    return sslbis;
> +}
> +
> +int cdata_sslbis_set_entry(struct cxl_cdat_sslbis *sslbis, uint8_t index,
> +                           uint16_t portx, uint16_t porty, uint16_t val)
> +{
> +    struct cxl_cdat_sslbis_entry *entry;
> +    if (index >= sslbis->num_entries) {
> +        return -1;
> +    }
> +    entry = &sslbis->entries[index];
> +    entry->port_x_id = portx;
> +    entry->port_y_id = porty;
> +    entry->val = val;
> +    return 0;
> +}
> +
> +int cxl_table_access(PCIEDOE *doe, uint16_t vendor_id, uint8_t object_type,
> +                     void *priv)
> +{
> +    uint8_t table_type;
> +    int total_entries;
> +    uint16_t entry_handle;
> +    CXLCDAT *cdat = priv;
> +    uint16_t next_entry;
> +
> +    if (doe->req_length != 3) {
> +        /* optional error for unexpected command length */
> +        return -1;
> +    }
> +
> +    if ((doe->store[2] & CXL_DOE_TABLE_ACCESS_DW2_RCODE) != 0) {
> +        /*
> +         * Only table access code currently supported.
> +         * Error indication is lack of Data Object Ready
> +         */
> +        return -1;
> +    }
> +
> +    table_type = (doe->store[2] & CXL_DOE_TABLE_ACCESS_DW2_TYPE) >>
> +        ctz32(CXL_DOE_TABLE_ACCESS_DW2_TYPE);
> +    if (table_type != 0) {
> +        /* Unsuported table ID so just don't set Data Object Ready */
> +        return -1;
> +    }
> +    entry_handle = (doe->store[2] & CXL_DOE_TABLE_ACCESS_DW2_ENTRYHANDLE) >>
> +        ctz32(CXL_DOE_TABLE_ACCESS_DW2_ENTRYHANDLE);
> +
> +    /* Assume entry handle == CDAT structure index */
> +    total_entries = g_list_length(cdat->dsmas_list) +
> +        g_list_length(cdat->dslbis_list) +
> +        g_list_length(cdat->dsmscis_list) +
> +        g_list_length(cdat->dsis_list) +
> +        g_list_length(cdat->dsemts_list) +
> +        g_list_length(cdat->sslbis_list);
> +    if (entry_handle + 1 == total_entries) {
> +        next_entry = 0xFFFF;
> +    } else {
> +        next_entry = entry_handle + 1;
> +    }
> +
> +    if (entry_handle < g_list_length(cdat->dsmas_list)) {
> +        const int dsmas_len = 24;
> +        struct cxl_cdat_dsmas *dsmas =
> +            g_list_nth_data(cdat->dsmas_list, entry_handle);
> +
> +        doe->store[1] = 3 + dsmas_len / sizeof(uint32_t);
> +        doe->store[2] = next_entry << 16;
> +        doe->store[3] = (dsmas_len << 16) | CXL_CDAT_DSMAS_TYPE;
> +
> +        /* cxl version of proximity domain */
> +        doe->store[4] = dsmas->dsmad_handle;
> +        /* flags in 2nd byte of [4] (bit 2 is non volatile) */
> +        doe->store[4] |= (dsmas->flags << 8);
> +        doe->store[5] = dsmas->dpa_base & 0xFFFFFFFF;
> +        doe->store[6] = (dsmas->dpa_base >> 32) & 0xFFFFFFFF;
> +        doe->store[7] = dsmas->dpa_length & 0xFFFFFFFF;
> +        doe->store[8] = (dsmas->dpa_length >> 32) & 0xFFFFFFFF;
> +        return 0;
> +    }
> +    entry_handle -= g_list_length(cdat->dsmas_list);
> +    if (entry_handle < g_list_length(cdat->dslbis_list)) {
> +        const int dslbis_len = 24;
> +        struct cxl_cdat_dslbis *dslbis =
> +            g_list_nth_data(cdat->dslbis_list, entry_handle);
> +
> +        doe->store[1] = 3 + dslbis_len / sizeof(uint32_t);
> +        doe->store[2] = next_entry << 16;
> +        doe->store[3] = (dslbis_len << 16) | CXL_CDAT_DSLBIS_TYPE;
> +
> +        doe->store[4] = (dslbis->data_type << 24) | (dslbis->flags << 8) |
> +            dslbis->handle;
> +        doe->store[5] = dslbis->entry_base_unit & 0xFFFFFFFF;
> +        doe->store[6] = (dslbis->entry_base_unit >> 32) & 0xFFFFFFFF;
> +        doe->store[7] = (dslbis->entries[1] << 16) | dslbis->entries[0];
> +        doe->store[8] = dslbis->entries[2];
> +        return 0;
> +    }
> +    entry_handle -= g_list_length(cdat->dslbis_list);
> +    if (entry_handle < g_list_length(cdat->dsmscis_list)) {
> +        const int dsmscis_len = 20;
> +        struct cxl_cdat_dsmscis *dsmscis =
> +            g_list_nth_data(cdat->dsmscis_list, entry_handle);
> +
> +        doe->store[1] = 3 + dsmscis_len / sizeof(uint32_t);
> +        doe->store[2] = next_entry << 16;
> +        doe->store[3] = (dsmscis_len << 16) | CXL_CDAT_DSMSCIS_TYPE;
> +        doe->store[4] = dsmscis->dsmas_handle;
> +        doe->store[5] = dsmscis->memory_side_cache_size & 0xffffffff;
> +        doe->store[6] = (dsmscis->memory_side_cache_size >> 32) & 0xffffffff;
> +        doe->store[7] = dsmscis->cache_attributes;
> +    }
> +    entry_handle -= g_list_length(cdat->dsmscis_list);
> +    if (entry_handle < g_list_length(cdat->dsis_list)) {
> +        const int dsis_len = 8;
> +        struct cxl_cdat_dsis *dsis =
> +            g_list_nth_data(cdat->dsis_list, entry_handle);
> +
> +        doe->store[1] = 3 + dsis_len / sizeof(uint32_t);
> +        doe->store[2] = next_entry << 16;
> +        doe->store[3] = (dsis_len << 16) | CXL_CDAT_DSMSCIS_TYPE;
> +        doe->store[4] = (dsis->handle << 8) | dsis->flags;
> +    }
> +    entry_handle -= g_list_length(cdat->dsis_list);
> +    if (entry_handle < g_list_length(cdat->dsemts_list)) {
> +        const int dsemts_len = 24;
> +        struct cxl_cdat_dsemts *dsemts =
> +            g_list_nth_data(cdat->dsemts_list, entry_handle);
> +
> +        doe->store[1] = 3 + dsemts_len / sizeof(uint32_t);
> +        doe->store[2] = next_entry << 16;
> +        doe->store[3] = (dsemts_len << 16) | CXL_CDAT_DSEMTS_TYPE;
> +        doe->store[4] = (dsemts->efi_mem_type_attr << 8) | dsemts->dsmas_handle;
> +        doe->store[5] = dsemts->dpa_offset & 0xffffffff;
> +        doe->store[6] = (dsemts->dpa_offset >> 32) & 0xffffffff;
> +        doe->store[7] = dsemts->dpa_length & 0xffffffff;
> +        doe->store[8] = (dsemts->dpa_length >> 32) & 0xffffffff;
> +    }
> +    entry_handle -= g_list_length(cdat->dsemts_list);
> +    if (entry_handle < g_list_length(cdat->sslbis_list)) {
> +        struct cxl_cdat_sslbis *sslbis =
> +            g_list_nth_data(cdat->sslbis_list, entry_handle);
> +        int sslbis_len = 16 + 8 * sslbis->num_entries;
> +        int i;
> +
> +        doe->store[1] = 3 + sslbis_len / sizeof(uint32_t);
> +        doe->store[2] = next_entry << 16;
> +        doe->store[3] = (sslbis_len << 16) | CXL_CDAT_SSLBIS_TYPE;
> +        doe->store[4] = sslbis->data_type;
> +        doe->store[5] = sslbis->base_unit & 0xffffffff;
> +        doe->store[6] = (sslbis->base_unit >> 32) & 0xffffffff;
> +        for (i = 0; i < sslbis->num_entries; i++) {
> +            doe->store[7 + i * 2] = (sslbis->entries[i].port_y_id << 8) |
> +                sslbis->entries[i].port_x_id;
> +            doe->store[7 + i * 2 + 1] = sslbis->entries[i].val;
> +        }
> +    }
> +
> +    return -1;
> +}
> diff --git a/hw/cxl/meson.build b/hw/cxl/meson.build
> index 0eca715d10..9e2e5f4094 100644
> --- a/hw/cxl/meson.build
> +++ b/hw/cxl/meson.build
> @@ -2,4 +2,5 @@ softmmu_ss.add(when: 'CONFIG_CXL', if_true: files(
>    'cxl-component-utils.c',
>    'cxl-device-utils.c',
>    'cxl-mailbox-utils.c',
> +  'cxl-cdat.c',
>  ))
> diff --git a/include/hw/cxl/cxl_cdat.h b/include/hw/cxl/cxl_cdat.h
> new file mode 100644
> index 0000000000..d0226c463c
> --- /dev/null
> +++ b/include/hw/cxl/cxl_cdat.h
> @@ -0,0 +1,101 @@
> +/*
> + * Support for CDAT entires as defined in
> + * Coherent Device Attribute Table (CDAT) Specification rev 1.02
> + * Available from uefi.org.
> + *
> + * Copyright (c) 2021 Jonathan Cameron <Jonathan.Cameron@huawei.com>
> + *
> + * This work is licensed under the terms of the GNU GPL, version 2 or later.
> + * See the COPYING file in the top-level directory.
> + */
> +#ifndef QEMU_CXL_CDAT_H
> +#define QEMU_CXL_CDAT_H
> +#include "hw/pci/doe.h"
> +/* DW2 is common to the request and response */
> +#define CXL_DOE_TABLE_ACCESS_DW2_RCODE          0x000000ff
> +#define CXL_DOE_TABLE_ACCESS_DW2_TYPE           0x0000ff00
> +#define CXL_DOE_TABLE_ACCESS_DW2_ENTRYHANDLE    0xffff0000
> +
> +#define CXL_CDAT_DSMAS_TYPE     0
> +#define CXL_CDAT_DSLBIS_TYPE    1
> +#define CXL_CDAT_DSMSCIS_TYPE   2
> +#define CXL_CDAT_DSIS_TYPE      3
> +#define CXL_CDAT_DSEMTS_TYPE    4
> +#define CXL_CDAT_SSLBIS_TYPE    5
> +
> +struct cxl_cdat_dsmas {
> +    uint8_t dsmad_handle;
> +    uint8_t flags;
> +#define CDAT_DSMAS_FLAG_NV (1 << 2)
> +    uint64_t dpa_base;
> +    uint64_t dpa_length;
> +};
> +
> +struct cxl_cdat_dslbis {
> +    uint8_t handle;
> +    uint8_t flags;
> +    uint8_t data_type;
> +    uint64_t entry_base_unit;
> +    uint16_t entries[3]; /* 6 bytes */
> +};
> +
> +struct cxl_cdat_dsmscis {
> +    uint8_t dsmas_handle;
> +    uint64_t memory_side_cache_size;
> +    uint32_t cache_attributes;
> +};
> +
> +struct cxl_cdat_dsis {
> +    uint8_t flags;
> +#define CDAT_DSIS_MEMORY_ATTACHED 0x01
> +    uint8_t handle;
> +};
> +
> +struct cxl_cdat_dsemts {
> +    uint8_t dsmas_handle;
> +    uint8_t efi_mem_type_attr;
> +    uint64_t dpa_offset;
> +    uint64_t dpa_length;
> +};
> +
> +struct cxl_cdat_sslbis_entry {
> +    uint16_t port_x_id;
> +    uint16_t port_y_id;
> +    uint16_t val;
> +};
> +
> +struct cxl_cdat_sslbis {
> +    uint8_t num_entries; /* needed to compute length */
> +    uint8_t data_type;
> +    uint64_t base_unit;
> +    struct cxl_cdat_sslbis_entry entries[];
> +};
> +
> +typedef struct cxl_cdat {
> +    GList *dsmas_list;
> +    GList *dslbis_list;
> +    GList *dsmscis_list;
> +    GList *dsis_list;
> +    GList *dsemts_list;
> +    GList *sslbis_list;
> +} CXLCDAT;
> +
> +void cdat_add_dsmas(CXLCDAT *cdat, uint8_t dsmad_handle, uint8_t flags,
> +                    uint64_t dpa_base, uint64_t dpa_length);
> +void cdat_add_dslbis(CXLCDAT *cdat, uint8_t handle, uint8_t flags,
> +                     uint8_t data_type, uint64_t base_unit,
> +                     uint16_t entry0, uint16_t entry1, uint16_t entry2);
> +void cdat_add_dsmscis(CXLCDAT *cdat, uint8_t dsmas_handle,
> +                      uint64_t memory_sc_size, uint64_t cache_attrs);
> +void cdat_add_dsis(CXLCDAT *cdat, uint8_t flags, uint8_t handle);
> +void cdat_add_dsemts(CXLCDAT *cdat, uint8_t dsmas_handle,
> +                     uint8_t efi_mem_type_attr, uint64_t dpa_offset,
> +                     uint64_t dpa_length);
> +struct cxl_cdat_sslbis *cdat_add_sslbis(CXLCDAT *cdat, uint8_t num_entries,
> +                                        uint8_t data_type, uint64_t base_unit);
> +int cdata_sslbis_set_entry(struct cxl_cdat_sslbis *sslbis, uint8_t index,
> +                           uint16_t portx, uint16_t pory, uint16_t val);
> +
> +int cxl_table_access(PCIEDOE *doe, uint16_t vendor_id, uint8_t object_type,
> +                     void *priv);
> +#endif
> -- 
> 2.19.1
> 
> 


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [Linuxarm]  Re: [RFC PATCH 3/4] hw/cxl/cxl-cdat: Initial CDAT implementation for use by CXL devices
  2021-02-02 18:49   ` Ben Widawsky
@ 2021-02-02 19:18     ` Jonathan Cameron
  0 siblings, 0 replies; 12+ messages in thread
From: Jonathan Cameron @ 2021-02-02 19:18 UTC (permalink / raw)
  To: Ben Widawsky
  Cc: Thomas Huth, linuxarm, Michael S . Tsirkin, Vishal Verma,
	f.fangjian, Chris Browy, qemu-devel, f4bug, jcm,
	Prashant V Agarwal, Igor Mammedov, Dan Williams

On Tue, 2 Feb 2021 10:49:23 -0800
Ben Widawsky <ben@bwidawsk.net> wrote:

> On 21-02-01 23:16:28, Jonathan Cameron wrote:
> > CDAT is an ACPI like format defined by the CXL consortium. It is
> > available from
> > 
> > https://www.uefi.org/node/4093
> > 
> > Here support for managing all the entires is introduced, along with
> > an implementation of a callback for a DOE mailbox which may be
> > used to read these values from CXL hardware by either firmware or
> > an OS.
> > 
> > Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>  
> 
> I seem to be missing one critical thing, where is the CDAT header (Table 1 from
> the spec) actually populated, length, revision, checksum, etc?

An excellent question.  As far as I can see there is no actual means defined
to read that from a device.   The DOE stuff in the CXL spec just accesses 'entries'
not the header (thus as far as I can tell making a good chunk of the CDAT spec
entirely pointless).

It's on my list to raise on the SSWG list / call.  Perhaps I'm missing something...

> 
> General, probably underthought-out, comment:
> 
> With the CXLType3Class I added since you probably wrote these patches, I wonder
> if that'd be a better fit for populating some of these tables, having them
> populate dynamically when the DOE/CDAT request actually comes in.
> 
> I have no strong opinion. The one advantage I see to using the class is
> operations like managing handles, length calculations, etc can be managed by the
> helper library rather than the device implementation having to do it. OTTOH if
> you have devices that want to do weird things, they might lose some flexibility.
> 
> I think with just a few methods added to the CXLType3Class you could pretty
> trivially build up a nice sane default CDAT for any CXL device.

I'm not convinced there is such a thing as a sane default, but can certainly
look at this if we take this particular implementation forwards.
I suspect we'll also want to make a lot of this configurable from the command
line or similar on a per device basis.

Thanks,

Jonathan


> 
> > ---
> >  hw/cxl/cxl-cdat.c         | 252 ++++++++++++++++++++++++++++++++++++++
> >  hw/cxl/meson.build        |   1 +
> >  include/hw/cxl/cxl_cdat.h | 101 +++++++++++++++
> >  3 files changed, 354 insertions(+)
> > 
> > diff --git a/hw/cxl/cxl-cdat.c b/hw/cxl/cxl-cdat.c
> > new file mode 100644
> > index 0000000000..6ed4c15cc0
> > --- /dev/null
> > +++ b/hw/cxl/cxl-cdat.c
> > @@ -0,0 +1,252 @@
> > +/*
> > + * Support for CDAT entires as defined in
> > + * Coherent Device Attribute Table (CDAT) Specification rev 1.02
> > + * Available from uefi.org.
> > + *
> > + * Copyright (c) 2021 Jonathan Cameron <Jonathan.Cameron@huawei.com>
> > + *
> > + * This work is licensed under the terms of the GNU GPL, version 2 or later.
> > + * See the COPYING file in the top-level directory.
> > + */
> > +#include "qemu/osdep.h"
> > +#include "qemu/units.h"
> > +#include "qemu/error-report.h"
> > +#include "hw/mem/memory-device.h"
> > +#include "hw/mem/pc-dimm.h"
> > +#include "hw/pci/pci.h"
> > +#include "hw/pci/doe.h"
> > +#include "hw/qdev-properties.h"
> > +#include "qapi/error.h"
> > +#include "qemu/log.h"
> > +#include "qemu/module.h"
> > +#include "qemu/range.h"
> > +#include "qemu/rcu.h"
> > +#include "sysemu/hostmem.h"
> > +#include "sysemu/numa.h"
> > +#include "hw/cxl/cxl.h"
> > +
> > +void cdat_add_dsmas(CXLCDAT *cdat, uint8_t dsmad_handle, uint8_t flags,
> > +                    uint64_t dpa_base, uint64_t dpa_length)
> > +{
> > +    struct cxl_cdat_dsmas *dsmas = g_malloc0(sizeof(*dsmas));
> > +    dsmas->dsmad_handle = dsmad_handle;
> > +    dsmas->flags = flags;
> > +    dsmas->dpa_base = dpa_base;
> > +    dsmas->dpa_length = dpa_length;
> > +    cdat->dsmas_list = g_list_append(cdat->dsmas_list, dsmas);
> > +}
> > +
> > +void cdat_add_dslbis(CXLCDAT *cdat, uint8_t handle, uint8_t flags,
> > +                     uint8_t data_type, uint64_t base_unit,
> > +                     uint16_t entry0, uint16_t entry1, uint16_t entry2)
> > +{
> > +    struct cxl_cdat_dslbis *dslbis = g_malloc0(sizeof(*dslbis));
> > +    dslbis->handle = handle;
> > +    dslbis->flags = flags;
> > +    dslbis->data_type = data_type;
> > +    dslbis->entry_base_unit = base_unit;
> > +    dslbis->entries[0] = entry0;
> > +    dslbis->entries[1] = entry1;
> > +    dslbis->entries[2] = entry2;
> > +    cdat->dslbis_list = g_list_append(cdat->dslbis_list, dslbis);
> > +}
> > +
> > +void cdat_add_dsmscis(CXLCDAT *cdat, uint8_t dsmas_handle,
> > +                      uint64_t memory_sc_size, uint64_t cache_attrs)
> > +{
> > +    struct cxl_cdat_dsmscis *dsmscis = g_malloc(sizeof(*dsmscis));
> > +    dsmscis->dsmas_handle = dsmas_handle;
> > +    dsmscis->memory_side_cache_size = memory_sc_size;
> > +    dsmscis->cache_attributes = cache_attrs;
> > +    cdat->dsmscis_list = g_list_append(cdat->dsmscis_list, dsmscis);
> > +}
> > +
> > +void cdat_add_dsis(CXLCDAT *cdat, uint8_t flags, uint8_t handle)
> > +{
> > +    struct cxl_cdat_dsis *dsis = g_malloc(sizeof(*dsis));
> > +    dsis->flags = flags;
> > +    dsis->handle = handle;
> > +    cdat->dsis_list = g_list_append(cdat->dsis_list, dsis);
> > +}
> > +
> > +void cdat_add_dsemts(CXLCDAT *cdat, uint8_t dsmas_handle,
> > +                     uint8_t efi_mem_type_attr, uint64_t dpa_offset,
> > +                     uint64_t dpa_length)
> > +{
> > +    struct cxl_cdat_dsemts *dsemts = g_malloc(sizeof(*dsemts));
> > +    dsemts->dsmas_handle = dsmas_handle;
> > +    dsemts->efi_mem_type_attr = efi_mem_type_attr;
> > +    dsemts->dpa_offset = dpa_offset;
> > +    dsemts->dpa_length = dpa_length;
> > +    cdat->dsemts_list = g_list_append(cdat->dsemts_list, dsemts);
> > +}
> > +
> > +struct cxl_cdat_sslbis *cdat_add_sslbis(CXLCDAT *cdat, uint8_t num_entries,
> > +                                        uint8_t data_type, uint64_t base_unit)
> > +{
> > +    struct cxl_cdat_sslbis *sslbis =
> > +        g_malloc(sizeof(*sslbis) + num_entries * sizeof(sslbis->entries[0]));
> > +    sslbis->num_entries = num_entries;
> > +    sslbis->data_type = data_type;
> > +    sslbis->base_unit = base_unit;
> > +    cdat->sslbis_list = g_list_append(cdat->sslbis_list, sslbis);
> > +    return sslbis;
> > +}
> > +
> > +int cdata_sslbis_set_entry(struct cxl_cdat_sslbis *sslbis, uint8_t index,
> > +                           uint16_t portx, uint16_t porty, uint16_t val)
> > +{
> > +    struct cxl_cdat_sslbis_entry *entry;
> > +    if (index >= sslbis->num_entries) {
> > +        return -1;
> > +    }
> > +    entry = &sslbis->entries[index];
> > +    entry->port_x_id = portx;
> > +    entry->port_y_id = porty;
> > +    entry->val = val;
> > +    return 0;
> > +}
> > +
> > +int cxl_table_access(PCIEDOE *doe, uint16_t vendor_id, uint8_t object_type,
> > +                     void *priv)
> > +{
> > +    uint8_t table_type;
> > +    int total_entries;
> > +    uint16_t entry_handle;
> > +    CXLCDAT *cdat = priv;
> > +    uint16_t next_entry;
> > +
> > +    if (doe->req_length != 3) {
> > +        /* optional error for unexpected command length */
> > +        return -1;
> > +    }
> > +
> > +    if ((doe->store[2] & CXL_DOE_TABLE_ACCESS_DW2_RCODE) != 0) {
> > +        /*
> > +         * Only table access code currently supported.
> > +         * Error indication is lack of Data Object Ready
> > +         */
> > +        return -1;
> > +    }
> > +
> > +    table_type = (doe->store[2] & CXL_DOE_TABLE_ACCESS_DW2_TYPE) >>
> > +        ctz32(CXL_DOE_TABLE_ACCESS_DW2_TYPE);
> > +    if (table_type != 0) {
> > +        /* Unsuported table ID so just don't set Data Object Ready */
> > +        return -1;
> > +    }
> > +    entry_handle = (doe->store[2] & CXL_DOE_TABLE_ACCESS_DW2_ENTRYHANDLE) >>
> > +        ctz32(CXL_DOE_TABLE_ACCESS_DW2_ENTRYHANDLE);
> > +
> > +    /* Assume entry handle == CDAT structure index */
> > +    total_entries = g_list_length(cdat->dsmas_list) +
> > +        g_list_length(cdat->dslbis_list) +
> > +        g_list_length(cdat->dsmscis_list) +
> > +        g_list_length(cdat->dsis_list) +
> > +        g_list_length(cdat->dsemts_list) +
> > +        g_list_length(cdat->sslbis_list);
> > +    if (entry_handle + 1 == total_entries) {
> > +        next_entry = 0xFFFF;
> > +    } else {
> > +        next_entry = entry_handle + 1;
> > +    }
> > +
> > +    if (entry_handle < g_list_length(cdat->dsmas_list)) {
> > +        const int dsmas_len = 24;
> > +        struct cxl_cdat_dsmas *dsmas =
> > +            g_list_nth_data(cdat->dsmas_list, entry_handle);
> > +
> > +        doe->store[1] = 3 + dsmas_len / sizeof(uint32_t);
> > +        doe->store[2] = next_entry << 16;
> > +        doe->store[3] = (dsmas_len << 16) | CXL_CDAT_DSMAS_TYPE;
> > +
> > +        /* cxl version of proximity domain */
> > +        doe->store[4] = dsmas->dsmad_handle;
> > +        /* flags in 2nd byte of [4] (bit 2 is non volatile) */
> > +        doe->store[4] |= (dsmas->flags << 8);
> > +        doe->store[5] = dsmas->dpa_base & 0xFFFFFFFF;
> > +        doe->store[6] = (dsmas->dpa_base >> 32) & 0xFFFFFFFF;
> > +        doe->store[7] = dsmas->dpa_length & 0xFFFFFFFF;
> > +        doe->store[8] = (dsmas->dpa_length >> 32) & 0xFFFFFFFF;
> > +        return 0;
> > +    }
> > +    entry_handle -= g_list_length(cdat->dsmas_list);
> > +    if (entry_handle < g_list_length(cdat->dslbis_list)) {
> > +        const int dslbis_len = 24;
> > +        struct cxl_cdat_dslbis *dslbis =
> > +            g_list_nth_data(cdat->dslbis_list, entry_handle);
> > +
> > +        doe->store[1] = 3 + dslbis_len / sizeof(uint32_t);
> > +        doe->store[2] = next_entry << 16;
> > +        doe->store[3] = (dslbis_len << 16) | CXL_CDAT_DSLBIS_TYPE;
> > +
> > +        doe->store[4] = (dslbis->data_type << 24) | (dslbis->flags << 8) |
> > +            dslbis->handle;
> > +        doe->store[5] = dslbis->entry_base_unit & 0xFFFFFFFF;
> > +        doe->store[6] = (dslbis->entry_base_unit >> 32) & 0xFFFFFFFF;
> > +        doe->store[7] = (dslbis->entries[1] << 16) | dslbis->entries[0];
> > +        doe->store[8] = dslbis->entries[2];
> > +        return 0;
> > +    }
> > +    entry_handle -= g_list_length(cdat->dslbis_list);
> > +    if (entry_handle < g_list_length(cdat->dsmscis_list)) {
> > +        const int dsmscis_len = 20;
> > +        struct cxl_cdat_dsmscis *dsmscis =
> > +            g_list_nth_data(cdat->dsmscis_list, entry_handle);
> > +
> > +        doe->store[1] = 3 + dsmscis_len / sizeof(uint32_t);
> > +        doe->store[2] = next_entry << 16;
> > +        doe->store[3] = (dsmscis_len << 16) | CXL_CDAT_DSMSCIS_TYPE;
> > +        doe->store[4] = dsmscis->dsmas_handle;
> > +        doe->store[5] = dsmscis->memory_side_cache_size & 0xffffffff;
> > +        doe->store[6] = (dsmscis->memory_side_cache_size >> 32) & 0xffffffff;
> > +        doe->store[7] = dsmscis->cache_attributes;
> > +    }
> > +    entry_handle -= g_list_length(cdat->dsmscis_list);
> > +    if (entry_handle < g_list_length(cdat->dsis_list)) {
> > +        const int dsis_len = 8;
> > +        struct cxl_cdat_dsis *dsis =
> > +            g_list_nth_data(cdat->dsis_list, entry_handle);
> > +
> > +        doe->store[1] = 3 + dsis_len / sizeof(uint32_t);
> > +        doe->store[2] = next_entry << 16;
> > +        doe->store[3] = (dsis_len << 16) | CXL_CDAT_DSMSCIS_TYPE;
> > +        doe->store[4] = (dsis->handle << 8) | dsis->flags;
> > +    }
> > +    entry_handle -= g_list_length(cdat->dsis_list);
> > +    if (entry_handle < g_list_length(cdat->dsemts_list)) {
> > +        const int dsemts_len = 24;
> > +        struct cxl_cdat_dsemts *dsemts =
> > +            g_list_nth_data(cdat->dsemts_list, entry_handle);
> > +
> > +        doe->store[1] = 3 + dsemts_len / sizeof(uint32_t);
> > +        doe->store[2] = next_entry << 16;
> > +        doe->store[3] = (dsemts_len << 16) | CXL_CDAT_DSEMTS_TYPE;
> > +        doe->store[4] = (dsemts->efi_mem_type_attr << 8) | dsemts->dsmas_handle;
> > +        doe->store[5] = dsemts->dpa_offset & 0xffffffff;
> > +        doe->store[6] = (dsemts->dpa_offset >> 32) & 0xffffffff;
> > +        doe->store[7] = dsemts->dpa_length & 0xffffffff;
> > +        doe->store[8] = (dsemts->dpa_length >> 32) & 0xffffffff;
> > +    }
> > +    entry_handle -= g_list_length(cdat->dsemts_list);
> > +    if (entry_handle < g_list_length(cdat->sslbis_list)) {
> > +        struct cxl_cdat_sslbis *sslbis =
> > +            g_list_nth_data(cdat->sslbis_list, entry_handle);
> > +        int sslbis_len = 16 + 8 * sslbis->num_entries;
> > +        int i;
> > +
> > +        doe->store[1] = 3 + sslbis_len / sizeof(uint32_t);
> > +        doe->store[2] = next_entry << 16;
> > +        doe->store[3] = (sslbis_len << 16) | CXL_CDAT_SSLBIS_TYPE;
> > +        doe->store[4] = sslbis->data_type;
> > +        doe->store[5] = sslbis->base_unit & 0xffffffff;
> > +        doe->store[6] = (sslbis->base_unit >> 32) & 0xffffffff;
> > +        for (i = 0; i < sslbis->num_entries; i++) {
> > +            doe->store[7 + i * 2] = (sslbis->entries[i].port_y_id << 8) |
> > +                sslbis->entries[i].port_x_id;
> > +            doe->store[7 + i * 2 + 1] = sslbis->entries[i].val;
> > +        }
> > +    }
> > +
> > +    return -1;
> > +}
> > diff --git a/hw/cxl/meson.build b/hw/cxl/meson.build
> > index 0eca715d10..9e2e5f4094 100644
> > --- a/hw/cxl/meson.build
> > +++ b/hw/cxl/meson.build
> > @@ -2,4 +2,5 @@ softmmu_ss.add(when: 'CONFIG_CXL', if_true: files(
> >    'cxl-component-utils.c',
> >    'cxl-device-utils.c',
> >    'cxl-mailbox-utils.c',
> > +  'cxl-cdat.c',
> >  ))
> > diff --git a/include/hw/cxl/cxl_cdat.h b/include/hw/cxl/cxl_cdat.h
> > new file mode 100644
> > index 0000000000..d0226c463c
> > --- /dev/null
> > +++ b/include/hw/cxl/cxl_cdat.h
> > @@ -0,0 +1,101 @@
> > +/*
> > + * Support for CDAT entires as defined in
> > + * Coherent Device Attribute Table (CDAT) Specification rev 1.02
> > + * Available from uefi.org.
> > + *
> > + * Copyright (c) 2021 Jonathan Cameron <Jonathan.Cameron@huawei.com>
> > + *
> > + * This work is licensed under the terms of the GNU GPL, version 2 or later.
> > + * See the COPYING file in the top-level directory.
> > + */
> > +#ifndef QEMU_CXL_CDAT_H
> > +#define QEMU_CXL_CDAT_H
> > +#include "hw/pci/doe.h"
> > +/* DW2 is common to the request and response */
> > +#define CXL_DOE_TABLE_ACCESS_DW2_RCODE          0x000000ff
> > +#define CXL_DOE_TABLE_ACCESS_DW2_TYPE           0x0000ff00
> > +#define CXL_DOE_TABLE_ACCESS_DW2_ENTRYHANDLE    0xffff0000
> > +
> > +#define CXL_CDAT_DSMAS_TYPE     0
> > +#define CXL_CDAT_DSLBIS_TYPE    1
> > +#define CXL_CDAT_DSMSCIS_TYPE   2
> > +#define CXL_CDAT_DSIS_TYPE      3
> > +#define CXL_CDAT_DSEMTS_TYPE    4
> > +#define CXL_CDAT_SSLBIS_TYPE    5
> > +
> > +struct cxl_cdat_dsmas {
> > +    uint8_t dsmad_handle;
> > +    uint8_t flags;
> > +#define CDAT_DSMAS_FLAG_NV (1 << 2)
> > +    uint64_t dpa_base;
> > +    uint64_t dpa_length;
> > +};
> > +
> > +struct cxl_cdat_dslbis {
> > +    uint8_t handle;
> > +    uint8_t flags;
> > +    uint8_t data_type;
> > +    uint64_t entry_base_unit;
> > +    uint16_t entries[3]; /* 6 bytes */
> > +};
> > +
> > +struct cxl_cdat_dsmscis {
> > +    uint8_t dsmas_handle;
> > +    uint64_t memory_side_cache_size;
> > +    uint32_t cache_attributes;
> > +};
> > +
> > +struct cxl_cdat_dsis {
> > +    uint8_t flags;
> > +#define CDAT_DSIS_MEMORY_ATTACHED 0x01
> > +    uint8_t handle;
> > +};
> > +
> > +struct cxl_cdat_dsemts {
> > +    uint8_t dsmas_handle;
> > +    uint8_t efi_mem_type_attr;
> > +    uint64_t dpa_offset;
> > +    uint64_t dpa_length;
> > +};
> > +
> > +struct cxl_cdat_sslbis_entry {
> > +    uint16_t port_x_id;
> > +    uint16_t port_y_id;
> > +    uint16_t val;
> > +};
> > +
> > +struct cxl_cdat_sslbis {
> > +    uint8_t num_entries; /* needed to compute length */
> > +    uint8_t data_type;
> > +    uint64_t base_unit;
> > +    struct cxl_cdat_sslbis_entry entries[];
> > +};
> > +
> > +typedef struct cxl_cdat {
> > +    GList *dsmas_list;
> > +    GList *dslbis_list;
> > +    GList *dsmscis_list;
> > +    GList *dsis_list;
> > +    GList *dsemts_list;
> > +    GList *sslbis_list;
> > +} CXLCDAT;
> > +
> > +void cdat_add_dsmas(CXLCDAT *cdat, uint8_t dsmad_handle, uint8_t flags,
> > +                    uint64_t dpa_base, uint64_t dpa_length);
> > +void cdat_add_dslbis(CXLCDAT *cdat, uint8_t handle, uint8_t flags,
> > +                     uint8_t data_type, uint64_t base_unit,
> > +                     uint16_t entry0, uint16_t entry1, uint16_t entry2);
> > +void cdat_add_dsmscis(CXLCDAT *cdat, uint8_t dsmas_handle,
> > +                      uint64_t memory_sc_size, uint64_t cache_attrs);
> > +void cdat_add_dsis(CXLCDAT *cdat, uint8_t flags, uint8_t handle);
> > +void cdat_add_dsemts(CXLCDAT *cdat, uint8_t dsmas_handle,
> > +                     uint8_t efi_mem_type_attr, uint64_t dpa_offset,
> > +                     uint64_t dpa_length);
> > +struct cxl_cdat_sslbis *cdat_add_sslbis(CXLCDAT *cdat, uint8_t num_entries,
> > +                                        uint8_t data_type, uint64_t base_unit);
> > +int cdata_sslbis_set_entry(struct cxl_cdat_sslbis *sslbis, uint8_t index,
> > +                           uint16_t portx, uint16_t pory, uint16_t val);
> > +
> > +int cxl_table_access(PCIEDOE *doe, uint16_t vendor_id, uint8_t object_type,
> > +                     void *priv);
> > +#endif
> > -- 
> > 2.19.1
> > 
> >   
> _______________________________________________
> Linuxarm mailing list -- linuxarm@openeuler.org
> To unsubscribe send an email to linuxarm-leave@openeuler.org



^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [RFC PATCH 1/4] include/standard-headers/linux/pci_regs: temp hack to add necessary DOE definitions.
  2021-02-02 15:39   ` Ben Widawsky
@ 2021-02-02 23:13     ` Michael S. Tsirkin
  0 siblings, 0 replies; 12+ messages in thread
From: Michael S. Tsirkin @ 2021-02-02 23:13 UTC (permalink / raw)
  To: Ben Widawsky
  Cc: Thomas Huth, linuxarm, Vishal Verma, f.fangjian, Chris Browy,
	qemu-devel, f4bug, Jonathan Cameron, Prashant V Agarwal,
	Igor Mammedov, Dan Williams, jcm

On Tue, Feb 02, 2021 at 07:39:51AM -0800, Ben Widawsky wrote:
> On 21-02-01 23:16:26, Jonathan Cameron wrote:
> > Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
> > ---
> >  include/standard-headers/linux/pci_regs.h | 33 ++++++++++++++++++++++-
> >  1 file changed, 32 insertions(+), 1 deletion(-)
> > 
> > diff --git a/include/standard-headers/linux/pci_regs.h b/include/standard-headers/linux/pci_regs.h
> > index e709ae8235..7e852d3dd0 100644
> > --- a/include/standard-headers/linux/pci_regs.h
> > +++ b/include/standard-headers/linux/pci_regs.h
> > @@ -730,7 +730,8 @@
> >  #define PCI_EXT_CAP_ID_DVSEC	0x23	/* Designated Vendor-Specific */
> >  #define PCI_EXT_CAP_ID_DLF	0x25	/* Data Link Feature */
> >  #define PCI_EXT_CAP_ID_PL_16GT	0x26	/* Physical Layer 16.0 GT/s */
> > -#define PCI_EXT_CAP_ID_MAX	PCI_EXT_CAP_ID_PL_16GT
> > +#define PCI_EXT_CAP_ID_DOE	0x2E	/* Data Object Exchange */
> > +#define PCI_EXT_CAP_ID_MAX	PCI_EXT_CAP_ID_DOE
> >  
> >  #define PCI_EXT_CAP_DSN_SIZEOF	12
> >  #define PCI_EXT_CAP_MCAST_ENDPOINT_SIZEOF 40
> > @@ -1092,4 +1093,34 @@
> >  #define  PCI_PL_16GT_LE_CTRL_USP_TX_PRESET_MASK		0x000000F0
> >  #define  PCI_PL_16GT_LE_CTRL_USP_TX_PRESET_SHIFT	4
> >  
> > +/* Data Object Exchange */
> > +#define PCI_DOE_CAP		0x04
> > +#define  PCI_DOE_CAP_INT_SUPPORT			0x00000001
> > +#define  PCI_DOE_CAP_INT_MSG_NUM			0x00000FFE
> > +
> > +#define PCI_DOE_CTRL		0x08
> > +#define  PCI_DOE_CTRL_DOE_ABORT				0x00000001
> > +#define  PCI_DOE_CTRL_DOE_INT_EN			0x00000002
> > +#define  PCI_DOE_CTRL_DOE_GO				0x80000000
> > +
> > +#define PCI_DOE_STATUS		0x0c
> > +#define  PCI_DOE_STATUS_DOE_BUSY			0x00000001
> > +#define  PCI_DOE_STATUS_INT_STATUS			0x00000002
> > +#define  PCI_DOE_STATUS_DOE_ERROR			0x00000004
> > +#define  PCI_DOE_STATUS_DATA_OBJECT_READY		0x80000000
> > +
> > +#define PCI_DOE_WRITE_MAILBOX	0x10
> > +#define PCI_DOE_READ_MAILBOX	0x14
> > +
> > +/* Data Object Format DOE ECN 6.xx.1 */
> > +#define PCI_DATA_OBJ_DW0_VID				0x0000ffff
> > +#define PCI_DATA_OBJ_DW0_TYPE				0x00ff0000
> > +#define PCI_DATA_OBJ_DW1_LEN				0x0003ffff
> > +
> > +/* DOE Discover Data Object */
> > +#define PCI_DOE_DIS_OBJ_TYPE	 0x1
> > +#define PCI_DOE_DIS_REQ_D0_DW0_INDEX			0x000000ff
> > +#define PCI_DOE_DIS_RSP_DO_DW0_VID			0x0000ffff
> > +#define PCI_DOE_DIS_RSP_D0_DW0_PROT			0x00ff0000
> > +#define PCI_DOE_DIS_RSP_D0_DW0_NEXT_INDEX		0xff000000
> >  #endif /* LINUX_PCI_REGS_H */
> 
> I think a lot of these should have had _MASK at the end.
> 
> As for the accuracy of the values, lgtm.

just add them in the source file where they are used.
standard-headers are over-written by scripts, adding
your own macros there won't help.



^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [RFC PATCH 0/4] hw/cxl/ + /hw/pci/: PCI DOE + CXL CDAT emulation
  2021-02-01 15:16 [RFC PATCH 0/4] hw/cxl/ + /hw/pci/: PCI DOE + CXL CDAT emulation Jonathan Cameron
                   ` (3 preceding siblings ...)
  2021-02-01 15:16 ` [RFC PATCH 4/4] hw/mem/cxl_type3: Enabled DOE mailbox for access to CDAT Jonathan Cameron
@ 2021-02-03 17:32 ` Jonathan Cameron
  4 siblings, 0 replies; 12+ messages in thread
From: Jonathan Cameron @ 2021-02-03 17:32 UTC (permalink / raw)
  To: qemu-devel
  Cc: Thomas Huth, Ben Widawsky, Michael S . Tsirkin, Vishal Verma,
	f.fangjian, Chris Browy, f4bug, linuxarm, jcm,
	Prashant V Agarwal, Igor Mammedov, Dan Williams

On Mon, 1 Feb 2021 23:16:25 +0800
Jonathan Cameron <Jonathan.Cameron@huawei.com> wrote:

> Whilst I know others are working on an implementation of at least some of
> this, a desire to work on the kernel user of this required an
> implementation so I threw this together in the meantime and am sending
> it out on the basis that it may be of use to others.
> 
> As it is an RFC I have been lazy on some cleanup and error handling.
> Will fix that in a future version if we decide to take this forwards.

In the interests of info for anyone seeing this, note there is an
alternative version from Chris Browy and colleagues.

https://lore.kernel.org/qemu-devel/alpine.LRH.2.23.451.2102021543190.30097@server4/

Note that it's proved useful because we disagreed on a few things some
of which I believe need spec clarification.
(definitely one spec misread in this version - oops).

> 
> Based on Ben's cxl-2.0v3 branch from https://gitlab.com/bwidawsk/qemu
> Possible I've picked an unstable branch so this may or may not currently
> apply :)  I'll rebase on Ben's next version when avaialble.
> 
> Jonathan Cameron (4):
>   include/standard-headers/linux/pci_regs: temp hack to add necessary
>     DOE definitions.
>   hw/pci/pcie_doe: Introduce utility functions for PCIe DOE
>   hw/cxl/cxl-cdat: Initial CDAT implementation for use by CXL devices
>   hw/mem/cxl_type3: Enabled DOE mailbox for access to CDAT
> 
>  hw/cxl/cxl-cdat.c                         | 252 +++++++++++++++++++++
>  hw/cxl/meson.build                        |   1 +
>  hw/mem/cxl_type3.c                        |  49 ++++-
>  hw/pci/meson.build                        |   2 +-
>  hw/pci/pcie_doe.c                         | 257 ++++++++++++++++++++++
>  include/hw/cxl/cxl.h                      |   1 +
>  include/hw/cxl/cxl_cdat.h                 | 101 +++++++++
>  include/hw/pci/doe.h                      |  40 ++++
>  include/hw/pci/pci_ids.h                  |   2 +
>  include/standard-headers/linux/pci_regs.h |  33 ++-
>  10 files changed, 734 insertions(+), 4 deletions(-)
>  create mode 100644 hw/cxl/cxl-cdat.c
>  create mode 100644 hw/pci/pcie_doe.c
>  create mode 100644 include/hw/cxl/cxl_cdat.h
>  create mode 100644 include/hw/pci/doe.h
> 



^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [RFC PATCH 2/4] hw/pci/pcie_doe: Introduce utility functions for PCIe DOE
  2021-02-02 17:54   ` Ben Widawsky
@ 2021-02-03 18:01     ` Jonathan Cameron
  0 siblings, 0 replies; 12+ messages in thread
From: Jonathan Cameron @ 2021-02-03 18:01 UTC (permalink / raw)
  To: Ben Widawsky
  Cc: Thomas Huth, linuxarm, Michael S . Tsirkin, Vishal Verma,
	f.fangjian, Chris Browy, qemu-devel, f4bug, jcm,
	Prashant V Agarwal, Igor Mammedov, Dan Williams

On Tue, 2 Feb 2021 09:54:11 -0800
Ben Widawsky <ben@bwidawsk.net> wrote:

> This was a bit more complicated than I was anticipating :-)
> 
> On 21-02-01 23:16:27, Jonathan Cameron wrote:
> > This implements the ECN to the PCI 5.0 specification available at
> > https://members.pcisig.com/wg/PCI-SIG/document/14143
> > 
> > Does not currently support interrupts.
> > 
> > Note that currently no attempt is made to clean up allocated memory.
> > 
> > Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>

Hi Ben,

Thanks, for taking a look. 

I've only commented when more info to give or disagree.

> > ---
> >  hw/pci/meson.build       |   2 +-
> >  hw/pci/pcie_doe.c        | 257 +++++++++++++++++++++++++++++++++++++++
> >  include/hw/pci/doe.h     |  40 ++++++
> >  include/hw/pci/pci_ids.h |   2 +
> >  4 files changed, 300 insertions(+), 1 deletion(-)
> > 
> > diff --git a/hw/pci/meson.build b/hw/pci/meson.build
> > index 5c4bbac817..7336620ee3 100644
> > --- a/hw/pci/meson.build
> > +++ b/hw/pci/meson.build
> > @@ -11,7 +11,7 @@ pci_ss.add(files(
> >  # The functions in these modules can be used by devices too.  Since we
> >  # allow plugging PCIe devices into PCI buses, include them even if
> >  # CONFIG_PCI_EXPRESS=n.
> > -pci_ss.add(files('pcie.c', 'pcie_aer.c'))
> > +pci_ss.add(files('pcie.c', 'pcie_aer.c',  'pcie_doe.c'))
> >  softmmu_ss.add(when: 'CONFIG_PCI_EXPRESS', if_true: files('pcie_port.c', 'pcie_host.c'))
> >  softmmu_ss.add_all(when: 'CONFIG_PCI', if_true: pci_ss)
> >  
> > diff --git a/hw/pci/pcie_doe.c b/hw/pci/pcie_doe.c
> > new file mode 100644
> > index 0000000000..8739c41280
> > --- /dev/null
> > +++ b/hw/pci/pcie_doe.c
> > @@ -0,0 +1,257 @@
> > +/*
> > + * pcie_doe.c
> > + * utility functions for pci express data object exchange introduced
> > + * in PCI 5.0 Data Object Exchange (DOE) ECN
> > + *
> > + * Copyright (c) 2021 Jonathan Cameron <Jonathan.Cameron@huawei.com>
> > + *
> > + * This work is licensed under the terms of the GNU GPL, version 2 or later.
> > + * See the COPYING file in the top-level directory.
> > + */
> > +
> > +#include "qemu/osdep.h"
> > +#include "qemu/units.h"
> > +#include "qemu/error-report.h"
> > +#include "hw/pci/pci.h"
> > +#include "hw/pci/doe.h"
> > +#include "hw/qdev-properties.h"
> > +#include "qapi/error.h"
> > +#include "qemu/log.h"
> > +#include "qemu/module.h"
> > +#include "qemu/range.h"
> > +#include "qemu/rcu.h"
> > +#include "sysemu/hostmem.h"
> > +  
> 
> I know it's RFC and quickly thrown together, but:
> 
> /* VID and Type */
> #define DOE_DATA_OBJECT_HDR1 0
> /* Length */
> #define DOE_DATA_OBJECT_HDR2 1
> /* Index */
> #define DOE_DATA_OBJECT_REQUEST_DATA 2
> #define DOE_DATA_OBJECT_RESPONSE_DATA 3
> 
> Then use that throughout below, please?
> 
> > +struct doe_handler {
> > +    uint16_t vendor_id;
> > +    uint8_t object_type;
> > +    doe_msg_handler_t handler;
> > +    void *priv;
> > +};
> > +
> > +static void doe_set_ctl(PCIEDOE *doe, uint32_t val)
> > +{
> > +    /* Abort */
> > +    if (val & PCI_DOE_CTRL_DOE_ABORT) {
> > +        doe->req_index = 0;
> > +        doe->rsp_index = 0;
> > +        doe->req_length = 0;
> > +        doe->error = false;
> > +        doe->data_object_ready = false;
> > +    }
> > +
> > +    if (val & PCI_DOE_CTRL_DOE_GO) {
> > +        GList *l;
> > +        uint16_t vendor_id = doe->store[0] & PCI_DATA_OBJ_DW0_VID;
> > +        uint8_t object_type = (doe->store[0] & PCI_DATA_OBJ_DW0_TYPE) >>
> > +            ctz32(PCI_DATA_OBJ_DW0_TYPE);  
> 
> I think it'd be much nicer to read this as REG32/FIELD_EX32.

Probably :)

> 
> > +        if ((doe->req_index != 3) || (doe->req_length != 3)) {
> > +            /*
> > +             * Not entirely clear what should happen if req_length is correct
> > +             * buf insufficient data has been received.  
> 
> s/buf/but
> 
> Also, maybe for more resiliency and readability, with a comment about why '3':
> if ((doe->req_index < 3) ...
> 
> I think it'd be much more readable if you pulled the index directly out of the
> register store here instead of when you set.
> 
> len = doe->store[DOE_DATA_OBJECT_HDR2] & PCI_DATA_OBJ_DW1_LEN;
> if (!len)
> 	len = 1<<18;
> 
> > +             */
> > +            doe->error = true;
> > +            return;  
> 
> I don't think this should be an error:
> "If the Length is shorter than expected for a specific data object, then the data
> object must be silently discarded.

Comment was to tell me to come back and check it.  Hinges on the bit below.

> 
> If the Length is greater than expected for a specific data object, then the
> portion of the data object up to the expected length must be processed normally
> and the remainder of the data object must be silently discarded."

There is a gotcha around that one that still has me scratching my head.
"Any of the following events must result in the DOE error bit being set:"
...
"Optionally, if the associated data object protocol does not provide an
alternative mechanism for reporting such errors, the transfer of a data object that
does not correspond to the expected length of the data object".

So I think the normally and dropping statement you've quoted may mean
that you wouldn't expect software to check the error bit before continuing to
write to the mailbox. Thus it should carry on allowing writes and just dropping
them.

Of course, an implementation might optionally 'not' set the error in this case.


> 
> > +        }
> > +        /* Discovery protocol - DOE ECN */
> > +        if (vendor_id == PCI_VENDOR_ID_PCI_SIG &&
> > +            object_type == PCI_DOE_DIS_OBJ_TYPE) {
> > +            uint8_t index = doe->store[2] & PCI_DOE_DIS_REQ_D0_DW0_INDEX;
> > +            doe->store[1] = 3;
> > +            if (index == 0) {
> > +                /* First entry is this one, the discovery protocol itself */
> > +                uint8_t next;
> > +
> > +                if (doe->cb_list) {
> > +                    next = index + 1;
> > +                } else {
> > +                    next = 0;
> > +                }  
> 
> I think a comment here that you're terminating the list if no callbacks are
> registered would be good.
> 
> > +                doe->store[2] =
> > +                    (next << ctz32(PCI_DOE_DIS_RSP_D0_DW0_NEXT_INDEX)) |
> > +                    (0 << ctz32(PCI_DOE_DIS_RSP_D0_DW0_PROT)) |
> > +                    0x0001;  
> 
> Same comment about FIELD_DP|EX32. I'd try it and see how it looks.
> 
> Would probably be nice to #define some vendor ID, or use fields so it's obvious
> why you have a 0x1.
> 
> > +            } else {
> > +                /* Other entries based on register callbacks */
> > +                uint8_t next;
> > +                struct doe_handler *h;
> > +
> > +                h = g_list_nth_data(doe->cb_list, index - 1);  
> 
> It wasn't immediately obvious to me why index - 1. Maybe it'd be a little nicer
> if you had a helper:
> 
> static inline struct doe_handler *doe_get_handler(PCIEDOE *doe, int doe_index)
> {
> 	struct doe_handler *ret = NULL;
> 
> 	// assert(doe->cb_list);
> 	if (doe->cb_list)
> 		ret = g_list_nth_data(doe->cb_list, doe_index - 1);
> 
> 	return ret;
> }
> 
> > +                /*
> > +                 * Off end of list, Table 7-x4 in DOE ECN -
> > +                 * Vendor ID 0xFFFF if no more indices
> > +                 */
> > +                if (h == NULL) {
> > +                    doe->store[2] = 0xFFFF;
> > +                } else {
> > +                    if (g_list_nth(doe->cb_list, index)) {
> > +                        next = index + 1;
> > +                    } else {
> > +                        next = 0;
> > +                    }  
> 
> A bit confusing here as well IMO, but I don't know a better way to write it
> other than adding comments about index effectively being off by 1.
> 
> > +                    doe->store[2] =
> > +                        (next << ctz32(PCI_DOE_DIS_RSP_D0_DW0_NEXT_INDEX)) |
> > +                        (h->object_type << ctz32(PCI_DOE_DIS_RSP_D0_DW0_PROT)) |
> > +                        h->vendor_id;
> > +                }
> > +            }
> > +            doe->data_object_ready = true;
> > +            doe->rsp_index = 0;
> > +        } else {
> > +            for (l = doe->cb_list; l != NULL; l = l->next) {
> > +                struct doe_handler *h = l->data;
> > +                if (h->vendor_id == vendor_id &&
> > +                    h->object_type == object_type) {
> > +                    int ret = h->handler(doe, h->vendor_id, h->object_type,
> > +                                         h->priv);
> > +                    if (ret) {
> > +                        /*
> > +                         * No response so as per 6.xx.1 in DOE ECN
> > +                         * "... within 1 second after the DOE Go bit was Set
> > +                         *  in the DOE Control register, otherwise the DOE
> > +                         *  instance must Set the DOE Error bit in the DOE
> > +                         *  Status register.."
> > +                         */
> > +                         doe->error = true;  
> 
> extra whitespace
> 
> > +                        break;
> > +                    }
> > +                    doe->data_object_ready = true;
> > +                    doe->rsp_index = 0;
> > +                    break;
> > +                }
> > +            }
> > +            /* Comamnd not handled */
> > +            if (l == NULL) {
> > +                doe->error = true;
> > +            }
> > +        }
> > +        /* Reset input index to allow for a new message */
> > +        doe->req_index = 0;
> > +    }
> > +}
> > +
> > +static void doe_set_write_mailbox(PCIEDOE *doe, uint32_t val)
> > +{
> > +    if (doe->req_index == 1) {
> > +        if (val & 0x3FFFF) {  
> 
> val & PCI_DATA_OBJ_DW1_LEN
> 
> > +            doe->req_length = val & PCI_DATA_OBJ_DW1_LEN;
> > +        } else {
> > +            doe->req_length = 1 << 18;
> > +        }
> > +    }
> > +    if (doe->req_length && doe->req_index == doe->req_length) {
> > +        /*
> > +         * 6.xx.1 Data Objects
> > +         * If the DW transferred does not match the indicated Length
> > +         * for a data object, then the data object must be
> > +         * silently discarded
> > +         */
> > +        return;
> > +    }
> > +    doe->store[doe->req_index] = val;
> > +    doe->req_index++;
> > +}
> > +
> > +static uint32_t doe_get_read_mailbox(PCIEDOE *doe)
> > +{
> > +    uint32_t val;
> > +
> > +    if (doe->rsp_index == 0) {
> > +        doe->rsp_length = doe->store[1] & PCI_DATA_OBJ_DW1_LEN;
> > +    }
> > +    if (!doe->data_object_ready) {
> > +        /* Underflow of the Read Data Mailbox Mechanism */
> > +        doe->error = true;
> > +        return 0;
> > +    }  
> 
> I don't think this should be an error condition. Could you please explain?
More fun with spec...
"Any of the following events must result in the DOE error bit being set"
"Underflow of the Read Data Mailbox Mechanism".

A read before data_object_ready is to me a read of an empty mailbox so
hits that.  Open to interpretation...

I guess I'll have annoying our PCI-SIG people...

> 
> > +
> > +    val = doe->store[doe->rsp_index];
> > +    doe->rsp_index++;
> > +    if (doe->rsp_index == doe->rsp_length) {
> > +        doe->rsp_index = -1;
> > +        doe->data_object_ready = false;
> > +    }
> > +
> > +    return val;
> > +}
> > +
> > +static uint32_t doe_get_status(PCIEDOE *doe)
> > +{
> > +    uint32_t val = 0;
> > +
> > +    if (doe->busy) {
> > +        val |= PCI_DOE_STATUS_DOE_BUSY;
> > +    }  
> 
> Do you actually intend to model busy?
It's an interesting corner.  We probably want to be able to to check
that OS code is doing the right thing.  + It is possible we'll get writes
from multiple CPUs so maybe it can race?

Obviously right now it does nothing.
 
> 
> > +    /* bit 1: interrupt not yet supported */
> > +    if (doe->error) {
> > +        val |= PCI_DOE_STATUS_DOE_ERROR;
> > +    }
> > +    if (doe->data_object_ready) {
> > +        val |= PCI_DOE_STATUS_DATA_OBJECT_READY;
> > +    }
> > +
> > +    return val;
> > +}
> > +
> > +void doe_add_message_handler(PCIEDOE *doe, uint16_t vendor_id,
> > +                             uint8_t object_type,
> > +                             const doe_msg_handler_t handler, void *priv)
> > +{
> > +    struct doe_handler *h = g_malloc0(sizeof(*handler));
> > +
> > +    h->vendor_id = vendor_id;
> > +    h->object_type = object_type;
> > +    h->handler = handler;
> > +    h->priv = priv;
> > +    doe->cb_list = g_list_append(doe->cb_list, h);
> > +}
> > +
> > +uint32_t pcie_doe_ecap(PCIEDOE *doe, PCIDevice *d, uint16_t offset)
> > +{
> > +    doe->doe_base = offset;
> > +    /* Length field is 18 bits and is in dwords */
> > +    doe->store = g_malloc0((1 << 18) * sizeof(uint32_t));  
> 
> I think it's fine if you're instantiating DOE to just have this be statically
> allocated.

Probably true for now.  Might want to make it flexible given how large it
needs to be is dependent on what commands are actually supported.
Nice to be smaller if we never need it to be so big.

> 
> > +
> > +    pcie_add_capability(d, PCI_EXT_CAP_ID_DOE, 1, offset, 0x18);
> > +    offset += 0x18;
> > +
> > +    return offset;
> > +}
> > +
> > +void pcie_doe_write(PCIEDOE *doe, uint32_t addr, uint32_t val, int len)
> > +{
> > +    if (len != 4) {
> > +        return;
> > +    }   
> 
> Do you want to check alignment also?

I 'think' config writes are always aligned, but will check.

> 
> > +
> > +    switch (addr - doe->doe_base) {
> > +    case PCI_DOE_CTRL:
> > +        doe_set_ctl(doe, val);
> > +        break;
> > +    case PCI_DOE_WRITE_MAILBOX:
> > +        doe_set_write_mailbox(doe, val);
> > +        break;
> > +    }
> > +}
> > +
> > +uint32_t pcie_doe_read(PCIEDOE *doe, uint32_t addr, int len, int *found)
> > +{
> > +    if (len != 4) {
> > +        *found = 0;
> > +        return 0;
> > +    }  
> 
> Same comment as _write
> 
> > +
> > +    *found = 1;
> > +    switch (addr - doe->doe_base) {
> > +    case PCI_DOE_CAP:
> > +        return 0; /* No interrupt support */
> > +    case PCI_DOE_STATUS:
> > +        return doe_get_status(doe);
> > +    case PCI_DOE_READ_MAILBOX:
> > +        return doe_get_read_mailbox(doe);
> > +    default:
> > +        *found = 0;
> > +        return 0;
> > +    }
> > +}  
> 
> I'm guessing you don't love 'found'. Perhaps you can have this return an int64_t
> and negative values are errors?

I thought about it but that is also rather uggly.

> 
> I'm not yet convinced you even need that level of error handling though. I
> suppose next patch will tell me.

Found isn't about error handling, it's about saving time if we have lots
of device specific config space callbacks.  Don't want to waste time running
callbacks as a given config register should only be handled once.

> 
> > +
> > diff --git a/include/hw/pci/doe.h b/include/hw/pci/doe.h
> > new file mode 100644
> > index 0000000000..364c866c53
> > --- /dev/null
> > +++ b/include/hw/pci/doe.h
> > @@ -0,0 +1,40 @@
> > +/*
> > + * PCIE DOE emulation.
> > + *
> > + * Copyright (c) 2021 Jonathan Cameron <Jonathan.Cameron@huawei.com>
> > + *
> > + * This work is licensed under the terms of the GNU GPL, version 2 or later.
> > + * See the COPYING file in the top-level directory.
> > + */
> > +
> > +#ifndef QEMU_PCIE_DOE_H_
> > +#define QEMU_PCIE_DOE_H_
> > +#include "qemu/osdep.h"
> > +#include "qemu/units.h"
> > +#include "hw/qdev-properties.h"
> > +#include "qapi/error.h"
> > +#include "qemu/module.h"
> > +
> > +typedef struct pcie_doe {
> > +    uint32_t doe_base;
> > +    GList *cb_list;
> > +    int req_index;
> > +    int req_length;
> > +    int rsp_index;
> > +    int rsp_length;
> > +    bool data_object_ready;
> > +    bool error;
> > +    bool busy;
> > +    uint32_t *store;
> > +} PCIEDOE;
> > +
> > +typedef int (*doe_msg_handler_t)(PCIEDOE *doe, uint16_t vendor_id,
> > +                                 uint8_t object_type, void *priv);
> > +
> > +uint32_t pcie_doe_ecap(PCIEDOE *doe, PCIDevice *d, uint16_t offset);
> > +void doe_add_message_handler(PCIEDOE *doe, uint16_t vendor_id,
> > +                             uint8_t object_type,
> > +                             const doe_msg_handler_t handler, void *priv);
> > +uint32_t pcie_doe_read(PCIEDOE *doe, uint32_t addr, int len, int *found);
> > +void pcie_doe_write(PCIEDOE *doe, uint32_t addr, uint32_t val, int len);
> > +#endif
> > diff --git a/include/hw/pci/pci_ids.h b/include/hw/pci/pci_ids.h
> > index 76bf3ed590..636b2e8017 100644
> > --- a/include/hw/pci/pci_ids.h
> > +++ b/include/hw/pci/pci_ids.h
> > @@ -157,6 +157,8 @@
> >  
> >  /* Vendors and devices.  Sort key: vendor first, device next. */
> >  
> > +#define PCI_VENDOR_ID_PCI_SIG            0x0001
> > +
> >  #define PCI_VENDOR_ID_LSI_LOGIC          0x1000
> >  #define PCI_DEVICE_ID_LSI_53C810         0x0001
> >  #define PCI_DEVICE_ID_LSI_53C895A        0x0012
> > -- 
> > 2.19.1
> > 
> >   



^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2021-02-03 18:28 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-02-01 15:16 [RFC PATCH 0/4] hw/cxl/ + /hw/pci/: PCI DOE + CXL CDAT emulation Jonathan Cameron
2021-02-01 15:16 ` [RFC PATCH 1/4] include/standard-headers/linux/pci_regs: temp hack to add necessary DOE definitions Jonathan Cameron
2021-02-02 15:39   ` Ben Widawsky
2021-02-02 23:13     ` Michael S. Tsirkin
2021-02-01 15:16 ` [RFC PATCH 2/4] hw/pci/pcie_doe: Introduce utility functions for PCIe DOE Jonathan Cameron
2021-02-02 17:54   ` Ben Widawsky
2021-02-03 18:01     ` Jonathan Cameron
2021-02-01 15:16 ` [RFC PATCH 3/4] hw/cxl/cxl-cdat: Initial CDAT implementation for use by CXL devices Jonathan Cameron
2021-02-02 18:49   ` Ben Widawsky
2021-02-02 19:18     ` [Linuxarm] " Jonathan Cameron
2021-02-01 15:16 ` [RFC PATCH 4/4] hw/mem/cxl_type3: Enabled DOE mailbox for access to CDAT Jonathan Cameron
2021-02-03 17:32 ` [RFC PATCH 0/4] hw/cxl/ + /hw/pci/: PCI DOE + CXL CDAT emulation Jonathan Cameron

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).