All of lore.kernel.org
 help / color / mirror / Atom feed
* [RFC PATCH 00/25] Introduce CXL 2.0 Emulation
@ 2020-11-11  5:46 Ben Widawsky
  2020-11-11  5:47 ` [RFC PATCH 01/25] Temp: Add the PCI_EXT_ID_DVSEC definition to the qemu pci_regs.h copy Ben Widawsky
                   ` (26 more replies)
  0 siblings, 27 replies; 64+ messages in thread
From: Ben Widawsky @ 2020-11-11  5:46 UTC (permalink / raw)
  To: qemu-devel; +Cc: Vishal Verma, Dan Williams, Ben Widawsky

Introduce emulation of Compute Express Link 2.0, which was released
today at https://www.computeexpresslink.org/.

I've pushed a branch here: https://gitlab.com/bwidawsk/qemu/-/tree/cxl-2.0

The emulation has been critical to get the Linux enabling started
(https://lore.kernel.org/linux-cxl/), it would be an ideal place to land
regression tests for different topology handling, and there may be applications
for this emulation as a way for a guest to manipulate its address space relative
to different performance memories. I am new to QEMU development, so please
forgive and point me in the right direction if I severely misinterpreted where a
piece of infrastructure belongs.

Three of the five CXL component types are emulated with some level of functionality:
host bridge, root port, and memory device. Upstream ports and downstream ports
aren't implemented (the two components needed to make up a switch).

CXL 2.0 is built on top of PCIe (see spec for details). As a result, much of the
implementation utilizes existing PCI paradigms. To implement the host bridge,
I've chosen to use PXB (PCI Expander Bridge). It seemed to be the most natural
fit even though it doesn't directly map to how hardware will work. For
persistent capacity of the memory device, I utilized the memory subsystem
(hw/mem).

We have 3 reasons why this work is valuable:
1. OS driver development and testing
2. OS driver regression testing
3. Possible guest support for HDMs

As mentioned above there are three benefits to carrying this enabling in
upstream QEMU:

1. Linux driver feature development benefits from emulation both due to
a lack of initial hardware availability, but also, as is seen with
NVDIMM/PMEM emulation, there is value in being able to share
topologies with system-software developers even after hardware is
available.

2. The Linux kernel's unit test suite for NVDIMM/PMEM ended up injecting fake
resources via custom modules (nfit_test). In retrospect a QEMU emulation of
nfit_test capabilities would have made the test environment more portable, and
allowed for easier community contributions of example configurations.

3. This is still being fleshed out, but in short it provides a standardized
mechanism for the guest to provide feedback to the host about size and placement
needs of the memory. After the host gives the guest a physical window mapping to
the CXL device, the emulated HDM decoders allow the guest a way to tell the host
how much it wants and where. There are likely simpler ways to do this, but
they'd require inventing a new interface and you'd need to have diverging driver
code in the guest programming of the HDM decoder vs. the host. Since we've
already done this work, why not use it?

There is quite a long list of work to do for full spec compliance, but I don't
believe that any of it precludes merging. Off the top of my head:
- Main host bridge support (WIP)
- Interleaving
- Better Tests
- Huge swaths of firmware functionality
- Hot plug support
- Emulating volatile capacity

The flow of the patches in general is to define all the data structures and
registers associated with the various components in a top down manner. Host
bridge, component, ports, devices. Then, the actual implementation is done in
the same order.

The summary is:
1-8: Put infrastructure in place for emulation of the components.
9-11: Create the concept of a CXL bus and plumb into PXB
12-16: Implement host bridges
17: Implement a root port
18: Implement a memory device
19: Implement HDM decoders
20-24: ACPI bits
25: Start working on enabling the main host bridge

Ben Widawsky (23):
  hw/pci/cxl: Add a CXL component type (interface)
  hw/cxl/component: Introduce CXL components (8.1.x, 8.2.5)
  hw/cxl/device: Introduce a CXL device (8.2.8)
  hw/cxl/device: Implement the CAP array (8.2.8.1-2)
  hw/cxl/device: Add device status (8.2.8.3)
  hw/cxl/device: Implement basic mailbox (8.2.8.4)
  hw/cxl/device: Add memory devices (8.2.8.5)
  hw/pxb: Use a type for realizing expanders
  hw/pci/cxl: Create a CXL bus type
  hw/pxb: Allow creation of a CXL PXB (host bridge)
  acpi/pci: Consolidate host bridge setup
  hw/pci: Plumb _UID through host bridges
  hw/cxl/component: Implement host bridge MMIO (8.2.5, table 142)
  acpi/pxb/cxl: Reserve host bridge MMIO
  hw/pxb/cxl: Add "windows" for host bridges
  hw/cxl/rp: Add a root port
  hw/cxl/device: Add a memory device (8.2.8.5)
  hw/cxl/device: Implement MMIO HDM decoding (8.2.5.12)
  acpi/cxl: Add _OSC implementation (9.14.2)
  acpi/cxl: Create the CEDT (9.14.1)
  Temp: acpi/cxl: Add ACPI0017 (CEDT awareness)
  WIP: i386/cxl: Initialize a host bridge
  qtest/cxl: Add very basic sanity tests

Jonathan Cameron (1):
  Temp: Add the PCI_EXT_ID_DVSEC definition to the qemu pci_regs.h copy.

Vishal Verma (1):
  acpi/cxl: Introduce a compat-driver UUID for CXL _OSC

 MAINTAINERS                               |   6 +
 hw/Kconfig                                |   1 +
 hw/acpi/Kconfig                           |   5 +
 hw/acpi/cxl.c                             | 198 +++++++++++++
 hw/acpi/meson.build                       |   1 +
 hw/arm/virt.c                             |   1 +
 hw/core/machine.c                         |  26 ++
 hw/core/numa.c                            |   3 +
 hw/cxl/Kconfig                            |   3 +
 hw/cxl/cxl-component-utils.c              | 192 +++++++++++++
 hw/cxl/cxl-device-utils.c                 | 293 +++++++++++++++++++
 hw/cxl/cxl-mailbox-utils.c                | 139 +++++++++
 hw/cxl/meson.build                        |   5 +
 hw/i386/acpi-build.c                      |  87 +++++-
 hw/i386/microvm.c                         |   1 +
 hw/i386/pc.c                              |   2 +
 hw/mem/Kconfig                            |   5 +
 hw/mem/cxl_type3.c                        | 334 ++++++++++++++++++++++
 hw/mem/meson.build                        |   1 +
 hw/meson.build                            |   1 +
 hw/pci-bridge/Kconfig                     |   5 +
 hw/pci-bridge/cxl_root_port.c             | 231 +++++++++++++++
 hw/pci-bridge/meson.build                 |   1 +
 hw/pci-bridge/pci_expander_bridge.c       | 209 +++++++++++++-
 hw/pci-bridge/pcie_root_port.c            |   6 +-
 hw/pci/pci.c                              |  32 ++-
 hw/pci/pcie.c                             |  30 ++
 hw/ppc/spapr.c                            |   2 +
 include/hw/acpi/cxl.h                     |  27 ++
 include/hw/boards.h                       |   2 +
 include/hw/cxl/cxl.h                      |  30 ++
 include/hw/cxl/cxl_component.h            | 181 ++++++++++++
 include/hw/cxl/cxl_device.h               | 199 +++++++++++++
 include/hw/cxl/cxl_pci.h                  | 155 ++++++++++
 include/hw/pci/pci.h                      |  15 +
 include/hw/pci/pci_bridge.h               |  25 ++
 include/hw/pci/pci_bus.h                  |   8 +
 include/hw/pci/pci_ids.h                  |   1 +
 include/standard-headers/linux/pci_regs.h |   1 +
 monitor/hmp-cmds.c                        |  15 +
 qapi/machine.json                         |   1 +
 tests/qtest/cxl-test.c                    |  93 ++++++
 tests/qtest/meson.build                   |   4 +
 43 files changed, 2547 insertions(+), 30 deletions(-)
 create mode 100644 hw/acpi/cxl.c
 create mode 100644 hw/cxl/Kconfig
 create mode 100644 hw/cxl/cxl-component-utils.c
 create mode 100644 hw/cxl/cxl-device-utils.c
 create mode 100644 hw/cxl/cxl-mailbox-utils.c
 create mode 100644 hw/cxl/meson.build
 create mode 100644 hw/mem/cxl_type3.c
 create mode 100644 hw/pci-bridge/cxl_root_port.c
 create mode 100644 include/hw/acpi/cxl.h
 create mode 100644 include/hw/cxl/cxl.h
 create mode 100644 include/hw/cxl/cxl_component.h
 create mode 100644 include/hw/cxl/cxl_device.h
 create mode 100644 include/hw/cxl/cxl_pci.h
 create mode 100644 tests/qtest/cxl-test.c

-- 
2.29.2



^ permalink raw reply	[flat|nested] 64+ messages in thread

* [RFC PATCH 01/25] Temp: Add the PCI_EXT_ID_DVSEC definition to the qemu pci_regs.h copy.
  2020-11-11  5:46 [RFC PATCH 00/25] Introduce CXL 2.0 Emulation Ben Widawsky
@ 2020-11-11  5:47 ` Ben Widawsky
  2020-11-11  5:47 ` [RFC PATCH 02/25] hw/pci/cxl: Add a CXL component type (interface) Ben Widawsky
                   ` (25 subsequent siblings)
  26 siblings, 0 replies; 64+ messages in thread
From: Ben Widawsky @ 2020-11-11  5:47 UTC (permalink / raw)
  To: qemu-devel; +Cc: Vishal Verma, Dan Williams, Ben Widawsky, Jonathan Cameron

From: Jonathan Cameron <Jonathan.Cameron@huawei.com>

This hasn't yet been added to the linux kernel tree, so for purposes
of this RFC just add it locally.

Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Ben Widawsky <ben.widawsky@intel.com>
---
 include/standard-headers/linux/pci_regs.h | 1 +
 1 file changed, 1 insertion(+)

diff --git a/include/standard-headers/linux/pci_regs.h b/include/standard-headers/linux/pci_regs.h
index a95d55f9f2..5d0b79b9da 100644
--- a/include/standard-headers/linux/pci_regs.h
+++ b/include/standard-headers/linux/pci_regs.h
@@ -723,6 +723,7 @@
 #define PCI_EXT_CAP_ID_DPC	0x1D	/* Downstream Port Containment */
 #define PCI_EXT_CAP_ID_L1SS	0x1E	/* L1 PM Substates */
 #define PCI_EXT_CAP_ID_PTM	0x1F	/* Precision Time Measurement */
+#define PCI_EXT_CAP_ID_DVSEC	0x23    /* Designated Vendor-Specific */
 #define PCI_EXT_CAP_ID_DLF	0x25	/* Data Link Feature */
 #define PCI_EXT_CAP_ID_PL_16GT	0x26	/* Physical Layer 16.0 GT/s */
 #define PCI_EXT_CAP_ID_MAX	PCI_EXT_CAP_ID_PL_16GT
-- 
2.29.2



^ permalink raw reply related	[flat|nested] 64+ messages in thread

* [RFC PATCH 02/25] hw/pci/cxl: Add a CXL component type (interface)
  2020-11-11  5:46 [RFC PATCH 00/25] Introduce CXL 2.0 Emulation Ben Widawsky
  2020-11-11  5:47 ` [RFC PATCH 01/25] Temp: Add the PCI_EXT_ID_DVSEC definition to the qemu pci_regs.h copy Ben Widawsky
@ 2020-11-11  5:47 ` Ben Widawsky
  2020-11-11  5:47 ` [RFC PATCH 03/25] hw/cxl/component: Introduce CXL components (8.1.x, 8.2.5) Ben Widawsky
                   ` (24 subsequent siblings)
  26 siblings, 0 replies; 64+ messages in thread
From: Ben Widawsky @ 2020-11-11  5:47 UTC (permalink / raw)
  To: qemu-devel; +Cc: Vishal Verma, Dan Williams, Ben Widawsky, Michael S. Tsirkin

A CXL component is a hardware entity that implements CXL component
registers from the CXL 2.0 spec (8.2.3). Currently these represent 3
general types.
1. Host Bridge
2. Ports (root, upstream, downstream)
3. Devices (memory, other)

A CXL component can be conceptually thought of as a PCIe device with
extra functionality when enumerated and enabled. For this reason, CXL
does here, and will continue to add on to existing PCI code paths.

Host bridges will typically need to be handled specially and so they can
implement this newly introduced interface or not. All other components
should implement this interface. Implementing this interface allows the
core pci code to treat these devices as special where appropriate.

Signed-off-by: Ben Widawsky <ben.widawsky@intel.com>
---
 hw/pci/pci.c         | 10 ++++++++++
 include/hw/pci/pci.h |  8 ++++++++
 2 files changed, 18 insertions(+)

diff --git a/hw/pci/pci.c b/hw/pci/pci.c
index 0131d9d02c..db88788c4b 100644
--- a/hw/pci/pci.c
+++ b/hw/pci/pci.c
@@ -192,6 +192,11 @@ static const TypeInfo pci_bus_info = {
     .class_init = pci_bus_class_init,
 };
 
+static const TypeInfo cxl_interface_info = {
+    .name          = INTERFACE_CXL_DEVICE,
+    .parent        = TYPE_INTERFACE,
+};
+
 static const TypeInfo pcie_interface_info = {
     .name          = INTERFACE_PCIE_DEVICE,
     .parent        = TYPE_INTERFACE,
@@ -2113,6 +2118,10 @@ static void pci_qdev_realize(DeviceState *qdev, Error **errp)
         pci_dev->cap_present |= QEMU_PCI_CAP_EXPRESS;
     }
 
+    if (object_class_dynamic_cast(klass, INTERFACE_CXL_DEVICE)) {
+        pci_dev->cap_present |= QEMU_PCIE_CAP_CXL;
+    }
+
     pci_dev = do_pci_register_device(pci_dev,
                                      object_get_typename(OBJECT(qdev)),
                                      pci_dev->devfn, errp);
@@ -2839,6 +2848,7 @@ static void pci_register_types(void)
     type_register_static(&pci_bus_info);
     type_register_static(&pcie_bus_info);
     type_register_static(&conventional_pci_interface_info);
+    type_register_static(&cxl_interface_info);
     type_register_static(&pcie_interface_info);
     type_register_static(&pci_device_type_info);
 }
diff --git a/include/hw/pci/pci.h b/include/hw/pci/pci.h
index 72ce649eee..4e6fd59fdd 100644
--- a/include/hw/pci/pci.h
+++ b/include/hw/pci/pci.h
@@ -194,6 +194,8 @@ enum {
     QEMU_PCIE_LNKSTA_DLLLA = (1 << QEMU_PCIE_LNKSTA_DLLLA_BITNR),
 #define QEMU_PCIE_EXTCAP_INIT_BITNR 9
     QEMU_PCIE_EXTCAP_INIT = (1 << QEMU_PCIE_EXTCAP_INIT_BITNR),
+#define QEMU_PCIE_CXL_BITNR 10
+    QEMU_PCIE_CAP_CXL = (1 << QEMU_PCIE_CXL_BITNR),
 };
 
 #define TYPE_PCI_DEVICE "pci-device"
@@ -201,6 +203,12 @@ typedef struct PCIDeviceClass PCIDeviceClass;
 DECLARE_OBJ_CHECKERS(PCIDevice, PCIDeviceClass,
                      PCI_DEVICE, TYPE_PCI_DEVICE)
 
+/*
+ * Implemented by devices that can be plugged on CXL buses. In the spec, this is
+ * actually a "CXL Component, but we name it device to match the PCI naming.
+ */
+#define INTERFACE_CXL_DEVICE "cxl-device"
+
 /* Implemented by devices that can be plugged on PCI Express buses */
 #define INTERFACE_PCIE_DEVICE "pci-express-device"
 
-- 
2.29.2



^ permalink raw reply related	[flat|nested] 64+ messages in thread

* [RFC PATCH 03/25] hw/cxl/component: Introduce CXL components (8.1.x, 8.2.5)
  2020-11-11  5:46 [RFC PATCH 00/25] Introduce CXL 2.0 Emulation Ben Widawsky
  2020-11-11  5:47 ` [RFC PATCH 01/25] Temp: Add the PCI_EXT_ID_DVSEC definition to the qemu pci_regs.h copy Ben Widawsky
  2020-11-11  5:47 ` [RFC PATCH 02/25] hw/pci/cxl: Add a CXL component type (interface) Ben Widawsky
@ 2020-11-11  5:47 ` Ben Widawsky
  2020-11-16 12:03   ` Jonathan Cameron
  2020-11-11  5:47 ` [RFC PATCH 04/25] hw/cxl/device: Introduce a CXL device (8.2.8) Ben Widawsky
                   ` (23 subsequent siblings)
  26 siblings, 1 reply; 64+ messages in thread
From: Ben Widawsky @ 2020-11-11  5:47 UTC (permalink / raw)
  To: qemu-devel; +Cc: Vishal Verma, Dan Williams, Ben Widawsky

A CXL 2.0 component is any entity in the CXL topology. All components
have a analogous function in PCIe. Except for the CXL host bridge, all
have a PCIe config space that is accessible via the common PCIe
mechanisms. CXL components are enumerated via DVSEC fields in the
extended PCIe header space. CXL components will minimally implement some
subset of CXL.mem and CXL.cache registers defined in 8.2.5 of the CXL
2.0 specification. Two headers and a utility library are introduced to
support the minimum functionality needed to enumerate components.

The cxl_pci header manages bits associated with PCI, specifically the
DVSEC and related fields. The cxl_component.h variant has data
structures and APIs that are useful for drivers implementing any of the
CXL 2.0 components. The library takes care of making use of the DVSEC
bits and the CXL.[mem|cache] regisetrs.

None of the mechanisms required to enumerate a CXL capable hostbridge
are introduced at this point.

Note that the CXL.mem and CXL.cache registers used are always 4B wide.
It's possible in the future that this constraint will not hold.

Signed-off-by: Ben Widawsky <ben.widawsky@intel.com>

--
It's tempting to have a more generalized DVSEC infrastructure. As far as
I can tell, the amount this would actually save in terms of code is
minimal because most of DVESC is vendor specific.
---
 MAINTAINERS                    |   6 ++
 hw/Kconfig                     |   1 +
 hw/cxl/Kconfig                 |   3 +
 hw/cxl/cxl-component-utils.c   | 192 +++++++++++++++++++++++++++++++++
 hw/cxl/cxl-device-utils.c      |   0
 hw/cxl/meson.build             |   3 +
 hw/meson.build                 |   1 +
 include/hw/cxl/cxl.h           |  17 +++
 include/hw/cxl/cxl_component.h | 181 +++++++++++++++++++++++++++++++
 include/hw/cxl/cxl_pci.h       | 133 +++++++++++++++++++++++
 10 files changed, 537 insertions(+)
 create mode 100644 hw/cxl/Kconfig
 create mode 100644 hw/cxl/cxl-component-utils.c
 create mode 100644 hw/cxl/cxl-device-utils.c
 create mode 100644 hw/cxl/meson.build
 create mode 100644 include/hw/cxl/cxl.h
 create mode 100644 include/hw/cxl/cxl_component.h
 create mode 100644 include/hw/cxl/cxl_pci.h

diff --git a/MAINTAINERS b/MAINTAINERS
index c1d16026ba..02b8e2274d 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -2184,6 +2184,12 @@ F: qapi/block*.json
 F: qapi/transaction.json
 T: git https://repo.or.cz/qemu/armbru.git block-next
 
+Compute Express Link
+M: Ben Widawsky <ben.widawsky@intel.com>
+S: Supported
+F: hw/cxl/
+F: include/hw/cxl/
+
 Dirty Bitmaps
 M: Eric Blake <eblake@redhat.com>
 M: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
diff --git a/hw/Kconfig b/hw/Kconfig
index 4de1797ffd..efed27805a 100644
--- a/hw/Kconfig
+++ b/hw/Kconfig
@@ -6,6 +6,7 @@ source audio/Kconfig
 source block/Kconfig
 source char/Kconfig
 source core/Kconfig
+source cxl/Kconfig
 source display/Kconfig
 source dma/Kconfig
 source gpio/Kconfig
diff --git a/hw/cxl/Kconfig b/hw/cxl/Kconfig
new file mode 100644
index 0000000000..8e67519b16
--- /dev/null
+++ b/hw/cxl/Kconfig
@@ -0,0 +1,3 @@
+config CXL
+    bool
+    default y if PCI_EXPRESS
diff --git a/hw/cxl/cxl-component-utils.c b/hw/cxl/cxl-component-utils.c
new file mode 100644
index 0000000000..c52bd5bfc7
--- /dev/null
+++ b/hw/cxl/cxl-component-utils.c
@@ -0,0 +1,192 @@
+/*
+ * CXL Utility library for components
+ *
+ * Copyright(C) 2020 Intel Corporation.
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2. See the
+ * COPYING file in the top-level directory.
+ */
+
+#include "qemu/osdep.h"
+#include "qemu/log.h"
+#include "hw/pci/pci.h"
+#include "hw/cxl/cxl.h"
+
+static uint64_t cxl_cache_mem_read_reg(void *opaque, hwaddr offset,
+                                       unsigned size)
+{
+    CXLComponentState *cxl_cstate = opaque;
+    ComponentRegisters *cregs = &cxl_cstate->crb;
+    uint32_t *cache_mem = cregs->cache_mem_registers;
+
+    if (size != 4) {
+        qemu_log_mask(LOG_UNIMP, "%uB component register read (RAZ)\n", size);
+        return 0;
+    }
+
+    if (cregs->special_ops && cregs->special_ops->read) {
+        return cregs->special_ops->read(cxl_cstate, offset, size);
+    } else {
+        return cache_mem[offset >> 2];
+    }
+}
+
+static void cxl_cache_mem_write_reg(void *opaque, hwaddr offset, uint64_t value,
+                                    unsigned size)
+{
+    CXLComponentState *cxl_cstate = opaque;
+    ComponentRegisters *cregs = &cxl_cstate->crb;
+
+    if (size != 4) {
+        qemu_log_mask(LOG_UNIMP, "%uB component register write (WI)\n", size);
+        return;
+    }
+
+    if (cregs->special_ops && cregs->special_ops->write) {
+        cregs->special_ops->write(cxl_cstate, offset, value, size);
+    }
+}
+
+static const MemoryRegionOps cache_mem_ops = {
+    .read = cxl_cache_mem_read_reg,
+    .write = cxl_cache_mem_write_reg,
+    .endianness = DEVICE_LITTLE_ENDIAN,
+    .valid = {
+        .min_access_size = 4,
+        .max_access_size = 4,
+    },
+    .impl = {
+        .min_access_size = 4,
+        .max_access_size = 4,
+    },
+};
+
+void cxl_component_register_block_init(Object *obj,
+                                       CXLComponentState *cxl_cstate,
+                                       const char *type)
+{
+    ComponentRegisters *cregs = &cxl_cstate->crb;
+
+    memory_region_init(&cregs->component_registers, obj, type, 0x10000);
+    memory_region_init_io(&cregs->io, obj, NULL, cregs, ".io", 0x1000);
+    memory_region_init_io(&cregs->cache_mem, obj, &cache_mem_ops, cregs,
+                          ".cache_mem", 0x1000);
+
+    memory_region_add_subregion(&cregs->component_registers, 0, &cregs->io);
+    memory_region_add_subregion(&cregs->component_registers, 0x1000,
+                                &cregs->cache_mem);
+}
+
+static void ras_init_common(uint32_t *reg_state)
+{
+    reg_state[R_CXL_RAS_UNC_ERR_STATUS] = 0;
+    reg_state[R_CXL_RAS_UNC_ERR_MASK] = 0x1efff;
+    reg_state[R_CXL_RAS_UNC_ERR_SEVERITY] = 0x1efff;
+    reg_state[R_CXL_RAS_COR_ERR_STATUS] = 0;
+    reg_state[R_CXL_RAS_COR_ERR_MASK] = 0x3f;
+    reg_state[R_CXL_RAS_ERR_CAP_CTRL] = 0; /* CXL switches and devices must set */
+}
+
+static void hdm_init_common(uint32_t *reg_state)
+{
+    ARRAY_FIELD_DP32(reg_state, CXL_HDM_DECODER_CAPABILITY, DECODER_COUNT, 0);
+    ARRAY_FIELD_DP32(reg_state, CXL_HDM_DECODER_GLOBAL_CONTROL, HDM_DECODER_ENABLE, 0);
+}
+
+void cxl_component_register_init_common(uint32_t *reg_state, enum reg_type type)
+{
+    int caps = 0;
+    switch (type) {
+    case CXL2_DOWNSTREAM_PORT:
+    case CXL2_DEVICE:
+        /* CAP, RAS, Link */
+        caps = 3;
+        break;
+    case CXL2_UPSTREAM_PORT:
+    case CXL2_TYPE3_DEVICE:
+    case CXL2_LOGICAL_DEVICE:
+        /* + HDM */
+        caps = 4;
+        break;
+    case CXL2_ROOT_PORT:
+        /* + Extended Security, + Snoop */
+        caps = 6;
+        break;
+    default:
+        abort();
+    }
+
+    memset(reg_state, 0, 0x1000);
+
+    /* CXL Capability Header Register */
+    ARRAY_FIELD_DP32(reg_state, CXL_CAPABILITY_HEADER, ID, 1);
+    ARRAY_FIELD_DP32(reg_state, CXL_CAPABILITY_HEADER, VERSION, 1);
+    ARRAY_FIELD_DP32(reg_state, CXL_CAPABILITY_HEADER, CACHE_MEM_VERSION, 1);
+    ARRAY_FIELD_DP32(reg_state, CXL_CAPABILITY_HEADER, ARRAY_SIZE, caps);
+
+
+#define init_cap_reg(reg, id, version)                                        \
+    do {                                                                      \
+        int which = R_CXL_##reg##_CAPABILITY_HEADER;                          \
+        reg_state[which] = FIELD_DP32(reg_state[which],                       \
+                                      CXL_##reg##_CAPABILITY_HEADER, ID, id); \
+        reg_state[which] =                                                    \
+            FIELD_DP32(reg_state[which], CXL_##reg##_CAPABILITY_HEADER,       \
+                       VERSION, version);                                     \
+        reg_state[which] =                                                    \
+            FIELD_DP32(reg_state[which], CXL_##reg##_CAPABILITY_HEADER, PTR,  \
+                       CXL_##reg##_REGISTERS_OFFSET);                         \
+    } while (0)
+
+    init_cap_reg(RAS, 2, 1);
+    ras_init_common(reg_state);
+
+    init_cap_reg(LINK, 4, 2);
+
+    if (caps < 4) {
+        return;
+    }
+
+    init_cap_reg(HDM, 5, 1);
+    hdm_init_common(reg_state);
+
+    if (caps < 6) {
+        return;
+    }
+
+    init_cap_reg(EXTSEC, 6, 1);
+    init_cap_reg(SNOOP, 8, 1);
+
+#undef init_cap_reg
+}
+
+/*
+ * Helper to creates a DVSEC header for a CXL entity. The caller is responsible
+ * for tracking the valid offset.
+ *
+ * This function will build the DVSEC header on behalf of the caller and then
+ * copy in the remaining data for the vendor specific bits.
+ */
+void cxl_component_create_dvsec(CXLComponentState *cxl, uint16_t length,
+                                uint16_t type, uint8_t rev, uint8_t *body)
+{
+    PCIDevice *pdev = cxl->pdev;
+    uint16_t offset = cxl->dvsec_offset;
+
+    assert(offset >= PCI_CFG_SPACE_SIZE && offset < PCI_CFG_SPACE_EXP_SIZE);
+    assert((length & 0xf000) == 0);
+    assert((rev & 0xf0) == 0);
+
+    /* Create the DVSEC in the MCFG space */
+    pcie_add_capability(pdev, PCI_EXT_CAP_ID_DVSEC, 1, offset, length);
+    pci_set_long(pdev->config + offset + PCIE_DVSEC_HEADER_OFFSET,
+                 (length << 20) | (rev << 16) | CXL_VENDOR_ID);
+    pci_set_word(pdev->config + offset + PCIE_DVSEC_ID_OFFSET, type);
+    memcpy(pdev->config + offset + sizeof(struct dvsec_header),
+           body + sizeof(struct dvsec_header),
+           length - sizeof(struct dvsec_header));
+
+    /* Update state for future DVSEC additions */
+    range_init_nofail(&cxl->dvsecs[type], cxl->dvsec_offset, length);
+    cxl->dvsec_offset += length;
+}
diff --git a/hw/cxl/cxl-device-utils.c b/hw/cxl/cxl-device-utils.c
new file mode 100644
index 0000000000..e69de29bb2
diff --git a/hw/cxl/meson.build b/hw/cxl/meson.build
new file mode 100644
index 0000000000..00c3876a0f
--- /dev/null
+++ b/hw/cxl/meson.build
@@ -0,0 +1,3 @@
+softmmu_ss.add(when: 'CONFIG_CXL', if_true: files(
+  'cxl-component-utils.c',
+))
diff --git a/hw/meson.build b/hw/meson.build
index 010de7219c..3e440c341a 100644
--- a/hw/meson.build
+++ b/hw/meson.build
@@ -6,6 +6,7 @@ subdir('block')
 subdir('char')
 subdir('core')
 subdir('cpu')
+subdir('cxl')
 subdir('display')
 subdir('dma')
 subdir('gpio')
diff --git a/include/hw/cxl/cxl.h b/include/hw/cxl/cxl.h
new file mode 100644
index 0000000000..55f6cc30a5
--- /dev/null
+++ b/include/hw/cxl/cxl.h
@@ -0,0 +1,17 @@
+/*
+ * QEMU CXL Support
+ *
+ * Copyright (c) 2020 Intel
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2. See the
+ * COPYING file in the top-level directory.
+ */
+
+#ifndef CXL_H
+#define CXL_H
+
+#include "cxl_pci.h"
+#include "cxl_component.h"
+
+#endif
+
diff --git a/include/hw/cxl/cxl_component.h b/include/hw/cxl/cxl_component.h
new file mode 100644
index 0000000000..014d9d10d3
--- /dev/null
+++ b/include/hw/cxl/cxl_component.h
@@ -0,0 +1,181 @@
+/*
+ * QEMU CXL Component
+ *
+ * Copyright (c) 2020 Intel
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2. See the
+ * COPYING file in the top-level directory.
+ */
+
+#ifndef CXL_COMPONENT_H
+#define CXL_COMPONENT_H
+
+/* CXL 2.0 - 8.2.4 */
+#define CXL2_COMPONENT_IO_REGION_SIZE 0x1000
+#define CXL2_COMPONENT_CM_REGION_SIZE 0x1000
+#define CXL2_COMPONENT_BLOCK_SIZE 0x10000
+
+#include "qemu/range.h"
+#include "qemu/typedefs.h"
+#include "hw/register.h"
+
+enum reg_type {
+    CXL2_DEVICE,
+    CXL2_TYPE3_DEVICE,
+    CXL2_LOGICAL_DEVICE,
+    CXL2_ROOT_PORT,
+    CXL2_UPSTREAM_PORT,
+    CXL2_DOWNSTREAM_PORT
+};
+
+/*
+ * Capability registers are defined at the top of the CXL.cache/mem region and
+ * are packed. For our purposes we will always define the caps in the same
+ * order.
+ * CXL 2.0 - 8.2.5 Table 142 for details.
+ */
+
+/* CXL 2.0 - 8.2.5.1 */
+REG32(CXL_CAPABILITY_HEADER, 0)
+    FIELD(CXL_CAPABILITY_HEADER, ID, 0, 16)
+    FIELD(CXL_CAPABILITY_HEADER, VERSION, 16, 4)
+    FIELD(CXL_CAPABILITY_HEADER, CACHE_MEM_VERSION, 20, 4)
+    FIELD(CXL_CAPABILITY_HEADER, ARRAY_SIZE, 24, 8)
+
+#define CXLx_CAPABILITY_HEADER(type, offset)                  \
+    REG32(CXL_##type##_CAPABILITY_HEADER, offset)             \
+        FIELD(CXL_##type##_CAPABILITY_HEADER, ID, 0, 16)      \
+        FIELD(CXL_##type##_CAPABILITY_HEADER, VERSION, 16, 4) \
+        FIELD(CXL_##type##_CAPABILITY_HEADER, PTR, 20, 12)
+CXLx_CAPABILITY_HEADER(RAS, 0x4)
+CXLx_CAPABILITY_HEADER(LINK, 0x8)
+CXLx_CAPABILITY_HEADER(HDM, 0xc)
+CXLx_CAPABILITY_HEADER(EXTSEC, 0x10)
+CXLx_CAPABILITY_HEADER(SNOOP, 0x14)
+
+/*
+ * Capability structures contain the actual registers that the CXL component
+ * implements. Some of these are specific to certain types of components, but
+ * this implementation leaves enough space regardless.
+ */
+/* 8.2.5.9 - CXL RAS Capability Structure */
+#define CXL_RAS_REGISTERS_OFFSET 0x80 /* Give ample space for caps before this */
+#define CXL_RAS_REGISTERS_SIZE   0x58
+REG32(CXL_RAS_UNC_ERR_STATUS, CXL_RAS_REGISTERS_OFFSET)
+REG32(CXL_RAS_UNC_ERR_MASK, CXL_RAS_REGISTERS_OFFSET + 0x4)
+REG32(CXL_RAS_UNC_ERR_SEVERITY, CXL_RAS_REGISTERS_OFFSET + 0x8)
+REG32(CXL_RAS_COR_ERR_STATUS, CXL_RAS_REGISTERS_OFFSET + 0xc)
+REG32(CXL_RAS_COR_ERR_MASK, CXL_RAS_REGISTERS_OFFSET + 0x10)
+REG32(CXL_RAS_ERR_CAP_CTRL, CXL_RAS_REGISTERS_OFFSET + 0x14)
+
+/* 8.2.5.10 - CXL Security Capability Structure */
+#define CXL_SEC_REGISTERS_OFFSET (CXL_RAS_REGISTERS_OFFSET + CXL_RAS_REGISTERS_SIZE)
+#define CXL_SEC_REGISTERS_SIZE   0 /* We don't implement 1.1 downstream ports */
+
+/* 8.2.5.11 - CXL Link Capability Structure */
+#define CXL_LINK_REGISTERS_OFFSET (CXL_SEC_REGISTERS_OFFSET + CXL_SEC_REGISTERS_SIZE)
+#define CXL_LINK_REGISTERS_SIZE   0x38
+
+/* 8.2.5.12 - CXL HDM Decoder Capability Structure */
+#define HDM_DECODE_MAX 10 /* 8.2.5.12.1 */
+#define CXL_HDM_REGISTERS_OFFSET \
+    (CXL_LINK_REGISTERS_OFFSET + CXL_LINK_REGISTERS_SIZE) /* 8.2.5.12 */
+#define CXL_HDM_REGISTERS_SIZE (0x20 + HDM_DECODE_MAX * 10)
+#define HDM_DECODER_INIT(n, base)                          \
+    REG32(CXL_HDM_DECODER##n##_BASE_LO, base + 0x10)       \
+        FIELD(CXL_HDM_DECODER##n##_BASE_LO, L, 28, 4)      \
+    REG32(CXL_HDM_DECODER##n##_BASE_HI, base + 0x14)       \
+        FIELD(CXL_HDM_DECODER##n##_BASE_HI, H, 0, 32)      \
+    REG32(CXL_HDM_DECODER##n##_SIZE_LO, base + 0x18)       \
+    REG32(CXL_HDM_DECODER##n##_SIZE_HI, base + 0x1C)       \
+    REG32(CXL_HDM_DECODER##n##_CTRL, base + 0x20)          \
+        FIELD(CXL_HDM_DECODER##n##_CTRL, IG, 0, 4)         \
+        FIELD(CXL_HDM_DECODER##n##_CTRL, IW, 4, 4)         \
+        FIELD(CXL_HDM_DECODER##n##_CTRL, LOCK, 8, 1)       \
+        FIELD(CXL_HDM_DECODER##n##_CTRL, COMMIT, 9, 1)     \
+        FIELD(CXL_HDM_DECODER##n##_CTRL, COMMITTED, 10, 1) \
+        FIELD(CXL_HDM_DECODER##n##_CTRL, ERROR, 11, 1)     \
+        FIELD(CXL_HDM_DECODER##n##_CTRL, TYPE, 12, 1)      \
+    REG32(CXL_HDM_DECODER##n##_TARGET_LIST_LO, 0x24)       \
+    REG32(CXL_HDM_DECODER##n##_TARGET_LIST_HI, 0x28)
+REG32(CXL_HDM_DECODER_CAPABILITY, 0)
+    FIELD(CXL_HDM_DECODER_CAPABILITY, DECODER_COUNT, 0, 4)
+    FIELD(CXL_HDM_DECODER_CAPABILITY, TARGET_COUNT, 4, 4)
+    FIELD(CXL_HDM_DECODER_CAPABILITY, INTERLEAVE_256B, 8, 1)
+    FIELD(CXL_HDM_DECODER_CAPABILITY, INTELEAVE_4K, 9, 1)
+    FIELD(CXL_HDM_DECODER_CAPABILITY, POISON_ON_ERR_CAP, 10, 1)
+REG32(CXL_HDM_DECODER_GLOBAL_CONTROL, 0)
+    FIELD(CXL_HDM_DECODER_GLOBAL_CONTROL, POISON_ON_ERR_EN, 0, 1)
+    FIELD(CXL_HDM_DECODER_GLOBAL_CONTROL, HDM_DECODER_ENABLE, 1, 1)
+
+HDM_DECODER_INIT(0, CXL_HDM_REGISTERS_OFFSET);
+
+/* 8.2.5.13 - CXL Extended Security Capability Structure (Root complex only) */
+#define EXTSEC_ENTRY_MAX        256
+#define CXL_EXTSEC_REGISTERS_OFFSET (CXL_HDM_REGISTERS_OFFSET + CXL_HDM_REGISTERS_SIZE)
+#define CXL_EXTSEC_REGISTERS_SIZE   (8 * EXTSEC_ENTRY_MAX + 4)
+
+/* 8.2.5.14 - CXL IDE Capability Structure */
+#define CXL_IDE_REGISTERS_OFFSET (CXL_EXTSEC_REGISTERS_OFFSET + CXL_EXTSEC_REGISTERS_SIZE)
+#define CXL_IDE_REGISTERS_SIZE   0
+
+/* 8.2.5.15 - CXL Snoop Filter Capability Structure */
+#define CXL_SNOOP_REGISTERS_OFFSET (CXL_IDE_REGISTERS_OFFSET + CXL_IDE_REGISTERS_SIZE)
+#define CXL_SNOOP_REGISTERS_SIZE   0x8
+
+_Static_assert((CXL_SNOOP_REGISTERS_OFFSET + CXL_SNOOP_REGISTERS_SIZE) < 0x1000,
+               "No space for registers");
+
+typedef struct component_registers {
+    /*
+     * Main memory region to be registered with QEMU core.
+     */
+    MemoryRegion component_registers;
+
+    /*
+     * 8.2.4 Table 141:
+     *   0x0000 - 0x0fff CXL.io registers
+     *   0x1000 - 0x1fff CXL.cache and CXL.mem
+     *   0x2000 - 0xdfff Implementation specific
+     *   0xe000 - 0xe3ff CXL ARB/MUX registers
+     *   0xe400 - 0xffff RSVD
+     */
+    uint32_t io_registers[CXL2_COMPONENT_IO_REGION_SIZE >> 2];
+    MemoryRegion io;
+
+    uint32_t cache_mem_registers[CXL2_COMPONENT_CM_REGION_SIZE >> 2];
+    MemoryRegion cache_mem;
+
+    MemoryRegion impl_specific;
+    MemoryRegion arb_mux;
+    MemoryRegion rsvd;
+
+    /* special_ops is used for any component that needs any specific handling */
+    MemoryRegionOps *special_ops;
+} ComponentRegisters;
+
+/*
+ * A CXL component represents all entities in a CXL hierarchy. This includes,
+ * host bridges, root ports, upstream/downstream ports, and devices
+ */
+typedef struct cxl_component {
+    ComponentRegisters crb;
+    union {
+        struct {
+            Range dvsecs[CXL20_MAX_DVSEC];
+            uint16_t dvsec_offset;
+            struct PCIDevice *pdev;
+        };
+    };
+} CXLComponentState;
+
+void cxl_component_register_block_init(Object *obj,
+                                       CXLComponentState *cxl_cstate,
+                                       const char *type);
+void cxl_component_register_init_common(uint32_t *reg_state,
+                                        enum reg_type type);
+
+void cxl_component_create_dvsec(CXLComponentState *cxl_cstate, uint16_t length,
+                                uint16_t type, uint8_t rev, uint8_t *body);
+
+#endif
diff --git a/include/hw/cxl/cxl_pci.h b/include/hw/cxl/cxl_pci.h
new file mode 100644
index 0000000000..b403770424
--- /dev/null
+++ b/include/hw/cxl/cxl_pci.h
@@ -0,0 +1,133 @@
+/*
+ * QEMU CXL PCI interfaces
+ *
+ * Copyright (c) 2020 Intel
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2. See the
+ * COPYING file in the top-level directory.
+ */
+
+#ifndef CXL_PCI_H
+#define CXL_PCI_H
+
+#include "hw/pci/pci.h"
+#include "hw/pci/pcie.h"
+
+#define CXL_VENDOR_ID 0x1e98
+
+#define PCIE_DVSEC_HEADER_OFFSET 0x4 /* Offset from start of extend cap */
+#define PCIE_DVSEC_ID_OFFSET     0x8
+
+#define PCIE_CXL_DEVICE_DVSEC_LENGTH 0x38
+#define PCIE_CXL_DEVICE_DVSEC_REVID  1
+
+#define EXTENSIONS_PORT_DVSEC_LENGTH 0x28
+#define EXTENSIONS_PORT_DVSEC_REVID  1
+
+#define GPF_PORT_DVSEC_LENGTH 0x10
+#define GPF_PORT_DVSEC_REVID  0
+
+#define PCIE_FLEXBUS_PORT_DVSEC_LENGTH_2_0 0x14
+#define PCIE_FLEXBUS_PORT_DVSEC_REVID_2_0  1
+
+#define REG_LOC_DVSEC_LENGTH 0x24
+#define REG_LOC_DVSEC_REVID  0
+
+enum {
+    PCIE_CXL_DEVICE_DVSEC      = 0,
+    NON_CXL_FUNCTION_MAP_DVSEC = 2,
+    EXTENSIONS_PORT_DVSEC      = 3,
+    GPF_PORT_DVSEC             = 4,
+    GPF_DEVICE_DVSEC           = 5,
+    PCIE_FLEXBUS_PORT_DVSEC    = 7,
+    REG_LOC_DVSEC              = 8,
+    MLD_DVSEC                  = 9,
+    CXL20_MAX_DVSEC
+};
+
+struct dvsec_header {
+    uint32_t cap_hdr;
+    uint32_t dv_hdr1;
+    uint16_t dv_hdr2;
+} __attribute__((__packed__));
+_Static_assert(sizeof(struct dvsec_header) == 10,
+               "dvsec header size incorrect");
+
+/*
+ * CXL 2.0 devices must implement certain DVSEC IDs, and can [optionally]
+ * implement others.
+ *
+ * CXL 2.0 Device: 0, [2], 5, 8
+ * CXL 2.0 RP: 3, 4, 7, 8
+ * CXL 2.0 Upstream Port: [2], 7, 8
+ * CXL 2.0 Downstream Port: 3, 4, 7, 8
+ */
+
+/* CXL 2.0 - 8.1.5 (ID 0003) */
+struct dvsec_port {
+    struct dvsec_header hdr;
+    uint16_t status;
+    uint16_t control;
+    uint8_t alt_bus_base;
+    uint8_t alt_bus_limit;
+    uint16_t alt_memory_base;
+    uint16_t alt_memory_limit;
+    uint16_t alt_prefetch_base;
+    uint16_t alt_prefetch_limit;
+    uint32_t alt_prefetch_base_high;
+    uint32_t alt_prefetch_base_low;
+    uint32_t rcrb_base;
+    uint32_t rcrb_base_high;
+};
+_Static_assert(sizeof(struct dvsec_port) == 0x28, "dvsec port size incorrect");
+#define PORT_CONTROL_OVERRIDE_OFFSET 0xc
+#define PORT_CONTROL_UNMASK_SBR      1
+#define PORT_CONTROL_ALT_MEMID_EN    4
+
+/* CXL 2.0 - 8.1.6 GPF DVSEC (ID 0004) */
+struct dvsec_port_gpf {
+    struct dvsec_header hdr;
+    uint16_t rsvd;
+    uint16_t phase1_ctrl;
+    uint16_t phase2_ctrl;
+};
+_Static_assert(sizeof(struct dvsec_port_gpf) == 0x10,
+               "dvsec port GPF size incorrect");
+
+/* CXL 2.0 - 8.1.8/8.2.1.3 Flexbus DVSEC (ID 0007) */
+struct dvsec_port_flexbus {
+    struct dvsec_header hdr;
+    uint16_t cap;
+    uint16_t ctrl;
+    uint16_t status;
+    uint32_t rcvd_mod_ts_data;
+};
+_Static_assert(sizeof(struct dvsec_port_flexbus) == 0x14,
+               "dvsec port flexbus size incorrect");
+
+/* CXL 2.0 - 8.1.9 Register Locator DVSEC (ID 0008) */
+struct dvsec_register_locator {
+    struct dvsec_header hdr;
+    uint16_t rsvd;
+    uint32_t reg0_base_lo;
+    uint32_t reg0_base_hi;
+    uint32_t reg1_base_lo;
+    uint32_t reg1_base_hi;
+    uint32_t reg2_base_lo;
+    uint32_t reg2_base_hi;
+};
+_Static_assert(sizeof(struct dvsec_register_locator) == 0x24,
+               "dvsec register locator size incorrect");
+#define BEI_BAR_10H 0
+#define BEI_BAR_14H 1
+#define BEI_BAR_18H 2
+#define BEI_BAR_1cH 3
+#define BEI_BAR_20H 4
+#define BEI_BAR_24H 5
+
+#define RBI_EMPTY          0
+#define RBI_COMPONENT_REG  (1 << 8)
+#define RBI_BAR_VIRT_ACL   (2 << 8)
+#define RBI_CXL_DEVICE_REG (3 << 8)
+
+#endif
-- 
2.29.2



^ permalink raw reply related	[flat|nested] 64+ messages in thread

* [RFC PATCH 04/25] hw/cxl/device: Introduce a CXL device (8.2.8)
  2020-11-11  5:46 [RFC PATCH 00/25] Introduce CXL 2.0 Emulation Ben Widawsky
                   ` (2 preceding siblings ...)
  2020-11-11  5:47 ` [RFC PATCH 03/25] hw/cxl/component: Introduce CXL components (8.1.x, 8.2.5) Ben Widawsky
@ 2020-11-11  5:47 ` Ben Widawsky
  2020-11-16 13:07   ` Jonathan Cameron
  2020-11-11  5:47 ` [RFC PATCH 05/25] hw/cxl/device: Implement the CAP array (8.2.8.1-2) Ben Widawsky
                   ` (22 subsequent siblings)
  26 siblings, 1 reply; 64+ messages in thread
From: Ben Widawsky @ 2020-11-11  5:47 UTC (permalink / raw)
  To: qemu-devel; +Cc: Vishal Verma, Dan Williams, Ben Widawsky

A CXL device is a type of CXL component. Conceptually, a CXL device
would be a leaf node in a CXL topology. From an emulation perspective,
CXL devices are the most complex and so the actual implementation is
reserved for discrete commits.

This new device type is specifically catered towards the eventually
implementation of a Type3 CXL.mem device, 8.2.8.5 in the CXL 2.0
specification.

Signed-off-by: Ben Widawsky <ben.widawsky@intel.com>
---
 include/hw/cxl/cxl.h        |   1 +
 include/hw/cxl/cxl_device.h | 193 ++++++++++++++++++++++++++++++++++++
 2 files changed, 194 insertions(+)
 create mode 100644 include/hw/cxl/cxl_device.h

diff --git a/include/hw/cxl/cxl.h b/include/hw/cxl/cxl.h
index 55f6cc30a5..23f52c4cf9 100644
--- a/include/hw/cxl/cxl.h
+++ b/include/hw/cxl/cxl.h
@@ -12,6 +12,7 @@
 
 #include "cxl_pci.h"
 #include "cxl_component.h"
+#include "cxl_device.h"
 
 #endif
 
diff --git a/include/hw/cxl/cxl_device.h b/include/hw/cxl/cxl_device.h
new file mode 100644
index 0000000000..491eca6e05
--- /dev/null
+++ b/include/hw/cxl/cxl_device.h
@@ -0,0 +1,193 @@
+/*
+ * QEMU CXL Devices
+ *
+ * Copyright (c) 2020 Intel
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2. See the
+ * COPYING file in the top-level directory.
+ */
+
+#ifndef CXL_DEVICE_H
+#define CXL_DEVICE_H
+
+#include "hw/register.h"
+
+/*
+ * The following is how a CXL device's MMIO space is laid out. The only
+ * requirement from the spec is that the capabilities array and the capability
+ * headers start at offset 0 and are contiguously packed. The headers themselves
+ * provide offsets to the register fields. For this emulation, registers will
+ * start at offset 0x80 (m == 0x80). No secondary mailbox is implemented which
+ * means that n = m + sizeof(mailbox registers) + sizeof(device registers).
+ *
+ * This is roughly described in 8.2.8 Figure 138 of the CXL 2.0 spec.
+ *
+ * n + PAYLOAD_SIZE_MAX  +---------------------------------+
+ *                       |                                 |
+ *                  ^    |                                 |
+ *                  |    |                                 |
+ *                  |    |                                 |
+ *                  |    |                                 |
+ *                  |    |         Command Payload         |
+ *                  |    |                                 |
+ *                  |    |                                 |
+ *                  |    |                                 |
+ *                  |    |                                 |
+ *                  |    |                                 |
+ *                  n    +---------------------------------+
+ *                  ^    |                                 |
+ *                  |    |    Device Capability Registers  |
+ *                  |    |    x, mailbox, y                |
+ *                  |    |                                 |
+ *                  m    +---------------------------------+
+ *                  ^    |     Device Capability Header y  |
+ *                  |    +---------------------------------+
+ *                  |    | Device Capability Header Mailbox|
+ *                  |    +------------- --------------------
+ *                  |    |     Device Capability Header x  |
+ *                  |    +---------------------------------+
+ *                  |    |                                 |
+ *                  |    |                                 |
+ *                  |    |      Device Cap Array[0..n]     |
+ *                  |    |                                 |
+ *                  |    |                                 |
+ *                  |    |                                 |
+ *                  0    +---------------------------------+
+ */
+
+#define CXL_DEVICE_CAP_HDR1_OFFSET 0x10 /* Figure 138 */
+#define CXL_DEVICE_CAP_REG_SIZE 0x10 /* 8.2.8.2 */
+
+#define CXL_DEVICE_REGISTERS_OFFSET 0x80 /* Read comment above */
+#define CXL_DEVICE_REGISTERS_LENGTH 0x8 /* 8.2.8.3.1 */
+
+#define CXL_MAILBOX_REGISTERS_OFFSET \
+    (CXL_DEVICE_REGISTERS_OFFSET + CXL_DEVICE_REGISTERS_LENGTH)
+#define CXL_MAILBOX_REGISTERS_SIZE 0x20
+#define CXL_MAILBOX_PAYLOAD_SHIFT 11
+#define CXL_MAILBOX_MAX_PAYLOAD_SIZE (1 << CXL_MAILBOX_PAYLOAD_SHIFT)
+#define CXL_MAILBOX_REGISTERS_LENGTH \
+    (CXL_MAILBOX_REGISTERS_SIZE + CXL_MAILBOX_MAX_PAYLOAD_SIZE)
+
+typedef struct cxl_device_state {
+    /* Boss container and caps registers */
+    MemoryRegion device_registers;
+
+    MemoryRegion caps;
+    MemoryRegion device;
+    MemoryRegion mailbox;
+
+    MemoryRegion *pmem;
+    MemoryRegion *vmem;
+
+    bool active;
+    uint16_t command;
+    uint16_t payload_size;
+    union {
+        uint8_t caps_reg_state[CXL_DEVICE_CAP_REG_SIZE * 4]; /* ARRAY + 3 CAPS */
+        uint32_t caps_reg_state32[0];
+    };
+} CXLDeviceState;
+
+/* Initialize the register block for a device */
+void cxl_device_register_block_init(Object *obj, CXLDeviceState *dev);
+
+/* Set up default values for the register block */
+void cxl_device_register_init_common(CXLDeviceState *dev);
+
+/* CXL 2.0 - 8.2.8.1 */
+REG32(CXL_DEV_CAP_ARRAY, 0) /* 48b!?!?! */
+    FIELD(CXL_DEV_CAP_ARRAY, CAP_ID, 0, 16)
+    FIELD(CXL_DEV_CAP_ARRAY, CAP_VERSION, 16, 8)
+REG32(CXL_DEV_CAP_ARRAY2, 4) /* We're going to pretend it's 64b */
+    FIELD(CXL_DEV_CAP_ARRAY2, CAP_COUNT, 0, 16)
+
+/*
+ * In the 8.2.8.2, this is listed as a 128b register, but in 8.2.8, it says:
+ * > No registers defined in Section 8.2.8 are larger than 64-bits wide so that
+ * > is the maximum access size allowed for these registers. If this rule is not
+ * > followed, the behavior is undefined
+ *
+ * Here we've chosen to make it 4 dwords. The spec allows any pow2 multiple
+ * access to be used for a register (2 qwords, 8 words, 128 bytes).
+ *
+ * XXX: This is supposedly fixed for the release version of the spec. If this
+ * comment is still here, I've failed.
+ */
+#define CXL_DEVICE_CAPABILITY_HEADER_REGISTER(n, offset)                            \
+    REG32(CXL_DEV_##n##_CAP_HDR0, offset)                 \
+        FIELD(CXL_DEV_##n##_CAP_HDR0, CAP_ID, 0, 16)      \
+        FIELD(CXL_DEV_##n##_CAP_HDR0, CAP_VERSION, 16, 8) \
+    REG32(CXL_DEV_##n##_CAP_HDR1, offset + 4)             \
+        FIELD(CXL_DEV_##n##_CAP_HDR1, CAP_OFFSET, 0, 32)  \
+    REG32(CXL_DEV_##n##_CAP_HDR2, offset + 8)             \
+        FIELD(CXL_DEV_##n##_CAP_HDR2, CAP_LENGTH, 0, 32)
+
+CXL_DEVICE_CAPABILITY_HEADER_REGISTER(DEVICE, CXL_DEVICE_CAP_HDR1_OFFSET)
+CXL_DEVICE_CAPABILITY_HEADER_REGISTER(MAILBOX, CXL_DEVICE_CAP_HDR1_OFFSET + \
+                                               CXL_DEVICE_CAP_REG_SIZE)
+
+REG32(CXL_DEV_MAILBOX_CAP, 0)
+    FIELD(CXL_DEV_MAILBOX_CAP, PAYLOAD_SIZE, 0, 5)
+    FIELD(CXL_DEV_MAILBOX_CAP, INT_CAP, 5, 1)
+    FIELD(CXL_DEV_MAILBOX_CAP, BG_INT_CAP, 6, 1)
+    FIELD(CXL_DEV_MAILBOX_CAP, MSI_N, 7, 4)
+
+REG32(CXL_DEV_MAILBOX_CTRL, 4)
+    FIELD(CXL_DEV_MAILBOX_CTRL, DOORBELL, 0, 1)
+    FIELD(CXL_DEV_MAILBOX_CTRL, INT_EN, 1, 2)
+    FIELD(CXL_DEV_MAILBOX_CTRL, BG_INT_EN, 2, 1)
+
+enum {
+    CXL_CMD_EVENTS              = 0x1,
+    CXL_CMD_IDENTIFY            = 0x40,
+};
+
+REG32(CXL_DEV_MAILBOX_CMD, 8)
+    FIELD(CXL_DEV_MAILBOX_CMD, OP, 0, 16)
+    FIELD(CXL_DEV_MAILBOX_CMD, LENGTH, 16, 20)
+
+/* 8.2.8.4.5.1 Command Return Codes */
+enum {
+    RET_SUCCESS                 = 0x0,
+    RET_BG_STARTED              = 0x1, /* Background Command Started */
+    RET_EINVAL                  = 0x2, /* Invalid Input */
+    RET_ENOTSUP                 = 0x3, /* Unsupported */
+    RET_ENODEV                  = 0x4, /* Internal Error */
+    RET_ERESTART                = 0x5, /* Retry Required */
+    RET_EBUSY                   = 0x6, /* Busy */
+    RET_MEDIA_DISABLED          = 0x7, /* Media Disabled */
+    RET_FW_EBUSY                = 0x8, /* FW Transfer in Progress */
+    RET_FW_OOO                  = 0x9, /* FW Transfer Out of Order */
+    RET_FW_AUTH                 = 0xa, /* FW Authentication Failed */
+    RET_FW_EBADSLT              = 0xb, /* Invalid Slot */
+    RET_FW_ROLLBACK             = 0xc, /* Activation Failed, FW Rolled Back */
+    RET_FW_REBOOT               = 0xd, /* Activation Failed, Cold Reset Required */
+    RET_ENOENT                  = 0xe, /* Invalid Handle */
+    RET_EFAULT                  = 0xf, /* Invalid Physical Address */
+    RET_POISON_E2BIG            = 0x10, /* Inject Poison Limit Reached */
+    RET_EIO                     = 0x11, /* Permanent Media Failure */
+    RET_ECANCELED               = 0x12, /* Aborted */
+    RET_EACCESS                 = 0x13, /* Invalid Security State */
+    RET_EPERM                   = 0x14, /* Incorrect Passphrase */
+    RET_EPROTONOSUPPORT         = 0x15, /* Unsupported Mailbox */
+    RET_EMSGSIZE                = 0x16, /* Invalid Payload Length */
+    RET_MAX                     = 0x17
+};
+
+/* XXX: actually a 64b register */
+REG32(CXL_DEV_MAILBOX_STS, 0x10)
+    FIELD(CXL_DEV_MAILBOX_STS, BG_OP, 0, 1)
+    FIELD(CXL_DEV_MAILBOX_STS, ERRNO, 32, 16)
+    FIELD(CXL_DEV_MAILBOX_STS, VENDOR_ERRNO, 48, 16)
+
+/* XXX: actually a 64b register */
+REG32(CXL_DEV_BG_CMD_STS, 0x18)
+    FIELD(CXL_DEV_BG_CMD_STS, BG, 0, 16)
+    FIELD(CXL_DEV_BG_CMD_STS, DONE, 16, 7)
+    FIELD(CXL_DEV_BG_CMD_STS, ERRNO, 32, 16)
+    FIELD(CXL_DEV_BG_CMD_STS, VENDOR_ERRNO, 48, 16)
+
+REG32(CXL_DEV_CMD_PAYLOAD, 0x20)
+
+#endif
-- 
2.29.2



^ permalink raw reply related	[flat|nested] 64+ messages in thread

* [RFC PATCH 05/25] hw/cxl/device: Implement the CAP array (8.2.8.1-2)
  2020-11-11  5:46 [RFC PATCH 00/25] Introduce CXL 2.0 Emulation Ben Widawsky
                   ` (3 preceding siblings ...)
  2020-11-11  5:47 ` [RFC PATCH 04/25] hw/cxl/device: Introduce a CXL device (8.2.8) Ben Widawsky
@ 2020-11-11  5:47 ` Ben Widawsky
  2020-11-16 13:11   ` Jonathan Cameron
  2020-11-11  5:47 ` [RFC PATCH 06/25] hw/cxl/device: Add device status (8.2.8.3) Ben Widawsky
                   ` (21 subsequent siblings)
  26 siblings, 1 reply; 64+ messages in thread
From: Ben Widawsky @ 2020-11-11  5:47 UTC (permalink / raw)
  To: qemu-devel; +Cc: Vishal Verma, Dan Williams, Ben Widawsky

This implements all device MMIO up to the first capability .That
includes the CXL Device Capabilities Array Register, as well as all of
the CXL Device Capability Header Registers. The latter are filled in as
they are implemented in the following patches.

Signed-off-by: Ben Widawsky <ben.widawsky@intel.com>
---
 hw/cxl/cxl-device-utils.c | 73 +++++++++++++++++++++++++++++++++++++++
 hw/cxl/meson.build        |  1 +
 2 files changed, 74 insertions(+)

diff --git a/hw/cxl/cxl-device-utils.c b/hw/cxl/cxl-device-utils.c
index e69de29bb2..a391bb15c6 100644
--- a/hw/cxl/cxl-device-utils.c
+++ b/hw/cxl/cxl-device-utils.c
@@ -0,0 +1,73 @@
+/*
+ * CXL Utility library for devices
+ *
+ * Copyright(C) 2020 Intel Corporation.
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2. See the
+ * COPYING file in the top-level directory.
+ */
+
+#include "qemu/osdep.h"
+#include "qemu/log.h"
+#include "hw/cxl/cxl.h"
+
+static uint64_t caps_reg_read(void *opaque, hwaddr offset, unsigned size)
+{
+    CXLDeviceState *cxl_dstate = opaque;
+
+    switch (size) {
+    case 4:
+        if (unlikely(offset & (sizeof(uint32_t) - 1))) {
+            qemu_log_mask(LOG_UNIMP, "Unaligned register read\n");
+            return 0;
+        }
+        break;
+    case 8:
+        if (unlikely(offset & (sizeof(uint64_t) - 1))) {
+            qemu_log_mask(LOG_UNIMP, "Unaligned register read\n");
+            return 0;
+        }
+        break;
+    }
+
+    return ldn_le_p(cxl_dstate->caps_reg_state + offset, size);
+}
+
+static const MemoryRegionOps caps_ops = {
+    .read = caps_reg_read,
+    .write = NULL,
+    .endianness = DEVICE_LITTLE_ENDIAN,
+    .valid = {
+        .min_access_size = 4,
+        .max_access_size = 8,
+    },
+    .impl = {
+        .min_access_size = 4,
+        .max_access_size = 8,
+    },
+};
+
+void cxl_device_register_block_init(Object *obj, CXLDeviceState *cxl_dstate)
+{
+    /* This will be a BAR, so needs to be rounded up to pow2 for PCI spec */
+    memory_region_init(
+        &cxl_dstate->device_registers, obj, "device-registers",
+        pow2ceil(CXL_MAILBOX_REGISTERS_LENGTH + CXL_MAILBOX_REGISTERS_OFFSET));
+
+    memory_region_init_io(&cxl_dstate->caps, obj, &caps_ops, cxl_dstate,
+                          "cap-array", CXL_DEVICE_REGISTERS_OFFSET - 0);
+
+    memory_region_add_subregion(&cxl_dstate->device_registers, 0,
+                                &cxl_dstate->caps);
+}
+
+void cxl_device_register_init_common(CXLDeviceState *cxl_dstate)
+{
+    uint32_t *cap_hdrs = cxl_dstate->caps_reg_state32;
+    const int cap_count = 0;
+
+    /* CXL Device Capabilities Array Register */
+    ARRAY_FIELD_DP32(cap_hdrs, CXL_DEV_CAP_ARRAY, CAP_ID, 0);
+    ARRAY_FIELD_DP32(cap_hdrs, CXL_DEV_CAP_ARRAY, CAP_VERSION, 1);
+    ARRAY_FIELD_DP32(cap_hdrs, CXL_DEV_CAP_ARRAY2, CAP_COUNT, cap_count);
+}
diff --git a/hw/cxl/meson.build b/hw/cxl/meson.build
index 00c3876a0f..47154d6850 100644
--- a/hw/cxl/meson.build
+++ b/hw/cxl/meson.build
@@ -1,3 +1,4 @@
 softmmu_ss.add(when: 'CONFIG_CXL', if_true: files(
   'cxl-component-utils.c',
+  'cxl-device-utils.c',
 ))
-- 
2.29.2



^ permalink raw reply related	[flat|nested] 64+ messages in thread

* [RFC PATCH 06/25] hw/cxl/device: Add device status (8.2.8.3)
  2020-11-11  5:46 [RFC PATCH 00/25] Introduce CXL 2.0 Emulation Ben Widawsky
                   ` (4 preceding siblings ...)
  2020-11-11  5:47 ` [RFC PATCH 05/25] hw/cxl/device: Implement the CAP array (8.2.8.1-2) Ben Widawsky
@ 2020-11-11  5:47 ` Ben Widawsky
  2020-11-16 13:16   ` Jonathan Cameron
  2020-11-11  5:47 ` [RFC PATCH 07/25] hw/cxl/device: Implement basic mailbox (8.2.8.4) Ben Widawsky
                   ` (20 subsequent siblings)
  26 siblings, 1 reply; 64+ messages in thread
From: Ben Widawsky @ 2020-11-11  5:47 UTC (permalink / raw)
  To: qemu-devel; +Cc: Vishal Verma, Dan Williams, Ben Widawsky

This implements the CXL device status registers from 8.2.8.3.1 in the
CXL 2.0 specification. It is capability ID 0001h.

Signed-off-by: Ben Widawsky <ben.widawsky@intel.com>
---
 hw/cxl/cxl-device-utils.c   | 45 +++++++++++++++++++++++++++++++++-
 include/hw/cxl/cxl_device.h | 49 ++++++++++++-------------------------
 2 files changed, 60 insertions(+), 34 deletions(-)

diff --git a/hw/cxl/cxl-device-utils.c b/hw/cxl/cxl-device-utils.c
index a391bb15c6..78144e103c 100644
--- a/hw/cxl/cxl-device-utils.c
+++ b/hw/cxl/cxl-device-utils.c
@@ -33,6 +33,42 @@ static uint64_t caps_reg_read(void *opaque, hwaddr offset, unsigned size)
     return ldn_le_p(cxl_dstate->caps_reg_state + offset, size);
 }
 
+static uint64_t dev_reg_read(void *opaque, hwaddr offset, unsigned size)
+{
+    uint64_t retval = 0;
+
+    switch (size) {
+    case 4:
+        if (unlikely(offset & (sizeof(uint32_t) - 1))) {
+            qemu_log_mask(LOG_UNIMP, "Unaligned register read\n");
+            return 0;
+        }
+        break;
+    case 8:
+        if (unlikely(offset & (sizeof(uint64_t) - 1))) {
+            qemu_log_mask(LOG_UNIMP, "Unaligned register read\n");
+            return 0;
+        }
+        break;
+    }
+
+    return ldn_le_p(&retval, size);
+}
+
+static const MemoryRegionOps dev_ops = {
+    .read = dev_reg_read,
+    .write = NULL,
+    .endianness = DEVICE_LITTLE_ENDIAN,
+    .valid = {
+        .min_access_size = 4,
+        .max_access_size = 8,
+    },
+    .impl = {
+        .min_access_size = 4,
+        .max_access_size = 8,
+    },
+};
+
 static const MemoryRegionOps caps_ops = {
     .read = caps_reg_read,
     .write = NULL,
@@ -56,18 +92,25 @@ void cxl_device_register_block_init(Object *obj, CXLDeviceState *cxl_dstate)
 
     memory_region_init_io(&cxl_dstate->caps, obj, &caps_ops, cxl_dstate,
                           "cap-array", CXL_DEVICE_REGISTERS_OFFSET - 0);
+    memory_region_init_io(&cxl_dstate->device, obj, &dev_ops, cxl_dstate,
+                          "device-status", CXL_DEVICE_REGISTERS_LENGTH);
 
     memory_region_add_subregion(&cxl_dstate->device_registers, 0,
                                 &cxl_dstate->caps);
+    memory_region_add_subregion(&cxl_dstate->device_registers,
+                                CXL_DEVICE_REGISTERS_OFFSET,
+                                &cxl_dstate->device);
 }
 
 void cxl_device_register_init_common(CXLDeviceState *cxl_dstate)
 {
     uint32_t *cap_hdrs = cxl_dstate->caps_reg_state32;
-    const int cap_count = 0;
+    const int cap_count = 1;
 
     /* CXL Device Capabilities Array Register */
     ARRAY_FIELD_DP32(cap_hdrs, CXL_DEV_CAP_ARRAY, CAP_ID, 0);
     ARRAY_FIELD_DP32(cap_hdrs, CXL_DEV_CAP_ARRAY, CAP_VERSION, 1);
     ARRAY_FIELD_DP32(cap_hdrs, CXL_DEV_CAP_ARRAY2, CAP_COUNT, cap_count);
+
+    cxl_device_cap_init(cxl_dstate, DEVICE, 1);
 }
diff --git a/include/hw/cxl/cxl_device.h b/include/hw/cxl/cxl_device.h
index 491eca6e05..2c674fdc9c 100644
--- a/include/hw/cxl/cxl_device.h
+++ b/include/hw/cxl/cxl_device.h
@@ -127,6 +127,22 @@ CXL_DEVICE_CAPABILITY_HEADER_REGISTER(DEVICE, CXL_DEVICE_CAP_HDR1_OFFSET)
 CXL_DEVICE_CAPABILITY_HEADER_REGISTER(MAILBOX, CXL_DEVICE_CAP_HDR1_OFFSET + \
                                                CXL_DEVICE_CAP_REG_SIZE)
 
+#define cxl_device_cap_init(dstate, reg, cap_id)                                   \
+    do {                                                                           \
+        uint32_t *cap_hdrs = dstate->caps_reg_state32;                             \
+        int which = R_CXL_DEV_##reg##_CAP_HDR0;                                    \
+        cap_hdrs[which] =                                                          \
+            FIELD_DP32(cap_hdrs[which], CXL_DEV_##reg##_CAP_HDR0, CAP_ID, cap_id); \
+        cap_hdrs[which] = FIELD_DP32(                                              \
+            cap_hdrs[which], CXL_DEV_##reg##_CAP_HDR0, CAP_VERSION, 1);            \
+        cap_hdrs[which + 1] =                                                      \
+            FIELD_DP32(cap_hdrs[which + 1], CXL_DEV_##reg##_CAP_HDR1,              \
+                       CAP_OFFSET, CXL_##reg##_REGISTERS_OFFSET);                  \
+        cap_hdrs[which + 2] =                                                      \
+            FIELD_DP32(cap_hdrs[which + 2], CXL_DEV_##reg##_CAP_HDR2,              \
+                       CAP_LENGTH, CXL_##reg##_REGISTERS_LENGTH);                  \
+    } while (0)
+
 REG32(CXL_DEV_MAILBOX_CAP, 0)
     FIELD(CXL_DEV_MAILBOX_CAP, PAYLOAD_SIZE, 0, 5)
     FIELD(CXL_DEV_MAILBOX_CAP, INT_CAP, 5, 1)
@@ -138,43 +154,10 @@ REG32(CXL_DEV_MAILBOX_CTRL, 4)
     FIELD(CXL_DEV_MAILBOX_CTRL, INT_EN, 1, 2)
     FIELD(CXL_DEV_MAILBOX_CTRL, BG_INT_EN, 2, 1)
 
-enum {
-    CXL_CMD_EVENTS              = 0x1,
-    CXL_CMD_IDENTIFY            = 0x40,
-};
-
 REG32(CXL_DEV_MAILBOX_CMD, 8)
     FIELD(CXL_DEV_MAILBOX_CMD, OP, 0, 16)
     FIELD(CXL_DEV_MAILBOX_CMD, LENGTH, 16, 20)
 
-/* 8.2.8.4.5.1 Command Return Codes */
-enum {
-    RET_SUCCESS                 = 0x0,
-    RET_BG_STARTED              = 0x1, /* Background Command Started */
-    RET_EINVAL                  = 0x2, /* Invalid Input */
-    RET_ENOTSUP                 = 0x3, /* Unsupported */
-    RET_ENODEV                  = 0x4, /* Internal Error */
-    RET_ERESTART                = 0x5, /* Retry Required */
-    RET_EBUSY                   = 0x6, /* Busy */
-    RET_MEDIA_DISABLED          = 0x7, /* Media Disabled */
-    RET_FW_EBUSY                = 0x8, /* FW Transfer in Progress */
-    RET_FW_OOO                  = 0x9, /* FW Transfer Out of Order */
-    RET_FW_AUTH                 = 0xa, /* FW Authentication Failed */
-    RET_FW_EBADSLT              = 0xb, /* Invalid Slot */
-    RET_FW_ROLLBACK             = 0xc, /* Activation Failed, FW Rolled Back */
-    RET_FW_REBOOT               = 0xd, /* Activation Failed, Cold Reset Required */
-    RET_ENOENT                  = 0xe, /* Invalid Handle */
-    RET_EFAULT                  = 0xf, /* Invalid Physical Address */
-    RET_POISON_E2BIG            = 0x10, /* Inject Poison Limit Reached */
-    RET_EIO                     = 0x11, /* Permanent Media Failure */
-    RET_ECANCELED               = 0x12, /* Aborted */
-    RET_EACCESS                 = 0x13, /* Invalid Security State */
-    RET_EPERM                   = 0x14, /* Incorrect Passphrase */
-    RET_EPROTONOSUPPORT         = 0x15, /* Unsupported Mailbox */
-    RET_EMSGSIZE                = 0x16, /* Invalid Payload Length */
-    RET_MAX                     = 0x17
-};
-
 /* XXX: actually a 64b register */
 REG32(CXL_DEV_MAILBOX_STS, 0x10)
     FIELD(CXL_DEV_MAILBOX_STS, BG_OP, 0, 1)
-- 
2.29.2



^ permalink raw reply related	[flat|nested] 64+ messages in thread

* [RFC PATCH 07/25] hw/cxl/device: Implement basic mailbox (8.2.8.4)
  2020-11-11  5:46 [RFC PATCH 00/25] Introduce CXL 2.0 Emulation Ben Widawsky
                   ` (5 preceding siblings ...)
  2020-11-11  5:47 ` [RFC PATCH 06/25] hw/cxl/device: Add device status (8.2.8.3) Ben Widawsky
@ 2020-11-11  5:47 ` Ben Widawsky
  2020-11-16 13:46   ` Jonathan Cameron
  2020-11-11  5:47 ` [RFC PATCH 08/25] hw/cxl/device: Add memory devices (8.2.8.5) Ben Widawsky
                   ` (19 subsequent siblings)
  26 siblings, 1 reply; 64+ messages in thread
From: Ben Widawsky @ 2020-11-11  5:47 UTC (permalink / raw)
  To: qemu-devel; +Cc: Vishal Verma, Dan Williams, Ben Widawsky

This is the beginning of implementing mailbox support for CXL 2.0
devices.

Signed-off-by: Ben Widawsky <ben.widawsky@intel.com>
---
 hw/cxl/cxl-device-utils.c   | 131 ++++++++++++++++++++++++++++++++++++
 hw/cxl/cxl-mailbox-utils.c  |  93 +++++++++++++++++++++++++
 hw/cxl/meson.build          |   1 +
 include/hw/cxl/cxl.h        |   3 +
 include/hw/cxl/cxl_device.h |  10 ++-
 5 files changed, 237 insertions(+), 1 deletion(-)
 create mode 100644 hw/cxl/cxl-mailbox-utils.c

diff --git a/hw/cxl/cxl-device-utils.c b/hw/cxl/cxl-device-utils.c
index 78144e103c..aec8b0d421 100644
--- a/hw/cxl/cxl-device-utils.c
+++ b/hw/cxl/cxl-device-utils.c
@@ -55,6 +55,123 @@ static uint64_t dev_reg_read(void *opaque, hwaddr offset, unsigned size)
     return ldn_le_p(&retval, size);
 }
 
+static uint64_t mailbox_reg_read(void *opaque, hwaddr offset, unsigned size)
+{
+    CXLDeviceState *cxl_dstate = opaque;
+
+    switch (size) {
+    case 4:
+        if (unlikely(offset & (sizeof(uint32_t) - 1))) {
+            qemu_log_mask(LOG_UNIMP, "Unaligned register read\n");
+            return 0;
+        }
+        break;
+    case 8:
+        if (unlikely(offset & (sizeof(uint64_t) - 1))) {
+            qemu_log_mask(LOG_UNIMP, "Unaligned register read\n");
+            return 0;
+        }
+        break;
+    default:
+        qemu_log_mask(LOG_UNIMP, "%uB component register read\n", size);
+        return 0;
+    }
+
+    return ldn_le_p(cxl_dstate->mbox_reg_state + offset, size);
+}
+
+static void mailbox_mem_writel(uint32_t *reg_state, hwaddr offset,
+                               uint64_t value)
+{
+    switch (offset) {
+    case A_CXL_DEV_MAILBOX_CTRL:
+        /* fallthrough */
+    case A_CXL_DEV_MAILBOX_CAP:
+        /* RO register */
+        break;
+    default:
+        qemu_log_mask(LOG_UNIMP,
+                      "%s Unexpected 32-bit access to 0x%" PRIx64 " (WI)\n",
+                      __func__, offset);
+        break;
+    }
+
+    stl_le_p((uint8_t *)reg_state + offset, value);
+}
+
+static void mailbox_mem_writeq(uint64_t *reg_state, hwaddr offset,
+                               uint64_t value)
+{
+    switch (offset) {
+    case A_CXL_DEV_MAILBOX_CMD:
+        break;
+    case A_CXL_DEV_BG_CMD_STS:
+        /* BG not supported */
+        /* fallthrough */
+    case A_CXL_DEV_MAILBOX_STS:
+        /* Read only register, will get updated by the state machine */
+        return;
+    case A_CXL_DEV_MAILBOX_CAP:
+    case A_CXL_DEV_MAILBOX_CTRL:
+    default:
+        qemu_log_mask(LOG_UNIMP,
+                      "%s Unexpected 64-bit access to 0x%" PRIx64 " (WI)\n",
+                      __func__, offset);
+        return;
+    }
+
+    stq_le_p((uint8_t *)reg_state + offset, value);
+}
+
+static void mailbox_reg_write(void *opaque, hwaddr offset, uint64_t value,
+                              unsigned size)
+{
+    CXLDeviceState *cxl_dstate = opaque;
+
+    /*
+     * Lock is needed to prevent concurrent writes as well as to prevent writes
+     * coming in while the firmware is processing. Without background commands
+     * or the second mailbox implemented, this serves no purpose since the
+     * memory access is synchronized at a higher level (per memory region).
+     */
+    RCU_READ_LOCK_GUARD();
+
+    switch (size) {
+    case 4:
+        if (unlikely(offset & (sizeof(uint32_t) - 1))) {
+            qemu_log_mask(LOG_UNIMP, "Unaligned register read\n");
+            return;
+        }
+        mailbox_mem_writel(cxl_dstate->mbox_reg_state32, offset, value);
+        break;
+    case 8:
+        if (unlikely(offset & (sizeof(uint64_t) - 1))) {
+            qemu_log_mask(LOG_UNIMP, "Unaligned register read\n");
+            return;
+        }
+        mailbox_mem_writeq(cxl_dstate->mbox_reg_state64, offset, value);
+        break;
+    }
+
+    if (ARRAY_FIELD_EX32(cxl_dstate->mbox_reg_state32, CXL_DEV_MAILBOX_CTRL,
+                         DOORBELL))
+        process_mailbox(cxl_dstate);
+}
+
+static const MemoryRegionOps mailbox_ops = {
+    .read = mailbox_reg_read,
+    .write = mailbox_reg_write,
+    .endianness = DEVICE_LITTLE_ENDIAN,
+    .valid = {
+        .min_access_size = 4,
+        .max_access_size = 8,
+    },
+    .impl = {
+        .min_access_size = 4,
+        .max_access_size = 8,
+    },
+};
+
 static const MemoryRegionOps dev_ops = {
     .read = dev_reg_read,
     .write = NULL,
@@ -94,12 +211,23 @@ void cxl_device_register_block_init(Object *obj, CXLDeviceState *cxl_dstate)
                           "cap-array", CXL_DEVICE_REGISTERS_OFFSET - 0);
     memory_region_init_io(&cxl_dstate->device, obj, &dev_ops, cxl_dstate,
                           "device-status", CXL_DEVICE_REGISTERS_LENGTH);
+    memory_region_init_io(&cxl_dstate->mailbox, obj, &mailbox_ops, cxl_dstate,
+                          "mailbox", CXL_MAILBOX_REGISTERS_LENGTH);
 
     memory_region_add_subregion(&cxl_dstate->device_registers, 0,
                                 &cxl_dstate->caps);
     memory_region_add_subregion(&cxl_dstate->device_registers,
                                 CXL_DEVICE_REGISTERS_OFFSET,
                                 &cxl_dstate->device);
+    memory_region_add_subregion(&cxl_dstate->device_registers,
+                                CXL_MAILBOX_REGISTERS_OFFSET, &cxl_dstate->mailbox);
+}
+
+static void mailbox_init_common(uint32_t *mbox_regs)
+{
+    /* 2048 payload size, with no interrupt or background support */
+    ARRAY_FIELD_DP32(mbox_regs, CXL_DEV_MAILBOX_CAP, PAYLOAD_SIZE,
+                     CXL_MAILBOX_PAYLOAD_SHIFT);
 }
 
 void cxl_device_register_init_common(CXLDeviceState *cxl_dstate)
@@ -113,4 +241,7 @@ void cxl_device_register_init_common(CXLDeviceState *cxl_dstate)
     ARRAY_FIELD_DP32(cap_hdrs, CXL_DEV_CAP_ARRAY2, CAP_COUNT, cap_count);
 
     cxl_device_cap_init(cxl_dstate, DEVICE, 1);
+    cxl_device_cap_init(cxl_dstate, MAILBOX, 2);
+
+    mailbox_init_common(cxl_dstate->mbox_reg_state32);
 }
diff --git a/hw/cxl/cxl-mailbox-utils.c b/hw/cxl/cxl-mailbox-utils.c
new file mode 100644
index 0000000000..2d1b0ef9e4
--- /dev/null
+++ b/hw/cxl/cxl-mailbox-utils.c
@@ -0,0 +1,93 @@
+/*
+ * CXL Utility library for mailbox interface
+ *
+ * Copyright(C) 2020 Intel Corporation.
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2. See the
+ * COPYING file in the top-level directory.
+ */
+
+#include "qemu/osdep.h"
+#include "qemu/log.h"
+#include "hw/pci/pci.h"
+#include "hw/cxl/cxl.h"
+
+/* 8.2.8.4.5.1 Command Return Codes */
+enum {
+    RET_SUCCESS                 = 0x0,
+    RET_BG_STARTED              = 0x1, /* Background Command Started */
+    RET_EINVAL                  = 0x2, /* Invalid Input */
+    RET_ENOTSUP                 = 0x3, /* Unsupported */
+    RET_ENODEV                  = 0x4, /* Internal Error */
+    RET_ERESTART                = 0x5, /* Retry Required */
+    RET_EBUSY                   = 0x6, /* Busy */
+    RET_MEDIA_DISABLED          = 0x7, /* Media Disabled */
+    RET_FW_EBUSY                = 0x8, /* FW Transfer in Progress */
+    RET_FW_OOO                  = 0x9, /* FW Transfer Out of Order */
+    RET_FW_AUTH                 = 0xa, /* FW Authentication Failed */
+    RET_FW_EBADSLT              = 0xb, /* Invalid Slot */
+    RET_FW_ROLLBACK             = 0xc, /* Activation Failed, FW Rolled Back */
+    RET_FW_REBOOT               = 0xd, /* Activation Failed, Cold Reset Required */
+    RET_ENOENT                  = 0xe, /* Invalid Handle */
+    RET_EFAULT                  = 0xf, /* Invalid Physical Address */
+    RET_POISON_E2BIG            = 0x10, /* Inject Poison Limit Reached */
+    RET_EIO                     = 0x11, /* Permanent Media Failure */
+    RET_ECANCELED               = 0x12, /* Aborted */
+    RET_EACCESS                 = 0x13, /* Invalid Security State */
+    RET_EPERM                   = 0x14, /* Incorrect Passphrase */
+    RET_EPROTONOSUPPORT         = 0x15, /* Unsupported Mailbox */
+    RET_EMSGSIZE                = 0x16, /* Invalid Payload Length */
+    RET_MAX                     = 0x17
+};
+
+void process_mailbox(CXLDeviceState *cxl_dstate)
+{
+    uint16_t ret = RET_SUCCESS;
+    uint32_t ret_len = 0;
+    uint64_t status_reg;
+
+    /*
+     * current state of mailbox interface
+     *  uint32_t mbox_cap_reg = cxl_dstate->reg_state32[R_CXL_DEV_MAILBOX_CAP];
+     *  uint32_t mbox_ctrl_reg = cxl_dstate->reg_state32[R_CXL_DEV_MAILBOX_CTRL];
+     *  uint64_t status_reg = *(uint64_t *)&cxl_dstate->reg_state[A_CXL_DEV_MAILBOX_STS];
+     */
+    uint64_t command_reg =
+        *(uint64_t *)&cxl_dstate->mbox_reg_state[A_CXL_DEV_MAILBOX_CMD];
+
+    /* Check if we have to do anything */
+    if (!ARRAY_FIELD_EX32(cxl_dstate->mbox_reg_state32, CXL_DEV_MAILBOX_CTRL, DOORBELL)) {
+        qemu_log_mask(LOG_UNIMP, "Corrupt internal state for firmware\n");
+        return;
+    }
+
+    uint8_t cmd_set = FIELD_EX64(command_reg, CXL_DEV_MAILBOX_CMD, COMMAND_SET);
+    uint8_t cmd = FIELD_EX64(command_reg, CXL_DEV_MAILBOX_CMD, COMMAND);
+    (void)cmd;
+    switch (cmd_set) {
+    default:
+        ret = RET_ENOTSUP;
+    }
+
+    /*
+     * Set the return code
+     * XXX: it's a 64b register, but we're not setting the vendor, so we can get
+     * away with this
+     */
+    status_reg = FIELD_DP64(0, CXL_DEV_MAILBOX_STS, ERRNO, ret);
+
+    /*
+     * Set the return length
+     */
+    command_reg = FIELD_DP64(command_reg, CXL_DEV_MAILBOX_CMD, COMMAND_SET, 0);
+    command_reg = FIELD_DP64(command_reg, CXL_DEV_MAILBOX_CMD, COMMAND, 0);
+    command_reg = FIELD_DP64(command_reg, CXL_DEV_MAILBOX_CMD, LENGTH, ret_len);
+
+    stq_le_p(cxl_dstate->mbox_reg_state + A_CXL_DEV_MAILBOX_CMD, command_reg);
+    stq_le_p(cxl_dstate->mbox_reg_state + A_CXL_DEV_MAILBOX_STS, status_reg);
+
+    /* Tell the host we're done */
+    ARRAY_FIELD_DP32(cxl_dstate->mbox_reg_state32, CXL_DEV_MAILBOX_CTRL,
+                     DOORBELL, 0);
+}
+
diff --git a/hw/cxl/meson.build b/hw/cxl/meson.build
index 47154d6850..0eca715d10 100644
--- a/hw/cxl/meson.build
+++ b/hw/cxl/meson.build
@@ -1,4 +1,5 @@
 softmmu_ss.add(when: 'CONFIG_CXL', if_true: files(
   'cxl-component-utils.c',
   'cxl-device-utils.c',
+  'cxl-mailbox-utils.c',
 ))
diff --git a/include/hw/cxl/cxl.h b/include/hw/cxl/cxl.h
index 23f52c4cf9..362cda40de 100644
--- a/include/hw/cxl/cxl.h
+++ b/include/hw/cxl/cxl.h
@@ -14,5 +14,8 @@
 #include "cxl_component.h"
 #include "cxl_device.h"
 
+#define COMPONENT_REG_BAR_IDX 0
+#define DEVICE_REG_BAR_IDX 2
+
 #endif
 
diff --git a/include/hw/cxl/cxl_device.h b/include/hw/cxl/cxl_device.h
index 2c674fdc9c..df00998def 100644
--- a/include/hw/cxl/cxl_device.h
+++ b/include/hw/cxl/cxl_device.h
@@ -87,6 +87,11 @@ typedef struct cxl_device_state {
         uint8_t caps_reg_state[CXL_DEVICE_CAP_REG_SIZE * 4]; /* ARRAY + 3 CAPS */
         uint32_t caps_reg_state32[0];
     };
+    union {
+        uint8_t mbox_reg_state[CXL_MAILBOX_REGISTERS_LENGTH];
+        uint32_t mbox_reg_state32[0];
+        uint64_t mbox_reg_state64[0];
+    };
 } CXLDeviceState;
 
 /* Initialize the register block for a device */
@@ -127,6 +132,8 @@ CXL_DEVICE_CAPABILITY_HEADER_REGISTER(DEVICE, CXL_DEVICE_CAP_HDR1_OFFSET)
 CXL_DEVICE_CAPABILITY_HEADER_REGISTER(MAILBOX, CXL_DEVICE_CAP_HDR1_OFFSET + \
                                                CXL_DEVICE_CAP_REG_SIZE)
 
+void process_mailbox(CXLDeviceState *cxl_dstate);
+
 #define cxl_device_cap_init(dstate, reg, cap_id)                                   \
     do {                                                                           \
         uint32_t *cap_hdrs = dstate->caps_reg_state32;                             \
@@ -155,7 +162,8 @@ REG32(CXL_DEV_MAILBOX_CTRL, 4)
     FIELD(CXL_DEV_MAILBOX_CTRL, BG_INT_EN, 2, 1)
 
 REG32(CXL_DEV_MAILBOX_CMD, 8)
-    FIELD(CXL_DEV_MAILBOX_CMD, OP, 0, 16)
+    FIELD(CXL_DEV_MAILBOX_CMD, COMMAND, 0, 8)
+    FIELD(CXL_DEV_MAILBOX_CMD, COMMAND_SET, 8, 8)
     FIELD(CXL_DEV_MAILBOX_CMD, LENGTH, 16, 20)
 
 /* XXX: actually a 64b register */
-- 
2.29.2



^ permalink raw reply related	[flat|nested] 64+ messages in thread

* [RFC PATCH 08/25] hw/cxl/device: Add memory devices (8.2.8.5)
  2020-11-11  5:46 [RFC PATCH 00/25] Introduce CXL 2.0 Emulation Ben Widawsky
                   ` (6 preceding siblings ...)
  2020-11-11  5:47 ` [RFC PATCH 07/25] hw/cxl/device: Implement basic mailbox (8.2.8.4) Ben Widawsky
@ 2020-11-11  5:47 ` Ben Widawsky
  2020-11-16 16:37   ` Jonathan Cameron
  2020-11-11  5:47 ` [RFC PATCH 09/25] hw/pxb: Use a type for realizing expanders Ben Widawsky
                   ` (18 subsequent siblings)
  26 siblings, 1 reply; 64+ messages in thread
From: Ben Widawsky @ 2020-11-11  5:47 UTC (permalink / raw)
  To: qemu-devel; +Cc: Vishal Verma, Dan Williams, Ben Widawsky

Memory devices implement extra capabilities on top of CXL devices. This
adds support for that.

Signed-off-by: Ben Widawsky <ben.widawsky@intel.com>
---
 hw/cxl/cxl-device-utils.c   | 48 ++++++++++++++++++++++++++++++++++++-
 hw/cxl/cxl-mailbox-utils.c  | 48 ++++++++++++++++++++++++++++++++++++-
 include/hw/cxl/cxl_device.h | 15 ++++++++++++
 3 files changed, 109 insertions(+), 2 deletions(-)

diff --git a/hw/cxl/cxl-device-utils.c b/hw/cxl/cxl-device-utils.c
index aec8b0d421..6544a68567 100644
--- a/hw/cxl/cxl-device-utils.c
+++ b/hw/cxl/cxl-device-utils.c
@@ -158,6 +158,45 @@ static void mailbox_reg_write(void *opaque, hwaddr offset, uint64_t value,
         process_mailbox(cxl_dstate);
 }
 
+static uint64_t mdev_reg_read(void *opaque, hwaddr offset, unsigned size)
+{
+    uint64_t retval = 0;
+
+    retval = FIELD_DP64(retval, CXL_MEM_DEV_STS, MEDIA_STATUS, 1);
+    retval = FIELD_DP64(retval, CXL_MEM_DEV_STS, MBOX_READY, 1);
+
+    switch (size) {
+    case 4:
+        if (unlikely(offset & (sizeof(uint32_t) - 1))) {
+            qemu_log_mask(LOG_UNIMP, "Unaligned register read\n");
+            return 0;
+        }
+        break;
+    case 8:
+        if (unlikely(offset & (sizeof(uint64_t) - 1))) {
+            qemu_log_mask(LOG_UNIMP, "Unaligned register read\n");
+            return 0;
+        }
+        break;
+    }
+
+    return ldn_le_p(&retval, size);
+}
+
+static const MemoryRegionOps mdev_ops = {
+    .read = mdev_reg_read,
+    .write = NULL,
+    .endianness = DEVICE_LITTLE_ENDIAN,
+    .valid = {
+        .min_access_size = 4,
+        .max_access_size = 8,
+    },
+    .impl = {
+        .min_access_size = 4,
+        .max_access_size = 8,
+    },
+};
+
 static const MemoryRegionOps mailbox_ops = {
     .read = mailbox_reg_read,
     .write = mailbox_reg_write,
@@ -213,6 +252,9 @@ void cxl_device_register_block_init(Object *obj, CXLDeviceState *cxl_dstate)
                           "device-status", CXL_DEVICE_REGISTERS_LENGTH);
     memory_region_init_io(&cxl_dstate->mailbox, obj, &mailbox_ops, cxl_dstate,
                           "mailbox", CXL_MAILBOX_REGISTERS_LENGTH);
+    memory_region_init_io(&cxl_dstate->memory_device, obj, &mdev_ops,
+                          cxl_dstate, "memory device caps",
+                          CXL_MEMORY_DEVICE_REGISTERS_LENGTH);
 
     memory_region_add_subregion(&cxl_dstate->device_registers, 0,
                                 &cxl_dstate->caps);
@@ -221,6 +263,9 @@ void cxl_device_register_block_init(Object *obj, CXLDeviceState *cxl_dstate)
                                 &cxl_dstate->device);
     memory_region_add_subregion(&cxl_dstate->device_registers,
                                 CXL_MAILBOX_REGISTERS_OFFSET, &cxl_dstate->mailbox);
+    memory_region_add_subregion(&cxl_dstate->device_registers,
+                                CXL_MEMORY_DEVICE_REGISTERS_OFFSET,
+                                &cxl_dstate->memory_device);
 }
 
 static void mailbox_init_common(uint32_t *mbox_regs)
@@ -233,7 +278,7 @@ static void mailbox_init_common(uint32_t *mbox_regs)
 void cxl_device_register_init_common(CXLDeviceState *cxl_dstate)
 {
     uint32_t *cap_hdrs = cxl_dstate->caps_reg_state32;
-    const int cap_count = 1;
+    const int cap_count = 3;
 
     /* CXL Device Capabilities Array Register */
     ARRAY_FIELD_DP32(cap_hdrs, CXL_DEV_CAP_ARRAY, CAP_ID, 0);
@@ -242,6 +287,7 @@ void cxl_device_register_init_common(CXLDeviceState *cxl_dstate)
 
     cxl_device_cap_init(cxl_dstate, DEVICE, 1);
     cxl_device_cap_init(cxl_dstate, MAILBOX, 2);
+    cxl_device_cap_init(cxl_dstate, MEMORY_DEVICE, 0x4000);
 
     mailbox_init_common(cxl_dstate->mbox_reg_state32);
 }
diff --git a/hw/cxl/cxl-mailbox-utils.c b/hw/cxl/cxl-mailbox-utils.c
index 2d1b0ef9e4..5d2579800e 100644
--- a/hw/cxl/cxl-mailbox-utils.c
+++ b/hw/cxl/cxl-mailbox-utils.c
@@ -12,6 +12,12 @@
 #include "hw/pci/pci.h"
 #include "hw/cxl/cxl.h"
 
+enum cxl_opcode {
+    CXL_EVENTS      = 0x1,
+    CXL_IDENTIFY    = 0x40,
+        #define CXL_IDENTIFY_MEMORY_DEVICE = 0x0
+};
+
 /* 8.2.8.4.5.1 Command Return Codes */
 enum {
     RET_SUCCESS                 = 0x0,
@@ -40,6 +46,43 @@ enum {
     RET_MAX                     = 0x17
 };
 
+/* 8.2.9.5.1.1 */
+static int cmd_set_identify(CXLDeviceState *cxl_dstate, uint8_t cmd,
+                            uint32_t *ret_size)
+{
+    struct identify {
+        char fw_revision[0x10];
+        uint64_t total_capacity;
+        uint64_t volatile_capacity;
+        uint64_t persistent_capacity;
+        uint64_t partition_align;
+        uint16_t info_event_log_size;
+        uint16_t warning_event_log_size;
+        uint16_t failure_event_log_size;
+        uint16_t fatal_event_log_size;
+        uint32_t lsa_size;
+        uint8_t poison_list_max_mer[3];
+        uint16_t inject_poison_limit;
+        uint8_t poison_caps;
+        uint8_t qos_telemetry_caps;
+    } __attribute__((packed)) *id;
+    _Static_assert(sizeof(struct identify) == 0x43, "Bad identify size");
+
+    if (memory_region_size(cxl_dstate->pmem) < (256 << 20)) {
+        return RET_ENODEV;
+    }
+
+    /* PMEM only */
+    id = (struct identify *)((void *)cxl_dstate->mbox_reg_state +
+                             A_CXL_DEV_CMD_PAYLOAD);
+    snprintf(id->fw_revision, 0x10, "BWFW VERSION %02d", 0);
+    id->total_capacity = memory_region_size(cxl_dstate->pmem);
+    id->persistent_capacity = memory_region_size(cxl_dstate->pmem);
+
+    *ret_size = 0x43;
+    return RET_SUCCESS;
+}
+
 void process_mailbox(CXLDeviceState *cxl_dstate)
 {
     uint16_t ret = RET_SUCCESS;
@@ -63,8 +106,11 @@ void process_mailbox(CXLDeviceState *cxl_dstate)
 
     uint8_t cmd_set = FIELD_EX64(command_reg, CXL_DEV_MAILBOX_CMD, COMMAND_SET);
     uint8_t cmd = FIELD_EX64(command_reg, CXL_DEV_MAILBOX_CMD, COMMAND);
-    (void)cmd;
     switch (cmd_set) {
+    case CXL_IDENTIFY:
+        ret = cmd_set_identify(cxl_dstate, cmd, &ret_len);
+        /* Fill in payload here */
+        break;
     default:
         ret = RET_ENOTSUP;
     }
diff --git a/include/hw/cxl/cxl_device.h b/include/hw/cxl/cxl_device.h
index df00998def..2cb2a9af3c 100644
--- a/include/hw/cxl/cxl_device.h
+++ b/include/hw/cxl/cxl_device.h
@@ -69,6 +69,10 @@
 #define CXL_MAILBOX_REGISTERS_LENGTH \
     (CXL_MAILBOX_REGISTERS_SIZE + CXL_MAILBOX_MAX_PAYLOAD_SIZE)
 
+#define CXL_MEMORY_DEVICE_REGISTERS_OFFSET \
+    (CXL_MAILBOX_REGISTERS_OFFSET + CXL_MAILBOX_REGISTERS_LENGTH)
+#define CXL_MEMORY_DEVICE_REGISTERS_LENGTH 0x8
+
 typedef struct cxl_device_state {
     /* Boss container and caps registers */
     MemoryRegion device_registers;
@@ -76,6 +80,7 @@ typedef struct cxl_device_state {
     MemoryRegion caps;
     MemoryRegion device;
     MemoryRegion mailbox;
+    MemoryRegion memory_device;
 
     MemoryRegion *pmem;
     MemoryRegion *vmem;
@@ -131,6 +136,8 @@ REG32(CXL_DEV_CAP_ARRAY2, 4) /* We're going to pretend it's 64b */
 CXL_DEVICE_CAPABILITY_HEADER_REGISTER(DEVICE, CXL_DEVICE_CAP_HDR1_OFFSET)
 CXL_DEVICE_CAPABILITY_HEADER_REGISTER(MAILBOX, CXL_DEVICE_CAP_HDR1_OFFSET + \
                                                CXL_DEVICE_CAP_REG_SIZE)
+CXL_DEVICE_CAPABILITY_HEADER_REGISTER(MEMORY_DEVICE, CXL_DEVICE_CAP_HDR1_OFFSET + \
+                                                     CXL_DEVICE_CAP_REG_SIZE * 2)
 
 void process_mailbox(CXLDeviceState *cxl_dstate);
 
@@ -181,4 +188,12 @@ REG32(CXL_DEV_BG_CMD_STS, 0x18)
 
 REG32(CXL_DEV_CMD_PAYLOAD, 0x20)
 
+/* XXX: actually a 64b registers */
+REG32(CXL_MEM_DEV_STS, 0)
+    FIELD(CXL_MEM_DEV_STS, FATAL, 0, 1)
+    FIELD(CXL_MEM_DEV_STS, FW_HALT, 1, 1)
+    FIELD(CXL_MEM_DEV_STS, MEDIA_STATUS, 2, 2)
+    FIELD(CXL_MEM_DEV_STS, MBOX_READY, 4, 1)
+    FIELD(CXL_MEM_DEV_STS, RESET_NEEDED, 5, 3)
+
 #endif
-- 
2.29.2



^ permalink raw reply related	[flat|nested] 64+ messages in thread

* [RFC PATCH 09/25] hw/pxb: Use a type for realizing expanders
  2020-11-11  5:46 [RFC PATCH 00/25] Introduce CXL 2.0 Emulation Ben Widawsky
                   ` (7 preceding siblings ...)
  2020-11-11  5:47 ` [RFC PATCH 08/25] hw/cxl/device: Add memory devices (8.2.8.5) Ben Widawsky
@ 2020-11-11  5:47 ` Ben Widawsky
  2020-11-11  5:47 ` [RFC PATCH 10/25] hw/pci/cxl: Create a CXL bus type Ben Widawsky
                   ` (17 subsequent siblings)
  26 siblings, 0 replies; 64+ messages in thread
From: Ben Widawsky @ 2020-11-11  5:47 UTC (permalink / raw)
  To: qemu-devel; +Cc: Vishal Verma, Dan Williams, Ben Widawsky, Michael S. Tsirkin

This opens up the possibility for more types of expanders (other than
PCI and PCIe). We'll need this to create a CXL expander.

Signed-off-by: Ben Widawsky <ben.widawsky@intel.com>
---
 hw/pci-bridge/pci_expander_bridge.c | 11 +++++++----
 1 file changed, 7 insertions(+), 4 deletions(-)

diff --git a/hw/pci-bridge/pci_expander_bridge.c b/hw/pci-bridge/pci_expander_bridge.c
index aedded1064..232b7ce305 100644
--- a/hw/pci-bridge/pci_expander_bridge.c
+++ b/hw/pci-bridge/pci_expander_bridge.c
@@ -24,6 +24,8 @@
 #include "hw/boards.h"
 #include "qom/object.h"
 
+enum BusType { PCI, PCIE };
+
 #define TYPE_PXB_BUS "pxb-bus"
 typedef struct PXBBus PXBBus;
 DECLARE_INSTANCE_CHECKER(PXBBus, PXB_BUS,
@@ -214,7 +216,8 @@ static gint pxb_compare(gconstpointer a, gconstpointer b)
            0;
 }
 
-static void pxb_dev_realize_common(PCIDevice *dev, bool pcie, Error **errp)
+static void pxb_dev_realize_common(PCIDevice *dev, enum BusType type,
+                                   Error **errp)
 {
     PXBDev *pxb = convert_to_pxb(dev);
     DeviceState *ds, *bds = NULL;
@@ -239,7 +242,7 @@ static void pxb_dev_realize_common(PCIDevice *dev, bool pcie, Error **errp)
     }
 
     ds = qdev_new(TYPE_PXB_HOST);
-    if (pcie) {
+    if (type == PCIE) {
         bus = pci_root_bus_new(ds, dev_name, NULL, NULL, 0, TYPE_PXB_PCIE_BUS);
     } else {
         bus = pci_root_bus_new(ds, "pxb-internal", NULL, NULL, 0, TYPE_PXB_BUS);
@@ -287,7 +290,7 @@ static void pxb_dev_realize(PCIDevice *dev, Error **errp)
         return;
     }
 
-    pxb_dev_realize_common(dev, false, errp);
+    pxb_dev_realize_common(dev, PCI, errp);
 }
 
 static void pxb_dev_exitfn(PCIDevice *pci_dev)
@@ -339,7 +342,7 @@ static void pxb_pcie_dev_realize(PCIDevice *dev, Error **errp)
         return;
     }
 
-    pxb_dev_realize_common(dev, true, errp);
+    pxb_dev_realize_common(dev, PCIE, errp);
 }
 
 static void pxb_pcie_dev_class_init(ObjectClass *klass, void *data)
-- 
2.29.2



^ permalink raw reply related	[flat|nested] 64+ messages in thread

* [RFC PATCH 10/25] hw/pci/cxl: Create a CXL bus type
  2020-11-11  5:46 [RFC PATCH 00/25] Introduce CXL 2.0 Emulation Ben Widawsky
                   ` (8 preceding siblings ...)
  2020-11-11  5:47 ` [RFC PATCH 09/25] hw/pxb: Use a type for realizing expanders Ben Widawsky
@ 2020-11-11  5:47 ` Ben Widawsky
  2020-11-11  5:47 ` [RFC PATCH 11/25] hw/pxb: Allow creation of a CXL PXB (host bridge) Ben Widawsky
                   ` (16 subsequent siblings)
  26 siblings, 0 replies; 64+ messages in thread
From: Ben Widawsky @ 2020-11-11  5:47 UTC (permalink / raw)
  To: qemu-devel; +Cc: Vishal Verma, Dan Williams, Ben Widawsky, Michael S. Tsirkin

The easiest way to differentiate a CXL bus, and a PCIE bus is using a
flag. A CXL bus, in hardware, is backward compatible with PCIE, and
therefore the code tries pretty hard to keep them in sync as much as
possible.

The other way to implement this would be to try to cast the bus to the
correct type. This is less code and useful for debugging via simply
looking at the flags.

Signed-off-by: Ben Widawsky <ben.widawsky@intel.com>
---
 hw/pci-bridge/pci_expander_bridge.c | 9 ++++++++-
 include/hw/pci/pci_bus.h            | 7 +++++++
 2 files changed, 15 insertions(+), 1 deletion(-)

diff --git a/hw/pci-bridge/pci_expander_bridge.c b/hw/pci-bridge/pci_expander_bridge.c
index 232b7ce305..88c45dc3b5 100644
--- a/hw/pci-bridge/pci_expander_bridge.c
+++ b/hw/pci-bridge/pci_expander_bridge.c
@@ -24,7 +24,7 @@
 #include "hw/boards.h"
 #include "qom/object.h"
 
-enum BusType { PCI, PCIE };
+enum BusType { PCI, PCIE, CXL };
 
 #define TYPE_PXB_BUS "pxb-bus"
 typedef struct PXBBus PXBBus;
@@ -35,6 +35,10 @@ DECLARE_INSTANCE_CHECKER(PXBBus, PXB_BUS,
 DECLARE_INSTANCE_CHECKER(PXBBus, PXB_PCIE_BUS,
                          TYPE_PXB_PCIE_BUS)
 
+#define TYPE_PXB_CXL_BUS "pxb-cxl-bus"
+DECLARE_INSTANCE_CHECKER(PXBBus, PXB_CXL_BUS,
+                         TYPE_PXB_CXL_BUS)
+
 struct PXBBus {
     /*< private >*/
     PCIBus parent_obj;
@@ -244,6 +248,9 @@ static void pxb_dev_realize_common(PCIDevice *dev, enum BusType type,
     ds = qdev_new(TYPE_PXB_HOST);
     if (type == PCIE) {
         bus = pci_root_bus_new(ds, dev_name, NULL, NULL, 0, TYPE_PXB_PCIE_BUS);
+    } else if (type == CXL) {
+        bus = pci_root_bus_new(ds, dev_name, NULL, NULL, 0, TYPE_PXB_CXL_BUS);
+        bus->flags |= PCI_BUS_CXL;
     } else {
         bus = pci_root_bus_new(ds, "pxb-internal", NULL, NULL, 0, TYPE_PXB_BUS);
         bds = qdev_new("pci-bridge");
diff --git a/include/hw/pci/pci_bus.h b/include/hw/pci/pci_bus.h
index 347440d42c..eb94e7e85c 100644
--- a/include/hw/pci/pci_bus.h
+++ b/include/hw/pci/pci_bus.h
@@ -24,6 +24,8 @@ enum PCIBusFlags {
     PCI_BUS_IS_ROOT                                         = 0x0001,
     /* PCIe extended configuration space is accessible on this bus */
     PCI_BUS_EXTENDED_CONFIG_SPACE                           = 0x0002,
+    /* This is a CXL Type BUS */
+    PCI_BUS_CXL                                             = 0x0004,
 };
 
 struct PCIBus {
@@ -53,6 +55,11 @@ struct PCIBus {
     Notifier machine_done;
 };
 
+static inline bool pci_bus_is_cxl(PCIBus *bus)
+{
+    return !!(bus->flags & PCI_BUS_CXL);
+}
+
 static inline bool pci_bus_is_root(PCIBus *bus)
 {
     return !!(bus->flags & PCI_BUS_IS_ROOT);
-- 
2.29.2



^ permalink raw reply related	[flat|nested] 64+ messages in thread

* [RFC PATCH 11/25] hw/pxb: Allow creation of a CXL PXB (host bridge)
  2020-11-11  5:46 [RFC PATCH 00/25] Introduce CXL 2.0 Emulation Ben Widawsky
                   ` (9 preceding siblings ...)
  2020-11-11  5:47 ` [RFC PATCH 10/25] hw/pci/cxl: Create a CXL bus type Ben Widawsky
@ 2020-11-11  5:47 ` Ben Widawsky
  2020-11-16 16:44   ` Jonathan Cameron
  2020-11-11  5:47 ` [RFC PATCH 12/25] acpi/pci: Consolidate host bridge setup Ben Widawsky
                   ` (15 subsequent siblings)
  26 siblings, 1 reply; 64+ messages in thread
From: Ben Widawsky @ 2020-11-11  5:47 UTC (permalink / raw)
  To: qemu-devel; +Cc: Vishal Verma, Dan Williams, Ben Widawsky, Michael S. Tsirkin

This works like adding a typical pxb device, except the name is
'pxb-cxl' instead of 'pxb-pcie'. An example command line would be as
follows:
  -device pxb-cxl,id=cxl.0,bus="pcie.0",bus_nr=1

A CXL PXB is backward compatible with PCIe. What this means in practice
is that an operating system that is unaware of CXL should still be able
to enumerate this topology as if it were PCIe.

One can create multiple CXL PXB host bridges, but a host bridge can only
be connected to the main root bus. Host bridges cannot appear elsewhere
in the topology.

Note that as of this patch, the ACPI tables needed for the host bridge
(specifically, an ACPI object in _SB named ACPI0016 and the CEDT) aren't
created. So while this patch internally creates it, it cannot be
properly used by an operating system or other system software.

Upcoming patches will allow creating multiple host bridges.

Signed-off-by: Ben Widawsky <ben.widawsky@intel.com>
---
 hw/pci-bridge/pci_expander_bridge.c | 67 ++++++++++++++++++++++++++++-
 hw/pci/pci.c                        |  7 +++
 include/hw/pci/pci.h                |  6 +++
 3 files changed, 78 insertions(+), 2 deletions(-)

diff --git a/hw/pci-bridge/pci_expander_bridge.c b/hw/pci-bridge/pci_expander_bridge.c
index 88c45dc3b5..3a8d815231 100644
--- a/hw/pci-bridge/pci_expander_bridge.c
+++ b/hw/pci-bridge/pci_expander_bridge.c
@@ -56,6 +56,10 @@ DECLARE_INSTANCE_CHECKER(PXBDev, PXB_DEV,
 DECLARE_INSTANCE_CHECKER(PXBDev, PXB_PCIE_DEV,
                          TYPE_PXB_PCIE_DEVICE)
 
+#define TYPE_PXB_CXL_DEVICE "pxb-cxl"
+DECLARE_INSTANCE_CHECKER(PXBDev, PXB_CXL_DEV,
+                         TYPE_PXB_CXL_DEVICE)
+
 struct PXBDev {
     /*< private >*/
     PCIDevice parent_obj;
@@ -67,6 +71,11 @@ struct PXBDev {
 
 static PXBDev *convert_to_pxb(PCIDevice *dev)
 {
+    /* A CXL PXB's parent bus is PCIe, so the normal check won't work */
+    if (object_dynamic_cast(OBJECT(dev), TYPE_PXB_CXL_DEVICE)) {
+        return PXB_CXL_DEV(dev);
+    }
+
     return pci_bus_is_express(pci_get_bus(dev))
         ? PXB_PCIE_DEV(dev) : PXB_DEV(dev);
 }
@@ -111,11 +120,20 @@ static const TypeInfo pxb_pcie_bus_info = {
     .class_init    = pxb_bus_class_init,
 };
 
+static const TypeInfo pxb_cxl_bus_info = {
+    .name          = TYPE_PXB_CXL_BUS,
+    .parent        = TYPE_CXL_BUS,
+    .instance_size = sizeof(PXBBus),
+    .class_init    = pxb_bus_class_init,
+};
+
 static const char *pxb_host_root_bus_path(PCIHostState *host_bridge,
                                           PCIBus *rootbus)
 {
-    PXBBus *bus = pci_bus_is_express(rootbus) ?
-                  PXB_PCIE_BUS(rootbus) : PXB_BUS(rootbus);
+    PXBBus *bus = pci_bus_is_cxl(rootbus) ?
+                      PXB_CXL_BUS(rootbus) :
+                      pci_bus_is_express(rootbus) ? PXB_PCIE_BUS(rootbus) :
+                                                    PXB_BUS(rootbus);
 
     snprintf(bus->bus_path, 8, "0000:%02x", pxb_bus_num(rootbus));
     return bus->bus_path;
@@ -380,13 +398,58 @@ static const TypeInfo pxb_pcie_dev_info = {
     },
 };
 
+static void pxb_cxl_dev_realize(PCIDevice *dev, Error **errp)
+{
+    /* A CXL PXB's parent bus is still PCIe */
+    if (!pci_bus_is_express(pci_get_bus(dev))) {
+        error_setg(errp, "pxb-cxl devices cannot reside on a PCI bus");
+        return;
+    }
+
+    pxb_dev_realize_common(dev, CXL, errp);
+}
+
+static void pxb_cxl_dev_class_init(ObjectClass *klass, void *data)
+{
+    DeviceClass *dc   = DEVICE_CLASS(klass);
+    PCIDeviceClass *k = PCI_DEVICE_CLASS(klass);
+
+    k->realize             = pxb_cxl_dev_realize;
+    k->exit                = pxb_dev_exitfn;
+    k->vendor_id           = PCI_VENDOR_ID_INTEL;
+    k->device_id           = 0xabcd;
+    k->class_id            = PCI_CLASS_BRIDGE_HOST;
+    k->subsystem_vendor_id = PCI_VENDOR_ID_INTEL;
+
+    dc->desc = "CXL Host Bridge";
+    device_class_set_props(dc, pxb_dev_properties);
+    set_bit(DEVICE_CATEGORY_BRIDGE, dc->categories);
+
+    /* Host bridges aren't hotpluggable. FIXME: spec reference */
+    dc->hotpluggable = false;
+}
+
+static const TypeInfo pxb_cxl_dev_info = {
+    .name          = TYPE_PXB_CXL_DEVICE,
+    .parent        = TYPE_PCI_DEVICE,
+    .instance_size = sizeof(PXBDev),
+    .class_init    = pxb_cxl_dev_class_init,
+    .interfaces =
+        (InterfaceInfo[]){
+            { INTERFACE_CONVENTIONAL_PCI_DEVICE },
+            {},
+        },
+};
+
 static void pxb_register_types(void)
 {
     type_register_static(&pxb_bus_info);
     type_register_static(&pxb_pcie_bus_info);
+    type_register_static(&pxb_cxl_bus_info);
     type_register_static(&pxb_host_info);
     type_register_static(&pxb_dev_info);
     type_register_static(&pxb_pcie_dev_info);
+    type_register_static(&pxb_cxl_dev_info);
 }
 
 type_init(pxb_register_types)
diff --git a/hw/pci/pci.c b/hw/pci/pci.c
index db88788c4b..67eed889a4 100644
--- a/hw/pci/pci.c
+++ b/hw/pci/pci.c
@@ -220,6 +220,12 @@ static const TypeInfo pcie_bus_info = {
     .class_init = pcie_bus_class_init,
 };
 
+static const TypeInfo cxl_bus_info = {
+    .name       = TYPE_CXL_BUS,
+    .parent     = TYPE_PCIE_BUS,
+    .class_init = pcie_bus_class_init,
+};
+
 static PCIBus *pci_find_bus_nr(PCIBus *bus, int bus_num);
 static void pci_update_mappings(PCIDevice *d);
 static void pci_irq_handler(void *opaque, int irq_num, int level);
@@ -2847,6 +2853,7 @@ static void pci_register_types(void)
 {
     type_register_static(&pci_bus_info);
     type_register_static(&pcie_bus_info);
+    type_register_static(&cxl_bus_info);
     type_register_static(&conventional_pci_interface_info);
     type_register_static(&cxl_interface_info);
     type_register_static(&pcie_interface_info);
diff --git a/include/hw/pci/pci.h b/include/hw/pci/pci.h
index 4e6fd59fdd..52267ff69e 100644
--- a/include/hw/pci/pci.h
+++ b/include/hw/pci/pci.h
@@ -405,6 +405,7 @@ typedef PCIINTxRoute (*pci_route_irq_fn)(void *opaque, int pin);
 #define TYPE_PCI_BUS "PCI"
 OBJECT_DECLARE_TYPE(PCIBus, PCIBusClass, PCI_BUS)
 #define TYPE_PCIE_BUS "PCIE"
+#define TYPE_CXL_BUS "CXL"
 
 bool pci_bus_is_express(PCIBus *bus);
 
@@ -753,6 +754,11 @@ static inline void pci_irq_pulse(PCIDevice *pci_dev)
     pci_irq_deassert(pci_dev);
 }
 
+static inline int pci_is_cxl(const PCIDevice *d)
+{
+    return d->cap_present & QEMU_PCIE_CAP_CXL;
+}
+
 static inline int pci_is_express(const PCIDevice *d)
 {
     return d->cap_present & QEMU_PCI_CAP_EXPRESS;
-- 
2.29.2



^ permalink raw reply related	[flat|nested] 64+ messages in thread

* [RFC PATCH 12/25] acpi/pci: Consolidate host bridge setup
  2020-11-11  5:46 [RFC PATCH 00/25] Introduce CXL 2.0 Emulation Ben Widawsky
                   ` (10 preceding siblings ...)
  2020-11-11  5:47 ` [RFC PATCH 11/25] hw/pxb: Allow creation of a CXL PXB (host bridge) Ben Widawsky
@ 2020-11-11  5:47 ` Ben Widawsky
  2020-11-12 17:46   ` Ben Widawsky
  2020-11-16 16:45   ` Jonathan Cameron
  2020-11-11  5:47 ` [RFC PATCH 13/25] hw/pci: Plumb _UID through host bridges Ben Widawsky
                   ` (14 subsequent siblings)
  26 siblings, 2 replies; 64+ messages in thread
From: Ben Widawsky @ 2020-11-11  5:47 UTC (permalink / raw)
  To: qemu-devel
  Cc: Ben Widawsky, Eduardo Habkost, Michael S. Tsirkin, Vishal Verma,
	Paolo Bonzini, Igor Mammedov, Dan Williams, Richard Henderson

This cleanup will make it easier to add support for CXL to the mix.

Signed-off-by: Ben Widawsky <ben.widawsky@intel.com>
---
 hw/i386/acpi-build.c | 31 +++++++++++++++++--------------
 1 file changed, 17 insertions(+), 14 deletions(-)

diff --git a/hw/i386/acpi-build.c b/hw/i386/acpi-build.c
index 4f66642d88..99b3088c9e 100644
--- a/hw/i386/acpi-build.c
+++ b/hw/i386/acpi-build.c
@@ -1486,6 +1486,20 @@ static void build_smb0(Aml *table, I2CBus *smbus, int devnr, int func)
     aml_append(table, scope);
 }
 
+enum { PCI, PCIE };
+static void init_pci_acpi(Aml *dev, int uid, int type)
+{
+    if (type == PCI) {
+        aml_append(dev, aml_name_decl("_HID", aml_eisaid("PNP0A03")));
+        aml_append(dev, aml_name_decl("_UID", aml_int(uid)));
+    } else {
+        aml_append(dev, aml_name_decl("_HID", aml_eisaid("PNP0A08")));
+        aml_append(dev, aml_name_decl("_CID", aml_eisaid("PNP0A03")));
+        aml_append(dev, aml_name_decl("_UID", aml_int(uid)));
+        aml_append(dev, build_q35_osc_method());
+    }
+}
+
 static void
 build_dsdt(GArray *table_data, BIOSLinker *linker,
            AcpiPmInfo *pm, AcpiMiscInfo *misc,
@@ -1514,9 +1528,8 @@ build_dsdt(GArray *table_data, BIOSLinker *linker,
     if (misc->is_piix4) {
         sb_scope = aml_scope("_SB");
         dev = aml_device("PCI0");
-        aml_append(dev, aml_name_decl("_HID", aml_eisaid("PNP0A03")));
+        init_pci_acpi(dev, 0, PCI);
         aml_append(dev, aml_name_decl("_ADR", aml_int(0)));
-        aml_append(dev, aml_name_decl("_UID", aml_int(0)));
         aml_append(sb_scope, dev);
         aml_append(dsdt, sb_scope);
 
@@ -1530,11 +1543,8 @@ build_dsdt(GArray *table_data, BIOSLinker *linker,
     } else {
         sb_scope = aml_scope("_SB");
         dev = aml_device("PCI0");
-        aml_append(dev, aml_name_decl("_HID", aml_eisaid("PNP0A08")));
-        aml_append(dev, aml_name_decl("_CID", aml_eisaid("PNP0A03")));
+        init_pci_acpi(dev, 0, PCIE);
         aml_append(dev, aml_name_decl("_ADR", aml_int(0)));
-        aml_append(dev, aml_name_decl("_UID", aml_int(0)));
-        aml_append(dev, build_q35_osc_method());
         aml_append(sb_scope, dev);
 
         if (pm->smi_on_cpuhp) {
@@ -1636,15 +1646,8 @@ build_dsdt(GArray *table_data, BIOSLinker *linker,
 
             scope = aml_scope("\\_SB");
             dev = aml_device("PC%.02X", bus_num);
-            aml_append(dev, aml_name_decl("_UID", aml_int(bus_num)));
             aml_append(dev, aml_name_decl("_BBN", aml_int(bus_num)));
-            if (pci_bus_is_express(bus)) {
-                aml_append(dev, aml_name_decl("_HID", aml_eisaid("PNP0A08")));
-                aml_append(dev, aml_name_decl("_CID", aml_eisaid("PNP0A03")));
-                aml_append(dev, build_q35_osc_method());
-            } else {
-                aml_append(dev, aml_name_decl("_HID", aml_eisaid("PNP0A03")));
-            }
+            init_pci_acpi(dev, bus_num, pci_bus_is_express(bus) ? PCIE : PCI);
 
             if (numa_node != NUMA_NODE_UNASSIGNED) {
                 aml_append(dev, aml_name_decl("_PXM", aml_int(numa_node)));
-- 
2.29.2



^ permalink raw reply related	[flat|nested] 64+ messages in thread

* [RFC PATCH 13/25] hw/pci: Plumb _UID through host bridges
  2020-11-11  5:46 [RFC PATCH 00/25] Introduce CXL 2.0 Emulation Ben Widawsky
                   ` (11 preceding siblings ...)
  2020-11-11  5:47 ` [RFC PATCH 12/25] acpi/pci: Consolidate host bridge setup Ben Widawsky
@ 2020-11-11  5:47 ` Ben Widawsky
  2020-11-11  5:47 ` [RFC PATCH 14/25] hw/cxl/component: Implement host bridge MMIO (8.2.5, table 142) Ben Widawsky
                   ` (13 subsequent siblings)
  26 siblings, 0 replies; 64+ messages in thread
From: Ben Widawsky @ 2020-11-11  5:47 UTC (permalink / raw)
  To: qemu-devel
  Cc: Ben Widawsky, Eduardo Habkost, Michael S. Tsirkin, Vishal Verma,
	Paolo Bonzini, Igor Mammedov, Dan Williams, Richard Henderson

Currently, QEMU makes _UID equivalent to the bus number (_BBN). While
there is nothing wrong with doing it this way, CXL spec has a heavy
reliance on _UID to identify host bridges and there is no link to the
bus number. Having a distinct UID solves two problems. The first is it
gets us around the limitation of 256 (current max bus number). The
second is it allows us to replicate hardware configurations where bus
number and uid aren't equivalent. The latter has benefits for our
development and debugging using QEMU.

The other way to do this would be to implement the expanded bus
numbering, but having an explicit uid makes more sense when trying to
replicate real hardware configurations.

The QEMU commandline to utilize this would be:
  -device pxb-cxl,id=cxl.0,bus="pcie.0",bus_nr=1,uid=x

Signed-off-by: Ben Widawsky <ben.widawsky@intel.com>

--

I'm guessing this patch will be somewhat controversial. For early CXL
work, this can be dropped without too much heartache.
---
 hw/i386/acpi-build.c                |  3 ++-
 hw/pci-bridge/pci_expander_bridge.c | 19 +++++++++++++++++++
 hw/pci/pci.c                        | 11 +++++++++++
 include/hw/pci/pci.h                |  1 +
 include/hw/pci/pci_bus.h            |  1 +
 5 files changed, 34 insertions(+), 1 deletion(-)

diff --git a/hw/i386/acpi-build.c b/hw/i386/acpi-build.c
index 99b3088c9e..aaed7da7dc 100644
--- a/hw/i386/acpi-build.c
+++ b/hw/i386/acpi-build.c
@@ -1634,6 +1634,7 @@ build_dsdt(GArray *table_data, BIOSLinker *linker,
         QLIST_FOREACH(bus, &bus->child, sibling) {
             uint8_t bus_num = pci_bus_num(bus);
             uint8_t numa_node = pci_bus_numa_node(bus);
+            int32_t uid = pci_bus_uid(bus);
 
             /* look only for expander root buses */
             if (!pci_bus_is_root(bus)) {
@@ -1647,7 +1648,7 @@ build_dsdt(GArray *table_data, BIOSLinker *linker,
             scope = aml_scope("\\_SB");
             dev = aml_device("PC%.02X", bus_num);
             aml_append(dev, aml_name_decl("_BBN", aml_int(bus_num)));
-            init_pci_acpi(dev, bus_num, pci_bus_is_express(bus) ? PCIE : PCI);
+            init_pci_acpi(dev, uid, pci_bus_is_express(bus) ? PCIE : PCI);
 
             if (numa_node != NUMA_NODE_UNASSIGNED) {
                 aml_append(dev, aml_name_decl("_PXM", aml_int(numa_node)));
diff --git a/hw/pci-bridge/pci_expander_bridge.c b/hw/pci-bridge/pci_expander_bridge.c
index 3a8d815231..d5b43a8a31 100644
--- a/hw/pci-bridge/pci_expander_bridge.c
+++ b/hw/pci-bridge/pci_expander_bridge.c
@@ -67,6 +67,7 @@ struct PXBDev {
 
     uint8_t bus_nr;
     uint16_t numa_node;
+    int32_t uid;
 };
 
 static PXBDev *convert_to_pxb(PCIDevice *dev)
@@ -98,12 +99,20 @@ static uint16_t pxb_bus_numa_node(PCIBus *bus)
     return pxb->numa_node;
 }
 
+static int32_t pxb_bus_uid(PCIBus *bus)
+{
+    PXBDev *pxb = convert_to_pxb(bus->parent_dev);
+
+    return pxb->uid;
+}
+
 static void pxb_bus_class_init(ObjectClass *class, void *data)
 {
     PCIBusClass *pbc = PCI_BUS_CLASS(class);
 
     pbc->bus_num = pxb_bus_num;
     pbc->numa_node = pxb_bus_numa_node;
+    pbc->uid = pxb_bus_uid;
 }
 
 static const TypeInfo pxb_bus_info = {
@@ -329,6 +338,7 @@ static Property pxb_dev_properties[] = {
     /* Note: 0 is not a legal PXB bus number. */
     DEFINE_PROP_UINT8("bus_nr", PXBDev, bus_nr, 0),
     DEFINE_PROP_UINT16("numa_node", PXBDev, numa_node, NUMA_NODE_UNASSIGNED),
+    DEFINE_PROP_INT32("uid", PXBDev, uid, -1),
     DEFINE_PROP_END_OF_LIST(),
 };
 
@@ -400,12 +410,21 @@ static const TypeInfo pxb_pcie_dev_info = {
 
 static void pxb_cxl_dev_realize(PCIDevice *dev, Error **errp)
 {
+    PXBDev *pxb = convert_to_pxb(dev);
+
     /* A CXL PXB's parent bus is still PCIe */
     if (!pci_bus_is_express(pci_get_bus(dev))) {
         error_setg(errp, "pxb-cxl devices cannot reside on a PCI bus");
         return;
     }
 
+    if (pxb->uid < 0) {
+        error_setg(errp, "pxb-cxl devices must have a valid uid (0-2147483647)");
+        return;
+    }
+
+    /* FIXME: Check that uid doesn't collide with UIDs of other host bridges */
+
     pxb_dev_realize_common(dev, CXL, errp);
 }
 
diff --git a/hw/pci/pci.c b/hw/pci/pci.c
index 67eed889a4..f728975d32 100644
--- a/hw/pci/pci.c
+++ b/hw/pci/pci.c
@@ -168,6 +168,11 @@ static uint16_t pcibus_numa_node(PCIBus *bus)
     return NUMA_NODE_UNASSIGNED;
 }
 
+static int32_t pcibus_uid(PCIBus *bus)
+{
+    return -1;
+}
+
 static void pci_bus_class_init(ObjectClass *klass, void *data)
 {
     BusClass *k = BUS_CLASS(klass);
@@ -182,6 +187,7 @@ static void pci_bus_class_init(ObjectClass *klass, void *data)
 
     pbc->bus_num = pcibus_num;
     pbc->numa_node = pcibus_numa_node;
+    pbc->uid = pcibus_uid;
 }
 
 static const TypeInfo pci_bus_info = {
@@ -528,6 +534,11 @@ int pci_bus_numa_node(PCIBus *bus)
     return PCI_BUS_GET_CLASS(bus)->numa_node(bus);
 }
 
+int pci_bus_uid(PCIBus *bus)
+{
+    return PCI_BUS_GET_CLASS(bus)->uid(bus);
+}
+
 static int get_pci_config_device(QEMUFile *f, void *pv, size_t size,
                                  const VMStateField *field)
 {
diff --git a/include/hw/pci/pci.h b/include/hw/pci/pci.h
index 52267ff69e..7a7b3da4df 100644
--- a/include/hw/pci/pci.h
+++ b/include/hw/pci/pci.h
@@ -462,6 +462,7 @@ static inline int pci_dev_bus_num(const PCIDevice *dev)
 }
 
 int pci_bus_numa_node(PCIBus *bus);
+int pci_bus_uid(PCIBus *bus);
 void pci_for_each_device(PCIBus *bus, int bus_num,
                          void (*fn)(PCIBus *bus, PCIDevice *d, void *opaque),
                          void *opaque);
diff --git a/include/hw/pci/pci_bus.h b/include/hw/pci/pci_bus.h
index eb94e7e85c..3c9fbc55bb 100644
--- a/include/hw/pci/pci_bus.h
+++ b/include/hw/pci/pci_bus.h
@@ -17,6 +17,7 @@ struct PCIBusClass {
 
     int (*bus_num)(PCIBus *bus);
     uint16_t (*numa_node)(PCIBus *bus);
+    int32_t (*uid)(PCIBus *bus);
 };
 
 enum PCIBusFlags {
-- 
2.29.2



^ permalink raw reply related	[flat|nested] 64+ messages in thread

* [RFC PATCH 14/25] hw/cxl/component: Implement host bridge MMIO (8.2.5, table 142)
  2020-11-11  5:46 [RFC PATCH 00/25] Introduce CXL 2.0 Emulation Ben Widawsky
                   ` (12 preceding siblings ...)
  2020-11-11  5:47 ` [RFC PATCH 13/25] hw/pci: Plumb _UID through host bridges Ben Widawsky
@ 2020-11-11  5:47 ` Ben Widawsky
  2020-11-11  5:47 ` [RFC PATCH 15/25] acpi/pxb/cxl: Reserve host bridge MMIO Ben Widawsky
                   ` (12 subsequent siblings)
  26 siblings, 0 replies; 64+ messages in thread
From: Ben Widawsky @ 2020-11-11  5:47 UTC (permalink / raw)
  To: qemu-devel; +Cc: Vishal Verma, Dan Williams, Ben Widawsky, Michael S. Tsirkin

CXL host bridges themselves may have MMIO. Since host bridges don't have
a BAR they are treated as special for MMIO.

Signed-off-by: Ben Widawsky <ben.widawsky@intel.com>

--

It's arbitrarily chosen here to pick 0xD0000000 as the base for the host
bridge MMIO. I'm not sure what the right way to find free space for
platform hardcoded things like this is.
---
 hw/pci-bridge/pci_expander_bridge.c | 53 ++++++++++++++++++++++++++++-
 include/hw/cxl/cxl.h                |  2 ++
 2 files changed, 54 insertions(+), 1 deletion(-)

diff --git a/hw/pci-bridge/pci_expander_bridge.c b/hw/pci-bridge/pci_expander_bridge.c
index d5b43a8a31..eca5c71d45 100644
--- a/hw/pci-bridge/pci_expander_bridge.c
+++ b/hw/pci-bridge/pci_expander_bridge.c
@@ -17,6 +17,7 @@
 #include "hw/pci/pci_host.h"
 #include "hw/qdev-properties.h"
 #include "hw/pci/pci_bridge.h"
+#include "hw/cxl/cxl.h"
 #include "qemu/range.h"
 #include "qemu/error-report.h"
 #include "qemu/module.h"
@@ -70,6 +71,12 @@ struct PXBDev {
     int32_t uid;
 };
 
+typedef struct CXLHost {
+    PCIHostState parent_obj;
+
+    CXLComponentState cxl_cstate;
+} CXLHost;
+
 static PXBDev *convert_to_pxb(PCIDevice *dev)
 {
     /* A CXL PXB's parent bus is PCIe, so the normal check won't work */
@@ -85,6 +92,9 @@ static GList *pxb_dev_list;
 
 #define TYPE_PXB_HOST "pxb-host"
 
+#define TYPE_PXB_CXL_HOST "pxb-cxl-host"
+#define PXB_CXL_HOST(obj) OBJECT_CHECK(CXLHost, (obj), TYPE_PXB_CXL_HOST)
+
 static int pxb_bus_num(PCIBus *bus)
 {
     PXBDev *pxb = convert_to_pxb(bus->parent_dev);
@@ -198,6 +208,46 @@ static const TypeInfo pxb_host_info = {
     .class_init    = pxb_host_class_init,
 };
 
+static void pxb_cxl_realize(DeviceState *dev, Error **errp)
+{
+    SysBusDevice *sbd = SYS_BUS_DEVICE(dev);
+    PCIHostState *phb = PCI_HOST_BRIDGE(dev);
+    CXLHost *cxl = PXB_CXL_HOST(dev);
+    CXLComponentState *cxl_cstate = &cxl->cxl_cstate;
+    struct MemoryRegion *mr = &cxl_cstate->crb.component_registers;
+
+    cxl_component_register_block_init(OBJECT(dev), cxl_cstate,
+                                      TYPE_PXB_CXL_HOST);
+    sysbus_init_mmio(sbd, mr);
+
+    /* FIXME: support multiple host bridges. */
+    sysbus_mmio_map(sbd, 0, CXL_HOST_BASE +
+                            memory_region_size(mr) * pci_bus_uid(phb->bus));
+}
+
+static void pxb_cxl_host_class_init(ObjectClass *class, void *data)
+{
+    DeviceClass *dc = DEVICE_CLASS(class);
+    PCIHostBridgeClass *hc = PCI_HOST_BRIDGE_CLASS(class);
+
+    hc->root_bus_path = pxb_host_root_bus_path;
+    dc->fw_name = "cxl";
+    dc->realize = pxb_cxl_realize;
+    /* Reason: Internal part of the pxb/pxb-pcie device, not usable by itself */
+    dc->user_creatable = false;
+}
+
+/*
+ * This is a device to handle the MMIO for a CXL host bridge. It does nothing
+ * else.
+ */
+static const TypeInfo cxl_host_info = {
+    .name          = TYPE_PXB_CXL_HOST,
+    .parent        = TYPE_PCI_HOST_BRIDGE,
+    .instance_size = sizeof(CXLHost),
+    .class_init    = pxb_cxl_host_class_init,
+};
+
 /*
  * Registers the PXB bus as a child of pci host root bus.
  */
@@ -272,7 +322,7 @@ static void pxb_dev_realize_common(PCIDevice *dev, enum BusType type,
         dev_name = dev->qdev.id;
     }
 
-    ds = qdev_new(TYPE_PXB_HOST);
+    ds = qdev_new(type == CXL ? TYPE_PXB_CXL_HOST : TYPE_PXB_HOST);
     if (type == PCIE) {
         bus = pci_root_bus_new(ds, dev_name, NULL, NULL, 0, TYPE_PXB_PCIE_BUS);
     } else if (type == CXL) {
@@ -466,6 +516,7 @@ static void pxb_register_types(void)
     type_register_static(&pxb_pcie_bus_info);
     type_register_static(&pxb_cxl_bus_info);
     type_register_static(&pxb_host_info);
+    type_register_static(&cxl_host_info);
     type_register_static(&pxb_dev_info);
     type_register_static(&pxb_pcie_dev_info);
     type_register_static(&pxb_cxl_dev_info);
diff --git a/include/hw/cxl/cxl.h b/include/hw/cxl/cxl.h
index 362cda40de..6bc344f205 100644
--- a/include/hw/cxl/cxl.h
+++ b/include/hw/cxl/cxl.h
@@ -17,5 +17,7 @@
 #define COMPONENT_REG_BAR_IDX 0
 #define DEVICE_REG_BAR_IDX 2
 
+#define CXL_HOST_BASE 0xD0000000
+
 #endif
 
-- 
2.29.2



^ permalink raw reply related	[flat|nested] 64+ messages in thread

* [RFC PATCH 15/25] acpi/pxb/cxl: Reserve host bridge MMIO
  2020-11-11  5:46 [RFC PATCH 00/25] Introduce CXL 2.0 Emulation Ben Widawsky
                   ` (13 preceding siblings ...)
  2020-11-11  5:47 ` [RFC PATCH 14/25] hw/cxl/component: Implement host bridge MMIO (8.2.5, table 142) Ben Widawsky
@ 2020-11-11  5:47 ` Ben Widawsky
  2020-11-16 16:54   ` Jonathan Cameron
  2020-11-11  5:47 ` [RFC PATCH 16/25] hw/pxb/cxl: Add "windows" for host bridges Ben Widawsky
                   ` (11 subsequent siblings)
  26 siblings, 1 reply; 64+ messages in thread
From: Ben Widawsky @ 2020-11-11  5:47 UTC (permalink / raw)
  To: qemu-devel
  Cc: Ben Widawsky, Eduardo Habkost, Michael S. Tsirkin, Vishal Verma,
	Paolo Bonzini, Igor Mammedov, Dan Williams, Richard Henderson

For all host bridges, reserve MMIO space with _CRS. The MMIO for the
host bridge lives in a magically hard coded space in the system's
physical address space. The standard mechanism to tell the OS about
regions which can't be used for host bridges is _CRS.

Signed-off-by: Ben Widawsky <ben.widawsky@intel.com>
---
 hw/i386/acpi-build.c | 22 +++++++++++++++++++---
 1 file changed, 19 insertions(+), 3 deletions(-)

diff --git a/hw/i386/acpi-build.c b/hw/i386/acpi-build.c
index aaed7da7dc..fae4fa28e1 100644
--- a/hw/i386/acpi-build.c
+++ b/hw/i386/acpi-build.c
@@ -28,6 +28,7 @@
 #include "qemu/bitmap.h"
 #include "qemu/error-report.h"
 #include "hw/pci/pci.h"
+#include "hw/cxl/cxl.h"
 #include "hw/core/cpu.h"
 #include "target/i386/cpu.h"
 #include "hw/misc/pvpanic.h"
@@ -1486,7 +1487,7 @@ static void build_smb0(Aml *table, I2CBus *smbus, int devnr, int func)
     aml_append(table, scope);
 }
 
-enum { PCI, PCIE };
+enum { PCI, PCIE, CXL };
 static void init_pci_acpi(Aml *dev, int uid, int type)
 {
     if (type == PCI) {
@@ -1635,20 +1636,28 @@ build_dsdt(GArray *table_data, BIOSLinker *linker,
             uint8_t bus_num = pci_bus_num(bus);
             uint8_t numa_node = pci_bus_numa_node(bus);
             int32_t uid = pci_bus_uid(bus);
+            int type;
 
             /* look only for expander root buses */
             if (!pci_bus_is_root(bus)) {
                 continue;
             }
 
+            type = pci_bus_is_cxl(bus) ? CXL :
+                                         pci_bus_is_express(bus) ? PCIE : PCI;
+
             if (bus_num < root_bus_limit) {
                 root_bus_limit = bus_num - 1;
             }
 
             scope = aml_scope("\\_SB");
-            dev = aml_device("PC%.02X", bus_num);
+            if (type == CXL) {
+                dev = aml_device("CXL%.01X", pci_bus_uid(bus));
+            } else {
+                dev = aml_device("PC%.02X", bus_num);
+            }
             aml_append(dev, aml_name_decl("_BBN", aml_int(bus_num)));
-            init_pci_acpi(dev, uid, pci_bus_is_express(bus) ? PCIE : PCI);
+            init_pci_acpi(dev, uid, type);
 
             if (numa_node != NUMA_NODE_UNASSIGNED) {
                 aml_append(dev, aml_name_decl("_PXM", aml_int(numa_node)));
@@ -1659,6 +1668,13 @@ build_dsdt(GArray *table_data, BIOSLinker *linker,
             aml_append(dev, aml_name_decl("_CRS", crs));
             aml_append(scope, dev);
             aml_append(dsdt, scope);
+
+            /* Handle the ranges for the PXB expanders */
+            if (type == CXL) {
+                uint64_t base = CXL_HOST_BASE + uid * 0x10000;
+                crs_range_insert(crs_range_set.mem_ranges, base,
+                                 base + 0x10000 - 1);
+            }
         }
     }
 
-- 
2.29.2



^ permalink raw reply related	[flat|nested] 64+ messages in thread

* [RFC PATCH 16/25] hw/pxb/cxl: Add "windows" for host bridges
  2020-11-11  5:46 [RFC PATCH 00/25] Introduce CXL 2.0 Emulation Ben Widawsky
                   ` (14 preceding siblings ...)
  2020-11-11  5:47 ` [RFC PATCH 15/25] acpi/pxb/cxl: Reserve host bridge MMIO Ben Widawsky
@ 2020-11-11  5:47 ` Ben Widawsky
  2020-11-13  0:49   ` Ben Widawsky
  2020-11-11  5:47 ` [RFC PATCH 17/25] hw/cxl/rp: Add a root port Ben Widawsky
                   ` (10 subsequent siblings)
  26 siblings, 1 reply; 64+ messages in thread
From: Ben Widawsky @ 2020-11-11  5:47 UTC (permalink / raw)
  To: qemu-devel; +Cc: Vishal Verma, Dan Williams, Ben Widawsky, Michael S. Tsirkin

In a bare metal CXL capable system, system firmware will program
physical address ranges on the host. This is done by programming
internal registers that aren't typically known to OS. These address
ranges might be contiguous or interleaved across host bridges.

For a QEMU guest a new construct is introduced allowing passing a memory
backend to the host bridge for this same purpose. Each memory backend
needs to be passed to the host bridge as well as any device that will be
emulating that memory (not implemented here).

I'm hopeful the interleaving work in the link can be re-purposed here
(see Link).

An example to create a host bridges with a 512M window at 0x4c0000000
 -object memory-backend-file,id=cxl-mem1,share,mem-path=cxl-type3,size=512M
 -device pxb-cxl,id=cxl.0,bus=pcie.0,bus_nr=52,uid=0,len-memory-base=1,memory-base\[0\]=0x4c0000000,memory\[0\]=cxl-mem1

Link: https://lists.nongnu.org/archive/html/qemu-devel/2020-08/msg03680.html
Signed-off-by: Ben Widawsky <ben.widawsky@intel.com>
---
 hw/pci-bridge/pci_expander_bridge.c | 65 +++++++++++++++++++++++++++--
 include/hw/cxl/cxl.h                |  1 +
 2 files changed, 62 insertions(+), 4 deletions(-)

diff --git a/hw/pci-bridge/pci_expander_bridge.c b/hw/pci-bridge/pci_expander_bridge.c
index eca5c71d45..75910f5870 100644
--- a/hw/pci-bridge/pci_expander_bridge.c
+++ b/hw/pci-bridge/pci_expander_bridge.c
@@ -69,12 +69,19 @@ struct PXBDev {
     uint8_t bus_nr;
     uint16_t numa_node;
     int32_t uid;
+    struct cxl_dev {
+        HostMemoryBackend *memory_window[CXL_WINDOW_MAX];
+
+        uint32_t num_windows;
+        hwaddr *window_base[CXL_WINDOW_MAX];
+    } cxl;
 };
 
 typedef struct CXLHost {
     PCIHostState parent_obj;
 
     CXLComponentState cxl_cstate;
+    PXBDev *dev;
 } CXLHost;
 
 static PXBDev *convert_to_pxb(PCIDevice *dev)
@@ -213,16 +220,31 @@ static void pxb_cxl_realize(DeviceState *dev, Error **errp)
     SysBusDevice *sbd = SYS_BUS_DEVICE(dev);
     PCIHostState *phb = PCI_HOST_BRIDGE(dev);
     CXLHost *cxl = PXB_CXL_HOST(dev);
+    struct cxl_dev *cxl_dev = &cxl->dev->cxl;
     CXLComponentState *cxl_cstate = &cxl->cxl_cstate;
     struct MemoryRegion *mr = &cxl_cstate->crb.component_registers;
+    int uid = pci_bus_uid(phb->bus);
 
     cxl_component_register_block_init(OBJECT(dev), cxl_cstate,
                                       TYPE_PXB_CXL_HOST);
     sysbus_init_mmio(sbd, mr);
 
-    /* FIXME: support multiple host bridges. */
-    sysbus_mmio_map(sbd, 0, CXL_HOST_BASE +
-                            memory_region_size(mr) * pci_bus_uid(phb->bus));
+    sysbus_mmio_map(sbd, 0, CXL_HOST_BASE + memory_region_size(mr) * uid);
+
+    /*
+     * A CXL host bridge can exist without a fixed memory window, but it would
+     * only operate in legacy PCIe mode.
+     */
+    if (!cxl_dev->memory_window[uid]) {
+        warn_report(
+            "CXL expander bridge created without window. Consider using %s",
+            "memdev[0]=<memory_backend>");
+        return;
+    }
+
+    mr = host_memory_backend_get_memory(cxl_dev->memory_window[uid]);
+    sysbus_init_mmio(sbd, mr);
+    sysbus_mmio_map(sbd, 1 + uid, *cxl_dev->window_base[uid]);
 }
 
 static void pxb_cxl_host_class_init(ObjectClass *class, void *data)
@@ -328,6 +350,7 @@ static void pxb_dev_realize_common(PCIDevice *dev, enum BusType type,
     } else if (type == CXL) {
         bus = pci_root_bus_new(ds, dev_name, NULL, NULL, 0, TYPE_PXB_CXL_BUS);
         bus->flags |= PCI_BUS_CXL;
+        PXB_CXL_HOST(ds)->dev = PXB_CXL_DEV(dev);
     } else {
         bus = pci_root_bus_new(ds, "pxb-internal", NULL, NULL, 0, TYPE_PXB_BUS);
         bds = qdev_new("pci-bridge");
@@ -389,6 +412,8 @@ static Property pxb_dev_properties[] = {
     DEFINE_PROP_UINT8("bus_nr", PXBDev, bus_nr, 0),
     DEFINE_PROP_UINT16("numa_node", PXBDev, numa_node, NUMA_NODE_UNASSIGNED),
     DEFINE_PROP_INT32("uid", PXBDev, uid, -1),
+    DEFINE_PROP_ARRAY("window-base", PXBDev, cxl.num_windows, cxl.window_base,
+                      qdev_prop_uint64, hwaddr),
     DEFINE_PROP_END_OF_LIST(),
 };
 
@@ -460,7 +485,9 @@ static const TypeInfo pxb_pcie_dev_info = {
 
 static void pxb_cxl_dev_realize(PCIDevice *dev, Error **errp)
 {
-    PXBDev *pxb = convert_to_pxb(dev);
+    PXBDev *pxb = PXB_CXL_DEV(dev);
+    struct cxl_dev *cxl = &pxb->cxl;
+    int count = 0;
 
     /* A CXL PXB's parent bus is still PCIe */
     if (!pci_bus_is_express(pci_get_bus(dev))) {
@@ -476,6 +503,23 @@ static void pxb_cxl_dev_realize(PCIDevice *dev, Error **errp)
     /* FIXME: Check that uid doesn't collide with UIDs of other host bridges */
 
     pxb_dev_realize_common(dev, CXL, errp);
+
+    for (unsigned i = 0; i < CXL_WINDOW_MAX; i++) {
+        if (!cxl->memory_window[i]) {
+            continue;
+        }
+
+        count++;
+    }
+
+    if (!count) {
+        warn_report("memory-windows should be set when creating CXL host bridges");
+    }
+
+    if (count != cxl->num_windows) {
+        error_setg(errp, "window bases count (%d) must match window count (%d)",
+                   cxl->num_windows, count);
+    }
 }
 
 static void pxb_cxl_dev_class_init(ObjectClass *klass, void *data)
@@ -496,6 +540,19 @@ static void pxb_cxl_dev_class_init(ObjectClass *klass, void *data)
 
     /* Host bridges aren't hotpluggable. FIXME: spec reference */
     dc->hotpluggable = false;
+
+    /*
+     * Below is moral equivalent of:
+     *   DEFINE_PROP_ARRAY("memdev", PXBDev, window_count, windows,
+     *                     qdev_prop_memory_device, HostMemoryBackend)
+     */
+    for (unsigned i = 0; i < CXL_WINDOW_MAX; i++) {
+        g_autofree char *name = g_strdup_printf("memdev[%u]", i);
+        object_class_property_add_link(klass, name, TYPE_MEMORY_BACKEND,
+                offsetof(PXBDev, cxl.memory_window[i]),
+                qdev_prop_allow_set_link_before_realize,
+                OBJ_PROP_LINK_STRONG);
+    }
 }
 
 static const TypeInfo pxb_cxl_dev_info = {
diff --git a/include/hw/cxl/cxl.h b/include/hw/cxl/cxl.h
index 6bc344f205..b1e5f4a8fa 100644
--- a/include/hw/cxl/cxl.h
+++ b/include/hw/cxl/cxl.h
@@ -18,6 +18,7 @@
 #define DEVICE_REG_BAR_IDX 2
 
 #define CXL_HOST_BASE 0xD0000000
+#define CXL_WINDOW_MAX 10
 
 #endif
 
-- 
2.29.2



^ permalink raw reply related	[flat|nested] 64+ messages in thread

* [RFC PATCH 17/25] hw/cxl/rp: Add a root port
  2020-11-11  5:46 [RFC PATCH 00/25] Introduce CXL 2.0 Emulation Ben Widawsky
                   ` (15 preceding siblings ...)
  2020-11-11  5:47 ` [RFC PATCH 16/25] hw/pxb/cxl: Add "windows" for host bridges Ben Widawsky
@ 2020-11-11  5:47 ` Ben Widawsky
  2020-11-11  5:47 ` [RFC PATCH 18/25] hw/cxl/device: Add a memory device (8.2.8.5) Ben Widawsky
                   ` (9 subsequent siblings)
  26 siblings, 0 replies; 64+ messages in thread
From: Ben Widawsky @ 2020-11-11  5:47 UTC (permalink / raw)
  To: qemu-devel; +Cc: Vishal Verma, Dan Williams, Ben Widawsky, Michael S. Tsirkin

This adds just enough of a root port implementation to be able to
enumerate root ports (creating the required DVSEC entries). What's not
here yet is the MMIO nor the ability to write some of the DVSEC entries.

This can be added with the qemu commandline by adding a rootport to a
specific CXL host bridge. For example:
  -device cxl-rp,id=rp0,bus="cxl.0",addr=0.0,chassis=4

Like the host bridge patch, the ACPI tables aren't generated at this
point and so system software cannot use it.

Signed-off-by: Ben Widawsky <ben.widawsky@intel.com>
---
 hw/pci-bridge/Kconfig          |   5 +
 hw/pci-bridge/cxl_root_port.c  | 231 +++++++++++++++++++++++++++++++++
 hw/pci-bridge/meson.build      |   1 +
 hw/pci-bridge/pcie_root_port.c |   6 +-
 hw/pci/pci.c                   |   4 +-
 5 files changed, 245 insertions(+), 2 deletions(-)
 create mode 100644 hw/pci-bridge/cxl_root_port.c

diff --git a/hw/pci-bridge/Kconfig b/hw/pci-bridge/Kconfig
index a51ec716f5..a821b531da 100644
--- a/hw/pci-bridge/Kconfig
+++ b/hw/pci-bridge/Kconfig
@@ -27,3 +27,8 @@ config DEC_PCI
 
 config SIMBA
     bool
+
+config CXL
+    bool
+    default y if PCI_EXPRESS && PXB
+    depends on PCI_EXPRESS && MSI_NONBROKEN && PXB
diff --git a/hw/pci-bridge/cxl_root_port.c b/hw/pci-bridge/cxl_root_port.c
new file mode 100644
index 0000000000..e3a19dee6d
--- /dev/null
+++ b/hw/pci-bridge/cxl_root_port.c
@@ -0,0 +1,231 @@
+/*
+ * CXL 2.0 Root Port Implementation
+ *
+ * Copyright(C) 2020 Intel Corporation.
+ *
+ * This library is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU Lesser General Public
+ * License as published by the Free Software Foundation; either
+ * version 2 of the License, or (at your option) any later version.
+ *
+ * This library is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+ * Lesser General Public License for more details.
+ *
+ * You should have received a copy of the GNU Lesser General Public
+ * License along with this library; if not, see <http://www.gnu.org/licenses/>
+ */
+
+#include "qemu/osdep.h"
+#include "qemu/log.h"
+#include "qemu/range.h"
+#include "hw/pci/pci_bridge.h"
+#include "hw/pci/pcie_port.h"
+#include "hw/qdev-properties.h"
+#include "hw/sysbus.h"
+#include "qapi/error.h"
+#include "hw/cxl/cxl.h"
+
+#define CXL_ROOT_PORT_DID 0x7075
+
+/* Copied from the gen root port which we derive */
+#define GEN_PCIE_ROOT_PORT_AER_OFFSET 0x100
+#define GEN_PCIE_ROOT_PORT_ACS_OFFSET \
+    (GEN_PCIE_ROOT_PORT_AER_OFFSET + PCI_ERR_SIZEOF)
+#define CXL_ROOT_PORT_DVSEC_OFFSET \
+    (GEN_PCIE_ROOT_PORT_ACS_OFFSET + PCI_ACS_SIZEOF)
+
+typedef struct CXLRootPort {
+    /*< private >*/
+    PCIESlot parent_obj;
+
+    CXLComponentState cxl_cstate;
+    PCIResReserve res_reserve;
+} CXLRootPort;
+
+#define TYPE_CXL_ROOT_PORT "cxl-rp"
+DECLARE_INSTANCE_CHECKER(CXLRootPort, CXL_ROOT_PORT, TYPE_CXL_ROOT_PORT)
+
+static void latch_registers(CXLRootPort *crp)
+{
+    uint32_t *reg_state = crp->cxl_cstate.crb.cache_mem_registers;
+
+    cxl_component_register_init_common(reg_state, CXL2_ROOT_PORT);
+}
+
+static void build_dvsecs(CXLComponentState *cxl)
+{
+    uint8_t *dvsec;
+
+    dvsec = (uint8_t *)&(struct dvsec_port){ 0 };
+    cxl_component_create_dvsec(cxl, EXTENSIONS_PORT_DVSEC_LENGTH,
+                               EXTENSIONS_PORT_DVSEC,
+                               EXTENSIONS_PORT_DVSEC_REVID, dvsec);
+
+    dvsec = (uint8_t *)&(struct dvsec_port_gpf){
+        .rsvd        = 0,
+        .phase1_ctrl = 1, /* 1μs timeout */
+        .phase2_ctrl = 1, /* 1μs timeout */
+    };
+    cxl_component_create_dvsec(cxl, GPF_PORT_DVSEC_LENGTH, GPF_PORT_DVSEC,
+                               GPF_PORT_DVSEC_REVID, dvsec);
+
+    dvsec = (uint8_t *)&(struct dvsec_port_flexbus){
+        .cap              = 0x26, /* IO, Mem, non-MLD */
+        .ctrl             = 0,
+        .status           = 0x26, /* same */
+        .rcvd_mod_ts_data = 0xef, /* WTF? */
+    };
+    cxl_component_create_dvsec(cxl, PCIE_FLEXBUS_PORT_DVSEC_LENGTH_2_0,
+                               PCIE_FLEXBUS_PORT_DVSEC,
+                               PCIE_FLEXBUS_PORT_DVSEC_REVID_2_0, dvsec);
+
+    dvsec = (uint8_t *)&(struct dvsec_register_locator){
+        .rsvd         = 0,
+        .reg0_base_lo = RBI_COMPONENT_REG | COMPONENT_REG_BAR_IDX,
+        .reg0_base_hi = 0,
+    };
+    cxl_component_create_dvsec(cxl, REG_LOC_DVSEC_LENGTH, REG_LOC_DVSEC,
+                               REG_LOC_DVSEC_REVID, dvsec);
+}
+
+static void cxl_rp_realize(DeviceState *dev, Error **errp)
+{
+    PCIDevice *pci_dev     = PCI_DEVICE(dev);
+    PCIERootPortClass *rpc = PCIE_ROOT_PORT_GET_CLASS(dev);
+    CXLRootPort *crp       = CXL_ROOT_PORT(dev);
+    CXLComponentState *cxl_cstate = &crp->cxl_cstate;
+    ComponentRegisters *cregs = &cxl_cstate->crb;
+    MemoryRegion *component_bar = &cregs->component_registers;
+    Error *local_err = NULL;
+
+    rpc->parent_realize(dev, &local_err);
+    if (local_err) {
+        error_propagate(errp, local_err);
+        return;
+    }
+
+    int rc =
+        pci_bridge_qemu_reserve_cap_init(pci_dev, 0, crp->res_reserve, errp);
+    if (rc < 0) {
+        rpc->parent_class.exit(pci_dev);
+        return;
+    }
+
+    if (!crp->res_reserve.io || crp->res_reserve.io == -1) {
+        pci_word_test_and_clear_mask(pci_dev->wmask + PCI_COMMAND,
+                                     PCI_COMMAND_IO);
+        pci_dev->wmask[PCI_IO_BASE]  = 0;
+        pci_dev->wmask[PCI_IO_LIMIT] = 0;
+    }
+
+    cxl_cstate->dvsec_offset = CXL_ROOT_PORT_DVSEC_OFFSET;
+    cxl_cstate->pdev = pci_dev;
+    build_dvsecs(&crp->cxl_cstate);
+
+    cxl_component_register_block_init(OBJECT(pci_dev), cxl_cstate,
+                                      TYPE_CXL_ROOT_PORT);
+
+    pci_register_bar(pci_dev, COMPONENT_REG_BAR_IDX,
+                     PCI_BASE_ADDRESS_SPACE_MEMORY |
+                         PCI_BASE_ADDRESS_MEM_TYPE_64,
+                     component_bar);
+}
+
+static void cxl_rp_reset(DeviceState *dev)
+{
+    PCIERootPortClass *rpc = PCIE_ROOT_PORT_GET_CLASS(dev);
+    CXLRootPort *crp = CXL_ROOT_PORT(dev);
+
+    rpc->parent_reset(dev);
+
+    latch_registers(crp);
+}
+
+static Property gen_rp_props[] = {
+    DEFINE_PROP_UINT32("bus-reserve", CXLRootPort, res_reserve.bus, -1),
+    DEFINE_PROP_SIZE("io-reserve", CXLRootPort, res_reserve.io, -1),
+    DEFINE_PROP_SIZE("mem-reserve", CXLRootPort, res_reserve.mem_non_pref, -1),
+    DEFINE_PROP_SIZE("pref32-reserve", CXLRootPort, res_reserve.mem_pref_32,
+                     -1),
+    DEFINE_PROP_SIZE("pref64-reserve", CXLRootPort, res_reserve.mem_pref_64,
+                     -1),
+    DEFINE_PROP_END_OF_LIST()
+};
+
+static void cxl_rp_dvsec_write_config(PCIDevice *dev, uint32_t addr,
+                                      uint32_t val, int len)
+{
+    CXLRootPort *crp = CXL_ROOT_PORT(dev);
+
+    if (range_contains(&crp->cxl_cstate.dvsecs[EXTENSIONS_PORT_DVSEC], addr)) {
+        uint8_t *reg = &dev->config[addr];
+        addr -= crp->cxl_cstate.dvsecs[EXTENSIONS_PORT_DVSEC].lob;
+        if (addr == PORT_CONTROL_OVERRIDE_OFFSET) {
+            if (pci_get_word(reg) & PORT_CONTROL_UNMASK_SBR) {
+                /* unmask SBR */
+            }
+            if (pci_get_word(reg) & PORT_CONTROL_ALT_MEMID_EN) {
+                /* Alt Memory & ID Space Enable */
+            }
+        }
+    }
+}
+
+static void cxl_rp_write_config(PCIDevice *d, uint32_t address, uint32_t val,
+                                int len)
+{
+    uint16_t slt_ctl, slt_sta;
+
+    pcie_cap_slot_get(d, &slt_ctl, &slt_sta);
+    pci_bridge_write_config(d, address, val, len);
+    pcie_cap_flr_write_config(d, address, val, len);
+    pcie_cap_slot_write_config(d, slt_ctl, slt_sta, address, val, len);
+    pcie_aer_write_config(d, address, val, len);
+
+    cxl_rp_dvsec_write_config(d, address, val, len);
+}
+
+static void cxl_root_port_class_init(ObjectClass *oc, void *data)
+{
+    DeviceClass *dc        = DEVICE_CLASS(oc);
+    PCIDeviceClass *k      = PCI_DEVICE_CLASS(oc);
+    PCIERootPortClass *rpc = PCIE_ROOT_PORT_CLASS(oc);
+
+    k->vendor_id = PCI_VENDOR_ID_INTEL;
+    k->device_id = CXL_ROOT_PORT_DID;
+    dc->desc     = "CXL Root Port";
+    k->revision  = 0;
+    device_class_set_props(dc, gen_rp_props);
+    k->config_write = cxl_rp_write_config;
+
+    device_class_set_parent_realize(dc, cxl_rp_realize, &rpc->parent_realize);
+    device_class_set_parent_reset(dc, cxl_rp_reset, &rpc->parent_reset);
+
+    rpc->aer_offset = GEN_PCIE_ROOT_PORT_AER_OFFSET;
+    rpc->acs_offset = GEN_PCIE_ROOT_PORT_ACS_OFFSET;
+
+    /*
+     * Explain
+     */
+    dc->hotpluggable = false;
+}
+
+static const TypeInfo cxl_root_port_info = {
+    .name = TYPE_CXL_ROOT_PORT,
+    .parent = TYPE_PCIE_ROOT_PORT,
+    .instance_size = sizeof(CXLRootPort),
+    .class_init = cxl_root_port_class_init,
+    .interfaces = (InterfaceInfo[]) {
+        { INTERFACE_CXL_DEVICE },
+        { }
+    },
+};
+
+static void cxl_register(void)
+{
+    type_register_static(&cxl_root_port_info);
+}
+
+type_init(cxl_register);
diff --git a/hw/pci-bridge/meson.build b/hw/pci-bridge/meson.build
index daab8acf2a..b6d26a03d5 100644
--- a/hw/pci-bridge/meson.build
+++ b/hw/pci-bridge/meson.build
@@ -5,6 +5,7 @@ pci_ss.add(when: 'CONFIG_IOH3420', if_true: files('ioh3420.c'))
 pci_ss.add(when: 'CONFIG_PCIE_PORT', if_true: files('pcie_root_port.c', 'gen_pcie_root_port.c', 'pcie_pci_bridge.c'))
 pci_ss.add(when: 'CONFIG_PXB', if_true: files('pci_expander_bridge.c'))
 pci_ss.add(when: 'CONFIG_XIO3130', if_true: files('xio3130_upstream.c', 'xio3130_downstream.c'))
+pci_ss.add(when: 'CONFIG_CXL', if_true: files('cxl_root_port.c'))
 
 # NewWorld PowerMac
 pci_ss.add(when: 'CONFIG_DEC_PCI', if_true: files('dec.c'))
diff --git a/hw/pci-bridge/pcie_root_port.c b/hw/pci-bridge/pcie_root_port.c
index f1cfe9d14a..460e48269d 100644
--- a/hw/pci-bridge/pcie_root_port.c
+++ b/hw/pci-bridge/pcie_root_port.c
@@ -67,7 +67,11 @@ static void rp_realize(PCIDevice *d, Error **errp)
     int rc;
 
     pci_config_set_interrupt_pin(d->config, 1);
-    pci_bridge_initfn(d, TYPE_PCIE_BUS);
+    if (d->cap_present & QEMU_PCIE_CAP_CXL) {
+        pci_bridge_initfn(d, TYPE_CXL_BUS);
+    } else {
+        pci_bridge_initfn(d, TYPE_PCIE_BUS);
+    }
     pcie_port_init_reg(d);
 
     rc = pci_bridge_ssvid_init(d, rpc->ssvid_offset, dc->vendor_id,
diff --git a/hw/pci/pci.c b/hw/pci/pci.c
index f728975d32..3765e33581 100644
--- a/hw/pci/pci.c
+++ b/hw/pci/pci.c
@@ -2690,7 +2690,9 @@ static void pci_device_class_base_init(ObjectClass *klass, void *data)
             object_class_dynamic_cast(klass, INTERFACE_CONVENTIONAL_PCI_DEVICE);
         ObjectClass *pcie =
             object_class_dynamic_cast(klass, INTERFACE_PCIE_DEVICE);
-        assert(conventional || pcie);
+        ObjectClass *cxl =
+            object_class_dynamic_cast(klass, INTERFACE_CXL_DEVICE);
+        assert(conventional || pcie || cxl);
     }
 }
 
-- 
2.29.2



^ permalink raw reply related	[flat|nested] 64+ messages in thread

* [RFC PATCH 18/25] hw/cxl/device: Add a memory device (8.2.8.5)
  2020-11-11  5:46 [RFC PATCH 00/25] Introduce CXL 2.0 Emulation Ben Widawsky
                   ` (16 preceding siblings ...)
  2020-11-11  5:47 ` [RFC PATCH 17/25] hw/cxl/rp: Add a root port Ben Widawsky
@ 2020-11-11  5:47 ` Ben Widawsky
  2020-11-12 18:37   ` Eric Blake
  2020-11-11  5:47 ` [RFC PATCH 19/25] hw/cxl/device: Implement MMIO HDM decoding (8.2.5.12) Ben Widawsky
                   ` (8 subsequent siblings)
  26 siblings, 1 reply; 64+ messages in thread
From: Ben Widawsky @ 2020-11-11  5:47 UTC (permalink / raw)
  To: qemu-devel
  Cc: Ben Widawsky, Eduardo Habkost, Michael S. Tsirkin, Vishal Verma,
	Dr. David Alan Gilbert, Markus Armbruster, Igor Mammedov,
	Paolo Bonzini, Dan Williams, Richard Henderson

A CXL memory device (AKA Type 3) is a CXL component that contains some
combination of volatile and persistent memory. It also implements the
previously defined mailbox interface as well as the memory device
firmware interface.

The following example will create a 256M device in a 512M window:

-object "memory-backend-file,id=cxl-mem1,share,mem-path=cxl-type3,size=512M"
-device "cxl-type3,bus=rp0,memdev=cxl-mem1,id=cxl-pmem0,size=256M"

Signed-off-by: Ben Widawsky <ben.widawsky@intel.com>
---
 hw/core/numa.c           |   3 +
 hw/i386/pc.c             |   1 +
 hw/mem/Kconfig           |   5 +
 hw/mem/cxl_type3.c       | 262 +++++++++++++++++++++++++++++++++++++++
 hw/mem/meson.build       |   1 +
 hw/pci/pcie.c            |  30 +++++
 include/hw/cxl/cxl.h     |   2 +
 include/hw/cxl/cxl_pci.h |  22 ++++
 include/hw/pci/pci_ids.h |   1 +
 monitor/hmp-cmds.c       |  15 +++
 qapi/machine.json        |   1 +
 11 files changed, 343 insertions(+)
 create mode 100644 hw/mem/cxl_type3.c

diff --git a/hw/core/numa.c b/hw/core/numa.c
index 7c4dd4e68e..3ddeb23036 100644
--- a/hw/core/numa.c
+++ b/hw/core/numa.c
@@ -770,6 +770,9 @@ static void numa_stat_memory_devices(NumaNodeMem node_mem[])
                 node_mem[pcdimm_info->node].node_plugged_mem +=
                     pcdimm_info->size;
                 break;
+            case MEMORY_DEVICE_INFO_KIND_CXL:
+                /* FINISHME */
+                break;
             case MEMORY_DEVICE_INFO_KIND_VIRTIO_PMEM:
                 vpi = value->u.virtio_pmem.data;
                 /* TODO: once we support numa, assign to right node */
diff --git a/hw/i386/pc.c b/hw/i386/pc.c
index 5e6c0023e0..ecfc497f71 100644
--- a/hw/i386/pc.c
+++ b/hw/i386/pc.c
@@ -79,6 +79,7 @@
 #include "acpi-build.h"
 #include "hw/mem/pc-dimm.h"
 #include "hw/mem/nvdimm.h"
+#include "hw/cxl/cxl.h"
 #include "qapi/error.h"
 #include "qapi/qapi-visit-common.h"
 #include "qapi/visitor.h"
diff --git a/hw/mem/Kconfig b/hw/mem/Kconfig
index a0ef2cf648..7d9d1ced3e 100644
--- a/hw/mem/Kconfig
+++ b/hw/mem/Kconfig
@@ -10,3 +10,8 @@ config NVDIMM
     default y
     depends on (PC || PSERIES || ARM_VIRT)
     select MEM_DEVICE
+
+config CXL_MEM_DEVICE
+    bool
+    default y if CXL
+    select MEM_DEVICE
diff --git a/hw/mem/cxl_type3.c b/hw/mem/cxl_type3.c
new file mode 100644
index 0000000000..48c25922f3
--- /dev/null
+++ b/hw/mem/cxl_type3.c
@@ -0,0 +1,262 @@
+#include "qemu/osdep.h"
+#include "qemu/units.h"
+#include "qemu/error-report.h"
+#include "hw/mem/memory-device.h"
+#include "hw/mem/pc-dimm.h"
+#include "hw/pci/pci.h"
+#include "hw/qdev-properties.h"
+#include "qapi/error.h"
+#include "qemu/log.h"
+#include "qemu/module.h"
+#include "qemu/range.h"
+#include "qemu/rcu.h"
+#include "sysemu/hostmem.h"
+#include "hw/cxl/cxl.h"
+
+typedef struct cxl_type3_dev {
+    /* Private */
+    PCIDevice parent_obj;
+
+    /* Properties */
+    uint64_t size;
+    HostMemoryBackend *hostmem;
+
+    /* State */
+    CXLComponentState cxl_cstate;
+    CXLDeviceState cxl_dstate;
+} CXLType3Dev;
+
+#define CT3(obj) OBJECT_CHECK(CXLType3Dev, (obj), TYPE_CXL_TYPE3_DEV)
+
+static void build_dvsecs(CXLType3Dev *ct3d)
+{
+    CXLComponentState *cxl_cstate = &ct3d->cxl_cstate;
+    uint8_t *dvsec;
+
+    dvsec = (uint8_t *)&(struct dvsec_device){
+        .cap = 0x1e,
+        .ctrl = 0x6,
+        .status2 = 0x2,
+        .range1_size_hi = 0,
+        .range1_size_lo = (2 << 5) | (2 << 2) | 0x3 | ct3d->size,
+        .range1_base_hi = 0,
+        .range1_base_lo = 0,
+    };
+    cxl_component_create_dvsec(cxl_cstate, PCIE_CXL_DEVICE_DVSEC_LENGTH,
+                               PCIE_CXL_DEVICE_DVSEC,
+                               PCIE_CXL_DEVICE_DVSEC_REVID, dvsec);
+
+    dvsec = (uint8_t *)&(struct dvsec_register_locator){
+        .rsvd         = 0,
+        .reg0_base_lo = RBI_COMPONENT_REG | COMPONENT_REG_BAR_IDX,
+        .reg0_base_hi = 0,
+        .reg1_base_lo = RBI_CXL_DEVICE_REG | DEVICE_REG_BAR_IDX,
+        .reg1_base_hi = 0,
+    };
+    cxl_component_create_dvsec(cxl_cstate, REG_LOC_DVSEC_LENGTH, REG_LOC_DVSEC,
+                               REG_LOC_DVSEC_REVID, dvsec);
+}
+
+static void ct3_instance_init(Object *obj)
+{
+    /* MemoryDeviceClass *mdc = MEMORY_DEVICE_GET_CLASS(obj); */
+}
+
+static void ct3_finalize(Object *obj)
+{
+    CXLType3Dev *ct3d = CT3(obj);
+
+    g_free(ct3d->cxl_dstate.pmem);
+}
+
+static void cxl_setup_memory(CXLType3Dev *ct3d, Error **errp)
+{
+    MemoryRegionSection mrs;
+    MemoryRegion *mr;
+    uint64_t offset = 0;
+    size_t remaining_size;
+
+    if (!ct3d->hostmem) {
+        error_setg(errp, "memdev property must be set");
+        return;
+    }
+
+    /* FIXME: need to check mr is the host bridge's MR */
+    mr = host_memory_backend_get_memory(ct3d->hostmem);
+
+    /* Create our new subregion */
+    ct3d->cxl_dstate.pmem = g_new(MemoryRegion, 1);
+
+    /* Find the first free space in the window */
+    WITH_RCU_READ_LOCK_GUARD()
+    {
+        mrs = memory_region_find(mr, offset, 1);
+        while (mrs.mr && mrs.mr != mr) {
+            offset += memory_region_size(mrs.mr);
+            mrs = memory_region_find(mr, offset, 1);
+        }
+    }
+
+    remaining_size = memory_region_size(mr) - offset;
+    if (remaining_size < ct3d->size) {
+        g_free(ct3d->cxl_dstate.pmem);
+        error_setg(errp,
+                   "Not enough free space (%zd) required for device (%" PRId64  ")",
+                   remaining_size, ct3d->size);
+    }
+
+    /* Register our subregion as non-volatile */
+    memory_region_init_ram(ct3d->cxl_dstate.pmem, OBJECT(ct3d),
+                           "cxl_type3-memory", ct3d->size, errp);
+    memory_region_set_nonvolatile(ct3d->cxl_dstate.pmem, true);
+
+#ifdef SET_PMEM_PADDR
+    memory_region_add_subregion(mr, offset, ct3d->cxl_dstate.pmem);
+#endif
+}
+
+static MemoryRegion *cxl_md_get_memory_region(MemoryDeviceState *md,
+                                              Error **errp)
+{
+    CXLType3Dev *ct3d = CT3(md);
+
+    if (!ct3d->cxl_dstate.pmem) {
+        cxl_setup_memory(ct3d, errp);
+    }
+
+    return ct3d->cxl_dstate.pmem;
+}
+
+static void ct3_realize(PCIDevice *pci_dev, Error **errp)
+{
+    CXLType3Dev *ct3d = CT3(pci_dev);
+    CXLComponentState *cxl_cstate = &ct3d->cxl_cstate;
+    ComponentRegisters *regs = &cxl_cstate->crb;
+    MemoryRegion *mr = &regs->component_registers;
+    uint8_t *pci_conf = pci_dev->config;
+
+    if (!ct3d->cxl_dstate.pmem) {
+        cxl_setup_memory(ct3d, errp);
+    }
+
+    pci_config_set_prog_interface(pci_conf, 0x10);
+    pci_config_set_class(pci_conf, PCI_CLASS_MEMORY_CXL);
+
+    pcie_endpoint_cap_init(pci_dev, 0x80);
+    cxl_cstate->dvsec_offset = 0x100;
+
+    ct3d->cxl_cstate.pdev = pci_dev;
+    build_dvsecs(ct3d);
+
+    cxl_component_register_block_init(OBJECT(pci_dev), cxl_cstate,
+                                      TYPE_CXL_TYPE3_DEV);
+
+    pci_register_bar(
+        pci_dev, COMPONENT_REG_BAR_IDX,
+        PCI_BASE_ADDRESS_SPACE_MEMORY | PCI_BASE_ADDRESS_MEM_TYPE_64, mr);
+
+    cxl_device_register_block_init(OBJECT(pci_dev), &ct3d->cxl_dstate);
+    pci_register_bar(pci_dev, DEVICE_REG_BAR_IDX,
+                     PCI_BASE_ADDRESS_SPACE_MEMORY |
+                         PCI_BASE_ADDRESS_MEM_TYPE_64,
+                     &ct3d->cxl_dstate.device_registers);
+}
+
+static uint64_t cxl_md_get_addr(const MemoryDeviceState *md)
+{
+    CXLType3Dev *ct3d = CT3(md);
+
+    return memory_region_get_ram_addr(ct3d->cxl_dstate.pmem);
+}
+
+static void cxl_md_set_addr(MemoryDeviceState *md, uint64_t addr, Error **errp)
+{
+    object_property_set_uint(OBJECT(md), "paddr", addr, errp);
+}
+
+static void ct3d_reset(DeviceState *dev)
+{
+    CXLType3Dev *ct3d = CT3(dev);
+    uint32_t *reg_state = ct3d->cxl_cstate.crb.cache_mem_registers;
+
+    cxl_component_register_init_common(reg_state, CXL2_TYPE3_DEVICE);
+    cxl_device_register_init_common(&ct3d->cxl_dstate);
+}
+
+static Property ct3_props[] = {
+    DEFINE_PROP_SIZE("size", CXLType3Dev, size, -1),
+    DEFINE_PROP_LINK("memdev", CXLType3Dev, hostmem, TYPE_MEMORY_BACKEND,
+                     HostMemoryBackend *),
+    DEFINE_PROP_END_OF_LIST(),
+};
+
+static void pc_dimm_md_fill_device_info(const MemoryDeviceState *md,
+                                        MemoryDeviceInfo *info)
+{
+    PCDIMMDeviceInfo *di = g_new0(PCDIMMDeviceInfo, 1);
+    const DeviceClass *dc = DEVICE_GET_CLASS(md);
+    const DeviceState *dev = DEVICE(md);
+    CXLType3Dev *ct3d = CT3(md);
+
+    if (dev->id) {
+        di->has_id = true;
+        di->id = g_strdup(dev->id);
+    }
+    di->hotplugged = dev->hotplugged;
+    di->hotpluggable = dc->hotpluggable;
+    di->addr = cxl_md_get_addr(md);
+    di->slot = 0;
+    di->node = 0;
+    di->size = memory_device_get_region_size(md, NULL);
+    di->memdev = object_get_canonical_path(OBJECT(ct3d->hostmem));
+
+
+    info->u.cxl.data = di;
+    info->type = MEMORY_DEVICE_INFO_KIND_CXL;
+}
+
+static void ct3_class_init(ObjectClass *oc, void *data)
+{
+    DeviceClass *dc = DEVICE_CLASS(oc);
+    PCIDeviceClass *pc = PCI_DEVICE_CLASS(oc);
+    MemoryDeviceClass *mdc = MEMORY_DEVICE_CLASS(oc);
+
+    pc->realize = ct3_realize;
+    pc->class_id = PCI_CLASS_STORAGE_EXPRESS;
+    pc->vendor_id = PCI_VENDOR_ID_INTEL;
+    pc->device_id = 0xd93; /* LVF for now */
+    pc->revision = 1;
+
+    set_bit(DEVICE_CATEGORY_STORAGE, dc->categories);
+    dc->desc = "CXL PMEM Device (Type 3)";
+    dc->reset = ct3d_reset;
+    device_class_set_props(dc, ct3_props);
+
+    mdc->get_memory_region = cxl_md_get_memory_region;
+    mdc->get_addr = cxl_md_get_addr;
+    mdc->fill_device_info = pc_dimm_md_fill_device_info;
+    mdc->get_plugged_size = memory_device_get_region_size;
+    mdc->set_addr = cxl_md_set_addr;
+}
+
+static const TypeInfo ct3d_info = {
+    .name = TYPE_CXL_TYPE3_DEV,
+    .parent = TYPE_PCI_DEVICE,
+    .class_init = ct3_class_init,
+    .instance_size = sizeof(CXLType3Dev),
+    .instance_init = ct3_instance_init,
+    .instance_finalize = ct3_finalize,
+    .interfaces = (InterfaceInfo[]) {
+        { TYPE_MEMORY_DEVICE },
+        { INTERFACE_CXL_DEVICE },
+        { INTERFACE_PCIE_DEVICE },
+        {}
+    },
+};
+
+static void ct3d_registers(void)
+{
+    type_register_static(&ct3d_info);
+}
+
+type_init(ct3d_registers);
diff --git a/hw/mem/meson.build b/hw/mem/meson.build
index 0d22f2b572..d13c3ed117 100644
--- a/hw/mem/meson.build
+++ b/hw/mem/meson.build
@@ -3,5 +3,6 @@ mem_ss.add(files('memory-device.c'))
 mem_ss.add(when: 'CONFIG_DIMM', if_true: files('pc-dimm.c'))
 mem_ss.add(when: 'CONFIG_NPCM7XX', if_true: files('npcm7xx_mc.c'))
 mem_ss.add(when: 'CONFIG_NVDIMM', if_true: files('nvdimm.c'))
+mem_ss.add(when: 'CONFIG_CXL_MEM_DEVICE', if_true: files('cxl_type3.c'))
 
 softmmu_ss.add_all(when: 'CONFIG_MEM_DEVICE', if_true: mem_ss)
diff --git a/hw/pci/pcie.c b/hw/pci/pcie.c
index d4010cf8f3..1ecf6f6a55 100644
--- a/hw/pci/pcie.c
+++ b/hw/pci/pcie.c
@@ -20,6 +20,7 @@
 
 #include "qemu/osdep.h"
 #include "qapi/error.h"
+#include "hw/mem/memory-device.h"
 #include "hw/pci/pci_bridge.h"
 #include "hw/pci/pcie.h"
 #include "hw/pci/msix.h"
@@ -27,6 +28,8 @@
 #include "hw/pci/pci_bus.h"
 #include "hw/pci/pcie_regs.h"
 #include "hw/pci/pcie_port.h"
+#include "hw/cxl/cxl.h"
+#include "hw/boards.h"
 #include "qemu/range.h"
 
 //#define DEBUG_PCIE
@@ -419,6 +422,28 @@ void pcie_cap_slot_pre_plug_cb(HotplugHandler *hotplug_dev, DeviceState *dev,
     }
 
     pcie_cap_slot_plug_common(PCI_DEVICE(hotplug_dev), dev, errp);
+
+#ifdef CXL_MEM_DEVICE
+    /*
+     * FIXME:
+     * if (object_dynamic_cast(OBJECT(dev), TYPE_CXL_TYPE3_DEV)) {
+     *    HotplugHandler *hotplug_ctrl;
+     *   Error *local_err = NULL;
+     *  hotplug_ctrl = qdev_get_hotplug_handler(dev);
+     *  if (hotplug_ctrl) {
+     *      hotplug_handler_pre_plug(hotplug_ctrl, dev, &local_err);
+     *      if (local_err) {
+     *          error_propagate(errp, local_err);
+     *          return;
+     *      }
+     *  }
+     */
+
+    if (object_dynamic_cast(OBJECT(dev), TYPE_CXL_TYPE3_DEV)) {
+        memory_device_pre_plug(MEMORY_DEVICE(dev), MACHINE(qdev_get_machine()),
+                               NULL, errp);
+    }
+#endif
 }
 
 void pcie_cap_slot_plug_cb(HotplugHandler *hotplug_dev, DeviceState *dev,
@@ -455,6 +480,11 @@ void pcie_cap_slot_plug_cb(HotplugHandler *hotplug_dev, DeviceState *dev,
         pcie_cap_slot_event(hotplug_pdev,
                             PCI_EXP_HP_EV_PDC | PCI_EXP_HP_EV_ABP);
     }
+
+#ifdef CXL_MEM_DEVICE
+    if (object_dynamic_cast(OBJECT(dev), TYPE_CXL_TYPE3_DEV))
+        memory_device_plug(MEMORY_DEVICE(dev), MACHINE(qdev_get_machine()));
+#endif
 }
 
 void pcie_cap_slot_unplug_cb(HotplugHandler *hotplug_dev, DeviceState *dev,
diff --git a/include/hw/cxl/cxl.h b/include/hw/cxl/cxl.h
index b1e5f4a8fa..809ed7de60 100644
--- a/include/hw/cxl/cxl.h
+++ b/include/hw/cxl/cxl.h
@@ -17,6 +17,8 @@
 #define COMPONENT_REG_BAR_IDX 0
 #define DEVICE_REG_BAR_IDX 2
 
+#define TYPE_CXL_TYPE3_DEV "cxl-type3"
+
 #define CXL_HOST_BASE 0xD0000000
 #define CXL_WINDOW_MAX 10
 
diff --git a/include/hw/cxl/cxl_pci.h b/include/hw/cxl/cxl_pci.h
index b403770424..7a9ce71612 100644
--- a/include/hw/cxl/cxl_pci.h
+++ b/include/hw/cxl/cxl_pci.h
@@ -63,6 +63,28 @@ _Static_assert(sizeof(struct dvsec_header) == 10,
  * CXL 2.0 Downstream Port: 3, 4, 7, 8
  */
 
+/* CXL 2.0 - 8.1.3 (ID 0001) */
+struct dvsec_device {
+    struct dvsec_header hdr;
+    uint16_t cap;
+    uint16_t ctrl;
+    uint16_t status;
+    uint16_t ctrl2;
+    uint16_t status2;
+    uint16_t lock;
+    uint16_t cap2;
+    uint32_t range1_size_hi;
+    uint32_t range1_size_lo;
+    uint32_t range1_base_hi;
+    uint32_t range1_base_lo;
+    uint32_t range2_size_hi;
+    uint32_t range2_size_lo;
+    uint32_t range2_base_hi;
+    uint32_t range2_base_lo;
+};
+_Static_assert(sizeof(struct dvsec_device) == 0x38,
+               "dvsec device size incorrect");
+
 /* CXL 2.0 - 8.1.5 (ID 0003) */
 struct dvsec_port {
     struct dvsec_header hdr;
diff --git a/include/hw/pci/pci_ids.h b/include/hw/pci/pci_ids.h
index 11f8ab7149..76bf3ed590 100644
--- a/include/hw/pci/pci_ids.h
+++ b/include/hw/pci/pci_ids.h
@@ -53,6 +53,7 @@
 #define PCI_BASE_CLASS_MEMORY            0x05
 #define PCI_CLASS_MEMORY_RAM             0x0500
 #define PCI_CLASS_MEMORY_FLASH           0x0501
+#define PCI_CLASS_MEMORY_CXL             0x0502
 #define PCI_CLASS_MEMORY_OTHER           0x0580
 
 #define PCI_BASE_CLASS_BRIDGE            0x06
diff --git a/monitor/hmp-cmds.c b/monitor/hmp-cmds.c
index 56e9bad33d..8de1959c68 100644
--- a/monitor/hmp-cmds.c
+++ b/monitor/hmp-cmds.c
@@ -1880,6 +1880,21 @@ void hmp_info_memory_devices(Monitor *mon, const QDict *qdict)
                 monitor_printf(mon, "  hotpluggable: %s\n",
                                di->hotpluggable ? "true" : "false");
                 break;
+            case MEMORY_DEVICE_INFO_KIND_CXL:
+                di = value->u.cxl.data;
+                monitor_printf(mon, "Memory device [%s]: \"%s\"\n",
+                               MemoryDeviceInfoKind_str(value->type),
+                               di->id ? di->id : "");
+                monitor_printf(mon, "  addr: 0x%" PRIx64 "\n", di->addr);
+                monitor_printf(mon, "  slot: %" PRId64 "\n", di->slot);
+                monitor_printf(mon, "  node: %" PRId64 "\n", di->node);
+                monitor_printf(mon, "  size: %" PRIu64 "\n", di->size);
+                monitor_printf(mon, "  memdev: %s\n", di->memdev);
+                monitor_printf(mon, "  hotplugged: %s\n",
+                               di->hotplugged ? "true" : "false");
+                monitor_printf(mon, "  hotpluggable: %s\n",
+                               di->hotpluggable ? "true" : "false");
+                break;
             case MEMORY_DEVICE_INFO_KIND_VIRTIO_PMEM:
                 vpi = value->u.virtio_pmem.data;
                 monitor_printf(mon, "Memory device [%s]: \"%s\"\n",
diff --git a/qapi/machine.json b/qapi/machine.json
index 7c9a263778..c954950fdc 100644
--- a/qapi/machine.json
+++ b/qapi/machine.json
@@ -1394,6 +1394,7 @@
 { 'union': 'MemoryDeviceInfo',
   'data': { 'dimm': 'PCDIMMDeviceInfo',
             'nvdimm': 'PCDIMMDeviceInfo',
+            'cxl': 'PCDIMMDeviceInfo',
             'virtio-pmem': 'VirtioPMEMDeviceInfo',
             'virtio-mem': 'VirtioMEMDeviceInfo'
           }
-- 
2.29.2



^ permalink raw reply related	[flat|nested] 64+ messages in thread

* [RFC PATCH 19/25] hw/cxl/device: Implement MMIO HDM decoding (8.2.5.12)
  2020-11-11  5:46 [RFC PATCH 00/25] Introduce CXL 2.0 Emulation Ben Widawsky
                   ` (17 preceding siblings ...)
  2020-11-11  5:47 ` [RFC PATCH 18/25] hw/cxl/device: Add a memory device (8.2.8.5) Ben Widawsky
@ 2020-11-11  5:47 ` Ben Widawsky
  2020-11-11  5:47 ` [RFC PATCH 20/25] acpi/cxl: Add _OSC implementation (9.14.2) Ben Widawsky
                   ` (7 subsequent siblings)
  26 siblings, 0 replies; 64+ messages in thread
From: Ben Widawsky @ 2020-11-11  5:47 UTC (permalink / raw)
  To: qemu-devel
  Cc: Vishal Verma, Dan Williams, Ben Widawsky, Igor Mammedov,
	Michael S. Tsirkin

A device's volatile and persistent memory are known Host Defined Memory
(HDM) regions. The mechanism by which the device is programmed to claim
the addresses associated with those regions is through dedicated logic
known as the HDM decoder. In order to allow the OS to properly program
the HDMs, the HDM decoders must be modeled.

There are two ways the HDM decoders can be implemented, the legacy
mechanism is through the PCIe DVSEC programming from CXL 1.1 (8.1.3.8),
and MMIO is found in 8.2.5.12 of the spec. For now, 8.1.3.8 is not
implemented.

Much of CXL device logic is implemented in cxl-utils. The HDM decoder
however is implemented directly by the device implementation. The
generic cxl-utils probably should be the correct place to put this since
HDM decoders aren't unique to a type3 device. It is however easier at
the moment, and requires less design consideration to simply implement
it in the device, and figure out how to consolidate it later.

Signed-off-by: Ben Widawsky <ben.widawsky@intel.com>
---
 hw/mem/cxl_type3.c | 82 +++++++++++++++++++++++++++++++++++++++++++---
 1 file changed, 77 insertions(+), 5 deletions(-)

diff --git a/hw/mem/cxl_type3.c b/hw/mem/cxl_type3.c
index 48c25922f3..00ab5044b1 100644
--- a/hw/mem/cxl_type3.c
+++ b/hw/mem/cxl_type3.c
@@ -57,6 +57,71 @@ static void build_dvsecs(CXLType3Dev *ct3d)
                                REG_LOC_DVSEC_REVID, dvsec);
 }
 
+static void hdm_decoder_commit(CXLType3Dev *ct3d, int which)
+{
+    MemoryRegion *pmem = ct3d->cxl_dstate.pmem;
+    MemoryRegion *mr = host_memory_backend_get_memory(ct3d->hostmem);
+    Range window, device;
+    ComponentRegisters *cregs = &ct3d->cxl_cstate.crb;
+    uint32_t *cache_mem = cregs->cache_mem_registers;
+    uint64_t offset, size;
+    Error *err = NULL;
+
+    assert(which == 0);
+
+    ARRAY_FIELD_DP32(cache_mem, CXL_HDM_DECODER0_CTRL, COMMIT, 0);
+    ARRAY_FIELD_DP32(cache_mem, CXL_HDM_DECODER0_CTRL, ERROR, 0);
+
+    offset = ((uint64_t)cache_mem[R_CXL_HDM_DECODER0_BASE_HI] << 32) |
+             cache_mem[R_CXL_HDM_DECODER0_BASE_LO];
+    size = ((uint64_t)cache_mem[R_CXL_HDM_DECODER0_SIZE_HI] << 32) |
+           cache_mem[R_CXL_HDM_DECODER0_SIZE_LO];
+
+    range_init_nofail(&window, mr->addr, memory_region_size(mr));
+    range_init_nofail(&device, offset, size);
+
+    if (!range_contains_range(&window, &device)) {
+        ARRAY_FIELD_DP32(cache_mem, CXL_HDM_DECODER0_CTRL, ERROR, 1);
+        return;
+    }
+
+    memory_region_ram_resize(pmem, size, &err);
+    if (err) {
+        ARRAY_FIELD_DP32(cache_mem, CXL_HDM_DECODER0_CTRL, ERROR, 1);
+        return;
+    }
+
+    offset -= mr->addr;
+    memory_region_add_subregion(mr, offset, pmem);
+
+    ARRAY_FIELD_DP32(cache_mem, CXL_HDM_DECODER0_CTRL, COMMITTED, 1);
+}
+
+static void ct3d_reg_write(void *opaque, hwaddr offset, uint64_t value, unsigned size)
+{
+    CXLComponentState *cxl_cstate = opaque;
+    ComponentRegisters *cregs = &cxl_cstate->crb;
+    CXLType3Dev *ct3d = container_of(cxl_cstate, CXLType3Dev, cxl_cstate);
+    uint32_t *cache_mem = cregs->cache_mem_registers;
+    bool should_commit = false;
+    int which_hdm = -1;
+
+    assert(size == 4);
+
+    switch (offset) {
+    case A_CXL_HDM_DECODER0_CTRL:
+        should_commit = FIELD_EX32(value, CXL_HDM_DECODER0_CTRL, COMMIT);
+        which_hdm = 0;
+        break;
+    default:
+        break;
+    }
+
+    stl_le_p((uint8_t *)cache_mem + offset, value);
+    if (should_commit)
+        hdm_decoder_commit(ct3d, which_hdm);
+}
+
 static void ct3_instance_init(Object *obj)
 {
     /* MemoryDeviceClass *mdc = MEMORY_DEVICE_GET_CLASS(obj); */
@@ -65,7 +130,10 @@ static void ct3_instance_init(Object *obj)
 static void ct3_finalize(Object *obj)
 {
     CXLType3Dev *ct3d = CT3(obj);
+    CXLComponentState *cxl_cstate = &ct3d->cxl_cstate;
+    ComponentRegisters *regs = &cxl_cstate->crb;
 
+    g_free((void *)regs->special_ops);
     g_free(ct3d->cxl_dstate.pmem);
 }
 
@@ -81,11 +149,12 @@ static void cxl_setup_memory(CXLType3Dev *ct3d, Error **errp)
         return;
     }
 
-    /* FIXME: need to check mr is the host bridge's MR */
-    mr = host_memory_backend_get_memory(ct3d->hostmem);
-
     /* Create our new subregion */
     ct3d->cxl_dstate.pmem = g_new(MemoryRegion, 1);
+    memory_region_set_nonvolatile(ct3d->cxl_dstate.pmem, true);
+
+    /* FIXME: need to check mr is the host bridge's MR */
+    mr = host_memory_backend_get_memory(ct3d->hostmem);
 
     /* Find the first free space in the window */
     WITH_RCU_READ_LOCK_GUARD()
@@ -108,8 +177,6 @@ static void cxl_setup_memory(CXLType3Dev *ct3d, Error **errp)
     /* Register our subregion as non-volatile */
     memory_region_init_ram(ct3d->cxl_dstate.pmem, OBJECT(ct3d),
                            "cxl_type3-memory", ct3d->size, errp);
-    memory_region_set_nonvolatile(ct3d->cxl_dstate.pmem, true);
-
 #ifdef SET_PMEM_PADDR
     memory_region_add_subregion(mr, offset, ct3d->cxl_dstate.pmem);
 #endif
@@ -148,6 +215,11 @@ static void ct3_realize(PCIDevice *pci_dev, Error **errp)
     ct3d->cxl_cstate.pdev = pci_dev;
     build_dvsecs(ct3d);
 
+#ifndef SET_PMEM_PADDR
+    regs->special_ops = g_new0(MemoryRegionOps, 1);
+    regs->special_ops->write = ct3d_reg_write;
+#endif
+
     cxl_component_register_block_init(OBJECT(pci_dev), cxl_cstate,
                                       TYPE_CXL_TYPE3_DEV);
 
-- 
2.29.2



^ permalink raw reply related	[flat|nested] 64+ messages in thread

* [RFC PATCH 20/25] acpi/cxl: Add _OSC implementation (9.14.2)
  2020-11-11  5:46 [RFC PATCH 00/25] Introduce CXL 2.0 Emulation Ben Widawsky
                   ` (18 preceding siblings ...)
  2020-11-11  5:47 ` [RFC PATCH 19/25] hw/cxl/device: Implement MMIO HDM decoding (8.2.5.12) Ben Widawsky
@ 2020-11-11  5:47 ` Ben Widawsky
  2020-11-11  5:47 ` [RFC PATCH 21/25] acpi/cxl: Introduce a compat-driver UUID for CXL _OSC Ben Widawsky
                   ` (6 subsequent siblings)
  26 siblings, 0 replies; 64+ messages in thread
From: Ben Widawsky @ 2020-11-11  5:47 UTC (permalink / raw)
  To: qemu-devel
  Cc: Ben Widawsky, Eduardo Habkost, Michael S. Tsirkin, Vishal Verma,
	Paolo Bonzini, Igor Mammedov, Dan Williams, Richard Henderson

CXL 2.0 specification adds 2 new dwords to the existing _OSC definition
from PCIe. The new dwords are accessed with a new uuid. This
implementation supports what is in the specification.

We are currently in the process of trying to define a new definition for
_OSC. See later work for an explanation.

Signed-off-by: Ben Widawsky <ben.widawsky@intel.com>
---
 hw/acpi/Kconfig       |   5 ++
 hw/acpi/cxl.c         | 104 ++++++++++++++++++++++++++++++++++++++++++
 hw/acpi/meson.build   |   1 +
 hw/i386/acpi-build.c  |  12 ++++-
 include/hw/acpi/cxl.h |  23 ++++++++++
 5 files changed, 144 insertions(+), 1 deletion(-)
 create mode 100644 hw/acpi/cxl.c
 create mode 100644 include/hw/acpi/cxl.h

diff --git a/hw/acpi/Kconfig b/hw/acpi/Kconfig
index 1932f66af8..b27907953e 100644
--- a/hw/acpi/Kconfig
+++ b/hw/acpi/Kconfig
@@ -5,6 +5,7 @@ config ACPI_X86
     bool
     select ACPI
     select ACPI_NVDIMM
+    select ACPI_CXL
     select ACPI_CPU_HOTPLUG
     select ACPI_MEMORY_HOTPLUG
     select ACPI_HMAT
@@ -42,3 +43,7 @@ config ACPI_VMGENID
     depends on PC
 
 config ACPI_HW_REDUCED
+
+config ACPI_CXL
+    bool
+    depends on ACPI
diff --git a/hw/acpi/cxl.c b/hw/acpi/cxl.c
new file mode 100644
index 0000000000..7124d5a1a3
--- /dev/null
+++ b/hw/acpi/cxl.c
@@ -0,0 +1,104 @@
+/*
+ * CXL ACPI Implementation
+ *
+ * Copyright(C) 2020 Intel Corporation.
+ *
+ * This library is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU Lesser General Public
+ * License as published by the Free Software Foundation; either
+ * version 2 of the License, or (at your option) any later version.
+ *
+ * This library is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+ * Lesser General Public License for more details.
+ *
+ * You should have received a copy of the GNU Lesser General Public
+ * License along with this library; if not, see <http://www.gnu.org/licenses/>
+ */
+
+#include "qemu/osdep.h"
+#include "hw/cxl/cxl.h"
+#include "hw/acpi/acpi.h"
+#include "hw/acpi/aml-build.h"
+#include "hw/acpi/bios-linker-loader.h"
+#include "hw/acpi/cxl.h"
+#include "qapi/error.h"
+#include "qemu/uuid.h"
+
+static Aml *__build_cxl_osc_method(void)
+{
+    Aml *method, *if_uuid, *else_uuid, *if_arg1_not_1, *if_cxl, *if_caps_masked;
+    Aml *a_ctrl = aml_local(0);
+    Aml *a_cdw1 = aml_name("CDW1");
+
+    method = aml_method("_OSC", 4, AML_NOTSERIALIZED);
+    aml_append(method, aml_create_dword_field(aml_arg(3), aml_int(0), "CDW1"));
+
+    /* 9.14.2.1.4 */
+    if_uuid = aml_if(
+        aml_lor(aml_equal(aml_arg(0),
+                          aml_touuid("33DB4D5B-1FF7-401C-9657-7441C03DD766")),
+                aml_equal(aml_arg(0),
+                          aml_touuid("68F2D50B-C469-4D8A-BD3D-941A103FD3FC"))));
+    aml_append(if_uuid, aml_create_dword_field(aml_arg(3), aml_int(4), "CDW2"));
+    aml_append(if_uuid, aml_create_dword_field(aml_arg(3), aml_int(8), "CDW3"));
+
+    aml_append(if_uuid, aml_store(aml_name("CDW3"), a_ctrl));
+
+    /* This is all the same as what's used for PCIe */
+    aml_append(if_uuid,
+               aml_and(aml_name("CTRL"), aml_int(0x1F), aml_name("CTRL")));
+
+    if_arg1_not_1 = aml_if(aml_lnot(aml_equal(aml_arg(1), aml_int(0x1))));
+    /* Unknown revision */
+    aml_append(if_arg1_not_1, aml_or(a_cdw1, aml_int(0x08), a_cdw1));
+    aml_append(if_uuid, if_arg1_not_1);
+
+    if_caps_masked = aml_if(aml_lnot(aml_equal(aml_name("CDW3"), a_ctrl)));
+    /* Capability bits were masked */
+    aml_append(if_caps_masked, aml_or(a_cdw1, aml_int(0x10), a_cdw1));
+    aml_append(if_uuid, if_caps_masked);
+
+    aml_append(if_uuid, aml_store(aml_name("CDW2"), aml_name("SUPP")));
+    aml_append(if_uuid, aml_store(aml_name("CDW3"), aml_name("CTRL")));
+
+    if_cxl = aml_if(aml_equal(
+        aml_arg(0), aml_touuid("68F2D50B-C469-4D8A-BD3D-941A103FD3FC")));
+    /* CXL support field */
+    aml_append(if_cxl, aml_create_dword_field(aml_arg(3), aml_int(12), "CDW4"));
+    /* CXL capabilities */
+    aml_append(if_cxl, aml_create_dword_field(aml_arg(3), aml_int(16), "CDW5"));
+    aml_append(if_cxl, aml_store(aml_name("CDW4"), aml_name("SUPC")));
+    aml_append(if_cxl, aml_store(aml_name("CDW5"), aml_name("CTRC")));
+
+    /* CXL 2.0 Port/Device Register access */
+    aml_append(if_cxl,
+               aml_or(aml_name("CDW5"), aml_int(0x1), aml_name("CDW5")));
+    aml_append(if_uuid, if_cxl);
+
+    /* Update DWORD3 (the return value) */
+    aml_append(if_uuid, aml_store(a_ctrl, aml_name("CDW3")));
+
+    aml_append(if_uuid, aml_return(aml_arg(3)));
+    aml_append(method, if_uuid);
+
+    else_uuid = aml_else();
+
+    /* unrecognized uuid */
+    aml_append(else_uuid,
+               aml_or(aml_name("CDW1"), aml_int(0x4), aml_name("CDW1")));
+    aml_append(else_uuid, aml_return(aml_arg(3)));
+    aml_append(method, else_uuid);
+
+    return method;
+}
+
+void build_cxl_osc_method(Aml *dev)
+{
+    aml_append(dev, aml_name_decl("SUPP", aml_int(0)));
+    aml_append(dev, aml_name_decl("CTRL", aml_int(0)));
+    aml_append(dev, aml_name_decl("SUPC", aml_int(0)));
+    aml_append(dev, aml_name_decl("CTRC", aml_int(0)));
+    aml_append(dev, __build_cxl_osc_method());
+}
diff --git a/hw/acpi/meson.build b/hw/acpi/meson.build
index dd69577212..9f5c5ced28 100644
--- a/hw/acpi/meson.build
+++ b/hw/acpi/meson.build
@@ -10,6 +10,7 @@ acpi_ss.add(when: 'CONFIG_ACPI_CPU_HOTPLUG', if_true: files('cpu_hotplug.c'))
 acpi_ss.add(when: 'CONFIG_ACPI_MEMORY_HOTPLUG', if_true: files('memory_hotplug.c'))
 acpi_ss.add(when: 'CONFIG_ACPI_NVDIMM', if_true: files('nvdimm.c'))
 acpi_ss.add(when: 'CONFIG_ACPI_PCI', if_true: files('pci.c'))
+acpi_ss.add(when: 'CONFIG_ACPI_CXL', if_true: files('cxl.c'))
 acpi_ss.add(when: 'CONFIG_ACPI_VMGENID', if_true: files('vmgenid.c'))
 acpi_ss.add(when: 'CONFIG_ACPI_HW_REDUCED', if_true: files('generic_event_device.c'))
 acpi_ss.add(when: 'CONFIG_ACPI_HMAT', if_true: files('hmat.c'))
diff --git a/hw/i386/acpi-build.c b/hw/i386/acpi-build.c
index fae4fa28e1..dd1f8b39d4 100644
--- a/hw/i386/acpi-build.c
+++ b/hw/i386/acpi-build.c
@@ -66,6 +66,7 @@
 #include "hw/acpi/aml-build.h"
 #include "hw/acpi/utils.h"
 #include "hw/acpi/pci.h"
+#include "hw/acpi/cxl.h"
 
 #include "qom/qom-qobject.h"
 #include "hw/i386/amd_iommu.h"
@@ -1493,11 +1494,20 @@ static void init_pci_acpi(Aml *dev, int uid, int type)
     if (type == PCI) {
         aml_append(dev, aml_name_decl("_HID", aml_eisaid("PNP0A03")));
         aml_append(dev, aml_name_decl("_UID", aml_int(uid)));
-    } else {
+    } else if (type == PCIE) {
         aml_append(dev, aml_name_decl("_HID", aml_eisaid("PNP0A08")));
         aml_append(dev, aml_name_decl("_CID", aml_eisaid("PNP0A03")));
         aml_append(dev, aml_name_decl("_UID", aml_int(uid)));
         aml_append(dev, build_q35_osc_method());
+    } else /* CXL */ {
+        struct Aml *pkg = aml_package(2);
+
+        aml_append(dev, aml_name_decl("_HID", aml_string("ACPI0016")));
+        aml_append(pkg, aml_eisaid("PNP0A08"));
+        aml_append(pkg, aml_eisaid("PNP0A03"));
+        aml_append(dev, aml_name_decl("_CID", pkg));
+        aml_append(dev, aml_name_decl("_UID", aml_int(uid)));
+        build_cxl_osc_method(dev);
     }
 }
 
diff --git a/include/hw/acpi/cxl.h b/include/hw/acpi/cxl.h
new file mode 100644
index 0000000000..7b8f3b8a2e
--- /dev/null
+++ b/include/hw/acpi/cxl.h
@@ -0,0 +1,23 @@
+/*
+ * Copyright (C) 2020 Intel Corporation
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+
+ * You should have received a copy of the GNU General Public License along
+ * with this program; if not, see <http://www.gnu.org/licenses/>.
+ */
+
+#ifndef HW_ACPI_CXL_H
+#define HW_ACPI_CXL_H
+
+void build_cxl_osc_method(Aml *dev);
+
+#endif
-- 
2.29.2



^ permalink raw reply related	[flat|nested] 64+ messages in thread

* [RFC PATCH 21/25] acpi/cxl: Introduce a compat-driver UUID for CXL _OSC
  2020-11-11  5:46 [RFC PATCH 00/25] Introduce CXL 2.0 Emulation Ben Widawsky
                   ` (19 preceding siblings ...)
  2020-11-11  5:47 ` [RFC PATCH 20/25] acpi/cxl: Add _OSC implementation (9.14.2) Ben Widawsky
@ 2020-11-11  5:47 ` Ben Widawsky
  2020-11-11  5:47 ` [RFC PATCH 22/25] acpi/cxl: Create the CEDT (9.14.1) Ben Widawsky
                   ` (5 subsequent siblings)
  26 siblings, 0 replies; 64+ messages in thread
From: Ben Widawsky @ 2020-11-11  5:47 UTC (permalink / raw)
  To: qemu-devel; +Cc: Vishal Verma, Dan Williams, Igor Mammedov, Michael S. Tsirkin

From: Vishal Verma <vishal.l.verma@intel.com>

Introduce a new UUID for CXL _OSC that only sets CXL related 'Support'
and Control' Dwords, independent of PCI/PCIe Dwords. This is a proposal
and an example AML implementation to demonstrate what such a compat UUID
would look like.

The AML resulting from this change is:

        Method (_OSC, 4, NotSerialized)  // _OSC: Operating System Capabilities
        {
            CreateDWordField (Arg3, Zero, CDW1)
            If ((((Arg0 == ToUUID ("33db4d5b-1ff7-401c-9657-7441c03dd766") /* PCI Host Bridge Device */) || (Arg0 == ToUUID ("68f2d50b-c469-4d8a-bd3d-941a103fd3fc"))) || (
                Arg0 == ToUUID ("a4d1629d-ff52-4888-be96-e5cade548db1"))))
            {
                If ((Arg0 == ToUUID ("a4d1629d-ff52-4888-be96-e5cade548db1")))
                {
                    CreateDWordField (Arg3, 0x04, CDW2)
                    CreateDWordField (Arg3, 0x08, CDW3)
                    SUPC = CDW2 /* \_SB_.CXL0._OSC.CDW2 */
                    CTRC = CDW3 /* \_SB_.CXL0._OSC.CDW3 */
                    CDW3 |= One
                    Return (Arg3)
                }
                Else
                {
                    CreateDWordField (Arg3, 0x04, CDW2)
                    CreateDWordField (Arg3, 0x08, CDW3)
                    Local0 = CDW3 /* \_SB_.CXL0._OSC.CDW3 */
                    CTRL &= 0x1F
                    If ((Arg1 != One))
                    {
                        CDW1 |= 0x08
                    }

                    If ((CDW3 != Local0))
                    {
                        CDW1 |= 0x10
                    }

                    SUPP = CDW2 /* \_SB_.CXL0._OSC.CDW2 */
                    CTRL = CDW3 /* \_SB_.CXL0._OSC.CDW3 */
                    If ((Arg0 == ToUUID ("68f2d50b-c469-4d8a-bd3d-941a103fd3fc")))
                    {
                        CreateDWordField (Arg3, 0x0C, CDW4)
                        CreateDWordField (Arg3, 0x10, CDW5)
                        SUPC = CDW4 /* \_SB_.CXL0._OSC.CDW4 */
                        CTRC = CDW5 /* \_SB_.CXL0._OSC.CDW5 */
                        CDW5 |= One
                    }

                    CDW3 = Local0
                    Return (Arg3)
                }
            }

            Return (Arg3)
            Else
            {
                CDW1 |= 0x04
            }
        }

Signed-off-by: Vishal Verma <vishal.l.verma@intel.com>
---
 hw/acpi/cxl.c | 54 ++++++++++++++++++++++++++++++++++++---------------
 1 file changed, 38 insertions(+), 16 deletions(-)

diff --git a/hw/acpi/cxl.c b/hw/acpi/cxl.c
index 7124d5a1a3..31ceaeecc3 100644
--- a/hw/acpi/cxl.c
+++ b/hw/acpi/cxl.c
@@ -29,6 +29,7 @@
 static Aml *__build_cxl_osc_method(void)
 {
     Aml *method, *if_uuid, *else_uuid, *if_arg1_not_1, *if_cxl, *if_caps_masked;
+    Aml *if_compat, *else_nocompat;
     Aml *a_ctrl = aml_local(0);
     Aml *a_cdw1 = aml_name("CDW1");
 
@@ -37,31 +38,51 @@ static Aml *__build_cxl_osc_method(void)
 
     /* 9.14.2.1.4 */
     if_uuid = aml_if(
-        aml_lor(aml_equal(aml_arg(0),
+        aml_lor(
+            aml_lor(aml_equal(aml_arg(0),
                           aml_touuid("33DB4D5B-1FF7-401C-9657-7441C03DD766")),
-                aml_equal(aml_arg(0),
-                          aml_touuid("68F2D50B-C469-4D8A-BD3D-941A103FD3FC"))));
-    aml_append(if_uuid, aml_create_dword_field(aml_arg(3), aml_int(4), "CDW2"));
-    aml_append(if_uuid, aml_create_dword_field(aml_arg(3), aml_int(8), "CDW3"));
-
-    aml_append(if_uuid, aml_store(aml_name("CDW3"), a_ctrl));
+                    aml_equal(aml_arg(0),
+                          aml_touuid("68F2D50B-C469-4D8A-BD3D-941A103FD3FC"))),
+                    aml_equal(aml_arg(0),
+                          aml_touuid("A4D1629D-FF52-4888-BE96-E5CADE548DB1"))));
+
+    if_compat = aml_if(aml_equal(aml_arg(0),
+                          aml_touuid("A4D1629D-FF52-4888-BE96-E5CADE548DB1")));
+    aml_append(if_compat,
+               aml_create_dword_field(aml_arg(3), aml_int(4), "CDW2"));
+    aml_append(if_compat,
+               aml_create_dword_field(aml_arg(3), aml_int(8), "CDW3"));
+    aml_append(if_compat, aml_store(aml_name("CDW2"), aml_name("SUPC")));
+    aml_append(if_compat, aml_store(aml_name("CDW3"), aml_name("CTRC")));
+    aml_append(if_compat,
+               aml_or(aml_name("CDW3"), aml_int(0x1), aml_name("CDW3")));
+    aml_append(if_compat, aml_return(aml_arg(3)));
+    aml_append(if_uuid, if_compat);
+
+    else_nocompat = aml_else();
+    aml_append(else_nocompat,
+               aml_create_dword_field(aml_arg(3), aml_int(4), "CDW2"));
+    aml_append(else_nocompat,
+               aml_create_dword_field(aml_arg(3), aml_int(8), "CDW3"));
+
+    aml_append(else_nocompat, aml_store(aml_name("CDW3"), a_ctrl));
 
     /* This is all the same as what's used for PCIe */
-    aml_append(if_uuid,
+    aml_append(else_nocompat,
                aml_and(aml_name("CTRL"), aml_int(0x1F), aml_name("CTRL")));
 
     if_arg1_not_1 = aml_if(aml_lnot(aml_equal(aml_arg(1), aml_int(0x1))));
     /* Unknown revision */
     aml_append(if_arg1_not_1, aml_or(a_cdw1, aml_int(0x08), a_cdw1));
-    aml_append(if_uuid, if_arg1_not_1);
+    aml_append(else_nocompat, if_arg1_not_1);
 
     if_caps_masked = aml_if(aml_lnot(aml_equal(aml_name("CDW3"), a_ctrl)));
     /* Capability bits were masked */
     aml_append(if_caps_masked, aml_or(a_cdw1, aml_int(0x10), a_cdw1));
-    aml_append(if_uuid, if_caps_masked);
+    aml_append(else_nocompat, if_caps_masked);
 
-    aml_append(if_uuid, aml_store(aml_name("CDW2"), aml_name("SUPP")));
-    aml_append(if_uuid, aml_store(aml_name("CDW3"), aml_name("CTRL")));
+    aml_append(else_nocompat, aml_store(aml_name("CDW2"), aml_name("SUPP")));
+    aml_append(else_nocompat, aml_store(aml_name("CDW3"), aml_name("CTRL")));
 
     if_cxl = aml_if(aml_equal(
         aml_arg(0), aml_touuid("68F2D50B-C469-4D8A-BD3D-941A103FD3FC")));
@@ -75,12 +96,13 @@ static Aml *__build_cxl_osc_method(void)
     /* CXL 2.0 Port/Device Register access */
     aml_append(if_cxl,
                aml_or(aml_name("CDW5"), aml_int(0x1), aml_name("CDW5")));
-    aml_append(if_uuid, if_cxl);
+    aml_append(else_nocompat, if_cxl);
 
     /* Update DWORD3 (the return value) */
-    aml_append(if_uuid, aml_store(a_ctrl, aml_name("CDW3")));
+    aml_append(else_nocompat, aml_store(a_ctrl, aml_name("CDW3")));
 
-    aml_append(if_uuid, aml_return(aml_arg(3)));
+    aml_append(else_nocompat, aml_return(aml_arg(3)));
+    aml_append(if_uuid, else_nocompat);
     aml_append(method, if_uuid);
 
     else_uuid = aml_else();
@@ -88,7 +110,7 @@ static Aml *__build_cxl_osc_method(void)
     /* unrecognized uuid */
     aml_append(else_uuid,
                aml_or(aml_name("CDW1"), aml_int(0x4), aml_name("CDW1")));
-    aml_append(else_uuid, aml_return(aml_arg(3)));
+    aml_append(method, aml_return(aml_arg(3)));
     aml_append(method, else_uuid);
 
     return method;
-- 
2.29.2



^ permalink raw reply related	[flat|nested] 64+ messages in thread

* [RFC PATCH 22/25] acpi/cxl: Create the CEDT (9.14.1)
  2020-11-11  5:46 [RFC PATCH 00/25] Introduce CXL 2.0 Emulation Ben Widawsky
                   ` (20 preceding siblings ...)
  2020-11-11  5:47 ` [RFC PATCH 21/25] acpi/cxl: Introduce a compat-driver UUID for CXL _OSC Ben Widawsky
@ 2020-11-11  5:47 ` Ben Widawsky
  2020-11-16 17:15   ` Jonathan Cameron
  2020-11-11  5:47 ` [RFC PATCH 23/25] Temp: acpi/cxl: Add ACPI0017 (CEDT awareness) Ben Widawsky
                   ` (4 subsequent siblings)
  26 siblings, 1 reply; 64+ messages in thread
From: Ben Widawsky @ 2020-11-11  5:47 UTC (permalink / raw)
  To: qemu-devel
  Cc: Ben Widawsky, Eduardo Habkost, Michael S. Tsirkin, Vishal Verma,
	Paolo Bonzini, Igor Mammedov, Dan Williams, Richard Henderson

The CXL Early Discovery Table is defined in the CXL 2.0 specification as
a way for the OS to get CXL specific information from the system
firmware.

As of CXL 2.0 spec, only 1 sub structure is defined, the CXL Host Bridge
Structure (CHBS) which is primarily useful for telling the OS exactly
where the MMIO for the host bridge is.

Signed-off-by: Ben Widawsky <ben.widawsky@intel.com>
---
 hw/acpi/cxl.c                       | 72 +++++++++++++++++++++++++++++
 hw/i386/acpi-build.c                |  6 ++-
 hw/pci-bridge/pci_expander_bridge.c | 21 +--------
 include/hw/acpi/cxl.h               |  4 ++
 include/hw/pci/pci_bridge.h         | 25 ++++++++++
 5 files changed, 107 insertions(+), 21 deletions(-)

diff --git a/hw/acpi/cxl.c b/hw/acpi/cxl.c
index 31ceaeecc3..c9631763ad 100644
--- a/hw/acpi/cxl.c
+++ b/hw/acpi/cxl.c
@@ -18,14 +18,86 @@
  */
 
 #include "qemu/osdep.h"
+#include "hw/sysbus.h"
+#include "hw/pci/pci_bridge.h"
+#include "hw/pci/pci_host.h"
 #include "hw/cxl/cxl.h"
+#include "hw/mem/memory-device.h"
 #include "hw/acpi/acpi.h"
 #include "hw/acpi/aml-build.h"
 #include "hw/acpi/bios-linker-loader.h"
 #include "hw/acpi/cxl.h"
+#include "hw/acpi/cxl.h"
 #include "qapi/error.h"
 #include "qemu/uuid.h"
 
+static void cedt_build_chbs(GArray *table_data, PXBDev *cxl)
+{
+    SysBusDevice *sbd = SYS_BUS_DEVICE(cxl->cxl.cxl_host_bridge);
+    struct MemoryRegion *mr = sbd->mmio[0].memory;
+
+    /* Type */
+    build_append_int_noprefix(table_data, 0, 1);
+
+    /* Reserved */
+    build_append_int_noprefix(table_data, 0xff, 1);
+
+    /* Record Length */
+    build_append_int_noprefix(table_data, 32, 2);
+
+    /* UID */
+    build_append_int_noprefix(table_data, cxl->uid, 4);
+
+    /* Version */
+    build_append_int_noprefix(table_data, 1, 4);
+
+    /* Reserved */
+    build_append_int_noprefix(table_data, 0xffffffff, 4);
+
+    /* Base */
+    build_append_int_noprefix(table_data, mr->addr, 8);
+
+    /* Length */
+    build_append_int_noprefix(table_data, memory_region_size(mr), 4);
+
+    /* Reserved */
+    build_append_int_noprefix(table_data, 0xffffffff, 4);
+}
+
+static int cxl_foreach_pxb_hb(Object *obj, void *opaque)
+{
+    Aml *cedt = opaque;
+
+    if (object_dynamic_cast(obj, TYPE_PXB_CXL_DEVICE)) {
+        PXBDev *pxb = PXB_CXL_DEV(obj);
+
+        cedt_build_chbs(cedt->buf, pxb);
+    }
+
+    return 0;
+}
+
+void cxl_build_cedt(GArray *table_offsets, GArray *table_data,
+                    BIOSLinker *linker)
+{
+    const int cedt_start = table_data->len;
+    Aml *cedt;
+
+    cedt = init_aml_allocator();
+
+    /* reserve space for CEDT header */
+    acpi_add_table(table_offsets, table_data);
+    acpi_data_push(cedt->buf, sizeof(AcpiTableHeader));
+
+    object_child_foreach_recursive(object_get_root(), cxl_foreach_pxb_hb, cedt);
+
+    /* copy AML table into ACPI tables blob and patch header there */
+    g_array_append_vals(table_data, cedt->buf->data, cedt->buf->len);
+    build_header(linker, table_data, (void *)(table_data->data + cedt_start),
+                 "CEDT", table_data->len - cedt_start, 1, NULL, NULL);
+    free_aml_allocator();
+}
+
 static Aml *__build_cxl_osc_method(void)
 {
     Aml *method, *if_uuid, *else_uuid, *if_arg1_not_1, *if_cxl, *if_caps_masked;
diff --git a/hw/i386/acpi-build.c b/hw/i386/acpi-build.c
index dd1f8b39d4..eda62dcd6a 100644
--- a/hw/i386/acpi-build.c
+++ b/hw/i386/acpi-build.c
@@ -75,6 +75,8 @@
 #include "hw/acpi/ipmi.h"
 #include "hw/acpi/hmat.h"
 
+#include "hw/acpi/cxl.h"
+
 /* These are used to size the ACPI tables for -M pc-i440fx-1.7 and
  * -M pc-i440fx-2.0.  Even if the actual amount of AML generated grows
  * a little bit, there should be plenty of free space since the DSDT
@@ -1662,7 +1664,7 @@ build_dsdt(GArray *table_data, BIOSLinker *linker,
 
             scope = aml_scope("\\_SB");
             if (type == CXL) {
-                dev = aml_device("CXL%.01X", pci_bus_uid(bus));
+                dev = aml_device("CXL%.01X", uid);
             } else {
                 dev = aml_device("PC%.02X", bus_num);
             }
@@ -2568,6 +2570,8 @@ void acpi_build(AcpiBuildTables *tables, MachineState *machine)
                           machine->nvdimms_state, machine->ram_slots);
     }
 
+    cxl_build_cedt(table_offsets, tables_blob, tables->linker);
+
     acpi_add_table(table_offsets, tables_blob);
     build_waet(tables_blob, tables->linker);
 
diff --git a/hw/pci-bridge/pci_expander_bridge.c b/hw/pci-bridge/pci_expander_bridge.c
index 75910f5870..b2c1d9056a 100644
--- a/hw/pci-bridge/pci_expander_bridge.c
+++ b/hw/pci-bridge/pci_expander_bridge.c
@@ -57,26 +57,6 @@ DECLARE_INSTANCE_CHECKER(PXBDev, PXB_DEV,
 DECLARE_INSTANCE_CHECKER(PXBDev, PXB_PCIE_DEV,
                          TYPE_PXB_PCIE_DEVICE)
 
-#define TYPE_PXB_CXL_DEVICE "pxb-cxl"
-DECLARE_INSTANCE_CHECKER(PXBDev, PXB_CXL_DEV,
-                         TYPE_PXB_CXL_DEVICE)
-
-struct PXBDev {
-    /*< private >*/
-    PCIDevice parent_obj;
-    /*< public >*/
-
-    uint8_t bus_nr;
-    uint16_t numa_node;
-    int32_t uid;
-    struct cxl_dev {
-        HostMemoryBackend *memory_window[CXL_WINDOW_MAX];
-
-        uint32_t num_windows;
-        hwaddr *window_base[CXL_WINDOW_MAX];
-    } cxl;
-};
-
 typedef struct CXLHost {
     PCIHostState parent_obj;
 
@@ -351,6 +331,7 @@ static void pxb_dev_realize_common(PCIDevice *dev, enum BusType type,
         bus = pci_root_bus_new(ds, dev_name, NULL, NULL, 0, TYPE_PXB_CXL_BUS);
         bus->flags |= PCI_BUS_CXL;
         PXB_CXL_HOST(ds)->dev = PXB_CXL_DEV(dev);
+        PXB_CXL_DEV(dev)->cxl.cxl_host_bridge = ds;
     } else {
         bus = pci_root_bus_new(ds, "pxb-internal", NULL, NULL, 0, TYPE_PXB_BUS);
         bds = qdev_new("pci-bridge");
diff --git a/include/hw/acpi/cxl.h b/include/hw/acpi/cxl.h
index 7b8f3b8a2e..db2063f8c9 100644
--- a/include/hw/acpi/cxl.h
+++ b/include/hw/acpi/cxl.h
@@ -18,6 +18,10 @@
 #ifndef HW_ACPI_CXL_H
 #define HW_ACPI_CXL_H
 
+#include "hw/acpi/bios-linker-loader.h"
+
+void cxl_build_cedt(GArray *table_offsets, GArray *table_data,
+                    BIOSLinker *linker);
 void build_cxl_osc_method(Aml *dev);
 
 #endif
diff --git a/include/hw/pci/pci_bridge.h b/include/hw/pci/pci_bridge.h
index a94d350034..50dd7fdf33 100644
--- a/include/hw/pci/pci_bridge.h
+++ b/include/hw/pci/pci_bridge.h
@@ -28,6 +28,7 @@
 
 #include "hw/pci/pci.h"
 #include "hw/pci/pci_bus.h"
+#include "hw/cxl/cxl.h"
 #include "qom/object.h"
 
 typedef struct PCIBridgeWindows PCIBridgeWindows;
@@ -81,6 +82,30 @@ struct PCIBridge {
 #define PCI_BRIDGE_DEV_PROP_MSI        "msi"
 #define PCI_BRIDGE_DEV_PROP_SHPC       "shpc"
 
+struct PXBDev {
+    /*< private >*/
+    PCIDevice parent_obj;
+    /*< public >*/
+
+    uint8_t bus_nr;
+    uint16_t numa_node;
+    int32_t uid;
+
+    struct cxl_dev {
+        HostMemoryBackend *memory_window[CXL_WINDOW_MAX];
+
+        uint32_t num_windows;
+        hwaddr *window_base[CXL_WINDOW_MAX];
+
+        void *cxl_host_bridge; /* Pointer to a CXLHost */
+    } cxl;
+};
+
+typedef struct PXBDev PXBDev;
+#define TYPE_PXB_CXL_DEVICE "pxb-cxl"
+DECLARE_INSTANCE_CHECKER(PXBDev, PXB_CXL_DEV,
+                         TYPE_PXB_CXL_DEVICE)
+
 int pci_bridge_ssvid_init(PCIDevice *dev, uint8_t offset,
                           uint16_t svid, uint16_t ssid,
                           Error **errp);
-- 
2.29.2



^ permalink raw reply related	[flat|nested] 64+ messages in thread

* [RFC PATCH 23/25] Temp: acpi/cxl: Add ACPI0017 (CEDT awareness)
  2020-11-11  5:46 [RFC PATCH 00/25] Introduce CXL 2.0 Emulation Ben Widawsky
                   ` (21 preceding siblings ...)
  2020-11-11  5:47 ` [RFC PATCH 22/25] acpi/cxl: Create the CEDT (9.14.1) Ben Widawsky
@ 2020-11-11  5:47 ` Ben Widawsky
  2020-11-11  5:47 ` [RFC PATCH 24/25] WIP: i386/cxl: Initialize a host bridge Ben Widawsky
                   ` (3 subsequent siblings)
  26 siblings, 0 replies; 64+ messages in thread
From: Ben Widawsky @ 2020-11-11  5:47 UTC (permalink / raw)
  To: qemu-devel
  Cc: Ben Widawsky, Eduardo Habkost, Michael S. Tsirkin, Vishal Verma,
	Paolo Bonzini, Igor Mammedov, Dan Williams, Richard Henderson

This represents Intel's proposal for how the system firmware can notify
Linux that the CEDT exists and provides a driver attach point. It is not
in the CXL 2.0 specification as of now.

CXL 2.0 specification adds an _HID, ACPI0016, for CXL capable host
bridges, with a _CID of PNP0A08 (PCIe host bridge). CXL aware software
is able to use this initiate the proper _OSC method, and get the _UID
which is referenced by the CEDT. Therefore the existence of an ACPI0016
device allows a CXL aware driver perform the necessary actions. For a
CXL capable OS, this works. For a CXL unaware OS, this works.

The motivation for ACPI0017 is to provide the possibility of having a
Linux CXL module that can work on a legacy Linux kernel.  Linux core
PCI/ACPI which won't be built as a module, will see the _CID of PNP0A08
and bind a driver to it. If we later loaded a driver for ACPI0016, Linux
won't be able to bind it to the hardware because it has already bound
the PNP0A08 driver. The ACPI0017 device is an opportunity to have an
object to bind a driver will be used by a Linux driver to walk the CXL
topology and do everything that we would have preferred to do with
ACPI0016.

There is another motivation for an ACPI0017 device which isn't
implemented here. An operating system needs an attach point for a
non-volatile region provider that understands cross-hostbridge
interleaving. Since QEMU emulation doesn't support interleaving yet,
this is more important on the OS side, for now.

Signed-off-by: Ben Widawsky <ben.widawsky@intel.com>
---
 hw/i386/acpi-build.c | 19 +++++++++++++++++++
 1 file changed, 19 insertions(+)

diff --git a/hw/i386/acpi-build.c b/hw/i386/acpi-build.c
index eda62dcd6a..d080e24228 100644
--- a/hw/i386/acpi-build.c
+++ b/hw/i386/acpi-build.c
@@ -1513,6 +1513,19 @@ static void init_pci_acpi(Aml *dev, int uid, int type)
     }
 }
 
+static void build_acpi0017(Aml *table)
+{
+    Aml *dev;
+    Aml *scope;
+
+    scope =  aml_scope("_SB");
+    dev = aml_device("CXLM");
+    aml_append(dev, aml_name_decl("_HID", aml_string("ACPI0017")));
+
+    aml_append(scope, dev);
+    aml_append(table, scope);
+}
+
 static void
 build_dsdt(GArray *table_data, BIOSLinker *linker,
            AcpiPmInfo *pm, AcpiMiscInfo *misc,
@@ -1529,6 +1542,7 @@ build_dsdt(GArray *table_data, BIOSLinker *linker,
     int root_bus_limit = 0xFF;
     PCIBus *bus = NULL;
     TPMIf *tpm = tpm_find();
+    bool cxl_present = false;
     int i;
     VMBusBridge *vmbus_bridge = vmbus_bridge_find();
 
@@ -1683,6 +1697,7 @@ build_dsdt(GArray *table_data, BIOSLinker *linker,
 
             /* Handle the ranges for the PXB expanders */
             if (type == CXL) {
+                cxl_present = true;
                 uint64_t base = CXL_HOST_BASE + uid * 0x10000;
                 crs_range_insert(crs_range_set.mem_ranges, base,
                                  base + 0x10000 - 1);
@@ -1690,6 +1705,10 @@ build_dsdt(GArray *table_data, BIOSLinker *linker,
         }
     }
 
+    if (cxl_present) {
+        build_acpi0017(dsdt);
+    }
+
     /*
      * At this point crs_range_set has all the ranges used by pci
      * busses *other* than PCI0.  These ranges will be excluded from
-- 
2.29.2



^ permalink raw reply related	[flat|nested] 64+ messages in thread

* [RFC PATCH 24/25] WIP: i386/cxl: Initialize a host bridge
  2020-11-11  5:46 [RFC PATCH 00/25] Introduce CXL 2.0 Emulation Ben Widawsky
                   ` (22 preceding siblings ...)
  2020-11-11  5:47 ` [RFC PATCH 23/25] Temp: acpi/cxl: Add ACPI0017 (CEDT awareness) Ben Widawsky
@ 2020-11-11  5:47 ` Ben Widawsky
  2020-11-11  5:47 ` [RFC PATCH 25/25] qtest/cxl: Add very basic sanity tests Ben Widawsky
                   ` (2 subsequent siblings)
  26 siblings, 0 replies; 64+ messages in thread
From: Ben Widawsky @ 2020-11-11  5:47 UTC (permalink / raw)
  To: qemu-devel
  Cc: Peter Maydell, Ben Widawsky, Eduardo Habkost, Sergio Lopez,
	Michael S. Tsirkin, Vishal Verma, open list:Virt,
	open list:sPAPR, Paolo Bonzini, Igor Mammedov, Dan Williams,
	David Gibson, Richard Henderson

This patch allows initializing the primary host bridge as a CXL capable
hostbridge.

Signed-off-by: Ben Widawsky <ben.widawsky@intel.com>

--
This patch is WIP.
---
 hw/arm/virt.c        |  1 +
 hw/core/machine.c    | 26 ++++++++++++++++++++++++++
 hw/i386/acpi-build.c |  8 +++++++-
 hw/i386/microvm.c    |  1 +
 hw/i386/pc.c         |  1 +
 hw/ppc/spapr.c       |  2 ++
 include/hw/boards.h  |  2 ++
 include/hw/cxl/cxl.h |  4 ++++
 8 files changed, 44 insertions(+), 1 deletion(-)

diff --git a/hw/arm/virt.c b/hw/arm/virt.c
index 27dbeb549e..9d1dafea9f 100644
--- a/hw/arm/virt.c
+++ b/hw/arm/virt.c
@@ -2475,6 +2475,7 @@ static void virt_machine_class_init(ObjectClass *oc, void *data)
     hc->unplug_request = virt_machine_device_unplug_request_cb;
     hc->unplug = virt_machine_device_unplug_cb;
     mc->nvdimm_supported = true;
+    mc->cxl_supported = false;
     mc->auto_enable_numa_with_memhp = true;
     mc->auto_enable_numa_with_memdev = true;
     mc->default_ram_id = "mach-virt.ram";
diff --git a/hw/core/machine.c b/hw/core/machine.c
index 98b87f76cb..5f37d63da6 100644
--- a/hw/core/machine.c
+++ b/hw/core/machine.c
@@ -26,6 +26,7 @@
 #include "sysemu/qtest.h"
 #include "hw/pci/pci.h"
 #include "hw/mem/nvdimm.h"
+#include "hw/cxl/cxl.h"
 #include "migration/vmstate.h"
 
 GlobalProperty hw_compat_5_1[] = {
@@ -491,6 +492,20 @@ static void machine_set_nvdimm_persistence(Object *obj, const char *value,
     nvdimms_state->persistence_string = g_strdup(value);
 }
 
+static bool machine_get_cxl(Object *obj, Error **errp)
+{
+    MachineState *ms = MACHINE(obj);
+
+    return ms->cxl_devices_state->is_enabled;
+}
+
+static void machine_set_cxl(Object *obj, bool value, Error **errp)
+{
+    MachineState *ms = MACHINE(obj);
+
+    ms->cxl_devices_state->is_enabled = value;
+}
+
 void machine_class_allow_dynamic_sysbus_dev(MachineClass *mc, const char *type)
 {
     strList *item = g_new0(strList, 1);
@@ -895,6 +910,16 @@ static void machine_initfn(Object *obj)
                                         "Valid values are cpu, mem-ctrl");
     }
 
+    if (mc->cxl_supported) {
+        Object *obj = OBJECT(ms);
+
+        ms->cxl_devices_state = g_new0(CXLState, 1);
+        object_property_add_bool(obj, "cxl", machine_get_cxl, machine_set_cxl);
+        object_property_set_description(obj, "cxl",
+                                        "Set on/off to enable/disable "
+                                        "CXL instantiation");
+    }
+
     if (mc->cpu_index_to_instance_props && mc->get_default_cpu_node_id) {
         ms->numa_state = g_new0(NumaState, 1);
         object_property_add_bool(obj, "hmat",
@@ -931,6 +956,7 @@ static void machine_finalize(Object *obj)
     g_free(ms->device_memory);
     g_free(ms->nvdimms_state);
     g_free(ms->numa_state);
+    g_free(ms->cxl_devices_state);
 }
 
 bool machine_usb(MachineState *machine)
diff --git a/hw/i386/acpi-build.c b/hw/i386/acpi-build.c
index d080e24228..465bde0196 100644
--- a/hw/i386/acpi-build.c
+++ b/hw/i386/acpi-build.c
@@ -53,6 +53,7 @@
 #include "sysemu/numa.h"
 #include "sysemu/reset.h"
 #include "hw/hyperv/vmbus-bridge.h"
+#include "hw/cxl/cxl.h"
 
 /* Supported chipsets: */
 #include "hw/southbridge/piix.h"
@@ -1569,8 +1570,13 @@ build_dsdt(GArray *table_data, BIOSLinker *linker,
         build_piix4_pci0_int(dsdt);
     } else {
         sb_scope = aml_scope("_SB");
+        /*
+         * XXX: CXL spec calls this "CXL0", but that would require lots of
+         * changes throughout and so even for CXL enabled, we call it "PCI0"
+         */
         dev = aml_device("PCI0");
-        init_pci_acpi(dev, 0, PCIE);
+        init_pci_acpi(dev, 0,
+                machine->cxl_devices_state->is_enabled ? CXL : PCIE);
         aml_append(dev, aml_name_decl("_ADR", aml_int(0)));
         aml_append(sb_scope, dev);
 
diff --git a/hw/i386/microvm.c b/hw/i386/microvm.c
index 5428448b70..ed2f992b2a 100644
--- a/hw/i386/microvm.c
+++ b/hw/i386/microvm.c
@@ -656,6 +656,7 @@ static void microvm_class_init(ObjectClass *oc, void *data)
     mc->auto_enable_numa_with_memdev = false;
     mc->default_cpu_type = TARGET_DEFAULT_CPU_TYPE;
     mc->nvdimm_supported = false;
+    mc->cxl_supported = false;
     mc->default_ram_id = "microvm.ram";
 
     /* Avoid relying too much on kernel components */
diff --git a/hw/i386/pc.c b/hw/i386/pc.c
index ecfc497f71..a962a77835 100644
--- a/hw/i386/pc.c
+++ b/hw/i386/pc.c
@@ -1694,6 +1694,7 @@ static void pc_machine_class_init(ObjectClass *oc, void *data)
     hc->unplug = pc_machine_device_unplug_cb;
     mc->default_cpu_type = TARGET_DEFAULT_CPU_TYPE;
     mc->nvdimm_supported = true;
+    mc->cxl_supported = true;
     mc->default_ram_id = "pc.ram";
 
     object_class_property_add(oc, PC_MACHINE_MAX_RAM_BELOW_4G, "size",
diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
index 227075103e..3d72bad5f2 100644
--- a/hw/ppc/spapr.c
+++ b/hw/ppc/spapr.c
@@ -4422,6 +4422,7 @@ static void spapr_machine_class_init(ObjectClass *oc, void *data)
     mc->default_cpu_type = POWERPC_CPU_TYPE_NAME("power9_v2.0");
     mc->has_hotpluggable_cpus = true;
     mc->nvdimm_supported = true;
+    mc->cxl_supported = false;
     smc->resize_hpt_default = SPAPR_RESIZE_HPT_ENABLED;
     fwc->get_dev_path = spapr_get_fw_dev_path;
     nc->nmi_monitor_handler = spapr_nmi;
@@ -4571,6 +4572,7 @@ static void spapr_machine_4_2_class_options(MachineClass *mc)
     smc->default_caps.caps[SPAPR_CAP_FWNMI] = SPAPR_CAP_OFF;
     smc->rma_limit = 16 * GiB;
     mc->nvdimm_supported = false;
+    mc->cxl_supported = false;
 }
 
 DEFINE_SPAPR_MACHINE(4_2, "4.2", false);
diff --git a/include/hw/boards.h b/include/hw/boards.h
index a49e3a6b44..f20ccc15c6 100644
--- a/include/hw/boards.h
+++ b/include/hw/boards.h
@@ -205,6 +205,7 @@ struct MachineClass {
     bool ignore_boot_device_suffixes;
     bool smbus_no_migration_support;
     bool nvdimm_supported;
+    bool cxl_supported;
     bool numa_mem_supported;
     bool auto_enable_numa;
     const char *default_ram_id;
@@ -290,6 +291,7 @@ struct MachineState {
     CPUArchIdList *possible_cpus;
     CpuTopology smp;
     struct NVDIMMState *nvdimms_state;
+    struct CXLState *cxl_devices_state;
     struct NumaState *numa_state;
 };
 
diff --git a/include/hw/cxl/cxl.h b/include/hw/cxl/cxl.h
index 809ed7de60..6961e47076 100644
--- a/include/hw/cxl/cxl.h
+++ b/include/hw/cxl/cxl.h
@@ -22,5 +22,9 @@
 #define CXL_HOST_BASE 0xD0000000
 #define CXL_WINDOW_MAX 10
 
+typedef struct CXLState {
+    bool is_enabled;
+} CXLState;
+
 #endif
 
-- 
2.29.2



^ permalink raw reply related	[flat|nested] 64+ messages in thread

* [RFC PATCH 25/25] qtest/cxl: Add very basic sanity tests
  2020-11-11  5:46 [RFC PATCH 00/25] Introduce CXL 2.0 Emulation Ben Widawsky
                   ` (23 preceding siblings ...)
  2020-11-11  5:47 ` [RFC PATCH 24/25] WIP: i386/cxl: Initialize a host bridge Ben Widawsky
@ 2020-11-11  5:47 ` Ben Widawsky
  2020-11-16 17:21 ` [RFC PATCH 00/25] Introduce CXL 2.0 Emulation Jonathan Cameron
  2020-12-04 14:27 ` Daniel P. Berrangé
  26 siblings, 0 replies; 64+ messages in thread
From: Ben Widawsky @ 2020-11-11  5:47 UTC (permalink / raw)
  To: qemu-devel
  Cc: Laurent Vivier, Thomas Huth, Ben Widawsky, Vishal Verma,
	Paolo Bonzini, Dan Williams

Signed-off-by: Ben Widawsky <ben.widawsky@intel.com>
---
 tests/qtest/cxl-test.c  | 93 +++++++++++++++++++++++++++++++++++++++++
 tests/qtest/meson.build |  4 ++
 2 files changed, 97 insertions(+)
 create mode 100644 tests/qtest/cxl-test.c

diff --git a/tests/qtest/cxl-test.c b/tests/qtest/cxl-test.c
new file mode 100644
index 0000000000..00eca14faa
--- /dev/null
+++ b/tests/qtest/cxl-test.c
@@ -0,0 +1,93 @@
+/*
+ * QTest testcase for CXL
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2 or later.
+ * See the COPYING file in the top-level directory.
+ */
+
+#include "qemu/osdep.h"
+#include "libqtest-single.h"
+
+#define QEMU_PXB_CMD "-machine q35 -object memory-backend-file,id=cxl-mem1," \
+                     "share,mem-path=%s,size=512M "                          \
+                     "-device pxb-cxl,id=cxl.0,bus=pcie.0,bus_nr=52,uid=0,"  \
+                     "len-window-base=1,window-base[0]=0x4c0000000,memdev[0]=cxl-mem1"
+#define QEMU_RP "-device cxl-rp,id=rp0,bus=cxl.0,addr=0.0,chassis=0,slot=0"
+
+#define QEMU_T3D "-device cxl-type3,bus=rp0,memdev=cxl-mem1,id=cxl-pmem0,size=256M"
+
+static void cxl_basic_hb(void)
+{
+    qtest_start("-machine q35,cxl");
+    qtest_end();
+}
+
+static void cxl_basic_pxb(void)
+{
+    qtest_start("-machine q35 -device pxb-cxl,bus=pcie.0,uid=0");
+    qtest_end();
+}
+
+static void cxl_pxb_with_window(void)
+{
+    GString *cmdline;
+    char template[] = "/tmp/cxl-test-XXXXXX";
+    const char *tmpfs;
+
+    tmpfs = mkdtemp(template);
+
+    cmdline = g_string_new(NULL);
+    g_string_printf(cmdline, QEMU_PXB_CMD, tmpfs);
+
+    qtest_start(cmdline->str);
+    qtest_end();
+
+    g_string_free(cmdline, TRUE);
+}
+
+static void cxl_root_port(void)
+{
+    GString *cmdline;
+    char template[] = "/tmp/cxl-test-XXXXXX";
+    const char *tmpfs;
+
+    tmpfs = mkdtemp(template);
+
+    cmdline = g_string_new(NULL);
+    g_string_printf(cmdline, QEMU_PXB_CMD " %s", tmpfs, QEMU_RP);
+
+    qtest_start(cmdline->str);
+    qtest_end();
+
+    g_string_free(cmdline, TRUE);
+}
+
+static void cxl_t3d(void)
+{
+    GString *cmdline;
+    char template[] = "/tmp/cxl-test-XXXXXX";
+    const char *tmpfs;
+
+    tmpfs = mkdtemp(template);
+
+    cmdline = g_string_new(NULL);
+    g_string_printf(cmdline, QEMU_PXB_CMD " %s %s", tmpfs, QEMU_RP, QEMU_T3D);
+
+    qtest_start(cmdline->str);
+    qtest_end();
+
+    g_string_free(cmdline, TRUE);
+}
+
+int main(int argc, char **argv)
+{
+    g_test_init(&argc, &argv, NULL);
+
+    qtest_add_func("/pci/cxl/basic_hostbridge", cxl_basic_hb);
+    qtest_add_func("/pci/cxl/basic_pxb", cxl_basic_pxb);
+    qtest_add_func("/pci/cxl/pxb_with_window", cxl_pxb_with_window);
+    qtest_add_func("/pci/cxl/root_port", cxl_root_port);
+    qtest_add_func("/pci/cxl/type3_device", cxl_t3d);
+
+    return g_test_run();
+}
diff --git a/tests/qtest/meson.build b/tests/qtest/meson.build
index c19f1c8503..7c6439b45c 100644
--- a/tests/qtest/meson.build
+++ b/tests/qtest/meson.build
@@ -22,6 +22,9 @@ qtests_pci = \
   (config_all_devices.has_key('CONFIG_VGA') ? ['display-vga-test'] : []) +                  \
   (config_all_devices.has_key('CONFIG_IVSHMEM_DEVICE') ? ['ivshmem-test'] : [])
 
+qtests_cxl = \
+  (config_all_devices.has_key('CONFIG_CXL') ? ['cxl-test'] : [])
+
 qtests_i386 = \
   (slirp.found() ? ['pxe-test', 'test-netfilter'] : []) +             \
   (config_host.has_key('CONFIG_POSIX') ? ['test-filter-mirror'] : []) +                     \
@@ -47,6 +50,7 @@ qtests_i386 = \
   (config_all_devices.has_key('CONFIG_TPM_TIS_ISA') ? ['tpm-tis-swtpm-test'] : []) +        \
   (config_all_devices.has_key('CONFIG_RTL8139_PCI') ? ['rtl8139-test'] : []) +              \
   qtests_pci +                                                                              \
+  qtests_cxl +                                                                              \
   ['fdc-test',
    'ide-test',
    'hd-geo-test',
-- 
2.29.2



^ permalink raw reply related	[flat|nested] 64+ messages in thread

* Re: [RFC PATCH 12/25] acpi/pci: Consolidate host bridge setup
  2020-11-11  5:47 ` [RFC PATCH 12/25] acpi/pci: Consolidate host bridge setup Ben Widawsky
@ 2020-11-12 17:46   ` Ben Widawsky
  2020-11-16 16:45   ` Jonathan Cameron
  1 sibling, 0 replies; 64+ messages in thread
From: Ben Widawsky @ 2020-11-12 17:46 UTC (permalink / raw)
  To: qemu-devel
  Cc: Eduardo Habkost, Michael S. Tsirkin, Vishal Verma, Paolo Bonzini,
	Igor Mammedov, Dan Williams, Richard Henderson

On 20-11-10 21:47:11, Ben Widawsky wrote:
> This cleanup will make it easier to add support for CXL to the mix.

This patch needs bios table updates for qtest... I am fixing it...

> 
> Signed-off-by: Ben Widawsky <ben.widawsky@intel.com>
> ---
>  hw/i386/acpi-build.c | 31 +++++++++++++++++--------------
>  1 file changed, 17 insertions(+), 14 deletions(-)
> 
> diff --git a/hw/i386/acpi-build.c b/hw/i386/acpi-build.c
> index 4f66642d88..99b3088c9e 100644
> --- a/hw/i386/acpi-build.c
> +++ b/hw/i386/acpi-build.c
> @@ -1486,6 +1486,20 @@ static void build_smb0(Aml *table, I2CBus *smbus, int devnr, int func)
>      aml_append(table, scope);
>  }
>  
> +enum { PCI, PCIE };
> +static void init_pci_acpi(Aml *dev, int uid, int type)
> +{
> +    if (type == PCI) {
> +        aml_append(dev, aml_name_decl("_HID", aml_eisaid("PNP0A03")));
> +        aml_append(dev, aml_name_decl("_UID", aml_int(uid)));
> +    } else {
> +        aml_append(dev, aml_name_decl("_HID", aml_eisaid("PNP0A08")));
> +        aml_append(dev, aml_name_decl("_CID", aml_eisaid("PNP0A03")));
> +        aml_append(dev, aml_name_decl("_UID", aml_int(uid)));
> +        aml_append(dev, build_q35_osc_method());
> +    }
> +}
> +
>  static void
>  build_dsdt(GArray *table_data, BIOSLinker *linker,
>             AcpiPmInfo *pm, AcpiMiscInfo *misc,
> @@ -1514,9 +1528,8 @@ build_dsdt(GArray *table_data, BIOSLinker *linker,
>      if (misc->is_piix4) {
>          sb_scope = aml_scope("_SB");
>          dev = aml_device("PCI0");
> -        aml_append(dev, aml_name_decl("_HID", aml_eisaid("PNP0A03")));
> +        init_pci_acpi(dev, 0, PCI);
>          aml_append(dev, aml_name_decl("_ADR", aml_int(0)));
> -        aml_append(dev, aml_name_decl("_UID", aml_int(0)));
>          aml_append(sb_scope, dev);
>          aml_append(dsdt, sb_scope);
>  
> @@ -1530,11 +1543,8 @@ build_dsdt(GArray *table_data, BIOSLinker *linker,
>      } else {
>          sb_scope = aml_scope("_SB");
>          dev = aml_device("PCI0");
> -        aml_append(dev, aml_name_decl("_HID", aml_eisaid("PNP0A08")));
> -        aml_append(dev, aml_name_decl("_CID", aml_eisaid("PNP0A03")));
> +        init_pci_acpi(dev, 0, PCIE);
>          aml_append(dev, aml_name_decl("_ADR", aml_int(0)));
> -        aml_append(dev, aml_name_decl("_UID", aml_int(0)));
> -        aml_append(dev, build_q35_osc_method());
>          aml_append(sb_scope, dev);
>  
>          if (pm->smi_on_cpuhp) {
> @@ -1636,15 +1646,8 @@ build_dsdt(GArray *table_data, BIOSLinker *linker,
>  
>              scope = aml_scope("\\_SB");
>              dev = aml_device("PC%.02X", bus_num);
> -            aml_append(dev, aml_name_decl("_UID", aml_int(bus_num)));
>              aml_append(dev, aml_name_decl("_BBN", aml_int(bus_num)));
> -            if (pci_bus_is_express(bus)) {
> -                aml_append(dev, aml_name_decl("_HID", aml_eisaid("PNP0A08")));
> -                aml_append(dev, aml_name_decl("_CID", aml_eisaid("PNP0A03")));
> -                aml_append(dev, build_q35_osc_method());
> -            } else {
> -                aml_append(dev, aml_name_decl("_HID", aml_eisaid("PNP0A03")));
> -            }
> +            init_pci_acpi(dev, bus_num, pci_bus_is_express(bus) ? PCIE : PCI);
>  
>              if (numa_node != NUMA_NODE_UNASSIGNED) {
>                  aml_append(dev, aml_name_decl("_PXM", aml_int(numa_node)));
> -- 
> 2.29.2
> 
> 


^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [RFC PATCH 18/25] hw/cxl/device: Add a memory device (8.2.8.5)
  2020-11-11  5:47 ` [RFC PATCH 18/25] hw/cxl/device: Add a memory device (8.2.8.5) Ben Widawsky
@ 2020-11-12 18:37   ` Eric Blake
  2020-11-13  7:47     ` Markus Armbruster
  0 siblings, 1 reply; 64+ messages in thread
From: Eric Blake @ 2020-11-12 18:37 UTC (permalink / raw)
  To: Ben Widawsky, qemu-devel
  Cc: Eduardo Habkost, Michael S. Tsirkin, Vishal Verma,
	Dr. David Alan Gilbert, Markus Armbruster, Igor Mammedov,
	Paolo Bonzini, Dan Williams, Richard Henderson

On 11/10/20 11:47 PM, Ben Widawsky wrote:
> A CXL memory device (AKA Type 3) is a CXL component that contains some
> combination of volatile and persistent memory. It also implements the
> previously defined mailbox interface as well as the memory device
> firmware interface.
> 
> The following example will create a 256M device in a 512M window:
> 
> -object "memory-backend-file,id=cxl-mem1,share,mem-path=cxl-type3,size=512M"
> -device "cxl-type3,bus=rp0,memdev=cxl-mem1,id=cxl-pmem0,size=256M"
> 
> Signed-off-by: Ben Widawsky <ben.widawsky@intel.com>
> ---

> +++ b/qapi/machine.json
> @@ -1394,6 +1394,7 @@
>  { 'union': 'MemoryDeviceInfo',
>    'data': { 'dimm': 'PCDIMMDeviceInfo',
>              'nvdimm': 'PCDIMMDeviceInfo',
> +            'cxl': 'PCDIMMDeviceInfo',
>              'virtio-pmem': 'VirtioPMEMDeviceInfo',
>              'virtio-mem': 'VirtioMEMDeviceInfo'
>            }

Missing documentation of the new data type, and the fact that it will be
introduced in 6.0.  Also, Markus has been trying to get rid of so-called
"simple unions" in favor of "flat unions" - every time we modify an
existing simple union, it is worth asking if it is time to first flatten
this.

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.           +1-919-301-3226
Virtualization:  qemu.org | libvirt.org



^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [RFC PATCH 16/25] hw/pxb/cxl: Add "windows" for host bridges
  2020-11-11  5:47 ` [RFC PATCH 16/25] hw/pxb/cxl: Add "windows" for host bridges Ben Widawsky
@ 2020-11-13  0:49   ` Ben Widawsky
  2020-11-23 19:12     ` Philippe Mathieu-Daudé
  0 siblings, 1 reply; 64+ messages in thread
From: Ben Widawsky @ 2020-11-13  0:49 UTC (permalink / raw)
  To: qemu-devel, Philippe Mathieu-Daudé
  Cc: Vishal Verma, Dan Williams, Michael S. Tsirkin

On 20-11-10 21:47:15, Ben Widawsky wrote:
> In a bare metal CXL capable system, system firmware will program
> physical address ranges on the host. This is done by programming
> internal registers that aren't typically known to OS. These address
> ranges might be contiguous or interleaved across host bridges.
> 
> For a QEMU guest a new construct is introduced allowing passing a memory
> backend to the host bridge for this same purpose. Each memory backend
> needs to be passed to the host bridge as well as any device that will be
> emulating that memory (not implemented here).
> 
> I'm hopeful the interleaving work in the link can be re-purposed here
> (see Link).
> 
> An example to create a host bridges with a 512M window at 0x4c0000000
>  -object memory-backend-file,id=cxl-mem1,share,mem-path=cxl-type3,size=512M
>  -device pxb-cxl,id=cxl.0,bus=pcie.0,bus_nr=52,uid=0,len-memory-base=1,memory-base\[0\]=0x4c0000000,memory\[0\]=cxl-mem1
> 
> Link: https://lists.nongnu.org/archive/html/qemu-devel/2020-08/msg03680.html
> Signed-off-by: Ben Widawsky <ben.widawsky@intel.com>

Hi Phil, wanted to call you out specifically on this one.

The newly released CXL 2.0 specification (which from a topology perspective can
be thought of as very PCIe-like) allows for interleaving of memory access.

Below is an example of two host bridges, each with two root ports, and 5 devices
(two of switch are behind a switch).

RP: Root Port
USP: Upstream Port
DSP: Downstream Port
Type 3 device: Memory Device with persistent or volatile memory.

+-------------------------+      +-------------------------+
|                         |      |                         |
|   CXL 2.0 Host Bridge   |      |   CXL 2.0 Host Bridge   |
|                         |      |                         |
|  +------+     +------+  |      |  +------+     +------+  |
|  |  RP  |     |  RP  |  |      |  |  RP  |     |  RP  |  |
+--+------+-----+------+--+      +--+------+-----+------+--+
      |            |                   |               \--
      |            |                   |        +-------+-\--+------+
   +------+    +-------+            +-------+   |       |USP |      |
   |Type 3|    |Type 3 |            |Type 3 |   |       +----+      |
   |Device|    |Device |            |Device |   |                   |
   +------+    +-------+            +-------+   | +----+     +----+ |
                                                | |DSP |     |DSP | |
                                                +-+----+-----+----+-+
                                                    |          |
                                                +------+    +-------+
                                                |Type 3|    |Type 3 |
                                                |Device|    |Device |
                                                +------+    +-------+

Considering this picture... interleaving of memory access can happen in all 3
layers in the topology.

- Memory access can be interleaved across host bridges (this is accomplished
  based on the physical address chosen for the devices, those address ranges are
  platform specific and not part of the 2.0 spec, for now).

- Memory access can be interleaved across root ports in a host bridge.

- Finally, memory access can be interleaved across downstream ports.

I'd like to start the discussion about how this might overlap with the patch
series you've last been working on to interleave memory. Do you have any
thoughts or ideas on how I should go about doing this?


^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [RFC PATCH 18/25] hw/cxl/device: Add a memory device (8.2.8.5)
  2020-11-12 18:37   ` Eric Blake
@ 2020-11-13  7:47     ` Markus Armbruster
  2020-11-25 16:53       ` Ben Widawsky
  0 siblings, 1 reply; 64+ messages in thread
From: Markus Armbruster @ 2020-11-13  7:47 UTC (permalink / raw)
  To: Eric Blake
  Cc: Ben Widawsky, Eduardo Habkost, Michael S. Tsirkin, Vishal Verma,
	qemu-devel, Dr. David Alan Gilbert, Paolo Bonzini, Igor Mammedov,
	Dan Williams, Richard Henderson

Eric Blake <eblake@redhat.com> writes:

> On 11/10/20 11:47 PM, Ben Widawsky wrote:
>> A CXL memory device (AKA Type 3) is a CXL component that contains some
>> combination of volatile and persistent memory. It also implements the
>> previously defined mailbox interface as well as the memory device
>> firmware interface.
>> 
>> The following example will create a 256M device in a 512M window:
>> 
>> -object "memory-backend-file,id=cxl-mem1,share,mem-path=cxl-type3,size=512M"
>> -device "cxl-type3,bus=rp0,memdev=cxl-mem1,id=cxl-pmem0,size=256M"
>> 
>> Signed-off-by: Ben Widawsky <ben.widawsky@intel.com>
>> ---
>
>> +++ b/qapi/machine.json
>> @@ -1394,6 +1394,7 @@
>>  { 'union': 'MemoryDeviceInfo',
>>    'data': { 'dimm': 'PCDIMMDeviceInfo',
>>              'nvdimm': 'PCDIMMDeviceInfo',
>> +            'cxl': 'PCDIMMDeviceInfo',
>>              'virtio-pmem': 'VirtioPMEMDeviceInfo',
>>              'virtio-mem': 'VirtioMEMDeviceInfo'
>>            }
>
> Missing documentation of the new data type, and the fact that it will be
> introduced in 6.0.

Old wish list item: improve the QAPI schema frontend to flag this.

>                     Also, Markus has been trying to get rid of so-called
> "simple unions" in favor of "flat unions" - every time we modify an
> existing simple union, it is worth asking if it is time to first flatten
> this.

0. Simple unions can be transformed into flat unions.  The
transformation can either preserve the nested wire format, or flatten
it.  See docs/devel/qapi-code-gen.txt "A simple union can always be
re-written as a flat union ..."

1. No new simple unions.

2. Existing simple unions that can be flattened without breaking
backward compatibility have long been flattened.

3. The remaining simple unions are part of QMP, where we need to
preserve the wire format.  We could turn them into flat union preserving
the wire format.  Only worthwhile if we kill simple unions and simplify
scripts/qapi/.  Opportunity to make the flat union syntax less
cumbersome.  Not done due to lack of time.

4. Kevin and I have been experimenting with ways to provide both flat
and nested wire format.  This would pave the way for orderly deprecation
of the nested wire format.  May not be practical for QMP output.



^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [RFC PATCH 03/25] hw/cxl/component: Introduce CXL components (8.1.x, 8.2.5)
  2020-11-11  5:47 ` [RFC PATCH 03/25] hw/cxl/component: Introduce CXL components (8.1.x, 8.2.5) Ben Widawsky
@ 2020-11-16 12:03   ` Jonathan Cameron
  2020-11-16 19:19     ` Ben Widawsky
  0 siblings, 1 reply; 64+ messages in thread
From: Jonathan Cameron @ 2020-11-16 12:03 UTC (permalink / raw)
  To: Ben Widawsky; +Cc: Vishal Verma, Dan Williams, qemu-devel

On Tue, 10 Nov 2020 21:47:02 -0800
Ben Widawsky <ben.widawsky@intel.com> wrote:

> A CXL 2.0 component is any entity in the CXL topology. All components
> have a analogous function in PCIe. Except for the CXL host bridge, all
> have a PCIe config space that is accessible via the common PCIe
> mechanisms. CXL components are enumerated via DVSEC fields in the
> extended PCIe header space. CXL components will minimally implement some
> subset of CXL.mem and CXL.cache registers defined in 8.2.5 of the CXL
> 2.0 specification. Two headers and a utility library are introduced to
> support the minimum functionality needed to enumerate components.
> 
> The cxl_pci header manages bits associated with PCI, specifically the
> DVSEC and related fields. The cxl_component.h variant has data
> structures and APIs that are useful for drivers implementing any of the
> CXL 2.0 components. The library takes care of making use of the DVSEC
> bits and the CXL.[mem|cache] regisetrs.
> 
> None of the mechanisms required to enumerate a CXL capable hostbridge
> are introduced at this point.
> 
> Note that the CXL.mem and CXL.cache registers used are always 4B wide.
> It's possible in the future that this constraint will not hold.
> 
> Signed-off-by: Ben Widawsky <ben.widawsky@intel.com>
> 
> --
> It's tempting to have a more generalized DVSEC infrastructure. As far as
> I can tell, the amount this would actually save in terms of code is
> minimal because most of DVESC is vendor specific.

Agreed.  Probably not worth bothering with generic infrastructure for 2.5 DW.

A few comments inline.

Jonathan


> ---
>  MAINTAINERS                    |   6 ++
>  hw/Kconfig                     |   1 +
>  hw/cxl/Kconfig                 |   3 +
>  hw/cxl/cxl-component-utils.c   | 192 +++++++++++++++++++++++++++++++++
>  hw/cxl/cxl-device-utils.c      |   0
>  hw/cxl/meson.build             |   3 +
>  hw/meson.build                 |   1 +
>  include/hw/cxl/cxl.h           |  17 +++
>  include/hw/cxl/cxl_component.h | 181 +++++++++++++++++++++++++++++++
>  include/hw/cxl/cxl_pci.h       | 133 +++++++++++++++++++++++
>  10 files changed, 537 insertions(+)
>  create mode 100644 hw/cxl/Kconfig
>  create mode 100644 hw/cxl/cxl-component-utils.c
>  create mode 100644 hw/cxl/cxl-device-utils.c
>  create mode 100644 hw/cxl/meson.build
>  create mode 100644 include/hw/cxl/cxl.h
>  create mode 100644 include/hw/cxl/cxl_component.h
>  create mode 100644 include/hw/cxl/cxl_pci.h
> 
> diff --git a/MAINTAINERS b/MAINTAINERS
> index c1d16026ba..02b8e2274d 100644
> --- a/MAINTAINERS
> +++ b/MAINTAINERS
> @@ -2184,6 +2184,12 @@ F: qapi/block*.json
>  F: qapi/transaction.json
>  T: git https://repo.or.cz/qemu/armbru.git block-next
>  
> +Compute Express Link
> +M: Ben Widawsky <ben.widawsky@intel.com>
> +S: Supported
> +F: hw/cxl/
> +F: include/hw/cxl/
> +
>  Dirty Bitmaps
>  M: Eric Blake <eblake@redhat.com>
>  M: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
> diff --git a/hw/Kconfig b/hw/Kconfig
> index 4de1797ffd..efed27805a 100644
> --- a/hw/Kconfig
> +++ b/hw/Kconfig
> @@ -6,6 +6,7 @@ source audio/Kconfig
>  source block/Kconfig
>  source char/Kconfig
>  source core/Kconfig
> +source cxl/Kconfig
>  source display/Kconfig
>  source dma/Kconfig
>  source gpio/Kconfig
> diff --git a/hw/cxl/Kconfig b/hw/cxl/Kconfig
> new file mode 100644
> index 0000000000..8e67519b16
> --- /dev/null
> +++ b/hw/cxl/Kconfig
> @@ -0,0 +1,3 @@
> +config CXL
> +    bool
> +    default y if PCI_EXPRESS
> diff --git a/hw/cxl/cxl-component-utils.c b/hw/cxl/cxl-component-utils.c
> new file mode 100644
> index 0000000000..c52bd5bfc7
> --- /dev/null
> +++ b/hw/cxl/cxl-component-utils.c
> @@ -0,0 +1,192 @@
> +/*
> + * CXL Utility library for components
> + *
> + * Copyright(C) 2020 Intel Corporation.
> + *
> + * This work is licensed under the terms of the GNU GPL, version 2. See the
> + * COPYING file in the top-level directory.
> + */
> +
> +#include "qemu/osdep.h"
> +#include "qemu/log.h"
> +#include "hw/pci/pci.h"
> +#include "hw/cxl/cxl.h"
> +
> +static uint64_t cxl_cache_mem_read_reg(void *opaque, hwaddr offset,
> +                                       unsigned size)
> +{
> +    CXLComponentState *cxl_cstate = opaque;
> +    ComponentRegisters *cregs = &cxl_cstate->crb;
> +    uint32_t *cache_mem = cregs->cache_mem_registers;
> +
> +    if (size != 4) {
> +        qemu_log_mask(LOG_UNIMP, "%uB component register read (RAZ)\n", size);
> +        return 0;
> +    }
> +
> +    if (cregs->special_ops && cregs->special_ops->read) {
> +        return cregs->special_ops->read(cxl_cstate, offset, size);
> +    } else {
> +        return cache_mem[offset >> 2];
> +    }
> +}
> +
> +static void cxl_cache_mem_write_reg(void *opaque, hwaddr offset, uint64_t value,
> +                                    unsigned size)
> +{
> +    CXLComponentState *cxl_cstate = opaque;
> +    ComponentRegisters *cregs = &cxl_cstate->crb;
> +
> +    if (size != 4) {
> +        qemu_log_mask(LOG_UNIMP, "%uB component register write (WI)\n", size);
> +        return;
> +    }
> +
> +    if (cregs->special_ops && cregs->special_ops->write) {
> +        cregs->special_ops->write(cxl_cstate, offset, value, size);
> +    }
> +}
> +
> +static const MemoryRegionOps cache_mem_ops = {
> +    .read = cxl_cache_mem_read_reg,
> +    .write = cxl_cache_mem_write_reg,
> +    .endianness = DEVICE_LITTLE_ENDIAN,
> +    .valid = {
> +        .min_access_size = 4,
> +        .max_access_size = 4,
> +    },
> +    .impl = {
> +        .min_access_size = 4,
> +        .max_access_size = 4,
> +    },
> +};
> +
> +void cxl_component_register_block_init(Object *obj,
> +                                       CXLComponentState *cxl_cstate,
> +                                       const char *type)

What the type parameter means, is not immediately obvious so docs appreciated.

> +{
> +    ComponentRegisters *cregs = &cxl_cstate->crb;
> +
> +    memory_region_init(&cregs->component_registers, obj, type, 0x10000);
> +    memory_region_init_io(&cregs->io, obj, NULL, cregs, ".io", 0x1000);
> +    memory_region_init_io(&cregs->cache_mem, obj, &cache_mem_ops, cregs,
> +                          ".cache_mem", 0x1000);
> +
> +    memory_region_add_subregion(&cregs->component_registers, 0, &cregs->io);
> +    memory_region_add_subregion(&cregs->component_registers, 0x1000,
> +                                &cregs->cache_mem);
> +}
> +
> +static void ras_init_common(uint32_t *reg_state)
> +{
> +    reg_state[R_CXL_RAS_UNC_ERR_STATUS] = 0;
> +    reg_state[R_CXL_RAS_UNC_ERR_MASK] = 0x1efff;
> +    reg_state[R_CXL_RAS_UNC_ERR_SEVERITY] = 0x1efff;
> +    reg_state[R_CXL_RAS_COR_ERR_STATUS] = 0;
> +    reg_state[R_CXL_RAS_COR_ERR_MASK] = 0x3f;
> +    reg_state[R_CXL_RAS_ERR_CAP_CTRL] = 0; /* CXL switches and devices must set */
> +}
> +
> +static void hdm_init_common(uint32_t *reg_state)
> +{
> +    ARRAY_FIELD_DP32(reg_state, CXL_HDM_DECODER_CAPABILITY, DECODER_COUNT, 0);
> +    ARRAY_FIELD_DP32(reg_state, CXL_HDM_DECODER_GLOBAL_CONTROL, HDM_DECODER_ENABLE, 0);
> +}
> +
> +void cxl_component_register_init_common(uint32_t *reg_state, enum reg_type type)
> +{
> +    int caps = 0;

> +    switch (type) {
> +    case CXL2_DOWNSTREAM_PORT:
> +    case CXL2_DEVICE:
> +        /* CAP, RAS, Link */
> +        caps = 3;
> +        break;
> +    case CXL2_UPSTREAM_PORT:
> +    case CXL2_TYPE3_DEVICE:
> +    case CXL2_LOGICAL_DEVICE:
> +        /* + HDM */
> +        caps = 4;
> +        break;
> +    case CXL2_ROOT_PORT:
> +        /* + Extended Security, + Snoop */
> +        caps = 6;
> +        break;
> +    default:
> +        abort();
> +    }
> +
> +    memset(reg_state, 0, 0x1000);
> +
> +    /* CXL Capability Header Register */
> +    ARRAY_FIELD_DP32(reg_state, CXL_CAPABILITY_HEADER, ID, 1);
> +    ARRAY_FIELD_DP32(reg_state, CXL_CAPABILITY_HEADER, VERSION, 1);
> +    ARRAY_FIELD_DP32(reg_state, CXL_CAPABILITY_HEADER, CACHE_MEM_VERSION, 1);
> +    ARRAY_FIELD_DP32(reg_state, CXL_CAPABILITY_HEADER, ARRAY_SIZE, caps);
> +
> +
> +#define init_cap_reg(reg, id, version)                                        \
> +    do {                                                                      \
> +        int which = R_CXL_##reg##_CAPABILITY_HEADER;                          \
> +        reg_state[which] = FIELD_DP32(reg_state[which],                       \
> +                                      CXL_##reg##_CAPABILITY_HEADER, ID, id); \
> +        reg_state[which] =                                                    \
> +            FIELD_DP32(reg_state[which], CXL_##reg##_CAPABILITY_HEADER,       \
> +                       VERSION, version);                                     \
> +        reg_state[which] =                                                    \
> +            FIELD_DP32(reg_state[which], CXL_##reg##_CAPABILITY_HEADER, PTR,  \
> +                       CXL_##reg##_REGISTERS_OFFSET);                         \
> +    } while (0)
> +
> +    init_cap_reg(RAS, 2, 1);
> +    ras_init_common(reg_state);
> +
> +    init_cap_reg(LINK, 4, 2);
> +
> +    if (caps < 4) {
> +        return;
> +    }
> +
> +    init_cap_reg(HDM, 5, 1);
> +    hdm_init_common(reg_state);
> +
> +    if (caps < 6) {
> +        return;
> +    }
> +
> +    init_cap_reg(EXTSEC, 6, 1);
> +    init_cap_reg(SNOOP, 8, 1);
> +
> +#undef init_cap_reg
> +}
> +
> +/*
> + * Helper to creates a DVSEC header for a CXL entity. The caller is responsible
> + * for tracking the valid offset.
> + *
> + * This function will build the DVSEC header on behalf of the caller and then
> + * copy in the remaining data for the vendor specific bits.
> + */
> +void cxl_component_create_dvsec(CXLComponentState *cxl, uint16_t length,
> +                                uint16_t type, uint8_t rev, uint8_t *body)
> +{
> +    PCIDevice *pdev = cxl->pdev;
> +    uint16_t offset = cxl->dvsec_offset;
> +
> +    assert(offset >= PCI_CFG_SPACE_SIZE && offset < PCI_CFG_SPACE_EXP_SIZE);

Better perhaps to check if offset + length < PCI_CFG_SPACE_EXP_SIZE?

> +    assert((length & 0xf000) == 0);
> +    assert((rev & 0xf0) == 0);

I'd prefer to express as mask of what we do want as doesn't rely on someone having
to look back to see how large rev is
Something like...

assert ((rev & ~0xf) == 0);

> +
> +    /* Create the DVSEC in the MCFG space */
> +    pcie_add_capability(pdev, PCI_EXT_CAP_ID_DVSEC, 1, offset, length);
> +    pci_set_long(pdev->config + offset + PCIE_DVSEC_HEADER_OFFSET,
> +                 (length << 20) | (rev << 16) | CXL_VENDOR_ID);
> +    pci_set_word(pdev->config + offset + PCIE_DVSEC_ID_OFFSET, type);
> +    memcpy(pdev->config + offset + sizeof(struct dvsec_header),
> +           body + sizeof(struct dvsec_header),
> +           length - sizeof(struct dvsec_header));
> +
> +    /* Update state for future DVSEC additions */
> +    range_init_nofail(&cxl->dvsecs[type], cxl->dvsec_offset, length);
> +    cxl->dvsec_offset += length;
> +}
> diff --git a/hw/cxl/cxl-device-utils.c b/hw/cxl/cxl-device-utils.c
> new file mode 100644
> index 0000000000..e69de29bb2
> diff --git a/hw/cxl/meson.build b/hw/cxl/meson.build
> new file mode 100644
> index 0000000000..00c3876a0f
> --- /dev/null
> +++ b/hw/cxl/meson.build
> @@ -0,0 +1,3 @@
> +softmmu_ss.add(when: 'CONFIG_CXL', if_true: files(
> +  'cxl-component-utils.c',
> +))
> diff --git a/hw/meson.build b/hw/meson.build
> index 010de7219c..3e440c341a 100644
> --- a/hw/meson.build
> +++ b/hw/meson.build
> @@ -6,6 +6,7 @@ subdir('block')
>  subdir('char')
>  subdir('core')
>  subdir('cpu')
> +subdir('cxl')
>  subdir('display')
>  subdir('dma')
>  subdir('gpio')
> diff --git a/include/hw/cxl/cxl.h b/include/hw/cxl/cxl.h
> new file mode 100644
> index 0000000000..55f6cc30a5
> --- /dev/null
> +++ b/include/hw/cxl/cxl.h
> @@ -0,0 +1,17 @@
> +/*
> + * QEMU CXL Support
> + *
> + * Copyright (c) 2020 Intel
> + *
> + * This work is licensed under the terms of the GNU GPL, version 2. See the
> + * COPYING file in the top-level directory.
> + */
> +
> +#ifndef CXL_H
> +#define CXL_H
> +
> +#include "cxl_pci.h"
> +#include "cxl_component.h"
> +
> +#endif
> +
> diff --git a/include/hw/cxl/cxl_component.h b/include/hw/cxl/cxl_component.h
> new file mode 100644
> index 0000000000..014d9d10d3
> --- /dev/null
> +++ b/include/hw/cxl/cxl_component.h
> @@ -0,0 +1,181 @@
> +/*
> + * QEMU CXL Component
> + *
> + * Copyright (c) 2020 Intel
> + *
> + * This work is licensed under the terms of the GNU GPL, version 2. See the
> + * COPYING file in the top-level directory.
> + */
> +
> +#ifndef CXL_COMPONENT_H
> +#define CXL_COMPONENT_H
> +
> +/* CXL 2.0 - 8.2.4 */
> +#define CXL2_COMPONENT_IO_REGION_SIZE 0x1000
> +#define CXL2_COMPONENT_CM_REGION_SIZE 0x1000
> +#define CXL2_COMPONENT_BLOCK_SIZE 0x10000
> +
> +#include "qemu/range.h"
> +#include "qemu/typedefs.h"
> +#include "hw/register.h"
> +
> +enum reg_type {
> +    CXL2_DEVICE,
> +    CXL2_TYPE3_DEVICE,
> +    CXL2_LOGICAL_DEVICE,
> +    CXL2_ROOT_PORT,
> +    CXL2_UPSTREAM_PORT,
> +    CXL2_DOWNSTREAM_PORT
> +};
> +
> +/*
> + * Capability registers are defined at the top of the CXL.cache/mem region and
> + * are packed. For our purposes we will always define the caps in the same
> + * order.
> + * CXL 2.0 - 8.2.5 Table 142 for details.
> + */
> +
> +/* CXL 2.0 - 8.2.5.1 */
> +REG32(CXL_CAPABILITY_HEADER, 0)
> +    FIELD(CXL_CAPABILITY_HEADER, ID, 0, 16)
> +    FIELD(CXL_CAPABILITY_HEADER, VERSION, 16, 4)
> +    FIELD(CXL_CAPABILITY_HEADER, CACHE_MEM_VERSION, 20, 4)
> +    FIELD(CXL_CAPABILITY_HEADER, ARRAY_SIZE, 24, 8)
> +
> +#define CXLx_CAPABILITY_HEADER(type, offset)                  \
> +    REG32(CXL_##type##_CAPABILITY_HEADER, offset)             \
> +        FIELD(CXL_##type##_CAPABILITY_HEADER, ID, 0, 16)      \
> +        FIELD(CXL_##type##_CAPABILITY_HEADER, VERSION, 16, 4) \
> +        FIELD(CXL_##type##_CAPABILITY_HEADER, PTR, 20, 12)
> +CXLx_CAPABILITY_HEADER(RAS, 0x4)
> +CXLx_CAPABILITY_HEADER(LINK, 0x8)
> +CXLx_CAPABILITY_HEADER(HDM, 0xc)
> +CXLx_CAPABILITY_HEADER(EXTSEC, 0x10)
> +CXLx_CAPABILITY_HEADER(SNOOP, 0x14)
> +
> +/*
> + * Capability structures contain the actual registers that the CXL component
> + * implements. Some of these are specific to certain types of components, but
> + * this implementation leaves enough space regardless.
> + */
> +/* 8.2.5.9 - CXL RAS Capability Structure */
> +#define CXL_RAS_REGISTERS_OFFSET 0x80 /* Give ample space for caps before this */
> +#define CXL_RAS_REGISTERS_SIZE   0x58
> +REG32(CXL_RAS_UNC_ERR_STATUS, CXL_RAS_REGISTERS_OFFSET)
> +REG32(CXL_RAS_UNC_ERR_MASK, CXL_RAS_REGISTERS_OFFSET + 0x4)
> +REG32(CXL_RAS_UNC_ERR_SEVERITY, CXL_RAS_REGISTERS_OFFSET + 0x8)
> +REG32(CXL_RAS_COR_ERR_STATUS, CXL_RAS_REGISTERS_OFFSET + 0xc)
> +REG32(CXL_RAS_COR_ERR_MASK, CXL_RAS_REGISTERS_OFFSET + 0x10)
> +REG32(CXL_RAS_ERR_CAP_CTRL, CXL_RAS_REGISTERS_OFFSET + 0x14)

Maybe a comment on header log registers coming after this?
Will make it obvious why the size is 0x58 above.


> +
> +/* 8.2.5.10 - CXL Security Capability Structure */
> +#define CXL_SEC_REGISTERS_OFFSET (CXL_RAS_REGISTERS_OFFSET + CXL_RAS_REGISTERS_SIZE)
> +#define CXL_SEC_REGISTERS_SIZE   0 /* We don't implement 1.1 downstream ports */
> +
> +/* 8.2.5.11 - CXL Link Capability Structure */
> +#define CXL_LINK_REGISTERS_OFFSET (CXL_SEC_REGISTERS_OFFSET + CXL_SEC_REGISTERS_SIZE)
> +#define CXL_LINK_REGISTERS_SIZE   0x38

Bit odd to introduce this but not define anything... Can't we bring these
in when we need them later?

> +
> +/* 8.2.5.12 - CXL HDM Decoder Capability Structure */
> +#define HDM_DECODE_MAX 10 /* 8.2.5.12.1 */
> +#define CXL_HDM_REGISTERS_OFFSET \
> +    (CXL_LINK_REGISTERS_OFFSET + CXL_LINK_REGISTERS_SIZE) /* 8.2.5.12 */
> +#define CXL_HDM_REGISTERS_SIZE (0x20 + HDM_DECODE_MAX * 10)
> +#define HDM_DECODER_INIT(n, base)                          \
> +    REG32(CXL_HDM_DECODER##n##_BASE_LO, base + 0x10)       \

Offset n should be included in the address calc.  It's always 0 at the moment
but might as well put it in now.  Mind you there is something a bit odd
in the spec I'm looking at. Nothing defined at 0x2c but no reserved line
either in the table.


> +        FIELD(CXL_HDM_DECODER##n##_BASE_LO, L, 28, 4)      \
> +    REG32(CXL_HDM_DECODER##n##_BASE_HI, base + 0x14)       \
> +        FIELD(CXL_HDM_DECODER##n##_BASE_HI, H, 0, 32)      \
> +    REG32(CXL_HDM_DECODER##n##_SIZE_LO, base + 0x18)       \

Consistency would argue for fields for this and the next.

> +    REG32(CXL_HDM_DECODER##n##_SIZE_HI, base + 0x1C)       \
> +    REG32(CXL_HDM_DECODER##n##_CTRL, base + 0x20)          \
> +        FIELD(CXL_HDM_DECODER##n##_CTRL, IG, 0, 4)         \
> +        FIELD(CXL_HDM_DECODER##n##_CTRL, IW, 4, 4)         \
> +        FIELD(CXL_HDM_DECODER##n##_CTRL, LOCK, 8, 1)       \
LOCK_ON_COMMIT  semantics of that are unusual enough probably worth naming
to call them out.

> +        FIELD(CXL_HDM_DECODER##n##_CTRL, COMMIT, 9, 1)     \
> +        FIELD(CXL_HDM_DECODER##n##_CTRL, COMMITTED, 10, 1) \
> +        FIELD(CXL_HDM_DECODER##n##_CTRL, ERROR, 11, 1)     \
> +        FIELD(CXL_HDM_DECODER##n##_CTRL, TYPE, 12, 1)      \
> +    REG32(CXL_HDM_DECODER##n##_TARGET_LIST_LO, 0x24)       \
> +    REG32(CXL_HDM_DECODER##n##_TARGET_LIST_HI, 0x28)

Blank line here would make it easier to spot the end of the macro

> +REG32(CXL_HDM_DECODER_CAPABILITY, 0)
> +    FIELD(CXL_HDM_DECODER_CAPABILITY, DECODER_COUNT, 0, 4)
> +    FIELD(CXL_HDM_DECODER_CAPABILITY, TARGET_COUNT, 4, 4)
> +    FIELD(CXL_HDM_DECODER_CAPABILITY, INTERLEAVE_256B, 8, 1)
> +    FIELD(CXL_HDM_DECODER_CAPABILITY, INTELEAVE_4K, 9, 1)
> +    FIELD(CXL_HDM_DECODER_CAPABILITY, POISON_ON_ERR_CAP, 10, 1)
> +REG32(CXL_HDM_DECODER_GLOBAL_CONTROL, 0)
> +    FIELD(CXL_HDM_DECODER_GLOBAL_CONTROL, POISON_ON_ERR_EN, 0, 1)
> +    FIELD(CXL_HDM_DECODER_GLOBAL_CONTROL, HDM_DECODER_ENABLE, 1, 1)
> +
> +HDM_DECODER_INIT(0, CXL_HDM_REGISTERS_OFFSET);
> +
> +/* 8.2.5.13 - CXL Extended Security Capability Structure (Root complex only) */
> +#define EXTSEC_ENTRY_MAX        256
> +#define CXL_EXTSEC_REGISTERS_OFFSET (CXL_HDM_REGISTERS_OFFSET + CXL_HDM_REGISTERS_SIZE)
> +#define CXL_EXTSEC_REGISTERS_SIZE   (8 * EXTSEC_ENTRY_MAX + 4)
> +
> +/* 8.2.5.14 - CXL IDE Capability Structure */
> +#define CXL_IDE_REGISTERS_OFFSET (CXL_EXTSEC_REGISTERS_OFFSET + CXL_EXTSEC_REGISTERS_SIZE)
> +#define CXL_IDE_REGISTERS_SIZE   0
> +
> +/* 8.2.5.15 - CXL Snoop Filter Capability Structure */
> +#define CXL_SNOOP_REGISTERS_OFFSET (CXL_IDE_REGISTERS_OFFSET + CXL_IDE_REGISTERS_SIZE)
> +#define CXL_SNOOP_REGISTERS_SIZE   0x8
> +
> +_Static_assert((CXL_SNOOP_REGISTERS_OFFSET + CXL_SNOOP_REGISTERS_SIZE) < 0x1000,
> +               "No space for registers");
> +
> +typedef struct component_registers {
> +    /*
> +     * Main memory region to be registered with QEMU core.
> +     */
> +    MemoryRegion component_registers;
> +
> +    /*
> +     * 8.2.4 Table 141:
> +     *   0x0000 - 0x0fff CXL.io registers
> +     *   0x1000 - 0x1fff CXL.cache and CXL.mem
> +     *   0x2000 - 0xdfff Implementation specific
> +     *   0xe000 - 0xe3ff CXL ARB/MUX registers
> +     *   0xe400 - 0xffff RSVD
> +     */
> +    uint32_t io_registers[CXL2_COMPONENT_IO_REGION_SIZE >> 2];
> +    MemoryRegion io;
> +
> +    uint32_t cache_mem_registers[CXL2_COMPONENT_CM_REGION_SIZE >> 2];
> +    MemoryRegion cache_mem;
> +
> +    MemoryRegion impl_specific;
> +    MemoryRegion arb_mux;
> +    MemoryRegion rsvd;
> +
> +    /* special_ops is used for any component that needs any specific handling */
> +    MemoryRegionOps *special_ops;
> +} ComponentRegisters;
> +
> +/*
> + * A CXL component represents all entities in a CXL hierarchy. This includes,
> + * host bridges, root ports, upstream/downstream ports, and devices
> + */
> +typedef struct cxl_component {
> +    ComponentRegisters crb;
> +    union {
> +        struct {
> +            Range dvsecs[CXL20_MAX_DVSEC];
> +            uint16_t dvsec_offset;
> +            struct PCIDevice *pdev;
> +        };
> +    };
> +} CXLComponentState;
> +
> +void cxl_component_register_block_init(Object *obj,
> +                                       CXLComponentState *cxl_cstate,
> +                                       const char *type);
> +void cxl_component_register_init_common(uint32_t *reg_state,
> +                                        enum reg_type type);
> +
> +void cxl_component_create_dvsec(CXLComponentState *cxl_cstate, uint16_t length,
> +                                uint16_t type, uint8_t rev, uint8_t *body);
> +
> +#endif
> diff --git a/include/hw/cxl/cxl_pci.h b/include/hw/cxl/cxl_pci.h
> new file mode 100644
> index 0000000000..b403770424
> --- /dev/null
> +++ b/include/hw/cxl/cxl_pci.h
> @@ -0,0 +1,133 @@
> +/*
> + * QEMU CXL PCI interfaces
> + *
> + * Copyright (c) 2020 Intel
> + *
> + * This work is licensed under the terms of the GNU GPL, version 2. See the
> + * COPYING file in the top-level directory.
> + */
> +
> +#ifndef CXL_PCI_H
> +#define CXL_PCI_H
> +
> +#include "hw/pci/pci.h"
> +#include "hw/pci/pcie.h"
> +
> +#define CXL_VENDOR_ID 0x1e98
> +
> +#define PCIE_DVSEC_HEADER_OFFSET 0x4 /* Offset from start of extend cap */

To keep this clearly aligned with PCIe spec I'd call it HEADER_1_OFFSET

> +#define PCIE_DVSEC_ID_OFFSET     0x8
> +
> +#define PCIE_CXL_DEVICE_DVSEC_LENGTH 0x38
> +#define PCIE_CXL_DEVICE_DVSEC_REVID  1

Make it clear this is the CXL 2.0 revid.
It would be 0 for CXL 1.1 I think? (8.1.3 of CXL 2.0 spec)


> +
> +#define EXTENSIONS_PORT_DVSEC_LENGTH 0x28
> +#define EXTENSIONS_PORT_DVSEC_REVID  1

I'm assuming this is the CXL 2.0 exensions DVSEC for ports
in figure 128?

If so table 128 has dvsec revision as 0. 

> +
> +#define GPF_PORT_DVSEC_LENGTH 0x10
> +#define GPF_PORT_DVSEC_REVID  0
> +
> +#define PCIE_FLEXBUS_PORT_DVSEC_LENGTH_2_0 0x14
> +#define PCIE_FLEXBUS_PORT_DVSEC_REVID_2_0  1
> +
> +#define REG_LOC_DVSEC_LENGTH 0x24
> +#define REG_LOC_DVSEC_REVID  0

Whilst I appreciate this is an RFC it would seem more logical
to me to only list things in the following enum if we
have also defined them here.  E.g. GPF_DEVICE_DVSEC doesn't
have length and revid defines.

> +
> +enum {
> +    PCIE_CXL_DEVICE_DVSEC      = 0,
> +    NON_CXL_FUNCTION_MAP_DVSEC = 2,
> +    EXTENSIONS_PORT_DVSEC      = 3,
> +    GPF_PORT_DVSEC             = 4,
> +    GPF_DEVICE_DVSEC           = 5,
> +    PCIE_FLEXBUS_PORT_DVSEC    = 7,
> +    REG_LOC_DVSEC              = 8,
> +    MLD_DVSEC                  = 9,
> +    CXL20_MAX_DVSEC
> +};
> +
> +struct dvsec_header {
> +    uint32_t cap_hdr;
> +    uint32_t dv_hdr1;
> +    uint16_t dv_hdr2;
> +} __attribute__((__packed__));
> +_Static_assert(sizeof(struct dvsec_header) == 10,
> +               "dvsec header size incorrect");
> +
> +/*
> + * CXL 2.0 devices must implement certain DVSEC IDs, and can [optionally]
> + * implement others.
> + *
> + * CXL 2.0 Device: 0, [2], 5, 8
> + * CXL 2.0 RP: 3, 4, 7, 8
> + * CXL 2.0 Upstream Port: [2], 7, 8
> + * CXL 2.0 Downstream Port: 3, 4, 7, 8
> + */
> +
> +/* CXL 2.0 - 8.1.5 (ID 0003) */
> +struct dvsec_port {

I'd keep naming consistent.  It's called EXTENSIONS_PORT_DVSEC above
so extensions_dvsec_port here.

> +    struct dvsec_header hdr;
> +    uint16_t status;
> +    uint16_t control;
> +    uint8_t alt_bus_base;
> +    uint8_t alt_bus_limit;
> +    uint16_t alt_memory_base;
> +    uint16_t alt_memory_limit;
> +    uint16_t alt_prefetch_base;
> +    uint16_t alt_prefetch_limit;
> +    uint32_t alt_prefetch_base_high;
> +    uint32_t alt_prefetch_base_low;
> +    uint32_t rcrb_base;
> +    uint32_t rcrb_base_high;
> +};
> +_Static_assert(sizeof(struct dvsec_port) == 0x28, "dvsec port size incorrect");
> +#define PORT_CONTROL_OVERRIDE_OFFSET 0xc
I'm not totally sure what this define is, but seems
like it's simply the offset of the control field above.
If so can't we get it from the there directly?

> +#define PORT_CONTROL_UNMASK_SBR      1
> +#define PORT_CONTROL_ALT_MEMID_EN    4

Use something to make it clear that 4 is simply bit 3. (1 << 3) maybe?

> +
> +/* CXL 2.0 - 8.1.6 GPF DVSEC (ID 0004) */
> +struct dvsec_port_gpf {
> +    struct dvsec_header hdr;
> +    uint16_t rsvd;
> +    uint16_t phase1_ctrl;
> +    uint16_t phase2_ctrl;
> +};
> +_Static_assert(sizeof(struct dvsec_port_gpf) == 0x10,
> +               "dvsec port GPF size incorrect");
> +
> +/* CXL 2.0 - 8.1.8/8.2.1.3 Flexbus DVSEC (ID 0007) */
> +struct dvsec_port_flexbus {
> +    struct dvsec_header hdr;
> +    uint16_t cap;
> +    uint16_t ctrl;
> +    uint16_t status;
> +    uint32_t rcvd_mod_ts_data;
> +};
> +_Static_assert(sizeof(struct dvsec_port_flexbus) == 0x14,
> +               "dvsec port flexbus size incorrect");
> +
> +/* CXL 2.0 - 8.1.9 Register Locator DVSEC (ID 0008) */
> +struct dvsec_register_locator {
> +    struct dvsec_header hdr;
> +    uint16_t rsvd;
> +    uint32_t reg0_base_lo;
> +    uint32_t reg0_base_hi;
> +    uint32_t reg1_base_lo;
> +    uint32_t reg1_base_hi;
> +    uint32_t reg2_base_lo;
> +    uint32_t reg2_base_hi;
> +};
> +_Static_assert(sizeof(struct dvsec_register_locator) == 0x24,
> +               "dvsec register locator size incorrect");
> +#define BEI_BAR_10H 0

BEI is obscure enough I'd add a comment giving full name
(BAR equivalent indicator)

> +#define BEI_BAR_14H 1
> +#define BEI_BAR_18H 2
> +#define BEI_BAR_1cH 3
> +#define BEI_BAR_20H 4
> +#define BEI_BAR_24H 5
> +
> +#define RBI_EMPTY          0

Likewise, RBI isn't actually used on spec that I can see.
So call out that it is Register Block Identifier.

> +#define RBI_COMPONENT_REG  (1 << 8)
> +#define RBI_BAR_VIRT_ACL   (2 << 8)
> +#define RBI_CXL_DEVICE_REG (3 << 8)

Nice to treat these as value of field (0,1,2,3) and a macro
to put it in the right place rather than rolling them together
directly.

> +
> +#endif



^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [RFC PATCH 04/25] hw/cxl/device: Introduce a CXL device (8.2.8)
  2020-11-11  5:47 ` [RFC PATCH 04/25] hw/cxl/device: Introduce a CXL device (8.2.8) Ben Widawsky
@ 2020-11-16 13:07   ` Jonathan Cameron
  2020-11-16 21:11     ` Ben Widawsky
  0 siblings, 1 reply; 64+ messages in thread
From: Jonathan Cameron @ 2020-11-16 13:07 UTC (permalink / raw)
  To: Ben Widawsky; +Cc: Vishal Verma, Dan Williams, qemu-devel

On Tue, 10 Nov 2020 21:47:03 -0800
Ben Widawsky <ben.widawsky@intel.com> wrote:

> A CXL device is a type of CXL component. Conceptually, a CXL device
> would be a leaf node in a CXL topology. From an emulation perspective,
> CXL devices are the most complex and so the actual implementation is
> reserved for discrete commits.
> 
> This new device type is specifically catered towards the eventually
> implementation of a Type3 CXL.mem device, 8.2.8.5 in the CXL 2.0
> specification.
> 
> Signed-off-by: Ben Widawsky <ben.widawsky@intel.com>

As an RFC, would be good to have questions relavant to individual
patches if possible.  Makes it easier to know what you want feedback on.

The REG32 being used for 64 bit registers seems awkward. I'd suggest
we either break them up into DW and deal with the edge parts manually.

I'm not sure a REG64 definition would work due to lack of explicit alignment
guarantees.  Might be fine though.

One buglet inline and a few other comments.

Jonathan


> ---
>  include/hw/cxl/cxl.h        |   1 +
>  include/hw/cxl/cxl_device.h | 193 ++++++++++++++++++++++++++++++++++++
>  2 files changed, 194 insertions(+)
>  create mode 100644 include/hw/cxl/cxl_device.h
> 
> diff --git a/include/hw/cxl/cxl.h b/include/hw/cxl/cxl.h
> index 55f6cc30a5..23f52c4cf9 100644
> --- a/include/hw/cxl/cxl.h
> +++ b/include/hw/cxl/cxl.h
> @@ -12,6 +12,7 @@
>  
>  #include "cxl_pci.h"
>  #include "cxl_component.h"
> +#include "cxl_device.h"
>  
>  #endif
>  
> diff --git a/include/hw/cxl/cxl_device.h b/include/hw/cxl/cxl_device.h
> new file mode 100644
> index 0000000000..491eca6e05
> --- /dev/null
> +++ b/include/hw/cxl/cxl_device.h
> @@ -0,0 +1,193 @@
> +/*
> + * QEMU CXL Devices
> + *
> + * Copyright (c) 2020 Intel
> + *
> + * This work is licensed under the terms of the GNU GPL, version 2. See the
> + * COPYING file in the top-level directory.
> + */
> +
> +#ifndef CXL_DEVICE_H
> +#define CXL_DEVICE_H
> +
> +#include "hw/register.h"
> +
> +/*
> + * The following is how a CXL device's MMIO space is laid out. The only
> + * requirement from the spec is that the capabilities array and the capability
> + * headers start at offset 0 and are contiguously packed. The headers themselves
> + * provide offsets to the register fields. For this emulation, registers will
> + * start at offset 0x80 (m == 0x80). No secondary mailbox is implemented which
> + * means that n = m + sizeof(mailbox registers) + sizeof(device registers).
> + *
> + * This is roughly described in 8.2.8 Figure 138 of the CXL 2.0 spec.
> + *
> + * n + PAYLOAD_SIZE_MAX  +---------------------------------+
> + *                       |                                 |
> + *                  ^    |                                 |
> + *                  |    |                                 |
> + *                  |    |                                 |
> + *                  |    |                                 |
> + *                  |    |         Command Payload         |
> + *                  |    |                                 |
> + *                  |    |                                 |
> + *                  |    |                                 |
> + *                  |    |                                 |
> + *                  |    |                                 |
> + *                  n    +---------------------------------+
> + *                  ^    |                                 |
> + *                  |    |    Device Capability Registers  |
> + *                  |    |    x, mailbox, y                |
> + *                  |    |                                 |
> + *                  m    +---------------------------------+
> + *                  ^    |     Device Capability Header y  |
> + *                  |    +---------------------------------+
> + *                  |    | Device Capability Header Mailbox|
> + *                  |    +------------- --------------------
> + *                  |    |     Device Capability Header x  |
> + *                  |    +---------------------------------+
> + *                  |    |                                 |
> + *                  |    |                                 |
> + *                  |    |      Device Cap Array[0..n]     |
> + *                  |    |                                 |
> + *                  |    |                                 |
> + *                  |    |                                 |
> + *                  0    +---------------------------------+
> + */
> +
> +#define CXL_DEVICE_CAP_HDR1_OFFSET 0x10 /* Figure 138 */
> +#define CXL_DEVICE_CAP_REG_SIZE 0x10 /* 8.2.8.2 */
> +
> +#define CXL_DEVICE_REGISTERS_OFFSET 0x80 /* Read comment above */
> +#define CXL_DEVICE_REGISTERS_LENGTH 0x8 /* 8.2.8.3.1 */
> +
> +#define CXL_MAILBOX_REGISTERS_OFFSET \
> +    (CXL_DEVICE_REGISTERS_OFFSET + CXL_DEVICE_REGISTERS_LENGTH)
> +#define CXL_MAILBOX_REGISTERS_SIZE 0x20
> +#define CXL_MAILBOX_PAYLOAD_SHIFT 11
> +#define CXL_MAILBOX_MAX_PAYLOAD_SIZE (1 << CXL_MAILBOX_PAYLOAD_SHIFT)
> +#define CXL_MAILBOX_REGISTERS_LENGTH \
> +    (CXL_MAILBOX_REGISTERS_SIZE + CXL_MAILBOX_MAX_PAYLOAD_SIZE)
> +
> +typedef struct cxl_device_state {
> +    /* Boss container and caps registers */
> +    MemoryRegion device_registers;
> +
> +    MemoryRegion caps;
> +    MemoryRegion device;
> +    MemoryRegion mailbox;
> +
> +    MemoryRegion *pmem;
> +    MemoryRegion *vmem;
> +
> +    bool active;
> +    uint16_t command;
> +    uint16_t payload_size;
> +    union {
> +        uint8_t caps_reg_state[CXL_DEVICE_CAP_REG_SIZE * 4]; /* ARRAY + 3 CAPS */
> +        uint32_t caps_reg_state32[0];
> +    };
> +} CXLDeviceState;
> +
> +/* Initialize the register block for a device */
> +void cxl_device_register_block_init(Object *obj, CXLDeviceState *dev);
> +
> +/* Set up default values for the register block */
> +void cxl_device_register_init_common(CXLDeviceState *dev);
> +
> +/* CXL 2.0 - 8.2.8.1 */
> +REG32(CXL_DEV_CAP_ARRAY, 0) /* 48b!?!?! */
> +    FIELD(CXL_DEV_CAP_ARRAY, CAP_ID, 0, 16)
> +    FIELD(CXL_DEV_CAP_ARRAY, CAP_VERSION, 16, 8)
> +REG32(CXL_DEV_CAP_ARRAY2, 4) /* We're going to pretend it's 64b */

Fair call.  Guess reserved 16 bits on the end will be the eventual fix.

> +    FIELD(CXL_DEV_CAP_ARRAY2, CAP_COUNT, 0, 16)
> +
> +/*
> + * In the 8.2.8.2, this is listed as a 128b register, but in 8.2.8, it says:
> + * > No registers defined in Section 8.2.8 are larger than 64-bits wide so that
> + * > is the maximum access size allowed for these registers. If this rule is not
> + * > followed, the behavior is undefined
> + *
> + * Here we've chosen to make it 4 dwords. The spec allows any pow2 multiple
> + * access to be used for a register (2 qwords, 8 words, 128 bytes).
> + *
> + * XXX: This is supposedly fixed for the release version of the spec. If this
> + * comment is still here, I've failed.

*sniggers in a sympathetic way*

> + */
> +#define CXL_DEVICE_CAPABILITY_HEADER_REGISTER(n, offset)                            \
> +    REG32(CXL_DEV_##n##_CAP_HDR0, offset)                 \
> +        FIELD(CXL_DEV_##n##_CAP_HDR0, CAP_ID, 0, 16)      \
> +        FIELD(CXL_DEV_##n##_CAP_HDR0, CAP_VERSION, 16, 8) \
> +    REG32(CXL_DEV_##n##_CAP_HDR1, offset + 4)             \
> +        FIELD(CXL_DEV_##n##_CAP_HDR1, CAP_OFFSET, 0, 32)  \
> +    REG32(CXL_DEV_##n##_CAP_HDR2, offset + 8)             \
> +        FIELD(CXL_DEV_##n##_CAP_HDR2, CAP_LENGTH, 0, 32)
> +
> +CXL_DEVICE_CAPABILITY_HEADER_REGISTER(DEVICE, CXL_DEVICE_CAP_HDR1_OFFSET)
> +CXL_DEVICE_CAPABILITY_HEADER_REGISTER(MAILBOX, CXL_DEVICE_CAP_HDR1_OFFSET + \
> +                                               CXL_DEVICE_CAP_REG_SIZE)
> +
> +REG32(CXL_DEV_MAILBOX_CAP, 0)
> +    FIELD(CXL_DEV_MAILBOX_CAP, PAYLOAD_SIZE, 0, 5)
> +    FIELD(CXL_DEV_MAILBOX_CAP, INT_CAP, 5, 1)
> +    FIELD(CXL_DEV_MAILBOX_CAP, BG_INT_CAP, 6, 1)
> +    FIELD(CXL_DEV_MAILBOX_CAP, MSI_N, 7, 4)
> +
> +REG32(CXL_DEV_MAILBOX_CTRL, 4)
> +    FIELD(CXL_DEV_MAILBOX_CTRL, DOORBELL, 0, 1)
> +    FIELD(CXL_DEV_MAILBOX_CTRL, INT_EN, 1, 2)

1 bit field, not 2.

> +    FIELD(CXL_DEV_MAILBOX_CTRL, BG_INT_EN, 2, 1)
> +
> +enum {
> +    CXL_CMD_EVENTS              = 0x1,

> +    CXL_CMD_IDENTIFY            = 0x40,
> +};
> +
> +REG32(CXL_DEV_MAILBOX_CMD, 8)
> +    FIELD(CXL_DEV_MAILBOX_CMD, OP, 0, 16)
> +    FIELD(CXL_DEV_MAILBOX_CMD, LENGTH, 16, 20)

That takes us out to bit 36 in a 32 bit register?
Probably needs a comment like the ones below.
Wouldn't want to miss fixing this one later.

> +
> +/* 8.2.8.4.5.1 Command Return Codes */
> +enum {
> +    RET_SUCCESS                 = 0x0,
> +    RET_BG_STARTED              = 0x1, /* Background Command Started */
> +    RET_EINVAL                  = 0x2, /* Invalid Input */
> +    RET_ENOTSUP                 = 0x3, /* Unsupported */
> +    RET_ENODEV                  = 0x4, /* Internal Error */

Mapping that to NODEV seems less than obvious.

> +    RET_ERESTART                = 0x5, /* Retry Required */
> +    RET_EBUSY                   = 0x6, /* Busy */
> +    RET_MEDIA_DISABLED          = 0x7, /* Media Disabled */
> +    RET_FW_EBUSY                = 0x8, /* FW Transfer in Progress */
> +    RET_FW_OOO                  = 0x9, /* FW Transfer Out of Order */
> +    RET_FW_AUTH                 = 0xa, /* FW Authentication Failed */
> +    RET_FW_EBADSLT              = 0xb, /* Invalid Slot */
> +    RET_FW_ROLLBACK             = 0xc, /* Activation Failed, FW Rolled Back */
> +    RET_FW_REBOOT               = 0xd, /* Activation Failed, Cold Reset Required */
> +    RET_ENOENT                  = 0xe, /* Invalid Handle */
> +    RET_EFAULT                  = 0xf, /* Invalid Physical Address */
> +    RET_POISON_E2BIG            = 0x10, /* Inject Poison Limit Reached */
> +    RET_EIO                     = 0x11, /* Permanent Media Failure */
> +    RET_ECANCELED               = 0x12, /* Aborted */
> +    RET_EACCESS                 = 0x13, /* Invalid Security State */
> +    RET_EPERM                   = 0x14, /* Incorrect Passphrase */
> +    RET_EPROTONOSUPPORT         = 0x15, /* Unsupported Mailbox */
> +    RET_EMSGSIZE                = 0x16, /* Invalid Payload Length */
> +    RET_MAX                     = 0x17
> +};
> +
> +/* XXX: actually a 64b register */
> +REG32(CXL_DEV_MAILBOX_STS, 0x10)
> +    FIELD(CXL_DEV_MAILBOX_STS, BG_OP, 0, 1)
> +    FIELD(CXL_DEV_MAILBOX_STS, ERRNO, 32, 16)
> +    FIELD(CXL_DEV_MAILBOX_STS, VENDOR_ERRNO, 48, 16)
> +
> +/* XXX: actually a 64b register */
> +REG32(CXL_DEV_BG_CMD_STS, 0x18)
> +    FIELD(CXL_DEV_BG_CMD_STS, BG, 0, 16)
> +    FIELD(CXL_DEV_BG_CMD_STS, DONE, 16, 7)
> +    FIELD(CXL_DEV_BG_CMD_STS, ERRNO, 32, 16)
> +    FIELD(CXL_DEV_BG_CMD_STS, VENDOR_ERRNO, 48, 16)
> +
> +REG32(CXL_DEV_CMD_PAYLOAD, 0x20)
> +
> +#endif



^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [RFC PATCH 05/25] hw/cxl/device: Implement the CAP array (8.2.8.1-2)
  2020-11-11  5:47 ` [RFC PATCH 05/25] hw/cxl/device: Implement the CAP array (8.2.8.1-2) Ben Widawsky
@ 2020-11-16 13:11   ` Jonathan Cameron
  2020-11-16 18:08     ` Ben Widawsky
  0 siblings, 1 reply; 64+ messages in thread
From: Jonathan Cameron @ 2020-11-16 13:11 UTC (permalink / raw)
  To: Ben Widawsky; +Cc: Vishal Verma, Dan Williams, qemu-devel

On Tue, 10 Nov 2020 21:47:04 -0800
Ben Widawsky <ben.widawsky@intel.com> wrote:

> This implements all device MMIO up to the first capability .That
> includes the CXL Device Capabilities Array Register, as well as all of
> the CXL Device Capability Header Registers. The latter are filled in as
> they are implemented in the following patches.
> 
> Signed-off-by: Ben Widawsky <ben.widawsky@intel.com>

Question below.

Thanks,

Jonathan

> ---
>  hw/cxl/cxl-device-utils.c | 73 +++++++++++++++++++++++++++++++++++++++
>  hw/cxl/meson.build        |  1 +
>  2 files changed, 74 insertions(+)
> 
> diff --git a/hw/cxl/cxl-device-utils.c b/hw/cxl/cxl-device-utils.c
> index e69de29bb2..a391bb15c6 100644
> --- a/hw/cxl/cxl-device-utils.c
> +++ b/hw/cxl/cxl-device-utils.c
> @@ -0,0 +1,73 @@
> +/*
> + * CXL Utility library for devices
> + *
> + * Copyright(C) 2020 Intel Corporation.
> + *
> + * This work is licensed under the terms of the GNU GPL, version 2. See the
> + * COPYING file in the top-level directory.
> + */
> +
> +#include "qemu/osdep.h"
> +#include "qemu/log.h"
> +#include "hw/cxl/cxl.h"
> +
> +static uint64_t caps_reg_read(void *opaque, hwaddr offset, unsigned size)
> +{
> +    CXLDeviceState *cxl_dstate = opaque;
> +
> +    switch (size) {
> +    case 4:
> +        if (unlikely(offset & (sizeof(uint32_t) - 1))) {
> +            qemu_log_mask(LOG_UNIMP, "Unaligned register read\n");
> +            return 0;
> +        }
> +        break;
> +    case 8:
> +        if (unlikely(offset & (sizeof(uint64_t) - 1))) {
> +            qemu_log_mask(LOG_UNIMP, "Unaligned register read\n");
> +            return 0;
> +        }
> +        break;

Seems unlikely but in theory we might get other sizes such as 2 and need
that to be aligned?

If we just don't want to support them perhaps a default with suitable error
print is appropriate?

> +    }
> +
> +    return ldn_le_p(cxl_dstate->caps_reg_state + offset, size);
> +}
> +
> +static const MemoryRegionOps caps_ops = {
> +    .read = caps_reg_read,
> +    .write = NULL,
> +    .endianness = DEVICE_LITTLE_ENDIAN,
> +    .valid = {
> +        .min_access_size = 4,
> +        .max_access_size = 8,
> +    },
> +    .impl = {
> +        .min_access_size = 4,
> +        .max_access_size = 8,
> +    },
> +};
> +
> +void cxl_device_register_block_init(Object *obj, CXLDeviceState *cxl_dstate)
> +{
> +    /* This will be a BAR, so needs to be rounded up to pow2 for PCI spec */
> +    memory_region_init(
> +        &cxl_dstate->device_registers, obj, "device-registers",
> +        pow2ceil(CXL_MAILBOX_REGISTERS_LENGTH + CXL_MAILBOX_REGISTERS_OFFSET));
> +
> +    memory_region_init_io(&cxl_dstate->caps, obj, &caps_ops, cxl_dstate,
> +                          "cap-array", CXL_DEVICE_REGISTERS_OFFSET - 0);
> +
> +    memory_region_add_subregion(&cxl_dstate->device_registers, 0,
> +                                &cxl_dstate->caps);
> +}
> +
> +void cxl_device_register_init_common(CXLDeviceState *cxl_dstate)
> +{
> +    uint32_t *cap_hdrs = cxl_dstate->caps_reg_state32;
> +    const int cap_count = 0;
> +
> +    /* CXL Device Capabilities Array Register */
> +    ARRAY_FIELD_DP32(cap_hdrs, CXL_DEV_CAP_ARRAY, CAP_ID, 0);
> +    ARRAY_FIELD_DP32(cap_hdrs, CXL_DEV_CAP_ARRAY, CAP_VERSION, 1);
> +    ARRAY_FIELD_DP32(cap_hdrs, CXL_DEV_CAP_ARRAY2, CAP_COUNT, cap_count);
> +}
> diff --git a/hw/cxl/meson.build b/hw/cxl/meson.build
> index 00c3876a0f..47154d6850 100644
> --- a/hw/cxl/meson.build
> +++ b/hw/cxl/meson.build
> @@ -1,3 +1,4 @@
>  softmmu_ss.add(when: 'CONFIG_CXL', if_true: files(
>    'cxl-component-utils.c',
> +  'cxl-device-utils.c',
>  ))



^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [RFC PATCH 06/25] hw/cxl/device: Add device status (8.2.8.3)
  2020-11-11  5:47 ` [RFC PATCH 06/25] hw/cxl/device: Add device status (8.2.8.3) Ben Widawsky
@ 2020-11-16 13:16   ` Jonathan Cameron
  2020-11-16 21:18     ` Ben Widawsky
  0 siblings, 1 reply; 64+ messages in thread
From: Jonathan Cameron @ 2020-11-16 13:16 UTC (permalink / raw)
  To: Ben Widawsky; +Cc: Vishal Verma, Dan Williams, qemu-devel

On Tue, 10 Nov 2020 21:47:05 -0800
Ben Widawsky <ben.widawsky@intel.com> wrote:

> This implements the CXL device status registers from 8.2.8.3.1 in the
> CXL 2.0 specification. It is capability ID 0001h.
> 
> Signed-off-by: Ben Widawsky <ben.widawsky@intel.com>

It does some other stuff it shouldn't as well.  Please tidy that up before
v2.  A few other passing comments inline.

Thanks,

Jonathan


> ---
>  hw/cxl/cxl-device-utils.c   | 45 +++++++++++++++++++++++++++++++++-
>  include/hw/cxl/cxl_device.h | 49 ++++++++++++-------------------------
>  2 files changed, 60 insertions(+), 34 deletions(-)
> 
> diff --git a/hw/cxl/cxl-device-utils.c b/hw/cxl/cxl-device-utils.c
> index a391bb15c6..78144e103c 100644
> --- a/hw/cxl/cxl-device-utils.c
> +++ b/hw/cxl/cxl-device-utils.c
> @@ -33,6 +33,42 @@ static uint64_t caps_reg_read(void *opaque, hwaddr offset, unsigned size)
>      return ldn_le_p(cxl_dstate->caps_reg_state + offset, size);
>  }
>  
> +static uint64_t dev_reg_read(void *opaque, hwaddr offset, unsigned size)
> +{
> +    uint64_t retval = 0;

Doesn't seem to be used.

> +

Perhaps break the alignment check out to a utility function given this sanity check
is same as in previous patch.

> +    switch (size) {
> +    case 4:
> +        if (unlikely(offset & (sizeof(uint32_t) - 1))) {
> +            qemu_log_mask(LOG_UNIMP, "Unaligned register read\n");
> +            return 0;
> +        }
> +        break;
> +    case 8:
> +        if (unlikely(offset & (sizeof(uint64_t) - 1))) {
> +            qemu_log_mask(LOG_UNIMP, "Unaligned register read\n");
> +            return 0;
> +        }
> +        break;
> +    }
> +
> +    return ldn_le_p(&retval, size);
> +}
> +
> +static const MemoryRegionOps dev_ops = {
> +    .read = dev_reg_read,
> +    .write = NULL,
> +    .endianness = DEVICE_LITTLE_ENDIAN,
> +    .valid = {
> +        .min_access_size = 4,
> +        .max_access_size = 8,
> +    },
> +    .impl = {
> +        .min_access_size = 4,
> +        .max_access_size = 8,
> +    },
> +};
> +
>  static const MemoryRegionOps caps_ops = {
>      .read = caps_reg_read,
>      .write = NULL,
> @@ -56,18 +92,25 @@ void cxl_device_register_block_init(Object *obj, CXLDeviceState *cxl_dstate)
>  
>      memory_region_init_io(&cxl_dstate->caps, obj, &caps_ops, cxl_dstate,
>                            "cap-array", CXL_DEVICE_REGISTERS_OFFSET - 0);
> +    memory_region_init_io(&cxl_dstate->device, obj, &dev_ops, cxl_dstate,
> +                          "device-status", CXL_DEVICE_REGISTERS_LENGTH);
>  
>      memory_region_add_subregion(&cxl_dstate->device_registers, 0,
>                                  &cxl_dstate->caps);
> +    memory_region_add_subregion(&cxl_dstate->device_registers,
> +                                CXL_DEVICE_REGISTERS_OFFSET,
> +                                &cxl_dstate->device);
>  }
>  
>  void cxl_device_register_init_common(CXLDeviceState *cxl_dstate)
>  {
>      uint32_t *cap_hdrs = cxl_dstate->caps_reg_state32;
> -    const int cap_count = 0;
> +    const int cap_count = 1;
>  
>      /* CXL Device Capabilities Array Register */
>      ARRAY_FIELD_DP32(cap_hdrs, CXL_DEV_CAP_ARRAY, CAP_ID, 0);
>      ARRAY_FIELD_DP32(cap_hdrs, CXL_DEV_CAP_ARRAY, CAP_VERSION, 1);
>      ARRAY_FIELD_DP32(cap_hdrs, CXL_DEV_CAP_ARRAY2, CAP_COUNT, cap_count);
> +
> +    cxl_device_cap_init(cxl_dstate, DEVICE, 1);
>  }
> diff --git a/include/hw/cxl/cxl_device.h b/include/hw/cxl/cxl_device.h
> index 491eca6e05..2c674fdc9c 100644
> --- a/include/hw/cxl/cxl_device.h
> +++ b/include/hw/cxl/cxl_device.h
> @@ -127,6 +127,22 @@ CXL_DEVICE_CAPABILITY_HEADER_REGISTER(DEVICE, CXL_DEVICE_CAP_HDR1_OFFSET)
>  CXL_DEVICE_CAPABILITY_HEADER_REGISTER(MAILBOX, CXL_DEVICE_CAP_HDR1_OFFSET + \
>                                                 CXL_DEVICE_CAP_REG_SIZE)
>  
> +#define cxl_device_cap_init(dstate, reg, cap_id)                                   \
> +    do {                                                                           \
> +        uint32_t *cap_hdrs = dstate->caps_reg_state32;                             \
> +        int which = R_CXL_DEV_##reg##_CAP_HDR0;                                    \
> +        cap_hdrs[which] =                                                          \
> +            FIELD_DP32(cap_hdrs[which], CXL_DEV_##reg##_CAP_HDR0, CAP_ID, cap_id); \
> +        cap_hdrs[which] = FIELD_DP32(                                              \
> +            cap_hdrs[which], CXL_DEV_##reg##_CAP_HDR0, CAP_VERSION, 1);            \
> +        cap_hdrs[which + 1] =                                                      \
> +            FIELD_DP32(cap_hdrs[which + 1], CXL_DEV_##reg##_CAP_HDR1,              \
> +                       CAP_OFFSET, CXL_##reg##_REGISTERS_OFFSET);                  \
> +        cap_hdrs[which + 2] =                                                      \
> +            FIELD_DP32(cap_hdrs[which + 2], CXL_DEV_##reg##_CAP_HDR2,              \
> +                       CAP_LENGTH, CXL_##reg##_REGISTERS_LENGTH);                  \
> +    } while (0)
> +
>  REG32(CXL_DEV_MAILBOX_CAP, 0)
>      FIELD(CXL_DEV_MAILBOX_CAP, PAYLOAD_SIZE, 0, 5)
>      FIELD(CXL_DEV_MAILBOX_CAP, INT_CAP, 5, 1)
> @@ -138,43 +154,10 @@ REG32(CXL_DEV_MAILBOX_CTRL, 4)
>      FIELD(CXL_DEV_MAILBOX_CTRL, INT_EN, 1, 2)
>      FIELD(CXL_DEV_MAILBOX_CTRL, BG_INT_EN, 2, 1)
>  
> -enum {
> -    CXL_CMD_EVENTS              = 0x1,
> -    CXL_CMD_IDENTIFY            = 0x40,
> -};
> -
>  REG32(CXL_DEV_MAILBOX_CMD, 8)
>      FIELD(CXL_DEV_MAILBOX_CMD, OP, 0, 16)
>      FIELD(CXL_DEV_MAILBOX_CMD, LENGTH, 16, 20)
>  
> -/* 8.2.8.4.5.1 Command Return Codes */

Umm. We only just introduced this a few patches ago.  Please tidy that
up so we don't end up bringing things in and out again.

> -enum {
> -    RET_SUCCESS                 = 0x0,
> -    RET_BG_STARTED              = 0x1, /* Background Command Started */
> -    RET_EINVAL                  = 0x2, /* Invalid Input */
> -    RET_ENOTSUP                 = 0x3, /* Unsupported */
> -    RET_ENODEV                  = 0x4, /* Internal Error */
> -    RET_ERESTART                = 0x5, /* Retry Required */
> -    RET_EBUSY                   = 0x6, /* Busy */
> -    RET_MEDIA_DISABLED          = 0x7, /* Media Disabled */
> -    RET_FW_EBUSY                = 0x8, /* FW Transfer in Progress */
> -    RET_FW_OOO                  = 0x9, /* FW Transfer Out of Order */
> -    RET_FW_AUTH                 = 0xa, /* FW Authentication Failed */
> -    RET_FW_EBADSLT              = 0xb, /* Invalid Slot */
> -    RET_FW_ROLLBACK             = 0xc, /* Activation Failed, FW Rolled Back */
> -    RET_FW_REBOOT               = 0xd, /* Activation Failed, Cold Reset Required */
> -    RET_ENOENT                  = 0xe, /* Invalid Handle */
> -    RET_EFAULT                  = 0xf, /* Invalid Physical Address */
> -    RET_POISON_E2BIG            = 0x10, /* Inject Poison Limit Reached */
> -    RET_EIO                     = 0x11, /* Permanent Media Failure */
> -    RET_ECANCELED               = 0x12, /* Aborted */
> -    RET_EACCESS                 = 0x13, /* Invalid Security State */
> -    RET_EPERM                   = 0x14, /* Incorrect Passphrase */
> -    RET_EPROTONOSUPPORT         = 0x15, /* Unsupported Mailbox */
> -    RET_EMSGSIZE                = 0x16, /* Invalid Payload Length */
> -    RET_MAX                     = 0x17
> -};
> -
>  /* XXX: actually a 64b register */
>  REG32(CXL_DEV_MAILBOX_STS, 0x10)
>      FIELD(CXL_DEV_MAILBOX_STS, BG_OP, 0, 1)



^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [RFC PATCH 07/25] hw/cxl/device: Implement basic mailbox (8.2.8.4)
  2020-11-11  5:47 ` [RFC PATCH 07/25] hw/cxl/device: Implement basic mailbox (8.2.8.4) Ben Widawsky
@ 2020-11-16 13:46   ` Jonathan Cameron
  2020-11-16 21:42     ` Ben Widawsky
  0 siblings, 1 reply; 64+ messages in thread
From: Jonathan Cameron @ 2020-11-16 13:46 UTC (permalink / raw)
  To: Ben Widawsky; +Cc: Vishal Verma, Dan Williams, qemu-devel

On Tue, 10 Nov 2020 21:47:06 -0800
Ben Widawsky <ben.widawsky@intel.com> wrote:

> This is the beginning of implementing mailbox support for CXL 2.0
> devices.
> 
> Signed-off-by: Ben Widawsky <ben.widawsky@intel.com>
Mostly patch set cleanup suggestions rather than anything meaningful
in here.

Thanks,

Jonathan

> ---
>  hw/cxl/cxl-device-utils.c   | 131 ++++++++++++++++++++++++++++++++++++
>  hw/cxl/cxl-mailbox-utils.c  |  93 +++++++++++++++++++++++++
>  hw/cxl/meson.build          |   1 +
>  include/hw/cxl/cxl.h        |   3 +
>  include/hw/cxl/cxl_device.h |  10 ++-
>  5 files changed, 237 insertions(+), 1 deletion(-)
>  create mode 100644 hw/cxl/cxl-mailbox-utils.c
> 
> diff --git a/hw/cxl/cxl-device-utils.c b/hw/cxl/cxl-device-utils.c
> index 78144e103c..aec8b0d421 100644
> --- a/hw/cxl/cxl-device-utils.c
> +++ b/hw/cxl/cxl-device-utils.c
> @@ -55,6 +55,123 @@ static uint64_t dev_reg_read(void *opaque, hwaddr offset, unsigned size)
>      return ldn_le_p(&retval, size);
>  }
>  
> +static uint64_t mailbox_reg_read(void *opaque, hwaddr offset, unsigned size)
> +{
> +    CXLDeviceState *cxl_dstate = opaque;
> +
> +    switch (size) {
> +    case 4:
> +        if (unlikely(offset & (sizeof(uint32_t) - 1))) {
> +            qemu_log_mask(LOG_UNIMP, "Unaligned register read\n");
> +            return 0;
> +        }
> +        break;
> +    case 8:
> +        if (unlikely(offset & (sizeof(uint64_t) - 1))) {
> +            qemu_log_mask(LOG_UNIMP, "Unaligned register read\n");
> +            return 0;
> +        }
> +        break;
> +    default:
> +        qemu_log_mask(LOG_UNIMP, "%uB component register read\n", size);
> +        return 0;
> +    }
> +
> +    return ldn_le_p(cxl_dstate->mbox_reg_state + offset, size);
> +}
> +
> +static void mailbox_mem_writel(uint32_t *reg_state, hwaddr offset,
> +                               uint64_t value)
> +{
> +    switch (offset) {
> +    case A_CXL_DEV_MAILBOX_CTRL:
> +        /* fallthrough */
> +    case A_CXL_DEV_MAILBOX_CAP:
> +        /* RO register */
> +        break;
> +    default:
> +        qemu_log_mask(LOG_UNIMP,
> +                      "%s Unexpected 32-bit access to 0x%" PRIx64 " (WI)\n",
> +                      __func__, offset);
> +        break;
> +    }
> +
> +    stl_le_p((uint8_t *)reg_state + offset, value);
> +}
> +
> +static void mailbox_mem_writeq(uint64_t *reg_state, hwaddr offset,
> +                               uint64_t value)
> +{
> +    switch (offset) {
> +    case A_CXL_DEV_MAILBOX_CMD:
> +        break;
> +    case A_CXL_DEV_BG_CMD_STS:
> +        /* BG not supported */
> +        /* fallthrough */
> +    case A_CXL_DEV_MAILBOX_STS:
> +        /* Read only register, will get updated by the state machine */
> +        return;
> +    case A_CXL_DEV_MAILBOX_CAP:
> +    case A_CXL_DEV_MAILBOX_CTRL:

I wouldn't bother listing these here given you don't list the MAILBOX_STS etc in
the 32 bit version.

> +    default:
> +        qemu_log_mask(LOG_UNIMP,
> +                      "%s Unexpected 64-bit access to 0x%" PRIx64 " (WI)\n",
> +                      __func__, offset);
> +        return;
> +    }
> +
> +    stq_le_p((uint8_t *)reg_state + offset, value);
> +}
> +
> +static void mailbox_reg_write(void *opaque, hwaddr offset, uint64_t value,
> +                              unsigned size)
> +{
> +    CXLDeviceState *cxl_dstate = opaque;
> +
> +    /*
> +     * Lock is needed to prevent concurrent writes as well as to prevent writes
> +     * coming in while the firmware is processing. Without background commands
> +     * or the second mailbox implemented, this serves no purpose since the
> +     * memory access is synchronized at a higher level (per memory region).
> +     */
> +    RCU_READ_LOCK_GUARD();
> +
> +    switch (size) {
> +    case 4:
> +        if (unlikely(offset & (sizeof(uint32_t) - 1))) {
> +            qemu_log_mask(LOG_UNIMP, "Unaligned register read\n");
> +            return;
> +        }
> +        mailbox_mem_writel(cxl_dstate->mbox_reg_state32, offset, value);
> +        break;
> +    case 8:
> +        if (unlikely(offset & (sizeof(uint64_t) - 1))) {
> +            qemu_log_mask(LOG_UNIMP, "Unaligned register read\n");
> +            return;
> +        }
> +        mailbox_mem_writeq(cxl_dstate->mbox_reg_state64, offset, value);
> +        break;
> +    }
> +
> +    if (ARRAY_FIELD_EX32(cxl_dstate->mbox_reg_state32, CXL_DEV_MAILBOX_CTRL,
> +                         DOORBELL))
> +        process_mailbox(cxl_dstate);
> +}
> +
> +static const MemoryRegionOps mailbox_ops = {
> +    .read = mailbox_reg_read,
> +    .write = mailbox_reg_write,
> +    .endianness = DEVICE_LITTLE_ENDIAN,
> +    .valid = {
> +        .min_access_size = 4,
> +        .max_access_size = 8,
> +    },
> +    .impl = {
> +        .min_access_size = 4,
> +        .max_access_size = 8,
> +    },
> +};
> +
>  static const MemoryRegionOps dev_ops = {
>      .read = dev_reg_read,
>      .write = NULL,
> @@ -94,12 +211,23 @@ void cxl_device_register_block_init(Object *obj, CXLDeviceState *cxl_dstate)
>                            "cap-array", CXL_DEVICE_REGISTERS_OFFSET - 0);
>      memory_region_init_io(&cxl_dstate->device, obj, &dev_ops, cxl_dstate,
>                            "device-status", CXL_DEVICE_REGISTERS_LENGTH);
> +    memory_region_init_io(&cxl_dstate->mailbox, obj, &mailbox_ops, cxl_dstate,
> +                          "mailbox", CXL_MAILBOX_REGISTERS_LENGTH);
>  
>      memory_region_add_subregion(&cxl_dstate->device_registers, 0,
>                                  &cxl_dstate->caps);
>      memory_region_add_subregion(&cxl_dstate->device_registers,
>                                  CXL_DEVICE_REGISTERS_OFFSET,
>                                  &cxl_dstate->device);
> +    memory_region_add_subregion(&cxl_dstate->device_registers,
> +                                CXL_MAILBOX_REGISTERS_OFFSET, &cxl_dstate->mailbox);
> +}
> +
> +static void mailbox_init_common(uint32_t *mbox_regs)
> +{
> +    /* 2048 payload size, with no interrupt or background support */
> +    ARRAY_FIELD_DP32(mbox_regs, CXL_DEV_MAILBOX_CAP, PAYLOAD_SIZE,
> +                     CXL_MAILBOX_PAYLOAD_SHIFT);
>  }
>  
>  void cxl_device_register_init_common(CXLDeviceState *cxl_dstate)
> @@ -113,4 +241,7 @@ void cxl_device_register_init_common(CXLDeviceState *cxl_dstate)
>      ARRAY_FIELD_DP32(cap_hdrs, CXL_DEV_CAP_ARRAY2, CAP_COUNT, cap_count);
>  
>      cxl_device_cap_init(cxl_dstate, DEVICE, 1);
> +    cxl_device_cap_init(cxl_dstate, MAILBOX, 2);
> +
> +    mailbox_init_common(cxl_dstate->mbox_reg_state32);
>  }
> diff --git a/hw/cxl/cxl-mailbox-utils.c b/hw/cxl/cxl-mailbox-utils.c
> new file mode 100644
> index 0000000000..2d1b0ef9e4
> --- /dev/null
> +++ b/hw/cxl/cxl-mailbox-utils.c
> @@ -0,0 +1,93 @@
> +/*
> + * CXL Utility library for mailbox interface
> + *
> + * Copyright(C) 2020 Intel Corporation.
> + *
> + * This work is licensed under the terms of the GNU GPL, version 2. See the
> + * COPYING file in the top-level directory.
> + */
> +
> +#include "qemu/osdep.h"
> +#include "qemu/log.h"
> +#include "hw/pci/pci.h"
> +#include "hw/cxl/cxl.h"
> +
> +/* 8.2.8.4.5.1 Command Return Codes */
> +enum {
> +    RET_SUCCESS                 = 0x0,
> +    RET_BG_STARTED              = 0x1, /* Background Command Started */
> +    RET_EINVAL                  = 0x2, /* Invalid Input */
> +    RET_ENOTSUP                 = 0x3, /* Unsupported */
> +    RET_ENODEV                  = 0x4, /* Internal Error */
> +    RET_ERESTART                = 0x5, /* Retry Required */
> +    RET_EBUSY                   = 0x6, /* Busy */
> +    RET_MEDIA_DISABLED          = 0x7, /* Media Disabled */
> +    RET_FW_EBUSY                = 0x8, /* FW Transfer in Progress */
> +    RET_FW_OOO                  = 0x9, /* FW Transfer Out of Order */
> +    RET_FW_AUTH                 = 0xa, /* FW Authentication Failed */
> +    RET_FW_EBADSLT              = 0xb, /* Invalid Slot */
> +    RET_FW_ROLLBACK             = 0xc, /* Activation Failed, FW Rolled Back */
> +    RET_FW_REBOOT               = 0xd, /* Activation Failed, Cold Reset Required */
> +    RET_ENOENT                  = 0xe, /* Invalid Handle */
> +    RET_EFAULT                  = 0xf, /* Invalid Physical Address */
> +    RET_POISON_E2BIG            = 0x10, /* Inject Poison Limit Reached */
> +    RET_EIO                     = 0x11, /* Permanent Media Failure */
> +    RET_ECANCELED               = 0x12, /* Aborted */
> +    RET_EACCESS                 = 0x13, /* Invalid Security State */
> +    RET_EPERM                   = 0x14, /* Incorrect Passphrase */
> +    RET_EPROTONOSUPPORT         = 0x15, /* Unsupported Mailbox */
> +    RET_EMSGSIZE                = 0x16, /* Invalid Payload Length */
> +    RET_MAX                     = 0x17
> +};

Ah back again.  Just drop the earlier add and remove of this list and
we are all good.

> +
> +void process_mailbox(CXLDeviceState *cxl_dstate)
> +{
> +    uint16_t ret = RET_SUCCESS;
> +    uint32_t ret_len = 0;
> +    uint64_t status_reg;
> +
> +    /*
> +     * current state of mailbox interface
> +     *  uint32_t mbox_cap_reg = cxl_dstate->reg_state32[R_CXL_DEV_MAILBOX_CAP];
> +     *  uint32_t mbox_ctrl_reg = cxl_dstate->reg_state32[R_CXL_DEV_MAILBOX_CTRL];
> +     *  uint64_t status_reg = *(uint64_t *)&cxl_dstate->reg_state[A_CXL_DEV_MAILBOX_STS];
> +     */
> +    uint64_t command_reg =
> +        *(uint64_t *)&cxl_dstate->mbox_reg_state[A_CXL_DEV_MAILBOX_CMD];
> +
> +    /* Check if we have to do anything */
> +    if (!ARRAY_FIELD_EX32(cxl_dstate->mbox_reg_state32, CXL_DEV_MAILBOX_CTRL, DOORBELL)) {
> +        qemu_log_mask(LOG_UNIMP, "Corrupt internal state for firmware\n");
> +        return;
> +    }
> +
> +    uint8_t cmd_set = FIELD_EX64(command_reg, CXL_DEV_MAILBOX_CMD, COMMAND_SET);
> +    uint8_t cmd = FIELD_EX64(command_reg, CXL_DEV_MAILBOX_CMD, COMMAND);
> +    (void)cmd;

?

> +    switch (cmd_set) {
> +    default:
> +        ret = RET_ENOTSUP;
> +    }
> +
> +    /*
> +     * Set the return code
> +     * XXX: it's a 64b register, but we're not setting the vendor, so we can get
> +     * away with this

Also mention not setting background operation bit?

> +     */
> +    status_reg = FIELD_DP64(0, CXL_DEV_MAILBOX_STS, ERRNO, ret);
> +
> +    /*
> +     * Set the return length
> +     */
> +    command_reg = FIELD_DP64(command_reg, CXL_DEV_MAILBOX_CMD, COMMAND_SET, 0);
> +    command_reg = FIELD_DP64(command_reg, CXL_DEV_MAILBOX_CMD, COMMAND, 0);
> +    command_reg = FIELD_DP64(command_reg, CXL_DEV_MAILBOX_CMD, LENGTH, ret_len);

Rather convoluted way of setting just the length field, I assume because there
are RsvdP fields in there we can't touch.

> +
> +    stq_le_p(cxl_dstate->mbox_reg_state + A_CXL_DEV_MAILBOX_CMD, command_reg);
> +    stq_le_p(cxl_dstate->mbox_reg_state + A_CXL_DEV_MAILBOX_STS, status_reg);
> +
> +    /* Tell the host we're done */
> +    ARRAY_FIELD_DP32(cxl_dstate->mbox_reg_state32, CXL_DEV_MAILBOX_CTRL,
> +                     DOORBELL, 0);
> +}
> +
> diff --git a/hw/cxl/meson.build b/hw/cxl/meson.build
> index 47154d6850..0eca715d10 100644
> --- a/hw/cxl/meson.build
> +++ b/hw/cxl/meson.build
> @@ -1,4 +1,5 @@
>  softmmu_ss.add(when: 'CONFIG_CXL', if_true: files(
>    'cxl-component-utils.c',
>    'cxl-device-utils.c',
> +  'cxl-mailbox-utils.c',
>  ))
> diff --git a/include/hw/cxl/cxl.h b/include/hw/cxl/cxl.h
> index 23f52c4cf9..362cda40de 100644
> --- a/include/hw/cxl/cxl.h
> +++ b/include/hw/cxl/cxl.h
> @@ -14,5 +14,8 @@
>  #include "cxl_component.h"
>  #include "cxl_device.h"
>  
> +#define COMPONENT_REG_BAR_IDX 0
> +#define DEVICE_REG_BAR_IDX 2
> +
>  #endif
>  
> diff --git a/include/hw/cxl/cxl_device.h b/include/hw/cxl/cxl_device.h
> index 2c674fdc9c..df00998def 100644
> --- a/include/hw/cxl/cxl_device.h
> +++ b/include/hw/cxl/cxl_device.h
> @@ -87,6 +87,11 @@ typedef struct cxl_device_state {
>          uint8_t caps_reg_state[CXL_DEVICE_CAP_REG_SIZE * 4]; /* ARRAY + 3 CAPS */
>          uint32_t caps_reg_state32[0];
>      };
> +    union {
> +        uint8_t mbox_reg_state[CXL_MAILBOX_REGISTERS_LENGTH];
> +        uint32_t mbox_reg_state32[0];
> +        uint64_t mbox_reg_state64[0];
> +    };
>  } CXLDeviceState;
>  
>  /* Initialize the register block for a device */
> @@ -127,6 +132,8 @@ CXL_DEVICE_CAPABILITY_HEADER_REGISTER(DEVICE, CXL_DEVICE_CAP_HDR1_OFFSET)
>  CXL_DEVICE_CAPABILITY_HEADER_REGISTER(MAILBOX, CXL_DEVICE_CAP_HDR1_OFFSET + \
>                                                 CXL_DEVICE_CAP_REG_SIZE)
>  
> +void process_mailbox(CXLDeviceState *cxl_dstate);
> +
>  #define cxl_device_cap_init(dstate, reg, cap_id)                                   \
>      do {                                                                           \
>          uint32_t *cap_hdrs = dstate->caps_reg_state32;                             \
> @@ -155,7 +162,8 @@ REG32(CXL_DEV_MAILBOX_CTRL, 4)
>      FIELD(CXL_DEV_MAILBOX_CTRL, BG_INT_EN, 2, 1)
>  
>  REG32(CXL_DEV_MAILBOX_CMD, 8)
> -    FIELD(CXL_DEV_MAILBOX_CMD, OP, 0, 16)
> +    FIELD(CXL_DEV_MAILBOX_CMD, COMMAND, 0, 8)

Can we fix the original introduction of this so we don't end up modifying it here?
From spec I can fully see how you ended up with this as you wrote the code
but nice to get rid of the two step definition now anyway.
(the field is first defined as 16 bits, then later it says there are two 8 bit fields).

> +    FIELD(CXL_DEV_MAILBOX_CMD, COMMAND_SET, 8, 8)
>      FIELD(CXL_DEV_MAILBOX_CMD, LENGTH, 16, 20)
>  
>  /* XXX: actually a 64b register */



^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [RFC PATCH 08/25] hw/cxl/device: Add memory devices (8.2.8.5)
  2020-11-11  5:47 ` [RFC PATCH 08/25] hw/cxl/device: Add memory devices (8.2.8.5) Ben Widawsky
@ 2020-11-16 16:37   ` Jonathan Cameron
  2020-11-16 21:45     ` Ben Widawsky
  0 siblings, 1 reply; 64+ messages in thread
From: Jonathan Cameron @ 2020-11-16 16:37 UTC (permalink / raw)
  To: Ben Widawsky; +Cc: Vishal Verma, Dan Williams, qemu-devel

On Tue, 10 Nov 2020 21:47:07 -0800
Ben Widawsky <ben.widawsky@intel.com> wrote:

> Memory devices implement extra capabilities on top of CXL devices. This
> adds support for that.
> 
> Signed-off-by: Ben Widawsky <ben.widawsky@intel.com>
> ---
>  hw/cxl/cxl-device-utils.c   | 48 ++++++++++++++++++++++++++++++++++++-
>  hw/cxl/cxl-mailbox-utils.c  | 48 ++++++++++++++++++++++++++++++++++++-
>  include/hw/cxl/cxl_device.h | 15 ++++++++++++
>  3 files changed, 109 insertions(+), 2 deletions(-)
> 
> diff --git a/hw/cxl/cxl-device-utils.c b/hw/cxl/cxl-device-utils.c
> index aec8b0d421..6544a68567 100644
> --- a/hw/cxl/cxl-device-utils.c
> +++ b/hw/cxl/cxl-device-utils.c
> @@ -158,6 +158,45 @@ static void mailbox_reg_write(void *opaque, hwaddr offset, uint64_t value,
>          process_mailbox(cxl_dstate);
>  }
>  
> +static uint64_t mdev_reg_read(void *opaque, hwaddr offset, unsigned size)
> +{
> +    uint64_t retval = 0;
> +
> +    retval = FIELD_DP64(retval, CXL_MEM_DEV_STS, MEDIA_STATUS, 1);
> +    retval = FIELD_DP64(retval, CXL_MEM_DEV_STS, MBOX_READY, 1);
> +
> +    switch (size) {
> +    case 4:
> +        if (unlikely(offset & (sizeof(uint32_t) - 1))) {
> +            qemu_log_mask(LOG_UNIMP, "Unaligned register read\n");
> +            return 0;
> +        }
> +        break;
> +    case 8:
> +        if (unlikely(offset & (sizeof(uint64_t) - 1))) {
> +            qemu_log_mask(LOG_UNIMP, "Unaligned register read\n");
> +            return 0;
> +        }
> +        break;
> +    }
> +
> +    return ldn_le_p(&retval, size);
> +}
> +
> +static const MemoryRegionOps mdev_ops = {
> +    .read = mdev_reg_read,
> +    .write = NULL,
> +    .endianness = DEVICE_LITTLE_ENDIAN,
> +    .valid = {
> +        .min_access_size = 4,
> +        .max_access_size = 8,
> +    },
> +    .impl = {
> +        .min_access_size = 4,
> +        .max_access_size = 8,
> +    },
> +};
> +
>  static const MemoryRegionOps mailbox_ops = {
>      .read = mailbox_reg_read,
>      .write = mailbox_reg_write,
> @@ -213,6 +252,9 @@ void cxl_device_register_block_init(Object *obj, CXLDeviceState *cxl_dstate)
>                            "device-status", CXL_DEVICE_REGISTERS_LENGTH);
>      memory_region_init_io(&cxl_dstate->mailbox, obj, &mailbox_ops, cxl_dstate,
>                            "mailbox", CXL_MAILBOX_REGISTERS_LENGTH);
> +    memory_region_init_io(&cxl_dstate->memory_device, obj, &mdev_ops,
> +                          cxl_dstate, "memory device caps",
> +                          CXL_MEMORY_DEVICE_REGISTERS_LENGTH);
>  
>      memory_region_add_subregion(&cxl_dstate->device_registers, 0,
>                                  &cxl_dstate->caps);
> @@ -221,6 +263,9 @@ void cxl_device_register_block_init(Object *obj, CXLDeviceState *cxl_dstate)
>                                  &cxl_dstate->device);
>      memory_region_add_subregion(&cxl_dstate->device_registers,
>                                  CXL_MAILBOX_REGISTERS_OFFSET, &cxl_dstate->mailbox);
> +    memory_region_add_subregion(&cxl_dstate->device_registers,
> +                                CXL_MEMORY_DEVICE_REGISTERS_OFFSET,
> +                                &cxl_dstate->memory_device);
>  }
>  
>  static void mailbox_init_common(uint32_t *mbox_regs)
> @@ -233,7 +278,7 @@ static void mailbox_init_common(uint32_t *mbox_regs)
>  void cxl_device_register_init_common(CXLDeviceState *cxl_dstate)
>  {
>      uint32_t *cap_hdrs = cxl_dstate->caps_reg_state32;
> -    const int cap_count = 1;

Guessing this should previously have been 2?

> +    const int cap_count = 3;
>  
>      /* CXL Device Capabilities Array Register */
>      ARRAY_FIELD_DP32(cap_hdrs, CXL_DEV_CAP_ARRAY, CAP_ID, 0);
> @@ -242,6 +287,7 @@ void cxl_device_register_init_common(CXLDeviceState *cxl_dstate)
>  
>      cxl_device_cap_init(cxl_dstate, DEVICE, 1);
>      cxl_device_cap_init(cxl_dstate, MAILBOX, 2);
> +    cxl_device_cap_init(cxl_dstate, MEMORY_DEVICE, 0x4000);
>  
>      mailbox_init_common(cxl_dstate->mbox_reg_state32);
>  }
> diff --git a/hw/cxl/cxl-mailbox-utils.c b/hw/cxl/cxl-mailbox-utils.c
> index 2d1b0ef9e4..5d2579800e 100644
> --- a/hw/cxl/cxl-mailbox-utils.c
> +++ b/hw/cxl/cxl-mailbox-utils.c
> @@ -12,6 +12,12 @@
>  #include "hw/pci/pci.h"
>  #include "hw/cxl/cxl.h"
>  
> +enum cxl_opcode {
> +    CXL_EVENTS      = 0x1,
> +    CXL_IDENTIFY    = 0x40,
> +        #define CXL_IDENTIFY_MEMORY_DEVICE = 0x0
> +};
> +
>  /* 8.2.8.4.5.1 Command Return Codes */
>  enum {
>      RET_SUCCESS                 = 0x0,
> @@ -40,6 +46,43 @@ enum {
>      RET_MAX                     = 0x17
>  };
>  
> +/* 8.2.9.5.1.1 */
> +static int cmd_set_identify(CXLDeviceState *cxl_dstate, uint8_t cmd,
> +                            uint32_t *ret_size)

I'm a bit confused on naming here, perhaps rsp_set_identity makes
it clearer which direction this is going in?  I think this is
filling in the reply for a command from software running on the
system. Naming seems to me to suggest we are setting the identity
of the hardware.  

> +{
> +    struct identify {
> +        char fw_revision[0x10];
> +        uint64_t total_capacity;
> +        uint64_t volatile_capacity;
> +        uint64_t persistent_capacity;
> +        uint64_t partition_align;
> +        uint16_t info_event_log_size;
> +        uint16_t warning_event_log_size;
> +        uint16_t failure_event_log_size;
> +        uint16_t fatal_event_log_size;
> +        uint32_t lsa_size;
> +        uint8_t poison_list_max_mer[3];
> +        uint16_t inject_poison_limit;
> +        uint8_t poison_caps;
> +        uint8_t qos_telemetry_caps;
> +    } __attribute__((packed)) *id;
> +    _Static_assert(sizeof(struct identify) == 0x43, "Bad identify size");
> +
> +    if (memory_region_size(cxl_dstate->pmem) < (256 << 20)) {
> +        return RET_ENODEV;
> +    }
> +
> +    /* PMEM only */
> +    id = (struct identify *)((void *)cxl_dstate->mbox_reg_state +
> +                             A_CXL_DEV_CMD_PAYLOAD);
> +    snprintf(id->fw_revision, 0x10, "BWFW VERSION %02d", 0);
> +    id->total_capacity = memory_region_size(cxl_dstate->pmem);
> +    id->persistent_capacity = memory_region_size(cxl_dstate->pmem);
> +
> +    *ret_size = 0x43;
> +    return RET_SUCCESS;
> +}
> +
>  void process_mailbox(CXLDeviceState *cxl_dstate)
>  {
>      uint16_t ret = RET_SUCCESS;
> @@ -63,8 +106,11 @@ void process_mailbox(CXLDeviceState *cxl_dstate)
>  
>      uint8_t cmd_set = FIELD_EX64(command_reg, CXL_DEV_MAILBOX_CMD, COMMAND_SET);
>      uint8_t cmd = FIELD_EX64(command_reg, CXL_DEV_MAILBOX_CMD, COMMAND);
> -    (void)cmd;

Clean this stuff up before v2.

>      switch (cmd_set) {
> +    case CXL_IDENTIFY:
> +        ret = cmd_set_identify(cxl_dstate, cmd, &ret_len);
> +        /* Fill in payload here */
> +        break;
>      default:
>          ret = RET_ENOTSUP;
>      }
> diff --git a/include/hw/cxl/cxl_device.h b/include/hw/cxl/cxl_device.h
> index df00998def..2cb2a9af3c 100644
> --- a/include/hw/cxl/cxl_device.h
> +++ b/include/hw/cxl/cxl_device.h
> @@ -69,6 +69,10 @@
>  #define CXL_MAILBOX_REGISTERS_LENGTH \
>      (CXL_MAILBOX_REGISTERS_SIZE + CXL_MAILBOX_MAX_PAYLOAD_SIZE)
>  
> +#define CXL_MEMORY_DEVICE_REGISTERS_OFFSET \
> +    (CXL_MAILBOX_REGISTERS_OFFSET + CXL_MAILBOX_REGISTERS_LENGTH)
> +#define CXL_MEMORY_DEVICE_REGISTERS_LENGTH 0x8
> +
>  typedef struct cxl_device_state {
>      /* Boss container and caps registers */
>      MemoryRegion device_registers;
> @@ -76,6 +80,7 @@ typedef struct cxl_device_state {
>      MemoryRegion caps;
>      MemoryRegion device;
>      MemoryRegion mailbox;
> +    MemoryRegion memory_device;
>  
>      MemoryRegion *pmem;
>      MemoryRegion *vmem;
> @@ -131,6 +136,8 @@ REG32(CXL_DEV_CAP_ARRAY2, 4) /* We're going to pretend it's 64b */
>  CXL_DEVICE_CAPABILITY_HEADER_REGISTER(DEVICE, CXL_DEVICE_CAP_HDR1_OFFSET)
>  CXL_DEVICE_CAPABILITY_HEADER_REGISTER(MAILBOX, CXL_DEVICE_CAP_HDR1_OFFSET + \
>                                                 CXL_DEVICE_CAP_REG_SIZE)
> +CXL_DEVICE_CAPABILITY_HEADER_REGISTER(MEMORY_DEVICE, CXL_DEVICE_CAP_HDR1_OFFSET + \
> +                                                     CXL_DEVICE_CAP_REG_SIZE * 2)
>  
>  void process_mailbox(CXLDeviceState *cxl_dstate);
>  
> @@ -181,4 +188,12 @@ REG32(CXL_DEV_BG_CMD_STS, 0x18)
>  
>  REG32(CXL_DEV_CMD_PAYLOAD, 0x20)
>  
> +/* XXX: actually a 64b registers */
> +REG32(CXL_MEM_DEV_STS, 0)
> +    FIELD(CXL_MEM_DEV_STS, FATAL, 0, 1)
> +    FIELD(CXL_MEM_DEV_STS, FW_HALT, 1, 1)
> +    FIELD(CXL_MEM_DEV_STS, MEDIA_STATUS, 2, 2)
> +    FIELD(CXL_MEM_DEV_STS, MBOX_READY, 4, 1)
> +    FIELD(CXL_MEM_DEV_STS, RESET_NEEDED, 5, 3)
> +
>  #endif



^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [RFC PATCH 11/25] hw/pxb: Allow creation of a CXL PXB (host bridge)
  2020-11-11  5:47 ` [RFC PATCH 11/25] hw/pxb: Allow creation of a CXL PXB (host bridge) Ben Widawsky
@ 2020-11-16 16:44   ` Jonathan Cameron
  2020-11-16 22:01     ` Ben Widawsky
  0 siblings, 1 reply; 64+ messages in thread
From: Jonathan Cameron @ 2020-11-16 16:44 UTC (permalink / raw)
  To: Ben Widawsky; +Cc: Vishal Verma, Dan Williams, qemu-devel, Michael S. Tsirkin

On Tue, 10 Nov 2020 21:47:10 -0800
Ben Widawsky <ben.widawsky@intel.com> wrote:

> This works like adding a typical pxb device, except the name is
> 'pxb-cxl' instead of 'pxb-pcie'. An example command line would be as
> follows:
>   -device pxb-cxl,id=cxl.0,bus="pcie.0",bus_nr=1
> 
> A CXL PXB is backward compatible with PCIe. What this means in practice
> is that an operating system that is unaware of CXL should still be able
> to enumerate this topology as if it were PCIe.
> 
> One can create multiple CXL PXB host bridges, but a host bridge can only
> be connected to the main root bus. Host bridges cannot appear elsewhere
> in the topology.
> 
> Note that as of this patch, the ACPI tables needed for the host bridge
> (specifically, an ACPI object in _SB named ACPI0016 and the CEDT) aren't
> created. So while this patch internally creates it, it cannot be
> properly used by an operating system or other system software.
> 
> Upcoming patches will allow creating multiple host bridges.
> 
> Signed-off-by: Ben Widawsky <ben.widawsky@intel.com>
Hi Ben,

Few minor things inline.

Jonathan

> ---
>  hw/pci-bridge/pci_expander_bridge.c | 67 ++++++++++++++++++++++++++++-
>  hw/pci/pci.c                        |  7 +++
>  include/hw/pci/pci.h                |  6 +++
>  3 files changed, 78 insertions(+), 2 deletions(-)
> 
> diff --git a/hw/pci-bridge/pci_expander_bridge.c b/hw/pci-bridge/pci_expander_bridge.c
> index 88c45dc3b5..3a8d815231 100644
> --- a/hw/pci-bridge/pci_expander_bridge.c
> +++ b/hw/pci-bridge/pci_expander_bridge.c
> @@ -56,6 +56,10 @@ DECLARE_INSTANCE_CHECKER(PXBDev, PXB_DEV,
>  DECLARE_INSTANCE_CHECKER(PXBDev, PXB_PCIE_DEV,
>                           TYPE_PXB_PCIE_DEVICE)
>  
> +#define TYPE_PXB_CXL_DEVICE "pxb-cxl"
> +DECLARE_INSTANCE_CHECKER(PXBDev, PXB_CXL_DEV,
> +                         TYPE_PXB_CXL_DEVICE)
> +
>  struct PXBDev {
>      /*< private >*/
>      PCIDevice parent_obj;
> @@ -67,6 +71,11 @@ struct PXBDev {
>  
>  static PXBDev *convert_to_pxb(PCIDevice *dev)
>  {
> +    /* A CXL PXB's parent bus is PCIe, so the normal check won't work */
> +    if (object_dynamic_cast(OBJECT(dev), TYPE_PXB_CXL_DEVICE)) {
> +        return PXB_CXL_DEV(dev);
> +    }
> +
>      return pci_bus_is_express(pci_get_bus(dev))
>          ? PXB_PCIE_DEV(dev) : PXB_DEV(dev);
>  }
> @@ -111,11 +120,20 @@ static const TypeInfo pxb_pcie_bus_info = {
>      .class_init    = pxb_bus_class_init,
>  };
>  
> +static const TypeInfo pxb_cxl_bus_info = {
> +    .name          = TYPE_PXB_CXL_BUS,
> +    .parent        = TYPE_CXL_BUS,
> +    .instance_size = sizeof(PXBBus),
> +    .class_init    = pxb_bus_class_init,
> +};
> +
>  static const char *pxb_host_root_bus_path(PCIHostState *host_bridge,
>                                            PCIBus *rootbus)
>  {
> -    PXBBus *bus = pci_bus_is_express(rootbus) ?
> -                  PXB_PCIE_BUS(rootbus) : PXB_BUS(rootbus);
> +    PXBBus *bus = pci_bus_is_cxl(rootbus) ?
> +                      PXB_CXL_BUS(rootbus) :
> +                      pci_bus_is_express(rootbus) ? PXB_PCIE_BUS(rootbus) :
> +                                                    PXB_BUS(rootbus);

There comes a point where if / else is much more readable.

>  
>      snprintf(bus->bus_path, 8, "0000:%02x", pxb_bus_num(rootbus));
>      return bus->bus_path;
> @@ -380,13 +398,58 @@ static const TypeInfo pxb_pcie_dev_info = {
>      },
>  };
>  
> +static void pxb_cxl_dev_realize(PCIDevice *dev, Error **errp)
> +{
> +    /* A CXL PXB's parent bus is still PCIe */
> +    if (!pci_bus_is_express(pci_get_bus(dev))) {
> +        error_setg(errp, "pxb-cxl devices cannot reside on a PCI bus");
> +        return;
> +    }
> +
> +    pxb_dev_realize_common(dev, CXL, errp);
> +}
> +
> +static void pxb_cxl_dev_class_init(ObjectClass *klass, void *data)
> +{
> +    DeviceClass *dc   = DEVICE_CLASS(klass);
> +    PCIDeviceClass *k = PCI_DEVICE_CLASS(klass);
> +
> +    k->realize             = pxb_cxl_dev_realize;
> +    k->exit                = pxb_dev_exitfn;
> +    k->vendor_id           = PCI_VENDOR_ID_INTEL;
> +    k->device_id           = 0xabcd;

Just to check, is that an officially assigned device_id that we will never
have a clash with?  Nice ID to get if it is :)


> +    k->class_id            = PCI_CLASS_BRIDGE_HOST;
> +    k->subsystem_vendor_id = PCI_VENDOR_ID_INTEL;
> +
> +    dc->desc = "CXL Host Bridge";
> +    device_class_set_props(dc, pxb_dev_properties);
> +    set_bit(DEVICE_CATEGORY_BRIDGE, dc->categories);
> +
> +    /* Host bridges aren't hotpluggable. FIXME: spec reference */
> +    dc->hotpluggable = false;
> +}
> +
> +static const TypeInfo pxb_cxl_dev_info = {
> +    .name          = TYPE_PXB_CXL_DEVICE,
> +    .parent        = TYPE_PCI_DEVICE,
> +    .instance_size = sizeof(PXBDev),
> +    .class_init    = pxb_cxl_dev_class_init,
> +    .interfaces =
> +        (InterfaceInfo[]){
> +            { INTERFACE_CONVENTIONAL_PCI_DEVICE },
> +            {},
> +        },
> +};
> +
>  static void pxb_register_types(void)
>  {
>      type_register_static(&pxb_bus_info);
>      type_register_static(&pxb_pcie_bus_info);
> +    type_register_static(&pxb_cxl_bus_info);
>      type_register_static(&pxb_host_info);
>      type_register_static(&pxb_dev_info);
>      type_register_static(&pxb_pcie_dev_info);
> +    type_register_static(&pxb_cxl_dev_info);
>  }
>  
>  type_init(pxb_register_types)
> diff --git a/hw/pci/pci.c b/hw/pci/pci.c
> index db88788c4b..67eed889a4 100644
> --- a/hw/pci/pci.c
> +++ b/hw/pci/pci.c
> @@ -220,6 +220,12 @@ static const TypeInfo pcie_bus_info = {
>      .class_init = pcie_bus_class_init,
>  };
>  
> +static const TypeInfo cxl_bus_info = {
> +    .name       = TYPE_CXL_BUS,
> +    .parent     = TYPE_PCIE_BUS,
> +    .class_init = pcie_bus_class_init,
> +};
> +
>  static PCIBus *pci_find_bus_nr(PCIBus *bus, int bus_num);
>  static void pci_update_mappings(PCIDevice *d);
>  static void pci_irq_handler(void *opaque, int irq_num, int level);
> @@ -2847,6 +2853,7 @@ static void pci_register_types(void)
>  {
>      type_register_static(&pci_bus_info);
>      type_register_static(&pcie_bus_info);
> +    type_register_static(&cxl_bus_info);
>      type_register_static(&conventional_pci_interface_info);
>      type_register_static(&cxl_interface_info);
>      type_register_static(&pcie_interface_info);
> diff --git a/include/hw/pci/pci.h b/include/hw/pci/pci.h
> index 4e6fd59fdd..52267ff69e 100644
> --- a/include/hw/pci/pci.h
> +++ b/include/hw/pci/pci.h
> @@ -405,6 +405,7 @@ typedef PCIINTxRoute (*pci_route_irq_fn)(void *opaque, int pin);
>  #define TYPE_PCI_BUS "PCI"
>  OBJECT_DECLARE_TYPE(PCIBus, PCIBusClass, PCI_BUS)
>  #define TYPE_PCIE_BUS "PCIE"
> +#define TYPE_CXL_BUS "CXL"
>  
>  bool pci_bus_is_express(PCIBus *bus);
>  
> @@ -753,6 +754,11 @@ static inline void pci_irq_pulse(PCIDevice *pci_dev)
>      pci_irq_deassert(pci_dev);
>  }
>  
> +static inline int pci_is_cxl(const PCIDevice *d)
> +{
> +    return d->cap_present & QEMU_PCIE_CAP_CXL;
> +}
> +
>  static inline int pci_is_express(const PCIDevice *d)
>  {
>      return d->cap_present & QEMU_PCI_CAP_EXPRESS;



^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [RFC PATCH 12/25] acpi/pci: Consolidate host bridge setup
  2020-11-11  5:47 ` [RFC PATCH 12/25] acpi/pci: Consolidate host bridge setup Ben Widawsky
  2020-11-12 17:46   ` Ben Widawsky
@ 2020-11-16 16:45   ` Jonathan Cameron
  1 sibling, 0 replies; 64+ messages in thread
From: Jonathan Cameron @ 2020-11-16 16:45 UTC (permalink / raw)
  To: Ben Widawsky
  Cc: Eduardo Habkost, Michael S. Tsirkin, Vishal Verma, qemu-devel,
	Paolo Bonzini, Igor Mammedov, Dan Williams, Richard Henderson

On Tue, 10 Nov 2020 21:47:11 -0800
Ben Widawsky <ben.widawsky@intel.com> wrote:

> This cleanup will make it easier to add support for CXL to the mix.
> 
> Signed-off-by: Ben Widawsky <ben.widawsky@intel.com>

Nice looking change.  Minor comment inline.

> ---
>  hw/i386/acpi-build.c | 31 +++++++++++++++++--------------
>  1 file changed, 17 insertions(+), 14 deletions(-)
> 
> diff --git a/hw/i386/acpi-build.c b/hw/i386/acpi-build.c
> index 4f66642d88..99b3088c9e 100644
> --- a/hw/i386/acpi-build.c
> +++ b/hw/i386/acpi-build.c
> @@ -1486,6 +1486,20 @@ static void build_smb0(Aml *table, I2CBus *smbus, int devnr, int func)
>      aml_append(table, scope);
>  }
>  
> +enum { PCI, PCIE };
> +static void init_pci_acpi(Aml *dev, int uid, int type)
> +{
> +    if (type == PCI) {

switch?

> +        aml_append(dev, aml_name_decl("_HID", aml_eisaid("PNP0A03")));
> +        aml_append(dev, aml_name_decl("_UID", aml_int(uid)));
> +    } else {
> +        aml_append(dev, aml_name_decl("_HID", aml_eisaid("PNP0A08")));
> +        aml_append(dev, aml_name_decl("_CID", aml_eisaid("PNP0A03")));
> +        aml_append(dev, aml_name_decl("_UID", aml_int(uid)));
> +        aml_append(dev, build_q35_osc_method());
> +    }
> +}
> +
>  static void
>  build_dsdt(GArray *table_data, BIOSLinker *linker,
>             AcpiPmInfo *pm, AcpiMiscInfo *misc,
> @@ -1514,9 +1528,8 @@ build_dsdt(GArray *table_data, BIOSLinker *linker,
>      if (misc->is_piix4) {
>          sb_scope = aml_scope("_SB");
>          dev = aml_device("PCI0");
> -        aml_append(dev, aml_name_decl("_HID", aml_eisaid("PNP0A03")));
> +        init_pci_acpi(dev, 0, PCI);
>          aml_append(dev, aml_name_decl("_ADR", aml_int(0)));
> -        aml_append(dev, aml_name_decl("_UID", aml_int(0)));
>          aml_append(sb_scope, dev);
>          aml_append(dsdt, sb_scope);
>  
> @@ -1530,11 +1543,8 @@ build_dsdt(GArray *table_data, BIOSLinker *linker,
>      } else {
>          sb_scope = aml_scope("_SB");
>          dev = aml_device("PCI0");
> -        aml_append(dev, aml_name_decl("_HID", aml_eisaid("PNP0A08")));
> -        aml_append(dev, aml_name_decl("_CID", aml_eisaid("PNP0A03")));
> +        init_pci_acpi(dev, 0, PCIE);
>          aml_append(dev, aml_name_decl("_ADR", aml_int(0)));
> -        aml_append(dev, aml_name_decl("_UID", aml_int(0)));
> -        aml_append(dev, build_q35_osc_method());
>          aml_append(sb_scope, dev);
>  
>          if (pm->smi_on_cpuhp) {
> @@ -1636,15 +1646,8 @@ build_dsdt(GArray *table_data, BIOSLinker *linker,
>  
>              scope = aml_scope("\\_SB");
>              dev = aml_device("PC%.02X", bus_num);
> -            aml_append(dev, aml_name_decl("_UID", aml_int(bus_num)));
>              aml_append(dev, aml_name_decl("_BBN", aml_int(bus_num)));
> -            if (pci_bus_is_express(bus)) {
> -                aml_append(dev, aml_name_decl("_HID", aml_eisaid("PNP0A08")));
> -                aml_append(dev, aml_name_decl("_CID", aml_eisaid("PNP0A03")));
> -                aml_append(dev, build_q35_osc_method());
> -            } else {
> -                aml_append(dev, aml_name_decl("_HID", aml_eisaid("PNP0A03")));
> -            }
> +            init_pci_acpi(dev, bus_num, pci_bus_is_express(bus) ? PCIE : PCI);
>  
>              if (numa_node != NUMA_NODE_UNASSIGNED) {
>                  aml_append(dev, aml_name_decl("_PXM", aml_int(numa_node)));



^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [RFC PATCH 15/25] acpi/pxb/cxl: Reserve host bridge MMIO
  2020-11-11  5:47 ` [RFC PATCH 15/25] acpi/pxb/cxl: Reserve host bridge MMIO Ben Widawsky
@ 2020-11-16 16:54   ` Jonathan Cameron
  0 siblings, 0 replies; 64+ messages in thread
From: Jonathan Cameron @ 2020-11-16 16:54 UTC (permalink / raw)
  To: Ben Widawsky
  Cc: Eduardo Habkost, Michael S. Tsirkin, Vishal Verma, qemu-devel,
	Paolo Bonzini, Igor Mammedov, Dan Williams, Richard Henderson

On Tue, 10 Nov 2020 21:47:14 -0800
Ben Widawsky <ben.widawsky@intel.com> wrote:

> For all host bridges, reserve MMIO space with _CRS. The MMIO for the
> host bridge lives in a magically hard coded space in the system's
> physical address space. The standard mechanism to tell the OS about
> regions which can't be used for host bridges is _CRS.
> 
> Signed-off-by: Ben Widawsky <ben.widawsky@intel.com>
> ---
>  hw/i386/acpi-build.c | 22 +++++++++++++++++++---
>  1 file changed, 19 insertions(+), 3 deletions(-)
> 
> diff --git a/hw/i386/acpi-build.c b/hw/i386/acpi-build.c
> index aaed7da7dc..fae4fa28e1 100644
> --- a/hw/i386/acpi-build.c
> +++ b/hw/i386/acpi-build.c
> @@ -28,6 +28,7 @@
>  #include "qemu/bitmap.h"
>  #include "qemu/error-report.h"
>  #include "hw/pci/pci.h"
> +#include "hw/cxl/cxl.h"
>  #include "hw/core/cpu.h"
>  #include "target/i386/cpu.h"
>  #include "hw/misc/pvpanic.h"
> @@ -1486,7 +1487,7 @@ static void build_smb0(Aml *table, I2CBus *smbus, int devnr, int func)
>      aml_append(table, scope);
>  }
>  
> -enum { PCI, PCIE };
> +enum { PCI, PCIE, CXL };
>  static void init_pci_acpi(Aml *dev, int uid, int type)
>  {
>      if (type == PCI) {
> @@ -1635,20 +1636,28 @@ build_dsdt(GArray *table_data, BIOSLinker *linker,
>              uint8_t bus_num = pci_bus_num(bus);
>              uint8_t numa_node = pci_bus_numa_node(bus);
>              int32_t uid = pci_bus_uid(bus);
> +            int type;
>  
>              /* look only for expander root buses */
>              if (!pci_bus_is_root(bus)) {
>                  continue;
>              }
>  
> +            type = pci_bus_is_cxl(bus) ? CXL :
> +                                         pci_bus_is_express(bus) ? PCIE : PCI;
> +
>              if (bus_num < root_bus_limit) {
>                  root_bus_limit = bus_num - 1;
>              }
>  
>              scope = aml_scope("\\_SB");
> -            dev = aml_device("PC%.02X", bus_num);
> +            if (type == CXL) {
> +                dev = aml_device("CXL%.01X", pci_bus_uid(bus));
> +            } else {
> +                dev = aml_device("PC%.02X", bus_num);
> +            }
>              aml_append(dev, aml_name_decl("_BBN", aml_int(bus_num)));
> -            init_pci_acpi(dev, uid, pci_bus_is_express(bus) ? PCIE : PCI);
> +            init_pci_acpi(dev, uid, type);

Ah, so you are relying on the fact you didn't do a switch in init_pci_acpi.
I'd rather see that called out explicitly there though as then obvious what
subset of bus types share the same init. It isn't trivial to follow through
this code right now adn work out what ends up in the AML for each type.
It might even be worth just allowing some repetition to make that easier
to see.

>  
>              if (numa_node != NUMA_NODE_UNASSIGNED) {
>                  aml_append(dev, aml_name_decl("_PXM", aml_int(numa_node)));
> @@ -1659,6 +1668,13 @@ build_dsdt(GArray *table_data, BIOSLinker *linker,
>              aml_append(dev, aml_name_decl("_CRS", crs));
>              aml_append(scope, dev);
>              aml_append(dsdt, scope);
> +
> +            /* Handle the ranges for the PXB expanders */
> +            if (type == CXL) {
> +                uint64_t base = CXL_HOST_BASE + uid * 0x10000;
> +                crs_range_insert(crs_range_set.mem_ranges, base,
> +                                 base + 0x10000 - 1);
> +            }
>          }
>      }
>  



^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [RFC PATCH 22/25] acpi/cxl: Create the CEDT (9.14.1)
  2020-11-11  5:47 ` [RFC PATCH 22/25] acpi/cxl: Create the CEDT (9.14.1) Ben Widawsky
@ 2020-11-16 17:15   ` Jonathan Cameron
  2020-11-16 22:05     ` Ben Widawsky
  0 siblings, 1 reply; 64+ messages in thread
From: Jonathan Cameron @ 2020-11-16 17:15 UTC (permalink / raw)
  To: Ben Widawsky
  Cc: Eduardo Habkost, Michael S. Tsirkin, Vishal Verma, qemu-devel,
	Paolo Bonzini, Igor Mammedov, Dan Williams, Richard Henderson

On Tue, 10 Nov 2020 21:47:21 -0800
Ben Widawsky <ben.widawsky@intel.com> wrote:

> The CXL Early Discovery Table is defined in the CXL 2.0 specification as
> a way for the OS to get CXL specific information from the system
> firmware.
> 
> As of CXL 2.0 spec, only 1 sub structure is defined, the CXL Host Bridge
> Structure (CHBS) which is primarily useful for telling the OS exactly
> where the MMIO for the host bridge is.
> 
> Signed-off-by: Ben Widawsky <ben.widawsky@intel.com>
Trivial comments inline.

Jonathan

> ---
>  hw/acpi/cxl.c                       | 72 +++++++++++++++++++++++++++++
>  hw/i386/acpi-build.c                |  6 ++-
>  hw/pci-bridge/pci_expander_bridge.c | 21 +--------
>  include/hw/acpi/cxl.h               |  4 ++
>  include/hw/pci/pci_bridge.h         | 25 ++++++++++
>  5 files changed, 107 insertions(+), 21 deletions(-)
> 
> diff --git a/hw/acpi/cxl.c b/hw/acpi/cxl.c
> index 31ceaeecc3..c9631763ad 100644
> --- a/hw/acpi/cxl.c
> +++ b/hw/acpi/cxl.c
> @@ -18,14 +18,86 @@
>   */
>  
>  #include "qemu/osdep.h"
> +#include "hw/sysbus.h"
> +#include "hw/pci/pci_bridge.h"
> +#include "hw/pci/pci_host.h"
>  #include "hw/cxl/cxl.h"
> +#include "hw/mem/memory-device.h"
>  #include "hw/acpi/acpi.h"
>  #include "hw/acpi/aml-build.h"
>  #include "hw/acpi/bios-linker-loader.h"
>  #include "hw/acpi/cxl.h"
> +#include "hw/acpi/cxl.h"
>  #include "qapi/error.h"
>  #include "qemu/uuid.h"
>  
> +static void cedt_build_chbs(GArray *table_data, PXBDev *cxl)
> +{
> +    SysBusDevice *sbd = SYS_BUS_DEVICE(cxl->cxl.cxl_host_bridge);
> +    struct MemoryRegion *mr = sbd->mmio[0].memory;
> +
> +    /* Type */
> +    build_append_int_noprefix(table_data, 0, 1);
> +
> +    /* Reserved */
> +    build_append_int_noprefix(table_data, 0xff, 1);

Why 0xff rather than 0x00?  ACPI uses default of 0 for reserved bits
(5.2.1 in ACPI 6.3 spec)

> +
> +    /* Record Length */
> +    build_append_int_noprefix(table_data, 32, 2);
> +
> +    /* UID */
> +    build_append_int_noprefix(table_data, cxl->uid, 4);
> +
> +    /* Version */
> +    build_append_int_noprefix(table_data, 1, 4);
> +
> +    /* Reserved */
> +    build_append_int_noprefix(table_data, 0xffffffff, 4);
> +
> +    /* Base */
> +    build_append_int_noprefix(table_data, mr->addr, 8);
> +
> +    /* Length */
> +    build_append_int_noprefix(table_data, memory_region_size(mr), 4);

Better to just treat this as a 64 bit field as per the spec, even though
it can only contain 0x10000?

> +
> +    /* Reserved */
> +    build_append_int_noprefix(table_data, 0xffffffff, 4);
> +}
> +
> +static int cxl_foreach_pxb_hb(Object *obj, void *opaque)
> +{
> +    Aml *cedt = opaque;
> +
> +    if (object_dynamic_cast(obj, TYPE_PXB_CXL_DEVICE)) {
> +        PXBDev *pxb = PXB_CXL_DEV(obj);
> +
> +        cedt_build_chbs(cedt->buf, pxb);
> +    }
> +
> +    return 0;
> +}
> +
> +void cxl_build_cedt(GArray *table_offsets, GArray *table_data,
> +                    BIOSLinker *linker)
> +{
> +    const int cedt_start = table_data->len;
> +    Aml *cedt;
> +
> +    cedt = init_aml_allocator();
> +
> +    /* reserve space for CEDT header */
> +    acpi_add_table(table_offsets, table_data);
> +    acpi_data_push(cedt->buf, sizeof(AcpiTableHeader));
> +
> +    object_child_foreach_recursive(object_get_root(), cxl_foreach_pxb_hb, cedt);
> +
> +    /* copy AML table into ACPI tables blob and patch header there */
> +    g_array_append_vals(table_data, cedt->buf->data, cedt->buf->len);
> +    build_header(linker, table_data, (void *)(table_data->data + cedt_start),
> +                 "CEDT", table_data->len - cedt_start, 1, NULL, NULL);
> +    free_aml_allocator();
> +}
> +
>  static Aml *__build_cxl_osc_method(void)
>  {
>      Aml *method, *if_uuid, *else_uuid, *if_arg1_not_1, *if_cxl, *if_caps_masked;
> diff --git a/hw/i386/acpi-build.c b/hw/i386/acpi-build.c
> index dd1f8b39d4..eda62dcd6a 100644
> --- a/hw/i386/acpi-build.c
> +++ b/hw/i386/acpi-build.c
> @@ -75,6 +75,8 @@
>  #include "hw/acpi/ipmi.h"
>  #include "hw/acpi/hmat.h"
>  
> +#include "hw/acpi/cxl.h"
> +
>  /* These are used to size the ACPI tables for -M pc-i440fx-1.7 and
>   * -M pc-i440fx-2.0.  Even if the actual amount of AML generated grows
>   * a little bit, there should be plenty of free space since the DSDT
> @@ -1662,7 +1664,7 @@ build_dsdt(GArray *table_data, BIOSLinker *linker,
>  
>              scope = aml_scope("\\_SB");
>              if (type == CXL) {
> -                dev = aml_device("CXL%.01X", pci_bus_uid(bus));
> +                dev = aml_device("CXL%.01X", uid);
>              } else {
>                  dev = aml_device("PC%.02X", bus_num);
>              }
> @@ -2568,6 +2570,8 @@ void acpi_build(AcpiBuildTables *tables, MachineState *machine)
>                            machine->nvdimms_state, machine->ram_slots);
>      }
>  
> +    cxl_build_cedt(table_offsets, tables_blob, tables->linker);
> +
>      acpi_add_table(table_offsets, tables_blob);
>      build_waet(tables_blob, tables->linker);
>  
> diff --git a/hw/pci-bridge/pci_expander_bridge.c b/hw/pci-bridge/pci_expander_bridge.c
> index 75910f5870..b2c1d9056a 100644
> --- a/hw/pci-bridge/pci_expander_bridge.c
> +++ b/hw/pci-bridge/pci_expander_bridge.c
> @@ -57,26 +57,6 @@ DECLARE_INSTANCE_CHECKER(PXBDev, PXB_DEV,
>  DECLARE_INSTANCE_CHECKER(PXBDev, PXB_PCIE_DEV,
>                           TYPE_PXB_PCIE_DEVICE)
>  
> -#define TYPE_PXB_CXL_DEVICE "pxb-cxl"
> -DECLARE_INSTANCE_CHECKER(PXBDev, PXB_CXL_DEV,
> -                         TYPE_PXB_CXL_DEVICE)
> -
> -struct PXBDev {
> -    /*< private >*/
> -    PCIDevice parent_obj;
> -    /*< public >*/
> -
> -    uint8_t bus_nr;
> -    uint16_t numa_node;
> -    int32_t uid;
> -    struct cxl_dev {
> -        HostMemoryBackend *memory_window[CXL_WINDOW_MAX];
> -
> -        uint32_t num_windows;
> -        hwaddr *window_base[CXL_WINDOW_MAX];
> -    } cxl;
> -};
> -
>  typedef struct CXLHost {
>      PCIHostState parent_obj;
>  
> @@ -351,6 +331,7 @@ static void pxb_dev_realize_common(PCIDevice *dev, enum BusType type,
>          bus = pci_root_bus_new(ds, dev_name, NULL, NULL, 0, TYPE_PXB_CXL_BUS);
>          bus->flags |= PCI_BUS_CXL;
>          PXB_CXL_HOST(ds)->dev = PXB_CXL_DEV(dev);
> +        PXB_CXL_DEV(dev)->cxl.cxl_host_bridge = ds;
>      } else {
>          bus = pci_root_bus_new(ds, "pxb-internal", NULL, NULL, 0, TYPE_PXB_BUS);
>          bds = qdev_new("pci-bridge");
> diff --git a/include/hw/acpi/cxl.h b/include/hw/acpi/cxl.h
> index 7b8f3b8a2e..db2063f8c9 100644
> --- a/include/hw/acpi/cxl.h
> +++ b/include/hw/acpi/cxl.h
> @@ -18,6 +18,10 @@
>  #ifndef HW_ACPI_CXL_H
>  #define HW_ACPI_CXL_H
>  
> +#include "hw/acpi/bios-linker-loader.h"
> +
> +void cxl_build_cedt(GArray *table_offsets, GArray *table_data,
> +                    BIOSLinker *linker);
>  void build_cxl_osc_method(Aml *dev);
>  
>  #endif
> diff --git a/include/hw/pci/pci_bridge.h b/include/hw/pci/pci_bridge.h
> index a94d350034..50dd7fdf33 100644
> --- a/include/hw/pci/pci_bridge.h
> +++ b/include/hw/pci/pci_bridge.h
> @@ -28,6 +28,7 @@
>  
>  #include "hw/pci/pci.h"
>  #include "hw/pci/pci_bus.h"
> +#include "hw/cxl/cxl.h"
>  #include "qom/object.h"
>  
>  typedef struct PCIBridgeWindows PCIBridgeWindows;
> @@ -81,6 +82,30 @@ struct PCIBridge {
>  #define PCI_BRIDGE_DEV_PROP_MSI        "msi"
>  #define PCI_BRIDGE_DEV_PROP_SHPC       "shpc"
>  
> +struct PXBDev {
> +    /*< private >*/
> +    PCIDevice parent_obj;
> +    /*< public >*/
> +
> +    uint8_t bus_nr;
> +    uint16_t numa_node;
> +    int32_t uid;
> +
> +    struct cxl_dev {
> +        HostMemoryBackend *memory_window[CXL_WINDOW_MAX];
> +
> +        uint32_t num_windows;
> +        hwaddr *window_base[CXL_WINDOW_MAX];
> +
> +        void *cxl_host_bridge; /* Pointer to a CXLHost */
> +    } cxl;
> +};
> +
> +typedef struct PXBDev PXBDev;
> +#define TYPE_PXB_CXL_DEVICE "pxb-cxl"
> +DECLARE_INSTANCE_CHECKER(PXBDev, PXB_CXL_DEV,
> +                         TYPE_PXB_CXL_DEVICE)
> +

Seems like this could sensibly be on one line?
Could have been in earlier patch as well of course.

>  int pci_bridge_ssvid_init(PCIDevice *dev, uint8_t offset,
>                            uint16_t svid, uint16_t ssid,
>                            Error **errp);



^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [RFC PATCH 00/25] Introduce CXL 2.0 Emulation
  2020-11-11  5:46 [RFC PATCH 00/25] Introduce CXL 2.0 Emulation Ben Widawsky
                   ` (24 preceding siblings ...)
  2020-11-11  5:47 ` [RFC PATCH 25/25] qtest/cxl: Add very basic sanity tests Ben Widawsky
@ 2020-11-16 17:21 ` Jonathan Cameron
  2020-11-16 18:06   ` Ben Widawsky
  2020-12-04 14:27 ` Daniel P. Berrangé
  26 siblings, 1 reply; 64+ messages in thread
From: Jonathan Cameron @ 2020-11-16 17:21 UTC (permalink / raw)
  To: Ben Widawsky; +Cc: Vishal Verma, Dan Williams, qemu-devel

On Tue, 10 Nov 2020 21:46:59 -0800
Ben Widawsky <ben.widawsky@intel.com> wrote:

> Introduce emulation of Compute Express Link 2.0, which was released
> today at https://www.computeexpresslink.org/.
> 
> I've pushed a branch here: https://gitlab.com/bwidawsk/qemu/-/tree/cxl-2.0
> 
> The emulation has been critical to get the Linux enabling started
> (https://lore.kernel.org/linux-cxl/), it would be an ideal place to land
> regression tests for different topology handling, and there may be applications
> for this emulation as a way for a guest to manipulate its address space relative
> to different performance memories. I am new to QEMU development, so please
> forgive and point me in the right direction if I severely misinterpreted where a
> piece of infrastructure belongs.
> 
> Three of the five CXL component types are emulated with some level of functionality:
> host bridge, root port, and memory device. Upstream ports and downstream ports
> aren't implemented (the two components needed to make up a switch).
> 
> CXL 2.0 is built on top of PCIe (see spec for details). As a result, much of the
> implementation utilizes existing PCI paradigms. To implement the host bridge,
> I've chosen to use PXB (PCI Expander Bridge). It seemed to be the most natural
> fit even though it doesn't directly map to how hardware will work. For
> persistent capacity of the memory device, I utilized the memory subsystem
> (hw/mem).
> 
> We have 3 reasons why this work is valuable:
> 1. OS driver development and testing
> 2. OS driver regression testing
> 3. Possible guest support for HDMs
> 
> As mentioned above there are three benefits to carrying this enabling in
> upstream QEMU:
> 
> 1. Linux driver feature development benefits from emulation both due to
> a lack of initial hardware availability, but also, as is seen with
> NVDIMM/PMEM emulation, there is value in being able to share
> topologies with system-software developers even after hardware is
> available.
> 
> 2. The Linux kernel's unit test suite for NVDIMM/PMEM ended up injecting fake
> resources via custom modules (nfit_test). In retrospect a QEMU emulation of
> nfit_test capabilities would have made the test environment more portable, and
> allowed for easier community contributions of example configurations.
> 
> 3. This is still being fleshed out, but in short it provides a standardized
> mechanism for the guest to provide feedback to the host about size and placement
> needs of the memory. After the host gives the guest a physical window mapping to
> the CXL device, the emulated HDM decoders allow the guest a way to tell the host
> how much it wants and where. There are likely simpler ways to do this, but
> they'd require inventing a new interface and you'd need to have diverging driver
> code in the guest programming of the HDM decoder vs. the host. Since we've
> already done this work, why not use it?
> 
> There is quite a long list of work to do for full spec compliance, but I don't
> believe that any of it precludes merging. Off the top of my head:
> - Main host bridge support (WIP)
> - Interleaving
> - Better Tests
> - Huge swaths of firmware functionality
> - Hot plug support
> - Emulating volatile capacity
> 
> The flow of the patches in general is to define all the data structures and
> registers associated with the various components in a top down manner. Host
> bridge, component, ports, devices. Then, the actual implementation is done in
> the same order.
> 
> The summary is:
> 1-8: Put infrastructure in place for emulation of the components.
> 9-11: Create the concept of a CXL bus and plumb into PXB
> 12-16: Implement host bridges
> 17: Implement a root port
> 18: Implement a memory device
> 19: Implement HDM decoders
> 20-24: ACPI bits
> 25: Start working on enabling the main host bridge

Hi Ben,

I've take a look at the whole series and offered a few comments in things that
stood out.  Unfortunately I'm playing catchup on CXL 2.0 and my qemu knowledge
is not what I'd like it to be.

Having said that, this feels like a good start to me.  Please clean up
the few patch handling issues before a v2.  Code that appears, disappears and
reappears is a bit distracting :)

Next up, the kernel side.

Thanks,

Jonathan

> 
> Ben Widawsky (23):
>   hw/pci/cxl: Add a CXL component type (interface)
>   hw/cxl/component: Introduce CXL components (8.1.x, 8.2.5)
>   hw/cxl/device: Introduce a CXL device (8.2.8)
>   hw/cxl/device: Implement the CAP array (8.2.8.1-2)
>   hw/cxl/device: Add device status (8.2.8.3)
>   hw/cxl/device: Implement basic mailbox (8.2.8.4)
>   hw/cxl/device: Add memory devices (8.2.8.5)
>   hw/pxb: Use a type for realizing expanders
>   hw/pci/cxl: Create a CXL bus type
>   hw/pxb: Allow creation of a CXL PXB (host bridge)
>   acpi/pci: Consolidate host bridge setup
>   hw/pci: Plumb _UID through host bridges
>   hw/cxl/component: Implement host bridge MMIO (8.2.5, table 142)
>   acpi/pxb/cxl: Reserve host bridge MMIO
>   hw/pxb/cxl: Add "windows" for host bridges
>   hw/cxl/rp: Add a root port
>   hw/cxl/device: Add a memory device (8.2.8.5)
>   hw/cxl/device: Implement MMIO HDM decoding (8.2.5.12)
>   acpi/cxl: Add _OSC implementation (9.14.2)
>   acpi/cxl: Create the CEDT (9.14.1)
>   Temp: acpi/cxl: Add ACPI0017 (CEDT awareness)
>   WIP: i386/cxl: Initialize a host bridge
>   qtest/cxl: Add very basic sanity tests
> 
> Jonathan Cameron (1):
>   Temp: Add the PCI_EXT_ID_DVSEC definition to the qemu pci_regs.h copy.
> 
> Vishal Verma (1):
>   acpi/cxl: Introduce a compat-driver UUID for CXL _OSC
> 
>  MAINTAINERS                               |   6 +
>  hw/Kconfig                                |   1 +
>  hw/acpi/Kconfig                           |   5 +
>  hw/acpi/cxl.c                             | 198 +++++++++++++
>  hw/acpi/meson.build                       |   1 +
>  hw/arm/virt.c                             |   1 +
>  hw/core/machine.c                         |  26 ++
>  hw/core/numa.c                            |   3 +
>  hw/cxl/Kconfig                            |   3 +
>  hw/cxl/cxl-component-utils.c              | 192 +++++++++++++
>  hw/cxl/cxl-device-utils.c                 | 293 +++++++++++++++++++
>  hw/cxl/cxl-mailbox-utils.c                | 139 +++++++++
>  hw/cxl/meson.build                        |   5 +
>  hw/i386/acpi-build.c                      |  87 +++++-
>  hw/i386/microvm.c                         |   1 +
>  hw/i386/pc.c                              |   2 +
>  hw/mem/Kconfig                            |   5 +
>  hw/mem/cxl_type3.c                        | 334 ++++++++++++++++++++++
>  hw/mem/meson.build                        |   1 +
>  hw/meson.build                            |   1 +
>  hw/pci-bridge/Kconfig                     |   5 +
>  hw/pci-bridge/cxl_root_port.c             | 231 +++++++++++++++
>  hw/pci-bridge/meson.build                 |   1 +
>  hw/pci-bridge/pci_expander_bridge.c       | 209 +++++++++++++-
>  hw/pci-bridge/pcie_root_port.c            |   6 +-
>  hw/pci/pci.c                              |  32 ++-
>  hw/pci/pcie.c                             |  30 ++
>  hw/ppc/spapr.c                            |   2 +
>  include/hw/acpi/cxl.h                     |  27 ++
>  include/hw/boards.h                       |   2 +
>  include/hw/cxl/cxl.h                      |  30 ++
>  include/hw/cxl/cxl_component.h            | 181 ++++++++++++
>  include/hw/cxl/cxl_device.h               | 199 +++++++++++++
>  include/hw/cxl/cxl_pci.h                  | 155 ++++++++++
>  include/hw/pci/pci.h                      |  15 +
>  include/hw/pci/pci_bridge.h               |  25 ++
>  include/hw/pci/pci_bus.h                  |   8 +
>  include/hw/pci/pci_ids.h                  |   1 +
>  include/standard-headers/linux/pci_regs.h |   1 +
>  monitor/hmp-cmds.c                        |  15 +
>  qapi/machine.json                         |   1 +
>  tests/qtest/cxl-test.c                    |  93 ++++++
>  tests/qtest/meson.build                   |   4 +
>  43 files changed, 2547 insertions(+), 30 deletions(-)
>  create mode 100644 hw/acpi/cxl.c
>  create mode 100644 hw/cxl/Kconfig
>  create mode 100644 hw/cxl/cxl-component-utils.c
>  create mode 100644 hw/cxl/cxl-device-utils.c
>  create mode 100644 hw/cxl/cxl-mailbox-utils.c
>  create mode 100644 hw/cxl/meson.build
>  create mode 100644 hw/mem/cxl_type3.c
>  create mode 100644 hw/pci-bridge/cxl_root_port.c
>  create mode 100644 include/hw/acpi/cxl.h
>  create mode 100644 include/hw/cxl/cxl.h
>  create mode 100644 include/hw/cxl/cxl_component.h
>  create mode 100644 include/hw/cxl/cxl_device.h
>  create mode 100644 include/hw/cxl/cxl_pci.h
>  create mode 100644 tests/qtest/cxl-test.c
> 



^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [RFC PATCH 00/25] Introduce CXL 2.0 Emulation
  2020-11-16 17:21 ` [RFC PATCH 00/25] Introduce CXL 2.0 Emulation Jonathan Cameron
@ 2020-11-16 18:06   ` Ben Widawsky
  2020-11-17 14:09     ` Jonathan Cameron
  0 siblings, 1 reply; 64+ messages in thread
From: Ben Widawsky @ 2020-11-16 18:06 UTC (permalink / raw)
  To: Jonathan Cameron; +Cc: Vishal Verma, Dan Williams, qemu-devel

On 20-11-16 17:21:07, Jonathan Cameron wrote:
> On Tue, 10 Nov 2020 21:46:59 -0800
> Ben Widawsky <ben.widawsky@intel.com> wrote:
> 
> > Introduce emulation of Compute Express Link 2.0, which was released
> > today at https://www.computeexpresslink.org/.
> > 
> > I've pushed a branch here: https://gitlab.com/bwidawsk/qemu/-/tree/cxl-2.0
> > 
> > The emulation has been critical to get the Linux enabling started
> > (https://lore.kernel.org/linux-cxl/), it would be an ideal place to land
> > regression tests for different topology handling, and there may be applications
> > for this emulation as a way for a guest to manipulate its address space relative
> > to different performance memories. I am new to QEMU development, so please
> > forgive and point me in the right direction if I severely misinterpreted where a
> > piece of infrastructure belongs.
> > 
> > Three of the five CXL component types are emulated with some level of functionality:
> > host bridge, root port, and memory device. Upstream ports and downstream ports
> > aren't implemented (the two components needed to make up a switch).
> > 
> > CXL 2.0 is built on top of PCIe (see spec for details). As a result, much of the
> > implementation utilizes existing PCI paradigms. To implement the host bridge,
> > I've chosen to use PXB (PCI Expander Bridge). It seemed to be the most natural
> > fit even though it doesn't directly map to how hardware will work. For
> > persistent capacity of the memory device, I utilized the memory subsystem
> > (hw/mem).
> > 
> > We have 3 reasons why this work is valuable:
> > 1. OS driver development and testing
> > 2. OS driver regression testing
> > 3. Possible guest support for HDMs
> > 
> > As mentioned above there are three benefits to carrying this enabling in
> > upstream QEMU:
> > 
> > 1. Linux driver feature development benefits from emulation both due to
> > a lack of initial hardware availability, but also, as is seen with
> > NVDIMM/PMEM emulation, there is value in being able to share
> > topologies with system-software developers even after hardware is
> > available.
> > 
> > 2. The Linux kernel's unit test suite for NVDIMM/PMEM ended up injecting fake
> > resources via custom modules (nfit_test). In retrospect a QEMU emulation of
> > nfit_test capabilities would have made the test environment more portable, and
> > allowed for easier community contributions of example configurations.
> > 
> > 3. This is still being fleshed out, but in short it provides a standardized
> > mechanism for the guest to provide feedback to the host about size and placement
> > needs of the memory. After the host gives the guest a physical window mapping to
> > the CXL device, the emulated HDM decoders allow the guest a way to tell the host
> > how much it wants and where. There are likely simpler ways to do this, but
> > they'd require inventing a new interface and you'd need to have diverging driver
> > code in the guest programming of the HDM decoder vs. the host. Since we've
> > already done this work, why not use it?
> > 
> > There is quite a long list of work to do for full spec compliance, but I don't
> > believe that any of it precludes merging. Off the top of my head:
> > - Main host bridge support (WIP)
> > - Interleaving
> > - Better Tests
> > - Huge swaths of firmware functionality
> > - Hot plug support
> > - Emulating volatile capacity
> > 
> > The flow of the patches in general is to define all the data structures and
> > registers associated with the various components in a top down manner. Host
> > bridge, component, ports, devices. Then, the actual implementation is done in
> > the same order.
> > 
> > The summary is:
> > 1-8: Put infrastructure in place for emulation of the components.
> > 9-11: Create the concept of a CXL bus and plumb into PXB
> > 12-16: Implement host bridges
> > 17: Implement a root port
> > 18: Implement a memory device
> > 19: Implement HDM decoders
> > 20-24: ACPI bits
> > 25: Start working on enabling the main host bridge
> 
> Hi Ben,
> 
> I've take a look at the whole series and offered a few comments in things that
> stood out.  Unfortunately I'm playing catchup on CXL 2.0 and my qemu knowledge
> is not what I'd like it to be.
> 
> Having said that, this feels like a good start to me.  Please clean up
> the few patch handling issues before a v2.  Code that appears, disappears and
> reappears is a bit distracting :)
> 
> Next up, the kernel side.
> 
> Thanks,
> 
> Jonathan

Thanks very much for taking the time Jonathan. I saw your CCIX series early on
and it was definitely helpful to me, so thanks for that as well. As you can
probably tell, this series has been rebased to hell and back and you caught some
of that in the code churn. I'll work on fixing those. I foolishly did a pretty
major refactor just before submission.

I wanted to discuss the 'dump all the defines in a patch and use them later'
style I went for. In general, I don't do this and I leave feedback on patches
that do this. I had two reasons for doing it here:
1. I wanted to separate a, 'go read the spec review' from actual functionality.
   I hope some of the issues you spotted were because of that.
2. Since I decided to make all the helper libraries first, many defines are
   needed for that.

For v2, I'll make sure there are no #define only patches, but I would still like
to introduce the helper libraries first which will leave some unused functions
and defines for a few patches.

Ben

> 
> > 
> > Ben Widawsky (23):
> >   hw/pci/cxl: Add a CXL component type (interface)
> >   hw/cxl/component: Introduce CXL components (8.1.x, 8.2.5)
> >   hw/cxl/device: Introduce a CXL device (8.2.8)
> >   hw/cxl/device: Implement the CAP array (8.2.8.1-2)
> >   hw/cxl/device: Add device status (8.2.8.3)
> >   hw/cxl/device: Implement basic mailbox (8.2.8.4)
> >   hw/cxl/device: Add memory devices (8.2.8.5)
> >   hw/pxb: Use a type for realizing expanders
> >   hw/pci/cxl: Create a CXL bus type
> >   hw/pxb: Allow creation of a CXL PXB (host bridge)
> >   acpi/pci: Consolidate host bridge setup
> >   hw/pci: Plumb _UID through host bridges
> >   hw/cxl/component: Implement host bridge MMIO (8.2.5, table 142)
> >   acpi/pxb/cxl: Reserve host bridge MMIO
> >   hw/pxb/cxl: Add "windows" for host bridges
> >   hw/cxl/rp: Add a root port
> >   hw/cxl/device: Add a memory device (8.2.8.5)
> >   hw/cxl/device: Implement MMIO HDM decoding (8.2.5.12)
> >   acpi/cxl: Add _OSC implementation (9.14.2)
> >   acpi/cxl: Create the CEDT (9.14.1)
> >   Temp: acpi/cxl: Add ACPI0017 (CEDT awareness)
> >   WIP: i386/cxl: Initialize a host bridge
> >   qtest/cxl: Add very basic sanity tests
> > 
> > Jonathan Cameron (1):
> >   Temp: Add the PCI_EXT_ID_DVSEC definition to the qemu pci_regs.h copy.
> > 
> > Vishal Verma (1):
> >   acpi/cxl: Introduce a compat-driver UUID for CXL _OSC
> > 
> >  MAINTAINERS                               |   6 +
> >  hw/Kconfig                                |   1 +
> >  hw/acpi/Kconfig                           |   5 +
> >  hw/acpi/cxl.c                             | 198 +++++++++++++
> >  hw/acpi/meson.build                       |   1 +
> >  hw/arm/virt.c                             |   1 +
> >  hw/core/machine.c                         |  26 ++
> >  hw/core/numa.c                            |   3 +
> >  hw/cxl/Kconfig                            |   3 +
> >  hw/cxl/cxl-component-utils.c              | 192 +++++++++++++
> >  hw/cxl/cxl-device-utils.c                 | 293 +++++++++++++++++++
> >  hw/cxl/cxl-mailbox-utils.c                | 139 +++++++++
> >  hw/cxl/meson.build                        |   5 +
> >  hw/i386/acpi-build.c                      |  87 +++++-
> >  hw/i386/microvm.c                         |   1 +
> >  hw/i386/pc.c                              |   2 +
> >  hw/mem/Kconfig                            |   5 +
> >  hw/mem/cxl_type3.c                        | 334 ++++++++++++++++++++++
> >  hw/mem/meson.build                        |   1 +
> >  hw/meson.build                            |   1 +
> >  hw/pci-bridge/Kconfig                     |   5 +
> >  hw/pci-bridge/cxl_root_port.c             | 231 +++++++++++++++
> >  hw/pci-bridge/meson.build                 |   1 +
> >  hw/pci-bridge/pci_expander_bridge.c       | 209 +++++++++++++-
> >  hw/pci-bridge/pcie_root_port.c            |   6 +-
> >  hw/pci/pci.c                              |  32 ++-
> >  hw/pci/pcie.c                             |  30 ++
> >  hw/ppc/spapr.c                            |   2 +
> >  include/hw/acpi/cxl.h                     |  27 ++
> >  include/hw/boards.h                       |   2 +
> >  include/hw/cxl/cxl.h                      |  30 ++
> >  include/hw/cxl/cxl_component.h            | 181 ++++++++++++
> >  include/hw/cxl/cxl_device.h               | 199 +++++++++++++
> >  include/hw/cxl/cxl_pci.h                  | 155 ++++++++++
> >  include/hw/pci/pci.h                      |  15 +
> >  include/hw/pci/pci_bridge.h               |  25 ++
> >  include/hw/pci/pci_bus.h                  |   8 +
> >  include/hw/pci/pci_ids.h                  |   1 +
> >  include/standard-headers/linux/pci_regs.h |   1 +
> >  monitor/hmp-cmds.c                        |  15 +
> >  qapi/machine.json                         |   1 +
> >  tests/qtest/cxl-test.c                    |  93 ++++++
> >  tests/qtest/meson.build                   |   4 +
> >  43 files changed, 2547 insertions(+), 30 deletions(-)
> >  create mode 100644 hw/acpi/cxl.c
> >  create mode 100644 hw/cxl/Kconfig
> >  create mode 100644 hw/cxl/cxl-component-utils.c
> >  create mode 100644 hw/cxl/cxl-device-utils.c
> >  create mode 100644 hw/cxl/cxl-mailbox-utils.c
> >  create mode 100644 hw/cxl/meson.build
> >  create mode 100644 hw/mem/cxl_type3.c
> >  create mode 100644 hw/pci-bridge/cxl_root_port.c
> >  create mode 100644 include/hw/acpi/cxl.h
> >  create mode 100644 include/hw/cxl/cxl.h
> >  create mode 100644 include/hw/cxl/cxl_component.h
> >  create mode 100644 include/hw/cxl/cxl_device.h
> >  create mode 100644 include/hw/cxl/cxl_pci.h
> >  create mode 100644 tests/qtest/cxl-test.c
> > 
> 


^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [RFC PATCH 05/25] hw/cxl/device: Implement the CAP array (8.2.8.1-2)
  2020-11-16 13:11   ` Jonathan Cameron
@ 2020-11-16 18:08     ` Ben Widawsky
  0 siblings, 0 replies; 64+ messages in thread
From: Ben Widawsky @ 2020-11-16 18:08 UTC (permalink / raw)
  To: Jonathan Cameron; +Cc: Vishal Verma, Dan Williams, qemu-devel

On 20-11-16 13:11:19, Jonathan Cameron wrote:
> On Tue, 10 Nov 2020 21:47:04 -0800
> Ben Widawsky <ben.widawsky@intel.com> wrote:
> 
> > This implements all device MMIO up to the first capability .That
> > includes the CXL Device Capabilities Array Register, as well as all of
> > the CXL Device Capability Header Registers. The latter are filled in as
> > they are implemented in the following patches.
> > 
> > Signed-off-by: Ben Widawsky <ben.widawsky@intel.com>
> 
> Question below.
> 
> Thanks,
> 
> Jonathan
> 
> > ---
> >  hw/cxl/cxl-device-utils.c | 73 +++++++++++++++++++++++++++++++++++++++
> >  hw/cxl/meson.build        |  1 +
> >  2 files changed, 74 insertions(+)
> > 
> > diff --git a/hw/cxl/cxl-device-utils.c b/hw/cxl/cxl-device-utils.c
> > index e69de29bb2..a391bb15c6 100644
> > --- a/hw/cxl/cxl-device-utils.c
> > +++ b/hw/cxl/cxl-device-utils.c
> > @@ -0,0 +1,73 @@
> > +/*
> > + * CXL Utility library for devices
> > + *
> > + * Copyright(C) 2020 Intel Corporation.
> > + *
> > + * This work is licensed under the terms of the GNU GPL, version 2. See the
> > + * COPYING file in the top-level directory.
> > + */
> > +
> > +#include "qemu/osdep.h"
> > +#include "qemu/log.h"
> > +#include "hw/cxl/cxl.h"
> > +
> > +static uint64_t caps_reg_read(void *opaque, hwaddr offset, unsigned size)
> > +{
> > +    CXLDeviceState *cxl_dstate = opaque;
> > +
> > +    switch (size) {
> > +    case 4:
> > +        if (unlikely(offset & (sizeof(uint32_t) - 1))) {
> > +            qemu_log_mask(LOG_UNIMP, "Unaligned register read\n");
> > +            return 0;
> > +        }
> > +        break;
> > +    case 8:
> > +        if (unlikely(offset & (sizeof(uint64_t) - 1))) {
> > +            qemu_log_mask(LOG_UNIMP, "Unaligned register read\n");
> > +            return 0;
> > +        }
> > +        break;
> 
> Seems unlikely but in theory we might get other sizes such as 2 and need
> that to be aligned?
> 
> If we just don't want to support them perhaps a default with suitable error
> print is appropriate?
> 

Shouldn't - the .impl field below makes QEMU internals round up/down access
sizes to be only 4 or 8. I've been assuming I don't need to handle non powers of
2, but if I do, that'd be an issue.

> > +    }
> > +
> > +    return ldn_le_p(cxl_dstate->caps_reg_state + offset, size);
> > +}
> > +
> > +static const MemoryRegionOps caps_ops = {
> > +    .read = caps_reg_read,
> > +    .write = NULL,
> > +    .endianness = DEVICE_LITTLE_ENDIAN,
> > +    .valid = {
> > +        .min_access_size = 4,
> > +        .max_access_size = 8,
> > +    },
> > +    .impl = {
> > +        .min_access_size = 4,
> > +        .max_access_size = 8,
> > +    },
> > +};
> > +
> > +void cxl_device_register_block_init(Object *obj, CXLDeviceState *cxl_dstate)
> > +{
> > +    /* This will be a BAR, so needs to be rounded up to pow2 for PCI spec */
> > +    memory_region_init(
> > +        &cxl_dstate->device_registers, obj, "device-registers",
> > +        pow2ceil(CXL_MAILBOX_REGISTERS_LENGTH + CXL_MAILBOX_REGISTERS_OFFSET));
> > +
> > +    memory_region_init_io(&cxl_dstate->caps, obj, &caps_ops, cxl_dstate,
> > +                          "cap-array", CXL_DEVICE_REGISTERS_OFFSET - 0);
> > +
> > +    memory_region_add_subregion(&cxl_dstate->device_registers, 0,
> > +                                &cxl_dstate->caps);
> > +}
> > +
> > +void cxl_device_register_init_common(CXLDeviceState *cxl_dstate)
> > +{
> > +    uint32_t *cap_hdrs = cxl_dstate->caps_reg_state32;
> > +    const int cap_count = 0;
> > +
> > +    /* CXL Device Capabilities Array Register */
> > +    ARRAY_FIELD_DP32(cap_hdrs, CXL_DEV_CAP_ARRAY, CAP_ID, 0);
> > +    ARRAY_FIELD_DP32(cap_hdrs, CXL_DEV_CAP_ARRAY, CAP_VERSION, 1);
> > +    ARRAY_FIELD_DP32(cap_hdrs, CXL_DEV_CAP_ARRAY2, CAP_COUNT, cap_count);
> > +}
> > diff --git a/hw/cxl/meson.build b/hw/cxl/meson.build
> > index 00c3876a0f..47154d6850 100644
> > --- a/hw/cxl/meson.build
> > +++ b/hw/cxl/meson.build
> > @@ -1,3 +1,4 @@
> >  softmmu_ss.add(when: 'CONFIG_CXL', if_true: files(
> >    'cxl-component-utils.c',
> > +  'cxl-device-utils.c',
> >  ))
> 


^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [RFC PATCH 03/25] hw/cxl/component: Introduce CXL components (8.1.x, 8.2.5)
  2020-11-16 12:03   ` Jonathan Cameron
@ 2020-11-16 19:19     ` Ben Widawsky
  2020-11-17 12:29       ` Jonathan Cameron
  0 siblings, 1 reply; 64+ messages in thread
From: Ben Widawsky @ 2020-11-16 19:19 UTC (permalink / raw)
  To: Jonathan Cameron; +Cc: Vishal Verma, Dan Williams, qemu-devel

On 20-11-16 12:03:52, Jonathan Cameron wrote:
> On Tue, 10 Nov 2020 21:47:02 -0800
> Ben Widawsky <ben.widawsky@intel.com> wrote:
> 
> > A CXL 2.0 component is any entity in the CXL topology. All components
> > have a analogous function in PCIe. Except for the CXL host bridge, all
> > have a PCIe config space that is accessible via the common PCIe
> > mechanisms. CXL components are enumerated via DVSEC fields in the
> > extended PCIe header space. CXL components will minimally implement some
> > subset of CXL.mem and CXL.cache registers defined in 8.2.5 of the CXL
> > 2.0 specification. Two headers and a utility library are introduced to
> > support the minimum functionality needed to enumerate components.
> > 
> > The cxl_pci header manages bits associated with PCI, specifically the
> > DVSEC and related fields. The cxl_component.h variant has data
> > structures and APIs that are useful for drivers implementing any of the
> > CXL 2.0 components. The library takes care of making use of the DVSEC
> > bits and the CXL.[mem|cache] regisetrs.
> > 
> > None of the mechanisms required to enumerate a CXL capable hostbridge
> > are introduced at this point.
> > 
> > Note that the CXL.mem and CXL.cache registers used are always 4B wide.
> > It's possible in the future that this constraint will not hold.
> > 
> > Signed-off-by: Ben Widawsky <ben.widawsky@intel.com>
> > 
> > --
> > It's tempting to have a more generalized DVSEC infrastructure. As far as
> > I can tell, the amount this would actually save in terms of code is
> > minimal because most of DVESC is vendor specific.
> 
> Agreed.  Probably not worth bothering with generic infrastructure for 2.5 DW.
> 
> A few comments inline.
> 
> Jonathan
> 

Anything I didn't respond to is accepted and will be in v2.

Thanks.
Ben

> 
> > ---
> >  MAINTAINERS                    |   6 ++
> >  hw/Kconfig                     |   1 +
> >  hw/cxl/Kconfig                 |   3 +
> >  hw/cxl/cxl-component-utils.c   | 192 +++++++++++++++++++++++++++++++++
> >  hw/cxl/cxl-device-utils.c      |   0
> >  hw/cxl/meson.build             |   3 +
> >  hw/meson.build                 |   1 +
> >  include/hw/cxl/cxl.h           |  17 +++
> >  include/hw/cxl/cxl_component.h | 181 +++++++++++++++++++++++++++++++
> >  include/hw/cxl/cxl_pci.h       | 133 +++++++++++++++++++++++
> >  10 files changed, 537 insertions(+)
> >  create mode 100644 hw/cxl/Kconfig
> >  create mode 100644 hw/cxl/cxl-component-utils.c
> >  create mode 100644 hw/cxl/cxl-device-utils.c
> >  create mode 100644 hw/cxl/meson.build
> >  create mode 100644 include/hw/cxl/cxl.h
> >  create mode 100644 include/hw/cxl/cxl_component.h
> >  create mode 100644 include/hw/cxl/cxl_pci.h
> > 
> > diff --git a/MAINTAINERS b/MAINTAINERS
> > index c1d16026ba..02b8e2274d 100644
> > --- a/MAINTAINERS
> > +++ b/MAINTAINERS
> > @@ -2184,6 +2184,12 @@ F: qapi/block*.json
> >  F: qapi/transaction.json
> >  T: git https://repo.or.cz/qemu/armbru.git block-next
> >  
> > +Compute Express Link
> > +M: Ben Widawsky <ben.widawsky@intel.com>
> > +S: Supported
> > +F: hw/cxl/
> > +F: include/hw/cxl/
> > +
> >  Dirty Bitmaps
> >  M: Eric Blake <eblake@redhat.com>
> >  M: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
> > diff --git a/hw/Kconfig b/hw/Kconfig
> > index 4de1797ffd..efed27805a 100644
> > --- a/hw/Kconfig
> > +++ b/hw/Kconfig
> > @@ -6,6 +6,7 @@ source audio/Kconfig
> >  source block/Kconfig
> >  source char/Kconfig
> >  source core/Kconfig
> > +source cxl/Kconfig
> >  source display/Kconfig
> >  source dma/Kconfig
> >  source gpio/Kconfig
> > diff --git a/hw/cxl/Kconfig b/hw/cxl/Kconfig
> > new file mode 100644
> > index 0000000000..8e67519b16
> > --- /dev/null
> > +++ b/hw/cxl/Kconfig
> > @@ -0,0 +1,3 @@
> > +config CXL
> > +    bool
> > +    default y if PCI_EXPRESS
> > diff --git a/hw/cxl/cxl-component-utils.c b/hw/cxl/cxl-component-utils.c
> > new file mode 100644
> > index 0000000000..c52bd5bfc7
> > --- /dev/null
> > +++ b/hw/cxl/cxl-component-utils.c
> > @@ -0,0 +1,192 @@
> > +/*
> > + * CXL Utility library for components
> > + *
> > + * Copyright(C) 2020 Intel Corporation.
> > + *
> > + * This work is licensed under the terms of the GNU GPL, version 2. See the
> > + * COPYING file in the top-level directory.
> > + */
> > +
> > +#include "qemu/osdep.h"
> > +#include "qemu/log.h"
> > +#include "hw/pci/pci.h"
> > +#include "hw/cxl/cxl.h"
> > +
> > +static uint64_t cxl_cache_mem_read_reg(void *opaque, hwaddr offset,
> > +                                       unsigned size)
> > +{
> > +    CXLComponentState *cxl_cstate = opaque;
> > +    ComponentRegisters *cregs = &cxl_cstate->crb;
> > +    uint32_t *cache_mem = cregs->cache_mem_registers;
> > +
> > +    if (size != 4) {
> > +        qemu_log_mask(LOG_UNIMP, "%uB component register read (RAZ)\n", size);
> > +        return 0;
> > +    }
> > +
> > +    if (cregs->special_ops && cregs->special_ops->read) {
> > +        return cregs->special_ops->read(cxl_cstate, offset, size);
> > +    } else {
> > +        return cache_mem[offset >> 2];
> > +    }
> > +}
> > +
> > +static void cxl_cache_mem_write_reg(void *opaque, hwaddr offset, uint64_t value,
> > +                                    unsigned size)
> > +{
> > +    CXLComponentState *cxl_cstate = opaque;
> > +    ComponentRegisters *cregs = &cxl_cstate->crb;
> > +
> > +    if (size != 4) {
> > +        qemu_log_mask(LOG_UNIMP, "%uB component register write (WI)\n", size);
> > +        return;
> > +    }
> > +
> > +    if (cregs->special_ops && cregs->special_ops->write) {
> > +        cregs->special_ops->write(cxl_cstate, offset, value, size);
> > +    }
> > +}
> > +
> > +static const MemoryRegionOps cache_mem_ops = {
> > +    .read = cxl_cache_mem_read_reg,
> > +    .write = cxl_cache_mem_write_reg,
> > +    .endianness = DEVICE_LITTLE_ENDIAN,
> > +    .valid = {
> > +        .min_access_size = 4,
> > +        .max_access_size = 4,
> > +    },
> > +    .impl = {
> > +        .min_access_size = 4,
> > +        .max_access_size = 4,
> > +    },
> > +};
> > +
> > +void cxl_component_register_block_init(Object *obj,

> > +                                       CXLComponentState *cxl_cstate,
> > +                                       const char *type)
> 
> What the type parameter means, is not immediately obvious so docs appreciated.

It's just the name associated with the memory region internal to QEMU. AFAICT,
it doesn't serve any purpose other than for debugging, but I also haven't looked
at exactly how memory access functions work. Since I have no callers of the
function I think it is indeed a good idea to document it.

> 
> > +{
> > +    ComponentRegisters *cregs = &cxl_cstate->crb;
> > +
> > +    memory_region_init(&cregs->component_registers, obj, type, 0x10000);
> > +    memory_region_init_io(&cregs->io, obj, NULL, cregs, ".io", 0x1000);
> > +    memory_region_init_io(&cregs->cache_mem, obj, &cache_mem_ops, cregs,
> > +                          ".cache_mem", 0x1000);
> > +
> > +    memory_region_add_subregion(&cregs->component_registers, 0, &cregs->io);
> > +    memory_region_add_subregion(&cregs->component_registers, 0x1000,
> > +                                &cregs->cache_mem);
> > +}
> > +
> > +static void ras_init_common(uint32_t *reg_state)
> > +{
> > +    reg_state[R_CXL_RAS_UNC_ERR_STATUS] = 0;
> > +    reg_state[R_CXL_RAS_UNC_ERR_MASK] = 0x1efff;
> > +    reg_state[R_CXL_RAS_UNC_ERR_SEVERITY] = 0x1efff;
> > +    reg_state[R_CXL_RAS_COR_ERR_STATUS] = 0;
> > +    reg_state[R_CXL_RAS_COR_ERR_MASK] = 0x3f;
> > +    reg_state[R_CXL_RAS_ERR_CAP_CTRL] = 0; /* CXL switches and devices must set */
> > +}
> > +
> > +static void hdm_init_common(uint32_t *reg_state)
> > +{
> > +    ARRAY_FIELD_DP32(reg_state, CXL_HDM_DECODER_CAPABILITY, DECODER_COUNT, 0);
> > +    ARRAY_FIELD_DP32(reg_state, CXL_HDM_DECODER_GLOBAL_CONTROL, HDM_DECODER_ENABLE, 0);
> > +}
> > +
> > +void cxl_component_register_init_common(uint32_t *reg_state, enum reg_type type)
> > +{
> > +    int caps = 0;
> 
> > +    switch (type) {
> > +    case CXL2_DOWNSTREAM_PORT:
> > +    case CXL2_DEVICE:
> > +        /* CAP, RAS, Link */
> > +        caps = 3;
> > +        break;
> > +    case CXL2_UPSTREAM_PORT:
> > +    case CXL2_TYPE3_DEVICE:
> > +    case CXL2_LOGICAL_DEVICE:
> > +        /* + HDM */
> > +        caps = 4;
> > +        break;
> > +    case CXL2_ROOT_PORT:
> > +        /* + Extended Security, + Snoop */
> > +        caps = 6;
> > +        break;
> > +    default:
> > +        abort();
> > +    }
> > +
> > +    memset(reg_state, 0, 0x1000);
> > +
> > +    /* CXL Capability Header Register */
> > +    ARRAY_FIELD_DP32(reg_state, CXL_CAPABILITY_HEADER, ID, 1);
> > +    ARRAY_FIELD_DP32(reg_state, CXL_CAPABILITY_HEADER, VERSION, 1);
> > +    ARRAY_FIELD_DP32(reg_state, CXL_CAPABILITY_HEADER, CACHE_MEM_VERSION, 1);
> > +    ARRAY_FIELD_DP32(reg_state, CXL_CAPABILITY_HEADER, ARRAY_SIZE, caps);
> > +
> > +
> > +#define init_cap_reg(reg, id, version)                                        \
> > +    do {                                                                      \
> > +        int which = R_CXL_##reg##_CAPABILITY_HEADER;                          \
> > +        reg_state[which] = FIELD_DP32(reg_state[which],                       \
> > +                                      CXL_##reg##_CAPABILITY_HEADER, ID, id); \
> > +        reg_state[which] =                                                    \
> > +            FIELD_DP32(reg_state[which], CXL_##reg##_CAPABILITY_HEADER,       \
> > +                       VERSION, version);                                     \
> > +        reg_state[which] =                                                    \
> > +            FIELD_DP32(reg_state[which], CXL_##reg##_CAPABILITY_HEADER, PTR,  \
> > +                       CXL_##reg##_REGISTERS_OFFSET);                         \
> > +    } while (0)
> > +
> > +    init_cap_reg(RAS, 2, 1);
> > +    ras_init_common(reg_state);
> > +
> > +    init_cap_reg(LINK, 4, 2);
> > +
> > +    if (caps < 4) {
> > +        return;
> > +    }
> > +
> > +    init_cap_reg(HDM, 5, 1);
> > +    hdm_init_common(reg_state);
> > +
> > +    if (caps < 6) {
> > +        return;
> > +    }
> > +
> > +    init_cap_reg(EXTSEC, 6, 1);
> > +    init_cap_reg(SNOOP, 8, 1);
> > +
> > +#undef init_cap_reg
> > +}
> > +
> > +/*
> > + * Helper to creates a DVSEC header for a CXL entity. The caller is responsible
> > + * for tracking the valid offset.
> > + *
> > + * This function will build the DVSEC header on behalf of the caller and then
> > + * copy in the remaining data for the vendor specific bits.
> > + */
> > +void cxl_component_create_dvsec(CXLComponentState *cxl, uint16_t length,
> > +                                uint16_t type, uint8_t rev, uint8_t *body)
> > +{
> > +    PCIDevice *pdev = cxl->pdev;
> > +    uint16_t offset = cxl->dvsec_offset;
> > +
> > +    assert(offset >= PCI_CFG_SPACE_SIZE && offset < PCI_CFG_SPACE_EXP_SIZE);
> 
> Better perhaps to check if offset + length < PCI_CFG_SPACE_EXP_SIZE?
> 
> > +    assert((length & 0xf000) == 0);
> > +    assert((rev & 0xf0) == 0);
> 
> I'd prefer to express as mask of what we do want as doesn't rely on someone having
> to look back to see how large rev is
> Something like...
> 
> assert ((rev & ~0xf) == 0);
> 
> > +
> > +    /* Create the DVSEC in the MCFG space */
> > +    pcie_add_capability(pdev, PCI_EXT_CAP_ID_DVSEC, 1, offset, length);
> > +    pci_set_long(pdev->config + offset + PCIE_DVSEC_HEADER_OFFSET,
> > +                 (length << 20) | (rev << 16) | CXL_VENDOR_ID);
> > +    pci_set_word(pdev->config + offset + PCIE_DVSEC_ID_OFFSET, type);
> > +    memcpy(pdev->config + offset + sizeof(struct dvsec_header),
> > +           body + sizeof(struct dvsec_header),
> > +           length - sizeof(struct dvsec_header));
> > +
> > +    /* Update state for future DVSEC additions */
> > +    range_init_nofail(&cxl->dvsecs[type], cxl->dvsec_offset, length);
> > +    cxl->dvsec_offset += length;
> > +}
> > diff --git a/hw/cxl/cxl-device-utils.c b/hw/cxl/cxl-device-utils.c
> > new file mode 100644
> > index 0000000000..e69de29bb2
> > diff --git a/hw/cxl/meson.build b/hw/cxl/meson.build
> > new file mode 100644
> > index 0000000000..00c3876a0f
> > --- /dev/null
> > +++ b/hw/cxl/meson.build
> > @@ -0,0 +1,3 @@
> > +softmmu_ss.add(when: 'CONFIG_CXL', if_true: files(
> > +  'cxl-component-utils.c',
> > +))
> > diff --git a/hw/meson.build b/hw/meson.build
> > index 010de7219c..3e440c341a 100644
> > --- a/hw/meson.build
> > +++ b/hw/meson.build
> > @@ -6,6 +6,7 @@ subdir('block')
> >  subdir('char')
> >  subdir('core')
> >  subdir('cpu')
> > +subdir('cxl')
> >  subdir('display')
> >  subdir('dma')
> >  subdir('gpio')
> > diff --git a/include/hw/cxl/cxl.h b/include/hw/cxl/cxl.h
> > new file mode 100644
> > index 0000000000..55f6cc30a5
> > --- /dev/null
> > +++ b/include/hw/cxl/cxl.h
> > @@ -0,0 +1,17 @@
> > +/*
> > + * QEMU CXL Support
> > + *
> > + * Copyright (c) 2020 Intel
> > + *
> > + * This work is licensed under the terms of the GNU GPL, version 2. See the
> > + * COPYING file in the top-level directory.
> > + */
> > +
> > +#ifndef CXL_H
> > +#define CXL_H
> > +
> > +#include "cxl_pci.h"
> > +#include "cxl_component.h"
> > +
> > +#endif
> > +
> > diff --git a/include/hw/cxl/cxl_component.h b/include/hw/cxl/cxl_component.h
> > new file mode 100644
> > index 0000000000..014d9d10d3
> > --- /dev/null
> > +++ b/include/hw/cxl/cxl_component.h
> > @@ -0,0 +1,181 @@
> > +/*
> > + * QEMU CXL Component
> > + *
> > + * Copyright (c) 2020 Intel
> > + *
> > + * This work is licensed under the terms of the GNU GPL, version 2. See the
> > + * COPYING file in the top-level directory.
> > + */
> > +
> > +#ifndef CXL_COMPONENT_H
> > +#define CXL_COMPONENT_H
> > +
> > +/* CXL 2.0 - 8.2.4 */
> > +#define CXL2_COMPONENT_IO_REGION_SIZE 0x1000
> > +#define CXL2_COMPONENT_CM_REGION_SIZE 0x1000
> > +#define CXL2_COMPONENT_BLOCK_SIZE 0x10000
> > +
> > +#include "qemu/range.h"
> > +#include "qemu/typedefs.h"
> > +#include "hw/register.h"
> > +
> > +enum reg_type {
> > +    CXL2_DEVICE,
> > +    CXL2_TYPE3_DEVICE,
> > +    CXL2_LOGICAL_DEVICE,
> > +    CXL2_ROOT_PORT,
> > +    CXL2_UPSTREAM_PORT,
> > +    CXL2_DOWNSTREAM_PORT
> > +};
> > +
> > +/*
> > + * Capability registers are defined at the top of the CXL.cache/mem region and
> > + * are packed. For our purposes we will always define the caps in the same
> > + * order.
> > + * CXL 2.0 - 8.2.5 Table 142 for details.
> > + */
> > +
> > +/* CXL 2.0 - 8.2.5.1 */
> > +REG32(CXL_CAPABILITY_HEADER, 0)
> > +    FIELD(CXL_CAPABILITY_HEADER, ID, 0, 16)
> > +    FIELD(CXL_CAPABILITY_HEADER, VERSION, 16, 4)
> > +    FIELD(CXL_CAPABILITY_HEADER, CACHE_MEM_VERSION, 20, 4)
> > +    FIELD(CXL_CAPABILITY_HEADER, ARRAY_SIZE, 24, 8)
> > +
> > +#define CXLx_CAPABILITY_HEADER(type, offset)                  \
> > +    REG32(CXL_##type##_CAPABILITY_HEADER, offset)             \
> > +        FIELD(CXL_##type##_CAPABILITY_HEADER, ID, 0, 16)      \
> > +        FIELD(CXL_##type##_CAPABILITY_HEADER, VERSION, 16, 4) \
> > +        FIELD(CXL_##type##_CAPABILITY_HEADER, PTR, 20, 12)
> > +CXLx_CAPABILITY_HEADER(RAS, 0x4)
> > +CXLx_CAPABILITY_HEADER(LINK, 0x8)
> > +CXLx_CAPABILITY_HEADER(HDM, 0xc)
> > +CXLx_CAPABILITY_HEADER(EXTSEC, 0x10)
> > +CXLx_CAPABILITY_HEADER(SNOOP, 0x14)
> > +
> > +/*
> > + * Capability structures contain the actual registers that the CXL component
> > + * implements. Some of these are specific to certain types of components, but
> > + * this implementation leaves enough space regardless.
> > + */
> > +/* 8.2.5.9 - CXL RAS Capability Structure */
> > +#define CXL_RAS_REGISTERS_OFFSET 0x80 /* Give ample space for caps before this */
> > +#define CXL_RAS_REGISTERS_SIZE   0x58
> > +REG32(CXL_RAS_UNC_ERR_STATUS, CXL_RAS_REGISTERS_OFFSET)
> > +REG32(CXL_RAS_UNC_ERR_MASK, CXL_RAS_REGISTERS_OFFSET + 0x4)
> > +REG32(CXL_RAS_UNC_ERR_SEVERITY, CXL_RAS_REGISTERS_OFFSET + 0x8)
> > +REG32(CXL_RAS_COR_ERR_STATUS, CXL_RAS_REGISTERS_OFFSET + 0xc)
> > +REG32(CXL_RAS_COR_ERR_MASK, CXL_RAS_REGISTERS_OFFSET + 0x10)
> > +REG32(CXL_RAS_ERR_CAP_CTRL, CXL_RAS_REGISTERS_OFFSET + 0x14)
> 
> Maybe a comment on header log registers coming after this?
> Will make it obvious why the size is 0x58 above.
> 
> 
> > +
> > +/* 8.2.5.10 - CXL Security Capability Structure */
> > +#define CXL_SEC_REGISTERS_OFFSET (CXL_RAS_REGISTERS_OFFSET + CXL_RAS_REGISTERS_SIZE)
> > +#define CXL_SEC_REGISTERS_SIZE   0 /* We don't implement 1.1 downstream ports */
> > +
> > +/* 8.2.5.11 - CXL Link Capability Structure */
> > +#define CXL_LINK_REGISTERS_OFFSET (CXL_SEC_REGISTERS_OFFSET + CXL_SEC_REGISTERS_SIZE)
> > +#define CXL_LINK_REGISTERS_SIZE   0x38
> 
> Bit odd to introduce this but not define anything... Can't we bring these
> in when we need them later?

Repeating my comment from 00/25...

For this specific patch series I liked providing #defines in bulk so that it's
easy enough to just bring up the spec and review them. Not sure if there is a
preference from others in the community on this.

I could also introduce the library that generates the capability headers with
this. Either is fine, I just wanted to point out the intent.

> 
> > +
> > +/* 8.2.5.12 - CXL HDM Decoder Capability Structure */
> > +#define HDM_DECODE_MAX 10 /* 8.2.5.12.1 */
> > +#define CXL_HDM_REGISTERS_OFFSET \
> > +    (CXL_LINK_REGISTERS_OFFSET + CXL_LINK_REGISTERS_SIZE) /* 8.2.5.12 */
> > +#define CXL_HDM_REGISTERS_SIZE (0x20 + HDM_DECODE_MAX * 10)
> > +#define HDM_DECODER_INIT(n, base)                          \
> > +    REG32(CXL_HDM_DECODER##n##_BASE_LO, base + 0x10)       \
> 
> Offset n should be included in the address calc.  It's always 0 at the moment
> but might as well put it in now.  Mind you there is something a bit odd
> in the spec I'm looking at. Nothing defined at 0x2c but no reserved line
> either in the table.

My guess is some earlier version of the spec had the decoder registers as 64b
and so they wanted to keep natural alignment.

> 
> 
> > +        FIELD(CXL_HDM_DECODER##n##_BASE_LO, L, 28, 4)      \
> > +    REG32(CXL_HDM_DECODER##n##_BASE_HI, base + 0x14)       \
> > +        FIELD(CXL_HDM_DECODER##n##_BASE_HI, H, 0, 32)      \
> > +    REG32(CXL_HDM_DECODER##n##_SIZE_LO, base + 0x18)       \
> 
> Consistency would argue for fields for this and the next.
> 
> > +    REG32(CXL_HDM_DECODER##n##_SIZE_HI, base + 0x1C)       \
> > +    REG32(CXL_HDM_DECODER##n##_CTRL, base + 0x20)          \
> > +        FIELD(CXL_HDM_DECODER##n##_CTRL, IG, 0, 4)         \
> > +        FIELD(CXL_HDM_DECODER##n##_CTRL, IW, 4, 4)         \
> > +        FIELD(CXL_HDM_DECODER##n##_CTRL, LOCK, 8, 1)       \
> LOCK_ON_COMMIT  semantics of that are unusual enough probably worth naming
> to call them out.
> 
> > +        FIELD(CXL_HDM_DECODER##n##_CTRL, COMMIT, 9, 1)     \
> > +        FIELD(CXL_HDM_DECODER##n##_CTRL, COMMITTED, 10, 1) \
> > +        FIELD(CXL_HDM_DECODER##n##_CTRL, ERROR, 11, 1)     \
> > +        FIELD(CXL_HDM_DECODER##n##_CTRL, TYPE, 12, 1)      \
> > +    REG32(CXL_HDM_DECODER##n##_TARGET_LIST_LO, 0x24)       \
> > +    REG32(CXL_HDM_DECODER##n##_TARGET_LIST_HI, 0x28)
> 
> Blank line here would make it easier to spot the end of the macro
> 
> > +REG32(CXL_HDM_DECODER_CAPABILITY, 0)
> > +    FIELD(CXL_HDM_DECODER_CAPABILITY, DECODER_COUNT, 0, 4)
> > +    FIELD(CXL_HDM_DECODER_CAPABILITY, TARGET_COUNT, 4, 4)
> > +    FIELD(CXL_HDM_DECODER_CAPABILITY, INTERLEAVE_256B, 8, 1)
> > +    FIELD(CXL_HDM_DECODER_CAPABILITY, INTELEAVE_4K, 9, 1)
> > +    FIELD(CXL_HDM_DECODER_CAPABILITY, POISON_ON_ERR_CAP, 10, 1)
> > +REG32(CXL_HDM_DECODER_GLOBAL_CONTROL, 0)
> > +    FIELD(CXL_HDM_DECODER_GLOBAL_CONTROL, POISON_ON_ERR_EN, 0, 1)
> > +    FIELD(CXL_HDM_DECODER_GLOBAL_CONTROL, HDM_DECODER_ENABLE, 1, 1)
> > +
> > +HDM_DECODER_INIT(0, CXL_HDM_REGISTERS_OFFSET);
> > +
> > +/* 8.2.5.13 - CXL Extended Security Capability Structure (Root complex only) */
> > +#define EXTSEC_ENTRY_MAX        256
> > +#define CXL_EXTSEC_REGISTERS_OFFSET (CXL_HDM_REGISTERS_OFFSET + CXL_HDM_REGISTERS_SIZE)
> > +#define CXL_EXTSEC_REGISTERS_SIZE   (8 * EXTSEC_ENTRY_MAX + 4)
> > +
> > +/* 8.2.5.14 - CXL IDE Capability Structure */
> > +#define CXL_IDE_REGISTERS_OFFSET (CXL_EXTSEC_REGISTERS_OFFSET + CXL_EXTSEC_REGISTERS_SIZE)
> > +#define CXL_IDE_REGISTERS_SIZE   0
> > +
> > +/* 8.2.5.15 - CXL Snoop Filter Capability Structure */
> > +#define CXL_SNOOP_REGISTERS_OFFSET (CXL_IDE_REGISTERS_OFFSET + CXL_IDE_REGISTERS_SIZE)
> > +#define CXL_SNOOP_REGISTERS_SIZE   0x8
> > +
> > +_Static_assert((CXL_SNOOP_REGISTERS_OFFSET + CXL_SNOOP_REGISTERS_SIZE) < 0x1000,
> > +               "No space for registers");
> > +
> > +typedef struct component_registers {
> > +    /*
> > +     * Main memory region to be registered with QEMU core.
> > +     */
> > +    MemoryRegion component_registers;
> > +
> > +    /*
> > +     * 8.2.4 Table 141:
> > +     *   0x0000 - 0x0fff CXL.io registers
> > +     *   0x1000 - 0x1fff CXL.cache and CXL.mem
> > +     *   0x2000 - 0xdfff Implementation specific
> > +     *   0xe000 - 0xe3ff CXL ARB/MUX registers
> > +     *   0xe400 - 0xffff RSVD
> > +     */
> > +    uint32_t io_registers[CXL2_COMPONENT_IO_REGION_SIZE >> 2];
> > +    MemoryRegion io;
> > +
> > +    uint32_t cache_mem_registers[CXL2_COMPONENT_CM_REGION_SIZE >> 2];
> > +    MemoryRegion cache_mem;
> > +
> > +    MemoryRegion impl_specific;
> > +    MemoryRegion arb_mux;
> > +    MemoryRegion rsvd;
> > +
> > +    /* special_ops is used for any component that needs any specific handling */
> > +    MemoryRegionOps *special_ops;
> > +} ComponentRegisters;
> > +
> > +/*
> > + * A CXL component represents all entities in a CXL hierarchy. This includes,
> > + * host bridges, root ports, upstream/downstream ports, and devices
> > + */
> > +typedef struct cxl_component {
> > +    ComponentRegisters crb;
> > +    union {
> > +        struct {
> > +            Range dvsecs[CXL20_MAX_DVSEC];
> > +            uint16_t dvsec_offset;
> > +            struct PCIDevice *pdev;
> > +        };
> > +    };
> > +} CXLComponentState;
> > +
> > +void cxl_component_register_block_init(Object *obj,
> > +                                       CXLComponentState *cxl_cstate,
> > +                                       const char *type);
> > +void cxl_component_register_init_common(uint32_t *reg_state,
> > +                                        enum reg_type type);
> > +
> > +void cxl_component_create_dvsec(CXLComponentState *cxl_cstate, uint16_t length,
> > +                                uint16_t type, uint8_t rev, uint8_t *body);
> > +
> > +#endif
> > diff --git a/include/hw/cxl/cxl_pci.h b/include/hw/cxl/cxl_pci.h
> > new file mode 100644
> > index 0000000000..b403770424
> > --- /dev/null
> > +++ b/include/hw/cxl/cxl_pci.h
> > @@ -0,0 +1,133 @@
> > +/*
> > + * QEMU CXL PCI interfaces
> > + *
> > + * Copyright (c) 2020 Intel
> > + *
> > + * This work is licensed under the terms of the GNU GPL, version 2. See the
> > + * COPYING file in the top-level directory.
> > + */
> > +
> > +#ifndef CXL_PCI_H
> > +#define CXL_PCI_H
> > +
> > +#include "hw/pci/pci.h"
> > +#include "hw/pci/pcie.h"
> > +
> > +#define CXL_VENDOR_ID 0x1e98
> > +
> > +#define PCIE_DVSEC_HEADER_OFFSET 0x4 /* Offset from start of extend cap */
> 
> To keep this clearly aligned with PCIe spec I'd call it HEADER_1_OFFSET
> 
> > +#define PCIE_DVSEC_ID_OFFSET     0x8
> > +
> > +#define PCIE_CXL_DEVICE_DVSEC_LENGTH 0x38
> > +#define PCIE_CXL_DEVICE_DVSEC_REVID  1
> 
> Make it clear this is the CXL 2.0 revid.
> It would be 0 for CXL 1.1 I think? (8.1.3 of CXL 2.0 spec)

Got it. BTW, you're correct. It is in the verbiage there
"DVSEC Revision ID of 0h represents the structure as defined in CXL 1.1 specification."

A bit hidden IMO.

> 
> 
> > +
> > +#define EXTENSIONS_PORT_DVSEC_LENGTH 0x28
> > +#define EXTENSIONS_PORT_DVSEC_REVID  1
> 
> I'm assuming this is the CXL 2.0 exensions DVSEC for ports
> in figure 128?
> 
> If so table 128 has dvsec revision as 0. 
> 

Good catch, btw a shortcut is to look at Table 124.

> > +
> > +#define GPF_PORT_DVSEC_LENGTH 0x10
> > +#define GPF_PORT_DVSEC_REVID  0
> > +
> > +#define PCIE_FLEXBUS_PORT_DVSEC_LENGTH_2_0 0x14
> > +#define PCIE_FLEXBUS_PORT_DVSEC_REVID_2_0  1
> > +
> > +#define REG_LOC_DVSEC_LENGTH 0x24
> > +#define REG_LOC_DVSEC_REVID  0
> 
> Whilst I appreciate this is an RFC it would seem more logical
> to me to only list things in the following enum if we
> have also defined them here.  E.g. GPF_DEVICE_DVSEC doesn't
> have length and revid defines.
> 
> > +
> > +enum {
> > +    PCIE_CXL_DEVICE_DVSEC      = 0,
> > +    NON_CXL_FUNCTION_MAP_DVSEC = 2,
> > +    EXTENSIONS_PORT_DVSEC      = 3,
> > +    GPF_PORT_DVSEC             = 4,
> > +    GPF_DEVICE_DVSEC           = 5,
> > +    PCIE_FLEXBUS_PORT_DVSEC    = 7,
> > +    REG_LOC_DVSEC              = 8,
> > +    MLD_DVSEC                  = 9,
> > +    CXL20_MAX_DVSEC
> > +};
> > +
> > +struct dvsec_header {
> > +    uint32_t cap_hdr;
> > +    uint32_t dv_hdr1;
> > +    uint16_t dv_hdr2;
> > +} __attribute__((__packed__));
> > +_Static_assert(sizeof(struct dvsec_header) == 10,
> > +               "dvsec header size incorrect");
> > +
> > +/*
> > + * CXL 2.0 devices must implement certain DVSEC IDs, and can [optionally]
> > + * implement others.
> > + *
> > + * CXL 2.0 Device: 0, [2], 5, 8
> > + * CXL 2.0 RP: 3, 4, 7, 8
> > + * CXL 2.0 Upstream Port: [2], 7, 8
> > + * CXL 2.0 Downstream Port: 3, 4, 7, 8
> > + */
> > +
> > +/* CXL 2.0 - 8.1.5 (ID 0003) */
> > +struct dvsec_port {
> 
> I'd keep naming consistent.  It's called EXTENSIONS_PORT_DVSEC above
> so extensions_dvsec_port here.
> 
> > +    struct dvsec_header hdr;
> > +    uint16_t status;
> > +    uint16_t control;
> > +    uint8_t alt_bus_base;
> > +    uint8_t alt_bus_limit;
> > +    uint16_t alt_memory_base;
> > +    uint16_t alt_memory_limit;
> > +    uint16_t alt_prefetch_base;
> > +    uint16_t alt_prefetch_limit;
> > +    uint32_t alt_prefetch_base_high;
> > +    uint32_t alt_prefetch_base_low;
> > +    uint32_t rcrb_base;
> > +    uint32_t rcrb_base_high;
> > +};
> > +_Static_assert(sizeof(struct dvsec_port) == 0x28, "dvsec port size incorrect");
> > +#define PORT_CONTROL_OVERRIDE_OFFSET 0xc
> I'm not totally sure what this define is, but seems
> like it's simply the offset of the control field above.
> If so can't we get it from the there directly?

Firstly, I only define these to show how one would handle DVSEC writes. I don't
actually have a use for them as of now. It is just the offset, but I don't know
what you mean by getting it from there directly. Could you elaborate a bit?


> 
> > +#define PORT_CONTROL_UNMASK_SBR      1
> > +#define PORT_CONTROL_ALT_MEMID_EN    4
> 
> Use something to make it clear that 4 is simply bit 3. (1 << 3) maybe?
> 
> > +
> > +/* CXL 2.0 - 8.1.6 GPF DVSEC (ID 0004) */
> > +struct dvsec_port_gpf {
> > +    struct dvsec_header hdr;
> > +    uint16_t rsvd;
> > +    uint16_t phase1_ctrl;
> > +    uint16_t phase2_ctrl;
> > +};
> > +_Static_assert(sizeof(struct dvsec_port_gpf) == 0x10,
> > +               "dvsec port GPF size incorrect");
> > +
> > +/* CXL 2.0 - 8.1.8/8.2.1.3 Flexbus DVSEC (ID 0007) */
> > +struct dvsec_port_flexbus {
> > +    struct dvsec_header hdr;
> > +    uint16_t cap;
> > +    uint16_t ctrl;
> > +    uint16_t status;
> > +    uint32_t rcvd_mod_ts_data;
> > +};
> > +_Static_assert(sizeof(struct dvsec_port_flexbus) == 0x14,
> > +               "dvsec port flexbus size incorrect");
> > +
> > +/* CXL 2.0 - 8.1.9 Register Locator DVSEC (ID 0008) */
> > +struct dvsec_register_locator {
> > +    struct dvsec_header hdr;
> > +    uint16_t rsvd;
> > +    uint32_t reg0_base_lo;
> > +    uint32_t reg0_base_hi;
> > +    uint32_t reg1_base_lo;
> > +    uint32_t reg1_base_hi;
> > +    uint32_t reg2_base_lo;
> > +    uint32_t reg2_base_hi;
> > +};
> > +_Static_assert(sizeof(struct dvsec_register_locator) == 0x24,
> > +               "dvsec register locator size incorrect");
> > +#define BEI_BAR_10H 0
> 
> BEI is obscure enough I'd add a comment giving full name
> (BAR equivalent indicator)
> 
> > +#define BEI_BAR_14H 1
> > +#define BEI_BAR_18H 2
> > +#define BEI_BAR_1cH 3
> > +#define BEI_BAR_20H 4
> > +#define BEI_BAR_24H 5
> > +
> > +#define RBI_EMPTY          0
> 
> Likewise, RBI isn't actually used on spec that I can see.
> So call out that it is Register Block Identifier.
> 
> > +#define RBI_COMPONENT_REG  (1 << 8)
> > +#define RBI_BAR_VIRT_ACL   (2 << 8)
> > +#define RBI_CXL_DEVICE_REG (3 << 8)
> 
> Nice to treat these as value of field (0,1,2,3) and a macro
> to put it in the right place rather than rolling them together
> directly.
> 
> > +
> > +#endif
> 


^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [RFC PATCH 04/25] hw/cxl/device: Introduce a CXL device (8.2.8)
  2020-11-16 13:07   ` Jonathan Cameron
@ 2020-11-16 21:11     ` Ben Widawsky
  2020-11-17 14:21       ` Jonathan Cameron
  0 siblings, 1 reply; 64+ messages in thread
From: Ben Widawsky @ 2020-11-16 21:11 UTC (permalink / raw)
  To: Jonathan Cameron; +Cc: Vishal Verma, Dan Williams, qemu-devel

On 20-11-16 13:07:56, Jonathan Cameron wrote:
> On Tue, 10 Nov 2020 21:47:03 -0800
> Ben Widawsky <ben.widawsky@intel.com> wrote:
> 
> > A CXL device is a type of CXL component. Conceptually, a CXL device
> > would be a leaf node in a CXL topology. From an emulation perspective,
> > CXL devices are the most complex and so the actual implementation is
> > reserved for discrete commits.
> > 
> > This new device type is specifically catered towards the eventually
> > implementation of a Type3 CXL.mem device, 8.2.8.5 in the CXL 2.0
> > specification.
> > 
> > Signed-off-by: Ben Widawsky <ben.widawsky@intel.com>
> 
> As an RFC, would be good to have questions relavant to individual
> patches if possible.  Makes it easier to know what you want feedback on.
> 
> The REG32 being used for 64 bit registers seems awkward. I'd suggest
> we either break them up into DW and deal with the edge parts manually.
> 
> I'm not sure a REG64 definition would work due to lack of explicit alignment
> guarantees.  Might be fine though.

Agreed, although I think the current frequency with which I've had to do this,
and the XXX comments are decent, it's definitely a bit ugly. I found at least
two registers (I don't recall one, but the very important command register was
the other that you noticed below) which have a field that straddles the 32b
boundary. I think having to do an upper and lower field for that would kind of
stink.

Given that the codebase has gone on long enough without REG64, I didn't want to
poke that bear, although I had wired it up at some point.

So for now, I'd like to just leave these as they are.

> 
> One buglet inline and a few other comments.
> 
> Jonathan

Thanks. Anything not responded to is acknowledged and will hopefully make its
way into v2.

> 
> 
> > ---
> >  include/hw/cxl/cxl.h        |   1 +
> >  include/hw/cxl/cxl_device.h | 193 ++++++++++++++++++++++++++++++++++++
> >  2 files changed, 194 insertions(+)
> >  create mode 100644 include/hw/cxl/cxl_device.h
> > 
> > diff --git a/include/hw/cxl/cxl.h b/include/hw/cxl/cxl.h
> > index 55f6cc30a5..23f52c4cf9 100644
> > --- a/include/hw/cxl/cxl.h
> > +++ b/include/hw/cxl/cxl.h
> > @@ -12,6 +12,7 @@
> >  
> >  #include "cxl_pci.h"
> >  #include "cxl_component.h"
> > +#include "cxl_device.h"
> >  
> >  #endif
> >  
> > diff --git a/include/hw/cxl/cxl_device.h b/include/hw/cxl/cxl_device.h
> > new file mode 100644
> > index 0000000000..491eca6e05
> > --- /dev/null
> > +++ b/include/hw/cxl/cxl_device.h
> > @@ -0,0 +1,193 @@
> > +/*
> > + * QEMU CXL Devices
> > + *
> > + * Copyright (c) 2020 Intel
> > + *
> > + * This work is licensed under the terms of the GNU GPL, version 2. See the
> > + * COPYING file in the top-level directory.
> > + */
> > +
> > +#ifndef CXL_DEVICE_H
> > +#define CXL_DEVICE_H
> > +
> > +#include "hw/register.h"
> > +
> > +/*
> > + * The following is how a CXL device's MMIO space is laid out. The only
> > + * requirement from the spec is that the capabilities array and the capability
> > + * headers start at offset 0 and are contiguously packed. The headers themselves
> > + * provide offsets to the register fields. For this emulation, registers will
> > + * start at offset 0x80 (m == 0x80). No secondary mailbox is implemented which
> > + * means that n = m + sizeof(mailbox registers) + sizeof(device registers).
> > + *
> > + * This is roughly described in 8.2.8 Figure 138 of the CXL 2.0 spec.
> > + *
> > + * n + PAYLOAD_SIZE_MAX  +---------------------------------+
> > + *                       |                                 |
> > + *                  ^    |                                 |
> > + *                  |    |                                 |
> > + *                  |    |                                 |
> > + *                  |    |                                 |
> > + *                  |    |         Command Payload         |
> > + *                  |    |                                 |
> > + *                  |    |                                 |
> > + *                  |    |                                 |
> > + *                  |    |                                 |
> > + *                  |    |                                 |
> > + *                  n    +---------------------------------+
> > + *                  ^    |                                 |
> > + *                  |    |    Device Capability Registers  |
> > + *                  |    |    x, mailbox, y                |
> > + *                  |    |                                 |
> > + *                  m    +---------------------------------+
> > + *                  ^    |     Device Capability Header y  |
> > + *                  |    +---------------------------------+
> > + *                  |    | Device Capability Header Mailbox|
> > + *                  |    +------------- --------------------
> > + *                  |    |     Device Capability Header x  |
> > + *                  |    +---------------------------------+
> > + *                  |    |                                 |
> > + *                  |    |                                 |
> > + *                  |    |      Device Cap Array[0..n]     |
> > + *                  |    |                                 |
> > + *                  |    |                                 |
> > + *                  |    |                                 |
> > + *                  0    +---------------------------------+
> > + */
> > +
> > +#define CXL_DEVICE_CAP_HDR1_OFFSET 0x10 /* Figure 138 */
> > +#define CXL_DEVICE_CAP_REG_SIZE 0x10 /* 8.2.8.2 */
> > +
> > +#define CXL_DEVICE_REGISTERS_OFFSET 0x80 /* Read comment above */
> > +#define CXL_DEVICE_REGISTERS_LENGTH 0x8 /* 8.2.8.3.1 */
> > +
> > +#define CXL_MAILBOX_REGISTERS_OFFSET \
> > +    (CXL_DEVICE_REGISTERS_OFFSET + CXL_DEVICE_REGISTERS_LENGTH)
> > +#define CXL_MAILBOX_REGISTERS_SIZE 0x20
> > +#define CXL_MAILBOX_PAYLOAD_SHIFT 11
> > +#define CXL_MAILBOX_MAX_PAYLOAD_SIZE (1 << CXL_MAILBOX_PAYLOAD_SHIFT)
> > +#define CXL_MAILBOX_REGISTERS_LENGTH \
> > +    (CXL_MAILBOX_REGISTERS_SIZE + CXL_MAILBOX_MAX_PAYLOAD_SIZE)
> > +
> > +typedef struct cxl_device_state {
> > +    /* Boss container and caps registers */
> > +    MemoryRegion device_registers;
> > +
> > +    MemoryRegion caps;
> > +    MemoryRegion device;
> > +    MemoryRegion mailbox;
> > +
> > +    MemoryRegion *pmem;
> > +    MemoryRegion *vmem;
> > +
> > +    bool active;
> > +    uint16_t command;
> > +    uint16_t payload_size;
> > +    union {
> > +        uint8_t caps_reg_state[CXL_DEVICE_CAP_REG_SIZE * 4]; /* ARRAY + 3 CAPS */
> > +        uint32_t caps_reg_state32[0];
> > +    };
> > +} CXLDeviceState;
> > +
> > +/* Initialize the register block for a device */
> > +void cxl_device_register_block_init(Object *obj, CXLDeviceState *dev);
> > +
> > +/* Set up default values for the register block */
> > +void cxl_device_register_init_common(CXLDeviceState *dev);
> > +
> > +/* CXL 2.0 - 8.2.8.1 */
> > +REG32(CXL_DEV_CAP_ARRAY, 0) /* 48b!?!?! */
> > +    FIELD(CXL_DEV_CAP_ARRAY, CAP_ID, 0, 16)
> > +    FIELD(CXL_DEV_CAP_ARRAY, CAP_VERSION, 16, 8)
> > +REG32(CXL_DEV_CAP_ARRAY2, 4) /* We're going to pretend it's 64b */
> 
> Fair call.  Guess reserved 16 bits on the end will be the eventual fix.

I reported it but I was too late to get it changed. Hopefully it will be fixed.

> 
> > +    FIELD(CXL_DEV_CAP_ARRAY2, CAP_COUNT, 0, 16)
> > +
> > +/*
> > + * In the 8.2.8.2, this is listed as a 128b register, but in 8.2.8, it says:
> > + * > No registers defined in Section 8.2.8 are larger than 64-bits wide so that
> > + * > is the maximum access size allowed for these registers. If this rule is not
> > + * > followed, the behavior is undefined
> > + *
> > + * Here we've chosen to make it 4 dwords. The spec allows any pow2 multiple
> > + * access to be used for a register (2 qwords, 8 words, 128 bytes).
> > + *
> > + * XXX: This is supposedly fixed for the release version of the spec. If this
> > + * comment is still here, I've failed.
> 
> *sniggers in a sympathetic way*

Notice that they didn't actually fix it in the release version of the spec, even
though I was assured they did!!!!

> 
> > + */
> > +#define CXL_DEVICE_CAPABILITY_HEADER_REGISTER(n, offset)                            \
> > +    REG32(CXL_DEV_##n##_CAP_HDR0, offset)                 \
> > +        FIELD(CXL_DEV_##n##_CAP_HDR0, CAP_ID, 0, 16)      \
> > +        FIELD(CXL_DEV_##n##_CAP_HDR0, CAP_VERSION, 16, 8) \
> > +    REG32(CXL_DEV_##n##_CAP_HDR1, offset + 4)             \
> > +        FIELD(CXL_DEV_##n##_CAP_HDR1, CAP_OFFSET, 0, 32)  \
> > +    REG32(CXL_DEV_##n##_CAP_HDR2, offset + 8)             \
> > +        FIELD(CXL_DEV_##n##_CAP_HDR2, CAP_LENGTH, 0, 32)
> > +
> > +CXL_DEVICE_CAPABILITY_HEADER_REGISTER(DEVICE, CXL_DEVICE_CAP_HDR1_OFFSET)
> > +CXL_DEVICE_CAPABILITY_HEADER_REGISTER(MAILBOX, CXL_DEVICE_CAP_HDR1_OFFSET + \
> > +                                               CXL_DEVICE_CAP_REG_SIZE)
> > +
> > +REG32(CXL_DEV_MAILBOX_CAP, 0)
> > +    FIELD(CXL_DEV_MAILBOX_CAP, PAYLOAD_SIZE, 0, 5)
> > +    FIELD(CXL_DEV_MAILBOX_CAP, INT_CAP, 5, 1)
> > +    FIELD(CXL_DEV_MAILBOX_CAP, BG_INT_CAP, 6, 1)
> > +    FIELD(CXL_DEV_MAILBOX_CAP, MSI_N, 7, 4)
> > +
> > +REG32(CXL_DEV_MAILBOX_CTRL, 4)
> > +    FIELD(CXL_DEV_MAILBOX_CTRL, DOORBELL, 0, 1)
> > +    FIELD(CXL_DEV_MAILBOX_CTRL, INT_EN, 1, 2)
> 
> 1 bit field, not 2.
> 
> > +    FIELD(CXL_DEV_MAILBOX_CTRL, BG_INT_EN, 2, 1)
> > +
> > +enum {
> > +    CXL_CMD_EVENTS              = 0x1,
> 
> > +    CXL_CMD_IDENTIFY            = 0x40,
> > +};
> > +
> > +REG32(CXL_DEV_MAILBOX_CMD, 8)
> > +    FIELD(CXL_DEV_MAILBOX_CMD, OP, 0, 16)
> > +    FIELD(CXL_DEV_MAILBOX_CMD, LENGTH, 16, 20)
> 
> That takes us out to bit 36 in a 32 bit register?
> Probably needs a comment like the ones below.
> Wouldn't want to miss fixing this one later.

I really thought I had a comment here... it's definitely needed.

> 
> > +
> > +/* 8.2.8.4.5.1 Command Return Codes */
> > +enum {
> > +    RET_SUCCESS                 = 0x0,
> > +    RET_BG_STARTED              = 0x1, /* Background Command Started */
> > +    RET_EINVAL                  = 0x2, /* Invalid Input */
> > +    RET_ENOTSUP                 = 0x3, /* Unsupported */
> > +    RET_ENODEV                  = 0x4, /* Internal Error */
> 
> Mapping that to NODEV seems less than obvious.

I tried to be cute and map as many things to errno as possible. Suggestions?

> 
> > +    RET_ERESTART                = 0x5, /* Retry Required */
> > +    RET_EBUSY                   = 0x6, /* Busy */
> > +    RET_MEDIA_DISABLED          = 0x7, /* Media Disabled */
> > +    RET_FW_EBUSY                = 0x8, /* FW Transfer in Progress */
> > +    RET_FW_OOO                  = 0x9, /* FW Transfer Out of Order */
> > +    RET_FW_AUTH                 = 0xa, /* FW Authentication Failed */
> > +    RET_FW_EBADSLT              = 0xb, /* Invalid Slot */
> > +    RET_FW_ROLLBACK             = 0xc, /* Activation Failed, FW Rolled Back */
> > +    RET_FW_REBOOT               = 0xd, /* Activation Failed, Cold Reset Required */
> > +    RET_ENOENT                  = 0xe, /* Invalid Handle */
> > +    RET_EFAULT                  = 0xf, /* Invalid Physical Address */
> > +    RET_POISON_E2BIG            = 0x10, /* Inject Poison Limit Reached */
> > +    RET_EIO                     = 0x11, /* Permanent Media Failure */
> > +    RET_ECANCELED               = 0x12, /* Aborted */
> > +    RET_EACCESS                 = 0x13, /* Invalid Security State */
> > +    RET_EPERM                   = 0x14, /* Incorrect Passphrase */
> > +    RET_EPROTONOSUPPORT         = 0x15, /* Unsupported Mailbox */
> > +    RET_EMSGSIZE                = 0x16, /* Invalid Payload Length */
> > +    RET_MAX                     = 0x17
> > +};
> > +
> > +/* XXX: actually a 64b register */
> > +REG32(CXL_DEV_MAILBOX_STS, 0x10)
> > +    FIELD(CXL_DEV_MAILBOX_STS, BG_OP, 0, 1)
> > +    FIELD(CXL_DEV_MAILBOX_STS, ERRNO, 32, 16)
> > +    FIELD(CXL_DEV_MAILBOX_STS, VENDOR_ERRNO, 48, 16)
> > +
> > +/* XXX: actually a 64b register */
> > +REG32(CXL_DEV_BG_CMD_STS, 0x18)
> > +    FIELD(CXL_DEV_BG_CMD_STS, BG, 0, 16)
> > +    FIELD(CXL_DEV_BG_CMD_STS, DONE, 16, 7)
> > +    FIELD(CXL_DEV_BG_CMD_STS, ERRNO, 32, 16)
> > +    FIELD(CXL_DEV_BG_CMD_STS, VENDOR_ERRNO, 48, 16)
> > +
> > +REG32(CXL_DEV_CMD_PAYLOAD, 0x20)
> > +
> > +#endif
> 


^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [RFC PATCH 06/25] hw/cxl/device: Add device status (8.2.8.3)
  2020-11-16 13:16   ` Jonathan Cameron
@ 2020-11-16 21:18     ` Ben Widawsky
  2020-11-17 14:24       ` Jonathan Cameron
  0 siblings, 1 reply; 64+ messages in thread
From: Ben Widawsky @ 2020-11-16 21:18 UTC (permalink / raw)
  To: Jonathan Cameron; +Cc: Vishal Verma, Dan Williams, qemu-devel

On 20-11-16 13:16:08, Jonathan Cameron wrote:
> On Tue, 10 Nov 2020 21:47:05 -0800
> Ben Widawsky <ben.widawsky@intel.com> wrote:
> 
> > This implements the CXL device status registers from 8.2.8.3.1 in the
> > CXL 2.0 specification. It is capability ID 0001h.
> > 
> > Signed-off-by: Ben Widawsky <ben.widawsky@intel.com>
> 
> It does some other stuff it shouldn't as well.  Please tidy that up before
> v2.  A few other passing comments inline.
> 
> Thanks,
> 
> Jonathan
> 
> 
> > ---
> >  hw/cxl/cxl-device-utils.c   | 45 +++++++++++++++++++++++++++++++++-
> >  include/hw/cxl/cxl_device.h | 49 ++++++++++++-------------------------
> >  2 files changed, 60 insertions(+), 34 deletions(-)
> > 
> > diff --git a/hw/cxl/cxl-device-utils.c b/hw/cxl/cxl-device-utils.c
> > index a391bb15c6..78144e103c 100644
> > --- a/hw/cxl/cxl-device-utils.c
> > +++ b/hw/cxl/cxl-device-utils.c
> > @@ -33,6 +33,42 @@ static uint64_t caps_reg_read(void *opaque, hwaddr offset, unsigned size)
> >      return ldn_le_p(cxl_dstate->caps_reg_state + offset, size);
> >  }
> >  
> > +static uint64_t dev_reg_read(void *opaque, hwaddr offset, unsigned size)
> > +{
> > +    uint64_t retval = 0;
> 
> Doesn't seem to be used.
> 

It's required for ldn_le_p, or did you mean something else?

> > +
> 
> Perhaps break the alignment check out to a utility function given this sanity check
> is same as in previous patch.
> 
> > +    switch (size) {
> > +    case 4:
> > +        if (unlikely(offset & (sizeof(uint32_t) - 1))) {
> > +            qemu_log_mask(LOG_UNIMP, "Unaligned register read\n");
> > +            return 0;
> > +        }
> > +        break;
> > +    case 8:
> > +        if (unlikely(offset & (sizeof(uint64_t) - 1))) {
> > +            qemu_log_mask(LOG_UNIMP, "Unaligned register read\n");
> > +            return 0;
> > +        }
> > +        break;
> > +    }
> > +
> > +    return ldn_le_p(&retval, size);
> > +}
> > +
> > +static const MemoryRegionOps dev_ops = {
> > +    .read = dev_reg_read,
> > +    .write = NULL,
> > +    .endianness = DEVICE_LITTLE_ENDIAN,
> > +    .valid = {
> > +        .min_access_size = 4,
> > +        .max_access_size = 8,
> > +    },
> > +    .impl = {
> > +        .min_access_size = 4,
> > +        .max_access_size = 8,
> > +    },
> > +};
> > +
> >  static const MemoryRegionOps caps_ops = {
> >      .read = caps_reg_read,
> >      .write = NULL,
> > @@ -56,18 +92,25 @@ void cxl_device_register_block_init(Object *obj, CXLDeviceState *cxl_dstate)
> >  
> >      memory_region_init_io(&cxl_dstate->caps, obj, &caps_ops, cxl_dstate,
> >                            "cap-array", CXL_DEVICE_REGISTERS_OFFSET - 0);
> > +    memory_region_init_io(&cxl_dstate->device, obj, &dev_ops, cxl_dstate,
> > +                          "device-status", CXL_DEVICE_REGISTERS_LENGTH);
> >  
> >      memory_region_add_subregion(&cxl_dstate->device_registers, 0,
> >                                  &cxl_dstate->caps);
> > +    memory_region_add_subregion(&cxl_dstate->device_registers,
> > +                                CXL_DEVICE_REGISTERS_OFFSET,
> > +                                &cxl_dstate->device);
> >  }
> >  
> >  void cxl_device_register_init_common(CXLDeviceState *cxl_dstate)
> >  {
> >      uint32_t *cap_hdrs = cxl_dstate->caps_reg_state32;
> > -    const int cap_count = 0;
> > +    const int cap_count = 1;
> >  
> >      /* CXL Device Capabilities Array Register */
> >      ARRAY_FIELD_DP32(cap_hdrs, CXL_DEV_CAP_ARRAY, CAP_ID, 0);
> >      ARRAY_FIELD_DP32(cap_hdrs, CXL_DEV_CAP_ARRAY, CAP_VERSION, 1);
> >      ARRAY_FIELD_DP32(cap_hdrs, CXL_DEV_CAP_ARRAY2, CAP_COUNT, cap_count);
> > +
> > +    cxl_device_cap_init(cxl_dstate, DEVICE, 1);
> >  }
> > diff --git a/include/hw/cxl/cxl_device.h b/include/hw/cxl/cxl_device.h
> > index 491eca6e05..2c674fdc9c 100644
> > --- a/include/hw/cxl/cxl_device.h
> > +++ b/include/hw/cxl/cxl_device.h
> > @@ -127,6 +127,22 @@ CXL_DEVICE_CAPABILITY_HEADER_REGISTER(DEVICE, CXL_DEVICE_CAP_HDR1_OFFSET)
> >  CXL_DEVICE_CAPABILITY_HEADER_REGISTER(MAILBOX, CXL_DEVICE_CAP_HDR1_OFFSET + \
> >                                                 CXL_DEVICE_CAP_REG_SIZE)
> >  
> > +#define cxl_device_cap_init(dstate, reg, cap_id)                                   \
> > +    do {                                                                           \
> > +        uint32_t *cap_hdrs = dstate->caps_reg_state32;                             \
> > +        int which = R_CXL_DEV_##reg##_CAP_HDR0;                                    \
> > +        cap_hdrs[which] =                                                          \
> > +            FIELD_DP32(cap_hdrs[which], CXL_DEV_##reg##_CAP_HDR0, CAP_ID, cap_id); \
> > +        cap_hdrs[which] = FIELD_DP32(                                              \
> > +            cap_hdrs[which], CXL_DEV_##reg##_CAP_HDR0, CAP_VERSION, 1);            \
> > +        cap_hdrs[which + 1] =                                                      \
> > +            FIELD_DP32(cap_hdrs[which + 1], CXL_DEV_##reg##_CAP_HDR1,              \
> > +                       CAP_OFFSET, CXL_##reg##_REGISTERS_OFFSET);                  \
> > +        cap_hdrs[which + 2] =                                                      \
> > +            FIELD_DP32(cap_hdrs[which + 2], CXL_DEV_##reg##_CAP_HDR2,              \
> > +                       CAP_LENGTH, CXL_##reg##_REGISTERS_LENGTH);                  \
> > +    } while (0)
> > +
> >  REG32(CXL_DEV_MAILBOX_CAP, 0)
> >      FIELD(CXL_DEV_MAILBOX_CAP, PAYLOAD_SIZE, 0, 5)
> >      FIELD(CXL_DEV_MAILBOX_CAP, INT_CAP, 5, 1)
> > @@ -138,43 +154,10 @@ REG32(CXL_DEV_MAILBOX_CTRL, 4)
> >      FIELD(CXL_DEV_MAILBOX_CTRL, INT_EN, 1, 2)
> >      FIELD(CXL_DEV_MAILBOX_CTRL, BG_INT_EN, 2, 1)
> >  
> > -enum {
> > -    CXL_CMD_EVENTS              = 0x1,
> > -    CXL_CMD_IDENTIFY            = 0x40,
> > -};
> > -
> >  REG32(CXL_DEV_MAILBOX_CMD, 8)
> >      FIELD(CXL_DEV_MAILBOX_CMD, OP, 0, 16)
> >      FIELD(CXL_DEV_MAILBOX_CMD, LENGTH, 16, 20)
> >  
> > -/* 8.2.8.4.5.1 Command Return Codes */
> 
> Umm. We only just introduced this a few patches ago.  Please tidy that
> up so we don't end up bringing things in and out again.
> 
> > -enum {
> > -    RET_SUCCESS                 = 0x0,
> > -    RET_BG_STARTED              = 0x1, /* Background Command Started */
> > -    RET_EINVAL                  = 0x2, /* Invalid Input */
> > -    RET_ENOTSUP                 = 0x3, /* Unsupported */
> > -    RET_ENODEV                  = 0x4, /* Internal Error */
> > -    RET_ERESTART                = 0x5, /* Retry Required */
> > -    RET_EBUSY                   = 0x6, /* Busy */
> > -    RET_MEDIA_DISABLED          = 0x7, /* Media Disabled */
> > -    RET_FW_EBUSY                = 0x8, /* FW Transfer in Progress */
> > -    RET_FW_OOO                  = 0x9, /* FW Transfer Out of Order */
> > -    RET_FW_AUTH                 = 0xa, /* FW Authentication Failed */
> > -    RET_FW_EBADSLT              = 0xb, /* Invalid Slot */
> > -    RET_FW_ROLLBACK             = 0xc, /* Activation Failed, FW Rolled Back */
> > -    RET_FW_REBOOT               = 0xd, /* Activation Failed, Cold Reset Required */
> > -    RET_ENOENT                  = 0xe, /* Invalid Handle */
> > -    RET_EFAULT                  = 0xf, /* Invalid Physical Address */
> > -    RET_POISON_E2BIG            = 0x10, /* Inject Poison Limit Reached */
> > -    RET_EIO                     = 0x11, /* Permanent Media Failure */
> > -    RET_ECANCELED               = 0x12, /* Aborted */
> > -    RET_EACCESS                 = 0x13, /* Invalid Security State */
> > -    RET_EPERM                   = 0x14, /* Incorrect Passphrase */
> > -    RET_EPROTONOSUPPORT         = 0x15, /* Unsupported Mailbox */
> > -    RET_EMSGSIZE                = 0x16, /* Invalid Payload Length */
> > -    RET_MAX                     = 0x17
> > -};
> > -
> >  /* XXX: actually a 64b register */
> >  REG32(CXL_DEV_MAILBOX_STS, 0x10)
> >      FIELD(CXL_DEV_MAILBOX_STS, BG_OP, 0, 1)
> 


^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [RFC PATCH 07/25] hw/cxl/device: Implement basic mailbox (8.2.8.4)
  2020-11-16 13:46   ` Jonathan Cameron
@ 2020-11-16 21:42     ` Ben Widawsky
  0 siblings, 0 replies; 64+ messages in thread
From: Ben Widawsky @ 2020-11-16 21:42 UTC (permalink / raw)
  To: Jonathan Cameron; +Cc: Vishal Verma, Dan Williams, qemu-devel

On 20-11-16 13:46:51, Jonathan Cameron wrote:
> On Tue, 10 Nov 2020 21:47:06 -0800
> Ben Widawsky <ben.widawsky@intel.com> wrote:
> 
> > This is the beginning of implementing mailbox support for CXL 2.0
> > devices.
> > 
> > Signed-off-by: Ben Widawsky <ben.widawsky@intel.com>
> Mostly patch set cleanup suggestions rather than anything meaningful
> in here.
> 
> Thanks,
> 
> Jonathan
> 
> > ---
> >  hw/cxl/cxl-device-utils.c   | 131 ++++++++++++++++++++++++++++++++++++
> >  hw/cxl/cxl-mailbox-utils.c  |  93 +++++++++++++++++++++++++
> >  hw/cxl/meson.build          |   1 +
> >  include/hw/cxl/cxl.h        |   3 +
> >  include/hw/cxl/cxl_device.h |  10 ++-
> >  5 files changed, 237 insertions(+), 1 deletion(-)
> >  create mode 100644 hw/cxl/cxl-mailbox-utils.c
> > 
> > diff --git a/hw/cxl/cxl-device-utils.c b/hw/cxl/cxl-device-utils.c
> > index 78144e103c..aec8b0d421 100644
> > --- a/hw/cxl/cxl-device-utils.c
> > +++ b/hw/cxl/cxl-device-utils.c
> > @@ -55,6 +55,123 @@ static uint64_t dev_reg_read(void *opaque, hwaddr offset, unsigned size)
> >      return ldn_le_p(&retval, size);
> >  }
> >  
> > +static uint64_t mailbox_reg_read(void *opaque, hwaddr offset, unsigned size)
> > +{
> > +    CXLDeviceState *cxl_dstate = opaque;
> > +
> > +    switch (size) {
> > +    case 4:
> > +        if (unlikely(offset & (sizeof(uint32_t) - 1))) {
> > +            qemu_log_mask(LOG_UNIMP, "Unaligned register read\n");
> > +            return 0;
> > +        }
> > +        break;
> > +    case 8:
> > +        if (unlikely(offset & (sizeof(uint64_t) - 1))) {
> > +            qemu_log_mask(LOG_UNIMP, "Unaligned register read\n");
> > +            return 0;
> > +        }
> > +        break;
> > +    default:
> > +        qemu_log_mask(LOG_UNIMP, "%uB component register read\n", size);
> > +        return 0;
> > +    }
> > +
> > +    return ldn_le_p(cxl_dstate->mbox_reg_state + offset, size);
> > +}
> > +
> > +static void mailbox_mem_writel(uint32_t *reg_state, hwaddr offset,
> > +                               uint64_t value)
> > +{
> > +    switch (offset) {
> > +    case A_CXL_DEV_MAILBOX_CTRL:
> > +        /* fallthrough */
> > +    case A_CXL_DEV_MAILBOX_CAP:
> > +        /* RO register */
> > +        break;
> > +    default:
> > +        qemu_log_mask(LOG_UNIMP,
> > +                      "%s Unexpected 32-bit access to 0x%" PRIx64 " (WI)\n",
> > +                      __func__, offset);
> > +        break;
> > +    }
> > +
> > +    stl_le_p((uint8_t *)reg_state + offset, value);
> > +}
> > +
> > +static void mailbox_mem_writeq(uint64_t *reg_state, hwaddr offset,
> > +                               uint64_t value)
> > +{
> > +    switch (offset) {
> > +    case A_CXL_DEV_MAILBOX_CMD:
> > +        break;
> > +    case A_CXL_DEV_BG_CMD_STS:
> > +        /* BG not supported */
> > +        /* fallthrough */
> > +    case A_CXL_DEV_MAILBOX_STS:
> > +        /* Read only register, will get updated by the state machine */
> > +        return;
> > +    case A_CXL_DEV_MAILBOX_CAP:
> > +    case A_CXL_DEV_MAILBOX_CTRL:
> 
> I wouldn't bother listing these here given you don't list the MAILBOX_STS etc in
> the 32 bit version.
> 
> > +    default:
> > +        qemu_log_mask(LOG_UNIMP,
> > +                      "%s Unexpected 64-bit access to 0x%" PRIx64 " (WI)\n",
> > +                      __func__, offset);
> > +        return;
> > +    }
> > +
> > +    stq_le_p((uint8_t *)reg_state + offset, value);
> > +}
> > +
> > +static void mailbox_reg_write(void *opaque, hwaddr offset, uint64_t value,
> > +                              unsigned size)
> > +{
> > +    CXLDeviceState *cxl_dstate = opaque;
> > +
> > +    /*
> > +     * Lock is needed to prevent concurrent writes as well as to prevent writes
> > +     * coming in while the firmware is processing. Without background commands
> > +     * or the second mailbox implemented, this serves no purpose since the
> > +     * memory access is synchronized at a higher level (per memory region).
> > +     */
> > +    RCU_READ_LOCK_GUARD();
> > +
> > +    switch (size) {
> > +    case 4:
> > +        if (unlikely(offset & (sizeof(uint32_t) - 1))) {
> > +            qemu_log_mask(LOG_UNIMP, "Unaligned register read\n");
> > +            return;
> > +        }
> > +        mailbox_mem_writel(cxl_dstate->mbox_reg_state32, offset, value);
> > +        break;
> > +    case 8:
> > +        if (unlikely(offset & (sizeof(uint64_t) - 1))) {
> > +            qemu_log_mask(LOG_UNIMP, "Unaligned register read\n");
> > +            return;
> > +        }
> > +        mailbox_mem_writeq(cxl_dstate->mbox_reg_state64, offset, value);
> > +        break;
> > +    }
> > +
> > +    if (ARRAY_FIELD_EX32(cxl_dstate->mbox_reg_state32, CXL_DEV_MAILBOX_CTRL,
> > +                         DOORBELL))
> > +        process_mailbox(cxl_dstate);
> > +}
> > +
> > +static const MemoryRegionOps mailbox_ops = {
> > +    .read = mailbox_reg_read,
> > +    .write = mailbox_reg_write,
> > +    .endianness = DEVICE_LITTLE_ENDIAN,
> > +    .valid = {
> > +        .min_access_size = 4,
> > +        .max_access_size = 8,
> > +    },
> > +    .impl = {
> > +        .min_access_size = 4,
> > +        .max_access_size = 8,
> > +    },
> > +};
> > +
> >  static const MemoryRegionOps dev_ops = {
> >      .read = dev_reg_read,
> >      .write = NULL,
> > @@ -94,12 +211,23 @@ void cxl_device_register_block_init(Object *obj, CXLDeviceState *cxl_dstate)
> >                            "cap-array", CXL_DEVICE_REGISTERS_OFFSET - 0);
> >      memory_region_init_io(&cxl_dstate->device, obj, &dev_ops, cxl_dstate,
> >                            "device-status", CXL_DEVICE_REGISTERS_LENGTH);
> > +    memory_region_init_io(&cxl_dstate->mailbox, obj, &mailbox_ops, cxl_dstate,
> > +                          "mailbox", CXL_MAILBOX_REGISTERS_LENGTH);
> >  
> >      memory_region_add_subregion(&cxl_dstate->device_registers, 0,
> >                                  &cxl_dstate->caps);
> >      memory_region_add_subregion(&cxl_dstate->device_registers,
> >                                  CXL_DEVICE_REGISTERS_OFFSET,
> >                                  &cxl_dstate->device);
> > +    memory_region_add_subregion(&cxl_dstate->device_registers,
> > +                                CXL_MAILBOX_REGISTERS_OFFSET, &cxl_dstate->mailbox);
> > +}
> > +
> > +static void mailbox_init_common(uint32_t *mbox_regs)
> > +{
> > +    /* 2048 payload size, with no interrupt or background support */
> > +    ARRAY_FIELD_DP32(mbox_regs, CXL_DEV_MAILBOX_CAP, PAYLOAD_SIZE,
> > +                     CXL_MAILBOX_PAYLOAD_SHIFT);
> >  }
> >  
> >  void cxl_device_register_init_common(CXLDeviceState *cxl_dstate)
> > @@ -113,4 +241,7 @@ void cxl_device_register_init_common(CXLDeviceState *cxl_dstate)
> >      ARRAY_FIELD_DP32(cap_hdrs, CXL_DEV_CAP_ARRAY2, CAP_COUNT, cap_count);
> >  
> >      cxl_device_cap_init(cxl_dstate, DEVICE, 1);
> > +    cxl_device_cap_init(cxl_dstate, MAILBOX, 2);
> > +
> > +    mailbox_init_common(cxl_dstate->mbox_reg_state32);
> >  }
> > diff --git a/hw/cxl/cxl-mailbox-utils.c b/hw/cxl/cxl-mailbox-utils.c
> > new file mode 100644
> > index 0000000000..2d1b0ef9e4
> > --- /dev/null
> > +++ b/hw/cxl/cxl-mailbox-utils.c
> > @@ -0,0 +1,93 @@
> > +/*
> > + * CXL Utility library for mailbox interface
> > + *
> > + * Copyright(C) 2020 Intel Corporation.
> > + *
> > + * This work is licensed under the terms of the GNU GPL, version 2. See the
> > + * COPYING file in the top-level directory.
> > + */
> > +
> > +#include "qemu/osdep.h"
> > +#include "qemu/log.h"
> > +#include "hw/pci/pci.h"
> > +#include "hw/cxl/cxl.h"
> > +
> > +/* 8.2.8.4.5.1 Command Return Codes */
> > +enum {
> > +    RET_SUCCESS                 = 0x0,
> > +    RET_BG_STARTED              = 0x1, /* Background Command Started */
> > +    RET_EINVAL                  = 0x2, /* Invalid Input */
> > +    RET_ENOTSUP                 = 0x3, /* Unsupported */
> > +    RET_ENODEV                  = 0x4, /* Internal Error */
> > +    RET_ERESTART                = 0x5, /* Retry Required */
> > +    RET_EBUSY                   = 0x6, /* Busy */
> > +    RET_MEDIA_DISABLED          = 0x7, /* Media Disabled */
> > +    RET_FW_EBUSY                = 0x8, /* FW Transfer in Progress */
> > +    RET_FW_OOO                  = 0x9, /* FW Transfer Out of Order */
> > +    RET_FW_AUTH                 = 0xa, /* FW Authentication Failed */
> > +    RET_FW_EBADSLT              = 0xb, /* Invalid Slot */
> > +    RET_FW_ROLLBACK             = 0xc, /* Activation Failed, FW Rolled Back */
> > +    RET_FW_REBOOT               = 0xd, /* Activation Failed, Cold Reset Required */
> > +    RET_ENOENT                  = 0xe, /* Invalid Handle */
> > +    RET_EFAULT                  = 0xf, /* Invalid Physical Address */
> > +    RET_POISON_E2BIG            = 0x10, /* Inject Poison Limit Reached */
> > +    RET_EIO                     = 0x11, /* Permanent Media Failure */
> > +    RET_ECANCELED               = 0x12, /* Aborted */
> > +    RET_EACCESS                 = 0x13, /* Invalid Security State */
> > +    RET_EPERM                   = 0x14, /* Incorrect Passphrase */
> > +    RET_EPROTONOSUPPORT         = 0x15, /* Unsupported Mailbox */
> > +    RET_EMSGSIZE                = 0x16, /* Invalid Payload Length */
> > +    RET_MAX                     = 0x17
> > +};
> 
> Ah back again.  Just drop the earlier add and remove of this list and
> we are all good.
> 
> > +
> > +void process_mailbox(CXLDeviceState *cxl_dstate)
> > +{
> > +    uint16_t ret = RET_SUCCESS;
> > +    uint32_t ret_len = 0;
> > +    uint64_t status_reg;
> > +
> > +    /*
> > +     * current state of mailbox interface
> > +     *  uint32_t mbox_cap_reg = cxl_dstate->reg_state32[R_CXL_DEV_MAILBOX_CAP];
> > +     *  uint32_t mbox_ctrl_reg = cxl_dstate->reg_state32[R_CXL_DEV_MAILBOX_CTRL];
> > +     *  uint64_t status_reg = *(uint64_t *)&cxl_dstate->reg_state[A_CXL_DEV_MAILBOX_STS];
> > +     */
> > +    uint64_t command_reg =
> > +        *(uint64_t *)&cxl_dstate->mbox_reg_state[A_CXL_DEV_MAILBOX_CMD];
> > +
> > +    /* Check if we have to do anything */
> > +    if (!ARRAY_FIELD_EX32(cxl_dstate->mbox_reg_state32, CXL_DEV_MAILBOX_CTRL, DOORBELL)) {
> > +        qemu_log_mask(LOG_UNIMP, "Corrupt internal state for firmware\n");
> > +        return;
> > +    }
> > +
> > +    uint8_t cmd_set = FIELD_EX64(command_reg, CXL_DEV_MAILBOX_CMD, COMMAND_SET);
> > +    uint8_t cmd = FIELD_EX64(command_reg, CXL_DEV_MAILBOX_CMD, COMMAND);
> > +    (void)cmd;
> 
> ?

I have plans for this, so you can ignore it for now.

> 
> > +    switch (cmd_set) {
> > +    default:
> > +        ret = RET_ENOTSUP;
> > +    }
> > +
> > +    /*
> > +     * Set the return code
> > +     * XXX: it's a 64b register, but we're not setting the vendor, so we can get
> > +     * away with this
> 
> Also mention not setting background operation bit?
> 
> > +     */
> > +    status_reg = FIELD_DP64(0, CXL_DEV_MAILBOX_STS, ERRNO, ret);
> > +
> > +    /*
> > +     * Set the return length
> > +     */
> > +    command_reg = FIELD_DP64(command_reg, CXL_DEV_MAILBOX_CMD, COMMAND_SET, 0);
> > +    command_reg = FIELD_DP64(command_reg, CXL_DEV_MAILBOX_CMD, COMMAND, 0);
> > +    command_reg = FIELD_DP64(command_reg, CXL_DEV_MAILBOX_CMD, LENGTH, ret_len);
> 
> Rather convoluted way of setting just the length field, I assume because there
> are RsvdP fields in there we can't touch.
> 
> > +
> > +    stq_le_p(cxl_dstate->mbox_reg_state + A_CXL_DEV_MAILBOX_CMD, command_reg);
> > +    stq_le_p(cxl_dstate->mbox_reg_state + A_CXL_DEV_MAILBOX_STS, status_reg);
> > +
> > +    /* Tell the host we're done */
> > +    ARRAY_FIELD_DP32(cxl_dstate->mbox_reg_state32, CXL_DEV_MAILBOX_CTRL,
> > +                     DOORBELL, 0);
> > +}
> > +
> > diff --git a/hw/cxl/meson.build b/hw/cxl/meson.build
> > index 47154d6850..0eca715d10 100644
> > --- a/hw/cxl/meson.build
> > +++ b/hw/cxl/meson.build
> > @@ -1,4 +1,5 @@
> >  softmmu_ss.add(when: 'CONFIG_CXL', if_true: files(
> >    'cxl-component-utils.c',
> >    'cxl-device-utils.c',
> > +  'cxl-mailbox-utils.c',
> >  ))
> > diff --git a/include/hw/cxl/cxl.h b/include/hw/cxl/cxl.h
> > index 23f52c4cf9..362cda40de 100644
> > --- a/include/hw/cxl/cxl.h
> > +++ b/include/hw/cxl/cxl.h
> > @@ -14,5 +14,8 @@
> >  #include "cxl_component.h"
> >  #include "cxl_device.h"
> >  
> > +#define COMPONENT_REG_BAR_IDX 0
> > +#define DEVICE_REG_BAR_IDX 2
> > +
> >  #endif
> >  
> > diff --git a/include/hw/cxl/cxl_device.h b/include/hw/cxl/cxl_device.h
> > index 2c674fdc9c..df00998def 100644
> > --- a/include/hw/cxl/cxl_device.h
> > +++ b/include/hw/cxl/cxl_device.h
> > @@ -87,6 +87,11 @@ typedef struct cxl_device_state {
> >          uint8_t caps_reg_state[CXL_DEVICE_CAP_REG_SIZE * 4]; /* ARRAY + 3 CAPS */
> >          uint32_t caps_reg_state32[0];
> >      };
> > +    union {
> > +        uint8_t mbox_reg_state[CXL_MAILBOX_REGISTERS_LENGTH];
> > +        uint32_t mbox_reg_state32[0];
> > +        uint64_t mbox_reg_state64[0];
> > +    };
> >  } CXLDeviceState;
> >  
> >  /* Initialize the register block for a device */
> > @@ -127,6 +132,8 @@ CXL_DEVICE_CAPABILITY_HEADER_REGISTER(DEVICE, CXL_DEVICE_CAP_HDR1_OFFSET)
> >  CXL_DEVICE_CAPABILITY_HEADER_REGISTER(MAILBOX, CXL_DEVICE_CAP_HDR1_OFFSET + \
> >                                                 CXL_DEVICE_CAP_REG_SIZE)
> >  
> > +void process_mailbox(CXLDeviceState *cxl_dstate);
> > +
> >  #define cxl_device_cap_init(dstate, reg, cap_id)                                   \
> >      do {                                                                           \
> >          uint32_t *cap_hdrs = dstate->caps_reg_state32;                             \
> > @@ -155,7 +162,8 @@ REG32(CXL_DEV_MAILBOX_CTRL, 4)
> >      FIELD(CXL_DEV_MAILBOX_CTRL, BG_INT_EN, 2, 1)
> >  
> >  REG32(CXL_DEV_MAILBOX_CMD, 8)
> > -    FIELD(CXL_DEV_MAILBOX_CMD, OP, 0, 16)
> > +    FIELD(CXL_DEV_MAILBOX_CMD, COMMAND, 0, 8)
> 
> Can we fix the original introduction of this so we don't end up modifying it here?
> From spec I can fully see how you ended up with this as you wrote the code
> but nice to get rid of the two step definition now anyway.
> (the field is first defined as 16 bits, then later it says there are two 8 bit fields).

I'll change it...

Please see this commit (and branch) for what I'm planning to do here:
https://gitlab.com/bwidawsk/qemu/-/commit/bdb9f9a5337873aedb89558d28968caa130db05e#0d08a7f7c79cf3169f15ad6c4bb7a440c3603af6_47_65

> 
> > +    FIELD(CXL_DEV_MAILBOX_CMD, COMMAND_SET, 8, 8)
> >      FIELD(CXL_DEV_MAILBOX_CMD, LENGTH, 16, 20)
> >  
> >  /* XXX: actually a 64b register */
> 


^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [RFC PATCH 08/25] hw/cxl/device: Add memory devices (8.2.8.5)
  2020-11-16 16:37   ` Jonathan Cameron
@ 2020-11-16 21:45     ` Ben Widawsky
  2020-11-17 14:31       ` Jonathan Cameron
  0 siblings, 1 reply; 64+ messages in thread
From: Ben Widawsky @ 2020-11-16 21:45 UTC (permalink / raw)
  To: Jonathan Cameron; +Cc: Vishal Verma, Dan Williams, qemu-devel

On 20-11-16 16:37:22, Jonathan Cameron wrote:
> On Tue, 10 Nov 2020 21:47:07 -0800
> Ben Widawsky <ben.widawsky@intel.com> wrote:
> 
> > Memory devices implement extra capabilities on top of CXL devices. This
> > adds support for that.
> > 
> > Signed-off-by: Ben Widawsky <ben.widawsky@intel.com>
> > ---
> >  hw/cxl/cxl-device-utils.c   | 48 ++++++++++++++++++++++++++++++++++++-
> >  hw/cxl/cxl-mailbox-utils.c  | 48 ++++++++++++++++++++++++++++++++++++-
> >  include/hw/cxl/cxl_device.h | 15 ++++++++++++
> >  3 files changed, 109 insertions(+), 2 deletions(-)
> > 
> > diff --git a/hw/cxl/cxl-device-utils.c b/hw/cxl/cxl-device-utils.c
> > index aec8b0d421..6544a68567 100644
> > --- a/hw/cxl/cxl-device-utils.c
> > +++ b/hw/cxl/cxl-device-utils.c
> > @@ -158,6 +158,45 @@ static void mailbox_reg_write(void *opaque, hwaddr offset, uint64_t value,
> >          process_mailbox(cxl_dstate);
> >  }
> >  
> > +static uint64_t mdev_reg_read(void *opaque, hwaddr offset, unsigned size)
> > +{
> > +    uint64_t retval = 0;
> > +
> > +    retval = FIELD_DP64(retval, CXL_MEM_DEV_STS, MEDIA_STATUS, 1);
> > +    retval = FIELD_DP64(retval, CXL_MEM_DEV_STS, MBOX_READY, 1);
> > +
> > +    switch (size) {
> > +    case 4:
> > +        if (unlikely(offset & (sizeof(uint32_t) - 1))) {
> > +            qemu_log_mask(LOG_UNIMP, "Unaligned register read\n");
> > +            return 0;
> > +        }
> > +        break;
> > +    case 8:
> > +        if (unlikely(offset & (sizeof(uint64_t) - 1))) {
> > +            qemu_log_mask(LOG_UNIMP, "Unaligned register read\n");
> > +            return 0;
> > +        }
> > +        break;
> > +    }
> > +
> > +    return ldn_le_p(&retval, size);
> > +}
> > +
> > +static const MemoryRegionOps mdev_ops = {
> > +    .read = mdev_reg_read,
> > +    .write = NULL,
> > +    .endianness = DEVICE_LITTLE_ENDIAN,
> > +    .valid = {
> > +        .min_access_size = 4,
> > +        .max_access_size = 8,
> > +    },
> > +    .impl = {
> > +        .min_access_size = 4,
> > +        .max_access_size = 8,
> > +    },
> > +};
> > +
> >  static const MemoryRegionOps mailbox_ops = {
> >      .read = mailbox_reg_read,
> >      .write = mailbox_reg_write,
> > @@ -213,6 +252,9 @@ void cxl_device_register_block_init(Object *obj, CXLDeviceState *cxl_dstate)
> >                            "device-status", CXL_DEVICE_REGISTERS_LENGTH);
> >      memory_region_init_io(&cxl_dstate->mailbox, obj, &mailbox_ops, cxl_dstate,
> >                            "mailbox", CXL_MAILBOX_REGISTERS_LENGTH);
> > +    memory_region_init_io(&cxl_dstate->memory_device, obj, &mdev_ops,
> > +                          cxl_dstate, "memory device caps",
> > +                          CXL_MEMORY_DEVICE_REGISTERS_LENGTH);
> >  
> >      memory_region_add_subregion(&cxl_dstate->device_registers, 0,
> >                                  &cxl_dstate->caps);
> > @@ -221,6 +263,9 @@ void cxl_device_register_block_init(Object *obj, CXLDeviceState *cxl_dstate)
> >                                  &cxl_dstate->device);
> >      memory_region_add_subregion(&cxl_dstate->device_registers,
> >                                  CXL_MAILBOX_REGISTERS_OFFSET, &cxl_dstate->mailbox);
> > +    memory_region_add_subregion(&cxl_dstate->device_registers,
> > +                                CXL_MEMORY_DEVICE_REGISTERS_OFFSET,
> > +                                &cxl_dstate->memory_device);
> >  }
> >  
> >  static void mailbox_init_common(uint32_t *mbox_regs)
> > @@ -233,7 +278,7 @@ static void mailbox_init_common(uint32_t *mbox_regs)
> >  void cxl_device_register_init_common(CXLDeviceState *cxl_dstate)
> >  {
> >      uint32_t *cap_hdrs = cxl_dstate->caps_reg_state32;
> > -    const int cap_count = 1;
> 
> Guessing this should previously have been 2?
> 
> > +    const int cap_count = 3;
> >  
> >      /* CXL Device Capabilities Array Register */
> >      ARRAY_FIELD_DP32(cap_hdrs, CXL_DEV_CAP_ARRAY, CAP_ID, 0);
> > @@ -242,6 +287,7 @@ void cxl_device_register_init_common(CXLDeviceState *cxl_dstate)
> >  
> >      cxl_device_cap_init(cxl_dstate, DEVICE, 1);
> >      cxl_device_cap_init(cxl_dstate, MAILBOX, 2);
> > +    cxl_device_cap_init(cxl_dstate, MEMORY_DEVICE, 0x4000);
> >  
> >      mailbox_init_common(cxl_dstate->mbox_reg_state32);
> >  }
> > diff --git a/hw/cxl/cxl-mailbox-utils.c b/hw/cxl/cxl-mailbox-utils.c
> > index 2d1b0ef9e4..5d2579800e 100644
> > --- a/hw/cxl/cxl-mailbox-utils.c
> > +++ b/hw/cxl/cxl-mailbox-utils.c
> > @@ -12,6 +12,12 @@
> >  #include "hw/pci/pci.h"
> >  #include "hw/cxl/cxl.h"
> >  
> > +enum cxl_opcode {
> > +    CXL_EVENTS      = 0x1,
> > +    CXL_IDENTIFY    = 0x40,
> > +        #define CXL_IDENTIFY_MEMORY_DEVICE = 0x0
> > +};
> > +
> >  /* 8.2.8.4.5.1 Command Return Codes */
> >  enum {
> >      RET_SUCCESS                 = 0x0,
> > @@ -40,6 +46,43 @@ enum {
> >      RET_MAX                     = 0x17
> >  };
> >  
> > +/* 8.2.9.5.1.1 */
> > +static int cmd_set_identify(CXLDeviceState *cxl_dstate, uint8_t cmd,
> > +                            uint32_t *ret_size)
> 
> I'm a bit confused on naming here, perhaps rsp_set_identity makes
> it clearer which direction this is going in?  I think this is
> filling in the reply for a command from software running on the
> system. Naming seems to me to suggest we are setting the identity
> of the hardware.  
> 

It sounds like maybe you read "identify" as "identity"?

You're correct, this represents the firmware running on the memory device that
is receiving the identify command from the host. I've been thinking about
renaming these based on what the underlying device is. For instance, this might
become:

mem_dev_identify()

> > +{
> > +    struct identify {
> > +        char fw_revision[0x10];
> > +        uint64_t total_capacity;
> > +        uint64_t volatile_capacity;
> > +        uint64_t persistent_capacity;
> > +        uint64_t partition_align;
> > +        uint16_t info_event_log_size;
> > +        uint16_t warning_event_log_size;
> > +        uint16_t failure_event_log_size;
> > +        uint16_t fatal_event_log_size;
> > +        uint32_t lsa_size;
> > +        uint8_t poison_list_max_mer[3];
> > +        uint16_t inject_poison_limit;
> > +        uint8_t poison_caps;
> > +        uint8_t qos_telemetry_caps;
> > +    } __attribute__((packed)) *id;
> > +    _Static_assert(sizeof(struct identify) == 0x43, "Bad identify size");
> > +
> > +    if (memory_region_size(cxl_dstate->pmem) < (256 << 20)) {
> > +        return RET_ENODEV;
> > +    }
> > +
> > +    /* PMEM only */
> > +    id = (struct identify *)((void *)cxl_dstate->mbox_reg_state +
> > +                             A_CXL_DEV_CMD_PAYLOAD);
> > +    snprintf(id->fw_revision, 0x10, "BWFW VERSION %02d", 0);
> > +    id->total_capacity = memory_region_size(cxl_dstate->pmem);
> > +    id->persistent_capacity = memory_region_size(cxl_dstate->pmem);
> > +
> > +    *ret_size = 0x43;
> > +    return RET_SUCCESS;
> > +}
> > +
> >  void process_mailbox(CXLDeviceState *cxl_dstate)
> >  {
> >      uint16_t ret = RET_SUCCESS;
> > @@ -63,8 +106,11 @@ void process_mailbox(CXLDeviceState *cxl_dstate)
> >  
> >      uint8_t cmd_set = FIELD_EX64(command_reg, CXL_DEV_MAILBOX_CMD, COMMAND_SET);
> >      uint8_t cmd = FIELD_EX64(command_reg, CXL_DEV_MAILBOX_CMD, COMMAND);
> > -    (void)cmd;
> 
> Clean this stuff up before v2.
> 
> >      switch (cmd_set) {
> > +    case CXL_IDENTIFY:
> > +        ret = cmd_set_identify(cxl_dstate, cmd, &ret_len);
> > +        /* Fill in payload here */
> > +        break;
> >      default:
> >          ret = RET_ENOTSUP;
> >      }
> > diff --git a/include/hw/cxl/cxl_device.h b/include/hw/cxl/cxl_device.h
> > index df00998def..2cb2a9af3c 100644
> > --- a/include/hw/cxl/cxl_device.h
> > +++ b/include/hw/cxl/cxl_device.h
> > @@ -69,6 +69,10 @@
> >  #define CXL_MAILBOX_REGISTERS_LENGTH \
> >      (CXL_MAILBOX_REGISTERS_SIZE + CXL_MAILBOX_MAX_PAYLOAD_SIZE)
> >  
> > +#define CXL_MEMORY_DEVICE_REGISTERS_OFFSET \
> > +    (CXL_MAILBOX_REGISTERS_OFFSET + CXL_MAILBOX_REGISTERS_LENGTH)
> > +#define CXL_MEMORY_DEVICE_REGISTERS_LENGTH 0x8
> > +
> >  typedef struct cxl_device_state {
> >      /* Boss container and caps registers */
> >      MemoryRegion device_registers;
> > @@ -76,6 +80,7 @@ typedef struct cxl_device_state {
> >      MemoryRegion caps;
> >      MemoryRegion device;
> >      MemoryRegion mailbox;
> > +    MemoryRegion memory_device;
> >  
> >      MemoryRegion *pmem;
> >      MemoryRegion *vmem;
> > @@ -131,6 +136,8 @@ REG32(CXL_DEV_CAP_ARRAY2, 4) /* We're going to pretend it's 64b */
> >  CXL_DEVICE_CAPABILITY_HEADER_REGISTER(DEVICE, CXL_DEVICE_CAP_HDR1_OFFSET)
> >  CXL_DEVICE_CAPABILITY_HEADER_REGISTER(MAILBOX, CXL_DEVICE_CAP_HDR1_OFFSET + \
> >                                                 CXL_DEVICE_CAP_REG_SIZE)
> > +CXL_DEVICE_CAPABILITY_HEADER_REGISTER(MEMORY_DEVICE, CXL_DEVICE_CAP_HDR1_OFFSET + \
> > +                                                     CXL_DEVICE_CAP_REG_SIZE * 2)
> >  
> >  void process_mailbox(CXLDeviceState *cxl_dstate);
> >  
> > @@ -181,4 +188,12 @@ REG32(CXL_DEV_BG_CMD_STS, 0x18)
> >  
> >  REG32(CXL_DEV_CMD_PAYLOAD, 0x20)
> >  
> > +/* XXX: actually a 64b registers */
> > +REG32(CXL_MEM_DEV_STS, 0)
> > +    FIELD(CXL_MEM_DEV_STS, FATAL, 0, 1)
> > +    FIELD(CXL_MEM_DEV_STS, FW_HALT, 1, 1)
> > +    FIELD(CXL_MEM_DEV_STS, MEDIA_STATUS, 2, 2)
> > +    FIELD(CXL_MEM_DEV_STS, MBOX_READY, 4, 1)
> > +    FIELD(CXL_MEM_DEV_STS, RESET_NEEDED, 5, 3)
> > +
> >  #endif
> 


^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [RFC PATCH 11/25] hw/pxb: Allow creation of a CXL PXB (host bridge)
  2020-11-16 16:44   ` Jonathan Cameron
@ 2020-11-16 22:01     ` Ben Widawsky
  2020-11-17 14:33       ` Jonathan Cameron
  0 siblings, 1 reply; 64+ messages in thread
From: Ben Widawsky @ 2020-11-16 22:01 UTC (permalink / raw)
  To: Jonathan Cameron
  Cc: Vishal Verma, Dan Williams, qemu-devel, Michael S. Tsirkin

On 20-11-16 16:44:09, Jonathan Cameron wrote:
> On Tue, 10 Nov 2020 21:47:10 -0800
> Ben Widawsky <ben.widawsky@intel.com> wrote:
> 
> > This works like adding a typical pxb device, except the name is
> > 'pxb-cxl' instead of 'pxb-pcie'. An example command line would be as
> > follows:
> >   -device pxb-cxl,id=cxl.0,bus="pcie.0",bus_nr=1
> > 
> > A CXL PXB is backward compatible with PCIe. What this means in practice
> > is that an operating system that is unaware of CXL should still be able
> > to enumerate this topology as if it were PCIe.
> > 
> > One can create multiple CXL PXB host bridges, but a host bridge can only
> > be connected to the main root bus. Host bridges cannot appear elsewhere
> > in the topology.
> > 
> > Note that as of this patch, the ACPI tables needed for the host bridge
> > (specifically, an ACPI object in _SB named ACPI0016 and the CEDT) aren't
> > created. So while this patch internally creates it, it cannot be
> > properly used by an operating system or other system software.
> > 
> > Upcoming patches will allow creating multiple host bridges.
> > 
> > Signed-off-by: Ben Widawsky <ben.widawsky@intel.com>
> Hi Ben,
> 
> Few minor things inline.
> 
> Jonathan
> 
> > ---
> >  hw/pci-bridge/pci_expander_bridge.c | 67 ++++++++++++++++++++++++++++-
> >  hw/pci/pci.c                        |  7 +++
> >  include/hw/pci/pci.h                |  6 +++
> >  3 files changed, 78 insertions(+), 2 deletions(-)
> > 
> > diff --git a/hw/pci-bridge/pci_expander_bridge.c b/hw/pci-bridge/pci_expander_bridge.c
> > index 88c45dc3b5..3a8d815231 100644
> > --- a/hw/pci-bridge/pci_expander_bridge.c
> > +++ b/hw/pci-bridge/pci_expander_bridge.c
> > @@ -56,6 +56,10 @@ DECLARE_INSTANCE_CHECKER(PXBDev, PXB_DEV,
> >  DECLARE_INSTANCE_CHECKER(PXBDev, PXB_PCIE_DEV,
> >                           TYPE_PXB_PCIE_DEVICE)
> >  
> > +#define TYPE_PXB_CXL_DEVICE "pxb-cxl"
> > +DECLARE_INSTANCE_CHECKER(PXBDev, PXB_CXL_DEV,
> > +                         TYPE_PXB_CXL_DEVICE)
> > +
> >  struct PXBDev {
> >      /*< private >*/
> >      PCIDevice parent_obj;
> > @@ -67,6 +71,11 @@ struct PXBDev {
> >  
> >  static PXBDev *convert_to_pxb(PCIDevice *dev)
> >  {
> > +    /* A CXL PXB's parent bus is PCIe, so the normal check won't work */
> > +    if (object_dynamic_cast(OBJECT(dev), TYPE_PXB_CXL_DEVICE)) {
> > +        return PXB_CXL_DEV(dev);
> > +    }
> > +
> >      return pci_bus_is_express(pci_get_bus(dev))
> >          ? PXB_PCIE_DEV(dev) : PXB_DEV(dev);
> >  }
> > @@ -111,11 +120,20 @@ static const TypeInfo pxb_pcie_bus_info = {
> >      .class_init    = pxb_bus_class_init,
> >  };
> >  
> > +static const TypeInfo pxb_cxl_bus_info = {
> > +    .name          = TYPE_PXB_CXL_BUS,
> > +    .parent        = TYPE_CXL_BUS,
> > +    .instance_size = sizeof(PXBBus),
> > +    .class_init    = pxb_bus_class_init,
> > +};
> > +
> >  static const char *pxb_host_root_bus_path(PCIHostState *host_bridge,
> >                                            PCIBus *rootbus)
> >  {
> > -    PXBBus *bus = pci_bus_is_express(rootbus) ?
> > -                  PXB_PCIE_BUS(rootbus) : PXB_BUS(rootbus);
> > +    PXBBus *bus = pci_bus_is_cxl(rootbus) ?
> > +                      PXB_CXL_BUS(rootbus) :
> > +                      pci_bus_is_express(rootbus) ? PXB_PCIE_BUS(rootbus) :
> > +                                                    PXB_BUS(rootbus);
> 
> There comes a point where if / else is much more readable.
> 
> >  
> >      snprintf(bus->bus_path, 8, "0000:%02x", pxb_bus_num(rootbus));
> >      return bus->bus_path;
> > @@ -380,13 +398,58 @@ static const TypeInfo pxb_pcie_dev_info = {
> >      },
> >  };
> >  
> > +static void pxb_cxl_dev_realize(PCIDevice *dev, Error **errp)
> > +{
> > +    /* A CXL PXB's parent bus is still PCIe */
> > +    if (!pci_bus_is_express(pci_get_bus(dev))) {
> > +        error_setg(errp, "pxb-cxl devices cannot reside on a PCI bus");
> > +        return;
> > +    }
> > +
> > +    pxb_dev_realize_common(dev, CXL, errp);
> > +}
> > +
> > +static void pxb_cxl_dev_class_init(ObjectClass *klass, void *data)
> > +{
> > +    DeviceClass *dc   = DEVICE_CLASS(klass);
> > +    PCIDeviceClass *k = PCI_DEVICE_CLASS(klass);
> > +
> > +    k->realize             = pxb_cxl_dev_realize;
> > +    k->exit                = pxb_dev_exitfn;
> > +    k->vendor_id           = PCI_VENDOR_ID_INTEL;
> > +    k->device_id           = 0xabcd;
> 
> Just to check, is that an officially assigned device_id that we will never
> have a clash with?  Nice ID to get if it is :)

No, not the real ID.

My understanding is that the host bridge won't exist at all in the PCI
hierarchy. So basically all of these can be undeclared. For testing/development
purposes I wanted to see this info.

Awesomely, it appears if I remove vendor, device, class, and subsystem
everything still works and I do not see a bridge device in lspci. So v2 will
have this all gone.

Thanks.

> 
> 
> > +    k->class_id            = PCI_CLASS_BRIDGE_HOST;
> > +    k->subsystem_vendor_id = PCI_VENDOR_ID_INTEL;
> > +
> > +    dc->desc = "CXL Host Bridge";
> > +    device_class_set_props(dc, pxb_dev_properties);
> > +    set_bit(DEVICE_CATEGORY_BRIDGE, dc->categories);
> > +
> > +    /* Host bridges aren't hotpluggable. FIXME: spec reference */
> > +    dc->hotpluggable = false;
> > +}
> > +
> > +static const TypeInfo pxb_cxl_dev_info = {
> > +    .name          = TYPE_PXB_CXL_DEVICE,
> > +    .parent        = TYPE_PCI_DEVICE,
> > +    .instance_size = sizeof(PXBDev),
> > +    .class_init    = pxb_cxl_dev_class_init,
> > +    .interfaces =
> > +        (InterfaceInfo[]){
> > +            { INTERFACE_CONVENTIONAL_PCI_DEVICE },
> > +            {},
> > +        },
> > +};
> > +
> >  static void pxb_register_types(void)
> >  {
> >      type_register_static(&pxb_bus_info);
> >      type_register_static(&pxb_pcie_bus_info);
> > +    type_register_static(&pxb_cxl_bus_info);
> >      type_register_static(&pxb_host_info);
> >      type_register_static(&pxb_dev_info);
> >      type_register_static(&pxb_pcie_dev_info);
> > +    type_register_static(&pxb_cxl_dev_info);
> >  }
> >  
> >  type_init(pxb_register_types)
> > diff --git a/hw/pci/pci.c b/hw/pci/pci.c
> > index db88788c4b..67eed889a4 100644
> > --- a/hw/pci/pci.c
> > +++ b/hw/pci/pci.c
> > @@ -220,6 +220,12 @@ static const TypeInfo pcie_bus_info = {
> >      .class_init = pcie_bus_class_init,
> >  };
> >  
> > +static const TypeInfo cxl_bus_info = {
> > +    .name       = TYPE_CXL_BUS,
> > +    .parent     = TYPE_PCIE_BUS,
> > +    .class_init = pcie_bus_class_init,
> > +};
> > +
> >  static PCIBus *pci_find_bus_nr(PCIBus *bus, int bus_num);
> >  static void pci_update_mappings(PCIDevice *d);
> >  static void pci_irq_handler(void *opaque, int irq_num, int level);
> > @@ -2847,6 +2853,7 @@ static void pci_register_types(void)
> >  {
> >      type_register_static(&pci_bus_info);
> >      type_register_static(&pcie_bus_info);
> > +    type_register_static(&cxl_bus_info);
> >      type_register_static(&conventional_pci_interface_info);
> >      type_register_static(&cxl_interface_info);
> >      type_register_static(&pcie_interface_info);
> > diff --git a/include/hw/pci/pci.h b/include/hw/pci/pci.h
> > index 4e6fd59fdd..52267ff69e 100644
> > --- a/include/hw/pci/pci.h
> > +++ b/include/hw/pci/pci.h
> > @@ -405,6 +405,7 @@ typedef PCIINTxRoute (*pci_route_irq_fn)(void *opaque, int pin);
> >  #define TYPE_PCI_BUS "PCI"
> >  OBJECT_DECLARE_TYPE(PCIBus, PCIBusClass, PCI_BUS)
> >  #define TYPE_PCIE_BUS "PCIE"
> > +#define TYPE_CXL_BUS "CXL"
> >  
> >  bool pci_bus_is_express(PCIBus *bus);
> >  
> > @@ -753,6 +754,11 @@ static inline void pci_irq_pulse(PCIDevice *pci_dev)
> >      pci_irq_deassert(pci_dev);
> >  }
> >  
> > +static inline int pci_is_cxl(const PCIDevice *d)
> > +{
> > +    return d->cap_present & QEMU_PCIE_CAP_CXL;
> > +}
> > +
> >  static inline int pci_is_express(const PCIDevice *d)
> >  {
> >      return d->cap_present & QEMU_PCI_CAP_EXPRESS;
> 


^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [RFC PATCH 22/25] acpi/cxl: Create the CEDT (9.14.1)
  2020-11-16 17:15   ` Jonathan Cameron
@ 2020-11-16 22:05     ` Ben Widawsky
  0 siblings, 0 replies; 64+ messages in thread
From: Ben Widawsky @ 2020-11-16 22:05 UTC (permalink / raw)
  To: Jonathan Cameron
  Cc: Eduardo Habkost, Michael S. Tsirkin, Vishal Verma, qemu-devel,
	Paolo Bonzini, Igor Mammedov, Dan Williams, Richard Henderson

On 20-11-16 17:15:03, Jonathan Cameron wrote:
> On Tue, 10 Nov 2020 21:47:21 -0800
> Ben Widawsky <ben.widawsky@intel.com> wrote:
> 
> > The CXL Early Discovery Table is defined in the CXL 2.0 specification as
> > a way for the OS to get CXL specific information from the system
> > firmware.
> > 
> > As of CXL 2.0 spec, only 1 sub structure is defined, the CXL Host Bridge
> > Structure (CHBS) which is primarily useful for telling the OS exactly
> > where the MMIO for the host bridge is.
> > 
> > Signed-off-by: Ben Widawsky <ben.widawsky@intel.com>
> Trivial comments inline.
> 
> Jonathan
> 
> > ---
> >  hw/acpi/cxl.c                       | 72 +++++++++++++++++++++++++++++
> >  hw/i386/acpi-build.c                |  6 ++-
> >  hw/pci-bridge/pci_expander_bridge.c | 21 +--------
> >  include/hw/acpi/cxl.h               |  4 ++
> >  include/hw/pci/pci_bridge.h         | 25 ++++++++++
> >  5 files changed, 107 insertions(+), 21 deletions(-)
> > 
> > diff --git a/hw/acpi/cxl.c b/hw/acpi/cxl.c
> > index 31ceaeecc3..c9631763ad 100644
> > --- a/hw/acpi/cxl.c
> > +++ b/hw/acpi/cxl.c
> > @@ -18,14 +18,86 @@
> >   */
> >  
> >  #include "qemu/osdep.h"
> > +#include "hw/sysbus.h"
> > +#include "hw/pci/pci_bridge.h"
> > +#include "hw/pci/pci_host.h"
> >  #include "hw/cxl/cxl.h"
> > +#include "hw/mem/memory-device.h"
> >  #include "hw/acpi/acpi.h"
> >  #include "hw/acpi/aml-build.h"
> >  #include "hw/acpi/bios-linker-loader.h"
> >  #include "hw/acpi/cxl.h"
> > +#include "hw/acpi/cxl.h"
> >  #include "qapi/error.h"
> >  #include "qemu/uuid.h"
> >  
> > +static void cedt_build_chbs(GArray *table_data, PXBDev *cxl)
> > +{
> > +    SysBusDevice *sbd = SYS_BUS_DEVICE(cxl->cxl.cxl_host_bridge);
> > +    struct MemoryRegion *mr = sbd->mmio[0].memory;
> > +
> > +    /* Type */
> > +    build_append_int_noprefix(table_data, 0, 1);
> > +
> > +    /* Reserved */
> > +    build_append_int_noprefix(table_data, 0xff, 1);
> 
> Why 0xff rather than 0x00?  ACPI uses default of 0 for reserved bits
> (5.2.1 in ACPI 6.3 spec)
> 
> > +
> > +    /* Record Length */
> > +    build_append_int_noprefix(table_data, 32, 2);
> > +
> > +    /* UID */
> > +    build_append_int_noprefix(table_data, cxl->uid, 4);
> > +
> > +    /* Version */
> > +    build_append_int_noprefix(table_data, 1, 4);
> > +
> > +    /* Reserved */
> > +    build_append_int_noprefix(table_data, 0xffffffff, 4);
> > +
> > +    /* Base */
> > +    build_append_int_noprefix(table_data, mr->addr, 8);
> > +
> > +    /* Length */
> > +    build_append_int_noprefix(table_data, memory_region_size(mr), 4);
> 
> Better to just treat this as a 64 bit field as per the spec, even though
> it can only contain 0x10000?
> 

Ah, I based this on a pre-release version where it was 32-bit. I'll fix it.

> > +
> > +    /* Reserved */
> > +    build_append_int_noprefix(table_data, 0xffffffff, 4);
> > +}
> > +
> > +static int cxl_foreach_pxb_hb(Object *obj, void *opaque)
> > +{
> > +    Aml *cedt = opaque;
> > +
> > +    if (object_dynamic_cast(obj, TYPE_PXB_CXL_DEVICE)) {
> > +        PXBDev *pxb = PXB_CXL_DEV(obj);
> > +
> > +        cedt_build_chbs(cedt->buf, pxb);
> > +    }
> > +
> > +    return 0;
> > +}
> > +
> > +void cxl_build_cedt(GArray *table_offsets, GArray *table_data,
> > +                    BIOSLinker *linker)
> > +{
> > +    const int cedt_start = table_data->len;
> > +    Aml *cedt;
> > +
> > +    cedt = init_aml_allocator();
> > +
> > +    /* reserve space for CEDT header */
> > +    acpi_add_table(table_offsets, table_data);
> > +    acpi_data_push(cedt->buf, sizeof(AcpiTableHeader));
> > +
> > +    object_child_foreach_recursive(object_get_root(), cxl_foreach_pxb_hb, cedt);
> > +
> > +    /* copy AML table into ACPI tables blob and patch header there */
> > +    g_array_append_vals(table_data, cedt->buf->data, cedt->buf->len);
> > +    build_header(linker, table_data, (void *)(table_data->data + cedt_start),
> > +                 "CEDT", table_data->len - cedt_start, 1, NULL, NULL);
> > +    free_aml_allocator();
> > +}
> > +
> >  static Aml *__build_cxl_osc_method(void)
> >  {
> >      Aml *method, *if_uuid, *else_uuid, *if_arg1_not_1, *if_cxl, *if_caps_masked;
> > diff --git a/hw/i386/acpi-build.c b/hw/i386/acpi-build.c
> > index dd1f8b39d4..eda62dcd6a 100644
> > --- a/hw/i386/acpi-build.c
> > +++ b/hw/i386/acpi-build.c
> > @@ -75,6 +75,8 @@
> >  #include "hw/acpi/ipmi.h"
> >  #include "hw/acpi/hmat.h"
> >  
> > +#include "hw/acpi/cxl.h"
> > +
> >  /* These are used to size the ACPI tables for -M pc-i440fx-1.7 and
> >   * -M pc-i440fx-2.0.  Even if the actual amount of AML generated grows
> >   * a little bit, there should be plenty of free space since the DSDT
> > @@ -1662,7 +1664,7 @@ build_dsdt(GArray *table_data, BIOSLinker *linker,
> >  
> >              scope = aml_scope("\\_SB");
> >              if (type == CXL) {
> > -                dev = aml_device("CXL%.01X", pci_bus_uid(bus));
> > +                dev = aml_device("CXL%.01X", uid);
> >              } else {
> >                  dev = aml_device("PC%.02X", bus_num);
> >              }
> > @@ -2568,6 +2570,8 @@ void acpi_build(AcpiBuildTables *tables, MachineState *machine)
> >                            machine->nvdimms_state, machine->ram_slots);
> >      }
> >  
> > +    cxl_build_cedt(table_offsets, tables_blob, tables->linker);
> > +
> >      acpi_add_table(table_offsets, tables_blob);
> >      build_waet(tables_blob, tables->linker);
> >  
> > diff --git a/hw/pci-bridge/pci_expander_bridge.c b/hw/pci-bridge/pci_expander_bridge.c
> > index 75910f5870..b2c1d9056a 100644
> > --- a/hw/pci-bridge/pci_expander_bridge.c
> > +++ b/hw/pci-bridge/pci_expander_bridge.c
> > @@ -57,26 +57,6 @@ DECLARE_INSTANCE_CHECKER(PXBDev, PXB_DEV,
> >  DECLARE_INSTANCE_CHECKER(PXBDev, PXB_PCIE_DEV,
> >                           TYPE_PXB_PCIE_DEVICE)
> >  
> > -#define TYPE_PXB_CXL_DEVICE "pxb-cxl"
> > -DECLARE_INSTANCE_CHECKER(PXBDev, PXB_CXL_DEV,
> > -                         TYPE_PXB_CXL_DEVICE)
> > -
> > -struct PXBDev {
> > -    /*< private >*/
> > -    PCIDevice parent_obj;
> > -    /*< public >*/
> > -
> > -    uint8_t bus_nr;
> > -    uint16_t numa_node;
> > -    int32_t uid;
> > -    struct cxl_dev {
> > -        HostMemoryBackend *memory_window[CXL_WINDOW_MAX];
> > -
> > -        uint32_t num_windows;
> > -        hwaddr *window_base[CXL_WINDOW_MAX];
> > -    } cxl;
> > -};
> > -
> >  typedef struct CXLHost {
> >      PCIHostState parent_obj;
> >  
> > @@ -351,6 +331,7 @@ static void pxb_dev_realize_common(PCIDevice *dev, enum BusType type,
> >          bus = pci_root_bus_new(ds, dev_name, NULL, NULL, 0, TYPE_PXB_CXL_BUS);
> >          bus->flags |= PCI_BUS_CXL;
> >          PXB_CXL_HOST(ds)->dev = PXB_CXL_DEV(dev);
> > +        PXB_CXL_DEV(dev)->cxl.cxl_host_bridge = ds;
> >      } else {
> >          bus = pci_root_bus_new(ds, "pxb-internal", NULL, NULL, 0, TYPE_PXB_BUS);
> >          bds = qdev_new("pci-bridge");
> > diff --git a/include/hw/acpi/cxl.h b/include/hw/acpi/cxl.h
> > index 7b8f3b8a2e..db2063f8c9 100644
> > --- a/include/hw/acpi/cxl.h
> > +++ b/include/hw/acpi/cxl.h
> > @@ -18,6 +18,10 @@
> >  #ifndef HW_ACPI_CXL_H
> >  #define HW_ACPI_CXL_H
> >  
> > +#include "hw/acpi/bios-linker-loader.h"
> > +
> > +void cxl_build_cedt(GArray *table_offsets, GArray *table_data,
> > +                    BIOSLinker *linker);
> >  void build_cxl_osc_method(Aml *dev);
> >  
> >  #endif
> > diff --git a/include/hw/pci/pci_bridge.h b/include/hw/pci/pci_bridge.h
> > index a94d350034..50dd7fdf33 100644
> > --- a/include/hw/pci/pci_bridge.h
> > +++ b/include/hw/pci/pci_bridge.h
> > @@ -28,6 +28,7 @@
> >  
> >  #include "hw/pci/pci.h"
> >  #include "hw/pci/pci_bus.h"
> > +#include "hw/cxl/cxl.h"
> >  #include "qom/object.h"
> >  
> >  typedef struct PCIBridgeWindows PCIBridgeWindows;
> > @@ -81,6 +82,30 @@ struct PCIBridge {
> >  #define PCI_BRIDGE_DEV_PROP_MSI        "msi"
> >  #define PCI_BRIDGE_DEV_PROP_SHPC       "shpc"
> >  
> > +struct PXBDev {
> > +    /*< private >*/
> > +    PCIDevice parent_obj;
> > +    /*< public >*/
> > +
> > +    uint8_t bus_nr;
> > +    uint16_t numa_node;
> > +    int32_t uid;
> > +
> > +    struct cxl_dev {
> > +        HostMemoryBackend *memory_window[CXL_WINDOW_MAX];
> > +
> > +        uint32_t num_windows;
> > +        hwaddr *window_base[CXL_WINDOW_MAX];
> > +
> > +        void *cxl_host_bridge; /* Pointer to a CXLHost */
> > +    } cxl;
> > +};
> > +
> > +typedef struct PXBDev PXBDev;
> > +#define TYPE_PXB_CXL_DEVICE "pxb-cxl"
> > +DECLARE_INSTANCE_CHECKER(PXBDev, PXB_CXL_DEV,
> > +                         TYPE_PXB_CXL_DEVICE)
> > +
> 
> Seems like this could sensibly be on one line?
> Could have been in earlier patch as well of course.
> 

Yeah - this seems to be the convention throughout the code. I didn't follow the
reasoning when it was introduced (which was recently).

> >  int pci_bridge_ssvid_init(PCIDevice *dev, uint8_t offset,
> >                            uint16_t svid, uint16_t ssid,
> >                            Error **errp);
> 


^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [RFC PATCH 03/25] hw/cxl/component: Introduce CXL components (8.1.x, 8.2.5)
  2020-11-16 19:19     ` Ben Widawsky
@ 2020-11-17 12:29       ` Jonathan Cameron
  2020-11-24 23:09         ` Ben Widawsky
  0 siblings, 1 reply; 64+ messages in thread
From: Jonathan Cameron @ 2020-11-17 12:29 UTC (permalink / raw)
  To: Ben Widawsky; +Cc: Vishal Verma, Dan Williams, qemu-devel

On Mon, 16 Nov 2020 11:19:36 -0800
Ben Widawsky <ben.widawsky@intel.com> wrote:

> On 20-11-16 12:03:52, Jonathan Cameron wrote:
> > On Tue, 10 Nov 2020 21:47:02 -0800
> > Ben Widawsky <ben.widawsky@intel.com> wrote:
> >   
> > > A CXL 2.0 component is any entity in the CXL topology. All components
> > > have a analogous function in PCIe. Except for the CXL host bridge, all
> > > have a PCIe config space that is accessible via the common PCIe
> > > mechanisms. CXL components are enumerated via DVSEC fields in the
> > > extended PCIe header space. CXL components will minimally implement some
> > > subset of CXL.mem and CXL.cache registers defined in 8.2.5 of the CXL
> > > 2.0 specification. Two headers and a utility library are introduced to
> > > support the minimum functionality needed to enumerate components.
> > > 
> > > The cxl_pci header manages bits associated with PCI, specifically the
> > > DVSEC and related fields. The cxl_component.h variant has data
> > > structures and APIs that are useful for drivers implementing any of the
> > > CXL 2.0 components. The library takes care of making use of the DVSEC
> > > bits and the CXL.[mem|cache] regisetrs.
> > > 
> > > None of the mechanisms required to enumerate a CXL capable hostbridge
> > > are introduced at this point.
> > > 
> > > Note that the CXL.mem and CXL.cache registers used are always 4B wide.
> > > It's possible in the future that this constraint will not hold.
> > > 
> > > Signed-off-by: Ben Widawsky <ben.widawsky@intel.com>
> > > 
> > > --
> > > It's tempting to have a more generalized DVSEC infrastructure. As far as
> > > I can tell, the amount this would actually save in terms of code is
> > > minimal because most of DVESC is vendor specific.  
> > 
> > Agreed.  Probably not worth bothering with generic infrastructure for 2.5 DW.
> > 
> > A few comments inline.
> > 
> > Jonathan
> >   
> 
> Anything I didn't respond to is accepted and will be in v2.
> 
> Thanks.
> Ben
> 
Hi Ben,

...

> >   
> > > +
> > > +/* 8.2.5.10 - CXL Security Capability Structure */
> > > +#define CXL_SEC_REGISTERS_OFFSET (CXL_RAS_REGISTERS_OFFSET + CXL_RAS_REGISTERS_SIZE)
> > > +#define CXL_SEC_REGISTERS_SIZE   0 /* We don't implement 1.1 downstream ports */
> > > +
> > > +/* 8.2.5.11 - CXL Link Capability Structure */
> > > +#define CXL_LINK_REGISTERS_OFFSET (CXL_SEC_REGISTERS_OFFSET + CXL_SEC_REGISTERS_SIZE)
> > > +#define CXL_LINK_REGISTERS_SIZE   0x38  
> > 
> > Bit odd to introduce this but not define anything... Can't we bring these
> > in when we need them later?  
> 
> Repeating my comment from 00/25...
> 
> For this specific patch series I liked providing #defines in bulk so that it's
> easy enough to just bring up the spec and review them. Not sure if there is a
> preference from others in the community on this.

Personally I'd prefer to see the lot if you are going to do that, on basis
should only need reviewing against the spec once.
Not sure others will agree though as "the lot" is an awful lot.

> 
> I could also introduce the library that generates the capability headers with
> this. Either is fine, I just wanted to point out the intent.
> 
> >   
> > > +
> > > +/* 8.2.5.12 - CXL HDM Decoder Capability Structure */
> > > +#define HDM_DECODE_MAX 10 /* 8.2.5.12.1 */
> > > +#define CXL_HDM_REGISTERS_OFFSET \
> > > +    (CXL_LINK_REGISTERS_OFFSET + CXL_LINK_REGISTERS_SIZE) /* 8.2.5.12 */
> > > +#define CXL_HDM_REGISTERS_SIZE (0x20 + HDM_DECODE_MAX * 10)
> > > +#define HDM_DECODER_INIT(n, base)                          \
> > > +    REG32(CXL_HDM_DECODER##n##_BASE_LO, base + 0x10)       \  
> > 
> > Offset n should be included in the address calc.  It's always 0 at the moment
> > but might as well put it in now.  Mind you there is something a bit odd
> > in the spec I'm looking at. Nothing defined at 0x2c but no reserved line
> > either in the table.  
> 
> My guess is some earlier version of the spec had the decoder registers as 64b
> and so they wanted to keep natural alignment.

Agreed, but having a hole in the spec isn't great.  Looks like a reserved
field should be inserted.

> 
> > 
...

> >   
> > > +#define PCIE_DVSEC_ID_OFFSET     0x8
> > > +
> > > +#define PCIE_CXL_DEVICE_DVSEC_LENGTH 0x38
> > > +#define PCIE_CXL_DEVICE_DVSEC_REVID  1  
> > 
> > Make it clear this is the CXL 2.0 revid.
> > It would be 0 for CXL 1.1 I think? (8.1.3 of CXL 2.0 spec)  
> 
> Got it. BTW, you're correct. It is in the verbiage there
> "DVSEC Revision ID of 0h represents the structure as defined in CXL 1.1 specification."
> 
> A bit hidden IMO.

Yes, it's 'fun' finding some stuff in that doc, though most things you are looking
for turn out to be somewhere at least.

> 
> > 
> >   
> > > +
> > > +#define EXTENSIONS_PORT_DVSEC_LENGTH 0x28
> > > +#define EXTENSIONS_PORT_DVSEC_REVID  1  
> > 
> > I'm assuming this is the CXL 2.0 exensions DVSEC for ports
> > in figure 128?
> > 
> > If so table 128 has dvsec revision as 0. 
> >   
> 
> Good catch, btw a shortcut is to look at Table 124.

Good point - I'd missed the revision column in that :)

> 
> > > +
> > > +#define GPF_PORT_DVSEC_LENGTH 0x10
> > > +#define GPF_PORT_DVSEC_REVID  0
> > > +
> > > +#define PCIE_FLEXBUS_PORT_DVSEC_LENGTH_2_0 0x14
> > > +#define PCIE_FLEXBUS_PORT_DVSEC_REVID_2_0  1
> > > +
> > > +#define REG_LOC_DVSEC_LENGTH 0x24
> > > +#define REG_LOC_DVSEC_REVID  0  
> > 
> > Whilst I appreciate this is an RFC it would seem more logical
> > to me to only list things in the following enum if we
> > have also defined them here.  E.g. GPF_DEVICE_DVSEC doesn't
> > have length and revid defines.
> >   
> > > +
> > > +enum {
> > > +    PCIE_CXL_DEVICE_DVSEC      = 0,
> > > +    NON_CXL_FUNCTION_MAP_DVSEC = 2,
> > > +    EXTENSIONS_PORT_DVSEC      = 3,
> > > +    GPF_PORT_DVSEC             = 4,
> > > +    GPF_DEVICE_DVSEC           = 5,
> > > +    PCIE_FLEXBUS_PORT_DVSEC    = 7,
> > > +    REG_LOC_DVSEC              = 8,
> > > +    MLD_DVSEC                  = 9,
> > > +    CXL20_MAX_DVSEC
> > > +};
> > > +
> > > +struct dvsec_header {
> > > +    uint32_t cap_hdr;
> > > +    uint32_t dv_hdr1;
> > > +    uint16_t dv_hdr2;
> > > +} __attribute__((__packed__));
> > > +_Static_assert(sizeof(struct dvsec_header) == 10,
> > > +               "dvsec header size incorrect");
> > > +
> > > +/*
> > > + * CXL 2.0 devices must implement certain DVSEC IDs, and can [optionally]
> > > + * implement others.
> > > + *
> > > + * CXL 2.0 Device: 0, [2], 5, 8
> > > + * CXL 2.0 RP: 3, 4, 7, 8
> > > + * CXL 2.0 Upstream Port: [2], 7, 8
> > > + * CXL 2.0 Downstream Port: 3, 4, 7, 8
> > > + */
> > > +
> > > +/* CXL 2.0 - 8.1.5 (ID 0003) */
> > > +struct dvsec_port {  
> > 
> > I'd keep naming consistent.  It's called EXTENSIONS_PORT_DVSEC above
> > so extensions_dvsec_port here.
> >   
> > > +    struct dvsec_header hdr;
> > > +    uint16_t status;
> > > +    uint16_t control;
> > > +    uint8_t alt_bus_base;
> > > +    uint8_t alt_bus_limit;
> > > +    uint16_t alt_memory_base;
> > > +    uint16_t alt_memory_limit;
> > > +    uint16_t alt_prefetch_base;
> > > +    uint16_t alt_prefetch_limit;
> > > +    uint32_t alt_prefetch_base_high;
> > > +    uint32_t alt_prefetch_base_low;
> > > +    uint32_t rcrb_base;
> > > +    uint32_t rcrb_base_high;
> > > +};
> > > +_Static_assert(sizeof(struct dvsec_port) == 0x28, "dvsec port size incorrect");
> > > +#define PORT_CONTROL_OVERRIDE_OFFSET 0xc  
> > I'm not totally sure what this define is, but seems
> > like it's simply the offset of the control field above.
> > If so can't we get it from the there directly?  
> 
> Firstly, I only define these to show how one would handle DVSEC writes. I don't
> actually have a use for them as of now. It is just the offset, but I don't know
> what you mean by getting it from there directly. Could you elaborate a bit?

As you have a packed representation you should be able to do some
address arthmetic to get it.  offsetof(dvsec_port, control) I think....

Thanks,

Jonathan


^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [RFC PATCH 00/25] Introduce CXL 2.0 Emulation
  2020-11-16 18:06   ` Ben Widawsky
@ 2020-11-17 14:09     ` Jonathan Cameron
  2020-11-25 18:29       ` Ben Widawsky
  0 siblings, 1 reply; 64+ messages in thread
From: Jonathan Cameron @ 2020-11-17 14:09 UTC (permalink / raw)
  To: Ben Widawsky; +Cc: Vishal Verma, Dan Williams, qemu-devel

On Mon, 16 Nov 2020 10:06:26 -0800
Ben Widawsky <ben.widawsky@intel.com> wrote:

> On 20-11-16 17:21:07, Jonathan Cameron wrote:
> > On Tue, 10 Nov 2020 21:46:59 -0800
> > Ben Widawsky <ben.widawsky@intel.com> wrote:
> >   
> > > Introduce emulation of Compute Express Link 2.0, which was released
> > > today at https://www.computeexpresslink.org/.
> > > 
> > > I've pushed a branch here: https://gitlab.com/bwidawsk/qemu/-/tree/cxl-2.0
> > > 
> > > The emulation has been critical to get the Linux enabling started
> > > (https://lore.kernel.org/linux-cxl/), it would be an ideal place to land
> > > regression tests for different topology handling, and there may be applications
> > > for this emulation as a way for a guest to manipulate its address space relative
> > > to different performance memories. I am new to QEMU development, so please
> > > forgive and point me in the right direction if I severely misinterpreted where a
> > > piece of infrastructure belongs.
> > > 
> > > Three of the five CXL component types are emulated with some level of functionality:
> > > host bridge, root port, and memory device. Upstream ports and downstream ports
> > > aren't implemented (the two components needed to make up a switch).
> > > 
> > > CXL 2.0 is built on top of PCIe (see spec for details). As a result, much of the
> > > implementation utilizes existing PCI paradigms. To implement the host bridge,
> > > I've chosen to use PXB (PCI Expander Bridge). It seemed to be the most natural
> > > fit even though it doesn't directly map to how hardware will work. For
> > > persistent capacity of the memory device, I utilized the memory subsystem
> > > (hw/mem).
> > > 
> > > We have 3 reasons why this work is valuable:
> > > 1. OS driver development and testing
> > > 2. OS driver regression testing
> > > 3. Possible guest support for HDMs
> > > 
> > > As mentioned above there are three benefits to carrying this enabling in
> > > upstream QEMU:
> > > 
> > > 1. Linux driver feature development benefits from emulation both due to
> > > a lack of initial hardware availability, but also, as is seen with
> > > NVDIMM/PMEM emulation, there is value in being able to share
> > > topologies with system-software developers even after hardware is
> > > available.
> > > 
> > > 2. The Linux kernel's unit test suite for NVDIMM/PMEM ended up injecting fake
> > > resources via custom modules (nfit_test). In retrospect a QEMU emulation of
> > > nfit_test capabilities would have made the test environment more portable, and
> > > allowed for easier community contributions of example configurations.
> > > 
> > > 3. This is still being fleshed out, but in short it provides a standardized
> > > mechanism for the guest to provide feedback to the host about size and placement
> > > needs of the memory. After the host gives the guest a physical window mapping to
> > > the CXL device, the emulated HDM decoders allow the guest a way to tell the host
> > > how much it wants and where. There are likely simpler ways to do this, but
> > > they'd require inventing a new interface and you'd need to have diverging driver
> > > code in the guest programming of the HDM decoder vs. the host. Since we've
> > > already done this work, why not use it?
> > > 
> > > There is quite a long list of work to do for full spec compliance, but I don't
> > > believe that any of it precludes merging. Off the top of my head:
> > > - Main host bridge support (WIP)
> > > - Interleaving
> > > - Better Tests
> > > - Huge swaths of firmware functionality
> > > - Hot plug support
> > > - Emulating volatile capacity
> > > 
> > > The flow of the patches in general is to define all the data structures and
> > > registers associated with the various components in a top down manner. Host
> > > bridge, component, ports, devices. Then, the actual implementation is done in
> > > the same order.
> > > 
> > > The summary is:
> > > 1-8: Put infrastructure in place for emulation of the components.
> > > 9-11: Create the concept of a CXL bus and plumb into PXB
> > > 12-16: Implement host bridges
> > > 17: Implement a root port
> > > 18: Implement a memory device
> > > 19: Implement HDM decoders
> > > 20-24: ACPI bits
> > > 25: Start working on enabling the main host bridge  
> > 
> > Hi Ben,
> > 
> > I've take a look at the whole series and offered a few comments in things that
> > stood out.  Unfortunately I'm playing catchup on CXL 2.0 and my qemu knowledge
> > is not what I'd like it to be.
> > 
> > Having said that, this feels like a good start to me.  Please clean up
> > the few patch handling issues before a v2.  Code that appears, disappears and
> > reappears is a bit distracting :)
> > 
> > Next up, the kernel side.
> > 
> > Thanks,
> > 
> > Jonathan  
> 
> Thanks very much for taking the time Jonathan. I saw your CCIX series early on
> and it was definitely helpful to me, so thanks for that as well. As you can
> probably tell, this series has been rebased to hell and back and you caught some
> of that in the code churn. I'll work on fixing those. I foolishly did a pretty
> major refactor just before submission.
> 
> I wanted to discuss the 'dump all the defines in a patch and use them later'
> style I went for. In general, I don't do this and I leave feedback on patches
> that do this. I had two reasons for doing it here:
> 1. I wanted to separate a, 'go read the spec review' from actual functionality.
>    I hope some of the issues you spotted were because of that.

An aim I can definitely get behind.  However, at the moment it feels like a half
way stage.  Some sections are fully defined, others not.  Mind you I don't know
about how the qemu community feels about large definition sets that aren't going
to get used for a 'while'.

> 2. Since I decided to make all the helper libraries first, many defines are
>    needed for that.
> 
> For v2, I'll make sure there are no #define only patches, but I would still like
> to introduce the helper libraries first which will leave some unused functions
> and defines for a few patches.

Agreed, it was the intermediate state that I wasn't keen on of structures defined
but then given 0 size.  I'd rather just look at them all once.  If that sometimes
means introducing a file that isn't even referenced for a few patches, that's
fine by me.

Jonathan

> 
> Ben
> 
> >   
> > > 
> > > Ben Widawsky (23):
> > >   hw/pci/cxl: Add a CXL component type (interface)
> > >   hw/cxl/component: Introduce CXL components (8.1.x, 8.2.5)
> > >   hw/cxl/device: Introduce a CXL device (8.2.8)
> > >   hw/cxl/device: Implement the CAP array (8.2.8.1-2)
> > >   hw/cxl/device: Add device status (8.2.8.3)
> > >   hw/cxl/device: Implement basic mailbox (8.2.8.4)
> > >   hw/cxl/device: Add memory devices (8.2.8.5)
> > >   hw/pxb: Use a type for realizing expanders
> > >   hw/pci/cxl: Create a CXL bus type
> > >   hw/pxb: Allow creation of a CXL PXB (host bridge)
> > >   acpi/pci: Consolidate host bridge setup
> > >   hw/pci: Plumb _UID through host bridges
> > >   hw/cxl/component: Implement host bridge MMIO (8.2.5, table 142)
> > >   acpi/pxb/cxl: Reserve host bridge MMIO
> > >   hw/pxb/cxl: Add "windows" for host bridges
> > >   hw/cxl/rp: Add a root port
> > >   hw/cxl/device: Add a memory device (8.2.8.5)
> > >   hw/cxl/device: Implement MMIO HDM decoding (8.2.5.12)
> > >   acpi/cxl: Add _OSC implementation (9.14.2)
> > >   acpi/cxl: Create the CEDT (9.14.1)
> > >   Temp: acpi/cxl: Add ACPI0017 (CEDT awareness)
> > >   WIP: i386/cxl: Initialize a host bridge
> > >   qtest/cxl: Add very basic sanity tests
> > > 
> > > Jonathan Cameron (1):
> > >   Temp: Add the PCI_EXT_ID_DVSEC definition to the qemu pci_regs.h copy.
> > > 
> > > Vishal Verma (1):
> > >   acpi/cxl: Introduce a compat-driver UUID for CXL _OSC
> > > 
> > >  MAINTAINERS                               |   6 +
> > >  hw/Kconfig                                |   1 +
> > >  hw/acpi/Kconfig                           |   5 +
> > >  hw/acpi/cxl.c                             | 198 +++++++++++++
> > >  hw/acpi/meson.build                       |   1 +
> > >  hw/arm/virt.c                             |   1 +
> > >  hw/core/machine.c                         |  26 ++
> > >  hw/core/numa.c                            |   3 +
> > >  hw/cxl/Kconfig                            |   3 +
> > >  hw/cxl/cxl-component-utils.c              | 192 +++++++++++++
> > >  hw/cxl/cxl-device-utils.c                 | 293 +++++++++++++++++++
> > >  hw/cxl/cxl-mailbox-utils.c                | 139 +++++++++
> > >  hw/cxl/meson.build                        |   5 +
> > >  hw/i386/acpi-build.c                      |  87 +++++-
> > >  hw/i386/microvm.c                         |   1 +
> > >  hw/i386/pc.c                              |   2 +
> > >  hw/mem/Kconfig                            |   5 +
> > >  hw/mem/cxl_type3.c                        | 334 ++++++++++++++++++++++
> > >  hw/mem/meson.build                        |   1 +
> > >  hw/meson.build                            |   1 +
> > >  hw/pci-bridge/Kconfig                     |   5 +
> > >  hw/pci-bridge/cxl_root_port.c             | 231 +++++++++++++++
> > >  hw/pci-bridge/meson.build                 |   1 +
> > >  hw/pci-bridge/pci_expander_bridge.c       | 209 +++++++++++++-
> > >  hw/pci-bridge/pcie_root_port.c            |   6 +-
> > >  hw/pci/pci.c                              |  32 ++-
> > >  hw/pci/pcie.c                             |  30 ++
> > >  hw/ppc/spapr.c                            |   2 +
> > >  include/hw/acpi/cxl.h                     |  27 ++
> > >  include/hw/boards.h                       |   2 +
> > >  include/hw/cxl/cxl.h                      |  30 ++
> > >  include/hw/cxl/cxl_component.h            | 181 ++++++++++++
> > >  include/hw/cxl/cxl_device.h               | 199 +++++++++++++
> > >  include/hw/cxl/cxl_pci.h                  | 155 ++++++++++
> > >  include/hw/pci/pci.h                      |  15 +
> > >  include/hw/pci/pci_bridge.h               |  25 ++
> > >  include/hw/pci/pci_bus.h                  |   8 +
> > >  include/hw/pci/pci_ids.h                  |   1 +
> > >  include/standard-headers/linux/pci_regs.h |   1 +
> > >  monitor/hmp-cmds.c                        |  15 +
> > >  qapi/machine.json                         |   1 +
> > >  tests/qtest/cxl-test.c                    |  93 ++++++
> > >  tests/qtest/meson.build                   |   4 +
> > >  43 files changed, 2547 insertions(+), 30 deletions(-)
> > >  create mode 100644 hw/acpi/cxl.c
> > >  create mode 100644 hw/cxl/Kconfig
> > >  create mode 100644 hw/cxl/cxl-component-utils.c
> > >  create mode 100644 hw/cxl/cxl-device-utils.c
> > >  create mode 100644 hw/cxl/cxl-mailbox-utils.c
> > >  create mode 100644 hw/cxl/meson.build
> > >  create mode 100644 hw/mem/cxl_type3.c
> > >  create mode 100644 hw/pci-bridge/cxl_root_port.c
> > >  create mode 100644 include/hw/acpi/cxl.h
> > >  create mode 100644 include/hw/cxl/cxl.h
> > >  create mode 100644 include/hw/cxl/cxl_component.h
> > >  create mode 100644 include/hw/cxl/cxl_device.h
> > >  create mode 100644 include/hw/cxl/cxl_pci.h
> > >  create mode 100644 tests/qtest/cxl-test.c
> > >   
> >   



^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [RFC PATCH 04/25] hw/cxl/device: Introduce a CXL device (8.2.8)
  2020-11-16 21:11     ` Ben Widawsky
@ 2020-11-17 14:21       ` Jonathan Cameron
  0 siblings, 0 replies; 64+ messages in thread
From: Jonathan Cameron @ 2020-11-17 14:21 UTC (permalink / raw)
  To: Ben Widawsky; +Cc: Vishal Verma, Dan Williams, qemu-devel

On Mon, 16 Nov 2020 13:11:16 -0800
Ben Widawsky <ben.widawsky@intel.com> wrote:

> On 20-11-16 13:07:56, Jonathan Cameron wrote:
> > On Tue, 10 Nov 2020 21:47:03 -0800
> > Ben Widawsky <ben.widawsky@intel.com> wrote:
> >   
> > > A CXL device is a type of CXL component. Conceptually, a CXL device
> > > would be a leaf node in a CXL topology. From an emulation perspective,
> > > CXL devices are the most complex and so the actual implementation is
> > > reserved for discrete commits.
> > > 
> > > This new device type is specifically catered towards the eventually
> > > implementation of a Type3 CXL.mem device, 8.2.8.5 in the CXL 2.0
> > > specification.
> > > 
> > > Signed-off-by: Ben Widawsky <ben.widawsky@intel.com>  
> > 
> > As an RFC, would be good to have questions relavant to individual
> > patches if possible.  Makes it easier to know what you want feedback on.
> > 
> > The REG32 being used for 64 bit registers seems awkward. I'd suggest
> > we either break them up into DW and deal with the edge parts manually.
> > 
> > I'm not sure a REG64 definition would work due to lack of explicit alignment
> > guarantees.  Might be fine though.  
> 
> Agreed, although I think the current frequency with which I've had to do this,
> and the XXX comments are decent, it's definitely a bit ugly. I found at least
> two registers (I don't recall one, but the very important command register was
> the other that you noticed below) which have a field that straddles the 32b
> boundary. I think having to do an upper and lower field for that would kind of
> stink.
> 
> Given that the codebase has gone on long enough without REG64, I didn't want to
> poke that bear, although I had wired it up at some point.
> 
> So for now, I'd like to just leave these as they are.
> 
> > 
> > One buglet inline and a few other comments.
> > 
> > Jonathan  
> 
> Thanks. Anything not responded to is acknowledged and will hopefully make its
> way into v2.
> 

...

> 
> >   
> > > +
> > > +/* 8.2.8.4.5.1 Command Return Codes */
> > > +enum {
> > > +    RET_SUCCESS                 = 0x0,
> > > +    RET_BG_STARTED              = 0x1, /* Background Command Started */
> > > +    RET_EINVAL                  = 0x2, /* Invalid Input */
> > > +    RET_ENOTSUP                 = 0x3, /* Unsupported */
> > > +    RET_ENODEV                  = 0x4, /* Internal Error */  
> > 
> > Mapping that to NODEV seems less than obvious.  
> 
> I tried to be cute and map as many things to errno as possible. Suggestions?

Don't bother being cute? :)
More seriously, I'd carry them as matching the spec out until you actually
have to return a standard error.   Fine to have a conversion function
that does a best possible mapping though so as to keep things consistent
across multiple locations.  Mind you perhaps qemu has a standard idiom for this?

cxl_cmd_ret_to_errno()


> 
> >   
> > > +    RET_ERESTART                = 0x5, /* Retry Required */
> > > +    RET_EBUSY                   = 0x6, /* Busy */
> > > +    RET_MEDIA_DISABLED          = 0x7, /* Media Disabled */
> > > +    RET_FW_EBUSY                = 0x8, /* FW Transfer in Progress */
> > > +    RET_FW_OOO                  = 0x9, /* FW Transfer Out of Order */
> > > +    RET_FW_AUTH                 = 0xa, /* FW Authentication Failed */
> > > +    RET_FW_EBADSLT              = 0xb, /* Invalid Slot */
> > > +    RET_FW_ROLLBACK             = 0xc, /* Activation Failed, FW Rolled Back */
> > > +    RET_FW_REBOOT               = 0xd, /* Activation Failed, Cold Reset Required */
> > > +    RET_ENOENT                  = 0xe, /* Invalid Handle */
> > > +    RET_EFAULT                  = 0xf, /* Invalid Physical Address */
> > > +    RET_POISON_E2BIG            = 0x10, /* Inject Poison Limit Reached */
> > > +    RET_EIO                     = 0x11, /* Permanent Media Failure */
> > > +    RET_ECANCELED               = 0x12, /* Aborted */
> > > +    RET_EACCESS                 = 0x13, /* Invalid Security State */
> > > +    RET_EPERM                   = 0x14, /* Incorrect Passphrase */
> > > +    RET_EPROTONOSUPPORT         = 0x15, /* Unsupported Mailbox */
> > > +    RET_EMSGSIZE                = 0x16, /* Invalid Payload Length */
> > > +    RET_MAX                     = 0x17
> > > +};
> > > +
> > > +/* XXX: actually a 64b register */
> > > +REG32(CXL_DEV_MAILBOX_STS, 0x10)
> > > +    FIELD(CXL_DEV_MAILBOX_STS, BG_OP, 0, 1)
> > > +    FIELD(CXL_DEV_MAILBOX_STS, ERRNO, 32, 16)
> > > +    FIELD(CXL_DEV_MAILBOX_STS, VENDOR_ERRNO, 48, 16)
> > > +
> > > +/* XXX: actually a 64b register */
> > > +REG32(CXL_DEV_BG_CMD_STS, 0x18)
> > > +    FIELD(CXL_DEV_BG_CMD_STS, BG, 0, 16)
> > > +    FIELD(CXL_DEV_BG_CMD_STS, DONE, 16, 7)
> > > +    FIELD(CXL_DEV_BG_CMD_STS, ERRNO, 32, 16)
> > > +    FIELD(CXL_DEV_BG_CMD_STS, VENDOR_ERRNO, 48, 16)
> > > +
> > > +REG32(CXL_DEV_CMD_PAYLOAD, 0x20)
> > > +
> > > +#endif  
> >   



^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [RFC PATCH 06/25] hw/cxl/device: Add device status (8.2.8.3)
  2020-11-16 21:18     ` Ben Widawsky
@ 2020-11-17 14:24       ` Jonathan Cameron
  0 siblings, 0 replies; 64+ messages in thread
From: Jonathan Cameron @ 2020-11-17 14:24 UTC (permalink / raw)
  To: Ben Widawsky; +Cc: Vishal Verma, Dan Williams, qemu-devel

On Mon, 16 Nov 2020 13:18:41 -0800
Ben Widawsky <ben.widawsky@intel.com> wrote:

> On 20-11-16 13:16:08, Jonathan Cameron wrote:
> > On Tue, 10 Nov 2020 21:47:05 -0800
> > Ben Widawsky <ben.widawsky@intel.com> wrote:
> >   
> > > This implements the CXL device status registers from 8.2.8.3.1 in the
> > > CXL 2.0 specification. It is capability ID 0001h.
> > > 
> > > Signed-off-by: Ben Widawsky <ben.widawsky@intel.com>  
> > 
> > It does some other stuff it shouldn't as well.  Please tidy that up before
> > v2.  A few other passing comments inline.
> > 
> > Thanks,
> > 
> > Jonathan
> > 
> >   
> > > ---
> > >  hw/cxl/cxl-device-utils.c   | 45 +++++++++++++++++++++++++++++++++-
> > >  include/hw/cxl/cxl_device.h | 49 ++++++++++++-------------------------
> > >  2 files changed, 60 insertions(+), 34 deletions(-)
> > > 
> > > diff --git a/hw/cxl/cxl-device-utils.c b/hw/cxl/cxl-device-utils.c
> > > index a391bb15c6..78144e103c 100644
> > > --- a/hw/cxl/cxl-device-utils.c
> > > +++ b/hw/cxl/cxl-device-utils.c
> > > @@ -33,6 +33,42 @@ static uint64_t caps_reg_read(void *opaque, hwaddr offset, unsigned size)
> > >      return ldn_le_p(cxl_dstate->caps_reg_state + offset, size);
> > >  }
> > >  
> > > +static uint64_t dev_reg_read(void *opaque, hwaddr offset, unsigned size)
> > > +{
> > > +    uint64_t retval = 0;  
> > 
> > Doesn't seem to be used.
> >   
> 
> It's required for ldn_le_p, or did you mean something else?

Nope just failed to notice that use. oops

> 
> > > +  
> > 
> > Perhaps break the alignment check out to a utility function given this sanity check
> > is same as in previous patch.
> >   
> > > +    switch (size) {
> > > +    case 4:
> > > +        if (unlikely(offset & (sizeof(uint32_t) - 1))) {
> > > +            qemu_log_mask(LOG_UNIMP, "Unaligned register read\n");
> > > +            return 0;
> > > +        }
> > > +        break;
> > > +    case 8:
> > > +        if (unlikely(offset & (sizeof(uint64_t) - 1))) {
> > > +            qemu_log_mask(LOG_UNIMP, "Unaligned register read\n");
> > > +            return 0;
> > > +        }
> > > +        break;
> > > +    }
> > > +
> > > +    return ldn_le_p(&retval, size);
> > > +}


^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [RFC PATCH 08/25] hw/cxl/device: Add memory devices (8.2.8.5)
  2020-11-16 21:45     ` Ben Widawsky
@ 2020-11-17 14:31       ` Jonathan Cameron
  0 siblings, 0 replies; 64+ messages in thread
From: Jonathan Cameron @ 2020-11-17 14:31 UTC (permalink / raw)
  To: Ben Widawsky; +Cc: Vishal Verma, Dan Williams, qemu-devel

On Mon, 16 Nov 2020 13:45:05 -0800
Ben Widawsky <ben.widawsky@intel.com> wrote:

> On 20-11-16 16:37:22, Jonathan Cameron wrote:
> > On Tue, 10 Nov 2020 21:47:07 -0800
> > Ben Widawsky <ben.widawsky@intel.com> wrote:
> >   
> > > Memory devices implement extra capabilities on top of CXL devices. This
> > > adds support for that.
> > > 
> > > Signed-off-by: Ben Widawsky <ben.widawsky@intel.com>
> > > ---
> > >  hw/cxl/cxl-device-utils.c   | 48 ++++++++++++++++++++++++++++++++++++-
> > >  hw/cxl/cxl-mailbox-utils.c  | 48 ++++++++++++++++++++++++++++++++++++-
> > >  include/hw/cxl/cxl_device.h | 15 ++++++++++++
> > >  3 files changed, 109 insertions(+), 2 deletions(-)
> > > 
> > > diff --git a/hw/cxl/cxl-device-utils.c b/hw/cxl/cxl-device-utils.c
> > > index aec8b0d421..6544a68567 100644
> > > --- a/hw/cxl/cxl-device-utils.c
> > > +++ b/hw/cxl/cxl-device-utils.c
> > > @@ -158,6 +158,45 @@ static void mailbox_reg_write(void *opaque, hwaddr offset, uint64_t value,
> > >          process_mailbox(cxl_dstate);
> > >  }
> > >  
> > > +static uint64_t mdev_reg_read(void *opaque, hwaddr offset, unsigned size)
> > > +{
> > > +    uint64_t retval = 0;
> > > +
> > > +    retval = FIELD_DP64(retval, CXL_MEM_DEV_STS, MEDIA_STATUS, 1);
> > > +    retval = FIELD_DP64(retval, CXL_MEM_DEV_STS, MBOX_READY, 1);
> > > +
> > > +    switch (size) {
> > > +    case 4:
> > > +        if (unlikely(offset & (sizeof(uint32_t) - 1))) {
> > > +            qemu_log_mask(LOG_UNIMP, "Unaligned register read\n");
> > > +            return 0;
> > > +        }
> > > +        break;
> > > +    case 8:
> > > +        if (unlikely(offset & (sizeof(uint64_t) - 1))) {
> > > +            qemu_log_mask(LOG_UNIMP, "Unaligned register read\n");
> > > +            return 0;
> > > +        }
> > > +        break;
> > > +    }
> > > +
> > > +    return ldn_le_p(&retval, size);
> > > +}
> > > +
> > > +static const MemoryRegionOps mdev_ops = {
> > > +    .read = mdev_reg_read,
> > > +    .write = NULL,
> > > +    .endianness = DEVICE_LITTLE_ENDIAN,
> > > +    .valid = {
> > > +        .min_access_size = 4,
> > > +        .max_access_size = 8,
> > > +    },
> > > +    .impl = {
> > > +        .min_access_size = 4,
> > > +        .max_access_size = 8,
> > > +    },
> > > +};
> > > +
> > >  static const MemoryRegionOps mailbox_ops = {
> > >      .read = mailbox_reg_read,
> > >      .write = mailbox_reg_write,
> > > @@ -213,6 +252,9 @@ void cxl_device_register_block_init(Object *obj, CXLDeviceState *cxl_dstate)
> > >                            "device-status", CXL_DEVICE_REGISTERS_LENGTH);
> > >      memory_region_init_io(&cxl_dstate->mailbox, obj, &mailbox_ops, cxl_dstate,
> > >                            "mailbox", CXL_MAILBOX_REGISTERS_LENGTH);
> > > +    memory_region_init_io(&cxl_dstate->memory_device, obj, &mdev_ops,
> > > +                          cxl_dstate, "memory device caps",
> > > +                          CXL_MEMORY_DEVICE_REGISTERS_LENGTH);
> > >  
> > >      memory_region_add_subregion(&cxl_dstate->device_registers, 0,
> > >                                  &cxl_dstate->caps);
> > > @@ -221,6 +263,9 @@ void cxl_device_register_block_init(Object *obj, CXLDeviceState *cxl_dstate)
> > >                                  &cxl_dstate->device);
> > >      memory_region_add_subregion(&cxl_dstate->device_registers,
> > >                                  CXL_MAILBOX_REGISTERS_OFFSET, &cxl_dstate->mailbox);
> > > +    memory_region_add_subregion(&cxl_dstate->device_registers,
> > > +                                CXL_MEMORY_DEVICE_REGISTERS_OFFSET,
> > > +                                &cxl_dstate->memory_device);
> > >  }
> > >  
> > >  static void mailbox_init_common(uint32_t *mbox_regs)
> > > @@ -233,7 +278,7 @@ static void mailbox_init_common(uint32_t *mbox_regs)
> > >  void cxl_device_register_init_common(CXLDeviceState *cxl_dstate)
> > >  {
> > >      uint32_t *cap_hdrs = cxl_dstate->caps_reg_state32;
> > > -    const int cap_count = 1;  
> > 
> > Guessing this should previously have been 2?
> >   
> > > +    const int cap_count = 3;
> > >  
> > >      /* CXL Device Capabilities Array Register */
> > >      ARRAY_FIELD_DP32(cap_hdrs, CXL_DEV_CAP_ARRAY, CAP_ID, 0);
> > > @@ -242,6 +287,7 @@ void cxl_device_register_init_common(CXLDeviceState *cxl_dstate)
> > >  
> > >      cxl_device_cap_init(cxl_dstate, DEVICE, 1);
> > >      cxl_device_cap_init(cxl_dstate, MAILBOX, 2);
> > > +    cxl_device_cap_init(cxl_dstate, MEMORY_DEVICE, 0x4000);
> > >  
> > >      mailbox_init_common(cxl_dstate->mbox_reg_state32);
> > >  }
> > > diff --git a/hw/cxl/cxl-mailbox-utils.c b/hw/cxl/cxl-mailbox-utils.c
> > > index 2d1b0ef9e4..5d2579800e 100644
> > > --- a/hw/cxl/cxl-mailbox-utils.c
> > > +++ b/hw/cxl/cxl-mailbox-utils.c
> > > @@ -12,6 +12,12 @@
> > >  #include "hw/pci/pci.h"
> > >  #include "hw/cxl/cxl.h"
> > >  
> > > +enum cxl_opcode {
> > > +    CXL_EVENTS      = 0x1,
> > > +    CXL_IDENTIFY    = 0x40,
> > > +        #define CXL_IDENTIFY_MEMORY_DEVICE = 0x0
> > > +};
> > > +
> > >  /* 8.2.8.4.5.1 Command Return Codes */
> > >  enum {
> > >      RET_SUCCESS                 = 0x0,
> > > @@ -40,6 +46,43 @@ enum {
> > >      RET_MAX                     = 0x17
> > >  };
> > >  
> > > +/* 8.2.9.5.1.1 */
> > > +static int cmd_set_identify(CXLDeviceState *cxl_dstate, uint8_t cmd,
> > > +                            uint32_t *ret_size)  
> > 
> > I'm a bit confused on naming here, perhaps rsp_set_identity makes
> > it clearer which direction this is going in?  I think this is
> > filling in the reply for a command from software running on the
> > system. Naming seems to me to suggest we are setting the identity
> > of the hardware.  
> >   
> 
> It sounds like maybe you read "identify" as "identity"?

yup.  I guess my mind didn't want to parse it.

> 
> You're correct, this represents the firmware running on the memory device that
> is receiving the identify command from the host. I've been thinking about
> renaming these based on what the underlying device is. For instance, this might
> become:
> 
> mem_dev_identify()

Maybe a little more to make the point that it is filling in values that will get
sent back from here.  Maybe something like:

mem_dev_fillresp_identify()?

Meh. Bike-shedding time - can always fall back to a comment to clarify what
it is doing if we can't find a magic non-confusing name.

Jonathan

> 
> > > +{
> > > +    struct identify {
> > > +        char fw_revision[0x10];
> > > +        uint64_t total_capacity;
> > > +        uint64_t volatile_capacity;
> > > +        uint64_t persistent_capacity;
> > > +        uint64_t partition_align;
> > > +        uint16_t info_event_log_size;
> > > +        uint16_t warning_event_log_size;
> > > +        uint16_t failure_event_log_size;
> > > +        uint16_t fatal_event_log_size;
> > > +        uint32_t lsa_size;
> > > +        uint8_t poison_list_max_mer[3];
> > > +        uint16_t inject_poison_limit;
> > > +        uint8_t poison_caps;
> > > +        uint8_t qos_telemetry_caps;
> > > +    } __attribute__((packed)) *id;
> > > +    _Static_assert(sizeof(struct identify) == 0x43, "Bad identify size");
> > > +
> > > +    if (memory_region_size(cxl_dstate->pmem) < (256 << 20)) {
> > > +        return RET_ENODEV;
> > > +    }
> > > +
> > > +    /* PMEM only */
> > > +    id = (struct identify *)((void *)cxl_dstate->mbox_reg_state +
> > > +                             A_CXL_DEV_CMD_PAYLOAD);
> > > +    snprintf(id->fw_revision, 0x10, "BWFW VERSION %02d", 0);
> > > +    id->total_capacity = memory_region_size(cxl_dstate->pmem);
> > > +    id->persistent_capacity = memory_region_size(cxl_dstate->pmem);
> > > +
> > > +    *ret_size = 0x43;
> > > +    return RET_SUCCESS;
> > > +}
> > > +
> > >  void process_mailbox(CXLDeviceState *cxl_dstate)
> > >  {
> > >      uint16_t ret = RET_SUCCESS;
> > > @@ -63,8 +106,11 @@ void process_mailbox(CXLDeviceState *cxl_dstate)
> > >  
> > >      uint8_t cmd_set = FIELD_EX64(command_reg, CXL_DEV_MAILBOX_CMD, COMMAND_SET);
> > >      uint8_t cmd = FIELD_EX64(command_reg, CXL_DEV_MAILBOX_CMD, COMMAND);
> > > -    (void)cmd;  
> > 
> > Clean this stuff up before v2.
> >   
> > >      switch (cmd_set) {
> > > +    case CXL_IDENTIFY:
> > > +        ret = cmd_set_identify(cxl_dstate, cmd, &ret_len);
> > > +        /* Fill in payload here */
> > > +        break;
> > >      default:
> > >          ret = RET_ENOTSUP;
> > >      }
> > > diff --git a/include/hw/cxl/cxl_device.h b/include/hw/cxl/cxl_device.h
> > > index df00998def..2cb2a9af3c 100644
> > > --- a/include/hw/cxl/cxl_device.h
> > > +++ b/include/hw/cxl/cxl_device.h
> > > @@ -69,6 +69,10 @@
> > >  #define CXL_MAILBOX_REGISTERS_LENGTH \
> > >      (CXL_MAILBOX_REGISTERS_SIZE + CXL_MAILBOX_MAX_PAYLOAD_SIZE)
> > >  
> > > +#define CXL_MEMORY_DEVICE_REGISTERS_OFFSET \
> > > +    (CXL_MAILBOX_REGISTERS_OFFSET + CXL_MAILBOX_REGISTERS_LENGTH)
> > > +#define CXL_MEMORY_DEVICE_REGISTERS_LENGTH 0x8
> > > +
> > >  typedef struct cxl_device_state {
> > >      /* Boss container and caps registers */
> > >      MemoryRegion device_registers;
> > > @@ -76,6 +80,7 @@ typedef struct cxl_device_state {
> > >      MemoryRegion caps;
> > >      MemoryRegion device;
> > >      MemoryRegion mailbox;
> > > +    MemoryRegion memory_device;
> > >  
> > >      MemoryRegion *pmem;
> > >      MemoryRegion *vmem;
> > > @@ -131,6 +136,8 @@ REG32(CXL_DEV_CAP_ARRAY2, 4) /* We're going to pretend it's 64b */
> > >  CXL_DEVICE_CAPABILITY_HEADER_REGISTER(DEVICE, CXL_DEVICE_CAP_HDR1_OFFSET)
> > >  CXL_DEVICE_CAPABILITY_HEADER_REGISTER(MAILBOX, CXL_DEVICE_CAP_HDR1_OFFSET + \
> > >                                                 CXL_DEVICE_CAP_REG_SIZE)
> > > +CXL_DEVICE_CAPABILITY_HEADER_REGISTER(MEMORY_DEVICE, CXL_DEVICE_CAP_HDR1_OFFSET + \
> > > +                                                     CXL_DEVICE_CAP_REG_SIZE * 2)
> > >  
> > >  void process_mailbox(CXLDeviceState *cxl_dstate);
> > >  
> > > @@ -181,4 +188,12 @@ REG32(CXL_DEV_BG_CMD_STS, 0x18)
> > >  
> > >  REG32(CXL_DEV_CMD_PAYLOAD, 0x20)
> > >  
> > > +/* XXX: actually a 64b registers */
> > > +REG32(CXL_MEM_DEV_STS, 0)
> > > +    FIELD(CXL_MEM_DEV_STS, FATAL, 0, 1)
> > > +    FIELD(CXL_MEM_DEV_STS, FW_HALT, 1, 1)
> > > +    FIELD(CXL_MEM_DEV_STS, MEDIA_STATUS, 2, 2)
> > > +    FIELD(CXL_MEM_DEV_STS, MBOX_READY, 4, 1)
> > > +    FIELD(CXL_MEM_DEV_STS, RESET_NEEDED, 5, 3)
> > > +
> > >  #endif  
> >   



^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [RFC PATCH 11/25] hw/pxb: Allow creation of a CXL PXB (host bridge)
  2020-11-16 22:01     ` Ben Widawsky
@ 2020-11-17 14:33       ` Jonathan Cameron
  0 siblings, 0 replies; 64+ messages in thread
From: Jonathan Cameron @ 2020-11-17 14:33 UTC (permalink / raw)
  To: Ben Widawsky; +Cc: Vishal Verma, Dan Williams, qemu-devel, Michael S. Tsirkin

On Mon, 16 Nov 2020 14:01:40 -0800
Ben Widawsky <ben.widawsky@intel.com> wrote:

> On 20-11-16 16:44:09, Jonathan Cameron wrote:
> > On Tue, 10 Nov 2020 21:47:10 -0800
> > Ben Widawsky <ben.widawsky@intel.com> wrote:
> >   
> > > This works like adding a typical pxb device, except the name is
> > > 'pxb-cxl' instead of 'pxb-pcie'. An example command line would be as
> > > follows:
> > >   -device pxb-cxl,id=cxl.0,bus="pcie.0",bus_nr=1
> > > 
> > > A CXL PXB is backward compatible with PCIe. What this means in practice
> > > is that an operating system that is unaware of CXL should still be able
> > > to enumerate this topology as if it were PCIe.
> > > 
> > > One can create multiple CXL PXB host bridges, but a host bridge can only
> > > be connected to the main root bus. Host bridges cannot appear elsewhere
> > > in the topology.
> > > 
> > > Note that as of this patch, the ACPI tables needed for the host bridge
> > > (specifically, an ACPI object in _SB named ACPI0016 and the CEDT) aren't
> > > created. So while this patch internally creates it, it cannot be
> > > properly used by an operating system or other system software.
> > > 
> > > Upcoming patches will allow creating multiple host bridges.
> > > 
> > > Signed-off-by: Ben Widawsky <ben.widawsky@intel.com>  
> > Hi Ben,
> > 
> > Few minor things inline.
> > 
> > Jonathan
> >   
> > > ---
> > >  hw/pci-bridge/pci_expander_bridge.c | 67 ++++++++++++++++++++++++++++-
> > >  hw/pci/pci.c                        |  7 +++
> > >  include/hw/pci/pci.h                |  6 +++
> > >  3 files changed, 78 insertions(+), 2 deletions(-)
> > > 
> > > diff --git a/hw/pci-bridge/pci_expander_bridge.c b/hw/pci-bridge/pci_expander_bridge.c
> > > index 88c45dc3b5..3a8d815231 100644
> > > --- a/hw/pci-bridge/pci_expander_bridge.c
> > > +++ b/hw/pci-bridge/pci_expander_bridge.c
> > > @@ -56,6 +56,10 @@ DECLARE_INSTANCE_CHECKER(PXBDev, PXB_DEV,
> > >  DECLARE_INSTANCE_CHECKER(PXBDev, PXB_PCIE_DEV,
> > >                           TYPE_PXB_PCIE_DEVICE)
> > >  
> > > +#define TYPE_PXB_CXL_DEVICE "pxb-cxl"
> > > +DECLARE_INSTANCE_CHECKER(PXBDev, PXB_CXL_DEV,
> > > +                         TYPE_PXB_CXL_DEVICE)
> > > +
> > >  struct PXBDev {
> > >      /*< private >*/
> > >      PCIDevice parent_obj;
> > > @@ -67,6 +71,11 @@ struct PXBDev {
> > >  
> > >  static PXBDev *convert_to_pxb(PCIDevice *dev)
> > >  {
> > > +    /* A CXL PXB's parent bus is PCIe, so the normal check won't work */
> > > +    if (object_dynamic_cast(OBJECT(dev), TYPE_PXB_CXL_DEVICE)) {
> > > +        return PXB_CXL_DEV(dev);
> > > +    }
> > > +
> > >      return pci_bus_is_express(pci_get_bus(dev))
> > >          ? PXB_PCIE_DEV(dev) : PXB_DEV(dev);
> > >  }
> > > @@ -111,11 +120,20 @@ static const TypeInfo pxb_pcie_bus_info = {
> > >      .class_init    = pxb_bus_class_init,
> > >  };
> > >  
> > > +static const TypeInfo pxb_cxl_bus_info = {
> > > +    .name          = TYPE_PXB_CXL_BUS,
> > > +    .parent        = TYPE_CXL_BUS,
> > > +    .instance_size = sizeof(PXBBus),
> > > +    .class_init    = pxb_bus_class_init,
> > > +};
> > > +
> > >  static const char *pxb_host_root_bus_path(PCIHostState *host_bridge,
> > >                                            PCIBus *rootbus)
> > >  {
> > > -    PXBBus *bus = pci_bus_is_express(rootbus) ?
> > > -                  PXB_PCIE_BUS(rootbus) : PXB_BUS(rootbus);
> > > +    PXBBus *bus = pci_bus_is_cxl(rootbus) ?
> > > +                      PXB_CXL_BUS(rootbus) :
> > > +                      pci_bus_is_express(rootbus) ? PXB_PCIE_BUS(rootbus) :
> > > +                                                    PXB_BUS(rootbus);  
> > 
> > There comes a point where if / else is much more readable.
> >   
> > >  
> > >      snprintf(bus->bus_path, 8, "0000:%02x", pxb_bus_num(rootbus));
> > >      return bus->bus_path;
> > > @@ -380,13 +398,58 @@ static const TypeInfo pxb_pcie_dev_info = {
> > >      },
> > >  };
> > >  
> > > +static void pxb_cxl_dev_realize(PCIDevice *dev, Error **errp)
> > > +{
> > > +    /* A CXL PXB's parent bus is still PCIe */
> > > +    if (!pci_bus_is_express(pci_get_bus(dev))) {
> > > +        error_setg(errp, "pxb-cxl devices cannot reside on a PCI bus");
> > > +        return;
> > > +    }
> > > +
> > > +    pxb_dev_realize_common(dev, CXL, errp);
> > > +}
> > > +
> > > +static void pxb_cxl_dev_class_init(ObjectClass *klass, void *data)
> > > +{
> > > +    DeviceClass *dc   = DEVICE_CLASS(klass);
> > > +    PCIDeviceClass *k = PCI_DEVICE_CLASS(klass);
> > > +
> > > +    k->realize             = pxb_cxl_dev_realize;
> > > +    k->exit                = pxb_dev_exitfn;
> > > +    k->vendor_id           = PCI_VENDOR_ID_INTEL;
> > > +    k->device_id           = 0xabcd;  
> > 
> > Just to check, is that an officially assigned device_id that we will never
> > have a clash with?  Nice ID to get if it is :)  
> 
> No, not the real ID.
> 
> My understanding is that the host bridge won't exist at all in the PCI
> hierarchy. So basically all of these can be undeclared. For testing/development
> purposes I wanted to see this info.
> 
> Awesomely, it appears if I remove vendor, device, class, and subsystem
> everything still works and I do not see a bridge device in lspci. So v2 will
> have this all gone.

Pity - it is always fun to track down the holder of the magic list of
device IDs and try to explain to them why you want one for a device
that doesn't "exist" and isn't visible anyway.

I guess no ID is less confusing :) 

Jonathan

> 
> Thanks.
> 
> > 
> >   
> > > +    k->class_id            = PCI_CLASS_BRIDGE_HOST;
> > > +    k->subsystem_vendor_id = PCI_VENDOR_ID_INTEL;
> > > +
> > > +    dc->desc = "CXL Host Bridge";
> > > +    device_class_set_props(dc, pxb_dev_properties);
> > > +    set_bit(DEVICE_CATEGORY_BRIDGE, dc->categories);
> > > +
> > > +    /* Host bridges aren't hotpluggable. FIXME: spec reference */
> > > +    dc->hotpluggable = false;
> > > +}
> > > +
> > > +static const TypeInfo pxb_cxl_dev_info = {
> > > +    .name          = TYPE_PXB_CXL_DEVICE,
> > > +    .parent        = TYPE_PCI_DEVICE,
> > > +    .instance_size = sizeof(PXBDev),
> > > +    .class_init    = pxb_cxl_dev_class_init,
> > > +    .interfaces =
> > > +        (InterfaceInfo[]){
> > > +            { INTERFACE_CONVENTIONAL_PCI_DEVICE },
> > > +            {},
> > > +        },
> > > +};
> > > +
> > >  static void pxb_register_types(void)
> > >  {
> > >      type_register_static(&pxb_bus_info);
> > >      type_register_static(&pxb_pcie_bus_info);
> > > +    type_register_static(&pxb_cxl_bus_info);
> > >      type_register_static(&pxb_host_info);
> > >      type_register_static(&pxb_dev_info);
> > >      type_register_static(&pxb_pcie_dev_info);
> > > +    type_register_static(&pxb_cxl_dev_info);
> > >  }
> > >  
> > >  type_init(pxb_register_types)
> > > diff --git a/hw/pci/pci.c b/hw/pci/pci.c
> > > index db88788c4b..67eed889a4 100644
> > > --- a/hw/pci/pci.c
> > > +++ b/hw/pci/pci.c
> > > @@ -220,6 +220,12 @@ static const TypeInfo pcie_bus_info = {
> > >      .class_init = pcie_bus_class_init,
> > >  };
> > >  
> > > +static const TypeInfo cxl_bus_info = {
> > > +    .name       = TYPE_CXL_BUS,
> > > +    .parent     = TYPE_PCIE_BUS,
> > > +    .class_init = pcie_bus_class_init,
> > > +};
> > > +
> > >  static PCIBus *pci_find_bus_nr(PCIBus *bus, int bus_num);
> > >  static void pci_update_mappings(PCIDevice *d);
> > >  static void pci_irq_handler(void *opaque, int irq_num, int level);
> > > @@ -2847,6 +2853,7 @@ static void pci_register_types(void)
> > >  {
> > >      type_register_static(&pci_bus_info);
> > >      type_register_static(&pcie_bus_info);
> > > +    type_register_static(&cxl_bus_info);
> > >      type_register_static(&conventional_pci_interface_info);
> > >      type_register_static(&cxl_interface_info);
> > >      type_register_static(&pcie_interface_info);
> > > diff --git a/include/hw/pci/pci.h b/include/hw/pci/pci.h
> > > index 4e6fd59fdd..52267ff69e 100644
> > > --- a/include/hw/pci/pci.h
> > > +++ b/include/hw/pci/pci.h
> > > @@ -405,6 +405,7 @@ typedef PCIINTxRoute (*pci_route_irq_fn)(void *opaque, int pin);
> > >  #define TYPE_PCI_BUS "PCI"
> > >  OBJECT_DECLARE_TYPE(PCIBus, PCIBusClass, PCI_BUS)
> > >  #define TYPE_PCIE_BUS "PCIE"
> > > +#define TYPE_CXL_BUS "CXL"
> > >  
> > >  bool pci_bus_is_express(PCIBus *bus);
> > >  
> > > @@ -753,6 +754,11 @@ static inline void pci_irq_pulse(PCIDevice *pci_dev)
> > >      pci_irq_deassert(pci_dev);
> > >  }
> > >  
> > > +static inline int pci_is_cxl(const PCIDevice *d)
> > > +{
> > > +    return d->cap_present & QEMU_PCIE_CAP_CXL;
> > > +}
> > > +
> > >  static inline int pci_is_express(const PCIDevice *d)
> > >  {
> > >      return d->cap_present & QEMU_PCI_CAP_EXPRESS;  
> >   



^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [RFC PATCH 16/25] hw/pxb/cxl: Add "windows" for host bridges
  2020-11-13  0:49   ` Ben Widawsky
@ 2020-11-23 19:12     ` Philippe Mathieu-Daudé
  0 siblings, 0 replies; 64+ messages in thread
From: Philippe Mathieu-Daudé @ 2020-11-23 19:12 UTC (permalink / raw)
  To: Ben Widawsky, qemu-devel; +Cc: Vishal Verma, Dan Williams, Michael S. Tsirkin

On 11/13/20 1:49 AM, Ben Widawsky wrote:
> On 20-11-10 21:47:15, Ben Widawsky wrote:
>> In a bare metal CXL capable system, system firmware will program
>> physical address ranges on the host. This is done by programming
>> internal registers that aren't typically known to OS. These address
>> ranges might be contiguous or interleaved across host bridges.
>>
>> For a QEMU guest a new construct is introduced allowing passing a memory
>> backend to the host bridge for this same purpose. Each memory backend
>> needs to be passed to the host bridge as well as any device that will be
>> emulating that memory (not implemented here).
>>
>> I'm hopeful the interleaving work in the link can be re-purposed here
>> (see Link).
>>
>> An example to create a host bridges with a 512M window at 0x4c0000000
>>  -object memory-backend-file,id=cxl-mem1,share,mem-path=cxl-type3,size=512M
>>  -device pxb-cxl,id=cxl.0,bus=pcie.0,bus_nr=52,uid=0,len-memory-base=1,memory-base\[0\]=0x4c0000000,memory\[0\]=cxl-mem1
>>
>> Link: https://lists.nongnu.org/archive/html/qemu-devel/2020-08/msg03680.html
>> Signed-off-by: Ben Widawsky <ben.widawsky@intel.com>
> 
> Hi Phil, wanted to call you out specifically on this one.
> 
> The newly released CXL 2.0 specification (which from a topology perspective can
> be thought of as very PCIe-like) allows for interleaving of memory access.
> 
> Below is an example of two host bridges, each with two root ports, and 5 devices
> (two of switch are behind a switch).
> 
> RP: Root Port
> USP: Upstream Port
> DSP: Downstream Port
> Type 3 device: Memory Device with persistent or volatile memory.
> 
> +-------------------------+      +-------------------------+
> |                         |      |                         |
> |   CXL 2.0 Host Bridge   |      |   CXL 2.0 Host Bridge   |
> |                         |      |                         |
> |  +------+     +------+  |      |  +------+     +------+  |
> |  |  RP  |     |  RP  |  |      |  |  RP  |     |  RP  |  |
> +--+------+-----+------+--+      +--+------+-----+------+--+
>       |            |                   |               \--
>       |            |                   |        +-------+-\--+------+
>    +------+    +-------+            +-------+   |       |USP |      |
>    |Type 3|    |Type 3 |            |Type 3 |   |       +----+      |
>    |Device|    |Device |            |Device |   |                   |
>    +------+    +-------+            +-------+   | +----+     +----+ |
>                                                 | |DSP |     |DSP | |
>                                                 +-+----+-----+----+-+
>                                                     |          |
>                                                 +------+    +-------+
>                                                 |Type 3|    |Type 3 |
>                                                 |Device|    |Device |
>                                                 +------+    +-------+
> 
> Considering this picture... interleaving of memory access can happen in all 3
> layers in the topology.
> 
> - Memory access can be interleaved across host bridges (this is accomplished
>   based on the physical address chosen for the devices, those address ranges are
>   platform specific and not part of the 2.0 spec, for now).
> 
> - Memory access can be interleaved across root ports in a host bridge.
> 
> - Finally, memory access can be interleaved across downstream ports.
> 
> I'd like to start the discussion about how this might overlap with the patch
> series you've last been working on to interleave memory. Do you have any
> thoughts or ideas on how I should go about doing this?

Main case:

 +-------------------------+
 |                         |
 |   CXL 2.0 Host Bridge   |
 |                         |
 |  +------+     +------+  |
 |  |  RP  |     |  RP  |  |
 +--+------+-----+------+--+
       |            |
       |            |
    +------+    +-------+
    |Type 3|    |Type 3 |
    |Device|    |Device |
    +------+    +-------+

// cxl device state
s = qdev_create(TYPE_CXL20_HB_DEV)

cxl_memsize = 2 * memsize(Type3Dev);

// container for cxl
memory_region_init(&s->container, OBJECT(s),
                   "container", cxl_memsize);

// create 2 slots, interleaved each 2k
s->interleaver = qdev_create(INTERLEAVER_DEV,
                             slotsize=2k,
                             max_slots=2)
qdev_prop_set_uint64(s->interleaver, "size",
                     cxl_memsize);

// connect each device to the interleaver
object_property_set_link(OBJECT(interleaver),
                         "mr0", OBJECT(RP0))
object_property_set_link(OBJECT(interleaver),
                         "mr1", OBJECT(RP1))
sysbus_realize_and_unref(SYS_BUS_DEVICE(interleaver))

// we can probably avoid this container
memory_region_add_subregion(&s->container, 0,
                            sysbus_mmio_get_region(interleaver, 0));


For the 2nd case, USP can be created the same way than case 1
(as a 2nd interleaver) then the main CXL is created with the
minor difference of mr1 now being the USP:

object_property_set_link(OBJECT(interleaver),
                         "mr1", OBJECT(USP))
sysbus_realize_and_unref(SYS_BUS_DEVICE(interleaver))

Regards,

Phil.


^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [RFC PATCH 03/25] hw/cxl/component: Introduce CXL components (8.1.x, 8.2.5)
  2020-11-17 12:29       ` Jonathan Cameron
@ 2020-11-24 23:09         ` Ben Widawsky
  0 siblings, 0 replies; 64+ messages in thread
From: Ben Widawsky @ 2020-11-24 23:09 UTC (permalink / raw)
  To: Jonathan Cameron; +Cc: Vishal Verma, Dan Williams, qemu-devel

On 20-11-17 12:29:40, Jonathan Cameron wrote:

[snip]

> > >   
> > > > +
> > > > +/* 8.2.5.10 - CXL Security Capability Structure */
> > > > +#define CXL_SEC_REGISTERS_OFFSET (CXL_RAS_REGISTERS_OFFSET + CXL_RAS_REGISTERS_SIZE)
> > > > +#define CXL_SEC_REGISTERS_SIZE   0 /* We don't implement 1.1 downstream ports */
> > > > +
> > > > +/* 8.2.5.11 - CXL Link Capability Structure */
> > > > +#define CXL_LINK_REGISTERS_OFFSET (CXL_SEC_REGISTERS_OFFSET + CXL_SEC_REGISTERS_SIZE)
> > > > +#define CXL_LINK_REGISTERS_SIZE   0x38  
> > > 
> > > Bit odd to introduce this but not define anything... Can't we bring these
> > > in when we need them later?  
> > 
> > Repeating my comment from 00/25...
> > 
> > For this specific patch series I liked providing #defines in bulk so that it's
> > easy enough to just bring up the spec and review them. Not sure if there is a
> > preference from others in the community on this.
> 
> Personally I'd prefer to see the lot if you are going to do that, on basis
> should only need reviewing against the spec once.
> Not sure others will agree though as "the lot" is an awful lot.
> 

I took a shot at stripping some of this out, but it turns out I already use all
of it for the cxl-component-utils. While some of them aren't directly used, the
space reservations for the various caps make sense here IMO.

So for v2, I'm going to leave this as is, and if there is a desire to do things
differently, I think I'd need a suggestion of how to do so.

[snip]


> 
> Thanks,
> 
> Jonathan
> 


^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [RFC PATCH 18/25] hw/cxl/device: Add a memory device (8.2.8.5)
  2020-11-13  7:47     ` Markus Armbruster
@ 2020-11-25 16:53       ` Ben Widawsky
  2020-11-26  6:36         ` Markus Armbruster
  0 siblings, 1 reply; 64+ messages in thread
From: Ben Widawsky @ 2020-11-25 16:53 UTC (permalink / raw)
  To: Markus Armbruster
  Cc: Eduardo Habkost, Michael S. Tsirkin, Vishal Verma, qemu-devel,
	Dr. David Alan Gilbert, Paolo Bonzini, Igor Mammedov,
	Dan Williams, Richard Henderson

On 20-11-13 08:47:59, Markus Armbruster wrote:
> Eric Blake <eblake@redhat.com> writes:
> 
> > On 11/10/20 11:47 PM, Ben Widawsky wrote:
> >> A CXL memory device (AKA Type 3) is a CXL component that contains some
> >> combination of volatile and persistent memory. It also implements the
> >> previously defined mailbox interface as well as the memory device
> >> firmware interface.
> >> 
> >> The following example will create a 256M device in a 512M window:
> >> 
> >> -object "memory-backend-file,id=cxl-mem1,share,mem-path=cxl-type3,size=512M"
> >> -device "cxl-type3,bus=rp0,memdev=cxl-mem1,id=cxl-pmem0,size=256M"
> >> 
> >> Signed-off-by: Ben Widawsky <ben.widawsky@intel.com>
> >> ---
> >
> >> +++ b/qapi/machine.json
> >> @@ -1394,6 +1394,7 @@
> >>  { 'union': 'MemoryDeviceInfo',
> >>    'data': { 'dimm': 'PCDIMMDeviceInfo',
> >>              'nvdimm': 'PCDIMMDeviceInfo',
> >> +            'cxl': 'PCDIMMDeviceInfo',
> >>              'virtio-pmem': 'VirtioPMEMDeviceInfo',
> >>              'virtio-mem': 'VirtioMEMDeviceInfo'
> >>            }
> >
> > Missing documentation of the new data type, and the fact that it will be
> > introduced in 6.0.
> 
> Old wish list item: improve the QAPI schema frontend to flag this.
> 

"Introduced in 6.0" - quite the optimist. Kidding aside, this is the area where
I could use some feedback. CXL Type 3 memory devices can contain both volatile
and persistent memory at the same time. As such, I think I'll need a new type to
represent that, but I'd love to know how best to accomplish that.

> >                     Also, Markus has been trying to get rid of so-called
> > "simple unions" in favor of "flat unions" - every time we modify an
> > existing simple union, it is worth asking if it is time to first flatten
> > this.
> 
> 0. Simple unions can be transformed into flat unions.  The
> transformation can either preserve the nested wire format, or flatten
> it.  See docs/devel/qapi-code-gen.txt "A simple union can always be
> re-written as a flat union ..."
> 
> 1. No new simple unions.
> 
> 2. Existing simple unions that can be flattened without breaking
> backward compatibility have long been flattened.
> 
> 3. The remaining simple unions are part of QMP, where we need to
> preserve the wire format.  We could turn them into flat union preserving
> the wire format.  Only worthwhile if we kill simple unions and simplify
> scripts/qapi/.  Opportunity to make the flat union syntax less
> cumbersome.  Not done due to lack of time.
> 
> 4. Kevin and I have been experimenting with ways to provide both flat
> and nested wire format.  This would pave the way for orderly deprecation
> of the nested wire format.  May not be practical for QMP output.
> 

So is there anything for me to do here?


^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [RFC PATCH 00/25] Introduce CXL 2.0 Emulation
  2020-11-17 14:09     ` Jonathan Cameron
@ 2020-11-25 18:29       ` Ben Widawsky
  0 siblings, 0 replies; 64+ messages in thread
From: Ben Widawsky @ 2020-11-25 18:29 UTC (permalink / raw)
  To: Jonathan Cameron
  Cc: Vishal Verma, Dan Williams, qemu-devel, Philippe Mathieu-Daudé

On 20-11-17 14:09:14, Jonathan Cameron wrote:

[snip]

> 
> Agreed, it was the intermediate state that I wasn't keen on of structures defined
> but then given 0 size.  I'd rather just look at them all once.  If that sometimes
> means introducing a file that isn't even referenced for a few patches, that's
> fine by me.

I've pushed v2 which hopefully addressed most of the feedback from you (it also
should fix some of the BIOS table failures from v1). My next plan is to
implement a few more firmware commands, and work on supporting interleaving
using the work from Phil. I will repost to the list after that.

https://gitlab.com/bwidawsk/qemu/-/tree/cxl-2.0v2


^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [RFC PATCH 18/25] hw/cxl/device: Add a memory device (8.2.8.5)
  2020-11-25 16:53       ` Ben Widawsky
@ 2020-11-26  6:36         ` Markus Armbruster
  2020-11-30 17:07           ` Ben Widawsky
  0 siblings, 1 reply; 64+ messages in thread
From: Markus Armbruster @ 2020-11-26  6:36 UTC (permalink / raw)
  To: Ben Widawsky
  Cc: Eduardo Habkost, Michael S. Tsirkin, Vishal Verma, qemu-devel,
	Dr. David Alan Gilbert, Igor Mammedov, Paolo Bonzini,
	Dan Williams, Richard Henderson

Ben Widawsky <ben.widawsky@intel.com> writes:

> On 20-11-13 08:47:59, Markus Armbruster wrote:
>> Eric Blake <eblake@redhat.com> writes:
>> 
>> > On 11/10/20 11:47 PM, Ben Widawsky wrote:
>> >> A CXL memory device (AKA Type 3) is a CXL component that contains some
>> >> combination of volatile and persistent memory. It also implements the
>> >> previously defined mailbox interface as well as the memory device
>> >> firmware interface.
>> >> 
>> >> The following example will create a 256M device in a 512M window:
>> >> 
>> >> -object "memory-backend-file,id=cxl-mem1,share,mem-path=cxl-type3,size=512M"
>> >> -device "cxl-type3,bus=rp0,memdev=cxl-mem1,id=cxl-pmem0,size=256M"
>> >> 
>> >> Signed-off-by: Ben Widawsky <ben.widawsky@intel.com>
>> >> ---
>> >
>> >> +++ b/qapi/machine.json
>> >> @@ -1394,6 +1394,7 @@
>> >>  { 'union': 'MemoryDeviceInfo',
>> >>    'data': { 'dimm': 'PCDIMMDeviceInfo',
>> >>              'nvdimm': 'PCDIMMDeviceInfo',
>> >> +            'cxl': 'PCDIMMDeviceInfo',
>> >>              'virtio-pmem': 'VirtioPMEMDeviceInfo',
>> >>              'virtio-mem': 'VirtioMEMDeviceInfo'
>> >>            }
>> >
>> > Missing documentation of the new data type, and the fact that it will be
>> > introduced in 6.0.
>> 
>> Old wish list item: improve the QAPI schema frontend to flag this.
>> 
>
> "Introduced in 6.0" - quite the optimist. Kidding aside, this is the area where
> I could use some feedback. CXL Type 3 memory devices can contain both volatile
> and persistent memory at the same time. As such, I think I'll need a new type to
> represent that, but I'd love to know how best to accomplish that.

We can help.  Tell us what information you want to provide in variant
'cxl'.  If it's a superset of an existing variant, give us just the
delta.

>> >                     Also, Markus has been trying to get rid of so-called
>> > "simple unions" in favor of "flat unions" - every time we modify an
>> > existing simple union, it is worth asking if it is time to first flatten
>> > this.
>> 
>> 0. Simple unions can be transformed into flat unions.  The
>> transformation can either preserve the nested wire format, or flatten
>> it.  See docs/devel/qapi-code-gen.txt "A simple union can always be
>> re-written as a flat union ..."
>> 
>> 1. No new simple unions.
>> 
>> 2. Existing simple unions that can be flattened without breaking
>> backward compatibility have long been flattened.
>> 
>> 3. The remaining simple unions are part of QMP, where we need to
>> preserve the wire format.  We could turn them into flat union preserving
>> the wire format.  Only worthwhile if we kill simple unions and simplify
>> scripts/qapi/.  Opportunity to make the flat union syntax less
>> cumbersome.  Not done due to lack of time.
>> 
>> 4. Kevin and I have been experimenting with ways to provide both flat
>> and nested wire format.  This would pave the way for orderly deprecation
>> of the nested wire format.  May not be practical for QMP output.
>> 
>
> So is there anything for me to do here?

No.  Extending an existing simple union is okay.

We should not add news ones.  We should think twice before we add new
uses of existing ones.



^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [RFC PATCH 18/25] hw/cxl/device: Add a memory device (8.2.8.5)
  2020-11-26  6:36         ` Markus Armbruster
@ 2020-11-30 17:07           ` Ben Widawsky
  2020-12-01 17:06             ` Markus Armbruster
  0 siblings, 1 reply; 64+ messages in thread
From: Ben Widawsky @ 2020-11-30 17:07 UTC (permalink / raw)
  To: Markus Armbruster
  Cc: Eduardo Habkost, Michael S. Tsirkin, Vishal Verma, qemu-devel,
	Dr. David Alan Gilbert, Paolo Bonzini, Igor Mammedov,
	Dan Williams, Richard Henderson

On 20-11-26 07:36:23, Markus Armbruster wrote:
> Ben Widawsky <ben.widawsky@intel.com> writes:
> 
> > On 20-11-13 08:47:59, Markus Armbruster wrote:
> >> Eric Blake <eblake@redhat.com> writes:
> >> 
> >> > On 11/10/20 11:47 PM, Ben Widawsky wrote:
> >> >> A CXL memory device (AKA Type 3) is a CXL component that contains some
> >> >> combination of volatile and persistent memory. It also implements the
> >> >> previously defined mailbox interface as well as the memory device
> >> >> firmware interface.
> >> >> 
> >> >> The following example will create a 256M device in a 512M window:
> >> >> 
> >> >> -object "memory-backend-file,id=cxl-mem1,share,mem-path=cxl-type3,size=512M"
> >> >> -device "cxl-type3,bus=rp0,memdev=cxl-mem1,id=cxl-pmem0,size=256M"
> >> >> 
> >> >> Signed-off-by: Ben Widawsky <ben.widawsky@intel.com>
> >> >> ---
> >> >
> >> >> +++ b/qapi/machine.json
> >> >> @@ -1394,6 +1394,7 @@
> >> >>  { 'union': 'MemoryDeviceInfo',
> >> >>    'data': { 'dimm': 'PCDIMMDeviceInfo',
> >> >>              'nvdimm': 'PCDIMMDeviceInfo',
> >> >> +            'cxl': 'PCDIMMDeviceInfo',
> >> >>              'virtio-pmem': 'VirtioPMEMDeviceInfo',
> >> >>              'virtio-mem': 'VirtioMEMDeviceInfo'
> >> >>            }
> >> >
> >> > Missing documentation of the new data type, and the fact that it will be
> >> > introduced in 6.0.
> >> 
> >> Old wish list item: improve the QAPI schema frontend to flag this.
> >> 
> >
> > "Introduced in 6.0" - quite the optimist. Kidding aside, this is the area where
> > I could use some feedback. CXL Type 3 memory devices can contain both volatile
> > and persistent memory at the same time. As such, I think I'll need a new type to
> > represent that, but I'd love to know how best to accomplish that.
> 
> We can help.  Tell us what information you want to provide in variant
> 'cxl'.  If it's a superset of an existing variant, give us just the
> delta.
> 

I'm not exactly sure what the best way to do this is in QEMU, so I'm not really
sure what to specify as the delta. A CXL memory device can have both volatile
and persistent memory. Currently when a CXL memory device implements the
TYPE_MEMORY_DEVICE interface. So I believe the shortest path is a
MemoryDeviceInfo that can have two memory devices associated with it, but I
don't know if that's the best path.


> >> >                     Also, Markus has been trying to get rid of so-called
> >> > "simple unions" in favor of "flat unions" - every time we modify an
> >> > existing simple union, it is worth asking if it is time to first flatten
> >> > this.
> >> 
> >> 0. Simple unions can be transformed into flat unions.  The
> >> transformation can either preserve the nested wire format, or flatten
> >> it.  See docs/devel/qapi-code-gen.txt "A simple union can always be
> >> re-written as a flat union ..."
> >> 
> >> 1. No new simple unions.
> >> 
> >> 2. Existing simple unions that can be flattened without breaking
> >> backward compatibility have long been flattened.
> >> 
> >> 3. The remaining simple unions are part of QMP, where we need to
> >> preserve the wire format.  We could turn them into flat union preserving
> >> the wire format.  Only worthwhile if we kill simple unions and simplify
> >> scripts/qapi/.  Opportunity to make the flat union syntax less
> >> cumbersome.  Not done due to lack of time.
> >> 
> >> 4. Kevin and I have been experimenting with ways to provide both flat
> >> and nested wire format.  This would pave the way for orderly deprecation
> >> of the nested wire format.  May not be practical for QMP output.
> >> 
> >
> > So is there anything for me to do here?
> 
> No.  Extending an existing simple union is okay.
> 
> We should not add news ones.  We should think twice before we add new
> uses of existing ones.
> 
> 


^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [RFC PATCH 18/25] hw/cxl/device: Add a memory device (8.2.8.5)
  2020-11-30 17:07           ` Ben Widawsky
@ 2020-12-01 17:06             ` Markus Armbruster
  0 siblings, 0 replies; 64+ messages in thread
From: Markus Armbruster @ 2020-12-01 17:06 UTC (permalink / raw)
  To: Ben Widawsky
  Cc: Eduardo Habkost, Michael S. Tsirkin, Vishal Verma, qemu-devel,
	Dr. David Alan Gilbert, Igor Mammedov, Paolo Bonzini,
	Dan Williams, Richard Henderson

Ben Widawsky <ben@bwidawsk.net> writes:

> On 20-11-26 07:36:23, Markus Armbruster wrote:
>> Ben Widawsky <ben.widawsky@intel.com> writes:
>> 
>> > On 20-11-13 08:47:59, Markus Armbruster wrote:
>> >> Eric Blake <eblake@redhat.com> writes:
>> >> 
>> >> > On 11/10/20 11:47 PM, Ben Widawsky wrote:
>> >> >> A CXL memory device (AKA Type 3) is a CXL component that contains some
>> >> >> combination of volatile and persistent memory. It also implements the
>> >> >> previously defined mailbox interface as well as the memory device
>> >> >> firmware interface.
>> >> >> 
>> >> >> The following example will create a 256M device in a 512M window:
>> >> >> 
>> >> >> -object "memory-backend-file,id=cxl-mem1,share,mem-path=cxl-type3,size=512M"
>> >> >> -device "cxl-type3,bus=rp0,memdev=cxl-mem1,id=cxl-pmem0,size=256M"
>> >> >> 
>> >> >> Signed-off-by: Ben Widawsky <ben.widawsky@intel.com>
>> >> >> ---
>> >> >
>> >> >> +++ b/qapi/machine.json
>> >> >> @@ -1394,6 +1394,7 @@
>> >> >>  { 'union': 'MemoryDeviceInfo',
>> >> >>    'data': { 'dimm': 'PCDIMMDeviceInfo',
>> >> >>              'nvdimm': 'PCDIMMDeviceInfo',
>> >> >> +            'cxl': 'PCDIMMDeviceInfo',
>> >> >>              'virtio-pmem': 'VirtioPMEMDeviceInfo',
>> >> >>              'virtio-mem': 'VirtioMEMDeviceInfo'
>> >> >>            }
>> >> >
>> >> > Missing documentation of the new data type, and the fact that it will be
>> >> > introduced in 6.0.
>> >> 
>> >> Old wish list item: improve the QAPI schema frontend to flag this.
>> >> 
>> >
>> > "Introduced in 6.0" - quite the optimist. Kidding aside, this is the area where
>> > I could use some feedback. CXL Type 3 memory devices can contain both volatile
>> > and persistent memory at the same time. As such, I think I'll need a new type to
>> > represent that, but I'd love to know how best to accomplish that.
>> 
>> We can help.  Tell us what information you want to provide in variant
>> 'cxl'.  If it's a superset of an existing variant, give us just the
>> delta.
>> 
>
> I'm not exactly sure what the best way to do this is in QEMU, so I'm not really
> sure what to specify as the delta. A CXL memory device can have both volatile
> and persistent memory. Currently when a CXL memory device implements the
> TYPE_MEMORY_DEVICE interface. So I believe the shortest path is a
> MemoryDeviceInfo that can have two memory devices associated with it, but I
> don't know if that's the best path.

Perhaps a CXL device should contain two sub-devices implementing
TYPE_MEMORY_DEVICE.  Paolo, what do you think?

If yes, one of the existing variants fits the bill, I guess.

If no, I have more ramblings to offer.

query-memory-devices returns a MemoryDeviceInfo for each realized device
that implements interface TYPE_MEMORY_DEVICE.  The interface provides
callback ->fill_device_info() to fill in the MemoryDeviceInfo.  This is
its only use.

This means TYPE_MEMORY_DEVICE places no obvious constraints on how the
callbacks use MemoryDeviceInfo.  Each callback can pick whatever variant
it wants.  This also means *your* callback can pick a new one, which you
define however you want.

What if there are unobvious (and unwritten) constraints?

The existing variants overlap:

* All provide the device's ID: optional member @id

* All provide a physical address (base address, I supposed) and size,
  but the member names differ (argh!): @addr, @size vs. @memaddr, @size

* All provide the memory backend: member @memdev

The members that overlap by necessity (i.e. any conceivable
TYPE_MEMORY_DEVICE will have them) should be common members, not variant
members.  Requires converting the simple union to the equivalent flat
union.

Do these members overlap by necessity?  Paolo, Igor?

[...]



^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [RFC PATCH 00/25] Introduce CXL 2.0 Emulation
  2020-11-11  5:46 [RFC PATCH 00/25] Introduce CXL 2.0 Emulation Ben Widawsky
                   ` (25 preceding siblings ...)
  2020-11-16 17:21 ` [RFC PATCH 00/25] Introduce CXL 2.0 Emulation Jonathan Cameron
@ 2020-12-04 14:27 ` Daniel P. Berrangé
  26 siblings, 0 replies; 64+ messages in thread
From: Daniel P. Berrangé @ 2020-12-04 14:27 UTC (permalink / raw)
  To: Ben Widawsky, Michael S. Tsirkin, Marcel Apfelbaum
  Cc: Vishal Verma, Dan Williams, qemu-devel

Just copying in the two primary QEMU maintainers for the PCI subsystem
to bring it to their attention.

On Tue, Nov 10, 2020 at 09:46:59PM -0800, Ben Widawsky wrote:
> Introduce emulation of Compute Express Link 2.0, which was released
> today at https://www.computeexpresslink.org/.
> 
> I've pushed a branch here: https://gitlab.com/bwidawsk/qemu/-/tree/cxl-2.0
> 
> The emulation has been critical to get the Linux enabling started
> (https://lore.kernel.org/linux-cxl/), it would be an ideal place to land
> regression tests for different topology handling, and there may be applications
> for this emulation as a way for a guest to manipulate its address space relative
> to different performance memories. I am new to QEMU development, so please
> forgive and point me in the right direction if I severely misinterpreted where a
> piece of infrastructure belongs.
> 
> Three of the five CXL component types are emulated with some level of functionality:
> host bridge, root port, and memory device. Upstream ports and downstream ports
> aren't implemented (the two components needed to make up a switch).
> 
> CXL 2.0 is built on top of PCIe (see spec for details). As a result, much of the
> implementation utilizes existing PCI paradigms. To implement the host bridge,
> I've chosen to use PXB (PCI Expander Bridge). It seemed to be the most natural
> fit even though it doesn't directly map to how hardware will work. For
> persistent capacity of the memory device, I utilized the memory subsystem
> (hw/mem).
> 
> We have 3 reasons why this work is valuable:
> 1. OS driver development and testing
> 2. OS driver regression testing
> 3. Possible guest support for HDMs
> 
> As mentioned above there are three benefits to carrying this enabling in
> upstream QEMU:
> 
> 1. Linux driver feature development benefits from emulation both due to
> a lack of initial hardware availability, but also, as is seen with
> NVDIMM/PMEM emulation, there is value in being able to share
> topologies with system-software developers even after hardware is
> available.
> 
> 2. The Linux kernel's unit test suite for NVDIMM/PMEM ended up injecting fake
> resources via custom modules (nfit_test). In retrospect a QEMU emulation of
> nfit_test capabilities would have made the test environment more portable, and
> allowed for easier community contributions of example configurations.
> 
> 3. This is still being fleshed out, but in short it provides a standardized
> mechanism for the guest to provide feedback to the host about size and placement
> needs of the memory. After the host gives the guest a physical window mapping to
> the CXL device, the emulated HDM decoders allow the guest a way to tell the host
> how much it wants and where. There are likely simpler ways to do this, but
> they'd require inventing a new interface and you'd need to have diverging driver
> code in the guest programming of the HDM decoder vs. the host. Since we've
> already done this work, why not use it?
> 
> There is quite a long list of work to do for full spec compliance, but I don't
> believe that any of it precludes merging. Off the top of my head:
> - Main host bridge support (WIP)
> - Interleaving
> - Better Tests
> - Huge swaths of firmware functionality
> - Hot plug support
> - Emulating volatile capacity
> 
> The flow of the patches in general is to define all the data structures and
> registers associated with the various components in a top down manner. Host
> bridge, component, ports, devices. Then, the actual implementation is done in
> the same order.
> 
> The summary is:
> 1-8: Put infrastructure in place for emulation of the components.
> 9-11: Create the concept of a CXL bus and plumb into PXB
> 12-16: Implement host bridges
> 17: Implement a root port
> 18: Implement a memory device
> 19: Implement HDM decoders
> 20-24: ACPI bits
> 25: Start working on enabling the main host bridge
> 
> Ben Widawsky (23):
>   hw/pci/cxl: Add a CXL component type (interface)
>   hw/cxl/component: Introduce CXL components (8.1.x, 8.2.5)
>   hw/cxl/device: Introduce a CXL device (8.2.8)
>   hw/cxl/device: Implement the CAP array (8.2.8.1-2)
>   hw/cxl/device: Add device status (8.2.8.3)
>   hw/cxl/device: Implement basic mailbox (8.2.8.4)
>   hw/cxl/device: Add memory devices (8.2.8.5)
>   hw/pxb: Use a type for realizing expanders
>   hw/pci/cxl: Create a CXL bus type
>   hw/pxb: Allow creation of a CXL PXB (host bridge)
>   acpi/pci: Consolidate host bridge setup
>   hw/pci: Plumb _UID through host bridges
>   hw/cxl/component: Implement host bridge MMIO (8.2.5, table 142)
>   acpi/pxb/cxl: Reserve host bridge MMIO
>   hw/pxb/cxl: Add "windows" for host bridges
>   hw/cxl/rp: Add a root port
>   hw/cxl/device: Add a memory device (8.2.8.5)
>   hw/cxl/device: Implement MMIO HDM decoding (8.2.5.12)
>   acpi/cxl: Add _OSC implementation (9.14.2)
>   acpi/cxl: Create the CEDT (9.14.1)
>   Temp: acpi/cxl: Add ACPI0017 (CEDT awareness)
>   WIP: i386/cxl: Initialize a host bridge
>   qtest/cxl: Add very basic sanity tests
> 
> Jonathan Cameron (1):
>   Temp: Add the PCI_EXT_ID_DVSEC definition to the qemu pci_regs.h copy.
> 
> Vishal Verma (1):
>   acpi/cxl: Introduce a compat-driver UUID for CXL _OSC
> 
>  MAINTAINERS                               |   6 +
>  hw/Kconfig                                |   1 +
>  hw/acpi/Kconfig                           |   5 +
>  hw/acpi/cxl.c                             | 198 +++++++++++++
>  hw/acpi/meson.build                       |   1 +
>  hw/arm/virt.c                             |   1 +
>  hw/core/machine.c                         |  26 ++
>  hw/core/numa.c                            |   3 +
>  hw/cxl/Kconfig                            |   3 +
>  hw/cxl/cxl-component-utils.c              | 192 +++++++++++++
>  hw/cxl/cxl-device-utils.c                 | 293 +++++++++++++++++++
>  hw/cxl/cxl-mailbox-utils.c                | 139 +++++++++
>  hw/cxl/meson.build                        |   5 +
>  hw/i386/acpi-build.c                      |  87 +++++-
>  hw/i386/microvm.c                         |   1 +
>  hw/i386/pc.c                              |   2 +
>  hw/mem/Kconfig                            |   5 +
>  hw/mem/cxl_type3.c                        | 334 ++++++++++++++++++++++
>  hw/mem/meson.build                        |   1 +
>  hw/meson.build                            |   1 +
>  hw/pci-bridge/Kconfig                     |   5 +
>  hw/pci-bridge/cxl_root_port.c             | 231 +++++++++++++++
>  hw/pci-bridge/meson.build                 |   1 +
>  hw/pci-bridge/pci_expander_bridge.c       | 209 +++++++++++++-
>  hw/pci-bridge/pcie_root_port.c            |   6 +-
>  hw/pci/pci.c                              |  32 ++-
>  hw/pci/pcie.c                             |  30 ++
>  hw/ppc/spapr.c                            |   2 +
>  include/hw/acpi/cxl.h                     |  27 ++
>  include/hw/boards.h                       |   2 +
>  include/hw/cxl/cxl.h                      |  30 ++
>  include/hw/cxl/cxl_component.h            | 181 ++++++++++++
>  include/hw/cxl/cxl_device.h               | 199 +++++++++++++
>  include/hw/cxl/cxl_pci.h                  | 155 ++++++++++
>  include/hw/pci/pci.h                      |  15 +
>  include/hw/pci/pci_bridge.h               |  25 ++
>  include/hw/pci/pci_bus.h                  |   8 +
>  include/hw/pci/pci_ids.h                  |   1 +
>  include/standard-headers/linux/pci_regs.h |   1 +
>  monitor/hmp-cmds.c                        |  15 +
>  qapi/machine.json                         |   1 +
>  tests/qtest/cxl-test.c                    |  93 ++++++
>  tests/qtest/meson.build                   |   4 +
>  43 files changed, 2547 insertions(+), 30 deletions(-)
>  create mode 100644 hw/acpi/cxl.c
>  create mode 100644 hw/cxl/Kconfig
>  create mode 100644 hw/cxl/cxl-component-utils.c
>  create mode 100644 hw/cxl/cxl-device-utils.c
>  create mode 100644 hw/cxl/cxl-mailbox-utils.c
>  create mode 100644 hw/cxl/meson.build
>  create mode 100644 hw/mem/cxl_type3.c
>  create mode 100644 hw/pci-bridge/cxl_root_port.c
>  create mode 100644 include/hw/acpi/cxl.h
>  create mode 100644 include/hw/cxl/cxl.h
>  create mode 100644 include/hw/cxl/cxl_component.h
>  create mode 100644 include/hw/cxl/cxl_device.h
>  create mode 100644 include/hw/cxl/cxl_pci.h
>  create mode 100644 tests/qtest/cxl-test.c
> 
> -- 
> 2.29.2
> 
> 

Regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|



^ permalink raw reply	[flat|nested] 64+ messages in thread

end of thread, other threads:[~2020-12-04 14:28 UTC | newest]

Thread overview: 64+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-11-11  5:46 [RFC PATCH 00/25] Introduce CXL 2.0 Emulation Ben Widawsky
2020-11-11  5:47 ` [RFC PATCH 01/25] Temp: Add the PCI_EXT_ID_DVSEC definition to the qemu pci_regs.h copy Ben Widawsky
2020-11-11  5:47 ` [RFC PATCH 02/25] hw/pci/cxl: Add a CXL component type (interface) Ben Widawsky
2020-11-11  5:47 ` [RFC PATCH 03/25] hw/cxl/component: Introduce CXL components (8.1.x, 8.2.5) Ben Widawsky
2020-11-16 12:03   ` Jonathan Cameron
2020-11-16 19:19     ` Ben Widawsky
2020-11-17 12:29       ` Jonathan Cameron
2020-11-24 23:09         ` Ben Widawsky
2020-11-11  5:47 ` [RFC PATCH 04/25] hw/cxl/device: Introduce a CXL device (8.2.8) Ben Widawsky
2020-11-16 13:07   ` Jonathan Cameron
2020-11-16 21:11     ` Ben Widawsky
2020-11-17 14:21       ` Jonathan Cameron
2020-11-11  5:47 ` [RFC PATCH 05/25] hw/cxl/device: Implement the CAP array (8.2.8.1-2) Ben Widawsky
2020-11-16 13:11   ` Jonathan Cameron
2020-11-16 18:08     ` Ben Widawsky
2020-11-11  5:47 ` [RFC PATCH 06/25] hw/cxl/device: Add device status (8.2.8.3) Ben Widawsky
2020-11-16 13:16   ` Jonathan Cameron
2020-11-16 21:18     ` Ben Widawsky
2020-11-17 14:24       ` Jonathan Cameron
2020-11-11  5:47 ` [RFC PATCH 07/25] hw/cxl/device: Implement basic mailbox (8.2.8.4) Ben Widawsky
2020-11-16 13:46   ` Jonathan Cameron
2020-11-16 21:42     ` Ben Widawsky
2020-11-11  5:47 ` [RFC PATCH 08/25] hw/cxl/device: Add memory devices (8.2.8.5) Ben Widawsky
2020-11-16 16:37   ` Jonathan Cameron
2020-11-16 21:45     ` Ben Widawsky
2020-11-17 14:31       ` Jonathan Cameron
2020-11-11  5:47 ` [RFC PATCH 09/25] hw/pxb: Use a type for realizing expanders Ben Widawsky
2020-11-11  5:47 ` [RFC PATCH 10/25] hw/pci/cxl: Create a CXL bus type Ben Widawsky
2020-11-11  5:47 ` [RFC PATCH 11/25] hw/pxb: Allow creation of a CXL PXB (host bridge) Ben Widawsky
2020-11-16 16:44   ` Jonathan Cameron
2020-11-16 22:01     ` Ben Widawsky
2020-11-17 14:33       ` Jonathan Cameron
2020-11-11  5:47 ` [RFC PATCH 12/25] acpi/pci: Consolidate host bridge setup Ben Widawsky
2020-11-12 17:46   ` Ben Widawsky
2020-11-16 16:45   ` Jonathan Cameron
2020-11-11  5:47 ` [RFC PATCH 13/25] hw/pci: Plumb _UID through host bridges Ben Widawsky
2020-11-11  5:47 ` [RFC PATCH 14/25] hw/cxl/component: Implement host bridge MMIO (8.2.5, table 142) Ben Widawsky
2020-11-11  5:47 ` [RFC PATCH 15/25] acpi/pxb/cxl: Reserve host bridge MMIO Ben Widawsky
2020-11-16 16:54   ` Jonathan Cameron
2020-11-11  5:47 ` [RFC PATCH 16/25] hw/pxb/cxl: Add "windows" for host bridges Ben Widawsky
2020-11-13  0:49   ` Ben Widawsky
2020-11-23 19:12     ` Philippe Mathieu-Daudé
2020-11-11  5:47 ` [RFC PATCH 17/25] hw/cxl/rp: Add a root port Ben Widawsky
2020-11-11  5:47 ` [RFC PATCH 18/25] hw/cxl/device: Add a memory device (8.2.8.5) Ben Widawsky
2020-11-12 18:37   ` Eric Blake
2020-11-13  7:47     ` Markus Armbruster
2020-11-25 16:53       ` Ben Widawsky
2020-11-26  6:36         ` Markus Armbruster
2020-11-30 17:07           ` Ben Widawsky
2020-12-01 17:06             ` Markus Armbruster
2020-11-11  5:47 ` [RFC PATCH 19/25] hw/cxl/device: Implement MMIO HDM decoding (8.2.5.12) Ben Widawsky
2020-11-11  5:47 ` [RFC PATCH 20/25] acpi/cxl: Add _OSC implementation (9.14.2) Ben Widawsky
2020-11-11  5:47 ` [RFC PATCH 21/25] acpi/cxl: Introduce a compat-driver UUID for CXL _OSC Ben Widawsky
2020-11-11  5:47 ` [RFC PATCH 22/25] acpi/cxl: Create the CEDT (9.14.1) Ben Widawsky
2020-11-16 17:15   ` Jonathan Cameron
2020-11-16 22:05     ` Ben Widawsky
2020-11-11  5:47 ` [RFC PATCH 23/25] Temp: acpi/cxl: Add ACPI0017 (CEDT awareness) Ben Widawsky
2020-11-11  5:47 ` [RFC PATCH 24/25] WIP: i386/cxl: Initialize a host bridge Ben Widawsky
2020-11-11  5:47 ` [RFC PATCH 25/25] qtest/cxl: Add very basic sanity tests Ben Widawsky
2020-11-16 17:21 ` [RFC PATCH 00/25] Introduce CXL 2.0 Emulation Jonathan Cameron
2020-11-16 18:06   ` Ben Widawsky
2020-11-17 14:09     ` Jonathan Cameron
2020-11-25 18:29       ` Ben Widawsky
2020-12-04 14:27 ` Daniel P. Berrangé

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.